SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse



Similar documents
Bringing Big Data to People

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?

Modernizing Your Data Warehouse for Hadoop

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

Please give me your feedback

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Microsoft Analytics Platform System. Solution Brief

HDP Hadoop From concept to deployment.

Workshop on Hadoop with Big Data

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

HDP Enabling the Modern Data Architecture

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

Comprehensive Analytics on the Hortonworks Data Platform

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

The Future of Data Management with Hadoop and the Enterprise Data Hub

How To Handle Big Data With A Data Scientist

Big Data Management and Security

SQL Server 2012 Parallel Data Warehouse. Solution Brief

Big Data Technologies Compared June 2014

Hadoop Job Oriented Training Agenda

Big Data Processing: Past, Present and Future

BIG DATA TRENDS AND TECHNOLOGIES

Microsoft Big Data. Solution Brief

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc.

Apache Hadoop: Past, Present, and Future

Modern Data Warehousing

Course 20467: Designing Self-Service Business Intelligence and Big Data Solutions

Big Data on Microsoft Platform

Microsoft technológie pre BigData. Ľubomír Goryl Solution Professional

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Agenda. Modern Data Warehouse Big Data Application examples. Analytic Platform Systems. Integration of Hadoop and APS. Architecture Hadoop

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

BIG DATA HADOOP TRAINING

Hadoop. Sunday, November 25, 12

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Upcoming Announcements

Designing Self-Service Business Intelligence and Big Data Solutions

Hadoop Ecosystem B Y R A H I M A.

Constructing a Data Lake: Hadoop and Oracle Database United!

Parallel Data Warehouse

#TalendSandbox for Big Data

Hadoop. for Oracle database professionals. Alex Gorbachev Calgary, AB September 2013

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

The Future of Data Management

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

HADOOP. Revised 10/19/2015

TRAINING PROGRAM ON BIGDATA/HADOOP

BIG DATA What it is and how to use?

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

How To Scale Out Of A Nosql Database

Dell In-Memory Appliance for Cloudera Enterprise

Big Data? Definition # 1: Big Data Definition Forrester Research

ITG Software Engineering

In-Memory Analytics for Big Data

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 15

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

Cost-Effective Business Intelligence with Red Hat and Open Source

The Inside Scoop on Hadoop

How Companies are! Using Spark

Data Analyst Program- 0 to 100

A very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect

How To Extend An Enterprise Bio Solution

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

SQL Server PDW. Artur Vieira Premier Field Engineer

A Modern Data Architecture with Apache Hadoop

Apache Hadoop: The Big Data Refinery

Big Data Introduction

The Microsoft Modern Data Warehouse

Testing Big data is one of the biggest

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Big Data and Data Science: Behind the Buzz Words

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Next Gen Hadoop Gather around the campfire and I will tell you a good YARN

How to Hadoop Without the Worry: Protecting Big Data at Scale

Navigating the Big Data infrastructure layer Helena Schwenk

Keywords: Big Data, Hadoop, cluster, heterogeneous, HDFS, MapReduce

Chase Wu New Jersey Ins0tute of Technology

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Big Data Big Data/Data Analytics & Software Development

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

A Brief Outline on Bigdata Hadoop

Information Builders Mission & Value Proposition

Transcription:

SQL Server 2012 PDW Ryan Simpson Technical Solution Professional PDW Microsoft Microsoft SQL Server 2012 Parallel Data Warehouse

Massively Parallel Processing Platform Delivers Big Data HDFS Delivers Scale Out But we also need: Highly Concurrent Complex Workloads Appliance

Zookeeper Ambari Hadoop Ecosystem Apache HCatalog Oozie HBase/Cassandra/Couch/ MongoDB Hive Mahout R Cascad-ing Pig Flume Sqoop HBase (Column DB) Avro MapReduce (Job Scheduling/Execution System) Hadoop = MapReduce + HDFS HDFS (Hadoop Distributed File System)

Scale OUT SQL Server Microsoft SQL Server 2012 Massive Parallel Processing Platform Scale Out Microsoft SQL Server 2012 Parallel Data Warehouse Control node (Co-ordinator) Management node SQL Instance #1 But we need: Highly Concurrent Complex Workloads True Appliance Redundant node SQL Instance #6 SQL Instance #5 SQL Instance #4 SQL Instance #3 SQL Instance #2 SQL Instance #1 Single SQL Instance represents single pipe for workloads from storage through RAM and Cores. PDW has multiple instances, so multiple pipes in parallel

How do you run a parallel query? 1. User issues a query 2. Query is sent to the Shell through sp_showmemo_xml stored procedure SQL Server (shell) performs the parsing, binding, authorization SQL optimizer generates execution alternatives SELECT Return SELECT Shell Appliance (SQL Server) Engine Service MEMO Control Node 3. MEMO containing candidate plans, histograms, data types is generated 4. Parallel execution plan generated Plan Steps Plan Steps Plan Steps 5. Parallel plan is executed on the compute nodes Compute Node (SQL Server) Compute Node (SQL Server) Compute Node (SQL Server) 6. Result gets returned to the user

Appliance: Run Sooner, Faster for Longer Predictable DW Best Practises in a box Deploy Fast and Drive Value HA, Keeps on going Add Capacity Add Capacity

BI on steroids In Memory Performance Next Gen Column Store Increased Parallel RAM Columns & Cubes map ROLAP

Demo PDW the Appliance MPP PDW Getting Data In BI on steroids Big Data on Your Terms Microsoft SQL Server 2012 Parallel Data Warehouse

Complete Platform Cloud or Appliance New Unstructured BigData Polybase Self Serve Reporting & Analysis SQL SQL Integration Services HDinsight Hadoop SQL SQL spoke Advanced Analytics Data mining Multiple Business Units Prototyping

Scale OUT Big Data on SQL Server Control node (Co-ordinator) Management node Redundant node SQL Instance #6 SQL Instance #5 SQL Instance #4 SQL Instance #3 SQL Instance #2 SQL Instance #1 HDFS Node #2 HDFS Node #1 Microsoft SQL Server 2012 Parallel Data Warehouse

Large Scale Success NASDAQ equity trading compliance (450 Tb) Direct Edge (3rd largest NA Stock Exchange) analysis of trade execution effectiveness (150 Tb) GroupM (world largest media company) analysis and optimization of online advertising campaigns AMD Wafer electrical test data adds additional terabyte every week. 50 sophisticated business analysts mine data and run thousands of ad hoc queries NHS Consolidation of Data warehouses to provide regional / national BI and Data services Live within 3 months Walmart On Shelf Availability solution delivering intraday stock alerts 1000 s of stores X 10,000s SKUs & Sales every hour.

10 X 42 X 100 in 3 Weeks Perform 10 times faster over 42 times more data with 100x more concurrent users. Existing DL980 3 Databases RAW, Staged & Data Warehouse 8 hours to load Raw data 24 hours to initialise Datawarehouse Single User query on 8 weeks of data took >120 mins After MPP 17 minutes to load Raw data 1 hour to initialise Datawarehouse (End to End) 200 concurrent users running same query < 12 mins on 7 years of data Rapidly built Prototyping Datasets I can run queries I only dreamed of on PDW

Big Data on Your Terms Single Cohesive SQL Based Big Data Solution Ready now and future proofed Polybase will just get better Truly Big Data Architecture: Familiar Tooling Familiar way of working for users Easy to manage and scale

More Info www.microsoft.com/casestudies Search products > SQLServer 2008 -> PDW 1. Hyvee Retailer 2. AMD 3. CROSSMARK Retail Analytics 4. DIRECT EDGE Stock Exchange 5. Smartro Financial Services 6. Microsoft Clickstream Analysis www.microsoft.com/pdw