SQL Server 2012 PDW Ryan Simpson Technical Solution Professional PDW Microsoft Microsoft SQL Server 2012 Parallel Data Warehouse
Massively Parallel Processing Platform Delivers Big Data HDFS Delivers Scale Out But we also need: Highly Concurrent Complex Workloads Appliance
Zookeeper Ambari Hadoop Ecosystem Apache HCatalog Oozie HBase/Cassandra/Couch/ MongoDB Hive Mahout R Cascad-ing Pig Flume Sqoop HBase (Column DB) Avro MapReduce (Job Scheduling/Execution System) Hadoop = MapReduce + HDFS HDFS (Hadoop Distributed File System)
Scale OUT SQL Server Microsoft SQL Server 2012 Massive Parallel Processing Platform Scale Out Microsoft SQL Server 2012 Parallel Data Warehouse Control node (Co-ordinator) Management node SQL Instance #1 But we need: Highly Concurrent Complex Workloads True Appliance Redundant node SQL Instance #6 SQL Instance #5 SQL Instance #4 SQL Instance #3 SQL Instance #2 SQL Instance #1 Single SQL Instance represents single pipe for workloads from storage through RAM and Cores. PDW has multiple instances, so multiple pipes in parallel
How do you run a parallel query? 1. User issues a query 2. Query is sent to the Shell through sp_showmemo_xml stored procedure SQL Server (shell) performs the parsing, binding, authorization SQL optimizer generates execution alternatives SELECT Return SELECT Shell Appliance (SQL Server) Engine Service MEMO Control Node 3. MEMO containing candidate plans, histograms, data types is generated 4. Parallel execution plan generated Plan Steps Plan Steps Plan Steps 5. Parallel plan is executed on the compute nodes Compute Node (SQL Server) Compute Node (SQL Server) Compute Node (SQL Server) 6. Result gets returned to the user
Appliance: Run Sooner, Faster for Longer Predictable DW Best Practises in a box Deploy Fast and Drive Value HA, Keeps on going Add Capacity Add Capacity
BI on steroids In Memory Performance Next Gen Column Store Increased Parallel RAM Columns & Cubes map ROLAP
Demo PDW the Appliance MPP PDW Getting Data In BI on steroids Big Data on Your Terms Microsoft SQL Server 2012 Parallel Data Warehouse
Complete Platform Cloud or Appliance New Unstructured BigData Polybase Self Serve Reporting & Analysis SQL SQL Integration Services HDinsight Hadoop SQL SQL spoke Advanced Analytics Data mining Multiple Business Units Prototyping
Scale OUT Big Data on SQL Server Control node (Co-ordinator) Management node Redundant node SQL Instance #6 SQL Instance #5 SQL Instance #4 SQL Instance #3 SQL Instance #2 SQL Instance #1 HDFS Node #2 HDFS Node #1 Microsoft SQL Server 2012 Parallel Data Warehouse
Large Scale Success NASDAQ equity trading compliance (450 Tb) Direct Edge (3rd largest NA Stock Exchange) analysis of trade execution effectiveness (150 Tb) GroupM (world largest media company) analysis and optimization of online advertising campaigns AMD Wafer electrical test data adds additional terabyte every week. 50 sophisticated business analysts mine data and run thousands of ad hoc queries NHS Consolidation of Data warehouses to provide regional / national BI and Data services Live within 3 months Walmart On Shelf Availability solution delivering intraday stock alerts 1000 s of stores X 10,000s SKUs & Sales every hour.
10 X 42 X 100 in 3 Weeks Perform 10 times faster over 42 times more data with 100x more concurrent users. Existing DL980 3 Databases RAW, Staged & Data Warehouse 8 hours to load Raw data 24 hours to initialise Datawarehouse Single User query on 8 weeks of data took >120 mins After MPP 17 minutes to load Raw data 1 hour to initialise Datawarehouse (End to End) 200 concurrent users running same query < 12 mins on 7 years of data Rapidly built Prototyping Datasets I can run queries I only dreamed of on PDW
Big Data on Your Terms Single Cohesive SQL Based Big Data Solution Ready now and future proofed Polybase will just get better Truly Big Data Architecture: Familiar Tooling Familiar way of working for users Easy to manage and scale
More Info www.microsoft.com/casestudies Search products > SQLServer 2008 -> PDW 1. Hyvee Retailer 2. AMD 3. CROSSMARK Retail Analytics 4. DIRECT EDGE Stock Exchange 5. Smartro Financial Services 6. Microsoft Clickstream Analysis www.microsoft.com/pdw