Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860
Unlock Insights on Any Data Taking an End-toEnd Approach to BI and Analytics Modernizing Your Data Warehouse for Hadoop
The traditional data warehouse data warehousing has reached the most significant tipping point since its inception. The biggest, possibly most elaborate data management system in IT is changing. Gartner, The State of Data Warehousing in 2012
The traditional data warehouse 2 Real time data 1 Increasing 1 data Increasing data 3 New data sources volumes volumes and types 4 Cloud-born data
The modern data warehouse
Microsoft s modern data warehouse SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform
Scale out relational data to petabytes From terabytes to multi-petabytes Scale out technologies in Analytics Platform System APS / HDInsight APS / HDInsight APS / HDInsight APS / HDInsight APS / HDInsight APS / HDInsight APS 0TB 6PB
Scale Out non-relational data Scale out big data Scale out non-relational data in HDInsight (for Microsoft Azure or APS)
In-memory performance In-memory Columnstore for next-generation performance Columnstore index representation
Concurrency and mixed workloads Great performance for mixed workloads Query Results
Near real-time insights Real-time with complex event processing Event Sources Event Targets
What is big data? Petabytes Data complexity: variety and velocity
What is Hadoop? Distributed, scalable system on commodity HW Operational services Data services AMBARI OOZIE FALCON FLUME SQOOP HBASE PIG HIVE & HCATALOG Core Services LOAD & EXTRACT NFS WebHDFS MAP REDUCE YARN HDFS Hadoop Cluster Hadoop clusters provide scale-out storage and distributed data processing on commodity hardware compute & storage.......... compute & storage
IT infrastructure optimization Legal discovery Social network analysis Traffic flow optimization Web app optimization Churn analysis Natural resource exploration Weather forecasting Healthcare outcomes Fraud detection Life sciences research Advertising analysis Equipment monitoring Smart meter monitoring
Hadoop offerings on-premise and cloud Real-time with complex event processing Microsoft Azure
Integrate relational data and Hadoop Integrated query with PolyBase in SQL APS Select Result set Microsoft Azure HDInsight Analytics Platform System PolyBase Hortonworks (Windows, Linux), Cloudera Microsoft HDInsight
Microsoft s modern data warehouse SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform
Freedom of deployment options and hybrid solutions
Appliance vs. Reference Architecture Buying an a appliance Reference Architecture Order SKU from a list of configuration options Factory Order builds hardware & tests from a BOM Hardware vendor installs & connects Customer builds & configures Microsoft validates function & performance Installs software, drivers, firmware, Hands over the etc. keys to the customer Microsoft is the single point of contact for Customer support manages multiple support channels
! Sign up for a free architectural design session for APS with your Microsoft rep! Visit Analytics Platform System at http://www.microsoft.com/aps! Try HDInsight at http://www.windowsazure.com/bigdata! Try SQL Server for data warehousing in Microsoft Azure VMs at http://www.windowsazure.com! Try Hortonworks Data Platform for Windows at http://www.hortonworks.com/ products/hdp-windows/! Try SQL Server 2014 at http://www.microsoft.com/sql/ sql-server-2014.aspx
alias@microsoft.com
Growth Topology PDW Region Only Scale Unit Base Unit Base UnitExtension
Growth Topologies Hadoop Region Extend Min
About Analytics Platform System SQL Server Parallel Data Warehouse PolyBase Microsoft HDInsight
About Hortonworks Data Platform For Windows
About Microsoft Azure HDInsight Microsoft Azure
Microsoft Contributions to Hadoop 6,000+ Engineering hours Hive (Improve performance 40x with Stinger) Contributed FileSystem implementation for Microsoft Azure Storage HDFS permissions model mapped to Windows HDP 2.0 25,000+ Code line contributions Windows, a first class OS for Hadoop REEF for creation and execution of machine learning jobs 9
Hortonworks and Microsoft Engineering alignment Corporate alignment Field Alignment
ü Data sources Non-Relational Data
Microsoft Azure ü
44
PDW Customers
HDInsight Customers
.