Modern Data Warehousing



Similar documents
Bringing Big Data to People

Microsoft Analytics Platform System. Solution Brief

Please give me your feedback

Big Data Processing: Past, Present and Future

Microsoft technológie pre BigData. Ľubomír Goryl Solution Professional

Modernizing Your Data Warehouse for Hadoop

Parallel Data Warehouse

The Inside Scoop on Hadoop

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

The Role Polybase in the MDW. Brian Mitchell Microsoft Big Data Center of Expertise

Building a BI Solution in the Cloud

SQL Server What s New? Christopher Speer. Technology Solution Specialist (SQL Server, BizTalk Server, Power BI, Azure) v-cspeer@microsoft.

Agenda. Modern Data Warehouse Big Data Application examples. Analytic Platform Systems. Integration of Hadoop and APS. Architecture Hadoop

Azure Data Lake Analytics

How To Get An Advantage From Analytics

SQL Server Parallel Data Warehouse: Architecture Overview. José Blakeley Database Systems Group, Microsoft Corporation

Whitepaper: Solution Overview - Breakthrough Insight. Published: March 7, Applies to: Microsoft SQL Server Summary:

Updating Your SQL Server Skills to Microsoft SQL Server 2014

Course 10977A: Updating Your SQL Server Skills to Microsoft SQL Server 2014

Course MS20467C Designing Self-Service Business Intelligence and Big Data Solutions

Microsoft BI Platform Overview

SQL Server 2016 New Features!

Updating Your SQL Server Skills to Microsoft SQL Server 2014

Understanding Microsoft s BI Tools

SQL Server 2012 Parallel Data Warehouse. Solution Brief

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

Designing Self-Service Business Intelligence and Big Data Solutions

Microsoft Data Platform Evolution

How To Extend An Enterprise Bio Solution

SQL Server Everything built-in. Csom Gergely Microsoft Adat platform szakértő

Microsoft Big Data. Solution Brief

Big Data on Microsoft Platform

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

CREATING PACKAGED IP FOR BUSINESS ANALYTICS PROJECTS

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

10977B: Updating Your SQL Server Skills to Microsoft SQL Server 2014

Upgrading Your SQL Server Skills to Microsoft SQL Server 2014

Course 10977: Updating Your SQL Server Skills to Microsoft SQL Server 2014

How To Create A Fact Table On Hadoop (Hadoop) On A Microsoft Powerbook (Powerbook) On An Ipa 2.2 (Powerpoint) On Microsoft Microsoft 2.3

Big Data Technologies Compared June 2014

MOC 20467B: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Introducing Oracle Exalytics In-Memory Machine

SQL Server 2012 Business Intelligence Boot Camp

Course 20467: Designing Self-Service Business Intelligence and Big Data Solutions

Faster Insights from Any Data Technical White Paper

SQL Server 2012 Performance White Paper

Upgrading Your SQL Server Skills to Microsoft SQL Server 2014

Structured data meets unstructured data in Azure and Hadoop

Business Intelligence and Healthcare

Harnessing the Power of the Microsoft Cloud for Deep Data Analytics

SQL Server 2014 Faster Insights from any Data Level 300

Cost-Effective Business Intelligence with Red Hat and Open Source

SQL 2016 and SQL Azure

Updating Your Skills to SQL Server 2016

Microsoft Excel, other vendors personal desktop reporting tools

Faster Insights from Any Data Technical White Paper

Introducing the Reimagined Power BI Platform. Jen Underwood, Microsoft

HP Enterprise Data Warehouse Deep Dive. Steve Tramack, Sr. Engineering Manager, I2A Solutions, HP

The Microsoft Modern Data Warehouse

Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal

MS Updating your Microsoft SQL Server 2008 BI Skills to SQL Server 2008 R2

Deeper Insights across Data

Data platform evolution

Il mondo dei DB Cambia : Tecnologie e opportunita`

Course 40009A: Updating your Business Intelligence Skills to Microsoft SQL Server 2012

Cass Walker TLG Learning

Implementing a SQL Data Warehouse 2016

Tagetik Extends Customer Value with SQL Server 2012

Polybase for SQL Server 2016

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Microsoft Services Exceed your business with Microsoft SharePoint Server 2010

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

EMC/Greenplum Driving the Future of Data Warehousing and Analytics

A Breakthrough Platform for Next-Generation Data Warehousing and Big Data Solutions

SQL Server In-Memory by Design. Anu Ganesan August 8, 2014

The Brave New World of Power BI and Hybrid Cloud

SQL Server Point of View. Overview on Key Enhancements and Updates

Course Outline. Module 1: Introduction to Data Warehousing

This three-day instructor-led course provides existing SQL Server database professionals with the knowledge

The Fantastic 12 of 2012

Course 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012

Updating your Database skills to Microsoft SQL Server 2012

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

BIG DATA TRENDS AND TECHNOLOGIES

Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot

Implementing a Data Warehouse with Microsoft SQL Server 2012

TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP

Course Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Updating Your Microsoft SQL Server 2008 BI Skills to SQL Server 2008 R2

Business Intelligence Using SharePoint 2013 and Office365

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

Implementing a Data Warehouse with Microsoft SQL Server 2012

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Updating Your SQL Server Skills from Microsoft SQL Server 2008 to Microsoft SQL Server 2014

Managed Self-Service BI & Data As A Service

Transcription:

Modern Data Warehousing Cem Kubilay Microsoft CEE, Turkey & Israel

Time is FY15

Gartner Survey April 2014 Piloting on premise 15% 10% 4% 14% 57% 2014 5% think Hadoop will replace existing DW solution (2013: 10%) Piloting in the cloud Production on-premise with cluster 50% decline even with the presence of Hadoop 2.0 Production on premise with appliance

Data sources

Data sources Non-Relational Data 5

The EDW also powers BI and other Analytical solutions that offer business insight Fraud Department Analytics Department Enterprise Data Warehouse Reports Information Workers Dashboards Finance Department Reports

Value Business event Reducing Cycle Time in the Enterprise Data latency Data captured Analysis latency Intelligence delivered Decision latency Action taken Action time or Action distance TDWI The Business Case for Real-Time BI. Based on concept developed by Richard Hackathorn, Bolder Technology

Value Business event Reducing Cycle Time in the Enterprise Data latency Data captured Analysis latency Intelligence delivered Decision latency Action taken Action time or Action distance TDWI The Business Case for Real-Time BI. Based on concept developed by Richard Hackathorn, Bolder Technology

Value Business event Reducing Cycle Time in the Enterprise Data latency Data captured Intelligence delivered Analysis latency Decision latency Action taken Action time or Action distance TDWI The Business Case for Real-Time BI. Based on concept developed by Richard Hackathorn, Bolder Technology

Value Business event Reducing Cycle Time in the Enterprise Data latency Data captured Intelligence delivered Action taken Analysis latency EDW Decision latency Action time or Action distance TDWI The Business Case for Real-Time BI. Based on concept developed by Richard Hackathorn, Bolder Technology

SQL Server Parallel Data Warehouse Microsoft Analytics Platform System 12

Storage Managed by Windows Storage Spaces Each servers has 256 GB RAM Can Scale up to 6 PB Starts with ¼ Rack (2 Servers) In-Memory Analytics Integrated «BigData» analytics

PARALLEL QUERY EXECUTION Table with 10,000 distinct Cust_ids, distributed on Cust_id Control Node Compute Node 1 SQL Server Instance... Compute Node 10 SQL Server Instance Query:...... Dist. 1 Dist. 8 Dist. 73 Dist. 80 SELECT cust_id, SUM (units) FROM [sales] GROUP BY [cust_id] SELECT cust_id, SUM (units) FROM sales_1 GROUP BY [cust_id] SELECT cust_id, SUM (units) FROM sales_8 GROUP BY [cust_id] SELECT cust_id, SUM (units) FROM sales_73 GROUP BY [cust_id] SELECT cust_id, SUM (units) FROM sales_80 GROUP BY [cust_id] DIRECT RESULTS 125 rows 125 rows 125 rows 125 rows Fully Parallel Query Execution

Access Data Faster: In-memory for Real-Time Powered by in-memory columnstore in PDW and SQL Server Customer Products Sales Supplier Country

Connecting Islands of data with PolyBase Bringing Hadoop point solutions and the data warehouse together for users and IT Hortonworks Windows Server Select Result set Windows Azure HDInsight SQL Server Parallel Data Warehouse Single T-SQL query model for PDW and Hadoop Rich features of T-SQL including joins without ETL Enhance query execution performance using the power of massively parallel processing Cloudera Hortonworks Linux PolyBase Microsoft HDInsight Open and collaborative platform Supports Windows Azure HDInsight to enable new hybrid cloud scenarios Query non-microsoft Hadoop distributions such as Hortonworks and Cloudera

1. Export COLD DATA to Hadoop Hadoop SQL Server PDW 17

5. Combine data from different sources Hadoop Query: Join between HDFS table and PDW table select c.*, o.* from pdwcustomer c, hdfsorders o, Cloud_Twitter ct where c.c_custkey = o.o_custkey and ct.name=c.name and o_orderdate < 9/1/2010 Execution plan : 4 3 RETURN OPERATION DMS SHUFFLE FROM HDFS on o_custkey Select c.*. o.* from Customer c, otemp o,cttemp ct where c.c_custkey = o.o_custkey and ct.name=c.name Read hdfstemp into otemp, partitioned on o_custkey CREATE otemp,cttemp 2 On PDW compute nodes distrib. on o_custkey Hadoop SQL Server PDW 1 Run Map Job on Hadoop Apply filter to hdfsorders, Cloud_Twitter, put data to Temp tables 18 18

Performance SharePoint Dashboards SharePoint Scorecards Excel Workbooks PowerPivot Applications Queries 10-100x faster than traditional DW systems Microsoft SSAS ROLAP MOLAP Direct Access Microsoft SSRS Microsoft Analytics Platform System Optimized for mixed workload & near real-time data analysis Enhanced loading, 2+TB/hour Simplicity Ease of installation, one throat to choke Ready to go for immediate load and query = fast time to value No indexing, tuning, data sorting or materialized view maintenance SSIS/Existing ETL Tool Source Systems Value Non- Proprietary Standards based architecture reduces risk & cost Minimal implementation and ongoing administration cost Lowest full life cycle TCO

APS & Power BI better together: APS as an On-premises data hub for Power BI 1. APS integrates Data Mgmt Gateway as region out of box, enabling Tier 1 gateway hub for the enterprise. 2. Gateway registers with Azure, discovers on-premises assets, enabling users to query onpremises via Power BI. 3. APS scales gateway workload performance/concurrency or complex mashups across sources for Power BI enabling: 1. Multiple joins 2. Large amount of data 3. Different data formats/sources 4. Compute pushdown for Hadoop etc. 4. APS scales gateway platform, improving resiliency, HA and management. O365 Power BI Metadata catalog Public Internet Intranet HDI Azure DB Azure Secure Gateway APS PDW HDI SQL Assets Hadoop 3 rd - Party

Some Customer References (Microsoft Internal)

Value of APS