How To Create A Fact Table On Hadoop (Hadoop) On A Microsoft Powerbook 2.5.1 (Powerbook) On An Ipa 2.2 (Powerpoint) On Microsoft Microsoft 2.3



Similar documents
SQL Server 2014 Faster Insights from any Data Level 300

The Role Polybase in the MDW. Brian Mitchell Microsoft Big Data Center of Expertise

Structured data meets unstructured data in Azure and Hadoop

Parallel Data Warehouse

Bringing Big Data to People

Agenda. Modern Data Warehouse Big Data Application examples. Analytic Platform Systems. Integration of Hadoop and APS. Architecture Hadoop

Modern Data Warehousing

A Breakthrough Platform for Next-Generation Data Warehousing and Big Data Solutions

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Please give me your feedback

Understanding Microsoft s BI Tools

Polybase for SQL Server 2016

How To Extend An Enterprise Bio Solution

Microsoft Analytics Platform System. Solution Brief

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM

Microsoft technológie pre BigData. Ľubomír Goryl Solution Professional

The Inside Scoop on Hadoop

Implementing Data Models and Reports with Microsoft SQL Server 2012 MOC 10778

Big Data Processing: Past, Present and Future

SQL Server 2016 New Features!

Course 20467: Designing Self-Service Business Intelligence and Big Data Solutions

Combined Knowledge Business Intelligence with SharePoint 2013 and SQL 2012 Course

MS 50511A The Microsoft Business Intelligence 2010 Stack

SQL Server 2012 Business Intelligence Boot Camp

Big Data on Microsoft Platform

Modernizing Your Data Warehouse for Hadoop

SQL Server What s New? Christopher Speer. Technology Solution Specialist (SQL Server, BizTalk Server, Power BI, Azure) v-cspeer@microsoft.

Building a BI Solution in the Cloud

LEARNING SOLUTIONS website milner.com/learning phone

Course 40009A: Updating your Business Intelligence Skills to Microsoft SQL Server 2012

Designing Self-Service Business Intelligence and Big Data Solutions

Course 10977A: Updating Your SQL Server Skills to Microsoft SQL Server 2014

Microsoft BI Platform Overview

MS 10977B Upgrading Your SQL Server Skills to Microsoft SQL Server 2014

Microsoft Data Platform Evolution

Whitepaper: Solution Overview - Breakthrough Insight. Published: March 7, Applies to: Microsoft SQL Server Summary:

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS

NZ BI User Group Auckland 18 September, Big Data Analytics with PowerPivot and Power View

Big Data Introduction

Designing Business Intelligence Solutions with Microsoft SQL Server 2012 Course 20467A; 5 Days

MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Business Intelligence for Dynamics GP. Presented By: Rob Jackson, Business Intelligence Consultant Brent Keilin, GP Consultant

Visualizing PI System Data with Dashboards and Reports

Big Data Technologies Compared June 2014

SQL Server PDW. Artur Vieira Premier Field Engineer

Updating Your SQL Server Skills from Microsoft SQL Server 2008 to Microsoft SQL Server 2014

Upgrading Your SQL Server Skills to Microsoft SQL Server 2014 va

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

Azure Data Lake Analytics

TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP

Faster Insights from Any Data Technical White Paper

Updating Your SQL Server Skills to Microsoft SQL Server 2014

Step by Step: Big Data Technology. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015

Updating Your SQL Server Skills to Microsoft SQL Server 2014

SQL SERVER WITH PLEX: Strategies To Integrate PLEX With SQL Server A Symphony of Alternatives-

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

Microsoft Business Intelligence 2012 Single Server Install Guide

The Microsoft Business Intelligence 2010 Stack Course 50511A; 5 Days, Instructor-led

COURSE SYLLABUS COURSE TITLE:

Microsoft Services Exceed your business with Microsoft SharePoint Server 2010

End to End Microsoft BI with SQL 2008 R2 and SharePoint 2010

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

Introducing the Reimagined Power BI Platform. Jen Underwood, Microsoft

10977B: Updating Your SQL Server Skills to Microsoft SQL Server 2014

Course 10977: Updating Your SQL Server Skills to Microsoft SQL Server 2014

Harnessing the Power of the Microsoft Cloud for Deep Data Analytics

Microsoft Business Intelligence

Tap into Hadoop and Other No SQL Sources

Microsoft End to End Business Intelligence Boot Camp

Microsoft SQL Server 2012 with Hadoop

SQL Server Point of View. Overview on Key Enhancements and Updates

Updating Your SQL Server Skills to Microsoft SQL Server 2014 (10977) H8B96S

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

Architecting for the Internet of Things & Big Data

Visualizing PI System Data with Dashboards and Reports

Register on projectbotticelli.com. Introduction to BI & Big Data DAX MDX Data Mining

Business Intelligence with Excel Excel new features for reporting and data analysis

City Deploys Big Data BI Solution to Improve Lives and Create a Smart-City Template

Implementing Business Intelligence at Indiana University Using Microsoft BI Tools

SQL Server 2012 Parallel Data Warehouse. Solution Brief

Global outlook on the perspectives of technologies like Power Hub

SQL Server 2016 BI Any Data, Anytime, Anywhere. Phua Chiu Kiang PCK CONSULTING MVP (Data Platform)

Upgrading Your SQL Server Skills to Microsoft SQL Server 2014

Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777

Microsoft Azure Data Technologies: An Overview

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Microsoft Big Data. Solution Brief

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Transcription:

學 習 門 檻 太 高, 把 人 變 成 7x24 系 統 IT 需 要 藉 由 人 工 化 的 方 式 重 置 資 料 到 DW Learn MapReduce Prior manual IT moving HDFS into Warehouse/Data Mart before Analysis

感 應 器 HDInsight (Hadoop) SQL Server 2012 PDW SQL Server Excel 裝 置 Windows Azure Bots Windows Server MPP + Polybase (Join Relational Data With Data From Hadoop) New Model Tabular Excel Report Dashboard Crawler End to End

新 應 用 的 興 起 大 量 非 關 聯 性 資 料 產 生 進 階 資 料 分 析 的 新 挑 戰 整 合 非 關 聯 性 資 料 的 技 術 需 求 Social Apps Mobile Apps Sensor & RFID Web Apps 如 何 克 服 這 二 者 的 鴻 溝? Traditional schemabased DW applications Hadoop RDBMS Non-Relational data Relational data

研 發 背 景 Microsoft s Jim Gray System 實 驗 室 與 資 料 庫 先 驅 David DeWitt 所 領 導 的 PDW 團 隊 密 切 合 作 研 發 SQL Server 2012 PDW 功 能 目 標 1. 簡 單 o 透 過 標 準 T-SQL 查 詢 Hadoop 的 資 料 2. 效 能 o 平 行 存 取 與 寫 入 資 料 至 Hadoop 3. 開 放 o 支 援 多 種 Hadoop 套 件, 4. 與 微 軟 Office BI 工 具 密 切 整 合 o Excel s PowerPivot, PowerView, SQL Server Reporting & Analysis Services

1. External Table : 定 義 框 架 2. Enhanced PDW query engine :DSQL & Hadoop HDFS 語 法 產 生 3. HDFS Bridge : 資 料 搬 移 Social Apps Mobile Apps Sensor & RFID Web Apps Regular T-SQL Enhanced PDW query engine Results External Table Traditional schemabased DW applications Hadoop data nodes Non-relational data HDFS bridge PDW V2 Relational data

Create the External Table from HDFS: FactSales_HX

Select the External Table: FactSales_HX

( 建 立 External Table) I. 透 過 External Table 查 詢 HDFS 的 資 料 並 以 資 料 表 呈 現 II. 連 結 HDFS 的 資 料 與 PDW 中 的 關 聯 性 資 料 1. 2. Running Example Creating external table ClickStream : CREATE EXTERNAL TABLE ClickStream(url varchar(50), event_date date, user_ip varchar(50)), WITH (LOCATION = hdfs://myhadoop:5000/tpch1gb/employee.tbl, FORMAT_OPTIONS (FIELD_TERMINATOR = ' ')); Query Examples Text file in HDFS with as field delimiter SELECT top 10 (url) FROM ClickStream where user_ip = 192.168.0.1 Filter query against data in HDFS SELECT url.description FROM ClickStream cs, Url_Descr* url WHERE cs.url = url.name and cs.url= www.cars.com ; 3. SELECT user_name FROM ClickStream cs, User* u WHERE cs.user_ip = u.user_ip and cs.url= www.microsoft.com ; Join data from various files in HDFS (*Url_Descr is a second text file) Join data from HDFS with data in PDW (*User is a distributed PDW table)

( 用 Select 建 立 External Table) parallelized Example CREATE EXTERNAL TABLE ClickStream WITH(LOCATION = hdfs://myhadoop:5000/users/outputdir,format_options (FIELD_TERMINATOR = ' ')) AS SELECT url, event_date, user_ip FROM ClickStream_PDW Retrieval of PDW data Social Apps Mobile Apps Sensor & RFID Web Apps Parallel HDFS writes data nodes Non-relational data CETAS External Table Enhanced PDW query engine Results HDFS bridge DMS DMS Reader Reader 1 N Traditional schemabased DW applications Parallel Export PDW V2 Relational data

Create External Dimension Table to the HDFS on-the-fly

Select External Dimension Table on the HDFS on-the-fly

Check the Table s Volume: FactSales

Create External Fact Table to the HDFS onthe-fly

Check & Compare the External Fact Table s Volume: FactSales_H

Select the External Fact Table: FactSales_H

External Table Located on Hadoop

View the Structure of External Dimension Table: Single File

View the Structure of External Fact Table: 16 Files

Join the PDW & External Table:

External Shuffle Move from Hadoop

憑 藉 Export 功 能 自 HDFS 上 round-tripping 資 料 1. 平 行 從 HDFS 上 匯 入 資 料 2. 聯 結 HDFS 及 PDW 上 的 資 料 3. 平 行 匯 出 資 料 至 HDFS Example 3. New external table created with results of the join CREATE EXTERNAL TABLE ClickStream_UserAnalytics WITH (LOCATION = hdfs://myhadoop:5000/users/outputdir, FORMAT_OPTIONS (FIELD_TERMINATOR = ' ')) AS SELECT user_name, user_location, event_date, user_ip FROM ClickStream c, User_PDW u where c.user_id = u.user_id 2. PDW data 2. Joining incoming data from HDFS with PDW data 1. External table referring to data in HDFS

Join the PDW & External Table and then Writing the result to the Hadoop

Select the Round- Tripping External Table

使 用 簡 單 的 工 具 操 作 Big Data Big Data 結 果 在 雲 端 滿 足 即 時 查 詢 新 方 法

事 業 群 主 管 一 線 同 仁 <BI 資 料 閱 讀 工 具 > 瀏 覽 器 與 裝 置 BI 展 現 層 企 業 內 部 與 雲 端 商 業 智 慧 入 口 網 站 Excel Services Power Pivot 報 表 儀 表 板 分 解 樹 ad- Hoc Excel Services Power BI BU 應 用 管 理 分 析 同 仁 < 報 表 / 績 效 設 計 工 具 層 > 專 業 使 用 者 專 業 績 效 設 計 工 具 Report Builder, Power View 使 用 者 簡 易 報 表 / 績 效 設 計 工 具 微 軟 商 業 智 慧 / 核 心 資 料 庫 平 台 BI 核 心 整 合 資 料 庫 層 報 表 Reporting Services 線 上 及 時 分 析 資 料 採 礦 Cube/Tabular/Data Mining 資 料 庫 資 料 倉 儲 SQL Server DBMS/PDW 資 料 整 合 SSIS 多 樣 異 質 資 料 源 SQL DB2 Oracle Access XML Text File

1. Power Query 4. Power View 2. Polybase 3. Power Pivot

整 合 熟 悉 的 Office 自 助 分 析 BI 工 具 與 強 大 的 雲 端 平 台 10 億 Office 使 用 者 ¼ 企 業 客 戶 在 Office 365 發 掘 分 析 視 覺 化 分 享 尋 找 Q&A 行 動 化 Scalable Manageable Trusted

DirectQuery Mode (ROLAP) DirectQuery 好 處 即 時 資 料 不 需 批 次 處 理

Familiar Tools To Analyze Structured/Unstructured Data Hadoop Data Structured Data High Adoption Of Excel No IT Intervention Analyze All Data Types