SQL Server 2014 Faster Insights from any Data Level 300
Data Explorer Preview for Excel Enable self-service data discovery, query, transformation, and mashup experiences for information workers through Excel and PowerPivot Discover and connect to a wide range of data sources that span volume and variety of data Engage in a highly interactive and intuitive experience for rapidly and iteratively building queries for virtually any data source of virtually any size Enjoy a consistent experience and parity of query capabilities over all data sources Join data across different sources; create custom views of data that then can be shared with a team or department
Discover, combine, and refine Big Data, small data, and any data Excel add-in to enhance self-service BI Identify and import external data Relational database Excel Text XML OData Webpages Hadoop (HDFS) Discover relevant data by using search Combine and transform multiple data sources
Azure SQL Database Azure HDInsight Windows Azure Marketplace S Windows Active Directory
Volume Exabytes (10E18) Petabytes (10E15) Terabytes (10E12) Gigabytes (10E9) Storage / GB 1980 190,000$ Social sentiment Clickstream Mobile Advertising Payables Payroll Inventory ecommerce ERP / CRM Contacts Deal-tracking Sales pipeline Sensors / RFID / devices Web 2.0 Collaboration Digital marketing Search marketing Web logs Recommendations Velocity, variety, and variability 1990 9,000$ Internet of things 2000 15$ Wikis and blogs Audio and video Log files Spatial & GPS coordinates Data market feeds egov feeds Weather Text and images ERP / CRM Web 2.0 Internet of things 2010 0.07$
Query (Hive) Distributed processing (MapReduce) Distributed storage (HDFS) ODBC Legend Red Blue Gray Orange Green Core Hadoop Data processing Microsoft integration points and value-adds Data movement Packages
Record reader Map Combiner Partitioner Shuffle and sort Reduce Output format
Hive, Pig, Mahout, Cascading, Scalding, Scoobi, Pegasus, and more C#, F# MapReduce, LINQ to Hive, and Microsoft.NET management clients JavaScript MapReduce, browser-hosted console, and Node.js management clients Windows PowerShell and cross-platform CLI tools
Authoring jobs App integration Extends breadth and depth Enables new scenarios Integrates with current tool chains Lightweight Low cost to extend Scenario-oriented Innovation flows upward New compute models Performance enhancements Authoring frameworks and languages Connectivity Programmability Security Loosely coupled
Insights to all users by activating new types of data
= SQL Server PDW querying HDFS data in place HDFS DB 26
Hadoop HDFS DB (a) PDW query in, results out Hadoop HDFS DB (b) PDW query in, results stored in HDFS 27
Introducing PolyBase Sensor and RFID Web apps Social apps Mobile apps How to overcome the impedance mismatch Traditional schemabased data warehouse applications Hadoop Unstructured data Increasingly massive amounts of unstructured data driven by new sources RDBMS Structured data At the same time, vast amounts of corporate data and data sources, and the bulk of their data analysis PolyBase addresses this challenge for advanced data analytics by allowing native query across PDW and Hadoop to integrate structured and unstructured data
PolyBase features in SQL Server PDW Full SQL query access to data stored in HDFS, represented as external tables in PDW Basic statistics support for data coming from HDFS Query across PDW and Hadoop tables (joining on the fly ) Fully parallelized, high performance import of data from HDFS files into PDW tables Fully parallelized, high performance export of data in PDW tables into HDFS files Hadoop on Windows Server, Hortonworks, and Cloudera Hadoop 1.0 and 2.0
Creating external tables CREATE EXTERNAL TABLE table_name ({<column_definition>} [,...n ]) {WITH (LOCATION = <URI>,[FORMAT_OPTIONS = (<VALUES>)])} [;] 1 Indicates External table 2 Required location of Hadoop cluster and file 3 Optional format options associated with data import from HDFS Internal representation of data residing in Hadoop / HDFS (delimited text files only) High-level permissions required for creating external tables ADMINISTER BULK OPERATIONS and ALTER SCHEMA Different from regular SQL tables : essentially read-only (no DML support)
Querying unstructured data 1. Querying data in HDFS and displaying results in table form (using external tables) 2. Joining data from HDFS with relational PDW data Example: creating external table ClickStream CREATE EXTERNAL TABLE ClickStream(url varchar(50), event_date date, user_ip varchar(50)), WITH (LOCATION = hdfs://myhadoop:5000/tpch1gb/employee.tbl, FORMAT_OPTIONS (FIELD_TERMINATOR = ' ')); Query examples 1 Text file in HDFS with as field delimiter SELECT top 10 (url) FROM ClickStream where user_ip = 192.168.0.1 Filter query against data in HDFS 2 SELECT url.description FROM ClickStream cs, Url_Description url WHERE cs.url = url.name and cs.url= www.cars.com ; Join data coming from files in HDFS (Url_description is a second text file in HDFS) 3 SELECT user_name FROM ClickStream cs, Users u WHERE cs.user_ip = u.user_ip and cs.url= www.microsoft.com ; Join data from HDFS with relational PDW table (Users is a distributed PDW table)
Parallel data import from HDFS into PDW Persistently storing data from HDFS in PDW tables Fully parallelized via CREATE TABLE AS SELECT with external tables as source table and PDW tables (either distributed or replicated) as destination CREATE TABLE ClickStream_PDW WITH DISTRIBUTION = HASH(url) AS SELECT url, event_date, user_ip FROM ClickStream Retrieval of data in HDFS on the fly Sensor and RFID Web apps Social apps Mobile apps Hadoop Unstructured data Parallel HDFS reads CTAS External table Enhanced PDW query engine HDFS bridge DMS reader 1 Results DMS reader N Parallel importing Traditional data warehouse applications PDW Structured data
Parallel data export from PDW into HDFS Fully parallelized via CREATE EXTERNAL TABLE AS SELECT with external tables as destination table and PDW tables as source Round-trip of data by first importing it from HDFS, joining it with relational data, and then exporting results back to HDFS CREATE EXTERNAL TABLE ClickStream (url, event_date, user_ip) WITH (LOCATION = hdfs://myhadoop:5000/users/outputdir, FORMAT_OPTIONS (FIELD_TERMINATOR = ' ')) AS SELECT url, event_date, user_ip FROM ClickStream_PDW Sensor and RFID Social apps CETAS External table Results Traditional DW applications Web apps Mobile apps HDFS data nodes Unstructured data Parallel HDFS writes Enhanced PDW query engine HDFS bridge DMS writer 1 DMS writer N Parallel reading PDW Structured data
Interactive analytics over Big Data SQL Server Analysis Services scaled out to large data volumes Sourced from Big Data sources Excel, PV 3 rd party apps, tools, etc. Hadoop Isotope Enterprise data sources SQL Server Oracle SAP External Data Sources Web services GW XMLA Built on the xvelocity analytics engine In-memory Column store 10x compression Manage Deploy Monitor AS instance AS instance AS instance Deployment vehicles: box, appliance, and cloud Reliable persistent storage Potential customers Skype Klout Halo 4 UBS adcenter Windows Update
Code-name GeoFlow for Microsoft Excel enables information workers to discover and share new insights from geographical and temporal data through three-dimensional storytelling
Map data Discover insights Share stories
3-D Guided Temporal geospatial tours
Sales performance Distribution of crime data Disease control Weather patterns Seasonality analysis Voting trends Real-estate assessment
Transform data into fluid, three-dimensional stories to unlock new insights for everyone
Excel add-in to enhance data visualization
Virtually anytime, anywhere Boost agility with real-time access to apps and data from virtually anywhere Engage customers with smart, contextual mobile experiences
Deliver immersive, connected customer experiences Beautiful experiences plus security and performance Connection through social apps and networks Real-time content and updates Optimization for discovery and reach
Deliver familiar, connected experiences to a mobile workforce while ensuring enterprise security, manageability, and compliance
Browser-based corporate BI solutions on ios, Android, and Windows SharePoint Mobile enhancements PerformancePoint Services Excel Services SQL Server Reporting Services Ultimately, the new Microsoft mobile BI solution leads to more revenue for Recall and gives us deeper customer insight, helping us stay ahead of our competitors. Recall Records Management Company Gets Real-Time BI, Boosts Sales with Mobile Solution case study. Full Case study.
Mobile browser Native apps Office hub Support across different mobile devices, including touch for tablets and phone Rich experience through native apps for business and social interactions and for collaboration A hub for all your document storage services plus a rich editing experience in Office Learn more: http://blogs.office.com/b/sharepoint/archive/2013/03/06/out-and-about-new-sharepoint-mobile-offerings.aspx
Clean user experience Large touch targets Filtering and navigation is as simple as working on the desktop
Never be without the tools you need Access and share with confidence
Excel 2010 plus reports, dashboards, PivotTables, and PivotCharts in a browser Optimized for touch High-fidelity viewing and interaction with advanced analysis views External connections and refresh connections Simultaneous coauthoring Embedding and the Excel button
Quick Explore
Managing streaming data in-memory Event Input stream Output stream Complex event-processing Ideal for continuous streaming data (web clickstreams, stock trading data, and smart grid data) Runs completely in-memory Low latency with subzero processing of large event streams Rich developer experience with Microsoft.NET 4.0 and Visual Studio Ease of management Flexible deployment from embedded devices, regional hubs, or a centralized location 53
StreamInsight on Windows Azure Cloud-scale services for complex event-processing Ideal for analyzing streaming data that originates in the cloud or is distributed globally Insights from data in motion Elastic scale-out in the cloud Simplified management through built-in connectivity Low TCO for cloud services 54
Third-party applications Reporting Services (Power View) Excel PowerPivot SharePoint Insights Databases LOB applications Files OData feeds Cloud services
BISM-MD object Cube Cube dimension Attributes (key(s), name) Measure Group Measure Measure without MeasureGroup MeasureGroup cube dimension relationship Perspective KPI User and parent-child hierarchies Tabular object Model Table Columns Table Measure Within table, called measures Relationship Perspective KPI Hierarchies
Types Children of all with a single real member Calculated members on user hierarchies Additional constraints Attribute may have an optional unknown member Attribute cannot be key unless it s the only attribute Not a parent-child attribute
Download SQL Server 2014 CTP1 Call to action Stay tuned for availability www.microsoft.com/sqlserver
2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION