Building Blocks of Cortana Intelligence Suite

Similar documents
Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

Azure Data Lake Analytics

Building a BI Solution in the Cloud

Microsoft Big Data. Solution Brief

HDP Hadoop From concept to deployment.

Modernizing Your Data Warehouse for Hadoop

SQL Server What s New? Christopher Speer. Technology Solution Specialist (SQL Server, BizTalk Server, Power BI, Azure) v-cspeer@microsoft.

Please give me your feedback

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Hadoop in the Hybrid Cloud

Bringing Big Data to People

Microsoft Big Data Solutions. Anar Taghiyev P-TSP

Power BI as a Self-Service BI Platform:

Harnessing the Power of the Microsoft Cloud for Deep Data Analytics

Updating Your SQL Server Skills to Microsoft SQL Server 2014

Melissa Coates. Tools & Techniques for Implementing Corporate and Self-Service BI. Triad SQL BI User Group 6/25/2013. BI Architect, Intellinet

BIG DATA TRENDS AND TECHNOLOGIES

Course 10977A: Updating Your SQL Server Skills to Microsoft SQL Server 2014

Assignment # 1 (Cloud Computing Security)

Updating Your SQL Server Skills to Microsoft SQL Server 2014

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM

Microsoft Research Microsoft Azure for Research Training

Microsoft Research Windows Azure for Research Training

Ganzheitliches Datenmanagement

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

HDP Enabling the Modern Data Architecture

10977B: Updating Your SQL Server Skills to Microsoft SQL Server 2014

Course 10977: Updating Your SQL Server Skills to Microsoft SQL Server 2014

Oracle Database 12c Plug In. Switch On. Get SMART.

Open Source Technologies on Microsoft Azure

Integrating a Big Data Platform into Government:

Beyond Lambda - how to get from logical to physical. Artur Borycki, Director International Technology & Innovations

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Azure Day Application Development

Introducing the Reimagined Power BI Platform. Jen Underwood, Microsoft

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS

Comprehensive Analytics on the Hortonworks Data Platform

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

The Future of Data Management

Whitepaper: Solution Overview - Breakthrough Insight. Published: March 7, Applies to: Microsoft SQL Server Summary:

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

Managing the PowerPivot for SharePoint Environment

Roadmap Talend : découvrez les futures fonctionnalités de Talend

The Enterprise Data Hub and The Modern Information Architecture

The Inside Scoop on Hadoop

Upgrading Your SQL Server Skills to Microsoft SQL Server 2014

SQL Server 2012 Business Intelligence Boot Camp

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Reference Architecture

Updating Your SQL Server Skills from Microsoft SQL Server 2008 to Microsoft SQL Server 2014

Big Data Analytics Nokia

Big Analytics in the Cloud. Matt Winkler PM, Big

Designing Self-Service Business Intelligence and Big Data Solutions

Architecting Open source solutions on Azure. Nicholas Dritsas Senior Director, Microsoft Singapore

Sisense. Product Highlights.

SQL Server 2016 New Features!

Parallel Data Warehouse

Using Tableau Software with Hortonworks Data Platform

Modern Data Warehousing

How To Extend An Enterprise Bio Solution

Migrating SaaS Applications to Windows Azure

Safe Harbor Statement

MicroStrategy Course Catalog

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Upcoming Announcements

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Oracle s Cloud Computing Strategy

Cisco Data Preparation

Hybrid Cloud Architectures for Operational Performance Management

Hadoop & Spark Using Amazon EMR

Databricks. A Primer

BIG DATA-AS-A-SERVICE

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

SAP and Hortonworks Reference Architecture

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Managed Self-Service BI & Data As A Service

Developing Microsoft Azure Solutions

Upgrading Your SQL Server Skills to Microsoft SQL Server 2014

Developing Microsoft Azure Solutions 20532A; 5 days

Informatica and the Vibe Virtual Data Machine

Databricks. A Primer

Microsoft Analytics Platform System. Solution Brief

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Apache Hadoop: The Big Data Refinery

A Modern Data Architecture with Apache Hadoop

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Luncheon Webinar Series May 13, 2013

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Transcription:

Building Blocks of Cortana Intelligence Suite Carolina IT Pro Group 6/20/2016 Melissa Coates Solution Architect, BlueGranite blue-granite.com Blog: sqlchick.com Twitter: @sqlchick Slide content last updated: 6/19/2016

Building Blocks of Cortana Intelligence Suite Agenda 1. Azure overview 2. Introduce individual components -Purpose -Use cases -Building blocks for solutions 3. Getting started -Skills & expertise

Cortana Intelligence Suite Azure Overview

Microsoft Azure Platform Microsoft Azure is a cloud computing platform and infrastructure for building, deploying, and managing applications and services through a global network of Microsoft-managed datacenters. Azure provides services and supports many different programming languages, tools and frameworks, including both Microsoft-specific and third-party software and systems. http://azureplatform.azurewebsites.net/

Azure Objectives Extend data center infrastructure Scalability Separate compute from storage Shared infrastructure = reduced cost of ownership Self-service provisioning Simplified management = reduced cost of ownership Open source interoperability Simplified development = faster time to value Integration of services Built-in high availability & disaster recovery Shared code base with (some) on-premises resources

Types of Cloud Deployments Shared Infrastructure (Lower Cost) On-Premises Cloud High Scalability Dedicated Infrastructure (Higher Cost of Ownership) More Control (Higher Administration Effort) Less Control (Lower Administration Effort) Limits to Scalability

Types of Cloud Deployments IaaS Azure Virtual Machines PaaS Azure SQL Data Warehouse Azure SQL Database Azure HDInsight Azure Data Lake Store Azure Stream Analytics SaaS Power BI SharePoint Online Office 365 Exchange Online Azure Data Catalog Azure Data Factory Azure Machine Learning IaaS Infrastructure-as-a-Service PaaS Platform-as-a- Service SaaS Software

Cortana Intelligence Suite Formerly known as Cortana Analytics Suite

Cortana Intelligence Suite Objectives Big Data and Advanced Analytics with less cost and effort Intelligent action from people or automated systems Enable opportunities for automation and innovation: Templates Preconfigured solutions Interoperability Easier to operationalize solutions Open standards Cortana Intelligence Suite = a marketing term for a bundle of integrated services

Cortana Intelligence Suite Sample Scenarios

Cortana Intelligence Suite Gallery Also some samples built into the Azure Portal + Azure documentation area + Visual Studio projects https://gallery.cortanaintelligence.com

Cortana Intelligence Components Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services Plus other solution components such as: IoT Hub Blob Storage Document DB Azure SQL Database Azure Virtual Machine R Analytics Excel Applications Azure Automation Azure Express Active Route Directory Virtual Network

Cortana Intelligence Suite Current State Young Set of Services Some services have become generally available in last several months Some services still in public preview Pre-Built Solutions Still Evolving Objective is to have a lot of solution templates to accelerate development time Pricing Still Evolving Still focused on individual services vs bundle Integration Still Evolving The ultimate goal is deep integration between many Azure services

Cortana Intelligence Suite Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services

Cortana Intelligence Suite Azure Data Catalog Generally Available as of April 2016

Azure Data Catalog Register, manage, search, and explore organizational data sources Data Preview (first x rows) Search by: Tag, Object Type, Source Type, Expert Name Data Profiling Data Profile Info

Azure Data Catalog Register, manage, search, and explore organizational data sources Descriptions for each column Tags for each column How to connect & how to request access

Azure Data Catalog Alternative to ADC interface: APIs in conjunction with custom portal

Azure Data Catalog Purpose Data Documentation & Discovery Enterprise-wide metadata catalog for data assets Simplified data source discovery via search Enrich & understand assets with tags & annotations Collaboration between data producers & data consumers Important to Know One Data Catalog per organization (not one per subscription) Authentication only accepts an organizational account (cannot use a Microsoft account)

Azure Data Catalog Common Use Cases Documentation for Centralized Data Sources Line of business systems Data warehouse, marts Analytic systems Reporting Services File system Data lake Facilitate Self-Service BI Data dictionary Assist combining data from multiple sources Reduce duplication of effort Capture Tribal Knowledge Documentation about the data is maintained by subject matter experts Enhance understanding Data Discovery & Provisioning Users search to discover data assets Who to contact to request access

Azure Data Catalog Building Blocks 1 Register metadata for on-premises data sources SQL Server: Data Warehouse SQL Server: CRM SQL Server: Inventory Azure SQL DB: Marketing 2 Register cloud & 3 rd party data sources 1 2 Azure Data Catalog 3 View database schema in SSDT 3 SQL Server Data Tools 4 4 4 Connect & view a table of data in client tools Power BI Desktop Excel

Cortana Intelligence Suite Azure Data Factory Generally Available as of August 2015

Azure Data Factory A service for building and operating data pipelines Factory analogy: Raw Materials Acquire Raw Materials Integration and Preparation Finished Goods Deliver Finished Goods

Azure Data Factory Purpose Data Orchestration Automation of data ingestion, orchestration, and data processing Serves as the glue for stitching together services Sources Activities grouped in a Pipeline Destination (Sink)

Azure Data Factory How Does ADF compare to Integration Services? Integration Services Data source connection Azure Data Factory Linked service Control Flow Data Flow Source / destination Built-in transformations Custom transformations SQL Server Agent Scheduling Dataset Pipeline Activities Monitoring Scheduling Hive, Pig, C# scripts ----- Stored proc ----- Copy

Azure Data Factory Key Differences from SSIS ADF does *not* have built-in transformation capabilities (Hive, Pig, C#, or SQL DB stored procs, etc) Pipeline intent and scheduling are all combined together in ADF (not modular) JSON Most ADF elements are hand-written JSON scripts One Data Factory Frequently just one Data Factory per subscription (a resource can store just one key for use by ADF) ADF diagram can get large (no concept of one package per destination, though pipelines to group activities help) Data lineage can be tracked across the ADF diagram

Azure Data Factory JSON JavaScript Object Notation Data-interchange format Alternative to XML JSON used significantly in Azure

Azure Data Factory ADF Monitoring & Management App

Azure Data Factory Common Use Cases Big Data Processing HDInsight & big data stores are its strength Provide the script you want to run (Hive, Pig, etc) & it will spin up/tear down an HDInsight cluster on-demand Operationalize Solutions Scheduling of data processing operations Stage Data In Supported Data Source Certain Azure services only support cloud data sources Multiple Cloud Sources Integrate or relocate disparate data Combine or copy multiple sources

Azure Data Factory Building Blocks Data Movement with ADF 3 1 ADF extracts a subset of data into a cloud source supported by the Azure service to be utilized Data Warehouse 1 Azure SQL Database 2 Azure Machine Learning 2 ADF invokes API to execute Machine Learning model = Azure Data Factory (ADF) 3 ADF outputs scored results back to database for further analysis and integration with other data

Cortana Intelligence Suite Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services

Cortana Intelligence Suite Azure SQL Data Warehouse In Public Preview

Relational Data Warehousing in Azure* SQL Server in a Virtual Machine (IaaS) Run SQL Server workloads (including SSAS, SSRS, SSIS, etc) in an Azure Virtual Machine. Best for: Migrating/extending existing database Administer all aspects Bring your own license Dev/test scenarios *Excluding: Other technologies in a VM such as Oracle, and Big Data technologies like Hive, etc Azure SQL Database (PaaS) A relational database-as-a-service (DBaaS). Close feature parity with SQL Server. Best for: < 1TB data volume (sharding across DBs is not suitable for DW workloads) OLTP with scaling & pooling needs (unpredictable workloads) Reduced administration of DB, O/S, and hardware Azure SQL Data Warehouse (PaaS) A DW-as-a-service (DWaaS) optimized for performance and large scale, distributed workloads. Best for: Larger data volumes on MPP architecture Ability to scale up/down/ pause on-demand Combining relational + nonrelational data

Azure SQL DW SQL DW is suitable for large-scale data warehousing workloads Massively parallel processing (MPP) architecture across distributions (SQL DBs) SQL DW data stored in blob storage (*not* SQL DB) Part of the SQL Server family (with differences) https://azure.microsoft.com/en-us/documentation/articles/sql-data-warehouse-overview-what-is/

Azure SQL DW Purpose MPP Scale-Out Query Engine Cloud-based, multi-tenant, platform-as-a-service (PaaS) Massively parallel processing (MPP) Built on SQL Server Clustered columnstore indexes used by default Elastic Scale Scale up/down ondemand or on schedule PolyBase T-SQL for Hadoop queries & data loads Storage + Compute Storage and compute is decoupled Separate billing & scaling for storage vs. compute Data Warehouse Units (DWUs) controls compute billing Increase/decrease/pause compute ability independently of data storage

Azure SQL DW Common Use Cases Analytical and Ad Hoc Workloads Batch inserts and updates OLTP workload are not suitable for SQL DW Varying Workloads Workloads which suit ability to scale compute up/down (ex: during data load processing or intensive analytical operations) Large Scale Data Warehouse Easier to provision large-scale environments in cloud than on-premises Data Variety Integration with various data source types and data structures (i.e., takes advantage of PolyBase)

Azure SQL DW Two Ways of Using PolyBase 1 Querying of relational + semi/unstructured data in a single consolidated query. Objective: avoid data movement from where the data currently resides. Azure SQL Data Warehouse Azure Blob Storage T-SQL query results Driver Car State Avg Miles Name Model Speed Driven Amanda Civic WA 58 91 Green Joe Brown Escort WA 25 10 Driver Name Car Model State Amanda Green Civic WA Joe Brown Escort WA Telemetry data

Azure SQL DW Two Ways of Using PolyBase 2 Data movement from source to target. Currently PolyBase is the recommended method for loading SQL DW due to its parallel processing behavior. Control Node Compute Node Compute Node Compute Node Azure Blob Storage

Azure SQL DW Building Blocks Modern Data Warehouse 1 Ingest data into Azure (copy of raw data) 2 Process and transform data Line of Business Systems Web Data 1 Blob Storage 2 Blob Storage 3 4 SQL Data Warehouse 3 Load processed data to SQL DW in parallel with PolyBase 4 Load relational data to SQL DW (via BCP, SSIS, ADF, etc) = Azure Data Factory (ADF) 5 Data Catalog 5 Register data source(s) which users are permitted to access

Cortana Intelligence Suite Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services

Azure Data Lake A collection of 3 services: Data Lake Analytics Big Data queries-as-a-service HDInsight Big Data cluster-as-a-service Data Lake Store Big Data storage as-a-service https://info.microsoft.com/co-azure-wbnr-fy16-01jan-19-azure-data-lake-overview-registration.html

Cortana Intelligence Suite Azure Data Lake Store In Public Preview

Azure Data Lake Store Big Data storage-as-a-service

Azure Data Lake Store Purpose Big Data Storage Storage to support Hadoop applications HDFS (Hadoop Distributed File System) for the cloud, with no size limitations Stores data in its native format: objective is to not reformat File System Optimized for Analytics An alternative to general purpose Azure Blob Storage Parallel read scans Scaled out over multiple machines Low latency writes Large file sizes WebHDFS-Compatibility Accessible to all HDFScompliant projects (if integrated with HDInsight)

Azure Data Lake Store Common Use Cases Big Data Analytic Workloads Optimized to work with ADL Analytics Also supports HDInsight Influx of Data Persist large variety of data formats Retain original data in native format before it s processed Large File Sizes No limits on account or file sizes Can handle PB size files Low Latency Read files as they are being written Active Archive Availability of archival data Analytics Sandbox Working area for data scientists and analytsts to store files

Cortana Intelligence Suite Azure Data Lake Analytics In Public Preview

Azure Data Lake Analytics Big Data queries-as-a-service

Azure Data Lake Analytics Purpose Big Data Processing Ability to process any data, regardless of size or structure YARN application built on open standards Query scalability: resources allocated for each query Optimized to work with Azure Data Lake Store Simplification Abstracts away the cluster nodes focuses on convenience, efficiency, and scalability U-SQL = familiar SQL and C# to reduce learning curve Separation of ADL Analytics from ADL Store: easier to manage, debug, and optimize

Azure Data Lake Analytics U-SQL SQL + C# New big data query processing language Applies schema on read logic Mix of multiple SQL dialects (T-SQL and ANSI SQL) Native extensibility of user code written in C# Full C# expressions Reuse in assemblies Define custom types, functions, etc. Automatically scales and parallelizes across nodes https://info.microsoft.com/co-azure-wbnr-fy16-01jan-19-azure-data-lake-overview-registration.html

Azure Data Lake Analytics Common Use Cases Focus on Business Logic Focus on jobs rather than on infrastructure for a cluster Abstracts away the cluster nodes and focuses on convenience, efficiency, and scalability Proving Value of Data Initial experimentation or infrequent analysis Can be a precursor to integrating data into relational store Cloud Data Data resides in the cloud U-SQL is a Fit Skillsets and preferences are a fit for U-SQL (SQL + C#) Various Size Workloads Scalability on an individual job basis Objective is to not reserve capacity that s not needed

Azure Data Lake Analytics Building Blocks Augment a Data Warehouse with a Data Lake 1 Ingest new data into ADL Store (via ADF, copy, streaming, etc) Line of Business Systems Web Data Device Data 1 Blob Storage 2 Data Lake Store PolyBase 5 U-SQL 4 3 SQL Data Warehouse 3 U-SQL Data Lake Analytics 2 3 4 Relocate existing data into ADL Store (optional) U-SQL: read data from SQL DW & ADL Store to analyze data U-SQL: write results back to ADL Store 5 Integrate results back into SQL DW = Azure Data Factory (ADF)

Cortana Intelligence Suite Azure HDInsight Generally Available as of October 2013

Azure HDInsight Hadoop-based distribution for Big Data solutions

Azure HDInsight Purpose Big Data Processing A Big Data cluster-as-a-service for distributed data processing, scaling, and querying capabilities Supports the Apache Hadoop open source ecosystem: Hive, Spark, R, Solr, Storm, etc Considered a compute service rather than as a big data store Linux or Windows Hortonworks Partnership Based on Hortonworks Data Platform (HDP) distribution (Microsoft + Hortonworks joint engineering team with joint roadmap)

Azure HDInsight Common Use Cases Big Data Scenarios Volume Variety Velocity Leveraging the Hadoop ecosystem You want to manage a cluster & go beyond what U- SQL can easily do with ADL Analytics Integration with other open source projects Development/POC Inexpensive way to test out proof of concept before investing in a big data cluster Data Processing Engine Computations and aggregations for refined data sent to DW and analytics systems Component of ETL operations Data Exploration Part of data scientist s toolbox Event Processing Storm

Azure HDInsight Important to Know A running cluster will charge for compute hours on a per-node basis whether it s being actively used or not. This is why clusters are frequently created and deleted (there s no pause/shutdown). It can take close to 30 minutes to create a cluster on-demand though. Because the data is stored separately from the HDInsight compute service, data is preserved when a cluster is deleted. Azure Blob Storage is currently the default storage of HDInsight, though it will change to Azure Data Lake Store in the future.

Azure HDInsight Building Blocks On-Demand HDInsight Cluster 1 Ingest new data into Blob Storage (via ADF, copy, streaming, import/export, etc) Web Data 1 Blob Storage 2 3 HDInsight 4 SQL Data Warehouse 2 ADF creates an HDInsight cluster on demand, then executes Hive script to perform data aggregations & calculations HDInsight 3 Output processed data to database for further analysis = Azure Data Factory (ADF) 4 HDInsight cluster is deleted by ADF (to save cost) once its time to live expires

Cortana Intelligence Suite Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services

Cortana Intelligence Suite Azure Machine Learning Generally Available as of February 2015

Azure Machine Learning A service for building predictive analytics solutions Data preprocessing modules Algorithms Execution of algorithms Custom R or Python

Azure Machine Learning Purpose Predictive Analytics Build predictive models using statistical techniques Learn from existing data to forecast future behaviors, outcomes & trends Minimize learning curve with predefined algorithms and drag & drop authoring environment Extensible with R and Python

Azure Machine Learning Common Use Cases Finding Anomalies Examining patterns for detection of fraud Locating unusual or abnormal equipment readings Descriptive Analytics Analysis of returns Customer segmentation (ex: by buying habits or age group) to improve customer service Personalized offer recommendations Predictive Analytics Credit risk Product demand & revenue predictions Customer retention Weather predictions Machine maintenance & smart buildings Hospital readmissions Student dropouts

Azure Machine Learning Building Blocks Operationalizing an ML Model 1 Relocate onpremises data to a supported online data source Line of Business Systems 1 Azure SQL Database 2 Azure Machine Learning API 3 2 Execute Machine Learning model by invoking the deployed machine learning API SQL Data Warehouse 3 Integrate scored results to SQL DW for further analysis = Azure Data Factory (ADF)

Cortana Intelligence Suite Azure Stream Analytics + Azure Event Hub Generally Available as of April 2015

Azure Stream Analytics Real-time analytics for high velocity streaming events

Azure Stream Analytics Purpose Stream Analytics Analytic processing engine for streaming events Internet of Things (IoT) solutions for data in motion from devices, sensors, social media, etc Event Hub A publish-subscribe service that handles high volume & high velocity data streams Allows events to be ingested into Azure from many platforms & devices (another option: IoT Hub) The preferred method of event ingestion for Stream Analytics Simplification Alternative to batch loading processes Lower bar to entry for developers by using SQLlike language More straightforward than an HDInsight Storm cluster

Azure Stream Analytics Common Use Cases Device Monitoring & Telemetry Level beyond acceptable threshold Business continuity Traffic Accident & traffic conditions Web Logs Clickstream analysis A/B testing Errors or degraded experience Identity Protection Real-time fraud alerts Identity theft scenarios Social Media Real-time sentiment analysis Demand-Based Pricing Bookings in past x minutes Inventory Levels Shelf volume vs. register checkouts

Azure Stream Analytics Building Blocks Equipment Maintenance Predictions 1 2 Events ingested to the raw event data queue Consume & process data for a window of time 3 Obtain reference data Machine Sensor Output 1 Event Hub HDInsight = Azure Data Factory (ADF) 6 2 Stream Analytics 3 5 Azure SQL Database 7 8 4 Azure Machine Learning API Power BI SQL Data Warehouse 4 5 6 7 8 Real-time reporting Persist streams of data for historical reporting & analysis Processing and aggregations of sensor data Predictions for maintenance and remaining useful life Integrate results for further analysis

Cortana Intelligence Suite Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services

Cortana Intelligence Suite Power BI Generally Available as of July 2015

Power BI Data analysis and visualization tools

Power BI Purpose Data Preparation, Data Modeling, Data Visualization Set of desktop, web, and mobile tools

Power BI Common Use Cases Self-Service BI Mashup of data into a small data model which is imported & refreshed in Power BI Service Data preparation & cleansing Data visualizations Front-End Reporting Reports and dashboards from Corporate BI sources via direct connect Third Party Reporting 3 rd party content packs quick start for isolated reporting scenarios Prototyping Test out ideas for data structure, calculations, and reports Embed in Custom App Power BI Embedded (separate Azure Service)

Power BI Building Blocks Bimodal Business Intelligence On-Prem Source Database Enterprise Gateway 2 Power BI Service Power BI Dataset Connection Power BI Report A 1 1 2 A B User runs Report A Data is requested via gateway to present in report Data mashup prepared Data model and reports published SQL Data Warehouse Flat File A Power BI Desktop B C C Power BI Imported Dataset Power BI Report B D C D Data refresh schedule created for imported dataset User runs Report B & results are returned from the imported dataset

Cortana Intelligence Suite Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services

Cortana Intelligence Suite Cognitive Services In Public Preview

Cognitive Services Purpose Give Applications a Human Side Designed to make applications more personalized, intelligent and engaging Set of APIs, SDKs, and services to see, hear, and interpret Vision Emotion Speech Face Recognition Language etc Some APIs supported by Azure Machine Learning or Bing services

Cognitive Services Common Use Cases Sentiment Analysis Detect key phrases & topics being discussed Identify feedback posted to website Convert speech to text Emotion recognition Security Systems Facial detection and recognition Auto-Suggestions Completion of partial search queries Personalized Shopping Experience Recommend items likely to be purchased together

Cortana Intelligence Suite Bot Framework In Public Preview

Bot Framework Purpose Conversation Agent Automated interactions with users Enable applications and services to have a conversational user interface (CUI) http://docs.botframework.com/

Bot Framework Common Use Cases Messaging Bots Web chats Text/SMS conversation Content Bots Share relevant content, such as news or weather, with you Watcher Bots for Alerting Flight is delayed Dental appointment reminder E-Commerce Bots Order food Book a flight Check inventory for a product

Cortana Intelligence Suite Cortana Assistant

Cortana Assistant Personal digital assistant

Cortana Assistant Purpose Apply Language More Pervasively Virtual personal assistant for: Asking questions Finding things on PC Managing calendar Tracking packages Integration Reminders are integrated between Windows devices Search within other apps, such as a calendar Interact with bots to make requests

Cortana Assistant Common Use Cases Reminders Help with remembering appointments Power BI Natural Language Type or talk question: Show me sales for this quarter in the East region and a Power BI chart will render Work With Bots Interact with a bot to place an order or schedule a meeting

Cortana Assistant Important To Know Three requirements to render Power BI reports via the Cortana Assistant 1. Permission in the Power BI Dataset for Cortana to access the data 2. Settings for PC are associated to the organizational account (Settings > Accounts > Work Access) 3. Cortana on PC is connected to Office 365 and the organizational account (Cortana > Notebook > Connected Accounts > Office 365)

Cortana Assistant Why Is the Suite Named After Cortana? As for Cortana, which is the Microsoft voice-driven personal assistant tool in Windows 10, it s a small part of the solution, but Sirosh says Microsoft named the suite after it because it symbolizes the contextualized intelligence that the company hopes to deliver across the entire suite. http://techcrunch.com/2015/07/13/microsoft-unifies-big-data-and-analytics-in-newly-launched-suite

Cortana Intelligence Suite Getting Started: Skills & Expertise Needed

Azure Portal Azure Portal Different services use the Azure Portal to varying degrees Other Related Portals Some services have their own specific portal for specific functionality: AzureDataCatalog.com DataFactory.Azure.com Studio.AzureML.net

Azure Portal Azure Resource Manager (ARM) The new portal is based on the ARM deployment model ARM allows you to: Logically organize related resources which have the same lifecycle Manage related resources as a group Secure related resources as a group Deploy resources via declarative JSON templates (Dev Test Prod) Source control Azure resources with JSON templates ( infrastructure as code ) Simplify deployment & rollback Track cost of an overall solution Tags Categorize resources for billing or management Specify owner and/or support for a resource

Azure Data Factory Skills & Expertise Needed Tools Authoring and development: Azure Portal ADF Tools for Visual Studio (VS Extension) Monitoring & management: ADF Monitoring & Management App (DataFactory.Azure.com) Data Management Gateway (access of on-prem data) Management of source & destination files in Azure Storage: Azure Storage Explorer AzCopy Languages & Scripting Options JSON Appropriate source system language (HiveQL, Pig Latin, C#, U-SQL, T-SQL) PowerShell REST API

Azure SQL DW Skills & Expertise Needed Tools Authoring, development, management: Azure Portal SQL Server Data Tools for Visual Studio SQL Server Management Studio coming soon Data movement options: PolyBase Azure Data Factory SQL Server Integration Services (SSIS) BCP Management of source & destination files in Azure Storage: Azure Storage Explorer AzCopy Languages & Scripting Options T-SQL; PolyBase REST APIs PowerShell

Azure Data Lake Store Skills & Expertise Needed Tools Data movement options: Azure Portal U-SQL (Azure Data Lake Analytics) Azure Data Factory ADLCopy (command line tool) DistCp (distributed copy-if integrated with HDInsight) Sqoop (if integrated with HDInsight) Languages & Scripting Options U-SQL PowerShell Various SDKs (.NET, Node.js, Java, Python) REST APIs (for unsupported languages & platforms) HDFS Projects (ex: Sqoop, Storm-if integrated with HDInsight)

Azure Data Lake Analytics Skills & Expertise Needed Tools Authoring and development: Azure Portal Azure Data Lake Tools for Visual Studio (VS Extension) Languages & Scripting Options U-SQL Various SDKs (Java,.NET, Node.js, Python) PowerShell

Azure HDInsight Skills & Expertise Needed Tools Authoring and development: Azure Portal HDInsight Tools for Visual Studio (VS Extension) Monitoring & management: Ambari Data movement options: SQL Server Integration Services (SSIS) DistCp (Distributed Copy) Sqoop, Storm, Flume, etc Management of source & destination files in Azure Storage: Azure Storage Explorer AzCopy Languages & Scripting Options Default programming language support: Java, Python (Additional languages can be installed) HiveQL, Pig Latin, Sqoop, etc PowerShell

Azure Machine Learning Skills & Expertise Needed Tools Authoring & Development: Azure Portal Azure Machine Learning Studio (Studio.AzureML.net) Jupyter Notebooks Deployment: Azure ML API Service Data Management Gateway (access of on-prem data) Library of Machine Learning Algorithms Pre-defined algorithms Sample experiments Cortana Intelligence Gallery Languages & Scripting Options R (for creation of custom modules) Python (for execution of scripts)

Azure Stream Analytics Skills & Expertise Needed Tools Azure Portal Languages & Scripting Options Stream Analytics Query Language (closely related to SQL) PowerShell.NET SDK

Power BI Skills & Expertise Needed Tools Power BI Desktop (authoring) Excel (authoring) Power BI Service (consumption, collaboration & some authoring) Power BI Mobile Apps (consumption) Power BI Visuals Gallery (published custom visuals) Enterprise Gateway + Personal Gateway (access on-prem data) Languages & Scripting Options M or Power Query Formula Language (data preparation) DAX - Data Analysis expressions (data model calculations & expressions) D3.js + related technologies (authoring of custom visuals) Power BI Embedded REST APIs and SDK (integration with custom application) Power BI APIs (integration with custom application)

Cognitive Services Skills & Expertise Needed Tools Azure Portal Visual Studio, or developer IDE of choice GitHub Languages & Scripting Options Source language applicable to the API (Python, C#, etc)

Bot Framework Skills & Expertise Needed Tools Azure Portal Visual Studio, or developer IDE of choice GitHub Bot Framework Emulator Languages & Scripting Options Bot Builder SDKs in C# and Node.js Bot Connector API

Cortana Intelligence Suite Conclusion

Cortana Intelligence Components Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services Plus other solution components such as: IoT Hub Blob Storage Document DB Azure SQL Database Azure Virtual Machine R Analytics Excel Applications Azure Automation Azure Express Active Route Directory Virtual Network

Additional Information Setting up a PC for Azure Cortana Intelligence Suite Development http://www.sqlchick.com/entries/2016/6/17/setting-up-a-pc-for-azure-cortanaintelligence-suite-development What is the Cortana Intelligence Suite? http://www.sqlchick.com/entries/2015/8/22/what-is-the-cortana-analytics-suite Should You Use a SQL Server Marketplace Image for an Azure Virtual Machine? http://www.sqlchick.com/entries/2016/4/2/should-you-use-a-sql-server-marketplaceimage-for-an-azure-virtual-machine How to Build a Demo/Test Environment for Azure Data Catalog http://www.sqlchick.com/entries/2016/4/20/how-to-create-a-demo-test-environmentfor-azure-data-catalog Overview of Azure Data Catalog in the Cortana Intelligence Suite http://www.sqlchick.com/entries/2015/9/15/overview-of-azure-data-catalog-in-thecortana-analytics-suite

Thank You for Attending! To download a copy of this presentation: SQLChick.com Presentations & Downloads page Melissa Coates Solution Architect, BlueGranite blue-granite.com Blog: sqlchick.com Twitter: @sqlchick Creative Commons License: Attribution-NonCommercial-NoDerivative Works 3.0