Building Blocks of Cortana Intelligence Suite Carolina IT Pro Group 6/20/2016 Melissa Coates Solution Architect, BlueGranite blue-granite.com Blog: sqlchick.com Twitter: @sqlchick Slide content last updated: 6/19/2016
Building Blocks of Cortana Intelligence Suite Agenda 1. Azure overview 2. Introduce individual components -Purpose -Use cases -Building blocks for solutions 3. Getting started -Skills & expertise
Cortana Intelligence Suite Azure Overview
Microsoft Azure Platform Microsoft Azure is a cloud computing platform and infrastructure for building, deploying, and managing applications and services through a global network of Microsoft-managed datacenters. Azure provides services and supports many different programming languages, tools and frameworks, including both Microsoft-specific and third-party software and systems. http://azureplatform.azurewebsites.net/
Azure Objectives Extend data center infrastructure Scalability Separate compute from storage Shared infrastructure = reduced cost of ownership Self-service provisioning Simplified management = reduced cost of ownership Open source interoperability Simplified development = faster time to value Integration of services Built-in high availability & disaster recovery Shared code base with (some) on-premises resources
Types of Cloud Deployments Shared Infrastructure (Lower Cost) On-Premises Cloud High Scalability Dedicated Infrastructure (Higher Cost of Ownership) More Control (Higher Administration Effort) Less Control (Lower Administration Effort) Limits to Scalability
Types of Cloud Deployments IaaS Azure Virtual Machines PaaS Azure SQL Data Warehouse Azure SQL Database Azure HDInsight Azure Data Lake Store Azure Stream Analytics SaaS Power BI SharePoint Online Office 365 Exchange Online Azure Data Catalog Azure Data Factory Azure Machine Learning IaaS Infrastructure-as-a-Service PaaS Platform-as-a- Service SaaS Software
Cortana Intelligence Suite Formerly known as Cortana Analytics Suite
Cortana Intelligence Suite Objectives Big Data and Advanced Analytics with less cost and effort Intelligent action from people or automated systems Enable opportunities for automation and innovation: Templates Preconfigured solutions Interoperability Easier to operationalize solutions Open standards Cortana Intelligence Suite = a marketing term for a bundle of integrated services
Cortana Intelligence Suite Sample Scenarios
Cortana Intelligence Suite Gallery Also some samples built into the Azure Portal + Azure documentation area + Visual Studio projects https://gallery.cortanaintelligence.com
Cortana Intelligence Components Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services Plus other solution components such as: IoT Hub Blob Storage Document DB Azure SQL Database Azure Virtual Machine R Analytics Excel Applications Azure Automation Azure Express Active Route Directory Virtual Network
Cortana Intelligence Suite Current State Young Set of Services Some services have become generally available in last several months Some services still in public preview Pre-Built Solutions Still Evolving Objective is to have a lot of solution templates to accelerate development time Pricing Still Evolving Still focused on individual services vs bundle Integration Still Evolving The ultimate goal is deep integration between many Azure services
Cortana Intelligence Suite Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services
Cortana Intelligence Suite Azure Data Catalog Generally Available as of April 2016
Azure Data Catalog Register, manage, search, and explore organizational data sources Data Preview (first x rows) Search by: Tag, Object Type, Source Type, Expert Name Data Profiling Data Profile Info
Azure Data Catalog Register, manage, search, and explore organizational data sources Descriptions for each column Tags for each column How to connect & how to request access
Azure Data Catalog Alternative to ADC interface: APIs in conjunction with custom portal
Azure Data Catalog Purpose Data Documentation & Discovery Enterprise-wide metadata catalog for data assets Simplified data source discovery via search Enrich & understand assets with tags & annotations Collaboration between data producers & data consumers Important to Know One Data Catalog per organization (not one per subscription) Authentication only accepts an organizational account (cannot use a Microsoft account)
Azure Data Catalog Common Use Cases Documentation for Centralized Data Sources Line of business systems Data warehouse, marts Analytic systems Reporting Services File system Data lake Facilitate Self-Service BI Data dictionary Assist combining data from multiple sources Reduce duplication of effort Capture Tribal Knowledge Documentation about the data is maintained by subject matter experts Enhance understanding Data Discovery & Provisioning Users search to discover data assets Who to contact to request access
Azure Data Catalog Building Blocks 1 Register metadata for on-premises data sources SQL Server: Data Warehouse SQL Server: CRM SQL Server: Inventory Azure SQL DB: Marketing 2 Register cloud & 3 rd party data sources 1 2 Azure Data Catalog 3 View database schema in SSDT 3 SQL Server Data Tools 4 4 4 Connect & view a table of data in client tools Power BI Desktop Excel
Cortana Intelligence Suite Azure Data Factory Generally Available as of August 2015
Azure Data Factory A service for building and operating data pipelines Factory analogy: Raw Materials Acquire Raw Materials Integration and Preparation Finished Goods Deliver Finished Goods
Azure Data Factory Purpose Data Orchestration Automation of data ingestion, orchestration, and data processing Serves as the glue for stitching together services Sources Activities grouped in a Pipeline Destination (Sink)
Azure Data Factory How Does ADF compare to Integration Services? Integration Services Data source connection Azure Data Factory Linked service Control Flow Data Flow Source / destination Built-in transformations Custom transformations SQL Server Agent Scheduling Dataset Pipeline Activities Monitoring Scheduling Hive, Pig, C# scripts ----- Stored proc ----- Copy
Azure Data Factory Key Differences from SSIS ADF does *not* have built-in transformation capabilities (Hive, Pig, C#, or SQL DB stored procs, etc) Pipeline intent and scheduling are all combined together in ADF (not modular) JSON Most ADF elements are hand-written JSON scripts One Data Factory Frequently just one Data Factory per subscription (a resource can store just one key for use by ADF) ADF diagram can get large (no concept of one package per destination, though pipelines to group activities help) Data lineage can be tracked across the ADF diagram
Azure Data Factory JSON JavaScript Object Notation Data-interchange format Alternative to XML JSON used significantly in Azure
Azure Data Factory ADF Monitoring & Management App
Azure Data Factory Common Use Cases Big Data Processing HDInsight & big data stores are its strength Provide the script you want to run (Hive, Pig, etc) & it will spin up/tear down an HDInsight cluster on-demand Operationalize Solutions Scheduling of data processing operations Stage Data In Supported Data Source Certain Azure services only support cloud data sources Multiple Cloud Sources Integrate or relocate disparate data Combine or copy multiple sources
Azure Data Factory Building Blocks Data Movement with ADF 3 1 ADF extracts a subset of data into a cloud source supported by the Azure service to be utilized Data Warehouse 1 Azure SQL Database 2 Azure Machine Learning 2 ADF invokes API to execute Machine Learning model = Azure Data Factory (ADF) 3 ADF outputs scored results back to database for further analysis and integration with other data
Cortana Intelligence Suite Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services
Cortana Intelligence Suite Azure SQL Data Warehouse In Public Preview
Relational Data Warehousing in Azure* SQL Server in a Virtual Machine (IaaS) Run SQL Server workloads (including SSAS, SSRS, SSIS, etc) in an Azure Virtual Machine. Best for: Migrating/extending existing database Administer all aspects Bring your own license Dev/test scenarios *Excluding: Other technologies in a VM such as Oracle, and Big Data technologies like Hive, etc Azure SQL Database (PaaS) A relational database-as-a-service (DBaaS). Close feature parity with SQL Server. Best for: < 1TB data volume (sharding across DBs is not suitable for DW workloads) OLTP with scaling & pooling needs (unpredictable workloads) Reduced administration of DB, O/S, and hardware Azure SQL Data Warehouse (PaaS) A DW-as-a-service (DWaaS) optimized for performance and large scale, distributed workloads. Best for: Larger data volumes on MPP architecture Ability to scale up/down/ pause on-demand Combining relational + nonrelational data
Azure SQL DW SQL DW is suitable for large-scale data warehousing workloads Massively parallel processing (MPP) architecture across distributions (SQL DBs) SQL DW data stored in blob storage (*not* SQL DB) Part of the SQL Server family (with differences) https://azure.microsoft.com/en-us/documentation/articles/sql-data-warehouse-overview-what-is/
Azure SQL DW Purpose MPP Scale-Out Query Engine Cloud-based, multi-tenant, platform-as-a-service (PaaS) Massively parallel processing (MPP) Built on SQL Server Clustered columnstore indexes used by default Elastic Scale Scale up/down ondemand or on schedule PolyBase T-SQL for Hadoop queries & data loads Storage + Compute Storage and compute is decoupled Separate billing & scaling for storage vs. compute Data Warehouse Units (DWUs) controls compute billing Increase/decrease/pause compute ability independently of data storage
Azure SQL DW Common Use Cases Analytical and Ad Hoc Workloads Batch inserts and updates OLTP workload are not suitable for SQL DW Varying Workloads Workloads which suit ability to scale compute up/down (ex: during data load processing or intensive analytical operations) Large Scale Data Warehouse Easier to provision large-scale environments in cloud than on-premises Data Variety Integration with various data source types and data structures (i.e., takes advantage of PolyBase)
Azure SQL DW Two Ways of Using PolyBase 1 Querying of relational + semi/unstructured data in a single consolidated query. Objective: avoid data movement from where the data currently resides. Azure SQL Data Warehouse Azure Blob Storage T-SQL query results Driver Car State Avg Miles Name Model Speed Driven Amanda Civic WA 58 91 Green Joe Brown Escort WA 25 10 Driver Name Car Model State Amanda Green Civic WA Joe Brown Escort WA Telemetry data
Azure SQL DW Two Ways of Using PolyBase 2 Data movement from source to target. Currently PolyBase is the recommended method for loading SQL DW due to its parallel processing behavior. Control Node Compute Node Compute Node Compute Node Azure Blob Storage
Azure SQL DW Building Blocks Modern Data Warehouse 1 Ingest data into Azure (copy of raw data) 2 Process and transform data Line of Business Systems Web Data 1 Blob Storage 2 Blob Storage 3 4 SQL Data Warehouse 3 Load processed data to SQL DW in parallel with PolyBase 4 Load relational data to SQL DW (via BCP, SSIS, ADF, etc) = Azure Data Factory (ADF) 5 Data Catalog 5 Register data source(s) which users are permitted to access
Cortana Intelligence Suite Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services
Azure Data Lake A collection of 3 services: Data Lake Analytics Big Data queries-as-a-service HDInsight Big Data cluster-as-a-service Data Lake Store Big Data storage as-a-service https://info.microsoft.com/co-azure-wbnr-fy16-01jan-19-azure-data-lake-overview-registration.html
Cortana Intelligence Suite Azure Data Lake Store In Public Preview
Azure Data Lake Store Big Data storage-as-a-service
Azure Data Lake Store Purpose Big Data Storage Storage to support Hadoop applications HDFS (Hadoop Distributed File System) for the cloud, with no size limitations Stores data in its native format: objective is to not reformat File System Optimized for Analytics An alternative to general purpose Azure Blob Storage Parallel read scans Scaled out over multiple machines Low latency writes Large file sizes WebHDFS-Compatibility Accessible to all HDFScompliant projects (if integrated with HDInsight)
Azure Data Lake Store Common Use Cases Big Data Analytic Workloads Optimized to work with ADL Analytics Also supports HDInsight Influx of Data Persist large variety of data formats Retain original data in native format before it s processed Large File Sizes No limits on account or file sizes Can handle PB size files Low Latency Read files as they are being written Active Archive Availability of archival data Analytics Sandbox Working area for data scientists and analytsts to store files
Cortana Intelligence Suite Azure Data Lake Analytics In Public Preview
Azure Data Lake Analytics Big Data queries-as-a-service
Azure Data Lake Analytics Purpose Big Data Processing Ability to process any data, regardless of size or structure YARN application built on open standards Query scalability: resources allocated for each query Optimized to work with Azure Data Lake Store Simplification Abstracts away the cluster nodes focuses on convenience, efficiency, and scalability U-SQL = familiar SQL and C# to reduce learning curve Separation of ADL Analytics from ADL Store: easier to manage, debug, and optimize
Azure Data Lake Analytics U-SQL SQL + C# New big data query processing language Applies schema on read logic Mix of multiple SQL dialects (T-SQL and ANSI SQL) Native extensibility of user code written in C# Full C# expressions Reuse in assemblies Define custom types, functions, etc. Automatically scales and parallelizes across nodes https://info.microsoft.com/co-azure-wbnr-fy16-01jan-19-azure-data-lake-overview-registration.html
Azure Data Lake Analytics Common Use Cases Focus on Business Logic Focus on jobs rather than on infrastructure for a cluster Abstracts away the cluster nodes and focuses on convenience, efficiency, and scalability Proving Value of Data Initial experimentation or infrequent analysis Can be a precursor to integrating data into relational store Cloud Data Data resides in the cloud U-SQL is a Fit Skillsets and preferences are a fit for U-SQL (SQL + C#) Various Size Workloads Scalability on an individual job basis Objective is to not reserve capacity that s not needed
Azure Data Lake Analytics Building Blocks Augment a Data Warehouse with a Data Lake 1 Ingest new data into ADL Store (via ADF, copy, streaming, etc) Line of Business Systems Web Data Device Data 1 Blob Storage 2 Data Lake Store PolyBase 5 U-SQL 4 3 SQL Data Warehouse 3 U-SQL Data Lake Analytics 2 3 4 Relocate existing data into ADL Store (optional) U-SQL: read data from SQL DW & ADL Store to analyze data U-SQL: write results back to ADL Store 5 Integrate results back into SQL DW = Azure Data Factory (ADF)
Cortana Intelligence Suite Azure HDInsight Generally Available as of October 2013
Azure HDInsight Hadoop-based distribution for Big Data solutions
Azure HDInsight Purpose Big Data Processing A Big Data cluster-as-a-service for distributed data processing, scaling, and querying capabilities Supports the Apache Hadoop open source ecosystem: Hive, Spark, R, Solr, Storm, etc Considered a compute service rather than as a big data store Linux or Windows Hortonworks Partnership Based on Hortonworks Data Platform (HDP) distribution (Microsoft + Hortonworks joint engineering team with joint roadmap)
Azure HDInsight Common Use Cases Big Data Scenarios Volume Variety Velocity Leveraging the Hadoop ecosystem You want to manage a cluster & go beyond what U- SQL can easily do with ADL Analytics Integration with other open source projects Development/POC Inexpensive way to test out proof of concept before investing in a big data cluster Data Processing Engine Computations and aggregations for refined data sent to DW and analytics systems Component of ETL operations Data Exploration Part of data scientist s toolbox Event Processing Storm
Azure HDInsight Important to Know A running cluster will charge for compute hours on a per-node basis whether it s being actively used or not. This is why clusters are frequently created and deleted (there s no pause/shutdown). It can take close to 30 minutes to create a cluster on-demand though. Because the data is stored separately from the HDInsight compute service, data is preserved when a cluster is deleted. Azure Blob Storage is currently the default storage of HDInsight, though it will change to Azure Data Lake Store in the future.
Azure HDInsight Building Blocks On-Demand HDInsight Cluster 1 Ingest new data into Blob Storage (via ADF, copy, streaming, import/export, etc) Web Data 1 Blob Storage 2 3 HDInsight 4 SQL Data Warehouse 2 ADF creates an HDInsight cluster on demand, then executes Hive script to perform data aggregations & calculations HDInsight 3 Output processed data to database for further analysis = Azure Data Factory (ADF) 4 HDInsight cluster is deleted by ADF (to save cost) once its time to live expires
Cortana Intelligence Suite Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services
Cortana Intelligence Suite Azure Machine Learning Generally Available as of February 2015
Azure Machine Learning A service for building predictive analytics solutions Data preprocessing modules Algorithms Execution of algorithms Custom R or Python
Azure Machine Learning Purpose Predictive Analytics Build predictive models using statistical techniques Learn from existing data to forecast future behaviors, outcomes & trends Minimize learning curve with predefined algorithms and drag & drop authoring environment Extensible with R and Python
Azure Machine Learning Common Use Cases Finding Anomalies Examining patterns for detection of fraud Locating unusual or abnormal equipment readings Descriptive Analytics Analysis of returns Customer segmentation (ex: by buying habits or age group) to improve customer service Personalized offer recommendations Predictive Analytics Credit risk Product demand & revenue predictions Customer retention Weather predictions Machine maintenance & smart buildings Hospital readmissions Student dropouts
Azure Machine Learning Building Blocks Operationalizing an ML Model 1 Relocate onpremises data to a supported online data source Line of Business Systems 1 Azure SQL Database 2 Azure Machine Learning API 3 2 Execute Machine Learning model by invoking the deployed machine learning API SQL Data Warehouse 3 Integrate scored results to SQL DW for further analysis = Azure Data Factory (ADF)
Cortana Intelligence Suite Azure Stream Analytics + Azure Event Hub Generally Available as of April 2015
Azure Stream Analytics Real-time analytics for high velocity streaming events
Azure Stream Analytics Purpose Stream Analytics Analytic processing engine for streaming events Internet of Things (IoT) solutions for data in motion from devices, sensors, social media, etc Event Hub A publish-subscribe service that handles high volume & high velocity data streams Allows events to be ingested into Azure from many platforms & devices (another option: IoT Hub) The preferred method of event ingestion for Stream Analytics Simplification Alternative to batch loading processes Lower bar to entry for developers by using SQLlike language More straightforward than an HDInsight Storm cluster
Azure Stream Analytics Common Use Cases Device Monitoring & Telemetry Level beyond acceptable threshold Business continuity Traffic Accident & traffic conditions Web Logs Clickstream analysis A/B testing Errors or degraded experience Identity Protection Real-time fraud alerts Identity theft scenarios Social Media Real-time sentiment analysis Demand-Based Pricing Bookings in past x minutes Inventory Levels Shelf volume vs. register checkouts
Azure Stream Analytics Building Blocks Equipment Maintenance Predictions 1 2 Events ingested to the raw event data queue Consume & process data for a window of time 3 Obtain reference data Machine Sensor Output 1 Event Hub HDInsight = Azure Data Factory (ADF) 6 2 Stream Analytics 3 5 Azure SQL Database 7 8 4 Azure Machine Learning API Power BI SQL Data Warehouse 4 5 6 7 8 Real-time reporting Persist streams of data for historical reporting & analysis Processing and aggregations of sensor data Predictions for maintenance and remaining useful life Integrate results for further analysis
Cortana Intelligence Suite Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services
Cortana Intelligence Suite Power BI Generally Available as of July 2015
Power BI Data analysis and visualization tools
Power BI Purpose Data Preparation, Data Modeling, Data Visualization Set of desktop, web, and mobile tools
Power BI Common Use Cases Self-Service BI Mashup of data into a small data model which is imported & refreshed in Power BI Service Data preparation & cleansing Data visualizations Front-End Reporting Reports and dashboards from Corporate BI sources via direct connect Third Party Reporting 3 rd party content packs quick start for isolated reporting scenarios Prototyping Test out ideas for data structure, calculations, and reports Embed in Custom App Power BI Embedded (separate Azure Service)
Power BI Building Blocks Bimodal Business Intelligence On-Prem Source Database Enterprise Gateway 2 Power BI Service Power BI Dataset Connection Power BI Report A 1 1 2 A B User runs Report A Data is requested via gateway to present in report Data mashup prepared Data model and reports published SQL Data Warehouse Flat File A Power BI Desktop B C C Power BI Imported Dataset Power BI Report B D C D Data refresh schedule created for imported dataset User runs Report B & results are returned from the imported dataset
Cortana Intelligence Suite Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services
Cortana Intelligence Suite Cognitive Services In Public Preview
Cognitive Services Purpose Give Applications a Human Side Designed to make applications more personalized, intelligent and engaging Set of APIs, SDKs, and services to see, hear, and interpret Vision Emotion Speech Face Recognition Language etc Some APIs supported by Azure Machine Learning or Bing services
Cognitive Services Common Use Cases Sentiment Analysis Detect key phrases & topics being discussed Identify feedback posted to website Convert speech to text Emotion recognition Security Systems Facial detection and recognition Auto-Suggestions Completion of partial search queries Personalized Shopping Experience Recommend items likely to be purchased together
Cortana Intelligence Suite Bot Framework In Public Preview
Bot Framework Purpose Conversation Agent Automated interactions with users Enable applications and services to have a conversational user interface (CUI) http://docs.botframework.com/
Bot Framework Common Use Cases Messaging Bots Web chats Text/SMS conversation Content Bots Share relevant content, such as news or weather, with you Watcher Bots for Alerting Flight is delayed Dental appointment reminder E-Commerce Bots Order food Book a flight Check inventory for a product
Cortana Intelligence Suite Cortana Assistant
Cortana Assistant Personal digital assistant
Cortana Assistant Purpose Apply Language More Pervasively Virtual personal assistant for: Asking questions Finding things on PC Managing calendar Tracking packages Integration Reminders are integrated between Windows devices Search within other apps, such as a calendar Interact with bots to make requests
Cortana Assistant Common Use Cases Reminders Help with remembering appointments Power BI Natural Language Type or talk question: Show me sales for this quarter in the East region and a Power BI chart will render Work With Bots Interact with a bot to place an order or schedule a meeting
Cortana Assistant Important To Know Three requirements to render Power BI reports via the Cortana Assistant 1. Permission in the Power BI Dataset for Cortana to access the data 2. Settings for PC are associated to the organizational account (Settings > Accounts > Work Access) 3. Cortana on PC is connected to Office 365 and the organizational account (Cortana > Notebook > Connected Accounts > Office 365)
Cortana Assistant Why Is the Suite Named After Cortana? As for Cortana, which is the Microsoft voice-driven personal assistant tool in Windows 10, it s a small part of the solution, but Sirosh says Microsoft named the suite after it because it symbolizes the contextualized intelligence that the company hopes to deliver across the entire suite. http://techcrunch.com/2015/07/13/microsoft-unifies-big-data-and-analytics-in-newly-launched-suite
Cortana Intelligence Suite Getting Started: Skills & Expertise Needed
Azure Portal Azure Portal Different services use the Azure Portal to varying degrees Other Related Portals Some services have their own specific portal for specific functionality: AzureDataCatalog.com DataFactory.Azure.com Studio.AzureML.net
Azure Portal Azure Resource Manager (ARM) The new portal is based on the ARM deployment model ARM allows you to: Logically organize related resources which have the same lifecycle Manage related resources as a group Secure related resources as a group Deploy resources via declarative JSON templates (Dev Test Prod) Source control Azure resources with JSON templates ( infrastructure as code ) Simplify deployment & rollback Track cost of an overall solution Tags Categorize resources for billing or management Specify owner and/or support for a resource
Azure Data Factory Skills & Expertise Needed Tools Authoring and development: Azure Portal ADF Tools for Visual Studio (VS Extension) Monitoring & management: ADF Monitoring & Management App (DataFactory.Azure.com) Data Management Gateway (access of on-prem data) Management of source & destination files in Azure Storage: Azure Storage Explorer AzCopy Languages & Scripting Options JSON Appropriate source system language (HiveQL, Pig Latin, C#, U-SQL, T-SQL) PowerShell REST API
Azure SQL DW Skills & Expertise Needed Tools Authoring, development, management: Azure Portal SQL Server Data Tools for Visual Studio SQL Server Management Studio coming soon Data movement options: PolyBase Azure Data Factory SQL Server Integration Services (SSIS) BCP Management of source & destination files in Azure Storage: Azure Storage Explorer AzCopy Languages & Scripting Options T-SQL; PolyBase REST APIs PowerShell
Azure Data Lake Store Skills & Expertise Needed Tools Data movement options: Azure Portal U-SQL (Azure Data Lake Analytics) Azure Data Factory ADLCopy (command line tool) DistCp (distributed copy-if integrated with HDInsight) Sqoop (if integrated with HDInsight) Languages & Scripting Options U-SQL PowerShell Various SDKs (.NET, Node.js, Java, Python) REST APIs (for unsupported languages & platforms) HDFS Projects (ex: Sqoop, Storm-if integrated with HDInsight)
Azure Data Lake Analytics Skills & Expertise Needed Tools Authoring and development: Azure Portal Azure Data Lake Tools for Visual Studio (VS Extension) Languages & Scripting Options U-SQL Various SDKs (Java,.NET, Node.js, Python) PowerShell
Azure HDInsight Skills & Expertise Needed Tools Authoring and development: Azure Portal HDInsight Tools for Visual Studio (VS Extension) Monitoring & management: Ambari Data movement options: SQL Server Integration Services (SSIS) DistCp (Distributed Copy) Sqoop, Storm, Flume, etc Management of source & destination files in Azure Storage: Azure Storage Explorer AzCopy Languages & Scripting Options Default programming language support: Java, Python (Additional languages can be installed) HiveQL, Pig Latin, Sqoop, etc PowerShell
Azure Machine Learning Skills & Expertise Needed Tools Authoring & Development: Azure Portal Azure Machine Learning Studio (Studio.AzureML.net) Jupyter Notebooks Deployment: Azure ML API Service Data Management Gateway (access of on-prem data) Library of Machine Learning Algorithms Pre-defined algorithms Sample experiments Cortana Intelligence Gallery Languages & Scripting Options R (for creation of custom modules) Python (for execution of scripts)
Azure Stream Analytics Skills & Expertise Needed Tools Azure Portal Languages & Scripting Options Stream Analytics Query Language (closely related to SQL) PowerShell.NET SDK
Power BI Skills & Expertise Needed Tools Power BI Desktop (authoring) Excel (authoring) Power BI Service (consumption, collaboration & some authoring) Power BI Mobile Apps (consumption) Power BI Visuals Gallery (published custom visuals) Enterprise Gateway + Personal Gateway (access on-prem data) Languages & Scripting Options M or Power Query Formula Language (data preparation) DAX - Data Analysis expressions (data model calculations & expressions) D3.js + related technologies (authoring of custom visuals) Power BI Embedded REST APIs and SDK (integration with custom application) Power BI APIs (integration with custom application)
Cognitive Services Skills & Expertise Needed Tools Azure Portal Visual Studio, or developer IDE of choice GitHub Languages & Scripting Options Source language applicable to the API (Python, C#, etc)
Bot Framework Skills & Expertise Needed Tools Azure Portal Visual Studio, or developer IDE of choice GitHub Bot Framework Emulator Languages & Scripting Options Bot Builder SDKs in C# and Node.js Bot Connector API
Cortana Intelligence Suite Conclusion
Cortana Intelligence Components Information Management Big Data Stores Machine Learning & Analytics Visualization Power BI Data Factory Data Catalog SQL Data Warehouse HDInsight Machine Learning Intelligence Event Hub Data Lake Store Data Lake Analytics Stream Analytics Cortana Assistant Bot Framework Cognitive Services Plus other solution components such as: IoT Hub Blob Storage Document DB Azure SQL Database Azure Virtual Machine R Analytics Excel Applications Azure Automation Azure Express Active Route Directory Virtual Network
Additional Information Setting up a PC for Azure Cortana Intelligence Suite Development http://www.sqlchick.com/entries/2016/6/17/setting-up-a-pc-for-azure-cortanaintelligence-suite-development What is the Cortana Intelligence Suite? http://www.sqlchick.com/entries/2015/8/22/what-is-the-cortana-analytics-suite Should You Use a SQL Server Marketplace Image for an Azure Virtual Machine? http://www.sqlchick.com/entries/2016/4/2/should-you-use-a-sql-server-marketplaceimage-for-an-azure-virtual-machine How to Build a Demo/Test Environment for Azure Data Catalog http://www.sqlchick.com/entries/2016/4/20/how-to-create-a-demo-test-environmentfor-azure-data-catalog Overview of Azure Data Catalog in the Cortana Intelligence Suite http://www.sqlchick.com/entries/2015/9/15/overview-of-azure-data-catalog-in-thecortana-analytics-suite
Thank You for Attending! To download a copy of this presentation: SQLChick.com Presentations & Downloads page Melissa Coates Solution Architect, BlueGranite blue-granite.com Blog: sqlchick.com Twitter: @sqlchick Creative Commons License: Attribution-NonCommercial-NoDerivative Works 3.0