Columnstore in SQL Server 2016



Similar documents
ENHANCEMENTS TO SQL SERVER COLUMN STORES. Anuhya Mallempati #

SQL Server and MicroStrategy: Functional Overview Including Recommendations for Performance Optimization. MicroStrategy World 2016

Performance Verbesserung von SAP BW mit SQL Server Columnstore

SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here

Inge Os Sales Consulting Manager Oracle Norway

SQL Server 2016 New Features!

Microsoft Data Platform Evolution

Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC

Microsoft Analytics Platform System. Solution Brief

Who am I? Copyright 2014, Oracle and/or its affiliates. All rights reserved. 3

SQL Server 2012 Performance White Paper

In-Memory Data Management for Enterprise Applications

70-467: Designing Business Intelligence Solutions with Microsoft SQL Server

Oracle Database In-Memory The Next Big Thing

THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS

Retail POS Data Analytics Using MS Bi Tools. Business Intelligence White Paper

Updating Your Skills to SQL Server 2016

Using SQL Server 2014 In-Memory Optimized Columnstore with SAP BW

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

PostgreSQL Business Intelligence & Performance Simon Riggs CTO, 2ndQuadrant PostgreSQL Major Contributor

Integrating Apache Spark with an Enterprise Data Warehouse

Microsoft SQL Database Administrator Certification

INTRODUCING DRUID: FAST AD-HOC QUERIES ON BIG DATA MICHAEL DRISCOLL - CEO ERIC TSCHETTER - LEAD METAMARKETS

Expert Oracle Exadata

Microsoft SQL Server Decision Support (DSS) Load Testing

Performance Tuning and Optimizing SQL Databases 2016

Optimize Oracle Business Intelligence Analytics with Oracle 12c In-Memory Database Option

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

Course Outline. Upgrading Your Skills to SQL Server 2016 Course 10986A: 5 days Instructor Led

Apache Kylin Introduction Dec 8,

2009 Oracle Corporation 1

IN-MEMORY DATABASE SYSTEMS. Prof. Dr. Uta Störl Big Data Technologies: In-Memory DBMS - SoSe

Oracle InMemory Database

Oracle Architecture, Concepts & Facilities

Parallel Data Warehouse

SQL Server 2008 Performance and Scale

MOC 20467B: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

SQL Server 2014 In-Memory Tables (Extreme Transaction Processing)

SQL Server In-Memory OLTP and Columnstore Feature Comparison

Managing the PowerPivot for SharePoint Environment

Driving Peak Performance IBM Corporation

SAS BI Course Content; Introduction to DWH / BI Concepts

Vertica Live Aggregate Projections

Updating Your SQL Server Skills to Microsoft SQL Server 2014

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

W I S E. SQL Server 2008/2008 R2 Advanced DBA Performance & WISE LTD.

In-memory databases and innovations in Business Intelligence

Course 10977A: Updating Your SQL Server Skills to Microsoft SQL Server 2014

Business Intelligence in SharePoint 2013

MS Designing and Optimizing Database Solutions with Microsoft SQL Server 2008

SQL Server Analysis Services Complete Practical & Real-time Training

Would-be system and database administrators. PREREQUISITES: At least 6 months experience with a Windows operating system.

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

Real-Time Analytical Processing with SQL Server

Safe Harbor Statement

LearnFromGuru Polish your knowledge

In-Memory Databases MemSQL

Implementing a SQL Data Warehouse 2016

ROLAP with Column Store Index Deep Dive. Alexei Khalyako SQL CAT Program Manager

Course Outline. SQL Server 2014 Performance Tuning and Optimization Course 55144: 5 days Instructor Led

MS SQL Performance (Tuning) Best Practices:

SAP BW Columnstore Optimized Flat Cube on Microsoft SQL Server

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

LEARNING SOLUTIONS website milner.com/learning phone

Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework

Developing Microsoft SQL Server Databases (20464) H8N64S

Course 20464: Developing Microsoft SQL Server Databases

Database Performance with In-Memory Solutions

Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot

Developing Microsoft SQL Server Databases 20464C; 5 Days

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering


The Vertica Analytic Database Technical Overview White Paper. A DBMS Architecture Optimized for Next-Generation Data Warehousing

SQL 2016 and SQL Azure

Oracle Big Data, In-memory, and Exadata - One Database Engine to Rule Them All Dr.-Ing. Holger Friedrich

Data Warehousing and Data Mining

MOC 20461C: Querying Microsoft SQL Server. Course Overview

Building a BI Solution in the Cloud

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

Entity store. Microsoft Dynamics AX 2012 R3

RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems CLOUD COMPUTING GROUP - LITAO DENG

Microsoft SQL Server performance tuning for Microsoft Dynamics NAV

Tiber Solutions. Understanding the Current & Future Landscape of BI and Data Storage. Jim Hadley

Online Courses. Version 9 Comprehensive Series. What's New Series

20464C: Developing Microsoft SQL Server Databases

Exploring the Synergistic Relationships Between BPC, BW and HANA

Updating Your SQL Server Skills to Microsoft SQL Server 2014

SQL Server 2012 Business Intelligence Boot Camp

Unlock your data for fast insights: dimensionless modeling with in-memory column store. By Vadim Orlov

SQL Server Administrator Introduction - 3 Days Objectives

<Insert Picture Here> Best Practices for Extreme Performance with Data Warehousing on Oracle Database

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013

Innovative technology for big data analytics

David Dye. Extract, Transform, Load

low-level storage structures e.g. partitions underpinning the warehouse logical table structures

Power BI Performance. Tips and Techniques

Transcription:

Columnstore in SQL Server 2016 Niko Neugebauer 3 Sponsor Sessions at 11:30 Don t miss them, they might be getting distributing some awesome prizes! HP SolidQ Pyramid Analytics Also Raffle prizes at the end of the event provided by HP, SolidQ, Pyramid Analytics, Altran & Microsoft 1

Our Main Sponsors: 2

Niko Neugebauer Microsoft Data Platform Professional OH22 (http://www.oh22.net) SQL Server MVP Founder of a couple of Portuguese PASS Chapters (SQLPort, BITuga, Porto.Data) Creator of CISL Open Source Columnstore Indexes Script Library (https://github.com/nikoneugebauer/cisl) Blog: http://www.nikoport.com Twitter: @NikoNeugebauer LinkedIn: http://pt.linkedin.com/in/webcaravela Email: info@webcaravela.com Our Columnstore Agenda Operational Analytics DataWarehousing High Availability Batch Mode Data Loading Improvements Performance Improvements Change Tracking Maintenance Improvements Monitorting Improvements (DMV, Extended Events, Performance Counters) 3

1. Analytics What is it all about: Predictive Analytics, Prescriptive Analytics, Business Analytics, Machine Learning, Data Mining OLAP - Online Analytical Processing Big Data But technically it all goes down to the one thing we treasure the most: the DATA 1. Analytics Faces Analytics is the process of discovery and communication of meaningful patterns in data. Reporting - extraction of the aggregated information for further analysis. Querying - data extraction process 4

Traditional operational/analytics architecture Key issues IIS Server BI analysts Complex implementation Requires two servers (capital expenditures and operational expenditures) Data latency in analytics More businesses demand; requires real-time analytics Minimizing data latency for analytics BI analysts Benefits No data latency No ETL IIS Server No separate data warehouse Challenges Analytics queries are resource intensive and can cause blocking Minimizing impact on operational workloads Sub-optimal execution of analytics on relational schema 5

Operational Analytics IIS Server BI analysts What is operational analytics and what does it mean to you? Operational analytics with disk-based tables Operational analytics with In-Memory OLTP 2. Traditional Operational Analytics Problems: Costs Integration Problems (data types, constraints, network problems, etc) Delay for getting the actual data 6

2. Modern Operational Analytics notes: Nothing substitutes analytics queries performance possible using schemas customized (Star/Snowflake) and/or pre-aggregated cubes 3. Database Trends Right now, 5 types of Major Investments: Analytics Big Data High Availability In-Memory Columnstore 7

4. Operational Analytics in SQL Server 2016 There are now 2 types of Operational Analytics: Operational Analytics for RowStore Operational Analytics for InMemory Operational Analytics Structure 8

COLUMNAR STORAGE Columnstore Indexes are: Verytically separated Grouped into Segments Extremely compressed Tuned for processing large volumes of data 9

Row Store vs Column Store ID First Name Last Name Salary 1 Jody Philipps 43.03 2 Mark Johnson 37.08 3 Matt Markensen 16.81 4 Gail Lindberg 24.90 Row Store Index: 1, Jody, Philips, 43.04; 2, Mark, Johnson, 37.08;... Column Store Index: 1, 2, 3, 4; Jody, Mark, Matt, Gail;... Row Store vs Column Store Row Store Column Store ID First Name Last Name Salary ID First Name Last Name Salary 1 Jody Philips 43.03 2 Mark Johnson 37.08 3 Matt Markensen 16.81 4 Gail Lindberg 24.90 1 23 4 Jody Mark Matt Gail Philips Johnson Markense Lindberg n 43.03 37.08 16.81 24.90 Data is stored on the disk tuple by tuple Data is stored on the disk column by column 10

Row Store vs Column Store Row Store Easy to add & modify tuples Might read unnecessary data Column Store Reading only relevant data Tuple writes require multiple accesses Phases of Columnstore Index creation: 1. Row Groups separation 2. Segment creation 3. Compression 11

1. Row Groups creation } } } } ~ 1 Million Rows ~ 1 Million Rows ~ 1 Million Rows ~ 1 Million Rows 2. Segments separation Column Segment 12

3. Compression 1 A...... 3 C...... 2 B...... 2 B...... 3 C...... 4 D.... 4 D...... 1 A..... SEGMENTS 13

A Segment... Contains around 1 Million Rows of unordered data Logical unit of data operations (no matter how many 8K Pages or Extents are involved) (8 MB ~= 1000 Pages ~= 130 Extents) Agressive Read-Ahead Large Object Cache (New since SQL Server 2012) New Memory Broker separates between Row Store & Column Store Segment column selection SELECT sum(c2), sum(c3) FROM dbo.factonlinesales WHERE C2 > 10 and C2 <=20; C1 C2 C3 C4 14

Segment elimination SELECT sum(c2), sum(c3) FROM dbo.factonlinesales WHERE C2 > 10 and C2 <=20; Min Value Max Value 1 9 C1 C2 1 C3 C4 9 10 20 15 30 31 45 Columnstore in SQL Server 2014 15

Operational Analytics Query Processing with Operational Analytics 16

DEMO TIME! Filtered Columnstore You can actually create a filtered Nonclustered Columnstore 17

DEMO TIME! 5. In-Memory Operational Analytics 18

5. In-Memory Operational Analytics sys.sp_memory_optimized_cs_migration compresses data from InMemory OLTP Table into In- Memory Columnstore Runs in interop mode only so far (it is changing with every release) Single core execution only (so far), but hey, it s a Batch Mode! DEMO TIME! 19

Clustered Columnstore (DWH) Nonclustered Indexes Primary & Foreign Keys Nonclustered Secondary Rowstore Locking Further Columnstore Improvements Data Loading Improvements High Availability Batch Mode Performance Improvements Change Tracking Maintenance Improvements Monitorting Improvements (DMV, Extended Events, Performance Counters) 20

Data Loading Improvements Parallel Data Loading finally enabled SIMD support Delta-Stores are not Page-Compressed!!! Under NCCI Delta-Stores are automatically increasing their max size with each iteration (1,2,4,8,16,32 Million Rows) Parallel Data Loading 21

High Availability Readable Secondaries for Availability Groups through Snapshot & Read Committed Snapshot Isolation Levels support Batch Mode Improvements Batch Mode support for 1 core execution plan operators Batch Mode support for the Sort operator Batch Mode support for the Multiple Distinct Count operations Batch Mode support for the Left Anti-Semi Join operators 22

Batch Mode for the Windowing functions Batch Performance Improvements: Simple Aggregate Predicate Pushdown String Predicate Pushdown for the Clustered Columnstore Index Scan operator in Batch Mode 23

String Predicate Pushdown Change Tracking in SQL Server 2016 Columnstore Change Tracking Change Data Capture Temporal (2016) Clustered no no yes Nonclustered yes yes yes 24

Maintenance Improvements: Better Index Reorganize (removes deleted rows, less memory pressure) New DMVs: sys.dm_column_store_object_pool sys.dm_db_column_store_row_group_physical_stats sys.dm_db_column_store_row_group_operational_stats sys.dm_db_index_operational_stats sys.dm_db_index_physical_stats sys.internal_partitions 25

Muchas Gracias! Resources: My Columnstore Blogpost Series (70+): http://www.nikoport.com/columnstore CISL Open Source Columnstore Library: https://github.com/nikoneugebauer/cisl 52 26