SSIS Scaling and Performance



Similar documents
LearnFromGuru Polish your knowledge

SQL Server Administrator Introduction - 3 Days Objectives

W I S E. SQL Server 2008/2008 R2 Advanced DBA Performance & WISE LTD.

SSIS Training: Introduction to SQL Server Integration Services Duration: 3 days

Agenda. SSIS - enterprise ready ETL

SQL SERVER DEVELOPER Available Features and Tools New Capabilities SQL Services Product Licensing Product Editions Will teach in class room

Microsoft. Microsoft SQL Server Integration Services. Wee-Hyong Tok. Rakesh Parida Matt Masson. Xiaoning Ding. Kaarthik Sivashanmugam

COPYRIGHTED MATERIAL. Welcome to SQL Server Integration Services. What s New in SQL Server 2005 SSIS

Implementing a Data Warehouse with Microsoft SQL Server 2012

High-Volume Data Warehousing in Centerprise. Product Datasheet

Big Data With Hadoop

Guerrilla Warfare? Guerrilla Tactics - Performance Testing MS SQL Server Applications

Load Testing Analysis Services Gerhard Brückl

Would-be system and database administrators. PREREQUISITES: At least 6 months experience with a Windows operating system.

MICROSOFT EXAM QUESTIONS & ANSWERS

SQL SERVER BUSINESS INTELLIGENCE (BI) - INTRODUCTION

In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR

Data Integrator Performance Optimization Guide

PUBLIC Performance Optimization Guide

Implementing a Data Warehouse with Microsoft SQL Server 2012

Implementing a Data Warehouse with Microsoft SQL Server

SQL SERVER TRAINING CURRICULUM

Building a BI Solution in the Cloud

Unit & Live Testing for SSIS

SAP Data Services 4.X. An Enterprise Information management Solution

Implementing a Data Warehouse with Microsoft SQL Server 2012

DBMS / Business Intelligence, SQL Server

MS SQL Server 2005 Data Collector. Status: 12/5/2008

Course Outline. Module 1: Introduction to Data Warehousing

AV-005: Administering and Implementing a Data Warehouse with SQL Server 2014

Correct Answer: J Explanation. Explanation/Reference: According to these references, this answer looks correct.

IBM WebSphere DataStage Online training from Yes-M Systems

COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001

In-Memory Analytics: A comparison between Oracle TimesTen and Oracle Essbase

Implementing a Data Warehouse with Microsoft SQL Server

PassTest. Bessere Qualität, bessere Dienstleistungen!

Data Integration and ETL with Oracle Warehouse Builder: Part 1

Implementing and Maintaining Microsoft SQL Server 2008 Integration Services

Real-time Data Replication

Course Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning

SQL Server 2012 Optimization, Performance Tuning and Troubleshooting

Module 2: Database Architecture

Implementing a Data Warehouse with Microsoft SQL Server MOC 20463

COURSE OUTLINE MOC 20463: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777

SQL Server Performance Tuning and Optimization

Optimizing Performance. Training Division New Delhi

Designing a Dimensional Model

Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days

Business Benefits From Microsoft SQL Server Business Intelligence Solutions How Can Business Intelligence Help You? PTR Associates Limited

Performance Counters. Microsoft SQL. Technical Data Sheet. Overview:

ETL Overview. Extract, Transform, Load (ETL) Refreshment Workflow. The ETL Process. General ETL issues. MS Integration Services

MOC 20461C: Querying Microsoft SQL Server. Course Overview

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February ISSN

East Asia Network Sdn Bhd

SharePoint 2010 Performance and Capacity Planning Best Practices

The safer, easier way to help you pass any IT exams. Exam : C_HANASUP_1. SAP Certified Support Associate - SAP HANA 1.0.

Implementing a Data Warehouse with Microsoft SQL Server

SQL Server Integration Services. Design Patterns. Andy Leonard. Matt Masson Tim Mitchell. Jessica M. Moss. Michelle Ufford

Microsoft SQL Server: MS Performance Tuning and Optimization Digital

SQL Server 2016 New Features!

Course Syllabus. Maintaining a Microsoft SQL Server 2005 Database. At Course Completion

SQL Server Enterprise Edition

SQL Server 2014 New Features/In- Memory Store. Juergen Thomas Microsoft Corporation

Apache Spark 11/10/15. Context. Reminder. Context. What is Spark? A GrowingStack

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

MS SQL Server 2000 Data Collector. Status: 12/8/2008

In-Memory Data Management for Enterprise Applications

Solving Performance Problems In SQL Server by Michal Tinthofer

Creating Power BI solutions using Power BI Desktop

Bigdata High Availability (HA) Architecture

Implementing a Data Warehouse with Microsoft SQL Server 2012 (70-463)

Tier Architectures. Kathleen Durant CS 3200

Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services. By Ajay Goyal Consultant Scalability Experts, Inc.

Top 10 Performance Tips for OBI-EE

Module 1: Getting Started with Databases and Transact-SQL in SQL Server 2008

Enterprise and Standard Feature Compare

Enterprise Data Integration for Microsoft Dynamics CRM

SQL Server Integration Services Design Patterns

Data Warehouse in the Cloud Marketing or Reality? Alexei Khalyako Sr. Program Manager Windows Azure Customer Advisory Team

Microsoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server

COPYRIGHTED MATERIAL. 1Welcome to SQL Server Integration Services

Microsoft SQL Server for Oracle DBAs Course 40045; 4 Days, Instructor-led

Course 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012

TS:MS SQL Server 2005 Business Intelligence-Implem & Mainte

COURSE SYLLABUS COURSE TITLE:

CRM Magic with Data Migration & Integration

SQL Server Integration Services Using Visual Studio 2005

SQL Server 2005 Features Comparison

Writing Queries Using Microsoft SQL Server 2008 Transact-SQL

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.

Implementing a Data Warehouse with Microsoft SQL Server 2012

DigiVault Online Backup Manager. Microsoft SQL Server Backup/Restore Guide

Introduction. Part I: Finding Bottlenecks when Something s Wrong. Chapter 1: Performance Tuning 3

Exam Number/Code : Exam Name: Name: PRO:MS SQL Serv. 08,Design,Optimize, and Maintain DB Admin Solu. Version : Demo.

Cloud Computing at Google. Architecture

EII - ETL - EAI What, Why, and How!

Implementing a Data Warehouse with Microsoft SQL Server 2012

Moving the Web Security Log Database

Transcription:

SSIS Scaling and Performance Erik Veerman Atlanta MDF member SQL Server MVP, Microsoft MCT Mentor, Solid Quality Learning Agenda Buffers Transformation Types, Execution Trees General Optimization Techniques Scaling Sources and Destinations Execution location Monitoring and Logging 1

Pipeline Buffers Data extracted into the Data Flow is passed into data buffer groupings Transformations logic flows over buffers for optimal performance Buffers allow a streaming process Each Data Flow can have multiple buffer profiles defined for different types of data Pipeline Buffers 2

Buffer Scheduler Slide Courtesy of Jeff Bernhardt (SSIS Team Lead) Data Flow Architecture Buffers based on design time metadata The width of a row determines the size of the buffer Smaller rows = more rows in memory = greater efficiency Memory copies are expensive! Pointer magic where possible E.g. Multicast logical vs. actual 3

Column <LineageID> Each buffer profile assigns a <LineageID> to each column (not to be confused with ETL Lineage tracking) A single column coming in from a source can end up having multiple LineageIDs through its life in the pipeline LineageIDs are viewable in the Advanced Input and Output Properties Component Types Streaming (synchronous) Logically works at a row level Buffer Reused Examples: Data Convert, Derived Column, Lookup Partially blocking (asynchronous) May logically work at a row level Data copied to new buffers Examples: Pivot, Un-pivot, Merge, Merge Join, Union All Blocking (asynchronous) Needs all input buffers before producing any output rows Data copied to new buffers Examples: Aggregate, Sort, Row Sampling, Fuzzy Grouping 4

Synchronous and Asynchronous Think transformation communication Definitions apply to transformation outputs Synchronous transformation outputs Same buffers immediately passed onto next transformation No rows added, no rows removed Asynchronous transformation outputs Data is copied to new buffer Downstream transformation works independently of upstream asynchronous transformation Synchronous and Asynchronous A Transform is not limited to a single synchronous output Multicast and Conditional Split have multiple synchronous outputs Synchronous outputs preserve the sort order of the input rows Identifying Synchronicity See SynchronousInputID property in Advanced Editor Output property Entry of 0 identifies an asynchronous transformation 5

Execution Trees and Threads Execution Trees Start from a source or an asynchronous output Ends at a destination or an input that has no sync outputs Different buffer profiles per tree Execution Threads Each Source can get a thread Each Execution Tree can get a thread Use EngineThreads to control parallelism Value applies to Execution Trees, not Sources Execution Trees and Threads Each component within an execution tree applies work on the same set of buffers Data in a new execution tree requires existing buffer data to be copied into new buffers 6

Buffer Scheduler Slide Courtesy of Jeff Bernhardt (SSIS Team Lead) General Optimization Increase Engine Threads Breakup large Execution Trees with a Union All transformation to allow a more process threads to handle operations Remove columns in pipeline not used downstream (avoid pipeline warnings) 7

General Optimization Limit Rowbased operations Limit Blocking Transforms (presort if possible) Perform data correlation in the Data Flow Handle staging requirements with a Multicast transformation Use strategic staging to optimize pipeline Optimize temporary storage locations General Optimization Manage and monitor memory! Isolate long running task durations Tuning buffers DefaultBufferMaxRows DefaultBufferSize Buffer size is the lesser of DefaultBufferMaxRows * row-width DefaultBufferSize 8

When to Stage Data Restartability requirements Process window times and precedence Intense downstream transformations causes source back pressure (slows extraction) Data Flow optimization Eases complexity Source and Destination Optimization Drop indexes, load, re-add indexes Increase destination DB filegroup size FastParse to True in the Flat File Source! Use Row Count in various data flow places to understand bottlenecks Use Fast Load! MaxInsertCommitSize 0 = all rows committed at one time >0 = each commit will lesser than buffer rowcount or MaxInsertCommitSize be careful! 9

Package Execution Location Storage location Execution on source server Execution on destination server Execution on third server Distributed execution Package Storage and Execution Location Package Load Path File System Storage or MSDB Storage Resource Impact Package Execution BIDS DTExecUI DTExec (cmd/bat) SQL Agent Programmatic Execution Location 10

Package Execution on Source Server Execution Impact Extraction Impact Load Impact Destination Data Path Execution Location Destination files/data Source files/data Package Execution on Destination Server Extraction Impact Execution Impact Load Impact Source Data Path Source files/data Execution Location Destination files/data 11

Package Execution on Secondary Server Execution Impact Extraction Impact Load Impact Source Data Path Network Impact Destination Data Path Execution Location Source files/data Destination files/data Package Execution on Tertiary Server Extraction Impact Load Impact Source files/data Destination files/data Execution Impact Execution Location 12

Distributed Package Execution File System Storage or MSDB Storage Master SQL Agent, Scheduling App Resource Impact Resource Impact Remote Package Execution SQL Agent Programmatic Service Broker MSMQ Execution Location SQL Agent, Scheduling Agent Remote Package Execution SQL Agent Programmatic Service Broker MSMQ Execution Location SQL Agent, Scheduling Agent Monitoring the Data Flow Pipeline logging events Pipeline Execution Trees Pipeline Execution Plan Performance Monitor counters Object = SQLServer: SSIS Pipeline Blob counters Buffer counters Row counts 13

Pipeline PerfMon Counters Buffer types Flat Buffers: primary buffers used in the pipeline Private Buffers: used by individual transformations to perform operations (Sort, Aggregate, Lookup cache) Buffer Counters (by Total, Flat, or Private) Buffer memory: Amount of memory used by buffers Buffers in use: number of buffers Buffers spooled: # of buffers spooled to disk KATMAI SQL 2008 Buffer threading scheduler changes MERGE/UPSERT Change Data Capture Better lookup component 14