Tuning Tableau and Your Database for Great Performance PRESENT ED BY



Similar documents
Tuning Tableau Server for High Performance

Tableau Visual Intelligence Platform Rapid Fire Analytics for Everyone Everywhere

Best Practices for Hadoop Data Analysis with Tableau

Driving Peak Performance IBM Corporation

Tableau and the Enterprise Data Warehouse: The Visual Approach to Business Intelligence

In-Memory or Live Data: Which is Better?

End to End Microsoft BI with SQL 2008 R2 and SharePoint 2010

Course 55144: SQL Server 2014 Performance Tuning and Optimization

Power BI Performance. Tips and Techniques

Big Data Analytics Nokia

"Charting the Course... MOC AC SQL Server 2014 Performance Tuning and Optimization. Course Summary

Efficient Data Access and Data Integration Using Information Objects Mica J. Block

In-Memory Analytics: A comparison between Oracle TimesTen and Oracle Essbase

SQL Server 2014 Performance Tuning and Optimization 55144; 5 Days; Instructor-led

Course Outline. SQL Server 2014 Performance Tuning and Optimization Course 55144: 5 days Instructor Led

Course 55144B: SQL Server 2014 Performance Tuning and Optimization

The Microsoft Business Intelligence 2010 Stack Course 50511A; 5 Days, Instructor-led

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

Oracle BI Suite Enterprise Edition

SQL Server and MicroStrategy: Functional Overview Including Recommendations for Performance Optimization. MicroStrategy World 2016

HP Vertica and MicroStrategy 10: a functional overview including recommendations for performance optimization. Presented by: Ritika Rahate

OBIEE 11g Data Modeling Best Practices

An Overview of SAP BW Powered by HANA. Al Weedman

Exploring the Synergistic Relationships Between BPC, BW and HANA

Microsoft Services Exceed your business with Microsoft SharePoint Server 2010

How to Migrate From Existing BusinessObjects or Cognos Environments to MicroStrategy. Ani Jain January 29, 2014

In-Memory or Live Data: Which Is Better?

Best Practices for Designing Efficient Tableau Workbooks

Tableau for the Enterprise: An Overview for IT

XpoLog Center Suite Data Sheet

Moving From Hadoop to Spark

SQL Databases Course. by Applied Technology Research Center. This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases.

Apache Kylin Introduction Dec 8,

70-467: Designing Business Intelligence Solutions with Microsoft SQL Server

Top 10 Performance Tips for OBI-EE

Using Tableau Software with Hortonworks Data Platform

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Next Generation Data Warehousing Appliances

QlikView Business Discovery Platform. Algol Consulting Srl

A Whole New World. Big Data Technologies Big Discovery Big Insights Endless Possibilities

QlikView 11.2 SR5 DIRECT DISCOVERY

Optimizing Your Data Warehouse Design for Superior Performance

Data-cubing made-simple! with Spark, Algebird and HBase goo.gl/dbgr0h

Native Connectivity to Big Data Sources in MSTR 10

Best Practices for Designing Efficient Tableau Workbooks

Tiber Solutions. Understanding the Current & Future Landscape of BI and Data Storage. Jim Hadley

Sisense. Product Highlights.

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package Data Federation Administration Tool Guide

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services. By Ajay Goyal Consultant Scalability Experts, Inc.

Escape from Data Jail: Getting business value out of your data warehouse

Unit & Live Testing for SSIS

Tableau Metadata Model

DBMS / Business Intelligence, SQL Server

Best Practices for Deploying Managed Self-Service Analytics and Why Tableau and QlikView Fall Short

Database Administration with MySQL

Alan Eldridge and Tara Walker. Understanding Level of Detail Expressions

Zynga Analytics Leveraging Big Data to Make Games More Fun and Social

Oracle Database In-Memory The Next Big Thing

Monitoring Genebanks using Datamarts based in an Open Source Tool

SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here

<Insert Picture Here> Enhancing the Performance and Analytic Content of the Data Warehouse Using Oracle OLAP Option

Business Intelligence, Data warehousing Concept and artifacts

Performance Tuning for the Teradata Database

Mobile Application Performance

Tableau for the Enterprise: An Overview for IT

Combined Knowledge Business Intelligence with SharePoint 2013 and SQL 2012 Course

WRITING EFFICIENT SQL. By Selene Bainum

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

Cisco IT Hadoop Journey

Breadboard BI. Unlocking ERP Data Using Open Source Tools By Christopher Lavigne

MOC 20467B: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

POLAR IT SERVICES. Business Intelligence Project Methodology

Deploying Governed Data Discovery to Centralized and Decentralized Teams. Why Tableau and QlikView fall short

Retail POS Data Analytics Using MS Bi Tools. Business Intelligence White Paper

ENZO UNIFIED SOLVES THE CHALLENGES OF OUT-OF-BAND SQL SERVER PROCESSING

Traditional BI vs. Business Data Lake A comparison

Ganzheitliches Datenmanagement

How, What, and Where of Data Warehouses for MySQL

W I S E. SQL Server 2008/2008 R2 Advanced DBA Performance & WISE LTD.

AV-005: Administering and Implementing a Data Warehouse with SQL Server 2014

Data Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here

CERULIUM TERADATA COURSE CATALOG

Armanino McKenna LLP Welcomes You To Today s Webinar:

In-memory computing with SAP HANA

A Scalable Data Transformation Framework using the Hadoop Ecosystem

Universe Best Practices Session Code: 806

An Oracle White Paper March Best Practices for Real-Time Data Warehousing

Real-time reporting at 10,000 inserts per second. Wesley Biggs CTO 25 October 2011 Percona Live

Dashboard Engine for Hadoop

SQL SERVER TRAINING CURRICULUM


White Paper. Thirsting for Insight? Quench It With 5 Data Management for Analytics Best Practices.

Beyond Plateaux: Optimize SSAS via Best Practices

Data Warehousing and Data Mining

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Transcription:

Tuning Tableau and Your Database for Great Performance PRESENT ED BY Matt Higgins, Tableau Software Robert Morton, Tableau Software

Tuning Tableau and Your Database for Great Performance Understand Tableau s Query Generation Identify Database Tuning Opportunities Author Optimized Visualizations

Agenda: Technical* Guidelines for Performance * Occasional marketing slides required

Understand Tableau s Query Generation Motivation: Direct connect & go! No up-front, complex ETL Push the computation close to the data Leverage customer investment in fast databases Topics covered: Filters Subqueries and temporary tables Custom SQL Discovery queries

Tableau Architecture Databases Data Marts Files Cubes Tableau Native Query Technology Tableau Data Engine Technology + Interactive queries to the database + Leverages IT policies for security + Avoids data silos + Ensures fresh data results + Values the EDW strategy SQL Connector MDX Connector VizQL Compiler Data Engine + Memory efficient + Column store + Highly compressed + Optimized API specific for Tableau + 64-bit (32-bit version as well) Use this option: User Interface + When the source DB query performance is slow + To offload iterative query workload from the source DB + To work offline from the network + To keep an archive of the data

Understand Tableau s Query Generation

Query Generation: Filters Filters are very expressive Different performance characteristics Range filter vs. itemized list Context filter Quick filters with "show relevant values" Slicing filter

Filters: Range vs. Itemized List Several ways to filter to the selection: Keep-only the selection Exclude the inverse selection Filter by range: Avg. FarePerMile < 1.0, and Avg. Distance < 5,000

Filters: Range vs. Itemized List Performance: 10.0s: Unfiltered 19.5s: Keep-only the selection 15.9s: Exclude the inverse selection 9.5s: Filter by range

Filters: Range vs. Itemized List Explanation: Discrete lists are expensive to evaluate Keep-only can easily result in large, discrete lists.

Filters: Add to Context Purpose of Context Filters Establish a baseline criteria which is evaluated first All subsequent filters are performed on this resultset How does Tableau process this? Instantiate context filter resultset into temp table, or Express the context filter resultset as a subquery Performance side-effects of Context Filters May have no impact May reduce performance if the filter is not very selective The resultset will be nearly as big as the original table May boost performance if the filter is highly selective The resultset will be small, making all subsequent filters, aggregations, etc. much faster

Quick Filters Easy-access UI for filtering Compact forms: Wildcard Match Compact List Slider, Only Relevant Values Filter items reflect choices made for other filters e.g. [State] should not list California if [Region] excludes West Requires querying the database for the domain of every filter any time a user modifies any of the filters.

Slicing Filter Filter criteria is not part of the visual level of detail Inspired by MDX slicers Not trivial to express in SQL Tableau must isolate the slicing filter level of detail Temp table Subquery Subquery example SELECT "Staples"."Prod Type1" AS "none:prod Type1:nk", SUM("Staples"."Sales Total") AS "sum:sales Total:qk" FROM "TestV1"."Staples" "Staples" INNER JOIN ( SELECT "Staples"."Prod Type4" AS "none:prod Type4:nk", COUNT(1) AS "_Tableau_join_flag" FROM "TestV1"."Staples" "Staples" GROUP BY 1 HAVING (SUM(CAST(1 AS BIGINT)) > 1000) ) "t0" ON ("Staples"."Prod Type4" = "t0"."none:prod Type4:nk") GROUP BY 1

Query Generation: Custom SQL What is Custom SQL? A way to define a connection based on complex join conditions, filtering and pre-aggregation Why does Tableau wrap Custom SQL? e.g.: SELECT * FROM ( SELECT `starbucks`.`market` AS `Market`, `starbucks`.`product Type` AS `Product Type`, SUM(`starbucks`.`Sales`) AS `Sum of Sales` FROM `starbucks` WHERE `starbucks`.`state` <> 'California' GROUP BY 1, 2 ) `TableauSQL`

Query Generation: Custom SQL Custom SQL represents a resultset Tableau visualizations build upon the resultset Queries are not part of the resultset Instead layered on top of the resultset SELECT `TableauSQL`.`Market` AS `none_market_nk`, SUM(1) AS `sum_number of Records_qk` FROM ( SELECT `starbucks`.`market` AS `Market`, `starbucks`.`product Type` AS `Product Type`, SUM(`starbucks`.`Sales`) AS `Sum of Sales` FROM `starbucks` WHERE `starbucks`.`state` <> 'California' GROUP BY 1, 2 ) `TableauSQL` WHERE (`TableauSQL`.`Product Type` = 'Coffee) GROUP BY 1

Query Generation: Discovery Queries Connect-time probes Validate joins / custom SQL LIMIT 1 Metadata discovery LIMIT 0 WHERE 1=0 Side-band queries Domain size checks Contains NULL?

Identify Database Tuning Opportunities Directly used by Tableau PK / FK (join culling) NOT NULL constraint (filtering) Temp table permissions (filtering) Indirectly used by Tableau Indexes hash index for grouping range index for dates Partitioning Alternatives to Custom SQL Database view (possibly materialized) Initial SQL

Author Optimized Visualizations Visualizations are targeted to the human brain Must be a digestible quantity of information Challenge: condensing insight from large data sets Working with Big Data Aggregation Filtering Extracts Avoiding joins Advanced techniques

Authoring: Aggregation Tableau aggregates by Default Pre-computing aggregations can lead to huge performance gains Certain things not preserved (Count Distinct) More than one aggregated extract can be used to great advantage

Authoring: Filtering Filter first, then viz Filtering High-Cardinality Fields (Skip) Wildcard match Type-in List Dashboard actions Can replace Quick Filters Edit your Action to exclude on deselect Create None filter first

Authoring: Avoiding Joins Aliases Rather than look up table Potential drawbacks: Blending Filters are done on original field Good for values that rarely change Create Primary Group Edit Primary Aliases Connect to same DB, separate connection for each table Ad-hoc groups On the fly, visually

Authoring: Extracts Extract creation (Demo) Filter Aggregate (summary view) Incremental Sampled Hide unused fields Data types/characteristics matter Integer better than String Date better than Timestamp (if detail not needed) Optimize extracts

Authoring: Advanced Techniques Initial SQL Can be used to prep data for faster querying Query Banding Teradata Only Can set query priority

Please evaluate this session (TCC11 202) Tuning Tableau and Your Database Text to 32075 In the body of the message, type: TCC11<space>202 then letters from the table below to indicate each response. Provide additional comments after an asterisk * Sample text: TCC11 202aho*That was great! Please give your response to the following: Excellent Great Good Average Poor Bad Very Bad What was the value of this session to you? a b c d e f g What are the chances you will apply what you learned in this session in your work? h i j k l m n What are the chances you would recommend this session to a colleague? o p q r s t u Each text evaluation you send enters you into a drawing for an ipad!