Top Ten Qlik Performance Tips



Similar documents
Access Queries (Office 2003)

THE SET ANALYSIS. Summary

QLIKVIEW GOVERNANCE DASHBOARD FAQ

QLIKVIEW ARCHITECTURE AND SYSTEM RESOURCE USAGE

QlikView 11.2 SR5 DIRECT DISCOVERY

Eucalyptus User Console Guide

QLIKVIEW GOVERNANCE DASHBOARD 1.0

Project management (Dashboard and Metrics) with QlikView

MyOra 3.0. User Guide. SQL Tool for Oracle. Jayam Systems, LLC

SSRS Reporting Using Report Builder 3.0. By Laura Rogers Senior SharePoint Consultant Rackspace Hosting

Financial Reporting Using Microsoft Excel. Presented By: Jim Lee

Intelligent Dashboards made Simple! Using Excel Services

Microsoft Access 3: Understanding and Creating Queries

Installation & User Guide

QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION

Advanced Query for Query Developers

Lab 9 Access PreLab Copy the prelab folder, Lab09 PreLab9_Access_intro

Creating Excel Link reports with efficient design

INTRODUCING ORACLE APPLICATION EXPRESS. Keywords: database, Oracle, web application, forms, reports

SelectSurvey.NET User Manual

RIFIS Ad Hoc Reports

Universal Simple Control, USC-1

What's New Feature Guide

EZManage V4.0 Release Notes. Document revision 1.08 ( )

Virto Pivot View for Microsoft SharePoint Release User and Installation Guide

Differences in Use between Calc and Excel

QLIKVIEW SCALABILITY OVERVIEW

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

Oracle Database Performance Management Best Practices Workshop. AIOUG Product Management Team Database Manageability

Splunk Enterprise 6.2.3

Optimizing Your Data Warehouse Design for Superior Performance

Departmental Reporting in Microsoft Excel for Sage 50 Accounts

Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs)

TRIM: Web Tool. Web Address The TRIM web tool can be accessed at:

Adam Rauch Partner, LabKey Software Extending LabKey Server Part 1: Retrieving and Presenting Data

Aras Corporation Aras Corporation. All rights reserved. Notice of Rights. Notice of Liability

The Commerce Trust Company

Qlik REST Connector Installation and User Guide

Webapps Vulnerability Report

Performance Monitor. Intellicus Web-based Reporting Suite Version 4.5. Enterprise Professional Smart Developer Smart Viewer

Chapter 7 - Find trades I

InstallShield Tip: Accessing the MSI Database at Run Time

Retrieving Data Using the SQL SELECT Statement. Copyright 2006, Oracle. All rights reserved.

Canadian Association for Research Libraries Toronto, Ontario 14 October 2015

Network Probe User Guide

RingStor User Manual. Version 2.1 Last Update on September 17th, RingStor, Inc. 197 Route 18 South, Ste 3000 East Brunswick, NJ

Cisco Data Preparation

PORTAL ADMINISTRATION

A Quick Tour of F9 1

Pcounter Web Report 3.x Installation Guide - v Pcounter Web Report Installation Guide Version 3.4

QAD Enterprise Applications. Training Guide Demand Management 6.1 Technical Training

Conquer the 5 Most Common Magento Coding Issues to Optimize Your Site for Performance

MONyog White Paper. Webyog

Iotivity Programmer s Guide Soft Sensor Manager for Android

Admin Guide Product version: Product date: November, Technical Administration Guide. General

Conga Composer Microsoft Excel Templates

Software Engineering I CS524 Professor Dr. Liang Sheldon X. Liang

The QlikView deployment framework

About Google Analytics

MyOra 3.5. User Guide. SQL Tool for Oracle. Kris Murthy

MS Access Lab 2. Topic: Tables

GOVERNANCE OVERVIEW. A QlikView Technology White Paper. qlikview.com. December 2011

Simply Accounting Intelligence Tips and Tricks Booklet Vol. 1

ARIZONA DEPARTMENT OF TRANSPORTATION. Presented by Lonnie D. Hendrix, P.E. Assistant State Engineer, Maintenance

CommonSpot Content Server Version 6.2 Release Notes

MAS 500 Intelligence Tips and Tricks Booklet Vol. 1

Step One Check for Internet Connection

Drupal Performance Tuning

Quick Start Guide to Logging in to Online Banking

Oracle Sales Offline. 1 Introduction. User Guide

Fact Sheet In-Memory Analysis

QLIKVIEW ARCHITECTURAL OVERVIEW

HansaWorld SQL Training Material

Dell KACE K1000 Management Appliance. Asset Management Guide. Release 5.3. Revision Date: May 13, 2011

Oracle Data Miner (Extension of SQL Developer 4.0)

Teamstudio USER GUIDE

Sage 300 ERP What's New

Sage Intelligence Reporting. Microsoft FRx to Sage Intelligence Report Designer Add-In Conversion Guide. Sage 100 ERP

Evaluator s Guide. PC-Duo Enterprise HelpDesk v5.0. Copyright 2006 Vector Networks Ltd and MetaQuest Software Inc. All rights reserved.

Physical Design. Meeting the needs of the users is the gold standard against which we measure our success in creating a database.

Ad Hoc Reporting: Data Export

User Training Guide Entrinsik, Inc.

iw Document Manager Cabinet Converter User s Guide

Introduction to the Data Migration Framework (DMF) in Microsoft Dynamics WHITEPAPER

QUICK START GUIDE. Cloud based Web Load, Stress and Functional Testing

Actualtests.C questions

Essbase Calculations: A Visual Approach

Data Warehouse and Business Intelligence Testing: Challenges, Best Practices & the Solution

Excel Templates. Release 8. Revised 19 November 2014

Sharperlight 2.10 Quick Start Guide

PULSE Dashboard Administrator Guide (Updated 2/19/15)

Server & Workstation Installation of Client Profiles for Windows

IBM BPM V8.5 Standard Consistent Document Managment

Instructions on registering in the Portal

Spotfire v6 New Features. TIBCO Spotfire Delta Training Jumpstart

Quick Start SAP Sybase IQ 16.0

TechTips. Connecting Xcelsius Dashboards to External Data Sources using: Web Services (Dynamic Web Query)

MONyog White Paper. Webyog

Transcription:

Top Ten Qlik Performance Tips Rob Wunderlich Panalytics, Inc 1 About Me Rob Wunderlich Qlikview Consultant and Trainer Using Qlikview since 2006 Author of Document Analyzer and other tools Founder of QlikView Components script library Qlik Luminary and MVP in QlikCommunity Blogger at QlikviewCookbook.com Presenter at Masters Summit for Qlik Instructor at q-on.bi Tweets as @QVCookbook 2 Copyright 2016 Rob Wunderlich 1

Please ask questions. Don t assume you are the only one wondering. 3 Define Performance Response time after a click What is fast and what is slow? depends who you ask. Reload Time Utilization of hardware Cost of purchase, upgrade and management. Development Effort 4 Copyright 2016 Rob Wunderlich 2

When (Not) to Performance Tune " premature optimization is the root of all evil " Donald Knuth Performance Tuning takes time, time is usually money. Best practices are frequently free Have a problem to solve 5 The Tuning Volume Curve Rows Data Model, Expressions Hardware Required Knowledge < Few million Doesn't matter Unimportant Get the numbers right! Few million to tens of millions Many tens of millions to low hundred millions Many hundreds of millions Billions Best practices Has an impact Best Practices Intentional Very important Senior Consultant Critical Critical Expert Specialized techniques Custom planning QV Internals, custom tooling 6 Copyright 2016 Rob Wunderlich 3

Measuring Performance Script Document Log + ScriptLogAnalyzer from QvCookbook Charts Sheet, Object Properties, Calc Time Understand the impacts of cache and multi-processing on these numbers Charts Document Analyzer from QvCookbook 7 Remove Unused Fields Remove Fields that are not being referenced in the front end. Use Document Analyzer to identify unused Fields. Don t get obsessive about this. Focus on the fields with high cardinality. 8 Copyright 2016 Rob Wunderlich 4

DROP FIELDS Question: From a performance perspective, is there a difference between: 1.DROP FIELD [AccountNumber]; DROP FIELD [BillToAddressID]; DROP FIELD [City]; 2.DROP FIELD [AccountNumber], [BillToAddressID], [City]; 9 Remove Unneeded Fact Rows Don t load rows that are not required for analysis. Limit in SQL where possible SQL SELECT * FROM Orders WHERE OrderDate >= '2012-01-01'; 10 Copyright 2016 Rob Wunderlich 5

Limiting QVD Fact Rows This be slow LOAD * FROM data.qvd(qvd) WHERE Date > MakeDate(2012); When limiting rows from a QVD, WHERE Exists() is usually the fastest choice. TempDates: LOAD MakeDate(2012) + RecNo()-1 as Date AutoGenerate 10000 ; data: LOAD * FROM data.qvd(qvd) WHERE Exists(Date); DROP TABLE TempDates; 11 QVD Subset Performance QV11 QV12 12 Copyright 2016 Rob Wunderlich 6

Segment QVDs Four years of facts in a single QVD, ~815M Rows, ~80 Fields 17M Rows per month Loading one year of data: LOAD * FROM OneBig.qvd(qvd) WHERE Date >= 2015-01-01 ; 40 Minutes Modified the extract to create one QVD per month Facts_YYYYMM LOAD * FROM Facts_2015*.qvd(qvd); 6 Minutes 13 Remove Unneeded Dimension Rows A subset ratio of less then 100% indicates an opportunity to eliminate some Dimension rows. Use the Table Viewer to identify. Limit Dimensions using WHERE Exists() or KEEP. Product: LOAD * FROM Product.qvd (qvd) WHERE exists(productid); 14 Copyright 2016 Rob Wunderlich 7

Reduce Cardinality Fields with high number of values impact RAM footprint, Document Save/Open time. Split Timestamps into two fields Date and Time. AutoNumber Keys and non-display Id fields Always use the second autonumber parm to ensure sequential integers for optimum efficiency. AutoNumber(%KeyField, $KeyField ) as %KeyField 15 Cardinality Challenge Excessive Detail Complex fields such as browser UserAgent can have a lot of unique values: Mozilla/5.0 (Linux; Android 4.4.2; Lenovo X2-EU Build/KOT49H) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/30.0.0.0 Mobile Safari/537.36 May be used for filtering like =Sum({<UserAgent-={'*Bot*','*bot*','*Spider*'}>}Clicks) 16 Copyright 2016 Rob Wunderlich 8

Cardinality Challenge Excessive Detail Lots of unique values, but not many unique components. Pre-parse in the script to extract components such as OS Browser Device FieldName #Values Avg Width UserAgent 234,173 156 UserAgentDevice 6 16 UserAgentOS 99 19 UserAgentType 11 16 UserAgentVersion 2,180 16 * 36M rows, 4 Months 17 Beware the Cost of Preceding Load LOAD *, A&B as B2 FROM data.qvd(qvd); LOAD *, A&B as B2; LOAD * FROM data.qvd(qvd); 18 Copyright 2016 Rob Wunderlich 9

Understand Caching The results of a chart calculation are stored and remembered in the Cache. If the "same calculation" is called for again in that chart or another chart, the results are retrieved from the cache and the computation is skipped, thus saving time. Cache is global on the server. A calculation is "the same" if: The same selections are in force The chart has the same Dimensions The expression is identical 19 Weird Cache Facts Expression text must be exactly the same to be considered equivalent. sum(linetotal) SUM(LineTotal) sum(linetotal) sum(linetotal) // a comment Document Analyzer can help identify logically equivalent expressions. Avoid Expression Variations by utilizing Variables or Master Measures Linked Objects 20 Copyright 2016 Rob Wunderlich 10

Weird Cache Facts 2 A calculation is "the same" if: The same selections are in force Any change in selections, even data not connected to the chart, will cause recalculation You must close QlikView, not just the Doc, to reset the cache. Caching can be turned off via easter egg but your computer will turn to dust. 21 Data Island Objects Data Islands are tables that have no linking field to the main model. Commonly used for UI switches like currency selection or select your metrics. 22 Copyright 2016 Rob Wunderlich 11

Impact of Data Island Objects Selections made in the Pick Dimensions listbox update the Reports chart. This is a data change that causes every object on every sheet to be recalculated. Every object in the same state that is 23 Put Data Island Objects in Alt State Put the listbox in an Alternate State and reference that State in objects or expressions as required. 24 Copyright 2016 Rob Wunderlich 12

Measures Should Be In Same Table An expression that requires values from two tables will generally run slower than if all values are available in the same table. Expressions that utilize one table are called "single row operations". The advantage is that QV can process the existing data row by row, and avoid the phase 2 step of assembling intermediate composite tables. ListPrice is in the Product table. Can we get all Fields in single table? Sum(OrderQty * ListPrice) - sum(linetotal) 25 If Necessary, Make a Copy of the Field ListPrice cannot be JOINed to SaleOrderHeader without losing some Product rows. What to do? We can create an additional copy of the ListPrice field for use in this chart Sum(OrderQty * SalesListPrice) - sum(linetotal) * Note: Pre-calculating the expression in the script is an alternative (and preferred) solution. 26 Copyright 2016 Rob Wunderlich 13

Control Detail Table Objects Straight and Pivot tables with millions of output rows will take a long time to calculate, and impact the entire sheet. You generally can t make them calculate faster, but you can make design choices about when you will calculate them. 27 Control Detail Table Objects Hidden objects don t get calculated. Minimized objects don t get calculated. All Container objects get calculated. Export Objects can stay minimized Use a Calculation Condition to limit to a reasonable number of rows. Calculation Condition is an expression, so you can always override it with a button and variable. Prefer Straight table to Pivot table when possible. 28 Copyright 2016 Rob Wunderlich 14

No Short Circuit Expression evaluation does not terminate false branches if(1=1,sum(x)// Always evaluated,sum(y)// Always evaluated ) Both expressions are always evaluated! Usually not a big deal. Except when many big choices that have only one truth for the chart, such as choosing an expression to match UI switches. 29 Short Circuit Optimization Use the Expression Conditional property Or move the if() into a variable calculation =if(ui_currency='usd','sum(sales_usd)','sum(sales_eur)' ) Further Reading http://qlikviewcookbook.com/2014/12/how-not-to-choose-an-expression/ http://qlikviewcookbook.com/2014/12/how-to-choose-an-expression/ 30 Copyright 2016 Rob Wunderlich 15

Sum(If(), the Performance Killer Extremely Slow. IF condition is performed for every row in the dataset Better Alternatives: Move the IF condition to the load script, generate a Flag (0/1) In 99% of cases, Set Analysis can be used In rare cases when Set Analysis is not possible, multiply by a flag As the last resort, use a numeric condition 31 Prefer Numeric to String Comparison Numeric comparisons are faster (double or better) than String comparisons, sum(if(expressship=1, LineTotal)) sum(if(expressship='yes', LineTotal)) (Even better, use Set Analysis when possible!) 32 Copyright 2016 Rob Wunderlich 16

Set Analysis Modifier In a Set Analysis Modifier, there is no performance difference between String and Numeric. sum({<expressshipnum={1}>} LineTotal) sum({<expressshiptext={'yes'}>} LineTotal) Set Analysis uses Search Logic, which is always string based. 33 Using Flags Flag Fields with a value of "0" or "1" can improve the performance of expressions. sum(if(expressshipnum=1, LineTotal))// Wrong! sum(linetotal * ExpressShipNum)// Right! sum({<expressshipnum={1}>}linetotal) // Right also! Which is faster, Multiplication or Set Analysis? It depends Is the time required to create the Set offset by the savings of a faster calculation? 34 Copyright 2016 Rob Wunderlich 17

Pre-calculate in Script When Possible Replace Calculated Dims. =Date(Floor(ShipTime)) should be done in the script FirstName & LastName as Name Calculated Dims don t cache well! Do Business logic in script Instead of: Expr: Sum({<OrderID={ =ShipDate=OrderDate }>}Sales) If(ShipDate=OrderDate, 1, 0) as Flag_SameDayShip Expr: Sum(Sales * Flag_SameDayShip) 35 Challenge I can t pre-aggregate. Evaluate all use cases, not just the lowest granularity. Example: Web Advertising, 100M Facts per month Measures #Impressions (Views) #Clicks Dimensions AdId Date Category Publisher 36 Copyright 2016 Rob Wunderlich 18

Pre-Aggregation Example Requirements I need to see overall Click% (#Clicks / #Impressions) for my selections. I need to trend measures over Months and Category. I may want to filter to a single AdId, Publisher or Category. I need to show the day by day activity for a single AdId. There is only one requirement for Day level data! We can pre-aggregate additional fields : sum(#impressions) as Monthly#Impressions and sum(#clicks) as Monthly#Clicks at the AdId, Month level. Assuming an average lifetime of 100 days per Ad, aggregation in the front end will use fewer rows 4/100. Imagine the pre-aggregation opportunity if the monthly data did not need to filter by AdId! 37 Q & A 38 Copyright 2016 Rob Wunderlich 19