PostgreSQL Business Intelligence & Performance Simon Riggs CTO, 2ndQuadrant PostgreSQL Major Contributor



Similar documents
PostgreSQL Features, Futures and Funding. Simon Riggs

2009 Oracle Corporation 1

<Insert Picture Here> Best Practices for Extreme Performance with Data Warehousing on Oracle Database

Oracle Database In-Memory The Next Big Thing

Capacity Management for Oracle Database Machine Exadata v2

iservdb The database closest to you IDEAS Institute

SQL Server 2005 Features Comparison

Splice Machine: SQL-on-Hadoop Evaluation Guide

Oracle Architecture, Concepts & Facilities

EII - ETL - EAI What, Why, and How!

IN-MEMORY DATABASE SYSTEMS. Prof. Dr. Uta Störl Big Data Technologies: In-Memory DBMS - SoSe

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Next Generation Data Warehousing Appliances

Safe Harbor Statement

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013

Oracle Database In-Memory A Practical Solution

Inge Os Sales Consulting Manager Oracle Norway

Actian SQL in Hadoop Buyer s Guide

CERULIUM TERADATA COURSE CATALOG

Enterprise and Standard Feature Compare

SQL Server In-Memory by Design. Anu Ganesan August 8, 2014

Integrating Apache Spark with an Enterprise Data Warehouse

ENHANCEMENTS TO SQL SERVER COLUMN STORES. Anuhya Mallempati #

In-Memory Data Management for Enterprise Applications

Main Memory Data Warehouses

SQL Server 2012 Performance White Paper

Oracle BI Suite Enterprise Edition

Data Warehousing. Jens Teubner, TU Dortmund Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1

Birds of a Feather Session: Best Practices for TimesTen on Exalytics

Oracle Database 11g Comparison Chart

SQL Server 2008 Performance and Scale

Cognos Performance Troubleshooting

Maximum performance, minimal risk for data warehousing

Columnstore in SQL Server 2016

Data warehousing with PostgreSQL

low-level storage structures e.g. partitions underpinning the warehouse logical table structures

Optimize Oracle Business Intelligence Analytics with Oracle 12c In-Memory Database Option

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

The HP Neoview data warehousing platform for business intelligence Die clevere Alternative

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

The Vertica Analytic Database Technical Overview White Paper. A DBMS Architecture Optimized for Next-Generation Data Warehousing

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

Parallel Data Warehouse

EMC/Greenplum Driving the Future of Data Warehousing and Analytics

SQL Server Administrator Introduction - 3 Days Objectives

IBM Cognos 10: Enhancing query processing performance for IBM Netezza appliances

The HP Neoview data warehousing platform for business intelligence

Retail POS Data Analytics Using MS Bi Tools. Business Intelligence White Paper

Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

Introducing Oracle Exalytics In-Memory Machine

Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com

Reference Architecture, Requirements, Gaps, Roles

Oracle Big Data, In-memory, and Exadata - One Database Engine to Rule Them All Dr.-Ing. Holger Friedrich

Big Data Technology CS , Technion, Spring 2013

Microsoft Analytics Platform System. Solution Brief

Introduction to Oracle Business Intelligence Standard Edition One. Mike Donohue Senior Manager, Product Management Oracle Business Intelligence

Performance Tuning and Optimizing SQL Databases 2016

In-Memory Analytics: A comparison between Oracle TimesTen and Oracle Essbase

W I S E. SQL Server 2008/2008 R2 Advanced DBA Performance & WISE LTD.

Big Data and Its Impact on the Data Warehousing Architecture

Actian Vector in Hadoop

Performance Counters. Microsoft SQL. Technical Data Sheet. Overview:

THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS

Migrating a Discoverer System to Oracle Business Intelligence Enterprise Edition

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

DBMS / Business Intelligence, SQL Server

SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here

MDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy. Satish Krishnaswamy VP MDM Solutions - Teradata

Data Warehousing and Data Mining

In-Memory Columnar Databases HyPer. Arto Kärki University of Helsinki

MOC 20467B: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Database Scalability and Oracle 12c

Course Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning

SWISSBOX REVISITING THE DATA PROCESSING SOFTWARE STACK

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

High-Volume Data Warehousing in Centerprise. Product Datasheet

Architectures for Big Data Analytics A database perspective

Oracle Database 11g for Data Warehousing

Data Warehouse: Introduction

Introduction. Part I: Finding Bottlenecks when Something s Wrong. Chapter 1: Performance Tuning 3

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

Key Attributes for Analytics in an IBM i environment

Business Usage Monitoring for Teradata

Licenze Microsoft SQL Server 2005

Big Data Processing with Google s MapReduce. Alexandru Costan

INTRODUCING DRUID: FAST AD-HOC QUERIES ON BIG DATA MICHAEL DRISCOLL - CEO ERIC TSCHETTER - LEAD METAMARKETS

Oracle MulBtenant Customer Success Stories

Using distributed technologies to analyze Big Data

Cloud Computing at Google. Architecture

Would-be system and database administrators. PREREQUISITES: At least 6 months experience with a Windows operating system.

IBM WebSphere DataStage Online training from Yes-M Systems

Big Data & Cloud Computing. Faysal Shaarani

Transcription:

PostgreSQL Business Intelligence & Performance Simon Riggs CTO, 2ndQuadrant PostgreSQL Major Contributor

The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007 2013) under grant agreem ent num be 318633

AXLE Project Analytics on Xtremely Large European data Secure Big Fast Hardware optimised Visual Analytics axleproject.eu

Topics Business Intelligence & Architecture BI Performance Feature Effectiveness Benchmark Analysis & Opportunities New Features in Progress

Business Intelligence ETL Reporting Ad-hoc queries Data Mining Many query types Counting Summarisation Strategic Analysis Analytics

BI Architecture SQL was invented for Business Intelligence Classic DW DB2 v Teradata Specialist Databases OLTP DW

Specialist OLTP Problems M ongodb Joins don't scale! V oltdb No concurrency All SQL must run in same duration Partial SQL implementation

Specialist DW Problems Second specialist system required ETL middleware also often required for loadin Data delayed on route to second system Frequently highly compressed, so read only o difficult to maintain

Get Real Big Data 99% of databases are <100GB Real Time results Business Intelligence required 24x7 Closed loop processing requires fast response SQL is much easier to use than alternatives More expressive and easier to use Already the de facto standard for BI

Minimal Approach Emphasise that additional BI technology will not reduce costs and may not offer solutions Keep Business Intelligence on PostgreSQL Use Hot Standby to expand capacity and isolate Business Intelligence workloads Minimise ETL whenever possible Gain benefits of SQL and concurrency Immediate access to data

Things To Learn Query performance is important Custom/special data structures are important in increasing performance Stale answers are acceptable for many situations

BI Feature Effectiveness Problem 1: Get the work into the database Problem 2: Speed up the work in the database +++++ Work Avoidance ++ Algorithmic Improvement + Brute Force

Orange Data Mining Orange 3.0 generates SQL for all data flows Directly utilises the power of databases Integrates with PostgreSQL

BI Tuning Opportunities COPY batch optimisations defeated Btree insert bottlenecks on large data loads Aggregate Optimisation Use sum()/count() not avg() Join Estimate/Actual Mismatch Use enable_nestloops = off Plan Pushdown Manual SQL rewrite

Speed Up: Work Avoidance Cacheing Result Cache Materialized Views Approximation Partition Elimination Improved Optimisation

Speed Up: Algorithms Compression Column Orientation Vectors Hardware approaches

Speed Ups: Brute Force Parallel Query is a brute force approach Gains in performance come from additional utilisation of resources, not from being smarter Reduces overall concurrency Still requires extensive optimizer changes The industry thinks we need it Some queries do require it PostgreSQL should do this, 2ndQuadrant can, will and has already helped

9.4 BI Features In-Progress Min Max Indexes Parallel Sort & Parallel Query infrastructure Materialized Views++ Multi-core scalability gains (lwlocks) (DDL Locking impact reductions) (Row Level Security)

Min Max Indexes (9.4) Automatic Partitioning Store min and max tuples for each page range Use theorem proving to avoid sections of scan Covers all columns, not just defined partition key Can be added easily to existing applications

Min Max Index results 2 GB table MinMax B-Tree Index build time 11s 96s Index size 24kB 1.1GB Load time w index 1 x2-3 Index SEL (1 row) x2-3 1 Index SEL (many) same same

MinMax Indexes Does not require complex DDL Generate almost no index inserts Fits in RAM even for Petabytes of data Generate almost no additional WAL Works well with Hot Standby data warehousing Only works with some data distributions Additional indexing may be needed

PostgreSQL BI Roadmap 9.4 10.0 10.1 Advanced Business Intelligence High Security Online Change Very Large Database

2ndQuadrant Consulting, Migration Open Source Development Training Support, RemoteDBA

The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007 2013) under grant agreem ent num be 318633