Look Out The Window Functions



Similar documents
ICAB4136B Use structured query language to create database structures and manipulate data

USING WINDOWS MOVIE MAKER TO CREATE THE MOMENT BEHIND THE PHOTO STORY PART 1

Microsoft Excel 2010 Part 3: Advanced Excel

Advanced Query for Query Developers

Plot and Solve Equations

Kaseya 2. Quick Start Guide. for VSA 6.3

Analyzing Plan Diagrams of Data Query Optimizers

Partitioning under the hood in MySQL 5.5

Machine Learning and Data Mining. Regression Problem. (adapted from) Prof. Alexander Ihler

Introduction to SQL and SQL in R. LISA Short Courses Xinran Hu

Performing Queries Using PROC SQL (1)

Graphing Quadratic Functions

ODS for PRINT, REPORT and TABULATE

Solve addition and subtraction word problems, and add and subtract within 10, e.g., by using objects or drawings to represent the problem.

Lab #4 - Linear Impulse and Momentum

CREATE A 3D MOVIE IN DIRECTOR

Microsoft Excel 2010 Pivot Tables

Reporting Student Progress and Achievement

Inside the PostgreSQL Query Optimizer

Excel Pivot Tables. Blue Pecan Computer Training Ltd - Onsite Training Provider :: :: info@bluepecan.co.

Session 7 Bivariate Data and Analysis

Tutorial Customer Lifetime Value

EditAble CRM Grid. For Microsoft Dynamics CRM. How To Guide. Trial Configuration: Opportunity View EditAble CRM Grid Scenario

SES Project v 9.0 SES/CAESAR QUERY TOOL. Running and Editing Queries. PS Query

Kyubit Business Intelligence OLAP analysis - User Manual

Climate and Weather. This document explains where we obtain weather and climate data and how we incorporate it into metrics:

Data warehousing with PostgreSQL

Linear and quadratic Taylor polynomials for functions of several variables.

Fig. 1 Suitable data for a Crosstab Query.

Sample Table. Columns. Column 1 Column 2 Column 3 Row 1 Cell 1 Cell 2 Cell 3 Row 2 Cell 4 Cell 5 Cell 6 Row 3 Cell 7 Cell 8 Cell 9.

5. CHANGING STRUCTURE AND DATA

To increase performaces SQL has been extended in order to have some new operations avaiable. They are:

OLAP Systems and Multidimensional Expressions I

FIRST GRADE MATH Summer 2011

IBM SPSS Direct Marketing 23

SQL SELECT Query: Intermediate

Visualization with Excel Tools and Microsoft Azure

SAS Institute Exam A SAS BI Content Developmentfor SAS 9 Version: 7.0 [ Total Questions: 107 ]

ADP Workforce Now V3.0

DBMS / Business Intelligence, SQL Server

3D Drawing. Single Point Perspective with Diminishing Spaces

EST.03. An Introduction to Parametric Estimating

Journal of Statistical Software

Unit 9. Unit 10. Unit 11. Unit 12. Introduction Busy Ant Maths Year 2 Medium-Term Plans. Number - Geometry - Position & direction

3D Drawing. Single Point Perspective with Diminishing Spaces

Ad Hoc Reporting. Usage and Customization

Join Example. Join Example Cart Prod Comprehensive Consulting Solutions, Inc.All rights reserved.

SAS BI Content Development. Exam Name : Demo. Version :

Introduction to Microsoft Jet SQL

HTSQL is a comprehensive navigational query language for relational databases.

SQL SERVER TRAINING CURRICULUM

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Oracle SQL. Course Summary. Duration. Objectives

ECONOMICS 351* -- Stata 10 Tutorial 2. Stata 10 Tutorial 2

Maths methods Key Stage 2: Year 3 and Year 4

Oracle Database: SQL and PL/SQL Fundamentals

KaleidaGraph Quick Start Guide

MOC 20461C: Querying Microsoft SQL Server. Course Overview

Guide to Performance and Tuning: Query Performance and Sampled Selectivity

PeopleSoft Query Training

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Performance Verbesserung von SAP BW mit SQL Server Columnstore

Multiple Linear Regression

Excel 2003 PivotTables Summarizing, Analyzing, and Presenting Your Data

INFORMATIKA ANGOL NYELVEN

Training Needs Analysis

Access 2010: Creating Queries Table of Contents INTRODUCTION TO QUERIES... 2 QUERY JOINS... 2 INNER JOINS... 3 OUTER JOINS...

Relational Databases

Access Queries (Office 2003)

CATIA Basic Concepts TABLE OF CONTENTS

White paper. CCD and CMOS sensor technology Technical white paper

10 Listing data and basic command syntax

9 Control Statements. 9.1 Introduction. 9.2 Objectives. 9.3 Statements

Education Solutions Development, Inc. APECS Navigation: Business Systems Getting Started Reference Guide

ABSTRACT 1. INTRODUCTION. Kamil Bajda-Pawlikowski

Working with Tables: How to use tables in OpenOffice.org Writer

Unit 7: Radical Functions & Rational Exponents

PowerWorld Simulator

Package tagcloud. R topics documented: July 3, 2015

How to Create a Custom TracDat Report With the Ad Hoc Reporting Tool

3.2. Solving quadratic equations. Introduction. Prerequisites. Learning Outcomes. Learning Style

Where do I start? DIGICATION E-PORTFOLIO HELP GUIDE. Log in to Digication

Introducing Microsoft SQL Server 2012 Getting Started with SQL Server Management Studio

RECURSIVE COMMON TABLE EXPRESSIONS DATABASE IN ORACLE. Iggy Fernandez, Database Specialists INTRODUCTION

Snagit 10. Getting Started Guide. March TechSmith Corporation. All rights reserved.

edgebooks Quick Start Guide 4

SQL Query Evaluation. Winter Lecture 23

Math Content by Strand 1

BI 4.1 Quick Start Guide

Inquiry Formulas. student guide

Application Note - How to Design a SolarEdge System Using PVsyst

MyOra 3.0. User Guide. SQL Tool for Oracle. Jayam Systems, LLC

Everyday Mathematics CCSS EDITION CCSS EDITION. Content Strand: Number and Numeration

Relational Database: Additional Operations on Relations; SQL

Querying Microsoft SQL Server

CBA Fractions Student Sheet 1

Formulas, Functions and Charts

Welcome to the topic on queries in SAP Business One.

Transcription:

and free your SQL 2ndQuadrant Italia PostgreSQL Conference Europe 2011 October 18-21, Amsterdam

Outline 1 Concepts Aggregates Different aggregations Partitions 2 Syntax Frames from 9.0 Frames in 8.4 3 Other A larger example Question time

Aggregates Aggregates 1 Example of an aggregate Problem 1 How many rows there are in table a? Solution SELECT count(*) FROM a; Here count is an aggregate function (SQL keyword AGGREGATE).

Aggregates Aggregates 2 Functions and Aggregates FUNCTIONs: input: one row output: either one row or a set of rows: AGGREGATEs: input: a set of rows output: one row

Different aggregations Different aggregations 1 Without window functions, and with them GROUP BY col 1,..., col n any supported PostgreSQL version compute aggregates by creating groups output is one row for each group window functions only PostgreSQL version 8.4+ compute aggregates via partitions and window frames output is one row for each input row

Different aggregations Different aggregations 2 Without window functions, and with them GROUP BY col 1,..., col n only one way of aggregating for each group only one way of grouping for each SELECT window functions different rows in the same partition can have different window frames aggregates in the same SELECT can use different partitions

Different aggregations Different aggregations Dataset #1 for the next examples SELECT * FROM a ORDER BY x; x y ----+--- 1 1 2 2 3 3 4 1 5 2 6 3 7 1 8 2 9 3 10 1

Different aggregations Different aggregations Example 1, with GROUP BY SELECT y,sum(x) FROM a GROUP BY y ORDER BY y; y sum ---+----- 1 22 2 15 3 18

Different aggregations Different aggregations Example 2, with window functions SELECT x,y,sum(x) OVER (PARTITION BY y) FROM a ORDER BY y,x; x y sum ----+---+----- 1 1 22 4 1 22 7 1 22 10 1 22 2 2 15 5 2 15 8 2 15 3 3 18 6 3 18 9 3 18

Different aggregations Different aggregations Example 3, with window functions (reordered) SELECT x,y,sum(x) OVER (PARTITION BY y) FROM a ORDER BY x; x y sum ----+---+----- 1 1 22 2 2 15 3 3 18 4 1 22 5 2 15 6 3 18 7 1 22 8 2 15 9 3 18 10 1 22

Partitions Partitions Basics PARTITION BY E 1, E 2,..., E n each row belongs to one partition similar to GROUP BY with GROUP BY you can partition in one way only for each SELECT with window functions, you can partition in several ways at once for each SELECT (up to a different partitioning for each aggregate function)

Basics Each row has its window frame The window frame is included in that row s partition Default: the window frame is equal to that row s partition The result of the aggregate for that row is computed using all the rows in its window frame The frame can move as the row moves

Partitions and frames In a picture red lines = partitions blue lines = frames

Frame movement and ORDER BY Notions like the current row plus the previous three rows depend on a notion of ordering You can specify an ordering with ORDER BY If you don t, all the rows will be peers

RANGE or ROWS Frames are of two kinds: RANGE frames, according to the ordering ROWS frames, according to the actual row number Difference: in ROWS frames, two rows will never be peers in RANGE frames, they might be

Examples of possible frames Some interesting frames: the current row and all its peers from the start of the partition to the current row the previous row only (the first row has an empty frame) the next row only (the last row has an empty frame) from two rows before the next row only (the last row has an empty frame)... The frame is trimmed to fit into the partition

Example 4: trivial partition and trivial window frame SELECT sum(x) OVER () FROM a; The whole table is one partition The window frame is the whole partition Very similar to: SELECT sum(x) FROM a;

Example 5: frame is current row plus previous row SELECT x,y,sum(x) OVER (PARTITION BY y ORDER BY x ROWS 1 PRECEDING) FROM a ORDER BY x; x y sum ----+---+----- 1 1 1 2 2 2 3 3 3 4 1 5 5 2 7 6 3 9 7 1 11 8 2 13 9 3 15 10 1 17

Example 6: frame is current row plus previous row and next row SELECT x,y,sum(x) OVER (PARTITION BY y ORDER BY x ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) FROM a ORDER BY x; x y sum ----+---+----- 1 1 5 2 2 7 3 3 9 4 1 12 5 2 15 6 3 18 7 1 21 8 2 13 9 3 15 10 1 17

Frames from 9.0 Syntax for Window Frames from 9.0 [ RANGE ROWS ] BETWEEN FrameStart AND FrameEnd FrameStart and FrameEnd can be chosen among: UNBOUNDED PRECEDING n PRECEDING (only ROWS for now) CURRENT ROW n FOLLOWING (only ROWS for now) UNBOUNDED FOLLOWING abbreviation: [ RANGE ROWS ] FrameStart meaning [ RANGE ROWS ] BETWEEN FrameStart AND CURRENT ROW

Frames in 8.4 Syntax for Window Frames before 9.0 no window functions before 8.4 in 8.4, only these frames allowed: [ RANGE ROWS ] BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (the window frame is from the start of the partition to the current row) [ RANGE ROWS ] BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING (the window frame is the whole partition)

A larger example Heat diffusion on a 2D grid Window functions over a larger dataset We consider a square N N grid Heat flows along each horizontal or vertical segment The speed on a given segment is proportional to the temperature gap between its two extremes Our plan: 1 we produce a dataset which simulates heat flow using window functions on PostgreSQL; 2 snapshots of that dataset are plotted with gnuplot and assembled into a movie using mencoder; 3 we view the movie to confirm that heat flows as we would expect.

A larger example The discrete 2D heat equation The following equation encodes our model: [ z(x, y) = C z(x 1, y) 2z(x, y) + z(x + 1, y) ] +z(x, y 1) 2z(x, y) + z(x, y + 1) (1) z(x, y) is the temperature at the point (x, y) Coefficient C controls the speed of the simulation Equation (1) can be expressed using window functions

A larger example Discrete heat equation via window functions The expression z(x 1, y) 2z(x, y) + z(x + 1, y) can be written as a window function: sum(z) OVER ( PARTITION BY y ORDER BY x ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING ) - 3 * z The same idea applies to the other half of Equation (1): z(x, y 1) 2z(x, y) + z(x, y + 1)

A larger example Heat flow scenarios We set initial data according to one of the following scenarios: a hot point a hot segment a hot square a cold square hot and cold are meant in comparison to the rest of the points then we run the simulation and view the video...

A larger example Benchmark (sort of) with N = 16 Subquery Scan on h0 (cost=153.32..210.49 rows=1089 width=36) -> WindowAgg (cost=153.32..175.10 rows=1089 width=20) InitPlan 2 (returns $1) -> Result (cost=0.05..0.06 rows=1 width=0) InitPlan 1 (returns $0) -> Limit (cost=0.00..0.05 rows=1 width=4) -> Index Scan Backward using h1_t_idx on h1 (cost=0.00..57.03 rows=1089 width=4) Index Cond: ((t IS NOT NULL) AND (t < 500)) -> Sort (cost=153.26..155.98 rows=1089 width=20) Sort Key: pg_temp_2.h1.x, pg_temp_2.h1.y -> WindowAgg (cost=76.55..98.33 rows=1089 width=20) -> Sort (cost=76.55..79.27 rows=1089 width=20) Sort Key: pg_temp_2.h1.y, pg_temp_2.h1.x -> Seq Scan on h1 (cost=0.00..21.61 rows=1089 width=20) Filter: (t = $1)

Question time Question time Any questions?

Question time Thank you for your attention! Feedback http://2011.pgconf.eu/feedback

Question time Licence This document is distributed under the Creative Commons Attribution-Non commercial-sharealike 3.0 Unported licence A copy of the licence is available at the URL http://creativecommons.org/licenses/by-nc-sa/3.0/ or you can write to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.