General Incremental Sliding-Window Aggregation

Similar documents
A binary heap is a complete binary tree, where each node has a higher priority than its children. This is called heap-order property

Binary search tree with SIMD bandwidth optimization using SSE

Physical Data Organization

Scalable Prefix Matching for Internet Packet Forwarding

Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints

6 March Array Implementation of Binary Trees

Rakam: Distributed Analytics API

Unit Storage Structures 1. Storage Structures. Unit 4.3

Quiz 4 Solutions EECS 211: FUNDAMENTALS OF COMPUTER PROGRAMMING II. 1 Q u i z 4 S o l u t i o n s

CIS 631 Database Management Systems Sample Final Exam

Data Structures. Jaehyun Park. CS 97SI Stanford University. June 29, 2015

A Digital Fountain Approach to Reliable Distribution of Bulk Data

SMALL INDEX LARGE INDEX (SILT)

Architecture Sensitive Database Design: Examples from the Columbia Group

Binary Heaps * * * * * * * / / \ / \ / \ / \ / \ * * * * * * * * * * * / / \ / \ / / \ / \ * * * * * * * * * *

DATA STRUCTURES USING C

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Efficient Parallel Block-Max WAND Algorithm

Load Distribution in Large Scale Network Monitoring Infrastructures

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Timing of a Disk I/O Transfer

QuickDB Yet YetAnother Database Management System?

Cpt S 223. School of EECS, WSU

Binary Heap Algorithms

Chapter 13: Query Processing. Basic Steps in Query Processing

Persistent Data Structures and Planar Point Location

Analysis of Compression Algorithms for Program Data

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)

Technical Challenges for Big Health Care Data. Donald Kossmann Systems Group Department of Computer Science ETH Zurich

Data Structure with C

Big Data Processing with Google s MapReduce. Alexandru Costan

Databases and Information Systems 1 Part 3: Storage Structures and Indices

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

Original-page small file oriented EXT3 file storage system

Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit

Review. Data Warehousing. Today. Star schema. Star join indexes. Dimension hierarchies

Persistent Data Structures

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

RevoScaleR Speed and Scalability

Cloud and Big Data Summer School, Stockholm, Aug Jeffrey D. Ullman

DATABASE DESIGN - 1DL400

Memory Allocation. Static Allocation. Dynamic Allocation. Memory Management. Dynamic Allocation. Dynamic Storage Allocation

Caching XML Data on Mobile Web Clients

Big Data With Hadoop

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Chapter 11 I/O Management and Disk Scheduling

Algorithms and Data Structures

Web Technologies: RAMCloud and Fiz. John Ousterhout Stanford University

Ryusuke KONISHI NTT Cyberspace Laboratories NTT Corporation

Distributed Dynamic Load Balancing for Iterative-Stencil Applications

Rethinking SIMD Vectorization for In-Memory Databases

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

Chapter 11 I/O Management and Disk Scheduling

A Practical Method to Diagnose Memory Leaks in Java Application Alan Yu

Project Time Management

A Comparative Study on Vega-HTTP & Popular Open-source Web-servers

IBM WebSphere Process Server V7.0 Deployment Exam.

! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I)

ESSBASE ASO TUNING AND OPTIMIZATION FOR MERE MORTALS

Binary Heaps. CSE 373 Data Structures

Image Compression through DCT and Huffman Coding Technique

Data storage Tree indexes

Practical Online Filesystem Checking and Repair

D1.1 Service Discovery system: Load balancing mechanisms

Binary Coded Web Access Pattern Tree in Education Domain

6. Storage and File Structures

In-Memory Columnar Databases HyPer. Arto Kärki University of Helsinki

Indexing Big Data. Michael A. Bender. ingest data ?????? ??? oy vey

High Frequency Trading and NoSQL. Peter Lawrey CEO, Principal Consultant Higher Frequency Trading

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

SSIS Scaling and Performance

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

Data Structures and Data Manipulation

Preemptive Rate-based Operator Scheduling in a Data Stream Management System

Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

Database Application Developer Tools Using Static Analysis and Dynamic Profiling

Memory Management Outline. Background Swapping Contiguous Memory Allocation Paging Segmentation Segmented Paging

Lecture 10: Dynamic Memory Allocation 1: Into the jaws of malloc()

Data Mining on Streams

Driving force. What future software needs. Potential research topics

HADOOP PERFORMANCE TUNING

Components: Interconnect Page 1 of 18

How To Trace

Data Structures For IP Lookup With Bursty Access Patterns

5 Performance Management for Web Services. Rolf Stadler School of Electrical Engineering KTH Royal Institute of Technology.

IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE OPERATORS

Software Defined Networks

Quality of Service versus Fairness. Inelastic Applications. QoS Analogy: Surface Mail. How to Provide QoS?

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

RADOS: A Scalable, Reliable Storage Service for Petabyte- scale Storage Clusters

AP Computer Science AB Syllabus 1

Transcription:

General Incremental Sliding-Window Aggregation Kanat Tangwongsan Mahidol University International College, Thailand (part of this work done at IBM Research, Watson) Joint work with Martin Hirzel, Scott Schneider, and Kun-Lung Wu IBM Research, Watson

Dynamic data is everywhere a data stream = caller_id: len: timestamp:_ caller_id: len: timestamp:_ caller_id: len: timestamp:_ caller_id: len: timestamp:_ caller_id: len: timestamp:_ tnow often interested in a sliding window Compute some aggregation on this data

Build a system to Aggregation in stream caller_id: len: timestamp:_ caller_id: len: timestamp:_ caller_id: len: timestamp:_ out stream Window Answer the following questions every minute about the past 4 hrs: How long was the longest call? How many calls have that duration? Who is a caller with that duration? 3

Answer the following questions every minute about the past 4 hrs: How long was the longest call? How many calls have that duration? Who is a caller with that duration? Basic Solution: Maintain a sliding window (past 4 hr) Simple but slow: O(n) per query Walk through the window every minute 4

Improvement Opportunities Idea: When window slides, lots of common contents with the most recent query. How to Reuse? If invertible, keep a running sum: add on arrival, subtract on departure sum Partial sum: bundle items that arrive and depart together to reduce # of items in the window. 5

This Work: How to engineer sliding-window aggregation so that can add new aggregation operations easily can get good performance with little hassle (using a combination of simple, known ideas) 6

Performance Generality Prior Work Good for small updates (e.g., Arasu-Widom VLDB 04, Moon et al. ICDE 00) Prior Work Require invertibility or commutativity or assoc. This Work Good for large updates (e.g., Cranor et al. SIGMOD 03, Krishnamurthi et al. SIGMOD 06) This Work Require FIFO windows Good for Large & Small: If m updates are made to a window of size n, use O(mlog(n/m)) time to compute aggregate. Require associativity (not but invertibility nor commutativity) OK to be non-fifo 7

Our Approach High-level Idea User writes aggregation code following an interface injected into Data Structure Template linked with Window- Management Library User declares query Aggregation Interface lift map-assoc. reduce-map reduce using combine lower w w w 3 w n t t t 3 t n a final answer 8

Example count StdDev v u t n nx (w i x) i= sum of values sum of squares lift x 7! {c :, : x, : x } combine component-wise add lower q c ( /c) 9

Perfect Binary Tree Keep Partial Sum in Internal Nodes 0 7 comb(.,.) Depth: log n Changing k leaves affects O(k log(n/k)) nodes 0 3 comb(.,.) 4 5 6 7 comb(.,.) 0 3 4 5 6 7 comb(.,.) comb(.,.) comb(.,.) comb(.,.) w[0] w[] w[] w[3] w[4] w[5] w[6] w[7] 0

Data Structure Engineering User writes code following an interface injected into Data Structure Template Minimize pointer chasing Allocate memory in bulk Place data to help cache linked with Window- Management Library

Main idea: keep a perfect binary tree, treating leaves as a circular buffer, all laid out as one array physical 3 4 5 6 7 8 9 0 3 4 5 binary heap encoding 8 9 logical 3 4 5 6 7 8 9 0 3 4 5 5 4 0 3 Q: Non power of? Dynamic window size? non-fifo? The queue s front and back locations give a natural demarcation. Resize and rebuild as necessary. Amortized bounds (see paper)

But Circular Buffer Leaves!= Window-Ordered Data leaf view Fix? window When view inverted: B F i j e f g h prefix suffix ans = combine(suffix, prefix) F B F B F 3 B 3 F 3 4 B 3 4 F 3 4 B 3 4 5 B F 3 4 3 4 5 3

Experimental Analysis What s the throughput relative to non-incremental? What s the performance trend under updates of different sizes? 3 How does wildly-changing window size affect the overall throughput? 4

What s the throughput relative to non-incremental? 3 Million Tuples/Second.5.5 0.5 crossover for StdDev (~ 0) Small windows: slow down 0% Break 0x as early as n in the thousands crossover for Max (~ 60) faster 0 4 6 64 56 04 baseline recompute all ISS Max ISS StdDev Window Size RA Max RA StdDev our scheme 5

What s the performance trend under updates of different sizes? Avg. Per-tuple Cost (Microseconds).5 0.5 Our Theory: Process k events in O( + log (n/k)) time per event. m = m = 4 m = 64 m = n/6 m = n/4 m = n faster 0 4 6 64 56 K 4K 6K 64K 56K M 4M Window Size (n) 6

Avg Cost Per Tuple (Microseconds) 3.5 0.5 0 How does wildly-changing window size affect the overall throughput? RA StdDev (fixed) RA Max (fixed) 4 6 64 56 K 4K 6K 64K 56K M 4M faster Window Size RA StdDev (osc.) RA Max (osc.) 7

Take-Home Points This work: sliding-window aggregation easily extendible by users and has good performance careful systems + algorithmic design and engineering (blend of known ideas) general (non-fifo, only need associativity) and fast for large & small windows If m updates have been made on a window of size n, use O(mlog(n/m)) time to derive aggregate. More details: see paper/come to the poster session. 8