External Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13



Similar documents
External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Overview of Storage and Indexing

Overview of Storage and Indexing. Data on External Storage. Alternative File Organizations. Chapter 8

Chapter 13: Query Processing. Basic Steps in Query Processing

Lecture 1: Data Storage & Index

Unit Storage Structures 1. Storage Structures. Unit 4.3

Storing Data: Disks and Files. Disks and Files. Why Not Store Everything in Main Memory? Chapter 7

Physical Data Organization

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)

Outline. Principles of Database Management Systems. Memory Hierarchy: Capacities and access times. CPU vs. Disk Speed

6 March Array Implementation of Binary Trees

Storage in Database Systems. CMPSCI 445 Fall 2010

COS 318: Operating Systems

EFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES

Big Data and Scripting. Part 4: Memory Hierarchies

Binary Heaps * * * * * * * / / \ / \ / \ / \ / \ * * * * * * * * * * * / / \ / \ / / \ / \ * * * * * * * * * *

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

Seeking Fast, Durable Data Management: A Database System and Persistent Storage Benchmark

Benchmarking Cassandra on Violin

File Management. Chapter 12

DATABASE DESIGN - 1DL400

The Google File System

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

Introduction to IR Systems: Supporting Boolean Text Search. Information Retrieval. IR vs. DBMS. Chapter 27, Part A

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

GraySort on Apache Spark by Databricks

Data storage Tree indexes

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao

Data Compression in Blackbaud CRM Databases

Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011

Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs)

DBMS / Business Intelligence, SQL Server

AP Computer Science AB Syllabus 1

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

CPSC 211 Data Structures & Implementations (c) Texas A&M University [ 313]

Query Processing C H A P T E R12. Practice Exercises

Google File System. Web and scalability

MSU Tier 3 Usage and Troubleshooting. James Koll

Solid State Drive Architecture

6. Storage and File Structures

CSE 326: Data Structures B-Trees and B+ Trees

SQL Query Evaluation. Winter Lecture 23

SQL Server 2012 Optimization, Performance Tuning and Troubleshooting

Performance And Scalability In Oracle9i And SQL Server 2000

MCTS Guide to Microsoft Windows 7. Chapter 10 Performance Tuning

VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

Binary Heap Algorithms

Databases and Information Systems 1 Part 3: Storage Structures and Indices

Energy Efficient MapReduce

Rethinking SIMD Vectorization for In-Memory Databases

Buffer Management 5. Buffer Management

A Study on Workload Imbalance Issues in Data Intensive Distributed Computing

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

Virtuoso and Database Scalability

Analysis of Algorithms I: Binary Search Trees

CS473 - Algorithms I

Hardware Configuration Guide

Exploring the Efficiency of Big Data Processing with Hadoop MapReduce

Data Structures and Algorithm Analysis (CSC317) Intro/Review of Data Structures Focus on dynamic sets

Converting a Number from Decimal to Binary

External Memory Geometric Data Structures

Performance Verbesserung von SAP BW mit SQL Server Columnstore

Tables so far. set() get() delete() BST Average O(lg n) O(lg n) O(lg n) Worst O(n) O(n) O(n) RB Tree Average O(lg n) O(lg n) O(lg n)

Analyze Database Optimization Techniques

Understanding the Benefits of IBM SPSS Statistics Server

1Z0-117 Oracle Database 11g Release 2: SQL Tuning. Oracle

Memory Hierarchy. Arquitectura de Computadoras. Centro de Investigación n y de Estudios Avanzados del IPN. adiaz@cinvestav.mx. MemoryHierarchy- 1

Maximizing VMware ESX Performance Through Defragmentation of Guest Systems. Presented by

Columnstore Indexes for Fast Data Warehouse Query Processing in SQL Server 11.0

OPERATING SYSTEM - VIRTUAL MEMORY

Application of Predictive Analytics for Better Alignment of Business and IT

CIS 631 Database Management Systems Sample Final Exam

How to recover a failed Storage Spaces

Actian Vector in Hadoop

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems

The Quest for Speed - Memory. Cache Memory. A Solution: Memory Hierarchy. Memory Hierarchy

Chapter 11: Input/Output Organisation. Lesson 06: Programmed IO

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles

Computer Organization and Architecture. Characteristics of Memory Systems. Chapter 4 Cache Memory. Location CPU Registers and control unit memory

University of Dublin Trinity College. Storage Hardware.

Four Orders of Magnitude: Running Large Scale Accumulo Clusters. Aaron Cordova Accumulo Summit, June 2014

Binary Heaps. CSE 373 Data Structures

LLamasoft K2 Enterprise 8.1 System Requirements

Chapter 13. Disk Storage, Basic File Structures, and Hashing

IBM DB2: LUW Performance Tuning and Monitoring for Single and Multiple Partition DBs

DB2 for Linux, UNIX, and Windows Performance Tuning and Monitoring Workshop

Outline. CS 245: Database System Principles. Notes 02: Hardware. Hardware DBMS Data Storage

Effective Java Programming. measurement as the basis

Whitepaper: performance of SqlBulkCopy

Oracle EXAM - 1Z Oracle Database 11g Release 2: SQL Tuning. Buy Full Product.

Binary search tree with SIMD bandwidth optimization using SSE

File Systems Management and Examples

Transcription:

External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing gpa order Sorting is first step in bulk loading B+ tree index. Sorting useful for eliminating duplicate copies in a collection of records (Why?) Sort-merge join algorithm involves sorting. Problem: sort 1Gb of data with 1Mb of RAM. why not virtual memory? Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 2 2-Way Sort: Requires 3 Buffers Pass 1: Read a page, sort it, write it. only one buffer page is used Pass 2, 3,, etc.: three buffer pages used. INPUT 1 INPUT 2 OUTPUT Main memory buffers Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 3

Two-Way External Merge Sort Each pass we read + write each page in file. N pages in the file => the number of passes = log 2 N + 1 So toal cost is: ( log 2 N + ) 2 N 1 Idea: Divide and conquer: sort subfiles and merge 3,4 6,2 9,4 8,7 5,6 3,1 2 3,4 2,6 4,9 7,8 5,6 1,3 2 2,3 4,6 Input file PASS 0 1-page runs PASS 1 2-page runs PASS 2 4-page runs PASS 3 8-page runs 9 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 4 2,3 4,4 6,7 8,9 4,7 8,9 1,2 2,3 3,4 4,5 6,6 7,8 1,3 5,6 2 1,2 3,5 6 General External Merge Sort * More than 3 buffer pages. How can we utilize them? To sort a file with N pages using B buffer pages: Pass 0: use B buffer pages. Produce N / B sorted runs of B pages each. Pass 2,, etc.: merge B-1 runs. INPUT 1... INPUT 2 OUTPUT...... INPUT B-1 B Main memory buffers Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 5 Cost of External Merge Sort Number of passes: 1 + log B 1 N / B Cost = 2N * (# of passes) E.g., with 5 buffer pages, to sort 108 page file: Pass 0: 108 / 5 = 22 sorted runs of 5 pages each (last run is only 3 pages) Pass 1: 22 / 4 = 6 sorted runs of 20 pages each (last run is only 8 pages) Pass 2: 2 sorted runs, 80 pages and 28 pages Pass 3: Sorted file of 108 pages Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 6

Number of Passes of External Sort N B=3 B=5 B=9 B=17 B=129 B=257 100 7 4 3 2 1 1 1,000 10 5 4 3 2 2 10,000 13 7 5 4 2 2 100,000 17 9 6 5 3 3 1,000,000 20 10 7 5 3 3 10,000,000 23 12 8 6 4 3 100,000,000 26 14 9 7 4 4 1,000,000,000 30 15 10 8 5 4 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 7 Internal Sort Algorithm Quicksort is a fast way to sort in memory. An alternative is tournament sort (a.k.a. heapsort ) Top: Read in B blocks Output: move smallest record to output buffer Read in a new record r insert r into heap if r not smallest, then GOTO Output else remove r from heap output heap in order; GOTO Top Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 8 More on Heapsort Fact: average length of a run in heapsort is 2B The snowplow analogy Worst-Case: What is min length of a run? How does this arise? Best-Case: B What is max length of a run? How does this arise? Quicksort is faster, but... Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 9

I/O for External Merge Sort longer runs often means fewer passes! Actually, do I/O a page at a time In fact, read a block of pages sequentially! Suggests we should make each buffer (input/output) be a block of pages. But this will reduce fan-out during merge passes! In practice, most files still sorted in 2-3 passes. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 10 Number of Passes of Optimized Sort N B=1,000 B=5,000 B=10,000 100 1 1 1 1,000 1 1 1 10,000 2 2 1 100,000 3 2 2 1,000,000 3 2 2 10,000,000 4 3 3 100,000,000 5 3 3 1,000,000,000 5 4 3 * Block size = 32, initial pass produces runs of size 2B. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 11 Double Buffering To reduce wait time for I/O request to complete, can prefetch into `shadow block. Potentially, more passes; in practice, most files still sorted in 2-3 passes. INPUT 1 INPUT 1' INPUT 2 OUTPUT INPUT 2' OUTPUT' INPUT k INPUT k' b block size B main memory buffers, k-way merge Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 12

Sorting Records! Sorting has become a blood sport! Parallel sorting is the name of the game... Datamation: Sort 1M records of size 100 bytes Typical DBMS: 15 minutes World record: 3.5 seconds 12-CPU SGI machine, 96 disks, 2GB of RAM New benchmarks proposed: Minute Sort: How many can you sort in 1 minute? Dollar Sort: How many can you sort for $1.00? Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 13 Using B+ Trees for Sorting Scenario: Table to be sorted has B+ tree index on sorting column(s). Idea: Can retrieve records in order by traversing leaf pages. Is this a good idea? Cases to consider: B+ tree is clustered Good idea! B+ tree is not clustered Could be a very bad idea! Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 14 Clustered B+ Tree Used for Sorting Cost: root to the leftmost leaf, then retrieve all leaf pages (Alternative 1) If Alternative 2 is used? Additional cost of retrieving data records: each page fetched just once. Data Records Index (Directs search) Data Entries ("Sequence set") * Always better than external sorting! Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 15

Unclustered B+ Tree Used for Sorting Alternative (2) for data entries; each data entry contains rid of a data record. In general, one I/O per data record! Index (Directs search) Data Entries ("Sequence set") Data Records Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 16 External Sorting vs. Unclustered Index N Sorting p=1 p=10 p=100 100 200 100 1,000 10,000 1,000 2,000 1,000 10,000 100,000 10,000 40,000 10,000 100,000 1,000,000 100,000 600,000 100,000 1,000,000 10,000,000 1,000,000 8,000,000 1,000,000 10,000,000 100,000,000 10,000,000 80,000,000 10,000,000 100,000,000 1,000,000,000 * p: # of records per page * B=1,000 and block size=32 for sorting * p=100 is the more realistic value. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 17 Summary External sorting is important; DBMS may dedicate part of buffer pool for sorting! External merge sort minimizes disk I/O cost: Pass 0: Produces sorted runs of size B (# buffer pages). Later passes: merge runs. # of runs merged at a time depends on B, and block size. Larger block size means less I/O cost per page. Larger block size means smaller # runs merged. In practice, # of runs rarely more than 2 or 3. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 18

Summary, cont. Choice of internal sort algorithm may matter: Quicksort: Quick! Heap/tournament sort: slower (2x), longer runs The best sorts are wildly fast: Despite 40+ years of research, we re still improving! Clustered B+ tree is good for sorting; unclustered tree is usually very bad. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 19