Query Processing Steps
|
|
- Betty Porter
- 7 years ago
- Views:
Transcription
1 Query Processing
2 Query Processing Steps Step 1 σ balance<2500 ( balance (account)) balance (σ balance<2500 (account)) SELECT balance FROM account WHERE balance < 2500 Step 3 Step 2 A B + -tree index on balance CMPT 454: Database II -- Query Processing 2
3 Query Cost Measures Query processing as an optimization problem Search space: all possible equivalent relational algebra expressions all possible query execution plans Goal: find the most efficient query execution Efficiency: I/O or CPU? CPU processing time is often much smaller than I/O cost, and is hard to estimate (real systems do consider) Each I/O access cost may slightly different Number of block transfers is a measure of the dominant component of query answering cost Communication cost in distributed database systems CMPT 454: Database II -- Query Processing 3
4 Selection Table-scan: read the blocks one by one from disk Linear search: scan each file block, Cost = t S + b r *t T, where b r is the number of blocks in the file, t S is the average seek time, and t T is the average block transfer time Search on a key attribute: an average cost of b r / 2 Can be used in any cases Binary search: the file is ordered on an attribute, the selection condition is an equality comparison on the attribute Cost: log 2 (b r ) * (t S +t T ) If the attribute is not a key, some extra blocks may need to be read Index-scan: using an index in selection Primary index, equality on key: if a B+-tree is used, (h i + 1) * (t S +t T ), where h i is the height of the tree Primary index, equality on nonkey: h i * (t T + t S )+ t S + t T * b, where b is the number of blocks containing records with the specified search key Secondary index, equality on key: (h i + 1) * (t S +t T ) Secondary index, equality on nonkey: (h i + n) * (t T + t S ), can be worse than linear search CMPT 454: Database II -- Query Processing 4
5 Selection Involving Comparisons Primary index, comparison Case A > v or A v: use the index to find the first tuple having A v, scan the tuple up to the end of the file, cost h i * (t T + t S )+ t S + t T * b Case A < v or A v: scan from the beginning of the file until the condition is violated, the index is not used Secondary index, comparison: use the index to find the pointers to the record, retrieve the data blocks Sort pointers to ensure each block is read once Can be costly if the selectivity of a query is low (i.e., many tuples satisfy the condition) Conjunction σ θ1 θ2... θn (r) Selection and test using one index on one attribute of composite search key Selection by intersection of identifiers Disjunction σ θ1 θ2... θn (r) by union of identifiers CMPT 454: Database II -- Query Processing 5
6 Nested-Loop Join To compute the theta join r θ for each tuple t s r in r do begin for each tuple t s in s do begin test whether pair (t r,t s ) satisfies the join condition θ if so, add t r t s to the result end end r : the outer relation of the join s : the inner relation of the join No indexes, can be used with any kind of join condition Cost Worst case only one block of each relation in memory: n r b s + b r Best case: both relations are in memory: b r + b s If one relation can fit entirely in main memory, use that relation as the inner relation: b r + b s CMPT 454: Database II -- Query Processing 6
7 Block Nested-Loop Join Idea: once a block is read into main memory, the records in the block should be utilized as much as possible for each block B r of r do begin for each block B s of s do begin for each tuple t r in B r do begin for each tuple t s in B s do begin check if (t r,t s ) satisfies the join condition if so, add t r t s to the result end end end end Cost Worst case only one block for each relation : b r b s + b r Best case: b r + b s CMPT 454: Database II -- Query Processing 7
8 Further Improvements In natural join or equi-join, if the join attributes form a key on the inner relation, then for each outer relation tuple, the inner loop can stop as soon as the first match is found In the block nested-loop algorithm, if M blocks are available, use M-2 blocks for outer relation (why?) Total cost: b r / (M-2) b s + b r Scan the inner loop alternately forward and backward (similar to the elevator algorithm), reuse the blocks remaining in the buffer How and why is it good? Indexed nested-loop join An index exists on the inner loop s join attribute use the index lookups to replace file scans Cost: b r (t T + t S ) + n r c, where c is the cost of a single selection on s using the join condition, n r is the number of records in r If indices are available on the join attributes of both r and s, use the relation with fewer tuples as the outer relation CMPT 454: Database II -- Query Processing 8
9 Merge Join Can be used only for equi-joins and natural joins Sort both relations on their join attribute (if not already sorted on the join attributes) Merge the sorted relations to join them Join step is similar to the merge stage of the sort-merge algorithm Every pair with same value on join attribute must be matched Cost: b r + b s block transfers + b r / b b + b s / b b seeks + the cost of sorting if relations are unsorted After sorting, each block needs to be read only once Suppose all tuples for any given value of the join attributes fit in memory Can be further improved by combining the merge phase of merge-sort with the merge phase of mergejoin merge-join multiple sorted sublists CMPT 454: Database II -- Query Processing 9
10 Hash Join: the Idea CMPT 454: Database II -- Query Processing 10
11 Hash Join For equi-joins and natural joins only A hash function h depending only on the join attributes is used to partition tuples of both relations, h maps JoinAttrs values to {0, 1,..., n} r 0, r 1,..., r n are partitions of r tuples, each tuple t r r is put in partition r i where i = h(t r [JoinAttrs]) s 0,, s 1..., s n are partitions of s tuples, each tuple t s s is put in partition s i, where i = h(t s [JoinAttrs]) r tuples in r i need only to be compared with s tuples in s i CMPT 454: Database II -- Query Processing 11
12 Setting Parameters of Hash Joins Algorithm: relation s: build input, relation r: probe input Partition the relation s using hashing function h, when partitioning a relation, one block of memory is reserved as the output buffer for each partition Partition r similarly For each i do Load s i into memory and build an in-memory hash index on it using the join attribute This hash index uses a different hash function than the earlier one h Read the tuples in r i from the disk one by one For each tuple t r locate each matching tuple t s in s i using the in-memory hash index Output the concatenation of their attributes n and the hash function h is chosen such that each s i should fit in memory Use the smaller input relation as the build relation The probe relation partitions r i need not fit in memory Typically n is chosen as b s /M * f where f is a fudge factor, typically around 1.2 CMPT 454: Database II -- Query Processing 12
13 Recursive Partitioning, Overflow For number of partitions n is greater than number of pages M of memory, instead of partitioning n ways, use M 1 partitions for s Further partition the M 1 partitions using a different hash function, use same partitioning method on r (Rarely required) Hash table overflow: a partition cannot fit in memory Many tuples with same value for join attributes due to bad hash function Partitioning is said skewed if some partitions have significantly more tuples than some others Overflow resolution in build phase Partition s i is further partitioned using different hash function Partition r i must be similarly partitioned Overflow avoidance Performs partitioning carefully to avoid overflows during build phase E.g. partition build relation into many partitions, then combine them Both approaches fail with large numbers of duplicates Fallback option: use block nested loops join on overflowed partitions CMPT 454: Database II -- Query Processing 13
14 Performance Analysis Without recursive partitioning: 3 (b r + b s ) + 4 n h For recursive partitioning: 2 (b r + b s ) log M 1 (b s ) 1 + b r + b s Cost of partitioning s: log M 1 (b s ) 1 Similar cost for partitioning r best to choose the smaller relation as the build relation If the entire build input can be kept in main memory, then do not partition the relations into temporary files Cost: b r + b s CMPT 454: Database II -- Query Processing 14
15 Hybrid Hash Join Join the first partitions during partitioning the tables Partition relation s, keep the first partition s 0 in main memory Partition relation r, join tuples in r 0 with s 0 in main memory No need to store s 0 and r 0 Most useful if M >> bs CMPT 454: Database II -- Query Processing 15
16 Complex Joins Join with a conjunctive condition r θ1 θ 2 θ n s Either use nested loops/block nested loops, or Compute the result of one of the simpler joins r θi s Final result comprises those tuples in the intermediate result that satisfy the remaining conditions θ 1... θ i 1 θ i θ n Join with a disjunctive condition r θ1 θ2... θ ns Either use nested loops/block nested loops, or Compute as the union of the records in individual joins r θ is Compute (r θ 1s) (r θ 2s)... (r θ n s) CMPT 454: Database II -- Query Processing 16
17 Sort-Based Algorithms Operators Memory cost I/O cost Duplicate elimination, grouping and aggregation Union, intersection and difference SQRT(B) SQRT(B(R)+B(S)) 3B 3(B(R)+B(S)) Merge-join SQRT(MAX(B(R), B(S)) 5(B(R)+B(S)) Merge-join (improved) SQRT(B(R)+B(S)) 3(B(R)+B(S)) CMPT 454: Database II -- Query Processing 17
18 Hash-Based Algorithms Operators Memory cost I/O cost Duplicate elimination, grouping and aggregation Union, intersection and difference SQRT(B) SQRT(MIN(B(R),B(S))) 3B 3(B(R)+B(S)) Simple hash-join SQRT(MIN(B(R),B(S))) 3(B(R)+B(S)) Hash-join (improved) SQRT(MIN(B(R),B(S))) CMPT 454: Database II -- Query Processing 18
19 Duplicate Elimination Using Sorting Pass-1: sort tuples in sublists Pass-2: use the available main memory to hold one block from each sorted sublist, repeatedly copy one to the output and ignore all tuples identical to it I/O cost: 3B(R) B(R) to read each block of R when creating the sorted sublists B(R) to write each of the sorted sublists to disk B(R) to read each block from the sublists back to generate the final results Memory usage: Each sublist can have up to M blocks (why?) Up to M sublists can be processed in the second pass CMPT 454: Database II -- Query Processing 19
20 Example Sublists Memory Disk Memory Disk 2, 5, 2, 1, 2, 2 4, 5, 4, 3, 4, 2 1, 5, 2, 1, 3 Sorting , , , 5 Output , , Output 2 Memory Disk Output 4 Output 5 Memory Disk Output 3 Memory Disk , Answer: 1, 2, 3, 4, 5 CMPT 454: Database II -- Query Processing 20
21 Duplicate Elimination Using Hashing Hash R to M-1 buckets Two duplicate tuples will be hashed to the same bucket Eliminate duplicates in each bucket Assumption: each bucket can fit into main memory Memory usage: I/O cost: 3B(R) B(R) B(R) in each of the three phases: reading in, writing hashing result, processing each bucket CMPT 454: Database II -- Query Processing 21
22 Grouping and Aggregation The sort-based method similar to duplicate elimination Please study the algorithm and analysis by yourself The hash-based method Use a hash function depending only on the grouping attributes to hash all tuples to (M-1) buckets Scan each bucket once to compute groups CMPT 454: Database II -- Query Processing 22
23 Union Algorithms How to compute R S? The sort-based algorithm Sort R and S respectively using the same order Merge the sorted sublists, remove duplicates The hash-based method Hashing S and R into buckets R 1,, R M-1, and S 1,, S M-1 using the same hash function Compute R i S i, get the union of the buckets Intersection and difference can be computed in a similar way CMPT 454: Database II -- Query Processing 23
Chapter 13: Query Processing. Basic Steps in Query Processing
Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing
More informationQuery Processing C H A P T E R12. Practice Exercises
C H A P T E R12 Query Processing Practice Exercises 12.1 Assume (for simplicity in this exercise) that only one tuple fits in a block and memory holds at most 3 blocks. Show the runs created on each pass
More informationComp 5311 Database Management Systems. 16. Review 2 (Physical Level)
Comp 5311 Database Management Systems 16. Review 2 (Physical Level) 1 Main Topics Indexing Join Algorithms Query Processing and Optimization Transactions and Concurrency Control 2 Indexing Used for faster
More informationSQL Query Evaluation. Winter 2006-2007 Lecture 23
SQL Query Evaluation Winter 2006-2007 Lecture 23 SQL Query Processing Databases go through three steps: Parse SQL into an execution plan Optimize the execution plan Evaluate the optimized plan Execution
More informationDatenbanksysteme II: Implementation of Database Systems Implementing Joins
Datenbanksysteme II: Implementation of Database Systems Implementing Joins Material von Prof. Johann Christoph Freytag Prof. Kai-Uwe Sattler Prof. Alfons Kemper, Dr. Eickler Prof. Hector Garcia-Molina
More informationInside the PostgreSQL Query Optimizer
Inside the PostgreSQL Query Optimizer Neil Conway neilc@samurai.com Fujitsu Australia Software Technology PostgreSQL Query Optimizer Internals p. 1 Outline Introduction to query optimization Outline of
More informationDatabases and Information Systems 1 Part 3: Storage Structures and Indices
bases and Information Systems 1 Part 3: Storage Structures and Indices Prof. Dr. Stefan Böttcher Fakultät EIM, Institut für Informatik Universität Paderborn WS 2009 / 2010 Contents: - database buffer -
More informationEvaluation of Expressions
Query Optimization Evaluation of Expressions Materialization: one operation at a time, materialize intermediate results for subsequent use Good for all situations Sum of costs of individual operations
More informationDATABASE DESIGN - 1DL400
DATABASE DESIGN - 1DL400 Spring 2015 A course on modern database systems!! http://www.it.uu.se/research/group/udbl/kurser/dbii_vt15/ Kjell Orsborn! Uppsala Database Laboratory! Department of Information
More informationLecture 1: Data Storage & Index
Lecture 1: Data Storage & Index R&G Chapter 8-11 Concurrency control Query Execution and Optimization Relational Operators File & Access Methods Buffer Management Disk Space Management Recovery Manager
More informationData Warehousing und Data Mining
Data Warehousing und Data Mining Multidimensionale Indexstrukturen Ulf Leser Wissensmanagement in der Bioinformatik Content of this Lecture Multidimensional Indexing Grid-Files Kd-trees Ulf Leser: Data
More informationPerformance Tuning for the Teradata Database
Performance Tuning for the Teradata Database Matthew W Froemsdorf Teradata Partner Engineering and Technical Consulting - i - Document Changes Rev. Date Section Comment 1.0 2010-10-26 All Initial document
More informationOverview of Storage and Indexing
Overview of Storage and Indexing Chapter 8 How index-learning turns no student pale Yet holds the eel of science by the tail. -- Alexander Pope (1688-1744) Database Management Systems 3ed, R. Ramakrishnan
More informationCSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92.
Name: Email ID: CSE 326, Data Structures Section: Sample Final Exam Instructions: The exam is closed book, closed notes. Unless otherwise stated, N denotes the number of elements in the data structure
More informationOverview of Storage and Indexing. Data on External Storage. Alternative File Organizations. Chapter 8
Overview of Storage and Indexing Chapter 8 How index-learning turns no student pale Yet holds the eel of science by the tail. -- Alexander Pope (1688-1744) Database Management Systems 3ed, R. Ramakrishnan
More informationOracle Database 11g: SQL Tuning Workshop
Oracle University Contact Us: + 38516306373 Oracle Database 11g: SQL Tuning Workshop Duration: 3 Days What you will learn This Oracle Database 11g: SQL Tuning Workshop Release 2 training assists database
More informationCIS 631 Database Management Systems Sample Final Exam
CIS 631 Database Management Systems Sample Final Exam 1. (25 points) Match the items from the left column with those in the right and place the letters in the empty slots. k 1. Single-level index files
More informationChapter 14: Query Optimization
Chapter 14: Query Optimization Database System Concepts 5 th Ed. See www.db-book.com for conditions on re-use Chapter 14: Query Optimization Introduction Transformation of Relational Expressions Catalog
More informationSorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)
Sorting revisited How did we use a binary search tree to sort an array of elements? Tree Sort Algorithm Given: An array of elements to sort 1. Build a binary search tree out of the elements 2. Traverse
More informationUnderstanding SQL Server Execution Plans. Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner
Understanding SQL Server Execution Plans Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner About me Independent SQL Server Consultant International Speaker, Author
More informationBig Data and Scripting. Part 4: Memory Hierarchies
1, Big Data and Scripting Part 4: Memory Hierarchies 2, Model and Definitions memory size: M machine words total storage (on disk) of N elements (N is very large) disk size unlimited (for our considerations)
More informationAPP INVENTOR. Test Review
APP INVENTOR Test Review Main Concepts App Inventor Lists Creating Random Numbers Variables Searching and Sorting Data Linear Search Binary Search Selection Sort Quick Sort Abstraction Modulus Division
More informationHash joins and hash teams in Microsoft
Hash joins and hash teams in Microsoft Goetz Graefe, Ross Bunker, Shaun Cooper SQL Server Abstract The query execution engine in Microsoft SQL Server employs hash-based algorithms for inner and outer joins,
More information2) What is the structure of an organization? Explain how IT support at different organizational levels.
(PGDIT 01) Paper - I : BASICS OF INFORMATION TECHNOLOGY 1) What is an information technology? Why you need to know about IT. 2) What is the structure of an organization? Explain how IT support at different
More informationSQL Server Query Tuning
SQL Server Query Tuning Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner About me Independent SQL Server Consultant International Speaker, Author Pro SQL Server
More informationEFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES
ABSTRACT EFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES Tyler Cossentine and Ramon Lawrence Department of Computer Science, University of British Columbia Okanagan Kelowna, BC, Canada tcossentine@gmail.com
More informationRethinking SIMD Vectorization for In-Memory Databases
SIGMOD 215, Melbourne, Victoria, Australia Rethinking SIMD Vectorization for In-Memory Databases Orestis Polychroniou Columbia University Arun Raghavan Oracle Labs Kenneth A. Ross Columbia University Latest
More informationElena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs.
Phases of database design Application requirements Conceptual design Database Management Systems Conceptual schema Logical design ER or UML Physical Design Relational tables Logical schema Physical design
More informationAnswer Key. UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division
Answer Key UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division CS186 Fall 2003 Eben Haber Midterm Midterm Exam: Introduction to Database Systems This exam has
More informationAnalysis of Binary Search algorithm and Selection Sort algorithm
Analysis of Binary Search algorithm and Selection Sort algorithm In this section we shall take up two representative problems in computer science, work out the algorithms based on the best strategy to
More informationSymbol Tables. Introduction
Symbol Tables Introduction A compiler needs to collect and use information about the names appearing in the source program. This information is entered into a data structure called a symbol table. The
More informationC H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods
C H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods Overview 1 Determining Data Relationships 1 Understanding the Methods for Combining SAS Data Sets 3
More informationAdvanced Oracle SQL Tuning
Advanced Oracle SQL Tuning Seminar content technical details 1) Understanding Execution Plans In this part you will learn how exactly Oracle executes SQL execution plans. Instead of describing on PowerPoint
More informationEfficient Processing of Joins on Set-valued Attributes
Efficient Processing of Joins on Set-valued Attributes Nikos Mamoulis Department of Computer Science and Information Systems University of Hong Kong Pokfulam Road Hong Kong nikos@csis.hku.hk Abstract Object-oriented
More informationBig Data Technology Map-Reduce Motivation: Indexing in Search Engines
Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Edward Bortnikov & Ronny Lempel Yahoo Labs, Haifa Indexing in Search Engines Information Retrieval s two main stages: Indexing process
More informationAlgorithms. Margaret M. Fleck. 18 October 2010
Algorithms Margaret M. Fleck 18 October 2010 These notes cover how to analyze the running time of algorithms (sections 3.1, 3.3, 4.4, and 7.1 of Rosen). 1 Introduction The main reason for studying big-o
More informationMapReduce examples. CSE 344 section 8 worksheet. May 19, 2011
MapReduce examples CSE 344 section 8 worksheet May 19, 2011 In today s section, we will be covering some more examples of using MapReduce to implement relational queries. Recall how MapReduce works from
More informationOracle Database 11g: SQL Tuning Workshop Release 2
Oracle University Contact Us: 1 800 005 453 Oracle Database 11g: SQL Tuning Workshop Release 2 Duration: 3 Days What you will learn This course assists database developers, DBAs, and SQL developers to
More informationStorage and File Structure
Storage and File Structure Chapter 10: Storage and File Structure Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files
More informationChapter 2 Data Storage
Chapter 2 22 CHAPTER 2. DATA STORAGE 2.1. THE MEMORY HIERARCHY 23 26 CHAPTER 2. DATA STORAGE main memory, yet is essentially random-access, with relatively small differences Figure 2.4: A typical
More informationEvaluation of view maintenance with complex joins in a data warehouse environment (HS-IDA-MD-02-301)
Evaluation of view maintenance with complex joins in a data warehouse environment (HS-IDA-MD-02-301) Kjartan Asthorsson (kjarri@kjarri.net) Department of Computer Science Högskolan i Skövde, Box 408 SE-54128
More informationMulti-dimensional index structures Part I: motivation
Multi-dimensional index structures Part I: motivation 144 Motivation: Data Warehouse A definition A data warehouse is a repository of integrated enterprise data. A data warehouse is used specifically for
More informationChapter 8: Structures for Files. Truong Quynh Chi tqchi@cse.hcmut.edu.vn. Spring- 2013
Chapter 8: Data Storage, Indexing Structures for Files Truong Quynh Chi tqchi@cse.hcmut.edu.vn Spring- 2013 Overview of Database Design Process 2 Outline Data Storage Disk Storage Devices Files of Records
More informationRelational Databases
Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 18 Relational data model Domain domain: predefined set of atomic values: integers, strings,... every attribute
More informationData storage Tree indexes
Data storage Tree indexes Rasmus Pagh February 7 lecture 1 Access paths For many database queries and updates, only a small fraction of the data needs to be accessed. Extreme examples are looking or updating
More informationBinary Search Trees CMPSC 122
Binary Search Trees CMPSC 122 Note: This notes packet has significant overlap with the first set of trees notes I do in CMPSC 360, but goes into much greater depth on turning BSTs into pseudocode than
More informationIn this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR
1 2 2 3 In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR The uniqueness of the primary key ensures that
More informationCSE 326: Data Structures B-Trees and B+ Trees
Announcements (4//08) CSE 26: Data Structures B-Trees and B+ Trees Brian Curless Spring 2008 Midterm on Friday Special office hour: 4:-5: Thursday in Jaech Gallery (6 th floor of CSE building) This is
More informationPersistent Binary Search Trees
Persistent Binary Search Trees Datastructures, UvA. May 30, 2008 0440949, Andreas van Cranenburgh Abstract A persistent binary tree allows access to all previous versions of the tree. This paper presents
More informationUniversity of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao
University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao CMPSCI 445 Midterm Practice Questions NAME: LOGIN: Write all of your answers directly on this paper. Be sure to clearly
More informationParallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel
Parallel Databases Increase performance by performing operations in parallel Parallel Architectures Shared memory Shared disk Shared nothing closely coupled loosely coupled Parallelism Terminology Speedup:
More informationIn-Memory Database: Query Optimisation. S S Kausik (110050003) Aamod Kore (110050004) Mehul Goyal (110050017) Nisheeth Lahoti (110050027)
In-Memory Database: Query Optimisation S S Kausik (110050003) Aamod Kore (110050004) Mehul Goyal (110050017) Nisheeth Lahoti (110050027) Introduction Basic Idea Database Design Data Types Indexing Query
More informationExternal Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13
External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing
More informationa presentation by Kirk Paul Lafler SAS Consultant, Author, and Trainer E-mail: KirkLafler@cs.com
a presentation by Kirk Paul Lafler SAS Consultant, Author, and Trainer E-mail: KirkLafler@cs.com 1 Copyright Kirk Paul Lafler, 1992-2010. All rights reserved. SAS is the registered trademark of SAS Institute
More informationUniversity of Aarhus. Databases 2009. 2009 IBM Corporation
University of Aarhus Databases 2009 Kirsten Ann Larsen What is good performance? Elapsed time End-to-end In DB2 Resource consumption CPU I/O Memory Locks Elapsed time = Sync. I/O + CPU + wait time I/O
More informationChapter 13: Query Optimization
Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Transformation of Relational Expressions Catalog
More informationPhysical DB design and tuning: outline
Physical DB design and tuning: outline Designing the Physical Database Schema Tables, indexes, logical schema Database Tuning Index Tuning Query Tuning Transaction Tuning Logical Schema Tuning DBMS Tuning
More informationMS SQL Performance (Tuning) Best Practices:
MS SQL Performance (Tuning) Best Practices: 1. Don t share the SQL server hardware with other services If other workloads are running on the same server where SQL Server is running, memory and other hardware
More informationKrishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C
Tutorial#1 Q 1:- Explain the terms data, elementary item, entity, primary key, domain, attribute and information? Also give examples in support of your answer? Q 2:- What is a Data Type? Differentiate
More informationUnderstanding Query Processing and Query Plans in SQL Server. Craig Freedman Software Design Engineer Microsoft SQL Server
Understanding Query Processing and Query Plans in SQL Server Craig Freedman Software Design Engineer Microsoft SQL Server Outline SQL Server engine architecture Query execution overview Showplan Common
More informationPERFORMANCE TIPS FOR BATCH JOBS
PERFORMANCE TIPS FOR BATCH JOBS Here is a list of effective ways to improve performance of batch jobs. This is probably the most common performance lapse I see. The point is to avoid looping through millions
More informationFile Management. Chapter 12
Chapter 12 File Management File is the basic element of most of the applications, since the input to an application, as well as its output, is usually a file. They also typically outlive the execution
More informationPhysical Database Design and Tuning
Chapter 20 Physical Database Design and Tuning Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1. Physical Database Design in Relational Databases (1) Factors that Influence
More informationDATABASDESIGN FÖR INGENJÖRER - 1DL124
1 DATABASDESIGN FÖR INGENJÖRER - 1DL124 Sommar 2005 En introduktionskurs i databassystem http://user.it.uu.se/~udbl/dbt-sommar05/ alt. http://www.it.uu.se/edu/course/homepage/dbdesign/st05/ Kjell Orsborn
More informationChapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design
Chapter 6: Physical Database Design and Performance Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Robert C. Nickerson ISYS 464 Spring 2003 Topic 23 Database
More information1. Physical Database Design in Relational Databases (1)
Chapter 20 Physical Database Design and Tuning Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1. Physical Database Design in Relational Databases (1) Factors that Influence
More informationChapter Objectives. Chapter 9. Sequential Search. Search Algorithms. Search Algorithms. Binary Search
Chapter Objectives Chapter 9 Search Algorithms Data Structures Using C++ 1 Learn the various search algorithms Explore how to implement the sequential and binary search algorithms Discover how the sequential
More informationRecord Storage and Primary File Organization
Record Storage and Primary File Organization 1 C H A P T E R 4 Contents Introduction Secondary Storage Devices Buffering of Blocks Placing File Records on Disk Operations on Files Files of Unordered Records
More informationBinary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child
Binary Search Trees Data in each node Larger than the data in its left child Smaller than the data in its right child FIGURE 11-6 Arbitrary binary tree FIGURE 11-7 Binary search tree Data Structures Using
More informationQuery tuning by eliminating throwaway
Query tuning by eliminating throwaway This paper deals with optimizing non optimal performing queries. Abstract Martin Berg (martin.berg@oracle.com) Server Technology System Management & Performance Oracle
More informationPipelining and load-balancing in parallel joins on distributed machines
NP-PAR 05 p. / Pipelining and load-balancing in parallel joins on distributed machines M. Bamha bamha@lifo.univ-orleans.fr Laboratoire d Informatique Fondamentale d Orléans (France) NP-PAR 05 p. / Plan
More informationPhysical Data Organization
Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor
More informationBig Data and Scripting map/reduce in Hadoop
Big Data and Scripting map/reduce in Hadoop 1, 2, parts of a Hadoop map/reduce implementation core framework provides customization via indivudual map and reduce functions e.g. implementation in mongodb
More informationExternal Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1
External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing
More informationCS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team
CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team Lecture Summary In this lecture, we learned about the ADT Priority Queue. A
More informationOperating Systems CSE 410, Spring 2004. File Management. Stephen Wagner Michigan State University
Operating Systems CSE 410, Spring 2004 File Management Stephen Wagner Michigan State University File Management File management system has traditionally been considered part of the operating system. Applications
More information6. Standard Algorithms
6. Standard Algorithms The algorithms we will examine perform Searching and Sorting. 6.1 Searching Algorithms Two algorithms will be studied. These are: 6.1.1. inear Search The inear Search The Binary
More informationLecture 6: Query optimization, query tuning. Rasmus Pagh
Lecture 6: Query optimization, query tuning Rasmus Pagh 1 Today s lecture Only one session (10-13) Query optimization: Overview of query evaluation Estimating sizes of intermediate results A typical query
More informationChapter 13. Disk Storage, Basic File Structures, and Hashing
Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible Hashing
More informationQuestions 1 through 25 are worth 2 points each. Choose one best answer for each.
Questions 1 through 25 are worth 2 points each. Choose one best answer for each. 1. For the singly linked list implementation of the queue, where are the enqueues and dequeues performed? c a. Enqueue in
More informationIntroduction to Querying & Reporting with SQL Server
1800 ULEARN (853 276) www.ddls.com.au Introduction to Querying & Reporting with SQL Server Length 5 days Price $4169.00 (inc GST) Overview This five-day instructor led course provides students with the
More informationFHE DEFINITIVE GUIDE. ^phihri^^lv JEFFREY GARBUS. Joe Celko. Alvin Chang. PLAMEN ratchev JONES & BARTLETT LEARN IN G. y ti rvrrtuttnrr i t i r
: 1. FHE DEFINITIVE GUIDE fir y ti rvrrtuttnrr i t i r ^phihri^^lv ;\}'\^X$:^u^'! :: ^ : ',!.4 '. JEFFREY GARBUS PLAMEN ratchev Alvin Chang Joe Celko g JONES & BARTLETT LEARN IN G Contents About the Authors
More informationPartitioning under the hood in MySQL 5.5
Partitioning under the hood in MySQL 5.5 Mattias Jonsson, Partitioning developer Mikael Ronström, Partitioning author Who are we? Mikael is a founder of the technology behind NDB
More informationChapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification
Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Outline More Complex SQL Retrieval Queries
More informationA Comparison of Dictionary Implementations
A Comparison of Dictionary Implementations Mark P Neyer April 10, 2009 1 Introduction A common problem in computer science is the representation of a mapping between two sets. A mapping f : A B is a function
More informationBinary Heap Algorithms
CS Data Structures and Algorithms Lecture Slides Wednesday, April 5, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks CHAPPELLG@member.ams.org 2005 2009 Glenn G. Chappell
More informationProTrack: A Simple Provenance-tracking Filesystem
ProTrack: A Simple Provenance-tracking Filesystem Somak Das Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology das@mit.edu Abstract Provenance describes a file
More information1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.
1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D. base address 2. The memory address of fifth element of an array can be calculated
More informationMapReduce for Data Warehouses
MapReduce for Data Warehouses Data Warehouses: Hadoop and Relational Databases In an enterprise setting, a data warehouse serves as a vast repository of data, holding everything from sales transactions
More informationHeaps & Priority Queues in the C++ STL 2-3 Trees
Heaps & Priority Queues in the C++ STL 2-3 Trees CS 3 Data Structures and Algorithms Lecture Slides Friday, April 7, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks
More informationWhy Query Optimization? Access Path Selection in a Relational Database Management System. How to come up with the right query plan?
Why Query Optimization? Access Path Selection in a Relational Database Management System P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie, T. Price Peyman Talebifard Queries must be executed and execution
More informationBig Data With Hadoop
With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials
More informationQuestion 1. Relational Data Model [17 marks] Question 2. SQL and Relational Algebra [31 marks]
EXAMINATIONS 2005 MID-YEAR COMP 302 Database Systems Time allowed: Instructions: 3 Hours Answer all questions. Make sure that your answers are clear and to the point. Write your answers in the spaces provided.
More informationPrevious Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles
B-Trees Algorithms and data structures for external memory as opposed to the main memory B-Trees Previous Lectures Height balanced binary search trees: AVL trees, red-black trees. Multiway search trees:
More informationENHANCEMENTS TO SQL SERVER COLUMN STORES. Anuhya Mallempati #2610771
ENHANCEMENTS TO SQL SERVER COLUMN STORES Anuhya Mallempati #2610771 CONTENTS Abstract Introduction Column store indexes Batch mode processing Other Enhancements Conclusion ABSTRACT SQL server introduced
More informationConverting a Number from Decimal to Binary
Converting a Number from Decimal to Binary Convert nonnegative integer in decimal format (base 10) into equivalent binary number (base 2) Rightmost bit of x Remainder of x after division by two Recursive
More informationOutline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb
Binary Search Trees Lecturer: Georgy Gimel farb COMPSCI 220 Algorithms and Data Structures 1 / 27 1 Properties of Binary Search Trees 2 Basic BST operations The worst-case time complexity of BST operations
More informationINTRODUCTION The collection of data that makes up a computerized database must be stored physically on some computer storage medium.
Chapter 4: Record Storage and Primary File Organization 1 Record Storage and Primary File Organization INTRODUCTION The collection of data that makes up a computerized database must be stored physically
More informationCopyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1
Slide 13-1 Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible
More informationB-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees
B-Trees Algorithms and data structures for external memory as opposed to the main memory B-Trees Previous Lectures Height balanced binary search trees: AVL trees, red-black trees. Multiway search trees:
More information