I/O-Efficient Spatial Data Structures for Range Queries
|
|
- Emmeline Walters
- 1 years ago
- Views:
Transcription
1 I/O-Efficient Spatial Data Structures for Range Queries Lars Arge Kasper Green Larsen MADALGO, Department of Computer Science, Aarhus University, Denmark 1 Introduction Range reporting is a one of the most fundamental topics in spatial databases and computational geometry. In this class of problems, the input consists of a set of geometric objects, such as points, line segments, rectangles etc. The goal is to preprocess the input set into a data structure, such that given a query range, one can efficiently report all input objects intersecting the range. The ranges most commonly considered are axis-parallel rectangles, halfspaces, points, simplices and balls. In this survey, we focus on the planar orthogonal range reporting problem in the external memory model of Aggarwal and Vitter [2]. Here the input consists of a set of N points in the plane, and the goal is to support reporting all points inside an axis-parallel query rectangle. We use B to denote the disk block size in number of points. The cost of answering a query is measured in the number of I/Os performed and the space of the data structure is measured in the number of disk blocks occupied, hence linear space is O(N/B) disk blocks. Outline. In Section 2, we set out by reviewing the classic B-tree for solving onedimensional orthogonal range reporting, i.e. given N points on the real line and a query interval q = [q 1, q 2 ], report all T points inside q. In Section 3 we present optimal solutions for planar orthogonal range reporting, and finally in Section 4, we briefly discuss related range searching problems. 2 The B-tree A B-tree [10] is constructed in the following manner from a sequence of N points on the real line: First sort the points according to their coordinates and partition them Center for Massive Data Algorithmics, a Center of the Danish National Research Foundation. 1
2 into N/B groups of B consecutive points each. Each group is stored in one disk block and we think of these blocks as the leaves of the B-tree. Each leaf node v is naturally associated to an x-range X v = (x 1 v, x 2 v] where x 1 v is the largest coordinate stored in the leaf preceding v (x 1 v = for the first leaf) and x 2 v is the largest coordinate of a point in v (we let x 2 v = for the last leaf). We call the coordinates defining these intervals splitter values and we assume without loss of generality that all coordinates are distinct. To guide searches, we construct a B-ary tree on top of the sorted sequence of leaves. Each internal node v is again associated to an x-range X v, which is the union of the x-ranges of the leaves in the subtree rooted at v. We augment each internal node with the O(B) splitter values of its children and store the splitter values in O(1) disk blocks. To answer a query q = [q 1, q 2 ], we start at the root node of the B-tree and traverse the two paths towards the leaves whose x-ranges contain q 1 and q 2, respectively. These paths are easily determined from the splitter values stored in the nodes along the paths. For each node v on the paths we do the following: We examine the x-range of all children of v, and for each child with an x-range completely inside q, we traverse all the leaves of the corresponding subtree and report all points stored there. Clearly this procedure correctly reports all points in the query range. Following the two paths towards the leaves containing q 1 and q 2 costs O(log B N) I/Os. For every subtree that is traversed because its x-range is inside q, we charge the traversal to the output size, i.e. this costs at most O(T/B) I/Os. Hence we conclude that the B-tree uses linear space and answers queries in O(log B N + T/B) I/Os. When solving higher-dimensional orthogonal range reporting, the aim is to obtain a performance that is comparable to that of the B-tree, i.e. O(log B N + T/B) query cost and as close to linear space as possible. 3 Orthogonal Range Queries Solutions for planar orthogonal range reporting are typically obtained by first obtaining a solution for a restricted type of query ranges, known as 3-sided queries. A 3-sided query q = (q 1, q 2, q 3 ) asks to report all input points (x, y) with q 1 x q 2 and y q 3. In Section 3.1, we show how to answer these 3-sided queries in external memory. In Section 3.2 we then extend these results to standard query ranges, which we refer to as 4-sided queries. 3.1 Three-Sided Queries In the following, we present a data structure for solving 3-sided planar range reporting: Preprocess a set S of point in the plane such that given a 3-sided query q, we can report all points in S that are inside q. 2
3 A solution to the 3-sided range query problem can be obtained using an external priority search tree [5]. A basic building block in the external priority search tree is an efficient solution for 3-sided range queries when the number of input points is only O(B 2 ). More specifically, we assume for now that for a set of K = O(B 2 ) points, one can construct a linear space data structure such that 3-sided range queries can be answered in O(1 + T/B) I/Os. We refer to such a base data structure as a B 2 structure. In the following, we first describe the external priority search tree simply assuming the availability of this B 2 structure. We then move on to describe the B 2 structure at the end of the section. Structure. An external priority search tree consists of a base B-tree T on the x-coordinates of the points in S. Recall from Section 2 that T stores all points in the leaves and that internal nodes store only splitter values. For the external priority search tree, we keep the splitter values in the internal nodes, but also move the points from the leaves into the internal nodes by the following recursive procedure: Start at the root node v of T and collect for each child v i of v, the B points with highest y-coordinates (if existing) amongst all points stored in the leaves of the subtree rooted at v i. We move these points out of their leaves and store the O(B 2 ) points collected for all children of v in a B 2 structure. This B 2 structure is stored at v. We finally recurse on all O(B) children of v. Observe that when this procedure terminates at the leaves, all points stored in the subtree rooted at a node v still have an x-coordinate in the x-range of v. Overall the external priority search tree uses linear space since T uses linear space and since each point is stored in precisely one B 2 structure. Query. To answer a 3-sided query q = (q 1, q 2, q 3 ) we start at the root of T and proceed recursively to the appropriate subtrees: When visiting a node v we first query the B 2 structure and report the relevant points. Then we advance the search to some of the children of v. The search is advanced to a child v i if it is either along the search path for q 1 or q 2, or if the entire set of points corresponding to v i in the B 2 structure were reported. The query procedure reports all points in the query range, since if we do not visit a child v i with an x-range completely contained in the interval [q 1, q 2 ], it means that at least one of the points in the B 2 structure corresponding to v i is not in q. Since all points corresponding to v i have an x- coordinate in the query range, there must be at least one of the points that have a y-coordinate smaller than q 3. This in turn means that none of the points in the subtree rooted at v i can be in q since they all have even smaller y-coordinates. That we use O(log B N + T/B) I/Os to answer a query can be seen as follows. In each internal node v of T visited by the query procedure we spend O(1 + T v /B) I/Os, where T v is the number of points reported at v. There are O(log B N) nodes visited on the search paths in T to the leaf containing q 1 and the leaf containing q 2, and thus the number of I/Os used in these nodes adds up to O(log B N +T/B). Each remaining visited internal node v in T is not on the search paths but it is visited because Θ(B) points corresponding to v were reported in its parent. Thus the cost 3
4 of visiting these nodes adds up to O(T/B). Theorem 1. An external priority search tree on a set of N points in the plane uses linear space and answers 3-sided range queries in O(log B N + T/B) I/Os. Three-Sided Queries on O(B 2 ) Points. In the following, we describe the B 2 structure. The data structure is based on a sweep line approach commonly used in solving geometric problems. Structure. We start by sorting the K = O(B 2 ) input points according to their x-coordinates. We then partition this sequence into O(K/B) groups of Θ(B) consecutive points. We store the points of each group in one disk block. We think of the blocks as being ordered in the natural way (i.e. according to the x-coordinates of the points inside). Then we conceptually sweep a horizontal line from y = to and merge the groups as the sweep line passes through the input points. More specifically, we start by marking all groups as active. We then raise the sweep line from y = and each time it passes through a point we check whether there are any two consecutive active groups for which both groups contain less than B/2 points above the sweep line. If this is the case, we form a new active group and fill it with the (at most B) points that are above the sweep line in the two groups. We then store the points in the newly formed group in one disk block and mark the two old groups as inactive. This process continues until no point remains above the sweep line. Since we start out with O(K/B) active groups, and we reduce the total number of active groups by one each time we create one new group, it follows that the total number of groups formed is O(K/B). Thus the stored disk blocks occupy only linear space. Each group g formed by the above procedure, including the initial groups, have a natural associated active y-interval [y 1, y 2 ] where y 1 is the y-coordinate of the point that caused g to be formed (we let y 1 = for the initial groups) and y 2 is the y-coordinate of the point that caused g to be inactive (y 2 = for the last group). Each group also has an associated x-range, which is simply the range containing the x-coordinates of the points in the group. Query. To answer the 3-sided query q = (q 1, q 2, q 3 ), we consider the groups whose active y-interval include q 3, that is, the groups that were active when the sweep line was at y = q 3. To answer the query, we assume for now that we can find the subset of these groups whose x-ranges intersect the interval [q 1, q 2 ]. If there are k such groups, we know that at least k 2 of them have an x-range completely inside [q 1, q 2 ]. Since we merge two consecutive groups when both have less than B/2 points above the sweep line, it follows that the query range contains Ω((k 2)B) points. Thus we can afford to spend O(1) I/Os for each intersected group to retrieve the corresponding disk block and report the subset of points that are also in the query range. What remains is to describe how we find the intersected groups. For this, recall that the total number of groups formed is only O(K/B) = O(B). Thus we can 4
5 store O(1) catalog blocks containing the active y-intervals and x-ranges of all the groups plus pointers to the corresponding disk blocks. We thus answer the query by first reading these catalog blocks to find the desired groups. We then follow the stored pointers to the corresponding disk blocks and finish as described above. This completes the description of the B 2 structure. 3.2 Four-Sided Queries In this section we show how the results for 3-sided queries can be extended to solve 4-sided queries. This extension is based on the classic data structures known as range trees [8]. Structure. The data structure consists of a balanced binary tree T with the N input points stored in sorted order of x-coordinates in the leaves. Each node is naturally associated to an x-range as in Section 2. For each internal node v of T, we store two data structures for answering 3-sided queries on the points stored in the leaves of the subtree rooted at v, one for query ranges of the form [q 1, ) [q 2, q 3 ] and one for query ranges of the form (, q 1 ] [q 2, q 3 ]. Note that such data structures can be obtained from the data structure presented in Section 3.1 by a simple geometric transformation of the input points. Since each point is stored in at most two linear space 3-sided data structures for each node on a root-to-leaf path in T, we conclude that the space usage is O(N/B log N). Queries. To answer a 4-sided query q = [q 1, q 2 ] [q 3, q 4 ] we start at the root node v of T. If q 1 and q 2 are contained in the x-range of the same child of v, we recursively visit that child. If this procedure ends in a leaf node, we answer the query simply by examining the associated point. Otherwise, let v be the internal node for which q 1 and q 2 lies in the x-range of different children. At v, the query decomposes into two 3-sided queries, one of the form [q 1, ) [q 3, q 4 ] on the 3-sided data structure stored for the left child of v, and one of the form (, q 2 ] [q 3, q 4 ] on the 3-sided data structure stored for the right child of v. By blocking the tree T appropriately, finding the node v can be done in O(log B N) I/Os, hence the total query cost is at most O(log B N + T/B) I/Os. Space Improvements. The solution for 4-sided queries presented above can be slightly improved. The idea is to change from a binary range tree, to a tree of degree α for a parameter α > 2. This decreases the height of the tree to log N/ log α and thus the space becomes O(N/B log N/ log α). Increasing the degree raises another problem however; at the node v where q 1 and q 2 lie in the x-range of two different children there might be up to α children whose x-range is intersected by [q 1, q 2 ]. Thus the query decomposes into two 3-sided queries and up to α one-dimensional queries. By elegant ideas, the one-dimensional queries can be solved jointly in O(α + T/B) I/Os [5]. Thus by setting α = log B N one arrives at a data structure using O(N/B log N/ log log B N) space and answering queries in O(log B N + T/B) I/Os. Surprisingly, this space bound has been shown to be optimal for any query time of the form O(log c B N + T/B), where c > 0 is any arbitrary constant [12, 5]. 5
6 Theorem 2. There exists a data structure that uses O(N/B log N/ log log B N) space and answers 4-sided range queries in O(log B N + T/B) I/Os on a set of N points in the plane. Linear Space. There also exists a number of linear space solutions for 4-sided queries. The simplest solution is a generalization of the kd-tree [9] structure to external memory [7]. This data structure stores each input point exactly once and answers queries in O( N/B + T/B) I/Os. This query time has been shown to be optimal when each point can be stored only once [13]. Dynamization. In the above, we have only focused on a static set of input points that we must preprocess into a data structure. If insertions and deletions of points are also to be supported, several new ideas are needed to make the above solutions efficient. For 3-sided queries, Arge et al. [5] showed how to dynamize the external priority search tree such that insertions and deletions can be supported in O(log B N) I/Os while maintaining the same space and query cost. Using the range tree idea, this also gave a dynamic data structure for 4-sided queries, which supports insertions and deletions in O(log B N log N/ log log B N) I/Os. The external memory kd-tree can be dynamized to support insertions and deletions in O(log 2 B N) I/Os, while maintaining O( N/B + T/B) query I/Os and still storing each point exactly once. This update cost can be improved to O(log B N) using the O-tree structure [13]. 3.3 Higher Dimensions For d-dimensional orthogonal range reporting, i.e. d-dimensional points and d- dimensional axis-parallel query rectangles, one can adapt the O-tree such that it answers queries in O((N/B) 1 1/d + T/B) I/Os. This is optimal if points can be stored only once [13]. If one is willing to spend super-linear space, the best known data structure uses O(N/B (log N/ log log B N) d 1 ) space and answers queries in O(log B N(log N/ log log B N) d 2 + T/B) I/Os [1]. The space bound is optimal if the query cost is O(log c B N +T/B), where c > 0 is any arbitrary constant [1]. For threedimensions, there is also another tradeoff with optimal O(log B N +T/B) query cost, using O(N(log N/ log log B N) 3 ) space [1]. 4 Other Range Searching Problems In many applications, the input data set does not consist of points, but rather of line segments, axis-parallel rectangles etc. Here we mention two fundamental problems where the input consists of a set of two-dimensional axis-parallel rectangles. 6
7 In the rectangle stabbing problem, we are to support reporting all rectangles containing a query point. Very surprisingly, it has been shown that this problem cannot be solved in near-linear space and O(log B N + T/B) query cost. In fact, the best possible query time with O(N/B log c N) space is just O(log N/ log log N +T/B) I/Os, where c > 0 is an arbitrary constant [6]. This is an interesting example where it is not possible to take advantage of blocking. In the rectangle-rectangle reporting problem, a query is specified by an axisparallel rectangle and the goal is to report all rectangles intersecting it. The PRtree [4] is a data structure that stores each input rectangle exactly once and supports queries in O( N/B + T/B) I/Os. Since a point is a special case of a rectangle, this bound is obviously optimal when rectangles can be stored only once. For further details and discussion of other range searching problems, we refer the reader to surveys [3, 11, 14, 7]. References [1] P. Afshani, L. Arge, and K. D. Larsen. Orthogonal range reporting in three and higher dimensions. In Proc. 50th IEEE Symposium on Foundations of Computer Science, pages , [2] A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31: , [3] L. Arge. External memory geometric data structures, Lecture notes available at large/ios06/ionotes.pdf. [4] L. Arge, M. D. Berg, H. Haverkort, and K. Yi. The priority r-tree: A practically efficient and worst-case optimal r-tree. ACM Transactions on Algorithms, 4(1):9:1 9:30, [5] L. Arge, V. Samoladas, and J. S. Vitter. On two-dimensional indexability and optimal range search indexing. In Proc. 18th ACM Symposium on Principles of Database Systems, pages , [6] L. Arge, V. Samoladas, and K. Yi. Optimal external-memory planar point enclosure. In Proc. 12th European Symposium on Algorithms, pages 40 52, [7] M. J. Atallah and M. Blanton, editors. Algorithms and theory of computation handbook: general concepts and techniques. Chapman & Hall/CRC, 2 edition, [8] J. L. Bentley. Multidimensional divide-and-conquer. Communications of the ACM, 23(4): ,
8 [9] M. de Berg and O. Cheong and M. van Kreveld and M. Overmars. Computational Geometry: Algorithms and Applications. Springer-Verlag, 3rd edition, Chapter 10. [10] D. Comer. The ubiquitous B-tree. ACM Computing Surveys, 11(2): , [11] V. Gaede and O. Günther. Multidimensional access methods. ACM Computing Surveys, 30(2): , [12] J. M. Hellerstein, E. Koutsoupias, D. P. Miranker, C. H. Papadimitriou, and V. Samoladas. On a model of indexability and its bounds for range queries. Journal of the ACM, 49(1):35 55, [13] K. V. R. Kanth and A. K. Singh. Optimal dynamic range searching in nonreplicating index structures. In Proc. International Conference on Database Theory, pages , [14] J. S. Vitter. Algorithms and data structures for external memory. Foundations and Trends in Theoretical Computer Science, 2(4): ,
External Memory Geometric Data Structures
External Memory Geometric Data Structures Lars Arge Department of Computer Science University of Aarhus and Duke University Augues 24, 2005 1 Introduction Many modern applications store and process datasets
Computational Geometry
Range trees 1 Range queries Database queries salary G. Ometer born: Aug 16, 1954 salary: $3,500 A database query may ask for all employees with age between a 1 and a 2, and salary between s 1 and s 2 2
Dynamic 3-sided Planar Range Queries with Expected Doubly Logarithmic Time
Dynamic 3-sided Planar Range Queries with Expected Doubly Logarithmic Time Gerth Stølting Brodal 1, Alexis C. Kaporis 2, Spyros Sioutas 3, Konstantinos Tsakalidis 1, Kostas Tsichlas 4 1 MADALGO, Department
Range Searching using Kd Tree.
Range Searching using Kd Tree. Hemant M. Kakde. 25-08-2005 1 Introduction to Range Searching. At first sight it seems that database has little to do with geometry. The queries about data in database can
A Note on Maximum Independent Sets in Rectangle Intersection Graphs
A Note on Maximum Independent Sets in Rectangle Intersection Graphs Timothy M. Chan School of Computer Science University of Waterloo Waterloo, Ontario N2L 3G1, Canada tmchan@uwaterloo.ca September 12,
Computational Geometry. Lecture 1: Introduction and Convex Hulls
Lecture 1: Introduction and convex hulls 1 Geometry: points, lines,... Plane (two-dimensional), R 2 Space (three-dimensional), R 3 Space (higher-dimensional), R d A point in the plane, 3-dimensional space,
Computational Geometry
and range trees Lecture 6: and range trees Lecture 7: and range trees Database queries 1D range trees Databases Databases store records or objects Personnel database: Each employee has a name, id code,
Cache-Oblivious Red-Blue Line Segment Intersection
Cache-Oblivious Red-Blue Line Segment Intersection Lars Arge 1,, Thomas Mølhave 1,, and Norbert Zeh 2, 1 MADALGO, Department of Computer Science, University of Aarhus, Denmark. E-mail: {large,thomasm}@madalgo.au.dk
Higher-dimensional Orthogonal Range Reporting and Rectangle Stabbing in the Pointer Machine Model
Higher-dimensional Orthogonal Range Reporting and Rectangle Stabbing in the Pointer Machine Model Peyman Afshani peyman@cs.au.dk Lars Arge large@cs.au.dk MADALGO Department of Computer Science Aarhus University
Internet Packet Filter Management and Rectangle Geometry
Internet Packet Filter Management and Rectangle Geometry David Eppstein S. Muthukrishnan Abstract We consider rule sets for internet packet routing and filtering, where each rule consists of a range of
SOLVING DATABASE QUERIES USING ORTHOGONAL RANGE SEARCHING
SOLVING DATABASE QUERIES USING ORTHOGONAL RANGE SEARCHING CPS234: Computational Geometry Prof. Pankaj Agarwal Prepared by: Khalid Al Issa (khalid@cs.duke.edu) Fall 2005 BACKGROUND : RANGE SEARCHING, a
Data Warehousing und Data Mining
Data Warehousing und Data Mining Multidimensionale Indexstrukturen Ulf Leser Wissensmanagement in der Bioinformatik Content of this Lecture Multidimensional Indexing Grid-Files Kd-trees Ulf Leser: Data
Big Data and Scripting. Part 4: Memory Hierarchies
1, Big Data and Scripting Part 4: Memory Hierarchies 2, Model and Definitions memory size: M machine words total storage (on disk) of N elements (N is very large) disk size unlimited (for our considerations)
Quadtrees. Yuping(Allan) Lu CS Graph Theory April 23, 2014
Quadtrees Yuping(Allan) Lu CS 594 - Graph Theory April 23, 2014 Agenda 1. 2. 3. 4. 5. 6. 7. Definitions History Examples Applications Open Problems References Homework Definitions A quadtree is a tree
Data Structures for Moving Objects
Data Structures for Moving Objects Pankaj K. Agarwal Department of Computer Science Duke University Geometric Data Structures S: Set of geometric objects Points, segments, polygons Ask several queries
Orthogonal Range Searching in 2D with Ball Inheritance Mads Ravn,
Orthogonal Range Searching in 2D with Ball Inheritance Mads Ravn, 20071580 Master s Thesis, Computer Science June 2015 Advisor: Kasper Green Larsen AARHUS AU UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE ii
Converting a Number from Decimal to Binary
Converting a Number from Decimal to Binary Convert nonnegative integer in decimal format (base 10) into equivalent binary number (base 2) Rightmost bit of x Remainder of x after division by two Recursive
Binary Space Partitions
Title: Binary Space Partitions Name: Adrian Dumitrescu 1, Csaba D. Tóth 2,3 Affil./Addr. 1: Computer Science, Univ. of Wisconsin Milwaukee, Milwaukee, WI, USA Affil./Addr. 2: Mathematics, California State
Colored Range Searching on Internal Memory
Colored Range Searching on Internal Memory Haritha Bellam, Saladi Rahul, and Krishnan Rajan Lab for Spatial Informatics, IIIT-Hyderabad, Hyderabad, India Univerity of Minnesota, Minneapolis, MN, USA Abstract.
Binary Space Partitions for Axis-Parallel Line Segments: Size-Height Tradeoffs
Binary Space Partitions for Axis-Parallel Line Segments: Size-Height Tradeoffs Sunil Arya Department of Computer Science The Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong
Divide-and-Conquer. Three main steps : 1. divide; 2. conquer; 3. merge.
Divide-and-Conquer Three main steps : 1. divide; 2. conquer; 3. merge. 1 Let I denote the (sub)problem instance and S be its solution. The divide-and-conquer strategy can be described as follows. Procedure
1 Point Inclusion in a Polygon
Comp 163: Computational Geometry Tufts University, Spring 2005 Professor Diane Souvaine Scribe: Ryan Coleman Point Inclusion and Processing Simple Polygons into Monotone Regions 1 Point Inclusion in a
Multi-Way Search Trees (B Trees)
Multi-Way Search Trees (B Trees) Multiway Search Trees An m-way search tree is a tree in which, for some integer m called the order of the tree, each node has at most m children. If n
Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child
Binary Search Trees Data in each node Larger than the data in its left child Smaller than the data in its right child FIGURE 11-6 Arbitrary binary tree FIGURE 11-7 Binary search tree Data Structures Using
Closest Pair Problem
Closest Pair Problem Given n points in d-dimensions, find two whose mutual distance is smallest. Fundamental problem in many applications as well as a key step in many algorithms. p q A naive algorithm
Optimal Binary Space Partitions in the Plane
Optimal Binary Space Partitions in the Plane Mark de Berg and Amirali Khosravi TU Eindhoven, P.O. Box 513, 5600 MB Eindhoven, the Netherlands Abstract. An optimal bsp for a set S of disjoint line segments
A traditional random access machine
How many Libraries of Congress per second can your server handle? Avery Brooks, IBM television commercial, 2001 1 Introduction Computer scientists traditionally analyze the running time of an algorithm
EE602 Algorithms GEOMETRIC INTERSECTION CHAPTER 27
EE602 Algorithms GEOMETRIC INTERSECTION CHAPTER 27 The Problem Given a set of N objects, do any two intersect? Objects could be lines, rectangles, circles, polygons, or other geometric objects Simple to
Persistent Data Structures and Planar Point Location
Persistent Data Structures and Planar Point Location Inge Li Gørtz Persistent Data Structures Ephemeral Partial persistence Full persistence Confluent persistence V1 V1 V1 V1 V2 q ue V2 V2 V5 V2 V4 V4
Persistent Data Structures and Planar Point Location. Inge Li Gørtz
Persistent Data Structures and Planar Point Location Inge Li Gørtz Persistent Data Structures Ephemeral Partial persistence Full persistence Confluent persistence V1 V1 V1 V1 V2 q ue V2 V2 V5 V2 V4 V4
Two General Methods to Reduce Delay and Change of Enumeration Algorithms
ISSN 1346-5597 NII Technical Report Two General Methods to Reduce Delay and Change of Enumeration Algorithms Takeaki Uno NII-2003-004E Apr.2003 Two General Methods to Reduce Delay and Change of Enumeration
1 2-3 Trees: The Basics
CS10: Data Structures and Object-Oriented Design (Fall 2013) November 1, 2013: 2-3 Trees: Inserting and Deleting Scribes: CS 10 Teaching Team Lecture Summary In this class, we investigated 2-3 Trees in
Line Segment Intersection
Chapter 1 Line Segment Intersection (with material from [1], [3], and [5], pictures are missing) 1.1 Interval case We first think of having n intervals (e.g., the x-range of horizontal line segments) and
CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992
CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992 Original Lecture #6: 28 January 1991 Topics: Triangulating Simple Polygons
External-Memory Algorithms for Processing Line Segments in Geographic Information Systems
BRICS RS-96-12 Arge et al.: External-Memory Algorithms for Processing Line Segments in GIS BRICS Basic Research in Computer Science External-Memory Algorithms for Processing Line Segments in Geographic
Data storage Tree indexes
Data storage Tree indexes Rasmus Pagh February 7 lecture 1 Access paths For many database queries and updates, only a small fraction of the data needs to be accessed. Extreme examples are looking or updating
Fast Sequential Summation Algorithms Using Augmented Data Structures
Fast Sequential Summation Algorithms Using Augmented Data Structures Vadim Stadnik vadim.stadnik@gmail.com Abstract This paper provides an introduction to the design of augmented data structures that offer
CS Indexing for Data Models with Constraints and Classes. Paris Kanellakis, Sridhar Ramaswamy, Darren Vengroff, Jeffrey Scott Vitter
CS--1994--20 Indexing for Data Models with Constraints and Classes Paris Kanellakis, Sridhar Ramaswamy, Darren Vengroff, Jeffrey Scott Vitter Department of Computer Science Duke University Durham, North
Analysis of Algorithms I: Binary Search Trees
Analysis of Algorithms I: Binary Search Trees Xi Chen Columbia University Hash table: A data structure that maintains a subset of keys from a universe set U = {0, 1,..., p 1} and supports all three dictionary
R-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants
R-Trees: A Dynamic Index Structure For Spatial Searching A. Guttman R-trees Generalization of B+-trees to higher dimensions Disk-based index structure Occupancy guarantee Multiple search paths Insertions
Out-Of-Core Algorithms for Scientific Visualization and Computer Graphics
Out-Of-Core Algorithms for Scientific Visualization and Computer Graphics Course Notes for IEEE Visualization 2002 Boston, Massachusetts October 2002 Organizer: Cláudio T. Silva Oregon Health & Science
Models and Techniques for Proving Data Structure Lower Bounds
Models and Techniques for Proving Data Structure Lower Bounds Kasper Green Larsen PhD Dissertation Department of Computer Science Aarhus University Denmark Models and Techniques for Proving Data Structure
Masters Thesis Computing the Visibility Graph of Points Within a Polygon. Jesper Buch Hansen
Masters Thesis Computing the Visibility Graph of Points Within a Polygon Jesper Buch Hansen December 22, 2005 Department of Computer Science, University of Aarhus Student Jesper Buch Hansen Supervisor
Vector storage and access; algorithms in GIS. This is lecture 6
Vector storage and access; algorithms in GIS This is lecture 6 Vector data storage and access Vectors are built from points, line and areas. (x,y) Surface: (x,y,z) Vector data access Access to vector
Solving Geometric Problems with the Rotating Calipers *
Solving Geometric Problems with the Rotating Calipers * Godfried Toussaint School of Computer Science McGill University Montreal, Quebec, Canada ABSTRACT Shamos [1] recently showed that the diameter of
Binary Heap Algorithms
CS Data Structures and Algorithms Lecture Slides Wednesday, April 5, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks CHAPPELLG@member.ams.org 2005 2009 Glenn G. Chappell
Data Structures Using C++ 2E. Chapter 11 Binary Trees and B-Trees
Data Structures Using C++ 2E Chapter 11 Binary Trees and B-Trees Binary Trees Definition: a binary tree, T, is either empty or such that T has a special node called the root node T has two sets of nodes,
Computational Geometry
Motivation Motivation Polygons and visibility Visibility in polygons Triangulation Proof of the Art gallery theorem Two points in a simple polygon can see each other if their connecting line segment is
Computational Geometry
Motivation 1 Motivation Polygons and visibility Visibility in polygons Triangulation Proof of the Art gallery theorem Two points in a simple polygon can see each other if their connecting line segment
Persistent Data Structures
6.854 Advanced Algorithms Lecture 2: September 9, 2005 Scribes: Sommer Gentry, Eddie Kohler Lecturer: David Karger Persistent Data Structures 2.1 Introduction and motivation So far, we ve seen only ephemeral
Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb
Binary Search Trees Lecturer: Georgy Gimel farb COMPSCI 220 Algorithms and Data Structures 1 / 27 1 Properties of Binary Search Trees 2 Basic BST operations The worst-case time complexity of BST operations
Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)
Sorting revisited How did we use a binary search tree to sort an array of elements? Tree Sort Algorithm Given: An array of elements to sort 1. Build a binary search tree out of the elements 2. Traverse
Tufts University, Spring 2005 Michael Horn, Julie Weber (2004) Voronoi Diagrams
Comp 163: Computational Geometry Professor Diane Souvaine Tufts University, Spring 2005 Michael Horn, Julie Weber (2004) Voronoi Diagrams 1 Voronoi Diagrams 1 Consider the following problem. To determine
Geometric Algorithms. range search quad and kd trees intersection search VLSI rules check. References: Algorithms in C (2nd edition), Chapters 26-27
Geometric Algorithms range search quad and kd trees intersection search VLSI rules check References: Algorithms in C (2nd edition), Chapters 26-27 http://www.cs.princeton.edu/introalgsds/73range http://www.cs.princeton.edu/introalgsds/74intersection
Balanced Trees Part One
Balanced Trees Part One Balanced Trees Balanced trees are surprisingly versatile data structures. Many programming languages ship with a balanced tree library. C++: std::map / std::set Java: TreeMap /
Homework Exam 1, Geometric Algorithms, 2016
Homework Exam 1, Geometric Algorithms, 2016 1. (3 points) Let P be a convex polyhedron in 3-dimensional space. The boundary of P is represented as a DCEL, storing the incidence relationships between the
Problem. Indexing with B-trees. Indexing. Primary Key Indexing. B-trees. B-trees: Example. primary key indexing
Problem Given a large collection of records, Indexing with B-trees find similar/interesting things, i.e., allow fast, approximate queries Anastassia Ailamaki http://www.cs.cmu.edu/~natassa 2 Indexing Primary
Existence of Simple Tours of Imprecise Points
Existence of Simple Tours of Imprecise Points Maarten Löffler Department of Information and Computing Sciences, Utrecht University Technical Report UU-CS-007-00 www.cs.uu.nl ISSN: 09-7 Existence of Simple
Parametric R-Tree: An Index Structure for Moving Objects
Parametric R-Tree: An Index Structure for Moving Objects Mengchu Cai IBM Silicon Valley Laboratory 555 Bailey Ave. San Jose, CA 95141, USA mcai@us.ibm.com Peter Revesz Dept. of Computer Science and Engineering
Announcements. CSE332: Data Abstractions. Lecture 9: B Trees. Today. Our goal. M-ary Search Tree. M-ary Search Tree. Ruth Anderson Winter 2011
Announcements CSE2: Data Abstractions Project 2 posted! Partner selection due by 11pm Tues 1/25 at the latest. Homework due Friday Jan 28 st at the BEGINNING of lecture Lecture 9: B Trees Ruth Anderson
24: Polygon Partition - Faster Triangulations
24: Polygon Partition - Faster Triangulations CS 473u - Algorithms - Spring 2005 April 15, 2005 1 Previous lecture We described triangulations of simple polygons. See Figure 1 for an example of a polygon
MATHEMATICAL ENGINEERING TECHNICAL REPORTS. The Best-fit Heuristic for the Rectangular Strip Packing Problem: An Efficient Implementation
MATHEMATICAL ENGINEERING TECHNICAL REPORTS The Best-fit Heuristic for the Rectangular Strip Packing Problem: An Efficient Implementation Shinji IMAHORI, Mutsunori YAGIURA METR 2007 53 September 2007 DEPARTMENT
Cache-Oblivious Search Trees via Trees of Small Height
Cache-Oblivious Search Trees via Trees of Small Height Gerth Stølting Brodal BRICS University of Aarhus Joint work with Rolf Fagerberg and Riko Jacob To be presented at the 13th Annual ACM-SIAM Symposium
Balanced search trees
Lecture 8 Balanced search trees 8.1 Overview In this lecture we discuss search trees as a method for storing data in a way that supports fast insert, lookup, and delete operations. (Data structures handling
Symbol Tables. IE 496 Lecture 13
Symbol Tables IE 496 Lecture 13 Reading for This Lecture Horowitz and Sahni, Chapter 2 Symbol Tables and Dictionaries A symbol table is a data structure for storing a list of items, each with a key and
Indexing Techniques in Data Warehousing Environment The UB-Tree Algorithm
Indexing Techniques in Data Warehousing Environment The UB-Tree Algorithm Prepared by: Yacine ghanjaoui Supervised by: Dr. Hachim Haddouti March 24, 2003 Abstract The indexing techniques in multidimensional
Monotone Partitioning. Polygon Partitioning. Monotone polygons. Monotone polygons. Monotone Partitioning. ! Define monotonicity
Monotone Partitioning! Define monotonicity Polygon Partitioning Monotone Partitioning! Triangulate monotone polygons in linear time! Partition a polygon into monotone pieces Monotone polygons! Definition
A Scalable Algorithm for Maximizing Range Sum in Spatial Databases
A Scalable Algorithm for Maximizing Range Sum in Spatial Databases Dong-Wan Choi Chin-Wan Chung, Yufei Tao,3 Department of Computer Science, KAIST, Daejeon, Korea Division of Web Science and Technology,
From Last Time: Remove (Delete) Operation
CSE 32 Lecture : More on Search Trees Today s Topics: Lazy Operations Run Time Analysis of Binary Search Tree Operations Balanced Search Trees AVL Trees and Rotations Covered in Chapter of the text From
Lecture: #9-10. Binary Trees
Lecture: #9-10 Binary Trees Topics 1. Definition and Application of Binary Trees 2. Binary Search Tree Operations 3. Template Considerations for Binary Search Trees 19-2 1. Definition and Application of
Balanced Binary Search Tree
AVL Trees / Slide 1 Balanced Binary Search Tree Worst case height of binary search tree: N-1 Insertion, deletion can be O(N) in the worst case We want a tree with small height Height of a binary tree with
A Survey of 3sum-Hard Problems
A Survey of 3sum-Hard Problems James King king@cs.ubc.ca December 20, 2004 1 Introduction The 3sum problem is defined as follows: given a set S of n integers, do there exist three elements {a, b, c} S
Computational Geometry
Lecture 10: 1 Lecture 10: Voronoi diagram Spatial interpolation Given some trees, seen from above, which region will they occupy? 2 Lecture 10: Voronoi diagram Spatial interpolation Given some trees, seen
Binary Search Trees CMPSC 122
Binary Search Trees CMPSC 122 Note: This notes packet has significant overlap with the first set of trees notes I do in CMPSC 360, but goes into much greater depth on turning BSTs into pseudocode than
Query Processing, optimization, and indexing techniques
Query Processing, optimization, and indexing techniques What s s this tutorial about? From here: SELECT C.name AS Course, count(s.students) AS Cnt FROM courses C, subscription S WHERE C.lecturer = Calders
Chapter 7. Multiway Trees. Data Structures and Algorithms in Java
Chapter 7 Multiway Trees Data Structures and Algorithms in Java Objectives Discuss the following topics: The Family of B-Trees Tries Case Study: Spell Checker Data Structures and Algorithms in Java 2 Multiway
Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles
B-Trees Algorithms and data structures for external memory as opposed to the main memory B-Trees Previous Lectures Height balanced binary search trees: AVL trees, red-black trees. Multiway search trees:
INTERSECTION OF LINE-SEGMENTS
INTERSECTION OF LINE-SEGMENTS Vera Sacristán Computational Geometry Facultat d Informàtica de Barcelona Universitat Politècnica de Catalunya Problem Input: n line-segments in the plane, s i = (p i, q
Binary Heaps * * * * * * * / / \ / \ / \ / \ / \ * * * * * * * * * * * / / \ / \ / / \ / \ * * * * * * * * * *
Binary Heaps A binary heap is another data structure. It implements a priority queue. Priority Queue has the following operations: isempty add (with priority) remove (highest priority) peek (at highest
ICS 434 Advanced Database Systems
ICS 434 Advanced Database Systems Dr. Abdallah Al-Sukairi sukairi@kfupm.edu.sa Second Semester 2003-2004 (032) King Fahd University of Petroleum & Minerals Information & Computer Science Department Outline
Algorithms. Algorithms GEOMETRIC APPLICATIONS OF BSTS. 1d range search line segment intersection kd trees interval search trees rectangle intersection
Algorithms ROBERT SEDGEWICK KEVIN WAYNE GEOMETRIC APPLICATIONS OF BSTS Algorithms F O U R T H E D I T I O N ROBERT SEDGEWICK KEVIN WAYNE 1d range search line segment intersection kd trees interval search
! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions
Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions Basic Steps in Query
Tree-Based Indexes for Image Data
Tree-Based Indexes for Image Data Leonard Brown Le Gruenwald The University of Oklahoma School of Computer Science Norman, OK, 73019 lbrown@cs.ou.edu gruenwal@cs.ou.edu Abstract As in conventional DataBase
15.1 Introduction The Disk-based Environment B-tree Definition B-tree Query B-tree Insertion B-tree Deletion
15 B Trees Donghui Zhang Northeastern University 15.1 Introduction............................................ 15-1 15.2 The Disk-based Environment........................ 15-2 15.3 The B-tree..............................................
Computational Geometry: Line segment intersection
: Line segment intersection Panos Giannopoulos Wolfgang Mulzer Lena Schlipf AG TI SS 2013 Tutorial room change: 055 this building!!! (from next monday on) Outline Motivation Line segment intersection (and
Shortest Inspection-Path. Queries in Simple Polygons
Shortest Inspection-Path Queries in Simple Polygons Christian Knauer, Günter Rote B 05-05 April 2005 Shortest Inspection-Path Queries in Simple Polygons Christian Knauer, Günter Rote Institut für Informatik,
Multiway Search Tree (MST)
Multiway Search Tree (MST) Generalization of BSTs Suitable for disk MST of order n: Each node has n or fewer sub-trees S1 S2. Sm, m n Each node has n-1 or fewer keys K1 Κ2 Κm-1 : m-1 keys in ascending
INTERSECTION OF LINE-SEGMENTS
INTERSECTION OF LINE-SEGMENTS Vera Sacristán Discrete and Algorithmic Geometry Facultat de Matemàtiques i Estadística Universitat Politècnica de Catalunya Problem Input: n line-segments in the plane,
root node level: internal node edge leaf node CS@VT Data Structures & Algorithms 2000-2009 McQuain
inary Trees 1 A binary tree is either empty, or it consists of a node called the root together with two binary trees called the left subtree and the right subtree of the root, which are disjoint from each
Spatial Partitioning and Indexing Responsible persons: Claudia Dolci, Dante Salvini, Michael Schrattner, Robert Weibel
Geographic Information Technology Training Alliance (GITTA) presents: Spatial Partitioning and Indexing Responsible persons: Claudia Dolci, Dante Salvini, Michael Schrattner, Robert Weibel Content 1.
Steven Skiena. skiena
Lecture 3: Recurrence Relations (1997) Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 11794 4400 http://www.cs.sunysb.edu/ skiena Argue the solution to T(n) =
B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees
B-Trees Algorithms and data structures for external memory as opposed to the main memory B-Trees Previous Lectures Height balanced binary search trees: AVL trees, red-black trees. Multiway search trees:
Comparison Sort. - In Merge Sort, the merge procedure chooses an item from one of two arrays after comparing the top items from both arrays.
Comparison Sort A Comparison Sort is a sorting algorithm where the final order is determined only by comparisons between the input elements. - In Insertion Sort, the proper position to insert the current
Lecture 1: Data Storage & Index
Lecture 1: Data Storage & Index R&G Chapter 8-11 Concurrency control Query Execution and Optimization Relational Operators File & Access Methods Buffer Management Disk Space Management Recovery Manager
Merge Sort. 2004 Goodrich, Tamassia. Merge Sort 1
Merge Sort 7 2 9 4 2 4 7 9 7 2 2 7 9 4 4 9 7 7 2 2 9 9 4 4 Merge Sort 1 Divide-and-Conquer Divide-and conquer is a general algorithm design paradigm: Divide: divide the input data S in two disjoint subsets
12. Visibility Graphs and 3-Sum Lecture on Monday 9 th
12. Visibility Graphs and 3-Sum Lecture on Monday 9 th November, 2009 by Michael Homann 12.1 Sorting all Angular Sequences. Theorem 12.1 Consider a set P of n points in the plane.
Database 2 Lecture II. Alessandro Artale
Free University of Bolzano Database 2. Lecture II, 2003/2004 A.Artale (1) Database 2 Lecture II Alessandro Artale Faculty of Computer Science Free University of Bolzano Room: 221 artale@inf.unibz.it http://www.inf.unibz.it/
Heaps & Priority Queues in the C++ STL 2-3 Trees
Heaps & Priority Queues in the C++ STL 2-3 Trees CS 3 Data Structures and Algorithms Lecture Slides Friday, April 7, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks
Binary Coded Web Access Pattern Tree in Education Domain
Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: kc.gomathi@gmail.com M. Moorthi
Data Structures Fibonacci Heaps, Amortized Analysis
Chapter 4 Data Structures Fibonacci Heaps, Amortized Analysis Algorithm Theory WS 2012/13 Fabian Kuhn Fibonacci Heaps Lacy merge variant of binomial heaps: Do not merge trees as long as possible Structure: