# Fast Sequential Summation Algorithms Using Augmented Data Structures

Save this PDF as:

Size: px
Start display at page:

Download "Fast Sequential Summation Algorithms Using Augmented Data Structures"

## Transcription

1 Fast Sequential Summation Algorithms Using Augmented Data Structures Vadim Stadnik Abstract This paper provides an introduction to the design of augmented data structures that offer an efficient representation of a mathematical sequence and fast sequential summation algorithms, which guarantee both logarithmic running time and logarithmic computational cost in terms of the total number of operations. In parallel summation algorithms, logarithmic running time is achieved by high linear computational cost with a linear number of processors. The important practical advantage of the fast sequential summation algorithms is that they do not require supercomputers and can run on the cheapest single processor systems. Layout 1. Motivation 2. Parallel Summation Algorithms 3. Choice of Data Structure 4. Geometric Interpretation 5. Representation of Sequence 6. Fast Sequential Summation Algorithms 7. Mean Value and Standard Deviation 8. Other Data Structures 1. Motivation The original version of the fast sequential summation algorithms described here was implemented in C++ using data structures that support interfaces of STL containers [1]. These interesting and important algorithms can be implemented in many other programming languages. This discussion provides an introduction to the design of data structures, which significantly improve performance of summation algorithms. The fast summation algorithms have been developed by applying the method of augmenting data structures [2]. This powerful method can be used to solve a wide range of problems. It helps design advanced data structures that support more efficient algorithms than basic data structures. The main disadvantage of this method is its complexity, which makes the use of this method difficult in practice. For this reason, we consider the fast summation algorithms from perspectives that simplify the explanation. The discussion should also help better understand the design principles for quite complex augmented data structures. Here we focus on the efficient representation of a mathematical sequence and the efficient summation algorithms. For the implementation of update operations on augmented data structures, we refer the reader to the detailed description of the method of augmenting in [2]. To highlight the main ideas, we use simplified variants of the data structures originally implemented in C++.

2 2. Parallel Summation Algorithms The summation algorithms are fundamental to a wide range of applications from scientific to commercial. The linear running time of the standard sequential summation is acceptable for processing small data sets, but can become a bottleneck for large and huge data sets. Any improvement in the running time of summation algorithms in performance critical applications is of significant importance. Recent progress in computer technology has enabled the development of effective parallel summation algorithms. Most modern computers have a very small number of processors compared to a typical number of elements processed. The running time of parallel summation on these computers is linear, with optimal improvement over the standard sequential summation proportional to the number of processors. This performance improvement might be significant for some applications, but not acceptable for others. If an algorithm has a performance bottleneck caused by the standard sequential summation, it will not be removed by the parallel summation, which has the same asymptotical running time. The barrier of linear running time can be broken using the ideal parallel machine. The summation algorithm shown in Figure 1 computes the sum in logarithmic time. Figure 1: A diagram of a parallel summation algorithm for 16 elements. The algorithm takes 4 time steps and performs 15 additions. A brief discussion of this simple parallel summation algorithm is useful for gaining insight into efficient sequential algorithms. The ideal parallel summation algorithm achieves considerable performance gain by applying the divide and conquer technique, which avoids sequential processing of a large number of elements that has linear running time. The algorithm uses system processors to divide a given data set into pairs of adjacent elements. The sums of all of the pairs are computed simultaneously and become elements of a new data set to be processed at the next stage of the computations. Each stage of processing decreases the number of elements and involved processors by a factor of 2. The last stage computes the total sum of all of the elements in the given data set. The running time of this algorithm is determined by the number of stages, which is logarithmic in the number of elements.

3 One of the main features of the parallel summation algorithm is that its logarithmic running time is achieved by high computational cost and, thus, high energy consumption. The total number of operations performed by the parallel algorithm may be even greater than the number of operations performed by the standard sequential algorithm [3]. The drawback of the parallel summation algorithm is that it is designed to compute one value of the total sum of all of the elements in a data set and throws away all of the results computed at intermediate stages. When the processed data set is modified by a small number of operations, such as appending new elements, many intermediate results remain valid and might be reused in subsequent computations. The recomputation of all of the valid data, which is performed by the parallel summation algorithm, incurs additional computational cost and seems to be unnecessary. This observation leads to the problem of the development of an optimal summation algorithm that performs the minimal number of operations by using precomputed data and avoiding the recomputation of valid data when a data set is modified. The fast sequential summation algorithms described below provide a solution to this problem. These algorithms guarantee both logarithmic running time and logarithmic computational cost in terms of the total number of operations. They do not require a supercomputer and can run on the cheapest single processor systems, including mobile devices. 3. Choice of Data Structure The implementation of the new summation algorithms requires a data structure that stores and efficiently updates the precomputed data. The choice of the most suitable data structure plays a critical role in the design of such algorithms. In general, the maximum efficiency of an algorithm is limited by the complexity and effectiveness of structural relationships between data items stored in a data structure. For example, both linked lists and balanced trees can store the same data set. A list provides search operations with linear running time only, whereas a tree guarantees significantly better logarithmic running time. The sequential summation algorithms that operate on linked lists have the same performance characteristics. These basic data structures cannot achieve a running time better than linear. The performance of the sequential summation can be improved using data structures that are more complex than a linked list or an array. The discussed parallel summation algorithm suggests a data structure that can support the fast sequential summation algorithms. The diagram of the dynamics of parallel computations (see Figure 1) shows the dependencies between the summation operations as a directed acyclic graph. The structural relationships between intermediate results computed by the parallel algorithm at current and next stages are the same as child-parent links in a binary tree. This property of the parallel summation algorithm implies that the implementation of the fast sequential summation algorithms might be based on a binary tree. This is a good option, but not the only one. There are several other data structures suitable for this task. In the context of the development of the fast sequential summation algorithms, a B+ tree has a number of advantages over other types of trees. One of them is the globally balanced structure. This feature of B+ trees simplifies the design and development of new summation algorithms by using the similarity with the parallel algorithm. A data set stored at one level of a B+ tree can be regarded as a counterpart

5 corresponding sum. This drawing method is not used here to make the geometric representation more compact. In contrast to the horizontal dimension, the vertical dimension of the data structure shown does not exist physically. It is only a feature of the schematic representation of this data structure. This dimension is used to separate horizontally linked lists of different levels and to show the child-parent links that connect these levels. The vertical child-parent links are perpendicular to line segments at any level and, thus, their projections on the horizontal dimension do not have lengths. At any internal level, the start and end points of a parent line segment coincide with the start point of its first child segment and the end point of its last child segment, respectively. The relationship between a parent node and its children can be regarded as a partition of the parent line segment into a polygonal chain of its child line segments. The deepest level of such a geometric data structure stores line segments that correspond to all of the elements in the original sequence. At all other levels, each line segment has a length equal to the sum of the lengths of its child line segments. When an algorithm moves up, the number of line segments at the current level quickly decreases and the length of each line segment increases. A line segment at an internal level covers a large number of consecutive line segments at the deepest level. The total length of a polyline at each level remains constant, despite the numbers and the lengths of line segments possibly varying quite significantly. In terms of processing the original sequence, the length of a line segment at an internal level of the geometric data structure represents a sum of a large number of original consecutive elements. The line segment at the top level has a length equal to the total length of all line segments at the deepest level. This result is equivalent to the top node in the original data structure storing the sum of all of the elements. The geometric data structure of line segments helps explain the following simple rules for the development of new sequential summation algorithms: 1. When an algorithm traverses a horizontal link from the current to the next node (from left to right in Figure 3), it adds the length of the visited line segment. This traversal is equivalent to the addition operation. 2. When an algorithm traverses a horizontal link from the current to the previous node (from right to left in Figure 3), it subtracts the length of the visited line segment. This traversal is equivalent to the subtraction operation. 3. A traversal of a vertical link has no effect on the value of an accumulated sum. The fast sequential summation algorithms can be developed using the observation that the line segments at high levels of this data structure provide the total length for large numbers of consecutive line segments at the deepest level. Thus, using high levels would be beneficial for the performance of such algorithms. Figure 4 illustrates the idea of the top down processing that finds a sum of the first consecutive elements. For comparison, this figure also shows the sequential summation of elements in a basic linked list.

6 Figure 4: A geometric illustration of the idea of efficient sequential summation using a B+ tree with precomputed data. A sum of first elements can be computed by top down traversal of the tree. The path drawn in red color has 4 horizontal line segments with the total length equal to the total length of 23 line segments in a linked list shown at the bottom. The optimal implementation of the top down algorithm can be expected to find the minimum possible number of line segments that cover the corresponding polyline of the deepest level. The algorithm visits all of the levels of the data structure. The number of levels is logarithmic in the number of sequence elements. The number of nodes visited at each level does not exceed the maximum size of a single group. Thus, it can be expected that the proposed data structure supports an algorithm that computes the sum of any number of first consecutive elements of a sequence in logarithmic time. 5. Representation of Sequence The proper implementation of summation algorithms based on the ideas discussed so far faces the following problems. First of all, it is desirable to develop a summation algorithm that operates on any consecutive elements, not just the first ones. The second problem is that the data structure, which stores summation data only, does not yet have a method to identify the elements to be processed. Unlike basic search trees, the value or key of an element cannot be used as an identifier for the element. The summation rules are formulated for sequences, which can contain elements with multiple and unique values in sorted or random order. A mathematical sequence is formally defined as a function that maps an element of the set of nonnegative integers, which represent ranks, into an element of a given set. Each element in a sequence is uniquely identified by its rank. Note that the rank of an element is also called the subscript, the index or the position in linear order. Any consecutive elements of a sequence are defined through the ranks of the first and last elements. The formal definition requires implementing an efficient data structure that stores elements of a given set and provides access to any element specified by its rank. The proposed data structure meets the requirements for a sequence, since it stores elements in a linked list of the deepest level. A doubly linked list also provides optimal constant time access to previous

7 and next elements. The problem for the data structure is how to support the rank of each element with efficiency sufficient for the fast summation algorithm. The straightforward method of counting elements in a sequence using the links between the nodes in the list of the deepest level is not acceptable. This method has linear running time. It is too slow for the fast summation algorithm, which is expected to run in logarithmic time. An interesting and probably surprising fact is that the idea of the discussed fast sequential summation algorithm using the proposed data structure hints at the solution to this problem. This conclusion follows from the observation that the rank of an element in a sequence, by definition, is the number of elements preceding this element. In other words, the rank is a special sum, which is also called the count. With the aid of the proposed data structure and the fast summation algorithm, the count and, thus, the rank can be quickly computed if each node at the deepest level stores the integer value 1 (see Figure 5). Figure 5: A B+ tree that stores counts of nodes at the deepest level. The bottom line shows the ranks of these nodes. Each parent node in this tree has the same number of child nodes as the matching parent node in the tree shown in Figure 2. Due to the close similarity, the data structure of counts can be regarded as a special variant of the previously discussed data structure of sums (compare Figure 5 with Figure 2). It differs only in the type of data it stores. Its nodes store the counts of nodes at the deepest level, which are equal to the counts of sequence elements, instead of the sums of the values of elements in the same counted external nodes. At high levels, the count in a node represents the number of sequence elements in a sub-tree rooted at this node. The algorithm for the computation of the rank of a sequence element can be implemented using the idea of the discussed fast sequential summation algorithm (see Figure 4) that operates on the data structure of counts shown in Figure 5. This algorithm represents a top down search for a node at the deepest level that stores a sequence element associated with a given rank. The algorithm uses the total count of nodes at the deepest level. As the algorithm scans through the linked list of a current level, it updates the total count using the counts stored in visited nodes. The algorithm initializes the total count to zero and starts the search at the first node at the top level. The algorithm scans forward through the current level while the total count is less than the given rank. If the current level is the deepest one, the search stops and the algorithm returns the current node. Otherwise the algorithm moves down to the next level and continues the forward scan and search.

9 6. Fast Sequential Summation Algorithms Efficient access to elements does not guarantee high performance of summation algorithms. If a data structure does not implement proper facilities, it can support only sequential summation, which has linear running time. The data structure that supports a fast sequential summation algorithm is more complex than the efficient data structure, which implements a general sequence. In order to efficiently support both the summation operations and the rank method, the internal nodes of this data structure must store both sums of the values of elements in a sequence and counts of these elements. Such a data structure is a combination of the sub-structures of sums and counts (see Figure 7). Figure 7: A B+ tree that provides an efficient representation of a sequence and supports fast sequential summation algorithms. The tree is a combination of the trees shown in Figures 2 and 5. Each node in this tree stores both a count and a sum of the values of elements in a sequence. All of the trees in Figures 2, 5 and 7 have an identical structure. Despite the fact that, in theory, the sub-structure of sums supersedes the sub-structure of counts, the former does not replace the latter in a summation algorithm. These sub-structures store two different data sets and support different algorithms. The sub-structure of counts stores only non-negative integers, whereas the sub-structure of sums can store elements of any type that support the addition operation, for example, complex numbers. A fast sequential summation algorithm involves the facilities of both sub-structures. The sub-structure of counts plays an essential role, since the summation of sequence elements is specified through their ranks. Without the sub-structure of counts that supports the efficient rank method, a summation algorithm cannot efficiently identify the elements to be processed. The proposed level-linked data structure shown in Figure 7 can support two variants of efficient sequential summation algorithms that are useful in applications. The fast sequential algorithm for a partial sum that computes the sum of the first n consecutive elements

10 can be implemented as an extension to the efficient algorithm for the rank of an element in a sequence discussed in Section 5. This summation algorithm employs the additional variable, which represents the total sum of processed sequence elements. The algorithm starts at the first node at the top level and moves to the last node at the deepest level along a path detected by the rank algorithm. As the algorithm visits a node on the path, it uses the sum stored in the node to update the total sum. With this partial sum algorithm, a sum of any consecutive elements in a sequence can be easily computed as the difference between the sum of the first n elements and the sum of the first m elements. This simple method offers good running time, which is logarithmic in the number of elements in a sequence, but it has a few issues. The method has excessive cost associated with the sum of the first m elements, which is computed twice. This computation is unnecessary to some degree, since this sum does not contribute to the result. In addition to this, there is a possibility that the sum of the first m elements has a huge value compared to the result value. This possibility may cause an accuracy problem in practical applications. The optimal algorithm computes a sum of only consecutive elements specified by the lower and upper limits of the summation (see Figure 8). Its implementation avoids the unnecessary computation of a sum of the first m elements by processing that starts and ends at the deepest level of the data structure, which stores all of the elements in a sequence. Figure 8: A geometric illustration of a fast sequential summation algorithm using a shortest path between two nodes at the deepest level. The algorithm visits 8 nodes (excluding the node associated with the end of the last visited line segment) at the three lowest levels and finds 4 horizontal line segments with the total length equal to the total length of 22 line segments in a linked list shown at the bottom. This algorithm uses the given summation limits to obtain the ranks of the first and last elements to be processed. These ranks enable the algorithm to identify the nodes at the deepest level that store the first and last elements. The fast summation algorithm uses the sub-structure of counts to find a shortest path (not necessarily unique) from the first node to the last node. This implementation detail of the summation algorithm is similar to the finger search algorithm in search trees [5, 6]. The levellinked data structure provides the set of links between nodes, which is optimized for local search and allows the algorithm to use the minimum number of low levels. The algorithm reaches the top level only when the number of elements to be processed approaches the total number of elements in a sequence.

11 A shortest path guarantees the most efficient computation of a sum of all of the elements between two end nodes at the deepest level. As the algorithm visits nodes on the path, it uses sums stored in the nodes to compute the total sum. The summation algorithm based on a shortest path performs the minimum number of computational operations. The summation algorithm based on finger search addresses all of the issues of the partial sum algorithm. Generally, it is more efficient, since its running time is logarithmic in the number of specified consecutive elements against the number of all of the elements in a sequence. The algorithm based on finger search does not sum up elements before the lower summation limit and, thus, provides basically the same accuracy of the result as the standard sequential summation. This advantage can be important in practice to minimize computational numerical errors. Compared to the partial sum algorithm, the summation algorithm based on finger search is more general. It can be applied to any consecutive elements in a sequence, not just the first ones. From an implementation perspective, both summation algorithms are quite similar (see Figures 4 and 8). Each algorithm computes a sum using a shortest path from the first node to the last node. The main difference is in the choice of the first node. The partial sum algorithm starts at the first node at the top level, whereas the finger search algorithm starts at the node at the deepest level that corresponds to the lower limit of the summation. For both algorithms, a path terminates at the same last node at the deepest level that corresponds to the upper limit of the summation. Note also that search operations on counts that support the efficient rank method and shortest path algorithms are similar to search operations on keys in a search tree. The difference is that a search for a key involves comparisons only while the search operations on counts use addition and comparison with an accumulated value. 7. Mean Value and Standard Deviation The summation algorithms with logarithmic running time can significantly improve the efficiency of solutions in numerous application areas, such as numerical integration, statistics, data analysis, etc. For example, the computation of important statistical parameters, such as the mean value and the standard deviation, using normal sequential summation takes linear time. With the fast summation algorithms, these parameters can be computed in logarithmic time only. This result follows from the fact that the running times of the computation of both parameters are determined by summation operations. The number of involved elements is calculated in constant time through the specified summation limits or the ranks of corresponding elements. This operation has no significant effect on the total running time. The mean value of any consecutive elements is simply computed by dividing an obtained sum by the number of elements The standard deviation can be computed simultaneously with the mean value. The straightforward method is to create a new augmented data structure, which implements the fast summation of squared values of sequence elements. The more advanced method is to store the sums of squared values of elements in the nodes of the available data structure that implements a fast summation algorithm. These additional data enable the summation of squared values with the same efficiency as the

12 summation of values of specified consecutive elements. The standard deviation is computed by taking the square root of the variance 8. Other Data Structures Efficient representations of a mathematical sequence and sequential summation algorithms with logarithmic running time can be devised with various data structures. Several well-known and widely used data structures are briefly considered here. The proposed data structures and algorithms have been developed using the method of augmenting. A general explanation of this method is provided in [2]. The underlying data structure used in specific examples is a red-black tree. Another application of this method to binary trees is available in [7]. These books show how the method of augmenting a data structure improves the performance of orderstatistic queries and range queries. There are also thorough descriptions of update operations on the augmented red-black and binary trees. In terms of the method of augmenting, a data structure that efficiently supports a sequence implements single augmenting with counts of elements. A data structure that supports a fast sequential summation algorithm implements double augmenting with both counts and sums of element values. Every augmenting uses a basic data structure to build its own sub-structure with a specific data set. Multiple augmenting can be regarded as implementing several data structures within one. For example, the discussed fast computation of the standard deviation is achieved by triple augmenting. The method of augmenting can be applied several times to one data structure to achieve the best efficiency of an algorithm. The augmented red-black and binary trees [2, 7] efficiently support counting and the rank of an element. These facilities can be used in algorithms for the representation of a mathematical sequence. The fast sequential summation algorithms can be obtained through augmenting that creates a substructure of sums of element values. The second augmenting is not expected to be challenging because of the close similarity of the sub-structures of counts and sums. The augmented red-black and binary trees have, however, one important limitation, namely that they do not provide efficient finger search, which is quite useful in practice for summation algorithms. Another data structure suitable for implementing fast sequential summation algorithms is a randomized skip list. Its simple insertion, deletion and traversal operations significantly reduce the cost of development. In the context of implementing summation algorithms, this data structure has the important advantage that it provides both top down and finger searches. A skip list can support both variants of fast sequential summation algorithms described here, since the efficient rank and summation algorithms are based on these searches in augmented data structures. The basic data structure used here for the description of fast sequential summation algorithms represents a level-linked B+ tree with the minimal number of child-parent links. It is equivalent to a deterministic skip list [8, 9]. The methods of augmenting this basic data structure and a randomized skip list are identical. For augmenting a skip list with the counts of elements, see [10].

13 The benefit of the low cost of the development of an augmented skip list comes at the expense of the disadvantages that might result from the randomization of its structure and loss of balance. A skip list guarantees logarithmic running time of the discussed algorithms on average only. In addition to this, random deviations from a shortest path may lead to a loss in accuracy of the summation result. References 1. V. Stadnik. STL extensions based on advanced data structures. https://github.com/vstadnik/stl_ext_adv_review. 2. T.H. Cormen, C.E. Leiserson, R.L. Rivest and C. Stein. Introduction to Algorithms. Second Edition, MIT Press and McGraw-Hill, Cambridge (MA), G.E. Blelloch and B.M. Maggs. Parallel Algorithms. In Computer Science Handbook, Second Edition, A.B. Tucker (Editor), CRC Press, D. Comer. The Ubiquitous B-trees. ACM Computing Surveys, 11(2), pages , P. Brass. Advanced Data Structures. Cambridge University Press, Cambridge, G.S. Brodal. Finger Search Trees. In Handbook of Data Structures and Applications, D. Mehta and S. Sahni (Editors), CRC Press, R. Sedgewick and K. Wayne. Algorithms. Fourth Edition, Addison-Wesley, Boston, P. Bose, K. Douïeb, S. Langerman. Dynamic Optimality for Skip Lists and B-Trees. In Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2008), pages , M.G. Lamoureux and B.G. Nickerson. On the Equivalence of B-trees and Deterministic Skip Lists. University of New Brunswick Technical Report, TR96-102, January W. Pugh. A skip list cookbook. Technical Report CS-TR , Department of Computer Science, University of Maryland, College Park, MD, July 1989.

### 12 Abstract Data Types

12 Abstract Data Types 12.1 Source: Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: Define the concept of an abstract data type (ADT).

### Data Structures and Algorithms Written Examination

Data Structures and Algorithms Written Examination 22 February 2013 FIRST NAME STUDENT NUMBER LAST NAME SIGNATURE Instructions for students: Write First Name, Last Name, Student Number and Signature where

### A Comparison of Dictionary Implementations

A Comparison of Dictionary Implementations Mark P Neyer April 10, 2009 1 Introduction A common problem in computer science is the representation of a mapping between two sets. A mapping f : A B is a function

### Persistent Binary Search Trees

Persistent Binary Search Trees Datastructures, UvA. May 30, 2008 0440949, Andreas van Cranenburgh Abstract A persistent binary tree allows access to all previous versions of the tree. This paper presents

### Skip List Data Structure Based New Searching Algorithm and Its Applications: Priority Search

Skip List Data Structure Based New Searching Algorithm and Its Applications: Priority Search Mustafa Aksu Department of Computer Technologies Vocational School of Technical Sciences, Sutcu Imam University

### Comparison of Optimization Techniques in Large Scale Transportation Problems

Journal of Undergraduate Research at Minnesota State University, Mankato Volume 4 Article 10 2004 Comparison of Optimization Techniques in Large Scale Transportation Problems Tapojit Kumar Minnesota State

### Vector storage and access; algorithms in GIS. This is lecture 6

Vector storage and access; algorithms in GIS This is lecture 6 Vector data storage and access Vectors are built from points, line and areas. (x,y) Surface: (x,y,z) Vector data access Access to vector

### I/O-Efficient Spatial Data Structures for Range Queries

I/O-Efficient Spatial Data Structures for Range Queries Lars Arge Kasper Green Larsen MADALGO, Department of Computer Science, Aarhus University, Denmark E-mail: large@madalgo.au.dk,larsen@madalgo.au.dk

### SOLVING DATABASE QUERIES USING ORTHOGONAL RANGE SEARCHING

SOLVING DATABASE QUERIES USING ORTHOGONAL RANGE SEARCHING CPS234: Computational Geometry Prof. Pankaj Agarwal Prepared by: Khalid Al Issa (khalid@cs.duke.edu) Fall 2005 BACKGROUND : RANGE SEARCHING, a

### Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:

### IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE OPERATORS

Volume 2, No. 3, March 2011 Journal of Global Research in Computer Science RESEARCH PAPER Available Online at www.jgrcs.info IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE

### Lecture Notes on Spanning Trees

Lecture Notes on Spanning Trees 15-122: Principles of Imperative Computation Frank Pfenning Lecture 26 April 26, 2011 1 Introduction In this lecture we introduce graphs. Graphs provide a uniform model

### BINARY SEARCH TREE PERFORMANCE

BINARY SEARCH TREE PERFORMANCE Operation Best Time Average Time Worst Time (on a tree of n nodes) Find Insert Delete O(lg n)?? O(lg n)?? O(n) Fastest Running Time The find, insert and delete algorithms

### Triangulation by Ear Clipping

Triangulation by Ear Clipping David Eberly Geometric Tools, LLC http://www.geometrictools.com/ Copyright c 1998-2016. All Rights Reserved. Created: November 18, 2002 Last Modified: August 16, 2015 Contents

### Multimedia Communications. Huffman Coding

Multimedia Communications Huffman Coding Optimal codes Suppose that a i -> w i C + is an encoding scheme for a source alphabet A={a 1,, a N }. Suppose that the source letter a 1,, a N occur with relative

### Optimal Binary Search Trees Meet Object Oriented Programming

Optimal Binary Search Trees Meet Object Oriented Programming Stuart Hansen and Lester I. McCann Computer Science Department University of Wisconsin Parkside Kenosha, WI 53141 {hansen,mccann}@cs.uwp.edu

### Algorithms. Theresa Migler-VonDollen CMPS 5P

Algorithms Theresa Migler-VonDollen CMPS 5P 1 / 32 Algorithms Write a Python function that accepts a list of numbers and a number, x. If x is in the list, the function returns the position in the list

### Lecture 2: The Complexity of Some Problems

IAS/PCMI Summer Session 2000 Clay Mathematics Undergraduate Program Basic Course on Computational Complexity Lecture 2: The Complexity of Some Problems David Mix Barrington and Alexis Maciel July 18, 2000

### A Non-Linear Schema Theorem for Genetic Algorithms

A Non-Linear Schema Theorem for Genetic Algorithms William A Greene Computer Science Department University of New Orleans New Orleans, LA 70148 bill@csunoedu 504-280-6755 Abstract We generalize Holland

### A. V. Gerbessiotis CS Spring 2014 PS 3 Mar 24, 2014 No points

A. V. Gerbessiotis CS 610-102 Spring 2014 PS 3 Mar 24, 2014 No points Problem 1. Suppose that we insert n keys into a hash table of size m using open addressing and uniform hashing. Let p(n, m) be the

### Measurement with Ratios

Grade 6 Mathematics, Quarter 2, Unit 2.1 Measurement with Ratios Overview Number of instructional days: 15 (1 day = 45 minutes) Content to be learned Use ratio reasoning to solve real-world and mathematical

### CHAPTER 3 Numbers and Numeral Systems

CHAPTER 3 Numbers and Numeral Systems Numbers play an important role in almost all areas of mathematics, not least in calculus. Virtually all calculus books contain a thorough description of the natural,

### - Easy to insert & delete in O(1) time - Don t need to estimate total memory needed. - Hard to search in less than O(n) time

Skip Lists CMSC 420 Linked Lists Benefits & Drawbacks Benefits: - Easy to insert & delete in O(1) time - Don t need to estimate total memory needed Drawbacks: - Hard to search in less than O(n) time (binary

### Why? A central concept in Computer Science. Algorithms are ubiquitous.

Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online

### 2. Compressing data to reduce the amount of transmitted data (e.g., to save money).

Presentation Layer The presentation layer is concerned with preserving the meaning of information sent across a network. The presentation layer may represent (encode) the data in various ways (e.g., data

### College Algebra. Barnett, Raymond A., Michael R. Ziegler, and Karl E. Byleen. College Algebra, 8th edition, McGraw-Hill, 2008, ISBN: 978-0-07-286738-1

College Algebra Course Text Barnett, Raymond A., Michael R. Ziegler, and Karl E. Byleen. College Algebra, 8th edition, McGraw-Hill, 2008, ISBN: 978-0-07-286738-1 Course Description This course provides

### NOTES ON DISTANCE- BASED CODING METHODS FOR BINARY TREES. Juha Lehikoinen and Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE

N S ER E S I UN I VER NOTES ON DISTANCE- BASED CODING METHODS FOR BINARY TREES P M TA S A S I T Juha Lehikoinen and Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A-1994-2 UNIVERSITY

### Efficient Data Structures for Decision Diagrams

Artificial Intelligence Laboratory Efficient Data Structures for Decision Diagrams Master Thesis Nacereddine Ouaret Professor: Supervisors: Boi Faltings Thomas Léauté Radoslaw Szymanek Contents Introduction...

### Monotone Partitioning. Polygon Partitioning. Monotone polygons. Monotone polygons. Monotone Partitioning. ! Define monotonicity

Monotone Partitioning! Define monotonicity Polygon Partitioning Monotone Partitioning! Triangulate monotone polygons in linear time! Partition a polygon into monotone pieces Monotone polygons! Definition

### Introduction Number Systems and Conversion

UNIT 1 Introduction Number Systems and Conversion Objectives 1. Introduction The first part of this unit introduces the material to be studied later. In addition to getting an overview of the material

### VISUAL BASIC COLLECTIONS

VISUAL BASIC COLLECTIONS AND HASH TABLES Thomas Niemann Preface Hash tables offer a method for quickly storing and accessing data based on a key value. When you access a Visual Basic collection using a

### Interactive Math Glossary Terms and Definitions

Terms and Definitions Absolute Value the magnitude of a number, or the distance from 0 on a real number line Additive Property of Area the process of finding an the area of a shape by totaling the areas

### Binary Codes for Nonuniform

Binary Codes for Nonuniform Sources (dcc-2005) Alistair Moffat and Vo Ngoc Anh Presented by:- Palak Mehta 11-20-2006 Basic Concept: In many applications of compression, decoding speed is at least as important

### ascending order decimal denominator descending order Numbers listed from largest to smallest equivalent fraction greater than or equal to SOL 7.

SOL 7.1 ascending order Numbers listed in order from smallest to largest decimal The numbers in the base 10 number system, having one or more places to the right of a decimal point denominator The bottom

### 1. Sorting (assuming sorting into ascending order) a) BUBBLE SORT

DECISION 1 Revision Notes 1. Sorting (assuming sorting into ascending order) a) BUBBLE SORT Make sure you show comparisons clearly and label each pass First Pass 8 4 3 6 1 4 8 3 6 1 4 3 8 6 1 4 3 6 8 1

### CS 2112 Spring 2014. 0 Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

CS 2112 Spring 2014 Assignment 3 Data Structures and Web Filtering Due: March 4, 2014 11:59 PM Implementing spam blacklists and web filters requires matching candidate domain names and URLs very rapidly

### The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge, cheapest first, we had to determine whether its two endpoints

### Prentice Hall Mathematics: Course 1 2008 Correlated to: Arizona Academic Standards for Mathematics (Grades 6)

PO 1. Express fractions as ratios, comparing two whole numbers (e.g., ¾ is equivalent to 3:4 and 3 to 4). Strand 1: Number Sense and Operations Every student should understand and use all concepts and

### DATA STRUCTURES USING C

DATA STRUCTURES USING C QUESTION BANK UNIT I 1. Define data. 2. Define Entity. 3. Define information. 4. Define Array. 5. Define data structure. 6. Give any two applications of data structures. 7. Give

### CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92.

Name: Email ID: CSE 326, Data Structures Section: Sample Final Exam Instructions: The exam is closed book, closed notes. Unless otherwise stated, N denotes the number of elements in the data structure

### MARiO: Multi Attribute Routing in Open Street Map

MARiO: Multi Attribute Routing in Open Street Map Franz Graf Matthias Renz Hans-Peter Kriegel Matthias Schubert Institute for Informatics, Ludwig-Maximilians-Universität München, Oettingenstr. 67, D-80538

### A fairly quick tempo of solutions discussions can be kept during the arithmetic problems.

Distributivity and related number tricks Notes: No calculators are to be used Each group of exercises is preceded by a short discussion of the concepts involved and one or two examples to be worked out

### Symbol Tables. IE 496 Lecture 13

Symbol Tables IE 496 Lecture 13 Reading for This Lecture Horowitz and Sahni, Chapter 2 Symbol Tables and Dictionaries A symbol table is a data structure for storing a list of items, each with a key and

### CHAPTER 3 Number System and Codes

CHAPTER 3 Number System and Codes 3.1 Introduction On hearing the word number, we immediately think of familiar decimal number system with its 10 digits; 0,1, 2,3,4,5,6, 7, 8 and 9. these numbers are called

### Algorithm set of steps used to solve a mathematical computation. Area The number of square units that covers a shape or figure

Fifth Grade CCSS Math Vocabulary Word List *Terms with an asterisk are meant for teacher knowledge only students need to learn the concept but not necessarily the term. Addend Any number being added Algorithm

### 5. Binary objects labeling

Image Processing - Laboratory 5: Binary objects labeling 1 5. Binary objects labeling 5.1. Introduction In this laboratory an object labeling algorithm which allows you to label distinct objects from a

### Data Structure. Lecture 3

Data Structure Lecture 3 Data Structure Formally define Data structure as: DS describes not only set of objects but the ways they are related, the set of operations which may be applied to the elements

### Lecture Notes on Binary Search Trees

Lecture Notes on Binary Search Trees 15-122: Principles of Imperative Computation Frank Pfenning André Platzer Lecture 17 October 23, 2014 1 Introduction In this lecture, we will continue considering associative

### Linear Programming. March 14, 2014

Linear Programming March 1, 01 Parts of this introduction to linear programming were adapted from Chapter 9 of Introduction to Algorithms, Second Edition, by Cormen, Leiserson, Rivest and Stein [1]. 1

### COT5405 Analysis of Algorithms Homework 3 Solutions

COT0 Analysis of Algorithms Homework 3 Solutions. Prove or give a counter example: (a) In the textbook, we have two routines for graph traversal - DFS(G) and BFS(G,s) - where G is a graph and s is any

### Storing Measurement Data

Storing Measurement Data File I/O records or reads data in a file. A typical file I/O operation involves the following process. 1. Create or open a file. Indicate where an existing file resides or where

### BINARY SEARCH TREE VISUALIZATION ALGORITHM

BINARY SEARCH TREE VISUALIZATION ALGORITHM Vadym Borovskiy, Jürgen Müller, Matthieu-Patrick Schapranow, Alexander Zeier Hasso Plattner Institute for Software Systems Engineering {vadym.borovskiy, juergen.mueller,

### Data Structure [Question Bank]

Unit I (Analysis of Algorithms) 1. What are algorithms and how they are useful? 2. Describe the factor on best algorithms depends on? 3. Differentiate: Correct & Incorrect Algorithms? 4. Write short note:

### Data storage Tree indexes

Data storage Tree indexes Rasmus Pagh February 7 lecture 1 Access paths For many database queries and updates, only a small fraction of the data needs to be accessed. Extreme examples are looking or updating

### Diameter and Treewidth in Minor-Closed Graph Families, Revisited

Algorithmica manuscript No. (will be inserted by the editor) Diameter and Treewidth in Minor-Closed Graph Families, Revisited Erik D. Demaine, MohammadTaghi Hajiaghayi MIT Computer Science and Artificial

### 2 Exercises and Solutions

2 Exercises and Solutions Most of the exercises below have solutions but you should try first to solve them. Each subsection with solutions is after the corresponding subsection with exercises. 2.1 Sorting

### ALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite

ALGEBRA Pupils should be taught to: Generate and describe sequences As outcomes, Year 7 pupils should, for example: Use, read and write, spelling correctly: sequence, term, nth term, consecutive, rule,

### Planar Tree Transformation: Results and Counterexample

Planar Tree Transformation: Results and Counterexample Selim G Akl, Kamrul Islam, and Henk Meijer School of Computing, Queen s University Kingston, Ontario, Canada K7L 3N6 Abstract We consider the problem

### Chapter 7. Multiway Trees. Data Structures and Algorithms in Java

Chapter 7 Multiway Trees Data Structures and Algorithms in Java Objectives Discuss the following topics: The Family of B-Trees Tries Case Study: Spell Checker Data Structures and Algorithms in Java 2 Multiway

### Shortest Inspection-Path. Queries in Simple Polygons

Shortest Inspection-Path Queries in Simple Polygons Christian Knauer, Günter Rote B 05-05 April 2005 Shortest Inspection-Path Queries in Simple Polygons Christian Knauer, Günter Rote Institut für Informatik,

### The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

### Operating Systems CSE 410, Spring 2004. File Management. Stephen Wagner Michigan State University

Operating Systems CSE 410, Spring 2004 File Management Stephen Wagner Michigan State University File Management File management system has traditionally been considered part of the operating system. Applications

### Sorting Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar

Sorting Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 2003. Topic Overview Issues in Sorting on Parallel

### SELECTING NEURAL NETWORK ARCHITECTURE FOR INVESTMENT PROFITABILITY PREDICTIONS

UDC: 004.8 Original scientific paper SELECTING NEURAL NETWORK ARCHITECTURE FOR INVESTMENT PROFITABILITY PREDICTIONS Tonimir Kišasondi, Alen Lovren i University of Zagreb, Faculty of Organization and Informatics,

### Scheduling the Brazilian soccer tournament with fairness and broadcast objectives

Scheduling the Brazilian soccer tournament with fairness and broadcast objectives Celso C. Ribeiro 1 and Sebastián Urrutia 2 1 Department of Computer Science, Universidade Federal Fluminense, Rua Passo

### 5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

1. The advantage of.. is that they solve the problem if sequential storage representation. But disadvantage in that is they are sequential lists. [A] Lists [B] Linked Lists [A] Trees [A] Queues 2. The

### Binary search tree with SIMD bandwidth optimization using SSE

Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous

### 2013 Texas Education Agency. All Rights Reserved 2013 Introduction to the Revised Mathematics TEKS: Vertical Alignment Chart Kindergarten Algebra I 1

2013 Texas Education Agency. All Rights Reserved 2013 Introduction to the Revised Mathematics TEKS: Vertical Alignment Chart Kindergarten Algebra I 1 The materials are copyrighted (c) and trademarked (tm)

### Sequential Data Structures

Sequential Data Structures In this lecture we introduce the basic data structures for storing sequences of objects. These data structures are based on arrays and linked lists, which you met in first year

### Algebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.

Chapter 1 Vocabulary identity - A statement that equates two equivalent expressions. verbal model- A word equation that represents a real-life problem. algebraic expression - An expression with variables.

### 2.3 Scheduling jobs on identical parallel machines

2.3 Scheduling jobs on identical parallel machines There are jobs to be processed, and there are identical machines (running in parallel) to which each job may be assigned Each job = 1,,, must be processed

### Lecture 6. Randomized data structures Random number generation Skip lists: ideas and implementation Skip list time costs. Reading:

Lecture 6 Randomized data structures Random number generation Skip lists: ideas and implementation Skip list time costs Reading: Skip Lists: A Probabilistic Alternative to Balanced Trees (author William

### Problem of the Month: Once Upon a Time

Problem of the Month: The Problems of the Month (POM) are used in a variety of ways to promote problem solving and to foster the first standard of mathematical practice from the Common Core State Standards:

### Sequences with MS

Sequences 2008-2014 with MS 1. [4 marks] Find the value of k if. 2a. [4 marks] The sum of the first 16 terms of an arithmetic sequence is 212 and the fifth term is 8. Find the first term and the common

### Sample Questions Csci 1112 A. Bellaachia

Sample Questions Csci 1112 A. Bellaachia Important Series : o S( N) 1 2 N N i N(1 N) / 2 i 1 o Sum of squares: N 2 N( N 1)(2N 1) N i for large N i 1 6 o Sum of exponents: N k 1 k N i for large N and k

### Data Warehousing und Data Mining

Data Warehousing und Data Mining Multidimensionale Indexstrukturen Ulf Leser Wissensmanagement in der Bioinformatik Content of this Lecture Multidimensional Indexing Grid-Files Kd-trees Ulf Leser: Data

### PES Institute of Technology-BSC QUESTION BANK

PES Institute of Technology-BSC Faculty: Mrs. R.Bharathi CS35: Data Structures Using C QUESTION BANK UNIT I -BASIC CONCEPTS 1. What is an ADT? Briefly explain the categories that classify the functions

### Lecture 10: Regression Trees

Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,

### Student Performance Q&A:

Student Performance Q&A: 2008 AP Calculus AB and Calculus BC Free-Response Questions The following comments on the 2008 free-response questions for AP Calculus AB and Calculus BC were written by the Chief

### Common Core Unit Summary Grades 6 to 8

Common Core Unit Summary Grades 6 to 8 Grade 8: Unit 1: Congruence and Similarity- 8G1-8G5 rotations reflections and translations,( RRT=congruence) understand congruence of 2 d figures after RRT Dilations

### Pythagorean Theorem. Overview. Grade 8 Mathematics, Quarter 3, Unit 3.1. Number of instructional days: 15 (1 day = minutes) Essential questions

Grade 8 Mathematics, Quarter 3, Unit 3.1 Pythagorean Theorem Overview Number of instructional days: 15 (1 day = 45 60 minutes) Content to be learned Prove the Pythagorean Theorem. Given three side lengths,

### International Journal of Advanced Research in Computer Science and Software Engineering

Volume 3, Issue 7, July 23 ISSN: 2277 28X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Greedy Algorithm:

### CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992

CS268: Geometric Algorithms Handout #5 Design and Analysis Original Handout #15 Stanford University Tuesday, 25 February 1992 Original Lecture #6: 28 January 1991 Topics: Triangulating Simple Polygons

### arxiv: v5 [cs.ds] 9 Feb 2015

УДК 510.52 Valerii Sopin A new algorithm for solving the rsum problem arxiv:1407.4640v5 [cs.ds] 9 Feb 2015 A determined algorithm is presented for solving the rsum problem for any natural r with a sub-quadratic

### A Hybrid Chaining Model with AVL and Binary Search Tree to Enhance Search Speed in Hashing

, pp.185-194 http://dx.doi.org/10.14257/ijhit.2015.8.3.18 A Hybrid Chaining Model with AVL and Binary Search Tree to Enhance Search Speed in Hashing 1 Akshay Saxena, 2 Harsh Anand, 3 Tribikram Pradhan

### Binary Heap Algorithms

CS Data Structures and Algorithms Lecture Slides Wednesday, April 5, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks CHAPPELLG@member.ams.org 2005 2009 Glenn G. Chappell

### Divide and Conquer Examples. Top-down recursive mergesort. Gravitational N-body problem.

Divide and Conquer Divide the problem into several subproblems of equal size. Recursively solve each subproblem in parallel. Merge the solutions to the various subproblems into a solution for the original

### Operation Count; Numerical Linear Algebra

10 Operation Count; Numerical Linear Algebra 10.1 Introduction Many computations are limited simply by the sheer number of required additions, multiplications, or function evaluations. If floating-point

### A New Polygon Based Algorithm for Filling Regions

Tamkang Journal of Science and Engineering, Vol. 2, No. 4, pp. 175-186 (2) 175 A New Polygon Based Algorithm for Filling Regions Hoo-Cheng *, Mu-Hwa Chen*, Shou-Yiing Hsu**, Chaoyin Chien*, Tsu-Feng Kuo*

### character E T A S R I O D frequency

Data Compression Data compression is any process by which a digital (e.g. electronic) file may be transformed to another ( compressed ) file, such that the original file may be fully recovered from the

### OPTIMUM PARTITION PARAMETER OF DIVIDE-AND-CONQUER ALGORITHM FOR SOLVING CLOSEST-PAIR PROBLEM

OPTIMUM PARTITION PARAMETER OF DIVIDE-AND-CONQUER ALGORITHM FOR SOLVING CLOSEST-PAIR PROBLEM Mohammad Zaidul Karim 1 and Nargis Akter 2 1 Department of Computer Science and Engineering, Daffodil International

### Chapter 10: Network Flow Programming

Chapter 10: Network Flow Programming Linear programming, that amazingly useful technique, is about to resurface: many network problems are actually just special forms of linear programs! This includes,

### Indexing Multiple- Inheritance Hierarchies

Indexing Multiple- Inheritance Hierarchies Faculty of Electronics and Information Technology Warsaw University of Technology Jacek Lewandowski prof. Henryk Rybiński Layout The context of research Introduction

### Algorithmic Thinking: The Key for Understanding Computer Science

Algorithmic Thinking: The Key for Understanding Computer Science Gerald Futschek Vienna University of Technology Institute of Software Technology and Interactive Systems Favoritenstrasse 9, 1040 Vienna,

### Qlik Sense scalability

Qlik Sense scalability Visual analytics platform Qlik Sense is a visual analytics platform powered by an associative, in-memory data indexing engine. Based on users selections, calculations are computed

### Common Core State Standards. Standards for Mathematical Practices Progression through Grade Levels

Standard for Mathematical Practice 1: Make sense of problems and persevere in solving them. Mathematically proficient students start by explaining to themselves the meaning of a problem and looking for

### Jones and Bartlett Publishers, LLC. NOT FOR SALE OR DISTRIBUTION

8498_CH08_WilliamsA.qxd 11/13/09 10:35 AM Page 347 Jones and Bartlett Publishers, LLC. NOT FOR SALE OR DISTRIBUTION C H A P T E R Numerical Methods 8 I n this chapter we look at numerical techniques for

### Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

Binary Search Trees Lecturer: Georgy Gimel farb COMPSCI 220 Algorithms and Data Structures 1 / 27 1 Properties of Binary Search Trees 2 Basic BST operations The worst-case time complexity of BST operations