Parallelization: Binary Tree Traversal
|
|
|
- Gerald Daniel
- 10 years ago
- Views:
Transcription
1 By Aaron Weeden and Patrick Royal Shodor Education Foundation, Inc. August 2012 Introduction: According to Moore s law, the number of transistors on a computer chip doubles roughly every two years. First formulated in 1965, this law remains amazingly accurate as a rule of thumb for predicting how chips will change in the future. The implications of this have led to a major conceptual shift in the limitations of computing. Whereas originally the primary limiting factor in a computer was storage, today enormous amounts of data storage are available at relatively low cost. As a result, storing all of that data is becoming less of a problem than analyzing and using it. The most basic tool of data analysis, and the subject of this module, is to search a set of data for a particular value. On the face of things, searching might seem very simple just look through bits of data one by one until the correct one is found. However, as computers store more and more information this sort of linear search becomes more and more unwieldy. If you wanted to search 1 billion items, for example, on average you would have to examine 500,000 of them. Instead, we can use something called a Binary Search Tree to organize the data more effectively. Once we write the data into a binary tree, we can search through, for example, 1 billions items and pick out the one we want in just 30 actions, almost 17 million times less work. A binary tree is essentially an ordering of data into interconnected nodes. The parent root node of the tree contains a single data entry and references to two child nodes. Each of these nodes is then effectively a root of its own subtree, with a data entry and references to two more child nodes, and so on. Due to the nature of the tree, a tree with n nodes has a height (the number of levels in the tree) of log 2 n, assuming the tree is fully populated, which we will assume throughout this module. The main advantage of this structure is that, in a properly sorted tree, searching it only requires looking at one node from each generation, so it is possible to search through 2 n -1 nodes in just n actions. Figure 1 shows a picture of a binary search tree. Module Document, Page 1
2 Figure 1: A binary search tree of height 4 with = 15 nodes. Types of Traversals: There are a number of orders in which a processor can traverse, or visit tree nodes, exactly once, in a specific order, in a binary tree. Some of the most common ones are listed and described in the table below. Traversal Type Description Real- World Use Pre- order Examine the root first, then the Creating (writing) a new tree (depth- first) left child, then the right child In- order Examine the left child, then the Searching a sorted binary tree (depth- first) root, then the right child Post- order Examine the left child, then the Deleting an existing tree (depth- first) right child, then the root Breadth- first Examine each level of the binary tree before moving to the next Finding the shortest path between two nodes Pre-Order Traversal: A pre- order traversal of a binary tree is a recursive algorithm where the program first prints the label of the root of the tree. It then sends the same command to the left and right subtrees. An implementation of this algorithm in C is as follows: void preordertraversal(struct node * root) { if(root!= NULL) { printf( %d, root->label); preordertraversal(root->left); preordertraversal(root->right); return; Module Document, Page 2
3 Post-Order Traversal: A post- order traversal of a binary tree is a recursive algorithm where the program first invokes itself on the left and right subtrees, traversing them completely, and then prints the label of the root. When the traversal of the left or right subtree reaches a node with no children, it prints the label of that node. An implementation of this algorithm in C is as follows: void postordertraversal(struct node * root) { if(root!= NULL) { postordertraversal(root->left); postordertraversal(root->right); printf( %d, root->label); return; In-Order Traversal: An in- order traversal of a binary tree is a recursive algorithm where the program first invokes itself on its left subtree, then when that finishes prints the root, and finally invokes itself on its right subtree. An implementation of this algorithm in C is as follows: void inordertraversal(struct node * root) { if(root!= NULL) { inordertraversal(root->left); printf( %d, root->label); inordertraversal(root->right); return; Breadth-First Traversal: A breadth- first traversal works differently from a depth- first traversal. The program loops downward through each level of the tree, pushing all of the nodes at each row in the order in which they appear in the row. It does this by using two stacks: one to hold the nodes to be printed, and one to hold their children. An implementation of this algorithm in C is as follows: Module Document, Page 3
4 void breadthfirsttraversal(node * root) { stack activenodes; stack nextnodes; activenodes.push(root); while(activenodes.peek()!= NULL) { while(activenodes.peek()!= NULL) { node * current = activenodes.pop(); printf( %d, current->label); nextnodes.push(current->left); nextnodes.push(current->right); while(nextnodes.peek()!= NULL) { activenodes.push(nextnodes.pop()); return; Exercise 1 Complete questions 1 through 4 below. 1. Label the nodes of the tree below 1 through 7 in the order they are processed by a pre-order traversal. 2. Label the nodes of the tree below 1 through 7 in the order they are processed by a post-order traversal. Module Document, Page 4
5 3. Label the nodes of the tree below 1 through 7 in the order they are processed by an in-order traversal. 4. Label the nodes of the tree below 1 through 7 in the order they are processed by a breadth-first traversal. Module Document, Page 5
6 Exercise 2 Follow the steps below on a computer, from the command line. Example commands are shown below each instruction. 1. Change directories to the code directory (an attachment to this module). cd BinaryTreeTraversal/code 2. Compile the code. make traverse 3. Run the pre- order program and compare it to your result from Exercise 1 Question 1../traverse --order pre 4. Run the in- order program and compare it to your result from Exercise 1 Question 3../traverse --order in 5. Run the bread- first program and compare it to your result from Exercise 1 Question 4../traverse --order breadth Module Document, Page 6
7 Parallelism and Scaling The current setup works well enough on small amounts of data, but at some point data sets can grow sufficiently large that a single computer cannot hold all of the data at once, or the tree becomes so large that it takes an unreasonable amount of time to complete a traversal. To overcome these issues, parallel computing, or parallelism, can be employed. In parallel computing, a problem is divided up and each of multiple processors performs a part of the problem. The processors work at the same time, in parallel. There are three primary reasons to parallelize a problem. The first is to achieve speedup, or to solve the problem in less time. As one adds more processors to solve the problem, one may find that the work is finished faster 1. This is known as strong scaling; the problem size stays constant, but the number of processors increases. The second reason to parallelize is to solve a bigger problem. More processors may be able to solve a bigger problem in the same amount of time that fewer processors are able to solve a smaller problem. If the problem size increases as the number of processors increases, we call it weak scaling. The third reason to parallelize is that it allows a problem that is too big to fit in the memory of one processor to be broken up such that it is able to fit in the memories of multiple processors. In many ways, a tree is the perfect candidate for parallelism. In a tree, each node/subtree is independent. As a result, we can split up a large tree into 2, 4, 8, or more subtrees and hold subtree on each processor. Then, the only duplicated data that must be kept on all processors is the tiny tip of the tree that is the parent of all of the individual subtrees. Mathematically speaking, for a tree divided among n processors (where n is a power of two), the processors only need to hold n 1 nodes in common no matter how big the tree itself is. A picture of this is shown below in Figure 2. In this picture and throughout the module, a processor s rank is an identifier to keep it distinct from the other processors. 1 An observation known as Amdahl s Law says that there is a theoretical limit to the amount of speedup that can be achieved from strong scaling. At some point, the number of processors exceeds the amount of work available to do in parallel, and adding processors results in no additional speedup. Worse, the time spent communicating between processors overwhelms the total time spent solving the problem, and it actually becomes slower to solve a problem with more processors than with fewer. Module Document, Page 7
8 Figure 2: A tree of height 4 split among 4 processors Parallel Traversals: The fact that trees are comprised of independent sub- trees makes parallelizing them very easy. Properly done, the portion of these traversals that is parallelizable grows at 2 n for an n-generation tree, while the processors only need to synchronize once, at the end, so it approaches 100% for large trees (but keep in mind Amdahl s Law, footnote 1). The basic steps for parallelizing these traversals are as follows: 1. Perform the traversal on the parent part of the tree. 2. Whenever you get to a node that is only present on one processor, ask that processor to execute the appropriate C algorithm detailed above. 3. The processor will return its result that can be used exactly as if it was a serial processor. A Breadth- First traversal is somewhat more complicated to implement as a parallel system because at each level, it must access nodes from all of the parallel processors. Theoretically, a Breadth- First traversal can achieve the same 100% speedup of Pre-, In-, and Post- Order traversals. However, the amount of processor- processor data transmission adds in a greater potential for delays, thus slowing down the algorithm. Nevertheless, as the size of the tree increases the size of the generations grows at the rate of 2 n while the number of synchronizations grows at a rate of n for an n-generation tree, so the parallelizable portion of these traversals also approaches 100%. The basic steps for parallelizing this traversal are as follows: Module Document, Page 8
9 1. Perform the traversal on the parent part of the tree. 2. Whenever you get to a node that is only present on one processor, ask that processor to execute the Breadth- First C algorithm detailed above, but wait after it finishes one generation. 3. Combine all the one- generation results from the different processors in the correct order. 4. Allow each processor to execute the next generation of the Breadth- First C algorithm detailed above, and then wait again. 5. Repeat Steps 3 and 4 until there are no nodes remaining. Message Passing Interface (MPI) The Message Passing Interface (MPI) is a standard for performing the type of distributed memory parallelism described above, in which each processor has its own memory. With MPI, parallel processes must communicate with each other by sending messages, as in the case of the breadth- first search algorithm above, in which processes must synchronize with each other in step 4 before continuing. MPI provides bindings in C, Fortran, and a host of other languages. When MPI processes are invoked, they each run the same copy of source code, in which MPI functions are written. Some of the common MPI C functions are shown in the table below. Function Description Description of Arguments MPI_Init(&argc, &argv); MPI_Comm_rank( MPI_COMM_WORLD, &rank); MPI_Comm_size( MPI_COMM_WORLD, &size); Initialize the MPI environment and strip out the relevant MPI command line arguments. No other MPI functions may be called before this function. Obtain the rank of the current process. Obtain the total number of processes. argc the number of arguments passed to the program on the command line. argv the arguments passed to the program on the command line. MPI_COMM_WORLD the communicator (group) of MPI processes working together. rank the rank (identifier) of the process currently running the program. MPI_COMM_WORLD the communicator (group) of MPI processes working together. size the total number of MPI processes. Module Document, Page 9
10 MPI_Send(buf, count, datatype, dest, tag, comm); MPI_Recv(buf, count, datatype, source, tag, comm, status); MPI_Finalize(); Send a message to another process. Receive a message from a process. Finalize the MPI environment. No other MPI functions may be called after this function. buf a pointer to the memory to send. count the number of values to send. datatype the type of the values, e.g. MPI_INT, MPI_CHAR, MPI_FLOAT. dest the rank of the receiving process. tag an identifier for the message (must match the receive s tag). comm the communicator, e.g. MPI_COMM_WORLD. buf a pointer to the memory to receive. count the number of values to receive. datatype the type of the values, e.g. MPI_INT, MPI_CHAR, MPI_FLOAT. source the rank of the sending process. tag an identifier for the message (must match the send s tag). comm the communicator, e.g. MPI_COMM_WORLD. status - shows the status of the message. Can be ignored with MPI_STATUS_IGNORE Using MPI requires including the MPI header file, mpi.h, using an MPI compiler, such as mpicc, and running with an MPI execution command such as mpirun or mpiexec. Module Document, Page 10
11 Exercise 3 In this exercise you will write and run a small MPI program. This exercise requires the use of a computer with MPI installed, preferably also a multi- processor machine (MPI can run on a serial processor, but it is meant for parallelism). Example commands are shown below each instruction. 1. Open a new C source file in a text editor such as vi. vi hello.c 2. Create a small, serial Hello, World! program. #include <stdio.h> int main() { printf( Hello, World!\n ); return 0; 3. Save the file and compile the program. gcc -o hello hello.c 4. Run the program../hello 5. Open the source file again. vi hello.c Module Document, Page 11
12 6. Add in the MPI code to initialize the environment, determine the process rank and size, and finalize the environment. #include <mpi.h> #include <stdio.h> int main(int argc, char **argv) { int rank = 0; int size = 0; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf( Hello, World!\n ); MPI_Finalize(); return 0; 7. Compile the code with the MPI compiler 2. mpicc -o hello-parallel hello.c 8. Run the code on two processes with the MPI execution command. mpirun -np 2 hello-parallel 9. Did the result change? 10. Open the source file again. vi hello.c 2 If you attempt to compile the code with a non- MPI compiler, it will be unable to find the mpi.h header file and will exit with an error. Module Document, Page 12
13 11. This time add code to print the process rank and size. #include <mpi.h> #include <stdio.h> int main(int argc, char **argv) { int rank = 0; int size = 0; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf( Hello, World from rank %d of %d!\n, rank, size); MPI_Finalize(); return 0; 12. Save and compile. mpicc -o hello-mpi hello.c 13. Run the program. mpirun -np 2 hello-mpi Exercise 4 Include a post- order parallel traversal in traversal.c in the code directory (an attachment to this module). Make changes to the depthfirstparalleltraversal function and use the pre- order and in- order codes as reference. Ensure the code compiles and runs correctly, producing the same result as the serial post- order traversal for different sizes of trees. Module Document, Page 13
14 Scaling The goal of scaling an algorithm is either to obtain a performance increase for a problem (strong scaling) or to solve a bigger problem with more processors with comparable performance to solving a smaller problem with fewer processors (weak scaling). An analogy is painting a house. If a single painter can paint only one room in an hour, the goal of adding more painters would be to either paint the same room in less than an hour or paint multiple rooms in the same hour. The task may require communication between the painters, and this communication may become appreciable, just as it does in the parallel breadth- first traversal. Thus, scaling is not always a be- all end- all way of obtaining speedup. However, there is usually an advantage to adding processers so that one can solve a bigger problem. Exercise 5 This exercise makes use of the Unix time command, which provides a simple, if not very accurate, measure of the amount of time it takes an algorithm to execute. By comparing the real times of the serial and parallel programs, we can get an idea of the speedup achieved by parallelism. Similarly, by comparing the real time of the parallel program running with different sizes of trees, we can get an idea of how the program scales. This exercise requires the use of a multi- processor computer with MPI installed. 1. Change directories to the code directory (an attachment to this module). cd BinaryTreeTraversal/code 2. Compile the code. make all 3. Time how long it takes to run the serial versions of each program. time./traverse --order pre time./traverse --order post time./traverse --order in time./traverse --order breadth 4. Time how long it takes to run the parallel versions of each program. Note that the np argument to mpirun specifies how many parallel processes to use, and the file ~/machines must exist and contain the hostname of each parallel processor, as well as how many processes MPI is allowed to spawn based on the number of processors and cores (if it is a multi- core system). Module Document, Page 14
15 5. When you run the parallel traversal programs, choose a value for -np that matches the total number of cores available. time mpirun np 5 hostfile ~/machines./traverse-parallel --order pre time mpirun np 5 hostfile ~/machines./traverse-parallel --order post time mpirun np 5 hostfile ~/machines./traverse-parallel --order in time mpirun np 5 hostfile ~/machines./traverse-parallel --order breadth 6. Focusing only on the real time results, describe how the run times differ between each run of the program. 7. Run each of the parallel versions of the program for bigger trees. time mpirun np 5 hostfile ~/machines./traverse-parallel --order pre --num-rows 20 time mpirun np 5 hostfile ~/machines./traverse-parallel --order post --num-rows 20 time mpirun np 5 hostfile ~/machines./traverse-parallel --order in --num-rows 20 time mpirun np 5 hostfile ~/machines./traverse-parallel --order breadth --num-rows Describe how these run times differ from the parallel run times obtained above. Possible Extensions This module has covered the traversing of binary trees. Binary search trees are just one way of organizing data, however. In fact, they are only one way of organizing data into trees; data can also be organized into trees whose nodes each have 3 children, or 4 children, or n children, where n may be fixed or may vary. Trees are also a special form of a graph, which consists of nodes and edges between them. Trees happen to be ordered hierarchically, as the search occurs top- to- bottom. This restriction is not placed on graphs in general, in which traversals can occur in multiple directions. Some graphs may not have roots, and some may not have leaves. Different kinds of data can be placed into different kinds of structures based on which traversal is more likely to quickly find specific elements of the data. Traversals through these various structures have different algorithms that may be more- easily or less- easily parallelized. These algorithms may also scale in different ways from the algorithms explored in this module. Module Document, Page 15
16 This module has considered the message passing paradigm, but the algorithm would also work under other parallel paradigms such as shared memory, in which threads are able to synchronize without having to pass messages to each other. Student Project Ideas 1. Generalize the algorithms in the module to include search trees whose nodes have more than 2 children. How does this change affect the algorithms involved? Is the parallelism affected? Include working code samples of the new serial and parallel algorithms. How do these algorithms scale compared to the binary search tree traversals? 2. Explore scaling the parallel algorithms over more processors and with bigger tree sizes. Create a graph of the scaling data, plotting number of processors against run time. Research the concepts of strong and weak scaling, and collect data for each type of scaling for each algorithm. Rubric for Student Assessment Students can be graded based on the following rubric, whose point values can be adjusted to fit a different scale or to assign a percent- based grade. Assignment Exercise 1 Exercise 2 Exercise 3 Exercise 4 Exercise 5 Student Project Total Number of Points Module Document, Page 16
17 Solutions to Exercises Exercise Module Document, Page 17
18 4. Exercise Exercise 3 Step 4 should print, Hello, World! Step 8 should print, Hello, World! twice (thus, the answer to Step 9 is yes). Step 13 should print, Hello, World from rank 0 of 2! Hello, World from rank 1 of 2! However, it is a race condition as to which processor finishes first, so it may instead print, Hello, World from rank 1 of 2! Hello, World from rank 0 of 2! Exercise 4 (Available as an attachment) Module Document, Page 18
Session 2: MUST. Correctness Checking
Center for Information Services and High Performance Computing (ZIH) Session 2: MUST Correctness Checking Dr. Matthias S. Müller (RWTH Aachen University) Tobias Hilbrich (Technische Universität Dresden)
High performance computing systems. Lab 1
High performance computing systems Lab 1 Dept. of Computer Architecture Faculty of ETI Gdansk University of Technology Paweł Czarnul For this exercise, study basic MPI functions such as: 1. for MPI management:
Parallel Programming with MPI on the Odyssey Cluster
Parallel Programming with MPI on the Odyssey Cluster Plamen Krastev Office: Oxford 38, Room 204 Email: [email protected] FAS Research Computing Harvard University Objectives: To introduce you
HPCC - Hrothgar Getting Started User Guide MPI Programming
HPCC - Hrothgar Getting Started User Guide MPI Programming High Performance Computing Center Texas Tech University HPCC - Hrothgar 2 Table of Contents 1. Introduction... 3 2. Setting up the environment...
Lightning Introduction to MPI Programming
Lightning Introduction to MPI Programming May, 2015 What is MPI? Message Passing Interface A standard, not a product First published 1994, MPI-2 published 1997 De facto standard for distributed-memory
Data Structure with C
Subject: Data Structure with C Topic : Tree Tree A tree is a set of nodes that either:is empty or has a designated node, called the root, from which hierarchically descend zero or more subtrees, which
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON THE USAGE OF OLD AND NEW DATA STRUCTURE ARRAYS, LINKED LIST, STACK,
Lecture 6: Introduction to MPI programming. Lecture 6: Introduction to MPI programming p. 1
Lecture 6: Introduction to MPI programming Lecture 6: Introduction to MPI programming p. 1 MPI (message passing interface) MPI is a library standard for programming distributed memory MPI implementation(s)
Parallel Computing. Parallel shared memory computing with OpenMP
Parallel Computing Parallel shared memory computing with OpenMP Thorsten Grahs, 14.07.2014 Table of contents Introduction Directives Scope of data Synchronization OpenMP vs. MPI OpenMP & MPI 14.07.2014
1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++
Answer the following 1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++ 2) Which data structure is needed to convert infix notations to postfix notations? Stack 3) The
To connect to the cluster, simply use a SSH or SFTP client to connect to:
RIT Computer Engineering Cluster The RIT Computer Engineering cluster contains 12 computers for parallel programming using MPI. One computer, cluster-head.ce.rit.edu, serves as the master controller or
Load Balancing. computing a file with grayscales. granularity considerations static work load assignment with MPI
Load Balancing 1 the Mandelbrot set computing a file with grayscales 2 Static Work Load Assignment granularity considerations static work load assignment with MPI 3 Dynamic Work Load Balancing scheduling
DATA STRUCTURES USING C
DATA STRUCTURES USING C QUESTION BANK UNIT I 1. Define data. 2. Define Entity. 3. Define information. 4. Define Array. 5. Define data structure. 6. Give any two applications of data structures. 7. Give
LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2015. Hermann Härtig
LOAD BALANCING DISTRIBUTED OPERATING SYSTEMS, SCALABILITY, SS 2015 Hermann Härtig ISSUES starting points independent Unix processes and block synchronous execution who does it load migration mechanism
Debugging with TotalView
Tim Cramer 17.03.2015 IT Center der RWTH Aachen University Why to use a Debugger? If your program goes haywire, you may... ( wand (... buy a magic... read the source code again and again and...... enrich
Data Structure [Question Bank]
Unit I (Analysis of Algorithms) 1. What are algorithms and how they are useful? 2. Describe the factor on best algorithms depends on? 3. Differentiate: Correct & Incorrect Algorithms? 4. Write short note:
Introduction. Reading. Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF. CMSC 818Z - S99 (lect 5)
Introduction Reading Today MPI & OpenMP papers Tuesday Commutativity Analysis & HPF 1 Programming Assignment Notes Assume that memory is limited don t replicate the board on all nodes Need to provide load
Binary Search Trees CMPSC 122
Binary Search Trees CMPSC 122 Note: This notes packet has significant overlap with the first set of trees notes I do in CMPSC 360, but goes into much greater depth on turning BSTs into pseudocode than
Binary Trees and Huffman Encoding Binary Search Trees
Binary Trees and Huffman Encoding Binary Search Trees Computer Science E119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Motivation: Maintaining a Sorted Collection of Data A data dictionary
Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit
Data Structures Page 1 of 24 A.1. Arrays (Vectors) n-element vector start address + ielementsize 0 +1 +2 +3 +4... +n-1 start address continuous memory block static, if size is known at compile time dynamic,
TREE BASIC TERMINOLOGIES
TREE Trees are very flexible, versatile and powerful non-liner data structure that can be used to represent data items possessing hierarchical relationship between the grand father and his children and
Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R
Binary Search Trees A Generic Tree Nodes in a binary search tree ( B-S-T) are of the form P parent Key A Satellite data L R B C D E F G H I J The B-S-T has a root node which is the only node whose parent
Introduction to Hybrid Programming
Introduction to Hybrid Programming Hristo Iliev Rechen- und Kommunikationszentrum aixcelerate 2012 / Aachen 10. Oktober 2012 Version: 1.1 Rechen- und Kommunikationszentrum (RZ) Motivation for hybrid programming
Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005
Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005 Compute Cluster Server Lab 3: Debugging the parallel MPI programs in Microsoft Visual Studio 2005... 1
ECE 250 Data Structures and Algorithms MIDTERM EXAMINATION 2008-10-23/5:15-6:45 REC-200, EVI-350, RCH-106, HH-139
ECE 250 Data Structures and Algorithms MIDTERM EXAMINATION 2008-10-23/5:15-6:45 REC-200, EVI-350, RCH-106, HH-139 Instructions: No aides. Turn off all electronic media and store them under your desk. If
root node level: internal node edge leaf node CS@VT Data Structures & Algorithms 2000-2009 McQuain
inary Trees 1 A binary tree is either empty, or it consists of a node called the root together with two binary trees called the left subtree and the right subtree of the root, which are disjoint from each
RA MPI Compilers Debuggers Profiling. March 25, 2009
RA MPI Compilers Debuggers Profiling March 25, 2009 Examples and Slides To download examples on RA 1. mkdir class 2. cd class 3. wget http://geco.mines.edu/workshop/class2/examples/examples.tgz 4. tar
5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.
1. The advantage of.. is that they solve the problem if sequential storage representation. But disadvantage in that is they are sequential lists. [A] Lists [B] Linked Lists [A] Trees [A] Queues 2. The
How To Create A Tree From A Tree In Runtime (For A Tree)
Binary Search Trees < 6 2 > = 1 4 8 9 Binary Search Trees 1 Binary Search Trees A binary search tree is a binary tree storing keyvalue entries at its internal nodes and satisfying the following property:
Binary Search Trees (BST)
Binary Search Trees (BST) 1. Hierarchical data structure with a single reference to node 2. Each node has at most two child nodes (a left and a right child) 3. Nodes are organized by the Binary Search
COMP/CS 605: Introduction to Parallel Computing Lecture 21: Shared Memory Programming with OpenMP
COMP/CS 605: Introduction to Parallel Computing Lecture 21: Shared Memory Programming with OpenMP Mary Thomas Department of Computer Science Computational Science Research Center (CSRC) San Diego State
Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles
B-Trees Algorithms and data structures for external memory as opposed to the main memory B-Trees Previous Lectures Height balanced binary search trees: AVL trees, red-black trees. Multiway search trees:
Ordered Lists and Binary Trees
Data Structures and Algorithms Ordered Lists and Binary Trees Chris Brooks Department of Computer Science University of San Francisco Department of Computer Science University of San Francisco p.1/62 6-0:
Output: 12 18 30 72 90 87. struct treenode{ int data; struct treenode *left, *right; } struct treenode *tree_ptr;
50 20 70 10 30 69 90 14 35 68 85 98 16 22 60 34 (c) Execute the algorithm shown below using the tree shown above. Show the exact output produced by the algorithm. Assume that the initial call is: prob3(root)
MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop
MPI Runtime Error Detection with MUST For the 13th VI-HPS Tuning Workshop Joachim Protze and Felix Münchhalfen IT Center RWTH Aachen University February 2014 Content MPI Usage Errors Error Classes Avoiding
1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.
1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D. base address 2. The memory address of fifth element of an array can be calculated
Why Choose C/C++ as the programming language? Parallel Programming in C/C++ - OpenMP versus MPI
Parallel Programming (Multi/cross-platform) Why Choose C/C++ as the programming language? Compiling C/C++ on Windows (for free) Compiling C/C++ on other platforms for free is not an issue Parallel Programming
The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,
The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge, cheapest first, we had to determine whether its two endpoints
Binary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
Converting a Number from Decimal to Binary
Converting a Number from Decimal to Binary Convert nonnegative integer in decimal format (base 10) into equivalent binary number (base 2) Rightmost bit of x Remainder of x after division by two Recursive
Data Structures. Level 6 C30151. www.fetac.ie. Module Descriptor
The Further Education and Training Awards Council (FETAC) was set up as a statutory body on 11 June 2001 by the Minister for Education and Science. Under the Qualifications (Education & Training) Act,
Questions 1 through 25 are worth 2 points each. Choose one best answer for each.
Questions 1 through 25 are worth 2 points each. Choose one best answer for each. 1. For the singly linked list implementation of the queue, where are the enqueues and dequeues performed? c a. Enqueue in
Data Structures and Algorithms
Data Structures and Algorithms CS245-2016S-06 Binary Search Trees David Galles Department of Computer Science University of San Francisco 06-0: Ordered List ADT Operations: Insert an element in the list
Analysis of Algorithms I: Binary Search Trees
Analysis of Algorithms I: Binary Search Trees Xi Chen Columbia University Hash table: A data structure that maintains a subset of keys from a universe set U = {0, 1,..., p 1} and supports all three dictionary
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
MAX = 5 Current = 0 'This will declare an array with 5 elements. Inserting a Value onto the Stack (Push) -----------------------------------------
=============================================================================================================================== DATA STRUCTURE PSEUDO-CODE EXAMPLES (c) Mubashir N. Mir - www.mubashirnabi.com
B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees
B-Trees Algorithms and data structures for external memory as opposed to the main memory B-Trees Previous Lectures Height balanced binary search trees: AVL trees, red-black trees. Multiway search trees:
The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).
CHAPTER 5 The Tree Data Model There are many situations in which information has a hierarchical or nested structure like that found in family trees or organization charts. The abstraction that models hierarchical
Linux Driver Devices. Why, When, Which, How?
Bertrand Mermet Sylvain Ract Linux Driver Devices. Why, When, Which, How? Since its creation in the early 1990 s Linux has been installed on millions of computers or embedded systems. These systems may
WinBioinfTools: Bioinformatics Tools for Windows Cluster. Done By: Hisham Adel Mohamed
WinBioinfTools: Bioinformatics Tools for Windows Cluster Done By: Hisham Adel Mohamed Objective Implement and Modify Bioinformatics Tools To run under Windows Cluster Project : Research Project between
Parallel Scalable Algorithms- Performance Parameters
www.bsc.es Parallel Scalable Algorithms- Performance Parameters Vassil Alexandrov, ICREA - Barcelona Supercomputing Center, Spain Overview Sources of Overhead in Parallel Programs Performance Metrics for
22S:295 Seminar in Applied Statistics High Performance Computing in Statistics
22S:295 Seminar in Applied Statistics High Performance Computing in Statistics Luke Tierney Department of Statistics & Actuarial Science University of Iowa August 30, 2007 Luke Tierney (U. of Iowa) HPC
University of Dayton Department of Computer Science Undergraduate Programs Assessment Plan DRAFT September 14, 2011
University of Dayton Department of Computer Science Undergraduate Programs Assessment Plan DRAFT September 14, 2011 Department Mission The Department of Computer Science in the College of Arts and Sciences
Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child
Binary Search Trees Data in each node Larger than the data in its left child Smaller than the data in its right child FIGURE 11-6 Arbitrary binary tree FIGURE 11-7 Binary search tree Data Structures Using
Informatica e Sistemi in Tempo Reale
Informatica e Sistemi in Tempo Reale Introduction to C programming Giuseppe Lipari http://retis.sssup.it/~lipari Scuola Superiore Sant Anna Pisa October 25, 2010 G. Lipari (Scuola Superiore Sant Anna)
Libmonitor: A Tool for First-Party Monitoring
Libmonitor: A Tool for First-Party Monitoring Mark W. Krentel Dept. of Computer Science Rice University 6100 Main St., Houston, TX 77005 [email protected] ABSTRACT Libmonitor is a library that provides
Hodor and Bran - Job Scheduling and PBS Scripts
Hodor and Bran - Job Scheduling and PBS Scripts UND Computational Research Center Now that you have your program compiled and your input file ready for processing, it s time to run your job on the cluster.
OpenMP & MPI CISC 879. Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware
OpenMP & MPI CISC 879 Tristan Vanderbruggen & John Cavazos Dept of Computer & Information Sciences University of Delaware 1 Lecture Overview Introduction OpenMP MPI Model Language extension: directives-based
Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:
CSE341T 08/31/2015 Lecture 3 Cost Model: Work, Span and Parallelism In this lecture, we will look at how one analyze a parallel program written using Cilk Plus. When we analyze the cost of an algorithm
An Incomplete C++ Primer. University of Wyoming MA 5310
An Incomplete C++ Primer University of Wyoming MA 5310 Professor Craig C. Douglas http://www.mgnet.org/~douglas/classes/na-sc/notes/c++primer.pdf C++ is a legacy programming language, as is other languages
Binary Heap Algorithms
CS Data Structures and Algorithms Lecture Slides Wednesday, April 5, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks [email protected] 2005 2009 Glenn G. Chappell
Persistent Binary Search Trees
Persistent Binary Search Trees Datastructures, UvA. May 30, 2008 0440949, Andreas van Cranenburgh Abstract A persistent binary tree allows access to all previous versions of the tree. This paper presents
Introduction to Cloud Computing
Introduction to Cloud Computing Distributed Systems 15-319, spring 2010 11 th Lecture, Feb 16 th Majd F. Sakr Lecture Motivation Understand Distributed Systems Concepts Understand the concepts / ideas
Charm++, what s that?!
Charm++, what s that?! Les Mardis du dev François Tessier - Runtime team October 15, 2013 François Tessier Charm++ 1 / 25 Outline 1 Introduction 2 Charm++ 3 Basic examples 4 Load Balancing 5 Conclusion
Grid 101. Grid 101. Josh Hegie. [email protected] http://hpc.unr.edu
Grid 101 Josh Hegie [email protected] http://hpc.unr.edu Accessing the Grid Outline 1 Accessing the Grid 2 Working on the Grid 3 Submitting Jobs with SGE 4 Compiling 5 MPI 6 Questions? Accessing the Grid Logging
10CS35: Data Structures Using C
CS35: Data Structures Using C QUESTION BANK REVIEW OF STRUCTURES AND POINTERS, INTRODUCTION TO SPECIAL FEATURES OF C OBJECTIVE: Learn : Usage of structures, unions - a conventional tool for handling a
Threads Scheduling on Linux Operating Systems
Threads Scheduling on Linux Operating Systems Igli Tafa 1, Stavri Thomollari 2, Julian Fejzaj 3 Polytechnic University of Tirana, Faculty of Information Technology 1,2 University of Tirana, Faculty of
Data Structures Using C++
Data Structures Using C++ 1.1 Introduction Data structure is an implementation of an abstract data type having its own set of data elements along with functions to perform operations on that data. Arrays
Symbol Tables. Introduction
Symbol Tables Introduction A compiler needs to collect and use information about the names appearing in the source program. This information is entered into a data structure called a symbol table. The
MPI Application Development Using the Analysis Tool MARMOT
MPI Application Development Using the Analysis Tool MARMOT HLRS High Performance Computing Center Stuttgart Allmandring 30 D-70550 Stuttgart http://www.hlrs.de 24.02.2005 1 Höchstleistungsrechenzentrum
Q. Consider a dynamic instruction execution (an execution trace, in other words) that consists of repeats of code in this pattern:
Pipelining HW Q. Can a MIPS SW instruction executing in a simple 5-stage pipelined implementation have a data dependency hazard of any type resulting in a nop bubble? If so, show an example; if not, prove
Why Use Binary Trees?
Binary Search Trees Why Use Binary Trees? Searches are an important application. What other searches have we considered? brute force search (with array or linked list) O(N) binarysearch with a pre-sorted
Load Balancing. Load Balancing 1 / 24
Load Balancing Backtracking, branch & bound and alpha-beta pruning: how to assign work to idle processes without much communication? Additionally for alpha-beta pruning: implementing the young-brothers-wait
UIL Computer Science for Dummies by Jake Warren and works from Mr. Fleming
UIL Computer Science for Dummies by Jake Warren and works from Mr. Fleming 1 2 Foreword First of all, this book isn t really for dummies. I wrote it for myself and other kids who are on the team. Everything
Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus
Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus A simple C/C++ language extension construct for data parallel operations Robert Geva [email protected] Introduction Intel
OPTIMAL BINARY SEARCH TREES
OPTIMAL BINARY SEARCH TREES 1. PREPARATION BEFORE LAB DATA STRUCTURES An optimal binary search tree is a binary search tree for which the nodes are arranged on levels such that the tree cost is minimum.
Learning Outcomes. COMP202 Complexity of Algorithms. Binary Search Trees and Other Search Trees
Learning Outcomes COMP202 Complexity of Algorithms Binary Search Trees and Other Search Trees [See relevant sections in chapters 2 and 3 in Goodrich and Tamassia.] At the conclusion of this set of lecture
Exercises Software Development I. 11 Recursion, Binary (Search) Trees. Towers of Hanoi // Tree Traversal. January 16, 2013
Exercises Software Development I 11 Recursion, Binary (Search) Trees Towers of Hanoi // Tree Traversal January 16, 2013 Software Development I Winter term 2012/2013 Institute for Pervasive Computing Johannes
ER E P M A S S I CONSTRUCTING A BINARY TREE EFFICIENTLYFROM ITS TRAVERSALS DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A-1998-5
S I N S UN I ER E P S I T VER M A TA S CONSTRUCTING A BINARY TREE EFFICIENTLYFROM ITS TRAVERSALS DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A-1998-5 UNIVERSITY OF TAMPERE DEPARTMENT OF
2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]
Code No: R05220502 Set No. 1 1. (a) Describe the performance analysis in detail. (b) Show that f 1 (n)+f 2 (n) = 0(max(g 1 (n), g 2 (n)) where f 1 (n) = 0(g 1 (n)) and f 2 (n) = 0(g 2 (n)). [8+8] 2. (a)
Union-Find Algorithms. network connectivity quick find quick union improvements applications
Union-Find Algorithms network connectivity quick find quick union improvements applications 1 Subtext of today s lecture (and this course) Steps to developing a usable algorithm. Define the problem. Find
Computer Systems II. Unix system calls. fork( ) wait( ) exit( ) How To Create New Processes? Creating and Executing Processes
Computer Systems II Creating and Executing Processes 1 Unix system calls fork( ) wait( ) exit( ) 2 How To Create New Processes? Underlying mechanism - A process runs fork to create a child process - Parent
Big Data and Scripting. Part 4: Memory Hierarchies
1, Big Data and Scripting Part 4: Memory Hierarchies 2, Model and Definitions memory size: M machine words total storage (on disk) of N elements (N is very large) disk size unlimited (for our considerations)
Algorithms Chapter 12 Binary Search Trees
Algorithms Chapter 1 Binary Search Trees Outline Assistant Professor: Ching Chi Lin 林 清 池 助 理 教 授 [email protected] Department of Computer Science and Engineering National Taiwan Ocean University
Parallel Computing. Shared memory parallel programming with OpenMP
Parallel Computing Shared memory parallel programming with OpenMP Thorsten Grahs, 27.04.2015 Table of contents Introduction Directives Scope of data Synchronization 27.04.2015 Thorsten Grahs Parallel Computing
Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C
Tutorial#1 Q 1:- Explain the terms data, elementary item, entity, primary key, domain, attribute and information? Also give examples in support of your answer? Q 2:- What is a Data Type? Differentiate
Intro to GPU computing. Spring 2015 Mark Silberstein, 048661, Technion 1
Intro to GPU computing Spring 2015 Mark Silberstein, 048661, Technion 1 Serial vs. parallel program One instruction at a time Multiple instructions in parallel Spring 2015 Mark Silberstein, 048661, Technion
Agenda. Using HPC Wales 2
Using HPC Wales Agenda Infrastructure : An Overview of our Infrastructure Logging in : Command Line Interface and File Transfer Linux Basics : Commands and Text Editors Using Modules : Managing Software
Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)
Grid Engine Basics (Formerly: Sun Grid Engine) Table of Contents Table of Contents Document Text Style Associations Prerequisites Terminology What is the Grid Engine (SGE)? Loading the SGE Module on Turing
Any two nodes which are connected by an edge in a graph are called adjacent node.
. iscuss following. Graph graph G consist of a non empty set V called the set of nodes (points, vertices) of the graph, a set which is the set of edges and a mapping from the set of edges to a set of pairs
Chapter 14 The Binary Search Tree
Chapter 14 The Binary Search Tree In Chapter 5 we discussed the binary search algorithm, which depends on a sorted vector. Although the binary search, being in O(lg(n)), is very efficient, inserting a
Chapter 12: Multiprocessor Architectures. Lesson 01: Performance characteristics of Multiprocessor Architectures and Speedup
Chapter 12: Multiprocessor Architectures Lesson 01: Performance characteristics of Multiprocessor Architectures and Speedup Objective Be familiar with basic multiprocessor architectures and be able to
Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)
Sorting revisited How did we use a binary search tree to sort an array of elements? Tree Sort Algorithm Given: An array of elements to sort 1. Build a binary search tree out of the elements 2. Traverse
Embedded Software Development
Linköpings Tekniska Högskola Institutionen för Datavetanskap (IDA), Software and Systems (SaS) TDDI11, Embedded Software 2010-04-22 Embedded Software Development Host and Target Machine Typical embedded
Cloud-based OpenMP Parallelization Using a MapReduce Runtime. Rodolfo Wottrich, Rodolfo Azevedo and Guido Araujo University of Campinas
Cloud-based OpenMP Parallelization Using a MapReduce Runtime Rodolfo Wottrich, Rodolfo Azevedo and Guido Araujo University of Campinas 1 MPI_Init(NULL, NULL); MPI_Comm_size(MPI_COMM_WORLD, &comm_sz); MPI_Comm_rank(MPI_COMM_WORLD,
S. Muthusundari. Research Scholar, Dept of CSE, Sathyabama University Chennai, India e-mail: [email protected]. Dr. R. M.
A Sorting based Algorithm for the Construction of Balanced Search Tree Automatically for smaller elements and with minimum of one Rotation for Greater Elements from BST S. Muthusundari Research Scholar,
Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003
Interconnect Efficiency of Tyan PSC T-630 with Microsoft Compute Cluster Server 2003 Josef Pelikán Charles University in Prague, KSVI Department, [email protected] Abstract 1 Interconnect quality
A binary search tree is a binary tree with a special property called the BST-property, which is given as follows:
Chapter 12: Binary Search Trees A binary search tree is a binary tree with a special property called the BST-property, which is given as follows: For all nodes x and y, if y belongs to the left subtree
Interconnection Networks. Interconnection Networks. Interconnection networks are used everywhere!
Interconnection Networks Interconnection Networks Interconnection networks are used everywhere! Supercomputers connecting the processors Routers connecting the ports can consider a router as a parallel
1 File Management. 1.1 Naming. COMP 242 Class Notes Section 6: File Management
COMP 242 Class Notes Section 6: File Management 1 File Management We shall now examine how an operating system provides file management. We shall define a file to be a collection of permanent data with
