Lecture 12: Partitioning and Load Balancing
|
|
|
- Gertrude Sherman
- 10 years ago
- Views:
Transcription
1 Lecture 12: Partitioning and Load Balancing G /G November 16, 2010 thanks to Schloegel,Karypis and Kumar survey paper and Zoltan website for many of today s slides and pictures
2 Partitioning Decompose computation into tasks to equi-distribute the data and work, minimize processor idle time. applies to grid points, elements, matrix rows, particles, VLSI layout,,... Map to processors to keep interprocessor communication low. communication to computation ratio comes from both the partitioning and the algorithm.
3 Partitioning Data decomposition + Owner computes rule: Data distributed among the processors Data distribution defines work assignment Owner performs all computations on its data. Data dependencies for data items owned by different processors incur communication
4 Partitioning Static - all information available before computation starts use off-line algorithms to prepare before execution time; run as pre-processor, can be serial, can be slow and expensive, starts. Dynamic - information not known until runtime, work changes during computation (e.g. adaptive methods), or locality of objects change (e.g. particles move) use on-line algorithms to make decisions mid-execution; must run side-by-side with application, should be parallel, fast, scalable. Incremental algorithm preferred (small changes in input result in small changes in partitions) will look at some geometric methods, graph-based methods, spectral methods, multilevel methods, diffusion-based balancing,...
5 Recursive Coordinate Bisection Geometric Divide Partitioning work into two equal parts using cutting plane orthogonal to coordinate axis For good aspect ratios cut in longest dimension. Adaptive Mesh Refinement Pa 1st cut 3rd 3rd 2nd 2nd 3rd 3rd Can generalize to k-way partitions. Parallel Finding Volume optimal Rendering partitions is NP hard. (There are optimality results for a class of graphs as a graph partitioning problem.)
6 Recursive Coordinate Bisection + Conceptually simple, easy to implement, fast. + Regular subdomains, easy to describe Need coordinates of mesh points/particles. No control of communication costs. Can generate disconnected subdomains
7 Recursive Coordinate Bisection Implicitly incremental - small changes in data result in small movement of cuts
8 Recursive Inertial Bisection For domains not oriented along coordinate axes can do better if account for the angle of orientation of the mesh. Use bisection line orthogonal to principal inertial axis (treat mesh elements as point masses). Project centers-of-mass onto this axis; bisect this ordered list. Typically gives smaller subdomain boundary.
9 Space-filling Curves Linearly order a multidimensional mesh (nested hierarchically, preserves locality) Peano-Hilbert ordering Morton ordering
10 Space-filling Curves Easily extends to adaptively refined meshes
11 Space-filling Curves Partition work into equal chunks.
12 Space-filling Curves + Generalizes to uneven work loads - incorporate weights. + Dynamic on-the-fly partitioning for any number of nodes. + Good for cache performance
13 Space-filling Curves Red region has more communication - not compact Need coordinates
14 Space-filling Curves Generalizes to other non-finite difference problems, e.g. particle methods, patch-based adaptive mesh refinement, smooth particle hydro.,
15 Space-filling Curves Implicitly incremental - small changes in data results in small movement of cuts in linear ordering
16 Graph Model of Computation for computation on mesh nodes, graph of the mesh is the graph of the computation; if there is an edge between nodes there is an edge between the vertices in the graph. for computation on the mesh elements the element is a vertex; put an edge between vertices if the mesh elements share an edge. This is the dual of the node graph.
17 Graph Model of Computation for computation on mesh nodes, graph of the mesh is the graph of the computation; if there is an edge between nodes there is an edge between the vertices in the graph. for computation on the mesh elements the element is a vertex; put an edge between vertices if the mesh elements share an edge. This is the dual of the node graph. Partition vertices into disjoint subdomains so each has same number. Estimate total communication by counting number of edges that connect vertices in different subdomains (the edge-cut metric).
18 Greedy Bisection Algorithm (also LND) Put connected components together for min communication. Start with single vertex (peripheral vertex, lowest degree, endpoints of graph diameter) Incrementally grow partition by adding adjacent vertices (bfs) Stop when half the vertices counted (n/p for p partitions)
19 Greedy Bisection Algorithm (also LND) Put connected components together for min communication. Start with single vertex (peripheral vertex, lowest degree, endpoints of graph diameter) Incrementally grow partition by adding adjacent vertices (bfs) Stop when half the vertices counted (n/p for p partitions) + At least one component connected Not best quality partitioning; need multiple trials.
20 Breadth First Search All edges between nodes in same level or adjacent levels. Partitioning the graph into nodes <= level L and >= L+1 breaks only tree and interlevel edges; no extra edges.
21 Breadth First Search BFS of two dimensional grid starting at center node.
22 Graph Partitioning for Sparse Matrix Vector Mult. Compute y = Ax, A sparse symmetric matrix, Vertices v i represent x i, y i. Edge (i,j) for each nonzero A ij Black lines represent communication.
23 Graph Partitioning for Sparse Matrix Factorization Nested dissection for fill-reducing orderings for sparse matrix factorizations. Recursively repeat: Compute vertex separator, bisect graph, edge separator = smallest subset of edges such that removing them divided graph into 2 disconnected subgraphs) vertex separator = can extend edge separator by connecting each edge to one vertex, or compute directly. Split a graph into roughly equal halves using the vertex separator At each level of recursion number the vertices of the partitions, number the separator vertices last. Unknowns ordered from n to 1. Smaller separators less fill and less factorization work
24 Spectral Bisection Gold standard for graph partitioning (Pothen, Simon, Liou, 1990) Let x i = { 1 i A 1 i B (i,j) E (x i x j ) 2 = 4 # cut edges Goal: find x to minimize quadratic objective function (edge cuts) for integer-valued x = ±1. Uses Laplacian L of graph G: d(i) i = j l ij = 1 i j, (i, j) E 0 otherwise
25 Spectral Bisection L = = D A A = adjacency matrix; D diagonal matrix L is symmetric, so has real eigenvalues and orthogonal evecs. Since row sum is 0, Le = 0, where e = ( ) t Think of second eigenvector as first vibrational mode
26 Spectral Bisection Note that x t Lx = x t Dx x t Ax = n i=1 d i x 2 i 2 (i,j) E x i x j = (i,j) E (x i x j ) 2 x 2 + x 3 x 1 + x 5 Using previous example, x t Ax = (x 1 x 2 x 3 x 4 x 5 ) x 1 + x 4 + x 5 x 3 + x 4 x 2 + x 3 + x 5 So finding x to minimize cut edges looks like minimizing x t Lx over vectors x = ±1 and n i=1 x i = 0 (balance condition).
27 Spectral Bisection Integer programming problem difficult. Replace x i = ±1 with n i=1 x i 2 = n min x t Lx = x xi 2Lx t 2 =0 x 2 i =n = λ 2 x t 2 x 2 = λ 2 n λ 2 is the smallest positive eval of L, with evec x 2, (assuming G is connected, λ 1 = 0, x 1 = e) x 2 satisfies x i = 0 since orthogonal to x 1, e t x 1 = 0 x 2 called Fiedler vector (properties studied by Fiedler in 70 s).
28 Spectral Bisection Assign vertices according to the sign of the x 2. Almost always gives connected subdomains, with significantly fewer edge cuts than RCB. (Thrm. (Fiedler) If G is connected, then one of A,B is. If i, x 2i = 0 then other set is connected too). Recursively repeat (or use higher order evecs) v2 =
29 Spectral Bisection + High quality partitions How find second eval and evec? (Lanczos, or CG,... how do this in parallel, when you don t yet have the partition?)
30 Kernighan-Lin Algorithm Heuristic for graph partitioning (even 2 way partitioning with unit weights is NP complete) Needs initial partition to start, iteratively improve it by making small local changes to improve partition quality (vertex swaps that decrease edge-cut cost) cut cost 4 cut cost 2
31 Kernighan-Lin Algorithm More precisely, the problem is: Given: an undirected graph G(V, E) with 2n vertices, edges (a, b) E with weights w(a, b) Find: sets A and B, so that V = A B, A B = 0, and A = B = n that minimizes the cost (a,b) AxB w(a, b) Approach: Take initial partition and iteratively improve it. Exchange two vertices and see if cost of cut size is reduced. Select best pair of vertices, lock them, continue. When all vertices locked one iteration is done. Original algorithm O(n 3 ). Complicated improvement by Fiduccia-Mattheyses is O( E ).
32 Kernighan-Lin Algorithm Let C = cost(a,b) E(a) = external cost of a in A = b B w(a, b) I(a) = internal cost of a in A = a A,a a w(a, a ) D(a) = cost of a in A = E(a) - I(a) Consider swapping X={a} and Y={b}. (newa = A - X Y newb = B - Y X) newc = C - (D(a) + D(b) - 2*w(a,b)) = C - gain(a,b) D(6) = 1 D(1) = 1 newd(a ) = D(a ) + 2 w(a,a) - 2 w(a,b) for a A, a a newd(b ) = D(b ) + 2 w(b,b) - 2 w(b,a) for b B, b b D(3) = 0 newd(3) = -2
33 Let C = cost(a,b) Kernighan-Lin Algorithm A E(a) = external cost of a in A = b B w(a, b) I(a) = internal cost of a in A = a A,a a w(a, a ) D(a) = cost of a in A = E(a) - I(a) Consider swapping X={a} and Y={b}. (newa = A - X Y newb = B - Y X) newc = C - (D(a) + D(b) - 2*w(a,b)) = C - gain(a,b) X Y Z B D(Y) = 1 D(Z) = 1 newc = C newd(a ) = D(a ) + 2 w(a,a) - 2 w(a,b) for a A, a a newd(b ) = D(b ) + 2 w(b,b) - 2 w(b,a) for b B, b b W
34 Kernighan-Lin Algorithm Compute C = cost(a,b) for initial A,B Repeat Compute costs D for all verts Unmark all nodes While there are unmarked nodes Find unmarked pair (a,b) maximizing gain(a,b) Mark a and b (do not swap) Update D for all unmarked verts (as if a,b swapped) End Pick sets of pairs maximizing gain if (Gain>0) then actually swap Update A = A - {a1,a2,...am} + {b1,b2,...bm} B = B - {b1,b2,...bm} + {a1,a2,...,am} C = C - Gain Until Gain<0
35 Kernighan-Lin Algorithm KL can sometimes climb out of local minima...
36 Kernighan-Lin Algorithm gets better solution; but need good partitions to start
37 Graph Coarsening Adjacent vertices are combined to form a multinode at next level, with weight equal to the sum of the original weights. Edges are the union of edges of the original vertices, also weighted. Coarser graph still represents original graph Graph collapse uses maximal matching = set of edges, no two of which are incident on the same vertex. The matched vertices are collapsed into the multinode. Unmatched vertices copied to next level. Heuristics that combine 2 vertices sharing edge with heaviest weight, or randomly chosen unmatched vertex,...
38 Graph Coarsening Fewer remaining visible edges on coarsest grid easier to partition
39 Coarsen graph Multilevel Graph Partitioning Partition the coarse graph Refine graph, using local refinement algorithm (e.g.k-l) vertices in larger graph assigned to same set as coarser graph s vertex. since vertex weight conserved, balance preserved similarly for edge weights Moving one node with K-L on coarse graph equivalent to moving large number of vertices in original graph but much faster.
40 Re-Partitioning when workload changes dynamically, need to re-partition as well as minimizing redistribution cost. Options include: partition from scratch (use incremental partitioner, or try to map on to processors well) called scratch-remap give away excess, called cut-and-paste repartitioning diffusive repartitioning Should you minimize sum of vertices changing subdomains (total volume of communication = TotalV), or max volume per processor (called maxv).
41 Re-Partitioning (b) from scratch (c) cut-and-paste, (d) diffusive
42 Diffusion-based Partitioning Iterative method used for re-partitioning - migrate tasks from overutilized processors to underutilized ones. Variations on which nodes to move, how many to move at one time. Based on Cybenko model w t+1 i = w t i + j α ij (w t j w t i ) if w j w i > 0 processor j gives work to i, else other way around. At steady state the temperature is constant (computational load is equal) Slow to converge, use multilevel version, or recursive bisection verion. Solve optimization problem to minimize norm of data movement (1- or 2-norm).
43 Multiphase/Multiconstraint Graph Partitioning Many simulations have multiple phases - e.g. first compute fluid step, next compute the structural deformation, move geometry,... Each step has different CPU and memory requirements. Would like to load balance each phase. single partition that balances all phases? multiple partition with redistribution between phases?
44 Issues with Edge Cut Approximation 7 edges cut 9 items communicated vertex 1 in A connected to two vertices in B but it only needs to be sent once. Edge cuts Communication volume Communication volume Communication cost
45 Hypergraphs Hypergraph H = (V, E) where E is a hyperedge = subset of V, i.e. connects more than two vertices e 1 = {v 1, v 2, v 3 } e 2 = {v 2, v 3 } e 3 = {v 3, v 5, v 6 } e 4 = {v 4 } k-way partitioning: find P = {V o,..., V k 1 } to minimize cut(h,p) = Σ E 1 i=0 (λ i (H, P) 1) λ i (H, P) = number of partitions spanned by hyperedge i
46 Other Issues Heterogeneous machines Aspect ratio of subdomains (needed for convergence rate of iterative solvers)
47 Software Packages Also, graph partitioning archive at Univ. of Greenwich by Walshaw.
48 Slide 91 Test Data Xyce 680K ASIC Stripped Circuit Simulation 680K x 680K 2.3M nonzeros SLAC *LCLS Radio Frequency Gun 6.0M x 6.0M 23.4M nonzeros Cage15 DNA Electrophoresis 5.1M x 5.1M 99M nonzeros SLAC Linear Accelerator 2.9M x 2.9M 11.4M nonzeros from Zoltan tutorial slides, by Erik Boman and Karen Devine
49 SLAC 6.0M LCLS Communication Volume: Lower is Better Number of parts = number of processors. Xyce 680K circuit Slide 92 SLAC 2.9M Linear Accelerator RCB Graph Hypergraph HSFC Cage15 5.1M electrophoresis from Zoltan tutorial slides, by Erik Boman and Karen Devine
50 SLAC 6.0M LCLS Partitioning Time: Lower is better 1024 parts. Varying number of processors. Xyce 680K circuit Slide 93 SLAC 2.9M Linear Accelerator RCB Graph Hypergraph HSFC Cage15 5.1M electrophoresis from Zoltan tutorial slides, by Erik Boman and Karen Devine
51 SLAC 6.0M LCLS Repartitioning Results: Lower is Better Xyce 680K circuit Slide 95 Data Redistribution Volume Application Communication Volume Repartitioning Time (secs) from Zoltan tutorial slides, by Erik Boman and Karen Devine
52 References Graph Partitioning for High Performance Scientific Simulations by K. Schloegel, G. Karypis and V. Kumar. in CRPC Parallel Computing Handbook, (2000). (University of Minnesota TR 0018) Load Balancing Fictions, Falsehoods and Fallacie by Bruce Hendrickson Applied Math Modelling, (preprint from his website; many other relevant papers there too). Zoltan tutorial by E. Boman and K. Devine kddevin/papers/ Zoltan_Tutorial_Slides.pdf
Partitioning and Dynamic Load Balancing for Petascale Applications
Partitioning and Dynamic Load Balancing for Petascale Applications Karen Devine, Sandia National Laboratories Erik Boman, Sandia National Laboratories Umit Çatalyürek, Ohio State University Lee Ann Riesen,
A FAST AND HIGH QUALITY MULTILEVEL SCHEME FOR PARTITIONING IRREGULAR GRAPHS
SIAM J. SCI. COMPUT. Vol. 20, No., pp. 359 392 c 998 Society for Industrial and Applied Mathematics A FAST AND HIGH QUALITY MULTILEVEL SCHEME FOR PARTITIONING IRREGULAR GRAPHS GEORGE KARYPIS AND VIPIN
Partitioning and Load Balancing for Emerging Parallel Applications and Architectures
Chapter Partitioning and Load Balancing for Emerging Parallel Applications and Architectures Karen D. Devine, Erik G. Boman, and George Karypis. Introduction An important component of parallel scientific
Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations
Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations Roy D. Williams, 1990 Presented by Chris Eldred Outline Summary Finite Element Solver Load Balancing Results Types Conclusions
Distributed Dynamic Load Balancing for Iterative-Stencil Applications
Distributed Dynamic Load Balancing for Iterative-Stencil Applications G. Dethier 1, P. Marchot 2 and P.A. de Marneffe 1 1 EECS Department, University of Liege, Belgium 2 Chemical Engineering Department,
Mesh Generation and Load Balancing
Mesh Generation and Load Balancing Stan Tomov Innovative Computing Laboratory Computer Science Department The University of Tennessee April 04, 2012 CS 594 04/04/2012 Slide 1 / 19 Outline Motivation Reliable
Fast Multipole Method for particle interactions: an open source parallel library component
Fast Multipole Method for particle interactions: an open source parallel library component F. A. Cruz 1,M.G.Knepley 2,andL.A.Barba 1 1 Department of Mathematics, University of Bristol, University Walk,
DATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
New Challenges in Dynamic Load Balancing
New Challenges in Dynamic Load Balancing Karen D. Devine 1, Erik G. Boman, Robert T. Heaphy, Bruce A. Hendrickson Discrete Algorithms and Mathematics Department, Sandia National Laboratories, Albuquerque,
Dynamic Load Balancing for Cluster Computing Jaswinder Pal Singh, CSE @ Technische Universität München. e-mail: [email protected]
Dynamic Load Balancing for Cluster Computing Jaswinder Pal Singh, CSE @ Technische Universität München. e-mail: [email protected] Abstract: In parallel simulations, partitioning and load-balancing algorithms
Hypergraph-based Dynamic Load Balancing for Adaptive Scientific Computations
Hypergraph-based Dynamic Load Balancing for Adaptive Scientific Computations Umit V. Catalyurek, Erik G. Boman, Karen D. Devine, Doruk Bozdağ, Robert Heaphy, and Lee Ann Riesen Ohio State University Sandia
Tutorial: Partitioning, Load Balancing and the Zoltan Toolkit
Tutorial: Partitioning, Load Balancing and the Zoltan Toolkit Erik Boman and Karen Devine Discrete Algorithms and Math Dept. Sandia National Laboratories, NM CSCAPES Institute SciDAC Tutorial, MIT, June
Mesh Partitioning and Load Balancing
and Load Balancing Contents: Introduction / Motivation Goals of Load Balancing Structures Tools Slide Flow Chart of a Parallel (Dynamic) Application Partitioning of the initial mesh Computation Iteration
Characterizing the Performance of Dynamic Distribution and Load-Balancing Techniques for Adaptive Grid Hierarchies
Proceedings of the IASTED International Conference Parallel and Distributed Computing and Systems November 3-6, 1999 in Cambridge Massachusetts, USA Characterizing the Performance of Dynamic Distribution
A Comparison of Load Balancing Algorithms for AMR in Uintah
1 A Comparison of Load Balancing Algorithms for AMR in Uintah Qingyu Meng, Justin Luitjens, Martin Berzins UUSCI-2008-006 Scientific Computing and Imaging Institute University of Utah Salt Lake City, UT
Part 2: Community Detection
Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -
A scalable multilevel algorithm for graph clustering and community structure detection
A scalable multilevel algorithm for graph clustering and community structure detection Hristo N. Djidjev 1 Los Alamos National Laboratory, Los Alamos, NM 87545 Abstract. One of the most useful measures
CSE 4351/5351 Notes 7: Task Scheduling & Load Balancing
CSE / Notes : Task Scheduling & Load Balancing Task Scheduling A task is a (sequential) activity that uses a set of inputs to produce a set of outputs. A task (precedence) graph is an acyclic, directed
NETZCOPE - a tool to analyze and display complex R&D collaboration networks
The Task Concepts from Spectral Graph Theory EU R&D Network Analysis Netzcope Screenshots NETZCOPE - a tool to analyze and display complex R&D collaboration networks L. Streit & O. Strogan BiBoS, Univ.
Social Media Mining. Graph Essentials
Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures
Yousef Saad University of Minnesota Computer Science and Engineering. CRM Montreal - April 30, 2008
A tutorial on: Iterative methods for Sparse Matrix Problems Yousef Saad University of Minnesota Computer Science and Engineering CRM Montreal - April 30, 2008 Outline Part 1 Sparse matrices and sparsity
A Refinement-tree Based Partitioning Method for Dynamic Load Balancing with Adaptively Refined Grids
A Refinement-tree Based Partitioning Method for Dynamic Load Balancing with Adaptively Refined Grids William F. Mitchell Mathematical and Computational Sciences Division National nstitute of Standards
Big Data Graph Algorithms
Christian Schulz CompSE seminar, RWTH Aachen, Karlsruhe 1 Christian Schulz: Institute for Theoretical www.kit.edu Informatics Algorithm Engineering design analyze Algorithms implement experiment 1 Christian
Dynamic Mapping and Load Balancing on Scalable Interconnection Networks Alan Heirich, California Institute of Technology Center for Advanced Computing Research The problems of mapping and load balancing
walberla: Towards an Adaptive, Dynamically Load-Balanced, Massively Parallel Lattice Boltzmann Fluid Simulation
walberla: Towards an Adaptive, Dynamically Load-Balanced, Massively Parallel Lattice Boltzmann Fluid Simulation SIAM Parallel Processing for Scientific Computing 2012 February 16, 2012 Florian Schornbaum,
USE OF EIGENVALUES AND EIGENVECTORS TO ANALYZE BIPARTIVITY OF NETWORK GRAPHS
USE OF EIGENVALUES AND EIGENVECTORS TO ANALYZE BIPARTIVITY OF NETWORK GRAPHS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA [email protected] ABSTRACT This
Load Balancing. Load Balancing 1 / 24
Load Balancing Backtracking, branch & bound and alpha-beta pruning: how to assign work to idle processes without much communication? Additionally for alpha-beta pruning: implementing the young-brothers-wait
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA [email protected]
Graph/Network Visualization
Graph/Network Visualization Data model: graph structures (relations, knowledge) and networks. Applications: Telecommunication systems, Internet and WWW, Retailers distribution networks knowledge representation
Topological Properties
Advanced Computer Architecture Topological Properties Routing Distance: Number of links on route Node degree: Number of channels per node Network diameter: Longest minimum routing distance between any
Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit
Data Structures Page 1 of 24 A.1. Arrays (Vectors) n-element vector start address + ielementsize 0 +1 +2 +3 +4... +n-1 start address continuous memory block static, if size is known at compile time dynamic,
LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014
LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING ----Changsheng Liu 10-30-2014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph
Load Balancing Strategies for Parallel SAMR Algorithms
Proposal for a Summer Undergraduate Research Fellowship 2005 Computer science / Applied and Computational Mathematics Load Balancing Strategies for Parallel SAMR Algorithms Randolf Rotta Institut für Informatik,
Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes
Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Eric Petit, Loïc Thebault, Quang V. Dinh May 2014 EXA2CT Consortium 2 WPs Organization Proto-Applications
Social Networks and Social Media
Social Networks and Social Media Social Media: Many-to-Many Social Networking Content Sharing Social Media Blogs Microblogging Wiki Forum 2 Characteristics of Social Media Consumers become Producers Rich
An Improved Spectral Load Balancing Method*
SAND93-016C An Improved Spectral Load Balancing Method* Bruce Hendrickson Robert Leland Abstract We describe an algorithm for the static load balancing of scientific computations that generalizes and improves
Small Maximal Independent Sets and Faster Exact Graph Coloring
Small Maximal Independent Sets and Faster Exact Graph Coloring David Eppstein Univ. of California, Irvine Dept. of Information and Computer Science The Exact Graph Coloring Problem: Given an undirected
Data Mining. Cluster Analysis: Advanced Concepts and Algorithms
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based
Clustering & Visualization
Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.
Optimizing Load Balance Using Parallel Migratable Objects
Optimizing Load Balance Using Parallel Migratable Objects Laxmikant V. Kalé, Eric Bohm Parallel Programming Laboratory University of Illinois Urbana-Champaign 2012/9/25 Laxmikant V. Kalé, Eric Bohm (UIUC)
Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations
C3P 913 June 1990 Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations Roy D. Williams Concurrent Supercomputing Facility California Institute of Technology Pasadena, California
FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG
FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG INSTITUT FÜR INFORMATIK (MATHEMATISCHE MASCHINEN UND DATENVERARBEITUNG) Lehrstuhl für Informatik 10 (Systemsimulation) Massively Parallel Multilevel Finite
Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits
Outline NP-completeness Examples of Easy vs. Hard problems Euler circuit vs. Hamiltonian circuit Shortest Path vs. Longest Path 2-pairs sum vs. general Subset Sum Reducing one problem to another Clique
Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms
Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms Amani AlOnazi, David E. Keyes, Alexey Lastovetsky, Vladimir Rychkov Extreme Computing Research Center,
CIS 700: algorithms for Big Data
CIS 700: algorithms for Big Data Lecture 6: Graph Sketching Slides at http://grigory.us/big-data-class.html Grigory Yaroslavtsev http://grigory.us Sketching Graphs? We know how to sketch vectors: v Mv
! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.
Approximation Algorithms 11 Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of three
Overview Motivation and applications Challenges. Dynamic Volume Computation and Visualization on the GPU. GPU feature requests Conclusions
Module 4: Beyond Static Scalar Fields Dynamic Volume Computation and Visualization on the GPU Visualization and Computer Graphics Group University of California, Davis Overview Motivation and applications
Approximation Algorithms
Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms
! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.
Approximation Algorithms Chapter Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of
Algorithm Design and Analysis
Algorithm Design and Analysis LECTURE 27 Approximation Algorithms Load Balancing Weighted Vertex Cover Reminder: Fill out SRTEs online Don t forget to click submit Sofya Raskhodnikova 12/6/2011 S. Raskhodnikova;
Mining Social-Network Graphs
342 Chapter 10 Mining Social-Network Graphs There is much information to be gained by analyzing the large-scale data that is derived from social networks. The best-known example of a social network is
Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations
Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations Amara Keller, Martin Kelly, Aaron Todd 4 June 2010 Abstract This research has two components, both involving the
Load-Balancing Spatially Located Computations using Rectangular Partitions
Load-Balancing Spatially Located Computations using Rectangular Partitions Erik Saule a,, Erdeniz Ö. Başa,b, Ümit V. Çatalyüreka,c a Department of Biomedical Informatics. The Ohio State University b Department
Load Balancing Of Parallel Monte Carlo Transport Calculations
Load Balancing Of Parallel Monte Carlo Transport Calculations R.J. Procassini, M. J. O Brien and J.M. Taylor Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA 9551 The performance of
ParFUM: A Parallel Framework for Unstructured Meshes. Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008
ParFUM: A Parallel Framework for Unstructured Meshes Aaron Becker, Isaac Dooley, Terry Wilmarth, Sayantan Chakravorty Charm++ Workshop 2008 What is ParFUM? A framework for writing parallel finite element
CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on
CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #18: Dimensionality Reduc7on Dimensionality Reduc=on Assump=on: Data lies on or near a low d- dimensional subspace Axes of this subspace
1 2 3 1 1 2 x = + x 2 + x 4 1 0 1
(d) If the vector b is the sum of the four columns of A, write down the complete solution to Ax = b. 1 2 3 1 1 2 x = + x 2 + x 4 1 0 0 1 0 1 2. (11 points) This problem finds the curve y = C + D 2 t which
Multiphase Flow - Appendices
Discovery Laboratory Multiphase Flow - Appendices 1. Creating a Mesh 1.1. What is a geometry? The geometry used in a CFD simulation defines the problem domain and boundaries; it is the area (2D) or volume
CSC2420 Fall 2012: Algorithm Design, Analysis and Theory
CSC2420 Fall 2012: Algorithm Design, Analysis and Theory Allan Borodin November 15, 2012; Lecture 10 1 / 27 Randomized online bipartite matching and the adwords problem. We briefly return to online algorithms
NEW VERSION OF DECISION SUPPORT SYSTEM FOR EVALUATING TAKEOVER BIDS IN PRIVATIZATION OF THE PUBLIC ENTERPRISES AND SERVICES
NEW VERSION OF DECISION SUPPORT SYSTEM FOR EVALUATING TAKEOVER BIDS IN PRIVATIZATION OF THE PUBLIC ENTERPRISES AND SERVICES Silvija Vlah Kristina Soric Visnja Vojvodic Rosenzweig Department of Mathematics
State of Stress at Point
State of Stress at Point Einstein Notation The basic idea of Einstein notation is that a covector and a vector can form a scalar: This is typically written as an explicit sum: According to this convention,
Multimedia Databases. Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.
Multimedia Databases Wolf-Tilo Balke Philipp Wille Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 14 Previous Lecture 13 Indexes for Multimedia Data 13.1
TESLA Report 2003-03
TESLA Report 23-3 A multigrid based 3D space-charge routine in the tracking code GPT Gisela Pöplau, Ursula van Rienen, Marieke de Loos and Bas van der Geer Institute of General Electrical Engineering,
Calculation of Eigenmodes in Superconducting Cavities
Calculation of Eigenmodes in Superconducting Cavities W. Ackermann, C. Liu, W.F.O. Müller, T. Weiland Institut für Theorie Elektromagnetischer Felder, Technische Universität Darmstadt Status Meeting December
Load balancing in a heterogeneous computer system by self-organizing Kohonen network
Bull. Nov. Comp. Center, Comp. Science, 25 (2006), 69 74 c 2006 NCC Publisher Load balancing in a heterogeneous computer system by self-organizing Kohonen network Mikhail S. Tarkov, Yakov S. Bezrukov Abstract.
Chapter 11. 11.1 Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling
Approximation Algorithms Chapter Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one
Partitioning and Divide and Conquer Strategies
and Divide and Conquer Strategies Lecture 4 and Strategies Strategies Data partitioning aka domain decomposition Functional decomposition Lecture 4 and Strategies Quiz 4.1 For nuclear reactor simulation,
Lecture 7 - Meshing. Applied Computational Fluid Dynamics
Lecture 7 - Meshing Applied Computational Fluid Dynamics Instructor: André Bakker http://www.bakker.org André Bakker (2002-2006) Fluent Inc. (2002) 1 Outline Why is a grid needed? Element types. Grid types.
Guide to Partitioning Unstructured Meshes for Parallel Computing
Guide to Partitioning Unstructured Meshes for Parallel Computing Phil Ridley Numerical Algorithms Group Ltd, Wilkinson House, Jordan Hill Road, Oxford, OX2 8DR, UK, email: [email protected] April 17,
Medical Information Management & Mining. You Chen Jan,15, 2013 [email protected]
Medical Information Management & Mining You Chen Jan,15, 2013 [email protected] 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
Load Balancing Techniques
Load Balancing Techniques 1 Lecture Outline Following Topics will be discussed Static Load Balancing Dynamic Load Balancing Mapping for load balancing Minimizing Interaction 2 1 Load Balancing Techniques
Linear Algebra Review. Vectors
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka [email protected] http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa Cogsci 8F Linear Algebra review UCSD Vectors The length
5.1 Bipartite Matching
CS787: Advanced Algorithms Lecture 5: Applications of Network Flow In the last lecture, we looked at the problem of finding the maximum flow in a graph, and how it can be efficiently solved using the Ford-Fulkerson
Random graphs with a given degree sequence
Sourav Chatterjee (NYU) Persi Diaconis (Stanford) Allan Sly (Microsoft) Let G be an undirected simple graph on n vertices. Let d 1,..., d n be the degrees of the vertices of G arranged in descending order.
Customer Training Material. Lecture 4. Meshing in Mechanical. Mechanical. ANSYS, Inc. Proprietary 2010 ANSYS, Inc. All rights reserved.
Lecture 4 Meshing in Mechanical Introduction to ANSYS Mechanical L4-1 Chapter Overview In this chapter controlling meshing operations is described. Topics: A. Global Meshing Controls B. Local Meshing Controls
Nimble Algorithms for Cloud Computing. Ravi Kannan, Santosh Vempala and David Woodruff
Nimble Algorithms for Cloud Computing Ravi Kannan, Santosh Vempala and David Woodruff Cloud computing Data is distributed arbitrarily on many servers Parallel algorithms: time Streaming algorithms: sublinear
Best practices for efficient HPC performance with large models
Best practices for efficient HPC performance with large models Dr. Hößl Bernhard, CADFEM (Austria) GmbH PRACE Autumn School 2013 - Industry Oriented HPC Simulations, September 21-27, University of Ljubljana,
Complex Networks Analysis: Clustering Methods
Complex Networks Analysis: Clustering Methods Nikolai Nefedov Spring 2013 ISI ETH Zurich [email protected] 1 Outline Purpose to give an overview of modern graph-clustering methods and their applications
Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016
Clustering Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 1 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate data attributes with
HPC Deployment of OpenFOAM in an Industrial Setting
HPC Deployment of OpenFOAM in an Industrial Setting Hrvoje Jasak [email protected] Wikki Ltd, United Kingdom PRACE Seminar: Industrial Usage of HPC Stockholm, Sweden, 28-29 March 2011 HPC Deployment
Complex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics
Complex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics Zhao Wenbin 1, Zhao Zhengxu 2 1 School of Instrument Science and Engineering, Southeast University, Nanjing, Jiangsu
Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 8/05/2005 1 What is data exploration? A preliminary
Statistical machine learning, high dimension and big data
Statistical machine learning, high dimension and big data S. Gaïffas 1 14 mars 2014 1 CMAP - Ecole Polytechnique Agenda for today Divide and Conquer principle for collaborative filtering Graphical modelling,
