Costbased Optimization of Graph Queries in Relational Database Management Systems


 Chester Jenkins
 2 years ago
 Views:
Transcription
1 Costbased Optimization of Graph Queries in Relational Database Management Systems D I S S E R T A T I O N zur Erlangung des akademischen Grades Dr. rer nat. im Fach Informatik eingereicht an der MathematischNaturwissenschaftlichen Fakultät II HumboldtUniversität zu Berlin von Dipl.Ing. (FH) Silke Trißl M.Sc. Präsident der HumboldtUniversität zu Berlin: Prof. Dr. JanHendrik Olbertz Dekan der MathematischNaturwissenschaftlichen Fakultät II: Prof. Dr. Elmar Kulke Gutachter: 1. Prof. Dr. Ulf Leser 2. Prof. JohannChristoph Freytag, Ph.D. 3. Prof. Dr. Thorsten Grust eingereicht am: Tag der mündlichen Prüfung:
2
3 Alles hat ein Ende nur die Wurst hat zwei. Stephan Remmler Acknowledgement This thesis would not have been possible without the help, support, and encouragement of many people. First of all, I would like to thank my supervisor Prof. Ulf Leser. He gave me the opportunity to start my PhD and provided a welcoming and pleasant working environment at HumboldtUniversität zu Berlin. I am greatly indebted to him for his patience, encouragement, and guidance during all these years with ups and downs. I could not have imagined a more motivated or dedicated advisor for my PhD study. I am grateful to all who gave me the opportunity to partly finance my PhD by teaching. I met committed and inquiring students in the courses and exercises I taught for Prof. Ulf Leser at HU Berlin and Prof. Felix Naumann at HPI Potsdam. Dr. Márta Gutsche and the Frauenförderung at HU Berlin gave me the opportunity to spark interest in girls to study computer science. I thank Prof. Louiqa Raschid at University of Maryland who invited me for a research exchange to the US. I am also grateful to the BMBF who supported my research. I would not have finished this PhD thesis without the help and support of many colleagues and friends. Thanks to Jörg, Timo, and Philippe who shared an office with me. Thanks to Jens, Melanie, Jana, Long, Roger, and Samira who also accompanied me for a long time during my thesis. I want to acknowledge all researchers and students from the groups WBI and DBIS at HU, Informationssysteme at HPI, and Genetik und Biometrie at FBN. Many thanks for constructive criticism and helpful suggestions. I am greatly indebted to all colleagues who tried to cheer me up during common lunch and coffee breaks. I acknowledge some students, who I met during my time in Berlin. Raphael and Philipp did a lot of programming in my first project Columba. Johannes, Christoph, Florian, and André used some ideas of GRIPP in their Studien or Diplomarbeiten and gave feedback on the algorithm. Last but not least, würde ich mich gerne bei meiner Familie bedanken, die während der gesamten Zeit Freud und Leid mit mir geteilt hat. Meine Eltern hatten und haben immer ein offenes Ohr für meine Sorgen und Nöte von ganzem Herzen vielen Dank dafür. Also, many thanks to my sister. Whenever I needed to discuss a problem, she listened patiently and gave me good advice.
4
5 Abstract Graphs occur in many areas of life. We are interested in graphs in biology, where nodes are chemical compounds, enzymes, reactions, or interactions, which are connected by either directed or undirected edges. Efficiently querying these graphs is a challenging task. In this thesis we present GRIcano, a system that efficiently executes graph queries. For GRIcano we assume that graphs are stored and queried using relational database management systems (RDBMS). We use an extended version of the Pathway Query Language PQL to express graph queries, for which we describe the syntax and semantics in this work. We employ ideas from RDBMS to improve the performance of query execution. Thus, the core of GRIcano is a costbased query optimizer, which is created using the Volcano optimizer generator. This thesis makes contributions to all three required components of the optimizer, the relational algebra, implementations, and cost model. Relational algebra operators alone are not sufficient to express graph queries. Thus, we first present new operators to rewrite PQL queries to algebra expressions. We propose the reachability φ, distance Φ, path length ψ, and path operator Ψ. In addition, we provide rewrite rules for the newly proposed operators in combination with standard relational algebra operators. Secondly, we present implementations for each proposed operator. The main contribution is GRIPP, an index structure that allows us to execute reachability queries on very large graphs containing directed edges. GRIPP has advantages over other existing index structures, which we review in this work. In addition, we show how to employ GRIPP and the recursive query strategy as implementation for all four proposed operators. The third component of GRIcano is the cost model, which requires cardinality estimates for the proposed operators and cost functions for the implementations. Based on extensive experimental evaluation of the proposed implementations we present functions to estimate the cardinality of the φ, Φ, ψ, and Ψ operator and the cost of executing a query. The novelty of our approach is that these functions only use key figures of the graph. We finally present the effectiveness of GRIcano using exemplary graph queries on real biological networks. v
6
7 Zusammenfassung Graphen sind in vielen Bereichen des Lebens zu finden, wobei wir speziell an Graphen aus der Biologie interessiert sind. Knoten in solchen Graphen sind chemische Komponenten, Enzyme, Reaktionen oder Interaktionen, die durch gerichtete oder ungerichtete Kanten miteinander verbunden sind. Eine effiziente Ausführung von Graphanfragen ist eine Herausforderung. In dieser Arbeit präsentieren wir GRIcano, ein System, das das effiziente Ausführen von Graphanfragen erlaubt. Wir nehmen an, dass die Graphen in relationalen Datenbankmanagementsystemen (RDBMS) gespeichert sind und darin auch angefragt werden. Als Graphanfragesprache schlagen wir eine erweiterte Version der Pathway Query Language (PQL) vor. Der Hauptbestandteil von GRIcano ist ein kostenbasierter Anfrageoptimierer, der mit Hilfe des Optimierergenerators Volcano erzeugt wird. Diese Arbeit enthält Beiträge zu allen drei benötigten Komponenten des Optimierers, der relationalen Algebra, Implementierungen und Kostenmodellen. Die Operatoren der relationalen Algebra alleine sind nicht ausreichend, um PQL Anfragen auszudrücken. Daher stellen wir zuerst die neuen Operatoren Erreichbarkeits φ, Distanz Φ, Pfadlängen ψ und Pfadoperator Ψ vor. Zusätzlich geben wir Regeln für die Umformung von Ausdrücken an, die die neuen Operatoren zusammen mit den Standardoperatoren der relationalen Algebra enthalten. Des Weiteren präsentieren wir Implementierungen für jeden vorgeschlagenen Operatoren. Der Hauptbeitrag dabei ist GRIPP, eine Indexstruktur, die die effiziente Ausführung von Erreichbarkeitsanfragen auf sehr großen Graphen mit gerichteten Kanten erlaubt. Wir zeigen, wie GRIPP und die rekursive Anfragestrategie genutzt werden können, um Implementierungen für alle vorgeschlagenen Operatoren bereitzustellen. Die dritte Komponente von GRIcano ist das Kostenmodell, das Kardinalitätsabschätzungen für die vorgeschlagenen Operatoren und Kostenmodelle für die Implementierungen benötigt. Basierend auf umfangreichen Experimenten schlagen wir Funktionen für die Abschätzung der Kardinalitäten der Operatoren φ, Φ, ψ und Ψ vor. Zusätzlich leiten wir Funktionen für die Abschätzung der Kosten für die Ausführung von Graphanfragen ab. Der neue Ansatz der Kostenmodelle ist, dass die Funktionen nur Kennzahlen der Graphen verwenden. Abschließend zeigen wir die Wirkungsweise von GRIcano mit Beispielanfragen auf echten biologischen Netzwerken. vii
8
9 Contents 1. Introduction Queries on Graphs Motivation Contribution Structure of this Work Definitions and Terminology Graphs Definitions Storage and Traversal Relational Algebra Algebra and Relations Operators Equivalence Rules CostBased Query Optimization Query Processing Implementation of Operators Cost Function and Query Optimization Volcano Graph Queries Data Model Graph Queries Query Graph Evaluation of Graph Queries Pathway Query Language Graphs in PQL Syntax PQL and Nongraph Relations PQL Semantics Semantics of Node Conditions Semantics of Path Conditions Semantics of HAVING Conditions Semantic of the Subgraph Specification Conversion to Relational Algebra Related Work ix
10 Contents 4. Operators for Graph Queries Operators for Nodes Operators for Paths Path Operator, Ψ Reachability operator, φ Path Length Operator, ψ Distance Operator, Φ Summary Related Work Implementations for Operators GRIPP Index Structure Reachability Queries Distance Queries Path Length and Path Queries Other Index Structures Transitive Closure Dual Labeling Label + SSPI RDBMS Capabilities Recursive Strategies Summary Related Work Performance of GRIPP Experimental Setup Generated Graphs Realworld Graphs Implementation Details Index Creation Query Performance Reachability Queries Distance Queries Path Length Queries Path Queries Comparison of Query Types Summary GRIcano Cardinality Estimates Reachability Operator Distance Operator Path Length Operator x
11 Contents Path Operator Validation on Real World Graphs Cost Functions Reachability Queries Distance Queries Path Length Queries Path Queries Validation on Real World Graphs GRIcano Experimental Evaluation Related Work Cardinality and Cost Estimates Rulebased Query Optimization Costbased Query Optimization Conclusion and Outlook Summary Future Work A. Strongly Connected Component 151 A.1. Kosaraju s Algorithm B. Rewrite Rules for Operators 153 B.1. Path Operator B.1.1. Restriction on Start and End Node B.1.2. Path Operator and Other Operators B.2. Path Length Operator B.2.1. Restriction on Start and End Node B.2.2. From Path Operator Ψ to Path Length Operator ψ B.2.3. Path Length Operator and Other Operators B.3. Distance Operator B.3.1. Restriction on Start and End Node B.3.2. From Path Operator Ψ to Distance Operator Φ B.3.3. Distance Operator and Other Operators B.4. Reachability Operator B.4.1. Restriction on Start and End Node B.4.2. From Path Operator Ψ to Reachability Operator φ B.4.3. Reachability Operator and Other Operators C. Additional Algorithms for GRIPP 161 C.1. Relational Schema for Storing GRIPP C.2. Stop Node List for GRIPP C.3. Reachability for Sets of Nodes xi
12 Contents D. Graph Properties 165 E. Model Specification for Volcano 167 F. Cost and Cardinality Functions for Volcano 173 G. Exemplary Queries for GRIcano 179 xii
13 1. Introduction The topic of this work is costbased optimization of graph queries in relational database management systems. In Section 1.1 we first introduce the kind of graphs that led us to this topic, before we proceed in Section 1.2 with the motivation for our approach. In Section 1.3 we summarize our contribution in the area of costbased optimization of graph queries. Finally, in Section 1.4 we give an overview of this work Queries on Graphs Graphs occur in many areas of life. Examples are public transport plans, road maps, the World Wide Web (WWW), or social networks. Common to all these graphs is that they consist of nodes and edges. Nodes are stations, junctions, web pages, or people. Edges in such networks are tracks, roads, links, or personal relationships. All these graphs have interesting features but we are interested in graphs in biology. To understand the content of these graphs we first make a short digression to cell biology. For a more comprehensive introduction we refer the reader to Alberts et al. [AJW + 08]. All biological cells are built in similar fashion, though there exist differences in the structure of cells between the three major groups, prokaryotes, eukaryotes, and archaea. All have in common that they contain a cell membrane as boundary to the outside and a genome, which holds information for building and maintaining the cell. In eukaryotes the genome is contained inside the nucleus, while in prokaryotes and archaea the genome is free in the cytoplasm. The genome is comprised of long stretches of DNA, the chromosomes. Genes are short regions of the genome that code for a functional product in the cell. During the transcription process genes are read and transcribed into RNA. Either the RNA itself is the functional product or the RNA, possibly with some modifications, is translated to proteins. Proteins in a cell are the workhorses as they catalyze reactions, process signals, or transport molecules. One class of proteins, the enzymes, catalyze chemical reactions, such as the degradation of sugar or the production of essential amino acids. Another class, the membrane proteins, reside inside the cell membrane and react to outer stimuli or facilitate the transport of substances in and out of a cell. When an outer stimuli occurs membrane proteins may activate or inactivate proteins inside the cell to enhance or suppress reactions. There exist other protein groups such as histones, which are concerned with packing the DNA in the nucleus of eukaryotes, collagens, which occur mainly in muscle cells, or antibodies, which are required in higher organisms for the immune response. 1
14 1. Introduction To give an impression of the complexity of the problem, every human has about 250,000 different proteins in his or her body, according to current estimates. Each protein may interact with numerous other proteins or some of the hundreds of thousands organic and inorganic substances. Biologists have studied these complex interactions involving proteins and other substances. Their knowledge is stored as graphs in publicly available data sources. Biological graphs may roughly be divided into three categories, metabolic networks, signaling pathways, and proteinprotein interaction networks 1. For a review on different biological networks see [BN05]. Metabolic networks are graphs, which represent the conversion of substances in a cell. Nodes in these networks are proteins, other molecules such as sugars or fatty acids, or reactions. Edges in such graphs are usually directed and indicate that a molecule participates in a reaction. The most familiar conversion is the glycolysis. In the glycolysis glucose is converted to pyruvate, which produces energy during the conversion. Proteins and reactions participating in this conversion are said to be in the glycolysis pathway. In general, pathways in metabolic networks are subgraphs that stand for specific conversions defined by researchers. The pathways may overlap, i.e., they may share proteins or reactions. Data sources for metabolic networks are KEGG [KGK + 04], BioCyc [KOMK + 05], and Reactome [JTGV + 05] for instance. Figure 1.1 shows the glycolysis given by KEGG. Circles are molecules that are converted, rectangular boxes on edges stand for reactions catalyzed by enzymes that are identified by their EC number, and the boxes with rounded corners represent other pathways. Signaling pathways are graphs that capture the information flow in a cell. Nodes in these graphs are usually proteins or reactions, while edges represent the flow of information. For example, Figure 1.2 shows the activation of protein kinase A (PKA) by an outer stimuli as given by BioCarta [htt11b]. The activated form of PKA regulates several reactions, including one reaction of the glycolysis presented in Figure 1.1. Depending on the outer stimuli glucose PKA phosphorylates or dephosphorylates the complex of the two enzymes phosphofructokinase 2 and fructose2,6bisphosphatase. The phosphorylation status influences the reaction rate of the glycolysis. The third group of biological graphs are proteinprotein interaction networks. In these graphs nodes are proteins, while edges represent interactions between proteins and they are usually undirected. Figure 1.3 shows known interactions for the protein complex phosphofructokinase 2 and fructose2,6bisphosphatase (PFKFB1) as given by String [vmjs + 05], a data source for proteinprotein interactions. The red node in the center is PFKFB1. It interacts with protein kinase A (PKACA) and several other proteins. The different colors of the edges code for different evidences, e.g., interactions found in other data sources are represented by blue edges, while interactions derived using text mining methods are shown by light green edges. Other data sources that contain data about proteinprotein interactions are for in 1 See Pathguide: the pathway resource list for a list on data sources 2
15 1.1. Queries on Graphs Figure 1.1.: The glycolysis as given by KEGG. The circles are molecules that are converted, rectangular boxes on edges stand for reactions catalyzed by enzymes that are identified by their EC number, and the boxes with rounded corners stand for other pathways. 3
16 1. Introduction Figure 1.2.: The activation of PKA through an outer stimuli from BioCarta. stance DIP [XSD + 02], BIND [BBH03], Intact [XSD + 02], and PubGene [JLKH01] Motivation The examples in the last section show only small parts of different biological graphs. Table 1.1 shows the number of nodes and edges of selected data sources. For example, KEGG contains 42,002 nodes and 51,450 edges in its reference pathway as of March The reference pathway is a summarization of the pathways of all species. In contrast, BioCyc stores an individual metabolic network for each of the roughly 400 species. In addition, in contrast to KEGG BioCyc also represents relationships between genes and proteins. Biologist use specialized graph viewing tools to display those graphs. For a review on the tools see Suderman & Hallett [SH07]. The tools usually display parts of the entire graph, e.g., a single pathway of a metabolic network, possibly with links to other pathways as shown in Figure 1.1. With such tools a biologist is only able to navigate through graphs. Consider the question How many steps does a cell require to produce the amino acid lysine given the substrate glucose. A biologist may use the metabolic network of KEGG, 4
17 1.2. Motivation Figure 1.3.: Known proteinprotein interactions for the protein complex PFKFB1 in humans. The different colors of edges stand for different evidences, e.g., interactions found in other data sources are represented by blue edges, while interactions derived using text mining methods are shown by light green edges. where she has to start at glucose in the glycolysis pathway, follow the link to the pathway of the citrate cycle, and then follow the link to the pathway of the lysine biosynthesis. This way, she will count that there are 25 steps required to produce lysine from the substrate glucose. Clearly, when manually navigating through the images of pathways a biologist might not find the shortest path or occasionally even no path at all although there exists one. Thus, tools are required that allow users to pose queries such as the one presented above and return an answer to the user. In [HNM + 00] van Helden and colleagues identified several other questions that are interesting for biologists: Get all reactions catalyzed by a given gene product. Find all metabolic pathways that convert compound A into compound B in less than X steps. Retrieve all genes whose expression is directly or indirectly affected by a given compound. Find all compounds that can be synthesized from a given precursor in less than X steps. Currently, researchers have to write specialized programs to traverse the graphs to 5
18 1. Introduction Biological graph Number of nodes Number of edges Metabolic networks KEGG [KGK + 04] 42,002 51,450 BioCyc A. thaliana [KOMK + 05] 10,951 23,649 Reactome [JTGV + 05] 11,795 23,649 Signaling pathways BioCarta [htt11b] only images NetPath TGFβ [KMR + 10] TransPath [KPV + 06] > 100,000 >240,000 Proteinprotein interaction networks String [vmjs + 05] > 2,500,000 > 50,000,000 DIP [XSD + 02] 23,201 71,276 Intact 50, ,044 Table 1.1.: Sizes of biological graphs (in March 2011). answer such queries. Whenever they want to pose a new query these programs need to be adjusted. In this work we present GRIcano to overcome this problem Contribution In this work we present GRIcano, a novel tool that efficiently retrieves answers to graph queries. In GRIcano we employ ideas from query optimization in relational database management systems (RDBMS) and carry these ideas over to graph query optimization. In the following chapters we target several aspects of graph queries. We specifically make the following contributions: Extend the existing query language PQL. We present and extend the Pathway Query Language (PQL) [Les05a], which was developed to express graph queries. Using PQL a user may express conditions of a graph query as predicates. In Chapter 3 we describe the syntax as well as the semantics of PQL. Define relational operators to express PQL queries. In order to optimize a graph query we want to be able to alter the order in which predicates of the query are evaluated. We may achieve this by rewriting the PQL query to an algebraic expression and apply rewrite rules for transformation. As standard operators from relational algebra are not sufficient for expressing PQL queries, which we discuss in Chapter 4, we develop new and novel operators in this thesis. We define the path Ψ, path length ψ, distance Φ, and reachability operator φ to express predicates of graphs queries and provide rewrite rules for the exchange of operators. Propose and experimentally evaluate implementations for operators. For each proposed operator we have to provide implementations to compute the result. Thus, in Chapter 5 we discuss implementations to answer reachability, 6
19 1.4. Structure of this Work distance, path length, and path queries. We may use GRIPP, our newly developed index structure, for answering all four types of graph queries. Chapter 6 shows that we are able to compute the GRIPP index even for very large graphs, for which the transitive closure cannot be created. In addition, we are able to answer reachability queries on average in almost constant time regardless the size and shape of the graph using GRIPP. Develop functions to estimate cardinality of operators and cost of implementations. For costbased query optimization we require cardinality estimates for the different operators and cost functions for each implementation. In Chapter 7 we develop equations that are based on key figures of the graph, which is to our knowledge a novel approach. Using our cost functions we correctly predict on generated as well as on realworld graphs the result sizes and fastest implementations. Present and evaluate a prototypical implementation of GRIcano. In Chapter 7 we present GRIcano, the first system that performs costbased query optimization for graph queries. The underlying costbased query optimizer is generated using the Volcano framework [GM93]. Volcano requires as input the available operators and rewrite rules of the algebra, the available implementations for the different operators, and the equations for the cardinality and cost estimates. We show the effect of GRIcano using exemplary queries Structure of this Work In Chapter 2 we introduce basic notation on graphs, relational algebra, and costbased query optimization. Chapter 3 is devoted to a data model for storing graphs, graph queries, and PQL, a language to express graph queries. In Chapter 4 we first argue that PQL queries should be executed like standard SQL queries, i.e., first transforming them to an algebraic expression. We induce the necessity of new operators for the algebra and introduce the path operator, Ψ, path length operator, ψ, distance operator Φ, and reachability operator φ. We also provide rewrite rules for exchanging operators. In Chapter 5 we provide implementations for the operators proposed in Chapter 4. We present GRIPP, an index structure to efficiently answer reachability queries even on large graphs. In Chapter 6 we experimentally evaluate the presented implementations. In Chapter 7 we devise functions to estimate cardinality for the four newly defined operators and cost functions for the different implementations. In that chapter we also introduce GRIcano, our graph query optimizer. We show the capabilities of GRIcano using selected queries. Chapter 8 concludes the work. 7
20
21 2. Definitions and Terminology This chapter introduces basic notation on graphs, relational algebra, and query optimization. In Section 2.1 we formally define graphs and properties of graphs. Section 2.2 introduces fundamental concepts behind relational algebra. In Section 2.3 we present an introduction to costbased query optimization in relational database management systems Graphs This work mostly deals with graph structured data. We therefore formally introduce graphs. For this purpose we adopt notation from Cormen et al. [CLR01] Definitions Definition 2.1 (Graph) A graph G = (V (G), E(G)) is a tuple consisting of a set of nodes V (G) and a set of edges E(G), with E(G) V (G) V (G). Whenever the context of the graph is clear we may write G = (V, E). There exist two types of graphs, directed and undirected graphs. Directed graphs have ordered pairs of nodes in E. In contrast, in undirected graphs the set E contains unordered pairs of nodes. Consider (u, v) E with u, v V and u v. In a directed graph only v is adjacent to u, while in an undirected graph the relation is symmetric, i.e., (u, v) is the same as (v, u). If (u, v) E in a directed graph we say node u has the outgoing edge (u, v) and therefore u is start node of (u, v). In analogy (u, v) is an incoming edge of node v and therefore v is target node of (u, v). We call u parent of v and v child of u. Definition 2.2 (Size of a graph) Let G = (V, E). The size of G is the number of nodes V plus the number of edges E in G, i.e., G = V + E. Based on the ratio between edges and nodes, which is called the density of a graph, we are able to divide graphs into two groups sparse and dense graphs. The literature does not provide a clear distinction between the two types. As rule of thumb, if the number of edges E is close to V 2 the graphs are called dense, otherwise if E V 2 they are sparse. 9
22 2. Definitions and Terminology e f a d b c Figure 2.1.: A directed graph. Circles represent nodes; arrows between nodes represent edges. Nodes in this example are uniquely labeled. The size of the graph is 14 (6 nodes plus 8 edges). For example, the degree of node b is deg(b) = 3. To describe the shape of a graph we look at the distribution of node degrees. To do so, we first define the degree of a node. Definition 2.3 (Degree of a node) Given a graph G = (V, E). The degree of node v V deg(v) is the number of edges in which v participates. If G is directed we may distinguish between an indegree deg in (v) and an outdegree deg out (v) of a node v. The indegree is the number of edges with v as target node and, in analogy, the outdegree is the number of edges with v as start node. Based on the distribution of the node degree we distinguish between different graph topologies. The distribution of the node degrees of random graphs follows a binomial distribution. Graphs where the distribution of the node degrees follows a powerlaw are called scalefree. Barabási and Oltvai describe in [BO04] these topologies. Nodes and edges are often labeled. Therefore we define a label function for nodes and edges of a graph. Definition 2.4 (Label function, φ) Let L be a set of labels. A label function φ assigns labels to nodes and edges, φ(v, L) : V L and φ(e, L) : E L. In this work we assume each label l L consists of a type and a value. Graphs also contain paths. Definition 2.5 (Path and path length) Let G = (V, E). A path p is a sequence of nodes v 0, v 1, v 2,..., v k, v i V such that (v i 1, v i ) E for i = 1, 2,..., k. The length of the path is the number of edges in the path. If there exists a path p from u to w we say w is reachable from u, written as u w. 10
12 Abstract Data Types
12 Abstract Data Types 12.1 Source: Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: Define the concept of an abstract data type (ADT).
More informationICS 434 Advanced Database Systems
ICS 434 Advanced Database Systems Dr. Abdallah AlSukairi sukairi@kfupm.edu.sa Second Semester 20032004 (032) King Fahd University of Petroleum & Minerals Information & Computer Science Department Outline
More informationCMSC 451: Graph Properties, DFS, BFS, etc.
CMSC 451: Graph Properties, DFS, BFS, etc. Slides By: Carl Kingsford Department of Computer Science University of Maryland, College Park Based on Chapter 3 of Algorithm Design by Kleinberg & Tardos. Graphs
More information8. Query Processing. Query Processing & Optimization
ECS165A WQ 11 136 8. Query Processing Goals: Understand the basic concepts underlying the steps in query processing and optimization and estimating query processing cost; apply query optimization techniques;
More information! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions
Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions Basic Steps in Query
More informationGraph Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
Graph Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 3. Topic Overview Definitions and Representation Minimum
More informationProtein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks YoungRae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
More informationQuery Processing, optimization, and indexing techniques
Query Processing, optimization, and indexing techniques What s s this tutorial about? From here: SELECT C.name AS Course, count(s.students) AS Cnt FROM courses C, subscription S WHERE C.lecturer = Calders
More informationChapter 13: Query Processing. Basic Steps in Query Processing
Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing
More informationCOT5405 Analysis of Algorithms Homework 3 Solutions
COT0 Analysis of Algorithms Homework 3 Solutions. Prove or give a counter example: (a) In the textbook, we have two routines for graph traversal  DFS(G) and BFS(G,s)  where G is a graph and s is any
More informationSocial Media Mining. Graph Essentials
Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures
More informationBioinformatics: Network Analysis
Bioinformatics: Network Analysis Graphtheoretic Properties of Biological Networks COMP 572 (BIOS 572 / BIOE 564)  Fall 2013 Luay Nakhleh, Rice University 1 Outline Architectural features Motifs, modules,
More informationLecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION. Professor Bharat Patel Office: Science 2, 2.36 Email: b.patel@griffith.edu.
Lecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION Professor Bharat Patel Office: Science 2, 2.36 Email: b.patel@griffith.edu.au What is Gene Expression & Gene Regulation? 1. Gene Expression
More informationSQL Query Evaluation. Winter 20062007 Lecture 23
SQL Query Evaluation Winter 20062007 Lecture 23 SQL Query Processing Databases go through three steps: Parse SQL into an execution plan Optimize the execution plan Evaluate the optimized plan Execution
More informationHome Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit
Data Structures Page 1 of 24 A.1. Arrays (Vectors) nelement vector start address + ielementsize 0 +1 +2 +3 +4... +n1 start address continuous memory block static, if size is known at compile time dynamic,
More informationGraph. Consider a graph, G in Fig Then the vertex V and edge E can be represented as:
Graph A graph G consist of 1. Set of vertices V (called nodes), (V = {v1, v2, v3, v4...}) and 2. Set of edges E (i.e., E {e1, e2, e3...cm} A graph can be represents as G = (V, E), where V is a finite and
More informationAnalysis of Algorithms, I
Analysis of Algorithms, I CSOR W4231.002 Eleni Drinea Computer Science Department Columbia University Thursday, February 26, 2015 Outline 1 Recap 2 Representing graphs 3 Breadthfirst search (BFS) 4 Applications
More informationData Warehousing und Data Mining
Data Warehousing und Data Mining Multidimensionale Indexstrukturen Ulf Leser Wissensmanagement in der Bioinformatik Content of this Lecture Multidimensional Indexing GridFiles Kdtrees Ulf Leser: Data
More informationNetwork Analysis and Visualization of Staphylococcus aureus. by Russ Gibson
Network Analysis and Visualization of Staphylococcus aureus by Russ Gibson Network analysis Based on graph theory Probabilistic models (random graphs) developed by Erdős and Rényi in 1959 Theory and tools
More informationSearch Engines Chapter 2 Architecture. 14.4.2011 Felix Naumann
Search Engines Chapter 2 Architecture 14.4.2011 Felix Naumann Overview 2 Basic Building Blocks Indexing Text Acquisition Text Transformation Index Creation Querying User Interaction Ranking Evaluation
More informationGraphs and Network Flows IE411 Lecture 1
Graphs and Network Flows IE411 Lecture 1 Dr. Ted Ralphs IE411 Lecture 1 1 References for Today s Lecture Required reading Sections 17.1, 19.1 References AMO Chapter 1 and Section 2.1 and 2.2 IE411 Lecture
More informationDatabase 2 Lecture II. Alessandro Artale
Free University of Bolzano Database 2. Lecture II, 2003/2004 A.Artale (1) Database 2 Lecture II Alessandro Artale Faculty of Computer Science Free University of Bolzano Room: 221 artale@inf.unibz.it http://www.inf.unibz.it/
More informationEfficiently Identifying Inclusion Dependencies in RDBMS
Efficiently Identifying Inclusion Dependencies in RDBMS Jana Bauckmann Department for Computer Science, HumboldtUniversität zu Berlin Rudower Chaussee 25, 12489 Berlin, Germany bauckmann@informatik.huberlin.de
More informationQ4. What are data model? Explain the different data model with examples. Q8. Differentiate physical and logical data independence data models.
FAQs Introduction to Database Systems and Design Module 1: Introduction Data, Database, DBMS, DBA Q2. What is a catalogue? Explain the use of it in DBMS. Q3. Differentiate File System approach and Database
More informationDatenbanksysteme II: Implementation of Database Systems Implementing Joins
Datenbanksysteme II: Implementation of Database Systems Implementing Joins Material von Prof. Johann Christoph Freytag Prof. KaiUwe Sattler Prof. Alfons Kemper, Dr. Eickler Prof. Hector GarciaMolina
More informationPractical Graph Mining with R. 5. Link Analysis
Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities
More informationComp 5311 Database Management Systems. 16. Review 2 (Physical Level)
Comp 5311 Database Management Systems 16. Review 2 (Physical Level) 1 Main Topics Indexing Join Algorithms Query Processing and Optimization Transactions and Concurrency Control 2 Indexing Used for faster
More informationA New Advanced Query Web Page and its query language
A New Advanced Query Web Page and its query language To replace the advanced query web form on www.biocyc.org Mario Latendresse Bioinformatics Research Group SRI International Mario@ai.sri.com 1 The Actual
More informationIE 680 Special Topics in Production Systems: Networks, Routing and Logistics*
IE 680 Special Topics in Production Systems: Networks, Routing and Logistics* Rakesh Nagi Department of Industrial Engineering University at Buffalo (SUNY) *Lecture notes from Network Flows by Ahuja, Magnanti
More informationData Structure [Question Bank]
Unit I (Analysis of Algorithms) 1. What are algorithms and how they are useful? 2. Describe the factor on best algorithms depends on? 3. Differentiate: Correct & Incorrect Algorithms? 4. Write short note:
More informationGSPARQL: A Hybrid Engine for Querying Large Attributed Graphs
GSPARQL: A Hybrid Engine for Querying Large Attributed Graphs Sherif Sakr National ICT Australia UNSW, Sydney, Australia ssakr@cse.unsw.edu.eu Sameh Elnikety Microsoft Research Redmond, WA, USA samehe@microsoft.com
More informationV. Adamchik 1. Graph Theory. Victor Adamchik. Fall of 2005
V. Adamchik 1 Graph Theory Victor Adamchik Fall of 2005 Plan 1. Basic Vocabulary 2. Regular graph 3. Connectivity 4. Representing Graphs Introduction A.Aho and J.Ulman acknowledge that Fundamentally, computer
More informationCSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92.
Name: Email ID: CSE 326, Data Structures Section: Sample Final Exam Instructions: The exam is closed book, closed notes. Unless otherwise stated, N denotes the number of elements in the data structure
More informationApplication of Graphbased Data Mining to Metabolic Pathways
Application of Graphbased Data Mining to Metabolic Pathways Chang Hun You, Lawrence B. Holder, Diane J. Cook School of Electrical Engineering and Computer Science Washington State University Pullman,
More informationControl of Gene Expression
Home Gene Regulation Is Necessary? Control of Gene Expression By switching genes off when they are not needed, cells can prevent resources from being wasted. There should be natural selection favoring
More informationDefinition. E.g. : Attempting to represent a transport link data with a tree structure:
The ADT Graph Recall the ADT binary tree: a tree structure used mainly to represent 1 to 2 relations, i.e. each item has at most two immediate successors. Limitations of tree structures: an item in a tree
More informationSPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE
SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE 2012 SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH (M.Sc., SFU, Russia) A THESIS
More information5. The chart below indicates the elements contained in four different molecules and the number of atoms of each element in those molecules.
1. In the diagram below, which substance belongs in area Z? 5. The chart below indicates the elements contained in four different molecules and the number of atoms of each element in those molecules. A)
More informationLoad balancing Static Load Balancing
Chapter 7 Load Balancing and Termination Detection Load balancing used to distribute computations fairly across processors in order to obtain the highest possible execution speed. Termination detection
More informationGraph theoretic approach to analyze amino acid network
Int. J. Adv. Appl. Math. and Mech. 2(3) (2015) 3137 (ISSN: 23472529) Journal homepage: www.ijaamm.com International Journal of Advances in Applied Mathematics and Mechanics Graph theoretic approach to
More informationUnit 2 Metabolism and Survival Summary
Unit 2 Metabolism and Survival Summary 1 Metabolism pathways and their control (a) Introduction to metabolic pathways This involves integrated and controlled pathways of enzymecatalysed reactions within
More informationwww.gr8ambitionz.com
Data Base Management Systems (DBMS) Study Material (Objective Type questions with Answers) Shared by Akhil Arora Powered by www. your A to Z competitive exam guide Database Objective type questions Q.1
More informationBioinformatics: Network Analysis
Bioinformatics: Network Analysis Molecular Cell Biology: A Brief Review COMP 572 (BIOS 572 / BIOE 564)  Fall 2013 Luay Nakhleh, Rice University 1 The Tree of Life 2 Prokaryotic vs. Eukaryotic Cell Structure
More informationProteins. Molecular Physiology: Enzymes and Cell Signaling. Binding. Protein Specificity. Enzymes. Enzymatic Reactions
Proteins Molecular Physiology: Enzymes and Cell Signaling Polymers of amino acids Have complex 3D structures Are the basis of most of the structure and physiological function of cells Binding Much of protein
More informationData Structures in Java. Session 16 Instructor: Bert Huang
Data Structures in Java Session 16 Instructor: Bert Huang http://www.cs.columbia.edu/~bert/courses/3134 Announcements Homework 4 due next class Remaining grades: hw4, hw5, hw6 25% Final exam 30% Midterm
More informationGenetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism )
Biology 1406 Exam 3 Notes Structure of DNA Ch. 10 Genetic information (DNA) determines structure of proteins DNA RNA proteins cell structure 3.11 3.15 enzymes control cell chemistry ( metabolism ) Proteins
More informationDatabases and Information Systems 1 Part 3: Storage Structures and Indices
bases and Information Systems 1 Part 3: Storage Structures and Indices Prof. Dr. Stefan Böttcher Fakultät EIM, Institut für Informatik Universität Paderborn WS 2009 / 2010 Contents:  database buffer 
More informationDiscrete Mathematics & Mathematical Reasoning Chapter 10: Graphs
Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs Kousha Etessami U. of Edinburgh, UK Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 6) 1 / 13 Overview Graphs and Graph
More informationQuery Processing + Optimization: Outline
Query Processing + Optimization: Outline Operator Evaluation Strategies Query processing in general Selection Join Query Optimization Heuristic query optimization Costbased query optimization Query Tuning
More information11 Reconstruction of the stoichiometry of ATP and NADHproducing systems using evolutionary algorithms
11 Reconstruction of the stoichiometry of ATP and NADHproducing systems using evolutionary algorithms O. Ebenhöh and R. Heinrich HumboldtUniversität zu Berlin, Institut für Biologie, Theoretische Biophysik,
More informationHuman Biology Higher Homework: Topic Human Cells. Subtopic3: Cell Metabolism
Human Biology Higher Homework: Topic Human Cells Subtopic3: Cell Metabolism 1. During which of the following chemical conversions is A T P produced? A B C D Amino acids protein Glucose pyruvic acid Haemoglobin
More informationCh. 12: DNA and RNA 12.1 DNA Chromosomes and DNA Replication
Ch. 12: DNA and RNA 12.1 DNA A. To understand genetics, biologists had to learn the chemical makeup of the gene Genes are made of DNA DNA stores and transmits the genetic information from one generation
More information[Refer Slide Time: 05:10]
Principles of Programming Languages Prof: S. Arun Kumar Department of Computer Science and Engineering Indian Institute of Technology Delhi Lecture no 7 Lecture Title: Syntactic Classes Welcome to lecture
More informationData Lineage and Meta Data Analysis in Data Warehouse Environments
Department of Informatics, University of Zürich BSc Thesis Data Lineage and Meta Data Analysis in Data Warehouse Environments Martin Noack Matrikelnummer: 09222232 Email: martin.noack@uzh.ch January
More informationVector storage and access; algorithms in GIS. This is lecture 6
Vector storage and access; algorithms in GIS This is lecture 6 Vector data storage and access Vectors are built from points, line and areas. (x,y) Surface: (x,y,z) Vector data access Access to vector
More information2. Give the formula (with names) for the catabolic degradation of glucose by cellular respiration.
Chapter 9: Cellular Respiration: Harvesting Chemical Energy Name Period Overview: Before getting involved with the details of cellular respiration and photosynthesis, take a second to look at the big picture.
More informationGene Regulation  The Lac Operon
Gene Regulation  The Lac Operon Specific proteins are present in different tissues and some appear only at certain times during development. All cells of a higher organism have the full set of genes:
More informationChapter There are nonisomorphic rooted trees with four vertices. Ans: 4.
Use the following to answer questions 126: In the questions below fill in the blanks. Chapter 10 1. If T is a tree with 999 vertices, then T has edges. 998. 2. There are nonisomorphic trees with four
More informationLoad Balancing and Termination Detection
Chapter 7 Load Balancing and Termination Detection 1 Load balancing used to distribute computations fairly across processors in order to obtain the highest possible execution speed. Termination detection
More information1311. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 1311 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
More informationTheorem A graph T is a tree if, and only if, every two distinct vertices of T are joined by a unique path.
Chapter 3 Trees Section 3. Fundamental Properties of Trees Suppose your city is planning to construct a rapid rail system. They want to construct the most economical system possible that will meet the
More informationMaplike Wikipedia Visualization. Pang Cheong Iao. Master of Science in Software Engineering
Maplike Wikipedia Visualization by Pang Cheong Iao Master of Science in Software Engineering 2011 Faculty of Science and Technology University of Macau Maplike Wikipedia Visualization by Pang Cheong
More informationSubgraph Patterns: Network Motifs and Graphlets. Pedro Ribeiro
Subgraph Patterns: Network Motifs and Graphlets Pedro Ribeiro Analyzing Complex Networks We have been talking about extracting information from networks Some possible tasks: General Patterns Ex: scalefree,
More information2007 7.013 Problem Set 1 KEY
2007 7.013 Problem Set 1 KEY Due before 5 PM on FRIDAY, February 16, 2007. Turn answers in to the box outside of 68120. PLEASE WRITE YOUR ANSWERS ON THIS PRINTOUT. 1. Where in a eukaryotic cell do you
More informationBiochemistry Energy and Glycolysis
MIT Department of Biology 7.014 Introductory Biology, Spring 2005 Recitation Section 4 Answer Key February 1415, 2005 Biochemistry Energy and Glycolysis A. Why do we care In lecture we discussed the three
More informationWebBased Genomic Information Integration with Gene Ontology
WebBased Genomic Information Integration with Gene Ontology Kai Xu 1 IMAGEN group, National ICT Australia, Sydney, Australia, kai.xu@nicta.com.au Abstract. Despite the dramatic growth of online genomic
More informationAlgorithms in Computational Biology (236522) spring 2007 Lecture #1
Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: Tuesday 11:0012:00/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office
More informationVisualizing Networks: Cytoscape. Prat Thiru
Visualizing Networks: Cytoscape Prat Thiru Outline Introduction to Networks Network Basics Visualization Inferences Cytoscape Demo 2 Why (Biological) Networks? 3 Networks: An Integrative Approach Zvelebil,
More informationMetabolic Network Analysis
Metabolic Network nalysis Overview  modelling chemical reaction networks  Levels of modelling Lecture II: Modelling chemical reaction networks dr. Sander Hille shille@math.leidenuniv.nl http://www.math.leidenuniv.nl/~shille
More informationChapter 7 Active Reading Guide Cellular Respiration and Fermentation
Name: AP Biology Mr. Croft Chapter 7 Active Reading Guide Cellular Respiration and Fermentation Overview: Before getting involved with the details of cellular respiration and photosynthesis, take a second
More informationNetwork (Tree) Topology Inference Based on Prüfer Sequence
Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 vanniarajanc@hcl.in,
More informationInside the PostgreSQL Query Optimizer
Inside the PostgreSQL Query Optimizer Neil Conway neilc@samurai.com Fujitsu Australia Software Technology PostgreSQL Query Optimizer Internals p. 1 Outline Introduction to query optimization Outline of
More informationCELL ORGANELLES. Functions
CELL ORGANELLES Functions CELL WALL PLANT CELL ONLY The cell walls of plants provide strength and protection, keeping the cells from bursting or rupturing. They also protect against insects and parasites,
More informationAsking Hard Graph Questions. Paul Burkhardt. February 3, 2014
Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate  R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)
More informationFrom DNA to Protein. Proteins. Chapter 13. Prokaryotes and Eukaryotes. The Path From Genes to Proteins. All proteins consist of polypeptide chains
Proteins From DNA to Protein Chapter 13 All proteins consist of polypeptide chains A linear sequence of amino acids Each chain corresponds to the nucleotide base sequence of a gene The Path From Genes
More informationNetwork Analysis. BCH 5101: Analysis of Omics Data 1/34
Network Analysis BCH 5101: Analysis of Omics Data 1/34 Network Analysis Graphs as a representation of networks Examples of genomescale graphs Statistical properties of genomescale graphs The search
More informationRelational Databases
Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 18 Relational data model Domain domain: predefined set of atomic values: integers, strings,... every attribute
More informationChapter 18 Regulation of Gene Expression
Chapter 18 Regulation of Gene Expression 18.1. Gene Regulation Is Necessary By switching genes off when they are not needed, cells can prevent resources from being wasted. There should be natural selection
More informationTwincore  Zentrum für Experimentelle und Klinische Infektionsforschung Institut für Molekulare Bakteriologie
Twincore  Zentrum für Experimentelle und Klinische Infektionsforschung Institut für Molekulare Bakteriologie 0 HELMHOLTZ I ZENTRUM FÜR INFEKTIONSFORSCHUNG Technische Universität Braunschweig Institut
More informationQuery Processing C H A P T E R12. Practice Exercises
C H A P T E R12 Query Processing Practice Exercises 12.1 Assume (for simplicity in this exercise) that only one tuple fits in a block and memory holds at most 3 blocks. Show the runs created on each pass
More informationEfficient Generation and Execution of DAGStructured Query Graphs
Efficient Generation and Execution of DAGStructured Query Graphs Inauguraldissertation zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften der Universität Mannheim vorgelegt von
More informationDistance Degree Sequences for Network Analysis
Universität Konstanz Computer & Information Science Algorithmics Group 15 Mar 2005 based on Palmer, Gibbons, and Faloutsos: ANF A Fast and Scalable Tool for Data Mining in Massive Graphs, SIGKDD 02. Motivation
More informationOptimization of SQL Queries in MainMemory Databases
Optimization of SQL Queries in MainMemory Databases Ladislav Vastag and Ján Genči Department of Computers and Informatics Technical University of Košice, Letná 9, 042 00 Košice, Slovakia lvastag@netkosice.sk
More informationDATA STRUCTURES USING C
DATA STRUCTURES USING C QUESTION BANK UNIT I 1. Define data. 2. Define Entity. 3. Define information. 4. Define Array. 5. Define data structure. 6. Give any two applications of data structures. 7. Give
More informationLoad Balancing and Termination Detection
Chapter 7 slides71 Load Balancing and Termination Detection slides72 Load balancing used to distribute computations fairly across processors in order to obtain the highest possible execution speed. Termination
More informationExtraction and Visualization of ProteinProtein Interactions from PubMed
Extraction and Visualization of ProteinProtein Interactions from PubMed Ulf Leser Knowledge Management in Bioinformatics HumboldtUniversität Berlin Finding Relevant Knowledge Find information about Much
More informationDynamics of Biological Systems
Dynamics of Biological Systems Part I  Biological background and mathematical modelling Paolo Milazzo (Università di Pisa) Dynamics of biological systems 1 / 53 Introduction The recent developments in
More informationQuerying ontologies in relational database systems
Querying ontologies in relational database systems Silke Trißl and Ulf Leser HumboldtUniversität zu Berlin, Institute of Computer Sciences, D10099 Berlin, Germany {trissl, leser}@informatik.huberlin.de
More informationBinary Trees and Huffman Encoding Binary Search Trees
Binary Trees and Huffman Encoding Binary Search Trees Computer Science E119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Motivation: Maintaining a Sorted Collection of Data A data dictionary
More informationBasic Characteristics of Cells. Cell Structure and Function. Each Cell Has Three Primary Regions. Basic Characteristics of Cells. The Plasma Membrane
Basic Characteristics of Cells Cell Structure and Function Chapter 3 Smallest living subdivision of the human body Diverse in structure and function Small Basic Characteristics of Cells Each Cell Has Three
More informationGeneral Network Analysis: Graphtheoretic. COMP572 Fall 2009
General Network Analysis: Graphtheoretic Techniques COMP572 Fall 2009 Networks (aka Graphs) A network is a set of vertices, or nodes, and edges that connect pairs of vertices Example: a network with 5
More informationAlison Stewart 11/12/06 Prokaryotic Cells, Eukaryotic cells and HIV: Structures, Transcription and Transport Section Handout Discussion Week #7
Alison Stewart 11/12/06 Prokaryotic Cells, Eukaryotic cells and HIV: Structures, Transcription and Transport Section Handout Discussion Week #7 Compare and contrast the organization of eukaryotic, prokaryotic
More informationWhy? A central concept in Computer Science. Algorithms are ubiquitous.
Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online
More informationLearning Objectives. Learning Objectives (cont.) Chapter 6: Metabolism  Energy & Enzymes 1. Lectures by Tariq Alalwan, Ph.D.
Biology, 10e Sylvia S. Mader Lectures by Tariq Alalwan, Ph.D. Learning Objectives Define energy, emphasizing how it is related to work and to heat State and apply two energy laws to energy transformations.
More informationMining SocialNetwork Graphs
342 Chapter 10 Mining SocialNetwork Graphs There is much information to be gained by analyzing the largescale data that is derived from social networks. The bestknown example of a social network is
More informationQuiz! Database Indexes. Index. Quiz! Disc and main memory. Quiz! How costly is this operation (naive solution)?
Database Indexes How costly is this operation (naive solution)? course per weekday hour room TDA356 2 VR Monday 13:15 TDA356 2 VR Thursday 08:00 TDA356 4 HB1 Tuesday 08:00 TDA356 4 HB1 Friday 13:15 TIN090
More informationGraph Algorithms using MapReduce
Graph Algorithms using MapReduce Graphs are ubiquitous in modern society. Some examples: The hyperlink structure of the web 1/7 Graph Algorithms using MapReduce Graphs are ubiquitous in modern society.
More informationCustomer Intimacy Analytics
Customer Intimacy Analytics Leveraging Operational Data to Assess Customer Knowledge and Relationships and to Measure their Business Impact by Francois Habryn Scientific Publishing CUSTOMER INTIMACY ANALYTICS
More informationIndex Selection Techniques in Data Warehouse Systems
Index Selection Techniques in Data Warehouse Systems Aliaksei Holubeu as a part of a Seminar Databases and Data Warehouses. Implementation and usage. Konstanz, June 3, 2005 2 Contents 1 DATA WAREHOUSES
More informationfor High Performance Computing
Technische Universität München Institut für Informatik Lehrstuhl für Rechnertechnik und Rechnerorganisation Automatic Performance Engineering Workflows for High Performance Computing Ventsislav Petkov
More information