Network Analysis. BCH 5101: Analysis of -Omics Data 1/34



Similar documents
Graph Theory and Networks in Biology

Bioinformatics: Network Analysis

General Network Analysis: Graph-theoretic. COMP572 Fall 2009

Understanding the dynamics and function of cellular networks

Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations

Graph theoretic approach to analyze amino acid network

Protein Protein Interaction Networks

Healthcare Analytics. Aryya Gangopadhyay UMBC

Temporal Dynamics of Scale-Free Networks

Introduction to Networks and Business Intelligence

USE OF GRAPH THEORY AND NETWORKS IN BIOLOGY

Qualitative modeling of biological systems

Data Integration. Lectures 16 & 17. ECS289A, WQ03, Filkov

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

Random graphs and complex networks

Integrating DNA Motif Discovery and Genome-Wide Expression Analysis. Erin M. Conlon

A Graph-Theoretic Analysis of the Human Protein-Interaction Network Using Multicore Parallel Algorithms

Structural constraints in complex networks

Visualizing Networks: Cytoscape. Prat Thiru

Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals

ProteinQuest user guide

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

Walk-Based Centrality and Communicability Measures for Network Analysis

NOVEL GENOME-SCALE CORRELATION BETWEEN DNA REPLICATION AND RNA TRANSCRIPTION DURING THE CELL CYCLE IN YEAST IS PREDICTED BY DATA-DRIVEN MODELS

A General Framework for Weighted Gene Co-expression Network Analysis

Feed Forward Loops in Biological Systems

SPANNING CACTI FOR STRUCTURALLY CONTROLLABLE NETWORKS NGO THI TU ANH NATIONAL UNIVERSITY OF SINGAPORE

Graphical Modeling for Genomic Data

InSyBio BioNets: Utmost efficiency in gene expression data and biological networks analysis

Bioinformatics: Network Analysis

Genetomic Promototypes

Research Article A Comparison of Online Social Networks and Real-Life Social Networks: A Study of Sina Microblogging

Network Analysis and System Biology with Omics Data. Zhenqiu Liu Samuel Oschin Comprehensive Cancer Institute Cedars Sinai Medical Center

Using graph theory to analyze biological networks

1. Introduction Gene regulation Genomics and genome analyses Hidden markov model (HMM)

Statistical Applications in Genetics and Molecular Biology

Unraveling protein networks with Power Graph Analysis

Graph models for the Web and the Internet. Elias Koutsoupias University of Athens and UCLA. Crete, July 2003

Dmitri Krioukov CAIDA/UCSD

How To Cluster Of Complex Systems

Some Examples of Network Measurements

Statistical Analysis of Network Data

The Topology of Large-Scale Engineering Problem-Solving Networks

A discussion of Statistical Mechanics of Complex Networks P. Part I

On Network Tools for Network Motif Finding: A Survey Study

Complex Networks Analysis: Clustering Methods

IC05 Introduction on Networks &Visualization Nov

Graph Mining Techniques for Social Media Analysis

Interaktionen von RNAs und Proteinen

How To Understand The Network Of A Network

The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth

Social Media Mining. Network Measures

DATA ANALYSIS IN PUBLIC SOCIAL NETWORKS

MINFS544: Business Network Data Analytics and Applications

Guide for Data Visualization and Analysis using ACSN

transcription networks

USE OF EIGENVALUES AND EIGENVECTORS TO ANALYZE BIPARTIVITY OF NETWORK GRAPHS

Exercise with Gene Ontology - Cytoscape - BiNGO

Quantitative proteomics background

Emergence of Complexity in Financial Networks

Applying Statistics Recommended by Regulatory Documents

NETWORK SCIENCE DEGREE CORRELATION

Distance Degree Sequences for Network Analysis

A box-covering algorithm for fractal scaling in scale-free networks

Comparing Methods for Identifying Transcription Factor Target Genes

Effects of node buffer and capacity on network traffic

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology

Network VisualizationS

Network/Graph Theory. What is a Network? What is network theory? Graph-based representations. Friendship Network. What makes a problem graph-like?

Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)

Statistical Inference for Networks Graduate Lectures. Hilary Term 2009 Prof. Gesine Reinert

MORPHEUS. Prediction of Transcription Factors Binding Sites based on Position Weight Matrix.

arxiv:cs.dm/ v1 30 Mar 2002

Transcription:

Network Analysis BCH 5101: Analysis of -Omics Data 1/34

Network Analysis Graphs as a representation of networks Examples of genome-scale graphs Statistical properties of genome-scale graphs The search for meso-scale subgraphs/modules Network motifs 2/34

Networks Mathematically, a network is often represented as a graph, having: A set of vertices, or nodes typically corresponding to genes, mrnas, proteins, complexes, etc. A set of edges representing interactions, regulation, co-expression, co-location, etc. Edges may be directed (A B) orundirected (A B) 3/34

Common edge types and meanings Prot1 Prot2 Protein-protein interaction, or co-expression, or co-localization Gene1 Prot1 Gene 1 codes for protein 1 Prot1 Gene1 Protein 1 regulates gene 1, or protein 1 activates gene 1 Gene1 Prot1 Protein 1 represses gene 1 4/34

Networks (again) Mathematically, a network is often represented as a graph, having: A set of vertices, or nodes typically corresponding to genes, mrnas, proteins, complexes, etc. A set of edges representing interactions, regulation, co-expression, co-location, etc. Edges may be directed (A B) orundirected (A B) Sometimes, we want to extend this traditional notion of graphs: Edges may have types (e.g., activation or repression) Edges or vertices may be weighted (assigned a real number, indicating importance, evidence, relevance, etc.) Edges may point to other edges 5/34

Example: Uetz et al. yeast PPI network Uetz et al. (Nature,2000) performed two variants of yeast two-hybrid screening to test for interactions between proteins in S. cerevisiae They found 957 interactions (far fewer than there really are!) involving 1004 proteins 6/34

Yeast transcriptome Lee et al. (Nature, 2002) used ChIP-chip to determine transcription factor gene promoter binding relationships 106 of 141 known transcription factors were successfully tested, resulting in 4000 relationships at a stringent p-value threshold (Picture from Beyer lab website.) 7/34

Human co-expression network Prieto et al. (PLoS One, 2008) used microarrays on a set of human tissue samples to determine co-expression. They found 15841 high-confidence relationships between 3327 genes. 8/34

Typical structure of large-scale newtorks Such networks are often found to be approximately scale-free : A few vertices ( hubs ) have many edges Most vertices ( leaves ) have just a few Formally, the number vertices with k edges is proportional to 1/k α for some real number α>1 (See previous graphs, especially Uetz et al.) 9/34

Yeast co-expression network scale-free van Noort et al. (EMBO Rep, 2004) analyzed co-expression in the data of Hughes et al. (Cell, 2000) 4077 genes (nodes) are connected by 65,430 undirected edges using a correlation threshold of 0.6 At that level of correlation, and at other levels, the degree distribution is approximately scale-free 10 / 34

Scale-free topology of Yeast transcriptional network (Lee et al., Science2002) In-degree and out-degree of resulting graph shows hubs and leaves. (Ignore the open red circles; they re a null model.) 11 / 34

Barabasi Modern excitement over scale-free networks began with work of Barabasi (Nature, 1999; Science, 1999) Many other natural and artificial networks show scale-free degree distributions, e.g., internet, www, friendship, roads, metabolism, etc. Barabasi proposed preferential attachment, or rich grow richer model of network growth to explain scale-free nature new nodes more likely to make connections to existing high degree nodes 12 / 34

Preferential attachment The Barabasi-Albert model works as follows: We begin with a core connected network that is small We repeatedly add a new vertex to the graph, along with m edges The probability that an edge is added between the new vertex and existing vertex i (P i ) is proportional to the degree of i (d i ) P i = d i j d j Asymptotically, this produces graphs with a power-law degree distribution; the number of vertices with degree d is N(d) d 3.(Independentofm.) Other variants can produce different exponents, different average degrees, etc. 13 / 34

Are high-degree nodes in genome networks important? There have been attempts to relate graph structure / high-degree nodes to biological properties: Robustness: does the network survival deletion or mutation of a gene? Conservation: are higher-degree nodes more constrained? Expression: are higher-degree nodes expression more often? Results are mixed... High-degree nodes may also be promiscuous, sticky, nonspecific or experimental artifacts.... still, high-degree nodes are a natural place to start an investigation. 14 / 34

Explaining scaling in biological networks Could preferential attachment explain powerlaw scaling in biological networks? 15 / 34

Explaining scaling in biological networks Another answer: evolutionary drift Assume genes duplicated at random (singly, or in blocks) carrying their regulatory or interacting links with them Assume genes deleted at random Assume interactions have random chance of creation or deletion between existing genes Then one can show a power-law degree distribution will result (See, e.g., Wagner (Proc. R. Soc. Lond. B, 2003) for protein interaction networks.) Note: no explicit evolutionary selection for individual genes or for network structure as a whole! 16 / 34