mmnet: Metagenomic analysis of microbiome metabolic network
|
|
|
- Julian Ford
- 9 years ago
- Views:
Transcription
1 mmnet: Metagenomic analysis of microbiome metabolic network Yang Cao, Fei Li, Xiaochen Bo May 15, 2016 Contents 1 Introduction Installation Analysis Pipeline: from raw Metagenomic Sequence Data to Metabolic Network Analysis Prepare metagenomic sequence data Annotation of Metagenomic Sequence Reads Estimating the abundance of enzymatic genes Building reference metabolic dataset Constructing State Specific Network Topological network analysis Differential network analysis Network Visualization Analysis in Cytoscape 11 1 Introduction This manual is a brief introduction to structure, functions and usage of mmnet package. The mmnet package provides a set of functions to support systems analysis of metagenomic data in R, including annotation of raw metagenomic sequence data, construction of metabolic network, and differential network analysis based on state specific metabolic network and enzymatic gene abundance. Meanwhile, the package supports an analysis pipeline for metagenomic systems biology. We can simply start from raw metagenomic sequence data, optionally use MG- RAST to rapidly retrieve metagenomic annotation, fetch pathway data from KEGG using its API to construct metabolic network, and then utilize a metagenomic systems biology computational framework mentioned in [Greenblum et al., 2012] to establish further differential network analysis. The main features of mmnet: Annotation of metagenomic sequence reads with MG-RAST Estimating abundances of enzymatic gene based on functional annotation 1
2 Constructing State Specific metabolic Network Topological network analysis Differential network analysis 1.1 Installation mmnet requires these packages: KEGGREST, igraph, Biobase, XML, RCurl, RJSO- NIO, stringr, ggplot2 and biom. These should be installed automatically when you install mmnet with bioclite() as follows: ## install release version of mmnet source(" bioclite("mmnet") ## install the latest development version usedevel() bioclite("mmnet") It should be noted that the function call usedevel() calling is chechout to the development versino of BIOCONDUCTOR and it will be install the latest development version of mmnet, which has more functions available and a lot of bugs fixed. After installation, the mmnet is ready to load into the current workspace by the following codes to the current workspace by typing or pasting the following codes: library(mmnet) 2 Analysis Pipeline: from raw Metagenomic Sequence Data to Metabolic Network Analysis We will demonstrate go through an analysis pipeline to illustrate some of the main functions in mmnet. This pipeline consists of several steps: 1. Raw metagenomic sequence data: prepare a specific amount of sequence data ready to analyze from metagenomic studies. You can get the data by short-gun sequencing or download raw data available online. Here we download a sample data from ftp of MG-RAST. 2. Functional annotation: raw reads must be processed for functional annotation to mapping to enzyme libraries with existing knowledge, such as KEGG database. 3. State Specific Network: start from an enzyme set, we can map these enzymes onto metabolic pathways to construct a relatively small metabolic network with abundance involved in specific biological states, named State Specific Network (SSN). 4. Topological and differential network analysis: topological network analysis performs a series of correlation analysis between enzyme abundances and topological properties, i.e. betweenness or pagerank. Meanwhile differential network analysis compares the enzyme abundances and structure of different state specific networks. 2
3 2.1 Prepare metagenomic sequence data Acceptable sequence data can be in FASTA, FASTQ or SFF format. These are recognized by the file name extension with valid extensions for the appropriate formats.fasta,.fna,.fastq,.fq, and.sff and FASTA and FASTQ files need to be in plain text ASCII. An easy way to get some sequence data is to download public dataset from MG-RAST. Here we download two FASTA data sets from MG-RAST ftp server, which corresponds two metagenomic samples with different physiological states (MG-RAST ID: and ). These data sets are consisting of several microbiomes from twin pairs and their mothers [Turnbaugh et al., 2008]. download.file("ftp://ftp.metagenomics.anl.gov/projects/10/ /raw/507.fna.gz", destfile = "obesesample.fna.gz") download.file("ftp://ftp.metagenomics.anl.gov/projects/10/ /raw/687.fna.gz", destfile = "leansample.fna.gz") 2.2 Annotation of Metagenomic Sequence Reads Functional annotation of microbiome is a crucial step for subsequent analysis. Many tools can be used to identify functions of sequences. Meanwhile, local analysis tools and annotation database need complicated installation and configuration, and suppose users have deep understanding on high performance computing and next generation sequencing, that requires users have sufficient preparing time, technical kills and computing resource. To facilitate the analysis of microbiome metabolic network, mmnet provides functions for sequence annotation linking R with MG-RAST. MG-RAST [Glass et al., 2010] is a stable, extensible, online analysis platform for annotation and statistic metagenomes based on sequence data. Moreover, MG-RAST is freely available to all researchers. For more details, see By integrating the web service provided by MG-RAST into R environment. mmnet provides several functions for assigning functional annotation to reads : loginmgrast,uploadmgrast, submitmgrastjob, listmgrastproject and checkmgrastmetagenome. Before use the MG-RAST services, we need register and login a MG-RAST account to access its service. Registration needs to be done on the website. Login can be processed by function loginmgrast. After successful calling this function with user s name and password, a session variable login.info is returned for further access to MG- RAST service. It should be noted that username mmnet with password mmnet was already registered for testing purpose. See?loginMgrast for more details. ## login on MG-RAST login.info <- loginmgrast(user = "mmnet", userpwd = "mmnet") After login, the next step is to upload raw sequence data (obesesample.fna.gz and leansample.fna.ga in here) to your MG-RAST inbox. Function uploadmgrast is designed to get the job done easily. ## select the sample data to upload seq.file <- c("obesesample.fna.gz", "leansample.fna.gz") ## upload sequence data to MG-RAST metagenome.id <- lapply(seq.file, uploadmgrast, login.info = login.info) 3
4 Note: According to MG-RAST user manual, uploaded files may be removed from your inbox after 72 hours. So please perform submission of your files within that time frame. Once the sequence data is uploaded to your MGRAST inbox, users could submit one MGRAST job to annotate each sequence files in your inbox with function submit- MgrastJob. ## submit MGRAST job metagenome.id <- lapply(seq.file, submitmgrastjob, login.info = login.info) show(metagenome.id) It should be noticed that our sample metagenomic sequences downloaded from the MG-RAST have been annotated, so it will return the corresponding metagenome ID that already exists in MG-RAST without duplicated annotation. In other cases, the annotation process may take several hours or days. Thus, here we provide a function checkmgrastmetagenome to check whether your metagenome annotation is completed or in processing. ## check MGRAST project status metagenome.status <- lapply(metagenome.id, checkmgrastmetagenome, login.info = login.info) ## apparently, status of completed annotation ## metagenome is TRUE show(metagenome.status) Once the annotation process completes, the metagenome ID can be obtained by function getmgrastannotation to load the functional annotation profile separately for subsequent metabolic analysis. For private data, you must login on MGRAST to get login.info and call getmgrastannotation with login.info. For more details see getmgrastannotation. ## private data private.annotation <- lapply(metagenome.id, getmgrastannotation, login.info = login.info) ## public annotation data, does not require ## login.info public.annotation <- lapply(metagenome.id, getmgrastannotation) For convenience, we have save this annotation profiles in mmnet names anno.rda. See anno for more details. Loading annotation profiles as follows: data(anno) summary(anno) ## Length Class Mode ## data.frame list ## data.frame list Here, we show an entire process which integrates all the functions above, from sequence uploading to annotation profile downloading, and retrieve the annotation profile after MG-RAST annotation completed. 4
5 ## first login on MG-RAST login.info <- loginmgrast("mmnet", "mmnet") ## prepare the metagenomic sequence for annotation seq <- "obesesample.fna.gz" ## mgrast annotation uploadmgrast(login.info, seq) metagenome.id2 <- submitmgrastjob(login.info, seqfile = basename(seq)) while (TRUE) { status <- checkmgrastmetagenome(metagenome.id = metagenome.id2) if (status) break Sys.sleep(5) cat("in annotation, please waiting...") flush.console() } ## if annotation profile is public,take login.info ## as NULL anno2 <- getmgrastannotation(metagenome.id2, login.info = login.info) 2.3 Estimating the abundance of enzymatic genes Enzyme abundance varies in different biological states. Enzymatic gene abundances are important for microbial marker-gene and disease study, especially for linking microbial genes with the host state. Here, we estimate enzymatic gene abundances based on read counts with a normalization procedure. To estimate the abundance of enzymatic genes, estimateabundance function should be called for each sample. A widely used in metagenomic analysis format - Biological Observation Matrix (BIOM) [McDonald et al., 2012] format will be return by this function. The BIOM file format (canonically pronounced biome) is designed to be a general-use format for representing biological sample by observation contingency tables. More details on BIOM file can be found in mmnet.abund <- estimateabundance(anno) show(mmnet.abund) ## biom object. ## type: enzymatic genes abundance ## matrix_type: dense ## 1034 rows and 2 columns Furthermore, BIOM abundance file was also supported by other tools. Take MG- RAST for example, user could download the enzymatic genes abundance profile with MG-RAST API. Moreover, while samples have been annotated with other tools (e.g. blast to KEGG locally), users could create their own BIOM files represents enzymatic genes abundance and import them for metagenomic systems biology analysis. ## download BIOM functional abundance profile of the ## two sample metagenome from MG-RAST if (require(rcurl)) { function.api <- " mgrast.abund <- read_biom(getform(function.api, 5
6 mgrast.abund1 mmnet.abund1 Figure 1: enzymatic genes abundance comparison between MG-RAST and mmnet package.params = list(id = "mgm ", id = "mgm ", source = "KO", result_type = "abundance"))) } ## obtain the intersect ko abundance of MG-RAST and ## esimatiabundance intersect.ko <- intersect(rownames(mgrast.abund), rownames(mmnet.abund)) ## compare the two by taking one metagenome mgrast.abund1 <- biom_data(mgrast.abund)[, 1][intersect.ko] mmnet.abund1 <- biom_data(mmnet.abund)[, 1][intersect.ko] if (require(ggplot2)) { p <- qplot(mgrast.abund1, mmnet.abund1) + geom_abline(slope = 1, intercept = 0, color = "blue") + ylim(0, 400) + xlim(0, 400) print(p) } The enzymatic gene abundances estimated by MG-RAST is un-calibrated and unoptimized compared to the method in mmnet. As show in Figure 1, abundances in MG-RAST is over-estimated in most cases, especially on high abundance enzymatic genes. The details on abundances calibration in mmnet is as follows: Sequences were annotated with all KOs (KEGG orthologous groups) of the top KO-associated match among the top 100 matches; Sequences with multiple top KO-associated matches with the same e-value were annotated with the union set of KOs; To make a balance between identifying low-abundance genes and reducing the error-rate of identification a threshold of 2 reads to allow the inclusion of rare genes and all KO abundances below this threshold were set to zero. 6
7 For other tools, users can download the annotated data first. Taking IMG/M for example, metagenomic annotation data from IMG/M includes the abundance information. Thus, we can construct the metabolic network without abundance estimation directly. It can be accessed as following: 1) In package mmnet, function to construct metabolic network takes BIOM file as input. We create the BIOM file for network construction first: ## Load the IMG/M sample data IMGFile <- system.file("extdata/imgsample.tab", package = "mmnet") IMGSample <- read.delim2(imgfile, quote = "") ## Create BIOM file for network construction abundance <- IMGSample$Gene.Count abundance <- abundance/sum(abundance) abundance <- as.data.frame(abundance) KO <- IMGSample$KO.ID KO <- as.data.frame(gsub("ko:", "", KO)) biom.data <- make_biom(abundance, observation_metadata = KO) biom.data$type <- "enzymatic genes abundance" 2) Then we construct and analyze the SSN: ## Construct and analyze SSN ssn <- constructssn(biom.data) topologicalanalyzenet(ssn) Futhermore, users may have annotated the metagenomic reads locally, the annotated profile should be preprocessed in format as the MGRAST annotation profile that have 13 columns. Details: 1. Read id 2. Hit id, ref sequence id 3. percentage identity, e.g alignment length, e.g number of mismatches, e.g number of gap openings, e.g query start, e.g query.end, e.g hit start, e.g hit.end, e.g e-value, e.g. 1.7e score in bits, e.g KEGG orthology Then we can analyze the annotation profile properly as profile from MG-RAST. We also have added this description to the vignette of mmnet to help users analyze the annotation profiles from the other tools. And we still strive to improve our package with good support for other tools. 2.4 Building reference metabolic dataset This package takes KEGG database to annotate enzymatic genes with metabolic reactions which is the basis representation for all proteins and functional RNAs corresponding to KEGG pathway nodes, BRITE hierarchy nodes, and KEGG module nodes 7
8 to annotation the microbiome. The KEGG metabolic pathway is the best organized part of KEGG PATHWAY database, and also is a network of KO-KO relations [Kanehisa and Goto, 2000]. It is composed of KO, substrate and product of KO, and have been applied widely to create genome-scale metabolic networks of various microbial species [Oberhardt et al., 2009]. This reference metabolic data was obtained with the KEGG free REST API, about 150 KEGG reference pathways. And KEGG API is provided for academic use by academic users belonging to academic institutions. This service should not be used for bulk data downloads. Thus, our small download with KEGG API is in agreement with the license. An initial reference data named RefDbcache.rda was saved in data subdirectory of mment that contains KOs and annotated with a metabolic reaction. Moreover, an KO based reference metabolic network was also saved in RefDbcache.rda, where nodes represent enzymes (KOs), and the directed edge from enzyme A to enzyme B indicates that a product metabolite of a reaction catalyzed by enzyme A is a substrate metabolite of a reaction catalyzed by enzyme B. As KEGG database is constantly updated, reference data can be updated by function updatemetabolicnetwork manually and saved in the directory user specified. Reference dataset can be loaded by function loadmetabolicdata loadmetabolicdata() summary(refdbcache) ## Length Class Mode ## ko none- character ## substrate none- list ## product none- list ## user 1 -none- character ## date 1 -none- character ## version 14 -none- list ## network 9 igraph list 2.5 Constructing State Specific Network Different biological states (e.g. obese or lean) can be identified based on different enzymatic gene set and abundances. In network view, different biological states associates different metabolic sub-network with different abundances, named as State Specific Network (SSN). We provide function constructssn to construct the reference network to obtain the state specific metabolic network for each microbiome. It could take the output of function estimateabundance as input, give the SSN as output. ssn <- constructssn(mmnet.abund) g <- ssn[[1]] summary(g) ## IGRAPH DN SSN ## + attr: name (g/c), name (v/c), abundance (v/n) abund <- get.vertex.attribute(g, "abundance", index = V(g)) summary(abund) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ##
9 betweennesscentrality degree clusteringcoefficient pagerank value abundance Figure 2: Topological metabolic network analysis, linking topological features and enzymatic gene abundances 2.6 Topological network analysis To examine whether enzymes that are associated with a specific host state exhibit some topological features in the SSN, we provide function topologicalanalyzenet to compute and illustrate the correlations between the topological properties of enzymatic genes and their abundances. It links the difference abundance with the topological correlation (Figure 2). Common topological features are supported. See?topological- AnalyzeNet for details. topo.net <- topologicalanalyzenet(g) ## network with topological features as attributes topo.net ## IGRAPH DN SSN ## + attr: name (g/c), name (v/c), abundance (v/n), ## betweennesscentrality (v/x), degree (v/x), ## clusteringcoefficient (v/x), pagerank (v/x) ## + edges (vertex names): ## [1] K01623->K00128 K01623->K00134 K01623->K01803 K01623->K00850 ## [5] K01623->K01625 K01623->K01619 K01623->K00615 K01623->K00616 ## [9] K01623->K01629 K01623->K00895 K01623->K00057 K01623->K01734 ## [13] K01623->K01632 K01623->K08681 K01623->K06215 K01623->K03517 ## [17] K01623->K01624 K01623->K04041 K01623->K03856 K00128->K01623 ## [21] K00128->K00382 K00128->K01895 K00128->K00174 K00128->K00658 ## +... omitted several edges 9
10 Figure 3: Differential metabolic network analysis, enzymatic genes that are associated with specific state appear as colored nodes (red=enriched, green=depleted) 2.7 Differential network analysis To compare the abundance of enzymatic genes across various samples, we take three strategies to identify the differential abundance (enrich or deplete), including (1) odds ratio, (2) difference rank and (3) Jensen-Shannon Divergence (JSD). The corresponding function differentialanalyzenet outputs the comparative network with nodes of differential abundance as result (Figure 3). state <- c("obese", "lean") differential.net <- differentialanalyzenet(ssn, sample.state = state, method = "OR", cutoff = c(0.5, 2)) summary(differential.net) ## IGRAPH DN refnet ## + attr: name (g/c), name (v/c), p.value (v/n), OR (v/n) 2.8 Network Visualization Both function topologicalanalyzenet and differentialanalyzenet utilize function showmetagenomicnet to plot the metabolic network. The showmetagenomicnet can also be used to personalized network display by specifying appropriate parameters (Figure 4). ## the reference network ## showmetagenomicnet(refdbcache$network,mode='ref') ## the state specific metabolic network 10
11 Figure 4: Visualization of State Specific Network using extitshowmetagenomicnet with node size proportional to log(abundance) showmetagenomicnet(g, mode = "ssn", vertex.label = NA, edge.width = 0.3, edge.arrow.size = 0.1, edge.arrow.width = 0.1, layout = layout.fruchterman.reingold) 3 Analysis in Cytoscape Here is a simple example to show the reference metabolic network in Cytoscape with RCytoscape package (Figure 5). Cytoscape2.8, and CytoscapeRPC: a Cytoscape plugin is required besides of R package RCytoscape. See RCytoscape package for details on how to transfer the network and attributes from R to Cytoscape. Open Cytoscape, and then activate the CytoscapeRPC plugin in Cytoscape s Plugins menu. Click to start the XMLRPC server in the dialog, and Cytoscape will wait for commands from R. In the default setting, the communicate port is 9000 on localhost. You can choose a different port, just be sure to use that changed port number when you call the RCytoscape constructor. Then, type the following commands in R: if (require(rcytoscape)) { refnet <- ssn[[1]] net <- igraph.to.graphnel(refnet) ## initialize the edge attibute ## edge.attr=list.edge.attributes(refnet) ## edge.attr.class = sapply(edge.attr, class) ## edge.attr.class[edge.attr.class=='character']='char' 11
12 } ## init node attributes node.attr = list.vertex.attributes(refnet) if (length(node.attr)) { node.attr.class = sapply(node.attr, class) node.attr.class[node.attr.class == "character"] = "char" for (i in 1:length(node.attr)) net <- initnodeattribute(net, attribute.name = node.attr[i], attribute.type = node.attr.class[i], default.value = "0") } ## our metagenomic network does not have edge ## attributes, set them all to 1 net <- initedgeattribute(net, attribute.name = "weight", attribute.type = "numeric", default.value = "1") ## create a network window in Cytoscape cw <- new.cytoscapewindow("net", graph = net, overwritewindow = TRUE) ## transmits the CytoscapeWindowClass's graph data, ## from R to Cytoscape, nodes, edges, node and edge ## attributes displaygraph(cw) Session Information The version number of R and packages loaded for generating the vignette were: ## R version ( ) ## Platform: x86_64-pc-linux-gnu (64-bit) ## Running under: Ubuntu LTS ## ## locale: ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## attached base packages: ## [1] stats graphics grdevices utils datasets methods ## [7] base ## ## other attached packages: ## [1] ggplot2_2.1.0 RCurl_ bitops_1.0-6 mmnet_ ## [5] biom_ igraph_1.0.1 knitr_1.13 ## ## loaded via a namespace (and not attached): ## [1] Rcpp_ formatr_1.4 plyr_1.8.3 ## [4] highr_0.6 XVector_ tools_3.3.0 ## [7] zlibbioc_ digest_0.6.9 evaluate_0.9 ## [10] gtable_0.2.0 lattice_ png_0.1-7 ## [13] Matrix_1.2-6 parallel_3.3.0 httr_
13 h Figure 5: Export netowork to Cytoscape with RCytoscape package 13
14 ## [16] stringr_1.0.0 Biostrings_ S4Vectors_ ## [19] IRanges_2.7.0 stats4_3.3.0 grid_3.3.0 ## [22] nnet_ Biobase_ R6_2.1.2 ## [25] flexmix_ XML_ RJSONIO_1.3-0 ## [28] reshape2_1.4.1 magrittr_1.5 scales_0.4.0 ## [31] codetools_ modeltools_ BiocGenerics_ ## [34] KEGGREST_ colorspace_1.2-6 labeling_0.3 ## [37] stringi_1.0-1 munsell_0.4.3 Cleanup This is a cleanup step for the vignette on Windows; typically not needed for users. allcon <- showconnections() socketcon <- as.integer(rownames(allcon)[allcon[, "class"] == "sockconn"]) sapply(socketcon, function(ii) close.connection(getconnection(ii))) ## list() References Elizabeth M Glass, Jared Wilkening, Andreas Wilke, Dionysios Antonopoulos, and Folker Meyer. Using the metagenomics rast server (mg-rast) for analyzing shotgun metagenomes. Cold Spring Harbor Protocols, 2010(1):pdb prot5368, Sharon Greenblum, Peter J Turnbaugh, and Elhanan Borenstein. Metagenomic systems biology of the human gut microbiome reveals topological shifts associated with obesity and inflammatory bowel disease. Proceedings of the National Academy of Sciences, 109(2): , Minoru Kanehisa and Susumu Goto. Kegg: kyoto encyclopedia of genes and genomes. Nucleic acids research, 28(1):27 30, Daniel McDonald, Jose C Clemente, Justin Kuczynski, Jai Ram Rideout, Jesse Stombaugh, Doug Wendel, Andreas Wilke, Susan Huse, John Hufnagle, Folker Meyer, et al. The biological observation matrix (biom) format or: how i learned to stop worrying and love the ome-ome. GigaScience, 1(1):7, Matthew A Oberhardt, Bernhard Ø Palsson, and Jason A Papin. Applications of genome-scale metabolic reconstructions. Molecular systems biology, 5(1), Peter J Turnbaugh, Micah Hamady, Tanya Yatsunenko, Brandi L Cantarel, Alexis Duncan, Ruth E Ley, Mitchell L Sogin, William J Jones, Bruce A Roe, Jason P Affourtit, et al. A core gut microbiome in obese and lean twins. Nature, 457(7228): ,
Creating a New Annotation Package using SQLForge
Creating a New Annotation Package using SQLForge Marc Carlson, Herve Pages, Nianhua Li February 4, 2016 1 Introduction The AnnotationForge package provides a series of functions that can be used to build
netresponse probabilistic tools for functional network analysis
netresponse probabilistic tools for functional network analysis Leo Lahti 1,2, Olli-Pekka Huovilainen 1, António Gusmão 1 and Juuso Parkkinen 1 (1) Dpt. Information and Computer Science, Aalto University,
HowTo: Querying online Data
HowTo: Querying online Data Jeff Gentry and Robert Gentleman May 3, 2016 1 Overview This article demonstrates how you can make use of the tools that have been provided for on-line querying of data resources.
MoBEDAC -- Integrated data and analysis for the indoor and built environment. Folker Meyer Argonne National Laboratory GSC 13 Shenzhen, China
MoBEDAC -- Integrated data and analysis for the indoor and built environment Folker Meyer Argonne National Laboratory GSC 13 Shenzhen, China NGS is causing paradigm shift Environmental clone libraries
Estimating batch effect in Microarray data with Principal Variance Component Analysis(PVCA) method
Estimating batch effect in Microarray data with Principal Variance Component Analysis(PVCA) method Pierre R. Bushel, Jianying Li April 16, 2015 Contents 1 Introduction 1 2 Installation 2 3 An example run
R4RNA: A R package for RNA visualization and analysis
R4RNA: A R package for RNA visualization and analysis Daniel Lai May 3, 2016 Contents 1 R4RNA 1 1.1 Reading Input............................................ 1 1.2 Basic Arc Diagram..........................................
Using the generecommender Package
Using the generecommender Package Greg Hather May 3, 2016 1 What is generecommender? generecommender is a package used to identify coexpressed genes in microarray data. In particular, the package ranks
Visualising big data in R
Visualising big data in R April 2013 Birmingham R User Meeting Alastair Sanderson www.alastairsanderson.com 23rd April 2013 The challenge of visualising big data Only a few million pixels on a screen,
Tutorial for proteome data analysis using the Perseus software platform
Tutorial for proteome data analysis using the Perseus software platform Laboratory of Mass Spectrometry, LNBio, CNPEM Tutorial version 1.0, January 2014. Note: This tutorial was written based on the information
org.rn.eg.db December 16, 2015 org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank accession numbers.
org.rn.eg.db December 16, 2015 org.rn.egaccnum Map Entrez Gene identifiers to GenBank Accession Numbers org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank
Bioinformatics Grid - Enabled Tools For Biologists.
Bioinformatics Grid - Enabled Tools For Biologists. What is Grid-Enabled Tools (GET)? As number of data from the genomics and proteomics experiment increases. Problems arise for the current sequence analysis
GEOsubmission. Alexandre Kuhn ([email protected]) May 3, 2016
GEOsubmission Alexandre Kuhn ([email protected]) May 3, 2016 1 Summary The goal of GEOsubmission is to ease the submission of microarray datasets to the GEO repository. It generates a single file (SOFT
KEGGgraph: a graph approach to KEGG PATHWAY in R and Bioconductor
KEGGgraph: a graph approach to KEGG PATHWAY in R and Bioconductor Jitao David Zhang and Stefan Wiemann October 13, 2015 Abstract We demonstrate the capabilities of the KEGGgraph package, an interface between
GDC Data Transfer Tool User s Guide. NCI Genomic Data Commons (GDC)
GDC Data Transfer Tool User s Guide NCI Genomic Data Commons (GDC) Contents 1 Getting Started 3 Getting Started.......................................................... 3 The GDC Data Transfer Tool: An
FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem
FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem Elsa Bernard Laurent Jacob Julien Mairal Jean-Philippe Vert May 3, 2016 Abstract FlipFlop implements a fast method for de novo transcript
Exercise with Gene Ontology - Cytoscape - BiNGO
Exercise with Gene Ontology - Cytoscape - BiNGO This practical has material extracted from http://www.cbs.dtu.dk/chipcourse/exercises/ex_go/goexercise11.php In this exercise we will analyze microarray
The full setup includes the server itself, the server control panel, Firebird Database Server, and three sample applications with source code.
Content Introduction... 2 Data Access Server Control Panel... 2 Running the Sample Client Applications... 4 Sample Applications Code... 7 Server Side Objects... 8 Sample Usage of Server Side Objects...
ProteinQuest user guide
ProteinQuest user guide 1. Introduction... 3 1.1 With ProteinQuest you can... 3 1.2 ProteinQuest basic version 4 1.3 ProteinQuest extended version... 5 2. ProteinQuest dictionaries... 6 3. Directions for
Content Filtering Client Policy & Reporting Administrator s Guide
Content Filtering Client Policy & Reporting Administrator s Guide Notes, Cautions, and Warnings NOTE: A NOTE indicates important information that helps you make better use of your system. CAUTION: A CAUTION
RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison
RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the
Frequently Asked Questions Next Generation Sequencing
Frequently Asked Questions Next Generation Sequencing Import These Frequently Asked Questions for Next Generation Sequencing are some of the more common questions our customers ask. Questions are divided
Bitrix Site Manager ASP.NET. Installation Guide
Bitrix Site Manager ASP.NET Installation Guide Contents Introduction... 4 Chapter 1. Checking for IIS Installation... 5 Chapter 2. Using An Archive File to Install Bitrix Site Manager ASP.NET... 7 Preliminary
5 Correlation and Data Exploration
5 Correlation and Data Exploration Correlation In Unit 3, we did some correlation analyses of data from studies related to the acquisition order and acquisition difficulty of English morphemes by both
Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data
Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data The Illumina TopHat Alignment and Cufflinks Assembly and Differential Expression apps make RNA data analysis accessible to any user, regardless
Cluster software and Java TreeView
Cluster software and Java TreeView To download the software: http://bonsai.hgc.jp/~mdehoon/software/cluster/software.htm http://bonsai.hgc.jp/~mdehoon/software/cluster/manual/treeview.html Cluster 3.0
Bitrix Site Manager 4.0. Quick Start Guide to Newsletters and Subscriptions
Bitrix Site Manager 4.0 Quick Start Guide to Newsletters and Subscriptions Contents PREFACE...3 CONFIGURING THE MODULE...4 SETTING UP FOR MANUAL SENDING E-MAIL MESSAGES...6 Creating a newsletter...6 Providing
Package hoarder. June 30, 2015
Type Package Title Information Retrieval for Genetic Datasets Version 0.1 Date 2015-06-29 Author [aut, cre], Anu Sironen [aut] Package hoarder June 30, 2015 Maintainer Depends
Using the T&D Thermo App with TR-7wf Data Loggers
Using the T&D Thermo App with TR-7wf Data Loggers T&D Thermo The T&D Thermo App from T&D Corporation can be used to accomplish a variety of functions with the TR-7wf Series of Wi-Fi enabled wireless Data
A Tutorial in Genetic Sequence Classification Tools and Techniques
A Tutorial in Genetic Sequence Classification Tools and Techniques Jake Drew Data Mining CSE 8331 Southern Methodist University [email protected] www.jakemdrew.com Sequence Characters IUPAC nucleotide
New Features in Primavera P6 EPPM 16.1
New Features in Primavera P6 EPPM 16.1 COPYRIGHT & TRADEMARKS Copyright 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
DiskPulse DISK CHANGE MONITOR
DiskPulse DISK CHANGE MONITOR User Manual Version 7.9 Oct 2015 www.diskpulse.com [email protected] 1 1 DiskPulse Overview...3 2 DiskPulse Product Versions...5 3 Using Desktop Product Version...6 3.1 Product
Monocle: Differential expression and time-series analysis for single-cell RNA-Seq and qpcr experiments
Monocle: Differential expression and time-series analysis for single-cell RNA-Seq and qpcr experiments Cole Trapnell Harvard University, Cambridge, Massachussetts, USA [email protected] Davide Cacchiarelli
Practical Differential Gene Expression. Introduction
Practical Differential Gene Expression Introduction In this tutorial you will learn how to use R packages for analysis of differential expression. The dataset we use are the gene-summarized count data
Visualizing Networks: Cytoscape. Prat Thiru
Visualizing Networks: Cytoscape Prat Thiru Outline Introduction to Networks Network Basics Visualization Inferences Cytoscape Demo 2 Why (Biological) Networks? 3 Networks: An Integrative Approach Zvelebil,
Gephi Tutorial Quick Start
Gephi Tutorial Welcome to this introduction tutorial. It will guide you to the basic steps of network visualization and manipulation in Gephi. Gephi version 0.7alpha2 was used to do this tutorial. Get
Vector NTI Advance 11 Quick Start Guide
Vector NTI Advance 11 Quick Start Guide Catalog no. 12605050, 12605099, 12605103 Version 11.0 December 15, 2008 12605022 Published by: Invitrogen Corporation 5791 Van Allen Way Carlsbad, CA 92008 U.S.A.
Gene2Pathway Predicting Pathway Membership via Domain Signatures
Gene2Pathway Predicting Pathway Membership via Domain Signatures Holger Fröhlich October 27, 2010 Abstract Functional characterization of genes is of great importance, e.g. in microarray studies. Valueable
Minería de Datos ANALISIS DE UN SET DE DATOS.! Visualization Techniques! Combined Graph! Charts and Pies! Search for specific functions
Minería de Datos ANALISIS DE UN SET DE DATOS! Visualization Techniques! Combined Graph! Charts and Pies! Search for specific functions Data Mining on the DAG ü When working with large datasets, annotation
Guide for Data Visualization and Analysis using ACSN
Guide for Data Visualization and Analysis using ACSN ACSN contains the NaviCell tool box, the intuitive and user- friendly environment for data visualization and analysis. The tool is accessible from the
17 July 2014 WEB-SERVER MANUAL. Contact: Michael Hackenberg ([email protected])
WEB-SERVER MANUAL Contact: Michael Hackenberg ([email protected]) 1 1 Introduction srnabench is a free web-server tool and standalone application for processing small- RNA data obtained from next generation
The data between TC Monitor and remote devices is exchanged using HTTP protocol. Monitored devices operate either as server or client mode.
1. Introduction TC Monitor is easy to use Windows application for monitoring and control of some Teracom Ethernet (TCW) and GSM/GPRS (TCG) controllers. The supported devices are TCW122B-CM, TCW181B- CM,
Application of Graph-based Data Mining to Metabolic Pathways
Application of Graph-based Data Mining to Metabolic Pathways Chang Hun You, Lawrence B. Holder, Diane J. Cook School of Electrical Engineering and Computer Science Washington State University Pullman,
UGENE Quick Start Guide
Quick Start Guide This document contains a quick introduction to UGENE. For more detailed information, you can find the UGENE User Manual and other special manuals in project website: http://ugene.unipro.ru.
Getting Started With SAM Director SAM Director User Guide
Getting Started With SAM Director SAM Director User Guide Copyright 2014 License Dashboard Limited. License Dashboard Limited is a trading subsidiary of the Blenheim Group. License Dashboard Limited -
Getting Started with R and RStudio 1
Getting Started with R and RStudio 1 1 What is R? R is a system for statistical computation and graphics. It is the statistical system that is used in Mathematics 241, Engineering Statistics, for the following
Version 5.0 Release Notes
Version 5.0 Release Notes 2011 Gene Codes Corporation Gene Codes Corporation 775 Technology Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) +1.734.769.7249 (elsewhere) +1.734.769.7074 (fax) www.genecodes.com
JustClust User Manual
JustClust User Manual Contents 1. Installing JustClust 2. Running JustClust 3. Basic Usage of JustClust 3.1. Creating a Network 3.2. Clustering a Network 3.3. Applying a Layout 3.4. Saving and Loading
Analysis of Illumina Gene Expression Microarray Data
Analysis of Illumina Gene Expression Microarray Data Asta Laiho, Msc. Tech. Bioinformatics research engineer The Finnish DNA Microarray Centre Turku Centre for Biotechnology, Finland The Finnish DNA Microarray
SBGNViz.js 1.1 User s Guide
i - Vis@Bilkent & cbio@mskcc SBGNViz.js 1.1 User s Guide i-vis@bilkent and cbio@mskcc2014 Bilkent University Ankara 06800, TURKEY Phone +90 312.290.1612 Fax +90 312.266.4047 Table of Contents 1. Introduction...
Bitrix Site Manager 4.1. User Guide
Bitrix Site Manager 4.1 User Guide 2 Contents REGISTRATION AND AUTHORISATION...3 SITE SECTIONS...5 Creating a section...6 Changing the section properties...8 SITE PAGES...9 Creating a page...10 Editing
Monitoring PostgreSQL database with Verax NMS
Monitoring PostgreSQL database with Verax NMS Table of contents Abstract... 3 1. Adding PostgreSQL database to device inventory... 4 2. Adding sensors for PostgreSQL database... 7 3. Adding performance
DeCyder Extended Data Analysis module Version 1.0
GE Healthcare DeCyder Extended Data Analysis module Version 1.0 Module for DeCyder 2D version 6.5 User Manual Contents 1 Introduction 1.1 Introduction... 7 1.2 The DeCyder EDA User Manual... 9 1.3 Getting
USING MYWEBSQL FIGURE 1: FIRST AUTHENTICATION LAYER (ENTER YOUR REGULAR SIMMONS USERNAME AND PASSWORD)
USING MYWEBSQL MyWebSQL is a database web administration tool that will be used during LIS 458 & CS 333. This document will provide the basic steps for you to become familiar with the application. 1. To
CD-HIT User s Guide. Last updated: April 5, 2010. http://cd-hit.org http://bioinformatics.org/cd-hit/
CD-HIT User s Guide Last updated: April 5, 2010 http://cd-hit.org http://bioinformatics.org/cd-hit/ Program developed by Weizhong Li s lab at UCSD http://weizhong-lab.ucsd.edu [email protected] 1. Introduction
Managing Users and Identity Stores
CHAPTER 8 Overview ACS manages your network devices and other ACS clients by using the ACS network resource repositories and identity stores. When a host connects to the network through ACS requesting
SeqArray: an R/Bioconductor Package for Big Data Management of Genome-Wide Sequencing Variants
SeqArray: an R/Bioconductor Package for Big Data Management of Genome-Wide Sequencing Variants Xiuwen Zheng Department of Biostatistics University of Washington Seattle Jan 14, 2015 Contents 1 Overview
Normalization of RNA-Seq
Normalization of RNA-Seq Davide Risso Modified: April 27, 2012. Compiled: April 27, 2012 1 Retrieving the data Usually, an RNA-Seq data analysis from scratch starts with a set of FASTQ files (see e.g.
When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want
1 When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want to search other databases as well. There are very
Please check www.milestonesys.com for updates to make sure you install the most recent version of our software.
Guide Contents Dear Milestone Customer, With the purchase of Milestone XProtect Central you have chosen a very powerful central monitoring solution, providing instant overview of any number of Milestone
BreezingForms Guide. 18 Forms: BreezingForms
BreezingForms 8/3/2009 1 BreezingForms Guide GOOGLE TRANSLATE FROM: http://openbook.galileocomputing.de/joomla15/jooml a_18_formulare_neu_001.htm#t2t32 18.1 BreezingForms 18.1.1 Installation and configuration
HP IMC User Behavior Auditor
HP IMC User Behavior Auditor Administrator Guide Abstract This guide describes the User Behavior Auditor (UBA), an add-on service module of the HP Intelligent Management Center. UBA is designed for IMC
Methods for network visualization and gene enrichment analysis July 17, 2013. Jeremy Miller Scientist I [email protected]
Methods for network visualization and gene enrichment analysis July 17, 2013 Jeremy Miller Scientist I [email protected] Outline Visualizing networks using R Visualizing networks using outside
IBM Operational Decision Manager Version 8 Release 5. Getting Started with Business Rules
IBM Operational Decision Manager Version 8 Release 5 Getting Started with Business Rules Note Before using this information and the product it supports, read the information in Notices on page 43. This
IBM Unica emessage Version 8 Release 6 February 13, 2015. User's Guide
IBM Unica emessage Version 8 Release 6 February 13, 2015 User's Guide Note Before using this information and the product it supports, read the information in Notices on page 403. This edition applies to
Package uptimerobot. October 22, 2015
Type Package Version 1.0.0 Title Access the UptimeRobot Ping API Package uptimerobot October 22, 2015 Provide a set of wrappers to call all the endpoints of UptimeRobot API which includes various kind
ProSightPC 3.0 Quick Start Guide
ProSightPC 3.0 Quick Start Guide The Thermo ProSightPC 3.0 application is the only proteomics software suite that effectively supports high-mass-accuracy MS/MS experiments performed on LTQ FT and LTQ Orbitrap
User Guide. Trade Finance Global. Reports Centre. October 2015. nordea.com/cm OR tradefinance Name of document 8/8 2015/V1
User Guide Trade Finance Global Reports Centre October 2015 nordea.com/cm OR tradefinance Name of document 2015/V1 8/8 Table of Contents 1 Trade Finance Global (TFG) Reports Centre Overview... 4 1.1 Key
Analysis of ChIP-seq data in Galaxy
Analysis of ChIP-seq data in Galaxy November, 2012 Local copy: https://galaxy.wi.mit.edu/ Joint project between BaRC and IT Main site: http://main.g2.bx.psu.edu/ 1 Font Conventions Bold and blue refers
Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova
Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel
How to test and debug an ASP.NET application
Chapter 4 How to test and debug an ASP.NET application 113 4 How to test and debug an ASP.NET application If you ve done much programming, you know that testing and debugging are often the most difficult
Microsoft Dynamics CRM 2013 Applications Introduction Training Material Version 2.0
Microsoft Dynamics CRM 2013 Applications Introduction Training Material Version 2.0 www.firebrandtraining.com Course content Module 0 Course Content and Plan... 4 Objectives... 4 Course Plan... 4 Course
Sequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011
Sequence Formats and Sequence Database Searches Gloria Rendon SC11 Education June, 2011 Sequence A is the primary structure of a biological molecule. It is a chain of residues that form a precise linear
Workplace Giving Guide
Workplace Giving Guide 042612 2012 Blackbaud, Inc. This publication, or any part thereof, may not be reproduced or transmitted in any form or by any means, electronic, or mechanical, including photocopying,
Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6
Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6 In the last lab, you learned how to perform basic multiple sequence alignments. While useful in themselves for determining conserved residues
Blackbaud Sphere & The Raiser s Edge Integration Guide
Blackbaud Sphere & The Raiser s Edge Integration Guide 101311 Blackbaud Sphere 2011 Blackbaud, Inc. This publication, or any part thereof, may not be reproduced or transmitted in any form or by any means,
Project Zip Code. Version 13.0. CUNA s Powerful Grassroots Program. User Manual. Copyright 2012, All Rights Reserved
Project Zip Code Version 13.0 CUNA s Powerful Grassroots Program User Manual Copyright 2012, All Rights Reserved Project Zip Code Version 13.0 Page 1 Table of Contents Topic Page About Project Zip Code
Keeper Care System Data Manager Version 1.0
Automated Inventory Solutions, Inc. User Manual Keeper Care System Data Manager Version 1.0 Automated Inventory Solutions Phone: (304)725-4801 Fax: (304)725-6983 www.aisvendors.com Email: [email protected]
EUR-Lex 2012 Data Extraction using Web Services
DOCUMENT HISTORY DOCUMENT HISTORY Version Release Date Description 0.01 24/01/2013 Initial draft 0.02 01/02/2013 Review 1.00 07/08/2013 Version 1.00 -v1.00.doc Page 2 of 17 TABLE OF CONTENTS 1 Introduction...
Metadata Import Plugin User manual
Metadata Import Plugin User manual User manual for Metadata Import Plugin 1.0 Windows, Mac OS X and Linux August 30, 2013 This software is for research purposes only. CLC bio Silkeborgvej 2 Prismet DK-8000
CLC Bioinformatics Database
CLC Bioinformatics Database End User USER MANUAL Manual for CLC Bioinformatics Database 4.6 Windows, Mac OS X and Linux September 3, 2015 This software is for research purposes only. QIAGEN Aarhus A/S
Exiqon Array Software Manual. Quick guide to data extraction from mircury LNA microrna Arrays
Exiqon Array Software Manual Quick guide to data extraction from mircury LNA microrna Arrays March 2010 Table of contents Introduction Overview...................................................... 3 ImaGene
Evaluator s Guide. PC-Duo Enterprise HelpDesk v5.0. Copyright 2006 Vector Networks Ltd and MetaQuest Software Inc. All rights reserved.
Evaluator s Guide PC-Duo Enterprise HelpDesk v5.0 Copyright 2006 Vector Networks Ltd and MetaQuest Software Inc. All rights reserved. All third-party trademarks are the property of their respective owners.
EDASeq: Exploratory Data Analysis and Normalization for RNA-Seq
EDASeq: Exploratory Data Analysis and Normalization for RNA-Seq Davide Risso Modified: May 22, 2012. Compiled: October 14, 2013 1 Introduction In this document, we show how to conduct Exploratory Data
DocumentsCorePack for MS CRM 2011 Implementation Guide
DocumentsCorePack for MS CRM 2011 Implementation Guide Version 5.0 Implementation Guide (How to install/uninstall) The content of this document is subject to change without notice. Microsoft and Microsoft
SNMP Web Management. User s Manual For SNMP Web Card/Box
SNMP Web Management User s Manual For SNMP Web Card/Box Management Software for Off-Grid Inverter Version: 1.2 Table of Contents 1. Overview... 1 1.1 Introduction... 1 1.2 Features... 1 1.3 Overlook...
SysPatrol - Server Security Monitor
SysPatrol Server Security Monitor User Manual Version 2.2 Sep 2013 www.flexense.com www.syspatrol.com 1 Product Overview SysPatrol is a server security monitoring solution allowing one to monitor one or
Hands-On Lab: WSUS. Lab Manual Expediting WSUS Service for XP Embedded OS
Lab Manual Expediting WSUS Service for XP Embedded OS Summary In this lab, you will learn how to deploy the security update to your XP Pro or XP embedded images. You will also learn how to prepare the
When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want
1 When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want to search other databases as well. There are very
Active Directory User Management System (ADUMS)
Active Directory User Management System (ADUMS) Release 2.9.3 User Guide Revision History Version Author Date Comments (MM/DD/YYYY) i RMA 08/05/2009 Initial Draft Ii RMA 08/20/09 Addl functionality and
Package RDAVIDWebService
Type Package Package RDAVIDWebService November 17, 2015 Title An R Package for retrieving data from DAVID into R objects using Web Services API. Version 1.9.0 Date 2014-04-15 Author Cristobal Fresno and
Note: With v3.2, the DocuSign Fetch application was renamed DocuSign Retrieve.
Quick Start Guide DocuSign Retrieve 3.2.2 Published April 2015 Overview DocuSign Retrieve is a windows-based tool that "retrieves" envelopes, documents, and data from DocuSign for use in external systems.
University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology
University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology Programme Structure - the MSc outcome will require 180 credits total (full-time only) - 60
How To Use Syntheticys User Management On A Pc Or Mac Or Macbook Powerbook (For Mac) On A Computer Or Mac (For Pc Or Pc) On Your Computer Or Ipa (For Ipa) On An Pc Or Ipad
SYNTHESYS MANAGEMENT User Management Synthesys.Net User Management 1 SYNTHESYS.NET USER MANAGEMENT INTRODUCTION...3 STARTING SYNTHESYS USER MANAGEMENT...4 Viewing User Details... 5 Locating individual
Table of Contents. Introduction... 3. 1. How to access the Safari Backoffice 3.11... 3. 2. How Safari corporate accounts are structured...
Safari Corporate Account Manager Instructions For Using Back Office 3 February 2006 Table of Contents Introduction... 3 1. How to access the Safari Backoffice 3.11... 3 2. How Safari corporate accounts
