A Comprehensive metatranscriptomics analysis pipeline and its validation using human small intestine microbiota metatranscriptome
|
|
- Loraine Jacobs
- 7 years ago
- Views:
Transcription
1 A Comprehensive metatranscriptomics analysis pipeline and its validation using human small intestine microbiota metatranscriptome NBIC: 3 rd Metagenomics Seminar Utrecht / September 25 th, 2012 Javier Ramiro Garcia Mark Davids Peter Schaap Wageningen University
2 Aims To develop a fast and robust bioinformatics pipeline for metatranscriptome analysis To validate the pipeline using human small intestine microbiota metatranscriptome samples
3 Human gastro-intestinal tract microbiota Microbial cells outnumber host cells by 10-fold Related with diabetes, obesity and intestinal disease > 1000 species of microbes in the gut ecosystem Large (~80%) uncultured fraction
4 Unexplored small intestinal microbiota Colon Good accessibility Well studied microbiota Small intestine Poorly accessible Relatively unexplored
5 Molecular approaches to study microbial communities Zoetendal et al., 2008, Gut 57:
6 Sampling the human small intestine Surgical removal of colon Or Small Bowel Transplant (SBT) e.g: Crohn s disease, Ulcerative colitis, cancer Healthy subject - Invasive (Sampling with catheter) - Limited amount of material - One time point sampling VS Ileostomy subject - Non invasive (sampling of luminal microbiota of the distal ileum) - Sufficient amount of material (up to 100ml) - Repeated sampling dietary intervention
7 Experimental design for microbiota analysis mrna ds cdna RNA sequencing Illumina (reads ~100bp) RNA extraction 9-42 millions / sample Ileostomy effluent 16S rrna RT-PCR Bioinformatics analysis Pyrosequencing DNA extraction 16S rdna PCR 16S rrna (activity) 16S rdna (community) Bacteria (meta) genome mrna (function/pathways)
8 Samples Single end reads: A : 29,709,278 reads A-rep : 8,951,083 reads Technical replicate Paired end reads B-left & B-right : 42,211,887 each
9 General layout of the RNA-seq data pipeline Sequencing reads (Pyrosequencing or Illumina) Quality check Assignment of sequencing reads to genes or proteins using a blast algorithm Annotation of the identified genes or proteins (COG/KEGG) Blast algorithm (blastx) NCBI Database of bacteria Accurate gene/protein assignment
10 Number of reads (raw) Number of reads (raw) Number of reads (raw) Number of reads (raw) Illumina reads quality (FastQC) A poor average high poor average high Mean sequence quality (Phred score) Mean sequence quality (Phred score) A-rep poor average high B-left B-right poor average high Mean sequence quality (Phred score) Mean sequence quality (Phred score) Between 71-83% of the sequencing reads are high quality
11 Computational time employed by RNA-seq pipeline Illumina reads QC Illumina reads (100bp) Large number of input sequences NCBI database of bacteria BlastX Putative mrna reads High computational demanding Determination of database Reads assignment to gene/protein Phylogenetic profiling Metabolic mapping
12 Reduction of the database size Will not be performed due to: General pùrpose of the pipeline (should be applicable for other environmental sample) Possibility of excluding some species from the selected database, that can be presented in the samples
13 Sequencing reads (%) Reduction number of reads by pooling Sample 1 Sample 1 Sample 1 Uniq-Seq Reduction of reads after pooling Header _a CACT Header _b TACG Header _c AACT Header _d GCGC Header _e CACT Header _f AACT Header _g TTAG Header _a CACT Header _b TACG Header _c AACT Header _d GCGC Header _e CACT Header _f AACT Header _g TTAG Header _a_2 CACT Header _b TACG Header _c_2 AACT Header _d GCGC Header _g TTAG A A-rep B-left B-right Total reads Unique reads after pooling ~54% reduction for single-end reads ~70% reduction for paired-end reads
14 Filtering process Read without biological function Ribosomal RNA PhyX spiked RNA and Illumina adaptor sequences. Filtering procedure Filter database Megablast Validation FDR= % min. alignment of 28nt Megablast :not fast enough and consume a lot of memory ram development of a new algorithm
15 rrna reads distribution (%) rrna filter development Filter database 100% 28-mers 80% 60% 40% Total rrna reads Non-rRNA reads Pooled Illumina reads filter database 20% 0% A A (rep) B-left B-right rrna reads non-rrna reads >75% of total reads are non-functional
16 Blast strategy Illumina reads QC (FastQC) Pooled Illumina reads rrna filter database rrna reads non-rrna reads NCBI database of bacteria BlastX Putative mrna reads High computational demanding Reads assignment to gene/protein Phylogenetic profiling Metabolic mapping
17 Blast strategy for putative mrna reads assignment Putative mrna reads Megablast Bacteria genome database Reads assigned to genome Reads not assigned to genome Blastn Bacteria genome database Reads assigned to genome Reads not assigned to genome (NAG) 10% of NAG Blastx to NCBI protein database Blastx (Metahit & small intestine databases) Reads assigned to protein (NCBI) Reads assigned to protein (M&SI) Unassigned reads
18 Validation of cut off value Illumina reads QC (FastQC) Pooled Illumina reads rrna filter database rrna reads non-rrna reads Non significant hits sequence of blasting Putative mrna reads Phylogenetic profiling Significant hits Validation of cut off value Metabolic mapping
19 Validation of cut off value NCBI complete bacteria coding region Generate 10,000 random in silico reads of 100bp blast hits/in silico read Grouped by bit score Check the COG & taxonomy match mismatch match match
20 % match Validation of genes and protein assignment Megablast Bitscore For accurate assignment of COG Family Genus 74 bit score with 95% confidence 110 bit score with 80% confidence 148 bit score with 80% confidence Not possible for assignment of species level
21 Abundance Abundance Reads distribution based on blast procedure and bit score 100% 100% 80% 60% 40% 20% 80% 60% 40% 20% 0% A A-rep B-left B-right genome (megablast) genome (blastn) unassign reads 0% A A-rep B-left B-right Bit score 148 Bit score Bit score The majority (>75%) of the reads can be assigned to genome using megablast & blastn Avoiding the use of blastx The majority (>54%) of the genome assigned reads have bit score of 148 Assignment at Genus (80%) Family (97%) COG (~100%)
22 Confidence assignment of the mrna reads to the genus level Selection of genes that belong to gal operon of Streptococcus salivarius CCHS33 Blast all putative mrna reads to these genes 350 Coding region Non-coding region Gal operon galk galt gale galm ~32% of the reads that can be assigned to those genes were belong to S. salivarius CCHS33, while the rest come from other Streptococcus species no other genus detected Increase confidence of genus assignment
23 Number of reads Number of genes Number of reads Number of genes Reads assignment to the genes Distribution of reads that can be assigned to genome 7.E+06 6.E+06 5.E+06 4.E+06 3.E+06 2.E+06 1.E+06 0.E+00 60,773 36,348 91,474 89,876 A A-rep B-left B-right Non-coding assigned reads Total coding assigned reads Total protein encoding genes 100,000 90,000 80,000 70,000 60,000 50,000 40,000 30,000 20,000 10, E+06 6.E+06 5.E+06 4.E+06 3.E+06 2.E+06 1.E+06 0.E+00 Normalisation & determination of significant genes: - Gene length normalisation - Removal of genes with <0.0005% reads abundance 10,556 9,063 12,646 12,673 A A-rep B-left B-right 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 Non-coding assigned reads Non-significant coding assigned reads Significant coding assigned reads Significant protein encoding genes Reduction of large number of genes but only discarding <8% reads from the total gene assigned reads
24 Read counts (%) Gene counts (%) Increase of gene identification accuracy using multiple reads assignment Increase confidence of gene assignment 0 0 Gene length coverage (%) Reads with average bit score of Reads with average bit score of Reads with average bit score of 148 Protein encoding gene
25 Validation of technical replicates and pairedend reads Pearson Correlations 9.84 Pair-end reads = 1 (p<0.01) Single-end replicates = (p<0.01) Robustness of the pipeline for functional annotation of the replicates and paired-end Paired-end matched reads (same genome) Paired-end matched reads (different genome) Unique reads
26 Abundance COG distribution of the genes 100% 80% Metabolism 60% 40% 20% Information, storage and processing 0% A A-rep B-left B-right Robustness of the pipeline for functional annotation
27 Functional analysis (metabolic pathways) Nucleotide metabolism Lipid metabolism Carbohydrate metabolism Amino acid metabolism Energy metabolism
28 Experimental design for microbiota analysis mrna ds cdna RNA sequencing Illumina (reads ~100bp) RNA extraction 9-40 millions / sample Ileostomy effluent 16S rrna RT-PCR Bioinformatics analysis Pyrosequencing DNA extraction 16S rdna PCR 16S rrna (activity) 16S rdna (community) Bacteria (meta) genome mrna (function/pathways)
29 16S rdna 16S rrna 16S rdna 16S rrna Relative abundance Relative abundance Taxonomic distribution at genus level Pyrosequencing 100% Others RNA-seq 100% 80% 60% 40% 20% Unclassified Haemophilus Bifidobacterium Turicibacter Gemella Streptococcus Rothia Lactobacillus 80% 60% 40% 20% 0% Lactococcus Veillonella Clostridium 0% A A-rep B-left B-right A B A B Correlation between microbiota composition, overall activity and specific activity of the community members
30 Final set up of the bioinformatics pipeline Pooled Illumina reads Input Processes Filter database Filter for rrna rrna reads Intermediate output Discarded output Genome database Reads assignment to the genome Non-assigned reads Reads classification Blastx (10%) to protein database NCBI protein database Non-gene assigned reads Gene assigned reads Non-assigned reads after blastx (10%) COG/KEGG database Gene annotations Blastx (10%) to protein database MetaHIT & SI protein databases Metabolic mapping & biological interpretation Unassigned reads
31 Summary Accurate COG functional assignment >95% confidence level Phylogenetic assignment: Genus >80% Family >97% ~ 54% of the genome assigned reads Robustness of functional assignment technical replicates & paired-end reads Correlation between microbiota composition, overall activity and specific activity of the community members
32 Acknowledgement Milkha M Leimena Mark Davids Matthijn C Hesselman Tom vd. Bogert Jos Boekhorst Eddy Smid Erwin Zoetendal Michiel Kleerebezem Peter Schaap Hauke Smidt
G E N OM I C S S E RV I C ES
GENOMICS SERVICES THE NEW YORK GENOME CENTER NYGC is an independent non-profit implementing advanced genomic research to improve diagnosis and treatment of serious diseases. capabilities. N E X T- G E
More informationBioinformatics Resources at a Glance
Bioinformatics Resources at a Glance A Note about FASTA Format There are MANY free bioinformatics tools available online. Bioinformaticists have developed a standard format for nucleotide and protein sequences
More informationA Tutorial in Genetic Sequence Classification Tools and Techniques
A Tutorial in Genetic Sequence Classification Tools and Techniques Jake Drew Data Mining CSE 8331 Southern Methodist University jakemdrew@gmail.com www.jakemdrew.com Sequence Characters IUPAC nucleotide
More informationRETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison
RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the
More informationNGS data analysis. Bernardo J. Clavijo
NGS data analysis Bernardo J. Clavijo 1 A brief history of DNA sequencing 1953 double helix structure, Watson & Crick! 1977 rapid DNA sequencing, Sanger! 1977 first full (5k) genome bacteriophage Phi X!
More informationAmphoraNet: Taxonomic Composition Analysis of Metagenomic Shotgun Sequencing Data
Csaba Kerepesi, Dániel Bánky, Vince Grolmusz: AmphoraNet: Taxonomic Composition Analysis of Metagenomic Shotgun Sequencing Data http://pitgroup.org/amphoranet/ PIT Bioinformatics Group, Department of Computer
More informationAnalysis of Illumina Gene Expression Microarray Data
Analysis of Illumina Gene Expression Microarray Data Asta Laiho, Msc. Tech. Bioinformatics research engineer The Finnish DNA Microarray Centre Turku Centre for Biotechnology, Finland The Finnish DNA Microarray
More informationNext Generation Sequencing Technologies in Microbial Ecology. Frank Oliver Glöckner
Next Generation Sequencing Technologies in Microbial Ecology Frank Oliver Glöckner 1 Max Planck Institute for Marine Microbiology Investigation of the role, diversity and features of microorganisms Interactions
More informationCanadian Microbiome Initiative
Canadian Microbiome Initiative Background The human body plays host to trillions of microbes, including bacteria, viruses and protists. These microbes constitute the Human Microbiome that resides both
More informationPreciseTM Whitepaper
Precise TM Whitepaper Introduction LIMITATIONS OF EXISTING RNA-SEQ METHODS Correctly designed gene expression studies require large numbers of samples, accurate results and low analysis costs. Analysis
More informationSimilarity Searches on Sequence Databases: BLAST, FASTA. Lorenza Bordoli Swiss Institute of Bioinformatics EMBnet Course, Basel, October 2003
Similarity Searches on Sequence Databases: BLAST, FASTA Lorenza Bordoli Swiss Institute of Bioinformatics EMBnet Course, Basel, October 2003 Outline Importance of Similarity Heuristic Sequence Alignment:
More informationStandards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium
Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium I. Introduction: Sequence based assays of transcriptomes (RNA-seq) are in wide use because of their favorable
More informationIntroduction to transcriptome analysis using High Throughput Sequencing technologies (HTS)
Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS) A typical RNA Seq experiment Library construction Protocol variations Fragmentation methods RNA: nebulization,
More informationFocusing on results not data comprehensive data analysis for targeted next generation sequencing
Focusing on results not data comprehensive data analysis for targeted next generation sequencing Daniel Swan, Jolyon Holdstock, Angela Matchan, Richard Stark, John Shovelton, Duarte Mohla and Simon Hughes
More informationJust the Facts: A Basic Introduction to the Science Underlying NCBI Resources
1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools
More informationNicolas Pons INRA Ins(tut Micalis Plateforme MetaQuant Jouy- en- Josas, France
Nicolas Pons INRA Ins(tut Micalis Plateforme MetaQuant Jouy- en- Josas, France Special Science Online Collec-on: Dealing with Data (feb 2011) DNA Protein TTGTGGATAACCTCAAAACTTTTCTCTTTCTGACCTGTGGAAAACTTTTTCGTTTTATGATAGAATCAGAGGACAAGAATAAAGA!
More informationHow Sequencing Experiments Fail
How Sequencing Experiments Fail v1.0 Simon Andrews simon.andrews@babraham.ac.uk Classes of Failure Technical Tracking Library Contamination Biological Interpretation Something went wrong with a machine
More informationModule 1. Sequence Formats and Retrieval. Charles Steward
The Open Door Workshop Module 1 Sequence Formats and Retrieval Charles Steward 1 Aims Acquaint you with different file formats and associated annotations. Introduce different nucleotide and protein databases.
More informationFrequently Asked Questions Next Generation Sequencing
Frequently Asked Questions Next Generation Sequencing Import These Frequently Asked Questions for Next Generation Sequencing are some of the more common questions our customers ask. Questions are divided
More informationA Primer of Genome Science THIRD
A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:
More informationSearching Nucleotide Databases
Searching Nucleotide Databases 1 When we search a nucleic acid databases, Mascot always performs a 6 frame translation on the fly. That is, 3 reading frames from the forward strand and 3 reading frames
More informationThe world of non-coding RNA. Espen Enerly
The world of non-coding RNA Espen Enerly ncrna in general Different groups Small RNAs Outline mirnas and sirnas Speculations Common for all ncrna Per def.: never translated Not spurious transcripts Always/often
More informationUniversity of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology
University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology Programme Structure - the MSc outcome will require 180 credits total (full-time only) - 60
More informationDiscovery and Quantification of RNA with RNASeq Roderic Guigó Serra Centre de Regulació Genòmica (CRG) roderic.guigo@crg.cat
Bioinformatique et Séquençage Haut Débit, Discovery and Quantification of RNA with RNASeq Roderic Guigó Serra Centre de Regulació Genòmica (CRG) roderic.guigo@crg.cat 1 RNA Transcription to RNA and subsequent
More informationNORTH PACIFIC RESEARCH BOARD SEMIANNUAL PROGRESS REPORT
1. PROJECT INFORMATION NPRB Project Number: 1303 Title: Assessing benthic meiofaunal community structure in the Alaskan Arctic: A high-throughput DNA sequencing approach Subaward period July 1, 2013 Jun
More informationBIOL 3200 Spring 2015 DNA Subway and RNA-Seq Data Analysis
BIOL 3200 Spring 2015 DNA Subway and RNA-Seq Data Analysis By the end of this lab students should be able to: Describe the uses for each line of the DNA subway program (Red/Yellow/Blue/Green) Describe
More informationTutorial for Windows and Macintosh. Preparing Your Data for NGS Alignment
Tutorial for Windows and Macintosh Preparing Your Data for NGS Alignment 2015 Gene Codes Corporation Gene Codes Corporation 775 Technology Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) 1.734.769.7249
More informationShouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center
Computational Challenges in Storage, Analysis and Interpretation of Next-Generation Sequencing Data Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center Next Generation Sequencing
More informationNew Technologies for Sensitive, Low-Input RNA-Seq. Clontech Laboratories, Inc.
New Technologies for Sensitive, Low-Input RNA-Seq Clontech Laboratories, Inc. Outline Introduction Single-Cell-Capable mrna-seq Using SMART Technology SMARTer Ultra Low RNA Kit for the Fluidigm C 1 System
More informationNext Generation Sequencing: Technology, Mapping, and Analysis
Next Generation Sequencing: Technology, Mapping, and Analysis Gary Benson Computer Science, Biology, Bioinformatics Boston University gbenson@bu.edu http://tandem.bu.edu/ The Human Genome Project took
More informationCCR Biology - Chapter 9 Practice Test - Summer 2012
Name: Class: Date: CCR Biology - Chapter 9 Practice Test - Summer 2012 Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Genetic engineering is possible
More informationSequencing the Human Genome
Revised and Updated Edvo-Kit #339 Sequencing the Human Genome 339 Experiment Objective: In this experiment, students will read DNA sequences obtained from automated DNA sequencing techniques. The data
More informationTranslation Study Guide
Translation Study Guide This study guide is a written version of the material you have seen presented in the replication unit. In translation, the cell uses the genetic information contained in mrna to
More informationIntroduction to Bioinformatics 3. DNA editing and contig assembly
Introduction to Bioinformatics 3. DNA editing and contig assembly Benjamin F. Matthews United States Department of Agriculture Soybean Genomics and Improvement Laboratory Beltsville, MD 20708 matthewb@ba.ars.usda.gov
More informationPairwise Sequence Alignment
Pairwise Sequence Alignment carolin.kosiol@vetmeduni.ac.at SS 2013 Outline Pairwise sequence alignment global - Needleman Wunsch Gotoh algorithm local - Smith Waterman algorithm BLAST - heuristics What
More informationThe EcoCyc Curation Process
The EcoCyc Curation Process Ingrid M. Keseler SRI International 1 HOW OFTEN IS THE GOLDEN GATE BRIDGE PAINTED? Many misconceptions exist about how often the Bridge is painted. Some say once every seven
More informationNGS Data Analysis: An Intro to RNA-Seq
NGS Data Analysis: An Intro to RNA-Seq March 25th, 2014 GST Colloquim: March 25th, 2014 1 / 1 Workshop Design Basics of NGS Sample Prep RNA-Seq Analysis GST Colloquim: March 25th, 2014 2 / 1 Experimental
More informationAnalysis of gene expression data. Ulf Leser and Philippe Thomas
Analysis of gene expression data Ulf Leser and Philippe Thomas This Lecture Protein synthesis Microarray Idea Technologies Applications Problems Quality control Normalization Analysis next week! Ulf Leser:
More informationBioinformatics Grid - Enabled Tools For Biologists.
Bioinformatics Grid - Enabled Tools For Biologists. What is Grid-Enabled Tools (GET)? As number of data from the genomics and proteomics experiment increases. Problems arise for the current sequence analysis
More informationSystematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals
Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals Xiaohui Xie 1, Jun Lu 1, E. J. Kulbokas 1, Todd R. Golub 1, Vamsi Mootha 1, Kerstin Lindblad-Toh
More informationHuman Genome Organization: An Update. Genome Organization: An Update
Human Genome Organization: An Update Genome Organization: An Update Highlights of Human Genome Project Timetable Proposed in 1990 as 3 billion dollar joint venture between DOE and NIH with 15 year completion
More informationHistory of DNA Sequencing & Current Applications
History of DNA Sequencing & Current Applications Christopher McLeod President & CEO, 454 Life Sciences, A Roche Company IMPORTANT NOTICE Intended Use Unless explicitly stated otherwise, all Roche Applied
More informationUCHIME in practice Single-region sequencing Reference database mode
UCHIME in practice Single-region sequencing UCHIME is designed for experiments that perform community sequencing of a single region such as the 16S rrna gene or fungal ITS region. While UCHIME may prove
More informationBiological Sequence Data Formats
Biological Sequence Data Formats Here we present three standard formats in which biological sequence data (DNA, RNA and protein) can be stored and presented. Raw Sequence: Data without description. FASTA
More informationEfficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing
Efficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing James D. Jackson Philip J. Hatcher Department of Computer Science Kingsbury Hall University of New Hampshire Durham,
More informationMolecular Genetics. RNA, Transcription, & Protein Synthesis
Molecular Genetics RNA, Transcription, & Protein Synthesis Section 1 RNA AND TRANSCRIPTION Objectives Describe the primary functions of RNA Identify how RNA differs from DNA Describe the structure and
More informationWelcome to the Plant Breeding and Genomics Webinar Series
Welcome to the Plant Breeding and Genomics Webinar Series Today s Presenter: Dr. Candice Hansey Presentation: http://www.extension.org/pages/ 60428 Host: Heather Merk Technical Production: John McQueen
More informationEuropean Medicines Agency
European Medicines Agency July 1996 CPMP/ICH/139/95 ICH Topic Q 5 B Quality of Biotechnological Products: Analysis of the Expression Construct in Cell Lines Used for Production of r-dna Derived Protein
More informationAnalyzing A DNA Sequence Chromatogram
LESSON 9 HANDOUT Analyzing A DNA Sequence Chromatogram Student Researcher Background: DNA Analysis and FinchTV DNA sequence data can be used to answer many types of questions. Because DNA sequences differ
More informationINTERNATIONAL CONFERENCE ON HARMONISATION OF TECHNICAL REQUIREMENTS FOR REGISTRATION OF PHARMACEUTICALS FOR HUMAN USE Q5B
INTERNATIONAL CONFERENCE ON HARMONISATION OF TECHNICAL REQUIREMENTS FOR REGISTRATION OF PHARMACEUTICALS FOR HUMAN USE ICH HARMONISED TRIPARTITE GUIDELINE QUALITY OF BIOTECHNOLOGICAL PRODUCTS: ANALYSIS
More informationAn example of bioinformatics application on plant breeding projects in Rijk Zwaan
An example of bioinformatics application on plant breeding projects in Rijk Zwaan Xiangyu Rao 17-08-2012 Introduction of RZ Rijk Zwaan is active worldwide as a vegetable breeding company that focuses on
More information17 July 2014 WEB-SERVER MANUAL. Contact: Michael Hackenberg (hackenberg@ugr.es)
WEB-SERVER MANUAL Contact: Michael Hackenberg (hackenberg@ugr.es) 1 1 Introduction srnabench is a free web-server tool and standalone application for processing small- RNA data obtained from next generation
More informationIntroduction to next-generation sequencing data
Introduction to next-generation sequencing data David Simpson Centre for Experimental Medicine Queens University Belfast http://www.qub.ac.uk/research-centres/cem/ Outline History of DNA sequencing NGS
More informationSequence Formats and Sequence Database Searches. Gloria Rendon SC11 Education June, 2011
Sequence Formats and Sequence Database Searches Gloria Rendon SC11 Education June, 2011 Sequence A is the primary structure of a biological molecule. It is a chain of residues that form a precise linear
More informationBasic Concepts of DNA, Proteins, Genes and Genomes
Basic Concepts of DNA, Proteins, Genes and Genomes Kun-Mao Chao 1,2,3 1 Graduate Institute of Biomedical Electronics and Bioinformatics 2 Department of Computer Science and Information Engineering 3 Graduate
More informationHENIPAVIRUS ANTIBODY ESCAPE SEQUENCING REPORT
HENIPAVIRUS ANTIBODY ESCAPE SEQUENCING REPORT Kimberly Bishop Lilly 1,2, Truong Luu 1,2, Regina Cer 1,2, and LT Vishwesh Mokashi 1 1 Naval Medical Research Center, NMRC Frederick, 8400 Research Plaza,
More informationDeliverable 7.3.1 First report on sample storage, DNA extraction and sample analysis processes
Model Driven Paediatric European Digital Repository Call identifier: FP7-ICT-2011-9 - Grant agreement no: 600932 Thematic Priority: ICT - ICT-2011.5.2: Virtual Physiological Human Deliverable 7.3.1 First
More information8/7/2012. Experimental Design & Intro to NGS Data Analysis. Examples. Agenda. Shoe Example. Breast Cancer Example. Rat Example (Experimental Design)
Experimental Design & Intro to NGS Data Analysis Ryan Peters Field Application Specialist Partek, Incorporated Agenda Experimental Design Examples ANOVA What assays are possible? NGS Analytical Process
More informationThe Steps. 1. Transcription. 2. Transferal. 3. Translation
Protein Synthesis Protein synthesis is simply the "making of proteins." Although the term itself is easy to understand, the multiple steps that a cell in a plant or animal must go through are not. In order
More informationTyping in the NGS era: The way forward!
Typing in the NGS era: The way forward! Valeria Michelacci NGS course, June 2015 Typing from sequence data NGS-derived conventional Multi Locus Sequence Typing (University of Warwick, 7 housekeeping genes)
More informationTribuna Académica. Overview of Metagenomics for Marine Biodiversity Research 1. Barton E. Slatko* Metagenomics defined
Tribuna Académica 117 Overview of Metagenomics for Marine Biodiversity Research 1 Barton E. Slatko* We are in the midst of the fastest growing revolution in molecular biology, perhaps in all of life science,
More informationLa capture de la fonction par des approches haut débit
Colloque Génomique Environnementale LYON 2011 La capture de la fonction par des approches haut débit Pierre PEYRET J. Denonfoux, N. Parisot, E. Dugat-Bony, C. Biderre-Petit, D. Boucher, G. Fonty, E. Peyretaillade
More informationIIID 14. Biotechnology in Fish Disease Diagnostics: Application of the Polymerase Chain Reaction (PCR)
IIID 14. Biotechnology in Fish Disease Diagnostics: Application of the Polymerase Chain Reaction (PCR) Background Infectious diseases caused by pathogenic organisms such as bacteria, viruses, protozoa,
More informationUsing Illumina BaseSpace Apps to Analyze RNA Sequencing Data
Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data The Illumina TopHat Alignment and Cufflinks Assembly and Differential Expression apps make RNA data analysis accessible to any user, regardless
More informationDifficult DNA Templates Sequencing. Primer Walking Service
Difficult DNA Templates Sequencing Primer Walking Service Result 16/18s (ITS 5.8s) rrna Sequencing Phylogenetic tree 16s rrna Region ITS rrna Region ITS and 26s rrna Region Order and Result Cloning Service
More informationAnalysis of ChIP-seq data in Galaxy
Analysis of ChIP-seq data in Galaxy November, 2012 Local copy: https://galaxy.wi.mit.edu/ Joint project between BaRC and IT Main site: http://main.g2.bx.psu.edu/ 1 Font Conventions Bold and blue refers
More informationChallenges associated with analysis and storage of NGS data
Challenges associated with analysis and storage of NGS data Gabriella Rustici Research and training coordinator Functional Genomics Group gabry@ebi.ac.uk Next-generation sequencing Next-generation sequencing
More informationBiotechnology and Recombinant DNA (Chapter 9) Lecture Materials for Amy Warenda Czura, Ph.D. Suffolk County Community College
Biotechnology and Recombinant DNA (Chapter 9) Lecture Materials for Amy Warenda Czura, Ph.D. Suffolk County Community College Primary Source for figures and content: Eastern Campus Tortora, G.J. Microbiology
More informationFlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem
FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem Elsa Bernard Laurent Jacob Julien Mairal Jean-Philippe Vert September 24, 2013 Abstract FlipFlop implements a fast method for de novo transcript
More informationNext Generation Sequencing: Adjusting to Big Data. Daniel Nicorici, Dr.Tech. Statistikot Suomen Lääketeollisuudessa 29.10.2013
Next Generation Sequencing: Adjusting to Big Data Daniel Nicorici, Dr.Tech. Statistikot Suomen Lääketeollisuudessa 29.10.2013 Outline Human Genome Project Next-Generation Sequencing Personalized Medicine
More informationWhen you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want
1 When you install Mascot, it includes a copy of the Swiss-Prot protein database. However, it is almost certain that you and your colleagues will want to search other databases as well. There are very
More informationSupplemental Material. Methods
Supplemental Material Methods Measurement of lncrnas expression Total RNA was extracted from PAXgene TM tubes using the PAXgene blood RNA kit (Qiagen, Venlo, Netherlands) as described by the manufacturer.
More informationMicrobial Oceanomics using High-Throughput DNA Sequencing
Microbial Oceanomics using High-Throughput DNA Sequencing Ramiro Logares Institute of Marine Sciences, CSIC, Barcelona 9th RES Users'Conference 23 September 2015 Importance of microbes in the sunlit ocean
More informationProbiotics for the Treatment of Adult Gastrointestinal Disorders
Probiotics for the Treatment of Adult Gastrointestinal Disorders Darren M. Brenner, M.D. Division of Gastroenterology Northwestern University, Feinberg School of Medicine Chicago, Illinois What are Probiotics?
More informationNon-invasive prenatal detection of chromosome aneuploidies using next generation sequencing: First steps towards clinical application
Non-invasive prenatal detection of chromosome aneuploidies using next generation sequencing: First steps towards clinical application PD Dr. rer. nat. Markus Stumm Zentrum für Pränataldiagnostik Kudamm-199
More informationLecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION. Professor Bharat Patel Office: Science 2, 2.36 Email: b.patel@griffith.edu.
Lecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION Professor Bharat Patel Office: Science 2, 2.36 Email: b.patel@griffith.edu.au What is Gene Expression & Gene Regulation? 1. Gene Expression
More informationMiSeq: Imaging and Base Calling
MiSeq: Imaging and Page Welcome Navigation Presenter Introduction MiSeq Sequencing Workflow Narration Welcome to MiSeq: Imaging and. This course takes 35 minutes to complete. Click Next to continue. Please
More informationBioinformatics and its applications
Bioinformatics and its applications Alla L Lapidus, Ph.D. SPbAU, SPbSU, St. Petersburg Term Bioinformatics Term Bioinformatics was invented by Paulien Hogeweg (Полина Хогевег) and Ben Hesper in 1970 as
More informationID of alternative translational initiation events. Description of gene function Reference of NCBI database access and relative literatures
Data resource: In this database, 650 alternatively translated variants assigned to a total of 300 genes are contained. These database records of alternative translational initiation have been collected
More information2.3 Identify rrna sequences in DNA
2.3 Identify rrna sequences in DNA For identifying rrna sequences in DNA we will use rnammer, a program that implements an algorithm designed to find rrna sequences in DNA [5]. The program was made by
More informationComparing Methods for Identifying Transcription Factor Target Genes
Comparing Methods for Identifying Transcription Factor Target Genes Alena van Bömmel (R 3.3.73) Matthew Huska (R 3.3.18) Max Planck Institute for Molecular Genetics Folie 1 Transcriptional Regulation TF
More informationModule 10: Bioinformatics
Module 10: Bioinformatics 1.) Goal: To understand the general approaches for basic in silico (computer) analysis of DNA- and protein sequences. We are going to discuss sequence formatting required prior
More informationBLAST. Anders Gorm Pedersen & Rasmus Wernersson
BLAST Anders Gorm Pedersen & Rasmus Wernersson Database searching Using pairwise alignments to search databases for similar sequences Query sequence Database Database searching Most common use of pairwise
More informationAlgorithms in Computational Biology (236522) spring 2007 Lecture #1
Algorithms in Computational Biology (236522) spring 2007 Lecture #1 Lecturer: Shlomo Moran, Taub 639, tel 4363 Office hours: Tuesday 11:00-12:00/by appointment TA: Ilan Gronau, Taub 700, tel 4894 Office
More informationSpecific problems. The genetic code. The genetic code. Adaptor molecules match amino acids to mrna codons
Tutorial II Gene expression: mrna translation and protein synthesis Piergiorgio Percipalle, PhD Program Control of gene transcription and RNA processing mrna translation and protein synthesis KAROLINSKA
More informationBIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16
Course Director: Dr. Barry Grant (DCM&B, bjgrant@med.umich.edu) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems
More informationGene Expression Assays
APPLICATION NOTE TaqMan Gene Expression Assays A mpl i fic ationef ficienc yof TaqMan Gene Expression Assays Assays tested extensively for qpcr efficiency Key factors that affect efficiency Efficiency
More informationMeasuring gene expression (Microarrays) Ulf Leser
Measuring gene expression (Microarrays) Ulf Leser This Lecture Gene expression Microarrays Idea Technologies Problems Quality control Normalization Analysis next week! 2 http://learn.genetics.utah.edu/content/molecules/transcribe/
More informationLecture Outline. Introduction to Databases. Introduction. Data Formats Sample databases How to text search databases. Shifra Ben-Dor Irit Orr
Introduction to Databases Shifra Ben-Dor Irit Orr Lecture Outline Introduction Data and Database types Database components Data Formats Sample databases How to text search databases What units of information
More informationAccelerate genomic breakthroughs in microbiology. Gain deeper insights with powerful bioinformatic tools.
Accelerate genomic breakthroughs in microbiology. Gain deeper insights with powerful bioinformatic tools. Empowering microbial genomics. Extensive methods. Expansive possibilities. In microbiome studies
More informationCorrelation of microarray and quantitative real-time PCR results. Elisa Wurmbach Mount Sinai School of Medicine New York
Correlation of microarray and quantitative real-time PCR results Elisa Wurmbach Mount Sinai School of Medicine New York Microarray techniques Oligo-array: Affymetrix, Codelink, spotted oligo-arrays (60-70mers)
More informationExpression Quantification (I)
Expression Quantification (I) Mario Fasold, LIFE, IZBI Sequencing Technology One Illumina HiSeq 2000 run produces 2 times (paired-end) ca. 1,2 Billion reads ca. 120 GB FASTQ file RNA-seq protocol Task
More informationACAAGGGACTAGAGAAACCAAAA AGAAACCAAAACGAAAGGTGCAGAA AACGAAAGGTGCAGAAGGGGAAACAGATGCAGA CHAPTER 3
ACAAGGGACTAGAGAAACCAAAA AGAAACCAAAACGAAAGGTGCAGAA AACGAAAGGTGCAGAAGGGGAAACAGATGCAGA CHAPTER 3 GAAGGGGAAACAGATGCAGAAAGCATC AGAAAGCATC ACAAGGGACTAGAGAAACCAAAACGAAAGGTGCAGAAGGGGAAACAGATGCAGAAAGCATC Introduction
More informationMolecular Genetics: Challenges for Statistical Practice. J.K. Lindsey
Molecular Genetics: Challenges for Statistical Practice J.K. Lindsey 1. What is a Microarray? 2. Design Questions 3. Modelling Questions 4. Longitudinal Data 5. Conclusions 1. What is a microarray? A microarray
More informationRNA-Seq Tutorial 1. John Garbe Research Informatics Support Systems, MSI March 19, 2012
RNA-Seq Tutorial 1 John Garbe Research Informatics Support Systems, MSI March 19, 2012 Tutorial 1 RNA-Seq Tutorials RNA-Seq experiment design and analysis Instruction on individual software will be provided
More informationBiological Databases and Protein Sequence Analysis
Biological Databases and Protein Sequence Analysis Introduction M. Madan Babu, Center for Biotechnology, Anna University, Chennai 25, India Bioinformatics is the application of Information technology to
More informationUsability in bioinformatics mobile applications
Usability in bioinformatics mobile applications what we are working on Noura Chelbah, Sergio Díaz, Óscar Torreño, and myself Juan Falgueras App name Performs Advantajes Dissatvantajes Link The problem
More informationNext generation sequencing (NGS)
Next generation sequencing (NGS) Vijayachitra Modhukur BIIT modhukur@ut.ee 1 Bioinformatics course 11/13/12 Sequencing 2 Bioinformatics course 11/13/12 Microarrays vs NGS Sequences do not need to be known
More informationGenBank, Entrez, & FASTA
GenBank, Entrez, & FASTA Nucleotide Sequence Databases First generation GenBank is a representative example started as sort of a museum to preserve knowledge of a sequence from first discovery great repositories,
More information