RNAseq / ChipSeq / Methylseq and personalized genomics



Similar documents
Next generation DNA sequencing technologies. theory & prac-ce

Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS)

Computational Genomics. Next generation sequencing (NGS)

Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center

Next Generation Sequencing

Introduction to NGS data analysis

New Technologies for Sensitive, Low-Input RNA-Seq. Clontech Laboratories, Inc.

Expression Quantification (I)

NGS data analysis. Bernardo J. Clavijo

G E N OM I C S S E RV I C ES

Next Generation Sequencing: Technology, Mapping, and Analysis

Frequently Asked Questions Next Generation Sequencing

Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data

Lectures 1 and February 7, Genomics 2012: Repetitorium. Peter N Robinson. VL1: Next- Generation Sequencing. VL8 9: Variant Calling

Data Analysis & Management of High-throughput Sequencing Data. Quoclinh Nguyen Research Informatics Genomics Core / Medical Research Institute

Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium

Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation

FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem

Challenges associated with analysis and storage of NGS data

Basic processing of next-generation sequencing (NGS) data

Human Genome Organization: An Update. Genome Organization: An Update

Control of Gene Expression

BIOL 3200 Spring 2015 DNA Subway and RNA-Seq Data Analysis

Single-Cell DNA Sequencing with the C 1. Single-Cell Auto Prep System. Reveal hidden populations and genetic diversity within complex samples

PreciseTM Whitepaper

Core Facility Genomics

Discovery and Quantification of RNA with RNASeq Roderic Guigó Serra Centre de Regulació Genòmica (CRG)

GenomeStudio Data Analysis Software

Bioruptor NGS: Unbiased DNA shearing for Next-Generation Sequencing

Introduction to next-generation sequencing data

Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research. March 17, 2011 Rendez-Vous Séquençage

8/7/2012. Experimental Design & Intro to NGS Data Analysis. Examples. Agenda. Shoe Example. Breast Cancer Example. Rat Example (Experimental Design)

Gene Expression Analysis

RNA-seq. Quantification and Differential Expression. Genomics: Lecture #12

Next Generation Sequencing: Adjusting to Big Data. Daniel Nicorici, Dr.Tech. Statistikot Suomen Lääketeollisuudessa

Nazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office

CRAC: An integrated approach to analyse RNA-seq reads Additional File 3 Results on simulated RNA-seq data.

NEXT GENERATION SEQUENCING

GenomeStudio Data Analysis Software

Overview of Next Generation Sequencing platform technologies

Next Generation Sequencing; Technologies, applications and data analysis

Gene Models & Bed format: What they represent.

Next Generation Sequencing; Technologies, applications and data analysis

Measuring gene expression (Microarrays) Ulf Leser

Analysis of gene expression data. Ulf Leser and Philippe Thomas

Methods, tools, and pipelines for analysis of Ion PGM Sequencer mirna and gene expression data

School of Nursing. Presented by Yvette Conley, PhD

Introduction Bioo Scientific

Introduction To Epigenetic Regulation: How Can The Epigenomics Core Services Help Your Research? Maria (Ken) Figueroa, M.D. Core Scientific Director

Go where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe

Biological Sequence Data Formats

An example of bioinformatics application on plant breeding projects in Rijk Zwaan

BIO 3352: BIOINFORMATICS II HYBRID COURSE SYLLABUS

Analysis of ChIP-seq data in Galaxy

Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources

Visualisation tools for next-generation sequencing

The Future of the Electronic Health Record. Gerry Higgins, Ph.D., Johns Hopkins

Introduction To Real Time Quantitative PCR (qpcr)

July 7th 2009 DNA sequencing

SEQUENCING. From Sample to Sequence-Ready

Services. Updated 05/31/2016

Simplifying Data Interpretation with Nexus Copy Number

Genetic Analysis. Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs)

TGC AT YOUR SERVICE. Taking your research to the next generation

Comparing Methods for Identifying Transcription Factor Target Genes

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

treatments) worked by killing cancerous cells using chemo or radiotherapy. While these techniques can

The Human Genome Project. From genome to health From human genome to other genomes and to gene function Structural Genomics initiative

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

How Sequencing Experiments Fail

NGS Technologies for Genomics and Transcriptomics

Genomic Testing: Actionability, Validation, and Standard of Lab Reports

Complex Systems BioMedicine: Molecules, Signals, Networks, Diseases

Vad är bioinformatik och varför behöver vi det i vården? a bioinformatician's perspectives

Protein Synthesis How Genes Become Constituent Molecules

RNA-Seq Tutorial 1. John Garbe Research Informatics Support Systems, MSI March 19, 2012

Focusing on results not data comprehensive data analysis for targeted next generation sequencing

Biological Sciences Initiative. Human Genome

Understanding West Nile Virus Infection

The RNA strategy. RNA as a tool and target in human disease diagnosis and therapy.

Using Galaxy for NGS Analysis. Daniel Blankenberg Postdoctoral Research Associate The Galaxy Team

Text file One header line meta information lines One line : variant/position

Validation and Replication

Analysis and Integration of Big Data from Next-Generation Genomics, Epigenomics, and Transcriptomics

Lecture 1 MODULE 3 GENE EXPRESSION AND REGULATION OF GENE EXPRESSION. Professor Bharat Patel Office: Science 2, b.patel@griffith.edu.

Nuevas tecnologías basadas en biomarcadores para oncología

BIOO LIFE SCIENCE PRODUCTS

From Reads to Differentially Expressed Genes. The statistics of differential gene expression analysis using RNA-seq data

TECHNOLOGIES, PRODUCTS & SERVICES for MOLECULAR DIAGNOSTICS, MDx ABA 298

LifeScope Genomic Analysis Software 2.5

Big data in cancer research : DNA sequencing and personalised medicine

Transcription:

RNAseq / ChipSeq / Methylseq and personalized genomics 7711 Lecture Subhajyo) De, PhD Division of Biomedical Informa)cs and Personalized Biomedicine, Department of Medicine University of Colorado School of Medicine September 11, 2014

Outline > RNA sequencing The biology of RNA: types of RNA, intron- exon structure, alterna)ve splicing Microarrays to quan)fy RNAs RNAseq protocols Quan)fica)on of RNA abundance Quality control in RNAseq data Microarrays vs RNAseq SoRware and their basic principles RNA edi)ng Chipseq / methylseq The biology of chroma)n histone modifica)ons, CpG methyla)on, nucleosome Quan)fica)on of abundance of chroma)n signature SoRware and their basic principles Combina)on of chroma)n marks Variant detec=on Germ line and soma)c muta)ons Quality control issues SoRware and their basic principles Genomics and Personalized Medicine

RNA sequencing

RNA sequencing > The biology of RNA: types of RNA, intron- exon structure, alterna)ve splicing Transcrip;on start site Intron Types of RNA: (a) mrnas: codes for protein/pep)de sequences (b) trnas: small RNA, necessary component of protein transla)on (c) micrornas: small RNA that plays a regulatory role in major biological processes; typically suppress expression of target genes. (d) Ribosomal RNA: small RNAs that are key components of ribosome (e) Long non- coding RNAs: Large mul)- exonic RNAs. Emerging reports link them to major biological processes (f) Other non- coding RNAs: such as circular RNA Kapranov, BMC Biology, 2010 Exon

RNA sequencing > Microarrays to quan)fy RNAs 1. Quan)le normaliza)on 2. RMA normaliza)on 3. GCRMA 4. Mas5 5. Lim et al. Bioinforma)cs. 2007

RNA sequencing > RNAseq protocol Removal of Ribosomal and other types of RNAs. PolyA selec)on or Ribo- minus. Sequencing using: Pyrosequencing (454 Technologies) Solexa sequencing (Illumina HiSeq) Sequencing by liga)on (SOLiD) Ion Torrent seminonductor sequencing Nanopore sequencing Single molecule real )me sequencing (PacBio) Read length: 50bp 10,000bp Sequencing volume: ~ million reads/sec Sequencing accuracy: 90% - 99.99%

RNA sequencing > Quan)fica)on of RNA abundance RNAseq has excellent reproducibility RPKM (Reads per kilo base per million) is a measure of expression level of a genomic en)ty. Mortazavi et al. Nature Methods. 2008 Paired end read: FPKM (fragments per kilobase of exon per million fragments mapped) RNAseq has excellent dynamic range

RNA sequencing > Quan)fica)on of RNA abundance Mortazavi et al. Nature Methods. 2008

RNA sequencing > Quality control in RNAseq data; Microarrays vs RNAseq RNAseq has excellent reproducibility i.e. low technical varia)on Mortazavi et al. Nature Methods. 2008 The technical varia)on typically follows Poisson distribu)on Marioni et al. Genome Res. 2008 There is considerable amount of biological varia)on Choy et al. PLoS Gene)cs. 2008 There is robust concordance between microarray and RNAseq data for the same sample. Mortazavi et al. Nature Methods. 2008

RNA sequencing > Quality control in RNAseq data GC content bias the RNA expression level The GC- bias is not always straight forward. Hansen et al. Biosta)s)cs. 2011 A C G T Hexamar priming bias RNA expression level Hansen et al. Biosta)s)cs. 2011 Fragment bias also affect paired end RNAseq data Roberts et al. Genome Biol. 2011

RNA sequencing > SoRware

RNA sequencing > SoRware Integra)ve Genome Viewer / Broad Ins)tute Differen)al expression Allelic expression Variant detec)on Alterna)ve splicing iden)fica)on Rapaport et al. Genome Biol. 2013

RNA sequencing > RNA edi)ng Li et al. Science. 2011 Pachter. Nature Biotech. 2012

RNA sequencing > Brain- storming S1: Untreated sample: 6 million reads S2: Treated sample: 4 million reads Depth of coverage plot How do you know if the RNAseq data looks ok? Differen)al expression MA plot

Chipseq / methylseq

Chipseq / methylseq > Chroma)n biology CpG methyla)on, histone modifica)ons, nucleosome Cremer & Cremer. Nature Reviews Genet. 2001

Chipseq / methylseq > Quan)fica)on of abundance of chroma)n signature Gu et al. Nature Protocols. 2011

Chipseq / methylseq > Quan)fica)on of abundance of chroma)n signature Mardis. Nature Methods. 2007 Zhao. Nature Immunology. 2011 Shah. Nature Methods. 2009

Chipseq / methylseq > SoRware and their basic principles Zhao. Nature Immunology. 2011

Chipseq / methylseq > SoRware and their basic principles These programs produce very different peaks in terms of peak size, number, and posi)on rela)ve to genes. Malone et al. PLoS One. 2011

Chipseq / methylseq > Combina)on of chroma)n marks Chroma)n marks are associated with one another Chrom- HMM Earnst and Kellis. Nature Methods. 2012

Chipseq / methylseq > Brainstorming How do you compare the peaks?

Variant detec=on (recap from previous lectures...)

Variant detec=on > Germ line and soma)c muta)ons Biesecker and Spinner. Nature Reviews Gene)cs. 2013 Haemophilia Type- II diabetes Early onset obesity Neurofibromatosis type 1 (NF1) Cancer Soma)c muta)ons are common in healthy human )ssues De. Trends Genet. 2011

Variant detec=on > Quality control issues An example: GATK pipeline (Probably covered in previous lecture...) McKenna et al. Genome Res. 2011

Variant detec=on > SoRware and their basic principles There is reasonable overlap among the muta)on calling sorware O Rawe et al. Genome Medicine. 2013

Variant detec=on > SoRware and their basic principles Detec)on of variants from RNAseq / Chipseq / methylseq data RNAseq:: GATK RNAseq variant calling workflow for calling muta)on from RNAseq data Methylseq:: BisSNP for detec)ng SNP from methylseq ChIPseq:: Simultaneous SNP iden)fica)on and assessment of allele- specific bias from ChIP- seq data (Ni et al. BMC Genomics. 2012) Most of the normal sorware with suitable improvisa)on. Detec)on of low frequency muta)ons in heterogeneous samples Mutect Mutsig JointSNVmix Many more...

Genomics and personalized Medicine

Genomics and Personalized Medicine > Systema)c discovery from clinical samples 100-800 samples from cancer pa)ents RNAseq, muta)on detec)on, methyla)on New cancer genes New drug- development New way to stra)fy pa)ents TCGA, Nature, 2012 Applica)on of genome sequencing to detect disease progression Relapsed Acute Myeloid Leukemia 8 pa)ents Primary and relapse samples Novel cancer gene muta)ons Ding et al, Nature, 2011 Clonal evolu)on as a result of chemotherapy Genome sequencing shaping diagnosis and treatment Welch et al, JAMA, 2011 Acute promyelocy)c leukemia One pa)ent 7 weeks for genome sequencing and analysis Ac)onable gene fusion detected Changed the treatment plan for the pa)ent

Contact details > Subhajyo= De, PhD Assistant Professor Department of Medicine, University of Colorado School of Medicine Anschutz Medical Campus Email: subhajyo).de@ucdenver.edu Phone: +1 303 724 6461 Webpage: hup://www.sjdlab.org/