Core Facility Genomics

Similar documents
How many of you have checked out the web site on protein-dna interactions?

Go where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe

G E N OM I C S S E RV I C ES

A Primer of Genome Science THIRD

School of Nursing. Presented by Yvette Conley, PhD

GenomeStudio Data Analysis Software

TECHNOLOGIES, PRODUCTS & SERVICES for MOLECULAR DIAGNOSTICS, MDx ABA 298

Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS)

Discovery and Quantification of RNA with RNASeq Roderic Guigó Serra Centre de Regulació Genòmica (CRG)

SEQUENCING. From Sample to Sequence-Ready

Next Generation Sequencing: Technology, Mapping, and Analysis

Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation

GenomeStudio Data Analysis Software

Focusing on results not data comprehensive data analysis for targeted next generation sequencing

Comparative genomic hybridization Because arrays are more than just a tool for expression analysis

Genotyping by sequencing and data analysis. Ross Whetten North Carolina State University

NGS and complex genetics

Overview of Next Generation Sequencing platform technologies

TGC AT YOUR SERVICE. Taking your research to the next generation

Introduction To Real Time Quantitative PCR (qpcr)

Data Analysis for Ion Torrent Sequencing

Genetic Analysis. Phenotype analysis: biological-biochemical analysis. Genotype analysis: molecular and physical analysis

Analysis of gene expression data. Ulf Leser and Philippe Thomas

Frequently Asked Questions Next Generation Sequencing

Biomedical Big Data and Precision Medicine

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

MiSeq: Imaging and Base Calling

LifeScope Genomic Analysis Software 2.5

Epigenetic variation and complex disease risk

Molecular Genetics: Challenges for Statistical Practice. J.K. Lindsey

PreciseTM Whitepaper

Computational Genomics. Next generation sequencing (NGS)

Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data

Challenges associated with analysis and storage of NGS data

Genomes and SNPs in Malaria and Sickle Cell Anemia

Single-Cell DNA Sequencing with the C 1. Single-Cell Auto Prep System. Reveal hidden populations and genetic diversity within complex samples

Next generation DNA sequencing technologies. theory & prac-ce

Delivering the power of the world s most successful genomics platform

Targeted. sequencing solutions. Accurate, scalable, fast TARGETED

Standards, Guidelines and Best Practices for RNA-Seq V1.0 (June 2011) The ENCODE Consortium

BRCA1 / 2 testing by massive sequencing highlights, shadows or pitfalls?

Validation and Replication

Sequencing and microarrays for genome analysis: complementary rather than competing?

Consistent Assay Performance Across Universal Arrays and Scanners

Advances in RainDance Sequence Enrichment Technology and Applications in Cancer Research. March 17, 2011 Rendez-Vous Séquençage

Disease gene identification with exome sequencing

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

ONLINE SUPPLEMENTAL MATERIAL. Allele-Specific Expression of Angiotensinogen in Human Subcutaneous Adipose Tissue

Simplifying Data Interpretation with Nexus Copy Number

The Future of the Electronic Health Record. Gerry Higgins, Ph.D., Johns Hopkins

Next Generation Sequencing

Introduction To Epigenetic Regulation: How Can The Epigenomics Core Services Help Your Research? Maria (Ken) Figueroa, M.D. Core Scientific Director

SNPbrowser Software v3.5

New generation sequencing: current limits and future perspectives. Giorgio Valle CRIBI - Università di Padova

Appendix 2 Molecular Biology Core Curriculum. Websites and Other Resources

Introduction to NGS data analysis

Data deluge (and it s applications) Gianluigi Zanetti. Data deluge. (and its applications) Gianluigi Zanetti

The Power of Next-Generation Sequencing in Your Hands On the Path towards Diagnostics

TruSeq Custom Amplicon v1.5

University of Glasgow - Programme Structure Summary C1G MSc Bioinformatics, Polyomics and Systems Biology

Overview of Genetic Testing and Screening

Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the

Introduction Bioo Scientific

SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications

Control of Gene Expression

Accelerate genomic breakthroughs in microbiology. Gain deeper insights with powerful bioinformatic tools.

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik

RT 2 Profiler PCR Array: Web-Based Data Analysis Tutorial

Introduction to next-generation sequencing data

Information leaflet. Centrum voor Medische Genetica. Version 1/ Design by Ben Caljon, UZ Brussel. Universitair Ziekenhuis Brussel

Bioruptor NGS: Unbiased DNA shearing for Next-Generation Sequencing

Frequently Asked Questions (FAQ)

GeneSifter: Next Generation Data Management and Analysis for Next Generation Sequencing

Next Generation Sequencing: Adjusting to Big Data. Daniel Nicorici, Dr.Tech. Statistikot Suomen Lääketeollisuudessa

Services. Updated 05/31/2016

DNA-Analytik III. Genetische Variabilität

Nazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office

IKDT Laboratory. IKDT as Service Lab (CRO) for Molecular Diagnostics

Mir-X mirna First-Strand Synthesis Kit User Manual

SMRT Analysis v2.2.0 Overview. 1. SMRT Analysis v SMRT Analysis v2.2.0 Overview. Notes:

A Streamlined Workflow for Untargeted Metabolomics

2. True or False? The sequence of nucleotides in the human genome is 90.9% identical from one person to the next. False (it s 99.

Technical Note. Roche Applied Science. No. LC 18/2004. Assay Formats for Use in Real-Time PCR

Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center

Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs)

Gene Expression Analysis

OriGene Technologies, Inc. MicroRNA analysis: Detection, Perturbation, and Target Validation

Interaktionen von RNAs und Proteinen

Cloud-Based Big Data Analytics in Bioinformatics

Factors for success in big data science

Description: Molecular Biology Services and DNA Sequencing

INTERNATIONAL CONFERENCE ON HARMONISATION OF TECHNICAL REQUIREMENTS FOR REGISTRATION OF PHARMACEUTICALS FOR HUMAN USE E15

14.3 Studying the Human Genome

The Human Genome Project

Transcription:

Core Facility Genomics versatile genome or transcriptome analyses based on quantifiable highthroughput data ascertainment 1

Topics Collaboration with Harald Binder and Clemens Kreutz Project: Microarray Validation of Cardiovascular Risk Factors (NGFN) Principal Investigator: Prof. Dr. G. Walz Collaborator: Prof. Dr. R. Baumeister / Prof. Dr. J. Timmer Technologies and application spectra of the Core Facility Genomics (Sample processing / Data analysis (QC) / Statistical analysis) 2

Workflow Population with high risk for CVD Dialysis Patients Expression Profiling (Primary Screen) Correlate the abundance of these candidates with the occurrence of CVD. (Tertiary Screen) Risk Factors for Cardiovascular Disease (CVD) Candidate genes that affect the life span of C. Elegans will be subjected to further analysis. Candidate Genes RNAi approach to inhibit candidate genes in C. Elegans. (Secondary Screen) 3

Study Population Collecting blood samples from 324 patients RNA Isolation Follow up study Data acquisition of all relevant clinical data 4

Printing and Hybridization Printing of the human Unigene Set - RZPD 3.1 (RZPD-Nr. 983) 37 358 human cdna clones two slides Amplification of the RNA samples using the Ambion Kit (arna) Sample RNA labelling = Cy3 (green) Universal Reference (Human) = Cy5 (red) log-ratio of measured Cy3/Cy5 intensities 5

Stringency Criteria Clemens Kreutz, Institute of Physics Starting with 12 103 992 data points Normalization and flagging based on the Genedata Software (Refiner) Filter criteria (stringent): - Rejection of saturated data points / genes (Intensity > 65 000) - Rejection of genes with more than 50% missing data (samples) - Signal / Noise Ratio > 2 (Foreground > Background) - Maximum relative Error < 0.2 (standard deviation / mean x number of pixels) 2 882 888 data points do not fulfil the stringency criteria 6

Grouping based on clinical covariates Group-1 living (221) versus died (100) Group-2 no coronary heart disease (117) versus coronary heart disease (160) Group-3 living with no coronary heart disease (97) versus died by coronary heart disease (30) 7

Number of regulated Genes Group-1 (living versus died) fold > 1.5 = 20 fold < 1.5 = 15 p-value < 0.05 and fold > 1.5 = 14 p-value < 0.05 and fold < 1.5 = 9 Group-2 (no CHD versus CHD) fold > 1.5 = 9 fold < 1.5 = 11 p-value < 0.05 and fold > 1.5 = 2 p-value < 0.05 and fold < 1.5 = 5 Group-3 (living with no CHD versus died by CHD) fold > 1.5 = 96 fold < 1.5 = 17 p-value < 0.05 and fold > 1.5 = 78 p-value < 0.05 and fold < 1.5 = 13 8

Comparing different kinds of Cases!!! Cardiovascular Disease (CVD) Polygenic Multifactorial Disease Dialysis Patients Population with high risk for CVD Epigenetics Genetics Environment One and the same clinical phenotype in two patients (e.g. CHD) but different history!!! Different history could lead to inhomogeneous expression profile within two patients with the same clinical phenotype 9

Predictive Modeling / Harald Binder, FDM Subdivision of the study population based on risk strata into low risk (25%), middle risk (50%) and high risk (25%) individuals low risk (81) versus high risk (81) fold > 1.5 = 602 fold < 1.5 = 281 p-value < 0.05 and fold > 1.5 = 579 p-value < 0.05 and fold < 1.5 = 276 p-fdr < 0.05 and fold > 1.5 = 567 p-fdr < 0.05 and fold < 1.5 = 276 Grouping based on survival analyses lead to more homogenous groups!!! 10

Technologies and application spectra of the Core Facility Genomics Genome Transkriptome Proteome Organism 11

Printing microarrays cdna Library Printing 75µm/250nl 12

Agilent Platform Genome Analysis - Tiling Arrays: 1 x 244K, 2 x 105K, 4 x 44K, 8 x 15K - Comparative Genomic Hybridization (CGH) - Copy Number Variation (CNV) - ChIP-on-chip (e.g. DNA methylation, transcription factor binding sites) - Custom Arrays Trancriptome Analyses - Whole Genome Expression for every known or partly known genome - Micro RNAs - Splice Variants - Custom Arrays 13

Agilent Platform Scanner upgrade Scanner resolution after upgrade: 2µm, 3µm + 5µm, 10µm Available tiling arrays formats after upgrade: 1 x 1M, 2 x 400K, 4 x 180K, 8 x 60K + 1 x 244K, 2 x 105K, 4 x 44K, 8 x 15K 14

iscan (BeadStation) Illumina Genome Analysis - Comparative Genomic Hybridization (CGH) - Copy Number Variation (CNV) - ChIP-on-chip (e.g. DNA methylation, transcription factor binding sites) - Custom Arrays - Whole Genome Expression - Micro RNAs - Splice Variants - Custom Arrays Trancriptome Analyses 15

iscan (BeadStation) Illumina SNP Genotyping SNP-CGH: Loss of Heterozygosity (LOH) Copy neutral LOH (e.g uniparental disomy) Flexible custom genotyping for fine mapping studies or diagnostics: + 96, 384, 768, 1536, 3072.. 60.000 SNPs per sample 16

iscan (BeadStation) Illumina Whole Genome Association Studies with up to 1 Million SNPs Haplotype tag (tag) SNPs capture most of the haplotype structure / variation within an haplotype block International HAPMAP Project: tag SNPs that uniquely identify common haplotypes INCREASE STATISTICAL POWER New England Journal of Medicine; Medical Uses of Statistics; 3"'Edition; Identifying Disease Genes in Association Studies (Dan L. Nicolae, Thorsten Kurz, and Carole Ober). Barrett JC, Cardon LR (2006) Evaluating coverage of genome-wide association studies. Nat Genet 38:659-62 17

Genome Analyzer IIx Illumina (Solexa) Deep Sequencing Principle 18

Genome Analyzer IIx Illumina (Solexa) Deep Sequencing / Dimensions Flow Cell 17mm x 66mm 1.4mm wide channels 1000 molecules per 1µm cluster = up to 150.000 clusters per tile = 120 tiles per lane Per Flow Cell: 8 lanes = 960 tiles = 1 50.000.000 clusters = 120.000.000 reads = 14-18 GB high-quality output (2x75bp) = 5TB raw data = 9.5 days run time 19

Genome Analyzer IIx Illumina (Solexa) Deep Sequencing / Applications Category Complete genome resequencing Reduced representation sequencing Targeted genomic resequencing Paired end sequencing Metagenomic sequencing Transcriptome sequencing Small RNA sequencing Sequencing of bisulfite-treated DNA Chromatin immunoprecipitation sequencing (ChIP-Seq) Nuclease fragmentation and sequencing Molecular barcoding Examples of applications Comprehensive polymorphism and mutation discovery in individual human genomes Large-scale polymorphism discovery Targeted polymorphism and mutation discovery Discovery of inherited and acquired structural variation Discovery of infectious and commensal flora Quantification of gene expression and alternative splicing; transcript annotation; discovery of transcribed SNPs or somatic mutations microrna profiling Determining patterns of cytosine methylation in genomic DNA Genome-wide mapping of protein-dna interactions Nucleosome positioning Multiplex sequencing of samples from multiple individuals 20

The Core of the Core Facility Genomics 21