Sequencing and microarrays for genome analysis: complementary rather than competing?



Similar documents
Focusing on results not data comprehensive data analysis for targeted next generation sequencing

Delivering the power of the world s most successful genomics platform

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik

Targeted. sequencing solutions. Accurate, scalable, fast TARGETED

School of Nursing. Presented by Yvette Conley, PhD

Information leaflet. Centrum voor Medische Genetica. Version 1/ Design by Ben Caljon, UZ Brussel. Universitair Ziekenhuis Brussel

Single-Cell DNA Sequencing with the C 1. Single-Cell Auto Prep System. Reveal hidden populations and genetic diversity within complex samples

Invasive Prenatal (Fetal) Diagnostic Testing

Roberto Ciccone, Orsetta Zuffardi Università di Pavia

The following chapter is called "Preimplantation Genetic Diagnosis (PGD)".

CHROMOSOMES Dr. Fern Tsien, Dept. of Genetics, LSUHSC, NO, LA

Next Generation Sequencing: Technology, Mapping, and Analysis

Overview of Genetic Testing and Screening

G E N OM I C S S E RV I C ES

Data Analysis for Ion Torrent Sequencing

Factors for success in big data science

The National Institute of Genomic Medicine (INMEGEN) was

Rapid Aneuploidy and CNV Detection in Single Cells using the MiSeq System

Core Facility Genomics

Biomedical Big Data and Precision Medicine

Simplifying Data Interpretation with Nexus Copy Number

Go where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe

The Human Genome Project. From genome to health From human genome to other genomes and to gene function Structural Genomics initiative

Innovations in Molecular Epidemiology

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

The Power of Next-Generation Sequencing in Your Hands On the Path towards Diagnostics

Nazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office

Genetic diagnostics the gateway to personalized medicine

Gene mutation and molecular medicine Chapter 15

Human Genome Organization: An Update. Genome Organization: An Update

Assuring the Quality of Next-Generation Sequencing in Clinical Laboratory Practice. Supplementary Guidelines

Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company Chapter 8: Recombinant DNA 2002 by W. H. Freeman and Company

Chromosomes, Karyotyping, and Abnormalities (Learning Objectives) Learn the components and parts of a metaphase chromosome.

SEQUENCING INITIATIVE SUOMI (SISU) SYMPOSIUM SPEAKERS August 26, 2014

Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation

Becker Muscular Dystrophy

SEQUENCING. From Sample to Sequence-Ready

Genetic Testing: Scientific Background for Policymakers

DNA-Analytik III. Genetische Variabilität

Umm AL Qura University MUTATIONS. Dr Neda M Bogari

European Genome-phenome Archive database of human data consented for use in biomedical research at the European Bioinformatics Institute

ITT Advanced Medical Technologies - A Programmer's Overview

European registered Clinical Laboratory Geneticist (ErCLG) Core curriculum

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Human Genome and Human Genome Project. Louxin Zhang

MEDICAL GENETICS GENERAL OBJECTIVE SPECIFIC OBJECTIVES

NGS and complex genetics

A Primer of Genome Science THIRD

Introduction to transcriptome analysis using High Throughput Sequencing technologies (HTS)

The Human Genome Project

How many of you have checked out the web site on protein-dna interactions?

The genetic screening of preimplantation embryos by comparative genomic hybridisation

SNP Essentials The same SNP story

DNA Mapping/Alignment. Team: I Thought You GNU? Lars Olsen, Venkata Aditya Kovuri, Nick Merowsky

Genomes and SNPs in Malaria and Sickle Cell Anemia

MUTATION, DNA REPAIR AND CANCER

How Can Institutions Foster OMICS Research While Protecting Patients?

BRCA1 / 2 testing by massive sequencing highlights, shadows or pitfalls?

escience and Post-Genome Biomedical Research

SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications

Introduction To Epigenetic Regulation: How Can The Epigenomics Core Services Help Your Research? Maria (Ken) Figueroa, M.D. Core Scientific Director

From Immunotherapy of Cancer to the Discovery of Kidney Cancer Genes

Next Generation Sequencing

Molecular typing of VTEC: from PFGE to NGS-based phylogeny

Accelerate genomic breakthroughs in microbiology. Gain deeper insights with powerful bioinformatic tools.

Lecture 6: Single nucleotide polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs)

Systematic discovery of regulatory motifs in human promoters and 30 UTRs by comparison of several mammals

An example of bioinformatics application on plant breeding projects in Rijk Zwaan

History of DNA Sequencing & Current Applications

Special Report: acgh for the Genetic Evaluation of Patients with Developmental Delay/ Mental Retardation or Autism Spectrum Disorder

Molecular Genetics: Challenges for Statistical Practice. J.K. Lindsey

Disease gene identification with exome sequencing

Revision of the Directive 98/79/EC on In Vitro Diagnostic Medical Devices. Response from Cancer Research UK to the Commission August 2010

Preimplantation Genetic Diagnosis (PGD) and Childhood Diagnostic Evaluation

Clinical Research Infrastructure

The Future of the Electronic Health Record. Gerry Higgins, Ph.D., Johns Hopkins

BlueFuse Multi Analysis Software for Molecular Cytogenetics

Fluorescence in situ hybridisation (FISH)

Survey of clinical data mining applications on big data in health informatics

Heritability: Twin Studies. Twin studies are often used to assess genetic effects on variation in a trait

Overview of Next Generation Sequencing platform technologies

ncounter Leukemia Fusion Gene Expression Assay Molecules That Count Product Highlights ncounter Leukemia Fusion Gene Expression Assay Overview

This fact sheet describes how genes affect our health when they follow a well understood pattern of genetic inheritance known as autosomal recessive.

Preimplantation genetic diagnosis new method of screening of 24 chromosomes with the Array CGH method...2

Integration of Genetic and Familial Data into. Electronic Medical Records and Healthcare Processes

Cancer Genomics: What Does It Mean for You?

ACMG clinical laboratory standards for next-generation sequencing

CANCER RESEARCH UK STRATIFIED MEDICINE PROGRAMME IAN WALKER PHD, MBA HEAD OF STRATIFIED MEDICINE

Cloud-Based Big Data Analytics in Bioinformatics

Algorithms in Computational Biology (236522) spring 2007 Lecture #1

Six Trends in Robotics in the Life Sciences

Genomic Medicine The Future of Cancer Care. Shayma Master Kazmi, M.D. Medical Oncology/Hematology Cancer Treatment Centers of America

March 19, Dear Dr. Duvall, Dr. Hambrick, and Ms. Smith,

Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects

A leader in the development and application of information technology to prevent and treat disease.

Analysis of gene expression data. Ulf Leser and Philippe Thomas

Transcription:

Sequencing and microarrays for genome analysis: complementary rather than competing? Simon Hughes, Richard Capper, Sandra Lam and Nicole Sparkes Introduction The human genome is comprised of more than 3 billion base pairs with greater than 99% being identical between any two unrelated people. The remaining 1% contains a mixture of sequence variants that range in size from a single base (single nucleotide polymorphisms or SNPs) to indels (insertions/deletions) >1bp <1kb and copy number variations (CNVs) comprising duplications or deletions greater than 1kb in length 1. Many sequence variants have no associated disease phenotype whilst others, which include inherited and de novo changes, can predispose people to diseases such as autoimmune disease 2, asthma 3, schizophrenia 4, obesity 5 as well as a variety of cancers 6-8. The development of genome analysis technologies such as DNA microarrays and next generation sequencing (NGS) has provided the researcher with the unique ability to screen for sequence variants of clinical relevance. Although DNA microarrays and NGS might be viewed as competing platforms, this article examines how exome and targeted sequencing is being implemented in biomedical laboratories and how NGS and microarrays could complement each other to address a range of biological questions. Applications of genome analysis Array comparative genomic hybridisation (acgh) platforms, which allow the detection of known and de novo CNVs present in a cell or tissue, play an important role in genome analysis and have had a major impact on the diagnosis of genetic disorders, accelerating CNV discovery for many diseases 9. Genome-wide association studies (GWAS) using oligonucleotide acgh are considered the gold-standard for CNV detection 10. In a clinical setting they have been key for identifying novel disease loci 11 and recent data suggest that they will have a pivotal role in prenatal diagnosis 12. NGS offers an alternative approach for genome analysis, providing single base resolution that has permitted the successful identification of causal mutations for a number of monogenic disorders 13-15 as well as for cancer 16,17. Choice of platform for genome analysis As both microarrays and NGS offer the potential for discovery of sequence variants, when selecting a platform it is important to consider some initial questions. 1. Which platform is the best fit for addressing a particular research objective? 2. Which platform is most appropriate with respect to sample throughput and cost? 3. What data analysis is required? Which platform is the best fit for addressing a particular research objective? Microarrays are an established technology and can routinely detect aneuploidy, unbalanced chromosomal rearrangements, subchromosomal deletions or duplications, loss of heterozygosity and SNPs (Table 1). The power of microarrays for detecting such variants comes from the density, coverage and genomic distribution of oligonucleotides on the array. This is of particular relevance clinically and can be addressed by utilising high-density genome-wide array designs or designs combining probes for specific focus regions with lower density probes covering the genomic backbone. NGS offers the ability to detect the sequence variants outlined above but can also be used to screen for copy-neutral variants (e.g. balanced chromosomal

inversions or translocations), indels or single base variants (e.g. point mutations) (Table 1). NGS provides the user with the capability to scan for disease causing variants without a priori sequence information. When planning a NGS experiment there are several components to consider, including protocol (whole exome, custom sequencing, etc.), library preparation, capture method, instrumentation, coverage, etc., which are all essential to ensure that the desired information is obtained from a sequencing run. Table 1: DNA sequence variants detected by microarrays and NGS Sequence variant Microarray Aneuploidy Balanced chromosomal rearrangements Unbalanced chromosomal rearrangements Sub chromosomal deletions/ duplications Loss of heterozygosity Whole exome/ Custom sequencing SNPs Indels Control Consortium (WTCCC) CNV study 18. A major factor determining the cost of microarray processing is the array format utilised. The facility for parallel processing of samples on a single microarray slide allows significant time and cost savings to be made. In contrast, whole genome sequencing can be more costly with a long turn-around time. To address this, more focussed sequencing approaches can be applied: Whole exome sequencing, which focuses on just the 1.5% of the human genome corresponding to gene encoding regions that contain approximately 85% of diseasecausing mutations 19. Custom sequencing, to target specific region(s) of interest (ROI) ranging from 0.2 34 megabases. Focusing in on one or more ROI will enable increased depth of coverage for those regions and increased confidence for detecting causal mutations. These approaches present an attractive area for NGS diagnostic development and offer the advantage of shorter turn-around time, reduced sequencing costs, multiplexing and the potential to study larger numbers of patients. However, when considering NGS an awareness of hidden costs is important, as the data analysis hardware and infrastructure requirements as well as experienced bioinformaticians can significantly increase the overall price tag 20. What data-analysis is required? Which platform is most appropriate with respect to sample throughput and cost? The numbers of samples that can be processed using either microarrays or NGS vary; however, the most important points to consider are the labour, time and cost involved in sample analysis. Microarrays enable parallel analysis of large numbers of samples and thus offer the potential to classify patient cohorts. As an example of the throughput achievable using microarrays, Oxford Gene Technology (OGT) recently used its high-throughput technology to process 20,000 samples in 20 weeks as part of the Wellcome Trust Case Not all sequence variants detected in a microarray or NGS experiment are necessarily relevant, as the presence of sequence variants in the healthy population indicates that many of these will be benign rather than disease causing 21. The analysis tools for distinguishing relevant from irrelevant variants from microarray data are well developed and some service providers (e.g. OGT) make these tools available. However, for NGS the identification of important sequence variants can represent a serious obstacle for researchers due to; (i) the volume of data generated and (ii) relatively slow development of standardised off-the-shelf data analysis tools. This is of critical importance as incorrect data

analysis, due to problems with base calling, alignment and assembly, may result in clinically relevant information being missed. A robust analytical pathway for data analysis, such as that developed by OGT (Figure 1), can circumvent these problems and accelerate progress in understanding the biological meaning of complex NGS data. variants with biomedical relevance 15. This information could then be used to generate new diagnostic arrays or add additional content to existing diagnostic arrays. The correct combination is essential to ensure that the most information is obtained with a careful balance needed between cost and information required. Figure 1: Translating data into information with OGT s advanced analysis pipeline Microarrays or Sequencing? At present no single platform, either microarrays or NGS, can identify all sequence variants within the genome. Although both platforms function perfectly well in isolation each offers complementary qualities that can, in combination, be used to identify and screen for known or de novo sequence variants. The exact order in which the platforms are used depends on the types of questions that need to be answered. 1. If the requirement is to screen a large number of samples to identify a particular subset or genomic region 7 for more comprehensive analysis, microarrays will be more effective for screening followed by sequencing. 2. If the goal is discovery, sequencing could be used to identify sequence About OGT Oxford Gene Technology (OGT) has a proven track record in providing high-quality genomic analysis technology and services, which are designed to lead the researcher all the way from project conception to highquality results. To enable this, we offer our expertise and experience to help implement the most effective solution for your study and budget. Our Genefficiency high-throughput array service provides researchers with the choice of standard or custom arrays to achieve maximum resolution over either the whole genome or particular regions of interest, while our Genefficiency NGS service offers a tailored approach to targeted sequencing, providing the correct combination of components to deliver high-quality results.

Our team of highly experienced project scientists have a strong scientific background and therefore understand the importance of the biology underlying each experiment. By leveraging the skills available at OGT, researchers can focus on the biology rather than the technical aspects and data analysis. To find out how we can help advance your genomic research, visit www.ogt.co.uk/genefficiency or contact us on +44 (0)1865 856826. References 1. Gökçümen, O. and Lee, C. (2009) Copy number variants (CNVs) in primate species using array-based comparative genomic hybridization. Methods 49, 18-25 2. Fanciulli, M. et al (2007) FCGR3B copy number variation is associated with susceptibility to systemic, but not organspecific, autoimmunity. Nature Genetics 39, 721-723 3. Brasch-Andersen, C. et al (2004) Possible gene dosage effect of glutathione- S-transferases on atopic asthma: using realtime PCR for quantification of GSTM1 and GSTT1 gene copy numbers. Human Mutation 24, 208-214 4. Walsh, T. et al (2008) Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science 320, 539-543 5. Walters, R.G. et al (2010) A new highly penetrant form of obesity due to deletions on chromosome 16p11.2. Nature 463, 671-675 6. Hughes, S. et al (2006) The use of whole genome amplification to study chromosomal changes in prostate cancer: insights into genome-wide signature of preneoplasia associated with cancer progression. BMC Genomics 7, 65 7. Ernst, T. et al (2010) Transcription factor mutations in myelodysplastic/ myeloproliferative neoplasms. Haematologica 95, 1473-1480 8. Dyrsø, T. et al (2011) Identification of chromosome aberrations in sporadic microsatellite stable and unstable colorectal cancers using array comparative genomic hybridization. Cancer Genetics 204, 84-95 9. Shaffer, L.G. et al (2007) The identification of microdeletion syndromes and other chromosome abnormalities: cytogenetic methods of the past, new technologies for the future. American Journal of Medical Genetics 145C, 335-345 10. Carter, N.P. (2007) Methods and strategies for analyzing copy number variation using DNA microarrays. Nature Genetics 39 (7 suppl), S16-S21 11. Slavotinek, A.M. (2008) Novel microdeletion syndromes detected by chromosome microarrays. Human Genetics 124, 1-17 12. Kleeman, L. et al (2009) Use of array comparative genomic hybridization for prenatal diagnosis of fetuses with sonographic anomalies and normal metaphase karyotype. Prenatal Diagnosis 29, 1213-1217 13. Choi, M. et al (2009) Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proceedings of the National Academy of Sciences of the United States of America 106, 19096 19101 14. Ng, S.B. et al (2010) Exome sequencing identifies the cause of a mendelian disorder. Nature Genetics 42, 30 35 15. Ng, S.B. et al (2009) Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461, 272 276 16. Wei, X. et al (2011) Exome sequencing identifies GRIN2A as frequently mutated in melanoma. Nature Genetics 43, 442-446 17. Yan, X.J. et al (2011) Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia. Nature Genetics 43, 309-315

18. Conrad, D.F. et al (2010) Origins and functional impact of copy number variation in the human genome. Nature 464, 704-712 19. Choi, M. et al (2010) Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proceedings of the National Academy of Sciences of the United States of America 106, 19096-19101. 20. McPherson, J.D. (2009) Next-generation gap. Nature Methods Supplement 6, S2-S5. 21. MacArthur, D.G. and Tyler-Smith C. (2010) Loss-of-function variants in the genomes of healthy humans. Human Molecular Genetics 19, R125-R130. Oxford Gene Technology Begbroke Science Park, Sandy Lane, Yarnton, Oxford OX5 1PF United Kingdom T: +44 (0)1865 856826 F: +44 (0)1865 848684 www.ogt.co.uk This document and its contents are Oxford Gene Technology IP Limited - May 2011 ESHG special. All rights reserved. OGT, Genefficiency and Oxford Gene Technology are trademarks of Oxford Gene Technology IP Limited.