Delivering the power of the world s most successful genomics platform

Similar documents
Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013

A leader in the development and application of information technology to prevent and treat disease.

LifeScope Genomic Analysis Software 2.5

OpenCB a next generation big data analytics and visualisation platform for the Omics revolution

Focusing on results not data comprehensive data analysis for targeted next generation sequencing

Disease gene identification with exome sequencing

G E N OM I C S S E RV I C ES

SeqScape Software Version 2.5 Comprehensive Analysis Solution for Resequencing Applications

Preparing the scenario for the use of patient s genome sequences in clinic. Joaquín Dopazo

Computational Requirements

Accelerate genomic breakthroughs in microbiology. Gain deeper insights with powerful bioinformatic tools.

BIOINFORMATICS Supporting competencies for the pharma industry

European Genome-phenome Archive database of human data consented for use in biomedical research at the European Bioinformatics Institute

Clinical Genomics at Scale: Synthesizing and Analyzing Big Data From Thousands of Patients

SNPbrowser Software v3.5

Cisco Data Preparation

This fact sheet describes how genes affect our health when they follow a well understood pattern of genetic inheritance known as autosomal recessive.

Nazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office

Accelerate > Converged Storage Infrastructure. DDN Case Study. ddn.com DataDirect Networks. All Rights Reserved

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

Go where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe

School of Nursing. Presented by Yvette Conley, PhD

Using Illumina BaseSpace Apps to Analyze RNA Sequencing Data

Simplifying Data Interpretation with Nexus Copy Number

Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects

Practical Solutions for Big Data Analytics

Information leaflet. Centrum voor Medische Genetica. Version 1/ Design by Ben Caljon, UZ Brussel. Universitair Ziekenhuis Brussel

Made to Fit Your Needs. SAP Solution Overview SAP Solutions for Small Businesses and Midsize Companies

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers

Dr Alexander Henzing

Single-Cell DNA Sequencing with the C 1. Single-Cell Auto Prep System. Reveal hidden populations and genetic diversity within complex samples

All in a highly interactive, easy to use Windows environment.

Large-scale Research Data Management and Analysis Using Globus Services. Ravi Madduri Argonne National Lab University of

MediSapiens Ltd. Bio-IT solutions for improving cancer patient care. Because data is not knowledge. 19th of March 2015

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

ORACLE HEALTH SCIENCES INFORM ADVANCED MOLECULAR ANALYTICS

Building Bioinformatics Capacity in Africa. Nicky Mulder CBIO Group, UCT

Introduction to NGS data analysis

Introduction to Arvados. A Curoverse White Paper

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences

Viewpoint ediscovery Services

Hadoop-BAM and SeqPig

Intro to Bioinformatics

Data Analysis & Management of High-throughput Sequencing Data. Quoclinh Nguyen Research Informatics Genomics Core / Medical Research Institute

TIBCO Spotfire Helps Organon Bridge the Data Gap Between Basic Research and Clinical Trials

University of Glasgow - Programme Structure Summary C1G MSc Bioinformatics, Polyomics and Systems Biology

CA Workload Automation

SIMCA 14 MASTER YOUR DATA SIMCA THE STANDARD IN MULTIVARIATE DATA ANALYSIS

GeneSifter: Next Generation Data Management and Analysis for Next Generation Sequencing

Building a Scalable Big Data Infrastructure for Dynamic Workflows

INCOGEN Professional Services

Oncology Insights Enabled by Knowledge Base-Guided Panel Design and the Seamless Workflow of the GeneReader NGS System

IO Informatics The Sentient Suite

End-to-End E-Clinical Coverage with Oracle Health Sciences InForm GTM

New solutions for Big Data Analysis and Visualization

Sequencing and microarrays for genome analysis: complementary rather than competing?

Targeted. sequencing solutions. Accurate, scalable, fast TARGETED

Accelerating variant calling

Data Analysis for Ion Torrent Sequencing

Assuring the Quality of Next-Generation Sequencing in Clinical Laboratory Practice. Supplementary Guidelines

Globus Genomics Tutorial GlobusWorld 2014

GeneProf and the new GeneProf Web Services

Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center

White Paper. An itelligence White Paper SAP Cloud for Sales: An Innovative Approach to Navigating a New Era of Sales Challenges

Tutorial for Windows and Macintosh. Preparing Your Data for NGS Alignment

White Paper. Version 1.2 May 2015 RAID Incorporated

Transformational Data-Driven Solutions for Healthcare

Pipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices

Integrating Genetic Data into Clinical Workflow with Clinical Decision Support Apps

IBM Global Business Services Microsoft Dynamics CRM solutions from IBM

Specialty Lab Informatics and its role in a large academic medical center

High Performance Compu2ng Facility

Single-Cell Whole Genome Sequencing on the C1 System: a Performance Evaluation

Big Data Challenges. technology basics for data scientists. Spring Jordi Torres, UPC - BSC

CA Process Automation for System z 3.1

TIBCO Spotfire Guided Analytics. Transferring Best Practice Analytics from Experts to Everyone

GC3 Use cases for the Cloud

Big Data Trends A Basis for Personalized Medicine

a measurable difference

Analysis of NGS Data

Application Test Management and Quality Assurance

Removing Sequential Bottlenecks in Analysis of Next-Generation Sequencing Data

BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS

Right-Sizing Electronic Discovery: The Case For Managed Services. A White Paper

escience and Post-Genome Biomedical Research

Big Data for Investment Research Management

Module 1. Sequence Formats and Retrieval. Charles Steward

Core Facility Genomics

Enhancing Document Review Efficiency with OmniX

Cloud-Based Big Data Analytics in Bioinformatics

Managing and Conducting Biomedical Research on the Cloud Prasad Patil

Richmond, VA. Richmond, VA. 2 Department of Microbiology and Immunology, Virginia Commonwealth University,

How To Make Data Streaming A Real Time Intelligence

Integration of genomic data into electronic health records

CA Service Desk Manager

Attacking the Biobank Bottleneck

Genomics and the EHR. Mark Hoffman, Ph.D. Vice President Research Solutions Cerner Corporation

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Transcription:

Delivering the power of the world s most successful genomics platform

NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE provides a comprehensive system to transform genomics data into medical understanding. We deliver fast, accurate, and actionable genomics insights to enable sequence-based diagnosis of diseases and improve patient care. Our system offers an automation standard that enables both public and private institutions to develop their best practices in bringing genetics into day-to-day medical use. Our integrated platform is designed for rapid mutation detection and clinical diagnostic applications using next-generation sequencing, novel bioinformatics, and proven data management technologies. INPUT PROCESSING RESULT Patient Samples GOR Database Architecture Big Data Solutions Mutation Discovery Sequence Miner for Deeper Analysis Clinical Grade NGS Legacy Patient Data Alignment & Variation Calling Genome Interpretation CSA System Dynamic & Scalable Data Storage & Retrieval Patient Data Management System Clinical Interface NextCODE Knowledge Base 40M validated variants annotated from >350K Whole Genomes, Curated Public Data Rapid Integration & Customization NextCODE Health offers an end-to-end solution for rapid patient diagnosis. 2 NextCODE Health

Our Platform: A Best-in-Class, End-to-End Solution for Rapid Mutation Detection Key Benefits A proven technology platform. Built over 16 years at decode genetics to meet the demands of datasets that are orders of magnitude larger than any found in public or private institutions, this unique platform has handled, analyzed, and stored whole genome sequence data from more than 350,000 individuals representing more than 40 million identified variants. In addition, the platform has proven scalability for storage and data queries, and has been used to power more than 350 original publications in gene discovery, diagnostics, and medical applications. An integrated system backed by decode s best practices. The system includes clinical nextgeneration sequencing, calibrated data analysis, effective genome interpretation, bioinformatics, and IT systems to enable the streamlined management of hundreds of thousands of patients from samples to mutation detection and confirmation. These solutions represent decode s best practices derived from successful genetic discoveries using linkage, genome-wide association, and next-generation sequencing. Reduce False Positives. We start from raw data, not VCF files, and run them using our analysis pipeline calibrated with data from thousands of whole genomes. This approach minimizes common alignment errors generated from other pipelines using sub-optimized parameters, resulting in significant reduction of false positives. Gain Clinical Insights on Cases without a Diagnosis. With 40 million validated variants to add to those in the public domain, we enable detection and confirmation of high-impact variants. NextCODE Knowledge Base 40 million validated variants annotated from >350,000 whole genomes 1.5 million indels and 6,600 loss-of-function mutations over 4,800 genes Variants with low allelic frequencies down to 0.01% Complements public databases, such as 1,000 Genomes or the NHLBI exome project. More samples and greater breadth and depth of sequence coverage. Our knowledge base enables users to quickly confirm rare mutations and filter out common variants that are unlikely to be high-impact disease variants. Rapid Confirmation. Our Genomic Ordered Relational (GOR) architecture facilitates instantaneous viewing of any candidate mutation at the sequence read level, increasing confidence via on-the-fly visual confirmation. Large-Scale Data Storage and Management. Raw data, analyzed data, and associated annotations can be stored in the GOR database tied to our Clinical Sequence Analyzer (CSA). No expensive hardware is required. Pay-As-You-Go Pricing. Our pricing is affordable, enabling you to analyze, interpret, store, and manage genomics data without the need to invest in internal IT infrastructure. www.nextcode.com 3

NextCODE s Comprehensive Products and Services Upstream: Sequencing NextCODE provides next-generation sequencing services through decode s CLIA-, CAP-, and ISO 13485-certified laboratory. These accreditations are an assurance of the highest sequencing data quality standards. We also provide a variety of competitively-priced next-generation sequencing services that do not require CLIA-grade sequencing. NextCODE provides clinical sequencing services through decode s CLIA-, CAP-, and ISO 13485-certified facility Mid-stream: Analysis NextCODE accepts legacy data from our clients. We use raw sequence read files for alignment and variant calling, which are calibrated for optimal results using the large cohorts sequenced by decode and NextCODE. Problematic calls are flagged or filtered leveraging our extensive knowledge of the unstable regions of the human genome. Benefits: Provides maximum yield of real variants by reducing both false positives and false negatives. Optimally-calibrated indel caller provides higher sensitivity and specificity. Tags or filters regions that are unstable to help users prioritize validation efforts. Enables instantaneous visualization of raw sequence data for confirmation of variants. Downstream: Genome Interpretation NextCODE s Clinical Sequence Analyzer (CSA) system enables users to quickly analyze their data through a clinically intuitive interface. The system operates on a petabyte scale using our GOR architecture, providing massively parallel ad hoc query and data analysis capabilities. Features: Custom candidate gene lists can be generated by users with phenotype-gene tools to complement the standard gene lists already included. Variant annotations can be stratified according to frequency, VEP class, predicted functional effects of missense variants, inheritance patterns, and effects on phenotypes, mortality, or expression. Potential pathogenic mutations are rapidly confirmed by an instant visualization of aligned sequence reads and cross-checked 4 NextCODE Health

with a continually updated, curated variation knowledge base made up of public domain, decode, and NextCODE data. User-friendly physician interfaces designed by clinicians enable users to manage their patient data, generate final reports, and notate diagnoses all within the same system, supporting the entire clinical team. Our CSA collaborative analysis features allow simultaneous access to patient sequence data by multiple users in real time, enabling sharing of analytical results, comments, and annotations. These features greatly facilitate collaborative genome interpretation where multiple users can create studies, and edit and share results and clinical reports within their institution or with collaborators around the world. Benefits: A clinically intuitive workflow enables users to rapidly analyze genomes, exomes, or transcriptomes, leading to de novo and rare mutation detection. Instantaneous visualization of raw sequence reads. This example shows that by hitting the Father BAM button, the NextCODE Genome Browser instantaneously loads BAM sequences enabling users to check the assembly and variant calls between the affected patient and the father. This detailed display of raw reads provides rapid visual confirmation that the father is a heterozygous carrier of the mutant C and his affected child is homozygous for the mutation. Similar analysis can be done for other members of the family and even for large cohorts. www.nextcode.com 5

NextCODE s Bigger Data Solutions Meeting the Big Data Challenge: The NextCODE Genomic Ordered Relational (GOR) Architecture Unlike other sequence analysis solutions, the NextCODE informatics systems have already been successfully used to manage sequence variation data for hundreds of thousands of individuals. Central to this capability is the unique design of the NextCODE GOR architecture and database. Instead of creating multiple data silos, the GOR architecture greatly simplifies the comparisons of variation data across large sets of patients with similar signs and symptoms, thereby leveraging the diagnostic power in the future avalanche of variation data generated by next-gen sequencing technologies. Equally important, our GOR database enables clients to dynamically retrieve, edit, and annotate their sequencing data on-the-fly without the common issues of substantial time-lag in retrieving and storing of data. Plus, user annotations on sequence variants from any given patient can be leveraged in the analysis of subsequent patients. Coupled with our web-based CSA system, our solution enables individual investigators and institutions to meet the challenge of managing big data without developing substantial IT infrastructure. Accelerating In-House Development: Rapid Integrations NextCODE provides an informatics infrastructure that allows medical centers to leverage their institutional experience and expertise in clinical genetics. It is our goal to enable in-house bioinformatics and IT efforts to take advantage of our advanced informatics platform. For example, experienced users and bioinformaticians can create their own data mining and analysis scripts, and deploy them to their own end users via our Sequence Miner tool. Also, our GOR database and data security design can be integrated to manage in-house data. Benefits: With the GOR database, clients can store and manage large amounts of raw and analyzed data at very affordable levels without the need to develop extensive IT infrastructure. NextCODE components and capabilities can easily be integrated into existing systems to enable internal development of best practices adopted at each institution. NextCODE s professional services team will provide consultations upon request. 6 NextCODE Health

Specifications Upstream: Sequencing NextCODE provides sequencing services Sample Submission Requirements: 2 ug of genomics DNA for exome and whole genome sequencing 3 ug to 4 ug of total RNA for transcriptome sequencing Minimal concentration 100 ng/ul - OR - NextCODE begins with client legacy data Data Submission Formats: FASTQ file Sample manifest file Mid-stream: Analysis NextCODE calibrated alignment and variation calling algorithms Calibrated indel caller provides higher sensitivity and specificity VEP variation effect prediction analysis based on gene context Missense mutation functional predictions Unstable region tagging or filtering Allelic frequencies, phenotype associations, and other variant annotations based on the NextCODE Knowledge Base, which includes both proprietary and public data Downstream: Interpretation and Data Management Informatics systems A. NextCODE data analysis and CSA system Sequencing quality control report Candidate gene report Report listing all known variations A report to identify putative pathogenic variants including: Autosomal dominant Autosomal recessive Autosomal compound heterozygosity X-linked recessive De novo variants B. NextCODE Genome Browser for data visualization C. NextCODE Sequence Miner for advanced sequence analysis and data mining D. Collaborative genome analysis workflow E. Clinical summary report Data Storage using NextCODE GOR Architecture and Database A. Data storage in NextCODE cloud- or serverbased hosting services B. Secondary analysis data is stored using BAM, VCF, and GOR formats C. Tertiary analysis is generated in GOR format and can easily be viewed or downloaded Recommended Computer Requirements CSA browser-based components require minimal computer resources for optimal performance NextCODE Genome Browser and NextCODE Sequence Miner, embedded in the CSA system, are Java Web Start modules which require 3Gb minimum memory for ideal performance NextCODE supports Windows, Linux, and Macbased operating systems www.nextcode.com 7

There s big data, and then there s bigger data We combine state-of-the-art sequencing and analytics with the world s largest genomics and phenotypic library. To learn more contact info@nextcode.com NextCODE Health One Broadway, 14 th Floor Cambridge, MA 02142 www.nextcode.com NextCODE Clinical Sequence Analyzer, NextCODE Genome Browser, NextCODE Sequence Miner, NextCODE GOR Architecture, and NextCODE GOR Database are registered trademarks of NextCODE Health.