The IT Challenges of Next- Gen Sequencing
|
|
- Darren Carson
- 7 years ago
- Views:
Transcription
1 The IT Challenges of Next- Gen Sequencing Tony Cox Head of Sequencing Informatics Sanger Institute, Cambridge, UK 24th November 2009
2 Outline» Next generation sequencing presents big challenges in informatics and data management. Driven by rapid change in:» Chemistry/instrumentation» Analysis techniques and software» Storage/processing requirements» Problems we have faced at Sanger and some solutions we have implemented
3 Capillary Sequencing Limitations» Number of samples per experiment (96)».0001 Gb/run» 1000 base reads» 1-2 hrs run time» $100,000 / Gb Since the human genome is 3Gb this approach is fundamentally limiting - a change was needed to make routine genome-scale sequencing viable
4 Moore s Law vs. Sequencing Sequencing is a key research technique that drives biological discovery. Pressure to sequence faster and more cheaply has been relentless.
5 Next Generation Sequencing Instrumentation Illumina - Genome Analyser Life Sciences SOLiD Roche/454 Titanium
6 Single Base Sequencing Cyclic process of: incorporate single, terminated, dye-labelled base. illuminate with laser and detect de-protect, repeat until chemistry becomes unreliable
7 GAIIx Optics
8 Illumina Single Base Sequencing» Flowcell similar size to microscope slide 60 61» 8 sample lanes» Two lasers + two filters detect four base/channels» 120 image tiles /lane» 1 image = 8Mb L1 L8 A C» ~500k images G T 1 120
9 Raw Image Data to DNA Sequence Images acquired at each chemistry cycle where one base is added Base sequence T G C T A C G A T
10 Sanger Illumina Production Facility 40 x GAIIx /RTA
11 IT Challenges» What are the IT challenges associated with running multiple next-generation sequencers in a high-throughput environment?» Understanding the data» How much will we produce?» How much will we keep?» How much must we move?
12 How much data will we produce?» Raw instrument data (huge number of large images)» Intermediate pipeline processing data (product of image processing). Typically very many text files.» Run folder has >1 million files in it» Results data small number of large files. May be 100x smaller than raw data» QC and LIMS» Bases and qualities» Alignments
13 How much data will we keep?» Images (raw data) are not interesting in the long term. Keep for only days or a few weeks (allows for re-analysis)» Keep what intermediate data you need to validate the experiment as a success.» QC data, LIMS and tracking information. May be stored longer term (years?).» Results data keep forever» Bases and qualities» Alignments, SNPs
14 How much data will we move?» Data has to be separated from the instrument at some point (RTA now does this for us)» May need to move to several locations for analysis, safe archive etc» Terabytes of data are likely to be involved» Moving terabyte datasets around networks is non-trivial even in an advanced IT infrastructure
15 Sanger NGS Data Output Instrument Upgrades Yearly Capillary output
16
17 Storage Planning» This is difficult and getting it wrong can break budgets and science projects» Think first in terms of bases produced, not in bytes needed» Work out bytes-perbase multipliers that are sensible for your scientific objectives
18 Storage Planning An Example from Sanger» We allow ~15 bytes/base for pipeline output storage.» Drive this down with more efficient storage formats!» Allow 15x-20x inflation for analysis (e.g. alignments and SNP calling)» Allow ~5x for long term storage of results
19 Compute Planning» Depends on type of analysis.» Work out how many millions of short reads your preferred aligner can process per hour» Extrapolate to the number of CPU days/day you will need to keep up» Analysis is rarely a clean process. Much reanalysis takes place
20 Compute + Storage = I/O» If your compute and storage requirements are big your network and disk I/O will be critical to efficiency.» Moving data around is very slow» Keep compute and storage close and well connected.
21 Archive (ENA) Sequencing Data Flow 1.RTA/CIFs 10 x 50Tb NFS Staging Area 2. pipeline analysis 3. archive Sequencing farm Analysis farm Analysis farm Lustre scratch storage Oracle Database (100Tb) 4. secondary analysis
22 Instrument Data Management Staging Storage RTA/CIFS IL3 IL3 IL2 IL2 IL1 IL Tb per instrument 4-6 wk production buffer Staged data deletion policy Incoming Incoming Analysis Analysis Outgoing Outgoing Pipeline Monitor
23 What have we learned?
24 Manufacturers are upgrading instruments constantly» Illumina went from 10 Gbases per run in Q to a 50 Gbases now and projected 95 Gbases per run by end 2009.» Storage requirements increase 10-fold in one year.» But real world data yields rarely match those advertised» At some point the informatics/it budget passes the sequencing budget
25 Plan for Change» Just have to accept that instruments, software and data processing requirements are changing very rapidly (month by month).» Plan our storage infrastructure carefully - or data management quickly gets out of control and projects will suffer
26 Precision is Difficult» We almost always underestimate the informatics resources needed to support data production and analysis.» Lab protocols and analysis techniques are changing rapidly. We need an agile approach to developing our software» It will probably be obsolete in less than 12 months
27 In Conclusion» Next gen sequencing is still a very rapidly moving field.» Plan for change!» keeping our infrastructure flexible» keep disk space expandable» keep software agile
Issues in Data Storage and Data Management in Large- Scale Next-Gen Sequencing
Issues in Data Storage and Data Management in Large- Scale Next-Gen Sequencing Matthew Trunnell Manager, Research Computing Broad Institute Overview The Broad Institute Major challenges Current data workflow
More informationPutting Genomes in the Cloud with WOS TM. ddn.com. DDN Whitepaper. Making data sharing faster, easier and more scalable
DDN Whitepaper Putting Genomes in the Cloud with WOS TM Making data sharing faster, easier and more scalable Table of Contents Cloud Computing 3 Build vs. Rent 4 Why WOS Fits the Cloud 4 Storing Sequences
More informationData Management & Storage for NGS
Data Management & Storage for NGS 2009 Pre-Conference Workshop Chris Dagdigian BioTeam Inc. Independent Consulting Shop: Vendor/technology agnostic Staffed by: Scientists forced to learn High Performance
More informationData Analysis & Management of High-throughput Sequencing Data. Quoclinh Nguyen Research Informatics Genomics Core / Medical Research Institute
Data Analysis & Management of High-throughput Sequencing Data Quoclinh Nguyen Research Informatics Genomics Core / Medical Research Institute Current Issues Current Issues The QSEQ file Number files per
More informationJuly 7th 2009 DNA sequencing
July 7th 2009 DNA sequencing Overview Sequencing technologies Sequencing strategies Sample preparation Sequencing instruments at MPI EVA 2 x 5 x ABI 3730/3730xl 454 FLX Titanium Illumina Genome Analyzer
More informationBuilding Bioinformatics Capacity in Africa. Nicky Mulder CBIO Group, UCT
Building Bioinformatics Capacity in Africa Nicky Mulder CBIO Group, UCT Outline What is bioinformatics? Why do we need IT infrastructure? What e-infrastructure does it require? How we are developing this
More informationG E N OM I C S S E RV I C ES
GENOMICS SERVICES THE NEW YORK GENOME CENTER NYGC is an independent non-profit implementing advanced genomic research to improve diagnosis and treatment of serious diseases. capabilities. N E X T- G E
More informationChallenges in data acquisition, storage and processing for NIH funded studies
UAB Research Computing Day September 15, 2011 Challenges in data acquisition, storage and processing for NIH funded studies Stephen Barnes, PhD Department of Pharmacology & Toxicology and the Targeted
More informationHigh Performance Compu2ng Facility
High Performance Compu2ng Facility Center for Health Informa2cs and Bioinforma2cs Accelera2ng Scien2fic Discovery and Innova2on in Biomedical Research at NYULMC through Advanced Compu2ng Efstra'os Efstathiadis,
More informationBig data in cancer research : DNA sequencing and personalised medicine
Big in cancer research : DNA sequencing and personalised medicine Philippe Hupé Conférence BIGDATA 04/04/2013 1 - Titre de la présentation - nom du département émetteur et/ ou rédacteur - 00/00/2005 Deciphering
More informationRemoving Sequential Bottlenecks in Analysis of Next-Generation Sequencing Data
Removing Sequential Bottlenecks in Analysis of Next-Generation Sequencing Data Yi Wang, Gagan Agrawal, Gulcin Ozer and Kun Huang The Ohio State University HiCOMB 2014 May 19 th, Phoenix, Arizona 1 Outline
More informationNext generation DNA sequencing technologies. theory & prac-ce
Next generation DNA sequencing technologies theory & prac-ce Outline Next- Genera-on sequencing (NGS) technologies overview NGS applica-ons NGS workflow: data collec-on and processing the exome sequencing
More informationThe Rise of Industrial Big Data. Brian Courtney General Manager Industrial Data Intelligence
The Rise of Industrial Big Data Brian Courtney General Manager Industrial Data Intelligence Agenda Introduction Big Data for the industrial sector Case in point: Big data saves millions at GE Energy Seeking
More informationBioHPC Web Computing Resources at CBSU
BioHPC Web Computing Resources at CBSU 3CPG workshop Robert Bukowski Computational Biology Service Unit http://cbsu.tc.cornell.edu/lab/doc/biohpc_web_tutorial.pdf BioHPC infrastructure at CBSU BioHPC Web
More informationAccelerate > Converged Storage Infrastructure. DDN Case Study. ddn.com. 2013 DataDirect Networks. All Rights Reserved
DDN Case Study Accelerate > Converged Storage Infrastructure 2013 DataDirect Networks. All Rights Reserved The University of Florida s (ICBR) offers access to cutting-edge technologies designed to enable
More informationComputational infrastructure for NGS data analysis. José Carbonell Caballero Pablo Escobar
Computational infrastructure for NGS data analysis José Carbonell Caballero Pablo Escobar Computational infrastructure for NGS Cluster definition: A computer cluster is a group of linked computers, working
More informationNazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office
2013 Laboratory Accreditation Program Audioconferences and Webinars Implementing Next Generation Sequencing (NGS) as a Clinical Tool in the Laboratory Nazneen Aziz, PhD Director, Molecular Medicine Transformation
More informationA Laboratory Information. Management System for the Molecular Biology Lab
A Laboratory Information L I M S Management System for the Molecular Biology Lab This Document Overview Why LIMS? LIMS overview Why LIMS? Current uses LIMS software Design differences LIMS software LIMS
More informationNGS data analysis. Bernardo J. Clavijo
NGS data analysis Bernardo J. Clavijo 1 A brief history of DNA sequencing 1953 double helix structure, Watson & Crick! 1977 rapid DNA sequencing, Sanger! 1977 first full (5k) genome bacteriophage Phi X!
More informationCore Facility Genomics
Core Facility Genomics versatile genome or transcriptome analyses based on quantifiable highthroughput data ascertainment 1 Topics Collaboration with Harald Binder and Clemens Kreutz Project: Microarray
More informationQ&A: Kevin Shianna on Ramping up Sequencing for the New York Genome Center
Q&A: Kevin Shianna on Ramping up Sequencing for the New York Genome Center Name: Kevin Shianna Age: 39 Position: Senior vice president, sequencing operations, New York Genome Center, since July 2012 Experience
More informationHow Sequencing Experiments Fail
How Sequencing Experiments Fail v1.0 Simon Andrews simon.andrews@babraham.ac.uk Classes of Failure Technical Tracking Library Contamination Biological Interpretation Something went wrong with a machine
More informationIncreasing Lab Efficiency by Automating Sample Test Workflows Using OpenLAB Enterprise Content Manager (ECM) and Business Process Manager (BPM)
Increasing Lab Efficiency by Automating Sample Test Workflows Using OpenLAB Enterprise Content Manager (ECM) and Business Process Manager (BPM) Technical te Solution Benefits Streamlines sample-test workflows
More informationBioruptor NGS: Unbiased DNA shearing for Next-Generation Sequencing
STGAAC STGAACT GTGCACT GTGAACT STGAAC STGAACT GTGCACT GTGAACT STGAAC STGAAC GTGCAC GTGAAC Wouter Coppieters Head of the genomics core facility GIGA center, University of Liège Bioruptor NGS: Unbiased DNA
More informationSRA File Formats Guide
SRA File Formats Guide Version 1.1 10 Mar 2010 National Center for Biotechnology Information National Library of Medicine EMBL European Bioinformatics Institute DNA Databank of Japan 1 Contents SRA File
More informationSolid State Drive Architecture
Solid State Drive Architecture A comparison and evaluation of data storage mediums Tyler Thierolf Justin Uriarte Outline Introduction Storage Device as Limiting Factor Terminology Internals Interface Architecture
More informationReproducible Research: A user s perspective on how to enable new discoveries with the OSDC
Reproducible Research: A user s perspective on how to enable new discoveries with the OSDC Maria Patterson, PhD Open Science Data Cloud Center for Data Intensive Science (CDIS) University of Chicago OSDC
More informationHere are my slides from lecture, along with my notes about each slide.
Chapter 3: Storage Here are my slides from lecture, along with my notes about each slide. NOTE: You are expected to attend all class meetings. Please be in the room when class begins and be ready to participate
More informationMSU Tier 3 Usage and Troubleshooting. James Koll
MSU Tier 3 Usage and Troubleshooting James Koll Overview Dedicated computing for MSU ATLAS members Flexible user environment ~500 job slots of various configurations ~150 TB disk space 2 Condor commands
More informationGenomic Applications on Cray supercomputers: Next Generation Sequencing Workflow. Barry Bolding. Cray Inc Seattle, WA
Genomic Applications on Cray supercomputers: Next Generation Sequencing Workflow Barry Bolding Cray Inc Seattle, WA 1 CUG 2013 Paper Genomic Applications on Cray supercomputers: Next Generation Sequencing
More informationIntroduction to next-generation sequencing data
Introduction to next-generation sequencing data David Simpson Centre for Experimental Medicine Queens University Belfast http://www.qub.ac.uk/research-centres/cem/ Outline History of DNA sequencing NGS
More informationOverview of Next Generation Sequencing platform technologies
Overview of Next Generation Sequencing platform technologies Dr. Bernd Timmermann Next Generation Sequencing Core Facility Max Planck Institute for Molecular Genetics Berlin, Germany Outline 1. Technologies
More informationShouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center
Computational Challenges in Storage, Analysis and Interpretation of Next-Generation Sequencing Data Shouguo Gao Ph. D Department of Physics and Comprehensive Diabetes Center Next Generation Sequencing
More informationOperations Management and the Integrated Manufacturing Facility
March 2010 Page 1 and the Integrated Manufacturing Facility This white paper provides a summary of the business value for investing in software systems to automate manufacturing operations within the scope
More informationGenotyping by sequencing and data analysis. Ross Whetten North Carolina State University
Genotyping by sequencing and data analysis Ross Whetten North Carolina State University Stein (2010) Genome Biology 11:207 More New Technology on the Horizon Genotyping By Sequencing Timeline 2007 Complexity
More informationCSCA0102 IT & Business Applications. Foundation in Business Information Technology School of Engineering & Computing Sciences FTMS College Global
CSCA0102 IT & Business Applications Foundation in Business Information Technology School of Engineering & Computing Sciences FTMS College Global Chapter 2 Data Storage Concepts System Unit The system unit
More informationInstitutional Partnership Program
GENEWIZ Outsourcing Services Institutional Partnership Program Solid Science. Superior Service. DNA Sequencing Partners to Fuel Your Success Institutions whose success depends on significant life science
More informationData Movement and Storage. Drew Dolgert and previous contributors
Data Movement and Storage Drew Dolgert and previous contributors Data Intensive Computing Location Viewing Manipulation Storage Movement Sharing Interpretation $HOME $WORK $SCRATCH 72 is a Lot, Right?
More information<Insert Picture Here> The Evolution Of Clinical Data Warehousing
The Evolution Of Clinical Data Warehousing Srinivas Karri Principal Consultant Agenda Value of Clinical Data Clinical Data warehousing & The Big Data Challenge
More informationIntegrated Rule-based Data Management System for Genome Sequencing Data
Integrated Rule-based Data Management System for Genome Sequencing Data A Research Data Management (RDM) Green Shoots Pilots Project Report by Michael Mueller, Simon Burbidge, Steven Lawlor and Jorge Ferrer
More informationFrequently Asked Questions (FAQ)
Frequently Asked Questions (FAQ) Why screen your (therapeutic) antibody for cross-reactivity? Cross-reactivity of therapeutic antibodies leads to adverse effects and might render the antibody unsuitable
More informationEoulsan Analyse du séquençage à haut débit dans le cloud et sur la grille
Eoulsan Analyse du séquençage à haut débit dans le cloud et sur la grille Journées SUCCES Stéphane Le Crom (UPMC IBENS) stephane.le_crom@upmc.fr Paris November 2013 The Sanger DNA sequencing method Sequencing
More informationEaseTag Cloud Storage Solution
EaseTag Cloud Storage Solution The Challenge For most companies, data especially unstructured data, continues to grow by 50 percent annually. The impact of spending more every year on storage, and on protecting
More informationEMBL Identity & Access Management
EMBL Identity & Access Management Rupert Lück EMBL Heidelberg e IRG Workshop Zürich Apr 24th 2008 Outline EMBL Overview Identity & Access Management for EMBL IT Requirements & Strategy Project Goal and
More informationManagement von Forschungsprimärdaten und DOI Registrierung. Dr. Matthias Lange (Bioinformatics & Information Technology) June 19 th, 2013
Management von Forschungsprimärdaten und DOI Registrierung Dr. Matthias Lange (Bioinformatics & Information Technology) June 19 th, 2013 Outline Motivation: IPK data infrastructure LIMS: Integration of
More informationRecommended hardware system configurations for ANSYS users
Recommended hardware system configurations for ANSYS users The purpose of this document is to recommend system configurations that will deliver high performance for ANSYS users across the entire range
More informationIT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez
IT of SPIM Data Storage and Compression EMBO Course - August 27th Jeff Oegema, Peter Steinbach, Oscar Gonzalez 1 Talk Outline Introduction and the IT Team SPIM Data Flow Capture, Compression, and the Data
More informationLustre failover experience
Lustre failover experience Lustre Administrators and Developers Workshop Paris 1 September 25, 2012 TOC Who we are Our Lustre experience: the environment Deployment Benchmarks What's next 2 Who we are
More informationIntroduction to NGS data analysis
Introduction to NGS data analysis Jeroen F. J. Laros Leiden Genome Technology Center Department of Human Genetics Center for Human and Clinical Genetics Sequencing Illumina platforms Characteristics: High
More informationPAGANTEC: OPENMP PARALLEL ERROR CORRECTION FOR NEXT-GENERATION SEQUENCING DATA
PAGANTEC: OPENMP PARALLEL ERROR CORRECTION FOR NEXT-GENERATION SEQUENCING DATA Markus Joppich, Tony Bolger, Dirk Schmidl Björn Usadel and Torsten Kuhlen Markus Joppich Lehr- und Forschungseinheit Bioinformatik
More informationAnalysis of ChIP-seq data in Galaxy
Analysis of ChIP-seq data in Galaxy November, 2012 Local copy: https://galaxy.wi.mit.edu/ Joint project between BaRC and IT Main site: http://main.g2.bx.psu.edu/ 1 Font Conventions Bold and blue refers
More informationNGS Technologies for Genomics and Transcriptomics
NGS Technologies for Genomics and Transcriptomics Massimo Delledonne Department of Biotechnologies - University of Verona http://profs.sci.univr.it/delledonne 13 years and $3 billion required for the Human
More informationTechnology Update White Paper. High Speed RAID 6. Powered by Custom ASIC Parity Chips
Technology Update White Paper High Speed RAID 6 Powered by Custom ASIC Parity Chips High Speed RAID 6 Powered by Custom ASIC Parity Chips Why High Speed RAID 6? Winchester Systems has developed High Speed
More informationlesson 1 An Overview of the Computer System
essential concepts lesson 1 An Overview of the Computer System This lesson includes the following sections: The Computer System Defined Hardware: The Nuts and Bolts of the Machine Software: Bringing the
More informationWriting Assignment #2 due Today (5:00pm) - Post on your CSC101 webpage - Ask if you have questions! Lab #2 Today. Quiz #1 Tomorrow (Lectures 1-7)
Overview of Computer Science CSC 101 Summer 2011 Main Memory vs. Auxiliary Storage Lecture 7 July 14, 2011 Announcements Writing Assignment #2 due Today (5:00pm) - Post on your CSC101 webpage - Ask if
More informationKey Considerations for Managing Big Data in the Life Science Industry
Key Considerations for Managing Big Data in the Life Science Industry The Big Data Bottleneck In Life Science Faster, cheaper technology outpacing Moore s law Lower costs and increasing speeds leading
More informationThe Microsoft Large Mailbox Vision
WHITE PAPER The Microsoft Large Mailbox Vision Giving users large mailboxes without breaking your budget Introduction Giving your users the ability to store more e mail has many advantages. Large mailboxes
More informationIntroduction to Research Data Management
Introduction to Research Data Management Marta Teperek, Veronica Phillips 30/10/2015 University of Cambridge TODAY: Mixture of activities and talking Introduction 1. Backup and exchange strategies 2. How
More informationData-Intensive Science and Scientific Data Infrastructure
Data-Intensive Science and Scientific Data Infrastructure Russ Rew, UCAR Unidata ICTP Advanced School on High Performance and Grid Computing 13 April 2011 Overview Data-intensive science Publishing scientific
More informationGo where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe
Go where the biology takes you. Genome Analyzer IIx Genome Analyzer IIe Go where the biology takes you. To published results faster With proven scalability To the forefront of discovery To limitless applications
More informationBig Data Challenges in Bioinformatics
Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres Jordi.Torres@bsc.es Talk outline! We talk about Petabyte?
More informationStorage for Science. Methods for Managing Large and Rapidly Growing Data Stores in Life Science Research Environments. An Isilon Systems Whitepaper
Storage for Science Methods for Managing Large and Rapidly Growing Data Stores in Life Science Research Environments An Isilon Systems Whitepaper August 2008 Prepared by: Table of Contents Introduction
More informationThe NGS IT notes. George Magklaras PhD RHCE
The NGS IT notes George Magklaras PhD RHCE Biotechnology Center of Oslo & The Norwegian Center of Molecular Medicine University of Oslo, Norway http://www.biotek.uio.no http://www.ncmm.uio.no http://www.no.embnet.org
More informationMiSeq: Imaging and Base Calling
MiSeq: Imaging and Page Welcome Navigation Presenter Introduction MiSeq Sequencing Workflow Narration Welcome to MiSeq: Imaging and. This course takes 35 minutes to complete. Click Next to continue. Please
More informationHow to recover a failed Storage Spaces
www.storage-spaces-recovery.com How to recover a failed Storage Spaces ReclaiMe Storage Spaces Recovery User Manual 2013 www.storage-spaces-recovery.com Contents Overview... 4 Storage Spaces concepts and
More informationHADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW
HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW 757 Maleta Lane, Suite 201 Castle Rock, CO 80108 Brett Weninger, Managing Director brett.weninger@adurant.com Dave Smelker, Managing Principal dave.smelker@adurant.com
More informationDiscover how customers are taking a radical leap forward with flash
Discover how customers are taking a radical leap forward with flash The world changes in a flash Datacenter unrest has been brewing virtualization consolidates mixed application workloads and places new
More informationMaximize Storage Efficiency with NetApp Thin Provisioning and Symantec Thin Reclamation
White Paper Maximize Storage Efficiency with NetApp Thin Provisioning and Symantec Thin Reclamation Jeremy LeBlanc; Adam Mendoza; Mike McNamara, NetApp Ashish Yajnik; Rishi Manocha, Symantec September
More informationMyths about Historians
Elliott Middleton, Senior Product Manager, Schneider Electric Executive summary Relational databases are ideal for many applications, but are not the best solution for time-series data. High throughput
More informationTutorial for Windows and Macintosh. Preparing Your Data for NGS Alignment
Tutorial for Windows and Macintosh Preparing Your Data for NGS Alignment 2015 Gene Codes Corporation Gene Codes Corporation 775 Technology Drive, Ann Arbor, MI 48108 USA 1.800.497.4939 (USA) 1.734.769.7249
More informationIntro to Bioinformatics
Intro to Bioinformatics Marylyn D Ritchie, PhD Professor, Biochemistry and Molecular Biology Director, Center for Systems Genomics The Pennsylvania State University Sarah A Pendergrass, PhD Research Associate
More informationAn Oracle White Paper July 2011. Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide
Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide An Oracle White Paper July 2011 1 Disclaimer The following is intended to outline our general product direction.
More informationGeneSifter: Next Generation Data Management and Analysis for Next Generation Sequencing
for Next Generation Sequencing Dale Baskin, N. Eric Olson, Laura Lucas, Todd Smith 1 Abstract Next generation sequencing technology is rapidly changing the way laboratories and researchers approach the
More informationGIVE YOUR ORACLE DBAs THE BACKUPS THEY REALLY WANT
Why Data Domain Series GIVE YOUR ORACLE DBAs THE BACKUPS THEY REALLY WANT Why you should take the time to read this paper Speed up backups (Up to 58.7 TB/hr, Data Domain systems are about 1.5 times faster
More informationData management challenges in todays Healthcare and Life Sciences ecosystems
Data management challenges in todays Healthcare and Life Sciences ecosystems Jose L. Alvarez Principal Engineer, WW Director Life Sciences jose.alvarez@seagate.com Evolution of Data Sets in Healthcare
More informationCluster Generation. Module 2: Overview
Cluster Generation Module 2: Overview Sequencing Workflow Sample Preparation Cluster Generation Sequencing Data Analysis 2 Cluster Generation 3 5 DNA (0.1-5.0 μg) Library preparation Single Cluster molecule
More informationIllumina GAIIx Sequencing Service
Illumina GAIIx Sequencing Service As researchers continue to develop novel applications for next generation sequencers, the technology landscape of the industry continues to advance at an unprecedented
More informationNews and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren
News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business
More informationVirtualizing SQL Server 2008 Using EMC VNX Series and Microsoft Windows Server 2008 R2 Hyper-V. Reference Architecture
Virtualizing SQL Server 2008 Using EMC VNX Series and Microsoft Windows Server 2008 R2 Hyper-V Copyright 2011 EMC Corporation. All rights reserved. Published February, 2011 EMC believes the information
More informationCloud-Based Big Data Analytics in Bioinformatics
Cloud-Based Big Data Analytics in Bioinformatics Presented By Cephas Mawere Harare Institute of Technology, Zimbabwe 1 Introduction 2 Big Data Analytics Big Data are a collection of data sets so large
More informationGenetic diagnostics the gateway to personalized medicine
Micronova 20.11.2012 Genetic diagnostics the gateway to personalized medicine Kristiina Assoc. professor, Director of Genetic Department HUSLAB, Helsinki University Central Hospital The Human Genome Packed
More informationProduct Brief: XenData X2500 LTO-6 Digital Video Archive System
Product Brief: XenData X2500 LTO-6 Digital Video Archive System Updated: March 21, 2013 Overview The XenData X2500 system includes XenData6 Workstation software which provides the archive, restore and
More informationParallel Compression and Decompression of DNA Sequence Reads in FASTQ Format
, pp.91-100 http://dx.doi.org/10.14257/ijhit.2014.7.4.09 Parallel Compression and Decompression of DNA Sequence Reads in FASTQ Format Jingjing Zheng 1,* and Ting Wang 1, 2 1,* Parallel Software and Computational
More informationTech Application Chapter 3 STUDY GUIDE
Name: Class: Date: Tech Application Chapter 3 STUDY GUIDE Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. This type of device retains data
More informationAutomated Lab Management for Illumina SeqLab
Automated Lab Management for Illumina SeqLab INTRODUCTION Whole genome sequencing holds the promise of understanding genetic variation and disease better than ever before. In response, Illumina developed
More informationMicrobial Oceanomics using High-Throughput DNA Sequencing
Microbial Oceanomics using High-Throughput DNA Sequencing Ramiro Logares Institute of Marine Sciences, CSIC, Barcelona 9th RES Users'Conference 23 September 2015 Importance of microbes in the sunlit ocean
More informationSEQUENCING. From Sample to Sequence-Ready
SEQUENCING From Sample to Sequence-Ready ACCESS ARRAY SYSTEM HIGH-QUALITY LIBRARIES, NOT ONCE, BUT EVERY TIME The highest-quality amplicons more sensitive, accurate, and specific Full support for all major
More informationManaging Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery
Center for Information Services and High Performance Computing (ZIH) Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery Richard Grunzke*, Jens Krüger, Sandra Gesing, Sonja
More informationENTELEC 2002 SCADA SYSTEM PERIODIC MAINTENANCE
Truong Le UTSI International Corporation Page 1 ENTELEC 2002 SCADA SYSTEM PERIODIC MAINTENANCE By Truong Le UTSI International Corporation INTRODUCTION Proper maintenance of SCADA equipment and software
More informationManaging and Conducting Biomedical Research on the Cloud Prasad Patil
Managing and Conducting Biomedical Research on the Cloud Prasad Patil Laboratory for Personalized Medicine Center for Biomedical Informatics Harvard Medical School SaaS & PaaS gmail google docs app engine
More informationAutomated and Scalable Data Management System for Genome Sequencing Data
Automated and Scalable Data Management System for Genome Sequencing Data Michael Mueller NIHR Imperial BRC Informatics Facility Faculty of Medicine Hammersmith Hospital Campus Continuously falling costs
More informationBackup architectures in the modern data center. Author: Edmond van As edmond@competa.com Competa IT b.v.
Backup architectures in the modern data center. Author: Edmond van As edmond@competa.com Competa IT b.v. Existing backup methods Most companies see an explosive growth in the amount of data that they have
More informationRecord Storage and Primary File Organization
Record Storage and Primary File Organization 1 C H A P T E R 4 Contents Introduction Secondary Storage Devices Buffering of Blocks Placing File Records on Disk Operations on Files Files of Unordered Records
More informationStorage Solutions for Bioinformatics
Storage Solutions for Bioinformatics Li Yan Director of FlexLab, Bioinformatics core technology laboratory liyan3@genomics.cn http://www.genomics.cn/flexlab/index.html Science and Technology Division,
More informationThe Power of Next-Generation Sequencing in Your Hands On the Path towards Diagnostics
The Power of Next-Generation Sequencing in Your Hands On the Path towards Diagnostics The GS Junior System The Power of Next-Generation Sequencing on Your Benchtop Proven technology: Uses the same long
More informationSystem Architecture. CS143: Disks and Files. Magnetic disk vs SSD. Structure of a Platter CPU. Disk Controller...
System Architecture CS143: Disks and Files CPU Word (1B 64B) ~ 10 GB/sec Main Memory System Bus Disk Controller... Block (512B 50KB) ~ 100 MB/sec Disk 1 2 Magnetic disk vs SSD Magnetic Disk Stores data
More informationTargeted. sequencing solutions. Accurate, scalable, fast TARGETED
Targeted TARGETED Sequencing sequencing solutions Accurate, scalable, fast Sequencing for every lab, every budget, every application Ion Torrent semiconductor sequencing Ion Torrent technology has pioneered
More informationwhat operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?
Inside the CPU how does the CPU work? what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored? some short, boring programs to illustrate the
More informationMany government agencies are requiring disclosure of security breaches. 32 states have security breach similar legislation
Is it safe? The business impact of data protection. Bruce Master IBM LTO Program Linear Tape-Open, LTO, LTO Logo, Ultrium and Ultrium Logo are trademarks of HP, IBM and Quantum in the US and other countries.
More information