Attacking the Biobank Bottleneck Professor Jan-Eric Litton BBMRI-ERIC BBMRI-ERIC
Big Data meets research biobanking Big data is high-volume, high-velocity and highvariety information assets that demand costeffective, innovative forms of information processing for enhanced insight and decision making. BBMRI-ERIC 2
Big Data meets research biobanking Access to high quality human biological samples and associated data BioBankCloud BBMRI-ERIC 3
Seven Pillars of BBMRI-ERIC Scientific excellence One type of infrastructure for Europe Access to high quality human biological samples and associated data WORK PROGRAMME 2014 Prepared for the Assembly of Members 2 nd Session in Vienna on 2014/04/29 Final Version 3 HQ BBMRI-ERIC 2014/04/15 Access to high quality Biomolecular resources Ethical and legal compliance Long-term sustainability International integration
BBMRI-ERIC get legal status Founding Members of BBMRI-ERIC Austria Belgium Czech Republic Estonia Finland France Germany Greece Italy Malta Netherlands Sweden Official Observers of BBMRI-ERIC Norway Poland Switzerland Turkey IARC BBMRI-ERIC 5
BBMRI-ERIC is today the largest health oriented ERIC ever launched In Europe! BBMRI-ERIC 6
BBMRI-ERIC has a total population of: 408 Million individuals BBMRI-ERIC 7
How many samples?? 1.6 billion tissues samples BBMRI-ERIC 8
Libraries of Flesh: The Sorry State of Human Tissue storage 24 oktober 2014 9
Libraries of Flesh: The Sorry State of Human Tissue storage Every year, billions of dollars worth of research into the genetic underpinnings of autism, schizophrenia, diabetes, Alzheimer s disease, and other devastating disorders hinges on scientists ability to tap industrial quantities of cells and tissue. But Compton found that while our technology for decoding the inner workings of life is advancing dramatically, the protocols for collecting and storing specimens of human flesh have barely evolved in decades. At the same time, innovation in the field of biobankinghas stalled for lack of funding and interest. The science of bio-preservation is still considered an arcane, musty specialty, more akin to taxidermy than medicine. You might have thought that doing the science would be the biggest challenge of a massive undertaking like the Cancer Genome Atlas, Compton told me last fall. But acquiring the biospecimens turned out to be the hardest part, bar none. It s the Wild West out there. 24 oktober 2014 10
Libraries of Flesh: The Sorry State of Human Tissue storage One bank at a major university claimed to have more than 12,000 samples of glioblastoma in its collection. Only 18 of those were good enough to use. The rate of unacceptable shipments from other institutions ran as high as 99 percent. 24 oktober 2014 11
Nature Medicine, #3 March 2013 24 oktober 2014 12
Catalogue WORK PROGRAMME 2014 Prepared for the Assembly of Members 2 nd Session in Vienna on 2014/04/29 Final Version 3 HQ BBMRI-ERIC 2014/04/15 BBMRI-ERIC 13
Data pyramid for biobank information
BBMRI-ERIC 24/10/2014 15
MIABIS 2.0 MIABIS has been slimmed down 52 > 37 items to make it easier to implement (Biobank, Sample collection, Study) MIABIS is modularized to make it easier to propose extensions / adhere to MIABIS is now accepted by large community of BBMRI-ERIC countries MIABIS has a governance process to evolve it MIABIS is implemented in a wide range of applications BBMRI-ERIC 24/10/2014 16
MIABIS 2.0 Biobank Biobank Sample Collection Sample Collection/Study Study Donors (Aggregated) Samples (Aggregated) Omics Participant Sample BBMRI-ERIC 24/10/2014 17
BBMRI-ERIC 24/10/2014 18
An Emerging framework for Sample-related standard BBMRI-ERIC 24/10/2014 19
CEN working documents: Molecular in vitro diagnostic examinations Specification for pre-examination processes For blood Circulating cell free DNA For blood Genomic DNA For blood Cellular RNA For metabolomics in urine, serum and plasma For FFPE tissue DNA For FFPE tissue RNA For FFPE-Proteins For frozen tissue RNA For frozen tissue - Proteins BBMRI-ERIC CEN= European Committee for Standardization 20
SOP-mapping BBMRI-ERIC 24/10/2014 21
CoBiBa (Connectiong BioBanks) BBMRI-ERIC 24/10/2014 22
Big Data meets research biobanking Access to high quality human biological samples and associated data BioBankCloud BBMRI-ERIC 23
A Platform-as-a-Service for Biobanking www.biobankcloud.com Financed by the European Commission 7 th Framework Programme.
The Biobank Bottleneck We will soon be generating more genomic data than we can currently securely store and process Urgent need for better systems to store and process genomic data
BBMRI-ERIC Numbers 24/10/2014 26
BBMRI-ERIC Cost 24/10/2014 27
Cross-linking Genomics with Clinical Data Imagine you have genomic data for hundreds of thousands of Biobank samples Imagine you can potentially cross-link the genomic data to high-quality databases containing individualized data concerning common diseases. Genomic Data Phenotype Data
BiobankCloud PaaS
Running on lots of machines Data Center PaaS PaaS PaaS PaaS PaaS PaaS PaaS PaaS PaaS
Scalable storage on commodity hardware Many genomics research groups still run expensive SANs (storage area networks) or file systems that require expensive interconnects. My data please Reliable Storage Service SAN Unreliable Commodity Servers
Data Model What immutable data should we store? Raw Reads All Reads (BAM file) Variations (VCF file) View of Reads/Variants? What mutable data? Meta-data Stored compressed data? Referential compression Other compression?
Flexible Computation Model Ideally, we would like a computation model that is flexible enough to address many different stages of analysis: Sequence alignment Gene querying Individual-level operations Population-level operations Building analysis pipelines
Summary New systems needed to solve the Biobank Bottlenecks: The BBMRI-ERIC is attacking this problem by building an Gateway for Health to find samples fit for purpose. The BiobankCloud Project is attacking this problem by building an open-source PaaS for private clouds.
Thank you! jan-eric.litton@bbmri-eric.eu BBMRI-ERIC 35