Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane

Size: px
Start display at page:

Download "Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane www.ebi.ac.uk"

Transcription

1 Three data delivery cases for EMBL- EBI s Embassy Guy Cochrane

2 EMBL European Bioinformatics Institute Genes, genomes & variation European Nucleotide Archive 1000 Genomes Ensembl Ensembl Genomes Ensembl Plants European Genome-phenome Archive Metagenomics portal GWAS Catalog browser Protein sequences InterPro Pfam UniProt Molecular structures Protein Data Bank in Europe Electron Microscopy Data Bank Literature & ontology Europe PubMed Central Gene Ontology Experimental Factor Ontology Expression ArrayExpress Expression Atlas Metabolights PRIDE Reactions, interactions & pathways IntAct Reactome MetaboLights Chemical biology ChEMBL ChEBI Systems BioModels Enzyme Portal BioSamples

3 Sequence data at EMBL-EBI Sample/method Sample/method Read Read Alignment Alignment Assembly European Genome-phenome Archive - Controlled access data - Human data around molecular medicine - Annotation European Nucleotide Archive - Unrestricted data - Pan-species and application -

4 Sequence data at EMBL-EBI Sample/method Sample/method Read Read Alignment Alignment Assembly Annotation European Nucleotide Archive - Unrestricted data - Pan-species and application - European Genome-phenome Archive - Controlled access data - Human data around molecular medicine - Infrastructure provision - BBSRC: RNAcentral, MG Portal - MRC: 100k Genomes data implementation - EC: COMPARE, MicroB3, ESGI, BASIS - etc.

5 Challenges Data have high volume and grow rapidly Data are dynamic (continuous feed) and their application has urgency Users require arbitrary and ad hoc access

6 Tara Oceans

7 Tara Oceans Capacity

8 Infectious disease Opportunity: A methodological revolution in clinical and public health towards shotgun sequencing-based methods Scientific power: Sequence harbours rich information Diagnostic: identification, typing, resistance profiling, etc. Public health: outbreak detection, response strategy, vaccine development Mechanistic: host interactions, pathogencity, virulence, transmission, antimicrobial resistance COMPARE: recently launched Horizon 2020 project in which EMBL-EBI is informatics provider Global Microbial Identifier: Initiative with EMBL-EBI involvement supporting technologies, standards and data sharing for pathogen surveillance Informatics roles for EMBL: COMPARE: Rapid global sharing of surveillance and outbreak data, systematic integrated analysis, compute provision (Embassy) Standards for reporting, analysis and the communication of results New algorithms and analysis methods User interfaces for surveillance data reporting, across the domains

9 COMPARE platform Sources Processes Portals and environments COMPARE Registry COMPARE Data Resource Public data COMPARE workflow engine Assembly & alignment Food workflow development COMPARE Portal Default tools API INSDC data exchange Managed access data Private data AnnotaHon Typing Workflow integrahon Clinical workflow development Outbreak workflow development Hosted tools API API EBI infrastructure Embassy infrastructure DTU infrastructure Embassy virtual domain

10 COMPARE platform Sources Processes Portals and environments COMPARE Registry COMPARE Data Resource Public data COMPARE workflow engine Assembly & alignment Food workflow development COMPARE Portal Default tools API INSDC data exchange Managed access data Private data AnnotaHon Urgency Typing Workflow integrahon Clinical workflow development Outbreak workflow development Hosted tools API API EBI infrastructure Embassy infrastructure DTU infrastructure Embassy virtual domain

11 Personalised medicine Motivation: Personalised studies of variation, cancer mutation, epigenetics, regulation, expression require references for comparison and interpretation As part of GA4GH, EMBL-EBI is working on Resources serving reference human genomic and transcriptomic data, including Google read API, variant Beacons, etc. CRAM compression supporting greater data fluidity and APIs to allow direct computational access Delivery and synchronisation of high volume datasets to local Embassy and remote cloud infrastructures Past and current FP7 projects include SLING, BASIS, ESGI

12 Personalised medicine Motivation: Personalised studies of variation, cancer mutation, epigenetics, regulation, expression require references for comparison and interpretation Arbitrary access As part of GA4GH, EMBL-EBI is working on Resources serving reference human genomic and transcriptomic data, including Google read API, variant Beacons, etc. CRAM compression supporting greater data fluidity and APIs to allow direct computational access Delivery and synchronisation of high volume datasets to local Embassy and remote cloud infrastructures Past and current FP7 projects include SLING, BASIS, ESGI

13 ENA conventional read data delivery Conventional infrastructure (FTP, Aspera, GridFTP) ENA metadata FIRE1 ENA data (NFS)

14 ENA Embassy read data delivery Conventional infrastructure (FTP, Aspera, GridFTP) ENA metadata FIRE2 ENA data (Cleversafe) FUSE HTTP

15 ENA Embassy read data delivery Conventional infrastructure (FTP, Aspera, GridFTP) Embassy cloud infrastructure (VMWare -> OpenStack) Marine cache Tara Oceans Embassy ENA metadata FIRE2 Pathogen cache COMPARE Embassy ENA data (Cleversafe) FUSE CRAM cache GA4GH Embassy HTTP

16 ENA external read data delivery phase II

17 EMBL-EBI Embassy Cloud Steven Newhouse Head of Technical Services

18 The Challenge Facing EMBL-EBI Volume and variety of genomic data expanding EMBL-EBI data doubling every year - replication is challenging Infrastructure currently 50,000 CPUs & 60+PB Need to support complex analysis scenarios Web and programmatic access to services (3M unique users) Access to both public and managed access data sets Bespoke workflows and tools across a variety of domains Hard for users to replicate data sets for local analysis Use the cloud to bring local analysis to EMBL-EBI data 18

19 EMBL-EBI Embassy Cloud Service hosted at EMBL-EBI data centres Direct network access to public and managed data sets Direct network to access public services Expect both academic and commercial users Technical Implementation Logically isolated outside EMBL-EBI s LANs Secure flexible infrastructure for both tenant and host Resources exposed using VMware s vcloud Director & OpenStack Provide isolated IaaS clouds to multiple users 19

20 Why Embassy Cloud? An embassy is sovereign territory in a host country Host Country: EMBL-EBI Data Centre Sovereign Territory: Host Country not allowed to enter Virtualisation provides the protection for tenant and host Host puts boundaries in place to protect it from the tenant Tenant has freedom and control within those boundaries 20

21 Embassy Cloud Concept PanCancer Public Data Public Services Managed Data Embassy Cloud 1 Embassy Cloud 2 Embassy Cloud 3 Private Data Virtualised EMBL-EBI Hardware 21

22 User Benefits for the IaaS Model Tenant organisations get an empty virtual infrastructure They establish their own virtual machines and networks System administration performed by the tenant EMBL-EBI staff have no access to the VMs Added value from EMBL-EBI over other clouds Machines and data hosted in known jurisdiction Direct network data sets (public & managed access) Direct network access to public EMBL-EBI services 22

23 Benefits to EMBL-EBI of the IaaS Model A secure collaborative workspace Work does not contend with main EMBL-EBI resources Clearly define the committed IT resources and data Explore how to build more data focused analysis services Move the analysis to where the big data is located Learn from and inform other big data scientific communities 23

24 Embassy Cloud: Typical Uses Collaborative Environment Neutral ground outside internal network CTTV: Resources and VMs to host intranet, databases, Data Staging Undertake submission from local machine (following data staging) rather from remote location BRAEMBL: Remote submission unreliable due to file upload Data Analysis Large scale management and analysis of data PanCancer: 1,000 cores, 2.5 TB RAM, 1.0 PB HDD

25 Issues Object Store Storage Infrastructure Essential for scalable high-performance storage Applications need to adapt to flat model Current caching strategy will have a limit Sharing resources between sites/communities/clouds Adopt a standards based model for federating resources Solutions for uploading and distributing VMs (+containers?) Replicating large data sets to attract workloads to a cloud 25

26 Gaps à Activities à Solutions? Data Set Replication Strategic pre-positioning of data into clouds Leverage JANET/GEANT, GridFTP + Globus Transfers, Cloud federation for mobile computing EGI has a federated cloud and VM distribution model ELIXIR plans to build on existing infrastructure where possible Wide-area file access needed for collaborative data analysis High performance wide-area object-store Need access control for human related data Coordinated investment in infrastructure Where is the UK coordination? What coordination is needed? Integrating commercial resources where they add value Integration with EU Infrastructure (ELIXIR) 26

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data in BioMedical Sciences Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data for BioMedical Sciences EMBL-EBI: What we do and why? Challenges & Opportunities Infrastructure Requirements

More information

Steven Newhouse, Head of Technical Services

Steven Newhouse, Head of Technical Services Challenges at EMBL-EBI Steven Newhouse, Head of Technical Services European Bioinformatics Institute Outstation of the European Molecular Biology Laboratory International organisation created by treaty

More information

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI

Big Data in BioMedical Sciences. Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data in BioMedical Sciences Steven Newhouse, Head of Technical Services, EMBL-EBI Big Data for BioMedical Sciences EMBL-EBI: What we do and why? Challenges & Opportunities Infrastructure Requirements

More information

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.2 Community Needs of

More information

Work Package 13.5: Authors: Paul Flicek and Ilkka Lappalainen. 1. Introduction

Work Package 13.5: Authors: Paul Flicek and Ilkka Lappalainen. 1. Introduction Work Package 13.5: Report summarising the technical feasibility of the European Genotype Archive to collect, store, and use genotype data stored in European biobanks in a manner that complies with all

More information

Global Alliance. Ewan Birney Associate Director EMBL-EBI

Global Alliance. Ewan Birney Associate Director EMBL-EBI Global Alliance Ewan Birney Associate Director EMBL-EBI Our world is changing Research to Medical Research English as language Lightweight legal Identical/similar systems Open data Publications Grant-funding

More information

The EMBL-European Bioinformatics Institute

The EMBL-European Bioinformatics Institute The EMBL-European Bioinformatics Institute The hub for bioinformatics in Europe Denise Carvalho-Silva denise@ebi.ac.uk www.ebi.ac.uk What is EMBL-EBI? Part of the European Molecular Biology Laboratory

More information

Keystones for supporting collaborative research using multiple data sets in the medical and bio-sciences

Keystones for supporting collaborative research using multiple data sets in the medical and bio-sciences Keystones for supporting collaborative research using multiple data sets in the medical and bio-sciences David Fergusson Head of Scientific Computing The Francis Crick Institute The Francis Crick Institute

More information

OpenCB a next generation big data analytics and visualisation platform for the Omics revolution

OpenCB a next generation big data analytics and visualisation platform for the Omics revolution OpenCB a next generation big data analytics and visualisation platform for the Omics revolution Development at the University of Cambridge - Closing the Omics / Moore s law gap with Dell & Intel Ignacio

More information

Genome Viewing. Module 2. Using Genome Browsers to View Annotation of the Human Genome

Genome Viewing. Module 2. Using Genome Browsers to View Annotation of the Human Genome Module 2 Genome Viewing Using Genome Browsers to View Annotation of the Human Genome Bert Overduin, Ph.D. PANDA Coordination & Outreach EMBL - European Bioinformatics Institute Wellcome Trust Genome Campus

More information

Databases and platforms for data analysis from NGS of MTB

Databases and platforms for data analysis from NGS of MTB Databases and platforms for data analysis from NGS of MTB Derrick Crook MMM Consortium MMM Consortium Linking Clinical record systems and NHS databases Translating next generation sequencing for patient

More information

Data integration and modelling in health sciences Science as a conversation across borders

Data integration and modelling in health sciences Science as a conversation across borders Open data key to the future Helsinki 2011-11-01 Data integration and modelling in health sciences Science as a conversation across borders Juni Palmgren Karolinska Institutet and FIMM, Helsinki University

More information

Module 1. Sequence Formats and Retrieval. Charles Steward

Module 1. Sequence Formats and Retrieval. Charles Steward The Open Door Workshop Module 1 Sequence Formats and Retrieval Charles Steward 1 Aims Acquaint you with different file formats and associated annotations. Introduce different nucleotide and protein databases.

More information

European Genome-phenome Archive database of human data consented for use in biomedical research at the European Bioinformatics Institute

European Genome-phenome Archive database of human data consented for use in biomedical research at the European Bioinformatics Institute European Genome-phenome Archive database of human data consented for use in biomedical research at the European Bioinformatics Institute Justin Paschall Team Leader Genetic Variation / EGA ! European Genome-phenome

More information

6 ELIXIR Domain Specific Services

6 ELIXIR Domain Specific Services 6 ELIXIR Domain Specific Services Work stream leads: Alfonso Valencia (ES), Inge Jonassen (NO), Jose Leal (PT) Work stream members: Nils-Peder Willassen (NO), Finn Drablos (NO), Mark Viant (UK), Ferran

More information

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

The 100,000 genomes project

The 100,000 genomes project The 100,000 genomes project Tim Hubbard @timjph Genomics England King s College London, King s Health Partners Wellcome Trust Sanger Institute ClinGen / Decipher Washington DC, 26 th May 2015 The 100,000

More information

Data Sharing Initiative: International Cancer Genome Consortium

Data Sharing Initiative: International Cancer Genome Consortium Data Sharing Initiative: International Cancer Genome Consortium Tom Hudson, MD President and Scientific Director Ontario Institute for Cancer Research 1 Sharing Data Sharing BIG Genome Initiative: DATA

More information

Scientific and Technical Applications as a Service in the Cloud

Scientific and Technical Applications as a Service in the Cloud Scientific and Technical Applications as a Service in the Cloud University of Bern, 28.11.2011 adapted version Wibke Sudholt CloudBroker GmbH Technoparkstrasse 1, CH-8005 Zurich, Switzerland Phone: +41

More information

Accelerate genomic breakthroughs in microbiology. Gain deeper insights with powerful bioinformatic tools.

Accelerate genomic breakthroughs in microbiology. Gain deeper insights with powerful bioinformatic tools. Accelerate genomic breakthroughs in microbiology. Gain deeper insights with powerful bioinformatic tools. Empowering microbial genomics. Extensive methods. Expansive possibilities. In microbiome studies

More information

New solutions for Big Data Analysis and Visualization

New solutions for Big Data Analysis and Visualization New solutions for Big Data Analysis and Visualization From HPC to cloud-based solutions Barcelona, February 2013 Nacho Medina imedina@cipf.es http://bioinfo.cipf.es/imedina Head of the Computational Biology

More information

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix Alternative Deployment Models for Cloud Computing in HPC Applications Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix The case for Cloud in HPC Build it in house Assemble in the cloud?

More information

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

RE Cloud Infrastructure as a Service

RE Cloud Infrastructure as a Service R 0 RE Cloud Infrastructure as a Service Low cost, reliable, available, scalable on-demand infrastructure as a service in a monthly pay-asyou-go arrangement RE Cloud is built to deliver cloud based Infrastructure

More information

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik Leading Genomics Diagnostic harma Discove Collab Shanghai Cambridge, MA Reykjavik Global leadership for using the genome to create better medicine WuXi NextCODE provides a uniquely proven and integrated

More information

A Primer of Genome Science THIRD

A Primer of Genome Science THIRD A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:

More information

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE ACCELERATING PROGRESS IS IN OUR GENES AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE GENESPRING GENE EXPRESSION (GX) MASS PROFILER PROFESSIONAL (MPP) PATHWAY ARCHITECT (PA) See Deeper. Reach Further. BIOINFORMATICS

More information

Cloud Ready for Bioinformatics?

Cloud Ready for Bioinformatics? IDB acknowledges co-funding by the European Community's Seventh Framework Programme (INFSO-RI-261552) and the French National Research Agency's Arpege Programme (ANR-10-SEGI-001) Cloud Ready for Bioinformatics?

More information

Case Study Life Sciences Data

Case Study Life Sciences Data Case Study Life Sciences Data Centre for Integrative Systems Biology and Bioinformatics www.imperial.ac.uk/bioinfsupport Sarah Butcher s.butcher@imperial.ac.uk www.imperial.ac.uk/bioinfsupport Bio-data

More information

Big Data: Challenges and Opportunities

Big Data: Challenges and Opportunities Big Data: Challenges and Opportunities NGWI & USDA/ARS Meeting USDA Carver Center April 16, 2014 Doreen Ware Acting Chief Science Information Officer USDA ARS Big Data: Challenges and Response Biology

More information

General Services Administration Federal Supply Service Authorized Federal Supply Schedule Price List

General Services Administration Federal Supply Service Authorized Federal Supply Schedule Price List General Services Administration Federal Supply Service Authorized Federal Supply Schedule Price List GSA Schedule 66 Scientific Equipment and Services SIN 66-1000 Professional Scientific Services IHRC,

More information

Use of Whole Genome Sequencing (WGS) of food-borne pathogens for public health protection

Use of Whole Genome Sequencing (WGS) of food-borne pathogens for public health protection EFSA Scientific Colloquium n 20 Use of Whole Genome Sequencing (WGS) of food-borne pathogens for public health protection Parma, Italy, 16-17 June 2014 Why WGS based approach Infectious diseases are responsible

More information

Enabling multi-cloud resources at CERN within the Helix Nebula project. D. Giordano (CERN IT-SDC) HEPiX Spring 2014 Workshop 23 May 2014

Enabling multi-cloud resources at CERN within the Helix Nebula project. D. Giordano (CERN IT-SDC) HEPiX Spring 2014 Workshop 23 May 2014 Enabling multi-cloud resources at CERN within the Helix Nebula project D. Giordano (CERN IT-) HEPiX Spring 2014 Workshop This document produced by Members of the Helix Nebula consortium is licensed under

More information

OpenCB development - A Big Data analytics and visualisation platform for the Omics revolution

OpenCB development - A Big Data analytics and visualisation platform for the Omics revolution OpenCB development - A Big Data analytics and visualisation platform for the Omics revolution Ignacio Medina, Paul Calleja, John Taylor (University of Cambridge, UIS, HPC Service (HPCS)) Abstract The advent

More information

Dr Alexander Henzing

Dr Alexander Henzing Horizon 2020 Health, Demographic Change & Wellbeing EU funding, research and collaboration opportunities for 2016/17 Innovate UK funding opportunities in omics, bridging health and life sciences Dr Alexander

More information

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the

More information

EMBL-EBI Web Services

EMBL-EBI Web Services EMBL-EBI Web Services Rodrigo Lopez Head of the External Services Team SME Workshop Piemonte 2011 EBI is an Outstation of the European Molecular Biology Laboratory. Summary Introduction The JDispatcher

More information

The European Bioinformatics Institute - Cambridge. Overview

The European Bioinformatics Institute - Cambridge. Overview The European Bioinformatics Institute - Cambridge Overview EMBL-European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD United Kingdom y www.ebi.ac.uk C +44 (0)1223 494

More information

Data Grids. Lidan Wang April 5, 2007

Data Grids. Lidan Wang April 5, 2007 Data Grids Lidan Wang April 5, 2007 Outline Data-intensive applications Challenges in data access, integration and management in Grid setting Grid services for these data-intensive application Architectural

More information

Delivering the power of the world s most successful genomics platform

Delivering the power of the world s most successful genomics platform Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE

More information

Processing Genome Data using Scalable Database Technology. My Background

Processing Genome Data using Scalable Database Technology. My Background Johann Christoph Freytag, Ph.D. freytag@dbis.informatik.hu-berlin.de http://www.dbis.informatik.hu-berlin.de Stanford University, February 2004 PhD @ Harvard Univ. Visiting Scientist, Microsoft Res. (2002)

More information

FACULTY OF MEDICAL SCIENCE

FACULTY OF MEDICAL SCIENCE Doctor of Philosophy Program in Microbiology FACULTY OF MEDICAL SCIENCE Naresuan University 171 Doctor of Philosophy Program in Microbiology The time is critical now for graduate education and research

More information

How To Write A Blog Post On Globus

How To Write A Blog Post On Globus Globus Software as a Service data publication and discovery Kyle Chard, University of Chicago Computation Institute, chard@uchicago.edu Jim Pruyne, University of Chicago Computation Institute, pruyne@uchicago.edu

More information

High Performance Computing OpenStack Options. September 22, 2015

High Performance Computing OpenStack Options. September 22, 2015 High Performance Computing OpenStack PRESENTATION TITLE GOES HERE Options September 22, 2015 Today s Presenters Glyn Bowden, SNIA Cloud Storage Initiative Board HP Helion Professional Services Alex McDonald,

More information

Challenges associated with analysis and storage of NGS data

Challenges associated with analysis and storage of NGS data Challenges associated with analysis and storage of NGS data Gabriella Rustici Research and training coordinator Functional Genomics Group gabry@ebi.ac.uk Next-generation sequencing Next-generation sequencing

More information

ELIXIR Scientific Programme 2014-2018

ELIXIR Scientific Programme 2014-2018 ELIXIR Scientific Programme 2014-2018 1 ELIXIR Scientific Programme 2014-2018 Contents About ELIXIR 1 Executive Summary 2 Europe s Bioinformatics Infrastructure: key challenges 2014-2018 4 ELIXIR s Strategic

More information

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013 ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE October 2013 Introduction As sequencing technologies continue to evolve and genomic data makes its way into clinical use and

More information

G E N OM I C S S E RV I C ES

G E N OM I C S S E RV I C ES GENOMICS SERVICES THE NEW YORK GENOME CENTER NYGC is an independent non-profit implementing advanced genomic research to improve diagnosis and treatment of serious diseases. capabilities. N E X T- G E

More information

Embargoed until 14:30 CEST European time, 13:30 BST UK, 8:30 Eastern US summer time Contacts:

Embargoed until 14:30 CEST European time, 13:30 BST UK, 8:30 Eastern US summer time Contacts: Embargoed until 14:30 CEST European time, 13:30 BST UK, 8:30 Eastern US summer time Contacts: Louisa Wood or Katrina Pavelin, EMBL EBI louisa@ebi.ac.uk katrina@ebi.ac.uk +44 (0)1223 494665 Sonia Furtado,

More information

WINDOWS AZURE EXECUTION MODELS

WINDOWS AZURE EXECUTION MODELS WINDOWS AZURE EXECUTION MODELS Windows Azure provides three different execution models for running applications: Virtual Machines, Web Sites, and Cloud Services. Each one provides a different set of services,

More information

The data explosion is transforming science

The data explosion is transforming science Talk Outline The data tsunami and the 4 th paradigm of science The challenges for the long tail of science Where is the cloud being used now? The app marketplace SMEs Analytics as a service. What are the

More information

Building Bioinformatics Capacity in Africa. Nicky Mulder CBIO Group, UCT

Building Bioinformatics Capacity in Africa. Nicky Mulder CBIO Group, UCT Building Bioinformatics Capacity in Africa Nicky Mulder CBIO Group, UCT Outline What is bioinformatics? Why do we need IT infrastructure? What e-infrastructure does it require? How we are developing this

More information

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS NEW YORK CITY COLLEGE OF TECHNOLOGY The City University Of New York School of Arts and Sciences Biological Sciences Department Course title:

More information

EMBL Identity & Access Management

EMBL Identity & Access Management EMBL Identity & Access Management Rupert Lück EMBL Heidelberg e IRG Workshop Zürich Apr 24th 2008 Outline EMBL Overview Identity & Access Management for EMBL IT Requirements & Strategy Project Goal and

More information

EMBL-European Bioinformatics Institute. Annual Scientific Report 2012

EMBL-European Bioinformatics Institute. Annual Scientific Report 2012 EMBL-European Bioinformatics Institute Annual Scientific Report 2012 EMBL-EBI Annual Scientific Report 2012 2013 EMBL-European Bioinformatics Institute This publication was produced by EMBL-EBI s External

More information

Cloud-based Analytics and Map Reduce

Cloud-based Analytics and Map Reduce 1 Cloud-based Analytics and Map Reduce Datasets Many technologies converging around Big Data theme Cloud Computing, NoSQL, Graph Analytics Biology is becoming increasingly data intensive Sequencing, imaging,

More information

Data integration for metagenomics: current status and future plans

Data integration for metagenomics: current status and future plans integration for metagenomics: current status and future plans Neil Wipat Computing Science University of Newcastle NERC Microbial Metagenomics Overview metamicrobase Current method of data integration

More information

Bioinformatics Grid - Enabled Tools For Biologists.

Bioinformatics Grid - Enabled Tools For Biologists. Bioinformatics Grid - Enabled Tools For Biologists. What is Grid-Enabled Tools (GET)? As number of data from the genomics and proteomics experiment increases. Problems arise for the current sequence analysis

More information

TRANSFORMING DATA PROTECTION

TRANSFORMING DATA PROTECTION TRANSFORMING DATA PROTECTION Moving from Reactive to Proactive Mark Galpin 1 Our Protection Strategy: Best Of Breed Performance LEADER HIGH-END STORAGE VMAX Low Service Level LEADER SCALE-OUT NAS STORAGE

More information

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.3 Selected Standards

More information

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS 1. The Technology Strategy sets out six areas where technological developments are required to push the frontiers of knowledge

More information

Personalized Medicine and IT

Personalized Medicine and IT Personalized Medicine and IT Data-driven Medicine in the Age of Genomics www.intel.com/healthcare/bigdata Ketan Paranjape General Manager, Life Sciences Intel Corp. @Portlandketan 1 The Central Dogma of

More information

Accelerate > Converged Storage Infrastructure. DDN Case Study. ddn.com. 2013 DataDirect Networks. All Rights Reserved

Accelerate > Converged Storage Infrastructure. DDN Case Study. ddn.com. 2013 DataDirect Networks. All Rights Reserved DDN Case Study Accelerate > Converged Storage Infrastructure 2013 DataDirect Networks. All Rights Reserved The University of Florida s (ICBR) offers access to cutting-edge technologies designed to enable

More information

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons The NIH Commons Summary The Commons is a shared virtual space where scientists can work with the digital objects of biomedical research, i.e. it is a system that will allow investigators to find, manage,

More information

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16 Course Director: Dr. Barry Grant (DCM&B, bjgrant@med.umich.edu) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems

More information

Big Data and the Earth Observation and Climate Modelling Communities: JASMIN and CEMS

Big Data and the Earth Observation and Climate Modelling Communities: JASMIN and CEMS Big Data and the Earth Observation and Climate Modelling Communities: JASMIN and CEMS Workshop on the Future of Big Data Management 27-28 June 2013 Philip Kershaw Centre for Environmental Data Archival

More information

CloudLink - The On-Ramp to the Cloud Security, Management and Performance Optimization for Multi-Tenant Private and Public Clouds

CloudLink - The On-Ramp to the Cloud Security, Management and Performance Optimization for Multi-Tenant Private and Public Clouds - The On-Ramp to the Cloud Security, Management and Performance Optimization for Multi-Tenant Private and Public Clouds February 2011 1 Introduction Today's business environment requires organizations

More information

Course Specification

Course Specification 1 Course Specification Program on which the course is given: Department offering the program: Department offering the course: Academic year /level: Date of specification approval: 2008/2009 Masters Degree

More information

School of Nursing. Presented by Yvette Conley, PhD

School of Nursing. Presented by Yvette Conley, PhD Presented by Yvette Conley, PhD What we will cover during this webcast: Briefly discuss the approaches introduced in the paper: Genome Sequencing Genome Wide Association Studies Epigenomics Gene Expression

More information

Attacking the Biobank Bottleneck

Attacking the Biobank Bottleneck Attacking the Biobank Bottleneck Professor Jan-Eric Litton BBMRI-ERIC BBMRI-ERIC Big Data meets research biobanking Big data is high-volume, high-velocity and highvariety information assets that demand

More information

Enabling a federated environment to support biomedical research. Gianmauro Cuccuru CRS4

Enabling a federated environment to support biomedical research. Gianmauro Cuccuru CRS4 Enabling a federated environment to support biomedical research Gianmauro Cuccuru CRS4 ELIXIR connects national bioinformatics centres and EMBL- EBI into a sustainable European infrastructure for biological

More information

Veeam Cloud Connect. Version 8.0. Administrator Guide

Veeam Cloud Connect. Version 8.0. Administrator Guide Veeam Cloud Connect Version 8.0 Administrator Guide April, 2015 2015 Veeam Software. All rights reserved. All trademarks are the property of their respective owners. No part of this publication may be

More information

European Molecular Biology Laboratory Case Example

European Molecular Biology Laboratory Case Example European Molecular Biology Laboratory Case Example Dr. Silke Schumacher Director International Relations EMBL Member States Austria 1974 Denmark 1974 France 1974 Germany 1974 Israel 1974 Italy 1974 Netherlands

More information

OpenStack IaaS. Rhys Oxenham OSEC.pl BarCamp, Warsaw, Poland November 2013

OpenStack IaaS. Rhys Oxenham OSEC.pl BarCamp, Warsaw, Poland November 2013 OpenStack IaaS 1 Rhys Oxenham OSEC.pl BarCamp, Warsaw, Poland November 2013 Disclaimer The information provided within this presentation is for educational purposes only and was prepared for a community

More information

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel

More information

VMware Cloud Automation Design and Deploy IaaS Service

VMware Cloud Automation Design and Deploy IaaS Service DATASHEET VMware Cloud Automation AT A GLANCE The VMware Cloud Automation Design and Deploy IaaS Service expands the power of virtualization and moves IT services away from existing infrastructure delivery

More information

vcloud Air Disaster Recovery Technical Presentation

vcloud Air Disaster Recovery Technical Presentation vcloud Air Disaster Recovery Technical Presentation Agenda 1 vcloud Air Disaster Recovery Overview 2 What s New 3 Architecture 4 Setup and Configuration 5 Considerations 6 Automation Options 2 vcloud Air

More information

Experiences and challenges in the development of the JASMIN cloud service for the environmental science community

Experiences and challenges in the development of the JASMIN cloud service for the environmental science community JASMIN (STFC/Stephen Kill) Experiences and challenges in the development of the JASMIN cloud service for the environmental science community ECMWF Visualisa-on in Meteorology Week, 28 September 2015 Philip

More information

Big Data Europe

Big Data Europe BIG DATA EUROPE SC1 Hangout Big Data Challenge in Health www.big-data-europe.eu Empowering Communities with Data Technologies Agenda for Today Welcome! Brief into and background (OPF) Introduction to the

More information

Preparing the scenario for the use of patient s genome sequences in clinic. Joaquín Dopazo

Preparing the scenario for the use of patient s genome sequences in clinic. Joaquín Dopazo Preparing the scenario for the use of patient s genome sequences in clinic Joaquín Dopazo Computational Medicine Institute, Centro de Investigación Príncipe Felipe (CIPF), Functional Genomics Node, (INB),

More information

SURFsara HPC Cloud Workshop

SURFsara HPC Cloud Workshop SURFsara HPC Cloud Workshop www.cloud.sara.nl Tutorial 2014-06-11 UvA HPC and Big Data Course June 2014 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current

More information

HEP Data-Intensive Distributed Cloud Computing System Requirements Specification Document

HEP Data-Intensive Distributed Cloud Computing System Requirements Specification Document HEP Data-Intensive Distributed Cloud Computing System Requirements Specification Document CANARIE NEP-101 Project University of Victoria HEP Computing Group December 18, 2013 Version 1.0 1 Revision History

More information

NIH Genomic Data Sharing (GDS) Policy Guidance Memo #2 1

NIH Genomic Data Sharing (GDS) Policy Guidance Memo #2 1 MEMORANDUM TO: Principal Investigators and Research Staff DATE: 2/22/15 FROM: Anne Klibanski, MD, Partners Chief Academic Officer (CAO) Paul Anderson, MD, PhD, BWH CAO Harry Orf, PhD, MGH Sr. Vice President-Research

More information

How To Choose Cloud Computing

How To Choose Cloud Computing IJSRD - International Journal for Scientific Research & Development Vol. 2, Issue 09, 2014 ISSN (online): 2321-0613 Comparison of Several IaaS Cloud Computing Platforms Amar Deep Gorai 1 Dr. Birendra Goswami

More information

Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions

Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions 64% of organizations were investing or planning to invest on Big Data technology

More information

Institut Français de Bioinformatique, Un Cloud pour les Sciences du Vivant

Institut Français de Bioinformatique, Un Cloud pour les Sciences du Vivant Institut Français de Bioinformatique, Un Cloud pour les Sciences du Vivant Christophe Blanchet! Institut Français de Bioinformatique - IFB French Institute of Bioinformatics - ELIXIR French Node CNRS UMS3601

More information

Automated and Scalable Data Management System for Genome Sequencing Data

Automated and Scalable Data Management System for Genome Sequencing Data Automated and Scalable Data Management System for Genome Sequencing Data Michael Mueller NIHR Imperial BRC Informatics Facility Faculty of Medicine Hammersmith Hospital Campus Continuously falling costs

More information

COPO: Collaborative Open Plant Omics. Rob Davey Data Infrastructure and Algorithms Group Leader robert.davey@tgac.ac.

COPO: Collaborative Open Plant Omics. Rob Davey Data Infrastructure and Algorithms Group Leader robert.davey@tgac.ac. : Collaborative Open Plant Omics Rob Davey Data Infrastructure and Algorithms Group Leader robert.davey@tgac.ac.uk @froggleston Toni Etuk Felix Shaw Acknowledgements Oxford eresearch Centre Susanna Sansone

More information

SURFsara HPC Cloud Workshop

SURFsara HPC Cloud Workshop SURFsara HPC Cloud Workshop doc.hpccloud.surfsara.nl UvA workshop 2016-01-25 UvA HPC Course Jan 2016 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current

More information

M110.726 The Nucleus M110.727 The Cytoskeleton M340.703 Cell Structure and Dynamics

M110.726 The Nucleus M110.727 The Cytoskeleton M340.703 Cell Structure and Dynamics of Biochemistry and Molecular Biology 1. Master the knowledge base of current biochemistry, molecular biology, and cellular physiology Describe current knowledge in metabolic transformations conducted

More information

VMware Virtual SAN Design and Sizing Guide TECHNICAL MARKETING DOCUMENTATION V 1.0/MARCH 2014

VMware Virtual SAN Design and Sizing Guide TECHNICAL MARKETING DOCUMENTATION V 1.0/MARCH 2014 VMware Virtual SAN Design and Sizing Guide TECHNICAL MARKETING DOCUMENTATION V 1.0/MARCH 2014 Table of Contents Introduction... 3 1.1 VMware Virtual SAN...3 1.2 Virtual SAN Datastore Characteristics and

More information

HBC1533 - How to build your cloud - Steps to Extend your Datacenter

HBC1533 - How to build your cloud - Steps to Extend your Datacenter VMworld 2014 Page 1 HBC1533 - How to build your cloud - Steps to Extend your Datacenter Tuesday, 14 October 2014 14:00 Dave Hill, VMware 5 key steps to Hybrid DC A thing made by combining two different

More information

Deployment of BioXSDenabled services on a Cloud. christophe.blanchet@ibcp.fr

Deployment of BioXSDenabled services on a Cloud. christophe.blanchet@ibcp.fr Deployment of BioXSDenabled services on a Cloud Outline IBCP, provider of BioXSD-enabled services Cloud Computing RENABI GRISBI, French infrastructure Bioinformatics Integrated s gbio-pbil.ibcp.fr/ws GBIO

More information

Bacterial Next Generation Sequencing - nur mehr Daten oder auch mehr Wissen? Dag Harmsen Univ. Münster, Germany dharmsen@uni-muenster.

Bacterial Next Generation Sequencing - nur mehr Daten oder auch mehr Wissen? Dag Harmsen Univ. Münster, Germany dharmsen@uni-muenster. Bacterial Next Generation Sequencing - nur mehr Daten oder auch mehr Wissen? Dag Harmsen Univ. Münster, Germany dharmsen@uni-muenster.de Commercial Disclosure Dag Harmsen is co-founder and partial owner

More information

Software Defined Security Mechanisms for Critical Infrastructure Management

Software Defined Security Mechanisms for Critical Infrastructure Management Software Defined Security Mechanisms for Critical Infrastructure Management SESSION: CRITICAL INFRASTRUCTURE PROTECTION Dr. Anastasios Zafeiropoulos, Senior R&D Architect, Contact: azafeiropoulos@ubitech.eu

More information

SUSE Cloud 2.0. Pete Chadwick. Douglas Jarvis. Senior Product Manager pchadwick@suse.com. Product Marketing Manager djarvis@suse.

SUSE Cloud 2.0. Pete Chadwick. Douglas Jarvis. Senior Product Manager pchadwick@suse.com. Product Marketing Manager djarvis@suse. SUSE Cloud 2.0 Pete Chadwick Douglas Jarvis Senior Product Manager pchadwick@suse.com Product Marketing Manager djarvis@suse.com SUSE Cloud SUSE Cloud is an open source software solution based on OpenStack

More information

Deploying Business Virtual Appliances on Open Source Cloud Computing

Deploying Business Virtual Appliances on Open Source Cloud Computing International Journal of Computer Science and Telecommunications [Volume 3, Issue 4, April 2012] 26 ISSN 2047-3338 Deploying Business Virtual Appliances on Open Source Cloud Computing Tran Van Lang 1 and

More information

BIOLOGICAL SCIENCES REQUIREMENTS [63 75 UNITS]

BIOLOGICAL SCIENCES REQUIREMENTS [63 75 UNITS] Biological Sciences Major The Biological Sciences address many of the most important and fundamental questions about our world: What is life? How does our brain produce our ideas and emotions? What are

More information

Too Much Data or Too Little Cooperation? Tom Plasterer, PhD. Research & Development Information (RDI) Director, US Cross-Science

Too Much Data or Too Little Cooperation? Tom Plasterer, PhD. Research & Development Information (RDI) Director, US Cross-Science Too Much Data or Too Little Cooperation? Tom Plasterer, PhD. Research & Development Information (RDI) Director, US Cross-Science Approaching a Pharma Big Data Problem: Requirements of the CI Informatics

More information