Building the Systems Biology Knowledgebase

Size: px
Start display at page:

Download "Building the Systems Biology Knowledgebase"

From this document you will learn the answers to the following questions:

  • What is the name of the product that is used in the Kbase?

  • What is the purpose of the Systems Biology Knowledgebase?

Transcription

1 Building the Systems Biology Knowledgebase Tom Brettin Oak Ridge National Laboratory

2 Integrate science and the science community JGI Sequencing Genome Carbon Cycling Processes Bioenergy Research Integrate Science Across Metabolic Modeling Plant Feedstocks for Bioenergy Biology Research There is a tremendous wealth of data and informa@on in the Genomic Sciences program. The Knowledgebase (Kbase) is an opportunity to integrate this data and informa@on both within individual ac@vi@es as well as to integrate together different ac@vi@es.

3 Everyone should be a contributor! KBASE: A. Professional Computa@onal Biologists B. Data generators and basic analysts C. Knowledge Seekers D. Knowledge Generators Therefore we aim to: instances of minimum inventory/maximum diversity systems, a term coined by Peter Pearce in his book, Structure in Nature Is a Strategy for Design (MIT Press, 1978). Create a powerful framework for programma@c access to data and func@ons of Kbase. (Users A,B) Ul@mately provide stubs for use in PERL, PYTHON, R, MATLAB, Galaxy, etc. Create a set of packaged Widgets that make placement and recognizable display of Kbase func@ons on web pages (or within perhaps other apps), easy and iden@fiable. (Users B) Create a simplified portal for search and aggrega@on of data for data consumers and Knowledge Seekers. (Users C,D) Create a innova+ve pla.orm for knowledge crea+on, evolu+on and sharing. 2 DOE Office of Science Office of Biological and Environmental Research

4 An Integrated View of Modeling, Simulation, Experiment, and Bioinformatics Bioinformatics Analysis Tools Integrated Biological Databases Experimental Design High-throughput Experiments Analysis & Visualization

5 An Integrated View of Modeling, Simulation, Experiment, and Bioinformatics Problem Specification Modeling and Simulation Analysis & Visualization Bioinformatics Analysis Tools Integrated Biological Databases Experimental Design High-throughput Experiments Analysis & Visualization

6 Base Knowledgebase enabling predic5ve systems biology. Powerful modeling framework. Systems Biology Knowledge Community driven, extensible and scalable open source so_ware and system. Infrastructure for and of algorithms and data sources. Framework for search, and of data. Enable model based experimental design and interpreta<on of results. Microbes Communities Plants

7 Engineering a Microbe for Biofuel Produc<on Annotated Genome Annota@on algorithms Metabolic reconstruc@on Feed Stock Stresses Hydrolysate, ph, Salt, End product, intermediates Metabolic model genera@on Model op@miza@on algorithms Biomass Regulatory network inference Isoprene Other func@onal modeling Fi`ng kine@c model parameters DNA replica<on transcrip<on protein folding transla<on Regula<on Predic@ng pathway fluxes KBase Tool Integra<on Proposing strain op@miza@ons Genome Sequence Compara@ve Genomics KEGG Brenda BioCyc Published models Gene KO Phenotypes Transcriptomics Metabolomics Proteomics Growth curves Flux tracing experiments KBase Data Integra<on

8 Modifying Lignin Biosynthesis S G H S G H PolyPhen 2 Genome annota@on algorithms Compara@ve genomics Genome wide Correla@ve analysis SNP influenced changes in protein structure and func@on Pathway predic@ons Network inference Pathway reconstruc@on Omics & SNP overlay Model op@miza@on valida@on Phylogenomics Modeling phase I Plant systems modifica@on Phenotype Mutant popula<on Resequencing data Transcriptomics Proteomics Metabolomics

9 Culturing Recalcitrant Microbes from Communi<es Covaria<on Analysis, Phylogene<cally and Func<onally Interes<ng Keystone Species Phylogene<c Inference Gene Func<onal Annota<on Trp N Differen<al Gene Expression Popula@on Sta@s@cs Compara@ve Metagenomics Isolate Genomes and Models Genome Assembly from Metagenomics Annota@on and Metabolic Reconstruc@on Regula@on and Func@onal Modeling Predict Syntrophic Interac@ons Predict Culturing Condi@ons Isolate vs. Community Phenotype Species Abundance Func@onal Gene Abundance Phylo binning and scaffolding Transcriptomics Metabolomics Proteomics Temp ph Salinity Amino Acids Cofactors Syntrophies

10 What the KBase Needs To Provide? Scalable compute and data capabilities beyond that available locally Distributed infrastructure available 24x7 worldwide Integration with local bioinfo systems for seamless computing and data management Enables leverage of remote systems administration and support via service providers Enables access to state of the art facilities at fraction of the cost (SPs just add more servers) Centralized support of tools and data Bottom line enable biologists to focus on biology

11 Leverage Existing Investments We leverage the considerable investments in existing integrated databases and analysis environments Key challenge: How we build on these systems yet provide to the community an integrated view for future development

12 Microbes Online Model SEED MG-RAST 1000s Data Sets 300+ Daily Users Meta Microbes Online 6532 Models Users 41,000 Metagenomes 500+ Daily Users Phyotozome 153 Metagenomes 100+ Daily Users RegFam 1000s Papers 100+ Daily Users 20,000+ users The SEED 1166 Subsystems 5859 Users 25 Plant Genomes 300 Daily Users RAST 39,000 Genomes Users

13 Infrastructure Goals Our vision is to put users in the drivers seat.

14 DOE Systems Biology Knowledgebase KBASE Data and modeling for predictive biology Overview of Infrastructure Tom Brettin and Rick Stevens Oak Ridge and Argonne National Laboratories

15 Working As One Team Plant CDM Design and Build Jan 2012, ORNL Hackathon Jan 2012, LBL First Internal Kbase Build Feb 2012, ANL

16 So_ware Technical Reviews (May 2 3, 2012)

17 Energy Sciences Network (ESnet) KBase leverages ESNet for 10+ Gb/s data transfer between all nodes BNL ESnet backbone ( ESnet4) is a na<onal 10 Gbps op<cal circuit infrastructure ESnet shares its op<cal network with Internet2 ESnet's IP network func<ons as a Tier 1 internet service provider

18 The DOE KBase Cloud Built on the DOE ASCR investment in the Magellan cloud infrastructure Current of 700 nodes homed at ANL for heterogeneous Open Stack Argonne Open Stack Oak Ridge Cluster Berkeley Cluster Brookhaven

19 The Kbase Cloud Architecture Data Intensive Science KBase Applica<on Development Large Scale Computa<on Method Development HPC Cluster Image MapReduce Image Ubuntu Image KBase Image OpenStack IaaS Cloud SoZware Stack (EC2/S3 APIs) Commodity Compute Cluster Hardware

20 The KBase Services Services Oriented Architecture: The KBase Unified API access to a highly diverse set of services ranging from quick retrieval of simple data to massive computa@ons on the KBase Cloud. In a SOA the system is func@onally decomposed into many services each of which is implemented as one or more servers. Our long term goal includes community developed and contributed services. Our ini@al set of services will be backed by the following example servers: Genomic Servers Protein Family Servers Phenotype Servers Polymorphism Servers Compound and Reac5on Data Servers Metabolic Modeling Servers Expression Data Servers Regulatory Models Servers

21 Concept: KBase User Experience

22 Development Schedule A series of system builds occurring every quarter will enable a graded process. Successive builds will expand community involvement.

KBase and Globus Online Nexus. Shreyas Cholia NERSC/LBL

KBase and Globus Online Nexus. Shreyas Cholia NERSC/LBL DOE Systems Biology Knowledgebase KBase and Globus Online Nexus Shreyas Cholia NERSC/LBL What is KBase? Knowledgebase enabling predic6ve systems biology. Powerful modeling framework. Community- driven,

More information

The Office of Biological and Environmental

The Office of Biological and Environmental Genomic Science Program genomicscience.energy.gov Overview of the DOE Systems Biology Knowledgebase and Related Research Activities The Office of Biological and Environmental Research (BER) within the

More information

DOE Office of Biological & Environmental Research: Biofuels Strategic Plan

DOE Office of Biological & Environmental Research: Biofuels Strategic Plan DOE Office of Biological & Environmental Research: Biofuels Strategic Plan I. Current Situation The vast majority of liquid transportation fuel used in the United States is derived from fossil fuels. In

More information

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management

More information

Big Data and Clouds: Challenges and Opportuni5es

Big Data and Clouds: Challenges and Opportuni5es Big Data and Clouds: Challenges and Opportuni5es NIST January 15 2013 Geoffrey Fox gcf@indiana.edu h"p://www.infomall.org h"p://www.futuregrid.org School of Informa;cs and Compu;ng Digital Science Center

More information

Mission. To provide higher technological educa5on with quality, preparing. competent professionals, with sound founda5ons in science, technology

Mission. To provide higher technological educa5on with quality, preparing. competent professionals, with sound founda5ons in science, technology Mission To provide higher technological educa5on with quality, preparing competent professionals, with sound founda5ons in science, technology and innova5on, commi

More information

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.2 Community Needs of

More information

Cloud-Based Big Data Analytics in Bioinformatics

Cloud-Based Big Data Analytics in Bioinformatics Cloud-Based Big Data Analytics in Bioinformatics Presented By Cephas Mawere Harare Institute of Technology, Zimbabwe 1 Introduction 2 Big Data Analytics Big Data are a collection of data sets so large

More information

Scalus A)ribute Workshop. Paris, April 14th 15th

Scalus A)ribute Workshop. Paris, April 14th 15th Scalus A)ribute Workshop Paris, April 14th 15th Content Mo=va=on, objec=ves, and constraints Scalus strategy Scenario and architectural views How the architecture works Mo=va=on for this MCITN Storage

More information

Research Data Networks: Privacy- Preserving Sharing of Protected Health Informa>on

Research Data Networks: Privacy- Preserving Sharing of Protected Health Informa>on Research Data Networks: Privacy- Preserving Sharing of Protected Health Informa>on Lucila Ohno-Machado, MD, PhD Division of Biomedical Informatics University of California San Diego PCORI Workshop 7/2/12

More information

ARTIST Methodology and Tooling. Jesus Gorroñogoitia - Atos SOC Crete, 1 st July 2015

ARTIST Methodology and Tooling. Jesus Gorroñogoitia - Atos SOC Crete, 1 st July 2015 ARTIST Methodology and Tooling Jesus Gorroñogoitia - Atos SOC Crete, 1 st July 2015 Motivation: From SaaP to SaaS So#ware as a Product based Company So#ware as a Service based Company : Cloud Computing

More information

A Primer of Genome Science THIRD

A Primer of Genome Science THIRD A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:

More information

nuts and bolts of DNA sequencing approaches and bioinformatic tools

nuts and bolts of DNA sequencing approaches and bioinformatic tools nuts and bolts of DNA sequencing approaches and bioinformatic tools Dionysios A. Antonopoulos Institute for Genomics and Systems Biology Biosciences Division Argonne National Laboratory August 7, 2012

More information

Data Center Evolu.on and the Cloud. Paul A. Strassmann George Mason University November 5, 2008, 7:20 to 10:00 PM

Data Center Evolu.on and the Cloud. Paul A. Strassmann George Mason University November 5, 2008, 7:20 to 10:00 PM Data Center Evolu.on and the Cloud Paul A. Strassmann George Mason University November 5, 2008, 7:20 to 10:00 PM 1 Hardware Evolu.on 2 Where is hardware going? x86 con(nues to move upstream Massive compute

More information

Big Data Challenges in Bioinformatics

Big Data Challenges in Bioinformatics Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres Jordi.Torres@bsc.es Talk outline! We talk about Petabyte?

More information

The EcoCyc Curation Process

The EcoCyc Curation Process The EcoCyc Curation Process Ingrid M. Keseler SRI International 1 HOW OFTEN IS THE GOLDEN GATE BRIDGE PAINTED? Many misconceptions exist about how often the Bridge is painted. Some say once every seven

More information

Plant Metabolomics. For BOT 6516

Plant Metabolomics. For BOT 6516 Plant Metabolomics For BOT 6516 Introduction Modern metabolomics began about ten years ago and yet many continue to question the relative performance of this area of technology in advancing plant biology.

More information

OpenCB a next generation big data analytics and visualisation platform for the Omics revolution

OpenCB a next generation big data analytics and visualisation platform for the Omics revolution OpenCB a next generation big data analytics and visualisation platform for the Omics revolution Development at the University of Cambridge - Closing the Omics / Moore s law gap with Dell & Intel Ignacio

More information

Data Integration. Lectures 16 & 17. ECS289A, WQ03, Filkov

Data Integration. Lectures 16 & 17. ECS289A, WQ03, Filkov Data Integration Lectures 16 & 17 Lectures Outline Goals for Data Integration Homogeneous data integration time series data (Filkov et al. 2002) Heterogeneous data integration microarray + sequence microarray

More information

University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology

University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology Programme Structure - the MSc outcome will require 180 credits total (full-time only) - 60

More information

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS

BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS 1. The Technology Strategy sets out six areas where technological developments are required to push the frontiers of knowledge

More information

Data integration is a feature that clearly expands the role of the GTL

Data integration is a feature that clearly expands the role of the GTL Technical Components of the GTL Knowledgebase Data Integration Data integration is a feature that clearly expands the role of the GTL Knowledgebase (GKB) beyond an archive to a dynamic systems biology

More information

NERSC Data Efforts Update Prabhat Data and Analytics Group Lead February 23, 2015

NERSC Data Efforts Update Prabhat Data and Analytics Group Lead February 23, 2015 NERSC Data Efforts Update Prabhat Data and Analytics Group Lead February 23, 2015-1 - A little bit about myself Computer Scien.st Brown, IIT Delhi Real- 3me Graphics, Virtual Reality, HCI Computa3onal

More information

Experiences with Eucalyptus: Deploying an Open Source Cloud

Experiences with Eucalyptus: Deploying an Open Source Cloud Experiences with Eucalyptus: Deploying an Open Source Cloud Rick Bradshaw - bradshaw@mcs.anl.gov Piotr T Zbiegiel - pzbiegiel@anl.gov Argonne National Laboratory Overview Introduction and Background Eucalyptus

More information

Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins- Diehr wilkinsn@sdsc.edu

Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins- Diehr wilkinsn@sdsc.edu Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins- Diehr wilkinsn@sdsc.edu What is a science gateway? science gateway /sī əәns gāt wā / n. 1. an

More information

MoBEDAC -- Integrated data and analysis for the indoor and built environment. Folker Meyer Argonne National Laboratory GSC 13 Shenzhen, China

MoBEDAC -- Integrated data and analysis for the indoor and built environment. Folker Meyer Argonne National Laboratory GSC 13 Shenzhen, China MoBEDAC -- Integrated data and analysis for the indoor and built environment Folker Meyer Argonne National Laboratory GSC 13 Shenzhen, China NGS is causing paradigm shift Environmental clone libraries

More information

Nicolas Pons INRA Ins(tut Micalis Plateforme MetaQuant Jouy- en- Josas, France

Nicolas Pons INRA Ins(tut Micalis Plateforme MetaQuant Jouy- en- Josas, France Nicolas Pons INRA Ins(tut Micalis Plateforme MetaQuant Jouy- en- Josas, France Special Science Online Collec-on: Dealing with Data (feb 2011) DNA Protein TTGTGGATAACCTCAAAACTTTTCTCTTTCTGACCTGTGGAAAACTTTTTCGTTTTATGATAGAATCAGAGGACAAGAATAAAGA!

More information

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

ENOS: a Network Opera/ng System for ESnet Testbed

ENOS: a Network Opera/ng System for ESnet Testbed ENOS: a Network Opera/ng System for ESnet Testbed Eric Pouyoul (lomax@es.net) Technology Exchange Cleveland, Ohio, September 2015 Is ESnet really developing Yet Another Network Opera:ng System (YANOS)?

More information

Big Data and Scientific Discovery

Big Data and Scientific Discovery Big Data and Scientific Discovery Bill Harrod Office of Science William.Harrod@science.doe.gov! February 26, 2014! Big Data and Scien*fic Discovery Next genera*on scien*fic breakthroughs require: Major

More information

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS Copyright 2014 Splunk Inc. Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS Dritan Bi=ncka BD Solu=ons Architecture Disclaimer During the course of this presenta=on, we may make forward looking statements

More information

Founda'onal IT Governance A Founda'onal Framework for Governing Enterprise IT Adapted from the ISACA COBIT 5 Framework

Founda'onal IT Governance A Founda'onal Framework for Governing Enterprise IT Adapted from the ISACA COBIT 5 Framework Founda'onal IT Governance A Founda'onal Framework for Governing Enterprise IT Adapted from the ISACA COBIT 5 Framework Steven Hunt Enterprise IT Governance Strategist NASA Ames Research Center Michael

More information

Legacy Archiving How many lights do you leave on? September 14 th, 2015

Legacy Archiving How many lights do you leave on? September 14 th, 2015 Legacy Archiving How many lights do you leave on? September 14 th, 2015 1 Introductions Wendy Laposata, Himforma(cs Tom Chase, Cone Health 2 About Cone Health More than 100 loca=ons 6 hospitals, 3 ambulatory

More information

Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane www.ebi.ac.uk

Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane www.ebi.ac.uk Three data delivery cases for EMBL- EBI s Embassy Guy Cochrane www.ebi.ac.uk EMBL European Bioinformatics Institute Genes, genomes & variation European Nucleotide Archive 1000 Genomes Ensembl Ensembl Genomes

More information

Big Data: Challenges and Opportunities

Big Data: Challenges and Opportunities Big Data: Challenges and Opportunities NGWI & USDA/ARS Meeting USDA Carver Center April 16, 2014 Doreen Ware Acting Chief Science Information Officer USDA ARS Big Data: Challenges and Response Biology

More information

Core Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1

Core Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1 Core Bioinformatics 2014/2015 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformàtica/Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat

More information

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16

BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16 Course Director: Dr. Barry Grant (DCM&B, bjgrant@med.umich.edu) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems

More information

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

Structural Bioinformatics

Structural Bioinformatics Structural Bioinformatics D. Ritchie, P. Tufféry Paris Nancy BISTRO Strasbourg Lyon Th IFB, Jan. 9, 2015 BISTRO Scien&fic leader: Julie Thompson Technical leader: Valérie Cognat IFB correspondent: Valérie

More information

Powering Cutting Edge Research in Life Sciences with High Performance Computing

Powering Cutting Edge Research in Life Sciences with High Performance Computing A Point of View Powering Cutting Edge Research in Life Sciences with High Performance Computing High performance computing (HPC) is the foundation of pioneering research in life sciences. HPC plays a vital

More information

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah Apache Hadoop: The Pla/orm for Big Data Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah 1 The Problems with Current Data Systems BI Reports + Interac7ve Apps RDBMS (aggregated

More information

Certified Cloud Computing Professional VS-1067

Certified Cloud Computing Professional VS-1067 Certified Cloud Computing Professional VS-1067 Certified Cloud Computing Professional Certification Code VS-1067 Vskills Cloud Computing Professional assesses the candidate for a company s cloud computing

More information

Return on Experience on Cloud Compu2ng Issues a stairway to clouds. Experts Workshop Nov. 21st, 2013

Return on Experience on Cloud Compu2ng Issues a stairway to clouds. Experts Workshop Nov. 21st, 2013 Return on Experience on Cloud Compu2ng Issues a stairway to clouds Experts Workshop Agenda InGeoCloudS SoCware Stack InGeoCloudS Elas2city and Scalability Elas2c File Server Elas2c Database Server Elas2c

More information

Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas

Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas Big Data The Big Picture Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas What is Big Data? Big Data gets its name because that s what it is data that

More information

Focusing on results not data comprehensive data analysis for targeted next generation sequencing

Focusing on results not data comprehensive data analysis for targeted next generation sequencing Focusing on results not data comprehensive data analysis for targeted next generation sequencing Daniel Swan, Jolyon Holdstock, Angela Matchan, Richard Stark, John Shovelton, Duarte Mohla and Simon Hughes

More information

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix Alternative Deployment Models for Cloud Computing in HPC Applications Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix The case for Cloud in HPC Build it in house Assemble in the cloud?

More information

Interna'onal Standards Ac'vi'es on Cloud Security EVA KUIPER, CISA CISSP EVA.KUIPER@HP.COM HP ENTERPRISE SECURITY SERVICES

Interna'onal Standards Ac'vi'es on Cloud Security EVA KUIPER, CISA CISSP EVA.KUIPER@HP.COM HP ENTERPRISE SECURITY SERVICES Interna'onal Standards Ac'vi'es on Cloud Security EVA KUIPER, CISA CISSP EVA.KUIPER@HP.COM HP ENTERPRISE SECURITY SERVICES Agenda Importance of Common Cloud Standards Outline current work undertaken Define

More information

Denis Caromel, CEO Ac.veEon. Orchestrate and Accelerate Applica.ons. Open Source Cloud Solu.ons Hybrid Cloud: Private with Burst Capacity

Denis Caromel, CEO Ac.veEon. Orchestrate and Accelerate Applica.ons. Open Source Cloud Solu.ons Hybrid Cloud: Private with Burst Capacity Cloud computing et Virtualisation : applications au domaine de la Finance Denis Caromel, CEO Ac.veEon Orchestrate and Accelerate Applica.ons Open Source Cloud Solu.ons Hybrid Cloud: Private with Burst

More information

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud) Open Cloud System (Integration of Eucalyptus, Hadoop and into deployment of University Private Cloud) Thinn Thu Naing University of Computer Studies, Yangon 25 th October 2011 Open Cloud System University

More information

Next-Generation Networking for Science

Next-Generation Networking for Science Next-Generation Networking for Science ASCAC Presentation March 23, 2011 Program Managers Richard Carlson Thomas Ndousse Presentation

More information

An Open Dynamic Big Data Driven Applica3on System Toolkit

An Open Dynamic Big Data Driven Applica3on System Toolkit An Open Dynamic Big Data Driven Applica3on System Toolkit Craig C. Douglas University of Wyoming and KAUST This research is supported in part by the Na3onal Science Founda3on and King Abdullah University

More information

Portable, Scalable, and High-Performance I/O Forwarding on Massively Parallel Systems. Jason Cope copej@mcs.anl.gov

Portable, Scalable, and High-Performance I/O Forwarding on Massively Parallel Systems. Jason Cope copej@mcs.anl.gov Portable, Scalable, and High-Performance I/O Forwarding on Massively Parallel Systems Jason Cope copej@mcs.anl.gov Computation and I/O Performance Imbalance Leadership class computa:onal scale: >100,000

More information

OpenDaylight: Introduction, Lithium and Beyond

OpenDaylight: Introduction, Lithium and Beyond OpenDaylight: Introduction, Lithium and Beyond Colin Dixon Technical Steering Committee Chair, OpenDaylight Senior Principal Engineer, Brocade Some content from: David Meyer, Neela Jaques, and Kevin Woods

More information

GeneProf and the new GeneProf Web Services

GeneProf and the new GeneProf Web Services GeneProf and the new GeneProf Web Services Florian Halbritter florian.halbritter@ed.ac.uk Stem Cell Bioinformatics Group (Simon R. Tomlinson) simon.tomlinson@ed.ac.uk December 10, 2012 Florian Halbritter

More information

Module 3. Genome Browsing. Using Web Browsers to View Genome Annota4on. Kers4n Howe Wellcome Trust Sanger Ins4tute zfish- help@sanger.ac.

Module 3. Genome Browsing. Using Web Browsers to View Genome Annota4on. Kers4n Howe Wellcome Trust Sanger Ins4tute zfish- help@sanger.ac. Module 3 Genome Browsing Using Web Browsers to View Genome Annota4on Kers4n Howe Wellcome Trust Sanger Ins4tute zfish- help@sanger.ac.uk Introduc.on Genome browsing The Ensembl gene set Guided examples

More information

SGI. High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems. January, 2012. Abstract. Haruna Cofer*, PhD

SGI. High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems. January, 2012. Abstract. Haruna Cofer*, PhD White Paper SGI High Throughput Computing (HTC) Wrapper Program for Bioinformatics on SGI ICE and SGI UV Systems Haruna Cofer*, PhD January, 2012 Abstract The SGI High Throughput Computing (HTC) Wrapper

More information

Pipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices

Pipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices overview Pipeline Pilot Enterprise Server Pipeline Pilot Enterprise Server (PPES) is a powerful client-server platform that streamlines the integration and analysis of the vast quantities of data flooding

More information

Cloudian The Storage Evolution to the Cloud.. Cloudian Inc. Pre Sales Engineering

Cloudian The Storage Evolution to the Cloud.. Cloudian Inc. Pre Sales Engineering Cloudian The Storage Evolution to the Cloud.. Cloudian Inc. Pre Sales Engineering Agenda Industry Trends Cloud Storage Evolu4on of Storage Architectures Storage Connec4vity redefined S3 Cloud Storage Use

More information

Ibis: Scaling Python Analy=cs on Hadoop and Impala

Ibis: Scaling Python Analy=cs on Hadoop and Impala Ibis: Scaling Python Analy=cs on Hadoop and Impala Wes McKinney, Budapest BI Forum 2015-10- 14 @wesmckinn 1 Me R&D at Cloudera Serial creator of structured data tools / user interfaces Mathema=cian MIT

More information

So#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell

So#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell So#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell R&D Manager, Scalable System So#ware Department Sandia National Laboratories is a multi-program laboratory managed and

More information

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison

RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the

More information

Core Bioinformatics. Titulació Tipus Curs Semestre. 4313473 Bioinformàtica/Bioinformatics OB 0 1

Core Bioinformatics. Titulació Tipus Curs Semestre. 4313473 Bioinformàtica/Bioinformatics OB 0 1 Core Bioinformatics 2014/2015 Codi: 42397 Crèdits: 12 Titulació Tipus Curs Semestre 4313473 Bioinformàtica/Bioinformatics OB 0 1 Professor de contacte Nom: Sònia Casillas Viladerrams Correu electrònic:

More information

Behind the scene III Cloud computing

Behind the scene III Cloud computing Behind the scene III Cloud computing Athens, 15.11.2014 M. Dolenc / R. Klinc Why we do it? Engineering in the cloud is a combina3on of cloud based services and rich interac3ve applica3ons allowing engineers

More information

Genomic Applications on Cray supercomputers: Next Generation Sequencing Workflow. Barry Bolding. Cray Inc Seattle, WA

Genomic Applications on Cray supercomputers: Next Generation Sequencing Workflow. Barry Bolding. Cray Inc Seattle, WA Genomic Applications on Cray supercomputers: Next Generation Sequencing Workflow Barry Bolding Cray Inc Seattle, WA 1 CUG 2013 Paper Genomic Applications on Cray supercomputers: Next Generation Sequencing

More information

bigdata Managing Scale in Ontological Systems

bigdata Managing Scale in Ontological Systems Managing Scale in Ontological Systems 1 This presentation offers a brief look scale in ontological (semantic) systems, tradeoffs in expressivity and data scale, and both information and systems architectural

More information

Software Description Technology

Software Description Technology Software applications using NCB Technology. Software Description Technology LEX Provide learning management system that is a central resource for online medical education content and computer-based learning

More information

Big Data + Big Analytics Transforming the way you do business

Big Data + Big Analytics Transforming the way you do business Big Data + Big Analytics Transforming the way you do business Bryan Harris Chief Technology Officer VSTI A SAS Company 1 AGENDA Lets get Real Beyond the Buzzwords Who is SAS? Our PerspecDve of Big Data

More information

SDN Controller Requirement

SDN Controller Requirement SDN Controller Requirement draft-gu-sdnrg-sdn-controller-requirement-00 Rong Gu (Presenter) Chen Li China Mobile Background l Public Cloud && Private Cloud in China Mobile Public Cloud (ecloud.10086.cn)

More information

Introduc)on of Pla/orm ISF. Weina Ma Weina.Ma@uoit.ca

Introduc)on of Pla/orm ISF. Weina Ma Weina.Ma@uoit.ca Introduc)on of Pla/orm ISF Weina Ma Weina.Ma@uoit.ca Agenda Pla/orm ISF Product Overview Pla/orm ISF Concepts & Terminologies Self- Service Applica)on Management Applica)on Example Deployment Examples

More information

Expanding Assessment of Analy3cal Skills among Biology Majors: From Introductory labs to Upper Division Elec3ves

Expanding Assessment of Analy3cal Skills among Biology Majors: From Introductory labs to Upper Division Elec3ves Expanding Assessment of Analy3cal Skills among Biology Majors: From Introductory labs to Upper Division Elec3ves Presented by Kathleen McAuley PI: Serena Moseman- Val3erra, Ph.D. Department of Biological

More information

Storage Solutions for Bioinformatics

Storage Solutions for Bioinformatics Storage Solutions for Bioinformatics Li Yan Director of FlexLab, Bioinformatics core technology laboratory liyan3@genomics.cn http://www.genomics.cn/flexlab/index.html Science and Technology Division,

More information

Visualizing Networks: Cytoscape. Prat Thiru

Visualizing Networks: Cytoscape. Prat Thiru Visualizing Networks: Cytoscape Prat Thiru Outline Introduction to Networks Network Basics Visualization Inferences Cytoscape Demo 2 Why (Biological) Networks? 3 Networks: An Integrative Approach Zvelebil,

More information

SURFsara HPC Cloud Workshop

SURFsara HPC Cloud Workshop SURFsara HPC Cloud Workshop doc.hpccloud.surfsara.nl UvA workshop 2016-01-25 UvA HPC Course Jan 2016 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current

More information

ENZO UNIFIED SOLVES THE CHALLENGES OF REAL-TIME DATA INTEGRATION

ENZO UNIFIED SOLVES THE CHALLENGES OF REAL-TIME DATA INTEGRATION ENZO UNIFIED SOLVES THE CHALLENGES OF REAL-TIME DATA INTEGRATION Enzo Unified Solves Real-Time Data Integration Challenges that Increase Business Agility and Reduce Operational Complexities CHALLENGES

More information

Delivering the power of the world s most successful genomics platform

Delivering the power of the world s most successful genomics platform Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE

More information

Making Sense of Big Data. Dr. Thomas E. Potok Computa2onal Data Analy2cs Group Leader Oak Ridge Na2onal Laboratory potokte@ornl.

Making Sense of Big Data. Dr. Thomas E. Potok Computa2onal Data Analy2cs Group Leader Oak Ridge Na2onal Laboratory potokte@ornl. Making Sense of Big Data Dr. Thomas E. Potok Computa2onal Data Analy2cs Group Leader Oak Ridge Na2onal Laboratory potokte@ornl.gov 865-574- 0834 ORNL s Big Data Legacy Science National Security Energy

More information

April 20 th 2011, Internet2 Spring Member Mee5ng Aaron Brown Internet2. Circuit Monitoring for DYNES

April 20 th 2011, Internet2 Spring Member Mee5ng Aaron Brown Internet2. Circuit Monitoring for DYNES April 20 th 2011, Internet2 Spring Member Mee5ng Aaron Brown Internet2 Circuit Monitoring for DYNES Dynamic Circuits Scien5fic disciplines require greater network capacity and predictably to cope with

More information

Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee

Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee Technology in Pedagogy, No. 8, April 2012 Written by Kiruthika Ragupathi (kiruthika@nus.edu.sg) Computational thinking is an emerging

More information

FACULTY OF MEDICAL SCIENCE

FACULTY OF MEDICAL SCIENCE Doctor of Philosophy Program in Microbiology FACULTY OF MEDICAL SCIENCE Naresuan University 171 Doctor of Philosophy Program in Microbiology The time is critical now for graduate education and research

More information

HFAA: A Generic Socket API for Hadoop File Systems

HFAA: A Generic Socket API for Hadoop File Systems Second Workshop on Architectures and Systems for Big Data (ASBD 2012) June 9 th, 2012 HFAA: A Generic Socket API for Hadoop File Systems Adam Yee and Jeffrey Shafer University of the Pacific 2 Hadoop MapReduce

More information

Hadoop- Based Data Explora1on for the Healthcare Safety- Net Technical & Sociocultural Challenges to Big Data Usability

Hadoop- Based Data Explora1on for the Healthcare Safety- Net Technical & Sociocultural Challenges to Big Data Usability Hadoop- Based Data Explora1on for the Healthcare Safety- Net Technical & Sociocultural Challenges to Big Data Usability David Hartzband, D.Sc. Research Affiliate, SSRC, MIT & Director, Technology Research

More information

May 13-14, 2015. Copyright 2015 Open Networking User Group. All Rights Reserved Confiden@al Not For Distribu@on

May 13-14, 2015. Copyright 2015 Open Networking User Group. All Rights Reserved Confiden@al Not For Distribu@on May 13-14, 2015 NSV Architecture Test Architecture System Under Test Mgmt, Orch, etc. Test Solution VM VM Hypervisor Hypervisor IP Network Methodology Each individual requirement had 1 test case associated

More information

Distributed Systems Interconnec=ng Them Fundamentals of Distributed Systems Alvaro A A Fernandes School of Computer Science University of Manchester

Distributed Systems Interconnec=ng Them Fundamentals of Distributed Systems Alvaro A A Fernandes School of Computer Science University of Manchester Distributed Systems Interconnec=ng Them Fundamentals of Distributed Systems lvaro Fernandes School of Computer Science University of Manchester Goals 1. To highlight the role of the interconnect in characterizing

More information

Copernicus Space Component Ground Segment and Opera4ons Concept

Copernicus Space Component Ground Segment and Opera4ons Concept Copernicus Space Component Ground Segment and Opera4ons Concept G. Kohlhammer, P. Bargellini, E. Monjoux Ground Segment and Mission Opera=ons Department, Earth Observa=on Programmes Directorate, European

More information

Big Data, Big Challenges

Big Data, Big Challenges Big Data, Big Challenges Big Data, Big Challenges DeIC Conference 2013 Michael Sullivan, M.D. Big Data Variety Volume Visualiza0on Velocity Variety Roger Ebert 1942-2013 Roger was diagnosed with cancer

More information

An introduction to bioinformatic tools for population genomic and metagenetic data analysis, 2.5 higher education credits Third Cycle

An introduction to bioinformatic tools for population genomic and metagenetic data analysis, 2.5 higher education credits Third Cycle An introduction to bioinformatic tools for population genomic and metagenetic data analysis, 2.5 higher education credits Third Cycle Faculty of Science; Department of Marine Sciences The Swedish Royal

More information

Linked Science as a producer and consumer of big data in the Earth Sciences

Linked Science as a producer and consumer of big data in the Earth Sciences Linked Science as a producer and consumer of big data in the Earth Sciences Line C. Pouchard,* Robert B. Cook,* Jim Green,* Natasha Noy,** Giri Palanisamy* Oak Ridge National Laboratory* Stanford Center

More information

2) Xen Hypervisor 3) UEC

2) Xen Hypervisor 3) UEC 5. Implementation Implementation of the trust model requires first preparing a test bed. It is a cloud computing environment that is required as the first step towards the implementation. Various tools

More information

Disaster Recovery Planning and Implementa6on. Chris Russel Director, IT Infrastructure and ISO Compu6ng and Network Services York University

Disaster Recovery Planning and Implementa6on. Chris Russel Director, IT Infrastructure and ISO Compu6ng and Network Services York University Disaster Recovery Planning and Implementa6on Chris Russel Director, IT Infrastructure and ISO Compu6ng and Network Services York University Agenda Background for York s I.T. Disaster Recovery Planning

More information

Linux Clusters Ins.tute: Turning HPC cluster into a Big Data Cluster. A Partnership for an Advanced Compu@ng Environment (PACE) OIT/ART, Georgia Tech

Linux Clusters Ins.tute: Turning HPC cluster into a Big Data Cluster. A Partnership for an Advanced Compu@ng Environment (PACE) OIT/ART, Georgia Tech Linux Clusters Ins.tute: Turning HPC cluster into a Big Data Cluster Fang (Cherry) Liu, PhD fang.liu@oit.gatech.edu A Partnership for an Advanced Compu@ng Environment (PACE) OIT/ART, Georgia Tech Targets

More information

Application of Graph-based Data Mining to Metabolic Pathways

Application of Graph-based Data Mining to Metabolic Pathways Application of Graph-based Data Mining to Metabolic Pathways Chang Hun You, Lawrence B. Holder, Diane J. Cook School of Electrical Engineering and Computer Science Washington State University Pullman,

More information

Toward a Unified Ontology of Cloud Computing

Toward a Unified Ontology of Cloud Computing Toward a Unified Ontology of Cloud Computing Lamia Youseff University of California, Santa Barbara Maria Butrico, Dilma Da Silva IBM T.J. Watson Research Center 1 In the Cloud Several Public Cloud Computing

More information

Lecture 11 Data storage and LIMS solutions. Stéphane LE CROM lecrom@biologie.ens.fr

Lecture 11 Data storage and LIMS solutions. Stéphane LE CROM lecrom@biologie.ens.fr Lecture 11 Data storage and LIMS solutions Stéphane LE CROM lecrom@biologie.ens.fr Various steps of a DNA microarray experiment Experimental steps Data analysis Experimental design set up Chips on catalog

More information

An Advanced Performance Architecture for Salesforce Native Applications

An Advanced Performance Architecture for Salesforce Native Applications An Advanced Performance Architecture for Salesforce Native Applications TABLE OF CONTENTS Introduction............................................... 3 Salesforce in the Digital Transformation Landscape...............

More information

Cloud Ready for Bioinformatics?

Cloud Ready for Bioinformatics? IDB acknowledges co-funding by the European Community's Seventh Framework Programme (INFSO-RI-261552) and the French National Research Agency's Arpege Programme (ANR-10-SEGI-001) Cloud Ready for Bioinformatics?

More information

Case Studies in Solving Testing Constraints using Service Virtualization

Case Studies in Solving Testing Constraints using Service Virtualization Case Studies in Solving Testing Constraints using Service Virtualization Rix.Groenboom@Parasoft.NL 2/21/14 1 Introduction Paraso& is supplier automated tes1ng solu1ons Since 1984, Los Angeles (US) and

More information

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof. CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Cloud Computing and Amazon Web Services Cloud Computing Amazon

More information

HP Converged Cloud Cloud Platform Overview. Shane Pearson Vice President, Portfolio & Product Management

HP Converged Cloud Cloud Platform Overview. Shane Pearson Vice President, Portfolio & Product Management HP Converged Cloud Cloud Platform Overview Shane Pearson Vice President, Portfolio & Product Management Cloud is the biggest disruption since the Internet 1970-80s Mainframe 1990s Client/Server 2000s The

More information