Big Data, Big Challenges

Size: px
Start display at page:

Download "Big Data, Big Challenges"

Transcription

1 Big Data, Big Challenges Big Data, Big Challenges DeIC Conference 2013 Michael Sullivan, M.D.

2 Big Data Variety Volume Visualiza0on Velocity

3 Variety Roger Ebert Roger was diagnosed with cancer of a salivary gland in 2002 and died in What I believe is that all clear- minded people should remain two things throughout their life:mes: curious and teachable.

4 TesGng for Cancer Today Diagnosis: Sublingual Adenocarcinoma (rare) Standard treatment: None Tissue biopsies showed a rare cancer. Serial CT scans of the lung showed disease progression.

5 Cancer Diagnosis and Treatment in the Future Central Dogma of Molecular Biology

6 DNA Sequencing Results This Circos plot shows gene expression before (T1) and arer (T2) treatment.

7 Signaling Network 3D RET protein structure Disrupted signaling pathways drive tumor proliferagon.

8 Heterogenous Data

9 NIH/NCI Cancer Genomics Cloud Ini0a0ve u The Cloud - move the compute, not the data. u 3 pilot centers maybe CGHub, BioNimbus, and Broad. u Preloaded with tools and data (TCGA will have 2.5 PB). u Assumes exisgng infrastructure base. u EsGmated $5 million per site per year.

10 Trans- NIH: BD2K and Infrastructure Plus u Big Data to Knowledge (BD2K) has 4 parts: v FacilitaGng Broad Use of Data (catalog, metadata) v Analysis Methods and SoRware (access to HPC) v Enhancing Training (data science) v Centers of Excellence (6-8 centers) u Infrastructure Plus Intramural (campus) upgrade

11 Volume May 5, 2013 mega = 10 6 giga = 10 9 tera = peta = exa = ze`a = yo`a = googol = googolplex = 10 googol

12 Growth of Storage at NCBI ( )

13 NCBI Outbound Data (TB/Month) Trillions Total Feb Apr Jun Aug Oct Dec Feb Apr Jun Aug Oct Dec Feb Apr Jun Aug Oct Dec Feb Apr Jun Aug Oct Dec Feb Apr Jun Aug

14 InformaGon Content: u = log 2 (n) in silico: n = 2, u = 1 1 byte = 8 bits in vivo: n = 4, u = 2 1 byte = 4 base pairs

15 Storage in silico vs. in vivo NCBI or EBI 20 Petabytes Human 160 ZeHabytes* Ra0o: 1 : 8,000,000 * CalculaGon: 6.4 Gbp 1 Byte 10 microbe bp cells 160 ZB X X X = cell 4 bp 1 human bp person person GB = Gigabyte (10 9 ) PB = Petabyte (10 15 ) ZB = Ze`abyte (10 21 ) bp = base pair Gbp = Gigabase pair (10 9 ) Assume all cells are diploid.

16 Visualiza0on Physics LHC Lead Ion Collision Source: CERN (ALICE detector)" 16 10/1/13, 2012 Internet2 Life Sciences MRI Monkey Brain Source: Van Wedeen, M.D., Martinos Center and Dept. of Radiology, Massachusetts General Hospital and Harvard University Medical School"

17 Circos Mapping of Dog and Human Genes Human 23 Chromosomes 3.1 Gb Dog 39 Chromosomes 2.4 Gb

18 Human Dog Synteny by Dog Chromosome

19 Protein- Protein InteracGons

20 The Human Connectome

21 Velocity

22 White House Champions of Open Science David Lipman, NCBI Atul Bu`e Stanford University David Altshuler Broad InsGtute Stephen Friend Sage Bionetworks

23 Science DMZ Science DMZ h`p://fasterdata.es.net/science- dmz/science- dmz- security/

24 Growth of perfsonar ~30 Countries ~200 Domains ~850 Instances

25 NCBI EBI Performance Problem: 10 Mb stream (expected 500Mb) SoluGon: perfsonar at the endpoints localized the problem

26 Dr. Lin Fang US China 10 Gbps Link Fed Ex: Internet + FTP: China- US 10G Link: 2 days 26 hours 30 seconds Dr. Dawei Lin Sample.fa (24GB) 26 10/1/13, 2012 Internet2

27 Scalability 300 million light years ~ meter Galaxy Neuron

28 NIST Big Data Reference Architecture NIST RA M0226v5 INFORMATION VALUE CHAIN System Orchestrator Data Provider DATA SW Applica0on Provider Collec0on Cura0on Analy0cs Visualiza0on Access DATA SW IT Provider Big Data Processing Frameworks (analy0c tools, etc.) Horizontally Scalable Ver0cally Scalable Big Data Pla`orms (databases, etc.) Horizontally Scalable Ver0cally Scalable Infrastructures Horizontally Scalable (VM clusters) Ver0cally Scalable DATA SW Data Consumer Security & Privacy Management IT VALUE CHAIN Physical and Virtual Resources (Networking, Compu0ng, etc.) 09/04/2013 NIST Big Data WG / Ref Arch Subgroup 28

29 Internet2 Network Infrastructure Topology

30 Innovation Platform 100 GigE Layer 2 Connec0on Science DMZ Soaware Defined Networking Internet2 innova0on backbone delivered as 100G L1 High- Performance Layer 2/3 Switch/Router SDN Control Server Performance Node Switches, data stores for data- intensive science R&E IP TR- CPS IP Network Layer 3 Your Research Sta0c Layer 2 Dynamic Layer 2 GENI Experiments GENI? Tradi0onal regional and commodity providers Tradi0onal Campus Border Router Tradi0onal L3 Campus Border Security Campus Enterprise Network For more informa0on, see fasterdata.es.net Tradi0onal Services Tradi0onal Switch Substrate Op0cal System Dark Fiber Innova0on Services Soaware Defined Networking Substrate 30 10/1/13, 2012 Internet2

31 Condo of Condos

32 DemocraGzaGon of Sequencing 2,386 Genome Sequencers Worldwide 30 May 2013 Source: Map of High-throughput Sequencers" 32 10/1/13, 2012 Internet2

33 NaGonal Cyberinfrastructure XSEDE NSF- funded Supercomputers HPC resources Internet2 250 universiges XSEDEnet NCGAS Indiana University TACC SDSC PSC Source: h`ps:// /1/13, 2012 Internet2

34 NCGAS Virtual Instrument Indiana University 6 PB Storage TACC NSF- Funded or XSEDE AllocaGon Federally Funded NCGAS Galaxy Portal POD Galaxy Portal Mason POD 5 PB D.C. 100 Gig Internet2 5.5 PB Storage 4 PB Storage SDSC PSC Sequencing Center 10 Gig NLR NCBI Source: Barne`, W.K., and R.D. LeDuc, Next Genera:on Cyberinfrastructures for Next Genera:on Sequencing and Genome Science, presented at 2013 AAMC GIR Conference, Vancouver, BC

35 A note here. Apache Hadoop Ecosystem

36 AWS Cluster Pricing

37 HPC and Cloud Convergence?

38 OpenStack

Overview of Internet2

Overview of Internet2 Southern Partnership in Advanced Networking Workshop 1: South Carolina and Georgia Donald F. (Rick) McMullen, Ph.D. Senior Director for Research Engagement and Development rmcmullen@internet2.edu 8 April

More information

Network Virtualiza/on on Internet2. Eric Boyd Senior Director for Strategic Projects

Network Virtualiza/on on Internet2. Eric Boyd Senior Director for Strategic Projects Network Virtualiza/on on Internet2 Eric Boyd Senior Director for Strategic Projects Internet2 Mission University Corpora=on = for Advanced Internet Development Internet2 Community Innova=on Story Abundant

More information

Big Data and Clouds: Challenges and Opportuni5es

Big Data and Clouds: Challenges and Opportuni5es Big Data and Clouds: Challenges and Opportuni5es NIST January 15 2013 Geoffrey Fox gcf@indiana.edu h"p://www.infomall.org h"p://www.futuregrid.org School of Informa;cs and Compu;ng Digital Science Center

More information

A Uniquely Flexible HPC Resource for New Communities and Data Analytics

A Uniquely Flexible HPC Resource for New Communities and Data Analytics A Uniquely Flexible HPC Resource for New Communities and Data Analytics Nick Nystrom Director, Strategic Applications & Bridges PI nystrom@psc.edu XSEDE ECSS Symposium December 15, 2015 2015 Pittsburgh

More information

Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins- Diehr wilkinsn@sdsc.edu

Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins- Diehr wilkinsn@sdsc.edu Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins- Diehr wilkinsn@sdsc.edu What is a science gateway? science gateway /sī əәns gāt wā / n. 1. an

More information

Real-World Insights from an SDN Lab. Ron Milford Manager, InCNTRE SDN Lab Indiana University

Real-World Insights from an SDN Lab. Ron Milford Manager, InCNTRE SDN Lab Indiana University Real-World Insights from an SDN Lab Ron Milford Manager, InCNTRE SDN Lab Indiana University 1 A bit about IU, the GlobalNOC, and InCNTRE... Indiana University s Network History 1998 University Corporation

More information

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk HPC and Big Data EPCC The University of Edinburgh Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk EPCC Facilities Technology Transfer European Projects HPC Research Visitor Programmes Training

More information

Home for the Homeless: Dissec0ng the Internet2 Research Support Center

Home for the Homeless: Dissec0ng the Internet2 Research Support Center December 5 th 2012 Merit Network Summit Jason Zurawski Senior Research Engineer Home for the Homeless: Dissec0ng the Internet2 Research Support Center Outline MoJvaJon Problem Statement Business as Usual

More information

A GPU-Enabled HPC System for New Communities and Data Analytics

A GPU-Enabled HPC System for New Communities and Data Analytics A GPU-Enabled HPC System for New Communities and Data Analytics Nick Nystrom Director, Strategic Applications & Bridges PI nystrom@psc.edu HPE Theater Presentation November 19, 2015 2015 Pittsburgh Supercomputing

More information

NUIT Tech Talk: Trends in Research Data Mobility

NUIT Tech Talk: Trends in Research Data Mobility NUIT Tech Talk: Trends in Research Data Mobility Pascal Paschos NUIT Academic & Research Technologies, Research Computing Services Matt Wilson NUIT Cyberinfrastructure, Telecommunication and Network Services

More information

Presented By Joe Mambretti, Director, International Center for Advanced Internet Research, Northwestern University

Presented By Joe Mambretti, Director, International Center for Advanced Internet Research, Northwestern University NEXT GENERATION CLOUDS, THE CHAMELEON CLOUD TESTBED, AND SOFTWARE DEFINED NETWORKING (SDN) Principal Investigator: Kate Keahey Co-PIs: J. Mambretti, D.K. Panda, P. Rad, W. Smith, D. Stanzione Presented

More information

Enhanced Research Data Management and Publication with Globus

Enhanced Research Data Management and Publication with Globus Enhanced Research Data Management and Publication with Globus Vas Vasiliadis Jim Pruyne Presented at OR2015 June 8, 2015 Presentations and other useful information available at globus.org/events/or2015/tutorial

More information

Personalized Medicine and IT

Personalized Medicine and IT Personalized Medicine and IT Data-driven Medicine in the Age of Genomics www.intel.com/healthcare/bigdata Ketan Paranjape General Manager, Life Sciences Intel Corp. @Portlandketan 1 The Central Dogma of

More information

Campus Research Network Overview

Campus Research Network Overview Campus Research Network Overview Chris Griffin Chief Network Architect University of Florida & Florida LambdaRail 5/6/2013 Agenda Research Networking at UF A brief history CRNv2 Florida LambdaRail What

More information

Delivering a Campus Research Data Service with Globus. GlobusWorld 2014 Keynote

Delivering a Campus Research Data Service with Globus. GlobusWorld 2014 Keynote Delivering a Campus Research Data Service with Globus GlobusWorld 2014 Keynote Give me your data, your terabytes, Your huddled files yearning to breathe free Building campus research data services Open

More information

Connecting Researchers, Data & HPC

Connecting Researchers, Data & HPC Connecting Researchers, Data & HPC Nick Nystrom Director, Strategic Applications & Bridges PI nystrom@psc.edu July 1, 2015 2015 Pittsburgh Supercomputing Center The Shift to Big Data New Emphases Pan-STARRS

More information

SWIFT. Page:1. Openstack Swift. Object Store Cloud built from the grounds up. David Hadas Swift ATC. HRL davidh@il.ibm.com 2012 IBM Corporation

SWIFT. Page:1. Openstack Swift. Object Store Cloud built from the grounds up. David Hadas Swift ATC. HRL davidh@il.ibm.com 2012 IBM Corporation Page:1 Openstack Swift Object Store Cloud built from the grounds up David Hadas Swift ATC HRL davidh@il.ibm.com Page:2 Object Store Cloud Services Expectations: PUT/GET/DELETE Huge Capacity (Scale) Always

More information

Stanford SDN-Based Private Cloud. Johan van Reijendam (jvanreij@stanford.edu) Stanford University

Stanford SDN-Based Private Cloud. Johan van Reijendam (jvanreij@stanford.edu) Stanford University Stanford SDN-Based Private Cloud (jvanreij@stanford.edu) Stanford University Executive Summary The Web and its infrastructure continue to make phenomenal progress, allowing the creation and scaling of

More information

EMERGING AND ENABLING GLOBAL, NATIONAL, AND REGIONAL NETWORK INFRASTRUCTURE TO SUPPORT RESEARCH & EDUCATION

EMERGING AND ENABLING GLOBAL, NATIONAL, AND REGIONAL NETWORK INFRASTRUCTURE TO SUPPORT RESEARCH & EDUCATION EMERGING AND ENABLING GLOBAL, NATIONAL, AND REGIONAL NETWORK INFRASTRUCTURE TO SUPPORT RESEARCH & EDUCATION Dave Pokorney CTO, Director of Engineering Florida LambdaRail NOC UCF Research Computing Day

More information

High Performance Compu2ng and Big Data. High Performance compu2ng Curriculum UvA- SARA h>p://www.hpc.uva.nl/

High Performance Compu2ng and Big Data. High Performance compu2ng Curriculum UvA- SARA h>p://www.hpc.uva.nl/ High Performance Compu2ng and Big Data High Performance compu2ng Curriculum UvA- SARA h>p://www.hpc.uva.nl/ Big data was big news in 2012 and probably in 2013 too. The Harvard Business Review talks about

More information

Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks

Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks Praveenkumar Kondikoppa, Chui-Hui Chiu, Cheng Cui, Lin Xue and Seung-Jong Park Department of Computer Science,

More information

SuperStack Next Exit. Challenges on CC*IIE at UF

SuperStack Next Exit. Challenges on CC*IIE at UF SuperStack Next Exit Challenges on CC*IIE at UF Xiaolin (Andy) Li Associate Professor Director, Scalable Software Systems Laboratory (S3Lab) Area Chair of Computer Engineering Division Department of Electrical

More information

Open Science, Big Data and Research Reproducibility. Tony Hey Senior Data Science Fellow escience Ins>tute University of Washington tony.hey@live.

Open Science, Big Data and Research Reproducibility. Tony Hey Senior Data Science Fellow escience Ins>tute University of Washington tony.hey@live. Open Science, Big Data and Research Reproducibility Tony Hey Senior Data Science Fellow escience Ins>tute University of Washington tony.hey@live.com The Vision of Open Science Vision for a New Era of Research

More information

Software-Defined Networking

Software-Defined Networking Overview: Software-Defined Networking Data Center Campus & Branch Access & Aggregation Review of Next Genera0on Networking Technologies WAN Core Edge Jim Apfel / jim.apfel@gmail.com / 650-400- 3304 Disclaimer

More information

Big process for big data

Big process for big data Big process for big data Process automa9on for data- driven science Ian Foster Computa9on Ins9tute Argonne Na9onal Laboratory & The University of Chicago Talk at Astroinforma9cs 2012, Redmond, September

More information

Cancer Genomics: What Does It Mean for You?

Cancer Genomics: What Does It Mean for You? Cancer Genomics: What Does It Mean for You? The Connection Between Cancer and DNA One person dies from cancer each minute in the United States. That s 1,500 deaths each day. As the population ages, this

More information

Flow Data at 10 GigE and Beyond What can (or should) we do?

Flow Data at 10 GigE and Beyond What can (or should) we do? Flow Data at 10 GigE and Beyond What can (or should) we do? Sco$ Pinkerton pinkerton@anl.gov Argonne Na6onal Laboratory www.anl.gov About me. Involved in network design, network opera6on & network security

More information

SwitchOn Workshop São Paulo October 15-16, 2015

SwitchOn Workshop São Paulo October 15-16, 2015 Managing Data Intensive Challenges with a Science DMZ SwitchOn Workshop São Paulo October 15-16, 2015 Julio Ibarra Florida International University Data Intensive Challenges Many Disciplines Need Dedicated

More information

David Minor. Chronopolis Program Manager Director, Digital Preserva7on Ini7a7ves UCSD Library San Diego Supercomputer Center

David Minor. Chronopolis Program Manager Director, Digital Preserva7on Ini7a7ves UCSD Library San Diego Supercomputer Center David Minor Chronopolis Program Manager Director, Digital Preserva7on Ini7a7ves UCSD Library San Diego Supercomputer Center SDSC Cloud now in produc7on UCSD Library DAMS use of Cloud DuraCloud + SDSC Cloud

More information

NIST Big Data Phase I Public Working Group

NIST Big Data Phase I Public Working Group NIST Big Data Phase I Public Working Group Reference Architecture Subgroup May 13 th, 2014 Presented by: Orit Levin Co-chair of the RA Subgroup Agenda Introduction: Why and How NIST Big Data Reference

More information

How To Build A Research Platform

How To Build A Research Platform Leveraging Digital Infrastructure and Innovative Software Services to Accelerate Scientific Discovery Hervé Guy and Steve Tuecke April 9, 2014 2014 Internet2 Global Summit Denver, CO Software to Support

More information

Big Data. George O. Strawn NITRD

Big Data. George O. Strawn NITRD Big Data George O. Strawn NITRD Caveat auditor The opinions expressed in this talk are those of the speaker, not the U.S. government Outline What is Big Data? NITRD's Big Data Research Initiative Big Data

More information

Bionimbus: From Big Data to Clouds and Commons

Bionimbus: From Big Data to Clouds and Commons Bionimbus: From Big Data to Clouds and Commons Robert Grossman University of Chicago Open Cloud Consor?um June 17, 2014 Open Science Data Cloud PIRE Workshop Amsterdam Four ques?ons and one challenge 1.

More information

NITRD and Big Data. George O. Strawn NITRD

NITRD and Big Data. George O. Strawn NITRD NITRD and Big Data George O. Strawn NITRD Caveat auditor The opinions expressed in this talk are those of the speaker, not the U.S. government Outline What is Big Data? Who is NITRD? NITRD's Big Data Research

More information

GTC Presentation March 19, 2013. Copyright 2012 Penguin Computing, Inc. All rights reserved

GTC Presentation March 19, 2013. Copyright 2012 Penguin Computing, Inc. All rights reserved GTC Presentation March 19, 2013 Copyright 2012 Penguin Computing, Inc. All rights reserved Session S3552 Room 113 S3552 - Using Tesla GPUs, Reality Server and Penguin Computing's Cloud for Visualizing

More information

Addressing research data challenges at the. University of Colorado Boulder

Addressing research data challenges at the. University of Colorado Boulder Addressing research data challenges at the University of Colorado Boulder Thomas Hauser Director Research Computing University of Colorado Boulder thomas.hauser@colorado.edu Research Data Challenges Research

More information

Globus Research Data Management: Introduction and Service Overview. Steve Tuecke Vas Vasiliadis

Globus Research Data Management: Introduction and Service Overview. Steve Tuecke Vas Vasiliadis Globus Research Data Management: Introduction and Service Overview Steve Tuecke Vas Vasiliadis Presentations and other useful information available at globus.org/events/xsede15/tutorial 2 Thank you to

More information

Data management challenges in todays Healthcare and Life Sciences ecosystems

Data management challenges in todays Healthcare and Life Sciences ecosystems Data management challenges in todays Healthcare and Life Sciences ecosystems Jose L. Alvarez Principal Engineer, WW Director Life Sciences jose.alvarez@seagate.com Evolution of Data Sets in Healthcare

More information

SDN Controller Requirement

SDN Controller Requirement SDN Controller Requirement draft-gu-sdnrg-sdn-controller-requirement-00 Rong Gu (Presenter) Chen Li China Mobile Background l Public Cloud && Private Cloud in China Mobile Public Cloud (ecloud.10086.cn)

More information

Cloudian The Storage Evolution to the Cloud.. Cloudian Inc. Pre Sales Engineering

Cloudian The Storage Evolution to the Cloud.. Cloudian Inc. Pre Sales Engineering Cloudian The Storage Evolution to the Cloud.. Cloudian Inc. Pre Sales Engineering Agenda Industry Trends Cloud Storage Evolu4on of Storage Architectures Storage Connec4vity redefined S3 Cloud Storage Use

More information

Big Data Streams. Analytics Challenges, Analysis, and Applications. Adel M. Alimi

Big Data Streams. Analytics Challenges, Analysis, and Applications. Adel M. Alimi Big Data Streams 1 Analytics Challenges, Analysis, and Applications Adel M. Alimi REGIM-Lab., University of Sfax, Tunisia http://adel.alimi.regim.org adel.alimi@ieee.org 2 Evolution of Technology 3 Nano,

More information

iplant + irods: Enabling data driven collaborations Nirav Merchant iplant Collaborative/Univ. of Arizona nirav@email.arizona.edu VAMP 2012 Utrecht

iplant + irods: Enabling data driven collaborations Nirav Merchant iplant Collaborative/Univ. of Arizona nirav@email.arizona.edu VAMP 2012 Utrecht iplant + irods: Enabling data driven collaborations Nirav Merchant iplant Collaborative/Univ. of Arizona nirav@email.arizona.edu VAMP 2012 Utrecht Topic Coverage About iplant 4 th Paradigm Technology challenges

More information

globus online Cloud-based services for (reproducible) science Ian Foster Computation Institute University of Chicago and Argonne National Laboratory

globus online Cloud-based services for (reproducible) science Ian Foster Computation Institute University of Chicago and Argonne National Laboratory globus online Cloud-based services for (reproducible) science Ian Foster Computation Institute University of Chicago and Argonne National Laboratory Computation Institute (CI) Apply to challenging problems

More information

HPC ABDS: The Case for an Integrating Apache Big Data Stack

HPC ABDS: The Case for an Integrating Apache Big Data Stack HPC ABDS: The Case for an Integrating Apache Big Data Stack with HPC 1st JTC 1 SGBD Meeting SDSC San Diego March 19 2014 Judy Qiu Shantenu Jha (Rutgers) Geoffrey Fox gcf@indiana.edu http://www.infomall.org

More information

IPv6 Traffic Analysis and Storage

IPv6 Traffic Analysis and Storage Report from HEPiX 2012: Network, Security and Storage david.gutierrez@cern.ch Geneva, November 16th Network and Security Network traffic analysis Updates on DC Networks IPv6 Ciber-security updates Federated

More information

Big Data in OpenTopography

Big Data in OpenTopography Big Data in OpenTopography Vishu Nandigam San Diego Supercomputer Center NSF Big Data in Educa

More information

Perspec'ves on SDN. Roadmap to SDN Workshop, LBL

Perspec'ves on SDN. Roadmap to SDN Workshop, LBL Perspec'ves on SDN Roadmap to SDN Workshop, LBL Philip Papadopoulos San Diego Supercomputer Center California Ins8tute for Telecommunica8ons and Informa8on Technology University of California, San Diego

More information

What s the Big Deal? Big Data, Cloud & the Internet of Things. Christine Kirkpatrick San Diego Supercomputer Center, UC San Diego

What s the Big Deal? Big Data, Cloud & the Internet of Things. Christine Kirkpatrick San Diego Supercomputer Center, UC San Diego What s the Big Deal? Big Data, Cloud & the Internet of Things Christine Kirkpatrick San Diego Supercomputer Center, UC San Diego A Futurist s Near-Term View The Future Depends on Data Self-driving car

More information

Data Centric Computing Revisited

Data Centric Computing Revisited Piyush Chaudhary Technical Computing Solutions Data Centric Computing Revisited SPXXL/SCICOMP Summer 2013 Bottom line: It is a time of Powerful Information Data volume is on the rise Dimensions of data

More information

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix Alternative Deployment Models for Cloud Computing in HPC Applications Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix The case for Cloud in HPC Build it in house Assemble in the cloud?

More information

How the ersa Problem became the ersa Solu3on. Why a network and network security is impera3ve for ersa s NeCTAR cloud. Paul Bartczak Infrastructure

How the ersa Problem became the ersa Solu3on. Why a network and network security is impera3ve for ersa s NeCTAR cloud. Paul Bartczak Infrastructure How the ersa Problem became the ersa Solu3on. Why a network and network security is impera3ve for ersa s NeCTAR cloud. Paul Bartczak Infrastructure Manager About ersa eresearch SA is a collabora3ve joint

More information

Big Data Testbed for Research and Education Networks Analysis. SomkiatDontongdang, PanjaiTantatsanawong, andajchariyasaeung

Big Data Testbed for Research and Education Networks Analysis. SomkiatDontongdang, PanjaiTantatsanawong, andajchariyasaeung Big Data Testbed for Research and Education Networks Analysis SomkiatDontongdang, PanjaiTantatsanawong, andajchariyasaeung Research and Education Networks ThaiREN is a specialized Internet Service Provider

More information

CS 378 Big Data Programming

CS 378 Big Data Programming CS 378 Big Data Programming Lecture 2 Map- Reduce CS 378 - Fall 2015 Big Data Programming 1 MapReduce Large data sets are not new What characterizes a problem suitable for MR? Most or all of the data is

More information

May 13-14, 2015. Copyright 2015 Open Networking User Group. All Rights Reserved Confiden@al Not For Distribu@on

May 13-14, 2015. Copyright 2015 Open Networking User Group. All Rights Reserved Confiden@al Not For Distribu@on May 13-14, 2015 Virtual Network Overlays Working Group Follow up from last ONUG use case and fire side discussions ONUG users wanted to see formalized feedback ONUG users wanted to see progression in use

More information

Engagement Strategies for Emerging Big Data Collaborations

Engagement Strategies for Emerging Big Data Collaborations Engagement Strategies for Emerging Big Data Collaborations Lauren Rotman, lauren@es.net ESnet Science Engagement Group Lead Lawrence Berkeley National Laboratory APAN 39 th Conference Global Collaborations

More information

Using the Bionimbus Protected Data Cloud (PDC): Obtaining Access Credentials FAQ

Using the Bionimbus Protected Data Cloud (PDC): Obtaining Access Credentials FAQ Using the Bionimbus Protected Data Cloud (PDC): Obtaining Access Credentials FAQ It s very important that a PDC user is the only one who logs in with an account. If you have members of your lab that would

More information

Data Services for Campus Researchers

Data Services for Campus Researchers Data Services for Campus Researchers Research Data Management Implementations Workshop March 13, 2013 Richard Moore SDSC Deputy Director & UCSD RCI Project Manager rlm@sdsc.edu SDSC Cloud: A Storage Paradigm

More information

Agenda. NRENs, GARR and GEANT in a nutshell SDN Activities Conclusion. Mauro Campanella Internet Festival, Pisa 9 Oct 2015 2

Agenda. NRENs, GARR and GEANT in a nutshell SDN Activities Conclusion. Mauro Campanella Internet Festival, Pisa 9 Oct 2015 2 Agenda NRENs, GARR and GEANT in a nutshell SDN Activities Conclusion 2 3 The Campus-NREN-GÉANT ecosystem CAMPUS networks NRENs GÉANT backbone. GÉANT Optical + switching platforms Multi-Domain environment

More information

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management

More information

LHCONE Site Connections

LHCONE Site Connections LHCONE Site Connections Michael O Connor moc@es.net ESnet Network Engineering Asia Tier Center Forum on Networking Daejeon, South Korea September 23, 2015 Outline Introduction ESnet LHCONE Traffic Volumes

More information

Data Sharing Initiative: International Cancer Genome Consortium

Data Sharing Initiative: International Cancer Genome Consortium Data Sharing Initiative: International Cancer Genome Consortium Tom Hudson, MD President and Scientific Director Ontario Institute for Cancer Research 1 Sharing Data Sharing BIG Genome Initiative: DATA

More information

Application Development. A Paradigm Shift

Application Development. A Paradigm Shift Application Development for the Cloud: A Paradigm Shift Ramesh Rangachar Intelsat t 2012 by Intelsat. t Published by The Aerospace Corporation with permission. New 2007 Template - 1 Motivation for the

More information

HADOOP IN THE LIFE SCIENCES:

HADOOP IN THE LIFE SCIENCES: White Paper HADOOP IN THE LIFE SCIENCES: An Introduction Abstract This introductory white paper reviews the Apache Hadoop TM technology, its components MapReduce and Hadoop Distributed File System (HDFS)

More information

EMBL Identity & Access Management

EMBL Identity & Access Management EMBL Identity & Access Management Rupert Lück EMBL Heidelberg e IRG Workshop Zürich Apr 24th 2008 Outline EMBL Overview Identity & Access Management for EMBL IT Requirements & Strategy Project Goal and

More information

Wireshark Developer and User Conference

Wireshark Developer and User Conference Wireshark Developer and User Conference Using NetFlow to Analyze Your Network June 15 th, 2011 Christopher J. White Manager Applica6ons and Analy6cs, Cascade Riverbed Technology cwhite@riverbed.com SHARKFEST

More information

Unifying the Programmability of Cloud and Carrier Infrastructure

Unifying the Programmability of Cloud and Carrier Infrastructure Unifying the Programmability of Cloud and Carrier Infrastructure Mario Kind EWSDN 2014, Budapest UNIFY is co-funded by the European Commission DG CONNECT in FP7 We might only have to knit the future. Operator

More information

CERN s Scientific Programme and the need for computing resources

CERN s Scientific Programme and the need for computing resources This document produced by Members of the Helix Nebula consortium is licensed under a Creative Commons Attribution 3.0 Unported License. Permissions beyond the scope of this license may be available at

More information

Building Storage Service in a Private Cloud

Building Storage Service in a Private Cloud Building Storage Service in a Private Cloud Sateesh Potturu & Deepak Vasudevan Wipro Technologies Abstract Storage in a private cloud is the storage that sits within a particular enterprise security domain

More information

Internet2 ION Service Overview and Status. Tom Lehman (USC/ISI)

Internet2 ION Service Overview and Status. Tom Lehman (USC/ISI) Internet2 ION Service Overview and Status Tom Lehman (USC/ISI) Internet2 ION Service ION is Internet2 instan=a=on of a Dynamic Circuit Network (DCN) Internet2 launched the ION service in 2009 ION allows

More information

CS 378 Big Data Programming. Lecture 2 Map- Reduce

CS 378 Big Data Programming. Lecture 2 Map- Reduce CS 378 Big Data Programming Lecture 2 Map- Reduce MapReduce Large data sets are not new What characterizes a problem suitable for MR? Most or all of the data is processed But viewed in small increments

More information

Genetic diagnostics the gateway to personalized medicine

Genetic diagnostics the gateway to personalized medicine Micronova 20.11.2012 Genetic diagnostics the gateway to personalized medicine Kristiina Assoc. professor, Director of Genetic Department HUSLAB, Helsinki University Central Hospital The Human Genome Packed

More information

The New Dynamism in Research and Education Networks

The New Dynamism in Research and Education Networks a s t r at egy paper fr om The New Dynamism in Research and Education Networks Software-defined networking technology delivers network capacity and flexibility for academic users brocade The New Dynamism

More information

HOW SDN AND (NFV) WILL RADICALLY CHANGE DATA CENTRE ARCHITECTURES AND ENABLE NEXT GENERATION CLOUD SERVICES

HOW SDN AND (NFV) WILL RADICALLY CHANGE DATA CENTRE ARCHITECTURES AND ENABLE NEXT GENERATION CLOUD SERVICES HOW SDN AND (NFV) WILL RADICALLY CHANGE DATA CENTRE ARCHITECTURES AND ENABLE NEXT GENERATION CLOUD SERVICES Brian Levy CTO SERVICE PROVIDER SECTOR EMEA JUNIPER NETWORKS CIO DILEMA IT viewed as cost center

More information

Vivien Bonazzi ADDS Office (OD) George Komatsoulis (NCBI)

Vivien Bonazzi ADDS Office (OD) George Komatsoulis (NCBI) The Commons Vivien Bonazzi ADDS Office (OD) George Komatsoulis (NCBI) Why do we need a Commons? NIH Data NIH Data System.out.println ( the Commons ) What are the PRINCIPLES of The Commons? Supports a

More information

SDN PARTNER INTEGRATION: SANDVINE

SDN PARTNER INTEGRATION: SANDVINE SDN PARTNER INTEGRATION: SANDVINE SDN PARTNERSHIPS SSD STRATEGY & MARKETING SERVICE PROVIDER CHALLENGES TIME TO SERVICE PRODUCT EVOLUTION OVER THE TOP THREAT NETWORK TO CLOUD B/OSS AGILITY Lengthy service

More information

Manufacturing and the Internet of Everything

Manufacturing and the Internet of Everything Manufacturing and the Internet of Everything Johan Arens, CISCO (joarens@cisco.com) Business relevance of the Internet of everything Manufacturing trends Business imperatives and outcomes A vision of the

More information

Components of Technology Suppor4ng Data Intensive Research

Components of Technology Suppor4ng Data Intensive Research Components of Technology Suppor4ng Data Intensive Research Ron Hutchins Associate Vice Provost for Research and Technology and CTO Georgia Ins4tute of Technology 24 January, 2012 NSF Dear Colleague LeKer:

More information

Real Time Big Data Processing

Real Time Big Data Processing Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure

More information

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases NASA Ames NASA Advanced Supercomputing (NAS) Division California, May 24th, 2012 Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases Ignacio M. Llorente Project Director OpenNebula Project.

More information

Cyber Security With Big Data

Cyber Security With Big Data Cyber Security With Big Data Fast. Complete. Cost-Effec1ve. Harry J Foxwell, PhD Principal Consultant Oracle Public Sector Oct 2015 Safe Harbor Statement The following is intended to outline our general

More information

An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing

An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing An Alternative Storage Solution for MapReduce Eric Lomascolo Director, Solutions Marketing MapReduce Breaks the Problem Down Data Analysis Distributes processing work (Map) across compute nodes and accumulates

More information

Enterprise Data Center Networks

Enterprise Data Center Networks Enterprise Data Center Networks Isabelle Guis Big Switch Networks Vice President of Outbound Marketing ONF Market Education Committee Chair 1 This Session Objectives Leave with an understanding of Data

More information

UCLA Campus Cyberinfrastructure Plan

UCLA Campus Cyberinfrastructure Plan UCLA Campus Cyberinfrastructure Plan Executive Summary Excellence and Engagement: An Academic Plan for UCLA gives voice to UCLA s aspiration to be nothing less than [an] exemplar of the American public

More information

CASC Autumn Meeting 2015 Data Activities Panel 14 October 2015. ACI Data Program : an Integrative, Evolving Portfolio with a View towards the Horizon

CASC Autumn Meeting 2015 Data Activities Panel 14 October 2015. ACI Data Program : an Integrative, Evolving Portfolio with a View towards the Horizon CASC Autumn Meeting 2015 Data Activities Panel 14 October 2015 ACI Data Program : an Integrative, Evolving Portfolio with a View towards the Horizon Robert Chadduck Program Director, Data & CI Program

More information

Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery

Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery Center for Information Services and High Performance Computing (ZIH) Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery Richard Grunzke*, Jens Krüger, Sandra Gesing, Sonja

More information

BRINGING NETWORKS TO THE CLOUD ERA

BRINGING NETWORKS TO THE CLOUD ERA BRINGING NETWORKS TO THE CLOUD ERA SDN enables new business models Aruna Ravichandran VICE PRESIDENT, MARKETING AND STRATEGY ARAVICHANDRAN@JUNIPER.NET SOFTWARE DEFINED NETWORKING (SDN), JUNIPER NETWORKS

More information

PACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD. Natasha Balac, Ph.D.

PACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD. Natasha Balac, Ph.D. PACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD Natasha Balac, Ph.D. Brief History of SDSC 1985-1997: NSF national supercomputer center; managed by General Atomics

More information

Big Data a threat or a chance?

Big Data a threat or a chance? Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but

More information

SC12 Cloud Compu,ng for Science Tutorial: Introduc,on to Infrastructure Clouds

SC12 Cloud Compu,ng for Science Tutorial: Introduc,on to Infrastructure Clouds SC12 Cloud Compu,ng for Science Tutorial: Introduc,on to Infrastructure Clouds John Breshnahan, Patrick Armstrong, Kate Keahey, Pierre Riteau Argonne National Laboratory Computation Institute, University

More information

Introduction to the Mathematics of Big Data. Philippe B. Laval

Introduction to the Mathematics of Big Data. Philippe B. Laval Introduction to the Mathematics of Big Data Philippe B. Laval Fall 2015 Introduction In recent years, Big Data has become more than just a buzz word. Every major field of science, engineering, business,

More information

High Performance Compu2ng Facility

High Performance Compu2ng Facility High Performance Compu2ng Facility Center for Health Informa2cs and Bioinforma2cs Accelera2ng Scien2fic Discovery and Innova2on in Biomedical Research at NYULMC through Advanced Compu2ng Efstra'os Efstathiadis,

More information

addition to upgrading connectivity between the PoPs to 100Gbps, GPN is pursuing additional collocation space in Kansas City and is in the pilot stage

addition to upgrading connectivity between the PoPs to 100Gbps, GPN is pursuing additional collocation space in Kansas City and is in the pilot stage Cyberinfrastructure Plan GPN GPN connects research and education networks in 6 states to the Internet2 backbone and to ESnet. GPN provides redundant connectivity for several other affiliated campuses and

More information

Behind the scene III Cloud computing

Behind the scene III Cloud computing Behind the scene III Cloud computing Athens, 15.11.2014 M. Dolenc / R. Klinc Why we do it? Engineering in the cloud is a combina3on of cloud based services and rich interac3ve applica3ons allowing engineers

More information

Apache Hadoop FileSystem and its Usage in Facebook

Apache Hadoop FileSystem and its Usage in Facebook Apache Hadoop FileSystem and its Usage in Facebook Dhruba Borthakur Project Lead, Apache Hadoop Distributed File System dhruba@apache.org Presented at Indian Institute of Technology November, 2010 http://www.facebook.com/hadoopfs

More information

Cisco Prime Network Services Controller. Sonali Kalje Sr. Product Manager Cloud and Virtualization, Cisco Systems

Cisco Prime Network Services Controller. Sonali Kalje Sr. Product Manager Cloud and Virtualization, Cisco Systems Cisco Prime Network Services Controller Sonali Kalje Sr. Product Manager Cloud and Virtualization, Cisco Systems Agenda Cloud Networking Challenges Prime Network Services Controller L4-7 Services Solutions

More information

Software Defined Networking - a new approach to network design and operation. Paul Horrocks Pre-Sales Strategist 8 th November 2012

Software Defined Networking - a new approach to network design and operation. Paul Horrocks Pre-Sales Strategist 8 th November 2012 Software Defined Networking - a new approach to network design and operation Paul Horrocks Pre-Sales Strategist 8 th November 2012 Agenda What is Software Defined Networking What is the value of Software

More information

www.basho.com Technical Overview Simple, Scalable, Object Storage Software

www.basho.com Technical Overview Simple, Scalable, Object Storage Software www.basho.com Technical Overview Simple, Scalable, Object Storage Software Table of Contents Table of Contents... 1 Introduction & Overview... 1 Architecture... 2 How it Works... 2 APIs and Interfaces...

More information

The Real Score of Cloud

The Real Score of Cloud The Real Score of Cloud Mayur Sahni Sr. Research Manger IDC Asia/Pacific msahni@idc.com @mayursahni Digital Transformation Changing Role of IT Innova&on Informa&on Business agility Changing role of the

More information

MapReduce and Hadoop Distributed File System V I J A Y R A O

MapReduce and Hadoop Distributed File System V I J A Y R A O MapReduce and Hadoop Distributed File System 1 V I J A Y R A O The Context: Big-data Man on the moon with 32KB (1969); my laptop had 2GB RAM (2009) Google collects 270PB data in a month (2007), 20000PB

More information

Core and Pod Data Center Design

Core and Pod Data Center Design Overview The Core and Pod data center design used by most hyperscale data centers is a dramatically more modern approach than traditional data center network design, and is starting to be understood by

More information