2012 AIRI Petabyte Challenge Chris Dagdigian chris@bioteam.net



Similar documents
Cloud Sobriety. Technical challenges in mapping Informatics to the cloud. Chris Dagdigian 2010 NHGRI Cloud Workshop

Utility Computing For Cynics

Grid Engine & Amazon EC2

Storage Architectures for Big Data in the Cloud

Scalable Architecture on Amazon AWS Cloud

Data Management & Storage for NGS

STeP-IN SUMMIT June 18 21, 2013 at Bangalore, INDIA. Performance Testing of an IAAS Cloud Software (A CloudStack Use Case)

Migration Scenario: Migrating Batch Processes to the AWS Cloud

SGE & Amazon EC2. Chris Dagdigian 2008 OSGC Conference

Cloud for Large Enterprise Where to Start. Terry Wise Director, Business Development Amazon Web Services

Scalable Application. Mikalai Alimenkou

Cloud Sensibility. Hype aside, what can the cloud do for life sciences today? Chris Dagdigian 2010 Bio-IT-World Cloud Workshop

Introduction to Red Hat Storage. January, 2012

Eucalyptus: An Open-source Infrastructure for Cloud Computing. Rich Wolski Eucalyptus Systems Inc.

Cloud Computing: Making the right choices

Are you ready for your Journey to the cloud? Maybe some of you are already using some cloud- based services?

Cloud Computing For Bioinformatics

Understanding Virtualization and Cloud in the Enterprise

Building an AWS-Compatible Hybrid Cloud with OpenStack

Product Brochure. Hedvig Distributed Storage Platform Modern Storage for Modern Business. Elastic. Accelerate data to value. Simple.

Amazon Web Services 100 Success Secrets

Automation and DevOps Best Practices. Rob Hirschfeld, Dell Matt Ray, Opscode

Data Centers and Cloud Computing

Data Centers and Cloud Computing. Data Centers

Amazon Web Services. Elastic Compute Cloud (EC2) and more...

Cloud Computing and Amazon Web Services. CJUG March, 2009 Tom Malaher

Eucalyptus: An Open-source Infrastructure for Cloud Computing. Rich Wolski Eucalyptus Systems Inc.

Boas Betzler. Planet. Globally Distributed IaaS Platform Examples AWS and SoftLayer. November 9, IBM Corporation

Emerging Technology for the Next Decade

Cloud Computing. Lecture 24 Cloud Platform Comparison

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Migration Scenario: Migrating Backend Processing Pipeline to the AWS Cloud

Cloud Computing and Open Source: Watching Hype meet Reality

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research

The Incremental Advantage:

Integrating Remote Cloud and Local HPC Resources

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Which is Better: Virtualization or Cloud IaaS?

Availability Digest. HPE Helion Private Cloud and Cloud Broker Services February 2016

House of Cards. IaaS without storage performance testing. Howard Marks, Deep Storage Len Rosenthal, Load DynamiX

Data Centers and Cloud Computing. Data Centers

Lustre * Filesystem for Cloud and Hadoop *

Cloud computing - Architecting in the cloud

Big Data - Infrastructure Considerations

Financial Services Grid Computing on Amazon Web Services January 2013 Ian Meyers

Infrastructure for Cloud Computing

I D C A N A L Y S T C O N N E C T I O N

Building a Private Cloud with Eucalyptus

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Cloud-based Analytics and Map Reduce

Intro to AWS: Storage Services

CloudStack and Big Data. Sebastien May 22nd 2013 LinuxTag, Berlin

Benchmarking Sahara-based Big-Data-as-a-Service Solutions. Zhidong Yu, Weiting Chen (Intel) Matthew Farrellee (Red Hat) May 2015

Who moved my cloud? Part I: Introduction to Private, Public and Hybrid clouds and smooth migration

Expand Your Infrastructure with the Elastic Cloud. Mark Ryland Chief Solutions Architect Jenn Steele Product Marketing Manager

How To Cloud Compute At The Cloud At The Cyclone Center For Cnc

DevOps with Containers. for Microservices

Evolving Datacenter and Cloud Connectivity Services

Sistemi Operativi e Reti. Cloud Computing

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

Cloud Computing Now and the Future Development of the IaaS

AMAZON S3: ARCHITECTING FOR RESILIENCY IN THE FACE OF FAILURES Jason McHugh

Data Centers and Cloud Computing. Data Centers. MGHPCC Data Center. Inside a Data Center

Amazon Web Services Workshop

CLOUD APPLICATION INTEGRATION AND DEPLOYMENT MADE SIMPLE

Comparing Open Source Private Cloud (IaaS) Platforms

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage

Cloud Based Architectures in Ground Systems of Space Missions

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

Utilizing the SDSC Cloud Storage Service

Scaling up to Production

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

WHITE PAPER. Software Defined Storage Hydrates the Cloud

The Scenario: Priority Matrix for Cloud Computing

Scalability in the Cloud HPC Convergence with Big Data in Design, Engineering, Manufacturing

Cloud Computing In Reality: Experience sharing in cloud solution developments and evaluations

Will They Blend?: Exploring Big Data Computation atop Traditional HPC NAS Storage

A Comparison of Clouds: Amazon Web Services, Windows Azure, Google Cloud Platform, VMWare and Others (Fall 2012)

Hybrid Cloud Mini Roundtable. April 17, Expect Excellence.

Red Hat Storage Server

OpenStack Alberto Molina Coballes

HYPER-CONVERGED INFRASTRUCTURE STRATEGIES

Dominion Enterprises and Amazon Web Services. Going Hybrid Joe Fuller, VP/CIO December 10, 2013

Design and Evolution of the Apache Hadoop File System(HDFS)

Cloud Computing Paradigm

Managing Traditional Workloads Together with Cloud Computing Workloads

CLOUD COMPUTING THOMAS BOLTZE CTO SKY & SAND THOMAS.BOLTZE@SKY-SAND.COM

Building Blocks of the Private Cloud

Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray VMware

Building a Cloud Computing Platform based on Open Source Software Donghoon Kim ( donghoon.kim@kt.com ) Yoonbum Huh ( huhbum@kt.

Cloud Computing and Amazon Web Services

How Server And Network Virtualization Make Data Centers More Dynamic

How To Scale A Server Farm

THE REALITIES OF NOSQL BACKUPS

Next Generation Data Centers: Hyperconverged Architectures Impact On Storage. PRESENTATION TITLE GOES HERE Mark O Connell Distinguished Engineer EMC

Cloud Computing. Up until now

Cloud Computing. Up until now

Hedvig Distributed Storage Platform with Cisco UCS

Amazon AWS in.net. Presented by: Scott Reed

Transcription:

Mapping Informatics To the Cloud 2012 AIRI Petabyte Challenge Chris Dagdigian chris@bioteam.net

I m Chris. I m an infrastructure geek. I work for the BioTeam.

The C Word.

When I say cloud I m talking IaaS.

Amazon AWS Is the IaaS cloud. Most others are fooling themselves. (Has-beens, also-rans & delusional marketing zombies)

A message for the pretenders

No APIs? Not a cloud.

No self-service? Not a cloud.

I have to email a human? Not a cloud.

~50% failure rate when provisioning new servers? Stupid cloud.

Block storage and virtual servers only? (barely) a cloud;

Private Clouds: My $.02

Private Clouds in 2012: Hype vs. Reality ratio still wacky Sensible only for certain shops Have you seen what you have to do to your networks & gear? There are easier ways

Private Clouds: My Advice for 12 Remain cynical (test vendor claims) Due Diligence still essential I personally would not deploy/buy anything that does not explicitly provide Amazon API compatibility

Private Clouds: My Advice for 12 Most people are better off: Adding VM platforms to existing HPC clusters & environments Extending enterprise VM platforms to allow user self-service & server catalogs

Enough Bloviating. Advice time.

Tip #1

HPC & Clouds: Whole New World

We have spent decades learning to tune research HPC systems for shared access & many users. The cloud upends this model

Far more common to see Dedicated cloud resources spun up for each app or use case Each system gets individually tuned & optimized

Tip #2

Hybrid Clouds & Cloud Bursting

Lots of aggressive marketing Lots of carefully constructed case studies and prototypes The truth? Less usable than you ve been told Possible? Heck yeah. Practical? Only sometimes.

Advice Be cynical Demand proof Test carefully

Still want to do it? Buy it, don t build it Cycle Computing Univa BrightComputing

Follow the crowd In the real world we see: Separation between local and cloud HPC resources Send your work to the system most suitable

Tip #3

You can t rewrite EVERYTHING.

Salesfolk will just glibly tell you to rewrite your apps so you can use whatever big data analysis framework they happen to be selling today

They have no clue.

In life science informatics we have hundreds of codes that will never be rewritten. We ll be needing them for years to come.

Advice: MapReduceish methods are the future for big-data informatics It will take years to get there We still have to deal with legacy algorithms and codes

You will need: A process for figuring out when it s worthwhile to rewrite/re-architect Tested cloud strategies for handling three use cases

You need 3 cloud architectures: 1. Legacy HPC 2. Cloudy HPC 3. Big Data HPC (Hadoop)

Legacy HPC on the cloud MIT StarCluster http://web.mit.edu/star/cluster/ This is your baseline Extend as needed

Cloudy HPC Use this method when It makes sense to rewrite or rearchitect an HPC workflow to better leverage modern cloud capabilities

Cloudy HPC, continued Ditch the legacy compute farm model Leverage elastic scale-out tools (***) Spot Instances for elastic & cheap compute SimpleDB for job statekeeping SQS for job queues & workrflow glue SNS for message passing & monitoring S3 for input & output data Etc.

Big Data HPC It s gonna be a MapReduce world Little need to roll your own Ecosystem already healthy Multiple providers today Often a slam-dunk cloud use case

Tip #4

The Cloud was not designed for us

HPC is an edge case for the hyperscale IaaS clouds We need to deal with this and engineer around it.

Many examples Eventual consistency Networking & subnets Latency Node placement

Advice Manage expectations Benchmark & test Evangelize (pester the cloud sales reps )

Tip #5

Data Movement Is Still Hard

Consistently getting easier Amazon is not a bottleneck AWS Import/Export AWS Direct Connect Aspera has some amazing stuff out right now

Advice AWS Import/Export works well Size of pipe is not everything Sweat the small stuff Tracking, checksums, disk speed Dedicated workstations Secure media storage

Dedicated data movement station

naked Terabyte-scale data movement

Don t overlook media storage

Advice for 2012 BioTeam is dialing down our advocacy of physical data ingestion into the cloud Why? Operationally hard, expensive and no longer strictly needed

Real world cross-country internet-based data movement March 2012

700Mb/sec into Amazon, stress-free & zero tuning March 2012

People trying to move data via physical media quickly realize the operational difficulties Bandwidth is cheaper than hiring another body to manage physical data ingestion & movement In 2012 we strongly recommend network-based data movement when at all possible

u r doing it wrong

cool data movement, bro!

Tips #6 & 7

Cloud storage. Still slow.

Big shared storage. Still hard.

Not much we can do except engineer around it AWS compute cluster instances are a huge step forward AWS competitors take note

We are not database nerds We care about more than just random IO performance We need it all Random I/O Long sequential read/write

Faster Storage Options Software RAID on EBS Various GlusterFS options Even if you optimize everything, the virtual NICs are still a bottleneck

Big Shared Storage 10GbE nodes and NFS Software RAID sets GlusterFS or similar 2012: pnfs finally?

Tip #8

Things fail differently in the cloud.

Stuff breaks It breaks in weird ways Transient/temporary issues more common than what we see at home

Advice Pessimism is good Design for failure Think hard about How will you detect? How will you respond?

Advice Remove humans from loop Automate recovery Automate your backups

Tip #9

Serial/batch computing at-scale

Loosely coupled workflows are ideal Break the pipeline into discrete components Components should be able to scale up down independently

Component = Opportunity to: Make a scaling decision (# nodes in use) Make sizing decision (instance type in use)

Nirvana is

independent loosely connected components that can self-scale and communicate asynchronously

Advice: Many people already doing this Best practices are well known Steal from the best: RightScale, Opscode & Cycle Computing

Phew. Think I m done now.

Questions? Slides available at http://slideshare.net/chrisdag/

End;

Backup Slides

Private Clouds: Pick Your Poison OpenStack - http://openstack.org Pro: Super smart developers; significant mindshare; True Open Source Con: Commitment to AWS API compatibility (?) & stability

Private Clouds: Pick Your Poison CloudStack- http://cloudstack.org Pro: Explicit AWS API support; very recent move away from open-core model; usability Con: Developer mindshare? Sudden switch to Apache

Private Clouds: Pick Your Poison Eucalyptus- http://eucalyptus.com Pro: Direct AWS API compatibility; lots of hypervisor support Con: Open-core model; mindshare; Recent ressurection