MAGELLAN 54 S CIDAC REVIEW S PRING 2010 WWW. SCIDACREVIEW. ORG

Size: px
Start display at page:

Download "MAGELLAN 54 S CIDAC REVIEW S PRING 2010 WWW. SCIDACREVIEW. ORG"

Transcription

1 MAGELLAN Exploring CLOUD Computing for DOE s Scientific Mission Cloud computing is gaining traction in the commercial world, with companies like Amazon, Google, and Yahoo offering pay-to-play cycles to help organizations meet cyclical demands for extra computing power. But can such an approach also meet the computing and data storage demands of the nation s scientific community? The exploratory project has been named Magellan in honor of the Portuguese explorer, and also for the Magellanic Clouds, the two closest galaxies to our Milky Way visible from the Southern Hemisphere. A new $32 million program funded by the American Recovery and Reinvestment Act through the U.S. Department of Energy (DOE) will examine cloud computing as a cost-effective and energyefficient computing paradigm for mid-range science users to accelerate discoveries in a variety of disciplines, including analysis of scientific datasets in biology, climate change, and physics. DOE is a world leader in providing high-performance computing resources for science with the National Energy Scientific Research Computing Center (NERSC) at Lawrence Berkeley National Laboratory (LBNL) to support the high-end computing needs of over 3,000 DOE Office of Science researchers and the Leadership Computing Facilities at Argonne and Oak Ridge National Laboratories serving the largest-scale computing projects across the broader science community through the Innovative and Novel Computational Impact on Theory and Experiment Program (INCITE). The focus of these facilities is on providing access to some of the world s most powerful supercomputing systems that are specifically designed for high-end scientific computing. Interestingly, some of the science demands for DOE computing resources do not require the scale of these well-balanced petascale machines. A great deal of computational science today is conducted on personal laptops or desktop computers or on small private computing clusters set up by individual researchers or small collaborations at their home institution. Local clusters have also been ideal for researchers that co-design complex problem-solving software infrastructures for the platforms in addition to running their simulations. Users with computational needs that fall between desktop and petascale systems are often referred to as mid-range and are the target for Magellan cloud projects. In the past, mid-range users were enticed to set up their own purpose-built clusters for developing codes, running custom software or solving computationally inexpensive problems because hardware has been relatively cheap. However the cost incurred by ownership, including ever-rising energy bills, space constraints for hardware, ongoing software maintenance, security, operations and a variety of other expenses, are forcing mid-range researchers and their funders to look for more costefficient alternatives. Some experts suspect that cloud computing may be a viable solution. Cloud computing refers to a flexible model for on-demand access to a shared pool of configurable computing resources (such as networks, servers, storage, applications, services, and software) that can be easily provisioned as needed. Cloud computing centralizes the resources to gain efficiency of scale and permit scientists to scale up to solve larger science problems while still allowing the system software to be configured as needed for individual application requirements. To test cloud computing for scientific capability, NERSC and the Argonne Leadership Computing Facility (ALCF) will install similar mid-range computing hardware, but will offer different computing environments (figure 1). The systems will create a cloud test bed that scientists 54 S CIDAC REVIEW S PRING 2010 WWW. SCIDACREVIEW. ORG

2 R. KALTSCHMIDT, LBNL Figure 1. Cloud control. The Magellan management and network control racks at NERSC. To test cloud computing for scientific capability, NERSC and the Argonne Leadership Computing Facility (ALCF) installed purpose-built test beds for running scientific applications on the IBM idataplex cluster. can use for their computations while also testing the effectiveness of cloud computing for their particular research problems. Since the project is exploratory, it has been named Magellan in honor of the Portuguese explorer who led the first effort to sail around the globe. It is also named for the Magellanic Clouds, which are the two closest galaxies to our Milky Way and visible from the Southern Hemisphere. What is Cloud Computing? In the report Above the Clouds: A Berkeley View of Cloud Computing (see Further Reading, p59) a team of luminaries from the Electrical Engineering and Computer Sciences Department at the University of California Berkeley noted that cloud computing refers to both the applications delivered as services over the Internet and the hardware and systems software in the datacenters that provide those services. The services themselves have long been referred to as software as a service (SaaS). The datacenter hardware and software is referred to as a cloud. When a cloud is made available in a pay-as-yougo manner to the general public, it is a public cloud; the service being sold is utility computing. Current examples of public utility computing include Amazon Web Services (AWS), Google AppEngine, and Microsoft Azure. As a successful example, Elastic Compute Cloud (EC2) from AWS sells 1.0 GHz x86 ISA slices, or instances, for $0.10 per hour, and a new instance can be added in two to five minutes. An instance is the allocated memory and collection of processes running on the server. Meanwhile, Amazon s Scalable Storage Service (S3) charges $0.12 to $0.15 per gigabyte-month, with additional bandwidth charges of $0.10 to $0.15 per gigabyte to move data into and out of AWS over the Internet. The advantages of SaaS to both end users and service providers are well understood. Service S CIDAC REVIEW S PRING 2010 WWW. SCIDACREVIEW. ORG 55

3 MAGELLAN Magellan Hardware This purpose-built test bed for running scientific applications will be built on the IBM idataplex chassis and based on InfiniBand technology, the system will offer high density with front-access cabling and will be liquid-cooled using rear-door heat exchangers (figure 2). Total computer performance across both sites will be on the order of 100 teraflop/s. The NERSC portion of the system will include: 61.5 teraflop/s peak performance 720 compute nodes (5,760 cores) with Intel Nehalem quad-core processors 21.1 TB DDR3 memory QDR InfiniBand fabric Meanwhile, the Argonne system will have: 43 teraflop/s peak performance 504 Compute nodes (4,032 cores) with Intel Nehalem quad-core processors 12 TB memory QDR InfiniBand fabric R. KALTSCHMIDT, LBNL providers enjoy greatly simplifed software installation and maintenance and centralized control over versioning; end users can access the service anytime, anywhere, share data and collaborate more easily, and keep their data stored safely in the infrastructure. Cloud computing does not change these arguments, but it does give more application providers the choice of deploying their product as SaaS without provisioning a datacenter: just as the emergence of semiconductor foundries gave chip companies the opportunity to design and sell chips without owning a semiconductor fabrication plant, cloud computing allows deploying SaaS and scaling on demand without building or provisioning a datacenter. Mid-Range Users on a Cloud Realizing that not all research applications require petascale computing power, the Magellan project will explore several areas: Understanding which science applications and user communities are most well-suited for cloud computing (sidebar Metagenomics on a Cloud? p58) Understanding the deployment and support issues required to build large science clouds. Is it cost effective and practical to operate science clouds? How could commercial clouds be leveraged? How does existing cloud software meet the needs of science and could extending or enhancing current cloud software improve utility? How well does cloud computing support dataintensive scientific applications? What are the challenges to addressing security in a virtualized cloud environment? Figure 2. Staying cool. By building the Magellan test bed at NERSC on IBM s idataplex chassis, the facility can take advantage of the machine s innovative halfdepth design and liquid-cooled door, reduce cooling costs by as much as half, and reduce floor space requirements by 30%. The orange tubes in the picture will carry coolant to chill the system. By installing the Magellan systems (sidebar Magellan Hardware ) at two of DOE s leading computing centers, the project will leverage staff experience and expertise as users put the cloud systems through their paces. The Magellan test bed will be comprised of cluster hardware built on IBM s idataplex chassis and based on Intel s Nehalem CPUs and QDR InfiniBand interconnect (figure 3). Total computer performance across both sites will be on the order of 100 teraflop/s. Researchers at ACLF and NERSC will look into the Eucalyptus toolkit, an open-source package that is compatible with Amazon Web Services, as a potential tool for allocating Linux virtual machine images. In addition, the teams researching Magellan s suitability will also investigate the performance of Apache s Hadoop and Google s MapReduce, two 56 S CIDAC REVIEW S PRING 2010 WWW. SCIDACREVIEW. ORG

4 R. KALTSCHMIDT, LBNL The Magellan test bed will be comprised of cluster hardware built on IBM s idataplex chassis and based on Intel s Nehalem CPUs and QDR InfiniBand interconnect, and the total computer performance across both sites will be on the order of 100 teraflop/s. Figure 3. Magellan systems at both NERSC and the ALCF will be built using QDR InfiniBand fabric like the one pictured here. S CIDAC REVIEW S PRING 2010 WWW. SCIDACREVIEW. ORG 57

5 MAGELLAN Metagenomics on a Cloud? One goal of the Magellan project is to understand which science applications and user communities are best suited for cloud computing, but some DOE researchers have already given public clouds a whirl. For example, Jared Wilkening, a software developer at Argonne National Laboratory, recently tested the feasibility of employing Amazon EC2 to run a BLASTbased metagenomics application. Metagenomics is the study of metagenomes, genetic material recovered directly from environmental samples. By identifying and understanding bacterial species based on sequence similarity, some researchers hope to put microbial communities to work mitigating global warming and cleaning up toxic waste sites, among other tasks. BLAST is the community standard for sequence comparison. It enables researchers to compare a query sequence with a library or database of sequences, and identify library sequences that resemble the query sequence above a certain threshold. Wilkening notes that the BLAST-based codes, like the one he used on the Amazon EC2, are perfect for cloud computing because there is little internal synchronization, therefore it does not rely on high-performance interconnects. Nevertheless, the study s conclusion was that Amazon is significantly more expensive than locallyowned clusters, due mainly to EC2 s inferior CPU hardware and the premium cost associated with ondemand access, although an increased demand for compute-intensive workloads could change that. Wilkening s paper was published in Cluster 2009, and slides are available at: J. WILKENING, ANL Figure 4. Metagenomics is the study of genetic material recovered directly from environmental samples. related software frameworks that deal with large distributed datasets. Currently, one of the challenges in building a private cloud is the lack of software standards. Although these frameworks are not widely supported at traditional supercomputing facilities, large distributed datasets are a common feature of many scientific codes and are natural fits for cloud computing. The team will also be experimenting with other commercial cloud offerings such as those from Amazon, Google, and Microsoft. By making Magellan available to a wide range of DOE science users, the researchers will be able to analyze the suitability for a cloud model across the broad spectrum of the DOE science workload. They will also use performance-monitoring software to analyze what kinds of science applications are being run on the system and how well they perform on a cloud. The science users will play a key role in this evaluation as they bring a very broad scientific workload into the equation and will help the researchers learn which features are important to the scientific community. Data Storage and Networking To address the challenge of analyzing the massive amounts of data being produced by scientific 58 S CIDAC REVIEW S PRING 2010 WWW. SCIDACREVIEW. ORG

6 R. KALTSCHMIDT, LBNL R. KALTSCHMIDT, LBNL Figure 6. Networking. When completed, the Magellan system at NERSC will be interconnected using QDR, 10 Gbps Ethernet, multiple 1 Gbps Ethernet, and 8 Gbps fiber channel SAN. Figure 5. Main system console for Magellan at NERSC. instruments ranging from powerful telescopes photographing the Universe to gene sequencers unraveling the genetic code of life, the Magellan test bed will also provide a storage cloud with a little over a petabyte of capacity. The NERSC Global File (NGF) system will provide most storage needs for projects running on the NERSC portion of the Magellan system. Approximately 1 PB of storage and 25 gigabits per second (Gbps) of bandwidth have been added to support use by the test bed. Archival storage needs will be satisfied by NERSC s High Performance Storage System (HPSS) archive, which is being increased by 15 PB in capacity. Meanwhile, the Magellan system at ACLF will have 250 TB of local disk storage on the compute nodes and additional 25 TB of global disk storage on the GPFS system. NERSC will make the Magellan storage available to science communities using a set of servers and software called Science Gateways, as well as experiment with Flash memory technology to provide fast random access storage for some of the more data-intensive problems. Approximately 10 TB will be deployed in NGF for highbandwidth, low-latency storage class and metadata acceleration. Around 16 TB will be deployed as local SSD in one SU for data analytics, local read-only data and local temporary storage. Approximately 2 TB will be deployed in HPSS. The ALCF will provide active storage, using HADOOP over PVFS, on approximately 100 compute/storage nodes. This active storage will increase the capacity of the ALCF Magellan system by approximately 30 TF of compute power, along with approximately 500 TB of local disk storage and 10 TB of local SSD. The NERSC and ALCF facilities will be linked by a groundbreaking 100 Gbps network, developed by DOE s Energy Sciences Network (ESnet) with funding from the American Recovery and Reinvestment Act. Such high bandwidth will facilitate rapid transfer of data between geographically dispersed clouds and enable scientists to use available computing resources regardless of location. The Magellan program will run for two years, and the initial clusters will be installed in the next few months. At NERSC installation was slated to begin in November 2009, with early users getting access in December. The NERSC system (figures 5 and 6) was slated to go into production use in mid-january At ALCF, installation was planned to begin in January 2010, with early users gaining access in February, and the system opening up for full access in March. Contributors Horst Simon, Kathy Yelick, Jeff Broughton, Brent Draney, Jon Bashor, David Paul, and Linda Vu from NERSC at LBNL; Pete Beckman, Susan Coghlan, and Eleanor Taylor from ALCF at Argonne National Laboratory. Further Reading Above the Clouds: A Berkeley View of Cloud Computing EECS pdf S CIDAC REVIEW S PRING 2010 WWW. SCIDACREVIEW. ORG 59

Magellan A Test Bed to Explore Cloud Computing for Science Shane Canon and Lavanya Ramakrishnan Cray XE6 Training February 8, 2011

Magellan A Test Bed to Explore Cloud Computing for Science Shane Canon and Lavanya Ramakrishnan Cray XE6 Training February 8, 2011 Magellan A Test Bed to Explore Cloud Computing for Science Shane Canon and Lavanya Ramakrishnan Cray XE6 Training February 8, 2011 Magellan Exploring Cloud Computing Co-located at two DOE-SC Facilities

More information

Science in the Cloud Exploring Cloud Computing for Science Shane Canon. Moab Con May 11, 2011

Science in the Cloud Exploring Cloud Computing for Science Shane Canon. Moab Con May 11, 2011 Science in the Cloud Exploring Cloud Computing for Science Shane Canon Moab Con May 11, 2011 Outline Definitions The Magellan Project Experience and Lessons Learned Cloud Misconceptions Closing remarks

More information

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts Part V Applications Cloud Computing: General concepts Copyright K.Goseva 2010 CS 736 Software Performance Engineering Slide 1 What is cloud computing? SaaS: Software as a Service Cloud: Datacenters hardware

More information

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack HIGHLIGHTS Real-Time Results Elasticsearch on Cisco UCS enables a deeper

More information

HPC Update: Engagement Model

HPC Update: Engagement Model HPC Update: Engagement Model MIKE VILDIBILL Director, Strategic Engagements Sun Microsystems mikev@sun.com Our Strategy Building a Comprehensive HPC Portfolio that Delivers Differentiated Customer Value

More information

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage Cisco for SAP HANA Scale-Out Solution Solution Brief December 2014 With Intelligent Intel Xeon Processors Highlights Scale SAP HANA on Demand Scale-out capabilities, combined with high-performance NetApp

More information

When Does Colocation Become Competitive With The Public Cloud? WHITE PAPER SEPTEMBER 2014

When Does Colocation Become Competitive With The Public Cloud? WHITE PAPER SEPTEMBER 2014 When Does Colocation Become Competitive With The Public Cloud? WHITE PAPER SEPTEMBER 2014 Table of Contents Executive Summary... 2 Case Study: Amazon Ec2 Vs In-House Private Cloud... 3 Aim... 3 Participants...

More information

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms Volume 1, Issue 1 ISSN: 2320-5288 International Journal of Engineering Technology & Management Research Journal homepage: www.ijetmr.org Analysis and Research of Cloud Computing System to Comparison of

More information

When Does Colocation Become Competitive With The Public Cloud?

When Does Colocation Become Competitive With The Public Cloud? When Does Colocation Become Competitive With The Public Cloud? PLEXXI WHITE PAPER Affinity Networking for Data Centers and Clouds Table of Contents EXECUTIVE SUMMARY... 2 CASE STUDY: AMAZON EC2 vs IN-HOUSE

More information

Open Cirrus: Towards an Open Source Cloud Stack

Open Cirrus: Towards an Open Source Cloud Stack Open Cirrus: Towards an Open Source Cloud Stack Karlsruhe Institute of Technology (KIT) HPC2010, Cetraro, June 2010 Marcel Kunze KIT University of the State of Baden-Württemberg and National Laboratory

More information

Mississippi State University High Performance Computing Collaboratory Brief Overview. Trey Breckenridge Director, HPC

Mississippi State University High Performance Computing Collaboratory Brief Overview. Trey Breckenridge Director, HPC Mississippi State University High Performance Computing Collaboratory Brief Overview Trey Breckenridge Director, HPC Mississippi State University Public university (Land Grant) founded in 1878 Traditional

More information

Amazon EC2 Product Details Page 1 of 5

Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Functionality Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of

More information

HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW

HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW 757 Maleta Lane, Suite 201 Castle Rock, CO 80108 Brett Weninger, Managing Director brett.weninger@adurant.com Dave Smelker, Managing Principal dave.smelker@adurant.com

More information

Hadoop in the Hybrid Cloud

Hadoop in the Hybrid Cloud Presented by Hortonworks and Microsoft Introduction An increasing number of enterprises are either currently using or are planning to use cloud deployment models to expand their IT infrastructure. Big

More information

Zadara Storage Cloud A whitepaper. @ZadaraStorage

Zadara Storage Cloud A whitepaper. @ZadaraStorage Zadara Storage Cloud A whitepaper @ZadaraStorage Zadara delivers two solutions to its customers: On- premises storage arrays Storage as a service from 31 locations globally (and counting) Some Zadara customers

More information

How To Build A Cloud Computer

How To Build A Cloud Computer Introducing the Singlechip Cloud Computer Exploring the Future of Many-core Processors White Paper Intel Labs Jim Held Intel Fellow, Intel Labs Director, Tera-scale Computing Research Sean Koehl Technology

More information

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research Introduction to Cloud : Cloud and Cloud Storage Lecture 2 Dr. Dalit Naor IBM Haifa Research Storage Systems 1 Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University http://www.eng.tau.ac.il/semcom

More information

Cloud Computing Now and the Future Development of the IaaS

Cloud Computing Now and the Future Development of the IaaS 2010 Cloud Computing Now and the Future Development of the IaaS Quanta Computer Division: CCASD Title: Project Manager Name: Chad Lin Agenda: What is Cloud Computing? Public, Private and Hybrid Cloud.

More information

The Greenplum Analytics Workbench

The Greenplum Analytics Workbench The Greenplum Analytics Workbench External Overview 1 The Greenplum Analytics Workbench Definition Is a 1000-node Hadoop Cluster. Pre-configured with publicly available data sets. Contains the entire Hadoop

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Big data management with IBM General Parallel File System

Big data management with IBM General Parallel File System Big data management with IBM General Parallel File System Optimize storage management and boost your return on investment Highlights Handles the explosive growth of structured and unstructured data Offers

More information

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze

More information

Managing Traditional Workloads Together with Cloud Computing Workloads

Managing Traditional Workloads Together with Cloud Computing Workloads Managing Traditional Workloads Together with Cloud Computing Workloads Table of Contents Introduction... 3 Cloud Management Challenges... 3 Re-thinking of Cloud Management Solution... 4 Teraproc Cloud

More information

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications Amazon Cloud Performance Compared David Adams Amazon EC2 performance comparison How does EC2 compare to traditional supercomputer for scientific applications? "Performance Analysis of High Performance

More information

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes Anthony Kenisky, VP of North America Sales About Appro Over 20 Years of Experience 1991 2000 OEM Server Manufacturer 2001-2007

More information

for my computation? Stefano Cozzini Which infrastructure Which infrastructure Democrito and SISSA/eLAB - Trieste

for my computation? Stefano Cozzini Which infrastructure Which infrastructure Democrito and SISSA/eLAB - Trieste Which infrastructure Which infrastructure for my computation? Stefano Cozzini Democrito and SISSA/eLAB - Trieste Agenda Introduction:! E-infrastructure and computing infrastructures! What is available

More information

On Demand Satellite Image Processing

On Demand Satellite Image Processing On Demand Satellite Image Processing Next generation technology for processing Terabytes of imagery on the Cloud WHITEPAPER MARCH 2015 Introduction Profound changes are happening with computing hardware

More information

Cloud Computing for Science

Cloud Computing for Science The Magellan Report on Cloud Computing for Science U.S. Department of Energy Office of Advanced Scientific Computing Research (ASCR) December, 2011 CSO 23179 The Magellan Report on Cloud Computing for

More information

Exploration of adaptive network transfer for 100 Gbps networks Climate100: Scaling the Earth System Grid to 100Gbps Network

Exploration of adaptive network transfer for 100 Gbps networks Climate100: Scaling the Earth System Grid to 100Gbps Network Exploration of adaptive network transfer for 100 Gbps networks Climate100: Scaling the Earth System Grid to 100Gbps Network February 1, 2012 Project period of April 1, 2011 through December 31, 2011 Principal

More information

HadoopTM Analytics DDN

HadoopTM Analytics DDN DDN Solution Brief Accelerate> HadoopTM Analytics with the SFA Big Data Platform Organizations that need to extract value from all data can leverage the award winning SFA platform to really accelerate

More information

How To Talk About Data Intensive Computing On The Cloud

How To Talk About Data Intensive Computing On The Cloud Data-intensive Computing on the Cloud: Concepts, Technologies and Applications B. Ramamurthy bina@buffalo.edu This talks is partially supported by National Science Foundation grants DUE: #0920335, OCI:

More information

Sun Constellation System: The Open Petascale Computing Architecture

Sun Constellation System: The Open Petascale Computing Architecture CAS2K7 13 September, 2007 Sun Constellation System: The Open Petascale Computing Architecture John Fragalla Senior HPC Technical Specialist Global Systems Practice Sun Microsystems, Inc. 25 Years of Technical

More information

Data Aggregation and Cloud Computing

Data Aggregation and Cloud Computing Data Intensive Scalable Computing Harnessing the Power of Cloud Computing Randal E. Bryant February, 2009 Our world is awash in data. Millions of devices generate digital data, an estimated one zettabyte

More information

Data management challenges in todays Healthcare and Life Sciences ecosystems

Data management challenges in todays Healthcare and Life Sciences ecosystems Data management challenges in todays Healthcare and Life Sciences ecosystems Jose L. Alvarez Principal Engineer, WW Director Life Sciences jose.alvarez@seagate.com Evolution of Data Sets in Healthcare

More information

Cloud Computing. Chapter 1 Introducing Cloud Computing

Cloud Computing. Chapter 1 Introducing Cloud Computing Cloud Computing Chapter 1 Introducing Cloud Computing Learning Objectives Understand the abstract nature of cloud computing. Describe evolutionary factors of computing that led to the cloud. Describe virtualization

More information

The Evolution of Microsoft SQL Server: The right time for Violin flash Memory Arrays

The Evolution of Microsoft SQL Server: The right time for Violin flash Memory Arrays The Evolution of Microsoft SQL Server: The right time for Violin flash Memory Arrays Executive Summary Microsoft SQL has evolved beyond serving simple workgroups to a platform delivering sophisticated

More information

Building a Linux Cluster

Building a Linux Cluster Building a Linux Cluster CUG Conference May 21-25, 2001 by Cary Whitney Clwhitney@lbl.gov Outline What is PDSF and a little about its history. Growth problems and solutions. Storage Network Hardware Administration

More information

How To Speed Up A Flash Flash Storage System With The Hyperq Memory Router

How To Speed Up A Flash Flash Storage System With The Hyperq Memory Router HyperQ Hybrid Flash Storage Made Easy White Paper Parsec Labs, LLC. 7101 Northland Circle North, Suite 105 Brooklyn Park, MN 55428 USA 1-763-219-8811 www.parseclabs.com info@parseclabs.com sales@parseclabs.com

More information

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging

More information

Program Summary. Criterion 1: Importance to University Mission / Operations. Importance to Mission

Program Summary. Criterion 1: Importance to University Mission / Operations. Importance to Mission Program Summary DoIT provides support for 1,200 virtual and 150 physical servers. Housed in DoIT data centers to take advantage of climate control, power conditioning and redundancy, fire suppression systems,

More information

The Hartree Centre helps businesses unlock the potential of HPC

The Hartree Centre helps businesses unlock the potential of HPC The Hartree Centre helps businesses unlock the potential of HPC Fostering collaboration and innovation across UK industry with help from IBM Overview The need The Hartree Centre needs leading-edge computing

More information

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products MaxDeploy Ready Hyper- Converged Virtualization Solution With SanDisk Fusion iomemory products MaxDeploy Ready products are configured and tested for support with Maxta software- defined storage and with

More information

SEAIP 2009 Presentation

SEAIP 2009 Presentation SEAIP 2009 Presentation By David Tan Chair of Yahoo! Hadoop SIG, 2008-2009,Singapore EXCO Member of SGF SIG Imperial College (UK), Institute of Fluid Science (Japan) & Chicago BOOTH GSB (USA) Alumni Email:

More information

IBM Deep Computing Visualization Offering

IBM Deep Computing Visualization Offering P - 271 IBM Deep Computing Visualization Offering Parijat Sharma, Infrastructure Solution Architect, IBM India Pvt Ltd. email: parijatsharma@in.ibm.com Summary Deep Computing Visualization in Oil & Gas

More information

Database Virtualization and the Cloud

Database Virtualization and the Cloud Database Virtualization and the Cloud How database virtualization, cloud computing and other advances will reshape the database landscape by Mike Hogan, CEO ScaleDB Inc. December 10, 2009 Introduction

More information

What Is It? Business Architecture Research Challenges Bibliography. Cloud Computing. Research Challenges Overview. Carlos Eduardo Moreira dos Santos

What Is It? Business Architecture Research Challenges Bibliography. Cloud Computing. Research Challenges Overview. Carlos Eduardo Moreira dos Santos Research Challenges Overview May 3, 2010 Table of Contents I 1 What Is It? Related Technologies Grid Computing Virtualization Utility Computing Autonomic Computing Is It New? Definition 2 Business Business

More information

wu.cloud: Insights Gained from Operating a Private Cloud System

wu.cloud: Insights Gained from Operating a Private Cloud System wu.cloud: Insights Gained from Operating a Private Cloud System Stefan Theußl, Institute for Statistics and Mathematics WU Wirtschaftsuniversität Wien March 23, 2011 1 / 14 Introduction In statistics we

More information

Cost-effective Strategies for Building the Next-generation Data Center

Cost-effective Strategies for Building the Next-generation Data Center White Paper Cost-effective Strategies for Building the Next-generation Data Center Custom-made servers bearing energy-efficient processors are key to today s cloud computing-inspired architectures. Tom

More information

Performance Across the Generations: Processor and Interconnect Technologies

Performance Across the Generations: Processor and Interconnect Technologies WHITE Paper Performance Across the Generations: Processor and Interconnect Technologies HPC Performance Results ANSYS CFD 12 Executive Summary Today s engineering, research, and development applications

More information

High Performance Server SAN using Micron M500DC SSDs and Sanbolic Software

High Performance Server SAN using Micron M500DC SSDs and Sanbolic Software High Performance Server SAN using Micron M500DC SSDs and Sanbolic Software White Paper Overview The Micron M500DC SSD was designed after months of close work with major data center service providers and

More information

Virtualization, Grid, Cloud: Integration Paths for Scientific Computing

Virtualization, Grid, Cloud: Integration Paths for Scientific Computing Virtualization, Grid, Cloud: Integration Paths for Scientific Computing Or, where and how will my efficient large-scale computing applications be executed? D. Salomoni, INFN Tier-1 Computing Manager Davide.Salomoni@cnaf.infn.it

More information

Kriterien für ein PetaFlop System

Kriterien für ein PetaFlop System Kriterien für ein PetaFlop System Rainer Keller, HLRS :: :: :: Context: Organizational HLRS is one of the three national supercomputing centers in Germany. The national supercomputing centers are working

More information

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN 1 PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster Construction

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

Hadoop: Embracing future hardware

Hadoop: Embracing future hardware Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop

More information

Sistemi Operativi e Reti. Cloud Computing

Sistemi Operativi e Reti. Cloud Computing 1 Sistemi Operativi e Reti Cloud Computing Facoltà di Scienze Matematiche Fisiche e Naturali Corso di Laurea Magistrale in Informatica Osvaldo Gervasi ogervasi@computer.org 2 Introduction Technologies

More information

ORACLE BIG DATA APPLIANCE X3-2

ORACLE BIG DATA APPLIANCE X3-2 ORACLE BIG DATA APPLIANCE X3-2 BIG DATA FOR THE ENTERPRISE KEY FEATURES Massively scalable infrastructure to store and manage big data Big Data Connectors delivers load rates of up to 12TB per hour between

More information

Cloud Computing. Chapter 1 Introducing Cloud Computing

Cloud Computing. Chapter 1 Introducing Cloud Computing Cloud Computing Chapter 1 Introducing Cloud Computing Learning Objectives Understand the abstract nature of cloud computing. Describe evolutionary factors of computing that led to the cloud. Describe virtualization

More information

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA WHITE PAPER April 2014 Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA Executive Summary...1 Background...2 File Systems Architecture...2 Network Architecture...3 IBM BigInsights...5

More information

Quick Reference Selling Guide for Intel Lustre Solutions Overview

Quick Reference Selling Guide for Intel Lustre Solutions Overview Overview The 30 Second Pitch Intel Solutions for Lustre* solutions Deliver sustained storage performance needed that accelerate breakthrough innovations and deliver smarter, data-driven decisions for enterprise

More information

Emerging Technology for the Next Decade

Emerging Technology for the Next Decade Emerging Technology for the Next Decade Cloud Computing Keynote Presented by Charles Liang, President & CEO Super Micro Computer, Inc. What is Cloud Computing? Cloud computing is Internet-based computing,

More information

Unified Computing Systems

Unified Computing Systems Unified Computing Systems Cisco Unified Computing Systems simplify your data center architecture; reduce the number of devices to purchase, deploy, and maintain; and improve speed and agility. Cisco Unified

More information

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution Arista 10 Gigabit Ethernet Switch Lab-Tested with Panasas ActiveStor Parallel Storage System Delivers Best Results for High-Performance and Low Latency for Scale-Out Cloud Storage Applications Introduction

More information

Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus

Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus International Symposium on Grid Computing 2009 (Taipei) Christian Baun The cooperation of and Universität Karlsruhe (TH) Agenda

More information

Trends in Cloud Computing and Data Intensive Networks. PASIG Malta - 25 June 2009

Trends in Cloud Computing and Data Intensive Networks. PASIG Malta - 25 June 2009 Trends in Cloud Computing and Data Intensive Networks PASIG Malta - 25 June 2009 Stephen Perrenod, Ph.D. Mgr. HPC & Cloud Business, APAC Sun Microsystems 1 All Clouds Share Key Traits One Service Fits

More information

Cornell University Center for Advanced Computing

Cornell University Center for Advanced Computing Cornell University Center for Advanced Computing David A. Lifka - lifka@cac.cornell.edu Director - Cornell University Center for Advanced Computing (CAC) Director Research Computing - Weill Cornell Medical

More information

Hadoop & its Usage at Facebook

Hadoop & its Usage at Facebook Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System dhruba@apache.org Presented at the Storage Developer Conference, Santa Clara September 15, 2009 Outline Introduction

More information

IBM Enterprise Linux Server

IBM Enterprise Linux Server IBM Systems and Technology Group February 2011 IBM Enterprise Linux Server Impressive simplification with leading scalability, high availability and security Table of Contents Executive Summary...2 Our

More information

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads 89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads A Competitive Test and Evaluation Report

More information

How to Do/Evaluate Cloud Computing Research. Young Choon Lee

How to Do/Evaluate Cloud Computing Research. Young Choon Lee How to Do/Evaluate Cloud Computing Research Young Choon Lee Cloud Computing Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing

More information

Hadoop on the Gordon Data Intensive Cluster

Hadoop on the Gordon Data Intensive Cluster Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,

More information

a Cloud Computing for Science

a Cloud Computing for Science Infrastructure-as-a-Service a Cloud Computing for Science April 2010 Salishan Conference Kate Keahey keahey@mcs.anl.gov Nimbus project lead Argonne National Laboratory Computation Institute, University

More information

Microsoft s Open CloudServer

Microsoft s Open CloudServer Microsoft s Open CloudServer Page 1 Microsoft s Open CloudServer How is our cloud infrastructure server design different from traditional IT servers? It begins with scale. From the number of customers

More information

Availability Digest. www.availabilitydigest.com. @availabilitydig. HPE Helion Private Cloud and Cloud Broker Services February 2016

Availability Digest. www.availabilitydigest.com. @availabilitydig. HPE Helion Private Cloud and Cloud Broker Services February 2016 the Availability Digest @availabilitydig HPE Helion Private Cloud and Cloud Broker Services February 2016 HPE Helion is a complete portfolio of cloud products and services that offers enterprise security,

More information

Last time. Data Center as a Computer. Today. Data Center Construction (and management)

Last time. Data Center as a Computer. Today. Data Center Construction (and management) Last time Data Center Construction (and management) Johan Tordsson Department of Computing Science 1. Common (Web) application architectures N-tier applications Load Balancers Application Servers Databases

More information

Cloud Computing and Amazon Web Services

Cloud Computing and Amazon Web Services Cloud Computing and Amazon Web Services Gary A. McGilvary edinburgh data.intensive research 1 OUTLINE 1. An Overview of Cloud Computing 2. Amazon Web Services 3. Amazon EC2 Tutorial 4. Conclusions 2 CLOUD

More information

High Performance Computing Cloud Computing. Dr. Rami YARED

High Performance Computing Cloud Computing. Dr. Rami YARED High Performance Computing Cloud Computing Dr. Rami YARED Outline High Performance Computing Parallel Computing Cloud Computing Definitions Advantages and drawbacks Cloud Computing vs Grid Computing Outline

More information

InfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures. Brian Sparks IBTA Marketing Working Group Co-Chair

InfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures. Brian Sparks IBTA Marketing Working Group Co-Chair InfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures Brian Sparks IBTA Marketing Working Group Co-Chair Page 1 IBTA & OFA Update IBTA today has over 50 members; OFA

More information

ECE6130 Grid and Cloud Computing

ECE6130 Grid and Cloud Computing ECE6130 Grid and Cloud Computing Howie Huang Department of Electrical and Computer Engineering School of Engineering and Applied Science Cloud Computing Hardware Software Outline Research Challenges 2

More information

A Review on "Above the Clouds: A Berkeley View of Cloud Computing (Armbrust, Fox, Griffith at.el.)"

A Review on Above the Clouds: A Berkeley View of Cloud Computing (Armbrust, Fox, Griffith at.el.) A Review on "Above the Clouds: A Berkeley View of Cloud Computing (Armbrust, Fox, Griffith at.el.)" Introduction: Cloud computing is the physical form of "computing as a utility". What is Cloud Computing:

More information

Large File System Backup NERSC Global File System Experience

Large File System Backup NERSC Global File System Experience Large File System Backup NERSC Global File System Experience M. Andrews, J. Hick, W. Kramer, A. Mokhtarani National Energy Research Scientific Computing Center at Lawrence Berkeley National Laboratory

More information

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010 Flash Memory Arrays Enabling the Virtualized Data Center July 2010 2 Flash Memory Arrays Enabling the Virtualized Data Center This White Paper describes a new product category, the flash Memory Array,

More information

Can Flash help you ride the Big Data Wave? Steve Fingerhut Vice President, Marketing Enterprise Storage Solutions Corporation

Can Flash help you ride the Big Data Wave? Steve Fingerhut Vice President, Marketing Enterprise Storage Solutions Corporation Can Flash help you ride the Big Data Wave? Steve Fingerhut Vice President, Marketing Enterprise Storage Solutions Corporation Forward-Looking Statements During our meeting today we may make forward-looking

More information

Introduction to AWS Economics

Introduction to AWS Economics Introduction to AWS Economics Reducing Costs and Complexity May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided for informational purposes

More information

Cloud Computing mit mathematischen Anwendungen

Cloud Computing mit mathematischen Anwendungen Cloud Computing mit mathematischen Anwendungen Vorlesung SoSe 2009 Dr. Marcel Kunze Karlsruhe Institute of Technology (KIT) Steinbuch Centre for Computing (SCC) KIT the cooperation of Forschungszentrum

More information

Demystifying the Cloud Computing 02.22.2012

Demystifying the Cloud Computing 02.22.2012 Demystifying the Cloud Computing 02.22.2012 Speaker Introduction Victor Lang Enterprise Technology Consulting Services Victor Lang joined Smartbridge in early 2003 as the company s third employee and currently

More information

AMD SEAMICRO OPENSTACK BLUEPRINTS CLOUD- IN- A- BOX OCTOBER 2013

AMD SEAMICRO OPENSTACK BLUEPRINTS CLOUD- IN- A- BOX OCTOBER 2013 AMD SEAMICRO OPENSTACK BLUEPRINTS CLOUD- IN- A- BOX OCTOBER 2013 OpenStack What is OpenStack? OpenStack is a cloud operaeng system that controls large pools of compute, storage, and networking resources

More information

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk HPC and Big Data EPCC The University of Edinburgh Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk EPCC Facilities Technology Transfer European Projects HPC Research Visitor Programmes Training

More information

Protecting Information in a Smarter Data Center with the Performance of Flash

Protecting Information in a Smarter Data Center with the Performance of Flash 89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com 212.367.7400 Protecting Information in a Smarter Data Center with the Performance of Flash IBM FlashSystem and IBM ProtecTIER Printed in

More information

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud February 25, 2014 1 Agenda v Mapping clients needs to cloud technologies v Addressing your pain

More information

Platfora Big Data Analytics

Platfora Big Data Analytics Platfora Big Data Analytics ISV Partner Solution Case Study and Cisco Unified Computing System Platfora, the leading enterprise big data analytics platform built natively on Hadoop and Spark, delivers

More information

Cisco UCS B460 M4 Blade Server

Cisco UCS B460 M4 Blade Server Data Sheet Cisco UCS B460 M4 Blade Server Product Overview The new Cisco UCS B460 M4 Blade Server uses the power of the latest Intel Xeon processor E7 v2 product family to add new levels of performance

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Session # LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton 3, Onur Celebioglu

More information

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

Scalable Cloud Computing Solutions for Next Generation Sequencing Data Scalable Cloud Computing Solutions for Next Generation Sequencing Data Matti Niemenmaa 1, Aleksi Kallio 2, André Schumacher 1, Petri Klemelä 2, Eija Korpelainen 2, and Keijo Heljanko 1 1 Department of

More information

Laurence Liew General Manager, APAC. Economics Is Driving Big Data Analytics to the Cloud

Laurence Liew General Manager, APAC. Economics Is Driving Big Data Analytics to the Cloud Laurence Liew General Manager, APAC Economics Is Driving Big Data Analytics to the Cloud Big Data 101 The Analytics Stack Economics of Big Data Convergence of the 3 forces Big Data Analytics in the Cloud

More information

Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture

Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture He Huang, Shanshan Li, Xiaodong Yi, Feng Zhang, Xiangke Liao and Pan Dong School of Computer Science National

More information

Private Cloud. One solution managed by Applied

Private Cloud. One solution managed by Applied Private Cloud : : C L O U D S E R V I C E S : : One solution managed by Applied THE CLOUD IS NO LONGER AN IT CONSIDERATION ALONE IT IS FUNDAMENTALLY CHANGING THE WAY EXECUTIVES ACROSS DEPARTMENTS VIEW

More information

How To Build A Cisco Ukcsob420 M3 Blade Server

How To Build A Cisco Ukcsob420 M3 Blade Server Data Sheet Cisco UCS B420 M3 Blade Server Product Overview The Cisco Unified Computing System (Cisco UCS ) combines Cisco UCS B-Series Blade Servers and C-Series Rack Servers with networking and storage

More information

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline References Introduction to Database Systems CSE 444 Lecture 24: Databases as a Service YongChul Kwon Amazon SimpleDB Website Part of the Amazon Web services Google App Engine Datastore Website Part of

More information