Eucalyptus-Based. GSAW 2010 Working Group Session 11D. Nehal Desai



Similar documents
Cloud JPL Science Data Systems

Software-Defined Networks Powered by VellOS

Scale Cloud Across the Enterprise

Logentries Insights: The State of Log Management & Analytics for AWS

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Cloud Computing Actionable Standards An Overview of Cloud Specifications

opening the clouds qualitative overview of the state-of-the-art open source cloud management platforms. ACM, middleware 2009 conference

Cluster, Grid, Cloud Concepts

Cloud.. Migration? Bursting? Orchestration? Vincent Lavergne SED EMEA, South Gary Newe Sr SEM EMEA, UKISA

Network Infrastructure Services CS848 Project

LEVERAGE VBLOCK SYSTEMS FOR Esri s ArcGIS SYSTEM


T Mobile Cloud Computing Private Cloud & Assignment

Service Performance Aspects for Cloud Service Level Agreements

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Clodoaldo Barrera Chief Technical Strategist IBM System Storage. Making a successful transition to Software Defined Storage

Oracle Reference Architecture and Oracle Cloud

Where We Are. References. Cloud Computing. Levels of Service. Cloud Computing History. Introduction to Data Management CSE 344

Development at the Speed and Scale of Google. Ashish Kumar Engineering Tools

and Deployment Roadmap for Satellite Ground Systems

Performance Management for Cloudbased STC 2012

Availability Digest. HPE Helion Private Cloud and Cloud Broker Services February 2016

HP OpenStack & Automation

Capacity Management PinkVERIFY

Cloud First Does Not Have to Mean Cloud Exclusively. Digital Government Institute s Cloud Computing & Data Center Conference, September 2014

Cloud Ready Data: Speeding Your Journey to the Cloud

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms

Eucalyptus: An Open-source Infrastructure for Cloud Computing. Rich Wolski Eucalyptus Systems Inc.

Deploying a Geospatial Cloud

How To Compare Cloud Computing To Cloud Platforms And Cloud Computing

Building the Internet of Things Jim Green - CTO, Data & Analytics Business Group, Cisco Systems

A ROAD MAP FOR GEOSPATIAL INFORMATION SYSTEM APPLICATIONS ON VBLOCK INFRASTRUCTURE PLATFORMS

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Cloud Courses Description

Addressing Storage Management Challenges using Open Source SDS Controller

CLOUD TECH SOLUTION AT INTEL INFORMATION TECHNOLOGY ICApp Platform as a Service

GigaSpaces Real-Time Analytics for Big Data

Cloud Computing for Architects

Simplified Management With Hitachi Command Suite. By Hitachi Data Systems

Eucalyptus: An Open-source Infrastructure for Cloud Computing. Rich Wolski Eucalyptus Systems Inc.

Becoming a Cloud Services Broker. Neelam Chakrabarty Sr. Product Marketing Manager, HP SW Cloud Products, HP April 17, 2013

Performance TesTing expertise in case studies a Q & ing T es T

Towards Smart and Intelligent SDN Controller

Ganzheitliches Datenmanagement

September 2009 Cloud Storage for Cloud Computing

LOGO Resource Management for Cloud Computing

Microsoft Private Cloud

Hadoop in the Hybrid Cloud

IBM Spectrum Protect in the Cloud

Fast Innovation requires Fast IT

Intro to AWS: Storage Services

Grid Computing vs Cloud

CloudPlatform XenDesktop/XenApp cloud provisioning. Gaby Grau - gaby.grau@citrix.com Systems Engineer Networking & Cloud October 2014

Linux/Open Source and Cloud computing Wim Coekaerts Senior Vice President, Linux and Virtualization Engineering

Cloud Computing and the Internet. Conferenza GARR 2010

Journey to the Cloud and Application Release Automation Shane Pearson VP, Portfolio & Product Management

Implementing Multi-Tenanted Storage for Service Providers with Cloudian HyperStore. The Challenge SOLUTION GUIDE

Data Storage in Clouds

BMC Cloud Management Functional Architecture Guide TECHNICAL WHITE PAPER

SERVICE ORIENTED ARCHITECTURE

DESIGN OF A PLATFORM OF VIRTUAL SERVICE CONTAINERS FOR SERVICE ORIENTED CLOUD COMPUTING. Carlos de Alfonso Andrés García Vicente Hernández

Neptune. A Domain Specific Language for Deploying HPC Software on Cloud Platforms. Chris Bunch Navraj Chohan Chandra Krintz Khawaja Shams

Building a Private Cloud with Eucalyptus

SECURITY MODELS FOR CLOUD Kurtis E. Minder, CISSP

How to Do/Evaluate Cloud Computing Research. Young Choon Lee

Migration Scenario: Migrating Backend Processing Pipeline to the AWS Cloud

XpoLog Center Suite Log Management & Analysis platform

IBM API Management Overview IBM Corporation

An analyst perspective on cloud and e-infrastructure. John Barr, Research Director HPC

Cloud Service and Deployment Models

Sistemi Operativi e Reti. Cloud Computing

RED HAT CLOUDFORMS ENTERPRISE- GRADE MANAGEMENT FOR AMAZON WEB SERVICES

Cloud Computing and Open Source: Watching Hype meet Reality

Cloud Computing Now and the Future Development of the IaaS

View Point. Performance Monitoring in Cloud. Abstract. - Vineetha V

Cloud Cruiser and Azure Public Rate Card API Integration

AN IMPLEMENTATION OF E- LEARNING SYSTEM IN PRIVATE CLOUD

Oracle SOA Suite: The Evaluation from 10g to 11g

Ontology, NFV and the Future OSS September 2015

Ensuring High Service Levels for Public Cloud Deployments Keys to Effective Service Management

Emerging Technology for the Next Decade

Accenture Cloud Enterprise Services

Deploying Public, Private, and Hybrid Storage Clouds. Marty Stogsdill, Oracle

Transcription:

GSAW 2010 Working Group Session 11D Eucalyptus-Based Event Correlation Nehal Desai Member of the Tech. Staff, CSD/CSTS/CSRD, The Aerospace Corporation Dr. Craig A. Lee, lee@aero.org Senior Scientist, CSD/CSTS/CSRD, The Aerospace Corporation President, Open Grid Forum March 3, 2010 The Aerospace Corporation 2010

Overview 2 Goal: Prototype and evaluate a Cloud Computing Environment as a generic hosting environment for NSS applications Approach: Augment an existing analyst tool, CORE, with a Correlation Evaluator () that is dynamically provisioned and run in a prototype private cloud will automate and enhance the semantics and scope of correlation queries against the a database aggregator to identify causal events of the highest importance Use prototype t environment to quantitatively evaluate benefits of cloud computing Success Metrics: Demonstrate t improved server utilization in the private cloud Demonstrate ability to dynamically scale-out support for multiple s and CORE users Demonstrate improved ability to identify high impact causal events Demonstrate ability to be a generic hosting environment

Approach Strategy Phased development Build critical functionality first Demonstrate simplest possible test cases as early as possible Add functionality incrementally to ultimately demonstrate complete system Ability to dynamically support multiple users, across multiple sites, hosting multiple applications Automatically enforce data policy across sites Assess project results at every increment Be a "fast learning" project that can quickly follow successful efforts and discard disappointing ones Leverage existing hardware platforms 3

Major Prototype Components 4 CORE Google Earth-based User interface with "tree" of available data Database Aggregator (DA) Provides access to multiple databases Eucalyptus Open source Amazon EC2 API clone to be used for private cloud Correlation Evaluator Client (C) (new development) Panel added to CORE user interface to specify correlation queries Correlation o Evaluator () (new development) e e Cloud application that runs correlation queries from one or more Cs irods Service for managing distributed data archives w/ integrated Rule Engine Workflow Manager (WfM) Tool to manage sets of queries against the DA Performance Model A model to estimate the "cost" of doing queries before they are done

Elastic Utility y Computing p garchitecture Linkingg Your Programs To Useful Systems Open Source API Clone of Amazon EC2 Web services based implementation of elastic/utility/cloud computing infrastructure University of California, Santa Barbara $5.5M in venture capital secured Intends to be the Redhat of cloud computing 5

Illustration of CORE with Correlation Evaluation Client Correlation Parameters 6

Correlation Evaluator () 7 Correlation Evaluators (s) take spatial/temporal l queries from Cs Checks for partial/component correlations already managed by irods (see below) Run sets of workflows against the DA to derive correlations Dynamically provisioned Workflow Managers (WfMs) manage workflows Standing correlations could be requested that assess correlations as new data becomes available Automatic notifications possible s can federate to exchange information Results ultimately displayed on CORE Google Earth (GE) s maintain a federated, distributed data service using irods irods (integrated t Rule-Oriented Oi ddata Service) )is open source from University it of California, San Diego and the Renaissance Computing Institute, North Carolina irods agents can federate to enforce data policy, e.g., Automatic data replication to reduce latency & improve overall performance Exchange of metadata to enhance discovery of correlations Integrated performance model enables correlation engines to estimate the computational load s can federate to estimate total load on the DA and throttle accordingly Give feedback to analyst on potentially intractable queries

Workflow Management (WfM) Engines Workflow Managers "orchestrate" or "choreograph" multiple steps in a large, distributed application Data Movement and Process Execution Keeping track of which operations: Need to be done Have completed May have failed and need to be retried Many Script-based or visual programming tools available to define workflows BPEL (Business Process Execution Language) most widely known in business community but problematic interoperability Pegasus, Taverna, Triana, DAGMan widely known in science community A workflow manager may only be needed here if correlation queries get extremely complicated 8

Functional View of Complete System Database Aggregator Private Clouds at Different Sites Correlation Evaluator 1 Correlation Evaluator n WfM Local Storage Local Storage WfM Correlation Engine w/ perf model irods irods Correlation Engine w/ perf model CORE 1 CORE i CORE j CORE k w/ /C w/ /C w/ /C w/ /C 9 Multiple CORE Users at Different Sites

End-to-End system with Google Earth-based clients using cloud-hosted, RulePoint Correlation Engines (s) to identify related events in multiple data sources Client 1 Client n DB1 DBn Corr Params Corr Params Database Aggregator/ Gateway Cloud Manager HyperV HyperV HyperV HyperV Cloud Site 1 10

Agent Logic s RulePoint Basic Architecture poll Topic subscribe subscribe Data Source Service Rules Event results Responses 11

Eucalyptus Basic System Architecture User (Euca2ools) Front end Cloud Manager Cloud Controller Walrus Storage (VMI repository) Back end Node 1 Node 2 Node n 12

Status of Initial Critical Tasks 1) Small private Eucalyptus cloud stood-up Eucalyptus is an open source, API clone of Amazon EC2 Project Penguin is a possible host platform Perform "microbenchmarks" " to evaluate performance/behavior 2) Basic Correlation Evaluation () basic package implemented "Packages": existing CORE concept for correlations against DA Draft Project Built virtual Planmachine image (VMI) for dynamic provisioning in cloud Initially runnning manually for development and testing 13 3) Currently Modifying CORE to initiate VMIs in cloud Automating CORE "packages" for doing correlations Add "Correlation Evaluation Client" window in CORE Compliments CORE "tree Results from s in the cloud displayed on Google Earth 4) Demonstrations and Evaluations to be done Perform benchmarks to evaluate performance/scalability/behavior Add multiple CORE users Add multiple client sites

Further Enhancements 14 5) Integrated Performance Model & Control Design performance model that estimates how much processing time and how much correlation data may be involved Use Perf Model to prevent inadvertent initiation of "expensive" queries that result in excessive processing and data requirements Use Perf model to "throttle" aggregate queries against DA so as to not adversely affect operational DA use 6) Integrated Workflow Management CORE package may require extensive "query sweeps" against the DA Use a Workflow Management engine to manage query sweeps 7) Integrated Data Management and Data Policy Enforcement Use irods to manage correlation data across sites Enforce data policy, e.g., caching, replication, security, transcoding 8) Development of Additional Cloud Applications Demonstrate generic hosting capabilities Multi-tenancy Increased utilization

Potential Issues and Conclusions There are no cloud standards 15 Amazon EC2 is de facto standard for IaaS Work underway in OGF OCCI to standardize this interface There is a risk that standard cloud APIs may be different Critically important when federating clouds from different organizations Cloud Architecture and Configuration application should be suitable for generic Eucalyptus configuration Other apps may need specialized configurations for HPC, data streaming How to manage cloud performance across NSS job mix? More likely to realize e benefits of Cloud Computing -- IaaS when applications are run at scale Small demo application can be used to drive installation of cloud prototype but may not allow huge benefits to be demonstrated More users/apps may be needed to show control of server utilization, etc. May want to demonstrate workload management across sites Green IT is another potential benefit of cloud computing Workflow mgmt across sites to enforce energy consumption policy

Legal Notice All trademarks, service marks, and trade names are the property of their respective owners 16