REPRODUCIBILITY OF EXECUTION ENVIRONMENTS IN SCIENTIFIC WORKFLOWS USING SEMANTICS AND CONTAINERS

Size: px
Start display at page:

Download "REPRODUCIBILITY OF EXECUTION ENVIRONMENTS IN SCIENTIFIC WORKFLOWS USING SEMANTICS AND CONTAINERS"

Transcription

1 REPRODUCIBILITY OF EXECUTION ENVIRONMENTS IN SCIENTIFIC WORKFLOWS USING SEMANTICS AND CONTAINERS Rafael Ferreira da Silva, Ewa Deelman USC Information Sciences Institute Container Strategies for Data & Software Preservation that Promote Open Science May 19-20, University of Notre Dame, USA

2 OUTLINE Introduction Semantic Modeling Semantic Annotations Motivation Scientific Workflows Conservation Annotations Virtualization WICUS Semantic Models Computational Reproducibility PRECIP Pegasus Reproducibility Process Infrastructure Specification Algorithm Use Cases Summary Conclusions Future Research Directions 2

3 Scientific Workflows Large scale computations Provenance INTRODUCTION DAG directed-acyclic graphs Reproducibility Data, code, workflow description Underlying infrastructure? job Command-line programs Conservation Physical: real object Logical: object description (semantics) dependency Usually data dependencies 3

4 SEMANTIC MODELING Former Equipment Desktop Cluster Grid Cloud Equivalent Execution Envrionment Virtual Images Containers Reproducibility Timeline ANNOTATE REPRODUCE Semantic Annotations Workflow Software Hardware Computing Resources 4

5 SEMANTICS WICUS Workflow Infrastructure Conservation Using Semantics Ontology Network 5

6 EXPERIMENT MANAGEMENT PRECIP Pegasus Repeatable Experiments for the Cloud in Python Overview Experiment management control API Works with commercial and academic clouds Tag-based system Features No need of pre-installed software on the VM image Create VM, transfer files Run commands remotely (SSH) 6

7 WORKFLOW MANAGEMENT SYSTEM Pegasus WMS Automates complex, multi-stage processing pipelines Enables parallel, distributed computations Automatically executes data transfers Handles failures with to provide reliability Reproducibility Reusable, aids reproducibility Records how data was produced (provenance) Keeps track of data and files job dependency DAG in XML split merge pipeline 7

8 REPRODUCIBILITY PROCESS DAX Annotator Annotations SW Comp Catalog Information Specification Algorithm TC Annotator & Config Annotations DAX XML Pegasus Trans. Catalog SVA Catalog PRECIP Script 8

9 DEPENDENCY MANAGEMENT Infrastructure Specification Algorithm (ISA) Obtain an specification defining what VMs need to be created, what software components must be deployed and their configuration REQ1 SVA1 Workflow REQ2 SVA2 REQ3 SVA3 9

10 USE CASES Workflow WMS Requirements Software Montage Epigenomics SoyKB mfitplane mfit mdifffit mproject... mdiff SolexFilter... FastqSplit Dedup... Haplotype GLibC Maq Map requirement requirement requirement requirement requirement requirement Maq Index Hardware C. Resources CentOS 6 Perl5 LibStd Sol2Sanger CPU RAM DISK CPU RAM DISK CPU RAM DISK AWS CentOS 6 HW FG CentOS 6 HW Vagrant CentOS 6 HW CPU RAM DISK CPU RAM DISK AWS CentOS 6 40GB disk HW FG CentOS 6 40GB disk HW AWS CentOS 6 5GB disk FG CentOS 6 5GB disk CentOS 6 SVA Vagrante CentOS 6 50GB disk AWS CentOS 6 40GB disk FG CentOS 6 40GB disk Sw Bundle Java Wget BWA Gtak Pegasus WMS HTCondor 10

11 USE CASES Montage Epigenomics Astronomy Workflow Montage software distribution 59 binaries (executables) Bioinformatics Workflow 8 binaries (executables) 11

12 SUMMARY Summary Conclusion Future Research Directions Contributions Semantic vocabularies to describe the execution environment of workflows Process for documenting the workflow application and the WMS Applicability to public, private, and local clouds (EC2, Chameleon, Vagrant, and Docker) Future Work Multi-node infrastructure Automation of the semantic annotation process I. Santana-Perez, R. Ferreira da Silva, M. Rynge, E. Deelman, M. S. Pérez-Hernández, and O. Corcho, Reproducibility of Execution Environments in Computational Science Using Semantics and Clouds, Future Generation Computer Systems,

13 REPRODUCIBILITY OF EXECUTION ENVIRONMENTS IN SCIENTIFIC WORKFLOWS USING SEMANTICS AND CONTAINERS Thank You Questions? Rafael Ferreira da Silva, Ph.D. Computer Scientist, USC Information Sciences Institute

Creating A Galactic Plane Atlas With Amazon Web Services

Creating A Galactic Plane Atlas With Amazon Web Services Creating A Galactic Plane Atlas With Amazon Web Services G. Bruce Berriman 1*, Ewa Deelman 2, John Good 1, Gideon Juve 2, Jamie Kinney 3, Ann Merrihew 3, and Mats Rynge 2 1 Infrared Processing and Analysis

More information

Data Sharing Options for Scientific Workflows on Amazon EC2

Data Sharing Options for Scientific Workflows on Amazon EC2 Data Sharing Options for Scientific Workflows on Amazon EC2 Gideon Juve, Ewa Deelman, Karan Vahi, Gaurang Mehta, Benjamin P. Berman, Bruce Berriman, Phil Maechling Francesco Allertsen Vrije Universiteit

More information

The Case for Resource Sharing in Scientific Workflow Executions

The Case for Resource Sharing in Scientific Workflow Executions The Case for Resource Sharing in Scientific Workflow Executions Ricardo Oda, Daniel Cordeiro, Rafael Ferreira da Silva 2 Ewa Deelman 2, Kelly R. Braghetto Instituto de Matemática e Estatística Universidade

More information

How can new technologies can be of service to astronomy? Community effort

How can new technologies can be of service to astronomy? Community effort 1 Astronomy must develop new computational model Integration and processing of data will be done increasingly on distributed facilities rather than desktops Wonderful opportunity for the next generation!

More information

Introduction to Arvados. A Curoverse White Paper

Introduction to Arvados. A Curoverse White Paper Introduction to Arvados A Curoverse White Paper Contents Arvados in a Nutshell... 4 Why Teams Choose Arvados... 4 The Technical Architecture... 6 System Capabilities... 7 Commitment to Open Source... 12

More information

SURVEY ON THE ALGORITHMS FOR WORKFLOW PLANNING AND EXECUTION

SURVEY ON THE ALGORITHMS FOR WORKFLOW PLANNING AND EXECUTION SURVEY ON THE ALGORITHMS FOR WORKFLOW PLANNING AND EXECUTION Kirandeep Kaur Khushdeep Kaur Research Scholar Assistant Professor, Department Of Cse, Bhai Maha Singh College Of Engineering, Bhai Maha Singh

More information

Timofey Turenko. Kirill Krinkin St-Petersburg Electrotechnical University

Timofey Turenko. Kirill Krinkin St-Petersburg Electrotechnical University 11 th Central and Eastern European Software Engineering Conference in Russia - CEE-SECR 2015 October 22-24, Moscow Automatic tool for multi-configuration environment creation for database server and database

More information

A General Approach to Real-time Workflow Monitoring Karan Vahi, Ewa Deelman, Gaurang Mehta, Fabio Silva

A General Approach to Real-time Workflow Monitoring Karan Vahi, Ewa Deelman, Gaurang Mehta, Fabio Silva A General Approach to Real-time Workflow Monitoring Karan Vahi, Ewa Deelman, Gaurang Mehta, Fabio Silva USC Information Sciences Institute Ian Harvey, Ian Taylor, Kieran Evans, Dave Rogers, Andrew Jones,

More information

DevOps with Containers. for Microservices

DevOps with Containers. for Microservices DevOps with Containers for Microservices DevOps is a Software Development Method Keywords Communication, collaboration, integration, automation, measurement Goals improved deployment frequency faster time

More information

Solution for private cloud computing

Solution for private cloud computing The CC1 system Solution for private cloud computing 1 Outline What is CC1? Features Technical details Use cases By scientist By HEP experiment System requirements and installation How to get it? 2 What

More information

Enabling multi-cloud resources at CERN within the Helix Nebula project. D. Giordano (CERN IT-SDC) HEPiX Spring 2014 Workshop 23 May 2014

Enabling multi-cloud resources at CERN within the Helix Nebula project. D. Giordano (CERN IT-SDC) HEPiX Spring 2014 Workshop 23 May 2014 Enabling multi-cloud resources at CERN within the Helix Nebula project D. Giordano (CERN IT-) HEPiX Spring 2014 Workshop This document produced by Members of the Helix Nebula consortium is licensed under

More information

Automated deployment of virtualization-based research models of distributed computer systems

Automated deployment of virtualization-based research models of distributed computer systems Automated deployment of virtualization-based research models of distributed computer systems Andrey Zenzinov Mechanics and mathematics department, Moscow State University Institute of mechanics, Moscow

More information

HDFS Cluster Installation Automation for TupleWare

HDFS Cluster Installation Automation for TupleWare HDFS Cluster Installation Automation for TupleWare Xinyi Lu Department of Computer Science Brown University Providence, RI 02912 xinyi_lu@brown.edu March 26, 2014 Abstract TupleWare[1] is a C++ Framework

More information

Eoulsan Analyse du séquençage à haut débit dans le cloud et sur la grille

Eoulsan Analyse du séquençage à haut débit dans le cloud et sur la grille Eoulsan Analyse du séquençage à haut débit dans le cloud et sur la grille Journées SUCCES Stéphane Le Crom (UPMC IBENS) stephane.le_crom@upmc.fr Paris November 2013 The Sanger DNA sequencing method Sequencing

More information

Hosted Science: Managing Computational Workflows in the Cloud. Ewa Deelman USC Information Sciences Institute

Hosted Science: Managing Computational Workflows in the Cloud. Ewa Deelman USC Information Sciences Institute Hosted Science: Managing Computational Workflows in the Cloud Ewa Deelman USC Information Sciences Institute http://pegasus.isi.edu deelman@isi.edu The Problem Scientific data is being collected at an

More information

HPC performance applications on Virtual Clusters

HPC performance applications on Virtual Clusters Panagiotis Kritikakos EPCC, School of Physics & Astronomy, University of Edinburgh, Scotland - UK pkritika@epcc.ed.ac.uk 4 th IC-SCCE, Athens 7 th July 2010 This work investigates the performance of (Java)

More information

CONDOR as Job Queue Management for Teamcenter 8.x

CONDOR as Job Queue Management for Teamcenter 8.x CONDOR as Job Queue Management for Teamcenter 8.x 7th March 2011 313000 Matthias Ahrens / GmbH The issue To support a few automatic document converting and handling mechanism inside Teamcenter a Job Queue

More information

Deploying Business Virtual Appliances on Open Source Cloud Computing

Deploying Business Virtual Appliances on Open Source Cloud Computing International Journal of Computer Science and Telecommunications [Volume 3, Issue 4, April 2012] 26 ISSN 2047-3338 Deploying Business Virtual Appliances on Open Source Cloud Computing Tran Van Lang 1 and

More information

Development of Monitoring and Analysis Tools for the Huawei Cloud Storage

Development of Monitoring and Analysis Tools for the Huawei Cloud Storage Development of Monitoring and Analysis Tools for the Huawei Cloud Storage September 2014 Author: Veronia Bahaa Supervisors: Maria Arsuaga-Rios Seppo S. Heikkila CERN openlab Summer Student Report 2014

More information

Experiences and challenges in the development of the JASMIN cloud service for the environmental science community

Experiences and challenges in the development of the JASMIN cloud service for the environmental science community JASMIN (STFC/Stephen Kill) Experiences and challenges in the development of the JASMIN cloud service for the environmental science community ECMWF Visualisa-on in Meteorology Week, 28 September 2015 Philip

More information

DevOps Course Content

DevOps Course Content DevOps Course Content INTRODUCTION TO DEVOPS What is DevOps? History of DevOps Dev and Ops DevOps definitions DevOps and Software Development Life Cycle DevOps main objectives Infrastructure As A Code

More information

A Very Brief Introduction To Cloud Computing. Jens Vöckler, Gideon Juve, Ewa Deelman, G. Bruce Berriman

A Very Brief Introduction To Cloud Computing. Jens Vöckler, Gideon Juve, Ewa Deelman, G. Bruce Berriman A Very Brief Introduction To Cloud Computing Jens Vöckler, Gideon Juve, Ewa Deelman, G. Bruce Berriman What is The Cloud Cloud computing refers to logical computational resources accessible via a computer

More information

Use of Hadoop File System for Nuclear Physics Analyses in STAR

Use of Hadoop File System for Nuclear Physics Analyses in STAR 1 Use of Hadoop File System for Nuclear Physics Analyses in STAR EVAN SANGALINE UC DAVIS Motivations 2 Data storage a key component of analysis requirements Transmission and storage across diverse resources

More information

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud)

Open Cloud System. (Integration of Eucalyptus, Hadoop and AppScale into deployment of University Private Cloud) Open Cloud System (Integration of Eucalyptus, Hadoop and into deployment of University Private Cloud) Thinn Thu Naing University of Computer Studies, Yangon 25 th October 2011 Open Cloud System University

More information

NASA's Strategy and Activities in Server Side Analytics

NASA's Strategy and Activities in Server Side Analytics NASA's Strategy and Activities in Server Side Analytics Tsengdar Lee, Ph.D. High-end Computing Program Manager NASA Headquarters Presented at the ESGF/UVCDAT Conference Lawrence Livermore National Laboratory

More information

Neptune. A Domain Specific Language for Deploying HPC Software on Cloud Platforms. Chris Bunch Navraj Chohan Chandra Krintz Khawaja Shams

Neptune. A Domain Specific Language for Deploying HPC Software on Cloud Platforms. Chris Bunch Navraj Chohan Chandra Krintz Khawaja Shams Neptune A Domain Specific Language for Deploying HPC Software on Cloud Platforms Chris Bunch Navraj Chohan Chandra Krintz Khawaja Shams ScienceCloud 2011 @ San Jose, CA June 8, 2011 Cloud Computing Three

More information

Big Data Course Highlights

Big Data Course Highlights Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like

More information

Controlling the Linux ecognition GRID server v9 from a ecognition Developer client

Controlling the Linux ecognition GRID server v9 from a ecognition Developer client Controlling the Linux ecognition GRID server v9 from a ecognition Developer client By S. Hese Earth Observation Friedrich-Schiller University Jena 07743 Jena Grietgasse 6 soeren.hese@uni-jena.de Versioning:

More information

Computer Virtualization in Practice

Computer Virtualization in Practice Computer Virtualization in Practice [ life between virtual and physical ] A. Németh University of Applied Sciences, Oulu, Finland andras.nemeth@students.oamk.fi ABSTRACT This paper provides an overview

More information

Cloud Computing through Virtualization and HPC technologies

Cloud Computing through Virtualization and HPC technologies Cloud Computing through Virtualization and HPC technologies William Lu, Ph.D. 1 Agenda Cloud Computing & HPC A Case of HPC Implementation Application Performance in VM Summary 2 Cloud Computing & HPC HPC

More information

Virtualisation Cloud Computing at the RAL Tier 1. Ian Collier STFC RAL Tier 1 HEPiX, Bologna, 18 th April 2013

Virtualisation Cloud Computing at the RAL Tier 1. Ian Collier STFC RAL Tier 1 HEPiX, Bologna, 18 th April 2013 Virtualisation Cloud Computing at the RAL Tier 1 Ian Collier STFC RAL Tier 1 HEPiX, Bologna, 18 th April 2013 Virtualisation @ RAL Context at RAL Hyper-V Services Platform Scientific Computing Department

More information

Managing a local Galaxy Instance. Anushka Brownley / Adam Kraut BioTeam Inc.

Managing a local Galaxy Instance. Anushka Brownley / Adam Kraut BioTeam Inc. Managing a local Galaxy Instance Anushka Brownley / Adam Kraut BioTeam Inc. Agenda Who are we Why a local installation Local infrastructure Local installation Tips and Tricks SlipStream Appliance WHO ARE

More information

Data Management Challenges of Data-Intensive Scientific Workflows

Data Management Challenges of Data-Intensive Scientific Workflows Data Management Challenges of Data-Intensive Scientific Workflows Ewa Deelman, Ann Chervenak USC Information Sciences Institute, Marina Del Rey, CA 90292 deelman@isi.edu, annc@isi.edu Abstract Scientific

More information

The Mantid Project. The challenges of delivering flexible HPC for novice end users. Nicholas Draper SOS18

The Mantid Project. The challenges of delivering flexible HPC for novice end users. Nicholas Draper SOS18 The Mantid Project The challenges of delivering flexible HPC for novice end users Nicholas Draper SOS18 What Is Mantid A framework that supports high-performance computing and visualisation of scientific

More information

On the Performance-cost Tradeoff for Workflow Scheduling in Hybrid Clouds

On the Performance-cost Tradeoff for Workflow Scheduling in Hybrid Clouds On the Performance-cost Tradeoff for Workflow Scheduling in Hybrid Clouds Thiago A. L. Genez, Luiz F. Bittencourt, Edmundo R. M. Madeira Institute of Computing University of Campinas UNICAMP Av. Albert

More information

Triplestore Testing in the Cloud with Clojure. Ryan Senior

Triplestore Testing in the Cloud with Clojure. Ryan Senior Triplestore Testing in the Cloud with Clojure Ryan Senior About Me Senior Engineer at Revelytix Inc Revelytix Info Strange Loop Sponsor Semantic Web Company http://revelytix.com Blog: http://objectcommando.com/blog

More information

Alfresco Enterprise on AWS: Reference Architecture

Alfresco Enterprise on AWS: Reference Architecture Alfresco Enterprise on AWS: Reference Architecture October 2013 (Please consult http://aws.amazon.com/whitepapers/ for the latest version of this paper) Page 1 of 13 Abstract Amazon Web Services (AWS)

More information

Cloud Computing. Alex Crawford Ben Johnstone

Cloud Computing. Alex Crawford Ben Johnstone Cloud Computing Alex Crawford Ben Johnstone Overview What is cloud computing? Amazon EC2 Performance Conclusions What is the Cloud? A large cluster of machines o Economies of scale [1] Customers use a

More information

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases

Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases NASA Ames NASA Advanced Supercomputing (NAS) Division California, May 24th, 2012 Cloud Computing Architecture with OpenNebula HPC Cloud Use Cases Ignacio M. Llorente Project Director OpenNebula Project.

More information

Big Data and Cloud Computing for GHRSST

Big Data and Cloud Computing for GHRSST Big Data and Cloud Computing for GHRSST Jean-Francois Piollé (jfpiolle@ifremer.fr) Frédéric Paul, Olivier Archer CERSAT / Institut Français de Recherche pour l Exploitation de la Mer Facing data deluge

More information

Putchong Uthayopas, Kasetsart University

Putchong Uthayopas, Kasetsart University Putchong Uthayopas, Kasetsart University Introduction Cloud Computing Explained Cloud Application and Services Moving to the Cloud Trends and Technology Legend: Cluster computing, Grid computing, Cloud

More information

Xen @ Google. Iustin Pop, <iustin@google.com> Google Switzerland. Sponsored by:

Xen @ Google. Iustin Pop, <iustin@google.com> Google Switzerland. Sponsored by: Xen @ Google Iustin Pop, Google Switzerland Sponsored by: & & Introduction Talk overview Corporate infrastructure Overview Use cases Technology Open source components Internal components

More information

Clearing the Clouds. Understanding cloud computing. Ali Khajeh-Hosseini ST ANDREWS CLOUD COMPUTING CO-LABORATORY. Cloud computing

Clearing the Clouds. Understanding cloud computing. Ali Khajeh-Hosseini ST ANDREWS CLOUD COMPUTING CO-LABORATORY. Cloud computing Clearing the Clouds Understanding cloud computing Ali Khajeh-Hosseini ST ANDREWS CLOUD COMPUTING CO-LABORATORY Cloud computing There are many definitions and they all differ Simply put, cloud computing

More information

Infrastructure Clouds for Science and Education: Platform Tools

Infrastructure Clouds for Science and Education: Platform Tools Infrastructure Clouds for Science and Education: Platform Tools Kate Keahey, Renato J. Figueiredo, John Bresnahan, Mike Wilde, David LaBissoniere Argonne National Laboratory Computation Institute, University

More information

Virtualization @ Google

Virtualization @ Google Virtualization @ Google Alexander Schreiber Google Switzerland Libre Software Meeting 2012 Geneva, Switzerland, 2012-06-10 Introduction Talk overview Corporate infrastructure Overview Use cases Technology

More information

Automating Big Data Benchmarking for Different Architectures with ALOJA

Automating Big Data Benchmarking for Different Architectures with ALOJA www.bsc.es Jan 2016 Automating Big Data Benchmarking for Different Architectures with ALOJA Nicolas Poggi, Postdoc Researcher Agenda 1. Intro on Hadoop performance 1. Current scenario and problematic 2.

More information

Science Clouds: Early Experiences in Cloud Computing for Scientific Applications Kate Keahey and Tim Freeman

Science Clouds: Early Experiences in Cloud Computing for Scientific Applications Kate Keahey and Tim Freeman Science Clouds: Early Experiences in Cloud Computing for Scientific Applications Kate Keahey and Tim Freeman About this document The Science Clouds provide EC2-style cycles to scientific projects. This

More information

Using Application Services

Using Application Services vrealize Automation 6.2.2 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions

More information

FREE computing using Amazon EC2

FREE computing using Amazon EC2 FREE computing using Amazon EC2 Seong-Hwan Jun 1 1 Department of Statistics Univ of British Columbia Nov 1st, 2012 / Student seminar Outline Basics of servers Amazon EC2 Setup R on an EC2 instance Stat

More information

Cloud Computing. AWS a practical example. Hugo Pérez UPC. Mayo 2012

Cloud Computing. AWS a practical example. Hugo Pérez UPC. Mayo 2012 Cloud Computing AWS a practical example Mayo 2012 Hugo Pérez UPC -2- Index Introduction Infraestructure Development and Results Conclusions Introduction In order to know deeper about AWS services, mapreduce

More information

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf

#jenkinsconf. Jenkins as a Scientific Data and Image Processing Platform. Jenkins User Conference Boston #jenkinsconf Jenkins as a Scientific Data and Image Processing Platform Ioannis K. Moutsatsos, Ph.D., M.SE. Novartis Institutes for Biomedical Research www.novartis.com June 18, 2014 #jenkinsconf Life Sciences are

More information

Cloud Computing and Advanced Relationship Analytics

Cloud Computing and Advanced Relationship Analytics Cloud Computing and Advanced Relationship Analytics Using Objectivity/DB to Discover the Relationships in your Data By Brian Clark Vice President, Product Management Objectivity, Inc. 408 992 7136 brian.clark@objectivity.com

More information

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

Introduction to Openstack, an Open Cloud Computing Platform. Libre Software Meeting

Introduction to Openstack, an Open Cloud Computing Platform. Libre Software Meeting Introduction to Openstack, an Open Cloud Computing Platform Libre Software Meeting 10 July 2012 David Butler BBC Research & Development david.butler@rd.bbc.co.uk Introduction: Libre Software Meeting 2012

More information

Data Centers and Cloud Computing

Data Centers and Cloud Computing Data Centers and Cloud Computing CS377 Guest Lecture Tian Guo 1 Data Centers and Cloud Computing Intro. to Data centers Virtualization Basics Intro. to Cloud Computing Case Study: Amazon EC2 2 Data Centers

More information

FleSSR Project: Installing Eucalyptus Open Source Cloud Solution at Oxford e- Research Centre

FleSSR Project: Installing Eucalyptus Open Source Cloud Solution at Oxford e- Research Centre FleSSR Project: Installing Eucalyptus Open Source Cloud Solution at Oxford e- Research Centre Matteo Turilli, David Wallom Eucalyptus is available in two versions: open source and enterprise. Within this

More information

Toronto June 18, 2014. Building Enterprise Clouds - Key Considerations and Strategies. Joe Fitzgerald GM, Cloud Management Products BU Red Hat

Toronto June 18, 2014. Building Enterprise Clouds - Key Considerations and Strategies. Joe Fitzgerald GM, Cloud Management Products BU Red Hat Toronto June 18, 2014 Building Enterprise Clouds - Key Considerations and Strategies Joe Fitzgerald GM, Cloud Management Products BU Red Hat I want to build a cloud... What KEY Capabilities are Important?

More information

Our Puppet Story. Martin Schütte. May 5 2014

Our Puppet Story. Martin Schütte. May 5 2014 Our Puppet Story Martin Schütte May 5 2014 About DECK36 Small team of 7 engineers Longstanding expertise in designing, implementing and operating complex web systems Developing own data intelligence-focused

More information

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community Ntinos Krampis Asst. Professor J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/

More information

A validation system for data preservation in HEP - motivation - concepts and design - walk through the implementation - summary and outlook

A validation system for data preservation in HEP - motivation - concepts and design - walk through the implementation - summary and outlook A validation system for data preservation in HEP - motivation - concepts and design - walk through the implementation - summary and outlook Yves Kemp (DESY IT), Marco Strutz & Hermann Heßling (HTW Berlin)

More information

NextLabs International Private Limited. 1. Position: Software Engineer (Java) Location: Singapore

NextLabs International Private Limited. 1. Position: Software Engineer (Java) Location: Singapore 1. Position: Software Engineer (Java) NextLabs (www.nextlabs.com), a Silicon Valley technology company, is the leading provider of policydriven information risk management (IRM) software for large enterprises,

More information

Enterprise Cloud VM Image Import User Guide. Version 1.0

Enterprise Cloud VM Image Import User Guide. Version 1.0 Enterprise Cloud VM Image Import User Guide Version 1.0 Version History Issue Date Comments 1.0 2013/03/20 Initial version i Introduction 1) Purpose of this document: The purpose of this document is to

More information

Performance Testing in Virtualized Environments. Emily Apsey Product Engineer

Performance Testing in Virtualized Environments. Emily Apsey Product Engineer Performance Testing in Virtualized Environments Emily Apsey Product Engineer Introduction Product Engineer on the Performance Engineering Team Overview of team - Specialty in Virtualization - Citrix, VMWare,

More information

Assignment # 1 (Cloud Computing Security)

Assignment # 1 (Cloud Computing Security) Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual

More information

GATECloud.net: Cloud Infrastructure for Large-Scale, Open-Source Text Processing

GATECloud.net: Cloud Infrastructure for Large-Scale, Open-Source Text Processing : Cloud Infrastructure for Large-Scale, Open-Source Text Processing Valentin Tablan Ian Roberts Hamish Cunningham Kalina Bontcheva University of Sheffield 28 September 2011 Tablan, Roberts, Cunningham,

More information

Unidata Cloud-Related Activities. Unidata Users Committee Meeting September 2014 Ward Fisher

Unidata Cloud-Related Activities. Unidata Users Committee Meeting September 2014 Ward Fisher Unidata Cloud-Related Activities Unidata Users Committee Meeting September 2014 Ward Fisher Overview Three ongoing efforts, broadly speaking. Unidata developers are incorporating the cloud and cloud-based

More information

Distributed Framework for Data Mining As a Service on Private Cloud

Distributed Framework for Data Mining As a Service on Private Cloud RESEARCH ARTICLE OPEN ACCESS Distributed Framework for Data Mining As a Service on Private Cloud Shraddha Masih *, Sanjay Tanwani** *Research Scholar & Associate Professor, School of Computer Science &

More information

How to Ingest Data into Google BigQuery using Talend for Big Data. A Technical Solution Paper from Saama Technologies, Inc.

How to Ingest Data into Google BigQuery using Talend for Big Data. A Technical Solution Paper from Saama Technologies, Inc. How to Ingest Data into Google BigQuery using Talend for Big Data A Technical Solution Paper from Saama Technologies, Inc. July 30, 2013 Table of Contents Intended Audience What you will Learn Background

More information

Performance Testing of a Cloud Service

Performance Testing of a Cloud Service Performance Testing of a Cloud Service Trilesh Bhurtun, Junior Consultant, Capacitas Ltd Capacitas 2012 1 Introduction Objectives Environment Tests and Results Issues Summary Agenda Capacitas 2012 2 1

More information

ArcGIS Pro: Virtualizing in Citrix XenApp and XenDesktop. Emily Apsey Performance Engineer

ArcGIS Pro: Virtualizing in Citrix XenApp and XenDesktop. Emily Apsey Performance Engineer ArcGIS Pro: Virtualizing in Citrix XenApp and XenDesktop Emily Apsey Performance Engineer Presentation Overview What it takes to successfully virtualize ArcGIS Pro in Citrix XenApp and XenDesktop - Shareable

More information

Benchmark Report: Univa Grid Engine, Nextflow, and Docker for running Genomic Analysis Workflows

Benchmark Report: Univa Grid Engine, Nextflow, and Docker for running Genomic Analysis Workflows PRBB / Ferran Mateo Benchmark Report: Univa Grid Engine, Nextflow, and Docker for running Genomic Analysis Workflows Summary of testing by the Centre for Genomic Regulation (CRG) utilizing new virtualization

More information

Cloud Cruiser and Azure Public Rate Card API Integration

Cloud Cruiser and Azure Public Rate Card API Integration Cloud Cruiser and Azure Public Rate Card API Integration In this article: Introduction Azure Rate Card API Cloud Cruiser s Interface to Azure Rate Card API Import Data from the Azure Rate Card API Defining

More information

VMware vrealize Automation

VMware vrealize Automation VMware vrealize Automation Reference Architecture Version 6.0 and Higher T E C H N I C A L W H I T E P A P E R Table of Contents Overview... 4 What s New... 4 Initial Deployment Recommendations... 4 General

More information

Big Data Challenges in Bioinformatics

Big Data Challenges in Bioinformatics Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres Jordi.Torres@bsc.es Talk outline! We talk about Petabyte?

More information

Learn Oracle WebLogic Server 12c Administration For Middleware Administrators

Learn Oracle WebLogic Server 12c Administration For Middleware Administrators Wednesday, November 18,2015 1:15-2:10 pm VT425 Learn Oracle WebLogic Server 12c Administration For Middleware Administrators Raastech, Inc. 2201 Cooperative Way, Suite 600 Herndon, VA 20171 +1-703-884-2223

More information

Cloud Computing. Lecture 24 Cloud Platform Comparison 2014-2015

Cloud Computing. Lecture 24 Cloud Platform Comparison 2014-2015 Cloud Computing Lecture 24 Cloud Platform Comparison 2014-2015 1 Up until now Introduction, Definition of Cloud Computing Pre-Cloud Large Scale Computing: Grid Computing Content Distribution Networks Cycle-Sharing

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

Intelligent Workflow Systems and Provenance-Aware Software

Intelligent Workflow Systems and Provenance-Aware Software International Environmental Modelling and Software Society (iemss) 7th Intl. Congress on Env. Modelling and Software, San Diego, CA, USA, Daniel P. Ames, Nigel W.T. Quinn and Andrea E. Rizzoli (Eds.) http://www.iemss.org/society/index.php/iemss-2014-proceedings

More information

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging

More information

Provisioning and Resource Management at Large Scale (Kadeploy and OAR)

Provisioning and Resource Management at Large Scale (Kadeploy and OAR) Provisioning and Resource Management at Large Scale (Kadeploy and OAR) Olivier Richard Laboratoire d Informatique de Grenoble (LIG) Projet INRIA Mescal 31 octobre 2007 Olivier Richard ( Laboratoire d Informatique

More information

VMware vrealize Automation

VMware vrealize Automation VMware vrealize Automation Reference Architecture Version 6.0 or Later T E C H N I C A L W H I T E P A P E R J U N E 2 0 1 5 V E R S I O N 1. 5 Table of Contents Overview... 4 What s New... 4 Initial Deployment

More information

QA PRO; TEST, MONITOR AND VISUALISE MYSQL PERFORMANCE IN JENKINS. Ramesh Sivaraman ramesh.sivaraman@percona.com 14-04-2015

QA PRO; TEST, MONITOR AND VISUALISE MYSQL PERFORMANCE IN JENKINS. Ramesh Sivaraman ramesh.sivaraman@percona.com 14-04-2015 QA PRO; TEST, MONITOR AND VISUALISE MYSQL PERFORMANCE IN JENKINS Ramesh Sivaraman ramesh.sivaraman@percona.com 14-04-2015 Agenda Jenkins : a continuous integration framework Percona Server in Jenkins Performance

More information

UBUNTU DISK IO BENCHMARK TEST RESULTS

UBUNTU DISK IO BENCHMARK TEST RESULTS UBUNTU DISK IO BENCHMARK TEST RESULTS FOR JOYENT Revision 2 January 5 th, 2010 The IMS Company Scope: This report summarizes the Disk Input Output (IO) benchmark testing performed in December of 2010 for

More information

Intro to Data Management. Chris Jordan Data Management and Collections Group Texas Advanced Computing Center

Intro to Data Management. Chris Jordan Data Management and Collections Group Texas Advanced Computing Center Intro to Data Management Chris Jordan Data Management and Collections Group Texas Advanced Computing Center Why Data Management? Digital research, above all, creates files Lots of files Without a plan,

More information

Brian Amedro CTO. Worldwide Customers

Brian Amedro CTO. Worldwide Customers Denis Caromel CEO Brian Amedro CTO Cloud Enterprise Applications (B2B) Reduce Costs (IT) + Reduce Pains (Time) Worldwide Customers 1 1 Software company born of INRIA in 2007 Software Editor, Open Source

More information

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

A Novel Cloud Based Elastic Framework for Big Data Preprocessing School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview

More information

Unified Batch & Stream Processing Platform

Unified Batch & Stream Processing Platform Unified Batch & Stream Processing Platform Himanshu Bari Director Product Management Most Big Data Use Cases Are About Improving/Re-write EXISTING solutions To KNOWN problems Current Solutions Were Built

More information

Grid Computing vs Cloud

Grid Computing vs Cloud Chapter 3 Grid Computing vs Cloud Computing 3.1 Grid Computing Grid computing [8, 23, 25] is based on the philosophy of sharing information and power, which gives us access to another type of heterogeneous

More information

CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014

CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014 CLOUD GAMING WITH NVIDIA GRID TECHNOLOGIES Franck DIARD, Ph.D., SW Chief Software Architect GDC 2014 Introduction Cloud ification < 2013 2014+ Music, Movies, Books Games GPU Flops GPUs vs. Consoles 10,000

More information

Dedicated Hosting. The best of all worlds. Build your server to deliver just what you want. For more information visit: imcloudservices.com.

Dedicated Hosting. The best of all worlds. Build your server to deliver just what you want. For more information visit: imcloudservices.com. Dedicated Hosting The best of all worlds. Build your server to deliver just what you want. Only pay for what you use with no long term contracts. High availability, your server is in the cloud. Dedicated

More information

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix Alternative Deployment Models for Cloud Computing in HPC Applications Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix The case for Cloud in HPC Build it in house Assemble in the cloud?

More information

5 SCS Deployment Infrastructure in Use

5 SCS Deployment Infrastructure in Use 5 SCS Deployment Infrastructure in Use Currently, an increasing adoption of cloud computing resources as the base to build IT infrastructures is enabling users to build flexible, scalable, and low-cost

More information

In order to upload a VM you need to have a VM image in one of the following formats:

In order to upload a VM you need to have a VM image in one of the following formats: What is VM Upload? 1. VM Upload allows you to import your own VM and add it to your environment running on CloudShare. This provides a convenient way to upload VMs and appliances which were already built.

More information

Data Lab System Architecture

Data Lab System Architecture Data Lab System Architecture Data Lab Context Data Lab Architecture Astronomer s Desktop Web Page Cmdline Tools Legacy Apps User Code User Mgmt Data Lab Ops Monitoring Presentation Layer Authentication

More information

Introduction to Cloud computing. Viet Tran

Introduction to Cloud computing. Viet Tran Introduction to Cloud computing Viet Tran Type of Cloud computing Infrastructure as a Service IaaS: offer full virtual machines via hardware virtualization tech. Amazon EC2, AbiCloud, ElasticHosts, Platform

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

Deploying Federal Geospatial Services

Deploying Federal Geospatial Services Deploying Federal Geospatial Services in the Cloud: Federal Geographic Data Committee (FGDC) and GSA GeoCloud Sandbox Initiative Doug Nebert USGS/FGDC December 2010 Draft For Official Use Only 1 Background

More information

A System Architecture for Running Big Data Workflows in the Cloud

A System Architecture for Running Big Data Workflows in the Cloud A System Architecture for Running Big Data Workflows in the Cloud Andrey Kashlev, Shiyong Lu Department of Compute Science Wayne State University Abstract Scientific workflows have become an important

More information

Running Oracle Databases in a z Systems Cloud environment

Running Oracle Databases in a z Systems Cloud environment Running Oracle Databases in a z Systems Cloud environment Sam Amsavelu samvelu@us.ibm.com ISV & Channels Technical Sales - Oracle IBM Advanced Technical Skills (ATS), America Technical University/Symposia

More information