Building Cloud Applications to Support Research: the Microsoft Azure Research Engagement Project

Similar documents
Dennis Gannon Cloud Computing Futures extreme Computing Group Microsoft Research

Cloud Computing for Research Roger Barga Cloud Computing Futures, Microsoft Research

Microsoft Research Worldwide Presence

Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson,Nelson Araujo, Dennis Gannon, Wei Lu, and

Dennis Gannon Cloud Computing Futures extreme Computing Group Microsoft Research

The data explosion is transforming science

Intensive Care Cloud (ICCloud) Venus-C pilot application

Future computing platforms for biodiversity science

Green Prefab: Civil Engineering Hub in Microsoft Windows Azure

Using Proxies to Accelerate Cloud Applications


CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series

Scientific Cloud Computing Infrastructure for Europe Strategic Plan. Bob Jones,

SEERA-EI. Introduction to Cloud Computing. SEERA-EI training, 13 April Aneta Karaivanova, IICT-BAS, Bulgaria

Data Semantics Aware Cloud for High Performance Analytics

Big Data a threat or a chance?

Scientific Cloud Computing Infrastructure for Europe. Bob Jones,

VENUS-C Virtual multidisciplinary EnviroNments USing Cloud Infrastructures

Cloud Computing for Fundamental Spatial Operations on Polygonal GIS Data

Cloud Computing for Forest Fire Management. Dr. Nikos Athanasis, Prof. Kostas Kalabokidis University of the AEGEAN

Necto on Azure The Ultimate Cloud Solution for BI

NIST Big Data PWG & RDA Big Data Infrastructure WG: Implementation Strategy: Best Practice Guideline for Big Data Application Development

Chapter 27 Aneka Cloud Application Platform and Its Integration with Windows Azure

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 ST CENTURY SCIENCE, ENGINEERING, AND EDUCATION (CIF21)

SAP HANA Cloud Platform Overview Customer

Every area of science is now engaged in data-intensive research Researchers need Technology to publish and share data in the cloud Data analytics

BIG DATA, SMALL DATA, OPEN DATA

Scientific and Technical Applications as a Service in the Cloud

Research Data Alliance: Current Activities and Expected Impact. SGBD Workshop, May 2014 Herman Stehouwer

Use Data Budgets to Manage Large Acoustic Datasets

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Today: Data Centers & Cloud Computing" Data Centers"

Deploying ArcGIS for Server using Managed Services

Chapter 1 Basic Introduction to Computers. Discovering Computers Your Interactive Guide to the Digital World

Helix Nebula, the Science Cloud: Potential for Earth Science Franco-British Workshop on Big Data in Science 6-7 November 2012

Center for Dynamic Data Analytics (CDDA) An NSF Supported Industry / University Cooperative Research Center (I/UCRC) Vision and Mission

Cloud-based Infrastructures. Serving INSPIRE needs

ECE6130 Grid and Cloud Computing

Enabling Science in the Cloud: A Remote Sensing Data Processing Service for Environmental Science Analysis

Chapter 2: Transparent Computing and Cloud Computing. Contents of the lecture

Cloud Thinking. Simplifying Big Data Processing. Rui L. Aguiar, Diogo Gomes Universidade de Aveiro - Portugal

WHITE PAPER. Migrating an existing on-premise application to Windows Azure Cloud

RES is a distributed infrastructure of Spanish HPC systems. The objective is to provide a unique service to HPC users in Spain

Good morning. It is a pleasure to be with you here today to talk about the value and promise of Big Data.

Windows HPC Server 2008 R2 Service Pack 3 (V3 SP3)

Web Application Deployment in the Cloud Using Amazon Web Services From Infancy to Maturity

Software services competence in research and development activities at PSNC. Cezary Mazurek PSNC, Poland

Brochure More information from

Technology Shift 1: Ubiquitous Telecommunications Infrastructure

Technology Implications of an Instrumented Planet presented at IFIP WG 10.4 Workshop on Challenges and Directions in Dependability

DataNet Flexible Metadata Overlay over File Resources

WHITEPAPER BEST PRACTICES

City Deploys Big Data BI Solution to Improve Lives and Create a Smart-City Template

Automating Big Data Benchmarking for Different Architectures with ALOJA

Evolution of Chinese Research Data Policy

Big Data / FDAAWARE. Rafi Maslaton President, cresults the maker of Smart-QC/QA/QD & FDAAWARE 30-SEP-2015

Alfresco Enterprise on Azure: Reference Architecture. September 2014

Outlook. Corporate Research and Technologies, Munich, Germany. 20 th May 2010

Cloud-Based Big Data Analytics in Bioinformatics

Enabling Manufacturing Transformation in a Connected World. John Shewchuk Technical Fellow DX

How To Understand The History Of Innovation In The United States

Energy: electricity, electric grids, nuclear, green... Transportation: roads, airplanes, helicopters, space exploration

HUB-E HUB ENGINEERING

Deploying ArcGIS for Server Using Esri Managed Services

StratusLab project. Standards, Interoperability and Asset Exploitation. Vangelis Floros, GRNET

Deploy Your First CF App on Azure with Template and Service Broker. Thomas Shao, Rita Zhang, Bin Xia Microsoft Azure Team

National eresearch Collaboration Tools and Resources nectar.org.au

SPM rollouts in Large Ent erprise: different iat ing exist ing cloud architectures

6 ELIXIR Domain Specific Services

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21)

12/7/2015. Data Science Master s programs

Elena Terenzi. Technology Advisor for Big Data in Oil and Gas Microsoft

Doing Multidisciplinary Research in Data Science

CLOUD COMPUTING & WINDOWS AZURE

Big Data Challenges in Bioinformatics

Accelerate > Converged Storage Infrastructure. DDN Case Study. ddn.com DataDirect Networks. All Rights Reserved

Scientific versus Business Workflows

Distributed Data Mining in Discovery Net. Dr. Moustafa Ghanem Department of Computing Imperial College London

SAP HANA Data Center Intelligence - Overview Presentation

Three data delivery cases for EMBL- EBI s Embassy. Guy Cochrane

Big Data R&D Initiative

Northern Europe. Western Europe

AN SURVEY ON CLOUD COMPUTING PROCESS AND ITS APPLICATIONS

Enabling multi-cloud resources at CERN within the Helix Nebula project. D. Giordano (CERN IT-SDC) HEPiX Spring 2014 Workshop 23 May 2014

Collaborative Open Market to Place Objects at your Service

Steve Apps Senior Manager Accenture South Africa

Microsoft CityNext. George Parrilla SMSP Public Sector Lead Microsoft Philippines

Editorial NUMBER 01 NOVEMBER Editorial. Project overview. Reference architecture

CSE 599c Scientific Data Management. Magdalena Balazinska and Bill Howe Spring 2010 Lecture 3 Science in the Cloud

High Performance Computing

INCREASINGLY, ORGANIZATIONS ARE ASKING WHAT CAN T GO TO THE CLOUD, RATHER THAN WHAT CAN. Albin Penič Technical Team Leader Eastern Europe

Trials community. Yannick Legré. EGI InSPIRE RI

Data Driven Discovery In the Social, Behavioral, and Economic Sciences

The Move to the Cloud

One Click deployment on the Cloud

CompatibleOne Open Source Cloud Broker Architecture Overview

Next Generation Video Conferencing

Introducing Cloud Computing into STEM Curriculum Using Microsoft Azure

Big Data in the context of Preservation and Value Adding

Transcription:

Building Cloud Applications to Support Research: the Microsoft Azure Research Engagement Project

Cloud Computing Research Engagement Initiative Alexander Voss, Christian Geuer-Pollmann, Dennis Gannon, Fabrizio Gagliardi, Goetz Brasche, Jared Jackson, Nelson Araujo, Roger Barga, Wei Lu International Engagements. Offer cloud services to academic and research communities worldwide back up this offering with a technical engagements team. Data. Provide select reference data sets on Azure to enable communities of researchers. Services for Research. Provide applications and core services for research. Ask the questions, what does it take to catalyze a community of researchers, what are the core services, key products needed to support research.

Microsoft s Goals for this Project Demonstrate that a client+cloud model can revolutionize research and learning Illustrate that cloud computing is a cost-effective and easyto-use way to outsource select components of research infrastructure Provide feedback from research community to our product groups Establish the Microsoft Cloud Computing platform as leader and trendsetter for basic research

Some examples based on our ongoing Cloud Research Engagements

AzureBLAST* Seamless Experience Evaluate data and invoke computational models from Excel. Computationally heavy analysis done close to large database of curated data. Scalable for large, surge computationally heavy analysis. Test local, run on the cloud. selects DBs and input sequence Web Role Input Splitter Worker Role Azure Blob Storage Genome DB 1 BLAST DB Configuration Genom e DB K BLAST Execution Worker Role #1. Combiner Worker Role BLAST Execution Worker Role #n *The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases

Making Excel the user interface to the cloud

Supporting Smart Sensors and Data Fusion The NSF Ocean Observing Initiative Hundreds of cabled sensors and robots exploring the sea floor Data to be collected, curated, mined OOI Architecture plan of record, store this data in the cloud Data collected from: Ocean floor sensors, AUV tracks, ship-side cruises, computational models Data moves from ocean to shore side data center to the Azure cloud to your computer.

AzureMODIS Azure Service for Remote Sensing Geoscience 5 TB (~600K files) upload of 9 different imagery products from 15 different locations (~6 days of download) 4 TB reprojected harmonized imagery ~35000 cpu hours 50 GB reduced science variable results ~18000 cpu hours (~14 hour download) 50 GB additional reduced science analysis results ~18000 cpu hours (~14 hour download)

Potential Cloud Applications The cloud as a host to data collections Classical http data - Web n-grams, photos, blogs, etc. Geographic & Semi-static Maps, satellite, geological records, weather radar, field reports/observations, geo-sensors (seismic, ocean) Geo-tagged Social record - census, public government records, local newspapers, health and police records Scientific genome, pathology data, fmri, astronomy, chemical Dynamic shared state - multiplayer games, virtual worlds, social networks My context - my devices, my application sessions, my agents.

Geographic & semi-static data Maps, satellite, geological records, weather radar, field reports/observations, geo-sensors. Mobile apps for emergency response. Tell me when me or my team/family is in danger. Integrate emergency data streams & field report tweets in the cloud and push to my client devices.

Geo-tagged social record Census, public government records, local newspapers, health and police records Build cloud tools to understand societies problems» Links between history of disease, poverty and pollution» Correlate crime rates and bad weather Discover the lost heritage of a people or a place

Dynamic shared state multiplayer games, virtual worlds, social networks New modalities of collaboration

Reaching Out: Azure Research Engagement project In the U.S. Memorandum of Understanding with the National Science Foundation Provide a substantial Azure resource as a donation to NSF NSF will provide funding to researchers to use this resource In Europe We interested in direct engagement with the thought leaders in the U.K., France and Germany EC engagements where possible In both we provide our engagement team We provide workshops, tutorials, best practices and shared services, learn from this community, shape policy In Asia We wish to explore possibilities.

VENUS-C Virtual multidisciplinary EnviroNments USing Cloud infrastructures Funding Scheme: Combination of Collaborative Project and Coordination and Support Action: Integrated Infrastructure Initiative (I3) Program Topic: INFRA-2010-2 1.2.1. Distributed Computing Infrastructures

Goals 1. Create a platform that enables user applications to leverage cloud computing principles and benefits. 2. Leverage the state of the art to on-board early adopters quickly, incrementally enable interop with existing DCI and push the state of the art where needed to satisfy on-boarding and interop 3. Create a sustainable infrastructure that enables the cloud computing paradigms for the user communities inside the project, the one from the call for applications, as well as others.

Supporting multiple basic research disciplines Biomedicine: Integrating widely used tools for Bioinformatics (UPV), System Biology (CosBI) and Drug Discovery (NCL) into the VENUS-C infrastructure Civil Protection and Emergency: Early fire risk detection (AEG), through an application that will run models on the VENUS-C infrastructure, based on multiple data sources. Civil Engineering: Support complex computing tasks on Building Information Management for green constructions (provided by COLB) and dynamic building structure analysis (provided by UPV). D4Science: Integrating computing through VENUS-C on data repositories (CNR). In particular focus will be on Marine Biodiversity through Aquamaps.

Open Call for 20 e-science Applications 20K funding each (in addition to Azure Compute, Storage and Network Resources) -> porting applications to the cloud -> education and traning -> scalability tests

Software + Services to facilitate e-science Applications in the Cloud Existing Users Applications VPaaS PaaS MS Azure Venus-C PaaS Venus-C PaaS/DCI Interop KTH ENG BSC Current/New DCI Existing design Extensions

Strategy/ Approach JRA1 Platform for Scientific Cloud TJRA1.1 Architecture Definition and High-Level Design TJRA1.4 Application Security TJRA1.5 Monitoring, Accounting and Billing TJRA1.6 Networking and network security SA1 Infrastructure Deployment and Operation TSA1.1 Requirements Gathering and Analysis TSA1.2 VENUS-C Infrastructure Setup and Operation SA2 Community Services Scenarios TSA2.1 Structural Analysis for Civil Engineering TSA2.2 Building Information Management TSA2.3 Data for science - AquaMaps TSA2.4 Civil Protection and Emergencies TSA2.5 Bioinformatics TJRA1.2 Data Management TJRA1.3 Programming Models TSA1.3 Technical Support for new Experiments TSA2.6 System Biology TNA2.7 Drug Discovery

Consortium 1 (co) Engineering Ingegneria Informatica S.p.a. ENG IT 2 European Microsoft Innovation Centre EMIC DE 3 European Charter of Open Grid Forum OGF.eeig UK 4 Barcelona Supercomputing Center Centro Nacional de Supercomputación BSC-CNS ES 5 Universidad Politecnica de Valencia UPV ES 6 Kungliga Tekniska Hoegskolan KTH SE 7 University of the Aegean AEG GR 8 Technion TECH IL 9 Centre for Computational and Systems Biology CoSBi IT 10 University of Newcastle NCL UK 11 Consiglio Nazionale delle Ricerche CNR IT 12 Collaboratorio COLB IT

Questions?