The Ophidia framework: toward cloud- based big data analy;cs for escience Sandro Fiore, Giovanni Aloisio, Ian Foster, Dean Williams
|
|
- Bonnie Thompson
- 8 years ago
- Views:
Transcription
1 The Ophidia framework: toward cloud- based big data analy;cs for escience Sandro Fiore, Giovanni Aloisio, Ian Foster, Dean Williams Sandro Fiore, Ph.D. CMCC Scientific Computing and Operations Division
2 The big challenge is to model this complex system! Several complex processes to be simulated Several interacting processes Great range of time scales to be analyzed Great range of spatial scales to be considered Need interdisciplinar sciences (physics, chemistry, biology, geology, ) Inherently non-linear governing equations Need sophisticated numerics Need huge computational resources and large volumes of data can be produced Warren M. Washington NCAR Scien;fic Grand Challenges Workshop Series: Challenges in Climate Change Science and the Role of Compu;ng at the Extreme Scale DOE Workshop (ASCR- BER) November 6-7, 2008
3 ESGF & the CMIP5 data archive!
4 Climate change domain: the current scientific workflow and the ESGF use case! Workflow: search, locate, download, analyze, display results J. Chen, A. Choudhary, S. Feldman, B. Hendrickson, C.R. Johnson, R. Mount, V. Sarkar, V. White, D. Williams. Synergistic Challenges in Data- Intensive Science and Exascale Computing, DOE ASCAC Data Subcommittee Report, Department of Energy Office of Science, March, 2013.
5 The Ophidia Project Ophidia is a research effort carried out at the Euro Mediterranean Centre on Climate Change (CMCC) to address big data challenges, issues and requirements for climate change data analytics. S. Fiore, A. D Anca, C. Palazzo, I. Foster, D. N. Williams, G. Aloisio, Ophidia: toward bigdata analytics for escience, ICCS2013 Conference, Procedia Elsevier, Barcelona, June 5-7, 2013
6 Requirements and needs focus on: Time series analysis Data subsetting Model intercomparison Multimodel means Massive data reduction Data transformation (through array-based primitives) Param. Sweep experiments (same task applied on a set of data) Climate change signal Maps generation Ensemble analysis Data analytics worflow support But also Performance re-usability Data analytics requirements and use cases! extensibility
7 Ophidia Architecture Declarative language Front end Compute layer Standard interfaces Analytics Framework I/O layer Array-based primitives I/O server instance Storage layer System catalog New storage model Partitioning/hierarchical data mng
8 Array based primitives! Primitives provide array-based transformation A comprehensive set of primitives have been already implemented ( 100) By definition, a primitive is applied to a single fragment They come in the form of plugins (I/O server extensions) So far, Ophidia primitives perform data reduction, sub-setting, predicates evaluation, statistical analysis, compression, and so forth. Support is provided both for byte-oriented and bit-oriented arrays Plugins can be nested to get more complex functionalities Compression is provided as a primitive too Libraries like PetsC, GSL, C Math have been integrated
9 Array based primitives: OPH_BOXPLOT oph_boxplot(measure, "OPH_DOUBLE ) Single chunk or fragment (input) Single chunk or fragment (output)
10 Array based primitives: OPH_AGGREGATE oph_aggregate(measure,"oph_avg ) Single chunk or fragment (input) Ver%cal aggrega%on Single chunk or fragment (output)
11 Array based primitives: nesting feature oph_boxplot(oph_subarray(oph_uncompress(measure), 1,18), "OPH_DOUBLE ) Single chunk or fragment (input) Single chunk or fragment (output) subarray(measure, 1,18)
12 The analytics framework: datacube operators (about 50)! OPERATOR NAME OPERATOR DESCRIPTION Operators Data processing Domain-agnostic OPH_APPLY(datacube_in, datacube_out, array_based_primitive) OPH_DUPLICATE(datacube_ in, datacube_out) OPH_SUBSET(datacube_in, subset_string, datacube_out) OPH_MERGE(datacube_in, merge_param, datacube_out) OPH_SPLIT(datacube_in, split_param, datacube_out) OPH_INTERCOMPARISON (datacube_in1, datacube_in2, datacube_out) OPH_DELETE(datacube_in) Creates the datacube_out by applying the array-based primitive to the datacube_in Creates a copy of the datacube_in in the datacube_out Creates the datacube_out by doing a sub-setting of the datacube_in by applying the subset_string Creates the datacube_out by merging groups of merge_param fragments from datacube_in Creates the datacube_out by splitting into groups of split_param fragments each fragment of the datacube_in Creates the datacube_out which is the element-wise difference between datacube_in1 and datacube_in2 Removes the datacube_in Data Access (sequential and parallel operators) Metadata management (sequential and parallel operators) Data processing (parallel operators, MPI & OpenMP based) Import/Export (parallel operators) OPERATOR NAME OPERATOR DESCRIPTION Operators Data processing Domain-oriented OPH_EXPORT_NC Exports the datacube_in data into the (datacube_in, file_out) file_out NetCDF file. OPH_IMPORT_NC Imports the data stored into the file_in (file_in, datacube_out) NetCDF file into the new datacube_in datacube Operators Data access OPH_INSPECT_FRAG Inspects the data stored in the (datacube_in, fragment_in) fragment_in from the datacube_in OPH_PUBLISH(datacube_in) Publishes the datacube_in fragments into HTML pages Operators Metadata OPH_CUBE_ELEMENTS Provides the total number of the (datacube_in) elements in the datacube_in OPH_CUBE_SIZE Provides the disk space occupied by the (datacube_in) datacube_in OPH_LIST(void) Provides the list of available datacubes. OPH_CUBEIO(datacube_in) Provides the provenance information related to the datacube_in OPH_FIND(search_param) Provides the list of datacubes matching the search_param criteria
13 The analytics framework: datacube operators!
14 Metadata operator (provenance management)
15 EUBrazil Cloud Connect! The main objec%ve is the crea%on of a federated e- infrastructure for research using a user- centric approach. To achieve this, we need to pursue three objec%ves: Adapta;on of exis%ng applica%ons to tackle new scenarios emerging from coopera%on between Europe and Brazil relevant to both regions. Integra%on of frameworks and programming models for scien;fic gateways and complex workflows. Federa%on of resources, to build up a general- purpose infrastructure comprising exis;ng and heterogeneous resources Addi%onally, EUBrazilCC will: perform an ac%ve dissemina;on campaign, analyse innova;on, foster the involvement of Brazilian ins%tu%ons in cloud standards defini;on, and bring the EU Cloudscape series to broader interna%onal audience. EU Coordinator Ignacio Blanquer- Espert, iblanque@dsic.upv.es Universitat Politècnica de València, Spain BR Coordinator Francisco Vilar Brasileiro, fubica@dsc.ufcg.edu.br Universidade Federal de Campina Grande, Brazil!
16 Use Case on Biodiversity and Climate Change! Objec;ve: Understand the impact of climate change on terrestrial biodiversity through two workflows based on Earth observa%on and ground level data. Technical Challenge: Integrate parallel data analysis with other processing workflows in a geographically distributed environment. Interna;onal Added Value: Integra%on of biodiversity data and modelling with mul%spectral and remote sensing data for studying the cross- correla%on of biodiversity and climate change. Climate & Biodiversity Clearing- house Parallel Data Analysis Species- Link CMCC CIMP5 Imaging Data Federated Infrastructure & Pla\orm
17 Cloud-based deployment scenarios! Deployment A I/ O S C VM Instance Legend Applica%on Image Data Image Deployment B VM Instance S I/ O I/ O I/ O C VM Instance 1 VM Instance VM C Instance n C Data IO Component Compute Component Server Component Deployment C VM Instance 1 VM Instance 1 VM Instance S VM Instance VM Instance C n C C I/ O I/ O I/ O VM Instance VM C Instance m C
18 References! Papers [1] G. Aloisio, S. Fiore, I. Foster, D. N. Williams, Scientific big data analytics challenges at large scale, Big Data and Extreme-scale Computing (BDEC), April 30 to May 01, 2013, Charleston, USA (position paper). [2] S. Fiore, G. Aloisio, I. Foster, D. N. Williams, A software infrastructure for big data analytics, Big Data and Extreme-scale Computing (BDEC), February 26-28, 2014, Fukuoka, Japan (position paper). [3] S. Fiore, A. D'Anca, C. Palazzo, I. Foster, Dean N. Williams, Giovanni Aloisio, Ophidia: Toward Big Data Analytics for escience, ICCS 2013, June 5-7, 2013 Barcelona, Spain, Procedia Computer Science, Elsevier, pp [4] S. Fiore, C. Palazzo, A. D Anca, I. Foster, D. N. Williams, G. Aloisio, A big data analytics framework for scientific data management, Workshop on Big Data and Science: Infrastructure and Services, IEEE International Conference on BigData 2013, October 6-9, 2013, Santa Clara, USA, pp [5] S. Fiore, A. D'Anca, D. Elia, C. Palazzo, I. Foster, D. Williams, G. Aloisio, "Ophidia: A Full Software Stack for Scientific Data Analytics, proc. of the 2014 International Conference on High Performance Computing & Simulation (HPCS 2014), July 21 25, 2014, Bologna, Italy, pp , ISBN: For more info, please contact: - Dr. Sandro Fiore (sandro.fiore@cmcc.it) - Prof. Giovanni Aloisio (giovanni.aloisio@unisalento.it)
19 Questions?!
The Ophidia framework: toward big data analy7cs for climate change
The Ophidia framework: toward big data analy7cs for climate change Dr. Sandro Fiore Leader, Scientific data management research group Scientific Computing Division @ CMCC Prof. Giovanni Aloisio Director,
More informationData analy(cs workflows for climate
Data analy(cs workflows for climate Dr. Sandro Fiore Leader, Scientific data management research group Scientific Computing Division @ CMCC Prof. Giovanni Aloisio Director, Scientific Computing Division
More informationMajor Goals. Ignacio Blanquer Universitat Politècnica de València European Coordinator. Wagner Meira Jr. Universidade Federal de Minas Gerais
Major Goals Ignacio Blanquer Universitat Politècnica de València European Coordinator Directorate General for Communica1ons Networks, Content and Technology So=ware and Services, Cloud Presented by Dorgival
More informationA big data analytics framework for scientific data management
2013 IEEE International Conference on Big Data A big data analytics framework for scientific data management Sandro Fiore 1,2*, Cosimo Palazzo 1,2, Alessandro D Anca 1, Ian Foster 3, Dean N. Williams 4,
More informationThe ORIENTGATE data platform
Seminar on Proposed and Revised set of indicators June 4-5, 2014 - Belgrade (Serbia) The ORIENTGATE data platform WP2, Action 2.4 Alessandra Nuzzo, Sandro Fiore, Giovanni Aloisio Scientific Computing and
More informationFuture computing platforms for biodiversity science
www.bsc.es Future computing platforms for biodiversity science Daniele Lezzi Rome, 5 September 2013 Motivation Lack of service integration and interoperability of research e- Infrastructure e-irg 2013
More informationBig Data Research at DKRZ
Big Data Research at DKRZ Michael Lautenschlager and Colleagues from DKRZ and Scien:fic Compu:ng Research Group Symposium Big Data in Science Karlsruhe October 7th, 2014 Big Data in Climate Research Big
More informationExArch: Climate analytics on distributed exascale data archives Martin Juckes, V. Balaji, B.N. Lawrence, M. Lautenschlager, S. Denvil, G. Aloisio, P.
ExArch: Climate analytics on distributed exascale data archives Martin Juckes, V. Balaji, B.N. Lawrence, M. Lautenschlager, S. Denvil, G. Aloisio, P. Kushner, D. Waliser, S. Pascoe, A. Stephens, P. Kershaw,
More informationIS-ENES WP3. D3.8 - Report on Training Sessions
IS-ENES WP3 D3.8 - Report on Training Sessions Abstract: The deliverable D3.8 describes the organization and the outcomes of the tutorial meetings on the Grid Prototype designed within task4 of the NA2
More informationReturn on Experience on Cloud Compu2ng Issues a stairway to clouds. Experts Workshop Nov. 21st, 2013
Return on Experience on Cloud Compu2ng Issues a stairway to clouds Experts Workshop Agenda InGeoCloudS SoCware Stack InGeoCloudS Elas2city and Scalability Elas2c File Server Elas2c Database Server Elas2c
More informationNASA's Strategy and Activities in Server Side Analytics
NASA's Strategy and Activities in Server Side Analytics Tsengdar Lee, Ph.D. High-end Computing Program Manager NASA Headquarters Presented at the ESGF/UVCDAT Conference Lawrence Livermore National Laboratory
More informationThe ORIENTGATE data platform
Research Papers Issue RP0195 December 2013 The ORIENTGATE data platform SCO Scientific Computing and Operations Division By Alessandra Nuzzo University of Salento and Scientific Computing and Operations
More informationCMIP6 Data Management at DKRZ
CMIP6 Data Management at DKRZ icas2015 Annecy, France on 13 17 September 2015 Michael Lautenschlager Deutsches Klimarechenzentrum (DKRZ) With contributions from ESGF Executive Committee and WGCM Infrastructure
More informationEfficient and Scalable Climate Metadata Management with the GRelC DAIS
Efficient and Scalable Climate Metadata Management with the GRelC DAIS G. Aloisio, S. Fiore CMCC Scientific Computing and Operations Division University of Salento, Lecce Context : countdown of the Intergovernmental
More informationData Management in the Cloud: Limitations and Opportunities. Annies Ductan
Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management
More informationData Management and Analysis in Support of DOE Climate Science
Data Management and Analysis in Support of DOE Climate Science August 7 th, 2013 Dean Williams, Galen Shipman Presented to: Processing and Analysis of Very Large Data Sets Workshop The Climate Data Challenge
More informationNASA s Big Data Challenges in Climate Science
NASA s Big Data Challenges in Climate Science Tsengdar Lee, Ph.D. High-end Computing Program Manager NASA Headquarters Presented at IEEE Big Data 2014 Workshop October 29, 2014 1 2 7-km GEOS-5 Nature Run
More informationDenis Caromel, CEO Ac.veEon. Orchestrate and Accelerate Applica.ons. Open Source Cloud Solu.ons Hybrid Cloud: Private with Burst Capacity
Cloud computing et Virtualisation : applications au domaine de la Finance Denis Caromel, CEO Ac.veEon Orchestrate and Accelerate Applica.ons Open Source Cloud Solu.ons Hybrid Cloud: Private with Burst
More informationData Semantics Aware Cloud for High Performance Analytics
Data Semantics Aware Cloud for High Performance Analytics Microsoft Future Cloud Workshop 2011 June 2nd 2011, Prof. Jun Wang, Computer Architecture and Storage System Laboratory (CASS) Acknowledgement
More informationTowards a New Model for the Infrastructure Grid
INTERNATIONAL ADVANCED RESEARCH WORKSHOP ON HIGH PERFORMANCE COMPUTING AND GRIDS Cetraro (Italy), June 30 - July 4, 2008 Panel: From Grids to Cloud Services Towards a New Model for the Infrastructure Grid
More informationBig Data and Clouds: Challenges and Opportuni5es
Big Data and Clouds: Challenges and Opportuni5es NIST January 15 2013 Geoffrey Fox gcf@indiana.edu h"p://www.infomall.org h"p://www.futuregrid.org School of Informa;cs and Compu;ng Digital Science Center
More informationA Seman(c Approach to Cloud Infrastructure Management at Freudenberg IT
Michael Schmidt A Seman(c Approach to Cloud Infrastructure Management at Freudenberg IT Timo Weber San Francisco, June 03, 2013 Freudenberg Group Facts & Figures Freudenberg IT a company of Freudenberg
More informationMicrosoft Research Worldwide Presence
Microsoft Research Worldwide Presence MSR India MSR New England Redmond Redmond, Washington Sept, 1991 San Francisco, California Jun, 1995 Cambridge, United Kingdom July, 1997 Beijing, China Nov, 1998
More informationIntegration strategy
C3-INAD and ESGF: Integration strategy C3-INAD Middleware Team: Stephan Kindermann, Carsten Ehbrecht [DKRZ] Bernadette Fritzsch [AWI] Maik Jorra, Florian Schintke, Stefan Plantikov [ZUSE Institute] Markus
More informationNERSC Data Efforts Update Prabhat Data and Analytics Group Lead February 23, 2015
NERSC Data Efforts Update Prabhat Data and Analytics Group Lead February 23, 2015-1 - A little bit about myself Computer Scien.st Brown, IIT Delhi Real- 3me Graphics, Virtual Reality, HCI Computa3onal
More informationHigh Performance Computing in Horizon 2020. February 26-28, 2014 Fukuoka Japan
High Performance Computing in Horizon 2020 Big Data and Extreme Scale Computing Workshop 51214 February 26-28, 2014 Fukuoka Japan Excellence in Science DG CONNECT European Commission Jean-Yves Berthou
More informationExploration of adaptive network transfer for 100 Gbps networks Climate100: Scaling the Earth System Grid to 100Gbps Network
Exploration of adaptive network transfer for 100 Gbps networks Climate100: Scaling the Earth System Grid to 100Gbps Network February 1, 2012 Project period of April 1, 2011 through December 31, 2011 Principal
More informationThe data explosion is transforming science
Talk Outline The data tsunami and the 4 th paradigm of science The challenges for the long tail of science Where is the cloud being used now? The app marketplace SMEs Analytics as a service. What are the
More informationCMIP5 Data Management CAS2K13
CMIP5 Data Management CAS2K13 08. 12. September 2013, Annecy Michael Lautenschlager (DKRZ) With Contributions from ESGF CMIP5 Core Data Centres PCMDI, BADC and DKRZ Status DKRZ Data Archive HLRE-2 archive
More informationNetwork for Sustainable Ultrascale Computing (NESUS) www.nesus.eu
Network for Sustainable Ultrascale Computing (NESUS) www.nesus.eu Objectives of the Action Aim of the Action: To coordinate European efforts for proposing realistic solutions addressing major challenges
More informationPercipient StorAGe for Exascale Data Centric Computing
Percipient StorAGe for Exascale Data Centric Computing per cip i ent (pr-sp-nt) adj. Having the power of perceiving, especially perceiving keenly and readily. n. One that perceives. Introducing: Seagate
More informationData Centric Systems (DCS)
Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems
More informationFederated Big Data for resource aggregation and load balancing with DIRAC
Procedia Computer Science Volume 51, 2015, Pages 2769 2773 ICCS 2015 International Conference On Computational Science Federated Big Data for resource aggregation and load balancing with DIRAC Víctor Fernández
More informationFrom classical web mapping publica2on to INSPIRE service architecture in the Cloud InGeoCloudS BRGM, Pierre LAGARDE
From classical web mapping publica2on to INSPIRE service architecture in the Cloud InGeoCloudS BRGM, Pierre LAGARDE The classical approach of the environmental data dissemina3on 2 The classical approach
More informationIntroduc)on of Pla/orm ISF. Weina Ma Weina.Ma@uoit.ca
Introduc)on of Pla/orm ISF Weina Ma Weina.Ma@uoit.ca Agenda Pla/orm ISF Product Overview Pla/orm ISF Concepts & Terminologies Self- Service Applica)on Management Applica)on Example Deployment Examples
More informationPART 1. Representations of atmospheric phenomena
PART 1 Representations of atmospheric phenomena Atmospheric data meet all of the criteria for big data : they are large (high volume), generated or captured frequently (high velocity), and represent a
More informationResearch Data Alliance: Current Activities and Expected Impact. SGBD Workshop, May 2014 Herman Stehouwer
Research Data Alliance: Current Activities and Expected Impact SGBD Workshop, May 2014 Herman Stehouwer The Vision 2 Researchers and innovators openly share data across technologies, disciplines, and countries
More informationBig Data Mining Services and Knowledge Discovery Applications on Clouds
Big Data Mining Services and Knowledge Discovery Applications on Clouds Domenico Talia DIMES, Università della Calabria & DtoK Lab Italy talia@dimes.unical.it Data Availability or Data Deluge? Some decades
More informationBig process for big data
Big process for big data Process automa9on for data- driven science Ian Foster Computa9on Ins9tute Argonne Na9onal Laboratory & The University of Chicago Talk at Astroinforma9cs 2012, Redmond, September
More informationGRNET Cloud Compu7ng Services An Overview
GRNET Cloud Compu7ng Services An Overview Panos Louridas louridas@grnet.gr GN3 Innova7on Workshop Copenhagen, October 10 11 2011 Greek Research and Technology Network 2 Outline u okeanos IaaS u pithos
More informationSelected Ac)vi)es Overseen by the WDAC
Selected Ac)vi)es Overseen by the WDAC Some background obs4mips ana4mips CREATE- IP (in planning stage) Contribu)ons from: M. Bosilovich, O. Brown, V. Eyring, R. Ferraro., P. Gleckler, R. Joseph, J. PoQer,
More informationThe Evolu*on of Service Management
The Evolu*on of Extending Disciplines Across the Enterprise Michael Jones Regional CTO - Architecture Michael.Jones@servicenow.com 2015 Now All Rights Reserved 1 How work gets done today! Emails Spreadsheets
More informationUsing the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova
Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel
More informationA Service for Data-Intensive Computations on Virtual Clusters
A Service for Data-Intensive Computations on Virtual Clusters Executing Preservation Strategies at Scale Rainer Schmidt, Christian Sadilek, and Ross King rainer.schmidt@arcs.ac.at Planets Project Permanent
More informationHPC technology and future architecture
HPC technology and future architecture Visual Analysis for Extremely Large-Scale Scientific Computing KGT2 Internal Meeting INRIA France Benoit Lange benoit.lange@inria.fr Toàn Nguyên toan.nguyen@inria.fr
More informationThe THREDDS Data Repository: for Long Term Data Storage and Access
8B.7 The THREDDS Data Repository: for Long Term Data Storage and Access Anne Wilson, Thomas Baltzer, John Caron Unidata Program Center, UCAR, Boulder, CO 1 INTRODUCTION In order to better manage ever increasing
More informationRecent and Future Activities in HPC and Scientific Data Management Siegfried Benkner
Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner Research Group Scientific Computing Faculty of Computer Science University of Vienna AUSTRIA http://www.par.univie.ac.at
More informationNERSC REQUIREMENTS FOR ENABLING CS RESEARCH IN EXASCALE DATA ANALYTICS
NERSC REQUIREMENTS FOR ENABLING CS RESEARCH IN EXASCALE DATA ANALYTICS Nagiza F. Samatova samatovan@ornl.gov Oak Ridge Na5onal Laboratory North Carolina State University January 5, 2011 ACKNOWLEDGEMENTS
More informationWebcasting vs. Web Conferencing. Webcasting vs. Web Conferencing
Webcasting vs. Web Conferencing 0 Introduction Webcasting vs. Web Conferencing Aside from simple conference calling, most companies face a choice between Web conferencing and webcasting. These two technologies
More informationNext Generation Networking for Science Program Update. NGNS Program Managers Richard Carlson Thomas Ndousse ASCAC meeting 11/21/2014
Next Generation Networking for Science Program Update NGNS Program Managers Richard Carlson Thomas Ndousse ASCAC meeting 11/21/2014 NGNS Program Mission The Goal of the program is to research, develop,
More informationA Technology for BigData Analysis Task Description using Domain-Specific Languages
A Technology for Big Analysis Task Description using Domain-Specific Languages Sergey V. Kovalchuk 1, Artem V. Zakharchuk 1, Jiaqi Liao 1, Sergey V. Ivanov 1, Alexander V. Boukhanovsky 1,2 1 ITMO University,
More informationNCDC Strategic Vision
NOAA s National Climatic Data Center World s Largest Archive of Climate and Weather Data Presented to: Coastal Environmental Disasters Data Management Workshop September 16, 2014 Stephen Del Greco Deputy
More informationBig Data and Scientific Discovery
Big Data and Scientific Discovery Bill Harrod Office of Science William.Harrod@science.doe.gov! February 26, 2014! Big Data and Scien*fic Discovery Next genera*on scien*fic breakthroughs require: Major
More informationCLOUD BASED N-DIMENSIONAL WEATHER FORECAST VISUALIZATION TOOL WITH IMAGE ANALYSIS CAPABILITIES
CLOUD BASED N-DIMENSIONAL WEATHER FORECAST VISUALIZATION TOOL WITH IMAGE ANALYSIS CAPABILITIES M. Laka-Iñurrategi a, I. Alberdi a, K. Alonso b, M. Quartulli a a Vicomteh-IK4, Mikeletegi pasealekua 57,
More informationData-Intensive Science and Scientific Data Infrastructure
Data-Intensive Science and Scientific Data Infrastructure Russ Rew, UCAR Unidata ICTP Advanced School on High Performance and Grid Computing 13 April 2011 Overview Data-intensive science Publishing scientific
More informationGeneralized Architecture for Dynamic Infrastructure Services
Generalized Architecture for Dynamic Infrastructure Services On behalf of GEYSERS Consor3um (presented by Pascale Vicat Blanc, GEYSERS Dissemina3on WPL) GLIF 2010, CERN, France 14 th october 2010 Grant
More informationSoftware Defined Networking for Extreme- Scale Science: Data, Compute, and Instrument Facilities
ASCR Intelligent Network Infrastructure Software Defined Networking for Extreme- Scale Science: Data, Compute, and Instrument Facilities Report of DOE ASCR Intelligent Network Infrastructure Workshop August
More informationHigh Performance Computing OpenStack Options. September 22, 2015
High Performance Computing OpenStack PRESENTATION TITLE GOES HERE Options September 22, 2015 Today s Presenters Glyn Bowden, SNIA Cloud Storage Initiative Board HP Helion Professional Services Alex McDonald,
More informationCEDA Storage. Dr Matt Pritchard. Centre for Environmental Data Archival (CEDA) www.ceda.ac.uk
CEDA Storage Dr Matt Pritchard Centre for Environmental Data Archival (CEDA) www.ceda.ac.uk How we store our data NAS Technology Backup JASMIN/CEMS CEDA Storage Data stored as files on disk. Data is migrated
More informationCloud Computing for Fundamental Spatial Operations on Polygonal GIS Data
1 Cloud Computing for Fundamental Spatial Operations on Polygonal GIS Data Dinesh Agarwal, Satish Puri, Xi He, and Sushil K. Prasad 1 Department of Computer Science Georgia State University Atlanta - 30303,
More informationCloud Computing @ JPL Science Data Systems
Cloud Computing @ JPL Science Data Systems Emily Law, GSAW 2011 Outline Science Data Systems (SDS) Space & Earth SDSs SDS Common Architecture Components Key Components using Cloud Computing Use Case 1:
More informationApplying Apache Hadoop to NASA s Big Climate Data!
National Aeronautics and Space Administration Applying Apache Hadoop to NASA s Big Climate Data! Use Cases and Lessons Learned! Glenn Tamkin (NASA/CSC)!! Team: John Schnase (NASA/PI), Dan Duffy (NASA/CO),!
More informationRevoScaleR Speed and Scalability
EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution
More informationThe Study of a Hierarchical Hadoop Architecture in Multiple Data Centers Environment
Send Orders for Reprints to reprints@benthamscience.ae The Open Cybernetics & Systemics Journal, 2015, 9, 131-137 131 Open Access The Study of a Hierarchical Hadoop Architecture in Multiple Data Centers
More informationHow To Manage Cloud Service Provisioning And Maintenance
Managing Cloud Service Provisioning and SLA Enforcement via Holistic Monitoring Techniques Vincent C. Emeakaroha Matrikelnr: 0027525 vincent@infosys.tuwien.ac.at Supervisor: Univ.-Prof. Dr. Schahram Dustdar
More informationHow To Build An Elastic Distributed System Over Big Data. http://esgf.org
How To Build An Elastic Distributed System Over Big Data Overview! The Earth System Grid Federation (ESGF.org) is indeed a spontaneous, unfunded, community-driven, open-source, collaborative effort to
More informationA General Approach to Real-time Workflow Monitoring Karan Vahi, Ewa Deelman, Gaurang Mehta, Fabio Silva
A General Approach to Real-time Workflow Monitoring Karan Vahi, Ewa Deelman, Gaurang Mehta, Fabio Silva USC Information Sciences Institute Ian Harvey, Ian Taylor, Kieran Evans, Dave Rogers, Andrew Jones,
More informationHortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
More informationElastic Management of Cluster based Services in the Cloud
First Workshop on Automated Control for Datacenters and Clouds (ACDC09) June 19th, Barcelona, Spain Elastic Management of Cluster based Services in the Cloud Rafael Moreno Vozmediano, Ruben S. Montero,
More informationThe Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets
The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and
More informationNext-Generation Networking for Science
Next-Generation Networking for Science ASCAC Presentation March 23, 2011 Program Managers Richard Carlson Thomas Ndousse Presentation
More informationBig Data Services at DKRZ
Big Data Services at DKRZ Michael Lautenschlager and Colleagues from DKRZ and Scientific Computing Research Group MPI-M Seminar Hamburg, March 31st, 2015 Big Data in Climate Research Big data is an all-encompassing
More informationOT- Med: Objec,va Terra - Mediterraneum. Joël Guiot
OT- Med: Objec,va Terra - Mediterraneum Joël Guiot Context The OT- Med Labex has been defined in this context q The Mediterranean Basin has been a key area of human- environment interac@ons for thousands
More informationData Requirements from NERSC Requirements Reviews
Data Requirements from NERSC Requirements Reviews Richard Gerber and Katherine Yelick Lawrence Berkeley National Laboratory Summary Department of Energy Scientists represented by the NERSC user community
More informationScientific Computing's Productivity Gridlock and How Software Engineering Can Help
Scientific Computing's Productivity Gridlock and How Software Engineering Can Help Stuart Faulk, Ph.D. Computer and Information Science University of Oregon Stuart Faulk CSESSP 2015 Outline Challenges
More informationReal-Time Analytics on Large Datasets: Predictive Models for Online Targeted Advertising
Real-Time Analytics on Large Datasets: Predictive Models for Online Targeted Advertising Open Data Partners and AdReady April 2012 1 Executive Summary AdReady is working to develop and deploy sophisticated
More informationOpenStack @ Cisco. Daneyon Hansen 3/28/2012. 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confiden+al 1
OpenStack @ Cisco Daneyon Hansen 3/28/2012 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confiden+al 1 Customer value is moving up into sogware and web services Virtualiza+on and internet
More informationTowards Analytical Data Management for Numerical Simulations
Towards Analytical Data Management for Numerical Simulations Ramon G. Costa, Fábio Porto, Bruno Schulze {ramongc, fporto, schulze}@lncc.br National Laboratory for Scientific Computing - RJ, Brazil Abstract.
More informationThe Business Case for Virtualization Management: A New Approach to Meeting IT Goals By Rich Corley Akorri
The BusinessCase forvirtualization Management: A New ApproachtoMeetingITGoals ByRichCorley Akorri July2009 The Business Case for Virtualization Management: A New Approach to Meeting IT Goals By Rich Corley
More informationDoing Big Data Projects: What s the Best Team Process Methology?
Doing Big Data Projects: What s the Best Team Process Methology? October 2015 1 Executive Summary What s the Best Team Process Methology? September 2015 2 Executive Summary What s the Best Team Process
More informationHPC Programming Framework Research Team
HPC Programming Framework Research Team 1. Team Members Naoya Maruyama (Team Leader) Motohiko Matsuda (Research Scientist) Soichiro Suzuki (Technical Staff) Mohamed Wahib (Postdoctoral Researcher) Shinichiro
More informationInteractive Data Visualization with Focus on Climate Research
Interactive Data Visualization with Focus on Climate Research Michael Böttinger German Climate Computing Center (DKRZ) 1 Agenda Visualization in HPC Environments Climate System, Climate Models and Climate
More informationIbis: Scaling Python Analy=cs on Hadoop and Impala
Ibis: Scaling Python Analy=cs on Hadoop and Impala Wes McKinney, Budapest BI Forum 2015-10- 14 @wesmckinn 1 Me R&D at Cloudera Serial creator of structured data tools / user interfaces Mathema=cian MIT
More informationReliable Systolic Computing through Redundancy
Reliable Systolic Computing through Redundancy Kunio Okuda 1, Siang Wun Song 1, and Marcos Tatsuo Yamamoto 1 Universidade de São Paulo, Brazil, {kunio,song,mty}@ime.usp.br, http://www.ime.usp.br/ song/
More informationGeoKettle: A powerful open source spatial ETL tool
GeoKettle: A powerful open source spatial ETL tool FOSS4G 2010 Dr. Thierry Badard, CTO Spatialytics inc. Quebec, Canada tbadard@spatialytics.com Barcelona, Spain Sept 9th, 2010 What is GeoKettle? It is
More informationSoftware services competence in research and development activities at PSNC. Cezary Mazurek PSNC, Poland
Software services competence in research and development activities at PSNC Cezary Mazurek PSNC, Poland Workshop on Actions for Better Participation of New Member States to FP7-ICT Timişoara, 18/19-03-2010
More informationMaximising the utility of OpeNDAP datasets through the NetCDF4 API
Maximising the utility of OpeNDAP datasets through the NetCDF4 API Stephen Pascoe (Stephen.Pascoe@stfc.ac.uk) Chris Mattmann (chris.a.mattmann@jpl.nasa.gov) Phil Kershaw (Philip.Kershaw@stfc.ac.uk) Ag
More informationCloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com
Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...
More informationBig Data and the Earth Observation and Climate Modelling Communities: JASMIN and CEMS
Big Data and the Earth Observation and Climate Modelling Communities: JASMIN and CEMS Workshop on the Future of Big Data Management 27-28 June 2013 Philip Kershaw Centre for Environmental Data Archival
More informationMicroStrategy Course Catalog
MicroStrategy Course Catalog 1 microstrategy.com/education 3 MicroStrategy course matrix 4 MicroStrategy 9 8 MicroStrategy 10 table of contents MicroStrategy course matrix MICROSTRATEGY 9 MICROSTRATEGY
More informationOverview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it
Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o igiro7o@ictp.it Informa(on & Communica(on Technology Sec(on (ICTS) Interna(onal Centre for Theore(cal Physics (ICTP) Mul(ple Socket
More informationLustre * Filesystem for Cloud and Hadoop *
OpenFabrics Software User Group Workshop Lustre * Filesystem for Cloud and Hadoop * Robert Read, Intel Lustre * for Cloud and Hadoop * Brief Lustre History and Overview Using Lustre with Hadoop Intel Cloud
More informationKnowledgent White Paper Series. Developing an MDM Strategy WHITE PAPER. Key Components for Success
Developing an MDM Strategy Key Components for Success WHITE PAPER Table of Contents Introduction... 2 Process Considerations... 3 Architecture Considerations... 5 Conclusion... 9 About Knowledgent... 10
More informationBPO. Accerela*ng Revenue Enhancements Through Sales Support Services
BPO Accerela*ng Revenue Enhancements Through Sales Support Services What is BPO? Business Process Outsorcing (BPO) is the process of outsourcing specific business func6ons to a third- party service provider
More informationInteraction with other IT projects: EUDAT2020, VLDATA, ENVRI PLUS,
Interaction with other IT projects: EUDAT2020, VLDATA, ENVRI PLUS, A. Spinuso, L. Trani, A. Strollo and D. Bailo EPOS PP final meeting, Rome, 22-24 October 2014 OUTLINE WG1 and the EIDA use case A modular
More informationEuropean Data Infrastructure - EUDAT Data Services & Tools
European Data Infrastructure - EUDAT Data Services & Tools Dr. Ing. Morris Riedel Research Group Leader, Juelich Supercomputing Centre Adjunct Associated Professor, University of iceland BDEC2015, 2015-01-28
More informationEnterprise Content Management (ECM)
Enterprise Content Management (ECM) What it is, Creating a Domain Architecture for ECM, Why you would want to do this, and where to get help if you do ;) The other guy could not make it I m Glenn A bit
More informationMaking a Smooth Transition to a Hybrid Cloud with Microsoft Cloud OS
Making a Smooth Transition to a Hybrid Cloud with Microsoft Cloud OS Transitioning from today s highly virtualized data center environments to a true cloud environment requires solutions that let companies
More informationDevelopment of Earth Science Observational Data Infrastructure of Taiwan. Fang-Pang Lin National Center for High-Performance Computing, Taiwan
Development of Earth Science Observational Data Infrastructure of Taiwan Fang-Pang Lin National Center for High-Performance Computing, Taiwan GLIF 13, Singapore, 4 Oct 2013 The Path from Infrastructure
More informationLesley Wyborn and Ben Evans. nci.org.au. IEEE Big Data in the Geosciences Workshop, Santa Clara 2015. National Computational Infrastructure 2015
Integra(ng Big Geoscience Data into the Petascale Na(onal Environmental Research Interoperability Pla
More information