Proposition d équipe-centre AVALON C. Perez

Size: px
Start display at page:

Download "Proposition d équipe-centre AVALON C. Perez"

Transcription

1 Proposition d équipe-centre AVALON C. Perez Algorithms and Software Architectures for Service Oriented Platforms Avalon Grenoble Rhône-Alpes 2012, January 6 th

2 Where are we coming from GRAAL EPI 2008: G. Fedak and C. Perez joined the team at Fall 2010: Creation of two LIP teams: Avalon and Roma 2012: Final evaluation in Fall RESO EPI 2012: Final evaluation in Spring CC IN2P3 2008: F. Suter hired at Fall 2

3 Members Assistant Evelyne Blesle Faculty Members (3 UCBL, 1 ENS, 4 INRIA, 1 CNRS) Yves Caniou, MCF UCBL Eddy Caron, MCF ENS Lyon, HDR (80%) Frédéric Desprez, DR1 INRIA, HDR (80%) Gilles Fedak, CR1 INRIA Jean-Patrick Gelas, MCF UCBL Olivier Glück, MCF UCBL Laurent Lefèvre, CR1 INRIA Christian Perez, DR2 INRIA, HDR, Project leader Frédéric Suter, CR1 CNRS, CC-IN2P3 Engineers Daniel Balouek, INRIA (AEH) Matthieu Imbert, INRIA SED (50%, until 3/12) Maxime Morel, INRIA Olivier Mornard, INRIA Jose Francisco Saray, INRIA Julien Carpentier, INRIA PhD students Mohammed El Mehdi Diouri, MENRT Maurice Djibril Faye, Senegal and ENS grant Sylvain Gault, INRIA Cristian Klein, INRIA George Markomanolis, CORDIS INRIA Adrian Muresan, MENRT Vincent Pichon, CIFRE EDF Anthony Simonet, INRIA Ghislain Landry Tsafack, CORDIS AEH, INRIA Postdoc Simon Delamare, INRIA, FP7 EDGI, (until 2/12) Florent Chuffart, INRIA Zhengxiong Hou, INRIA (should start in 2/12) Jonathan Rouzaud-Cornabas, PRES Lyon Bing Tang, INRIA 3

4 INRIA Members Assistant Evelyne Blesle Faculty Members (3 UCBL, 1 ENS, 4 INRIA, 1 CNRS) Yves Caniou, MCF UCBL Eddy Caron, MCF ENS Lyon, HDR (80%) Frédéric Desprez, DR1 INRIA, HDR (80%) Gilles Fedak, CR1 INRIA Jean-Patrick Gelas, MCF UCBL Olivier Glück, MCF UCBL Laurent Lefèvre, CR1 INRIA Christian Perez, DR2 INRIA, HDR, Project leader Frédéric Suter, CR1 CNRS, CC-IN2P3 Engineers Daniel Balouek, INRIA (AEH) Matthieu Imbert, INRIA SED (50%, until 3/12) Maxime Morel, INRIA Olivier Mornard, INRIA (until Spring 2012) Jose Francisco Saray, INRIA Julien Carpentier, INRIA PhD students Mohammed El Mehdi Diouri, MENRT Maurice Djibril Faye, Senegal and ENS grant Sylvain Gault, INRIA Cristian Klein, INRIA George Markomanolis, CORDIS INRIA Adrian Muresan, MENRT Vincent Pichon, CIFRE EDF Anthony Simonet, INRIA Ghislain Landry Tsafack, CORDIS AEH, INRIA Postdoc Simon Delamare, INRIA, FP7 EDGI, (until 2/12) Florent Chuffart, INRIA Zhengxiong Hou, INRIA (should start in 2/12) Jonathan Rouzaud-Cornabas, PRES Lyon Bing Tang, INRIA 4

5 Members in January 2012 Assistant Evelyne Blesle Faculty Members (1 UCBL, 1 ENS, 3 INRIA, 1 CNRS) Yves Caniou, MCF UCBL Eddy Caron, MCF ENS Lyon, HDR (80%) Frédéric Desprez, DR1 INRIA, HDR (80%) Gilles Fedak, CR1 INRIA Jean-Patrick Gelas, MCF UCBL Olivier Glück, MCF UCBL Laurent Lefèvre, CR1 INRIA Christian Perez, DR2 INRIA, HDR, Project leader Frédéric Suter, CR1 CNRS, CC-IN2P3 Engineers Daniel Balouek, INRIA (AEH) Matthieu Imbert, INRIA SED (50%, until 3/12) Maxime Morel, INRIA Olivier Mornard, INRIA Jose Francisco Saray, INRIA Julien Carpentier, INRIA PhD students Mohammed El Mehdi Diouri, MENRT Maurice Djibril Faye, Senegal and ENS grant Sylvain Gault, INRIA Cristian Klein, INRIA George Markomanolis, CORDIS INRIA Adrian Muresan, MENRT Vincent Pichon, CIFRE EDF Anthony Simonet, INRIA Ghislain Landry Tsafack, CORDIS AEH, INRIA Postdoc Simon Delamare, INRIA, FP7 EDGI, (until 2/12) Florent Chuffart, INRIA Zhengxiong Hou, INRIA (should start in 2/12) Jonathan Rouzaud-Cornabas, PRES Lyon Bing Tang, INRIA Members of Avalon/GRAAL Members of Reso that will join Avalon Member of CC-IN2P3 5

6 Agenda Evolution of HPC and distributed computing Research activities Four research axes Two transverse research topics Software Positioning Conclusion Example: Google s MapReduce 6

7 Algorithms and Software Architectures for Service Oriented Platforms Evolution of computing platforms Platforms based on the aggregation of large clusters (Grids), huge datacenters (Clouds), collections of PC (Desktop grids), and/or supercomputers (HPC) Different characteristics: performance, energy, size, cost, reliability, quality of service, etc. Common challenges: large scale, heterogeneity, volatility, on-demand Overall idea Consider the whole system, ranging from resources to applications, as a set of services to be composed - Access transparently to these platforms driven through the use of different services providing mandatory features such as resource discovery, deployment, load-balancing, etc. Long term goals of Avalon Contribute to the design of programming models supporting numerous architecture kinds Provide simple to use abstractions of resources Efficiently make use of them by mastering the various algorithmic issues involved Contribute to the development of middleware frameworks at different levels By studying the impact on application-level algorithms 7

8 Avalon: Research Activities Applications Programming Abstractions? Algorithmics Resource Abstractions Application & Resource Models Elasticity Energy CPU/data-intensive Scientific Applications From simple to code coupling Complexity Code and language heterogeneity New forms of interactions (MR) Objectives Expressiveness simplicity Application portability Resource specific optimizations Elastic resource management Energy consumption Team strength Complementary kinds of expertise Supercomputers (Exascale) Grids (EGI) Desktop Grids Clouds (IaaS, PaaS) Large scale Heterogeneity Volatility On demand 8

9 Programming Abstractions Avalon: Four Research Axes Algorithmics Resource Abstractions Application & Resource Models Programming abstractions Composition based programming models such as component/workflow models. - Transform resource-aware applications into resource-aware executions Domain specific languages Resource abstractions Hybrid platform management Service deployment, service composition and orchestration, service discovery Data service and execution of compute/data-intensive applications Application and resource models Accurate/realistic models for application and execution infrastructures - Application profiling models and tools Energy consumption models Algorithmics Trying to narrow the gap between programming models and resource management systems Understand and define the abstraction offered by RMS so that they can achieve their goals (security, fairness, energy, etc.) while enabling advanced programming models to participate to the resource selection 9

10 Programming Abstractions Programming Abstractions (1/2) Algorithmics Resource Abstractions Application & Resource Models Involved researchers E. Caron, G. Fedak, C. Perez Parallel programming is hard Low level/infrastructure specific models (MPI, threads, etc.) Poorly composable models (e.g. MPI/OpenMP) Static models Hard to extract information for an efficient execution Lots of partial knowledge Message oriented models (MPI, etc) Partitioned global address space models (OpenMP, UPC, etc.) Workflow models Component models (CCA, Fractal, etc.) Domain specific languages (GridRPC, MapReduce, etc.) 10

11 Programming Abstractions Programming Abstractions (2/2) Algorithmics Resource Abstractions Application & Resource Models Objectives Contribute to define programming models - Simple to use - Efficient on different kinds of infrastructures - Extensible w.r.t composition operators Methodology Focus on some important DSLs (GridRPC, MapReduce) - Acquire know-how by establishing results on well-defined use cases - Select DSL with original features - Collaborate to standardization bodies (Open Grid Forum) Contribute to a general (component) model (HLCM) - Shall support all forms of composition (spatial, temporal, skeleton) and interactions - New kinds of composition operators motivated by motivating applications Models are validated on use cases through prototype implementations 11

12 Programming Abstractions Resource Abstractions (1/2) Algorithmics Resource Abstractions Application & Resource Models Involved Researchers E. Caron, G. Fedak, L. Lefevre, J.-P. Gelas, O. Glück, C. Perez, F. Suter Distributed Computing Infrastructures Many different DCIs with many different characteristics - Performance, reliability, prices, energy consumption, etc. Variety of access usages - Batch scheduler, reservation, on-demand, best-effort, virtualized, etc. Question: how to provide relevant abstractions to allow efficient resource usage? Challenges Adequate resource management services - Large scale, heterogeneous, volatile, elastic Combining several DCIs together Feedback on how applications make use of resources - Energy monitoring 12

13 Programming Abstractions Resource Abstractions (2/2) Algorithmics Resource Abstractions Application & Resource Models Objectives Design adapted services - Job submission, resource discovery, data management, monitoring Study at which level some advanced features have to/can be provided - Data-centric, reliability, security, etc. Methodology Make use of existing DCI s service as much as possible - Develop new services otherwise Develop/enhance middleware and services w.r.t particular use cases - Data management and data-intense computing (BitDew, DIET) - Workflows (DIET) - Component model (HLCM) Collaborate with large-scale infrastructures makers - Grid5000, FutureGrid, International Desktop Grid Federation 13

14 Programming Abstractions Application and Resource Models (1/2) Algorithmics Resource Abstractions Application & Resource Models Involved Researchers F. Desprez, O. Glück, L. Lefevre, F. Suter Analysis of large scale distributed system behavior a.k.a. computing grids, data grids, IaaS clouds, supercomputer, etc. Need of better understanding for better usage Many performance indicators to analyze - Throughput, availability, reliability, energy consumption Question: How to analyze without disruption? Mapping applications on such various infrastructures raises many questions Do I have to deploy it everywhere to know if it works well enough? How can I ensure that my applications will always work well in a shared context? How will it cope with fault and failures? 14

15 Programming Abstractions Application and Resource Models (2/2) Algorithmics Resource Abstractions Application & Resource Models Objectives Contribute to an efficient and well known simulation toolkit - As simulation allows us to test any what-if scenarios Focus on - Storage resources (data-intensive sciences) - Regular parallel applications Methodology Propose comprehensive and realistic models - Using expertise and logs from CC IN2P3 on hierarchical storage systems - Continue and extend work on off-line simulation of MPI applications Implement these models into Simgrid - To benefit from the existing simulation kernel and ecosystem Run validation campaign to ensure soundness of results 15

16 Programming Abstractions Algorithmics (1/2) Algorithmics Resource Abstractions Application & Resource Models Involved Researchers E. Caron, F. Desprez, C. Perez, F. Suter Targeting The deployment of complex applications made of several tasks over complex architectures - Large scale, heterogeneous, shared, and elastic Taking into account different metrics that can be combined - Performance, resource usage, energy, etc. Challenges Dynamicity/elasticity of the resources and the applications Automatic deployment (as much as possible) Implement the algorithms and heuristics over actual platforms 16

17 Programming Abstractions Algorithmics (2/2) Algorithmics Resource Abstractions Application & Resource Models Objectives Design heuristics and algorithms for specific problems raised by our target applications and environments Focus on - (Dynamic) resource management - Storage management Methodology Based on designed models - Application, middleware, and platform models Design algorithms and heuristics Validate them using - Simulation (SimGrid) - Actual platforms (Grid'5000, FutureGrid, OpenCirrus) Transfer the results (production grids/clouds, companies?) 17

18 Programming Abstractions Transverse Topic: Elasticity Algorithmics Resource Abstractions Application & Resource Models Involved Researchers E. Caron, F. Desprez, G. Fedak, L. Lefevre, C. Perez, F. Suter Elasticity: Our definition Runtime variation of the number of execution artifacts - Motivated by the application needs or the system constraints From static to dynamic Not so new in some domains: distributed systems, desktop grids, clouds Major step for some domains: HPC, scientific applications Challenges Define programming models compliant with all kinds of elasticity Define resource abstractions supporting elasticity Define algorithms managing elasticity Understand how elasticity impacts application and resource models 18

19 Programming Abstractions Transverse Topic: Energy Efficiency Algorithmics Resource Abstractions Application & Resource Models Involved Researchers E. Caron, G. Fedak, J.P. Gelas, O. Gluck, L. Lefevre, F. Suter Energy usage and efficiency is not taken into account since the design of systems Energy is becoming the major constraint Many different DCIs with many different characteristics How to express QoS w.r.t performance and energy efficiency? Challenges Profile and predict energy usage of applications Investigate energy usage of (yet to come) infrastructures Clouds, exascale, next generation networks Promote and inject energy usage in measurements decision systems Apply adapted green leverages at multiple layers (network, cpu) 19

20 Avalon Overview The team plans to contribute at different levels including Programming models Distributed algorithms Resource management Deployment of services and service discovery Large scale data management Application profiling Energy measurement Theoretical results will be validated on software prototypes using applications from different scientific fields such as bioinformatics, physics, cosmology, climatology, distributed games, etc. Experimental testbed Simulation: SimGrid HPC: Grid 5000, PRACE, BluePrint, Bluewaters Cloud/Grid: Grid 5000, Ciment, Bonfire, Futuregrid 20

21 Software (1/3) DIET (EC, FD) Used in production in the Décrypthon project Co-development with SysFera (GRAAL s startup) Experimental framework for our theoretical developments Future Evolution towards virtualized platforms (dynamicity, elasticity) Connection with monitoring (network, processors, multicore platforms, etc.) Bitdew (GF) Open source middleware for data management on desktop Grid Future Support for data Intense processing Data-driven Master/Worker, MapReduce, Active Data Support for hybrid DCIs: Grids (JSAGA), Clouds (S3, direct download) Provides more abstractions for complex data management Network aware data placement, data space partitioning, language for abstraction composition 21

22 Software (2/3) HCLMi (CP) Feasibility of a generic connector-based hierarchical component model Subsume most of previous models/implementations Future More planning algorithms, better interactions with resources Dynamicity (workflow), just in time transformation/compilation, etc. Simgrid (FS) Chosen by CERN to simulate their data management infrastructure Intermediary experimental framework for our theoretical developments Future Platform/service dimensioning through simulation (models/traces) Extension of the simulation kernel for HPC, HPN, Data Grids and Clouds infrastructures with new models and high level concepts 22

23 Software (3/3) ShowWatts (LL, JPG) Energy usage measurements of large scale distributed systems Used in CompatibleOne/Magellan project Future - Integration of large set of energy sensors - Measurement of virtual resources 23

24 Who is doing what Yves Caniou Data management, scheduling, resource management Eddy Caron Grid & Cloud computing, distributed algorithms, scheduling, middleware Frédéric Desprez Grid & Cloud computing, scheduling, resource and data management Gilles Fedak Desktop grid computing, Hybrid DCI, data-intense computing Jean-Patrick Gelas Energy, HPC, cloud computing, operating system Olivier Glück HPCN, communication middleware, energy and networking Laurent Lefèvre Energy efficiency, HPC and Cloud, green networks Christian Perez HPC, Clouds, component models, application transformation, resource management Frédéric Suter Resource and application modeling, simulation, performance prediction, scheduling 24

25 Positioning Field: Networks, Systems, Services and Distributed Computing Theme: Distributed and High Performance Computing 25

26 Thematic Positioning inside INRIA Component Model Adam, Oasis, Sardes Fault Tolerance Grand-Large, Roma Distributed Systems Adam, Sardes, Ascola, Myriads, Oasis Desktop Grid Computing Mescal, Regal Distributed Algorithms Cépage, Regal Energy Efficiency Ascola, Algorille, Sardes Scheduling Cepage, Moais, Mescal, Roma Avalon scientific coordination under the INRIA umbrella Scientific director of the ADT Aladdin (Grid 5000) Scientific director of the Héméra large wingspan project (24 teams in France) - Cooperation with the GDR ASR Co-leader of the programming model pole of the HpcCse large wingspan project 26

27 International Positioning Component model Industry: IBM, Microsoft, EDF Academic: BSC, Univ. Pisa, ORNL, ANL, Utah Univ., UIUC Clouds Industry: Amazon, Microsoft, Google, IBM, Yahoo Academic: Univ. Comp. Madrid, ANL, Univ. Chicago, UCSB HPC Industry: Bull, CEA, IBM, Fujitsu, Cray, Academic: BSC, Fraunhofer, ORNL, ANL, UIUC, Univ. Berkeley, UTK, Univ. Tokyo Scheduling Academic: Univ. Delft, Univ. of Göttingen, Univ Hawai`i Desktop Grid Computing Academic: Univ. Berkeley, Univ. Wisconsin, SZTAKI, Univ of Campina Grande Energy Efficiency Industry: Bell Labs, BULL, EATON Academic : Virginia Tech, Univ of Sydney, Otago University, AIST 27

28 Conclusion: Self Assessment Strengths Gather complementary competences (programming model, different kinds of infrastructure, energy, simulation) around an algorithmic core Research from theory to software development and experimental validation - Access to production platforms through CC-IN2P3 Important involvement in national and international projects Success with various projects submissions Strong collaboration with EDF R&D (components, grid middleware) Potential Weaknesses Team size Integration of new members Grid 5000 future? 28

29 One example application: Google s MapReduce 29

30 One example application: Google s MapReduce Developed by Google in 2003 Programming model: dataflow programming Input & Output: each a set of key/value pairs Programmer defines two functions map(in key, in value) list(out key, intermediate value) - Processes input key/value pair, produces set of intermediate pairs reduce(out key, list(intermediate value)) list(out value) - Combines all intermediate values for a particular key, produces a set of merged output values (usually just one) Use cases: distributed grep, web link-graph reversal, distributed sort, web access log stats, document clustering, etc. Open-source version Hadoop (Java implementation of MapReduce + GFS + Bigtable) 30

31 Parallel MapReduce Execution Distribute Map Shuffle Reduce Combine 31

32 MapReduce and Programming Issues Christian Perez Gilles Fedak 32

33 Objectives Enable code-reuse E.g. mapper or reducer code Let expert develop a piece of code not tied to a particular framework Enable adaptation when re-using code E.g. reducer sum not specific to a particular type of data Let re-use code with parameterization options Enable any kind of composition operators E.g. mapper or reducer may interact with a DB Do not impose any communication models (framework) Enable efficient implementation of composition operators E.g. enable resource specific optimization 33

34 MapReduce Skeleton in High Level Component Model Enable code-reuse Software component - Primitive component for re-using implementation code - Composite component for re-using assemblies of components Enable adaptation when re-using code Genericity Enable any kind of composition operators Connectors Enable efficient implementation of composition operators Open connection HLCM Input Open connection based Enable any kind of compositions MxN, shared data, collective communications, algorithmic skeletons (master-worker, map-reduce, ) Genericity (template meta-programming) In Mapper Mapper #mapper? Self-*? MapReduce<Mapper, Reducer> Reducer #reducer? BlobSeer? BitDew? Both? Out Output HLCMi A MDE-based implementation of HLCM Component MapReduce<Component Map, Component Reduce> exposes { In, Out} 34

35 MapReduce and Scheduling Issues Eddy Caron Frédéric Desprez Frédéric Suter 35

36 Context: Development of a toolbox to deploy application services providers with a hierarchical architecture for scalability MA Main research issues: scheduling, heterogeneity, automatic deployment, interoperability, high performance data transfer and management, monitoring, fault tolerance, genericity of solutions for various applications, static and dynamic analysis of performance, Corba MA MA MA LA Master Agent MA LA LA Local Agent LA Server front end Client Validation: Large validation over Grid Interoperability: DIET is compliant with the OGF standardization of the GridRPPC, prototype with Aegis (JAEA, Japan) DIET used case: The Decrypthon project - DIET was selected by IBM Start up: SysFera (founded on march 2010). Contact: F. Desprez, E. Caron, 36

37 MapReduce through DIET MapReduce paradigm on a GridRPC middleware Take benefit of the Workflow engine from DIET Data management (replication, migration) Functional workflow Three solutions DFS emulation Sort couples services Tree reduction Benefit Distributed MapReduce on mixed architectures Workstation, Cluster, Cloud Control on data localization Scheduling on multiples MapReduce 37

38 Scheduling Issues Models Adding virtual machines abstraction layers Multiple processes on each CPU, multicore architectures New metrics (cost, energy consumption, fairness at several levels) Multiple objective functions Playing with new virtual machines features Migration, checkpointing, stop&restart, energy management Elastic management of resources (auto-scaling), pilot-jobs Scheduling (multiple) DAGs/Workflows over elastic platforms Taking into account the dynamicity of the platform Faults, adding/removing resources (auto-scaling) Coupling resource scheduling with data-management (compute to data) Automatic partitioning of tasks/data, automatic (and smart) replication 38

39 MapReduce and Desktop Grids Gilles Fedak 39

40 MapReduce on Desktop Grids Motivation DG systems have focused on BoT application: CPU only Vast amount of free disk space to exploit to execute data-intense application (bio informatics, web crawling etc.) - PetaBytes processing Needs for a programming model: MapReduce! Challenges No shared file system or direct communication Faults and host churn Data replication management Result certification of intermediate data Collective operation scatter + shuffle + gather/reduction 40

41 Runtime Optimization Latency hiding Multi-thread: to overlap com. & comp. Data-prefetch FS Barrier-free computation Allows reduction to start before the end of map tasks 2-level scheduler Data + task (detect laggers) Worker failures Heatbeat + re-execution Laggers Replication of the latest tasks Distributed result certification Replication of mappers/reducers + majority voting on reducers/master WordCount application Grid5000 (512 nodes) Size of the document doubles with #nodes (2TBmax) 41

42 One Example Application: Google s MapReduce Map Map Map Map Reduce Reduce Reduce Programming models Components Services Scheduling Mixed parallelism Multiple DAGs Multiple objective functions - Makespan, fairness, energy consumption, $, IaaS IaaS (Clouds) (Clouds) Grids Grids (Grid 5000, (Grid 5000, EGI) EGI) Desktop Desktop Grids Grids Simulation Simulation Distributed Scheduling Service Management Service discovery Service deployment Service composition Elastic resource allocation Energy consumption models 42

43 Thank you Grenoble Rhône-Alpes

44 44 Who is working on what EE EL ALG ARM RA PM FS CP LL OG JPG GF FD EC YC

45 Current Funded Project Participation INRIA ADT Bitdew, ADT ALADDIN AE Héméra, France Regional Grant, FastExpand ( ) ANR SPADES, 3 years, 08-ANR-SEGI-025, ANR COOP, 3 years, ANR-09-COSI , ANR 4years, ANR-09-JCJC , ANR MapReduce, 4 years, ANR-09-JCJC , FUI, CompatibleOne, FSN, Mangellan, Europe FP7, BonFire (IP), FP7, EDGI (CPCSA), FP7, Prace 2IP (I3), FP7, SEED4C, COST, IC804, IIE, PrimeEnergyIt, ERCIM WG CoreGRID International French-Japanese ANR-JST FP3C project INRIA-UIUC-NCSA Joint Laboratory for Petascale Computing GreenTouch 45

46 Collaborations French collaborations Algorille Ascola Kerdata Mescal Myriads Regal RESO L. Nussbaum, M. Quinson (FD, FS) J.-M Menaud (LL), A. Lebre (CP) G. Antoniu (EC, FD, CP, GF, FS) D. Kondo (GF), A. Legrand, O. Richard (Y.C, EC, FS, FD) C. Morin (CP) F. Petit (EC) Runtime Sardes L. Lefèvre (EC) A. Denis (CP) N. De Palma, F. Boyer (EC, FD) ENSEEIHT/IRIT R. Guivarch, T. Monteil, J.-M. Pierson (Y.C, EC, FD, CP, LL, JPG) LAL IN2P3 O. Lodygensky (GF) Université de Picardie Jules Verne G. Le Mahec, V. Vilain (EC) French collaborations, cont. Université Joseph-Fourier L. Viry, H. Gallée (EC) Combining G. Beslon, (YC, FD) IGBMC, Strasbourg O. Poch et coll, (NB) International collaborations University of Manoa, Honolulu, Hawaï H. Casanova (FD, FS) University of Las Vegas, Nevada A. Datta (EC) University of Urbana-Champaign, IL L. Kale (CP), F. Cappello (EC, LL, OG) Univertisty of Tokyo, Japan P. Codognet (YC) University HUST Wuhan, Chine J. Hai,. Shi (GF) AIST, Japon H. Nakada (Y.C., EC) Industrial collaborations EDF R&D* (EC, FD, CP) Bull* (EC, CP, LL, JPG) Bell Labs (LL, JPG) IBM (FD) Fast Expand (EC, FD) * accord cadre INRIA 46

Seed4C: A High-security project for Cloud Infrastructure

Seed4C: A High-security project for Cloud Infrastructure Seed4C: A High-security project for Cloud Infrastructure J. Rouzaud-Cornabas (LIP/CC-IN2P3 CNRS) & E. Caron (LIP ENS-Lyon) November 30, 2012 J. Rouzaud-Cornabas (LIP/CC-IN2P3 CNRS) & E. Seed4C: Caron (LIP

More information

Héméra Inria Project Lab July 2010 June 2014

Héméra Inria Project Lab July 2010 June 2014 Héméra Inria Project Lab July 2010 June 2014 Final Evaluation Paris, December 17 th 2014 Christian Perez AVALON INRIA, France Agenda 10:00-10:10. Bienvenue et tour de table 10:10-10:35. Présentation et

More information

Big Data Management in the Clouds and HPC Systems

Big Data Management in the Clouds and HPC Systems Big Data Management in the Clouds and HPC Systems Hemera Final Evaluation Paris 17 th December 2014 Shadi Ibrahim Shadi.ibrahim@inria.fr Era of Big Data! Source: CNRS Magazine 2013 2 Era of Big Data! Source:

More information

Seed4C: A Cloud Security Infrastructure validated on Grid 5000

Seed4C: A Cloud Security Infrastructure validated on Grid 5000 Seed4C: A Cloud Security Infrastructure validated on Grid 5000 E. Caron 1, A. Lefray 1, B. Marquet 2, and J. Rouzaud-Cornabas 1 1 Université de Lyon. LIP Laboratory. UMR CNRS - ENS Lyon - INRIA - UCBL

More information

DIET A Scalable Platform for Clusters, Grids and Clouds

DIET A Scalable Platform for Clusters, Grids and Clouds DIET A Scalable Platform for Clusters, Grids and Clouds Eddy Caron, Frédéric Desprez INRIA LIP ENS Lyon Avalon Research Team Benjamin Depardon SysFera Joint work with A. Muresan, J. Rouzaud-Cornabas (LIP

More information

Chapter 7. Using Hadoop Cluster and MapReduce

Chapter 7. Using Hadoop Cluster and MapReduce Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Cloud Computing I (intro) 15 319, spring 2010 2 nd Lecture, Jan 14 th Majd F. Sakr Lecture Motivation General overview on cloud computing What is cloud computing Services

More information

BSC vision on Big Data and extreme scale computing

BSC vision on Big Data and extreme scale computing BSC vision on Big Data and extreme scale computing Jesus Labarta, Eduard Ayguade,, Fabrizio Gagliardi, Rosa M. Badia, Toni Cortes, Jordi Torres, Adrian Cristal, Osman Unsal, David Carrera, Yolanda Becerra,

More information

Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus

Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus Elastic Cloud Computing in the Open Cirrus Testbed implemented via Eucalyptus International Symposium on Grid Computing 2009 (Taipei) Christian Baun The cooperation of and Universität Karlsruhe (TH) Agenda

More information

OpenNebula Leading Innovation in Cloud Computing Management

OpenNebula Leading Innovation in Cloud Computing Management OW2 Annual Conference 2010 Paris, November 24th, 2010 OpenNebula Leading Innovation in Cloud Computing Management Ignacio M. Llorente DSA-Research.org Distributed Systems Architecture Research Group Universidad

More information

Workflow Allocations and Scheduling on IaaS Platforms, from Theory to Practice

Workflow Allocations and Scheduling on IaaS Platforms, from Theory to Practice Workflow Allocations and Scheduling on IaaS Platforms, from Theory to Practice Eddy Caron 1, Frédéric Desprez 2, Adrian Mureșan 1, Frédéric Suter 3, Kate Keahey 4 1 Ecole Normale Supérieure de Lyon, France

More information

Grid Computing Perspectives for IBM

Grid Computing Perspectives for IBM Grid Computing Perspectives for IBM Atelier Internet et Grilles de Calcul en Afrique Jean-Pierre Prost IBM France jpprost@fr.ibm.com Agenda Grid Computing Initiatives within IBM World Community Grid Decrypthon

More information

Network for Sustainable Ultrascale Computing (NESUS) www.nesus.eu

Network for Sustainable Ultrascale Computing (NESUS) www.nesus.eu Network for Sustainable Ultrascale Computing (NESUS) www.nesus.eu Objectives of the Action Aim of the Action: To coordinate European efforts for proposing realistic solutions addressing major challenges

More information

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000

Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000 Leveraging BlobSeer to boost up the deployment and execution of Hadoop applications in Nimbus cloud environments on Grid 5000 Alexandra Carpen-Amarie Diana Moise Bogdan Nicolae KerData Team, INRIA Outline

More information

Big Data and Cloud Computing for GHRSST

Big Data and Cloud Computing for GHRSST Big Data and Cloud Computing for GHRSST Jean-Francois Piollé (jfpiolle@ifremer.fr) Frédéric Paul, Olivier Archer CERSAT / Institut Français de Recherche pour l Exploitation de la Mer Facing data deluge

More information

News about HPC and Clouds @ Inria

News about HPC and Clouds @ Inria News about HPC and Clouds @ Inria Claude Kirchner Advisor to the president 24/11/2014 Nov 24, 2014-2 Nov 24, 2014-3 Inria Research Centres Inria LILLE Nord Europe Inria PARIS - Rocquencourt Inria NANCY

More information

Building Platform as a Service for Scientific Applications

Building Platform as a Service for Scientific Applications Building Platform as a Service for Scientific Applications Moustafa AbdelBaky moustafa@cac.rutgers.edu Rutgers Discovery Informa=cs Ins=tute (RDI 2 ) The NSF Cloud and Autonomic Compu=ng Center Department

More information

Grid Computing vs Cloud

Grid Computing vs Cloud Chapter 3 Grid Computing vs Cloud Computing 3.1 Grid Computing Grid computing [8, 23, 25] is based on the philosophy of sharing information and power, which gives us access to another type of heterogeneous

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

Hybrid Distributed Computing Infrastructure Experiments in Grid5000: Supporting QoS in Desktop Grids with Cloud Resources

Hybrid Distributed Computing Infrastructure Experiments in Grid5000: Supporting QoS in Desktop Grids with Cloud Resources Hybrid Distributed Computing Infrastructure Experiments in Grid5: Supporting QoS in Desktop Grids with Cloud Resources Simon Delamare, Gilles Fedak and Oleg Lodygensky simon.delamare@inria.fr - gilles.fedak@inria.fr

More information

Scalable Data Management for Map-Reduce-based Data-Intensive Applications: A View for Cloud and Hybrid Infrastructures

Scalable Data Management for Map-Reduce-based Data-Intensive Applications: A View for Cloud and Hybrid Infrastructures Int. J. of Cloud Computing Scalable Data Management for Map-Reduce-based Data-Intensive Applications: A View for Cloud and Hybrid Infrastructures Gabriel Antoniu a,b gabriel.antoniu@inria.fr Julien Bigot

More information

Cloud Computing Summary and Preparation for Examination

Cloud Computing Summary and Preparation for Examination Basics of Cloud Computing Lecture 8 Cloud Computing Summary and Preparation for Examination Satish Srirama Outline Quick recap of what we have learnt as part of this course How to prepare for the examination

More information

A Peer-to-peer Extension of Network-Enabled Server Systems

A Peer-to-peer Extension of Network-Enabled Server Systems A Peer-to-peer Extension of Network-Enabled Server Systems Eddy Caron 1, Frédéric Desprez 1, Cédric Tedeschi 1 Franck Petit 2 1 - GRAAL Project / LIP laboratory 2 - LaRIA laboratory E-Science 2005 - December

More information

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Introduction

More information

Big Data Processing with Google s MapReduce. Alexandru Costan

Big Data Processing with Google s MapReduce. Alexandru Costan 1 Big Data Processing with Google s MapReduce Alexandru Costan Outline Motivation MapReduce programming model Examples MapReduce system architecture Limitations Extensions 2 Motivation Big Data @Google:

More information

Flauncher and DVMS Deploying and Scheduling Thousands of Virtual Machines on Hundreds of Nodes Distributed Geographically

Flauncher and DVMS Deploying and Scheduling Thousands of Virtual Machines on Hundreds of Nodes Distributed Geographically Flauncher and Deploying and Scheduling Thousands of Virtual Machines on Hundreds of Nodes Distributed Geographically Daniel Balouek, Adrien Lèbre, Flavien Quesnel To cite this version: Daniel Balouek,

More information

Agenda: 1. Background 2. Solution: ProActive 3. Live Demonstration 4. IFP EN Use Case

Agenda: 1. Background 2. Solution: ProActive 3. Live Demonstration 4. IFP EN Use Case Advances in Cloud Computing with ProActive Parallel Suite D. Caromel Accelerate and Orchestrate Enterprise Applications Hybrid Cloud Solutions (Private with Public Burst) Agenda: 1. Background 2. Solution:

More information

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS By HAI JIN, SHADI IBRAHIM, LI QI, HAIJUN CAO, SONG WU and XUANHUA SHI Prepared by: Dr. Faramarz Safi Islamic Azad

More information

BIGhybrid - A Toolkit for Simulating MapReduce on Hybrid Infrastructures

BIGhybrid - A Toolkit for Simulating MapReduce on Hybrid Infrastructures BIGhybrid - A Toolkit for Simulating MapReduce on Hybrid Infrastructures Julio Cesar Santos dos Anjos, Gilles Fedak, Claudio Geyer To cite this version: Julio Cesar Santos dos Anjos, Gilles Fedak, Claudio

More information

OpenNebula Cloud Innovation and Case Studies for Telecom

OpenNebula Cloud Innovation and Case Studies for Telecom Telecom Cloud Standards Information Day Hyatt Regency, Santa Clara, CA, USA 6-7 December, 2010 OpenNebula Cloud Innovation and Case Studies for Telecom Constantino Vázquez Blanco DSA-Research.org Distributed

More information

Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel

Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel Parallel Databases Increase performance by performing operations in parallel Parallel Architectures Shared memory Shared disk Shared nothing closely coupled loosely coupled Parallelism Terminology Speedup:

More information

BlobSeer: Enabling Efficient Lock-Free, Versioning-Based Storage for Massive Data under Heavy Access Concurrency

BlobSeer: Enabling Efficient Lock-Free, Versioning-Based Storage for Massive Data under Heavy Access Concurrency BlobSeer: Enabling Efficient Lock-Free, Versioning-Based Storage for Massive Data under Heavy Access Concurrency Gabriel Antoniu 1, Luc Bougé 2, Bogdan Nicolae 3 KerData research team 1 INRIA Rennes -

More information

Computing in clouds: Where we come from, Where we are, What we can, Where we go

Computing in clouds: Where we come from, Where we are, What we can, Where we go Computing in clouds: Where we come from, Where we are, What we can, Where we go Luc Bougé ENS Cachan/Rennes, IRISA, INRIA Biogenouest With help from many colleagues: Gabriel Antoniu, Guillaume Pierre,

More information

BlobSeer: Towards efficient data storage management on large-scale, distributed systems

BlobSeer: Towards efficient data storage management on large-scale, distributed systems : Towards efficient data storage management on large-scale, distributed systems Bogdan Nicolae University of Rennes 1, France KerData Team, INRIA Rennes Bretagne-Atlantique PhD Advisors: Gabriel Antoniu

More information

Sriram Krishnan, Ph.D. sriram@sdsc.edu

Sriram Krishnan, Ph.D. sriram@sdsc.edu Sriram Krishnan, Ph.D. sriram@sdsc.edu (Re-)Introduction to cloud computing Introduction to the MapReduce and Hadoop Distributed File System Programming model Examples of MapReduce Where/how to run MapReduce

More information

Neptune. A Domain Specific Language for Deploying HPC Software on Cloud Platforms. Chris Bunch Navraj Chohan Chandra Krintz Khawaja Shams

Neptune. A Domain Specific Language for Deploying HPC Software on Cloud Platforms. Chris Bunch Navraj Chohan Chandra Krintz Khawaja Shams Neptune A Domain Specific Language for Deploying HPC Software on Cloud Platforms Chris Bunch Navraj Chohan Chandra Krintz Khawaja Shams ScienceCloud 2011 @ San Jose, CA June 8, 2011 Cloud Computing Three

More information

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging

More information

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets

More information

A SURVEY ON MAPREDUCE IN CLOUD COMPUTING

A SURVEY ON MAPREDUCE IN CLOUD COMPUTING A SURVEY ON MAPREDUCE IN CLOUD COMPUTING Dr.M.Newlin Rajkumar 1, S.Balachandar 2, Dr.V.Venkatesakumar 3, T.Mahadevan 4 1 Asst. Prof, Dept. of CSE,Anna University Regional Centre, Coimbatore, newlin_rajkumar@yahoo.co.in

More information

A Cost-Evaluation of MapReduce Applications in the Cloud

A Cost-Evaluation of MapReduce Applications in the Cloud 1/23 A Cost-Evaluation of MapReduce Applications in the Cloud Diana Moise, Alexandra Carpen-Amarie Gabriel Antoniu, Luc Bougé KerData team 2/23 1 MapReduce applications - case study 2 3 4 5 3/23 MapReduce

More information

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture DATA MINING WITH HADOOP AND HIVE Introduction to Architecture Dr. Wlodek Zadrozny (Most slides come from Prof. Akella s class in 2014) 2015-2025. Reproduction or usage prohibited without permission of

More information

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or

More information

San Diego Supercomputer Center, UCSD. Institute for Digital Research and Education, UCLA

San Diego Supercomputer Center, UCSD. Institute for Digital Research and Education, UCLA Facilitate Parallel Computation Using Kepler Workflow System on Virtual Resources Jianwu Wang 1, Prakashan Korambath 2, Ilkay Altintas 1 1 San Diego Supercomputer Center, UCSD 2 Institute for Digital Research

More information

PARIS*: Programming parallel and distributed systems for large scale numerical simulation applications. Christine Morin IRISA/INRIA

PARIS*: Programming parallel and distributed systems for large scale numerical simulation applications. Christine Morin IRISA/INRIA PARIS*: Programming parallel and distributed systems for large scale numerical simulation applications Kerrighed, Vigne Christine Morin IRISA/INRIA * Common project with CNRS, ENS-Cachan, INRIA, INSA,

More information

Cloud Federations in Contrail

Cloud Federations in Contrail Cloud Federations in Contrail Emanuele Carlini 1,3, Massimo Coppola 1, Patrizio Dazzi 1, Laura Ricci 1,2, GiacomoRighetti 1,2 " 1 - CNR - ISTI, Pisa, Italy" 2 - University of Pisa, C.S. Dept" 3 - IMT Lucca,

More information

Grid Computing Vs. Cloud Computing

Grid Computing Vs. Cloud Computing International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 6 (2013), pp. 577-582 International Research Publications House http://www. irphouse.com /ijict.htm Grid

More information

Sistemi Operativi e Reti. Cloud Computing

Sistemi Operativi e Reti. Cloud Computing 1 Sistemi Operativi e Reti Cloud Computing Facoltà di Scienze Matematiche Fisiche e Naturali Corso di Laurea Magistrale in Informatica Osvaldo Gervasi ogervasi@computer.org 2 Introduction Technologies

More information

Challenges in Hybrid and Federated Cloud Computing

Challenges in Hybrid and Federated Cloud Computing Cloud Day 2011 KTH-SICS Cloud Innovation Center and EIT ICT Labs Kista, Sweden, September 14th, 2011 Challenges in Hybrid and Federated Cloud Computing Ignacio M. Llorente Project Director Acknowledgments

More information

Cloud Computing. Adam Barker

Cloud Computing. Adam Barker Cloud Computing Adam Barker 1 Overview Introduction to Cloud computing Enabling technologies Different types of cloud: IaaS, PaaS and SaaS Cloud terminology Interacting with a cloud: management consoles

More information

for my computation? Stefano Cozzini Which infrastructure Which infrastructure Democrito and SISSA/eLAB - Trieste

for my computation? Stefano Cozzini Which infrastructure Which infrastructure Democrito and SISSA/eLAB - Trieste Which infrastructure Which infrastructure for my computation? Stefano Cozzini Democrito and SISSA/eLAB - Trieste Agenda Introduction:! E-infrastructure and computing infrastructures! What is available

More information

marlabs driving digital agility WHITEPAPER Big Data and Hadoop

marlabs driving digital agility WHITEPAPER Big Data and Hadoop marlabs driving digital agility WHITEPAPER Big Data and Hadoop Abstract This paper explains the significance of Hadoop, an emerging yet rapidly growing technology. The prime goal of this paper is to unveil

More information

MapReduce and Hadoop Distributed File System

MapReduce and Hadoop Distributed File System MapReduce and Hadoop Distributed File System 1 B. RAMAMURTHY Contact: Dr. Bina Ramamurthy CSE Department University at Buffalo (SUNY) bina@buffalo.edu http://www.cse.buffalo.edu/faculty/bina Partially

More information

Key Research Challenges in Cloud Computing

Key Research Challenges in Cloud Computing 3rd EU-Japan Symposium on Future Internet and New Generation Networks Tampere, Finland October 20th, 2010 Key Research Challenges in Cloud Computing Ignacio M. Llorente Head of DSA Research Group Universidad

More information

An Implementation of Active Data Technology

An Implementation of Active Data Technology White Paper by: Mario Morfin, PhD Terri Chu, MEng Stephen Chen, PhD Robby Burko, PhD Riad Hartani, PhD An Implementation of Active Data Technology October 2015 In this paper, we build the rationale for

More information

Denis Caromel, CEO Ac.veEon. Orchestrate and Accelerate Applica.ons. Open Source Cloud Solu.ons Hybrid Cloud: Private with Burst Capacity

Denis Caromel, CEO Ac.veEon. Orchestrate and Accelerate Applica.ons. Open Source Cloud Solu.ons Hybrid Cloud: Private with Burst Capacity Cloud computing et Virtualisation : applications au domaine de la Finance Denis Caromel, CEO Ac.veEon Orchestrate and Accelerate Applica.ons Open Source Cloud Solu.ons Hybrid Cloud: Private with Burst

More information

Software Architecture & Composition. Guillaume Waignier, Anne-Françoise Le Meur, Laurence Duchien Project-Team U. Lille 1/CNRS-INRIA

Software Architecture & Composition. Guillaume Waignier, Anne-Françoise Le Meur, Laurence Duchien Project-Team U. Lille 1/CNRS-INRIA 1 Software Architecture & Composition Guillaume Waignier, Anne-Françoise Le Meur, Laurence Duchien Project-Team U. Lille 1/CNRS-INRIA http://adam.lille.inria.fr April 2009 2 Scientific context Future applications

More information

Accelerating Hadoop MapReduce Using an In-Memory Data Grid

Accelerating Hadoop MapReduce Using an In-Memory Data Grid Accelerating Hadoop MapReduce Using an In-Memory Data Grid By David L. Brinker and William L. Bain, ScaleOut Software, Inc. 2013 ScaleOut Software, Inc. 12/27/2012 H adoop has been widely embraced for

More information

Big Data With Hadoop

Big Data With Hadoop With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials

More information

SimGrid Cloud Broker: Simulation of Public and Private Clouds

SimGrid Cloud Broker: Simulation of Public and Private Clouds SimGrid Cloud Broker: Simulation of Public and Private Clouds Jonathan Rouzaud-Cornabas CNRS CC-IN2P3 / LIP (UMR 5668) J. Rouzaud-Cornabas (CNRS) SimGrid Cloud Broker 1 / 2 SimGrid Cloud Broker SimGrid

More information

WORKFLOW ENGINE FOR CLOUDS

WORKFLOW ENGINE FOR CLOUDS WORKFLOW ENGINE FOR CLOUDS By SURAJ PANDEY, DILEBAN KARUNAMOORTHY, and RAJKUMAR BUYYA Prepared by: Dr. Faramarz Safi Islamic Azad University, Najafabad Branch, Esfahan, Iran. Workflow Engine for clouds

More information

Data-Intensive Computing with Map-Reduce and Hadoop

Data-Intensive Computing with Map-Reduce and Hadoop Data-Intensive Computing with Map-Reduce and Hadoop Shamil Humbetov Department of Computer Engineering Qafqaz University Baku, Azerbaijan humbetov@gmail.com Abstract Every day, we create 2.5 quintillion

More information

Distributed Computing and Big Data: Hadoop and MapReduce

Distributed Computing and Big Data: Hadoop and MapReduce Distributed Computing and Big Data: Hadoop and MapReduce Bill Keenan, Director Terry Heinze, Architect Thomson Reuters Research & Development Agenda R&D Overview Hadoop and MapReduce Overview Use Case:

More information

MapReduce and Hadoop Distributed File System V I J A Y R A O

MapReduce and Hadoop Distributed File System V I J A Y R A O MapReduce and Hadoop Distributed File System 1 V I J A Y R A O The Context: Big-data Man on the moon with 32KB (1969); my laptop had 2GB RAM (2009) Google collects 270PB data in a month (2007), 20000PB

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

How to Do/Evaluate Cloud Computing Research. Young Choon Lee

How to Do/Evaluate Cloud Computing Research. Young Choon Lee How to Do/Evaluate Cloud Computing Research Young Choon Lee Cloud Computing Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing

More information

Scalable Architecture on Amazon AWS Cloud

Scalable Architecture on Amazon AWS Cloud Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect

More information

16.1 MAPREDUCE. For personal use only, not for distribution. 333

16.1 MAPREDUCE. For personal use only, not for distribution. 333 For personal use only, not for distribution. 333 16.1 MAPREDUCE Initially designed by the Google labs and used internally by Google, the MAPREDUCE distributed programming model is now promoted by several

More information

Hadoop and Map-Reduce. Swati Gore

Hadoop and Map-Reduce. Swati Gore Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data

More information

Scientific and Technical Applications as a Service in the Cloud

Scientific and Technical Applications as a Service in the Cloud Scientific and Technical Applications as a Service in the Cloud University of Bern, 28.11.2011 adapted version Wibke Sudholt CloudBroker GmbH Technoparkstrasse 1, CH-8005 Zurich, Switzerland Phone: +41

More information

Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid 5000 Testbed

Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid 5000 Testbed Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid 5000 Testbed Sébastien Badia, Alexandra Carpen-Amarie, Adrien Lèbre, Lucas Nussbaum Grid 5000 S. Badia, A. Carpen-Amarie, A. Lèbre, L. Nussbaum

More information

Cloud computing - Architecting in the cloud

Cloud computing - Architecting in the cloud Cloud computing - Architecting in the cloud anna.ruokonen@tut.fi 1 Outline Cloud computing What is? Levels of cloud computing: IaaS, PaaS, SaaS Moving to the cloud? Architecting in the cloud Best practices

More information

IaaS Federation. Contrail project. IaaS Federation! Objectives and Challenges! & SLA management in Federations 5/23/11

IaaS Federation. Contrail project. IaaS Federation! Objectives and Challenges! & SLA management in Federations 5/23/11 Cloud Computing (IV) s and SPD Course 19-20/05/2011 Massimo Coppola IaaS! Objectives and Challenges! & management in s Adapted from two presentations! by Massimo Coppola (CNR) and Lorenzo Blasi (HP) Italy)!

More information

Key Challenges in Cloud Computing to Enable Future Internet of Things

Key Challenges in Cloud Computing to Enable Future Internet of Things The 4th EU-Japan Symposium on New Generation Networks and Future Internet Future Internet of Things over "Clouds Tokyo, Japan, January 19th, 2012 Key Challenges in Cloud Computing to Enable Future Internet

More information

SURVEY ON THE ALGORITHMS FOR WORKFLOW PLANNING AND EXECUTION

SURVEY ON THE ALGORITHMS FOR WORKFLOW PLANNING AND EXECUTION SURVEY ON THE ALGORITHMS FOR WORKFLOW PLANNING AND EXECUTION Kirandeep Kaur Khushdeep Kaur Research Scholar Assistant Professor, Department Of Cse, Bhai Maha Singh College Of Engineering, Bhai Maha Singh

More information

CompatibleOne Open Source Cloud Broker Architecture Overview

CompatibleOne Open Source Cloud Broker Architecture Overview CompatibleOne Open Source Cloud Broker Architecture Overview WHITE PAPER October 2012 Table of Contents Abstract 2 Background 2 Disclaimer 2 Introduction 2 Section A: CompatibleOne: Open Standards and

More information

A Very Brief Introduction To Cloud Computing. Jens Vöckler, Gideon Juve, Ewa Deelman, G. Bruce Berriman

A Very Brief Introduction To Cloud Computing. Jens Vöckler, Gideon Juve, Ewa Deelman, G. Bruce Berriman A Very Brief Introduction To Cloud Computing Jens Vöckler, Gideon Juve, Ewa Deelman, G. Bruce Berriman What is The Cloud Cloud computing refers to logical computational resources accessible via a computer

More information

ISSN: 2320-1363 CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS

ISSN: 2320-1363 CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS A.Divya *1, A.M.Saravanan *2, I. Anette Regina *3 MPhil, Research Scholar, Muthurangam Govt. Arts College, Vellore, Tamilnadu, India Assistant

More information

Work in Progress on Cloud Computing in Myriads Team and Contrail European Project Christine Morin, Inria

Work in Progress on Cloud Computing in Myriads Team and Contrail European Project Christine Morin, Inria Potential collaboration talk Work in Progress on Cloud Computing in Myriads Team and Contrail European Project Christine Morin, Inria Design and implementation of autonomous distributed systems Internet

More information

A Service for Data-Intensive Computations on Virtual Clusters

A Service for Data-Intensive Computations on Virtual Clusters A Service for Data-Intensive Computations on Virtual Clusters Executing Preservation Strategies at Scale Rainer Schmidt, Christian Sadilek, and Ross King rainer.schmidt@arcs.ac.at Planets Project Permanent

More information

A science-gateway workload archive application to the self-healing of workflow incidents

A science-gateway workload archive application to the self-healing of workflow incidents A science-gateway workload archive application to the self-healing of workflow incidents Rafael FERREIRA DA SILVA, Tristan GLATARD University of Lyon, CNRS, INSERM, CREATIS Villeurbanne, France Frédéric

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

Distribution transparency. Degree of transparency. Openness of distributed systems

Distribution transparency. Degree of transparency. Openness of distributed systems Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science steen@cs.vu.nl Chapter 01: Version: August 27, 2012 1 / 28 Distributed System: Definition A distributed

More information

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage

Parallel Computing. Benson Muite. benson.muite@ut.ee http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage Parallel Computing Benson Muite benson.muite@ut.ee http://math.ut.ee/ benson https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage 3 November 2014 Hadoop, Review Hadoop Hadoop History Hadoop Framework

More information

Business applications:

Business applications: Consorzio COMETA - Progetto PI2S2 UNIONE EUROPEA Business applications: the COMETA approach Prof. Antonio Puliafito University of Messina Open Grid Forum (OGF25) Catania, 2-6.03.2009 www.consorzio-cometa.it

More information

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Edward Bortnikov & Ronny Lempel Yahoo Labs, Haifa Indexing in Search Engines Information Retrieval s two main stages: Indexing process

More information

Overview of HPC Resources at Vanderbilt

Overview of HPC Resources at Vanderbilt Overview of HPC Resources at Vanderbilt Will French Senior Application Developer and Research Computing Liaison Advanced Computing Center for Research and Education June 10, 2015 2 Computing Resources

More information

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms

Analysis and Research of Cloud Computing System to Comparison of Several Cloud Computing Platforms Volume 1, Issue 1 ISSN: 2320-5288 International Journal of Engineering Technology & Management Research Journal homepage: www.ijetmr.org Analysis and Research of Cloud Computing System to Comparison of

More information

High Performance Applications over the Cloud: Gains and Losses

High Performance Applications over the Cloud: Gains and Losses High Performance Applications over the Cloud: Gains and Losses Dr. Leila Ismail Faculty of Information Technology United Arab Emirates University leila@uaeu.ac.ae http://citweb.uaeu.ac.ae/citweb/profile/leila

More information

Apache Hadoop. Alexandru Costan

Apache Hadoop. Alexandru Costan 1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open

More information

Scheduling in the Cloud

Scheduling in the Cloud Scheduling in the Cloud Jon Weissman Distributed Computing Systems Group Department of CS&E University of Minnesota Introduction Cloud Context fertile platform for scheduling research re-think old problems

More information

Provisioning and Resource Management at Large Scale (Kadeploy and OAR)

Provisioning and Resource Management at Large Scale (Kadeploy and OAR) Provisioning and Resource Management at Large Scale (Kadeploy and OAR) Olivier Richard Laboratoire d Informatique de Grenoble (LIG) Projet INRIA Mescal 31 octobre 2007 Olivier Richard ( Laboratoire d Informatique

More information

Cloud Computing. Summary

Cloud Computing. Summary Cloud Computing Lecture 1 2011-2012 https://fenix.ist.utl.pt/disciplinas/cn Summary Teaching Staff. Rooms and Schedule. Goals. Context. Syllabus. Reading Material. Assessment and Grading. Important Dates.

More information

Hadoop Parallel Data Processing

Hadoop Parallel Data Processing MapReduce and Implementation Hadoop Parallel Data Processing Kai Shen A programming interface (two stage Map and Reduce) and system support such that: the interface is easy to program, and suitable for

More information

Putchong Uthayopas, Kasetsart University

Putchong Uthayopas, Kasetsart University Putchong Uthayopas, Kasetsart University Introduction Cloud Computing Explained Cloud Application and Services Moving to the Cloud Trends and Technology Legend: Cluster computing, Grid computing, Cloud

More information

HPC technology and future architecture

HPC technology and future architecture HPC technology and future architecture Visual Analysis for Extremely Large-Scale Scientific Computing KGT2 Internal Meeting INRIA France Benoit Lange benoit.lange@inria.fr Toàn Nguyên toan.nguyen@inria.fr

More information

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Cloud Computing: Computing as a Service Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Abstract: Computing as a utility. is a dream that dates from the beginning from the computer

More information

Mining Large Datasets: Case of Mining Graph Data in the Cloud

Mining Large Datasets: Case of Mining Graph Data in the Cloud Mining Large Datasets: Case of Mining Graph Data in the Cloud Sabeur Aridhi PhD in Computer Science with Laurent d Orazio, Mondher Maddouri and Engelbert Mephu Nguifo 16/05/2014 Sabeur Aridhi Mining Large

More information

Challenges for Data Driven Systems

Challenges for Data Driven Systems Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Quick History of Data Management 4000 B C Manual recording From tablets to papyrus to paper A. Payberah 2014 2

More information

OpenCloudware Towards a PaaS Management Stack over Multiple Clouds

OpenCloudware Towards a PaaS Management Stack over Multiple Clouds OpenCloudware Towards a PaaS Management Stack over Multiple Clouds WHITE PAPER October 2014 (cc by) OW2. 1 (CC) OW2 Disclaimer The information contained in this White Paper represents the current view(s)

More information

Volunteer Computing, Grid Computing and Cloud Computing: Opportunities for Synergy. Derrick Kondo INRIA, France

Volunteer Computing, Grid Computing and Cloud Computing: Opportunities for Synergy. Derrick Kondo INRIA, France Volunteer Computing, Grid Computing and Cloud Computing: Opportunities for Synergy Derrick Kondo INRIA, France Outline Cloud Grid Volunteer Computing Cloud Background Vision Hide complexity of hardware

More information