HPC technology and future architecture

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "HPC technology and future architecture"

Transcription

1 HPC technology and future architecture Visual Analysis for Extremely Large-Scale Scientific Computing KGT2 Internal Meeting INRIA France Benoit Lange Toàn Nguyên

2 Outline The VELaSSCo project General information Members of the consortium Motivations of the project Objectives of the project Target data Develop a Big Data platform The VELaSSCo architecture Big Data, what does it mean? Data of Big Data What are the challenges of Big Data Grid vs Cloud Big Data needs a distributed system Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

3 The VELaSSCo project General information VELaSSCO is a EC funded Project which deals with end-user visualization of huge simulation data (Big Data). 3 years project ( ) By 2020, most crucial simulation results such as those from the aeronautic industry or automotive, will not be able to be stored in a single machine or server. How to store, access, simplify and manipulate billion of records to extract the relevant information? How to represent information in a feasible and flexible way? How to visualise and interactively inspect the huge quantity of information they produce taking into account end-user's needs? Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

4 The VELaSSCo project Members of the consortium Big Data Infrastructure Data Analytics Visualization Expertise End-users / Beneficiaries Big Data Issues HPC and Big Data, Handling, formatting,storage Data access, extraction, reduction Platforms FEM Models DEM Models LB Models End-user testing Usability verification Reactivity Spain ATOS CIMNE United Kingdom UNEDIN Norway SINTEF JOTNE France INRIA Germany FRAUNHOFER Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

5 Motivations of the project VELaSSCo Pre- processing Calculation Post- processing Geometry description Preparation of analysis data Visualizationof results Computer Analysis Pre and post-processor Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

6 Motivations of the project The simulation data are naturally linked to: High Performance Computing Simulation has already been introduced in Big Data area very traditional supercomputer manufacturers such as SGI companies oriented to massive number of customers such as Amazon, offering very attractive solutions for simulation software vendors (Elastic Compute Cloud, EC2, Simple Storage Service, S3) well-known simulation suites such as Matlab or OpenFOAM (precisely through Amazon services) How Big is the current Simulation Data? Some examples include: weather & climate (400 PB/year, now) nuclear & fusion energy (2PB/time step, now, and 200 PB/time step by 2020) high-energy physics, Materials, Chemistry, Biology, fluid dynamics Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

7 Objectives of the project Target data DEM FEM Total size 50 GB à 1 PB 30 GB à 50 TB Partitions 1 à 10,000 Particles / elements 10 million 8 million à 1 billion Time-steps 1 billion 40 à 25,000 Variables per node 10 variables 2-8 scalars, 1-2 vectors,?1 tensor? Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

8 Objectives of the project Nowadays the huge amount of data provided by the solver in HPC cannot be stored in one single machine, so it is mandatory: Distributed post-processing Distributed visualization Problems if a calculation node fails in HPC. Need a redundancy for the data Big Data The main objective of VELaSSCo project is to build the VELaSSCo Platform, a system that performs distributed post-processing operations and visualization of very large simulations. To address this objective, VELaSSCo brings together Simulation and Big Data. Develop a platform which targeted most of IT system Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

9 The VELaSSCo architecture Big Data, what does it mean? Big Data refers to data sets whose size is beyond the ability of typical database software tools to capture, store, manage and analyze. (McKinsey Global Institute) Big Data is the term for a collection of data sets so large and complex that it becomes difficult to process using onhand database management tools or traditional data processing applications. (Wikipedia) Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

10 The VELaSSCo architecture Data of Big Data Usages: Data cleaning Data transformation Data analysis Data search Data computation Data visualization Heterogeneous data: From sensors Images Medias Textual Networks Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

11 The VELaSSCo architecture What are the challenges of Big Data Scale Data volume Distribution of computation and storage between different locations Size of network and storage system Complexity A wide variety of acquisitions A large set of dimensions Fuzzy data Heterogeneity Scientific collaboration between several domains Specific data format Complex workflow computation Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

12 The VELaSSCo architecture Grid vs Cloud Grids Owned by scientific community Batch computation Computation time Widely distributed Clouds Mainly owned by industry Simultaneous computations CPU time Can be distributed Heterogeneous system I. Foster, Y. Zhao, I. Raicu, and S. Lu. Cloud computing and grid computing 360-degree compared. In Grid Computing Environments Workshop, 2008, Nov Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

13 The VELaSSCo architecture Grid vs Cloud Grids Clouds Business Model Project-oriented Consomption basis Architecture Five layers Four layers Abstract resources Can be implement over a grid Resources Management Batch-scheduling Shared file system Shared by all users Specific FS Programming Model Workflow tools Map-Reduce Application Model Any Any Difficulties with HPC problems Security Model Strict security Strong security Foster, Y. Zhao, I. Raicu, and S. Lu. Cloud computing and grid computing 360-degree compared. In Grid Computing Environments Workshop, GCE 08, pages 1 10, Nov Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

14 The VELaSSCo architecture Big Data needs a distributed system The most suitable computational model for Big Data: MapReduce Designed for large distributed system A simple programming model Based on a specific FS Designed to scale up High availability Deal with nodes failure Batch computation But this model has evolved (Hadoop 2.0) More complex computation Management of Resources A Data-oriented Operating System Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

15 The VELaSSCo architecture Simplified version or or or Or.. Visualisation Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

16 The VELaSSCo architecture VELaSSCo.Platform.Access.lib Visualization client VELaSSCo.Engine.Layer (YARN) Query Manager Module Asynchronous Availability, resources, load, etc. Monitoring Graphics Compressi on / Streaming GPU struct Analytics LOD, D2C Iso, stream, stats VELaSSCo.Data.Layer RT Storage Module Batch Data Query VELaSSCo.Platform.DataIngestion.lib Simulation Ingestion & Processing (Flume) HBas hive FS Phoeni e x HadoopAbstractFileSystem Existing software To develop HDFS NFS EDM Plug-in EDM Results / data flow Consortium Open Queries flow Commercial Version Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

17 Conclusions A Big Data platform for engineering data (FEM and DEM simulation) with supports of visualisationtools: GID (CIMNE) ifx (Fraunhofer IGD) With support of real-time queries A big data architecture for any IT systems For ex: co-exists with a HPC cluster Extensible (support plug-ins) A database engine, based on widely used technologies such as Hadoop-HBase and ISO STEP, that can organise and store a diverse range of largescale simulation data sets for collaborative use. An innovative approach, adopting big data best practices, to handle large scale simulation data sets that have to be stored on multiple servers. A framework equipped with advanced in-situ processing tools to analyse the output of parallel distributed simulation solvers. An analysis platform to analyse and visualize large-scale data sets interactively. This builds on leading edge graphics hardware. Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

18 Thanks you for your attention. More information are available on You can contact me at: Benoit Lange - VELaSSCo - KGT2 - - Xi'An 7 May

A Hadoop use case for engineering data

A Hadoop use case for engineering data A Hadoop use case for engineering data Benoit Lange, Toan Nguyen To cite this version: Benoit Lange, Toan Nguyen. A Hadoop use case for engineering data. 2015. HAL Id: hal-01167510 https://hal.inria.fr/hal-01167510

More information

Visual Analysis for Extremely Large Scale Scientific Computing

Visual Analysis for Extremely Large Scale Scientific Computing Visual Analysis for Extremely Large Scale Scientific Computing D2.5 Big data interfaces for data acquisition Deliverable Information Grant Agreement no 619439 Web Site Related WP & Task: WP2, T2.4 http://www.velassco.eu/

More information

Big Data architecture for large-scale scientific computing

Big Data architecture for large-scale scientific computing Big Data architecture for large-scale scientific computing Benoit Lange, Toan Nguyen To cite this version: Benoit Lange, Toan Nguyen. Big Data architecture for large-scale scientific computing. world academy

More information

Visual Analysis for Extremely Large Scale Scientific Computing

Visual Analysis for Extremely Large Scale Scientific Computing Visual Analysis for Extremely Large Scale Scientific Computing Deliverable Information Grant Agreement no 619439 Web Site D2.2 Specification of Big Data Architecture Version 3.0 http://www.velassco.eu/

More information

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Is a Data Scientist the New Quant? Stuart Kozola MathWorks Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by

More information

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Data-intensive HPC: opportunities and challenges. Patrick Valduriez Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

Visual Analysis for Extremely Large- Scale Scientific Computing

Visual Analysis for Extremely Large- Scale Scientific Computing Visual Analysis for Extremely Large- Scale Scientific Computing D2.1 State- of- the- art of Big Data Version 1 Deliverable Information Grant Agreement no 619439 Web Site http://www.velassco.eu/ Related

More information

A Brief Introduction to Apache Tez

A Brief Introduction to Apache Tez A Brief Introduction to Apache Tez Introduction It is a fact that data is basically the new currency of the modern business world. Companies that effectively maximize the value of their data (extract value

More information

Chapter 7. Using Hadoop Cluster and MapReduce

Chapter 7. Using Hadoop Cluster and MapReduce Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in

More information

Bringing Big Data Modelling into the Hands of Domain Experts

Bringing Big Data Modelling into the Hands of Domain Experts Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks david.willingham@mathworks.com.au 2015 The MathWorks, Inc. 1 Data is the sword of the

More information

SURFsara HPC Cloud Workshop

SURFsara HPC Cloud Workshop SURFsara HPC Cloud Workshop doc.hpccloud.surfsara.nl UvA workshop 2016-01-25 UvA HPC Course Jan 2016 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current

More information

Big Data at Cloud Scale

Big Data at Cloud Scale Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For

More information

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop

More information

San Diego Supercomputer Center, UCSD. Institute for Digital Research and Education, UCLA

San Diego Supercomputer Center, UCSD. Institute for Digital Research and Education, UCLA Facilitate Parallel Computation Using Kepler Workflow System on Virtual Resources Jianwu Wang 1, Prakashan Korambath 2, Ilkay Altintas 1 1 San Diego Supercomputer Center, UCSD 2 Institute for Digital Research

More information

Real Time Big Data Processing

Real Time Big Data Processing Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure

More information

Open source Google-style large scale data analysis with Hadoop

Open source Google-style large scale data analysis with Hadoop Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical

More information

Big Data and Market Surveillance. April 28, 2014

Big Data and Market Surveillance. April 28, 2014 Big Data and Market Surveillance April 28, 2014 Copyright 2014 Scila AB. All rights reserved. Scila AB reserves the right to make changes to the information contained herein without prior notice. No part

More information

The Power of Pentaho and Hadoop in Action. Demonstrating MapReduce Performance at Scale

The Power of Pentaho and Hadoop in Action. Demonstrating MapReduce Performance at Scale The Power of Pentaho and Hadoop in Action Demonstrating MapReduce Performance at Scale Introduction Over the last few years, Big Data has gone from a tech buzzword to a value generator for many organizations.

More information

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging

More information

Big Data and Cloud Computing for GHRSST

Big Data and Cloud Computing for GHRSST Big Data and Cloud Computing for GHRSST Jean-Francois Piollé (jfpiolle@ifremer.fr) Frédéric Paul, Olivier Archer CERSAT / Institut Français de Recherche pour l Exploitation de la Mer Facing data deluge

More information

CoolEmAll - Tools for realising an energy efficient data centre

CoolEmAll - Tools for realising an energy efficient data centre CoolEmAll - Tools for realising an energy efficient data centre Wolfgang Christmann christmann informationstechnik + medien GmbH & Co. KG www.christmann.info 1 Outline CoolEmAll project RECS system towards

More information

Data Centric Systems (DCS)

Data Centric Systems (DCS) Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Workshop on Hadoop with Big Data

Workshop on Hadoop with Big Data Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly

More information

A Professional Big Data Master s Program to train Computational Specialists

A Professional Big Data Master s Program to train Computational Specialists A Professional Big Data Master s Program to train Computational Specialists Anoop Sarkar, Fred Popowich, Alexandra Fedorova! School of Computing Science! Education for Employable Graduates: Critical Questions

More information

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research & BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research & Innovation 04-08-2011 to the EC 8 th February, Luxembourg Your Atos business Research technologists. and Innovation

More information

Manifest for Big Data Pig, Hive & Jaql

Manifest for Big Data Pig, Hive & Jaql Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,

More information

SURFsara HPC Cloud Workshop

SURFsara HPC Cloud Workshop SURFsara HPC Cloud Workshop www.cloud.sara.nl Tutorial 2014-06-11 UvA HPC and Big Data Course June 2014 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current

More information

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk HPC and Big Data EPCC The University of Edinburgh Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk EPCC Facilities Technology Transfer European Projects HPC Research Visitor Programmes Training

More information

Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc.

Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc. Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc. 2015 The MathWorks, Inc. 1 Challenges of Big Data Any collection of data sets so large and complex that it becomes difficult

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

A Novel Cloud Based Elastic Framework for Big Data Preprocessing School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview

More information

On a Hadoop-based Analytics Service System

On a Hadoop-based Analytics Service System Int. J. Advance Soft Compu. Appl, Vol. 7, No. 1, March 2015 ISSN 2074-8523 On a Hadoop-based Analytics Service System Mikyoung Lee, Hanmin Jung, and Minhee Cho Korea Institute of Science and Technology

More information

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013 Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after

More information

Putchong Uthayopas, Kasetsart University

Putchong Uthayopas, Kasetsart University Putchong Uthayopas, Kasetsart University Introduction Cloud Computing Explained Cloud Application and Services Moving to the Cloud Trends and Technology Legend: Cluster computing, Grid computing, Cloud

More information

Energy efficiency in HPC :

Energy efficiency in HPC : Energy efficiency in HPC : A new trend? A software approach to save power but still increase the number or the size of scientific studies! 19 Novembre 2012 The EDF Group in brief A GLOBAL LEADER IN ELECTRICITY

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

Open source large scale distributed data management with Google s MapReduce and Bigtable

Open source large scale distributed data management with Google s MapReduce and Bigtable Open source large scale distributed data management with Google s MapReduce and Bigtable Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory

More information

Application Development. A Paradigm Shift

Application Development. A Paradigm Shift Application Development for the Cloud: A Paradigm Shift Ramesh Rangachar Intelsat t 2012 by Intelsat. t Published by The Aerospace Corporation with permission. New 2007 Template - 1 Motivation for the

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment

More information

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze

More information

Manjrasoft Market Oriented Cloud Computing Platform

Manjrasoft Market Oriented Cloud Computing Platform Manjrasoft Market Oriented Cloud Computing Platform Innovative Solutions for 3D Rendering Aneka is a market oriented Cloud development and management platform with rapid application development and workload

More information

Cloud-based Infrastructures. Serving INSPIRE needs

Cloud-based Infrastructures. Serving INSPIRE needs Cloud-based Infrastructures Serving INSPIRE needs INSPIRE Conference 2014 Workshop Sessions Benoit BAURENS, AKKA Technologies (F) Claudio LUCCHESE, CNR (I) June 16th, 2014 This content by the InGeoCloudS

More information

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Simplifying Big Data Analytics: Unifying Batch and Stream Processing John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Streaming Analy.cs S S S Scale- up Database Data And Compute Grid

More information

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson 1 A New Platform for Pervasive Analytics Multiple big data opportunities

More information

Big Data. White Paper. Big Data Executive Overview WP-BD-10312014-01. Jafar Shunnar & Dan Raver. Page 1 Last Updated 11-10-2014

Big Data. White Paper. Big Data Executive Overview WP-BD-10312014-01. Jafar Shunnar & Dan Raver. Page 1 Last Updated 11-10-2014 White Paper Big Data Executive Overview WP-BD-10312014-01 By Jafar Shunnar & Dan Raver Page 1 Last Updated 11-10-2014 Table of Contents Section 01 Big Data Facts Page 3-4 Section 02 What is Big Data? Page

More information

2015 The MathWorks, Inc. 1

2015 The MathWorks, Inc. 1 25 The MathWorks, Inc. 빅 데이터 및 다양한 데이터 처리 위한 MATLAB의 인터페이스 환경 및 새로운 기능 엄준상 대리 Application Engineer MathWorks 25 The MathWorks, Inc. 2 Challenges of Data Any collection of data sets so large and complex

More information

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling

More information

Enterprise HPC & Cloud Computing for Engineering Simulation. Barbara Hutchings Director, Strategic Partnerships ANSYS, Inc.

Enterprise HPC & Cloud Computing for Engineering Simulation. Barbara Hutchings Director, Strategic Partnerships ANSYS, Inc. Enterprise HPC & Cloud Computing for Engineering Simulation Barbara Hutchings Director, Strategic Partnerships ANSYS, Inc. Historical Perspective Evolution of Computing for Simulation Pendulum swing: Centralized

More information

SEAIP 2009 Presentation

SEAIP 2009 Presentation SEAIP 2009 Presentation By David Tan Chair of Yahoo! Hadoop SIG, 2008-2009,Singapore EXCO Member of SGF SIG Imperial College (UK), Institute of Fluid Science (Japan) & Chicago BOOTH GSB (USA) Alumni Email:

More information

Big Data - Infrastructure Considerations

Big Data - Infrastructure Considerations April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright

More information

So What s the Big Deal?

So What s the Big Deal? So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data

More information

Big Data on Google Cloud

Big Data on Google Cloud Big Data on Google Cloud Using Cloud Dataflow, BigQuery, and friends to process data the Cloud way William Vambenepe, Lead Product Manager for Big Data, Google Cloud Platform @vambenepe / vbp@google.com

More information

Building your Big Data Architecture on Amazon Web Services

Building your Big Data Architecture on Amazon Web Services Building your Big Data Architecture on Amazon Web Services Abhishek Sinha @abysinha sinhaar@amazon.com AWS Services Deployment & Administration Application Services Compute Storage Database Networking

More information

Comprehensive Analytics on the Hortonworks Data Platform

Comprehensive Analytics on the Hortonworks Data Platform Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at spoozhikala@stratapps.com.

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS

BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS Megha Joshi Assistant Professor, ASM s Institute of Computer Studies, Pune, India Abstract: Industry is struggling to handle voluminous, complex, unstructured

More information

High Performance Computing with Hadoop WV HPC Summer Institute 2014

High Performance Computing with Hadoop WV HPC Summer Institute 2014 High Performance Computing with Hadoop WV HPC Summer Institute 2014 E. James Harner Director of Data Science Department of Statistics West Virginia University June 18, 2014 Outline Introduction Hadoop

More information

Introducing EEMBC Cloud and Big Data Server Benchmarks

Introducing EEMBC Cloud and Big Data Server Benchmarks Introducing EEMBC Cloud and Big Data Server Benchmarks Quick Background: Industry-Standard Benchmarks for the Embedded Industry EEMBC formed in 1997 as non-profit consortium Defining and developing application-specific

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

Industry 4.0 and Big Data

Industry 4.0 and Big Data Industry 4.0 and Big Data Marek Obitko, mobitko@ra.rockwell.com Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and

More information

COMPUTER MEASUREMENT GROUP - India Hyderabad Chapter. Strategies to Optimize Cloud Costs By Cloud Performance Monitoring

COMPUTER MEASUREMENT GROUP - India Hyderabad Chapter. Strategies to Optimize Cloud Costs By Cloud Performance Monitoring COMPUTER MEASUREMENT GROUP - India Hyderabad Chapter Strategies to Optimize Cloud Costs By Cloud Performance Monitoring October 2013 www.cmgindia.org Computer Measurement Group, India 1 About Me Credentials

More information

Big Data Explained. An introduction to Big Data Science.

Big Data Explained. An introduction to Big Data Science. Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of

More information

Modernizing Your Data Warehouse for Hadoop

Modernizing Your Data Warehouse for Hadoop Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking

More information

Concept and Project Objectives

Concept and Project Objectives 3.1 Publishable summary Concept and Project Objectives Proactive and dynamic QoS management, network intrusion detection and early detection of network congestion problems among other applications in the

More information

Cloud Computing. Alex Crawford Ben Johnstone

Cloud Computing. Alex Crawford Ben Johnstone Cloud Computing Alex Crawford Ben Johnstone Overview What is cloud computing? Amazon EC2 Performance Conclusions What is the Cloud? A large cluster of machines o Economies of scale [1] Customers use a

More information

Cloud Computing @ JPL Science Data Systems

Cloud Computing @ JPL Science Data Systems Cloud Computing @ JPL Science Data Systems Emily Law, GSAW 2011 Outline Science Data Systems (SDS) Space & Earth SDSs SDS Common Architecture Components Key Components using Cloud Computing Use Case 1:

More information

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12 Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using

More information

Big Data Use Case: Business Analytics

Big Data Use Case: Business Analytics Big Data Use Case: Business Analytics Starting point A telecommunications company wants to allude to the topic of Big Data. The established Big Data working group has access to the data stock of the enterprise

More information

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce

More information

Big Data and the Cloud Trends, Applications, and Training

Big Data and the Cloud Trends, Applications, and Training Big Data and the Cloud Trends, Applications, and Training Stavros Christodoulakis MUSIC/TUC Lab School of Electronic and Computer Engineering Technical University of Crete stavros@ced.tuc.gr Data Explosion

More information

Day with Development Master Class Big Data Management System DW & Big Data Global Leaders Program

Day with Development Master Class Big Data Management System DW & Big Data Global Leaders Program Day with Development Master Class Big Data Management System DW & Big Data Global Leaders Program Jean-Pierre Dijcks Big Data Product Management Server Technologies Part 1 Part 2 Foundation and Architecture

More information

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand? BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand? The Big Data Buzz big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database

More information

Upcoming Announcements

Upcoming Announcements Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within

More information

III Big Data Technologies

III Big Data Technologies III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Big Data and Analytics: A Conceptual Overview. Mike Park Erik Hoel

Big Data and Analytics: A Conceptual Overview. Mike Park Erik Hoel Big Data and Analytics: A Conceptual Overview Mike Park Erik Hoel In this technical workshop This presentation is for anyone that uses ArcGIS and is interested in analyzing large amounts of data We will

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

Hue Streams. Seismic Compression Technology. Years of my life were wasted waiting for data loading and copying

Hue Streams. Seismic Compression Technology. Years of my life were wasted waiting for data loading and copying Hue Streams Seismic Compression Technology Hue Streams real-time seismic compression results in a massive reduction in storage utilization and significant time savings for all seismic-consuming workflows.

More information

Manjrasoft Market Oriented Cloud Computing Platform

Manjrasoft Market Oriented Cloud Computing Platform Manjrasoft Market Oriented Cloud Computing Platform Aneka Aneka is a market oriented Cloud development and management platform with rapid application development and workload distribution capabilities.

More information

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India 1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India Call for Papers Colossal Data Analysis and Networking has emerged as a de facto

More information

Research trends relevant to data warehousing and OLAP include [Cuzzocrea et al.]: Combining the benefits of RDBMS and NoSQL database systems

Research trends relevant to data warehousing and OLAP include [Cuzzocrea et al.]: Combining the benefits of RDBMS and NoSQL database systems DATA WAREHOUSING RESEARCH TRENDS Research trends relevant to data warehousing and OLAP include [Cuzzocrea et al.]: Data source heterogeneity and incongruence Filtering out uncorrelated data Strongly unstructured

More information

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof. CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Cloud Computing and Amazon Web Services Cloud Computing Amazon

More information

10- High Performance Compu5ng

10- High Performance Compu5ng 10- High Performance Compu5ng (Herramientas Computacionales Avanzadas para la Inves6gación Aplicada) Rafael Palacios, Fernando de Cuadra MRE Contents Implemen8ng computa8onal tools 1. High Performance

More information

Large-Scale Data Processing

Large-Scale Data Processing Large-Scale Data Processing Eiko Yoneki eiko.yoneki@cl.cam.ac.uk http://www.cl.cam.ac.uk/~ey204 Systems Research Group University of Cambridge Computer Laboratory 2010s: Big Data Why Big Data now? Increase

More information

Cloud Design and Implementation. Cheng Li MPI-SWS Nov 9 th, 2010

Cloud Design and Implementation. Cheng Li MPI-SWS Nov 9 th, 2010 Cloud Design and Implementation Cheng Li MPI-SWS Nov 9 th, 2010 1 Modern Computing CPU, Mem, Disk Academic computation Chemistry, Biology Large Data Set Analysis Online service Shopping Website Collaborative

More information

Hadoop and Map-Reduce. Swati Gore

Hadoop and Map-Reduce. Swati Gore Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data

More information

Convergence of Big Data and Cloud

Convergence of Big Data and Cloud American Journal of Engineering Research (AJER) e-issn : 2320-0847 p-issn : 2320-0936 Volume-03, Issue-05, pp-266-270 www.ajer.org Research Paper Open Access Convergence of Big Data and Cloud Sreevani.Y.V.

More information

Remote sensing information cloud service: research and practice

Remote sensing information cloud service: research and practice Remote sensing information cloud service: research and practice Yang Banghui Dr., Ren Fuhu Prof. and Wang jinnian Prof. yangbh@radi.ac.cn +8613810963452 Content 1 Background 2 Studying and Designing 3

More information

Modern Data Architecture for Predictive Analytics

Modern Data Architecture for Predictive Analytics Modern Data Architecture for Predictive Analytics David Smith VP Marketing and Community - Revolution Analytics John Kreisa VP Strategic Marketing- Hortonworks Hortonworks Inc. 2013 Page 1 Your Presenters

More information

Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp Department of Civil Engineering The University of Tokyo

Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp Department of Civil Engineering The University of Tokyo Sensor Network Messaging Service Hive/Hadoop Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp Department of Civil Engineering The University of Tokyo Contents 1 Introduction 2 What & Why Sensor Network

More information

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing Deep Mann ME (Software Engineering) Computer Science and Engineering Department Thapar University Patiala-147004

More information