HPC technology and future architecture
|
|
|
- Byron Collins
- 10 years ago
- Views:
Transcription
1 HPC technology and future architecture Visual Analysis for Extremely Large-Scale Scientific Computing KGT2 Internal Meeting INRIA France Benoit Lange Toàn Nguyên
2 Outline The VELaSSCo project General information Members of the consortium Motivations of the project Objectives of the project Target data Develop a Big Data platform The VELaSSCo architecture Big Data, what does it mean? Data of Big Data What are the challenges of Big Data Grid vs Cloud Big Data needs a distributed system Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
3 The VELaSSCo project General information VELaSSCO is a EC funded Project which deals with end-user visualization of huge simulation data (Big Data). 3 years project ( ) By 2020, most crucial simulation results such as those from the aeronautic industry or automotive, will not be able to be stored in a single machine or server. How to store, access, simplify and manipulate billion of records to extract the relevant information? How to represent information in a feasible and flexible way? How to visualise and interactively inspect the huge quantity of information they produce taking into account end-user's needs? Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
4 The VELaSSCo project Members of the consortium Big Data Infrastructure Data Analytics Visualization Expertise End-users / Beneficiaries Big Data Issues HPC and Big Data, Handling, formatting,storage Data access, extraction, reduction Platforms FEM Models DEM Models LB Models End-user testing Usability verification Reactivity Spain ATOS CIMNE United Kingdom UNEDIN Norway SINTEF JOTNE France INRIA Germany FRAUNHOFER Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
5 Motivations of the project VELaSSCo Pre- processing Calculation Post- processing Geometry description Preparation of analysis data Visualizationof results Computer Analysis Pre and post-processor Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
6 Motivations of the project The simulation data are naturally linked to: High Performance Computing Simulation has already been introduced in Big Data area very traditional supercomputer manufacturers such as SGI companies oriented to massive number of customers such as Amazon, offering very attractive solutions for simulation software vendors (Elastic Compute Cloud, EC2, Simple Storage Service, S3) well-known simulation suites such as Matlab or OpenFOAM (precisely through Amazon services) How Big is the current Simulation Data? Some examples include: weather & climate (400 PB/year, now) nuclear & fusion energy (2PB/time step, now, and 200 PB/time step by 2020) high-energy physics, Materials, Chemistry, Biology, fluid dynamics Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
7 Objectives of the project Target data DEM FEM Total size 50 GB à 1 PB 30 GB à 50 TB Partitions 1 à 10,000 Particles / elements 10 million 8 million à 1 billion Time-steps 1 billion 40 à 25,000 Variables per node 10 variables 2-8 scalars, 1-2 vectors,?1 tensor? Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
8 Objectives of the project Nowadays the huge amount of data provided by the solver in HPC cannot be stored in one single machine, so it is mandatory: Distributed post-processing Distributed visualization Problems if a calculation node fails in HPC. Need a redundancy for the data Big Data The main objective of VELaSSCo project is to build the VELaSSCo Platform, a system that performs distributed post-processing operations and visualization of very large simulations. To address this objective, VELaSSCo brings together Simulation and Big Data. Develop a platform which targeted most of IT system Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
9 The VELaSSCo architecture Big Data, what does it mean? Big Data refers to data sets whose size is beyond the ability of typical database software tools to capture, store, manage and analyze. (McKinsey Global Institute) Big Data is the term for a collection of data sets so large and complex that it becomes difficult to process using onhand database management tools or traditional data processing applications. (Wikipedia) Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
10 The VELaSSCo architecture Data of Big Data Usages: Data cleaning Data transformation Data analysis Data search Data computation Data visualization Heterogeneous data: From sensors Images Medias Textual Networks Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
11 The VELaSSCo architecture What are the challenges of Big Data Scale Data volume Distribution of computation and storage between different locations Size of network and storage system Complexity A wide variety of acquisitions A large set of dimensions Fuzzy data Heterogeneity Scientific collaboration between several domains Specific data format Complex workflow computation Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
12 The VELaSSCo architecture Grid vs Cloud Grids Owned by scientific community Batch computation Computation time Widely distributed Clouds Mainly owned by industry Simultaneous computations CPU time Can be distributed Heterogeneous system I. Foster, Y. Zhao, I. Raicu, and S. Lu. Cloud computing and grid computing 360-degree compared. In Grid Computing Environments Workshop, 2008, Nov Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
13 The VELaSSCo architecture Grid vs Cloud Grids Clouds Business Model Project-oriented Consomption basis Architecture Five layers Four layers Abstract resources Can be implement over a grid Resources Management Batch-scheduling Shared file system Shared by all users Specific FS Programming Model Workflow tools Map-Reduce Application Model Any Any Difficulties with HPC problems Security Model Strict security Strong security Foster, Y. Zhao, I. Raicu, and S. Lu. Cloud computing and grid computing 360-degree compared. In Grid Computing Environments Workshop, GCE 08, pages 1 10, Nov Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
14 The VELaSSCo architecture Big Data needs a distributed system The most suitable computational model for Big Data: MapReduce Designed for large distributed system A simple programming model Based on a specific FS Designed to scale up High availability Deal with nodes failure Batch computation But this model has evolved (Hadoop 2.0) More complex computation Management of Resources A Data-oriented Operating System Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
15 The VELaSSCo architecture Simplified version or or or Or.. Visualisation Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
16 The VELaSSCo architecture VELaSSCo.Platform.Access.lib Visualization client VELaSSCo.Engine.Layer (YARN) Query Manager Module Asynchronous Availability, resources, load, etc. Monitoring Graphics Compressi on / Streaming GPU struct Analytics LOD, D2C Iso, stream, stats VELaSSCo.Data.Layer RT Storage Module Batch Data Query VELaSSCo.Platform.DataIngestion.lib Simulation Ingestion & Processing (Flume) HBas hive FS Phoeni e x HadoopAbstractFileSystem Existing software To develop HDFS NFS EDM Plug-in EDM Results / data flow Consortium Open Queries flow Commercial Version Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
17 Conclusions A Big Data platform for engineering data (FEM and DEM simulation) with supports of visualisationtools: GID (CIMNE) ifx (Fraunhofer IGD) With support of real-time queries A big data architecture for any IT systems For ex: co-exists with a HPC cluster Extensible (support plug-ins) A database engine, based on widely used technologies such as Hadoop-HBase and ISO STEP, that can organise and store a diverse range of largescale simulation data sets for collaborative use. An innovative approach, adopting big data best practices, to handle large scale simulation data sets that have to be stored on multiple servers. A framework equipped with advanced in-situ processing tools to analyse the output of parallel distributed simulation solvers. An analysis platform to analyse and visualize large-scale data sets interactively. This builds on leading edge graphics hardware. Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
18 Thanks you for your attention. More information are available on You can contact me at: Benoit Lange - VELaSSCo - KGT2 - [email protected] - Xi'An 7 May
Visual Analysis for Extremely Large Scale Scientific Computing
Visual Analysis for Extremely Large Scale Scientific Computing D2.5 Big data interfaces for data acquisition Deliverable Information Grant Agreement no 619439 Web Site Related WP & Task: WP2, T2.4 http://www.velassco.eu/
Is a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
Data-intensive HPC: opportunities and challenges. Patrick Valduriez
Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
Visual Analysis for Extremely Large- Scale Scientific Computing
Visual Analysis for Extremely Large- Scale Scientific Computing D2.1 State- of- the- art of Big Data Version 1 Deliverable Information Grant Agreement no 619439 Web Site http://www.velassco.eu/ Related
A Brief Introduction to Apache Tez
A Brief Introduction to Apache Tez Introduction It is a fact that data is basically the new currency of the modern business world. Companies that effectively maximize the value of their data (extract value
Chapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
Bringing Big Data Modelling into the Hands of Domain Experts
Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks [email protected] 2015 The MathWorks, Inc. 1 Data is the sword of the
SURFsara HPC Cloud Workshop
SURFsara HPC Cloud Workshop doc.hpccloud.surfsara.nl UvA workshop 2016-01-25 UvA HPC Course Jan 2016 Anatoli Danezi, Markus van Dijk [email protected] Agenda Introduction and Overview (current
Big Data at Cloud Scale
Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For
Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia
Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop
Big Data and Market Surveillance. April 28, 2014
Big Data and Market Surveillance April 28, 2014 Copyright 2014 Scila AB. All rights reserved. Scila AB reserves the right to make changes to the information contained herein without prior notice. No part
Real Time Big Data Processing
Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
Open source Google-style large scale data analysis with Hadoop
Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: [email protected] Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical
The Power of Pentaho and Hadoop in Action. Demonstrating MapReduce Performance at Scale
The Power of Pentaho and Hadoop in Action Demonstrating MapReduce Performance at Scale Introduction Over the last few years, Big Data has gone from a tech buzzword to a value generator for many organizations.
Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging
Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging
Big Data and Cloud Computing for GHRSST
Big Data and Cloud Computing for GHRSST Jean-Francois Piollé ([email protected]) Frédéric Paul, Olivier Archer CERSAT / Institut Français de Recherche pour l Exploitation de la Mer Facing data deluge
Data Centric Systems (DCS)
Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems
Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme
Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,
Oracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
BIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
SURFsara HPC Cloud Workshop
SURFsara HPC Cloud Workshop www.cloud.sara.nl Tutorial 2014-06-11 UvA HPC and Big Data Course June 2014 Anatoli Danezi, Markus van Dijk [email protected] Agenda Introduction and Overview (current
Workshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
Manifest for Big Data Pig, Hive & Jaql
Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,
BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &
BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research & Innovation 04-08-2011 to the EC 8 th February, Luxembourg Your Atos business Research technologists. and Innovation
HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect [email protected]
HPC and Big Data EPCC The University of Edinburgh Adrian Jackson Technical Architect [email protected] EPCC Facilities Technology Transfer European Projects HPC Research Visitor Programmes Training
Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc.
Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc. 2015 The MathWorks, Inc. 1 Challenges of Big Data Any collection of data sets so large and complex that it becomes difficult
How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
A Novel Cloud Based Elastic Framework for Big Data Preprocessing
School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview
On a Hadoop-based Analytics Service System
Int. J. Advance Soft Compu. Appl, Vol. 7, No. 1, March 2015 ISSN 2074-8523 On a Hadoop-based Analytics Service System Mikyoung Lee, Hanmin Jung, and Minhee Cho Korea Institute of Science and Technology
Data Refinery with Big Data Aspects
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data
Energy efficiency in HPC :
Energy efficiency in HPC : A new trend? A software approach to save power but still increase the number or the size of scientific studies! 19 Novembre 2012 The EDF Group in brief A GLOBAL LEADER IN ELECTRICITY
Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013
Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache
Putchong Uthayopas, Kasetsart University
Putchong Uthayopas, Kasetsart University Introduction Cloud Computing Explained Cloud Application and Services Moving to the Cloud Trends and Technology Legend: Cluster computing, Grid computing, Cloud
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after
BIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
Application Development. A Paradigm Shift
Application Development for the Cloud: A Paradigm Shift Ramesh Rangachar Intelsat t 2012 by Intelsat. t Published by The Aerospace Corporation with permission. New 2007 Template - 1 Motivation for the
Open source large scale distributed data management with Google s MapReduce and Bigtable
Open source large scale distributed data management with Google s MapReduce and Bigtable Ioannis Konstantinou Email: [email protected] Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory
Advanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
Cloud-based Infrastructures. Serving INSPIRE needs
Cloud-based Infrastructures Serving INSPIRE needs INSPIRE Conference 2014 Workshop Sessions Benoit BAURENS, AKKA Technologies (F) Claudio LUCCHESE, CNR (I) June 16th, 2014 This content by the InGeoCloudS
Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!
Simplifying Big Data Analytics: Unifying Batch and Stream Processing John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Streaming Analy.cs S S S Scale- up Database Data And Compute Grid
Manjrasoft Market Oriented Cloud Computing Platform
Manjrasoft Market Oriented Cloud Computing Platform Innovative Solutions for 3D Rendering Aneka is a market oriented Cloud development and management platform with rapid application development and workload
Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect
on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze
The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson
The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson 1 A New Platform for Pervasive Analytics Multiple big data opportunities
Enterprise HPC & Cloud Computing for Engineering Simulation. Barbara Hutchings Director, Strategic Partnerships ANSYS, Inc.
Enterprise HPC & Cloud Computing for Engineering Simulation Barbara Hutchings Director, Strategic Partnerships ANSYS, Inc. Historical Perspective Evolution of Computing for Simulation Pendulum swing: Centralized
Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities
Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling
Big Data. White Paper. Big Data Executive Overview WP-BD-10312014-01. Jafar Shunnar & Dan Raver. Page 1 Last Updated 11-10-2014
White Paper Big Data Executive Overview WP-BD-10312014-01 By Jafar Shunnar & Dan Raver Page 1 Last Updated 11-10-2014 Table of Contents Section 01 Big Data Facts Page 3-4 Section 02 What is Big Data? Page
2015 The MathWorks, Inc. 1
25 The MathWorks, Inc. 빅 데이터 및 다양한 데이터 처리 위한 MATLAB의 인터페이스 환경 및 새로운 기능 엄준상 대리 Application Engineer MathWorks 25 The MathWorks, Inc. 2 Challenges of Data Any collection of data sets so large and complex
Big Data on Google Cloud
Big Data on Google Cloud Using Cloud Dataflow, BigQuery, and friends to process data the Cloud way William Vambenepe, Lead Product Manager for Big Data, Google Cloud Platform @vambenepe / [email protected]
Comprehensive Analytics on the Hortonworks Data Platform
Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page
Big Data - Infrastructure Considerations
April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright
BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS
BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS Megha Joshi Assistant Professor, ASM s Institute of Computer Studies, Pune, India Abstract: Industry is struggling to handle voluminous, complex, unstructured
Cluster, Grid, Cloud Concepts
Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of
Hadoop & Spark Using Amazon EMR
Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?
Building your Big Data Architecture on Amazon Web Services
Building your Big Data Architecture on Amazon Web Services Abhishek Sinha @abysinha [email protected] AWS Services Deployment & Administration Application Services Compute Storage Database Networking
So What s the Big Deal?
So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data
Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum
Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at [email protected].
Big Data Explained. An introduction to Big Data Science.
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
Industry 4.0 and Big Data
Industry 4.0 and Big Data Marek Obitko, [email protected] Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and
Introducing EEMBC Cloud and Big Data Server Benchmarks
Introducing EEMBC Cloud and Big Data Server Benchmarks Quick Background: Industry-Standard Benchmarks for the Embedded Industry EEMBC formed in 1997 as non-profit consortium Defining and developing application-specific
Cloud Computing. Alex Crawford Ben Johnstone
Cloud Computing Alex Crawford Ben Johnstone Overview What is cloud computing? Amazon EC2 Performance Conclusions What is the Cloud? A large cluster of machines o Economies of scale [1] Customers use a
Modernizing Your Data Warehouse for Hadoop
Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist [email protected] O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking
Cloud Computing @ JPL Science Data Systems
Cloud Computing @ JPL Science Data Systems Emily Law, GSAW 2011 Outline Science Data Systems (SDS) Space & Earth SDSs SDS Common Architecture Components Key Components using Cloud Computing Use Case 1:
COMPUTER MEASUREMENT GROUP - India Hyderabad Chapter. Strategies to Optimize Cloud Costs By Cloud Performance Monitoring
COMPUTER MEASUREMENT GROUP - India Hyderabad Chapter Strategies to Optimize Cloud Costs By Cloud Performance Monitoring October 2013 www.cmgindia.org Computer Measurement Group, India 1 About Me Credentials
Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12
Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using
Cloud Design and Implementation. Cheng Li MPI-SWS Nov 9 th, 2010
Cloud Design and Implementation Cheng Li MPI-SWS Nov 9 th, 2010 1 Modern Computing CPU, Mem, Disk Academic computation Chemistry, Biology Large Data Set Analysis Online service Shopping Website Collaborative
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
Big Data Use Case: Business Analytics
Big Data Use Case: Business Analytics Starting point A telecommunications company wants to allude to the topic of Big Data. The established Big Data working group has access to the data stock of the enterprise
Concept and Project Objectives
3.1 Publishable summary Concept and Project Objectives Proactive and dynamic QoS management, network intrusion detection and early detection of network congestion problems among other applications in the
Upcoming Announcements
Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC [email protected] Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce
Manjrasoft Market Oriented Cloud Computing Platform
Manjrasoft Market Oriented Cloud Computing Platform Aneka Aneka is a market oriented Cloud development and management platform with rapid application development and workload distribution capabilities.
BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?
BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand? The Big Data Buzz big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database
10- High Performance Compu5ng
10- High Performance Compu5ng (Herramientas Computacionales Avanzadas para la Inves6gación Aplicada) Rafael Palacios, Fernando de Cuadra MRE Contents Implemen8ng computa8onal tools 1. High Performance
How To Use A Data Center With A Data Farm On A Microsoft Server On A Linux Server On An Ipad Or Ipad (Ortero) On A Cheap Computer (Orropera) On An Uniden (Orran)
Day with Development Master Class Big Data Management System DW & Big Data Global Leaders Program Jean-Pierre Dijcks Big Data Product Management Server Technologies Part 1 Part 2 Foundation and Architecture
1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India
1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India Call for Papers Colossal Data Analysis and Networking has emerged as a de facto
Convergence of Big Data and Cloud
American Journal of Engineering Research (AJER) e-issn : 2320-0847 p-issn : 2320-0936 Volume-03, Issue-05, pp-266-270 www.ajer.org Research Paper Open Access Convergence of Big Data and Cloud Sreevani.Y.V.
Big Data and Analytics: A Conceptual Overview. Mike Park Erik Hoel
Big Data and Analytics: A Conceptual Overview Mike Park Erik Hoel In this technical workshop This presentation is for anyone that uses ArcGIS and is interested in analyzing large amounts of data We will
Luncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
Hue Streams. Seismic Compression Technology. Years of my life were wasted waiting for data loading and copying
Hue Streams Seismic Compression Technology Hue Streams real-time seismic compression results in a massive reduction in storage utilization and significant time savings for all seismic-consuming workflows.
Hadoop and Map-Reduce. Swati Gore
Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data
Large-Scale Data Processing
Large-Scale Data Processing Eiko Yoneki [email protected] http://www.cl.cam.ac.uk/~ey204 Systems Research Group University of Cambridge Computer Laboratory 2010s: Big Data Why Big Data now? Increase
Mr. Apichon Witayangkurn [email protected] Department of Civil Engineering The University of Tokyo
Sensor Network Messaging Service Hive/Hadoop Mr. Apichon Witayangkurn [email protected] Department of Civil Engineering The University of Tokyo Contents 1 Introduction 2 What & Why Sensor Network
Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing
Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing Deep Mann ME (Software Engineering) Computer Science and Engineering Department Thapar University Patiala-147004
Big Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
