From Distributed Computing to Distributed Artificial Intelligence

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "From Distributed Computing to Distributed Artificial Intelligence"

Transcription

1 From Distributed Computing to Distributed Artificial Intelligence Dr. Christos Filippidis, NCSR Demokritos Dr. George Giannakopoulos, NCSR Demokritos

2 Big Data and the Fourth Paradigm The two dominant paradigms for scientific discovery: Theory Experiments large-scale computer simulations emerging as the third paradigm in the 20th century The fourth paradigm, which seeks to exploit information buried in massive datasets, has emerged as an essential complement to the three existing paradigms The complexity and challenge of the fourth paradigm arises from the increasing rate, heterogeneity, and volume of data generation. Large Hadron Collider (LHC) currently generate tens of petabytes of reduced data per year observational and simulation data in the climate domain are expected to reach exabytes by 2021 Light source experiments are expected to generate hundreds of terabytes per day

3 LHC Data Challenge Starting from this event (particle collision) Data DataCollection Collection Data DataStorage Storage Data Data Processing Processing You are looking for this signature Selectivity: 1 in 1013 Like looking for 1 person in a thousand world populations! Or for a needle in 20 million haystacks!

4 Amount of data from the LHC detectors Balloon (30 Km) CMS CD stack with 1 year LHC data! (~ 20 Km) ATLAS ~15 PetaBytes / year ~1010 events / year ~103 batch and interactive users ~ CD / year Concorde (15 Km) Mt. Blanc (4.8 Km) LHCb

5 Grid / Cloud Technologies

6 Definition of Grid systems Collection of geographically distributed heterogeneous resources Most generalized, globalized form of distributed computing An infrastructure that enables flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions and resources Ian Foster and Carl Kesselman

7

8

9

10

11

12

13

14

15

16 Information about sites:

17

18

19 Exascale Challenges Current Petascale systems is unlike to scale to exascale environments, due to the disparity among computational power, machine memory and I/O bandwidth The exascale simulations will not be able to write enough data out to permanent storage to ensure a reliable analysis Current Grid infrastructures are not user friendly and are far from efficient, for small groups and individuals Grid infrastructures, when implemented by HEP VOs, tends to be centralized, from the data point of view. Users demand mobility, efficient data sharing and in the same time autonomy

20 IKAROS Platform Data/Metadata-Collector Ikaros-EG plugin job creation Content provider + mobile devices mobile-grid + WI-FI, 3G android.apk android.apk android.apk android.apk android.apk android.apk 20 android.apk

21 Elastic Transfer (et) Create your Personal Storage Cloud Directly, transfer your files from your workstation to another PC Third-party Data transfer Flexible data & storage sharing You are on the road, behind fifteen firewalls, and want to share some web application you're developing locally, or just share a set of files with someone real quick (Reverse HTTP)

22 Nice! So, now can I... Discover whether corruption in politics is a location-based issue? Check what is the best route to a house by the sea, with low rent? Find the ideal husband/wife? Determine how to improve my economy, relying on agriculture?

23 Well, you kind of can... If you can read through petabytes of information can determine what is useful and what is not contact 30 different organizations hosting the data have experts combining the data visualize them in a meaningful way I hope you got the point by now...

24 So, did we fail?

25 Bits and pieces If you had individual people producing simple statements Decipherable by machines People need food Souvlaki is food Souvlaki contains meat <people, need, food> <souvlaki, is, food> <souvlaki, contains, meat> Could computers combine knowledge to be intelligent? <?,need,meat>: Who needs meat? <souvlaki,contains,?>: What do I need to make a souvlaki?

26 Distributed Artificial Intelligence to the rescue! You start with something like this RDF graph:

27 You end up with something like...

28 How does it work? You use MACHINES (agents will do fine...)! You query LOTS of resources... With BILLIONS of small, statements You REASON upon them You provide answers in realistic time You visualize the results

29 Challenges Data providers speak different languages Data providers can go offline Even knowing who to ask is a problem Responding in time can be challenging The (data) world changes

30 SemaGrow: Distributed, Heterogeneous, Semantic Query Processing Distributed queries over SPARQL endpoints On-the-fly mapping across data provider languages Adaptive to problematic data providers Allows complex queries Support for streaming data (sensors!)

31 Summary Distributed computing allows Generating amazing amounts of data Handling amazing amounts of data Computational availability and fail-over On-demand computation power Security Distributed artificial intelligence allows Asking complex questions over data Combining data Generating knowledge Exploiting knowledge

32 From Distributed Computing to Distributed Artificial Intelligence Dr. Christos Filippidis, NCSR Demokritos Dr. George Giannakopoulos, NCSR Demokritos Thank you!

Shared Computing Driving Discovery: From the Large Hadron Collider to Virus Hunting. Frank Würthwein

Shared Computing Driving Discovery: From the Large Hadron Collider to Virus Hunting. Frank Würthwein Shared Computing Driving Discovery: From the Large Hadron Collider to Virus Hunting Frank Würthwein Professor of Physics University of California San Diego February 14th, 2015 The Science of the LHC The

More information

CMS: Challenges in Advanced Computing Techniques (Big Data, Data reduction, Data Analytics)

CMS: Challenges in Advanced Computing Techniques (Big Data, Data reduction, Data Analytics) CMS: Challenges in Advanced Computing Techniques (Big Data, Data reduction, Data Analytics) With input from: Daniele Bonacorsi, Ian Fisk, Valentin Kuznetsov, David Lange Oliver Gutsche CERN openlab technical

More information

(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015

(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015 (Possible) HEP Use Case for NDN Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015 Outline LHC Experiments LHC Computing Models CMS Data Federation & AAA Evolving Computing Models & NDN Summary Phil DeMar:

More information

Betriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil

Betriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil Betriebssystem-Virtualisierung auf einem Rechencluster am SCC mit heterogenem Anwendungsprofil Volker Büge 1, Marcel Kunze 2, OIiver Oberst 1,2, Günter Quast 1, Armin Scheurer 1 1) Institut für Experimentelle

More information

Standards for Big Data in the Cloud

Standards for Big Data in the Cloud Standards for Big Data in the Cloud International Cloud Symposium 15/10/2013 Carola Carstens (Project Officer) DG CONNECT, Unit G3 Data Value Chain European Commission Outline 1) Data Value Chain Unit

More information

Testing the In-Memory Column Store for in-database physics analysis. Dr. Maaike Limper

Testing the In-Memory Column Store for in-database physics analysis. Dr. Maaike Limper Testing the In-Memory Column Store for in-database physics analysis Dr. Maaike Limper About CERN CERN - European Laboratory for Particle Physics Support the research activities of 10 000 scientists from

More information

Solving the Mysteries of the Universe with Big Data

Solving the Mysteries of the Universe with Big Data Solving the Mysteries of the Universe with Big Data Sverre Jarp CERN openlab CTO Big Data Innovation Summit, Boston, 12 th September 2013 Accelerating Science and Innovation 1 What is CERN? The European

More information

Data sharing and Big Data in the physical sciences. 2 October 2015

Data sharing and Big Data in the physical sciences. 2 October 2015 Data sharing and Big Data in the physical sciences 2 October 2015 Content Digital curation: Data and metadata Why consider the physical sciences? Astronomy: Video Physics: LHC for example. Video The Research

More information

BIG DATA AGGREGATOR STASINOS KONSTANTOPOULOS NCSR DEMOKRITOS, GREECE. Big Data Europe

BIG DATA AGGREGATOR STASINOS KONSTANTOPOULOS NCSR DEMOKRITOS, GREECE. Big Data Europe BIG DATA AGGREGATOR STASINOS KONSTANTOPOULOS NCSR DEMOKRITOS, GREECE Big Data Europe The Big Data Aggregator The Big Data Aggregator: o A general-purpose architecture for processing Big Data o An implementation

More information

Data Requirements from NERSC Requirements Reviews

Data Requirements from NERSC Requirements Reviews Data Requirements from NERSC Requirements Reviews Richard Gerber and Katherine Yelick Lawrence Berkeley National Laboratory Summary Department of Energy Scientists represented by the NERSC user community

More information

Biomedical Informatics Applications, Big Data, & Cloud Computing

Biomedical Informatics Applications, Big Data, & Cloud Computing Biomedical Informatics Applications, Big Data, & Cloud Computing Patrick Widener, PhD Assistant Professor, Biomedical Engineering Senior Research Scientist, Center for Comprehensive Informatics Emory University

More information

Context-aware cloud computing for HEP

Context-aware cloud computing for HEP Department of Physics and Astronomy, University of Victoria, Victoria, British Columbia, Canada V8W 2Y2 E-mail: rsobie@uvic.ca The use of cloud computing is increasing in the field of high-energy physics

More information

Emerging Geospatial Trends The Convergence of Technologies. Jim Steiner Vice President, Product Management

Emerging Geospatial Trends The Convergence of Technologies. Jim Steiner Vice President, Product Management Emerging Geospatial Trends The Convergence of Technologies Jim Steiner Vice President, Product Management United Nation Analysis Initiative on Global GeoSpatial Information Management Future Trends Technology

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

Data-Intensive Science and Scientific Data Infrastructure

Data-Intensive Science and Scientific Data Infrastructure Data-Intensive Science and Scientific Data Infrastructure Russ Rew, UCAR Unidata ICTP Advanced School on High Performance and Grid Computing 13 April 2011 Overview Data-intensive science Publishing scientific

More information

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution Jonathan Halstuch, COO, RackTop Systems JHalstuch@racktopsystems.com Big Data Invasion We hear so much on Big Data and

More information

Solving the Mysteries of the Universe with Big Data

Solving the Mysteries of the Universe with Big Data Solving the Mysteries of the Universe with Big Data Sverre Jarp Former CERN openlab CTO Big Data Innovation Summit, Stockholm, 8 th May 2014 Accelerating Science and Innovation 1 What is CERN? The European

More information

Data Centric Systems (DCS)

Data Centric Systems (DCS) Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems

More information

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12 Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using

More information

Future Applications and the Network they will need. Brian E Carpenter Distinguished Engineer Internet Standards & Technology IBM & January 2002

Future Applications and the Network they will need. Brian E Carpenter Distinguished Engineer Internet Standards & Technology IBM & January 2002 Future Applications and the Network they will need Brian E Carpenter Distinguished Engineer Internet Standards & Technology IBM & January 2002 Topics The Internet today: as far as Web Services The Internet

More information

LHC discoveries and Particle Physics Concepts for Education

LHC discoveries and Particle Physics Concepts for Education LHC discoveries and Particle Physics Concepts for Education Farid Ould- Saada, University of Oslo On behalf of IPPOG EPS- HEP, Vienna, 25.07.2015 A successful program LHC data are successfully deployed

More information

High Performance Computing and Big Data: The coming wave.

High Performance Computing and Big Data: The coming wave. High Performance Computing and Big Data: The coming wave. 1 In science and engineering, in order to compete, you must compute Today, the toughest challenges, and greatest opportunities, require computation

More information

Big Data Analytics. for the Exploitation of the CERN Accelerator Complex. Antonio Romero Marín

Big Data Analytics. for the Exploitation of the CERN Accelerator Complex. Antonio Romero Marín Big Data Analytics for the Exploitation of the CERN Accelerator Complex Antonio Romero Marín Milan 11/03/2015 Oracle Big Data and Analytics @ Work 1 What is CERN CERN - European Laboratory for Particle

More information

Collaboration, Big Data and the search for the Higgs Boson

Collaboration, Big Data and the search for the Higgs Boson Collaboration, Big Data and the search for the Higgs Boson Intel European Research and Innovation Conference October 23 rd 2012 Andrzej Nowak, CERN openlab Andrzej.Nowak@cern.ch The European Particle Physics

More information

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,

More information

An Integrated CyberSecurity Approach for HEP Grids. Workshop Report. http://hpcrd.lbl.gov/hepcybersecurity/

An Integrated CyberSecurity Approach for HEP Grids. Workshop Report. http://hpcrd.lbl.gov/hepcybersecurity/ An Integrated CyberSecurity Approach for HEP Grids Workshop Report http://hpcrd.lbl.gov/hepcybersecurity/ 1. Introduction The CMS and ATLAS experiments at the Large Hadron Collider (LHC) being built at

More information

Cray: Enabling Real-Time Discovery in Big Data

Cray: Enabling Real-Time Discovery in Big Data Cray: Enabling Real-Time Discovery in Big Data Discovery is the process of gaining valuable insights into the world around us by recognizing previously unknown relationships between occurrences, objects

More information

ON DEMAND ACCESS TO BIG DATA THROUGH SEMANTIC TECHNOLOGIES. Peter Haase fluid Operations AG

ON DEMAND ACCESS TO BIG DATA THROUGH SEMANTIC TECHNOLOGIES. Peter Haase fluid Operations AG ON DEMAND ACCESS TO BIG DATA THROUGH SEMANTIC TECHNOLOGIES Peter Haase fluid Operations AG fluid Operations(fluidOps) Linked Data& Semantic Technologies Enterprise Cloud Computing Software company founded

More information

Big Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs

Big Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs 1 Big Data Analytics Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs Montevideo, 22 nd November 4 th December, 2015 INFORMATIQUE

More information

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: simmibagga12@gmail.com

More information

Six Days in the Network Security Trenches at SC14. A Cray Graph Analytics Case Study

Six Days in the Network Security Trenches at SC14. A Cray Graph Analytics Case Study Six Days in the Network Security Trenches at SC14 A Cray Graph Analytics Case Study WP-NetworkSecurity-0315 www.cray.com Table of Contents Introduction... 3 Analytics Mission and Source Data... 3 Analytics

More information

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Data-intensive HPC: opportunities and challenges. Patrick Valduriez Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,

More information

Smart Data THE driving force for industrial applications

Smart Data THE driving force for industrial applications Smart Data THE driving force for industrial applications European Data Forum Luxembourg, siemens.com The world is becoming digital User behavior is radically changing based on new business models Newspaper,

More information

Reference Architecture, Requirements, Gaps, Roles

Reference Architecture, Requirements, Gaps, Roles Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture

More information

Big Data Hope or Hype?

Big Data Hope or Hype? Big Data Hope or Hype? David J. Hand Imperial College, London and Winton Capital Management Big data science, September 2013 1 Google trends on big data Google search 1 Sept 2013: 1.6 billion hits on big

More information

III Big Data Technologies

III Big Data Technologies III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and

More information

BIG Big Data Public Private Forum

BIG Big Data Public Private Forum DATA STORAGE Martin Strohbach, AGT International (R&D) THE DATA VALUE CHAIN Value Chain Data Acquisition Data Analysis Data Curation Data Storage Data Usage Structured data Unstructured data Event processing

More information

HPC technology and future architecture

HPC technology and future architecture HPC technology and future architecture Visual Analysis for Extremely Large-Scale Scientific Computing KGT2 Internal Meeting INRIA France Benoit Lange benoit.lange@inria.fr Toàn Nguyên toan.nguyen@inria.fr

More information

Training for Big Data

Training for Big Data Training for Big Data Learnings from the CATS Workshop Raghu Ramakrishnan Technical Fellow, Microsoft Head, Big Data Engineering Head, Cloud Information Services Lab Store any kind of data What is Big

More information

Shoal: IaaS Cloud Cache Publisher

Shoal: IaaS Cloud Cache Publisher University of Victoria Faculty of Engineering Winter 2013 Work Term Report Shoal: IaaS Cloud Cache Publisher Department of Physics University of Victoria Victoria, BC Mike Chester V00711672 Work Term 3

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013 Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013 James Maltby, Ph.D 1 Outline of Presentation Semantic Graph Analytics Database Architectures In-memory Semantic Database Formulation

More information

Storage Networking Overview

Storage Networking Overview Networking Overview iscsi Attached LAN Networking SAN NAS Gateway NAS Attached SAN Attached IBM Total Module Flow Business Challenges Networking Trends and Directions What is Networking? Technological

More information

ANY THREAT, ANYWHERE, ANYTIME Scalable.Infrastructure.to.Enable.the.Warfi.ghter

ANY THREAT, ANYWHERE, ANYTIME Scalable.Infrastructure.to.Enable.the.Warfi.ghter WHITEPAPER ANY THREAT, ANYWHERE, ANYTIME Scalable.Infrastructure.to.Enable.the.Warfi.ghter THE BIG DATA CHALLENGE AND OPPORTUNITY The.proliferation,.management.and.analysis.of.intelligence.data.is.a.fast.growing.concern.

More information

CS 698: Special Topics in Big Data. Chapter 2. Computing Trends for Big Data

CS 698: Special Topics in Big Data. Chapter 2. Computing Trends for Big Data CS 698: Special Topics in Big Data Chapter 2. Computing Trends for Big Data Chase Wu Associate Professor Department of Computer Science New Jersey Institute of Technology chase.wu@njit.edu Collaborative

More information

Data Intensive Science and Computing

Data Intensive Science and Computing DEFENSE LABORATORIES ACADEMIA TRANSFORMATIVE SCIENCE Efficient, effective and agile research system INDUSTRY Data Intensive Science and Computing Advanced Computing & Computational Sciences Division University

More information

Global Technology Outlook 2011

Global Technology Outlook 2011 Global Technology Outlook 2011 Global Technology Outlook 2011 Since 1982, The Global Technology Outlook had identified significant technology trends five to even 10 years before they have come to realization.

More information

Bringing Compute to the Data Alternatives to Moving Data. Part of EUDAT s Training in the Fundamentals of Data Infrastructures

Bringing Compute to the Data Alternatives to Moving Data. Part of EUDAT s Training in the Fundamentals of Data Infrastructures Bringing Compute to the Data Alternatives to Moving Data Part of EUDAT s Training in the Fundamentals of Data Infrastructures Introduction Why consider alternatives? The traditional approach Alternative

More information

New Jersey Big Data Alliance

New Jersey Big Data Alliance Rutgers Discovery Informatics Institute (RDI 2 ) New Jersey s Center for Advanced Computation New Jersey Big Data Alliance Manish Parashar Director, Rutgers Discovery Informatics Institute (RDI 2 ) Professor,

More information

Supercomputing and Big Data: Where are the Real Boundaries and Opportunities for Synergy?

Supercomputing and Big Data: Where are the Real Boundaries and Opportunities for Synergy? HPC2012 Workshop Cetraro, Italy Supercomputing and Big Data: Where are the Real Boundaries and Opportunities for Synergy? Bill Blake CTO Cray, Inc. The Big Data Challenge Supercomputing minimizes data

More information

Information Sciences Institute University of Southern California Los Angeles, CA 90292 {annc, carl}@isi.edu

Information Sciences Institute University of Southern California Los Angeles, CA 90292 {annc, carl}@isi.edu _ Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing Bill Allcock 1 Joe Bester 1 John Bresnahan 1 Ann L. Chervenak 2 Ian Foster 1,3 Carl Kesselman 2 Sam

More information

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS By HAI JIN, SHADI IBRAHIM, LI QI, HAIJUN CAO, SONG WU and XUANHUA SHI Prepared by: Dr. Faramarz Safi Islamic Azad

More information

EUDAT. Towards a pan-european Collaborative Data Infrastructure

EUDAT. Towards a pan-european Collaborative Data Infrastructure EUDAT Towards a pan-european Collaborative Data Infrastructure Damien Lecarpentier CSC-IT Center for Science, Finland EISCAT User Meeting, Uppsala,6 May 2013 2 Exponential growth Data trends Zettabytes

More information

Big Data and Hadoop. Sreedhar C, Dr. D. Kavitha, K. Asha Rani

Big Data and Hadoop. Sreedhar C, Dr. D. Kavitha, K. Asha Rani Big Data and Hadoop Sreedhar C, Dr. D. Kavitha, K. Asha Rani Abstract Big data has become a buzzword in the recent years. Big data is used to describe a massive volume of both structured and unstructured

More information

MEDICAL DATA MINING. Timothy Hays, PhD. Health IT Strategy Executive Dynamics Research Corporation (DRC) December 13, 2012

MEDICAL DATA MINING. Timothy Hays, PhD. Health IT Strategy Executive Dynamics Research Corporation (DRC) December 13, 2012 MEDICAL DATA MINING Timothy Hays, PhD Health IT Strategy Executive Dynamics Research Corporation (DRC) December 13, 2012 2 Healthcare in America Is a VERY Large Domain with Enormous Opportunities for Data

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

Towards a New Model for the Infrastructure Grid

Towards a New Model for the Infrastructure Grid INTERNATIONAL ADVANCED RESEARCH WORKSHOP ON HIGH PERFORMANCE COMPUTING AND GRIDS Cetraro (Italy), June 30 - July 4, 2008 Panel: From Grids to Cloud Services Towards a New Model for the Infrastructure Grid

More information

Cloud Computing and Advanced Relationship Analytics

Cloud Computing and Advanced Relationship Analytics Cloud Computing and Advanced Relationship Analytics Using Objectivity/DB to Discover the Relationships in your Data By Brian Clark Vice President, Product Management Objectivity, Inc. 408 992 7136 brian.clark@objectivity.com

More information

The Fusion of Supercomputing and Big Data. Peter Ungaro President & CEO

The Fusion of Supercomputing and Big Data. Peter Ungaro President & CEO The Fusion of Supercomputing and Big Data Peter Ungaro President & CEO The Supercomputing Company Supercomputing Big Data Because some great things never change One other thing that hasn t changed. Cray

More information

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering June 2014 Page 1 Contents Introduction... 3 About Amazon Web Services (AWS)... 3 About Amazon Redshift... 3 QlikView on AWS...

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

Moving Beyond the Web, a Look at the Potential Benefits of Grid Computing for Future Power Networks

Moving Beyond the Web, a Look at the Potential Benefits of Grid Computing for Future Power Networks Moving Beyond the Web, a Look at the Potential Benefits of Grid Computing for Future Power Networks by Malcolm Irving, Gareth Taylor, and Peter Hobson 1999 ARTVILLE, LLC. THE WORD GRID IN GRID-COMPUTING

More information

McAfee Global Threat Intelligence File Reputation Service. Best Practices Guide for McAfee VirusScan Enterprise Software

McAfee Global Threat Intelligence File Reputation Service. Best Practices Guide for McAfee VirusScan Enterprise Software McAfee Global Threat Intelligence File Reputation Service Best Practices Guide for McAfee VirusScan Enterprise Software Table of Contents McAfee Global Threat Intelligence File Reputation Service McAfee

More information

Big Data and Storage Management at the Large Hadron Collider

Big Data and Storage Management at the Large Hadron Collider Big Data and Storage Management at the Large Hadron Collider Dirk Duellmann CERN IT, Data & Storage Services Accelerating Science and Innovation CERN was founded 1954: 12 European States Science for Peace!

More information

Industry 4.0 and Big Data

Industry 4.0 and Big Data Industry 4.0 and Big Data Marek Obitko, mobitko@ra.rockwell.com Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and

More information

Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner

Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner Research Group Scientific Computing Faculty of Computer Science University of Vienna AUSTRIA http://www.par.univie.ac.at

More information

ON DEMAND ACCESS TO BIG DATA. Peter Haase fluid Operations AG

ON DEMAND ACCESS TO BIG DATA. Peter Haase fluid Operations AG ON DEMAND ACCESS TO BIG DATA THROUGHSEMANTIC TECHNOLOGIES Peter Haase fluid Operations AG fluid Operations (fluidops) Linked Data & SemanticTechnologies Enterprise Cloud Computing Software company founded

More information

HADOOP, a newly emerged Java-based software framework, Hadoop Distributed File System for the Grid

HADOOP, a newly emerged Java-based software framework, Hadoop Distributed File System for the Grid Hadoop Distributed File System for the Grid Garhan Attebury, Andrew Baranovski, Ken Bloom, Brian Bockelman, Dorian Kcira, James Letts, Tanya Levshina, Carl Lundestedt, Terrence Martin, Will Maier, Haifeng

More information

ADVANCEMENTS IN BIG DATA PROCESSING IN THE ATLAS AND CMS EXPERIMENTS 1. A.V. Vaniachine on behalf of the ATLAS and CMS Collaborations

ADVANCEMENTS IN BIG DATA PROCESSING IN THE ATLAS AND CMS EXPERIMENTS 1. A.V. Vaniachine on behalf of the ATLAS and CMS Collaborations ADVANCEMENTS IN BIG DATA PROCESSING IN THE ATLAS AND CMS EXPERIMENTS 1 A.V. Vaniachine on behalf of the ATLAS and CMS Collaborations Argonne National Laboratory, 9700 S Cass Ave, Argonne, IL, 60439, USA

More information

A Big Picture for Big Data

A Big Picture for Big Data Supported by EU FP7 SCIDIP-ES, EU FP7 EarthServer A Big Picture for Big Data FOSS4G-Europe, Bremen, 2014-07-15 Peter Baumann Jacobs University rasdaman GmbH p.baumann@jacobs-university.de Our Stds Involvement

More information

Concept and Project Objectives

Concept and Project Objectives 3.1 Publishable summary Concept and Project Objectives Proactive and dynamic QoS management, network intrusion detection and early detection of network congestion problems among other applications in the

More information

LHC Databases on the Grid: Achievements and Open Issues * A.V. Vaniachine. Argonne National Laboratory 9700 S Cass Ave, Argonne, IL, 60439, USA

LHC Databases on the Grid: Achievements and Open Issues * A.V. Vaniachine. Argonne National Laboratory 9700 S Cass Ave, Argonne, IL, 60439, USA ANL-HEP-CP-10-18 To appear in the Proceedings of the IV International Conference on Distributed computing and Gridtechnologies in science and education (Grid2010), JINR, Dubna, Russia, 28 June - 3 July,

More information

GIS Initiative: Developing an atmospheric data model for GIS. Olga Wilhelmi (ESIG), Jennifer Boehnert (RAP/ESIG) and Terri Betancourt (RAP)

GIS Initiative: Developing an atmospheric data model for GIS. Olga Wilhelmi (ESIG), Jennifer Boehnert (RAP/ESIG) and Terri Betancourt (RAP) GIS Initiative: Developing an atmospheric data model for GIS Olga Wilhelmi (ESIG), Jennifer Boehnert (RAP/ESIG) and Terri Betancourt (RAP) Unidata seminar August 30, 2004 Presentation Outline Overview

More information

Problems to store, transfer and process the Big Data 6/2/2016 GIANG TRAN - TTTGIANG2510@GMAIL.COM 1

Problems to store, transfer and process the Big Data 6/2/2016 GIANG TRAN - TTTGIANG2510@GMAIL.COM 1 Problems to store, transfer and process the Big Data COURSE: COMPUTING CLUSTERS, GRIDS, AND CLOUDS LECTURER: ANDREY SHEVEL ITMO UNIVERSITY SAINT PETERSBURG 6/2/2016 GIANG TRAN - TTTGIANG2510@GMAIL.COM

More information

US NSF s Scientific Software Innovation Institutes

US NSF s Scientific Software Innovation Institutes US NSF s Scientific Software Innovation Institutes S 2 I 2 awards invest in long-term projects which will realize sustained software infrastructure that is integral to doing transformative science. (Can

More information

Performance Monitoring of the Software Frameworks for LHC Experiments

Performance Monitoring of the Software Frameworks for LHC Experiments Proceedings of the First EELA-2 Conference R. mayo et al. (Eds.) CIEMAT 2009 2009 The authors. All rights reserved Performance Monitoring of the Software Frameworks for LHC Experiments William A. Romero

More information

On-demand Provisioning of Workflow Middleware and Services An Overview

On-demand Provisioning of Workflow Middleware and Services An Overview On-demand Provisioning of Workflow Middleware and s An Overview University of Stuttgart Universitätsstr. 8 70569 Stuttgart Germany Karolina Vukojevic-Haupt, Florian Haupt, and Frank Leymann Institute of

More information

NextGen Infrastructure for Big DATA Analytics.

NextGen Infrastructure for Big DATA Analytics. NextGen Infrastructure for Big DATA Analytics. So What is Big Data? Data that exceeds the processing capacity of conven4onal database systems. The data is too big, moves too fast, or doesn t fit the structures

More information

Essential Characteristics of Cloud Computing: On-Demand Self-Service Rapid Elasticity Location Independence Resource Pooling Measured Service

Essential Characteristics of Cloud Computing: On-Demand Self-Service Rapid Elasticity Location Independence Resource Pooling Measured Service Cloud Computing Although cloud computing is quite a recent term, elements of the concept have been around for years. It is the maturation of Internet. Cloud Computing is the fine end result of a long chain;

More information

André Karpištšenko, Co-Founder & Chief Scientist, Marinexplore Strata, 2014.02.11

André Karpištšenko, Co-Founder & Chief Scientist, Marinexplore Strata, 2014.02.11 marineos André Karpištšenko, Co-Founder & Chief Scientist, Marinexplore Strata, 2014.02.11 The Ocean's Big Data Platform marineos: a platform for organizing, analyzing and distributing machine data marineos

More information

Analyzing Big Data with AWS

Analyzing Big Data with AWS Analyzing Big Data with AWS Peter Sirota, General Manager, Amazon Elastic MapReduce @petersirota What is Big Data? Computer generated data Application server logs (web sites, games) Sensor data (weather,

More information

Synergistic Challenges in Data-Intensive Science and Exascale Computing

Synergistic Challenges in Data-Intensive Science and Exascale Computing Synergistic Challenges in Data-Intensive Science and Exascale Computing DOE ASCAC Data Subcommittee Report March 2013 Synergistic Challenges in Data-Intensive Science and Exascale Computing Summary Report

More information

Data-intensive Computing on the Cloud: Concepts, Technologies and Applications B. Ramamurthy bina@buffalo.edu This talks is partially supported by

Data-intensive Computing on the Cloud: Concepts, Technologies and Applications B. Ramamurthy bina@buffalo.edu This talks is partially supported by Data-intensive Computing on the Cloud: Concepts, Technologies and Applications B. Ramamurthy bina@buffalo.edu This talks is partially supported by National Science Foundation grants DUE: #0920335, OCI:

More information

Customer Site Requirements for incontact Workforce Optimization 16.2. www.incontact.com

Customer Site Requirements for incontact Workforce Optimization 16.2. www.incontact.com Customer Site Requirements for incontact Workforce Optimization 16.2 www.incontact.com Customer Site Requirements for incontact Workforce Optimization Version 16.2 Last Revision June 2016 About incontact

More information

The Challenge of Handling Large Data Sets within your Measurement System

The Challenge of Handling Large Data Sets within your Measurement System The Challenge of Handling Large Data Sets within your Measurement System The Often Overlooked Big Data Aaron Edgcumbe Marketing Engineer Northern Europe, Automated Test National Instruments Introduction

More information

Big Data a threat or a chance?

Big Data a threat or a chance? Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but

More information

Definition of Computers. INTRODUCTION to COMPUTERS. Historical Development ENIAC

Definition of Computers. INTRODUCTION to COMPUTERS. Historical Development ENIAC Definition of Computers INTRODUCTION to COMPUTERS Bülent Ecevit University Department of Environmental Engineering A general-purpose machine that processes data according to a set of instructions that

More information

Cloud Computing and Software Agents: Towards Cloud Intelligent Services

Cloud Computing and Software Agents: Towards Cloud Intelligent Services Cloud Computing and Software Agents: Towards Cloud Intelligent Services Domenico Talia ICAR-CNR & University of Calabria Rende, Italy talia@deis.unical.it Abstract Cloud computing systems provide large-scale

More information

Steve Hamby Chief Technology Officer Orbis Technologies, Inc. shamby@orbistechnologies.com 678.346.6386

Steve Hamby Chief Technology Officer Orbis Technologies, Inc. shamby@orbistechnologies.com 678.346.6386 Semantic Technology and Cloud Computing Applied to Tactical Intelligence Domain Steve Hamby Chief Technology Officer Orbis Technologies, Inc. shamby@orbistechnologies.com 678.346.6386 1 Abstract The tactical

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

DATA. Big Data Operational Excellence Ahead in the Cloud. Detect Patterns with Mass Correlation. Limit Surprise with Smart Data

DATA. Big Data Operational Excellence Ahead in the Cloud. Detect Patterns with Mass Correlation. Limit Surprise with Smart Data FY11 FY12 FY13+ DATA Big Data Operational Excellence Ahead in the Cloud Detect Patterns with Mass Correlation Limit Surprise with Smart Data Accelerate Discovery with Visual Analytics Ira A. (Gus) Hunt

More information

Enterprise Energy Management with JouleX and Cisco EnergyWise

Enterprise Energy Management with JouleX and Cisco EnergyWise Enterprise Energy Management with JouleX and Cisco EnergyWise Introduction Corporate sustainability and enterprise energy management are pressing initiatives for organizations dealing with rising energy

More information

What happens when Big Data and Master Data come together?

What happens when Big Data and Master Data come together? What happens when Big Data and Master Data come together? Jeremy Pritchard Master Data Management fgdd 1 What is Master Data? Master data is data that is shared by multiple computer systems. The Information

More information

From Big Data to Smart Data Thomas Hahn

From Big Data to Smart Data Thomas Hahn Siemens Future Forum @ HANNOVER MESSE 2014 From Big to Smart Hannover Messe 2014 The Evolution of Big Digital data ~ 1960 warehousing ~1986 ~1993 Big data analytics Mining ~2015 Stream processing Digital

More information

Big Data Processing in Cloud Environments

Big Data Processing in Cloud Environments Big Data in Cloud Environments Satoshi Tsuchiya Yoshinori Sakamoto Yuichi Tsuchimoto Vivian Lee In recent years, accompanied by lower prices of information and communications technology (ICT) equipment

More information

E-mail: guido.negri@cern.ch, shank@bu.edu, dario.barberis@cern.ch, kors.bos@cern.ch, alexei.klimentov@cern.ch, massimo.lamanna@cern.

E-mail: guido.negri@cern.ch, shank@bu.edu, dario.barberis@cern.ch, kors.bos@cern.ch, alexei.klimentov@cern.ch, massimo.lamanna@cern. *a, J. Shank b, D. Barberis c, K. Bos d, A. Klimentov e and M. Lamanna a a CERN Switzerland b Boston University c Università & INFN Genova d NIKHEF Amsterdam e BNL Brookhaven National Laboratories E-mail:

More information

The Tonnabytes Big Data Challenge: Transforming Science and Education. Kirk Borne George Mason University

The Tonnabytes Big Data Challenge: Transforming Science and Education. Kirk Borne George Mason University The Tonnabytes Big Data Challenge: Transforming Science and Education Kirk Borne George Mason University Ever since we first began to explore our world humans have asked questions and have collected evidence

More information

Mobile Cloud Computing: Paradigms and Challenges 移 动 云 计 算 : 模 式 与 挑 战

Mobile Cloud Computing: Paradigms and Challenges 移 动 云 计 算 : 模 式 与 挑 战 Mobile Cloud Computing: Paradigms and Challenges 移 动 云 计 算 : 模 式 与 挑 战 Jiannong Cao Internet & Mobile Computing Lab Department of Computing Hong Kong Polytechnic University Email: csjcao@comp.polyu.edu.hk

More information

The Intersection of Big Data and Analytics. Philip Russom TDWI Research Director for Data Management May 5, 2011

The Intersection of Big Data and Analytics. Philip Russom TDWI Research Director for Data Management May 5, 2011 The Intersection of Big Data and Analytics Philip Russom TDWI Research Director for Data Management May 5, 2011 Sponsor 2 Speakers Philip Russom TDWI Research Director, Data Management Francois Ajenstat

More information