The Big Gift of Big Data
|
|
|
- Juliana Julia Ferguson
- 10 years ago
- Views:
Transcription
1 The Big Gift of Big Data Valerio Pascucci Director, Center for Extreme Data Management Analysis and Visualization Professor, SCI institute and School of Computing, University of Utah Laboratory Fellow, Pacific Northwest National Laboratory Pascucci-1
2 Center for Extreme Data Management, Analysis, and Visualization 10 Faculty + scientists, developers, students, Primary partners: UU, PNNL, LLNL Other partnerships: NSA, INL, ANL,. Involvement in national Initiatives $1.6B NSA data center (1.5 million-square-foot facility) Pascucci-2
3 What is Big Data? Big Data is like teenage sex: everyone Talks About It, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it. (Dan Ariely). eventually everyone will do it! Pascucci-3
4 Are Global Warming Trends Associated to Human Activities? Pascucci-4
5 Can Devastating Material Failures Such as Earthquakes Travel at Supersonic Speed? Understanding the strength of new materials is critical in creating structures as small as microprocessors, buildings, or airplanes that withstand real-world forces Pascucci-5
6 Change the World of Fuels and Engines to Increase Efficiency and Reduce Pollution Fuel streams are rapidly evolving: Heavy hydrocarbons Oil sands Oil shale Coal New renewable fuel sources Ethanol Biodiesel New engine technologies: Direct Injection (DI) Homogeneous Charge Compression Ignition (HCCI) Low-temperature combustion Mixed modes of combustion (dilute, high-pressure, low-temp.) Pascucci-6
7 Can We Explain the Origin and the Evolution of the Universe? Pascucci-7
8 Can We Develop a New Healthcare Process that is Fully Personalized Pascucci-8
9 Intermezzo to talk about a convicted felon Pascucci-10
10 Galileo Galilei Jailed in Apr 12, 1633 because "gravely suspect of heresy The case was quickly reviewed by the Catholic Church Received excuses in 1992 Pascucci-11
11 Galileo Galilei Led the Modern Revolution of the Scientific Method Make observations Propose a hypothesis of theoretical model Design and perform anrepetible experiment to test the hypothesis Analyze your data to determine whether to accept or reject the hypothesis If necessary, iterate (propose and test a new hypothesis) Pascucci-12
12 Galileo Galilei Led the Modern Revolution of the Scientific Method Make observations Propose a hypothesis of theoretical model Design and perform anrepetible experiment to test the hypothesis Analyze your data to determine whether to accept or reject the hypothesis If necessary, iterate (propose and test a new hypothesis) Pascucci-13
13 Galileo Galilei Led the Modern Revolution of the Scientific Method Make observations Propose a hypothesis of theoretical model Design and perform anrepetible experiment to test the hypothesis Analyze your data to determine whether to accept or reject the hypothesis If necessary, iterate (propose and test a new hypothesis) Pascucci-14
14 Galileo Galilei Led the Modern Revolution of the Scientific Method Make observations Propose a hypothesis of theoretical model Design and perform anrepetible experiment to test the hypothesis Analyze your data to determine whether to accept or reject the hypothesis Invention If necessary, of the iterate (propose and test a new hypothesis) Telescope Pascucci-15
15 Galileo Galilei Led the Modern Revolution of the Scientific Method Make observations Propose a hypothesis of theoretical model Design and perform anrepetible experiment to test the hypothesis Analyze your data to determine whether to accept or reject the hypothesis If necessary, iterate (propose and test a new hypothesis) Pascucci-16
16 Galileo Galilei Led the Modern Revolution of the Scientific Method Make observations Propose a hypothesis of theoretical model Design and perform anrepetible experiment to test the hypothesis Analyze your data to determine whether to accept or reject the hypothesis If necessary, iterate Development of formal (propose and test a new hypothesis) model description Pascucci-17
17 New Data Collection and Processing Resources Challenged This Model Large Synoptic Survey Telescope 15TB/day 100PB in 10-year Pascucci-18
18 New Data Collection and Processing Resources Challenged This Model PayPal s Data Volumes Pascucci-19
19 New Data Collection and Processing Resources Challenged This Model Earth System Grid Tens of Petabytes of climate data Pascucci-20
20 New Data Collection and Processing Resources Challenged This Model Year Annual Number of Google Searches Average Searches Per Day ,161,530,000,000 5,922,000, ,873,910,000,000 5,134,000, ,722,071,000,000 4,717,000, ,324,670,000,000 3,627,000, ,700,000,000 2,610,000, ,200,000,000 1,745,000, ,000,000,000 1,200,000, ,000,000,000 60,000, ,600,000 9,800 Pascucci-21
21 New Data Collection and Processing Resources Challenged This Model 1 st paradigm, empirical science 2 nd paradigm, model based theoretical science 3 rd paradigm, computational science (simulations) 4 th paradigm, data driven investigation (escience) Pascucci-23
22 New Data Collection and Processing Resources Challenged This Model 1 st paradigm, empirical science 2 nd paradigm, model based theoretical science "Essentially, all models are wrong, but some are useful George Edward Pelham Box (1987) 3 rd paradigm, computational science (simulations) 4 th paradigm, data driven investigation (escience) The End of Theory: The Data Deluge Makes the Scientific Method Obsolete Chris Anderson (2011) Pascucci-24
23 New Data Collection and Processing Resources Challenged This Model 1 st paradigm, empirical science 2 nd paradigm, model based theoretical science "Essentially, all models are wrong, but some are useful George Edward Pelham Box (1987) 3 rd paradigm, computational science (simulations) 4 th paradigm, data driven investigation (escience) The End of Theory: The Data Deluge Makes the Scientific Method Obsolete Chris Anderson (2011) Pascucci-25
24 New Data Collection and Processing Resources Challenged This Model 1 st paradigm, empirical science 2 nd paradigm, model based theoretical science "Essentially, all models are wrong, but some are useful George Edward Pelham Box (1987) 3 rd paradigm, computational science (simulations) 4 th paradigm, data driven investigation (escience) The End of Theory: The Data Deluge Makes the Scientific Method Obsolete Chris Anderson (2011) Pascucci-26
25 New Data Collection and Processing Resources Challenged This Model 1 st paradigm, empirical science 2 nd paradigm, model based theoretical science "Essentially, all models are wrong, but some are useful George Edward Pelham Box (1987) 3 rd paradigm, computational science (simulations) 4 th paradigm, data driven investigation (escience) The End of Theory: The Data Deluge Makes the Scientific Method Obsolete Chris Anderson (2011) Pascucci-27
26 New Data Collection and Processing Resources Challenged This Model 1 st paradigm, empirical science 2 nd paradigm, model based theoretical science 3 rd paradigm, computational science (simulations) 4 th paradigm, data driven investigation (escience) Pascucci-28
27 The True Revolution is in the New Challenges that We Can Try to Tackle Milky Way crashes into Andromeda system. Can we test Empirically multiple scenarios? In four billion years Pascucci-29
28 The True Revolution is in the New Challenges that We Can Try to Tackle Global warming expectation Can we test Empirically multiple scenarios? Predict the outcome in 2100 Pascucci-30
29 The True Revolution is in the New Challenges that We Can Try to Tackle Evolution of the stock market Can we test Empirically multiple scenarios? Value of investments in 10 years Pascucci-31
30 The True Revolution is in the New Challenges that We Can Try to Tackle Predicting 100 year aging effects on new elements Can we test Empirically multiple scenarios? Plutonium discovered in 1940 Pascucci-32
31 Would you go to space with a vehicle that has been developed only based on simulations? Ariane 5's first flight (Flight 501) on 4 June 1996 $370 million in 37 seconds Software bug Pascucci-33
32 How Would You Maintain an Arsenal of nuclear Bombs that are not tested? Comprehensive Test-Ban Treaty (CTBT) United Nations General Assembly on 10 Sept Is the nuclear arsenal aging properly or is it becoming dangerous and ineffective? Pascucci-34
33 How Would You Maintain an Arsenal of nuclear Bombs that are not tested? Comprehensive Test-Ban Treaty (CTBT) United Nations General Assembly on 10 Sept Is the nuclear arsenal aging properly or is it becoming dangerous and ineffective? Pascucci-35
34 Unprecedented Evolution of High Performance Computing Resources M-5: Los Alamos National Lab No 1. system in June 1993C Pascucci-36
35 Unprecedented Evolution of High Performance Computing Resources Num. Wind Tunnel: National Aerospace Laboratory of Japan No. 1 system in November 1993 and November 1994 Pascucci-37
36 Unprecedented Evolution of High Performance Computing Resources Intel XP/S 140 Paragon: Sandia National Labs No. 1 system in June 1994 Pascucci-38
37 Unprecedented Evolution of High Performance Computing Resources Hitachi SR2201: University of Tokyo No. 1 system in June 19 Pascucci-39
38 Unprecedented Evolution of High Performance Computing Resources CP-PACS: University of Tsukuba Pascucci-40
39 Unprecedented Evolution of High Performance Computing Resources ASCI Red: Sandia National Laboratory No. 1 system from June 1997 to June 2000 Pascucci-41
40 Unprecedented Evolution of High Performance Computing Resources ASCI White: LLNL No. 1 system from Nov to Nov Pascucci-42
41 Unprecedented Evolution of High Performance Computing Resources The Earth Simulator No. 1 from June 2002 to June 2004 Pascucci-43
42 Unprecedented Evolution of High Performance Computing Resources BlueGene/L: LLNL No. 1 from November 2004 to November 2007 Pascucci-44
43 Unprecedented Evolution of High Performance Computing Resources Roadrunner: Los Alamos National Laboratory No. 1 from June 2008 to June 2009 Pascucci-45
44 Unprecedented Evolution of High Performance Computing Resources Jaguar: Oak ridge National Laboratory No. 1 from November 2009 to June 2010 Pascucci-46
45 Unprecedented Evolution of High Performance Computing Resources Tianhe-1A: National SC in Tianjin No. 1 in November 2010 Pascucci-47
46 Unprecedented Evolution of High Performance Computing Resources K Computer: RIKEN Institute for No. 1 June 2011 to November 2011 Pascucci-48
47 Unprecedented Evolution of High Performance Computing Resources Sequoia: LLNL No. 1 in June 2012 Pascucci-49
48 Unprecedented Evolution of High Performance Computing Resources Titan: Oak Ridge National Laboratory No. 1 in November 2012 Pascucci-50
49 Unprecedented Evolution of High Performance Computing Resources Tianhe-2 (MilkyWay-2) No. 1 system since June 2013 Pascucci-51
50 Unprecedented Evolution of High Performance Computing Resources Tianhe-2 (MilkyWay-2) No. 1 system since June 2013 Pascucci-52
51 The Fundamental Paradigm Shift of escience The data is a major driver of the investigations Having the data does NOT guarantee to have the right questions and answers NOT having the data guarantees that you CANNOT develop the right questions and answers HPC, scientific computing, machine learning, data analysis, mining, exploration and visualization are critical activities to knowledge discovery. Pascucci-53
52 High Performance Data Movements for Real-Time Access to Large Scale Experimental Data Experiment run at Advance Photon Source at ANL Scientists located at PNNL Pascucci-54
53 High Performance Data Movements for Real-Time Access to Large Scale Simulation Data Data streams that allow merging multiple datasets in real time Time interpolation of and concurrent visualization of climate data ensembles defined on different time scales Server side and client side computation of statistical functions such as median, average, standard deviation,. Standard Deviation and Average of ten climate models Pascucci-55
54 Interactive Remote Analysis and Visualization of 6TB Imaging Data EM datasets of resolution: 130Kx130Kx TB of raw data Web Server Pascucci-56
55 SC12/13 Demonstration of data streaming analytics and visualization Live demonstration from Argonne National Laboratory to Supercomputing exhibit floor Infrastructure that scales gracefully with available hardware resources Cores available Pascucci-57
56 Massive Precomputations Can Avoid the Need for Real Time Processing Problem: Need accurate automated phone quotes in 100ms. They couldn t do these calculations nearly fast enough on the fly. Solution: Each weekend, use a new HPC cluster to precalculate quotes for every American adult and household (60 hour run time) Pascucci-58
57 The Fundamental Paradigm Shift of escience Development and curation of massive data collections from simulations and sensing will be crucial to any scientific progress Need to develop scientific knowledge and actionable information that cannot be tested empirically Uncertainty Quantification Verification and Validation Pascucci-59
58 Assessing the Uncertainty in Fuel Design For New Clean Burning Devices Exploration of high dimensional space of possible configurations. High Performance Combustion 10 dimensional landscape Poor Combustion to be avoided Pascucci-60
59 Topological Analysis of the Space of Composite Materials of a Given Class Features in experimental data show unexpected structures and are used to plan future experiments. Stakeholder: A. Karim, PNNL. Palladium Tin Molybdenum Tungsten Pascucci-61
60 How Can We Achieve True Verification and Validation? Verification: Are we building the software right? Pascucci-62
61 How Can We Achieve True Verification and Validation? Verification: Are we building the software right? Is my software bug free? Pascucci-63
62 How Can We Achieve True Verification and Validation? Verification: Are we building the software right? Is my software bug free? Chimera Pascucci-64
63 How Can We Achieve True Verification and Validation? Validation: Is the process providing accurate predictions? Pascucci-65
64 How Can We Achieve True Verification and Validation? Validation: Is the process providing accurate predictions? Pascucci-66
65 How Can We Achieve True Verification and Validation? Validation: Is the process providing accurate predictions? U.S. Securities and Exchange Commission (SEC) requires to tell: investors that a fund's: past performance does not necessarily predict future results Pascucci-67
66 Example: Validation: Is the process providing accurate predictions? Sidharth Kumar Aaditya Landge U.S. Securities and Exchange Commission (SEC) requires to tell: investors that a fund's: past performance does not necessarily predict future results Pascucci-68
67 Validation: Example: ATPESC summer school 2013 Is the process providing accurate predictions? Sidharth Kumar Aaditya Landge U.S. Securities and Exchange Commission (SEC) requires to tell: investors that a fund's: past performance does not necessarily predict future results Pascucci-69
68 Validation: Example: Is the process providing accurate predictions? Sidharth Kumar Aaditya Landge SC14 ATPESC summer school 2013 U.S. Securities and Exchange Efficient I/O and Storage of Adaptive Resolution Data Commission (SEC) requires to tell: investors that a fund's: SC14 In-Situ Feature Extraction of Large Scale Combustion Simulations Using Segmented Merge Trees past performance does not necessarily predict future results Pascucci-70
69 Validation: Example: Is the process providing accurate predictions? Sidharth Kumar Aaditya Landge SC14 ATPESC summer school 2013 U.S. Securities and Exchange Efficient I/O and Storage of Adaptive Resolution Data Commission (SEC) requires to tell: investors that a fund's: SC14 In-Situ Feature Extraction of Large Scale Combustion Simulations Using Segmented Merge Trees past performance does not necessarily predict future results Pascucci-71
70 Validation: Example: Is the process providing accurate predictions? Sidharth Kumar Aaditya Landge SC14 ATPESC summer school 2013 U.S. Securities and Exchange Efficient I/O and Storage of Adaptive Resolution Data Commission (SEC) requires to tell: investors that a fund's: SC14 In-Situ Feature Extraction of Large Scale Combustion Simulations Using Segmented Merge Trees past performance does not necessarily predict future results Pascucci-72
71 Rethinking Multi-Scale Representation of Massive Data Models Multi-resolution representations are insufficient to deal with big data: Data preprocessing is typically too long Wavelet-like averaging looses information Data analysis results often do not represent well important trends (e.g. multi-modal distributions) New data abstractions are needed for Big Data Multi-resolution Abstraction Pascucci-73
72 Rethinking Multi-Scale Representation of Massive Data Models Pascucci-74
73 Topological Analysis of Massive Combustion Simulations Non-premixed DNS combustion (J. Chen, SNL): Analysis of the time evolution of extinction and reignition regions for the design of better fuels time Pascucci-75
74 Topology Has Been Successful for Analysis and Visualization of Massive Scientific Data 76 Pascucci-76
75 Running Efficiently Big Data Computations is a Big Data Problem Massive logs Complex Memory Hierarchies Complex and Diverse Interconnects 3D, 4D, 5D tori Fat tree Complex I/O pathways Cost of Power Dominated by Data Movements Pascucci-77
76 Running Efficiently Big Data Computations is a Big Data Problem Growing Community involving Performance analysis and vis community Workshop at last IEEE VIS Dagstuhl Perspective Workshop Workshop at SC14 Conference Pascucci-78
77 Large Team Requiring Multi-disciplinary and Multi-institutional Collaboration Challenging collaborations among: Government laboratories Industry Academia Close collaborations with domain scientists requiring to cross language and cultural barriers: New education needed Communicate problems not tasks!!!!!! Pascucci-79
78 The Big Gift of Big Data A great opportunity to achieve new scientific discoveries and engineering innovations A great opportunity for the Computer Science community to become a central player in the development of modern science A great challenge for all communities to become strongly engaged in interdisciplinary collaboration A great opportunity for our community to become the data generation, processing and exploration telescope of modern science and engineering Pascucci-80
79 The Big Gift of Big Data A great opportunity to achieve new scientific discoveries and engineering innovations A great opportunity for the Computer Science community to build the infrastructure needed to drive modern science A great challenge for all communities to become strongly engaged in interdisciplinary collaboration A great opportunity for our community to become the data generation, processing and exploration telescope of modern science and engineering Pascucci-81
80 END Pascucci-82
Big Data Hope or Hype?
Big Data Hope or Hype? David J. Hand Imperial College, London and Winton Capital Management Big data science, September 2013 1 Google trends on big data Google search 1 Sept 2013: 1.6 billion hits on big
Big Data Challenges in Bioinformatics
Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres [email protected] Talk outline! We talk about Petabyte?
Hank Childs, University of Oregon
Exascale Analysis & Visualization: Get Ready For a Whole New World Sept. 16, 2015 Hank Childs, University of Oregon Before I forget VisIt: visualization and analysis for very big data DOE Workshop for
Data Intensive Science and Computing
DEFENSE LABORATORIES ACADEMIA TRANSFORMATIVE SCIENCE Efficient, effective and agile research system INDUSTRY Data Intensive Science and Computing Advanced Computing & Computational Sciences Division University
SIRFN Capability Summary National Renewable Energy Laboratory, Golden, CO, USA
SIRFN Capability Summary National Renewable Energy Laboratory, Golden, CO, USA Introduction The National Renewable Energy Laboratory (NREL) is the U.S. Department of Energy s primary national laboratory
Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-Kernels Jiannan Ouyang, Brian Kocoloski, John Lange The Prognostic Lab @ University of Pittsburgh Kevin Pedretti Sandia National Laboratories HPDC 2015
BIG DATA POSSIBILITIES AND CHALLENGES
BIG DATA POSSIBILITIES AND CHALLENGES PROFESSOR AND CENTER DIRECTOR WHY BIG DATA? In God we trust - all others must bring data W. Edwards Deming (US engineer and statistician, 1900-1993) WHAT IS BIG DATA?
The Big Data Paradigm Shift. Insight Through Automation
The Big Data Paradigm Shift Insight Through Automation Agenda The Problem Emcien s Solution: Algorithms solve data related business problems How Does the Technology Work? Case Studies 2013 Emcien, Inc.
The future of Big Data A United Hitachi View
The future of Big Data A United Hitachi View Alex van Die Pre-Sales Consultant 1 Oktober 2014 1 Agenda Evolutie van Data en Analytics Internet of Things Hitachi Social Innovation Vision and Solutions 2
Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory. The Nation s Premier Laboratory for Land Forces UNCLASSIFIED
Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory 21 st Century Research Continuum Theory Theory embodied in computation Hypotheses tested through experiment SCIENTIFIC METHODS
Parallel Analysis and Visualization on Cray Compute Node Linux
Parallel Analysis and Visualization on Cray Compute Node Linux David Pugmire, Oak Ridge National Laboratory and Hank Childs, Lawrence Livermore National Laboratory and Sean Ahern, Oak Ridge National Laboratory
New Jersey Big Data Alliance
Rutgers Discovery Informatics Institute (RDI 2 ) New Jersey s Center for Advanced Computation New Jersey Big Data Alliance Manish Parashar Director, Rutgers Discovery Informatics Institute (RDI 2 ) Professor,
EL Program: Smart Manufacturing Systems Design and Analysis
EL Program: Smart Manufacturing Systems Design and Analysis Program Manager: Dr. Sudarsan Rachuri Associate Program Manager: K C Morris Strategic Goal: Smart Manufacturing, Construction, and Cyber-Physical
BigData & Analytics Research and Developement at Petrobras
III Fórum HBR BRASIL Big Data & Analytics BigData & Analytics Research and Developement at Petrobras Ismael H F Santos, DSc CENPES/PDEP/TOOL [email protected] Agosto 2014 Agenda BigData and Scientific
HPC & Visualization. Visualization and High-Performance Computing
HPC & Visualization Visualization and High-Performance Computing Visualization is a critical step in gaining in-depth insight into research problems, empowering understanding that is not possible with
An Analytical Framework for Particle and Volume Data of Large-Scale Combustion Simulations. Franz Sauer 1, Hongfeng Yu 2, Kwan-Liu Ma 1
An Analytical Framework for Particle and Volume Data of Large-Scale Combustion Simulations Franz Sauer 1, Hongfeng Yu 2, Kwan-Liu Ma 1 1 University of California, Davis 2 University of Nebraska, Lincoln
Big Data a threat or a chance?
Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but
Big Data Collection and Utilization for Operational Support of Smarter Social Infrastructure
Hitachi Review Vol. 63 (2014), No. 1 18 Big Data Collection and Utilization for Operational Support of Smarter Social Infrastructure Kazuaki Iwamura Hideki Tonooka Yoshihiro Mizuno Yuichi Mashita OVERVIEW:
Sanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 [email protected] 1. Introduction The field of data mining and knowledgee discovery is emerging as a
Outline. What is Big data and where they come from? How we deal with Big data?
What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,
Big-data Analytics: Challenges and Opportunities
Big-data Analytics: Challenges and Opportunities Chih-Jen Lin Department of Computer Science National Taiwan University Talk at 台 灣 資 料 科 學 愛 好 者 年 會, August 30, 2014 Chih-Jen Lin (National Taiwan Univ.)
Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee
Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee Technology in Pedagogy, No. 8, April 2012 Written by Kiruthika Ragupathi ([email protected]) Computational thinking is an emerging
Data Requirements from NERSC Requirements Reviews
Data Requirements from NERSC Requirements Reviews Richard Gerber and Katherine Yelick Lawrence Berkeley National Laboratory Summary Department of Energy Scientists represented by the NERSC user community
Are You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
Information Visualization WS 2013/14 11 Visual Analytics
1 11.1 Definitions and Motivation Lot of research and papers in this emerging field: Visual Analytics: Scope and Challenges of Keim et al. Illuminating the path of Thomas and Cook 2 11.1 Definitions and
High Performance Computing
High Parallel Computing Hybrid Program Coding Heterogeneous Program Coding Heterogeneous Parallel Coding Hybrid Parallel Coding High Performance Computing Highly Proficient Coding Highly Parallelized Code
Are You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
DEPARTMENT OF PETROLEUM ENGINEERING Graduate Program (Version 2002)
DEPARTMENT OF PETROLEUM ENGINEERING Graduate Program (Version 2002) COURSE DESCRIPTION PETE 512 Advanced Drilling Engineering I (3-0-3) This course provides the student with a thorough understanding of
NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB
bankmark UG (haftungsbeschränkt) Bahnhofstraße 1 9432 Passau Germany www.bankmark.de [email protected] T +49 851 25 49 49 F +49 851 25 49 499 NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB,
HPC technology and future architecture
HPC technology and future architecture Visual Analysis for Extremely Large-Scale Scientific Computing KGT2 Internal Meeting INRIA France Benoit Lange [email protected] Toàn Nguyên [email protected]
a Data Science initiative @ Univ. Piraeus [GR]
a Data Science initiative @ Univ. Piraeus [GR] The Data Science Lab members June 2015 What is Data Science source: quora.com! Looking at data! Tools and methods used to analyze large amounts of data! Anything
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Modern (Computational) Approaches to Big Data Analytics. CSC 576 Computer Science, University of Rochester Instructor: Ji Liu
Modern (Computational) Approaches to Big Data Analytics CSC 576 Computer Science, University of Rochester Instructor: Ji Liu Big Data in Academy SIGKDD 2014 (program page, found 14 big data, 50+ large
Big Data R&D Initiative
Big Data R&D Initiative Howard Wactlar CISE Directorate National Science Foundation NIST Big Data Meeting June, 2012 Image Credit: Exploratorium. The Landscape: Smart Sensing, Reasoning and Decision Environment
Data Deduplication in a Hybrid Architecture for Improving Write Performance
Data Deduplication in a Hybrid Architecture for Improving Write Performance Data-intensive Salable Computing Laboratory Department of Computer Science Texas Tech University Lubbock, Texas June 10th, 2013
Using Predictive Maintenance to Approach Zero Downtime
SAP Thought Leadership Paper Predictive Maintenance Using Predictive Maintenance to Approach Zero Downtime How Predictive Analytics Makes This Possible Table of Contents 4 Optimizing Machine Maintenance
Data Centric Systems (DCS)
Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems
A Conceptual Approach to Data Visualization for User Interface Design of Smart Grid Operation Tools
A Conceptual Approach to Data Visualization for User Interface Design of Smart Grid Operation Tools Dong-Joo Kang and Sunju Park Yonsei University [email protected], [email protected] Abstract
High Performance Computing. Course Notes 2007-2008. HPC Fundamentals
High Performance Computing Course Notes 2007-2008 2008 HPC Fundamentals Introduction What is High Performance Computing (HPC)? Difficult to define - it s a moving target. Later 1980s, a supercomputer performs
Big Data Management in the Clouds and HPC Systems
Big Data Management in the Clouds and HPC Systems Hemera Final Evaluation Paris 17 th December 2014 Shadi Ibrahim [email protected] Era of Big Data! Source: CNRS Magazine 2013 2 Era of Big Data! Source:
How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
Big Data Mining Services and Knowledge Discovery Applications on Clouds
Big Data Mining Services and Knowledge Discovery Applications on Clouds Domenico Talia DIMES, Università della Calabria & DtoK Lab Italy [email protected] Data Availability or Data Deluge? Some decades
Is a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
Data-intensive HPC: opportunities and challenges. Patrick Valduriez
Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,
High Performance Computing Infrastructure in JAPAN
High Performance Computing Infrastructure in JAPAN Yutaka Ishikawa Director of Information Technology Center Professor of Computer Science Department University of Tokyo Leader of System Software Research
Mission Need Statement for the Next Generation High Performance Production Computing System Project (NERSC-8)
Mission Need Statement for the Next Generation High Performance Production Computing System Project () (Non-major acquisition project) Office of Advanced Scientific Computing Research Office of Science
Make the Most of Big Data to Drive Innovation Through Reseach
White Paper Make the Most of Big Data to Drive Innovation Through Reseach Bob Burwell, NetApp November 2012 WP-7172 Abstract Monumental data growth is a fact of life in research universities. The ability
Concept and Project Objectives
3.1 Publishable summary Concept and Project Objectives Proactive and dynamic QoS management, network intrusion detection and early detection of network congestion problems among other applications in the
UniGR Workshop: Big Data «The challenge of visualizing big data»
Dept. ISC Informatics, Systems & Collaboration UniGR Workshop: Big Data «The challenge of visualizing big data» Dr Ir Benoît Otjacques Deputy Scientific Director ISC The Future is Data-based Can we help?
Manjula Ambur NASA Langley Research Center April 2014
Manjula Ambur NASA Langley Research Center April 2014 Outline What is Big Data Vision and Roadmap Key Capabilities Impetus for Watson Technologies Content Analytics Use Potential use cases What is Big
Data Analytics in Organisations and Business
Data Analytics in Organisations and Business Dr. Isabelle E-mail: [email protected] 1 Data Analytics in Organisations and Business Some organisational information: Tutorship: Gian Thanei:
LSST and the Cloud: Astro Collaboration in 2016 Tim Axelrod LSST Data Management Scientist
LSST and the Cloud: Astro Collaboration in 2016 Tim Axelrod LSST Data Management Scientist DERCAP Sydney, Australia, 2009 Overview of Presentation LSST - a large-scale Southern hemisphere optical survey
Data-Driven Decisions: Role of Operations Research in Business Analytics
Data-Driven Decisions: Role of Operations Research in Business Analytics Dr. Radhika Kulkarni Vice President, Advanced Analytics R&D SAS Institute April 11, 2011 Welcome to the World of Analytics! Lessons
Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales
Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes Anthony Kenisky, VP of North America Sales About Appro Over 20 Years of Experience 1991 2000 OEM Server Manufacturer 2001-2007
Types of Engineering Jobs
What Do Engineers Do? Engineers apply the theories and principles of science and mathematics to the economical solution of practical technical problems. I.e. To solve problems Often their work is the link
Masters in Information Technology
Computer - Information Technology MSc & MPhil - 2015/6 - July 2015 Masters in Information Technology Programme Requirements Taught Element, and PG Diploma in Information Technology: 120 credits: IS5101
International Journal of Innovative Research in Computer and Communication Engineering
FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,
1.1 Difficulty in Fault Localization in Large-Scale Computing Systems
Chapter 1 Introduction System failures have been one of the biggest obstacles in operating today s largescale computing systems. Fault localization, i.e., identifying direct or indirect causes of failures,
Scalable Developments for Big Data Analytics in Remote Sensing
Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,
II Ill Ill1 II. /y3n-7 GAO. Testimony. Supercomputing in Industry
GAO a Testimony /y3n-7 For Release on Delivery Expected at 9:30 am, EST Thursday March 7, 1991 Supercomputing in Industry II Ill Ill1 II 143857 Statement for the record by Jack L Erock, Jr, Director Government
High Performance Storage System. Overview
Overview * Please review disclosure statement on last slide Please see disclaimer on last slide Copyright IBM 2009 Slide Number 1 Hierarchical storage software developed in collaboration with five US Department
Big Data Integration: A Buyer's Guide
SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology
Survey of Canadian and International Data Management Initiatives. By Diego Argáez and Kathleen Shearer
Survey of Canadian and International Data Management Initiatives By Diego Argáez and Kathleen Shearer on behalf of the CARL Data Management Working Group (Working paper) April 28, 2008 Introduction Today,
Parallel Computing. Introduction
Parallel Computing Introduction Thorsten Grahs, 14. April 2014 Administration Lecturer Dr. Thorsten Grahs (that s me) [email protected] Institute of Scientific Computing Room RZ 120 Lecture Monday 11:30-13:00
Data Centric Computing Revisited
Piyush Chaudhary Technical Computing Solutions Data Centric Computing Revisited SPXXL/SCICOMP Summer 2013 Bottom line: It is a time of Powerful Information Data volume is on the rise Dimensions of data
SciDAC Visualization and Analytics Center for Enabling Technology
SciDAC Visualization and Analytics Center for Enabling Technology E. Wes Bethel 1, Chris Johnson 2, Ken Joy 3, Sean Ahern 4, Valerio Pascucci 5, Hank Childs 5, Jonathan Cohen 5, Mark Duchaineau 5, Bernd
Percipient StorAGe for Exascale Data Centric Computing
Percipient StorAGe for Exascale Data Centric Computing per cip i ent (pr-sp-nt) adj. Having the power of perceiving, especially perceiving keenly and readily. n. One that perceives. Introducing: Seagate
The Promise of Industrial Big Data
The Promise of Industrial Big Data Big Data Real Time Analytics Katherine Butler 1 st Annual Digital Economy Congress San Diego, CA Nov 14 th 15 th, 2013 Individual vs. Ecosystem What Happened When 1B
Masters in Human Computer Interaction
Masters in Human Computer Interaction Programme Requirements Taught Element, and PG Diploma in Human Computer Interaction: 120 credits: IS5101 CS5001 CS5040 CS5041 CS5042 or CS5044 up to 30 credits from
Situational Awareness Through Network Visualization
CYBER SECURITY DIVISION 2014 R&D SHOWCASE AND TECHNICAL WORKSHOP Situational Awareness Through Network Visualization Pacific Northwest National Laboratory Daniel M. Best Bryan Olsen 11/25/2014 Introduction
NITRD and Big Data. George O. Strawn NITRD
NITRD and Big Data George O. Strawn NITRD Caveat auditor The opinions expressed in this talk are those of the speaker, not the U.S. government Outline What is Big Data? Who is NITRD? NITRD's Big Data Research
INCITE Overview Lattice QCD Computational Science Workshop April 30, 2013. Julia C. White, INCITE Manager
INCITE Overview Lattice QCD Computational Science Workshop April 30, 2013 Julia C. White, INCITE Manager Origin of Leadership Computing Facility Department of Energy High-End Computing Revitalization Act
Visualization and Data Analysis
Working Group Outbrief Visualization and Data Analysis James Ahrens, David Rogers, Becky Springmeyer Eric Brugger, Cyrus Harrison, Laura Monroe, Dino Pavlakos Scott Klasky, Kwan-Liu Ma, Hank Childs LLNL-PRES-481881
