Big Data Hope or Hype?

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Big Data Hope or Hype?"

Transcription

1 Big Data Hope or Hype? David J. Hand Imperial College, London and Winton Capital Management Big data science, September

2 Google trends on big data Google search 1 Sept 2013: 1.6 billion hits on big data Big data science, September

3 What is big data? Various definitions: data which are too extensive to permit iterative analysis: one pass analysis is necessary; data sets which standard database tools cannot handle; data sets which are so large they require new forms of processing; a data set which exceeds 20% of the RAM of a given machine; Big data science, September

4 Some big data stories The Large Hadron Collider: petabyte (10 15 ) per second; Sequencing the Human Genome: 3.3 billion base pairs Social network analysis: 2.5 quintillion (10 18 ) bytes per day Climate modelling: Coupled model intercomparison project 5 th phase: more than 2 petabytes Google Translate: statistical machine translation; 200 billion words from UN documents Big data science, September

5 Why now? automatic data capture (often secondary) simulations (e.g. meteorology, physics) exponential growth in computer memory Big data science, September

6 But it s not new! It s media rebranding 1994: Wal Mart, with over 7 billion transactions per year; 1997: AT&T, with over 70 billion long distant phone call records per year; 1990s: Mobil Oil, over 100 terabytes of data; 2000: in just a few months the Sloan Digital Sky Survey collected more data than had previously been collected in the entire history of astronomy Big data science, September

7 Why is it exciting? A new world, according to many! McKinsey: we are on the cusp of a tremendous wave of innovation, productivity, and growth, as well as new modes of competition and value capture, all driven by big data as consumers, companies, and economic sectors exploit its potential Big data science, September

8 Some see big data as a paradigm shift in science: Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves. Chris Anderson Wired in an article called The end of theory: the data deluge makes scientific method obsolete. Big data science, September

9 But he was wrong: the numbers don t speak for themselves Big data science, September

10 But he was wrong: the numbers don t speak for themselves There are two kinds of models: data driven substantive Big data science, September

11 Data driven models Based purely on empirical relationships in the data e.g.in credit scoring the model of choice is a logistic regression tree The population is partitioned into segments on empirical grounds Different logistic regression models built in each segment No underlying theory No psychology, prospect theory, behavioural finance, etc. Big data science, September

12 Data driven models are not new e.g. segmented regression in credit scoring in 1960s Data driven models are good for prediction and anomaly detection which is why they are so heavily used in some domains But data driven models don t provide insight Big data science, September

13 Substantive models Are essentially theories e.g. Newton s Laws of Motion necessary for understanding e.g. to detect dark matter from galaxy rotation lack of insight has its dangers Billions Sources: BBA, CCRG (Access cards only added from 1974, Building Societies from 1996) Big data science, September

14 So it s too much to say Out with every theory of human behavior (Anderson) It depends what you are using the models for prediction understanding Big data science, September

15 Big data needs Computer science for manipulating data Sorting, adding, selecting, aggregating, concatenating, etc Statistics for extracting information from data Most of the problems we want to solve are inferential We don t want to make a statement about the data we have, but about data we might get tomorrow (e.g. economic forecasting); the population from which our data were drawn (e.g. astronomical databases); a true value, which we have observed with measurement error (e.g. gene expression data); data we might have had if things had been different (e.g. social policy) Big data science, September

16 Big data risks big data often collected as a side effect of some other exercise: the definitions may not match definitions may change over time if administrative data quality (good for one purpose, not for another; computer is a necessary intermediary) selection bias different observational automatic data capture sources have different biases; problem of selecting on basis of response variable crime maps example: Direct Line Insurance survey: selective reporting of incidents for fear of impact on house prices multiple testing everything significant Big data science, September

17 New tools needed Wikipedia says the challenges include capture, curation, storage, search, sharing, transfer, analysis, and visualization While true, this is mostly talking about computational housekeeping tools rather than knowledge extraction tools: It s talking about data juggling rather than inference [Even analysis in the above quote refers to Hadoop, missing the point] Big data science, September

18 But there are Implications for inference visualisation: but familiar tools may be inadequate Big data science, September

19 iteration too slow use simple models (eg. regression instead of logistic reg) splitting and screening (not really taking advantage of the big data) e.g. the LHC: 1 petabyte per sec, online filter reduces by a factor of 10,000, further selection by factor of 100. anomaly detection streaming data Big data science, September

20 Big data does not mean end of small data Power law for data set size: The probability of observing a data set of size n is inversely related to a power of n There are vastly more small data sets than very large ones Big data science, September

21 The data mining experience Most unusual structures in large data sets arise because of data errors turn out to be known about beforehand are uninteresting e.g. the discovery that in a time series of data, maxima and minima alternate e.g. the discovery that in the US about half the married people are male Big data science, September

22 Summary Big data has great potential Big data does not mean the end of small data Big is not necessarily good, useful, valuable, or interesting Data is not knowledge It is possible to be data rich but information poor Big data science, September

23 The future is not big data, but what you learn from it Big data science, September

24 thanks! Big data science, September

Finding Patterns the Challenge of Big Data 1

Finding Patterns the Challenge of Big Data 1 Finding Patterns The Challenge of Big Data David J. Hand Imperial College, London and Winton Capital Management November 2015 Finding Patterns the Challenge of Big Data 1 we are on the cusp of a tremendous

More information

David J. Hand Imperial College, London

David J. Hand Imperial College, London David J. Hand Imperial College, London Discovery vs distortion the importance of quality in learning from data David J. Hand Imperial College, London and Winton Capital Management 14 July 2015 Learning

More information

Big Data a threat or a chance?

Big Data a threat or a chance? Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but

More information

Collaborations between Official Statistics and Academia in the Era of Big Data

Collaborations between Official Statistics and Academia in the Era of Big Data Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI vnn@umich.edu What

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

Data Analytics in Organisations and Business

Data Analytics in Organisations and Business Data Analytics in Organisations and Business Dr. Isabelle E-mail: isabelle.flueckiger@math.ethz.ch 1 Data Analytics in Organisations and Business Some organisational information: Tutorship: Gian Thanei:

More information

Big Data. Fast Forward. Putting data to productive use

Big Data. Fast Forward. Putting data to productive use Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize

More information

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»

More information

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining

More information

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014 Big Data Analytics An Introduction Oliver Fuchsberger University of Paderborn 2014 Table of Contents I. Introduction & Motivation What is Big Data Analytics? Why is it so important? II. Techniques & Solutions

More information

Is Big Data a Big Deal? What Big Data Does to Science

Is Big Data a Big Deal? What Big Data Does to Science Is Big Data a Big Deal? What Big Data Does to Science Netherlands escience Center Wilco Hazeleger Wilco Hazeleger Student @ Wageningen University and Reading University Meteorology PhD @ Utrecht University,

More information

Big Data Challenges in Bioinformatics

Big Data Challenges in Bioinformatics Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres Jordi.Torres@bsc.es Talk outline! We talk about Petabyte?

More information

An analysis of Big Data ecosystem from an HCI perspective.

An analysis of Big Data ecosystem from an HCI perspective. An analysis of Big Data ecosystem from an HCI perspective. Jay Sanghvi Rensselaer Polytechnic Institute For: Theory and Research in Technical Communication and HCI Rensselaer Polytechnic Institute Wednesday,

More information

Astrophysics with Terabyte Datasets. Alex Szalay, JHU and Jim Gray, Microsoft Research

Astrophysics with Terabyte Datasets. Alex Szalay, JHU and Jim Gray, Microsoft Research Astrophysics with Terabyte Datasets Alex Szalay, JHU and Jim Gray, Microsoft Research Living in an Exponential World Astronomers have a few hundred TB now 1 pixel (byte) / sq arc second ~ 4TB Multi-spectral,

More information

Architecture 3.0 Landscape Analytics

Architecture 3.0 Landscape Analytics Architecture 3.0 Landscape Analytics Jürgen Döllner Hasso- Plattner- Institut Landscape Analytics Big Data Big Data Analytics Visual Analytics Predictive Analytics Landscape Analytics Big Data Data is

More information

Big Data, Official Statistics and Social Science Research: Emerging Data Challenges

Big Data, Official Statistics and Social Science Research: Emerging Data Challenges Big Data, Official Statistics and Social Science Research: Emerging Data Challenges Professor Paul Cheung Director, United Nations Statistics Division Building the Global Information System Elements of

More information

Learning from Big Data in

Learning from Big Data in Learning from Big Data in Astronomy an overview Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ From traditional astronomy 2 to Big Data

More information

Data Intensive Scalable Computing. Harnessing the Power of Cloud Computing

Data Intensive Scalable Computing. Harnessing the Power of Cloud Computing Data Intensive Scalable Computing Harnessing the Power of Cloud Computing Randal E. Bryant February, 2009 Our world is awash in data. Millions of devices generate digital data, an estimated one zettabyte

More information

Big Data Big Knowledge?

Big Data Big Knowledge? EBPI Epidemiology, Biostatistics and Prevention Institute Big Data Big Knowledge? Torsten Hothorn 2015-03-06 The end of theory The End of Theory: The Data Deluge Makes the Scientific Method Obsolete (Chris

More information

Big Data Promises and Pitfalls David J. Hand Imperial College, London and Winton Capital Management Policy making in the Big Data Era 1

Big Data Promises and Pitfalls David J. Hand Imperial College, London and Winton Capital Management Policy making in the Big Data Era 1 Big Data Promises and Pitfalls David J. Hand Imperial College, London and Winton Capital Management July 2015 Policy making in the Big Data Era 1 The world of data is changing Not something which happens

More information

Andrew Fayram Wisconsin Department of Natural Resources. Monitoring Coordinator

Andrew Fayram Wisconsin Department of Natural Resources. Monitoring Coordinator Cth Catchy title! But not mine. Making Sense of Too Much Data Andrew Fayram Wisconsin Department of Natural Resources Office of the Great Lakes Monitoring Coordinator Cth Catchy title! But not mine. Making

More information

From Raw Data to. Actionable Insights with. MATLAB Analytics. Learn more. Develop predictive models. 1Access and explore data

From Raw Data to. Actionable Insights with. MATLAB Analytics. Learn more. Develop predictive models. 1Access and explore data 100 001 010 111 From Raw Data to 10011100 Actionable Insights with 00100111 MATLAB Analytics 01011100 11100001 1 Access and Explore Data For scientists the problem is not a lack of available but a deluge.

More information

BIG DATA, MAPREDUCE & HADOOP

BIG DATA, MAPREDUCE & HADOOP BIG, MAPREDUCE & HADOOP LARGE SCALE DISTRIBUTED SYSTEMS By Jean-Pierre Lozi A tutorial for the LSDS class LARGE SCALE DISTRIBUTED SYSTEMS BIG, MAPREDUCE & HADOOP 1 OBJECTIVES OF THIS LAB SESSION The LSDS

More information

Training program on Big Data Analytics

Training program on Big Data Analytics Training program on Big Data Analytics Finesse / StatLabs, Bangalore; are leading organization providing Big Data Analytics Training and Services that helps organizations anticipate job/ business opportunities.

More information

Science: what is possible. Engineering: turn science into an everyday commodity (cheap, safe, reliable, resilient, )

Science: what is possible. Engineering: turn science into an everyday commodity (cheap, safe, reliable, resilient, ) : Big Data Analytics for Renewable Energy Mark J. Embrechts Dept. Industrial and Systems Engineering Rensselaer Polytechnic Institute, Troy, NY, USA What is Data Mining? Data Mining Big Data Analytics

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Extreme Computing. Big Data. Stratis Viglas. School of Informatics University of Edinburgh sviglas@inf.ed.ac.uk. Stratis Viglas Extreme Computing 1

Extreme Computing. Big Data. Stratis Viglas. School of Informatics University of Edinburgh sviglas@inf.ed.ac.uk. Stratis Viglas Extreme Computing 1 Extreme Computing Big Data Stratis Viglas School of Informatics University of Edinburgh sviglas@inf.ed.ac.uk Stratis Viglas Extreme Computing 1 Petabyte Age Big Data Challenges Stratis Viglas Extreme Computing

More information

Data-Intensive Science and Scientific Data Infrastructure

Data-Intensive Science and Scientific Data Infrastructure Data-Intensive Science and Scientific Data Infrastructure Russ Rew, UCAR Unidata ICTP Advanced School on High Performance and Grid Computing 13 April 2011 Overview Data-intensive science Publishing scientific

More information

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: simmibagga12@gmail.com

More information

Training for Big Data

Training for Big Data Training for Big Data Learnings from the CATS Workshop Raghu Ramakrishnan Technical Fellow, Microsoft Head, Big Data Engineering Head, Cloud Information Services Lab Store any kind of data What is Big

More information

Chapter 7: Data Mining

Chapter 7: Data Mining Chapter 7: Data Mining Overview Topics discussed: The Need for Data Mining and Business Value The Data Mining Process: Define Business Objectives Get Raw Data Identify Relevant Predictive Variables Gain

More information

Data Mining in Telecommunication

Data Mining in Telecommunication Data Mining in Telecommunication Mohsin Nadaf & Vidya Kadam Department of IT, Trinity College of Engineering & Research, Pune, India E-mail : mohsinanadaf@gmail.com Abstract Telecommunication is one of

More information

The Impact of Big Data on Classic Machine Learning Algorithms. Thomas Jensen, Senior Business Analyst @ Expedia

The Impact of Big Data on Classic Machine Learning Algorithms. Thomas Jensen, Senior Business Analyst @ Expedia The Impact of Big Data on Classic Machine Learning Algorithms Thomas Jensen, Senior Business Analyst @ Expedia Who am I? Senior Business Analyst @ Expedia Working within the competitive intelligence unit

More information

Big Trouble. Does Big Data spell. for Lawyers? Presented to Colorado Bar Association, Communications & Technology Law Section Denver, Colorado

Big Trouble. Does Big Data spell. for Lawyers? Presented to Colorado Bar Association, Communications & Technology Law Section Denver, Colorado Does Big Data spell Big Trouble for Lawyers? Paul Karlzen Director HR Information & Analytics April 1, 2015 Presented to Colorado Bar Association, Communications & Technology Law Section Denver, Colorado

More information

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Certification In SAS Programming. Introduction to SAS Program

Certification In SAS Programming. Introduction to SAS Program Certification In SAS Programming Introduction to SAS Program What Lies Ahead In this session, you will gain answers to: Overview of Analytics Careers in Analytics Why Use SAS? Introduction to SAS System

More information

DATAOPT SOLUTIONS. What Is Big Data?

DATAOPT SOLUTIONS. What Is Big Data? DATAOPT SOLUTIONS What Is Big Data? WHAT IS BIG DATA? It s more than just large amounts of data, though that s definitely one component. The more interesting dimension is about the types of data. So Big

More information

Statistical Challenges with Big Data in Management Science

Statistical Challenges with Big Data in Management Science Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision

More information

Perspectives on Data Mining

Perspectives on Data Mining Perspectives on Data Mining Niall Adams Department of Mathematics, Imperial College London n.adams@imperial.ac.uk April 2009 Objectives Give an introductory overview of data mining (DM) (or Knowledge Discovery

More information

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of

More information

Turning Big Data into Big Decisions Delivering on the High Demand for Data

Turning Big Data into Big Decisions Delivering on the High Demand for Data Turning Big Data into Big Decisions Delivering on the High Demand for Data Michael Ho, Vice President of Professional Services Digital Government Institute s Government Big Data Conference, October 31,

More information

EVERYTHING THAT MATTERS IN ADVANCED ANALYTICS

EVERYTHING THAT MATTERS IN ADVANCED ANALYTICS EVERYTHING THAT MATTERS IN ADVANCED ANALYTICS Marcia Kaufman, Principal Analyst, Hurwitz & Associates Dan Kirsch, Senior Analyst, Hurwitz & Associates Steve Stover, Sr. Director, Product Management, Predixion

More information

Spark in Action. Fast Big Data Analytics using Scala. Matei Zaharia. www.spark- project.org. University of California, Berkeley UC BERKELEY

Spark in Action. Fast Big Data Analytics using Scala. Matei Zaharia. www.spark- project.org. University of California, Berkeley UC BERKELEY Spark in Action Fast Big Data Analytics using Scala Matei Zaharia University of California, Berkeley www.spark- project.org UC BERKELEY My Background Grad student in the AMP Lab at UC Berkeley» 50- person

More information

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Machine Learning and Data Mining. Fundamentals, robotics, recognition Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,

More information

Industry Perspective: Big Data and Big Data Analytics. David Barnes Program Director Emerging Internet Technologies IBM Software Group

Industry Perspective: Big Data and Big Data Analytics. David Barnes Program Director Emerging Internet Technologies IBM Software Group Industry Perspective: Big Data and Big Data Analytics David Barnes Program Director Emerging Internet Technologies IBM Software Group What is Big Data? The Adjacent Possible Inexpensive disk + Increased

More information

Navigating Big Data business analytics

Navigating Big Data business analytics mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what

More information

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining + Business Intelligence. Integration, Design and Implementation Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution

More information

Conquering the Astronomical Data Flood through Machine

Conquering the Astronomical Data Flood through Machine Conquering the Astronomical Data Flood through Machine Learning and Citizen Science Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ The Problem:

More information

Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India.

Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India. Impact of Big Data in Oil & Gas Industry Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India. New Age Information 2.92 billions Internet Users in 2014 Twitter processes 7 terabytes

More information

Is Big Data Bigger than a Bread Box?

Is Big Data Bigger than a Bread Box? Is Big Data Bigger than a Bread Box? Bradley Strauss Chitika, Inc. January 14, 2014 The Basic Problem The basic problem we face is simple to state: the big in big data is not well-defined, and perhaps

More information

Understanding the Value of In-Memory in the IT Landscape

Understanding the Value of In-Memory in the IT Landscape February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to

More information

Fast Analytics on Big Data with H20

Fast Analytics on Big Data with H20 Fast Analytics on Big Data with H20 0xdata.com, h2o.ai Tomas Nykodym, Petr Maj Team About H2O and 0xdata H2O is a platform for distributed in memory predictive analytics and machine learning Pure Java,

More information

Big Data and Its Empiricist Founda4ons. Teresa Scantamburlo

Big Data and Its Empiricist Founda4ons. Teresa Scantamburlo Big Data and Its Empiricist Founda4ons Teresa Scantamburlo The evolu4on of Data Science The mechaniza4on of induc4on The business of data The Big Data paradigm (data + computa4on) Cri4cal analysis Tenta4ve

More information

Big Data in Telco & Banking Analytics. Benjamin Sznajder IBM Research Haifa

Big Data in Telco & Banking Analytics. Benjamin Sznajder IBM Research Haifa Big Data in Telco & Banking Analytics Benjamin Sznajder IBM Research Haifa Agenda What is Big Data, Why Now IBM s approach Big Data in Banking industry A Telco scenario Bytes and bytes Megabyte: 1 minute

More information

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Introduction to Big Data! with Apache Spark UC#BERKELEY# Introduction to Big Data! with Apache Spark" UC#BERKELEY# This Lecture" The Big Data Problem" Hardware for Big Data" Distributing Work" Handling Failures and Slow Machines" Map Reduce and Complex Jobs"

More information

Predictive Analytics for Demand Forecasting and Planning Managers A Big Data Challenge Hans Levenbach, Delphus, Inc.

Predictive Analytics for Demand Forecasting and Planning Managers A Big Data Challenge Hans Levenbach, Delphus, Inc. Predictive Analytics for Demand Forecasting and Planning Managers A Big Data Challenge Hans Levenbach, Delphus, Inc. ISF2013, KAIST College of Business, Seoul, Korea Agenda Role of Big Data in Small Predictive

More information

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Is a Data Scientist the New Quant? Stuart Kozola MathWorks Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by

More information

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization

More information

Analytics-as-a-Service: From Science to Marketing

Analytics-as-a-Service: From Science to Marketing Analytics-as-a-Service: From Science to Marketing Data Information Knowledge Insights (Discovery & Decisions) Kirk Borne George Mason University, Fairfax, VA www.kirkborne.net @KirkDBorne Big Data: What

More information

Introduction of Information Visualization and Visual Analytics. Chapter 2. Introduction and Motivation

Introduction of Information Visualization and Visual Analytics. Chapter 2. Introduction and Motivation Introduction of Information Visualization and Visual Analytics Chapter 2 Introduction and Motivation Overview! 2 Overview and Motivation! Information Visualization (InfoVis)! InfoVis Application Areas!

More information

The Past, Present, and Future of Data Science Education

The Past, Present, and Future of Data Science Education The Past, Present, and Future of Data Science Education Kirk Borne @KirkDBorne http://kirkborne.net George Mason University School of Physics, Astronomy, & Computational Sciences Outline Research and Application

More information

Big Data Storage, Management and challenges. Ahmed Ali-Eldin

Big Data Storage, Management and challenges. Ahmed Ali-Eldin Big Data Storage, Management and challenges Ahmed Ali-Eldin (Ambitious) Plan What is Big Data? And Why talk about Big Data? How to store Big Data? BigTables (Google) Dynamo (Amazon) How to process Big

More information

Opportunities and Limitations of Big Data

Opportunities and Limitations of Big Data Opportunities and Limitations of Big Data Karl Schmedders University of Zurich and Swiss Finance Institute «Big Data: Little Ethics?» HWZ-Darden-Conference June 4, 2015 On fortune.com this morning: Apple's

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Insightful Analytics: Leveraging the data explosion for business optimisation. Top Ten Challenges for Investment Banks 2015

Insightful Analytics: Leveraging the data explosion for business optimisation. Top Ten Challenges for Investment Banks 2015 Insightful Analytics: Leveraging the data explosion for business optimisation 09 Top Ten Challenges for Investment Banks 2015 Insightful Analytics: Leveraging the data explosion for business optimisation

More information

Data Isn't Everything

Data Isn't Everything June 17, 2015 Innovate Forward Data Isn't Everything The Challenges of Big Data, Advanced Analytics, and Advance Computation Devices for Transportation Agencies. Using Data to Support Mission, Administration,

More information

From Distributed Computing to Distributed Artificial Intelligence

From Distributed Computing to Distributed Artificial Intelligence From Distributed Computing to Distributed Artificial Intelligence Dr. Christos Filippidis, NCSR Demokritos Dr. George Giannakopoulos, NCSR Demokritos Big Data and the Fourth Paradigm The two dominant paradigms

More information

Big Data and utility function in bank services. Nikolay K. Vitanov 1

Big Data and utility function in bank services. Nikolay K. Vitanov 1 Big Data and utility function in bank services Selected aspects Nikolay K. Vitanov 1 1 Institute of Mechanics, Bulgarian Academy of Sciences Sofia, 16. 06. 2015 Vitanov (BAS) Big Data and utility function

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends

Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends Spring 2015 Thomas Hill, Ph.D. VP Analytic Solutions Dell Statistica Overview and Agenda Dell Software overview Dell in

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

What happens when Big Data and Master Data come together?

What happens when Big Data and Master Data come together? What happens when Big Data and Master Data come together? Jeremy Pritchard Master Data Management fgdd 1 What is Master Data? Master data is data that is shared by multiple computer systems. The Information

More information

ANALYTICS CENTER LEARNING PROGRAM

ANALYTICS CENTER LEARNING PROGRAM Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals

More information

Taming the Internet of Things: The Lord of the Things

Taming the Internet of Things: The Lord of the Things Taming the Internet of Things: The Lord of the Things Kirk Borne @KirkDBorne School of Physics, Astronomy, & Computational Sciences College of Science, George Mason University, Fairfax, VA Taming the Internet

More information

Master of Science in Marketing Analytics (MSMA)

Master of Science in Marketing Analytics (MSMA) Master of Science in Marketing Analytics (MSMA) COURSE DESCRIPTION The Master of Science in Marketing Analytics program teaches students how to become more engaged with consumers, how to design and deliver

More information

Why the Big Deal about Big Data?

Why the Big Deal about Big Data? Why the Big Deal about Big Data? Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering Founding Director, escience Institute University of Washington Technology Alliance Insight to Impact

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,

More information

LARGE-SCALE DATA ANALYTICS Applications and Technology

LARGE-SCALE DATA ANALYTICS Applications and Technology LARGE-SCALE DATA ANALYTICS Applications and Technology Peter Brezany Research Group Scientific Computing Faculty of Computer Science University of Vienna, Austria 2nd SPLab Workshop, October 2012 Today

More information

Big Data. George O. Strawn NITRD

Big Data. George O. Strawn NITRD Big Data George O. Strawn NITRD Caveat auditor The opinions expressed in this talk are those of the speaker, not the U.S. government Outline What is Big Data? NITRD's Big Data Research Initiative Big Data

More information

Indexed Terms: Big Data, benefits, characteristics, definition, problems, unstructured data

Indexed Terms: Big Data, benefits, characteristics, definition, problems, unstructured data Managing Data through Big Data: A Review Harsimran Singh Anand Assistant Professor, PG Dept of Computer Science & IT, DAV College, Amritsar Email id: harsimran_anand@yahoo.com A B S T R A C T Big Data

More information

INTRODUCTION TO DATA MINING SAS ENTERPRISE MINER

INTRODUCTION TO DATA MINING SAS ENTERPRISE MINER INTRODUCTION TO DATA MINING SAS ENTERPRISE MINER Mary-Elizabeth ( M-E ) Eddlestone Principal Systems Engineer, Analytics SAS Customer Loyalty, SAS Institute, Inc. AGENDA Overview/Introduction to Data Mining

More information

Accelerate BI Initiatives With Self-Service Data Discovery And Integration

Accelerate BI Initiatives With Self-Service Data Discovery And Integration A Custom Technology Adoption Profile Commissioned By Attivio June 2015 Accelerate BI Initiatives With Self-Service Data Discovery And Integration Introduction The rapid advancement of technology has ushered

More information

Good morning. It is a pleasure to be with you here today to talk about the value and promise of Big Data.

Good morning. It is a pleasure to be with you here today to talk about the value and promise of Big Data. Good morning. It is a pleasure to be with you here today to talk about the value and promise of Big Data. 1 Advances in information technologies are transforming the fabric of our society and data represent

More information

Statistics, Big Data and Data Science!?

Statistics, Big Data and Data Science!? Statistics, Big Data and Data Science!? Prof. Dr. Göran Kauermann Ludwig-Maximilians-Universität Munich, Germany Statistics, Big Data and Data Science Statistics Founded around 1900 with the seminal work

More information

Journée Thématique Big Data 13/03/2015

Journée Thématique Big Data 13/03/2015 Journée Thématique Big Data 13/03/2015 1 Agenda About Flaminem What Do We Want To Predict? What Is The Machine Learning Theory Behind It? How Does It Work In Practice? What Is Happening When Data Gets

More information

IBM Big Data in Government

IBM Big Data in Government IBM Big in Government Turning big data into smarter decisions Deepak Mohapatra Sr. Consultant Government IBM Software Group dmohapatra@us.ibm.com The Big Paradigm Shift 2 Big Creates A Challenge And an

More information

NITRD and Big Data. George O. Strawn NITRD

NITRD and Big Data. George O. Strawn NITRD NITRD and Big Data George O. Strawn NITRD Caveat auditor The opinions expressed in this talk are those of the speaker, not the U.S. government Outline What is Big Data? Who is NITRD? NITRD's Big Data Research

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Hadoop and Map-reduce computing

Hadoop and Map-reduce computing Hadoop and Map-reduce computing 1 Introduction This activity contains a great deal of background information and detailed instructions so that you can refer to it later for further activities and homework.

More information

Using Internet Sources For Market Intelligence. Presented at NABE Big Data At Work Conference

Using Internet Sources For Market Intelligence. Presented at NABE Big Data At Work Conference Using Internet Sources For Market Intelligence Presented at NABE Big Data At Work Conference Northern Light 6-17-2015 The task Firms must make strategy decisions in order to survive and prosper What products

More information

Big Data: Image & Video Analytics

Big Data: Image & Video Analytics Big Data: Image & Video Analytics How it could support Archiving & Indexing & Searching Dieter Haas, IBM Deutschland GmbH The Big Data Wave 60% of internet traffic is multimedia content (images and videos)

More information

What is the number one issue that Organizational Leaders are facing today?

What is the number one issue that Organizational Leaders are facing today? What is the number one issue that Organizational Leaders are facing today? Managing time and energy in the face of growing complexity...the sense that the world is moving faster -Chris Zook (Bain & Company

More information

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING

More information

BigMemory and Hadoop: Powering the Real-time Intelligent Enterprise

BigMemory and Hadoop: Powering the Real-time Intelligent Enterprise WHITE PAPER and Hadoop: Powering the Real-time Intelligent Enterprise BIGMEMORY: IN-MEMORY DATA MANAGEMENT FOR THE REAL-TIME ENTERPRISE Terracotta is the solution of choice for enterprises seeking the

More information

This Symposium brought to you by www.ttcus.com

This Symposium brought to you by www.ttcus.com This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data

More information

EUDAT. Towards a pan-european Collaborative Data Infrastructure

EUDAT. Towards a pan-european Collaborative Data Infrastructure EUDAT Towards a pan-european Collaborative Data Infrastructure Damien Lecarpentier CSC-IT Center for Science, Finland EISCAT User Meeting, Uppsala,6 May 2013 2 Exponential growth Data trends Zettabytes

More information

Doing Multidisciplinary Research in Data Science

Doing Multidisciplinary Research in Data Science Doing Multidisciplinary Research in Data Science Assoc.Prof. Abzetdin ADAMOV CeDAWI - Center for Data Analytics and Web Insights Qafqaz University aadamov@qu.edu.az http://ce.qu.edu.az/~aadamov 16 May

More information