What is Big Data used for? Intro to Big Data INRIA



Similar documents
Applications for Big Data Analytics

Doing Multidisciplinary Research in Data Science

Transforming the Telecoms Business using Big Data and Analytics

CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

From Big Data to Smart Data Thomas Hahn

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Industry Impact of Big Data in the Cloud: An IBM Perspective

Big Data Use Cases Update

How Big Data will change your life. (what is it and why should you care?)

Big Data Explained. An introduction to Big Data Science.

Using Big Data to Explore New Opportunities. Fandhy Haristha Siregar, M.Kom, CIA, CRMA, CISA, CISM, CISSP, CEH, CEP-PM, QIA, COBIT5

A New Era Of Analytic

Big Data. Fast Forward. Putting data to productive use

A Hurwitz white paper. Inventing the Future. Judith Hurwitz President and CEO. Sponsored by Hitachi

1. Understanding Big Data

Safe Harbor Statement

Top 4 Trends in Digital Marketing 2014

A Strategic Approach to Unlock the Opportunities from Big Data

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

LARGE-SCALE DATA-DRIVEN DECISION- MAKING: THE NEXT REVOLUTION FOR TRADITIONAL INDUSTRIES

The New World of Data. Don Strickland President, Strickland & Associates

Big Data how it changes the way you treat data

How To Map Human Dynamics With Social Media For Disaster Alerts

IBM Business Analytics software for Insurance

Data Driven Discovery In the Social, Behavioral, and Economic Sciences

Mohan Sawhney Robert R. McCormick Tribune Foundation Clinical Professor of Technology Kellogg School of Management

Streaming Analytics and the Internet of Things: Transportation and Logistics

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

How Big Is Big Data Adoption? Survey Results. Survey Results Big Data Company Strategy... 6

Copyright 2014, Neudesic. All rights reserved.

CONTENTS. Introduction 3. IoT- the next evolution of the internet..3. IoT today and its importance..4. Emerging opportunities of IoT 5

How To Use Big Data Effectively

OVERVIEW OF INTERNET MARKETING

Fujitsu Big Data Software Use Cases

Big Data and Semantic Web in Manufacturing. Nitesh Khilwani, PhD Chief Engineer, Samsung Research Institute Noida, India

Predictive Analytics. Noam Zeigerson, CTO

Information Management course

SECURITY MEETS BIG DATA. Achieve Effectiveness And Efficiency. Copyright 2012 EMC Corporation. All rights reserved.

Customer Segmentation in the Age of Big Data

Big Data better business benefits

What do we do at Cimigo?

Key Marketing Trends & Developments in 2015

Research Note What is Big Data?

Why Modern B2B Marketers Need Predictive Marketing

Sunnie Chung. Cleveland State University

Big Data. Introducción. Santiago González

BIG DATA FUNDAMENTALS

COMP9321 Web Application Engineering

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Managing the Next Best Activity Decision

Big Data Analytics- Innovations at the Edge

Getting Started Practical Input For Your Roadmap

Data-Driven Decisions: Role of Operations Research in Business Analytics

DATA EXPERTS MINE ANALYZE VISUALIZE. We accelerate research and transform data to help you create actionable insights

Big Data Analytics: 14 November 2013

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait

So What s the Big Deal?

SOCIAL MEDIA ADVERTISING STRATEGIES THAT WORK

Big Data or Data Tsunami - The Challenge of accessing meaningful clinical information. Can we learn from other industries?

Bruhati Technologies. About us. ISO 9001:2008 certified. Technology fit for Business

Prescriptive Analytics. A business guide

Are You Ready for Big Data?

Fighting Future Fraud A Strategy for Using Big Data, Machine Learning, and Data Lakes to Fight Mobile Communications Fraud

Information Visualization WS 2013/14 11 Visual Analytics

12/7/2015. Data Science Master s programs

Banking On A Customer-Centric Approach To Data

Big Analytics: A Next Generation Roadmap

How has Web 2.0 reshaped the presidential campaign in the United States?

IC05 Introduction on Networks &Visualization Nov

The Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics

EVERYTHING THAT MATTERS IN ADVANCED ANALYTICS

Are You Ready for Big Data?

Advanced Methods for Pedestrian and Bicyclist Sensing

MLg. Big Data and Its Implication to Research Methodologies and Funding. Cornelia Caragea TARDIS November 7, Machine Learning Group

Rapid Visualization with Big Data Analytics. Ravi Chalaka VP, Solution and Social Innovation Marketing

Big Data and Analytics: Challenges and Opportunities

Transcription:

What is Big Data used for? 56

What is Big Data used for? Harnessing scientific discoveries 57

What is Big Data used for? Harnessing scientific discoveries Initiating early warning of natural disasters (e.g., floods, volcanic eruptions, and earthquakes) 58

What is Big Data used for? Harnessing scientific discoveries Initiating early warning of natural disasters (e.g., floods, volcanic eruptions, and earthquakes) Reports» Track business processes, transactions 59

What is Big Data used for? Diagnosis Decisions 60

What is Big Data used for? Diagnosis» Why is user engagement dropping?» Why is the system slow?» Prevent failures» Detect spam, worms, viruses, DDoS attacks Decisions 61

What is Big Data used for? Diagnosis» Why is user engagement dropping?» Why is the system slow?» Prevent failures» Detect spam, worms, viruses, DDoS attacks Decisions» Personalized medical treatment» Decide what ads to show 62

Who is collecting what? Credit Card Companies What data are they getting? Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 63

Who is collecting what? Credit Card Companies What data are they getting? Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 64

Who is collecting what? Credit Card Companies What data are they getting? Airline ticket Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 65

Who is collecting what? Credit Card Companies What data are they getting? Airline ticket Restaurant check Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 66

Who is collecting what? Credit Card Companies What data are they getting? Airline ticket Restaurant check Grocery Bill Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 67

Who is collecting what? Credit Card Companies What data are they getting? Airline ticket Restaurant check Grocery Bill Hotel Bill Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 68

Why are they collecting all this data? Target Marketing Targeted Information Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 69

Why are they collecting all this data? Target Marketing Targeted Information To send you catalogs for exactly the merchandise you typically purchase. Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 70

Why are they collecting all this data? Target Marketing Targeted Information To send you catalogs for exactly the merchandise you typically purchase. To suggest medications that precisely match your medical history. Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 71

Why are they collecting all this data? Target Marketing Targeted Information To send you catalogs for exactly the merchandise you typically purchase. To suggest medications that precisely match your medical history. To push television channels to your set instead of your pulling them in. Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 72

Why are they collecting all this data? Target Marketing To send you catalogs for exactly the merchandise you typically purchase. To suggest medications that precisely match your medical history. To push television channels to your set instead of your pulling them in. To send advertisements on those channels just for you! Targeted Information Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 73

Why are they collecting all this data? Target Marketing To send you catalogs for exactly the merchandise you typically purchase. To suggest medications that precisely match your medical history. To push television channels to your set instead of your pulling them in. To send advertisements on those channels just for you! Targeted Information To know what you need before you even know you need it based on past purchasing habits! Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 74

Why are they collecting all this data? Target Marketing To send you catalogs for exactly the merchandise you typically purchase. To suggest medications that precisely match your medical history. To push television channels to your set instead of your pulling them in. To send advertisements on those channels just for you! Targeted Information To know what you need before you even know you need it based on past purchasing habits! To notify you of your expiring driver s license or credit cards or last refill on a Rx, etc. Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 75

Why are they collecting all this data? Target Marketing To send you catalogs for exactly the merchandise you typically purchase. To suggest medications that precisely match your medical history. To push television channels to your set instead of your pulling them in. To send advertisements on those channels just for you! Targeted Information To know what you need before you even know you need it based on past purchasing habits! To notify you of your expiring driver s license or credit cards or last refill on a Rx, etc. To give you turn- by- turn directions to a shelter in case of emergency. Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 76

5 Ways Big Data Will Change the World 77

78

Medicine Aetna is using reams of data to try to get early diagnosis, prevention and treatment of heart disease and diabetes. UCLA is using Big Data analysis to prevent complications from brain injuries. The American Society of Clinical Oncology is using Big Data to help it find the best treatments for cancer. http://insights.wired.com/profiles/blogs/5- ways- big- data- will- change- the- world#axzz3nhefva1j 79

Security There is a pedometer application that can actually identify people based on their gait, how they walk. A new security firm called Pindrop is using Big Data analysis to help banks and other financial institutions identify callers to ensure the person on the other end of the line is who they say t.. Pindrop is able to listen to more than 100 different background sounds on a phone call to tell where the call is coming from and whether it is a cell phone, land line, of VOIP. They can tell you if the person claiming to be in Nebraska is actually calling from Nigeria. http://insights.wired.com/profiles/blogs/5- ways- big- data- will- change- the- world#axzz3nhefva1j 80

Urban Planning Tracking the movements of people and how that could impact urban planning. Cities are using data discovery techniques to examine the myriad of ways small changes can impact a big urban centers. The Urban Center for Computational Data talks about computer models helping cities to figure out how things like a new bus line might impact crime, employment, and energy usage in parts of a city. There is little question that how our cities are built and function will be changed by data analytics. http://insights.wired.com/profiles/blogs/5- ways- big- data- will- change- the- world#axzz3nhefva1j 81

Consumer Products The tremendous rise in online shopping has created piles of data to better understand what consumers want and how they shop. It even allows companies to customize their pricing models based on who is shopping and when they want to buy. http://insights.wired.com/profiles/blogs/5- ways- big- data- will- change- the- world#axzz3nhefva1j 82

Elections In the 2012 presidential election, the Obama Campaign made use of voter models on a scale never before seen. They were able to identify specific voters who would make a difference in the election and target messages to those voters. I am not talking about something general like, we need to appeal to soccer moms, I am talking about true specifics like, the Johnson family Maple Lane in Columbus, Ohio will vote for us if they know our stance on social security. It seems insane to think that presidential politics has gotten that local, but it has and it worked. There is little question that the Obama campaigns sophisticated methods of get out the vote and swing voter identification swung a very close election their way. http://insights.wired.com/profiles/blogs/5- ways- big- data- will- change- the- world#axzz3nhefva1j 83

Usage Examples of Big Data 84

Self- driving cars have to do with Big Data? Computers in cars know where you go, when you go, how fast you go, how many times you stop along the way, whether you stay in your lane, what your average MPG is, how you like your temperature, how close you get before stepping on the brake, and tens of thousands of other facts.instantly. Analyzing all of this data rapidly allows a self- driving car to:» Anticipate where you are going by looking at driving history» Check road signs using sensors to know what the speed limit is or if a stop sign is approaching» Alert and activate your braking and steering systems if pedestrians are in the street or you re too close to the curb or you drift into another lane or you doze off. Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 85

Usage Example in Big Data - Moneyball: The Art of Winning an Unfair Game Oakland Athletics baseball team and its general manager Billy Beane - Oakland A's' front office took advantage of more analytical gauges of player performance to field a team that could compete successfully against richer competitors in MLB - Oakland approximately $41 million in salary, New York Yankees, $125 million in payroll that same season. Oakland is forced to find players undervalued by the market, - Moneyball had a huge impact in other teams in MLB And there is a moneyball movie!!!!! Adopted from a presentation by Kayvan Tirdad The Age of Big Data, York University 86

Usage Example of Big Data US 2012 Election - predictive modeling - mybarackobama.com - drive traffic to other campaign sites Facebook page (33 million "likes") YouTube channel (240,000 subscribers and 246 million page views). - a contest to dine with Sarah Jessica Parker - Every single night, the team ran 66,000 computer simulations. - Amazon web services - data mining for individualized ad targeting - Orca big- data app - YouTube channel( 23,700 subscribers and 26 million page views) - Adopted from a presentation by Kayvan Tirdad The Age of Big Data, York University 87

Big Data: Challenges 88

Volume (Scale) Data Volume» 44x increase from 2009 2020» From 0.8 zettabytes to 35zb Data volume is increasing exponentially 89

Volume (Scale) Data Volume» 44x increase from 2009 2020» From 0.8 zettabytes to 35zb Data volume is increasing exponentially 90

Volume (Scale) Data Volume» 44x increase from 2009 2020» From 0.8 zettabytes to 35zb Data volume is increasing exponentially Exponential increase in collected/generated data 91

12+ TBs of tweet data every day

12+ TBs of tweet data every day 25+ TBs of log data every day

12+ TBs of tweet data every day? TBs of data every day 25+ TBs of log data every day

12+ TBs of tweet data every day? TBs of data every day 25+ TBs of log data every day

12+ TBs of tweet data every day 30 billion RFID tags today (1.3B in 2005) 4.6 billion camera phones world wide? TBs of data every day 25+ TBs of log data every day 76 million smart meters in 2009 200M by 2014 100s of millions of GPS enabled devices sold annually 2+ billion people on the Web by end 2011

Variety (Complexity) Relational Data (Tables/Transaction/Legacy Data) Text Data (Web) Semi- structured Data (XML) Graph Data» Social Network, Semantic Web (RDF), Streaming Data» You can only scan the data once A single application can be generating/collecting many types of data Big Public Data (online, weather, finance, etc) Intro to Big Data INRIA 98

Variety (Complexity) Relational Data (Tables/Transaction/Legacy Data) Text Data (Web) Semi- structured Data (XML) Graph Data» Social Network, Semantic Web (RDF), Streaming Data» You can only scan the data once A single application can be generating/collecting many types of data Big Public Data (online, weather, finance, etc) To extract knowledgeè all these types of data need to linked together Intro to Big Data INRIA 99

A Single View to the Customer Social Media Bankin g Finance Gamin g Customer Our Known History Entertain Purchas e

Velocity (Speed) Data is begin generated fast and need to be processed fast Online Data Analytics Late decisions è missing opportunities 101

Velocity (Speed) Data is begin generated fast and need to be processed fast Online Data Analytics Late decisions è missing opportunities Examples» E- Promotions: Based on your current location, your purchase history, what you like è send promotions right now for store next to you» Healthcare monitoring: sensors monitoring your activities and body è any abnormal measurements require immediate reaction 102

Some Make it 4V s 103

and Privacy 104

Goodbye Anonymity 105

What are some impacts of Big Data? Decisions like your credit score and your insurance rates may be based on the analysis of big data, for good or bad. After Haiti s 2010 earthquake, Columbia University tracked the movements of 2 million refugees by the SIM cards in their cell phones and were able to determine where health risks would likely develop. Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 106

Is Big Data good or bad for consumers? How would you feel about paying more for the same product than the person checking out in front of you? The real challenge: are you willing to get better value and more innovation for some loss of privacy? Since there is no way to stop the accumulation of Big Data, should its use be regulated by the Federal government? Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 107

How Can You Avoid Big Data? Pay cash for everything! Never go online! Don t use a telephone! Don t fill any prescriptions! Never leave your house! Adopted from a presentation by Stu Miller, How Big Data will change your life, at Osher Lifelong Learning Institute 108

The Data Science: The 4 th Paradigm for Scientific Discovery 109. a a 2 = 4πGρ Κ 3 c a 2 2 Thousand years ago Description of natural phenomena Last few hundred years Newton s laws, Maxwell s equations Last few decades Simulation of complex phenomena Today and the Future Unify theory, experiment and simulation with large multidisciplinary Data Using data exploration and data mining (from instruments, sensors, humans ) Crédits: Dennis Gannon Distributed Communities 109

The Data Science: The 4 th Paradigm for Scientific Discovery 110. a a 2 = 4πGρ Κ 3 c a 2 2 Thousand years ago Description of natural phenomena Last few hundred years Newton s laws, Maxwell s equations Last few decades Simulation of complex phenomena Today and the Future Unify theory, experiment and simulation with large multidisciplinary Data Using data exploration and data mining (from instruments, sensors, humans ) Crédits: Dennis Gannon Distributed Communities 110

Big Data Science: The art of understanding huge volumes of data Data Science is not just data analysis. Four main topics: Data architecture: how the data would need to be routed and organized to support the analysis, visualization and presentation of the data Data acquisition: how the data are collected, and, importantly, how the data are represented prior to analysis and presentation Data analysis: involves many technical, mathematical, and statistical aspects; still, the results have to be effectively communicated to the data user. Data archiving: preservation of collected data in a form that makes it highly reusable (data curation) 111

Data Scientist skills Evolution from the data analyst role: Computer science, software engineering methodologies, modeling, statistics, analytics, visualization, databases, machine learning, data mining, big data and maths. Business skills: Influence in making decisions in a business environment The data scientist guides a data science project Engineer collect & scrub disparate data sources manage a large computing cluster Mathematician machine learning statistics Artist visualize data beautifully, tell a convincing story 112

Data Science Venn Diagram

Cross- Cutting data Requirements Machine learning Statistical methods Scalability Quality Multi- model Data Analysis Schema Sharing Retention Search I/O Storage tech. Data Management Acquisition Workflow Reduction System arch. Provenance Data Processing 114

Data Scientist I keep saying the sexy job in the next ten years will be statisticians. The ability to take data - to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it. Hal Varian, Google s chief economist 115

116

Intro to Big Data INRIA 117

118

Acknowledgments Gabriel Antoniu (Inria) Alexandru Costan (INSA) 119

Thank you! Shadi Ibrahim shadi.ibrahim@inria.fr 120