Mining Big Data. Pang-Ning Tan. Associate Professor Dept of Computer Science & Engineering Michigan State University

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Mining Big Data. Pang-Ning Tan. Associate Professor Dept of Computer Science & Engineering Michigan State University"

Transcription

1 Mining Big Data Pang-Ning Tan Associate Professor Dept of Computer Science & Engineering Michigan State University Website:

2 Google Trends Big Data Smart Cities

3 Big Data and Smart Cities

4 Outline Smart Cities Big Data and Its Challenges Mining Big Data

5 Smart Cities Cities are growing steadily, and the process of urbanization is a common trend in the world. Although cities are getting bigger, they are not necessarily getting better smart cities, founded on the use of information and communication technologies, aim at tackling many local problems, from local economy and transportation to quality of life and e-governance. [Martínez-Ballesté et al. IEEE Communications 2013]

6 Examples of Smart Cities Smart Buildings E-Governance Transportation Healthcare DATA $$$$ Education Energy Water Waste management Public safety What are the key resources needed to realize this?

7 Types of Data from Smart Cities Sensor time series Surveillance video streams GPS trajectories from mobile devices Smart card Social media Structured data

8 Why Mine/Analyze the Data? The data contains useful information that can be harnessed for various purposes: Monitoring/surveillance Event detection Adaptation Decision making Planning Forecasting Etc..

9 Outline Smart Cities Big Data and Its Challenges Mining Big Data

10 Big Data: How Much Data is Out There? Source:

11 How much is a Zettabyte? 1 ZettaByte= 1000 ExaBytes= 10 6 PetaBytes = 10 9 TeraBytes= GigaBytes A DVD stores about 5 GB data and its case is ~1cm thick 1 ZettaByte ~ / = 200 billion DVDs to store them Distance from Earth to moon = 384,000 km = cm ** If you stack all the DVDs that contain 1 ZB of data, it is about 3 times the distance to the moon and back

12 Challenges of Big Data Volume: large amount of data that is continuously growing Velocity: rapid streams of data collected Variety: structured and unstructured data obtained from (potentially) multiple data sources Veracity: messiness or trustworthiness of the data Value: usefulness of the data; needs a careful cost/benefit analysis before embarking on big data project

13 Outline Smart Cities Big Data and Its Challenges Mining Big Data

14 What is Data Mining? A collection of computer algorithms and techniques to automatically extract useful information from large data repositories Big Data Analytics Pipeline

15 Garbage In, Garbage Out Quality of output information depends on quality of input data

16 Data Preprocessing Helps to alleviate many of the data quality issues Noise Outliers Missing values (incomplete data) Duplicate data Data with irrelevant attributes Data with redundant attributes Data of varying format, scales, etc

17 Types of Data Analysis Simple, descriptive statistics Mean/Median/Mode Standard deviation/mean absolute deviation Quartiles, percentiles, top-k Example: Heavy-hitter problem Find the hot topics (e.g., trending hashtags) used over the past 24 hours

18 TrendMap

19 Finding Hot Topics (Unbounded storage) Data Stream 2013 discount holiday 2013 MSU Associative array, f Memory discount 1 holiday 1 MSU 1 Naïve algorithm; Assume storage space is unbounded

20 Finding Hot Topics (Limited Storage) Data Stream 2013 discount holiday Associative array, f Memory discount 1? holiday 1 Which one to replace? Any theoretical guarantees that solution will always be in the array?

21 Misra-Gries Algorithm Data Stream 2013 discount holiday 2013 MSU Associative array, f Memory discount MSU 1 0 holiday 1 Algorithm guarantees that all hot items that appear at least m/k+1 times will be in the buffer (where m is length of data stream and k is number of buffers)

22 Summary Even simple analysis becomes harder to compute when you have big data Need for fast and scalable algorithms that can produce good, approximate solutions

23 10 Advanced Data Mining Analysis Data Tid Refund Marital Status Taxable Income Cheat 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes 11 No Married 60K No 12 Yes Divorced 220K No 13 No Single 85K Yes 14 No Married 75K No 15 No Single 90K Yes Ranking/ Recommendation

24 Predictive Modeling: Classification To infer the value of a nominal attribute based on the values of other observed attributes Examples: Autonomous driving Traffic sign recognition Open lane detection Smart Home/Building: Appliance identification based on electricity utilization

25 Predictive Modeling: Regression To infer the value of a continuous attribute based on the values of other observed attributes Examples: mhealth Monitoring heart rate and body temperature using wearable devices Intelligent Transportation System Traffic volume prediction Smart Building Electricity/Water demand prediction

26 Framework for Predictive Modeling Labeled examples Unlabeled examples congestion No congestion Test Set Training Set Train Model Model

27 Cluster Analysis Find groups of observations such that the observations in the same group are more similar to each other than to those in other groups Intra-cluster distances are minimized Inter-cluster distances are maximized

28 Applications of Cluster Analysis Crime hotspot detection GPS trajectory segmentation

29 Association Analysis Extract patterns of frequently co-occurring events Time Sensor ID State 3/1/ :48:05 BR1 OFF 3/1/ :48:07 LR1 ON 3/1/ :48:10 LR6 ON 3/1/ :48:20 BT1 ON 3/1/ :48:40 LR6 OFF 3/1/ :49:30 BT3 ON Weekday, 7-8am, BR2 = OFF, BR1 = OFF, LR6 = ON LR1=ON Weekday, 10-11pm, BR1 = ON, BR2 = ON, LR6 = OFF LR1 = OFF

30 Applications of Association Analysis Traffic Accident Analysis Smart Health Adverse drug interactions

31 Anomaly Detection Detect significant deviations from normal observations

32 Applications of Anomaly Detection Smart Transportation Congestion detection Sensor fault detection Smart Home/Building Water theft detection Pipe burst detection

33 Ranking (Recommendation) Given a query q, recommend items in specific rank order based on their relevance to q Examples: Location-aware services Smart home assistant

34 Other Challenges: Privacy

35 Other Challenges: Security

36 Summary Mining big data is both a challenge and an opportunity

37 CSE Courses on Data Mining CSE 491/891: Computational Techniques for Large-Scale Data Analysis CSE 881: Data Mining

38 References Pang-Ning Tan, Knowledge Discovery from Sensor Data, Feature Article in Sensors Magazine, March Pang-Ning Tan, Michael Steinbach, and VipinKumar, Introduction to Data Mining, Addison Wesley, 2006

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014 Big Data Analytics An Introduction Oliver Fuchsberger University of Paderborn 2014 Table of Contents I. Introduction & Motivation What is Big Data Analytics? Why is it so important? II. Techniques & Solutions

More information

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Outline Big Data How to extract information? Data clustering

More information

Data Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler

Data Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Data Mining: Introduction Lecture Notes for Chapter 1 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused - Web

More information

The Big Deal about Big Data. Mike Skinner, CPA CISA CITP HORNE LLP

The Big Deal about Big Data. Mike Skinner, CPA CISA CITP HORNE LLP The Big Deal about Big Data Mike Skinner, CPA CISA CITP HORNE LLP Mike Skinner, CPA CISA CITP Senior Manager, IT Assurance & Risk Services HORNE LLP Focus areas: IT security & risk assessment IT governance,

More information

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

Introduction of Information Visualization and Visual Analytics. Chapter 4. Data Mining

Introduction of Information Visualization and Visual Analytics. Chapter 4. Data Mining Introduction of Information Visualization and Visual Analytics Chapter 4 Data Mining Books! P. N. Tan, M. Steinbach, V. Kumar: Introduction to Data Mining. First Edition, ISBN-13: 978-0321321367, 2005.

More information

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: simmibagga12@gmail.com

More information

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University September 19, 2012

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University September 19, 2012 Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University September 19, 2012 E-Mart No. of items sold per day = 139x2000x20 = ~6 million

More information

Smarter Planet evolution

Smarter Planet evolution Smarter Planet evolution 13/03/2012 2012 IBM Corporation Ignacio Pérez González Enterprise Architect ignacio.perez@es.ibm.com @ignaciopr Mike May Technologies of the Change Capabilities Tendencies Vision

More information

Collaborations between Official Statistics and Academia in the Era of Big Data

Collaborations between Official Statistics and Academia in the Era of Big Data Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI vnn@umich.edu What

More information

Big Data in Transportation Engineering

Big Data in Transportation Engineering Big Data in Transportation Engineering Nii Attoh-Okine Professor Department of Civil and Environmental Engineering University of Delaware, Newark, DE, USA Email: okine@udel.edu IEEE Workshop on Large Data

More information

DIGITAL UNIVERSE UNIVERSE

DIGITAL UNIVERSE UNIVERSE - - - - - - The - - DIGITAL - - - - - - - - - - - E M C D I G I T A L of OPPORTUNITIES RICH DATA & the Increasing Value of the INTERNET OF THINGS - - - - - - - - - - - - - - - - - - - - - - - - - GET STARTED

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Science: what is possible. Engineering: turn science into an everyday commodity (cheap, safe, reliable, resilient, )

Science: what is possible. Engineering: turn science into an everyday commodity (cheap, safe, reliable, resilient, ) : Big Data Analytics for Renewable Energy Mark J. Embrechts Dept. Industrial and Systems Engineering Rensselaer Polytechnic Institute, Troy, NY, USA What is Data Mining? Data Mining Big Data Analytics

More information

BIG Big Data Public Private Forum

BIG Big Data Public Private Forum DATA STORAGE Martin Strohbach, AGT International (R&D) THE DATA VALUE CHAIN Value Chain Data Acquisition Data Analysis Data Curation Data Storage Data Usage Structured data Unstructured data Event processing

More information

CSE4334/5334 Data Mining Lecturer 2: Introduction to Data Mining. Chengkai Li University of Texas at Arlington Spring 2016

CSE4334/5334 Data Mining Lecturer 2: Introduction to Data Mining. Chengkai Li University of Texas at Arlington Spring 2016 CSE4334/5334 Data Mining Lecturer 2: Introduction to Data Mining Chengkai Li University of Texas at Arlington Spring 2016 Big Data http://dilbert.com/strip/2012-07-29 Big Data http://www.ibmbigdatahub.com/infographic/four-vs-big-data

More information

Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data

Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data Yinong Chen 2 Big Data Big Data Technologies Cloud Computing Service and Web-Based Computing Applications Industry Control Systems

More information

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data

More information

Big Data Analytics. The Hype and the Hope* Dr. Ted Ralphs Industrial and Systems Engineering Director, COR@L Laboratory

Big Data Analytics. The Hype and the Hope* Dr. Ted Ralphs Industrial and Systems Engineering Director, COR@L Laboratory Big Data Analytics The Hype and the Hope* Dr. Ted Ralphs Industrial and Systems Engineering Director, COR@L Laboratory * Source: http://www.economistinsights.com/technology-innovation/analysis/hype-and-hope/methodology

More information

Big Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs

Big Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs 1 Big Data Analytics Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs Montevideo, 22 nd November 4 th December, 2015 INFORMATIQUE

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

So Just What Is Big Data? James E. Tcheng, MD, FACC, FSCAI

So Just What Is Big Data? James E. Tcheng, MD, FACC, FSCAI So Just What Is Big Data? James E. Tcheng, MD, FACC, FSCAI Disclosures James E. Tcheng, MD, FACC, FSCAI Affiliations / Financial Relationships / Other RWI ACC Chair, Informatics and Health IT Task Force

More information

Turning Big Data into Big Decisions Delivering on the High Demand for Data

Turning Big Data into Big Decisions Delivering on the High Demand for Data Turning Big Data into Big Decisions Delivering on the High Demand for Data Michael Ho, Vice President of Professional Services Digital Government Institute s Government Big Data Conference, October 31,

More information

Smart Data THE driving force for industrial applications

Smart Data THE driving force for industrial applications Smart Data THE driving force for industrial applications European Data Forum Luxembourg, siemens.com The world is becoming digital User behavior is radically changing based on new business models Newspaper,

More information

Data Mining: Introduction

Data Mining: Introduction Data Mining: Introduction Introducing the course How the course is organized How students are evaluated Deadlines Data Mining [Chapt. 1 of course book] What is it about? The KDD process Relations to other

More information

Big Data: Image & Video Analytics

Big Data: Image & Video Analytics Big Data: Image & Video Analytics How it could support Archiving & Indexing & Searching Dieter Haas, IBM Deutschland GmbH The Big Data Wave 60% of internet traffic is multimedia content (images and videos)

More information

Network Big Data: Facing and Tackling the Complexities Xiaolong Jin

Network Big Data: Facing and Tackling the Complexities Xiaolong Jin Network Big Data: Facing and Tackling the Complexities Xiaolong Jin CAS Key Laboratory of Network Data Science & Technology Institute of Computing Technology Chinese Academy of Sciences (CAS) 2015-08-10

More information

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,

More information

Information Security, PII and Big Data

Information Security, PII and Big Data ITU Workshop on ICT Security Standardization for Developing Countries (Geneva, Switzerland, 15-16 September 2014) Information Security, PII and Big Data Edward (Ted) Humphreys ISO/IEC JTC 1/SC 27 (WG1

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

See the wood for the trees

See the wood for the trees See the wood for the trees Dr. Harald Schöning Head of Research The world is becoming digital socienty government economy Digital Society Digital Government Digital Enterprise 2 Data is Getting Bigger

More information

Data Mining Introduction

Data Mining Introduction Data Mining Introduction Organization Lectures Mondays and Thursdays from 10:30 to 12:30 Lecturer: Mouna Kacimi Office hours: appointment by email Labs Thursdays from 14:00 to 16:00 Teaching Assistant:

More information

Big Data og Smart City. Knut H. H. Johansen CEO esmart System 7. mai 2015

Big Data og Smart City. Knut H. H. Johansen CEO esmart System 7. mai 2015 Big Data og Smart City Knut H. H. Johansen CEO esmart System 7. mai 2015 2 Smart Cities Big Data & Analytics Integrated Operations Smart City? No one definition for smart city > depends smartness comes

More information

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics

More information

A New Era Of Analytic

A New Era Of Analytic Penang egovernment Seminar 2014 A New Era Of Analytic Megat Anuar Idris Head, Project Delivery, Business Analytics & Big Data Agenda Overview of Big Data Case Studies on Big Data Big Data Technology Readiness

More information

Big data and its transformational effects

Big data and its transformational effects Big data and its transformational effects Professor Fai Cheng Head of Research & Technology September 2015 Working together for a safer world Topics Lloyd s Register Big Data Data driven world Data driven

More information

Big Data Systems CS 5965/6965 FALL 2014

Big Data Systems CS 5965/6965 FALL 2014 Big Data Systems CS 5965/6965 FALL 2014 Today General course overview Q&A Introduction to Big Data Data Collection Assignment #1 General Course Information Course Web Page http://www.cs.utah.edu/~hari/teaching/fall2014.html

More information

Statistical Challenges with Big Data in Management Science

Statistical Challenges with Big Data in Management Science Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision

More information

BIG DATA & SOCIAL INNOVATION KENNETH THOMAS, CLIENT MANAGER

BIG DATA & SOCIAL INNOVATION KENNETH THOMAS, CLIENT MANAGER BIG DATA & SOCIAL INNOVATION KENNETH THOMAS, CLIENT MANAGER 1 MAKING THE RIGHT DECISSION AT THE RIGHT PLACE AT THE RIGHT TIME 2 THE DATA MULTIPLIER EFFECT AT WORK BUSINESS DRIVEN HUMAN DRIVEN MACHINE DRIVEN

More information

CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science

CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science Dr. Daisy Zhe Wang CISE Department University of Florida August 25th 2014 20 Review Overview of Data Science Why Data

More information

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait CSC590: Selected Topics BIG DATA & DATA MINING Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait Agenda Introduction What is Big Data Why Big Data? Characteristics of Big Data Applications of Big Data Problems

More information

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER Hur hanterar vi utmaningar inom området - Big Data Jan Östling Enterprise Technologies Intel Corporation, NER Legal Disclaimers All products, computer systems, dates, and figures specified are preliminary

More information

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof. CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Data Science Overview Why, What, How, Who Outline Why Data Science?

More information

Majed Al-Ghandour, PhD, PE, CPM Division of Planning and Programming NCDOT 2016 NCAMPO Conference- Greensboro, NC May 12, 2016

Majed Al-Ghandour, PhD, PE, CPM Division of Planning and Programming NCDOT 2016 NCAMPO Conference- Greensboro, NC May 12, 2016 Big Data! Majed Al-Ghandour, PhD, PE, CPM Division of Planning and Programming NCDOT 2016 NCAMPO Conference- Greensboro, NC May 12, 2016 Big Data: Data Analytical Tools for Decision Support 2 Outline Introduce

More information

CIS492 Special Topics: Cloud Computing د. منذر الطزاونة

CIS492 Special Topics: Cloud Computing د. منذر الطزاونة CIS492 Special Topics: Cloud Computing د. منذر الطزاونة Big Data Definition No single standard definition Big Data is data whose scale, diversity, and complexity require new architecture, techniques, algorithms,

More information

Demystifying Big Data Government Agencies & The Big Data Phenomenon

Demystifying Big Data Government Agencies & The Big Data Phenomenon Demystifying Big Data Government Agencies & The Big Data Phenomenon Today s Discussion If you only remember four things 1 Intensifying business challenges coupled with an explosion in data have pushed

More information

Introduction to Predictive Analytics. Dr. Ronen Meiri ronen@dmway.com

Introduction to Predictive Analytics. Dr. Ronen Meiri ronen@dmway.com Introduction to Predictive Analytics Dr. Ronen Meiri Outline From big data to predictive analytics Predictive Analytics vs. BI Intelligent platforms What can we do with it. The modeling process. Example

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

From Big Data to Smart Data Thomas Hahn

From Big Data to Smart Data Thomas Hahn Siemens Future Forum @ HANNOVER MESSE 2014 From Big to Smart Hannover Messe 2014 The Evolution of Big Digital data ~ 1960 warehousing ~1986 ~1993 Big data analytics Mining ~2015 Stream processing Digital

More information

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Applications for Business Intelligence, Predictive Analytics and Big Data

Applications for Business Intelligence, Predictive Analytics and Big Data Finance, Management, & Operations Applications for Business Intelligence, Predictive Analytics and Big Data Patrick Bogan, Chief Information Officer, Fuzion Analytics Kyle Korzenowski, Chief Information

More information

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel Big Data and Analytics: Getting Started with ArcGIS Mike Park Erik Hoel Agenda Overview of big data Distributed computation User experience Data management Big data What is it? Big Data is a loosely defined

More information

MEDICAL DATA MINING. Timothy Hays, PhD. Health IT Strategy Executive Dynamics Research Corporation (DRC) December 13, 2012

MEDICAL DATA MINING. Timothy Hays, PhD. Health IT Strategy Executive Dynamics Research Corporation (DRC) December 13, 2012 MEDICAL DATA MINING Timothy Hays, PhD Health IT Strategy Executive Dynamics Research Corporation (DRC) December 13, 2012 2 Healthcare in America Is a VERY Large Domain with Enormous Opportunities for Data

More information

Exploiting the power of Big Data

Exploiting the power of Big Data Exploiting the power of Big Data Timos Sellis School of Computer Science and Information Technology timos.sellis@rmit.edu.au ITECHLAW Asia-Pacific Conference, February 26-28, 2014 Melbourne Australia Timeline

More information

Big Data Analytics Process & Building Blocks

Big Data Analytics Process & Building Blocks Big Data Analytics Process & Building Blocks Duen Horng (Polo) Chau Georgia Tech CSE 6242 A / CS 4803 DVA Jan 10, 2013 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica So What s the market s definition of Big Data? Datasets whose volume, velocity, variety

More information

Big Data Driven Knowledge Discovery for Autonomic Future Internet

Big Data Driven Knowledge Discovery for Autonomic Future Internet Big Data Driven Knowledge Discovery for Autonomic Future Internet Professor Geyong Min Chair in High Performance Computing and Networking Department of Mathematics and Computer Science College of Engineering,

More information

BIG DATA: BIG BOOST TO BIG TECH

BIG DATA: BIG BOOST TO BIG TECH BIG DATA: BIG BOOST TO BIG TECH Ms. Tosha Joshi Department of Computer Applications, Christ College, Rajkot, Gujarat (India) ABSTRACT Data formation is occurring at a record rate. A staggering 2.9 billion

More information

Addressing government challenges with big data analytics

Addressing government challenges with big data analytics IBM Software White Paper Government Addressing government challenges with big data analytics 2 Addressing government challenges with big data analytics Contents 2 Introduction 4 How big data analytics

More information

Data Warehousing and Data Mining

Data Warehousing and Data Mining Data Warehousing and Data Mining Winter Semester 2010/2011 Free University of Bozen, Bolzano DW Lecturer: Johann Gamper gamper@inf.unibz.it DM Lecturer: Mouna Kacimi mouna.kacimi@unibz.it http://www.inf.unibz.it/dis/teaching/dwdm/index.html

More information

The Big Data Paradigm Shift. Insight Through Automation

The Big Data Paradigm Shift. Insight Through Automation The Big Data Paradigm Shift Insight Through Automation Agenda The Problem Emcien s Solution: Algorithms solve data related business problems How Does the Technology Work? Case Studies 2013 Emcien, Inc.

More information

Data Centric Computing Revisited

Data Centric Computing Revisited Piyush Chaudhary Technical Computing Solutions Data Centric Computing Revisited SPXXL/SCICOMP Summer 2013 Bottom line: It is a time of Powerful Information Data volume is on the rise Dimensions of data

More information

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Exploiting Data at Rest and Data in Motion with a Big Data Platform Exploiting Data at Rest and Data in Motion with a Big Data Platform Sarah Brader, sarah_brader@uk.ibm.com What is Big Data? Where does it come from? 12+ TBs of tweet data every day 30 billion RFID tags

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Industry Impact of Big Data in the Cloud: An IBM Perspective

Industry Impact of Big Data in the Cloud: An IBM Perspective Industry Impact of Big Data in the Cloud: An IBM Perspective Inhi Cho Suh IBM Software Group, Information Management Vice President, Product Management and Strategy email: inhicho@us.ibm.com twitter: @inhicho

More information

AV-24 Advanced Analytics for Predictive Maintenance

AV-24 Advanced Analytics for Predictive Maintenance Slide 1 AV-24 Advanced Analytics for Predictive Maintenance Big Data Meets Equipment Reliability and Maintenance Paul Sheremeto President & CEO Pattern Discovery Technologies Inc. social.invensys.com @InvensysOpsMgmt

More information

Context-Aware Online Traffic Prediction

Context-Aware Online Traffic Prediction Context-Aware Online Traffic Prediction Jie Xu, Dingxiong Deng, Ugur Demiryurek, Cyrus Shahabi, Mihaela van der Schaar University of California, Los Angeles University of Southern California J. Xu, D.

More information

Modern (Computational) Approaches to Big Data Analytics. CSC 576 Computer Science, University of Rochester Instructor: Ji Liu

Modern (Computational) Approaches to Big Data Analytics. CSC 576 Computer Science, University of Rochester Instructor: Ji Liu Modern (Computational) Approaches to Big Data Analytics CSC 576 Computer Science, University of Rochester Instructor: Ji Liu Big Data in Academy SIGKDD 2014 (program page, found 14 big data, 50+ large

More information

WHAT IS BIG DATA? David Bechtold

WHAT IS BIG DATA? David Bechtold WHAT IS BIG DATA? David Bechtold Agenda 1. Introduction 2. What is Big Data? 3. Big Data a perspective 4. Characteristic of Big Data Three Vs 5. A Fourth V..? 6. Examples 7. How did we get here?... A historical

More information

IJITE Vol.03 Issue - 03, (March 2015) ISSN: 2321 1776 Impact Factor 3.570

IJITE Vol.03 Issue - 03, (March 2015) ISSN: 2321 1776 Impact Factor 3.570 Big data analytics vs Data Mining analytics Vinti Parmar, 1 Department of Computer Science, Indira Gandhi University, Meerpur, Rewari Haryana, INDIA Itisha Gupta Department of Computer Science, Bright

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

Big Data a threat or a chance?

Big Data a threat or a chance? Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but

More information

Introduction to the Mathematics of Big Data. Philippe B. Laval

Introduction to the Mathematics of Big Data. Philippe B. Laval Introduction to the Mathematics of Big Data Philippe B. Laval Fall 2015 Introduction In recent years, Big Data has become more than just a buzz word. Every major field of science, engineering, business,

More information

Hexaware E-book on Predictive Analytics

Hexaware E-book on Predictive Analytics Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,

More information

Deploying Big Data to the Cloud: Roadmap for Success

Deploying Big Data to the Cloud: Roadmap for Success Deploying Big Data to the Cloud: Roadmap for Success James Kobielus Chair, CSCC Big Data in the Cloud Working Group IBM Big Data Evangelist. IBM Data Magazine, Editor-in- Chief. IBM Senior Program Director,

More information

Big Data: What You Should Know. Mark Child Research Manager - Software IDC CEMA

Big Data: What You Should Know. Mark Child Research Manager - Software IDC CEMA Big Data: What You Should Know Mark Child Research Manager - Software IDC CEMA Agenda Market Dynamics Defining Big Data Technology Trends Information and Intelligence Market Realities Future Applications

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Data Mining and Soft Computing. Francisco Herrera

Data Mining and Soft Computing. Francisco Herrera Francisco Herrera Research Group on Soft Computing and Information Intelligent Systems (SCI 2 S) Dept. of Computer Science and A.I. University of Granada, Spain Email: herrera@decsai.ugr.es http://sci2s.ugr.es

More information

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool.

Comparative Analysis of EM Clustering Algorithm and Density Based Clustering Algorithm Using WEKA tool. International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 9, Issue 8 (January 2014), PP. 19-24 Comparative Analysis of EM Clustering Algorithm

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

JAPAN UNIVERSE. RICH DATA & the Increasing Value of the INTERNET OF THINGS. The DIGITAL UNIVERSE of OPPORTUNITIES GET STARTED COUNTRY BRIEF

JAPAN UNIVERSE. RICH DATA & the Increasing Value of the INTERNET OF THINGS. The DIGITAL UNIVERSE of OPPORTUNITIES GET STARTED COUNTRY BRIEF COUNTRY BRIEF The DIGITAL of OPPORTUNITIES RICH DATA & the Increasing Value of the INTERNET OF THINGS - With Research & Analysis By - APRIL 2014 GET STARTED Digital Universe in Japan IDC analyzed the Digital

More information

BIG DATA. - How big data transforms our world. Kim Escherich Executive Innovation Architect, IBM Global Business Services

BIG DATA. - How big data transforms our world. Kim Escherich Executive Innovation Architect, IBM Global Business Services BIG DATA - How big data transforms our world Kim Escherich Executive Innovation Architect, IBM Global Business Services 1 2 What happens? What is data? 340.282.366.920.938.463.463.374.607.431.768.211.456

More information

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

More information

Foundations of Artificial Intelligence. Introduction to Data Mining

Foundations of Artificial Intelligence. Introduction to Data Mining Foundations of Artificial Intelligence Introduction to Data Mining Objectives Data Mining Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees Present

More information

Big Data Analytics. Lucas Rego Drumond

Big Data Analytics. Lucas Rego Drumond Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 36 Outline

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

Next Internet Evolution: Getting Big Data insights from the Internet of Things

Next Internet Evolution: Getting Big Data insights from the Internet of Things Next Internet Evolution: Getting Big Data insights from the Internet of Things Internet of things are fast becoming broadly accepted in the world of computing and they should be. Advances in Cloud computing,

More information

Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D.

Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D. Data Mining on Social Networks Dionysios Sotiropoulos Ph.D. 1 Contents What are Social Media? Mathematical Representation of Social Networks Fundamental Data Mining Concepts Data Mining Tasks on Digital

More information

What is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO

What is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO What is Data Mining? Data Mining (Knowledge discovery in database) Data Mining: "The non trivial extraction of implicit, previously unknown, and potentially useful information from data" William J Frawley,

More information

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Is a Data Scientist the New Quant? Stuart Kozola MathWorks Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India.

Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India. Impact of Big Data in Oil & Gas Industry Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India. New Age Information 2.92 billions Internet Users in 2014 Twitter processes 7 terabytes

More information

Approaches for parallel data loading and data querying

Approaches for parallel data loading and data querying 78 Approaches for parallel data loading and data querying Approaches for parallel data loading and data querying Vlad DIACONITA The Bucharest Academy of Economic Studies diaconita.vlad@ie.ase.ro This paper

More information