Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data



Similar documents
CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

BIG DATA FUNDAMENTALS

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: Vol. 1, Issue 6, October Big Data and Hadoop

Are You Ready for Big Data?

Are You Ready for Big Data?

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

Big Data. Donald Kossmann & Nesime Tatbul Systems Group ETH Zurich

Industry Impact of Big Data in the Cloud: An IBM Perspective

What happens when Big Data and Master Data come together?

How Big Is Big Data Adoption? Survey Results. Survey Results Big Data Company Strategy... 6

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Data Refinery with Big Data Aspects

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Deploying Big Data to the Cloud: Roadmap for Success

Problems to store, transfer and process the Big Data 6/2/2016 GIANG TRAN - TTTGIANG2510@GMAIL.COM 1

Analyzing Big Data: The Path to Competitive Advantage

Big Data: Image & Video Analytics

Exploiting Data at Rest and Data in Motion with a Big Data Platform

The Big Picture on Big Data. Princeton Section 307 Dinner Meeting December 11, 2013 Richard Herczeg

Auto-Classification for Document Archiving and Records Declaration

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

Getting Started Practical Input For Your Roadmap

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Big Data in Transportation Engineering

Big Data a threat or a chance?

Sources: Summary Data is exploding in volume, variety and velocity timely

We are Big Data A Sonian Whitepaper

Keywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop

Sunnie Chung. Cleveland State University

Big Data Analytics: 14 November 2013

The Next Wave of Data Management. Is Big Data The New Normal?

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica

The Internet of Things

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science

Statistical Challenges with Big Data in Management Science

Information Management course

NoSQL for SQL Professionals William McKnight

Transforming the Telecoms Business using Big Data and Analytics

Turning Big Data into Big Decisions Delivering on the High Demand for Data

Beyond Watson: The Business Implications of Big Data

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

Big Data-Challenges and Opportunities

BIG DATA: BIG BOOST TO BIG TECH

Page 2 of 5. Big Data = Data Literacy: HP Vertica and IIS

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Big Analytics: A Next Generation Roadmap

The 4 Pillars of Technosoft s Big Data Practice

Big Data Introduction, Importance and Current Perspective of Challenges

Of all the data in recorded human history, 90 percent has been created in the last two years. - Mark van Rijmenam, Think Bigger, 2014

Improving Data Processing Speed in Big Data Analytics Using. HDFS Method

A Brief Outline on Bigdata Hadoop

Demystifying Big Data Government Agencies & The Big Data Phenomenon

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

BIG DATA. - How big data transforms our world. Kim Escherich Executive Innovation Architect, IBM Global Business Services

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Big Data & Analytics for Semiconductor Manufacturing

BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo

Modern (Computational) Approaches to Big Data Analytics. CSC 576 Computer Science, University of Rochester Instructor: Ji Liu

The Big Deal about Big Data. Mike Skinner, CPA CISA CITP HORNE LLP

Boarding to Big data

IJRCS - International Journal of Research in Computer Science ISSN:

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料

Big Data. White Paper. Big Data Executive Overview WP-BD Jafar Shunnar & Dan Raver. Page 1 Last Updated

Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India.

"BIG DATA A PROLIFIC USE OF INFORMATION"

Introduction to Predictive Analytics: SPSS Modeler

DATA MANAGEMENT FOR THE INTERNET OF THINGS

BIG DATA TRENDS AND TECHNOLOGIES

Customer Insight Appliance. Enabling retailers to understand and serve their customer

Introduction to the Mathematics of Big Data. Philippe B. Laval

SECURITY MEETS BIG DATA. Achieve Effectiveness And Efficiency. Copyright 2012 EMC Corporation. All rights reserved.

BIG DATA CHALLENGES AND PERSPECTIVES

Mining Big Data. Pang-Ning Tan. Associate Professor Dept of Computer Science & Engineering Michigan State University

Big Data: Key Concepts The three Vs

Government Technology Trends to Watch in 2014: Big Data

How To Use Big Data Effectively

IBM Business Analytics software for Insurance

A Hurwitz white paper. Inventing the Future. Judith Hurwitz President and CEO. Sponsored by Hitachi

BIG DATA STRATEGY. Rama Kattunga Chair at American institute of Big Data Professionals. Building Big Data Strategy For Your Organization

ANALYTICS BUILT FOR INTERNET OF THINGS

Data Science, Predictive Analytics & Big Data Analytics Solutions. Service Presentation

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

Transcription:

Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data Yinong Chen

2 Big Data Big Data Technologies Cloud Computing Service and Web-Based Computing Applications Industry Control Systems IoT 物 联 网 Industry Internet

3 Lecture Outline What is Big Data? Characteristics Challenges Big Data Collection: Sources Mechanisms Big Data Applications: Road Traffic Enforcement Network Traffic Analysis MOOC and Big Data Application in Education Other Applications

4 What is Big Data? Big data is the term for a collection of data sets so big and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications [http://en.wikipedia.org/wiki/big_data]. Sources of big data are mainly from Human through social networking and Devices, particularly, IoT The challenges in big data processing lie not only in the volume, but also in the types of data and the velocity of new data are generated, etc.

5 Three Types of Data Three types of data stored in computer systems: Structured Data: Tables of data in traditional relational databases. SQL is the typical query language Semi-Structured Data: XML files stored in XML databases, as we discussed in Chapter 4 and this chapter. Unstructured Data: Data that are not structured or semi-structured, mainly streamed data like voice, photos, and video files.

6 Challenges of Big Data System Type: Poly-Structured Data: including structured, semi-structured, and unstructured data. unstructured data are the essential part. Volume: Huge and rapidly growing. Processing: Adequate capacity and algorithms to extract information in adequate time, sometimes, in real time. Compromise: in consistency, accuracy and response time, and in long-term and short-term storage (Volatile).

7 Aspects of Big Data Systems: Many Vs Volume Velocity Variety Value Volatile Variability Veracity

8 Aspects of Big Data Systems: Many Vs Value: It is the next big thing after Internet (Communication) and Cloud Computing (Computation). Big data is about Data. Volume: A moving target from petabyte (10 15 bytes), exabyte (10 18 ), zetabyte (10 21 ), to more Velocity: Real-time data require real-time responses. Variety: Data from different sources with different semantics are integrated into different applications. Variability in data structures: poly-structured data Veracity: Accuracy issue: Noise elimination and fault tolerance Volatile: Not all data will be stored, and some will be permanently deleted, big data processing systems are required to selectively store and organize the data to maximize its value.

9 From Big Data Concepts to Domains of Study Volume Velocity Data Processing Analytics Variety Value Volatile Data Collection Data-App Integration Variability Concepts Veracity Data Store Management Infrastructure and Facilities

10 Key Domains of Study Data collection: Human and Devices Infrastructure: Cloud computing and parallel computing Storage and database facilities Communication Management: Organizing data and facilities Processing and Analytic techniques: specifically developed for processing and analyzing big data in specific applications Big Data Applications

11 Lecture Outline What is Big Data? Characteristics Challenges Big Data Collection: Sources Mechanisms Big Data Applications: Road Traffic Enforcement Network Traffic Analysis MOOC and Big Data Application in Education Other Applications

12 Sources and Collection Mechanisms Human Social networks, Facebook, Twitter, Web sites from different organizations Databases and data warehouses Publications: books, journals, Data Collection Archives and legacy documents Computer generated data Photos and images Devices IoT Surveillance camera steam video/audio Telescope and astronomical data Sensor networks outputs 2013 talks on IoT

13 Lecture Outline What is Big Data? Characteristics Challenges Big Data Collection: Sources Mechanisms Big Data Applications: Road Traffic Enforcement Banking Applications Network Traffic Analysis MOOC and Big Data Application in Education Other Applications

14 Road Traffic Rule Enforcement Cameras are installed not any in every intersections, but also between intersections Video taping all road behaviors Illegal driving detection Red light running Speedy Illegal U-turn Double yellow lines crossing Ticket issuing Driver s license suspension

15 Finding a Person via Facial Recognition Facial recognition algorithms can extract relative positions, sizes, shapes of the eyes, nose, ears, cheekbones, and jaw and hash these numbers into a single number The video cameras on the road film and analyze everyone s facial characteristics. To find a person, enter the person s facial characteristics Compare them with data coming from every camera on the road.

16 Big Data Applications in Banking A bank typically service millions of customers; Big data capabilities provides banks the ability to understand their clients with more detail and can deliver targeted personalized offers faster. It enables higher offer and cross sell acceptance rates, which improve customer profitability, satisfaction and retention. While all banks currently perform analysis of customer data to obtain insight to improve customer offers, many are unable to properly process the big amount of data to optimize their offers.

17 IBM Big Data Platform Fiserv for Banking http://www-01.ibm.com/software/data/bigdata/industry-banking.html Fiserv enables banks to deliver improved actionable insight during customer interactions with the contact center. Used by more than 16,000 financial institutions. Benefits include: More accurately predict what happens next, understand customers, and increase customer adoption; Improve processing performance with increased capabilities; Higher revenues through improved cross sell Cost efficiency through higher productivity Higher rate of customer satisfaction/retention Lower campaign and infrastructure costs

18 Network Traffic Monitoring Example (Text Chapter 11 for more detail) Traffic Analysis for Intrusion Detection & Prevention Define features to characterize the traffic behaviors Obtained features from data fields in the IP packets. Save normal & abnormal patterns and compare with traffic flow.

19 Model Differentiating Normal and Abnormal Behaviors Definition: The relative deviation distance (D i ) of feature f i is defined D i f i E( f Rule: Feature selection rule between normal traffic (N) and abnormal (A) traffic behavior: ( Di ) ( Di ) ( D ( D A N 110 i) A i 1 i) N i i ) threshold

20 MOOC MOOC (Massively Open Online Course) is reshaping the education system. Video: http://mslgoee.asu.edu/mediasite/play/aae59213d64e4005aeb8d2e6d6290a561d?catalog=bac218b3-1903-4a98-9b83-93856a9b74f0 Course ontology: Store the concepts required by the course Learning ontology and big data analysis: Collect statistically significant numbers of user interactions, semantic analysis of learner postings, which can be used for creating feedback to learners: Discussion board is no longer adequate if there are thousands of posting in each thread. We need recommendation. We need an ontology system for knowledge organizing and processing

21 Online Lecture

22 Many Other Applications Healthcare: Link all patients data, doctors decisions, and outcomes, require an ontology. National Security: The U.S. Utah Data Center is a big data system for Comprehensive National Cybersecurity Initiative. The mission is classified. Tax: IRS collects all data from all organizations and individuals to detect any tax evasion. Credit Scores: The US companies collect all the finance-related activities of every person with an SSN. Retailers: Not only online retailers like Amazon and ebay, but also traditional retailers like Walmart and Target, have million transactions/hour to process. Real Estate: Collect GPS signals to help home buyers to determine their drive times to work throughout at different times.