Veracity in Big Data Reliability of Routes

Size: px
Start display at page:

Download "Veracity in Big Data Reliability of Routes"

Transcription

1 Veracity in Big Data Reliability of Routes Dr. Tobias Emrich Post-Doctoral Scholar Integrated Media Systems Center (IMSC) Viterbi School of Engineering University of Southern California Los Angeles, CA

2 OUTLINE Big (Uncertain) Data Reliability in Traffic Networks From Uncertainty to Reliability Outlook 2

3 BIG DATA Big data is like everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it... Dan Ariely Center for Advanced Hindsight at Duke University 3

4 BIG DATA Big data is like : everyone talks about it, Studies* on Microsoft and Yahoo production cluster: nobody really knows how to do it, everyone Median thinks Hadoop job everyone is else ~13 GB is doing it, 90% of the jobs are < 100 GB so everyone claims they are doing it... Dan Ariely Center for Advanced Hindsight at Duke University *A. Rowstron, D. Narayanan, A. Donnelly, G. O'Shea and A. Douglas. "Nobody ever got fired for using Hadoop on a cluster, Proceedings of HotCDP, April

5 BIG DATA Variety Volume Veracity Velocity Value 5

6 BIG DATA Variety Volume Veracity Velocity Value 6

7 Uncertainty in (not so big) Data J. Niedermayer, A. Züfle, T. Emrich, M. Renz, N. Mamoulis, L. Chen, H. P. Kriegel: Probabilistic Nearest Neighbor Queries on Uncertain Moving Object Trajectories In Proceedings of the 40th International Conference on Very Large Data Bases (VLDB), Hangzhou, China: , P. Zhang, R. Cheng, N. Mamoulis, M. Renz, A. Züfle, Y. Tang, T. Emrich: Voronoi based Nearest Neighbor Search for Multi Dimensional Uncertain Databases In Proceedings of the 29th International Conference on Data Engineering (ICDE), Brisbane, Australia: , J. Niedermayer, A. Züfle, T. Emrich, M. Renz, N. Mamoulis, L. Chen, H. P. Kriegel: Similarity Search on Uncertain Spatio temporal Data In Proceedings of the 6th Internation Conference on Similarity Search and Applications (SISAP), Coruna, Spain: 43 49, 2013 T. Emrich, H. P. Kriegel, J. Niedermayer, M. Renz, A. Suhartha, A. Züfle: Exploration of monte carlo based probabilistic query processing in uncertain graphs In Proceedings of the 21st ACM Conference on Information and Knowledge Management (CIKM), Maui, HI: , T. Emrich, H. P. Kriegel, N. Mamoulis, M. Renz, A. Züfle: Indexing uncertain spatio temporal data In Proceedings of the 21st ACM Conference on Information and Knowledge Management (CIKM), Maui, HI: , T. Bernecker, T. Emrich, H. P. Kriegel, M. Renz, A. Züfle: Probabilistic Ranking in Fuzzy Object Databases In Proceedings of the 21st ACM Conference on Information and Knowledge Management (CIKM), Maui, HI: , N. Hubig, A. Züfle, T. Emrich, M. A. Nascimento, M. Renz, H. P. Kriegel: Continuous Probabilistic Sum Queries in Wireless Sensor Networks with Ranges In Proceedings of the 24th International Conference on Scientific and Statistical Database Management (SSDBM), Chania, Crete, Greece: , T. Emrich, H. P. Kriegel, N. Mamoulis, M. Renz, A. Züfle: Querying Uncertain Spatio Temporal Data In Proceedings of the 28th International Conference on Data Engineering (ICDE), Washington, DC, T. Bernecker, T. Emrich, H. P. Kriegel, N. Mamoulis, M. Renz, A. Züfle: A Novel Probabilistic Pruning Approach to Speed Up Similarity Queries in Uncertain Databases In Proceedings of the 27th International Conference on Data Engineering (ICDE), Hannover, Germany: , T. Bernecker, L. Chen, T. Emrich, H. P. Kriegel, N. Mamoulis, A. Züfle: Managing Uncertain Spatio Temporal Data In Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Querying and Mining Uncertain Spatio Temporal Data (QUeST), Chicago, IL: 16 20, T. Bernecker, T. Emrich, H. P. Kriegel, M. Renz, S. Zankl, A. Züfle: Efficient Probabilistic Reverse Nearest Neighbor Query Processing on Uncertain Data In Proceedings of the 37th International Conference on Very Large Data Bases (VLDB), Seattle, WA: ,

8 Uncertainty Databases Result 1) Efficient 2) No Confidence Query/ Datamining Uncertainty is inherent in many datasets: Automated Extraction of Information from HTML Sensor Readings Human Observations Predictions Result 1) Efficient Alg. needed 2) Confidence attached 8

9 Reliability in Traffic Networks Route B: ~58 min Usually, I take route A Route A: ~53 min when I have a meeting in the morning, I take route B 9

10 Reliability in Traffic Networks Route B: ~58 min Travel time prediction, incorporating uncertainty 0,3 Route A: ~53 min Probability 0,25 0,2 0,15 0,1 Route A Route B 0, Travel Time 10

11 Reliability in Traffic Networks Predicted travel time of a route vary due to Imprecise Prediction of traffic flow Unpredictable accidents Changing weather conditions Routes differ in variance of travel time Starting at 9am Route A arrives before 10am with 89% Route B arrives before 10am with 99.2% To have 99.2% on route A I have to leave 10 mins earlier Probability 0,3 0,2 0,1 0 Route A: ~53 min Route B: ~58 min Route A Route B Travel Time 11

12 From Uncertainty to Reliability Current approaches Traditional D TT = 14 min 9.00pm S 12

13 From Uncertainty to Reliability Current approaches Traditional 9.00pm S D TT = 14 min TT = 12 min ClearPath 13

14 From Uncertainty to Reliability Considering uncertainty of predictions D 9.00pm S How to predict How to model 14

15 From Uncertainty to Reliability Considering uncertainty of predictions D 9.00pm S How to predict How to model 15

16 From Uncertainty to Reliability Considering uncertainty of predictions D 9.00pm S How to predict What time to use? How to model 16

17 From Uncertainty to Reliability Considering uncertainty of predictions D 9.00pm S How to predict What time to use? How to model 17

18 From Uncertainty to Reliability Considering uncertainty of predictions D TT =? min 9.00pm S How to predict What time to use? How to model How to add up uncertain travel times? 18

19 From Uncertainty to Reliability Considering uncertainty of predictions How to deal with correlations? D TT =? min 9.00pm S How to predict What time to use? How to model How to add up uncertain travel times? 19

20 From Uncertainty to Reliability Considering uncertainty of predictions How to deal with correlations? D TT =? min 9.00pm S How to predict What time to use? How to model How to add up uncertain travel times? 20

21 Outlook Evaluation of the quality of the result Efficient online prediction Extension to new query mechanisms: When do I have to start (and which route do I have to take) when I want to be at USC at 8.00am (or before) with a probability of 99%? 21

22 Questions? Dr. Tobias Emrich 22

23 Uncertainty in Databases Uncertainty is inherent in many datasets: Automated Extraction of Information from HTML (i.e. John works at Google vs. John works at Microsoft) Sensor Readings (i.e. RFID sensors tracking the position of customers) Human Observations (i.e. the seen Bird was either a Raven (75%) or a Crow (25%)) Predictions (i.e. tomorrow its going to rain (10%) or not(90%)) Two approaches to solve this Cleaning (e.g. get rid of uncertainty) Management (e.g. handle the uncertainty) 23

24 24

Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases. Andreas Züfle

Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases. Andreas Züfle Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases Andreas Züfle Geo Spatial Data Huge flood of geo spatial data Modern technology New user mentality Great research potential

More information

9700 South Cass Avenue, Lemont, IL 60439 URL: www.mcs.anl.gov/ fulin

9700 South Cass Avenue, Lemont, IL 60439 URL: www.mcs.anl.gov/ fulin Fu Lin Contact information Education Work experience Research interests Mathematics and Computer Science Division Phone: (630) 252-0973 Argonne National Laboratory E-mail: fulin@mcs.anl.gov 9700 South

More information

Archiving and Sharing Big Data Digital Repositories, Libraries, Cloud Storage

Archiving and Sharing Big Data Digital Repositories, Libraries, Cloud Storage Archiving and Sharing Big Data Digital Repositories, Libraries, Cloud Storage Cyrus Shahabi, Ph.D. Professor of Computer Science & Electrical Engineering Director, Integrated Media Systems Center (IMSC)

More information

Veracity of data. New approaches are emerging to account for uncertainty in data at a giant scale. 2013 IBM Corporation

Veracity of data. New approaches are emerging to account for uncertainty in data at a giant scale. 2013 IBM Corporation Veracity of data 1. The degree to which data is accurate, reliable, certain 2. An emerging platform for organizing, understanding and deriving value from big data Introduction Financial decisions require

More information

BIG DATA POSSIBILITIES AND CHALLENGES

BIG DATA POSSIBILITIES AND CHALLENGES BIG DATA POSSIBILITIES AND CHALLENGES PROFESSOR AND CENTER DIRECTOR WHY BIG DATA? In God we trust - all others must bring data W. Edwards Deming (US engineer and statistician, 1900-1993) WHAT IS BIG DATA?

More information

Geo-Social Co-location Mining

Geo-Social Co-location Mining Geo-Social Co-location Mining Michael Weiler* Klaus Arthur Schmid* Nikos Mamoulis + Matthias Renz* Institute for Informatics, Ludwig-Maximilians-Universität München + Department of Computer Science, University

More information

BigData at UI CS. Hasan Jamil Department of Computer Science University of Idaho

BigData at UI CS. Hasan Jamil Department of Computer Science University of Idaho BigData at UI CS Hasan Jamil Department of Computer Science University of Idaho BigData Four Vs of BigData Volume: Unprecedented size 40 zecabytes by 2020; 2.5 quinjllion bytes each day; 100 terabytes

More information

Big Data and Analytics: Challenges and Opportunities

Big Data and Analytics: Challenges and Opportunities Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif

More information

Advanced Methods for Pedestrian and Bicyclist Sensing

Advanced Methods for Pedestrian and Bicyclist Sensing Advanced Methods for Pedestrian and Bicyclist Sensing Yinhai Wang PacTrans STAR Lab University of Washington Email: yinhai@uw.edu Tel: 1-206-616-2696 For Exchange with University of Nevada Reno Sept. 25,

More information

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information Eric Hsueh-Chan Lu Chi-Wei Huang Vincent S. Tseng Institute of Computer Science and Information Engineering

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

Load Balancing Using a Co-Simulation/Optimization/Control Approach. Petros Ioannou

Load Balancing Using a Co-Simulation/Optimization/Control Approach. Petros Ioannou Stopping Criteria Freight Transportation Network Network Data Network Simulation Models Network states Optimization: Minimum cost Route Load Balancing Using a Co-Simulation/Optimization/Control Approach

More information

Second International Workshop on Preservation of Evolving Big Data - Panel on Big Data Quality

Second International Workshop on Preservation of Evolving Big Data - Panel on Big Data Quality Second International Workshop on Preservation of Evolving Big Data - Panel on Big Data Quality Angela Bonifati University of Lyon 1 Liris CNRS, France March 15, 2016 Angela Bonifati 2nd Diachron Workshop

More information

Integration of GPS Traces with Road Map

Integration of GPS Traces with Road Map Integration of GPS Traces with Road Map Lijuan Zhang Institute of Cartography and Geoinformatics Leibniz University of Hannover Hannover, Germany +49 511.762-19437 Lijuan.Zhang@ikg.uni-hannover.de Frank

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Mobile Monetization Scenario Design & Big Data. Arther Wu Senior Director of Monetization and Business Operation

Mobile Monetization Scenario Design & Big Data. Arther Wu Senior Director of Monetization and Business Operation Mobile Monetization Scenario Design & Big Data Arther Wu Senior Director of Monetization and Business Operation Agenda Quick update of Cheetah Mobile Ad Scenario Design Big Data / Relation with Advertising

More information

How To Teach Economics

How To Teach Economics YONG CHAO College of Business Phone: (626) 230-5198 University of Louisville E-mail: yongchao.usa@gmail.com Louisville, Kentucky 40292 Web: http://yongchao.us ACADEMIA APPOINTMENT 2010 ~ Present Assistant

More information

Web Mining Seminar CSE 450. Spring 2008 MWF 11:10 12:00pm Maginnes 113

Web Mining Seminar CSE 450. Spring 2008 MWF 11:10 12:00pm Maginnes 113 CSE 450 Web Mining Seminar Spring 2008 MWF 11:10 12:00pm Maginnes 113 Instructor: Dr. Brian D. Davison Dept. of Computer Science & Engineering Lehigh University davison@cse.lehigh.edu http://www.cse.lehigh.edu/~brian/course/webmining/

More information

Modern (Computational) Approaches to Big Data Analytics. CSC 576 Computer Science, University of Rochester Instructor: Ji Liu

Modern (Computational) Approaches to Big Data Analytics. CSC 576 Computer Science, University of Rochester Instructor: Ji Liu Modern (Computational) Approaches to Big Data Analytics CSC 576 Computer Science, University of Rochester Instructor: Ji Liu Big Data in Academy SIGKDD 2014 (program page, found 14 big data, 50+ large

More information

Rackscale- the things that matter GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH

Rackscale- the things that matter GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Rackscale- the things that matter GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH HTDC 2014 Systems Group = www.systems.ethz.ch Enterprise Computing Center = www.ecc.ethz.ch On the way

More information

Spatio-Temporal Networks:

Spatio-Temporal Networks: Spatio-Temporal Networks: Analyzing Change Across Time and Place WHITE PAPER By: Jeremy Peters, Principal Consultant, Digital Commerce Professional Services, Pitney Bowes ABSTRACT ORGANIZATIONS ARE GENERATING

More information

CAS CS 565, Data Mining

CAS CS 565, Data Mining CAS CS 565, Data Mining Course logistics Course webpage: http://www.cs.bu.edu/~evimaria/cs565-10.html Schedule: Mon Wed, 4-5:30 Instructor: Evimaria Terzi, evimaria@cs.bu.edu Office hours: Mon 2:30-4pm,

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Big Data in Pictures: Data Visualization

Big Data in Pictures: Data Visualization Big Data in Pictures: Data Visualization Huamin Qu Hong Kong University of Science and Technology What is data visualization? Data visualization is the creation and study of the visual representation of

More information

Big-Data Computing with Smart Clouds and IoT Sensing

Big-Data Computing with Smart Clouds and IoT Sensing A New Book from Wiley Publisher to appear in late 2016 or early 2017 Big-Data Computing with Smart Clouds and IoT Sensing Kai Hwang, University of Southern California, USA Min Chen, Huazhong University

More information

Statistical Models in Data Mining

Statistical Models in Data Mining Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of

More information

Anomaly Detection and Predictive Maintenance

Anomaly Detection and Predictive Maintenance Anomaly Detection and Predictive Maintenance Rosaria Silipo Iris Adae Christian Dietz Phil Winters Rosaria.Silipo@knime.com Iris.Adae@uni-konstanz.de Christian.Dietz@uni-konstanz.de Phil.Winters@knime.com

More information

Industry 4.0 and Big Data

Industry 4.0 and Big Data Industry 4.0 and Big Data Marek Obitko, mobitko@ra.rockwell.com Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and

More information

Uncertain Data Management for Sensor Networks

Uncertain Data Management for Sensor Networks Uncertain Data Management for Sensor Networks Amol Deshpande, University of Maryland (joint work w/ Bhargav Kanagal, Prithviraj Sen, Lise Getoor, Sam Madden) Motivation: Sensor Networks Unprecedented,

More information

Proposed Advance Taxi Recommender System Based On a Spatiotemporal Factor Analysis Model

Proposed Advance Taxi Recommender System Based On a Spatiotemporal Factor Analysis Model Proposed Advance Taxi Recommender System Based On a Spatiotemporal Factor Analysis Model Santosh Thakkar, Supriya Bhosale, Namrata Gawade, Prof. Sonia Mehta Department of Computer Engineering, Alard College

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms

More information

Architecture 3.0 Landscape Analytics

Architecture 3.0 Landscape Analytics Architecture 3.0 Landscape Analytics Jürgen Döllner Hasso- Plattner- Institut Landscape Analytics Big Data Big Data Analytics Visual Analytics Predictive Analytics Landscape Analytics Big Data Data is

More information

The value of data analytics

The value of data analytics Data is the new oil. This phrase expresses in just a few words how data is changing our world just like oil did in the 20th century. According to Forbes, 89% of the business leaders think that (Big) Data

More information

Global Internet Marketing. Presentation for the NTA Montage Meeting Seville, Spain April 17, 2010

Global Internet Marketing. Presentation for the NTA Montage Meeting Seville, Spain April 17, 2010 Global Internet Marketing Presentation for the NTA Montage Meeting Seville, Spain April 17, 2010 What I Will Cover Current Trends The 5 Most Essential Things you Should be Doing Online Case Study Information

More information

Research of Postal Data mining system based on big data

Research of Postal Data mining system based on big data 3rd International Conference on Mechatronics, Robotics and Automation (ICMRA 2015) Research of Postal Data mining system based on big data Xia Hu 1, Yanfeng Jin 1, Fan Wang 1 1 Shi Jiazhuang Post & Telecommunication

More information

Global Data Center Location Insights March 2013

Global Data Center Location Insights March 2013 Global Data Center Location March 2013 This report is solely for the use of Talent Neuron clients and Talent Neuron Subscribers. No part of it may be circulated, quoted, or reproduced for distribution

More information

Dimensionalizing Big Data. WA State vs. peers. Building on strengths CONTENTS. McKinsey & Company 1

Dimensionalizing Big Data. WA State vs. peers. Building on strengths CONTENTS. McKinsey & Company 1 CONTENTS Building on strengths 1 Printed 2/26/2015 12:55 PM Pacific Standard Time WA State vs. peers Last Modified 3/2/2015 10:17 AM Pacific Standard Time Dimensionalizing Big Data Big Data: big and getting

More information

The Big Data Paradigm Shift. Insight Through Automation

The Big Data Paradigm Shift. Insight Through Automation The Big Data Paradigm Shift Insight Through Automation Agenda The Problem Emcien s Solution: Algorithms solve data related business problems How Does the Technology Work? Case Studies 2013 Emcien, Inc.

More information

Keywords: Mobility Prediction, Location Prediction, Data Mining etc

Keywords: Mobility Prediction, Location Prediction, Data Mining etc Volume 4, Issue 4, April 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Data Mining Approach

More information

Survey On: Nearest Neighbour Search With Keywords In Spatial Databases

Survey On: Nearest Neighbour Search With Keywords In Spatial Databases Survey On: Nearest Neighbour Search With Keywords In Spatial Databases SayaliBorse 1, Prof. P. M. Chawan 2, Prof. VishwanathChikaraddi 3, Prof. Manish Jansari 4 P.G. Student, Dept. of Computer Engineering&

More information

MONIC and Followups on Modeling and Monitoring Cluster Transitions

MONIC and Followups on Modeling and Monitoring Cluster Transitions MONIC and Followups on Modeling and Monitoring Cluster Transitions Myra Spiliopoulou 1, Eirini Ntoutsi 2, Yannis Theodoridis 3, and Rene Schult 4 1 Otto-von-Guericke University of Magdeburg, Germany, myra@iti.cs.uni-magdeburg.de,

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

How To Understand The History Of Navigation In French Marine Science

How To Understand The History Of Navigation In French Marine Science E-navigation, from sensors to ship behaviour analysis Laurent ETIENNE, Loïc SALMON French Naval Academy Research Institute Geographic Information Systems Group laurent.etienne@ecole-navale.fr loic.salmon@ecole-navale.fr

More information

Collaborations between Official Statistics and Academia in the Era of Big Data

Collaborations between Official Statistics and Academia in the Era of Big Data Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI vnn@umich.edu What

More information

Big Data: A Closer Look

Big Data: A Closer Look Big Data: A Closer Look What is Big Data? Increasingly popular search query http://www.google.com/trends/explore#q=big%20data What is Big Data? the V s Characteristics of Big Data Volume http://www.businessweek.com/articles/2014-03-06/interactive-graphichow-big-is-big-data

More information

DIGITAL MARKETING STRATEGIES Leveraging The Back-End Tools

DIGITAL MARKETING STRATEGIES Leveraging The Back-End Tools DIGITAL MARKETING STRATEGIES Leveraging The Back-End Tools Professional Background RACING INDUSTRY EXPERIENCE: First Job Out of Undergrad: - Arlington Park, Assistant to the VP of Marketing - Sponsorship

More information

Big Data & Analytics: Your concise guide (note the irony) Wednesday 27th November 2013

Big Data & Analytics: Your concise guide (note the irony) Wednesday 27th November 2013 Big Data & Analytics: Your concise guide (note the irony) Wednesday 27th November 2013 Housekeeping 1. Any questions coming out of today s presentation can be discussed in the bar this evening 2. OCF is

More information

Big Data Analytics. Lucas Rego Drumond

Big Data Analytics. Lucas Rego Drumond Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 36 Outline

More information

A Benchmark to Evaluate Mobile Video Upload to Cloud Infrastructures

A Benchmark to Evaluate Mobile Video Upload to Cloud Infrastructures A Benchmark to Evaluate Mobile Video Upload to Cloud Infrastructures Afsin Akdogan, Hien To, Seon Ho Kim and Cyrus Shahabi Integrated Media Systems Center University of Southern California, Los Angeles,

More information

Online Content Optimization Using Hadoop. Jyoti Ahuja Dec 20 2011

Online Content Optimization Using Hadoop. Jyoti Ahuja Dec 20 2011 Online Content Optimization Using Hadoop Jyoti Ahuja Dec 20 2011 What do we do? Deliver right CONTENT to the right USER at the right TIME o Effectively and pro-actively learn from user interactions with

More information

"BIG DATA A PROLIFIC USE OF INFORMATION"

BIG DATA A PROLIFIC USE OF INFORMATION Ojulari Moshood Cameron University - IT4444 Capstone 2013 "BIG DATA A PROLIFIC USE OF INFORMATION" Abstract: The idea of big data is to better use the information generated by individual to remake and

More information

Conference and Journal Publications in Database Systems and Theory

Conference and Journal Publications in Database Systems and Theory Conference and Journal Publications in Database Systems and Theory This document rates the conferences and journal publications in the database area with respect to quality and impact. As in many other

More information

MLg. Big Data and Its Implication to Research Methodologies and Funding. Cornelia Caragea TARDIS 2014. November 7, 2014. Machine Learning Group

MLg. Big Data and Its Implication to Research Methodologies and Funding. Cornelia Caragea TARDIS 2014. November 7, 2014. Machine Learning Group Big Data and Its Implication to Research Methodologies and Funding Cornelia Caragea TARDIS 2014 November 7, 2014 UNT Computer Science and Engineering Data Everywhere Lots of data is being collected and

More information

Query Selectivity Estimation for Uncertain Data

Query Selectivity Estimation for Uncertain Data Query Selectivity Estimation for Uncertain Data Sarvjeet Singh *, Chris Mayfield *, Rahul Shah #, Sunil Prabhakar * and Susanne Hambrusch * * Department of Computer Science, Purdue University # Department

More information

Post-Graduation Survey Results 2014 Dietrich College of Humanities & Social Sciences STATISTICS Bachelor of Science

Post-Graduation Survey Results 2014 Dietrich College of Humanities & Social Sciences STATISTICS Bachelor of Science Post-Graduation Survey Results 2014 Bachelor of Science EMPLOYERS AND JOB TITLES Employer Job Title City State/Country Major Alteryx Solutions Developer Bolder CO Econ-Stats Annalect Data Scientist New

More information

Decentralized Utility-based Sensor Network Design

Decentralized Utility-based Sensor Network Design Decentralized Utility-based Sensor Network Design Narayanan Sadagopan and Bhaskar Krishnamachari University of Southern California, Los Angeles, CA 90089-0781, USA narayans@cs.usc.edu, bkrishna@usc.edu

More information

Dataset Preparation and Indexing for Data Mining Analysis Using Horizontal Aggregations

Dataset Preparation and Indexing for Data Mining Analysis Using Horizontal Aggregations Dataset Preparation and Indexing for Data Mining Analysis Using Horizontal Aggregations Binomol George, Ambily Balaram Abstract To analyze data efficiently, data mining systems are widely using datasets

More information

Fast Data in the Era of Big Data: Twitter s Real-

Fast Data in the Era of Big Data: Twitter s Real- Fast Data in the Era of Big Data: Twitter s Real- Time Related Query Suggestion Architecture Gilad Mishne, Jeff Dalton, Zhenghua Li, Aneesh Sharma, Jimmy Lin Presented by: Rania Ibrahim 1 AGENDA Motivation

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

Data Cleansing for Remote Battery System Monitoring

Data Cleansing for Remote Battery System Monitoring Data Cleansing for Remote Battery System Monitoring Gregory W. Ratcliff Randall Wald Taghi M. Khoshgoftaar Director, Life Cycle Management Senior Research Associate Director, Data Mining and Emerson Network

More information

Big Data, Statistics, and the Internet

Big Data, Statistics, and the Internet Big Data, Statistics, and the Internet Steven L. Scott April, 4 Steve Scott (Google) Big Data, Statistics, and the Internet April, 4 / 39 Summary Big data live on more than one machine. Computing takes

More information

class 1 welcome to CS265! BIG DATA SYSTEMS prof. Stratos Idreos

class 1 welcome to CS265! BIG DATA SYSTEMS prof. Stratos Idreos class 1 welcome to CS265 BIG DATA SYSTEMS prof. plan for today big data + systems class structure + logistics who is who project ideas 2 data vs knowledge 3 big data 4 daily data 2.5 exabytes years 2012

More information

Paolo Costa costa@imperial.ac.uk

Paolo Costa costa@imperial.ac.uk joint work with Ant Rowstron, Austin Donnelly, and Greg O Shea (MSR Cambridge) Hussam Abu-Libdeh, Simon Schubert (Interns) Paolo Costa costa@imperial.ac.uk Paolo Costa CamCube - Rethinking the Data Center

More information

CS 207 - Data Science and Visualization Spring 2016

CS 207 - Data Science and Visualization Spring 2016 CS 207 - Data Science and Visualization Spring 2016 Professor: Sorelle Friedler sorelle@cs.haverford.edu An introduction to techniques for the automated and human-assisted analysis of data sets. These

More information

Context-Aware Online Traffic Prediction

Context-Aware Online Traffic Prediction Context-Aware Online Traffic Prediction Jie Xu, Dingxiong Deng, Ugur Demiryurek, Cyrus Shahabi, Mihaela van der Schaar University of California, Los Angeles University of Southern California J. Xu, D.

More information

Real Time Bus Monitoring System by Sharing the Location Using Google Cloud Server Messaging

Real Time Bus Monitoring System by Sharing the Location Using Google Cloud Server Messaging Real Time Bus Monitoring System by Sharing the Location Using Google Cloud Server Messaging Aravind. P, Kalaiarasan.A 2, D. Rajini Girinath 3 PG Student, Dept. of CSE, Anand Institute of Higher Technology,

More information

Fundamentals of Visualizing Biological Data

Fundamentals of Visualizing Biological Data Fundamentals of Visualizing Biological Data Marc Streit marc.streit@jku.at Marc Streit, Johannes Kepler University Linz Marc Streit, Johannes Kepler University Linz Presentation 2 Interactive Marc Streit,

More information

Statistics, Big Data and Data Science!?

Statistics, Big Data and Data Science!? Statistics, Big Data and Data Science!? Prof. Dr. Göran Kauermann Ludwig-Maximilians-Universität Munich, Germany Statistics, Big Data and Data Science Statistics Founded around 1900 with the seminal work

More information

CALL for PARTICIPATION

CALL for PARTICIPATION CALL for PARTICIPATION In conjunction with PREMI 2013 IUPRAI Workshop on Big Data: A Soft Computing Perspective Date: 14.12.2013 Place: ISI, Kolkata The main challenges in handling Big Data lie not only

More information

What is Big Data? BCS Aberdeen Branch 6 November 2014

What is Big Data? BCS Aberdeen Branch 6 November 2014 What is Big Data? BCS Aberdeen Branch 6 November 2014 Keith Gordon Soldier Teacher Data Manager Engineer Information Systems Professional Standards Expert Big Data Sceptic What they say The overeager adoption

More information

Big Data a threat or a chance?

Big Data a threat or a chance? Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but

More information

DRIVEN BY TIME WINDOWS: PREDICTIVE STRATEGIES

DRIVEN BY TIME WINDOWS: PREDICTIVE STRATEGIES DRIVEN BY TIME WINDOWS: PREDICTIVE STRATEGIES FOR REAL-TIME PICK-UP AND DELIVERY OPERATIONS Hani Mahmassani Lan Jiang Industry Workshop: The Fight for the Last Mile Northwestern University Transportation

More information

Aggregation of Spatio-temporal and Event Log Databases for Stochastic Characterization of Process Activities

Aggregation of Spatio-temporal and Event Log Databases for Stochastic Characterization of Process Activities Aggregation of Spatio-temporal and Event Log Databases for Stochastic Characterization of Process Activities Rodrigo M. T. Gonçalves, Rui Jorge Almeida, João M. C. Sousa Planning and Scheduling In logistic

More information

Shareability and Locality Aware Scheduling Algorithm in Hadoop for Mobile Cloud Computing

Shareability and Locality Aware Scheduling Algorithm in Hadoop for Mobile Cloud Computing Shareability and Locality Aware Scheduling Algorithm in Hadoop for Mobile Cloud Computing Hsin-Wen Wei 1,2, Che-Wei Hsu 2, Tin-Yu Wu 3, Wei-Tsong Lee 1 1 Department of Electrical Engineering, Tamkang University

More information

Questions and Answers Regarding the Watering Index and ET (EvapoTranspiration)

Questions and Answers Regarding the Watering Index and ET (EvapoTranspiration) Questions and Answers Regarding the Watering Index and ET (EvapoTranspiration) What is the Watering Index? The Watering Index is a scientifically based guide that was created to help people adjust watering

More information

Big Data Management and Analytics

Big Data Management and Analytics Big Data Management and Analytics Lecture Notes Winter semester 2015 / 2016 Ludwig-Maximilians-University Munich Prof. Dr. Matthias Renz 2015 Based on lectures by Donald Kossmann (ETH Zürich), as well

More information

Evaluation of Unlimited Storage: Towards Better Data Access Model for Sensor Network

Evaluation of Unlimited Storage: Towards Better Data Access Model for Sensor Network Evaluation of Unlimited Storage: Towards Better Data Access Model for Sensor Network Sagar M Mane Walchand Institute of Technology Solapur. India. Solapur University, Solapur. S.S.Apte Walchand Institute

More information

BIG DATA FOR MODELLING 2.0

BIG DATA FOR MODELLING 2.0 BIG DATA FOR MODELLING 2.0 ENHANCING MODELS WITH MASSIVE REAL MOBILITY DATA DATA INTEGRATION www.ptvgroup.com Lorenzo Meschini - CEO, PTV SISTeMA COST TU1004 final Conference www.ptvgroup.com Paris, 11

More information

1 Results from Prior Support

1 Results from Prior Support 1 Results from Prior Support Dr. Shashi Shekhar s work has been supported by multiple NSF grants [21, 23, 18, 14, 15, 16, 17, 19, 24, 22]. His most recent grant relating to spatiotemporal network databases

More information

Data + Science Towards Organizational Excellence. eric Choo

Data + Science Towards Organizational Excellence. eric Choo Data + Science Towards Organizational Excellence eric Choo Business Intelligence & Big Data Analytics SoftSource Big Data Solutions Business Intelligence & Data EVA Services D a t a E x p l o r a t i o

More information

A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION

A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS Kyoungjin Park Alper Yilmaz Photogrammetric and Computer Vision Lab Ohio State University park.764@osu.edu yilmaz.15@osu.edu ABSTRACT Depending

More information

International comparisons of road safety using Singular Value Decomposition

International comparisons of road safety using Singular Value Decomposition International comparisons of road safety using Singular Value Decomposition Siem Oppe D-2001-9 International comparisons of road safety using Singular Value Decomposition D-2001-9 Siem Oppe Leidschendam,

More information

Big Data Analytics on Big Spatial Database

Big Data Analytics on Big Spatial Database Big Data Analytics on Big Spatial Database Prepared by Raymond Wong Presented by Raymond Wong raywong@cse HKUST 1 Do you know what is the hot topic in June and July in 2014 (last year)? World Cup HKUST

More information

Distributed Continuous Range Query Processing on Moving Objects

Distributed Continuous Range Query Processing on Moving Objects Distributed Continuous Range Query Processing on Moving Objects Haojun Wang, Roger Zimmermann, and Wei-Shinn Ku University of Southern California, Computer Science Department, Los Angeles, CA, USA {haojunwa,

More information

INTEROPERABILITY IN DATA WAREHOUSES

INTEROPERABILITY IN DATA WAREHOUSES INTEROPERABILITY IN DATA WAREHOUSES Riccardo Torlone Roma Tre University http://torlone.dia.uniroma3.it/ SYNONYMS Data warehouse integration DEFINITION The term refers to the ability of combining the content

More information

1.5.3 Project 3: Traffic Monitoring

1.5.3 Project 3: Traffic Monitoring 1.5.3 Project 3: Traffic Monitoring This project aims to provide helpful information about traffic in a given geographic area based on the history of traffic patterns, current weather, and time of the

More information

Nobody ever got fired for using Hadoop on a cluster

Nobody ever got fired for using Hadoop on a cluster Nobody ever got fired for using Hadoop on a cluster Antony Rowstron Dushyanth Narayanan Austin Donnelly Greg O Shea Andrew Douglas Microsoft Research, Cambridge Abstract The norm for data analytics is

More information

Comments on the Meaning of as appropriate. Bernard D. Goldstein, MD University of Pittsburgh Graduate School of Public Health bdgold@pitt.

Comments on the Meaning of as appropriate. Bernard D. Goldstein, MD University of Pittsburgh Graduate School of Public Health bdgold@pitt. Comments on the Meaning of as appropriate Bernard D. Goldstein, MD University of Pittsburgh Graduate School of Public Health bdgold@pitt.edu EPA shall incorporate, as appropriate, based on chemical- specific

More information

Ramandeep S. Randhawa

Ramandeep S. Randhawa Ramandeep S. Randhawa Contact BRI 401P Phone: (213) 740-1042 Marshall School of Business Fax: (213) 740-7313 University of Southern California Email: ramandeep.randhawa@marshall.usc.edu Los Angeles, CA

More information

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel Big Data and Analytics: Getting Started with ArcGIS Mike Park Erik Hoel Agenda Overview of big data Distributed computation User experience Data management Big data What is it? Big Data is a loosely defined

More information

Department of Political Science Phone: (805) 893-5902 University of California, Santa Barbara Fax: (805) 893-3309

Department of Political Science Phone: (805) 893-5902 University of California, Santa Barbara Fax: (805) 893-3309 NEIL NARANG Department of Political Science Phone: (805) 893-5902 University of California, Santa Barbara Fax: (805) 893-3309 Ellison Hall, Office 3710 Email: narangn@polsci.ucsb.edu Santa Barbara, CA

More information

A Multimodal Trip Planning System Incorporating the Park-and-Ride Mode and Real-time Traffic/Transit Information

A Multimodal Trip Planning System Incorporating the Park-and-Ride Mode and Real-time Traffic/Transit Information A Multimodal Trip Planning System Incorporating the Park-and-Ride Mode and Real-time Traffic/Transit Information Jing-Quan Li, Kun Zhou, Liping Zhang, and Wei-Bin Zhang Abstract California PATH, University

More information

Better Buildings Neighborhood Program Data and Evaluation Peer Exchange Call: Homeowner and Contractor Surveys

Better Buildings Neighborhood Program Data and Evaluation Peer Exchange Call: Homeowner and Contractor Surveys Better Buildings Neighborhood Program Data and Evaluation Peer Exchange Call: Homeowner and Contractor Surveys Call Slides and Discussion Summary January 19, 2012 1 Agenda Call Logistics and Attendance

More information

Approaches for parallel data loading and data querying

Approaches for parallel data loading and data querying 78 Approaches for parallel data loading and data querying Approaches for parallel data loading and data querying Vlad DIACONITA The Bucharest Academy of Economic Studies diaconita.vlad@ie.ase.ro This paper

More information

HOW TO START YOUR SUCCESSFUL CAREER IN SEARCH ENGINE OPTIMIZATION DIGITAL MARKETING

HOW TO START YOUR SUCCESSFUL CAREER IN SEARCH ENGINE OPTIMIZATION DIGITAL MARKETING HOW TO START YOUR SUCCESSFUL CAREER IN SEARCH ENGINE OPTIMIZATION DIGITAL MARKETING WELCOME If you re underemployed (maybe that barista job is wearing thin?), or you never quite found your post-college

More information

Complex, true real-time analytics on massive, changing datasets.

Complex, true real-time analytics on massive, changing datasets. Complex, true real-time analytics on massive, changing datasets. A NoSQL, all in-memory enabling platform technology from: Better Questions Come Before Better Answers FinchDB is a NoSQL, all in-memory

More information

Schedule Risk Analysis A large DoD Development Program

Schedule Risk Analysis A large DoD Development Program Schedule Risk Analysis A large DoD Development Program So, who s the dude? Jim Aksel PMP, PMI SP Microsoft MVP 35 Years experience aerospace/defense Co Author PMI Practice Standard for Scheduling Program

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Smarter Planet evolution

Smarter Planet evolution Smarter Planet evolution 13/03/2012 2012 IBM Corporation Ignacio Pérez González Enterprise Architect ignacio.perez@es.ibm.com @ignaciopr Mike May Technologies of the Change Capabilities Tendencies Vision

More information