Recommendations in Mobile Environments. Professor Hui Xiong Rutgers Business School Rutgers University. Rutgers, the State University of New Jersey

Size: px
Start display at page:

Download "Recommendations in Mobile Environments. Professor Hui Xiong Rutgers Business School Rutgers University. Rutgers, the State University of New Jersey"

Transcription

1 1 Recommendations in Mobile Environments Professor Hui Xiong Rutgers Business School Rutgers University ADMA-2014 Rutgers, the State University of New Jersey

2 Big Data

3 3 Big Data Application Requirements Timely observation Timely analysis Timely solution

4 Big Data Talents 4 Technical Knowledge Domain Knowledge Big Data Applications

5 Data Driven Solutions 5 Theoretical top-down solutions Data driven bottomup solutions

6 Data Miners in Big Data Analytics Big Data Analytics Understand goals of business Collaborate in interdisciplinary teams Integrate large volumes of structured and unstructured data Formulate problems, develop solutions Blend statistical modeling, data mining, forecasting, optimization Develop/run integrated software solutions Gain higher visibility Change business operation

7 10 Ways Mobile Tech Is Changing Our World 7 1. Elections Will Never Be The Same 1. Doing Good By Texting 2. Bye-Bye, Wallets 3. The Phone Knows All 4. Your Life Is Fully Mobile 5. The Grid Is Winning 6. A Camera Goes Anywhere 7. Toys Get Unplugged 8. Gadgets Go To Class 9. Disease Can't Hide

8 Human Mobility Human mobility is people s movement trajectories which can be Phone traces or trajectories of driving routes a sequences of posts (like geo-tweets, geo-tagged photos, or check-ins) Indoor Traces and Outdoor Traces.

9 Urban Geography Urban geography is a set of geographic characteristics of a city including road networks, public transportation places of interest (POIs), regional functions

10 Public transportation data

11 Point of Interests (POI)

12 Outdoor Location Traces Taxi GPS trajectories

13 Big Data Application Trends 13 Micro-scope Macro-scope Personalized Recommendation Urban Computing Micro-Scope Classic Macro-Scope Applications/Techniques

14 Big Data Experiences 14 Understanding Data Characteristics Data Distribution, Data Quality etc. Feature Engineering Feature engineering is one of the key strategy for the success of big data analytics. The goal is to explicitly reveal important information to the model by feature selection or feature generation Original features different encoding of the features combined features Instance Selection (particularly mobile environment) The goal is to select the right instances/objects for the underlying data analytics

15 Mobile Recommender Systems 15 Background Revolution in Mobile Devices GPS WiFi Mobile phone The Urgent Demand for Better Service Driving route suggestion Mobile tourist guides Definition Mobile pervasive recommendation is promised to provide mobile users access to personalized recommendations anytime, anywhere.

16 Recommender Systems 16 Definition Recommend the right content and products (items) to the right user at the right time so as to archive the most relevant experience. Dependence Most Popular What s most engaging overall? Personalized Recommendation What s most relevant to me based on my interests and attributes? Related items Behavioral Affinity: People who did X, did Y Similarity: Based on metadata

17 Traditional Recommender Systems 17

18 Traditional Recommender System 18 Problem Formalization With Rating Without Rating Item-1 Item-2 Item-3 Item-4 Item-5 User-1 4? 2 5? User ?? User-3? 2? 3? User-4? Item-1 Item-2 Item-3 Item-4 Item-5 User-1 1? 1 1? User ?? User-3? 1? 1? User-4?

19 Classification of Algorithms 19 Content-based Method Thinking In C++ C++ Hello world Java Hello world Search papers Sequence pattern Social mining network Data mining Text mining Web mining Auto abstract Machine learning Text classify NLP Windows Linux Unix Write Good papers NLU

20 Classification of Algorithms 20 Collaborative Filtering Method User-based CF new

21 Classification of Algorithms 21 Collaborative Filtering Method Item-based CF new

22 Mobile Recommender Systems 22 Route Recommendation Travel Recommendation

23 Challenges for Mobile 23 Recommendation (I) Complexity of the Mobile Data Heterogeneous Spatial and temporal auto-correlation Noisy The Validation Problem No Ratings The Generality Problem Different application domains with different recommendation techniques

24

25 25 Challenges for Mobile Recommendation (II) The Cost Constraints Time Price The Life Cycle Problem The Transplantation Problem Difficult to apply traditional Recommendation techniques for mobile recommendation

26 26 The Characteristics of Mobile Data Two Cases Case 1. Location trace by taxi drivers Case2. The tourism data Why? A good coverage of unique characteristics of mobile data Can be naturally exploited for developing mobile recommender systems They are the real-world data

27 27 The Characteristics of Mobile Data Case 1. Location trace by taxi drivers Data Description GPS traces Location information (Longitude, Latitude), timestamp The operation status (with or without passengers) Experienced drivers can usually have more driving hours and high occupancy rates Inexperienced drivers tend to have less driving hours and low occupancy rates

28 The Characteristics of Mobile Data 28 Case 1. Location trace by taxi drivers Driving pattern comparison The experienced drivers have a wider operation area. The experienced drivers know the roads as well the traffic patterns better.

29 29 The Characteristics of Mobile Data Case 1. Location trace by taxi drivers (KDD-2009) Develop a mobile recommender system Users ~ Taxi drivers Items ~ Potential pick-up points What did we learn? The difference between Mobile RS and traditional RS The items are application-dependent There is some cost to extract items The items are not i.i.d while spatial auto-correlation

30 The Characteristics of Mobile Data 30 Case2. The travel data (ICDM-2011) Data Description Expense records Tourists: ID, travel time Package: ID, name, landscapes, price, travel days Duration: Recommender System Users ~ Tourists Items ~ Packages

31 31 The Characteristics of Mobile Data Case2. The travel data Characteristics of Tourism Data (I) Spatial auto correlation of packages For example, the 1-day Niagara Falls Tour The Sparseness Much sparser than the Netflix data set.

32 The Characteristics of Mobile Data 32 Case2. The travel data Characteristics of Tourism Data(II) The time dependence Packages and tourists have seasonal tendency Packages have a life cycle Short-period packages are more popular

33 Problem Formulation 33 : all probabilities of all pick-up points contained in PTD: Potential travel distance

34 An Example 34 PoCab -> C1 -> C4 or PoCab -> C4 ->C3 -> C2?

35 Two Challenges 35 Mining reliable pick-up point with probability information Computation challenge to search the optimal route

36 MSR Problem with Constraints 36 Computational complexity:

37 37 Recommending Point Generation High-performance Drivers Sufficient Driving Hours High Occupancy Rate Clustering based on Driving Distance Clustering close pick-up points into one pick-up cluster Probability Calculation Ratio of Pick-up Events Happening when cab travels across pick-up clusters

38 38 Potential Travel Distance Function: Two Vectors: PoCab: the position of a cab C1: one pick-up point P(C1): the probability of pick-up event D1: the driving distance Distance Vector: Probability Vector: PTD Function:

39 Potential Travel Distance Function: 39 PoCab: the position of a cab C1: one pick-up point P(C1): the probability of pick-up event D1: the driving distance One Vector: Generally, for, PTD Function:

40 Important Property of PTD Function 40

41 41 Route Dominance By this definition, if a candidate route A is dominated by a candidate route B, A cannot be an optimal route

42 42 Constrained Sub-route Dominance Illustration: the Sub-route Dominance

43 43 If and dominates Then dominates

44 44 Enumerate all sub-routes with length of L Prune dominated constrained sub-routes with length of L Once effort to prune search space offline

45 Skyline Route 45 First find the skyline routes Search the optimal driving route from the set of skyline routes

46 Pruning for SkyRoute 46 Traditional Skyline computing algorithms time-consuming and large memory SkyRoute Prune some candidates including dominated subroutes, at a very early stage The search space will be significant reduced, since lots of candidates containing dominated sub-routes are discarded Gradual effort to continue prune the search space

47 47 First use LCP to prune some candidates Search Skyline Routes from the rest of candidates Gradually prune candidates containing dominated sub-routes Search the optimal route from the set of Skyline Routes

48 48 Travel Package Recommendation Observable cost of travel packages Unobserved user s cost preference Different financial and time affordability of different users Challenges How to incorporate explicit package cost information into traditional collaborative filtering models How to model and learn user s cost preference

49 49 Probabilistic Matrix Factorization (PMF) The log of the posterior distribution over user and item features is given by

50 50 Probabilistic Matrix Factorization (PMF) Models cpmf represents user cost with a vector GcPMF represents user cost with a 2-D Gaussian Distribution

51 Theoretical Abstraction 51 Given: a set of objects O={O 1, O 2,, O n } Find: An ordered subset S={S 1, S 2,, S k } O The order of S 1, S 2,, S k is optimized subject to certain constraints. For taxi driver recommendation, the set O is a set of potential pick-up points For travel package recommendation, the set O is a set of landscapes.

52 Data Driven Solutions 52 Theoretical top-down solutions Data driven bottomup solutions

53 53 Point-of-Interest (POI) Recommendations in Mobile Social Environments (KDD-2013)

54 POI Recommendation 54 Task: Recommending POIs based one users check-in history and other available information Existing Method: collaborative filtering Failed to consider the multiple factors for choosing a POI Lack of integrated analysis of the joint effect of the various factors Challenges Various factors can influence POI check-in: user preferences, geographical influences, and user mobility

55 What s Special about POI Recommendation? 55 User mobility Geographical Influence probability of a user visiting a POI is inversely proportional to the geographic distance Law of geography everything is related to everything else, but nearby things are more related than distant thing Implicit user feedback need to infer user preferences from implicit user feedback in terms of user check-in count data

56 Check-in Patterns 56 All POIs in different regions A user's check-ins in different regions User check-ins in San Francisco Need a model that jointly encodes the personalized preferences, spatial influence, user mobility into the user check-in decision process to learn geographical user preferences for effective POI recommendation

57 General Ideas 57 Users are most likely to check in a number of POIs and these POIs are usually limited to some geographical regions By Tobler's first law of geography, POIs with similar services are likely to be clustered into the same geographical area A user's propensity for a POI Personalized interest Satisfaction Distance Need region partition! Best personalization, maximum satisfaction, at lowest distance cost

58 Geographical Probabilistic Factor Model 58 red plate represents users, blue plate represents POIs, and purple plate represents latent regions

59 59 Model Components Personal Interest preferences: latent factor model User mobility model A user samples a region from all R regions following a multinomial distribution A POI is assigned a normal distribution Geographical influence Distance from region center to the POI the probability of a user choose a POI decays as the power-law of the distance between them

60 Poisson Geo-PFM 60 Why Poisson Poisson distribution is more suitable and effective for modeling the skewed user check-in count Rigorous probabilistic generative process for the model check-in counts dist. and Poisson approximation

61 Experimental Data 61 Foursquare, Gowalla, and Brightkite

62 Summary 62 A general framework to learn geographical preferences for POI recommendation Learned the geographical influence on a user's check-in behavior Effectively modeled the user mobility patterns Extended the latent factor in explicit rating recommendation to implicit feedback recommendation settings Proposed model is flexible and could be extended to incorporate different later factor models

63 Data Miners in Big Data Analytics Big Data Analytics Understand goals of business Collaborate in interdisciplinary teams Integrate large volumes of structured and unstructured data Formulate problems, develop solutions Blend statistical modeling, data mining, forecasting, optimization Develop/run integrated software solutions Gain higher visibility Change business operation

64 Thank You! 64 My WEB site:

Big Data Analytics in Mobile Environments

Big Data Analytics in Mobile Environments 1 Big Data Analytics in Mobile Environments 熊 辉 教 授 罗 格 斯 - 新 泽 西 州 立 大 学 2012-10-2 Rutgers, the State University of New Jersey Why big data: historical view? Productivity versus Complexity (interrelatedness,

More information

Data Analysis on Location-Based Social Networks

Data Analysis on Location-Based Social Networks Data Analysis on Location-Based Social Networks Huiji Gao and Huan Liu Abstract The rapid growth of location-based social networks (LBSNs) has greatly enriched people s urban experience through social

More information

Content-Based Recommendation

Content-Based Recommendation Content-Based Recommendation Content-based? Item descriptions to identify items that are of particular interest to the user Example Example Comparing with Noncontent based Items User-based CF Searches

More information

GeoLife: A Collaborative Social Networking Service among User, Location and Trajectory

GeoLife: A Collaborative Social Networking Service among User, Location and Trajectory GeoLife: A Collaborative Social Networking Service among User, Location and Trajectory Yu Zheng, Xing Xie and Wei-Ying Ma Microsoft Research Asia, 4F Sigma Building, NO. 49 Zhichun Road, Beijing 100190,

More information

The Need for Training in Big Data: Experiences and Case Studies

The Need for Training in Big Data: Experiences and Case Studies The Need for Training in Big Data: Experiences and Case Studies Guy Lebanon Amazon Background and Disclaimer All opinions are mine; other perspectives are legitimate. Based on my experience as a professor

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

MOBILITY DATA MODELING AND REPRESENTATION

MOBILITY DATA MODELING AND REPRESENTATION PART I MOBILITY DATA MODELING AND REPRESENTATION 1 Trajectories and Their Representations Stefano Spaccapietra, Christine Parent, and Laura Spinsanti 1.1 Introduction For a long time, applications have

More information

10/14/11. Big data in science Application to large scale physical systems

10/14/11. Big data in science Application to large scale physical systems Big data in science Application to large scale physical systems Large scale physical systems Large scale systems with spatio-temporal dynamics Propagation of pollutants in air, Water distribution networks,

More information

Applying Data Science to Sales Pipelines for Fun and Profit

Applying Data Science to Sales Pipelines for Fun and Profit Applying Data Science to Sales Pipelines for Fun and Profit Andy Twigg, CTO, C9 @lambdatwigg Abstract Machine learning is now routinely applied to many areas of industry. At C9, we apply machine learning

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Integration of GPS Traces with Road Map

Integration of GPS Traces with Road Map Integration of GPS Traces with Road Map Lijuan Zhang Institute of Cartography and Geoinformatics Leibniz University of Hannover Hannover, Germany +49 511.762-19437 Lijuan.Zhang@ikg.uni-hannover.de Frank

More information

How To Solve The Kd Cup 2010 Challenge

How To Solve The Kd Cup 2010 Challenge A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China catch0327@yahoo.com yanxing@gdut.edu.cn

More information

BUSINESS ANALYTICS. Overview. Lecture 0. Information Systems and Machine Learning Lab. University of Hildesheim. Germany

BUSINESS ANALYTICS. Overview. Lecture 0. Information Systems and Machine Learning Lab. University of Hildesheim. Germany Tomáš Horváth BUSINESS ANALYTICS Lecture 0 Overview Information Systems and Machine Learning Lab University of Hildesheim Germany BA and its relation to BI Business analytics is the continuous iterative

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

Collective Behavior Prediction in Social Media. Lei Tang Data Mining & Machine Learning Group Arizona State University

Collective Behavior Prediction in Social Media. Lei Tang Data Mining & Machine Learning Group Arizona State University Collective Behavior Prediction in Social Media Lei Tang Data Mining & Machine Learning Group Arizona State University Social Media Landscape Social Network Content Sharing Social Media Blogs Wiki Forum

More information

A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks

A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks Text Analytics World, Boston, 2013 Lars Hard, CTO Agenda Difficult text analytics tasks Feature extraction Bio-inspired

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Location-Based Social Networks: Users

Location-Based Social Networks: Users Chapter 8 Location-Based Social Networks: Users Yu Zheng Abstract In this chapter, we introduce and define the meaning of location-based social network (LBSN) and discuss the research philosophy behind

More information

ICT Perspectives on Big Data: Well Sorted Materials

ICT Perspectives on Big Data: Well Sorted Materials ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Advanced Methods for Pedestrian and Bicyclist Sensing

Advanced Methods for Pedestrian and Bicyclist Sensing Advanced Methods for Pedestrian and Bicyclist Sensing Yinhai Wang PacTrans STAR Lab University of Washington Email: yinhai@uw.edu Tel: 1-206-616-2696 For Exchange with University of Nevada Reno Sept. 25,

More information

CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION

CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION N PROBLEM DEFINITION Opportunity New Booking - Time of Arrival Shortest Route (Distance/Time) Taxi-Passenger Demand Distribution Value Accurate

More information

IPTV Recommender Systems. Paolo Cremonesi

IPTV Recommender Systems. Paolo Cremonesi IPTV Recommender Systems Paolo Cremonesi Agenda 2 IPTV architecture Recommender algorithms Evaluation of different algorithms Multi-model systems Valentino Rossi 3 IPTV architecture 4 Live TV Set-top-box

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2015 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains

More information

Sensors talk and humans sense Part II

Sensors talk and humans sense Part II Sensors talk and humans sense Part II Athena Vakali Palic, 6 th September 2013 OSWINDS group Department of Informatics Aristotle University of Thessaloniki http://oswinds.csd.auth.gr SEN2SOC Architecture

More information

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort xavier.conort@gear-analytics.com Motivation Location matters! Observed value at one location is

More information

Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek

Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pelánek 2015 Today lecture, basic principles: content-based knowledge-based hybrid, choice of approach,... critiquing, explanations,...

More information

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information Eric Hsueh-Chan Lu Chi-Wei Huang Vincent S. Tseng Institute of Computer Science and Information Engineering

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

CABM System - The Best of CAB Software Solutions

CABM System - The Best of CAB Software Solutions CAB Management System CMS Desktop CMS Palm CMS Online Ready? to Computerize your Business? Boost Your Business Using CMS At Prices You Can Afford! User friendly Flexible Efficient Economical Why CMS? The

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

More information

Estimating Twitter User Location Using Social Interactions A Content Based Approach

Estimating Twitter User Location Using Social Interactions A Content Based Approach 2011 IEEE International Conference on Privacy, Security, Risk, and Trust, and IEEE International Conference on Social Computing Estimating Twitter User Location Using Social Interactions A Content Based

More information

Proposed Advance Taxi Recommender System Based On a Spatiotemporal Factor Analysis Model

Proposed Advance Taxi Recommender System Based On a Spatiotemporal Factor Analysis Model Proposed Advance Taxi Recommender System Based On a Spatiotemporal Factor Analysis Model Santosh Thakkar, Supriya Bhosale, Namrata Gawade, Prof. Sonia Mehta Department of Computer Engineering, Alard College

More information

GPS Enabled Taxi Probe s Big Data Processing for Traffic Evaluation of Bangkok Using Apache Hadoop Distributed System

GPS Enabled Taxi Probe s Big Data Processing for Traffic Evaluation of Bangkok Using Apache Hadoop Distributed System GPS Enabled Taxi Probe s Big Data Processing for Traffic Evaluation of Bangkok Using Apache Hadoop Distributed System Paper Identification number: AYRF14-060 Saurav RANJIT 1, Masahiko NAGAI 1, 2, Itti

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

PhoCA: An extensible service-oriented tool for Photo Clustering Analysis

PhoCA: An extensible service-oriented tool for Photo Clustering Analysis paper:5 PhoCA: An extensible service-oriented tool for Photo Clustering Analysis Yuri A. Lacerda 1,2, Johny M. da Silva 2, Leandro B. Marinho 1, Cláudio de S. Baptista 1 1 Laboratório de Sistemas de Informação

More information

ANALYTICS IN BIG DATA ERA

ANALYTICS IN BIG DATA ERA ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut

More information

life science data mining

life science data mining life science data mining - '.)'-. < } ti» (>.:>,u» c ~'editors Stephen Wong Harvard Medical School, USA Chung-Sheng Li /BM Thomas J Watson Research Center World Scientific NEW JERSEY LONDON SINGAPORE.

More information

Big Data Text Mining and Visualization. Anton Heijs

Big Data Text Mining and Visualization. Anton Heijs Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Siddhanth Gokarapu 1, J. Laxmi Narayana 2 1 Student, Computer Science & Engineering-Department, JNTU Hyderabad India 1

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Big Data Analytics for United Security

Big Data Analytics for United Security Big Data Analytics for United Security What Advantages Does an Agile Network Bring? (Issue 2) By Swift Liu, President Enterprise Networking Product Line Huawei Enterprise Business Group Agile means quick

More information

Clustering Data Streams

Clustering Data Streams Clustering Data Streams Mohamed Elasmar Prashant Thiruvengadachari Javier Salinas Martin gtg091e@mail.gatech.edu tprashant@gmail.com javisal1@gatech.edu Introduction: Data mining is the science of extracting

More information

Data Mining for Web Personalization

Data Mining for Web Personalization 3 Data Mining for Web Personalization Bamshad Mobasher Center for Web Intelligence School of Computer Science, Telecommunication, and Information Systems DePaul University, Chicago, Illinois, USA mobasher@cs.depaul.edu

More information

Business Intelligence, Predictive Analytics, and Geographic Information Systems

Business Intelligence, Predictive Analytics, and Geographic Information Systems Spatial Intelligence The Intersection of Data Warehousing, Business Intelligence, Predictive Analytics, and Geographic Information Systems 1 GeoAnalytics, Inc. 2008, all rights reserved An Evolving Sensibility

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data Athanasius Zakhary, Neamat El Gayar Faculty of Computers and Information Cairo University, Giza, Egypt

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin

More information

The primary goal of this thesis was to understand how the spatial dependence of

The primary goal of this thesis was to understand how the spatial dependence of 5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial

More information

Load Balancing in Cellular Networks with User-in-the-loop: A Spatial Traffic Shaping Approach

Load Balancing in Cellular Networks with User-in-the-loop: A Spatial Traffic Shaping Approach WC25 User-in-the-loop: A Spatial Traffic Shaping Approach Ziyang Wang, Rainer Schoenen,, Marc St-Hilaire Department of Systems and Computer Engineering Carleton University, Ottawa, Ontario, Canada Sources

More information

Crowdsourcing mobile networks from experiment

Crowdsourcing mobile networks from experiment Crowdsourcing mobile networks from the experiment Katia Jaffrès-Runser University of Toulouse, INPT-ENSEEIHT, IRIT lab, IRT Team Ecole des sciences avancées de Luchon Networks and Data Mining, Session

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

A Time Efficient Algorithm for Web Log Analysis

A Time Efficient Algorithm for Web Log Analysis A Time Efficient Algorithm for Web Log Analysis Santosh Shakya Anju Singh Divakar Singh Student [M.Tech.6 th sem (CSE)] Asst.Proff, Dept. of CSE BU HOD (CSE), BUIT, BUIT,BU Bhopal Barkatullah University,

More information

Cloud Analytics for Capacity Planning and Instant VM Provisioning

Cloud Analytics for Capacity Planning and Instant VM Provisioning Cloud Analytics for Capacity Planning and Instant VM Provisioning Yexi Jiang Florida International University Advisor: Dr. Tao Li Collaborator: Dr. Charles Perng, Dr. Rong Chang Presentation Outline Background

More information

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Big Data Mining Services and Knowledge Discovery Applications on Clouds Big Data Mining Services and Knowledge Discovery Applications on Clouds Domenico Talia DIMES, Università della Calabria & DtoK Lab Italy talia@dimes.unical.it Data Availability or Data Deluge? Some decades

More information

Unsupervised Data Mining (Clustering)

Unsupervised Data Mining (Clustering) Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in

More information

Big Data and Analytics: Challenges and Opportunities

Big Data and Analytics: Challenges and Opportunities Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif

More information

Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr

Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr Image Classification and Object Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Image classification Image (scene) classification is a fundamental

More information

Data Analytics at NICTA. Stephen Hardy National ICT Australia (NICTA) shardy@nicta.com.au

Data Analytics at NICTA. Stephen Hardy National ICT Australia (NICTA) shardy@nicta.com.au Data Analytics at NICTA Stephen Hardy National ICT Australia (NICTA) shardy@nicta.com.au NICTA Copyright 2013 Outline Big data = science! Data analytics at NICTA Discrete Finite Infinite Machine Learning

More information

Deep Insights Smart Decisions Motionlogic

Deep Insights Smart Decisions Motionlogic Deep Insights Smart Decisions Motionlogic About Motionlogic Big Data business of Deutsche Telekom 100% subsidiary Analytics of people movement behavior and demographic indicators Using anonymized network

More information

Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases. Andreas Züfle

Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases. Andreas Züfle Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases Andreas Züfle Geo Spatial Data Huge flood of geo spatial data Modern technology New user mentality Great research potential

More information

dm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING

dm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING dm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING ABSTRACT In most CRM (Customer Relationship Management) systems, information on

More information

Inferential Statistics. Data Mining. ASC September Proving value in complex analytics. 2 Rivers. Information and Data Management

Inferential Statistics. Data Mining. ASC September Proving value in complex analytics. 2 Rivers. Information and Data Management ASC September Proving value in complex analytics 26 th September 2014 John McConnell Information and Data Management 1 1 2 Rivers Research Operational/Transactional Inferential Statistics Inferring parameter

More information

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms

More information

Star rating driver traffic and safety behaviour through OBD and smartphone data collection

Star rating driver traffic and safety behaviour through OBD and smartphone data collection International Symposium on Road Safety Behaviour Measurements and Indicators Belgian Road Safety Institute 23 April 2015, Brussels Star rating driver traffic and safety behaviour through OBD and smartphone

More information

Overview. Introduction. Recommender Systems & Slope One Recommender. Distributed Slope One on Mahout and Hadoop. Experimental Setup and Analyses

Overview. Introduction. Recommender Systems & Slope One Recommender. Distributed Slope One on Mahout and Hadoop. Experimental Setup and Analyses Slope One Recommender on Hadoop YONG ZHENG Center for Web Intelligence DePaul University Nov 15, 2012 Overview Introduction Recommender Systems & Slope One Recommender Distributed Slope One on Mahout and

More information

Oracle Real Time Decisions

Oracle Real Time Decisions A Product Review James Taylor CEO CONTENTS Introducing Decision Management Systems Oracle Real Time Decisions Product Architecture Key Features Availability Conclusion Oracle Real Time Decisions (RTD)

More information

CHAPTER VII CONCLUSIONS

CHAPTER VII CONCLUSIONS CHAPTER VII CONCLUSIONS To do successful research, you don t need to know everything, you just need to know of one thing that isn t known. -Arthur Schawlow In this chapter, we provide the summery of the

More information

1 2013 Solera Networks, A Blue Coat Company SOLERA NETWORKS BIG DATA SECURITY ANALYTICS

1 2013 Solera Networks, A Blue Coat Company SOLERA NETWORKS BIG DATA SECURITY ANALYTICS 1 2013 Solera Networks, A Blue Coat Company SOLERA NETWORKS BIG DATA SECURITY ANALYTICS $32.8B 100,000 Cyber Criminals State-Sponsored Spies Hactivists We live in a POST-PREVENTION Amount enterprises are

More information

House Committee on the Judiciary. Subcommittee on the Constitution, Civil Rights, and Civil Liberties

House Committee on the Judiciary. Subcommittee on the Constitution, Civil Rights, and Civil Liberties House Committee on the Judiciary Subcommittee on the Constitution, Civil Rights, and Civil Liberties Hearing on ECPA Reform and the Revolution in Location Based Technologies and Services Testimony of Professor

More information

Invited Applications Paper

Invited Applications Paper Invited Applications Paper - - Thore Graepel Joaquin Quiñonero Candela Thomas Borchert Ralf Herbrich Microsoft Research Ltd., 7 J J Thomson Avenue, Cambridge CB3 0FB, UK THOREG@MICROSOFT.COM JOAQUINC@MICROSOFT.COM

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

A Statistical Text Mining Method for Patent Analysis

A Statistical Text Mining Method for Patent Analysis A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, shjun@cju.ac.kr Abstract Most text data from diverse document databases are unsuitable for analytical

More information

DATA WAREHOUSING AND ITS POTENTIAL USING IN WEATHER FORECAST

DATA WAREHOUSING AND ITS POTENTIAL USING IN WEATHER FORECAST 2.2 DATA WAREHOUSING AND ITS POTENTIAL USING IN WEATHER FORECAST Xiaoguang Tan* Institute of Urban Meteorology, CMA, Beijing, China 1. INTRODUCTION The main function of most current forecaster s workbench

More information

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1 Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints

More information

Synergistic Sensor Location for Cost-Effective Traffic Monitoring

Synergistic Sensor Location for Cost-Effective Traffic Monitoring Synergistic Sensor Location for Cost-Effective Traffic Monitoring ManWo Ng, Ph.D. Assistant Professor Department of Modeling, Simulation and Visualization Engineering & Department of Civil and Environmental

More information

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence Augmented Search for Web Applications New frontier in big log data analysis and application intelligence Business white paper May 2015 Web applications are the most common business applications today.

More information

USING INTELLIGENT TRANSPORTATION SYSTEMS DATA ARCHIVES FOR TRAFFIC SIMULATION APPLICATIONS

USING INTELLIGENT TRANSPORTATION SYSTEMS DATA ARCHIVES FOR TRAFFIC SIMULATION APPLICATIONS USING INTELLIGENT TRANSPORTATION SYSTEMS DATA ARCHIVES FOR TRAFFIC SIMULATION APPLICATIONS Patricio Alvarez, Universidad del Bío Bío, palvarez@ubiobio.cl Mohammed Hadi, Florida International University,

More information

Asian Institute of Technology, Pathum Thani, Thailand Email: saurav.ranjit@ait.ac.th, nagaim@ait.ac.th

Asian Institute of Technology, Pathum Thani, Thailand Email: saurav.ranjit@ait.ac.th, nagaim@ait.ac.th GPS Enabled Taxi Probe s Big Data Processing for Traffic Evaluation of Bangkok Using Apache Hadoop Distributed System Paper Identification number: AYRF14-060 Saurav RANJIT 1, Masahiko NAGAI 1, 2, Itti

More information

Optimization of patient transport dispatching in hospitals

Optimization of patient transport dispatching in hospitals Optimization of patient transport dispatching in hospitals Cyrille Lefèvre and Sophie Marquet Supervisors: Yves Deville and Gildas Avoine 1500 minutes Thesis motivation MedSoc NPO asked us to conduct a

More information

Aggregation of Spatio-temporal and Event Log Databases for Stochastic Characterization of Process Activities

Aggregation of Spatio-temporal and Event Log Databases for Stochastic Characterization of Process Activities Aggregation of Spatio-temporal and Event Log Databases for Stochastic Characterization of Process Activities Rodrigo M. T. Gonçalves, Rui Jorge Almeida, João M. C. Sousa Planning and Scheduling In logistic

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

MCOM VEHICLE TRACKING SYSTEM MANUAL

MCOM VEHICLE TRACKING SYSTEM MANUAL 2015 MCOM VEHICLE TRACKING SYSTEM MANUAL Vehicle Tracking System allows the Department to track, trace and monitor their vehicles in real time using GSM / GPRS technology. It sends the location address,

More information

Interactive information visualization in a conference location

Interactive information visualization in a conference location Interactive information visualization in a conference location Maria Chiara Caschera, Fernando Ferri, Patrizia Grifoni Istituto di Ricerche sulla Popolazione e Politiche Sociali, CNR, Via Nizza 128, 00198

More information

Discovery of User Groups within Mobile Data

Discovery of User Groups within Mobile Data Discovery of User Groups within Mobile Data Syed Agha Muhammad Technical University Darmstadt muhammad@ess.tu-darmstadt.de Kristof Van Laerhoven Darmstadt, Germany kristof@ess.tu-darmstadt.de ABSTRACT

More information

2015 Workshops for Professors

2015 Workshops for Professors SAS Education Grow with us Offered by the SAS Global Academic Program Supporting teaching, learning and research in higher education 2015 Workshops for Professors 1 Workshops for Professors As the market

More information

Data Mining Analytics for Business Intelligence and Decision Support

Data Mining Analytics for Business Intelligence and Decision Support Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing

More information

So, how do you pronounce. Jilles Vreeken. Okay, now we can talk. So, what kind of data? binary. * multi-relational

So, how do you pronounce. Jilles Vreeken. Okay, now we can talk. So, what kind of data? binary. * multi-relational Simply Mining Data Jilles Vreeken So, how do you pronounce Exploratory Data Analysis Jilles Vreeken Jilles Yill less Vreeken Fray can 17 August 2015 Okay, now we can talk. 17 August 2015 The goal So, what

More information

Asynchronous Data Mining Tools at the GES-DISC

Asynchronous Data Mining Tools at the GES-DISC Asynchronous Data Mining Tools at the GES-DISC Long B. Pham, Stephen W. Berrick, Christopher S. Lynnes and Eunice K. Eng NASA Goddard Space Flight Center Distributed Active Archive Center Introduction

More information

Data2Diamonds Turning Information into a Competitive Asset

Data2Diamonds Turning Information into a Competitive Asset WHITE PAPER Data2Diamonds Turning Information into a Competitive Asset In today s business world, information management (IM), business intelligence (BI) and have become critical to compete and thrive.

More information

TEXT ANALYTICS INTEGRATION

TEXT ANALYTICS INTEGRATION TEXT ANALYTICS INTEGRATION A TELECOMMUNICATIONS BEST PRACTICES CASE STUDY VISION COMMON ANALYTICAL ENVIRONMENT Structured Unstructured Analytical Mining Text Discovery Text Categorization Text Sentiment

More information

A Game Theoretical Framework for Adversarial Learning

A Game Theoretical Framework for Adversarial Learning A Game Theoretical Framework for Adversarial Learning Murat Kantarcioglu University of Texas at Dallas Richardson, TX 75083, USA muratk@utdallas Chris Clifton Purdue University West Lafayette, IN 47907,

More information

Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm

Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm Purpose: key concepts in mining frequent itemsets understand the Apriori algorithm run Apriori in Weka GUI and in programatic way 1 Theoretical

More information

The Attendance Management System Based on Micro-cellular and GIS technology

The Attendance Management System Based on Micro-cellular and GIS technology I.J. Education and Management Engineering 2012, 5, 54-59 Published Online May 2012 in MECS (http://www.mecs-press.net) DOI: 10.5815/ijeme.2012.05.09 Available online at http://www.mecs-press.net/ijeme

More information

Text Analytics. A business guide

Text Analytics. A business guide Text Analytics A business guide February 2014 Contents 3 The Business Value of Text Analytics 4 What is Text Analytics? 6 Text Analytics Methods 8 Unstructured Meets Structured Data 9 Business Application

More information

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of

More information

Recommendation Tool Using Collaborative Filtering

Recommendation Tool Using Collaborative Filtering Recommendation Tool Using Collaborative Filtering Aditya Mandhare 1, Soniya Nemade 2, M.Kiruthika 3 Student, Computer Engineering Department, FCRIT, Vashi, India 1 Student, Computer Engineering Department,

More information