Big Data Technology Recommendation Challenges in Web Media Sites

Size: px
Start display at page:

Download "Big Data Technology Recommendation Challenges in Web Media Sites"

Transcription

1 Big Data Technology Recommendation Challenges in Web Media Sites Course Summary Edward Bortnikov & Ronny Lempel Yahoo! Labs, Haifa Recommender Systems: A Canonical Big Data Problem Pioneered by Amazon in the mid/late 90s Today applied everywhere Shopping sites, content sites, multimedia streaming sites, social networks The field has a dedicated conference and easily merits a dedicated academic course 2 1

2 Recommendation in Social Networks 3 Recommender Systems Example of Effectiveness 1988: Random House releases Touching the Void, a book by a mountain climber detailing a harrowing account of near death in the Andes It got good reviews but modest commercial success 1999: Into Thin Air, another mountain-climbing tragedy book, becomes a best-seller By virtue of Amazon s recommender system, Touching the Void started to sell again, requiring a new edition A revised paperback edition spent 14 weeks on the New York Times bestseller list From The Long Tail, by Chris Anderson 4 2

3 The Netflix Challenge Slides 4-6 courtesy of Yehuda Koren, member of Challenge winners Bellkor s Pragmatic Chaos We re quite curious, really. To the tune of one million dollars. Netflix Prize rules Goal: improve on Netflix existing movie recommender The open-to-the-public contest began October 2, 2006; winners announced September 2009 Prize Based on reduction in root mean squared error on test data $1 million grand prize for 10% improvement on Cinematch result $0K 2007 progress prize for 8.43% improvement $0K 2008 progress prize for 9.44% improvement Netflix gets full rights to use IP developed by the winners Example of Crowdsourcing Netflix basically got over 100 researcher years (and good publicity) for $1.1M 6 3

4 Netflix Movie Ratings Data Training data 100 million ratings 480,000 users 17,770 movies 6 years of data: Test data Last few ratings of each user (2.8 million) Dates of ratings are given user Training data movie score user Test data movie Types of Recommender Systems At a high level, two main techniques: Content-based recommendation: characterizes the affinity of users to certain features (content, metadata) of their preferred items Lots of classification technology under the hood Collaborative Filtering: exploits similar consumption and preference patterns between users See next slides Many state of the art systems combine both techniques 8 4

5 Collaborative Filtering Mathematical Abstraction Consider a consumption matrix R of users and items r i,k =1 whenever person i consumed item k In other cases, r i,k might be person i s rating on item k The matrix R is typically very sparse and often very large Items Real-life task: top-k recommendation From among the items that weren t yet consumed by each user, predict which ones the user would most enjoy Related task on ratings data: matrix completion Predict users ratings for items they have yet to rate, i.e. complete missing values R = users U x I 9 Collaborative Filtering Neighborhood Models Compute the similarity of items [users] to each other Items are considered similar when users tend to rate them similarly or to co-consume them Users are considered similar when they tend to co-consume items or rate items similarly Recommend to a user: Items similar to items he/she has already consumed [rated highly] Items consumed [rated highly] by similar users Key questions: How exactly to define pair-wise similarities How to combine them into quality recommendations 10

6 R = Collaborative Filtering Matrix Factorization Latent factor models (LFM): Map both users and items to some f-dimensional space R f, i.e. produce f-dimensional vectors v u and w i for each user and items Define rating estimates as inner products: q ij = <v i,w j > Main problem: finding a mapping of users and items to the latent factor space that produces good estimates users Items V W U x I U x f f x I Closely related to dimensionality reduction techniques of the ratings matrix R (e.g. Singular Value Decomposition) 11 This Class Real collaborative filtering applications run into many research challenges beyond those represented by analysis of the user-item matrix These challenges are often under-represented in the literature Some examples covered in the slides: Perpetual cold start problems Inferring implicit interactions and satisfaction Personalization vs. Contextualization Repeated consumption and repeated recommendation; diversity Set and sequence recommendation Incremental Collaborative Filtering Social networks and recommendation consumption Focus here is mostly on Web Media sites 12 6

7 Web Media Sites 13 Definition: Cold Start Problems Good recommendations require observed data on the user being recommended to [items being recommended] User cold start: when a new user arrives to a system, can the system make a good first impression Item cold start: how do we recommend newly arrived items with little historic consumption 14 7

8 Challenge: Perpetual Cold Start Problems Extreme cases exhibit perpetual cold-start scenarios: All users are cold & appear just once (e.g. certain online advertising scenarios) Every item is ephemeral with a short lifetime (e.g. news recommendations) 1 False-Positive Costs in Media Sites are Low False positive: recommending an irrelevant item Consequence, in media sites: (just) a bit of lost time As opposed to lots of lost time or money in other settings Opportunity: better handling of cold-start problems Item cold-start: show new item to a select group of users whose feedback should help in modeling it to everyone Several possible formulations of optimization problems User cold-start: more aggressive exploration Vs. playing it safe and perpetuating popular items But exploration should be optimized to effectively model the user 16 8

9 Challenge: Inferring Interactions and Satisfaction Dominant model in the literature: input consists of <user-item-rating> triplets, i.e. explicit ratings are available In many recommendation settings we only know which items users have consumed, not whether they liked them I.e. no explicit ratings data Several publications talk about binary consumption data What about items the user did not consume Was the user even aware of the items he/she did not consume What items did the recommender system expose the user to 17 Presentation Bias Effect on Media Consumption Pop Culture: items longevity creates familiarity Media sites: items are ephemeral, and users are mostly unaware of items the site did not expose them to Presentation bias obscures true taste users essentially select the best of the little that was shown Must correctly account for presentation bias when modeling: seen & not selected not seen & not selected 18 9

10 Aside: Skips in Search Must correctly account for presentation bias when modeling: seen and not selected not seen and not selected Search: negative interpretation of skipped search results (Joachims, KDD 2002) 19 Layouts of Recommendation Modules Interpreting interactions in vertical layouts is easy using the skips paradigm What about 2D, tabbed, horizontal layouts 20 10

11 Layouts of Recommendation Modules (cont.) What about multiple presentation formats Are we more confident in a skip of a salient item 21 Challenge: Inferring Interactions and Satisfaction Beyond consumption, do interactions imply satisfaction Web pages: what happens after the initial click Short online videos: what happens after pressing play TV programs: zapping patterns In some domains, can we even positively assess consumption Is anyone watching Time 22 11

12 Personalized Popular Contextual 23 Challenge: Contextualization vs. Personalization Web media sites often display links to additional stories on each article page Matching the article s context, matching the user, consumed by the user s friends, popular When creating a unified list for a given a user reading a specific page, how should contextualization and personalization be mixed Ignoring story context might create offending recommendations Related direction: Tensor Factorization, Karatzoglou et. al, RecSys

13 Challenge: Repeated Recommendations One typically doesn t buy the same book twice, nor do people typically read the same news story twice But people listen to the songs they like over and over again, and watch movies they like multiple times as well When and how frequently is it ok to recommend an item that was already consumed On the other hand, when should we stop showing a recommendation if the user doesn t act upon it Implication: a recommendation system may not only need to track aggregated consumption to-date It may need to track consumption timelines It may need to track recommendation history 2 3D: Three Aspects of Diversity Time 1. How diverse is the recommendation to user u at time t Search: result set attributes (e.g. diversity) in Search (Agrawal et al., WSDM 2009) Netflix tutorial at RecSys 2012: diversity is Relatively well understood 26 13

14 3D: Three Aspects of Diversity Time 2. How diverse are the recommendations, across all users, at time t Indication of aggressiveness of personalization and deviation from popularity baselines 27 3D: Three Aspects of Diversity Time 3. How diverse are the recommendations to user u over time Shouldn t recommend the same items day after day 28 14

15 Challenge: Recommending Sets and Sequences of Items In some domains, users consume multiple items in rapid succession (e.g. music playlists) Recent works: WWW 2012 (Aizenberg et al., sets) and KDD 2012 (Chen et al., sequences) From Independent utility of recommendations to set or sequence utility, predicting items that go well together Sometimes need to respect constraints An extension of diversity Tiling recommendations: in TV Watch-list generation, the broadcast schedule further complicates matters due to program overlaps Perhaps a new domain of constrained recommendations 29 Challenge: Incremental Collaborative Filtering Live system often cannot afford to recompute recommendations regularly over the entire history Problem: collaborative filtering models do not easily lend themselves to faithful incremental processing User-Item Interactions t 1 User-Item Interactions t 2 User-Item Interactions t 3 M i = CF-ALG(t i ) f, f { M 1, M 2 } CF_ALG(t 1 t 2 ) Is there a good model aggregation function f(m prev, M curr ) that is good enough 30 T 1

16 Social Networks and Recommendation Computation Some are hailing social networks as a silver bullet for recommender systems Tell me who your friends are and we ll tell you what you like Is it really the case that we like the same media as our friends Affinity trumps friendship! There are people out there who are more like us than our limited set of friends Once affinity is considered, the marginal value of social connections is often negligible Not to be confused with non-friendship social networks, where connections are affinity-related (e.g. Epinions) 31 Social Networks and Recommendation Consumption Previous slide nonewithstanding, social is a great motivator for consuming recommendations People like you rate Lincoln very highly Your friends Alice and Bob saw Lincoln last night and loved it Explaining recommendations for motivating and increasing consumption is an emerging practice Some commercial systems completely separate their explanation generation from their recommendation generation So Alice and Bob may not be why the system recommended Lincoln to you, but they will be leveraged to get you to watch it Privacy in the face of joint consumption of a personalized experience vs

17 Course Summary Three main topics: Batch processing of large amounts of data Incremental processing Stream and online processing Each of the main topics covered business needs, algorithms, and systems Complementing topics: Infrastructure: data centers Methodology: controlled experiments Business case example: recommender systems 33 Revisiting Data Science Virtuous Cycle (Web) Requirements for systems come from each step! Capture Crawl, ingest feeds, record instrumented interactions, Transfer Move the data to a system capable of storing and processing it Visualization Experimentation & Metrics Deploy/Serve Tap output of previous step to improve user experience Analyze/Model Here data mining & machine learning take place 34 17

18 Course Summary Plenty that Wasn t Covered Interactive analytics platforms Dremel, Impala In-memory distributed filesystems Spark, Tachyon Graph processing Pregel, Bagel, Giraph Standard Data Science toolkits statistics, machine learning, data mining, information extraction Data visualization Dimensionality reduction techniques Engagement metrics Specific applications 3 Related Courses (Non-Exhaustive List) Systems: Parallel and Distributed Programming 23631/ Distributed Systems Functional Distributed computing Database Systems Theory: Introduction to Statistics / Computational Learning Tools: Introduction to AI Data Mining and Business Intelligence 23676/04619 Machine Learning Approximation Algorithms Applications: Search Engine Technology Information Retrieval Web Search and Data Mining 36 18

19 Final Logistics Exams: 30/7, 6/10; exact hours and location TBD All offline material and non-communicating devices allowed Reception hours by appointment As you may be aware, this was the first rendition of this course Help us improve! Feel free to send feedback to the course s account 37 Conclusions Big Data is an umbrella name for a vast area of multidisciplinary research theoretical, applied and experimental - with plenty more to be done. Academia & open source: Many journals and conferences in the domain Many open-source projects that build fascinating systems Can sustain many graduate-level theses Industry: Companies of all sizes and in many different areas are discovering the need for competency in Big Data and Data Science 38 19

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2015 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains

More information

The Need for Training in Big Data: Experiences and Case Studies

The Need for Training in Big Data: Experiences and Case Studies The Need for Training in Big Data: Experiences and Case Studies Guy Lebanon Amazon Background and Disclaimer All opinions are mine; other perspectives are legitimate. Based on my experience as a professor

More information

IPTV Recommender Systems. Paolo Cremonesi

IPTV Recommender Systems. Paolo Cremonesi IPTV Recommender Systems Paolo Cremonesi Agenda 2 IPTV architecture Recommender algorithms Evaluation of different algorithms Multi-model systems Valentino Rossi 3 IPTV architecture 4 Live TV Set-top-box

More information

Big Data and Analytics: Challenges and Opportunities

Big Data and Analytics: Challenges and Opportunities Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif

More information

Monetizing Mobile. How Broadcasters Can Generate Revenue With Mobile Apps. 2016 jācapps

Monetizing Mobile. How Broadcasters Can Generate Revenue With Mobile Apps. 2016 jācapps Monetizing Mobile How Broadcasters Can Generate Revenue With Mobile Apps 2016 jācapps Contents Mobile Revenue Growth 4 5 Principles for Monetizing Mobile. 6 1: An Ad is Not an Ad 7 2: Embrace What Differentiates

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

Tutorial Customer Lifetime Value

Tutorial Customer Lifetime Value MARKETING ENGINEERING FOR EXCEL TUTORIAL VERSION 150211 Tutorial Customer Lifetime Value Marketing Engineering for Excel is a Microsoft Excel add-in. The software runs from within Microsoft Excel and only

More information

Web analytics: Data Collected via the Internet

Web analytics: Data Collected via the Internet Database Marketing Fall 2016 Web analytics (incl real-time data) Collaborative filtering Facebook advertising Mobile marketing Slide set 8 1 Web analytics: Data Collected via the Internet Customers can

More information

Course Overview Lean Six Sigma Green Belt

Course Overview Lean Six Sigma Green Belt Course Overview Lean Six Sigma Green Belt Summary and Objectives This Six Sigma Green Belt course is comprised of 11 separate sessions. Each session is a collection of related lessons and includes an interactive

More information

A survey on click modeling in web search

A survey on click modeling in web search A survey on click modeling in web search Lianghao Li Hong Kong University of Science and Technology Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models

More information

Downloaded from UvA-DARE, the institutional repository of the University of Amsterdam (UvA) http://hdl.handle.net/11245/2.122992

Downloaded from UvA-DARE, the institutional repository of the University of Amsterdam (UvA) http://hdl.handle.net/11245/2.122992 Downloaded from UvA-DARE, the institutional repository of the University of Amsterdam (UvA) http://hdl.handle.net/11245/2.122992 File ID Filename Version uvapub:122992 1: Introduction unknown SOURCE (OR

More information

Social Media and Digital Marketing Analytics ( INFO-UB.0038.01) Professor Anindya Ghose Monday Friday 6-9:10 pm from 7/15/13 to 7/30/13

Social Media and Digital Marketing Analytics ( INFO-UB.0038.01) Professor Anindya Ghose Monday Friday 6-9:10 pm from 7/15/13 to 7/30/13 Social Media and Digital Marketing Analytics ( INFO-UB.0038.01) Professor Anindya Ghose Monday Friday 6-9:10 pm from 7/15/13 to 7/30/13 aghose@stern.nyu.edu twitter: aghose pages.stern.nyu.edu/~aghose

More information

The fundamental question in economics is 2. Consumer Preferences

The fundamental question in economics is 2. Consumer Preferences A Theory of Consumer Behavior Preliminaries 1. Introduction The fundamental question in economics is 2. Consumer Preferences Given limited resources, how are goods and service allocated? 1 3. Indifference

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION Exploration is a process of discovery. In the database exploration process, an analyst executes a sequence of transformations over a collection of data structures to discover useful

More information

101 IELTS Speaking Part Two Topic cards about sports, hobbies and free time A- Z

101 IELTS Speaking Part Two Topic cards about sports, hobbies and free time A- Z 101 IELTS Speaking Part Two Topic cards about sports, hobbies and free time A- Z As the topics of sports, hobbies and free time are easy ones that tie in with IELTS Speaking Part One and students like

More information

THE SME S GUIDE TO COST-EFFECTIVE WEBSITE MARKETING

THE SME S GUIDE TO COST-EFFECTIVE WEBSITE MARKETING THE SME S GUIDE TO COST-EFFECTIVE WEBSITE MARKETING Learn how to set your website up to convert visitors into sales and drive traffic to your website using online advertising. A publication by: Introduction

More information

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02) Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we

More information

Why big data? Lessons from a Decade+ Experiment in Big Data

Why big data? Lessons from a Decade+ Experiment in Big Data Why big data? Lessons from a Decade+ Experiment in Big Data David Belanger PhD Senior Research Fellow Stevens Institute of Technology dbelange@stevens.edu 1 What Does Big Look Like? 7 Image Source Page:

More information

A TELEVÍZIÓZÁS ÉLMÉNYÉNEK NÖVELÉSE ALGORITMIKUS MÓDSZEREKKEL, AVAGY PERSZONALIZÁLT TARTALOMA JÁNLÓ SZOLGÁLTATÁS IPTV ÉS OTT RENDSZEREK SZÁMÁRA

A TELEVÍZIÓZÁS ÉLMÉNYÉNEK NÖVELÉSE ALGORITMIKUS MÓDSZEREKKEL, AVAGY PERSZONALIZÁLT TARTALOMA JÁNLÓ SZOLGÁLTATÁS IPTV ÉS OTT RENDSZEREK SZÁMÁRA 1 A TELEVÍZIÓZÁS ÉLMÉNYÉNEK NÖVELÉSE ALGORITMIKUS MÓDSZEREKKEL, AVAGY PERSZONALIZÁLT TARTALOMA JÁNLÓ SZOLGÁLTATÁS IPTV ÉS OTT RENDSZEREK SZÁMÁRA Zibriczky Dávid, ImpressTV 2015-10-09 2 About ImpressTV

More information

Big & Personal: data and models behind Netflix recommendations

Big & Personal: data and models behind Netflix recommendations Big & Personal: data and models behind Netflix recommendations Xavier Amatriain Netflix xavier@netflix.com ABSTRACT Since the Netflix $1 million Prize, announced in 2006, our company has been known to

More information

A Step By Step Guide On How To Attract Your Dream Life Now

A Step By Step Guide On How To Attract Your Dream Life Now A Step By Step Guide On How To Attract Your Dream Life Now This guide is about doing things in a step by step fashion every day to get the results you truly desire. There are some techniques and methods

More information

BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business

BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business Instructor: Kunpeng Zhang (kzhang@rmsmith.umd.edu) Lecture-Discussions:

More information

Challenges and Opportunities in Data Mining: Personalization

Challenges and Opportunities in Data Mining: Personalization Challenges and Opportunities in Data Mining: Big Data, Predictive User Modeling, and Personalization Bamshad Mobasher School of Computing DePaul University, April 20, 2012 Google Trends: Data Mining vs.

More information

How to Schedule Ketarin Update with Windows Task Scheduler

How to Schedule Ketarin Update with Windows Task Scheduler Scenario: Like most households, I have a bandwidth hungry family. If my son isn t playing Madden or Call of Duty: Black Ops 3 on his Xbox online, my wife is streaming Netflix, video chatting with friends

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Exemplar for Internal Achievement Standard. Accounting Level 2

Exemplar for Internal Achievement Standard. Accounting Level 2 Exemplar for internal assessment resource Accounting for Achievement Standard 9386 Exemplar for Internal Achievement Standard Accounting Level This exemplar supports assessment against: Achievement Standard

More information

Recommendations in Mobile Environments. Professor Hui Xiong Rutgers Business School Rutgers University. Rutgers, the State University of New Jersey

Recommendations in Mobile Environments. Professor Hui Xiong Rutgers Business School Rutgers University. Rutgers, the State University of New Jersey 1 Recommendations in Mobile Environments Professor Hui Xiong Rutgers Business School Rutgers University ADMA-2014 Rutgers, the State University of New Jersey Big Data 3 Big Data Application Requirements

More information

Steven C.H. Hoi. School of Computer Engineering Nanyang Technological University Singapore

Steven C.H. Hoi. School of Computer Engineering Nanyang Technological University Singapore Steven C.H. Hoi School of Computer Engineering Nanyang Technological University Singapore Acknowledgments: Peilin Zhao, Jialei Wang, Hao Xia, Jing Lu, Rong Jin, Pengcheng Wu, Dayong Wang, etc. 2 Agenda

More information

ANALYTICS IN BIG DATA ERA

ANALYTICS IN BIG DATA ERA ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut

More information

Five Ways Retailers Can Profit from Customer Intelligence

Five Ways Retailers Can Profit from Customer Intelligence Five Ways Retailers Can Profit from Customer Intelligence Use predictive analytics to reach your best customers. An Apption Whitepaper Tel: 1-888-655-6875 Email: info@apption.com www.apption.com/customer-intelligence

More information

Tables in the Cloud. By Larry Ng

Tables in the Cloud. By Larry Ng Tables in the Cloud By Larry Ng The Idea There has been much discussion about Big Data and the associated intricacies of how it can be mined, organized, stored, analyzed and visualized with the latest

More information

Predict the Popularity of YouTube Videos Using Early View Data

Predict the Popularity of YouTube Videos Using Early View Data 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

Play Jeopardy! for free. All you need is a computer.

Play Jeopardy! for free. All you need is a computer. Play Jeopardy! for free. All you need is a computer. Thanks for downloading the Pacdude Games Jeopardy! Presentation Software for Schools. This is a free piece of software dedicated to giving students

More information

Collaborations between Official Statistics and Academia in the Era of Big Data

Collaborations between Official Statistics and Academia in the Era of Big Data Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI vnn@umich.edu What

More information

Response prediction using collaborative filtering with hierarchies and side-information

Response prediction using collaborative filtering with hierarchies and side-information Response prediction using collaborative filtering with hierarchies and side-information Aditya Krishna Menon 1 Krishna-Prasad Chitrapura 2 Sachin Garg 2 Deepak Agarwal 3 Nagaraj Kota 2 1 UC San Diego 2

More information

BUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE

BUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE BUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE Alex Lin Senior Architect Intelligent Mining alin@intelligentmining.com Outline Predictive modeling methodology k-nearest Neighbor

More information

Exploring Big Data in Social Networks

Exploring Big Data in Social Networks Exploring Big Data in Social Networks virgilio@dcc.ufmg.br (meira@dcc.ufmg.br) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about

More information

A Near Real-Time Personalization for ecommerce Platform Amit Rustagi arustagi@ebay.com

A Near Real-Time Personalization for ecommerce Platform Amit Rustagi arustagi@ebay.com A Near Real-Time Personalization for ecommerce Platform Amit Rustagi arustagi@ebay.com Abstract. In today's competitive environment, you only have a few seconds to help site visitors understand that you

More information

Predicting Box Office Success: Do Critical Reviews Really Matter? By: Alec Kennedy Introduction: Information economics looks at the importance of

Predicting Box Office Success: Do Critical Reviews Really Matter? By: Alec Kennedy Introduction: Information economics looks at the importance of Predicting Box Office Success: Do Critical Reviews Really Matter? By: Alec Kennedy Introduction: Information economics looks at the importance of information in economic decisionmaking. Consumers that

More information

MSCA 31000 Introduction to Statistical Concepts

MSCA 31000 Introduction to Statistical Concepts MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced

More information

Take value-add on a test drive. Explore smarter ways to evaluate phone data providers.

Take value-add on a test drive. Explore smarter ways to evaluate phone data providers. White Paper Take value-add on a test drive. Explore smarter ways to evaluate phone data providers. Employing an effective debt-collection strategy with the right information solutions provider helps increase

More information

Study Guide #2 for MKTG 469 Advertising Types of online advertising:

Study Guide #2 for MKTG 469 Advertising Types of online advertising: Study Guide #2 for MKTG 469 Advertising Types of online advertising: Display (banner) ads, Search ads Paid search, Ads on social networks, Mobile ads Direct response is growing faster, Not all ads are

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

A Sales Strategy to Increase Function Bookings

A Sales Strategy to Increase Function Bookings A Sales Strategy to Increase Function Bookings It s Time to Start Selling Again! It s time to take on a sales oriented focus for the bowling business. Why? Most bowling centres have lost the art and the

More information

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Introduction to Support Vector Machines. Colin Campbell, Bristol University Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.

More information

GUIDE TO GOOGLE ADWORDS

GUIDE TO GOOGLE ADWORDS GUIDE TO GOOGLE ADWORDS How to use Google Adwords to drive relevant traffic to your website 2 April 2012 Version 1.0 Contents Contents 2 Introduction 4 Skill Level 4 Terminology 4 Video Tutorials 5 What

More information

Master of Science in Marketing Analytics (MSMA)

Master of Science in Marketing Analytics (MSMA) Master of Science in Marketing Analytics (MSMA) COURSE DESCRIPTION The Master of Science in Marketing Analytics program teaches students how to become more engaged with consumers, how to design and deliver

More information

BIG DATA : BIG CULTURE THE GROWING POWER OF THE DATA AND ITS OUTLOOK FOR THE ECONOMY OF CULTURE

BIG DATA : BIG CULTURE THE GROWING POWER OF THE DATA AND ITS OUTLOOK FOR THE ECONOMY OF CULTURE BIG DATA : BIG CULTURE THE GROWING POWER OF THE DATA AND ITS OUTLOOK FOR THE ECONOMY OF CULTURE November 2013 INTRODUCTION - - - - - - - Discovering I Tech tours Understanding I Business studies - - for

More information

Find what matters. Information Alchemy Turning Your Building Data Into Money

Find what matters. Information Alchemy Turning Your Building Data Into Money Find what matters Information Alchemy Turning Your Building Data Into Money version 1.1 Feb 2012 CONTENTS Information Alchemy Transforming Data Into Value... 2 How Does My Building Really Perform?... 2

More information

"SEO vs. PPC The Final Round"

SEO vs. PPC The Final Round "SEO vs. PPC The Final Round" A Research Study by Engine Ready, Inc. Examining The Role Traffic Source Plays in Visitor Purchase Behavior January 2008 Table of Contents Introduction 3 Definitions 4 Methodology

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

QMB 3302 - Business Analytics CRN 80700 - Fall 2015 T & R 9.30 to 10.45 AM -- Lutgert Hall 2209

QMB 3302 - Business Analytics CRN 80700 - Fall 2015 T & R 9.30 to 10.45 AM -- Lutgert Hall 2209 QMB 3302 - Business Analytics CRN 80700 - Fall 2015 T & R 9.30 to 10.45 AM -- Lutgert Hall 2209 Elias T. Kirche, Ph.D. Associate Professor Department of Information Systems and Operations Management Lutgert

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

Models of a Vending Machine Business

Models of a Vending Machine Business Math Models: Sample lesson Tom Hughes, 1999 Models of a Vending Machine Business Lesson Overview Students take on different roles in simulating starting a vending machine business in their school that

More information

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

White Paper. How Streaming Data Analytics Enables Real-Time Decisions White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction... 1 What Is Streaming Analytics?... 1 How Does SAS Event Stream Processing Work?... 2 Overview...2 Event Stream

More information

N-CAP Users Guide. Everything You Need to Know About Using the Internet! How Banner Ads Work

N-CAP Users Guide. Everything You Need to Know About Using the Internet! How Banner Ads Work N-CAP Users Guide Everything You Need to Know About Using the Internet! How Banner Ads Work How Banner Ads Work by Tom Harris If you've spent any time surfing the Internet, you've seen more than your fair

More information

Show Me the Euro. By Marian N. Jackson. Federal Reserve Bank of Atlanta. Lesson Plan of the Year Contest, 2009. Honorable Mention

Show Me the Euro. By Marian N. Jackson. Federal Reserve Bank of Atlanta. Lesson Plan of the Year Contest, 2009. Honorable Mention Show Me the Euro By Marian N. Jackson Federal Reserve Bank of Atlanta Lesson Plan of the Year Contest, 2009 Honorable Mention LESSON DESCRIPTION The purpose of this two-day lesson is to give foreign language

More information

VAK Learning Styles Self-Assessment Questionnaire

VAK Learning Styles Self-Assessment Questionnaire Student Services Study Skills Student Development and Counselling VAK Learning Styles Self-Assessment Questionnaire Circle or tick the answer that most represents how you generally behave. (It s best to

More information

Oracle Data Miner (Extension of SQL Developer 4.0)

Oracle Data Miner (Extension of SQL Developer 4.0) An Oracle White Paper September 2013 Oracle Data Miner (Extension of SQL Developer 4.0) Integrate Oracle R Enterprise Mining Algorithms into a workflow using the SQL Query node Denny Wong Oracle Data Mining

More information

It s easy to protect our files our school work, our music, our photos, our games everything that we save on our computers from loss by malware.

It s easy to protect our files our school work, our music, our photos, our games everything that we save on our computers from loss by malware. Activities for Protecting Your Identity and Computer for Elementary and Middle School Students Overview There are three posters about protecting your computer for this grade span. We recommend that these

More information

Scarcity and Choices Grade One

Scarcity and Choices Grade One Ohio Standards Connection: Economics Benchmark A Explain how the scarcity of resources requires people to make choices to satisfy their wants. Indicator 1 Explain that wants are unlimited and resources

More information

Big Data Analytics. Lucas Rego Drumond

Big Data Analytics. Lucas Rego Drumond Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Going For Large Scale Application Scenario: Recommender

More information

Evaluating a CATW Writing Sample

Evaluating a CATW Writing Sample 1 Reading and Writing Center Kingsborough Community College Evaluating a CATW Writing Sample The CUNY Assessment Test in Writing (CATW)--Abridged Guide #2 Adapted from the Student Handbook/ Office of Assessment/

More information

Beating the MLB Moneyline

Beating the MLB Moneyline Beating the MLB Moneyline Leland Chen llxchen@stanford.edu Andrew He andu@stanford.edu 1 Abstract Sports forecasting is a challenging task that has similarities to stock market prediction, requiring time-series

More information

Six Signs. you are ready for BI WHITE PAPER

Six Signs. you are ready for BI WHITE PAPER Six Signs you are ready for BI WHITE PAPER LET S TAKE A LOOK AT THE WAY YOU MIGHT BE MONITORING AND MEASURING YOUR COMPANY About the auther You re managing information from a number of different data sources.

More information

Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek

Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pelánek 2015 Today lecture, basic principles: content-based knowledge-based hybrid, choice of approach,... critiquing, explanations,...

More information

Specification for Assessment #3 Organizing and Making Inferences/Predictions from Data

Specification for Assessment #3 Organizing and Making Inferences/Predictions from Data Organizing and Making Inferences/Predictions from Data Competencies Student can organize data by creating a table, chart, or other representation to facilitate interpretation. Student can make inferences

More information

Tomer Shiran VP Product Management MapR Technologies. November 12, 2013

Tomer Shiran VP Product Management MapR Technologies. November 12, 2013 Predictive Analytics with Hadoop Tomer Shiran VP Product Management MapR Technologies November 12, 2013 1 Me, Us Tomer Shiran VP Product Management, MapR Technologies tshiran@maprtech.com MapR Enterprise-grade

More information

Worldwide Casino Consulting Inc.

Worldwide Casino Consulting Inc. Card Count Exercises George Joseph The first step in the study of card counting is the recognition of those groups of cards known as Plus, Minus & Zero. It is important to understand that the House has

More information

Online Training Welcome Pack

Online Training Welcome Pack Online Training Welcome Pack INTRODUCTION Hello, and welcome to your brand new retra online training platform! This is a fantastic staff training and development resource provided exclusively for retra

More information

Brought to you by. Technology changes fast. From new apps to digital marketing, it can feel impossible to keep up.

Brought to you by. Technology changes fast. From new apps to digital marketing, it can feel impossible to keep up. 1 Brought to you by Technology changes fast. From new apps to digital marketing, it can feel impossible to keep up. At The Paperless Agent, our mission is to help real estate professionals from all experience

More information

Socialprise: Leveraging Social Data in the Enterprise Rev 0109

Socialprise: Leveraging Social Data in the Enterprise Rev 0109 Socialprise: Leveraging Social Data in the Enterprise Rev 0109 Contents I. Socialprise: Capturing Smart Insights into Agile Relationships II. Socialprise Applications: Getting the Who, What and When of

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

Big Data The Next Phase Lessons from a Decade+ Experiment in Big Data

Big Data The Next Phase Lessons from a Decade+ Experiment in Big Data Big Data The Next Phase Lessons from a Decade+ Experiment in Big Data David Belanger PhD Senior Research Fellow Stevens Institute of Technology dbelange@stevens.edu 1 Outline Big Data Overview Thinking

More information

Degeneracy in Linear Programming

Degeneracy in Linear Programming Degeneracy in Linear Programming I heard that today s tutorial is all about Ellen DeGeneres Sorry, Stan. But the topic is just as interesting. It s about degeneracy in Linear Programming. Degeneracy? Students

More information

Using Flash Media Live Encoder To Broadcast An Audio-Only Stream (on Mac)

Using Flash Media Live Encoder To Broadcast An Audio-Only Stream (on Mac) Using Flash Media Live Encoder To Broadcast An Audio-Only Stream (on Mac) A user guide for setting up Flash Media Live Encoder (FMLE) to broadcast video over our servers is available here: (https://community.ja.net/system/files/15551/fmle%20streaming%20wizard%20guide.pdf)

More information

Ratings, Audiences, & Failed Shows

Ratings, Audiences, & Failed Shows Ratings, Audiences, & Failed Shows TELEVISION RATINGS Before the Show Airs A New Season TV networks test programs Subjects view pilots or season finales for current shows in theaters. Meters record their

More information

Mathematics Task Arcs

Mathematics Task Arcs Overview of Mathematics Task Arcs: Mathematics Task Arcs A task arc is a set of related lessons which consists of eight tasks and their associated lesson guides. The lessons are focused on a small number

More information

Chapter 1 Basic Introduction to Computers. Discovering Computers 2012. Your Interactive Guide to the Digital World

Chapter 1 Basic Introduction to Computers. Discovering Computers 2012. Your Interactive Guide to the Digital World Chapter 1 Basic Introduction to Computers Discovering Computers 2012 Your Interactive Guide to the Digital World Objectives Overview Explain why computer literacy is vital to success in today s world Define

More information

Best Practice Search Engine Optimisation

Best Practice Search Engine Optimisation Best Practice Search Engine Optimisation October 2007 Lead Hitwise Analyst: Australia Heather Hopkins, Hitwise UK Search Marketing Services Contents 1 Introduction 1 2 Search Engines 101 2 2.1 2.2 2.3

More information

QMB 3302 - Business Analytics CRN 82361 - Fall 2015 W 6:30-9:15 PM -- Lutgert Hall 2209

QMB 3302 - Business Analytics CRN 82361 - Fall 2015 W 6:30-9:15 PM -- Lutgert Hall 2209 QMB 3302 - Business Analytics CRN 82361 - Fall 2015 W 6:30-9:15 PM -- Lutgert Hall 2209 Rajesh Srivastava, Ph.D. Professor and Chair, Department of Information Systems and Operations Management Lutgert

More information

Self-Improving Supply Chains

Self-Improving Supply Chains Self-Improving Supply Chains Cyrus Hadavi Ph.D. Adexa, Inc. All Rights Reserved January 4, 2016 Self-Improving Supply Chains Imagine a world where supply chain planning systems can mold themselves into

More information

Hybrid model rating prediction with Linked Open Data for Recommender Systems

Hybrid model rating prediction with Linked Open Data for Recommender Systems Hybrid model rating prediction with Linked Open Data for Recommender Systems Andrés Moreno 12 Christian Ariza-Porras 1, Paula Lago 1, Claudia Jiménez-Guarín 1, Harold Castro 1, and Michel Riveill 2 1 School

More information

see, say, feel, do Social Media Metrics that Matter

see, say, feel, do Social Media Metrics that Matter see, say, feel, do Social Media Metrics that Matter the three stages of social media adoption When social media first burst on to the scene, it was the new new thing. But today, social media has reached

More information

CoolaData Predictive Analytics

CoolaData Predictive Analytics CoolaData Predictive Analytics 9 3 6 About CoolaData CoolaData empowers online companies to become proactive and predictive without having to develop, store, manage or monitor data themselves. It is an

More information

COMPUTER APPLICATIONS TECHNOLOGY TEACHER GUIDE

COMPUTER APPLICATIONS TECHNOLOGY TEACHER GUIDE COMPUTER APPLICATIONS TECHNOLOGY TEACHER GUIDE Welcome to the Mindset Computer Applications Technology teaching and learning resources! In partnership with Coza Cares Foundation, Mindset Learn, a division

More information

A Data Generator for Multi-Stream Data

A Data Generator for Multi-Stream Data A Data Generator for Multi-Stream Data Zaigham Faraz Siddiqui, Myra Spiliopoulou, Panagiotis Symeonidis, and Eleftherios Tiakas University of Magdeburg ; University of Thessaloniki. [siddiqui,myra]@iti.cs.uni-magdeburg.de;

More information

OPM3. Project Management Institute. OPM3 in Action: Pinellas County IT Turns Around Performance and Customer Confidence

OPM3. Project Management Institute. OPM3 in Action: Pinellas County IT Turns Around Performance and Customer Confidence Project Management Institute OPM3 case study : OPM3 in Action: Pinellas County IT Turns Around Performance and Customer Confidence OPM3 Organizational Project Management Maturity Model Project Management

More information

Maximizing Precision of Hit Predictions in Baseball

Maximizing Precision of Hit Predictions in Baseball Maximizing Precision of Hit Predictions in Baseball Jason Clavelli clavelli@stanford.edu Joel Gottsegen joeligy@stanford.edu December 13, 2013 Introduction In recent years, there has been increasing interest

More information

CHAPTER VII CONCLUSIONS

CHAPTER VII CONCLUSIONS CHAPTER VII CONCLUSIONS To do successful research, you don t need to know everything, you just need to know of one thing that isn t known. -Arthur Schawlow In this chapter, we provide the summery of the

More information

analytics+insights for life science Descriptive to Prescriptive Accelerating Business Insights with Data Analytics a lifescale leadership brief

analytics+insights for life science Descriptive to Prescriptive Accelerating Business Insights with Data Analytics a lifescale leadership brief analytics+insights for life science Descriptive to Prescriptive Accelerating Business Insights with Data Analytics a lifescale leadership brief The potential of data analytics can be confusing for many

More information

Load Testing Basics: These are the basic ideas in setting up a load test By: Bob Wescott

Load Testing Basics: These are the basic ideas in setting up a load test By: Bob Wescott : These are the basic ideas in setting up a load test By: Bob Wescott Summary Load testing requires you to select transactions that are important to you and then synthetically generate them at a rate that

More information

Some Research Challenges for Big Data Analytics of Intelligent Security

Some Research Challenges for Big Data Analytics of Intelligent Security Some Research Challenges for Big Data Analytics of Intelligent Security Yuh-Jong Hu hu at cs.nccu.edu.tw Emerging Network Technology (ENT) Lab. Department of Computer Science National Chengchi University,

More information

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

What is Data Science? Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014

What is Data Science? Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014 What is Data Science? { Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014 Let s start with: What is Data? http://upload.wikimedia.org/wikipedia/commons/f/f0/darpa

More information

VIDEO TRANSCRIPT: Content Marketing Analyzing Your Efforts 1. Content Marketing - Analyzing Your Efforts:

VIDEO TRANSCRIPT: Content Marketing Analyzing Your Efforts 1. Content Marketing - Analyzing Your Efforts: VIDEO TRANSCRIPT: Content Marketing Analyzing Your Efforts 1 Content Marketing - Analyzing Your Efforts: This is a transcript of a presentation originally given live at the Growth Powered by Risdall Fall

More information

Cross-Domain Collaborative Recommendation in a Cold-Start Context: The Impact of User Profile Size on the Quality of Recommendation

Cross-Domain Collaborative Recommendation in a Cold-Start Context: The Impact of User Profile Size on the Quality of Recommendation Cross-Domain Collaborative Recommendation in a Cold-Start Context: The Impact of User Profile Size on the Quality of Recommendation Shaghayegh Sahebi and Peter Brusilovsky Intelligent Systems Program University

More information

Machine Learning over Big Data

Machine Learning over Big Data Machine Learning over Big Presented by Fuhao Zou fuhao@hust.edu.cn Jue 16, 2014 Huazhong University of Science and Technology Contents 1 2 3 4 Role of Machine learning Challenge of Big Analysis Distributed

More information