Collaborative Filtering. Radek Pelánek


 Malcolm Simon
 2 years ago
 Views:
Transcription
1 Collaborative Filtering Radek Pelánek 2015
2 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains widely used in practice
3 Basic CF Approach input: matrix of useritem ratings (with missing values, often very sparse) output: predictions for missing values
4 Netflix Prize Netflix video rental company contest: 10% improvement of the quality of recommendations prize: 1 million dollars data: user ID, movie ID, time, rating
5 Main CF Techniques memory based nearest neighbors (user, item) model based latent factors matrix factorization
6 Neighborhood Methods: Illustration Matrix factorization techniques for recommender systems
7 Latent Factors: Illustration Matrix factorization techniques for recommender systems
8 Latent Factors: Netflix Data Matrix factorization techniques for recommender systems
9 Ratings explicit e.g., stars (1 to 5 Likert scale) to consider: granularity, multidimensionality issues: users may not be willing to rate data sparsity implicit proxy data for quality rating clicks, page views, time on page the following applies directly to explicit ratings, modifications may be needed for implicit (or their combination)
10 Nonpersonalized Predictions averages, issues: bias, normalization some users give systematically higher ratings (more details for a CF later) number of ratings, uncertainty average 5 from 3 ratings average 4.9 from 100 ratings
11 Exploitation vs Exploration pure exploitation always recommend top items what if some other item is actually better, rating is poorer just due to noise? exploration trying items to get more data how to balance exploration and exploitation?
12 Multiarmed Bandit standard model for exploitation vs exploration arm (unknown) probabilistic reward how to choose arms to maximize reward? wellstudied, many algorithms (e.g., upper confidence bounds) typical application: advertisements
13 Core Idea do not use just averages quantify uncertainty (e.g., standard deviation) combine average & uncertainty for decisions example: TrueSkill, ranking of players (leaderboard)
14 Note on Improving Performance simple predictors often provide reasonable performance further improvements often small but can have significant impact on behavior (not easy to evaluate) evaluation lecture Introduction to Recommender Systems, Xavier Amatriain
15 Userbased Nearest Neighbor CF user Alice: item i not rated by Alice: find similar users to Alice who have rated i compute average to predict rating by Alice recommend items with highest predicted rating
16 Userbased Nearest Neighbor CF Recommender Systems: An Introduction (slides)
17 User Similarity Pearson correlation coefficient (alternatives: e.g. spearman cor. coef., cosine similarity) Recommender Systems: An Introduction (slides)
18 Pearson Correlation Coefficient: Reminder r = n n i=1 (X i X )(Y i Ȳ ) i=1 (X i X ) 2 n i=1 (Y i Ȳ )2
19 Making Predictions b N pred(a, i) = r a + sim(a, b) (r bi r b ) sim(a, b) r ai rating of user a, item i r a, r b user averages b N
20 Improvements number of corated items agreement on more exotic items more important case amplification more weight to very similar neighbors neighbor selection
21 Itembased Collaborative Filtering compute similarity between items use this similarity to predict ratings more computationally efficient, often: number of items << number of users
22 Itembased Nearest Neighbor CF Recommender Systems: An Introduction (slides)
23 Cosine Similarity rating by Bob rating by Alice cos(α) = A B A B
24 Similarity, Predictions (adjusted) cosine similarity similar to Pearson s r, works slightly better pred(u, p) = i R sim(i, p)r ui sim(i, p) i R neighborhood size limited (20 to 50)
25 Preprocessing O(N 2 ) calculations still large Itemitem recommendations by Amazon (2003) calculate similarities in advance (periodical update) supposed to be stable, item relations not expected to change quickly reductions (min. number of coratings etc)
26 Matrix Factorization CF main idea: latent factors of users/items use these to predict ratings related to singular value decomposition
27 Notes singular value decomposition (SVD) theorem in linear algebra in CF context the name SVD usually used for an approach only slightly related to SVD theorem related to latent semantic analysis introduced during the Netflix prize, in a blog post (Simon Funk)
28 Singular Value Decomposition (Linear Algebra) X = USV T U, V orthogonal matrices S diagonal matrix, diagonal entries singular values lowrank matrix approximation (use only top k singular values)
29 SVD CF Interpretation X = USV T X matrix of ratings U userfactors strengths V itemfactors strengths S importance of factors
30 Latent Factors Matrix factorization techniques for recommender systems
31 Latent Factors Matrix factorization techniques for recommender systems
32 Missing Values matrix factorization techniques (SVD) work with full matrix ratings sparse matrix solutions: value imputation expensive, imprecise alternative algorithms (greedy, heuristic): gradient descent, alternating least squares
33 Notation u user, i item r ui rating ˆr ui predicted rating b, b u, b i bias q i, p u latent factor vectors (length k)
34 Simple Baseline Predictors [ note: always use baseline methods in your experiments ] naive: ˆr ui = µ, µ is global mean biases: ˆr ui = µ + b u + b i b u, b i biases, average deviations some users/items systematically larger/lower ratings
35 Latent Factors (for a while assume centered data without bias) ˆr ui = qi T p u vector multiplication useritem interaction via latent factors illustration (3 factors): user (p u ): (0.5, 0.8, 0.3) item (q i ): (0.4, 0.1, 0.8)
36 Latent Factors vector multiplication ˆr ui = q T i p u useritem interaction via latent factors we need to find q i, p u from the data (cf contentbased techniques) note: finding q i, p u at the same time
37 Learning Factor Vectors we want to minimize squared errors (related to RMSE, more details leater) min q,p (u,i) T (r ui q T i p u ) 2 regularization to avoid overfitting (standard machine learning approach) min q,p (u,i) T How to find the minimum? (r ui q T i p u ) 2 + λ( q i 2 + p u 2 )
38 Stochastic Gradient Descent standard technique in machine learning greedy, may find local minimum
39 Gradient Descent for CF prediction error e ui = r ui q T i p u update (in parallel): q i := q i + γ(e ui p u λq i ) p i := p u + γ(e ui q i λp u ) math behind equations gradient = partial derivatives γ, λ constants, set pragmatically learning rate γ (0.005 for Netflix) regularization λ (0.02 for Netflix)
40 Advice if you want to learn/understand gradient descent (and also many other machine learning notions) experiment with linear regression can be (simply) approached in many ways: analytic solution, gradient descent, brute force search easy to visualize good for intuitive understanding relatively easy to derive the equations (one of examples in IV122 Math & programming)
41 Advice II recommended sources: Koren, Yehuda, Robert Bell, and Chris Volinsky. Matrix factorization techniques for recommender systems. Computer 42.8 (2009): Koren, Yehuda, and Robert Bell. Advances in collaborative filtering. Recommender Systems Handbook. Springer US,
42 Adding Bias predictions: ˆr ui = µ + b u + b i + q T i p u function to minimize: (r ui µ b u b i qi T p u ) 2 +λ( q i 2 + p u 2 +bu+b 2 i 2 )] min q,p (u,i) T
43 Improvements additional data sources (implicit ratings) varying confidence level temporal dynamics
44 Temporal Dynamics Netflix data Y. Koren, Collaborative Filtering with Temporal Dynamics
45 Temporal Dynamics Netflix data, jump early in 2004 Y. Koren, Collaborative Filtering with Temporal Dynamics
46 Temporal Dynamics baseline = behaviour influenced by exterior considerations interaction = behaviour explained by match between users and items Y. Koren, Collaborative Filtering with Temporal Dynamics
47 Results for Netflix Data Matrix factorization techniques for recommender systems
48 Slope One Slope One Predictors for Online RatingBased Collaborative Filtering average over such simple prediction
49 Slope One accurate within reason easy to implement updateable on the fly efficient at query time expect little from first visitors
50 Other CF Techniques clustering association rules classifiers
51 Clustering main idea: cluster similar users nonpersonalized predictions ( popularity ) for each cluster clustering: unsupervised machine learning many algorithms kmeans, EM algorithm,...
52 Clustering Introduction to Recommender Systems, Xavier Amatriain
53 Clustering: Kmeans
54 Association Rules relationships among items, e.g., common purchases famous example (google it for more details): beer and diapers Customers Who Bought This Item Also Bought... advantage: provides explanation, useful for building trust
55 Classifiers general machine learning techniques positive / negative classification train, test set logistic regression, support vector machines, decision trees, Bayesian techniques,...
56 Limitations of CF cold start problem popularity bias difficult to recommend items from the long tail impact of noise (e.g., one account used by different people) possibility of attacks
57 Cold Start Problem How to recommend new items? What to recommend to new users?
58 Cold Start Problem use another method (nonpersonalized, contentbased...) in the initial phase ask/force user to rate items use defaults (means) better algorithms e.g., recursive CF
59 Recursive Collaborative Filtering Recommender Systems: An Introduction (slides)
60 Collaborative Filtering: Summary requires only ratings, widely applicable neighborhood methods, latent factors use of machine learning techniques
Big Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Going For Large Scale Application Scenario: Recommender
More informationRecommender Systems: Contentbased, Knowledgebased, Hybrid. Radek Pelánek
Recommender Systems: Contentbased, Knowledgebased, Hybrid Radek Pelánek 2015 Today lecture, basic principles: contentbased knowledgebased hybrid, choice of approach,... critiquing, explanations,...
More informationHybrid model rating prediction with Linked Open Data for Recommender Systems
Hybrid model rating prediction with Linked Open Data for Recommender Systems Andrés Moreno 12 Christian ArizaPorras 1, Paula Lago 1, Claudia JiménezGuarín 1, Harold Castro 1, and Michel Riveill 2 1 School
More informationThe Need for Training in Big Data: Experiences and Case Studies
The Need for Training in Big Data: Experiences and Case Studies Guy Lebanon Amazon Background and Disclaimer All opinions are mine; other perspectives are legitimate. Based on my experience as a professor
More informationMachine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
More informationDATA ANALYTICS USING R
DATA ANALYTICS USING R Duration: 90 Hours Intended audience and scope: The course is targeted at fresh engineers, practicing engineers and scientists who are interested in learning and understanding data
More informationIPTV Recommender Systems. Paolo Cremonesi
IPTV Recommender Systems Paolo Cremonesi Agenda 2 IPTV architecture Recommender algorithms Evaluation of different algorithms Multimodel systems Valentino Rossi 3 IPTV architecture 4 Live TV Settopbox
More informationLecture #2. Algorithms for Big Data
Additional Topics: Big Data Lecture #2 Algorithms for Big Data Joseph Bonneau jcb82@cam.ac.uk April 30, 2012 Today's topic: algorithms Do we need new algorithms? Quantity is a quality of its own Joseph
More informationContentBased Recommendation
ContentBased Recommendation Contentbased? Item descriptions to identify items that are of particular interest to the user Example Example Comparing with Noncontent based Items Userbased CF Searches
More informationAdvances in Collaborative Filtering
Advances in Collaborative Filtering Yehuda Koren and Robert Bell Abstract The collaborative filtering (CF) approach to recommenders has recently enjoyed much interest and progress. The fact that it played
More informationIntroduction to machine learning and pattern recognition Lecture 1 Coryn BailerJones
Introduction to machine learning and pattern recognition Lecture 1 Coryn BailerJones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 What is machine learning? Data description and interpretation
More informationScalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights
Seventh IEEE International Conference on Data Mining Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights Robert M. Bell and Yehuda Koren AT&T Labs Research 180 Park
More informationOverview. Introduction. Recommender Systems & Slope One Recommender. Distributed Slope One on Mahout and Hadoop. Experimental Setup and Analyses
Slope One Recommender on Hadoop YONG ZHENG Center for Web Intelligence DePaul University Nov 15, 2012 Overview Introduction Recommender Systems & Slope One Recommender Distributed Slope One on Mahout and
More informationCollaborative Filtering Recommender Systems. Contents
Foundations and Trends R in Human Computer Interaction Vol. 4, No. 2 (2010) 81 173 c 2011 M. D. Ekstrand, J. T. Riedl and J. A. Konstan DOI: 10.1561/1100000009 Collaborative Filtering Recommender Systems
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More information! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms  II
! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms  II ChingYung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and
More informationFactorization Machines
Factorization Machines Factorized Polynomial Regression Models Christoph Freudenthaler, Lars SchmidtThieme and Steffen Rendle 2 Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim,
More informationBINARY PRINCIPAL COMPONENT ANALYSIS IN THE NETFLIX COLLABORATIVE FILTERING TASK
BINARY PRINCIPAL COMPONENT ANALYSIS IN THE NETFLIX COLLABORATIVE FILTERING TASK László Kozma, Alexander Ilin, Tapani Raiko Helsinki University of Technology Adaptive Informatics Research Center P.O. Box
More informationPreliminaries. Chapter 2. 2.1 Basic recommendation techniques. 2.1.1 Notation and symbols
Chapter 2 Preliminaries This chapter is organized as follows: First we present the two basic filtering techniques for recommender systems: collaborative and contentbased filtering. Afterwards, in Section
More informationLink Prediction in Social Networks
Link Prediction in Social Networks 2/17/2014 Outline Link Prediction Problems Social Network Recommender system Algorithms of Link Prediction Supervised Methods Collaborative Filtering Recommender System
More informationCS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 RealTime Systems. CSCI 522 High Performance Computing
CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate
More informationIntroduction to Machine Learning
Introduction to Machine Learning Prof. Alexander Ihler Prof. Max Welling icamp Tutorial July 22 What is machine learning? The ability of a machine to improve its performance based on previous results:
More informationCSE 494 CSE/CBS 598 (Fall 2007): Numerical Linear Algebra for Data Exploration Clustering Instructor: Jieping Ye
CSE 494 CSE/CBS 598 Fall 2007: Numerical Linear Algebra for Data Exploration Clustering Instructor: Jieping Ye 1 Introduction One important method for data compression and classification is to organize
More informationPrinciples of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
More informationEVALUATING THE IMPACT OF DEMOGRAPHIC DATA ON A HYBRID RECOMMENDER MODEL
Vol. 12, No. 2, pp. 149167 ISSN: 16457641 EVALUATING THE IMPACT OF DEMOGRAPHIC DATA ON A HYBRID RECOMMENDER MODEL Edson B. Santos Junior. Institute of Mathematics and Computer Science Sao Paulo University
More informationRecommendation Tool Using Collaborative Filtering
Recommendation Tool Using Collaborative Filtering Aditya Mandhare 1, Soniya Nemade 2, M.Kiruthika 3 Student, Computer Engineering Department, FCRIT, Vashi, India 1 Student, Computer Engineering Department,
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) Linear Algebra Methods for Data Mining, Spring 2007, University of
More informationRating Prediction with Informative Ensemble of MultiResolution Dynamic Models
JMLR: Workshop and Conference Proceedings 75 97 Rating Prediction with Informative Ensemble of MultiResolution Dynamic Models Zhao Zheng Hong Kong University of Science and Technology, Hong Kong Tianqi
More informationBig & Personal: data and models behind Netflix recommendations
Big & Personal: data and models behind Netflix recommendations Xavier Amatriain Netflix xavier@netflix.com ABSTRACT Since the Netflix $1 million Prize, announced in 2006, our company has been known to
More informationAPPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder
APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large
More informationEnsemble Learning Better Predictions Through Diversity. Todd Holloway ETech 2008
Ensemble Learning Better Predictions Through Diversity Todd Holloway ETech 2008 Outline Building a classifier (a tutorial example) Neighbor method Major ideas and challenges in classification Ensembles
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationBig Data Technology Recommendation Challenges in Web Media Sites Course Summary
Big Data Technology Recommendation Challenges in Web Media Sites Course Summary Edward Bortnikov & Ronny Lempel Yahoo! Labs, Haifa Recommender Systems: A Canonical Big Data Problem Pioneered by Amazon
More informationHow I won the Chess Ratings: Elo vs the rest of the world Competition
How I won the Chess Ratings: Elo vs the rest of the world Competition Yannis Sismanis November 2010 Abstract This article discusses in detail the rating system that won the kaggle competition Chess Ratings:
More informationModern consumers are inundated with MATRIX FACTORIZATION TECHNIQUES FOR RECOMMENDER SYSTEMS COVER FEATURE. Recommender system strategies
COVER FEAURE MARIX FACORIZAION ECHNIQUES FOR RECOMMENDER SYSEMS Yehuda Koren, Yahoo Research Robert Bell and Chris Volinsky, A& Labs Research As the Netflix Prize competition has demonstrated, matrix factorization
More informationFactorization Machines
Factorization Machines Steffen Rendle Department of Reasoning for Intelligence The Institute of Scientific and Industrial Research Osaka University, Japan rendle@ar.sanken.osakau.ac.jp Abstract In this
More informationCollaborative Filtering for Implicit Feedback Datasets
Collaborative Filtering for Implicit Feedback Datasets Yifan Hu AT&T Labs Research Florham Park, NJ 07932 Yehuda Koren Yahoo! Research Haifa 31905, Israel Chris Volinsky AT&T Labs Research Florham Park,
More informationBUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE
BUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE Alex Lin Senior Architect Intelligent Mining alin@intelligentmining.com Outline Predictive modeling methodology knearest Neighbor
More informationIntroduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011
Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning
More informationCOMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16. Lecture 2: Linear Regression Gradient Descent Nonlinear basis functions
COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16 Lecture 2: Linear Regression Gradient Descent Nonlinear basis functions LINEAR REGRESSION MOTIVATION Why Linear Regression? Regression
More informationOn Topk Recommendation using Social Networks
On Topk Recommendation using Social Networks Xiwang Yang, Harald Steck,Yang Guo and Yong Liu Polytechnic Institute of NYU, Brooklyn, NY, USA 1121 Bell Labs, AlcatelLucent, New Jersey Email: xyang1@students.poly.edu,
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationAnalyze It use cases in telecom & healthcare
Analyze It use cases in telecom & healthcare Chung Min Chen, VP of Data Science The views and opinions expressed in this presentation are those of the author and do not necessarily reflect the position
More informationA Social NetworkBased Recommender System (SNRS)
A Social NetworkBased Recommender System (SNRS) Jianming He and Wesley W. Chu Computer Science Department University of California, Los Angeles, CA 90095 jmhek@cs.ucla.edu, wwc@cs.ucla.edu Abstract. Social
More informationLearning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal
Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether
More informationSupervised Learning (Big Data Analytics)
Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Daybyday Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationChallenges and Opportunities in Data Mining: Personalization
Challenges and Opportunities in Data Mining: Big Data, Predictive User Modeling, and Personalization Bamshad Mobasher School of Computing DePaul University, April 20, 2012 Google Trends: Data Mining vs.
More informationFundamental Analysis Challenge
All Together Now: A Perspective on the NETFLIX PRIZE Robert M. Bell, Yehuda Koren, and Chris Volinsky When the Netflix Prize was announced in October of 6, we initially approached it as a fun diversion
More informationCI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.
CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes
More informationContextualBandit Approach to Recommendation Konstantin Knauf
ContextualBandit Approach to Recommendation Konstantin Knauf 22. Januar 2014 Prof. Ulf Brefeld Knowledge Mining & Assesment 1 Agenda Problem Scenario Scenario Multiarmed Bandit Model for Online Recommendation
More informationCLASSIFICATION AND CLUSTERING. Anveshi Charuvaka
CLASSIFICATION AND CLUSTERING Anveshi Charuvaka Learning from Data Classification Regression Clustering Anomaly Detection Contrast Set Mining Classification: Definition Given a collection of records (training
More information8/29/2015. Data Mining and Machine Learning. Erich Seamon University of Idaho
Data Mining and Machine Learning Erich Seamon University of Idaho www.webpages.uidaho.edu/erichs erichs@uidaho.edu 1 I am NOMAD 2 1 Data Mining and Machine Learning Outline Outlining the data mining and
More informationE N T I T Y R E C O M M E N D AT I O N B A S E D O N W I K I P E D I A
University of Saarland Faculty of Natural Sciences and Technology I Department of Computer Science Master s Thesis E N T I T Y R E C O M M E N D AT I O N B A S E D O N W I K I P E D I A submitted by dragan
More informationFast Analytics on Big Data with H20
Fast Analytics on Big Data with H20 0xdata.com, h2o.ai Tomas Nykodym, Petr Maj Team About H2O and 0xdata H2O is a platform for distributed in memory predictive analytics and machine learning Pure Java,
More informationA NOVEL RESEARCH PAPER RECOMMENDATION SYSTEM
International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 7, Issue 1, JanFeb 2016, pp. 0716, Article ID: IJARET_07_01_002 Available online at http://www.iaeme.com/ijaret/issues.asp?jtype=ijaret&vtype=7&itype=1
More informationAn Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationKnowledge Discovery and Data Mining 1 (VO) (707.003)
Knowledge Discovery and Data Mining 1 (VO) (707.003) Denis Helic KTI, TU Graz Oct 1, 2015 Denis Helic (KTI, TU Graz) KDDM1 Oct 1, 2015 1 / 55 Lecturer Name: Denis Helic Office: IWT, Inffeldgasse 13, 5th
More informationBayesian Factorization Machines
Bayesian Factorization Machines Christoph Freudenthaler, Lars SchmidtThieme Information Systems & Machine Learning Lab University of Hildesheim 31141 Hildesheim {freudenthaler, schmidtthieme}@ismll.de
More informationBig Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising
More informationPREA: Personalized Recommendation Algorithms Toolkit
Journal of Machine Learning Research 13 (2012) 26992703 Submitted 7/11; Revised 4/12; Published 9/12 PREA: Personalized Recommendation Algorithms Toolkit Joonseok Lee Mingxuan Sun Guy Lebanon College
More informationLecture 5: Singular Value Decomposition SVD (1)
EEM3L1: Numerical and Analytical Techniques Lecture 5: Singular Value Decomposition SVD (1) EE3L1, slide 1, Version 4: 25Sep02 Motivation for SVD (1) SVD = Singular Value Decomposition Consider the system
More informationPredictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD
Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,
More informationRegression Using Support Vector Machines: Basic Foundations
Regression Using Support Vector Machines: Basic Foundations Technical Report December 2004 Aly Farag and Refaat M Mohamed Computer Vision and Image Processing Laboratory Electrical and Computer Engineering
More informationMasters Courses Recommendation: Exploring Collaborative Filtering and Singular Value Decomposition with Student Profiling
Masters Courses Recommendation: Exploring Collaborative Filtering and Singular Value Decomposition with Student Profiling Fábio Carballo fabio.carballo@tecnico.ulisboa.pt Instituto Superior Técnico, Universidade
More informationMachine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next
More informationIs a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
More informationScalable Machine Learning  or what to do with all that Big Data infrastructure
 or what to do with all that Big Data infrastructure TU Berlin blog.mikiobraun.de Strata+Hadoop World London, 2015 1 Complex Data Analysis at Scale Clickthrough prediction Personalized Spam Detection
More information2. EXPLICIT AND IMPLICIT FEEDBACK
Comparison of Implicit and Explicit Feedback from an Online Music Recommendation Service Gawesh Jawaheer Gawesh.Jawaheer.1@city.ac.uk Martin Szomszor Martin.Szomszor.1@city.ac.uk Patty Kostkova Patty@soi.city.ac.uk
More informationDefending Networks with Incomplete Information: A Machine Learning Approach. Alexandre Pinto alexcp@mlsecproject.org @alexcpsec @MLSecProject
Defending Networks with Incomplete Information: A Machine Learning Approach Alexandre Pinto alexcp@mlsecproject.org @alexcpsec @MLSecProject Agenda Security Monitoring: We are doing it wrong Machine Learning
More information1 Orthogonal projections and the approximation
Math 1512 Fall 2010 Notes on least squares approximation Given n data points (x 1, y 1 ),..., (x n, y n ), we would like to find the line L, with an equation of the form y = mx + b, which is the best fit
More informationCity University of Hong Kong. Information on a Course offered by Department of Computer Science with effect from Semester A in 2014 / 2015
City University of Hong Kong Information on a Course offered by Department of Computer Science with effect from Semester A in 2014 / 2015 Part I Course Title: Fundamentals of Data Science Course Code:
More informationA Collaborative Filtering Recommendation Algorithm Based On User Clustering And Item Clustering
A Collaborative Filtering Recommendation Algorithm Based On User Clustering And Item Clustering GRADUATE PROJECT TECHNICAL REPORT Submitted to the Faculty of The School of Engineering & Computing Sciences
More informationInvited Applications Paper
Invited Applications Paper   Thore Graepel Joaquin Quiñonero Candela Thomas Borchert Ralf Herbrich Microsoft Research Ltd., 7 J J Thomson Avenue, Cambridge CB3 0FB, UK THOREG@MICROSOFT.COM JOAQUINC@MICROSOFT.COM
More informationClass Overview and General Introduction to Machine Learning
Class Overview and General Introduction to Machine Learning Piyush Rai www.cs.utah.edu/~piyush CS5350/6350: Machine Learning August 23, 2011 (CS5350/6350) Intro to ML August 23, 2011 1 / 25 Course Logistics
More informationMultiArmed Bandit Problems
Machine Learning Coms4771 MultiArmed Bandit Problems Lecture 20 Multiarmed Bandit Problems The Setting: K arms (or actions) Each time t, each arm i pays off a bounded realvalued reward x i (t), say
More informationSystem Identification for Acoustic Comms.:
System Identification for Acoustic Comms.: New Insights and Approaches for Tracking Sparse and Rapidly Fluctuating Channels Weichang Li and James Preisig Woods Hole Oceanographic Institution The demodulation
More informationJava Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Nonnormal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
More informationChapter 12 Discovering New Knowledge Data Mining
Chapter 12 Discovering New Knowledge Data Mining BecerraFernandez, et al.  Knowledge Management 1/e  2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to
More informationNeural Networks. CAP5610 Machine Learning Instructor: GuoJun Qi
Neural Networks CAP5610 Machine Learning Instructor: GuoJun Qi Recap: linear classifier Logistic regression Maximizing the posterior distribution of class Y conditional on the input vector X Support vector
More informationPrecise Recommendation System for the Long Tail Problem Using Adaptive Clustering Technique
Precise Recommendation System for the Long Tail Problem Using Adaptive Clustering Technique S.Jeyshirii 1, Dr.T.K.Thivakaran 2 P.G. Student, Department of Computer Science & Engineering,Sri Venkateswara
More informationMammoth Scale Machine Learning!
Mammoth Scale Machine Learning! Speaker: Robin Anil, Apache Mahout PMC Member! OSCON"10! Portland, OR! July 2010! Quick Show of Hands!# Are you fascinated about ML?!# Have you used ML?!# Do you have Gigabytes
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More informationSeveral Views of Support Vector Machines
Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min
More informationMachine Learning Big Data using Map Reduce
Machine Learning Big Data using Map Reduce By Michael Bowles, PhD Where Does Big Data Come From? Web data (web logs, click histories) ecommerce applications (purchase histories) Retail purchase histories
More informationNovember 28 th, Carlos Guestrin. Lower dimensional projections
PCA Machine Learning 070/578 Carlos Guestrin Carnegie Mellon University November 28 th, 2007 Lower dimensional projections Rather than picking a subset of the features, we can new features that are combinations
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationPractical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
More informationAdvice for applying Machine Learning
Advice for applying Machine Learning Andrew Ng Stanford University Today s Lecture Advice on how getting learning algorithms to different applications. Most of today s material is not very mathematical.
More informationMS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationAdvanced Ensemble Strategies for Polynomial Models
Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer
More informationImproved Recommendations via (More) Collaboration
Improved Recommendations via (More) Collaboration Rubi Boim Haim Kaplan Tova Milo Ronitt Rubinfeld School of Computer Science TelAviv University {boim,haimk,milo,ronitt}@post.tau.ac.il ABSTRACT We consider
More informationRecommender Systems. UserFacing Decision Support Systems. Michael Hahsler
Recommender Systems UserFacing Decision Support Systems Michael Hahsler Intelligent Data Analysis Lab (IDA@SMU) CSE, Lyle School of Engineering Southern Methodist University EMIS 5/7357: Decision Support
More informationMATHEMATICAL METHODS OF STATISTICS
MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS
More informationECS289 Data Mining Lecture Outline
ECS289 Data Mining Lecture Outline What is Data Mining? Absurdly Simple Student Grade Example Fundamental Tasks in Data Mining Course Structure Overview Emphasis on social networks and graphs Our aim
More informationClustering and Data Mining in R
Clustering and Data Mining in R Workshop Supplement Thomas Girke December 10, 2011 Introduction Data Preprocessing Data Transformations Distance Methods Cluster Linkage Hierarchical Clustering Approaches
More informationCourse: Model, Learning, and Inference: Lecture 5
Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.
More informationMachine Learning Capacity and Performance Analysis and R
Machine Learning and R May 3, 11 30 25 15 10 5 25 15 10 5 30 25 15 10 5 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 100 80 60 40 100 80 60 40 100 80 60 40 30 25 15 10 5 25 15 10
More information