Big Data Analytics. Lucas Rego Drumond


 Esther Dawson
 3 years ago
 Views:
Transcription
1 Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Going For Large Scale Application Scenario: Recommender Systems Going For Large Scale Application Scenario: Recommender Systems 1 / 44
2 Outline 1. ADMM Continued 2. Recommender Systems 3. Traditional Recommendation approaches 3.1. Nearest Neighbor Approaches 3.2. Factorization Models Going For Large Scale Application Scenario: Recommender Systems 1 / 44
3 1. ADMM Continued Outline 1. ADMM Continued 2. Recommender Systems 3. Traditional Recommendation approaches 3.1. Nearest Neighbor Approaches 3.2. Factorization Models Going For Large Scale Application Scenario: Recommender Systems 1 / 44
4 1. ADMM Continued Applying ADMM to predictive problems minimize β (i),α N i=1 (x,y) D train i subject to β (i) α = 0 l(y, ŷ(x; β (i) )) + R(β (i) ) The ADMM algorithm iteratively performs the following steps: β (i)t+1 arg min β α t+1 arg min α (x,y) D train i l(y, ŷ(x; β)) + R(β) + ν (i)t T β + s 2 β αt 2 2 N ν (i)t T (β (i) t+1 α) + s 2 β(i)t+1 α 2 2 i=1 ν (i)t+1 ν (i)t + s (β (i)t+1 α t+1) Going For Large Scale Application Scenario: Recommender Systems 1 / 44
5 1. ADMM Continued Solving Linear Regression with ADMM Loss function: y X β λ β 2 2 N = y (i) X (i) β (i) λ β (i) 2 2 i=1 The ADMM algorithm iteratively performs the following steps: β (i)t+1 arg min y (i) X (i) β λ β ν (i)t T s β + β 2 β αt 2 2 N α t+1 arg min ν (i)t T (β (i) t+1 α) + s α 2 β(i)t+1 α 2 2 i=1 ν (i)t+1 ν (i)t + s (β (i)t+1 α t+1) Going For Large Scale Application Scenario: Recommender Systems 2 / 44
6 1. ADMM Continued Solving Linear Regression with ADMM 1st Step: β t+1 arg min y X β λ β ν t T s β + β 2 β αt 2 2 ( β y X β λ β ν t T s ) β + 2 β αt 2 2 = 0 2X T (y X ˆβ) + 2λ ˆβ + ν t + s(β α t ) = 0 2X T X ˆβ 2X T y + 2λ ˆβ + ν t + sβ sα t = 0 2X T X ˆβ + 2λ ˆβ + sβ = 2X T y ν t + sα t ˆβ = (2X T X + 2(λ + s)i) 1 (2X T y ν t + sα t ) (2X T X + 2(λ + s)i) ˆβ = (2X T y ν t + sα t ) Going For Large Scale Application Scenario: Recommender Systems 3 / 44
7 1. ADMM Continued Solving Linear Regression with ADMM 2nd Step: α t+1 arg min α N ν (i)t T (β (i) t+1 α) + s 2 β(i)t+1 α 2 2 i=1 ( N ) α ν (i)t T (β (i) t+1 α) + s 2 β(i)t+1 α 2 2 = 0 i=1 N ν (i)t i=1 N s(β (i)t+1 α) = 0 i=1 Nsα = N ν (i)t + sβ (i)t+1 i=1 N i=1 α = ν(i)t + sβ (i)t+1 Ns Going For Large Scale Application Scenario: Recommender Systems 4 / 44
8 1. ADMM Continued Solving Linear Regression with ADMM Loss function: y X β λ β 2 2 N = y (i) X (i) β (i) λ β (i) 2 2 i=1 The ADMM algorithm iteratively performs the following steps: β (i)t+1 (2X (i)t X (i) + 2(λ + s)i) 1 (2X (i)t y (i) ν (i)t + sα t ) N α t+1 i=1 ν(i)t + sβ (i)t+1 Ns ν (i)t+1 ν (i)t + s (β (i)t+1 α t+1) Going For Large Scale Application Scenario: Recommender Systems 5 / 44
9 1. ADMM Continued Solving Linear Regression with ADMM Now assume we initialize ν (i)0 = 0 Our first α update step will be: N α 1 i=1 ν(i)0 + sβ (i)1 Ns N i=1 = 0 + sβ(i)1 Ns = = N i=1 sβ(i)1 Ns N i=1 β(i)1 N Going For Large Scale Application Scenario: Recommender Systems 6 / 44
10 1. ADMM Continued Solving Linear Regression with ADMM If we have then the ν update will be α 1 = N i=1 β(i)1 N ν (i)1 ν (i)0 + s (β (i)1 α 1) = s β (i)1 N j=1 β(j)1 N Going For Large Scale Application Scenario: Recommender Systems 7 / 44
11 1. ADMM Continued Solving Linear Regression with ADMM The next α update step will be: N i=1 ν(i)1 i=1 N i=1 sβ(i)2 α 2 + Ns Ns = 1 N s β (i)1 Ns N = 1 N N i=1 ( β (i)1) 1 N N j=1 β(j)1 N i=1 + N j=1 β(j)1 N N i=1 β(i)2 + N N i=1 β(i)2 N = N i=1 β(i)1 N N j=1 β(j)1 N + N i=1 β(i)2 N = N i=1 β(i)2 N Going For Large Scale Application Scenario: Recommender Systems 8 / 44
12 1. ADMM Continued Solving Linear Regression with ADMM From this it follows that, given ν (i)0 = 0, the algorithm can be further simplified to iteratively perform the following steps: β (i)t+1 (2X (i)t X (i) + 2(λ + s)i) 1 (2X (i)t y (i) ν (i)t + sα t ) N i=1 β(i)t+1 α t+1 N ν (i)t+1 ν (i)t + s (β (i)t+1 α t+1) Going For Large Scale Application Scenario: Recommender Systems 9 / 44
13 1. ADMM Continued Solving Linear Regression with ADMM Finally notice that the α and ν update steps do not depend directly on the loss function. This means the ADMM algorithm can be generalized with the simplified updates: β (i)t+1 arg min β (x,y) D train i N i=1 β(i)t+1 α t+1 N ν (i)t+1 ν (i)t + s (β (i)t+1 α t+1) l(y, ŷ(x; β)) + R(β) + ν (i)t T β + s 2 β αt 2 2 Going For Large Scale Application Scenario: Recommender Systems 10 / 44
14 1. ADMM Continued Year Prediction Data Set Least Squares Problem Prediction of the release year of a song from audio features 90 features Experiments done on a subset of 1000 instances of the data Going For Large Scale Application Scenario: Recommender Systems 11 / 44
15 1. ADMM Continued ADMM on The Year Prediction Dataset Runtime in Seconds ADMM Closed Form Number of Workers Going For Large Scale Application Scenario: Recommender Systems 12 / 44
16 2. Recommender Systems Outline 1. ADMM Continued 2. Recommender Systems 3. Traditional Recommendation approaches 3.1. Nearest Neighbor Approaches 3.2. Factorization Models Going For Large Scale Application Scenario: Recommender Systems 13 / 44
17 2. Recommender Systems Recommender Systems Going For Large Scale Application Scenario: Recommender Systems 13 / 44
18 2. Recommender Systems Why Recommender Systems? Powerful method for enabling users to filter large amounts of information Personalized recommendations can boost the revenue of an ecommerce system: Amazon recommender systems Netflix challgenge: 1 million dollars for improving their system on 10% Different applications: Human computer interaction Ecommerce Education... Going For Large Scale Application Scenario: Recommender Systems 14 / 44
19 2. Recommender Systems Why Personalization?  The Long Tail Source: Going For Large Scale Application Scenario: Recommender Systems 15 / 44
20 2. Recommender Systems Rating Prediction Given the previously rated items, how the user will evaluate other items? Going For Large Scale Application Scenario: Recommender Systems 16 / 44
21 2. Recommender Systems Item Prediction Which will be the next items to be consumed by a user? Going For Large Scale Application Scenario: Recommender Systems 17 / 44
22 2. Recommender Systems Formalization U  Set of Users I  Set of Items Ratings data D U I R Rating data D are typically represented as a sparse matrix R R U I items users Going For Large Scale Application Scenario: Recommender Systems 18 / 44
23 2. Recommender Systems Example Titanic (t) Matrix (m) The Godfather (g) Once (o) Alice (a) Bob (b) 4 3 John (j) 4 3 Users U := {Alice, Bob, John} Items I := {Titanic, Matrix, The Godfather, Once} Ratings data D := {(Alice, Titanic, 4), (Bob, Matrix, 4),...} Going For Large Scale Application Scenario: Recommender Systems 19 / 44
24 2. Recommender Systems Recommender Systems  Some definitions Some useful definitions: N (u) is the set of all items rated by user u N (Alice) := {Titanic, The Godfather, Once} N (i) is the set of all users that rated item i N (Once) := {Alice, John} Going For Large Scale Application Scenario: Recommender Systems 20 / 44
25 2. Recommender Systems Recommender Systems  Task Given a set of users U, items I and training data D train U I R, find a function such that some error is minimal error(ˆr, D test ) := ˆr : U I R (u,i,r ui ) D test l(r ui, ˆr(u, i)) Going For Large Scale Application Scenario: Recommender Systems 21 / 44
26 3. Traditional Recommendation approaches Outline 1. ADMM Continued 2. Recommender Systems 3. Traditional Recommendation approaches 3.1. Nearest Neighbor Approaches 3.2. Factorization Models Going For Large Scale Application Scenario: Recommender Systems 22 / 44
27 3. Traditional Recommendation approaches Recommender Systems Approaches Most recommender system approaches can be classified into: Content Based Filtering: recommends items similar to the items liked by a user using textual similarity in metadata Collaborative Filtering: similar behavior recommends items liked by users with We will focus on collaborative filtering! Going For Large Scale Application Scenario: Recommender Systems 22 / 44
28 3. Traditional Recommendation approaches 3.1. Nearest Neighbor Approaches Nearest Neighbor Approaches Nearest neighbor approaces build on the concept of similarity between users and/or items. The neighborhood N u of a user u is the set of k most similar users to u Analogously, the neighborhood N i of an item i is the set of k most similar items to i There are two main neighborhood based approaches User Based: The rating of an item by a user is computed based on how similar users have rated the same item Item Based: The rating of an item by a user is computed based on how similar items have been rated the user Going For Large Scale Application Scenario: Recommender Systems 23 / 44
29 3. Traditional Recommendation approaches 3.1. Nearest Neighbor Approaches User Based Recommender A user u U is represented as a vector u R I containing user ratings. Titanic (t) Matrix (m) The Godfather (g) Once (o) Alice (a) Bob (b) 4 3 John (j) 4 3 Examples: a := [4, 0, 2, 5] b := [0, 4, 3, 0] j := [0, 4, 0, 3] Going For Large Scale Application Scenario: Recommender Systems 24 / 44
30 3. Traditional Recommendation approaches 3.1. Nearest Neighbor Approaches User Based Recommender  Prediction Function ˆr(u, i) := r u + v N u sim(u, v)(r vi r v ) v N u sim(u, v) Where: r u is the average rating of user u sim is a similarity function used to compute the neighborhood N u Going For Large Scale Application Scenario: Recommender Systems 25 / 44
31 3. Traditional Recommendation approaches 3.1. Nearest Neighbor Approaches Item Based Recommender An item i I is represented as a vector i R U containing information on how items are rated by users. Titanic (t) Matrix (m) The Godfather (g) Once (o) Alice (a) Bob (b) 4 3 John (j) 4 3 Examples: t := [4, 0, 0] m := [0, 4, 4] g := [2, 3, 0] o := [5, 0, 3] Going For Large Scale Application Scenario: Recommender Systems 26 / 44
32 3. Traditional Recommendation approaches 3.1. Nearest Neighbor Approaches Item Based Recommender  Prediction Function ˆr(u, i) := r i + j N i sim(i, j)(r ui r i ) j N i sim(i, j) Where: r i is the average rating of item i sim is a similarity function used to compute the neighborhood N i Going For Large Scale Application Scenario: Recommender Systems 27 / 44
33 3. Traditional Recommendation approaches 3.1. Nearest Neighbor Approaches Similarity Measures On both user and item based recomenders the similarity measure plays an important role: It is used to compute the neighborhood of users and items (neighbors are most similar ones) It is used during the prediction of the ratings Which similarity measure to use? Going For Large Scale Application Scenario: Recommender Systems 28 / 44
34 3. Traditional Recommendation approaches 3.1. Nearest Neighbor Approaches Similarity Measures Commonly used similarity measures: Cosine: Pearson correlation: sim(u, i) = sim(u, v) = cos(u, v) = u v u 2 v 2 i N (u) N (v) (r ui r u )(r vi r v ) i N (u) N (v) (r ui r u ) 2 i N (u) N (v) (r vi r v ) 2 Going For Large Scale Application Scenario: Recommender Systems 29 / 44
35 3. Traditional Recommendation approaches 3.2. Factorization Models Why Factorization Models? Neighborhood based approaches have been shon to be effective but... Computing and maintaining the neighborhoods is expensive In the last years, a number of models have been shown to outperform them One of the results of the Netflix Challenge was the power of factorization models when applied to recommender systems Going For Large Scale Application Scenario: Recommender Systems 30 / 44
36 3. Traditional Recommendation approaches 3.2. Factorization Models Factorization Models Going For Large Scale Application Scenario: Recommender Systems 31 / 44
37 3. Traditional Recommendation approaches 3.2. Factorization Models Partially observed matrices The ratings matrix R is usually partially observed: No user is able to rate all items Most of the items are not rated by all users Can we estimate the factorization of a matrix from some observations to predict its unbserved part? Going For Large Scale Application Scenario: Recommender Systems 32 / 44
38 3. Traditional Recommendation approaches 3.2. Factorization Models Factorization models Each item i I is associated with a latent feature vector q i R k Each user u U is associated with a latent feature vector p u R k Each entry in the original matrix can be estimated by k ˆr(u, i) = p u q i = p u,f q i,f f =1 Going For Large Scale Application Scenario: Recommender Systems 33 / 44
39 3. Traditional Recommendation approaches 3.2. Factorization Models Example Titanic (t) Matrix (m) The Godfather (g) Once (o) Alice (a) Bob (b) 4 3 John (j) 4 3 T M G O T M G O Alice Alice Bob 4 3 a b Bob x John 4 3 John R P Q T Going For Large Scale Application Scenario: Recommender Systems 34 / 44
40 3. Traditional Recommendation approaches 3.2. Factorization Models Latent Factors Source: Yehuda Koren, Robert Bell, Chris Volinsky: Matrix Factorization Techniques for Recommender Systems, Computer, v.42 n.8, p.3037, August 2009 Going For Large Scale Application Scenario: Recommender Systems 35 / 44
41 3. Traditional Recommendation approaches 3.2. Factorization Models Learning a factorization model  Objective Function Task: Where: arg min P,Q (u,i,r ui ) D train (r ui ˆr(u, i)) 2 + λ( P 2 + Q 2 ) ˆr(u, i) := p u q i D train is the training data λ is a regularization constant Going For Large Scale Application Scenario: Recommender Systems 36 / 44
42 3. Traditional Recommendation approaches 3.2. Factorization Models Optimization method L := (r ui ˆr(u, i)) 2 + λ( P 2 + Q 2 ) (u,i,r ui ) D train Stochastic Gradient Descent: Conditions: Loss function should be decomposable into a sum of components The loss function should be differentiable Procedure: Randomly draw one component of the sum Update the parameters in the opposite direction of the gradient Going For Large Scale Application Scenario: Recommender Systems 37 / 44
43 3. Traditional Recommendation approaches 3.2. Factorization Models SGD: gradients Gradients: L := (u,i,r ui ) D train (r ui ˆr(u, i)) 2 + λ( P 2 + Q 2 ) L p u = 2(r u,i ˆr(u, i))q i + 2λp u L q i = 2(r u,i ˆr(u, i))p u + 2λq i Going For Large Scale Application Scenario: Recommender Systems 38 / 44
44 3. Traditional Recommendation approaches 3.2. Factorization Models Stochastic Gradient Descent Algorithm 1: procedure LearnLatentFactors input: D Train, λ, α 2: (p u ) u U N(0, σi) 3: (q i ) i I N(0, σi) 4: repeat 5: for (u, i, r u,i ) D Train do In a random order 6: p u p u α ( 2(r u,i ˆr(u, i))q i + 2λp u ) 7: q i q i α ( 2(r u,i ˆr(u, i))p u + 2λq i ) 8: end for 9: until convergence 10: return P, Q 11: end procedure Going For Large Scale Application Scenario: Recommender Systems 39 / 44
45 3. Traditional Recommendation approaches 3.2. Factorization Models Factorization Models on practice Dataset: MovieLens (ML1M) Users: 6040 Movies: 3703 Ratings: From 1 (worst) to 5 (best) observed ratings (approx. 4.5% of possible ratings) Going For Large Scale Application Scenario: Recommender Systems 40 / 44
46 3. Traditional Recommendation approaches 3.2. Factorization Models Evaluation Evaluation protocol 10fold cross falivation Leaveoneout Measure: RMSE (Root Mean Squared Error) (u,i,r ui ) D Test(r ui ˆr(u, i)) 2 RMSE = D Test Going For Large Scale Application Scenario: Recommender Systems 41 / 44
47 3. Traditional Recommendation approaches 3.2. Factorization Models SGD for factorization Models  Performance over epochs RMSE RMSE RMSE on test RMSE on train epoch Going For Large Scale Application Scenario: Recommender Systems 42 / 44
48 3. Traditional Recommendation approaches 3.2. Factorization Models Factorization Models  Impact of the number of latent features Movielens1M RMSE k latent features Going For Large Scale Application Scenario: Recommender Systems 43 / 44
49 3. Traditional Recommendation approaches 3.2. Factorization Models Factorization Models  Effect of regularization RMSE Regularized Model Unregularized Model epoch Going For Large Scale Application Scenario: Recommender Systems 44 / 44
Big Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Going For Large Scale Going For Large Scale 1
More informationCollaborative Filtering. Radek Pelánek
Collaborative Filtering Radek Pelánek 2015 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains
More informationBig Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany MapReduce II MapReduce II 1 / 33 Outline 1. Introduction
More informationBayesian Factorization Machines
Bayesian Factorization Machines Christoph Freudenthaler, Lars SchmidtThieme Information Systems & Machine Learning Lab University of Hildesheim 31141 Hildesheim {freudenthaler, schmidtthieme}@ismll.de
More informationThe Need for Training in Big Data: Experiences and Case Studies
The Need for Training in Big Data: Experiences and Case Studies Guy Lebanon Amazon Background and Disclaimer All opinions are mine; other perspectives are legitimate. Based on my experience as a professor
More informationParallel & Distributed Optimization. Based on Mark Schmidt s slides
Parallel & Distributed Optimization Based on Mark Schmidt s slides Motivation behind using parallel & Distributed optimization Performance Computational throughput have increased exponentially in linear
More informationFactorization Machines
Factorization Machines Factorized Polynomial Regression Models Christoph Freudenthaler, Lars SchmidtThieme and Steffen Rendle 2 Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim,
More informationBig Data Analytics Verizon Lab, Palo Alto
Spark Meetup Big Data Analytics Verizon Lab, Palo Alto July 28th, 2015 Copyright 2015 Verizon. All Rights Reserved. Information contained herein is provided AS IS and subject to change without notice.
More informationFactorization Machines
Factorization Machines Steffen Rendle Department of Reasoning for Intelligence The Institute of Scientific and Industrial Research Osaka University, Japan rendle@ar.sanken.osakau.ac.jp Abstract In this
More informationHybrid model rating prediction with Linked Open Data for Recommender Systems
Hybrid model rating prediction with Linked Open Data for Recommender Systems Andrés Moreno 12 Christian ArizaPorras 1, Paula Lago 1, Claudia JiménezGuarín 1, Harold Castro 1, and Michel Riveill 2 1 School
More informationIntroduction to Machine Learning
Introduction to Machine Learning Prof. Alexander Ihler Prof. Max Welling icamp Tutorial July 22 What is machine learning? The ability of a machine to improve its performance based on previous results:
More informationBig Data Techniques Applied to Very Shortterm Wind Power Forecasting
Big Data Techniques Applied to Very Shortterm Wind Power Forecasting Ricardo Bessa Senior Researcher (ricardo.j.bessa@inesctec.pt) Center for Power and Energy Systems, INESC TEC, Portugal Joint work with
More informationBig Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 36 Outline
More informationBINARY PRINCIPAL COMPONENT ANALYSIS IN THE NETFLIX COLLABORATIVE FILTERING TASK
BINARY PRINCIPAL COMPONENT ANALYSIS IN THE NETFLIX COLLABORATIVE FILTERING TASK László Kozma, Alexander Ilin, Tapani Raiko Helsinki University of Technology Adaptive Informatics Research Center P.O. Box
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationIPTV Recommender Systems. Paolo Cremonesi
IPTV Recommender Systems Paolo Cremonesi Agenda 2 IPTV architecture Recommender algorithms Evaluation of different algorithms Multimodel systems Valentino Rossi 3 IPTV architecture 4 Live TV Settopbox
More informationFundamental Analysis Challenge
All Together Now: A Perspective on the NETFLIX PRIZE Robert M. Bell, Yehuda Koren, and Chris Volinsky When the Netflix Prize was announced in October of 6, we initially approached it as a fun diversion
More informationRating Prediction with Informative Ensemble of MultiResolution Dynamic Models
JMLR: Workshop and Conference Proceedings 75 97 Rating Prediction with Informative Ensemble of MultiResolution Dynamic Models Zhao Zheng Hong Kong University of Science and Technology, Hong Kong Tianqi
More informationAddressing Cold Start in Recommender Systems: A Semisupervised Cotraining Algorithm
Addressing Cold Start in Recommender Systems: A Semisupervised Cotraining Algorithm Mi Zhang,2 Jie Tang 3 Xuchen Zhang,2 Xiangyang Xue,2 School of Computer Science, Fudan University 2 Shanghai Key Laboratory
More informationCLASSIFICATION AND CLUSTERING. Anveshi Charuvaka
CLASSIFICATION AND CLUSTERING Anveshi Charuvaka Learning from Data Classification Regression Clustering Anomaly Detection Contrast Set Mining Classification: Definition Given a collection of records (training
More informationLink Prediction in Social Networks
Link Prediction in Social Networks 2/17/2014 Outline Link Prediction Problems Social Network Recommender system Algorithms of Link Prediction Supervised Methods Collaborative Filtering Recommender System
More informationBig Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising
More informationAnalyze It use cases in telecom & healthcare
Analyze It use cases in telecom & healthcare Chung Min Chen, VP of Data Science The views and opinions expressed in this presentation are those of the author and do not necessarily reflect the position
More informationIntroduction to Online Learning Theory
Introduction to Online Learning Theory Wojciech Kot lowski Institute of Computing Science, Poznań University of Technology IDSS, 04.06.2013 1 / 53 Outline 1 Example: Online (Stochastic) Gradient Descent
More informationCollaborative Filtering Scalable Data Analysis Algorithms Claudia Lehmann, Andrina Mascher
Collaborative Filtering Scalable Data Analysis Algorithms Claudia Lehmann, Andrina Mascher Outline 2 1. Retrospection 2. Stratosphere Plans 3. Comparison with Hadoop 4. Evaluation 5. Outlook Retrospection
More informationBig Data Analytics: Optimization and Randomization
Big Data Analytics: Optimization and Randomization Tianbao Yang, Qihang Lin, Rong Jin Tutorial@SIGKDD 2015 Sydney, Australia Department of Computer Science, The University of Iowa, IA, USA Department of
More informationDistributed Machine Learning and Big Data
Distributed Machine Learning and Big Data Sourangshu Bhattacharya Dept. of Computer Science and Engineering, IIT Kharagpur. http://cse.iitkgp.ac.in/~sourangshu/ August 21, 2015 Sourangshu Bhattacharya
More informationBig Data Analytics. Prof. Dr. Lars SchmidtThieme
Big Data Analytics Prof. Dr. Lars SchmidtThieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,
More informationModern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
More informationOn Topk Recommendation using Social Networks
On Topk Recommendation using Social Networks Xiwang Yang, Harald Steck,Yang Guo and Yong Liu Polytechnic Institute of NYU, Brooklyn, NY, USA 1121 Bell Labs, AlcatelLucent, New Jersey Email: xyang1@students.poly.edu,
More information! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms  II
! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms  II ChingYung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and
More informationNew Ensemble Combination Scheme
New Ensemble Combination Scheme Namhyoung Kim, Youngdoo Son, and Jaewook Lee, Member, IEEE Abstract Recently many statistical learning techniques are successfully developed and used in several areas However,
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More informationWeekly Sales Forecasting
Weekly Sales Forecasting! San Diego Data Science and R Users Group June 2014 Kevin Davenport! http://kldavenport.com kldavenportjr@gmail.com @KevinLDavenport Thank you to our sponsors: The competition
More informationScalable Machine Learning  or what to do with all that Big Data infrastructure
 or what to do with all that Big Data infrastructure TU Berlin blog.mikiobraun.de Strata+Hadoop World London, 2015 1 Complex Data Analysis at Scale Clickthrough prediction Personalized Spam Detection
More information4F7 Adaptive Filters (and Spectrum Estimation) Least Mean Square (LMS) Algorithm Sumeetpal Singh Engineering Department Email : sss40@eng.cam.ac.
4F7 Adaptive Filters (and Spectrum Estimation) Least Mean Square (LMS) Algorithm Sumeetpal Singh Engineering Department Email : sss40@eng.cam.ac.uk 1 1 Outline The LMS algorithm Overview of LMS issues
More informationengineering pipelines for learning at scale Benjamin Recht University of California, Berkeley
engineering pipelines for learning at scale Benjamin Recht University of California, Berkeley If you can pose your problem as a simple optimization problem, you re mostly done + + + + +   + +    
More informationBag of Pursuits and Neural Gas for Improved Sparse Coding
Bag of Pursuits and Neural Gas for Improved Sparse Coding Kai Labusch, Erhardt Barth, and Thomas Martinetz University of Lübec Institute for Neuro and Bioinformatics Ratzeburger Allee 6 23562 Lübec, Germany
More informationL3: Statistical Modeling with Hadoop
L3: Statistical Modeling with Hadoop Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revision: December 10, 2014 Today we are going to learn...
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 SigmaRestricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationBIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM 10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
More informationData Visualization Via Collaborative Filtering
Data Visualization Via Collaborative Filtering AnneMarie Kermarrec, Afshin Moin To cite this version: AnneMarie Kermarrec, Afshin Moin. Data Visualization Via Collaborative Filtering. [Research Report]
More informationThe Operational Value of Social Media Information. Social Media and Customer Interaction
The Operational Value of Social Media Information Dennis J. Zhang (Kellogg School of Management) Ruomeng Cui (Kelley School of Business) Santiago Gallino (Tuck School of Business) Antonio MorenoGarcia
More informationBIG DATA PROBLEMS AND LARGESCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION
BIG DATA PROBLEMS AND LARGESCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION Ş. İlker Birbil Sabancı University Ali Taylan Cemgil 1, Hazal Koptagel 1, Figen Öztoprak 2, Umut Şimşekli
More informationMONTECARLO SIMULATION OF AMERICAN OPTIONS WITH GPUS. Julien Demouth, NVIDIA
MONTECARLO SIMULATION OF AMERICAN OPTIONS WITH GPUS Julien Demouth, NVIDIA STACA2 BENCHMARK STACA2 Benchmark Developed by banks Macro and micro, performance and accuracy Pricing and Greeks for American
More informationCov(x, y) V ar(x)v ar(y)
Simple linear regression Systematic components: β 0 + β 1 x i Stochastic component : error term ε Y i = β 0 + β 1 x i + ε i ; i = 1,..., n E(Y X) = β 0 + β 1 x the central parameter is the slope parameter
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
More informationLocal classification and local likelihoods
Local classification and local likelihoods November 18 knearest neighbors The idea of local regression can be extended to classification as well The simplest way of doing so is called nearest neighbor
More informationBig Data at Spotify. Anders Arpteg, Ph D Analytics Machine Learning, Spotify
Big Data at Spotify Anders Arpteg, Ph D Analytics Machine Learning, Spotify Quickly about me Quickly about Spotify What is all the data used for? Quickly about Spark Hadoop MR vs Spark Need for (distributed)
More informationComputer programming course in the Department of Physics, University of Calcutta
Computer programming course in the Department of Physics, University of Calcutta Parongama Sen with inputs from Prof. S. Dasgupta and Dr. J. Saha and feedback from students Computer programming course
More informationAdvanced Ensemble Strategies for Polynomial Models
Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer
More informationCCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York
BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal  the stuff biology is not
More informationLecture 13: Validation
Lecture 3: Validation g Motivation g The Holdout g Resampling techniques g Threeway data splits Motivation g Validation techniques are motivated by two fundamental problems in pattern recognition: model
More informationStudying Auto Insurance Data
Studying Auto Insurance Data Ashutosh Nandeshwar February 23, 2010 1 Introduction To study auto insurance data using traditional and nontraditional tools, I downloaded a wellstudied data from http://www.statsci.org/data/general/motorins.
More informationDrug Store Sales Prediction
Drug Store Sales Prediction Chenghao Wang, Yang Li Abstract  In this paper we tried to apply machine learning algorithm into a real world problem drug store sales forecasting. Given store information,
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationlop Building Machine Learning Systems with Python en source
Building Machine Learning Systems with Python Master the art of machine learning with Python and build effective machine learning systems with this intensive handson guide Willi Richert Luis Pedro Coelho
More informationCS 688 Pattern Recognition Lecture 4. Linear Models for Classification
CS 688 Pattern Recognition Lecture 4 Linear Models for Classification Probabilistic generative models Probabilistic discriminative models 1 Generative Approach ( x ) p C k p( C k ) Ck p ( ) ( x Ck ) p(
More informationEnsemble Learning Better Predictions Through Diversity. Todd Holloway ETech 2008
Ensemble Learning Better Predictions Through Diversity Todd Holloway ETech 2008 Outline Building a classifier (a tutorial example) Neighbor method Major ideas and challenges in classification Ensembles
More informationEconomics 102C: Advanced Topics in Econometrics 2  Tooling Up: The Basics
Economics 102C: Advanced Topics in Econometrics 2  Tooling Up: The Basics Michael Best Spring 2015 Outline The Evaluation Problem Matrix Algebra Matrix Calculus OLS With Matrices The Evaluation Problem:
More informationResponse prediction using collaborative filtering with hierarchies and sideinformation
Response prediction using collaborative filtering with hierarchies and sideinformation Aditya Krishna Menon 1 KrishnaPrasad Chitrapura 2 Sachin Garg 2 Deepak Agarwal 3 Nagaraj Kota 2 1 UC San Diego 2
More informationBayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
More informationNeural Networks. Introduction to Artificial Intelligence CSE 150 May 29, 2007
Neural Networks Introduction to Artificial Intelligence CSE 150 May 29, 2007 Administration Last programming assignment has been posted! Final Exam: Tuesday, June 12, 11:302:30 Last Lecture Naïve Bayes
More informationLinear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S
Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard
More informationBig Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 33 Outline
More informationMLlib: Scalable Machine Learning on Spark
MLlib: Scalable Machine Learning on Spark Xiangrui Meng Collaborators: Ameet Talwalkar, Evan Sparks, Virginia Smith, Xinghao Pan, Shivaram Venkataraman, Matei Zaharia, Rean Griffith, John Duchi, Joseph
More informationMammoth Scale Machine Learning!
Mammoth Scale Machine Learning! Speaker: Robin Anil, Apache Mahout PMC Member! OSCON"10! Portland, OR! July 2010! Quick Show of Hands!# Are you fascinated about ML?!# Have you used ML?!# Do you have Gigabytes
More informationMachine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
More informationBig Data Analytics. Lucas Rego Drumond
Big Data Analytics Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Apache Spark Apache Spark 1
More informationCS 207  Data Science and Visualization Spring 2016
CS 207  Data Science and Visualization Spring 2016 Professor: Sorelle Friedler sorelle@cs.haverford.edu An introduction to techniques for the automated and humanassisted analysis of data sets. These
More informationSimple and efficient online algorithms for real world applications
Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRALab,
More informationHow I won the Chess Ratings: Elo vs the rest of the world Competition
How I won the Chess Ratings: Elo vs the rest of the world Competition Yannis Sismanis November 2010 Abstract This article discusses in detail the rating system that won the kaggle competition Chess Ratings:
More informationDATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDDLAB ISTI CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
More informationANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING
ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING BY OMID ROUHANIKALLEH THESIS Submitted as partial fulfillment of the requirements for the degree of
More informationScalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights
Seventh IEEE International Conference on Data Mining Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights Robert M. Bell and Yehuda Koren AT&T Labs Research 180 Park
More informationBig Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 21 Outline
More informationFace Recognition using SIFT Features
Face Recognition using SIFT Features Mohamed Aly CNS186 Term Project Winter 2006 Abstract Face recognition has many important practical applications, like surveillance and access control.
More informationLecture 8 February 4
ICS273A: Machine Learning Winter 2008 Lecture 8 February 4 Scribe: Carlos Agell (Student) Lecturer: Deva Ramanan 8.1 Neural Nets 8.1.1 Logistic Regression Recall the logistic function: g(x) = 1 1 + e θt
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationIntroduction to Logistic Regression
OpenStaxCNX module: m42090 1 Introduction to Logistic Regression Dan Calderon This work is produced by OpenStaxCNX and licensed under the Creative Commons Attribution License 3.0 Abstract Gives introduction
More informationTopic 3 Correlation and Regression
Topic 3 Correlation and Regression Linear Regression I 1 / 15 Outline Principle of Least Squares Regression Equations Residuals 2 / 15 Introduction Covariance and correlation are measures of linear association.
More informationJournée Thématique Big Data 13/03/2015
Journée Thématique Big Data 13/03/2015 1 Agenda About Flaminem What Do We Want To Predict? What Is The Machine Learning Theory Behind It? How Does It Work In Practice? What Is Happening When Data Gets
More informationAdvances in Collaborative Filtering
Advances in Collaborative Filtering Yehuda Koren and Robert Bell Abstract The collaborative filtering (CF) approach to recommenders has recently enjoyed much interest and progress. The fact that it played
More informationData Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
More informationBig Data & Scripting Part II Streaming Algorithms
Big Data & Scripting Part II Streaming Algorithms 1, 2, a note on sampling and filtering sampling: (randomly) choose a representative subset filtering: given some criterion (e.g. membership in a set),
More informationProgramming Exercise 3: Multiclass Classification and Neural Networks
Programming Exercise 3: Multiclass Classification and Neural Networks Machine Learning November 4, 2011 Introduction In this exercise, you will implement onevsall logistic regression and neural networks
More informationA Preliminary Study on a Recommender System for the Million Songs Dataset Challenge
A Preliminary Study on a Recommender System for the Million Songs Dataset Challenge Fabio Aiolli University of Padova, Italy, email: aiolli@math.unipd.it Abstract. In this paper, the preliminary study
More informationLogistic Regression for Spam Filtering
Logistic Regression for Spam Filtering Nikhila Arkalgud February 14, 28 Abstract The goal of the spam filtering problem is to identify an email as a spam or not spam. One of the classic techniques used
More informationCross Validation. Dr. Thomas Jensen Expedia.com
Cross Validation Dr. Thomas Jensen Expedia.com About Me PhD from ETH Used to be a statistician at Link, now Senior Business Analyst at Expedia Manage a database with 720,000 Hotels that are not on contract
More informationDistributed R for Big Data
Distributed R for Big Data Indrajit Roy, HP Labs November 2013 Team: Shivara m Erik Kyungyon g Alvin Rob Vanish A Big Data story Once upon a time, a customer in distress had. 2+ billion rows of financial
More informationGraphbased ModelSelection Framework for Large Ensembles
Graphbased ModelSelection Framework for Large Ensembles Krisztian Buza, Alexandros Nanopoulos and Lars SchmidtThieme Information Systems and Machine Learning Lab (ISMLL) Samelsonplatz 1, University
More informationClustering. Adrian Groza. Department of Computer Science Technical University of ClujNapoca
Clustering Adrian Groza Department of Computer Science Technical University of ClujNapoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 Kmeans 3 Hierarchical Clustering What is Datamining?
More informationBUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE
BUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE Alex Lin Senior Architect Intelligent Mining alin@intelligentmining.com Outline Predictive modeling methodology knearest Neighbor
More informationNeural Networks. CAP5610 Machine Learning Instructor: GuoJun Qi
Neural Networks CAP5610 Machine Learning Instructor: GuoJun Qi Recap: linear classifier Logistic regression Maximizing the posterior distribution of class Y conditional on the input vector X Support vector
More informationMaking Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research
More informationMachine Learning over Big Data
Machine Learning over Big Presented by Fuhao Zou fuhao@hust.edu.cn Jue 16, 2014 Huazhong University of Science and Technology Contents 1 2 3 4 Role of Machine learning Challenge of Big Analysis Distributed
More informationMasters Courses Recommendation: Exploring Collaborative Filtering and Singular Value Decomposition with Student Profiling
Masters Courses Recommendation: Exploring Collaborative Filtering and Singular Value Decomposition with Student Profiling Fábio Carballo fabio.carballo@tecnico.ulisboa.pt Instituto Superior Técnico, Universidade
More informationCrossvalidation for detecting and preventing overfitting
Crossvalidation for detecting and preventing overfitting Note to other teachers and users of these slides. Andrew would be delighted if ou found this source material useful in giving our own lectures.
More informationBig Data Analytics Using Neural networks
San José State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research 412014 Big Data Analytics Using Neural networks Follow this and additional works at: http://scholarworks.sjsu.edu/etd_projects
More information