Mining Social Networks for Recommendation. Mohsen Jamali& Martin Ester Simon Fraser University Tutorial at ICDM 2011 December 12 th 2011

Size: px
Start display at page:

Download "Mining Social Networks for Recommendation. Mohsen Jamali& Martin Ester Simon Fraser University Tutorial at ICDM 2011 December 12 th 2011"

Transcription

1 Mining Social Networks for Recommendation Mohsen Jamali& Martin Ester Simon Fraser University Tutorial at ICDM 2011 December 12 th 2011

2 Introduction Flood of information Conventional (industrial / mass) media Social media Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

3 Introduction Outline Introduction Recommender systems Recommendation in social networks Mining social networks Memory based approaches Model based approaches Link prediction Social networks with distrust Summary Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

4 Recommender Systems Web search Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

5 Recommender Systems Search Input: query keywords Output: ranked list of results User needs to know what he is looking for. but content changes, keywords change Every user gets same result. but users have diverse interests e.g., student, software developer, politician,... Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

6 Recommender Systems Users want to have personalized results. But are not willing to spend a lot of time to specify their personal information needs. Recommender systems automatically identify information relevant for a given user, learning from available data. Data - user actions, - user profiles. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

7 Recommender Systems Rating Matrix Items Target Item Target User Users Similar User Departed Star Wars Matrix Hurt Locker Titanic Terminator Joe ?? John Susan Pal Jean Ben 1 5 Nathan Ratings Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

8 Recommendation Tasks Rating prediction Predict the rating of target user for target item, e.g. predict Joe s rating for Titanic. Top-N recommendation Predict the top-n highest-rated items among the items not yet rated by target user. Link recommendation (only if social network) Predict the top-n users to which the target user is most likely to connect. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

9 Applications Yahoo! news recommendations Recommendations of new articles for Today box on Yahoo's home page 9,000 recommendations per minute Sophisticated personalization algorithm Based on demographic user attributes, the places they've visited when they've come to Yahoo in the past, and the stories they've already seen during that particular visit. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

10 Applications Yahoo! news recommendations (cont.) Team of editors prepare news packages, algorithm ranks packages for user. Has increased the click through rate by 270% since Has helped editors to get better understanding of the interests of different user segments. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

11 Applications Facebook friend recommendations People you may know Based on mutual friends, work and education information, networks you re part of, contacts and many other factors. Since our formula is automatic, you might occasionally see people you don t know or don t want to be friends with. To remove them from view, just click the X next to their names. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

12 Applications Pandora music recommendation Internet radio service Pandora.com Music Genome Project: trained music analysts score each song based on hundreds of distinct musical characteristics. Recommend songs with similar scores Recommend sequence of songs Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

13 Applications Item-based collaborative filtering Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

14 Applications Netflix (movie recommendation) $1M prize for 10% accuracy improvement Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

15 Privacy Issues Recommender systems use a lot of personal data: movies watched, current location,... The more personal data shared, the better (more personalized) the recommendations. Serious threat to data privacy. Users need to be able to make an informed choice. E.g., Google users can shut off personalization features by deleting their Web history. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

16 The Filter Bubble Users get less exposure to conflicting viewpoints and are isolated intellectually in their own informational bubble. E.g., Google results for "BP" User 1: investment news about British Petroleum User 2: information about the Deepwater Horizon oil spill Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

17 The Filter Bubble [Pariser2011]:... creates the impression that our narrow self-interest is all that exists.... Google and Facebookare offering users "too much candy, and not enough carrots. Book reviewer Paul Boutin: did a similar experiment among people with differing search histories with nearly identical search results. Harvard law professor Jonathan Zittrain: "the effects of search personalization have been light." Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

18 Collaborative Filtering Set of items I, set of users U Users rate items. No need for information about content of items or attributes of users. Users with similar ratings on some items are likely to have similar ratings on further item. Items which are rated similarly by some users are likely to have similar ratings by further users. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

19 Collaborative Filtering Target User Aggregator Prediction Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

20 Collaborative Filtering Nearest neighbor-based approach [Resnick et al., 1994] find users with history of agreement (similar rating profiles), aggregate their ratings to predict unknown rating. Issues How to define user similarity? How many similar users? How to aggregate the ratings? Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

21 Collaborative Filtering r u, i : (observed) rating of user ufor item i r u,i i r u : mean rating of user u ˆ r, u i: predicted rating of useru for itemi N(u) :set ofusers similar to useru (who have rated itemi) sim( u, v) : similarity of users uand v κ : normalization factor Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

22 Collaborative Filtering Different users use the ratings scale differently. normalize ratings by the mean rating The more similar a user v, the higher the weight of his rating. Rating prediction rˆ u, i = ru + κ sim( u, v) ( rv, i rv ) v N ( u) Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

23 Collaborative Filtering 23 Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM 2011 How to define similarity of users? : set of items rated by both users uand v Pearson correlation coefficient Cosine similarity = uv uv uv I i v i v I i u i u v i v I i u i u r r r r r r r r v u sim 2, 2,,, ) ( ) ( ) )( ( ), ( = uv uv uv I i i v I i i u i v I i i u r r r r v u sim 2, 2,,, ), ( I uv Intro Recommenders Recommendation in SNs Mining SNs Memory based Model based Link prediction Distrust Summary

24 Collaborative Filtering So far: user-based CF Item-based CF is dual approach [Sarwar et al., 2001]. r i : mean rating of item i N(i) :set ofitems (rated by user u) similar to itemi sim( i, j) : similarity of itemsi andj Rating prediction rˆ u, i = ri + κ sim( i, j) ( ru, j rj ) j N ( i) Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

25 Collaborative Filtering So far: memory-based CF Model-based CF: learn model in training phase, apply model in test phase to predict rating. R(u) : set of ratings of user u Rating scale [1..n] Probabilistic model n ˆ = u, = r= 1 u i =, i r P( ru i r R( u)) How to compute P( r, r R( u))? r Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

26 Collaborative Filtering Idea: ratings of items depend on their location in a latent factor space. [Koren et al., 2009] Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

27 Collaborative Filtering Probabilistic matrix factorization [Salakhutdinov et al., 2007] Assumption: observed ratings are governed by latent variables (factors). N: number of users M: number of items K: number of factors, K << M, K << N U u : latent factor (vector) of user u V i : latent factor (vector) of item i Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

28 Collaborative Filtering Probabilistic matrix factorization (MF)(cont.) T 2 P( r = r U, V ) = Ν( r U V, σ ) σ u, σ V, σ R : normal priors Assumption: item ratings are independent from each other I R u, i = u, i u i u, i u i R 1,if rui observed 0, otherwise Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

29 Collaborative Filtering MF(cont.) Parameter learning through maximum likelihood estimation Equivalent to minimizing the error function Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

30 Collaborative Filtering MF(cont.) MF-based CF typically outperforms NN-based CF [Korenet al., 2009]. MF can naturally incorporate biases and additional data sources [Koren et al., 2009]. But latent factors are hard to interpret. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

31 Content-based Recommenders Set of items I, set of users U. Given user profiles, describing the users tastes, preferences and needs. Given item profiles, characterizing the content of item. Top-N recommendation by ranking items w.r.t. similarity of item profiles and user profile [Balabanovic 1997]. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

32 Content-based Recommenders Item profile: typically frequencies of kselected keywords. f i, j: frequency of keyword iin item j n i : number of items containing keyword i Term frequency / inverse document frequency TF i, Profile for item i: j = max f i, z j f z, j IDF i = log M n w i, j = TF i, j IDF i content ( i) = ( w, i,..., w k, 1 i i ) Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

33 Content-based Recommenders User profile: typically importance or frequencies of keywords, e.g. aggregation of profiles of items liked by user. Similarity of item iand user u sim ( u, i) = k l = 1 k l = 1 w w 2 l, u l, u w l, i k l = 1 w 2 l, i Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

34 Hybrid Recommender Systems Combine collaborative and content-based method. Approach 1: combine separate recommenders Combine results, e.g. using linear combination or voting. [Pazzani 1999] Approach 2: add aspects of content-based method to CF. [Pazzani 1999] E.g., use profiles to compute similarities between users or items. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

35 Hybrid Recommender Systems Approach 3: add aspects of CF to contentbased method [Soboroff et al., 1999]. E.g., perform dimensionality reduction on group of content-based profiles. Approach 4: Unified recommendation model E.g., combine topic model, i.e. Latent Dirichlet Allocation, with MF. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

36 Hybrid Recommender Systems Latent Dirichlet Allocation (LDA) [Blei et al., 2003] Assumption: documents have latent topic distribution, topics have word distributions. Graphical model Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM 2011 θ(latent) topic distribution Z (latent) topic W (observed) word αprior for topic distribution βword distributions N number of words per document M number of documents 36

37 Hybrid Recommender Systems Collaborative topic regression [Wang et al., 2011] Idea: latent item factors (V) depend on topic distribution (θ) of item. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

38 Performance Evaluation Cross-validation on offline dataset Withhold subset of ratings (test set) Test ( U I ),i.e. Test = {( u, i), ( v, j),...} Use remaining ratings to train recommender (training set) Compare the withheld ratings against the predicted ratings, compute error measure Standard evaluation in research Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

39 Performance Evaluation Cross-validation on offline dataset (cont.) Measures for rating prediction Mean absolute error 1 MAE = Test ( u, i) Test Root mean square error RMSE = ( rˆ u, i ( u, i) Test Test rˆ u r, i u, i u, i Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM 2011 r ) 2 39

40 Performance Evaluation Cross-validation on offline dataset (cont.) Measures for top-n recommendation Recall (or coverage) TopN: set of the top-n recommendations (by algorithm) TestTop: set of all elements of the test set that are among the top-n items for the user Recall = TopN TestTop TestTop Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

41 Performance Evaluation Limitations Measures only accuracy of recommendations Does not measure other aspects such as diversity Does not measure how recommendations change user behavior this is the ultimate goal of a recommender! Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

42 Performance Evaluation In industry Want to evaluate user satisfaction and business profit A/B testing in online system Evaluation measures click-through rate usage return rate of customers profit Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

43 Challenges Privacy-preservation How to make good recommendations without violating privacy concerns? Diversity of recommendations Recommenders tend to suggest more of the same. Explanation of recommendations Necessary to build trust into the recommender. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

44 Challenges Recommendations for cold-start users i.e. users with very few ratings Typically, ~ 50% of users cold-start. CF fails, because there are no similar users (userbased CF) and no item ratings to aggregate (itembased CF). Recommendations for cold-start items i.e. items with very few ratings Typically, ~ 50% of items cold-start. CF fails, because there are not enough ratersfor the item. Jamali& Ester: Content-based Mining Social Networks for Recommendation, method Tutorial works. at ICDM

45 Social Networks Social network [Wasserman et al., 1994] Used widely in the social and behavioral sciences, in economics, marketing,... directed or undirected graph nodes: actors edges: social relationships or interactions Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

46 Social Networks Different types of social relationships Different types of interactions Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

47 Social Networks Explicit social network relationships provided by users Implicit social network relationships inferred from user actions network Co-worker network Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

48 Social Networks The formation and evolution of social networks is affected by many effects, including Self-interest, Social and resource exchange, Balance, Homophily, Proximity. [Monge & Contractor 2003] Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

49 Trust Networks Related concept: trust network [Golbeck 2005] Trust network allows users to systematically document their trust-relationships, see which users have declared trust in another user. Connected users do not necessarily have a social relationship. Trust in a user may be based, e.g., on articles or reviews authored by that user. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

50 Online Social Networks Emergence of online social networks Among the top websites [Alexa2011] FaceBook Twitter LinkedIn Availability of very large datasets Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

51 Social Rating Networks Social rating network (SRN): social network, where users are associated with item ratings. Item ratings can be numeric [1..5] or Boolean (bookmark photo, like article,...). Examples: Epinions, Flixster, last.fm, flickr, Digg. Social action: create social relationship, rating action: rate an item. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

52 Social Rating Networks Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

53 Effects in Social Rating Networks Social influence: ratings are influenced by ratings of friends, i.e. friends are more likely to have similar ratings than strangers. Correlational influence: ratings are influenced by ratings of actors with similar ratings, i.e. if some ratings are similar, further ratings are more likely also to be similar. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

54 Effects in Social Rating Networks Selection (homophily): actors relate to actors with similar ratings, i.e. actors with similar ratings are more likely to become friends. Transitivity: actors relate to friends of their friends, i.e. actors are more likely to relate to indirect friends. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

55 Recommendation in Social Networks Benefits of social network-based recommendation: - Exploit social influence, correlational influence, transitivity, selection. -Can deal with cold-start users, as long as they are connected to the social network. -Are more robust to fraud, in particular to profile attacks. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

56 Recommendation in Social Networks Challenges Low probability of finding rater at small network distance. Noisy ratings at large network distances. Social network data is very sensitive. Edges in online social networks are of greatly varying reliability / strength. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

57 Mining Social Networks Lots of research in various directions, e.g. community identification, maximization of social influence, etc. Here only mining methods relevant for recommendation: Analysis of social influence, Models of social rating networks, Inference of social networks. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

58 Influence and Correlation [Anagnostopoulos et al., 2008] Goal: does a SN exhibit social influence? Discrete time period [0..T], consider only one action, e.g. using a certain tag. At every time step, each user flips a coin to decide whether he will get active. Probability of activation depends only on number a of already active friends: p α ln( a+ 1) + β ( a) = α ln( a+ 1 ) + β 1+ e Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM 2011 e 58

59 Influence and Correlation α measures social correlation Y a,t : number of users with aactive friends at time t-1 who get activated at time t N a,t : number of users with aactive friends at time t-1 who do not get activated at time t Y = = a Ya, t, Na Na, t t t Compute αand βthat maximize the data Y a likelihood p ( a) (1 ( p( a)) Na a Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

60 Influence and Correlation If social influence does not play a role, then the timing of activation should be independent of the timing of activation of other users. W = { w 1,..., w l } : set of active users at time T t i : activation time of user i Shuffle test Perform random permutation πof {1,..., l}. Set activation time of user ito t = t. i' : π ( i) Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

61 Influence and Correlation Compute α for original activation times. Compute α for shuffled activation times. If α and α are close to each other, then the model exhibits no social influence. α for original activation times vs. α for shuffled activation times on Flickr dataset Social correlation, but no social influence! Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

62 Feedback Effects Between Similarity and Social Influence [Crandall et al., 2008] Goal: characterize how social influence and selection work together to affect users actions and interactions. Wikipedia dataset Actions: editing article, Interactions: editing the discussion page of another user. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

63 Feedback Effects Between Similarity and Social Influence How does the similarity of two users vary around the time of their first interaction? Sharp increase in similarity immediately before first interaction Continuing but slower increase after first interaction Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

64 Feedback Effects Between Similarity and Social Influence Generative model for social network Users are associated with history of their actions and corresponding time stamps. Options for generating next action for user u Sample from u sown history. Sample from history of a friend of u. Sample from history of any user. Perform a new action. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

65 Feedback Effects Between Similarity and Social Influence Options for generating next interaction for user u Sample from users with similar history of actions. Use weighted Jaccard coefficient to measure similarity. Sample a random user to interact with. Estimate model parameters from data Some parameters can be observed. Others are estimated by maximum likelihood estimation. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

66 Modeling Social Rating Networks [Jamali et al., 2011] Goal: generative model considering all four effects between social actions and rating actions, i.e. social influence, selection, transitivity and correlational influence. What about other effects? E.g., demographic attributes, location. Corresponding data not observed, modeled as random background effect. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

67 Modeling Social Rating Networks Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

68 Modeling Social Rating Networks Generation of actions similar to [Crandall et al., 2008], but also transitivity and correlational infl. Temporal dynamics of effects, e.g. new user Epinions Flickr Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

69 Modeling Social Rating Networks a k : k th action, ordered by timestamps S t : state of SRN at time t ϴ: set of model parameters Data likelihood Likelihood of social action Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

70 Modeling Social Rating Networks Likelihood of rating action Parameter learning Maximum likelihood estimation EM algorithm Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

71 Modeling Social Rating Networks Experimental results Φ t,1 =0.91 in Epinionsand Φ t,1 =0.9 in Flickr Transitivity is the strongest effect for social actions. Φ r,1 =0.59 in Epinionsand Φ r,1 = 0.54 in Flickr Social influence is the strongest effect for rating actions. Comparison partners include CrossModel, similar to [Crandall et al., 2008] SocialOnly, similar to [Leskovec et al., 2008] Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

72 Modeling Social Rating Networks Growth of similarity after creation of social relationship Flickr Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

73 Modeling Social Rating Networks Average network distance of users before creation of social relationship Epinions Flickr Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

74 Inferring Social Relationships and Their Strength So far: (Boolean) social network given. Sometimes, no information about social relationships, only user actions Inference of social network from user actions [Gomez-Rodriguez et al., 2010] Inference of weighted social network [Myers et al., 2010] Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

75 Inferring Social Networks [Gomez-Rodriguez et al., 2010] Goal: infer social relationships from user actions with time stamps. Assumption: there is a latent, static network over which influence propagates. t u : activation time of user u, i.e. time when user u gets activated ( infected ) by a cascade Cascade cspecified through activation times of all users: c [ t 1,..., t n ], possibly t = = i Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

76 Inferring Social Networks Independent Cascade model: activated node activates each of his friends with a given probability P c ( u, v) : probability of cascade cspreading from user uto user v = tv t u ( u, v) decreases with increasing P c P c 1 α or Pc ( u, ) α ( u, v) e v Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

77 77 Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM 2011 Inferring Social Networks C: set of all given cascades G: inferred directed graph over node (user) set U={1,,n} T(G): set of all subtreesof G Problem: compute Gwith at most kedges that maximizes the likelihood = T v u c v u P T c P ), ( ), ( ) ( = = T v u c C c G T T C c G T T v u P T c P G C P ), ( ) ( ) ( ), ( max ) ( max ) ( ) ( G C P Intro Recommenders Recommendation in SNs Mining SNs Memory based Model based Link prediction Distrust Summary

78 Inferring Social Networks Improvement of log-likelihood over empty graph E: F ( G) = max log P( c T ) max log T T ( G) Equivalent problem c T T ( E) P( c T ) Problem is NP-hard. F C (G) is submodular, which means that a greedy algorithm gives a constant-factor approximation of the optimal solution. NetInf algorithm Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

79 Inferring Social Networks Precision and recall for synthetic datasets Spreading probability PL: power law, Exp: exponential Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

80 Inferring Weighted Social Networks [Myers et al., 2010] NetInfis very accurate for homogeneous networks, i.e. networks where all connected nodes influence ( infect ) each other with the same probability. For inhomogeneous networks, define A ij = P( node i infects node j node i is infected) Goal: learn the matrix A = [A ij ] from the observed set of cascades C={c 1,..., c n } Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

81 Inferring Weighted Social Networks If ibecomes infected, then jwill be infected with probability A ij. w(t): transmission time model probability distribution of the transmission time from one node to a friend c τ i : time of infection of node iby cascade c time of infection of i sfriend jby cascade c τ c c j = τ i + t, where t ~ w( t) Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

82 Inferring Weighted Social Networks Likelihood of observed cascades C given a weight matrix A P( C A) = c C 1 (1 w( τ τ ) A Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM 2011 ) (1 A ) c c < < = < i j ji ji c c c c c i: τ i j: τ j τ i i: τ i j: τ j First term: one factor for each infected node i, assuming that at least one of his friends jwho was infected earlier infected him Second term: one factor for each non-infected node i, assuming that none of the infected friends jinfected him 82

83 Inferring Weighted Social Networks Parameter learning through maximum likelihood estimation If inever infected j, then A ij := 0 and do not need to learn it. Translate into a convex optimization problem: Finds globally optimal solution. Can use efficient convex optimization methods. ConNIe algorithm Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

84 Inferring Weighted Social Networks Precision vs. recall and mean square errors vs. number of edges for synthetic datasets Transmission time model PL: power law, Exp: exponential, WB: Weibull. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

85 Mining Social Networks for Recommendation Tutorial at ICDM 2011 MohsenJamali& Martin Ester Part 2 Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

86 Outline Introduction Recommender systems Recommendation in social networks Mining social networks Methods for Recommendation in Social Netorks Memory based approaches Model based approaches Link prediction Social Networks with distrust Summary Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

87 Recommendation in SNs: Problem Definition Input Rating matrix Real valued or binary Social network Weighted or binary Social rating network (SRN) A social network in which users can express ratings on items Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

88 Recommendation in SNs: Problem Definition Rating prediction problem For a given user uand the target item i: Predict the rating r u,i. Top-N item recommendation For a given user urecommend top N items desirable for him [Deshpande et al. 2004]. Mostly neglected in the literature Top-N recommendation has been investigated in traditional recommender system. In social networks, there are very few works [Jamali et al b]. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

89 Data Sets for Recommendation in SNs Epinions Online product review Explicit notion of trust Users review and rate products in different categories. Users express trust on other reviewers. 50K users, 140K items, 650K ratings, 480K links 70K users, 105K items, 575K ratings, 500K links 50 % cold start Less than 5 ratings Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

90 Data Sets for Recommendation in SNs Flixster Social metworking service for rating movies Friendship relations 1M users, 50K items, 8M ratings, 26M links 85% of users have no ratings 50% of rater are cold start Less than 5 ratings Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

91 Models for Recommendation in SNs Memory based approaches Explore the social network for raters Aggregate the ratings to compute prediction Store the social rating network No Learning phase Slow in prediction Most pioneer works for recommendation in SN are memory based approaches. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

92 Models for Recommendation in SNs Model based approaches Learn a model Store the model parameters only Extra time for learning Fast in Prediction Most models are based on matrix factorization Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

93 Memory based Approaches for Recommendation in Social Networks Explore the network to find raters in the neighborhood Aggregate the ratings of rater to compute the predicted rating. There are different methods to calculate the top trusted neighborhood of users. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

94 Advogato [Levien et al., 2002] A trust metric to compute the top N trusted users Input n: the number of users to trust x: the source users who want to trust A maximum flow based approach Advogatocan be used to find the neighborhood in rating prediction. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

95 Node Capacities: source user: n user at level l capacity at l-1 average out-degree from l-1 Advogato(cont.) Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

96 Advogato(cont.) To apply Ford-Fulkerson algorithm for maximum flow analysis, we should have Single source, single sink Capacities on edges Graph transformation Super sink Split nodes into two nodes Node capacity c Edge with c-1 from negative to positive node Edge with capacity 1 from negative to super sink Infinite capacity for regular edges Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

97 Advogato(cont.) Maximum flow computed from source-negative to super sink Nodes with flow to super link Top n trusted users Recommendation No distinction among the top trusted users Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

98 Trust Metric AppleSeed [Ziegler 2005] Intuition: Spreading activation model Source node uis activation through injection of energy e. Energy is fully propagated through edges Proportional to the edge weights Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

99 AppleSeed(cont.) Nodes are ranked according to the energy they receive Issue: Trust is considered to be additive Nodes with many weakly trusted paths are considered to be highly trusted Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

100 TidalTrust [Golbeck 2005] Modified breadth first search in the network Consider all raters v at the shortest distance from the source user u. Trust between u and v Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

101 Predicted rating TidalTrust(cont.) Only considers raters at the shortest distance: Loss of information Lower recall Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

102 MoleTrust [Massa et al., 2007] Similar to the idea in TidalTrust Considers raters up to a maximum-depth d. Backward exploration in trust computation Tuning d: Trade-off between accuracy and recall Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

103 Memory based Approaches for Recommendation in Social Networks How far to go into network? Tradeoff between Precision and Recall Far neighbors on the exact target item Trusted friends on similar items Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

104 TrustWalker [Jamali et al., 2009] Random walk based model Combines item-based recommendation and trust-based recommendation. Performs several random walks on the social network. Each random walk returns a rating on the exact target item or a similar item. Prediction = aggregate of all returned ratings Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

105 TrustWalker Single Random Walk Starts from source user u 0. At step k, at node u: If uhas rated i,return r u,i With Φ u,i,k, the random walk stops Randomly select item jrated by uand return r u,j. With 1-Φ u,i,k, continue the random walk to a direct neighbor of u. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

106 TrustWalker(cont.) Item similarities Φ u,i,k Similarity of items rated by uand target item i. The step of random walk Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

107 TrustWalker(cont.) Special cases of TrustWalker Φ u,i,k = 1 Random walk never starts. Item-based Recommendation. Φ u,i,k = 0 Pure trust-based recommendation. Continues until finding the exact target item. Aggregates the ratings weighted by probability of reaching them. Existing methods approximate this. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

108 TrustWalker Extension TrustWalkercan be applied to recommend a list of top-n items [Jamali et al., 2009.b] Every random walk stops at a user v. All items ranked highly by v are returned as the result of the random walk. Result of several random walks are merged into a list of top-n items. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

109 Memory based Approaches: Experiments RMSE Results on Epinions Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

110 Memory based Approaches: Experiments Result for cold start users on Epinions Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

111 Memory based Approaches: Experiments Result for cold start users on Epinions Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

112 Memory based Approaches: Experiments Result for all users on Epinions Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

113 Memory based Approaches: Experiments Result for all users on Epinions Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

114 Memory based Approaches: Experiments RMSE Results on Flixster Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

115 Memory based Approaches: Summary Not learning a model No Learning phase Explore the network to find raters Need to store the SRN Slow in the prediction phase due to exploration Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

116 Model based approaches for Recommendation in SNs Recently have attracted attention Most common approach: Matrix factorization Latent features for users Latent features for items Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

117 SoRec [Ma et al., 2008] Matrix factorization model Factorize the ratings and links together Social network as a binary matrix Latent factors for items (as in MF) Twolatent factors for users One for the initiator One for the receiver Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

118 FIP [Yang et al., 2011] Factorizes both rating matrix and the social network Similar to the idea in SoRec Assumes undirected network FIP vs. SocRec SocRec: directed graph FIP uses features as priors Choice of the user factor determining the observed rating is arbitrary Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

119 Social Trust Ensemble [Ma et al., 2009] Social Trust Ensemble (STE) Linear Combination of Basic matrix factorization and Latent factors of the user and the item determine the observed rating. Social network based approach Latent factors of the neighbors and the latent factor of the item determine the observed rating. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

120 Social Trust Ensemble (cont.) The STE model Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

121 Social Trust Ensemble (cont.) The STE model Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

122 Social Trust Ensemble (cont.) Issues with STE Latent factors of neighbors should influence the latent factor of unot his ratings STE does not handle trust propagation Learning is based on observed ratings only Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

123 SocialMF [Jamaliet al., 2010] Social influence behavior of a user uis affected by his direct neighbors N u. Latent characteristics of a user depend on his neighbors. T u,v is the normalized trust value. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

124 SocialMF(cont.) Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

125 SocialMF(cont.) Properties of SocialMF Trust propagation Learning the user latent factor is possible with existence of the social network only No need to fully observed rating for learning Appropriate for cold start users and users with no ratings Similar ideas [Ma et al. 2011] Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

126 Results for Epinions Gain over STE: 6.2%. for K=5 and 5.7% for K=10 Mohsen Jamali, Social Matrix Factorization 126

127 Results for Flixster SocialMFgain over STE (5%) is 3 times the STE gain over BasicMF(1.5%) Mohsen Jamali, Social Matrix Factorization 127

128 Experiments on Cold Start Users Mohsen Jamali, Social Matrix Factorization 128

129 Analysis of Learning Runtime SocialMF: STE: SocialMFis faster by factor N K r t # of Users Latent Feature Size Avg. ratings per user Avg. neighbors per user Mohsen Jamali, Social Matrix Factorization 129

130 Generalized Stochastic Block Model for Recommendation in Social Networks [Jamali et al., 2011] Social influence and selection lead to formation of communities/groups Users may belong to different groups in their actions Teacher interacting with students or his/her son Digital Camera when being rated Clustering based methods for recommendation Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

131 GSBM (cont.) Extending mixed membership stochastic block model [Airoldi et al. 2008] Users probabilistically act as a member of one of the groups in their actions. Every item is considered to belong to a latent group when it is being rated. The relation between users and items is governed by the relation between groups. 131

132 GSBM Graphical Model 132

133 GSBM (cont.) Sample the social relation, g2 K1 K1 g1 P(g1-->g2) B T 133

134 GSBM : Experiments on Rating Prediction Epinions Flixster 134

135 GSBM: Experiments on Link Prediction Epinions Flixster 135

136 Link Prediction Emergence of online social network The need to get connected to other people led to link prediction Problem Definition Given a user pair (u,v), estimate the probability of creation of the link u v Given a user u, recommend a list of top users for u to connect to. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

137 Link Prediction (cont.) Link prediction vs. Rating prediction Strength of a relation between a users and another user Strength of a relation between a users and an item Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

138 Link Prediction Methods Pair-wise similarity based approach Roots in social selection Users with highest similarity to uare recommended to u. Every user uis represented by his/her observed features, properties and past activities such as ratings and clicks. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

139 Link Prediction- Similarity based Methods Defining similarity measure between A and B the ratio between the amount of information needed to state the commonality of A and B and the information needed to fully describe what A and B are [Lin 1998]: Special Cases: Cosine similarity Pearson correlation Jaccard s coefficient Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

140 Link Prediction Methods (cont.) Network Topology based methods Common neighbors Jaccard s coefficient [Adamicand Adar 2003] Preferential attachment [Newman 2001] Initially proposed for modeling network growth Measure similarity, based on direct neighbors Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

141 Link Prediction Path based Methods Katz [Katz 1953]: path l A,B: number of paths of length lfrom Ato B Hitting time [Liben-Nowell et al., 2003] score(a,b): Average number of steps for a random walk to reach Bstarting from A Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

142 Link Prediction Path based Methods Random walk with restart [Pan et al., 2004] A random walk starts from A. At each step, with probability α the random walk restarts score(a,b): probability of being at B during the random walk. SimRank[Jehet al., 2002] Two user are similar to the extent that they are joined to similar neighbours. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

143 Link Prediction Methods (cont.) MF based models [Rennieet al., 2005] Social network as a binary matrix Similar to MF methods for rating prediction Factorize the network matrix into product of lower rank matrices (representing user factors) Advanced version in [Yang et al., 2011] Latest advances: Supervised random walks [Backstrom et al., 2011] Random walk based approach Considering properties of links and user attributes Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

144 Analyzing Social Networks with Distrust Relations between users on social media sites often reflect a mixture of positive and negative interactions [Leskovec et al., 2010]. User can express distrust on other users E.g. block some users in ebay, Google+ Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

145 Analyzing Social Networks with Distrust (cont.) Few works have addressed negative relations [Leskovec et al., 2010] [Kunegiset al., 2009] [Brzozowski et al., 2008] [Guhaet al., 2004] Prior work shifted the trust to avoid negative values [Kamvaret al., 2003] Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

146 Analyzing Social Networks with Distrust (cont.) How does distrust propagate? d b e a c f Distrust propagates only one step [Guhaet al., 2004] g Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

147 Analyzing Social Networks with Distrust (cont.) How does distrust propagate? a trust distrust c b Distrust propagates only one step [Guhaet al., 2004] d e f g a a a a?? d e f g Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

148 Analyzing Social Networks with Distrust (cont.) Signed network can be analyzed according to two different theories [Leskovec et al., 2010]: Structural balance theory Originated in social psychology Triangles with three positive signs (three mutual friends,t3) and those with one positive sign (two friends with a common enemy, T1) are more plausible. [Leskovec et al., 2010] Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

149 Analyzing Social Networks with Distrust (cont.) Signed network can be analyzed according to two different theories [Leskovec et al., 2010]: Theory of Status Positive(negative) directed link indicate that the creator of the link views the recipient as having higher(lower) status These relative levels of status can be propagated along multi-step paths of signed links Leads to different predictions than balance theory. Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

150 Analyzing Social Networks with Distrust (cont.) Which theory does better explain the relations among users in a signed social network? [Leskovec et al., 2010] In undirected networks, structural balance theory In directed signed networks, theory of status Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

151 Recommendation in Social Networks with Distrust How can distrust be exploited in recommendation? Very few works addressed this problem [Ma et al., 2009.b] Matrix factorization model Modified objective (error) function Maximizing the distance between factor of uand his/her distrusted neighbor v Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

152 Recommendation in Social Networks with Distrust (cont.) [Ma et al., 2009.b] D + : set of users that u distrusts : distrust score Experiments on Epinions show promising results Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

153 Summary State-of-the-art methods for recommendation in social networks Memory based approaches ModelTrust[Massa 2007], Modified BFS TidalTrust[Golbeck 2005], Modified BFS TrustWalker[Jamali et al., 2009], Random Walk Model based approaches SoRec[Ma et al., 2008], Matrix Factorization FIP [Yan et al., 2011], Matrix Factorization STE [Ma et al., 2009], Matrix Factorization SocialMF[Jamali et al., 2010], Matrix Factorization GSBM[Jamali et al., 2011], Stochastic BlockModel Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

154 Summary Link Prediction Pair-wise profile similarity approaches Information theoretic based definition of similarity Network topology based approaches Common neighbors Path based approaches Katz, Hitting time, RWR, SimRank Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

155 Summary Social Networks with distrust Propagation of distrust Theories behind distrust Recommendation with distrust [Ma et al., 2009.b] Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

156 Future Research Directions Exploring other machine learning models Privacy of recommendation in social networks How to preserve privacy while employing social networks? Improving the diversity of recommendations How to evaluate the diversity? Recommendation of cold-start items They are very important! Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

157 Future Research Directions Recommendation in mobile social networks Distributed algorithm How to exploit the user location? Recommendation in social networks with documents (posts) E.g., Twitter Integration with topic models Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

158 Thank You! Jamali& Ester: Mining Social Networks for Recommendation, Tutorial at ICDM

On Top-k Recommendation using Social Networks

On Top-k Recommendation using Social Networks On Top-k Recommendation using Social Networks Xiwang Yang, Harald Steck,Yang Guo and Yong Liu Polytechnic Institute of NYU, Brooklyn, NY, USA 1121 Bell Labs, Alcatel-Lucent, New Jersey Email: xyang1@students.poly.edu,

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2015 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains

More information

Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek

Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pelánek 2015 Today lecture, basic principles: content-based knowledge-based hybrid, choice of approach,... critiquing, explanations,...

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

The Need for Training in Big Data: Experiences and Case Studies

The Need for Training in Big Data: Experiences and Case Studies The Need for Training in Big Data: Experiences and Case Studies Guy Lebanon Amazon Background and Disclaimer All opinions are mine; other perspectives are legitimate. Based on my experience as a professor

More information

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014 Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)

More information

1 o Semestre 2007/2008

1 o Semestre 2007/2008 Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Outline 1 2 3 4 5 Outline 1 2 3 4 5 Exploiting Text How is text exploited? Two main directions Extraction Extraction

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

Social Media Mining. Network Measures

Social Media Mining. Network Measures Klout Measures and Metrics 22 Why Do We Need Measures? Who are the central figures (influential individuals) in the network? What interaction patterns are common in friends? Who are the like-minded users

More information

IPTV Recommender Systems. Paolo Cremonesi

IPTV Recommender Systems. Paolo Cremonesi IPTV Recommender Systems Paolo Cremonesi Agenda 2 IPTV architecture Recommender algorithms Evaluation of different algorithms Multi-model systems Valentino Rossi 3 IPTV architecture 4 Live TV Set-top-box

More information

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network , pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and

More information

Practical Graph Mining with R. 5. Link Analysis

Practical Graph Mining with R. 5. Link Analysis Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities

More information

Mining Social Network Graphs

Mining Social Network Graphs Mining Social Network Graphs Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata November 13, 17, 2014 Social Network No introduc+on required Really? We s7ll need to understand

More information

Graph Processing and Social Networks

Graph Processing and Social Networks Graph Processing and Social Networks Presented by Shu Jiayu, Yang Ji Department of Computer Science and Engineering The Hong Kong University of Science and Technology 2015/4/20 1 Outline Background Graph

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Ensemble Learning Better Predictions Through Diversity. Todd Holloway ETech 2008

Ensemble Learning Better Predictions Through Diversity. Todd Holloway ETech 2008 Ensemble Learning Better Predictions Through Diversity Todd Holloway ETech 2008 Outline Building a classifier (a tutorial example) Neighbor method Major ideas and challenges in classification Ensembles

More information

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

Utility of Distrust in Online Recommender Systems

Utility of Distrust in Online Recommender Systems Utility of in Online Recommender Systems Capstone Project Report Uma Nalluri Computing & Software Systems Institute of Technology Univ. of Washington, Tacoma unalluri@u.washington.edu Committee: nkur Teredesai

More information

Recommender Systems Seminar Topic : Application Tung Do. 28. Januar 2014 TU Darmstadt Thanh Tung Do 1

Recommender Systems Seminar Topic : Application Tung Do. 28. Januar 2014 TU Darmstadt Thanh Tung Do 1 Recommender Systems Seminar Topic : Application Tung Do 28. Januar 2014 TU Darmstadt Thanh Tung Do 1 Agenda Google news personalization : Scalable Online Collaborative Filtering Algorithm, System Components

More information

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore. CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes

More information

Machine Learning for Data Science (CS4786) Lecture 1

Machine Learning for Data Science (CS4786) Lecture 1 Machine Learning for Data Science (CS4786) Lecture 1 Tu-Th 10:10 to 11:25 AM Hollister B14 Instructors : Lillian Lee and Karthik Sridharan ROUGH DETAILS ABOUT THE COURSE Diagnostic assignment 0 is out:

More information

INFORMATION from a single node (entity) can reach other

INFORMATION from a single node (entity) can reach other 1 Network Infusion to Infer Information Sources in Networks Soheil Feizi, Muriel Médard, Gerald Quon, Manolis Kellis, and Ken Duffy arxiv:166.7383v1 [cs.si] 23 Jun 216 Abstract Several significant models

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

More information

A Social Network-Based Recommender System (SNRS)

A Social Network-Based Recommender System (SNRS) A Social Network-Based Recommender System (SNRS) Jianming He and Wesley W. Chu Computer Science Department University of California, Los Angeles, CA 90095 jmhek@cs.ucla.edu, wwc@cs.ucla.edu Abstract. Social

More information

Social Network Mining

Social Network Mining Social Network Mining Data Mining November 11, 2013 Frank Takes (ftakes@liacs.nl) LIACS, Universiteit Leiden Overview Social Network Analysis Graph Mining Online Social Networks Friendship Graph Semantics

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

Data Mining. Cluster Analysis: Advanced Concepts and Algorithms

Data Mining. Cluster Analysis: Advanced Concepts and Algorithms Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based

More information

BUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE

BUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE BUILDING A PREDICTIVE MODEL AN EXAMPLE OF A PRODUCT RECOMMENDATION ENGINE Alex Lin Senior Architect Intelligent Mining alin@intelligentmining.com Outline Predictive modeling methodology k-nearest Neighbor

More information

Influence Propagation in Social Networks: a Data Mining Perspective

Influence Propagation in Social Networks: a Data Mining Perspective Influence Propagation in Social Networks: a Data Mining Perspective Francesco Bonchi Yahoo! Research Barcelona - Spain bonchi@yahoo-inc.com http://francescobonchi.com/ Acknowledgments Amit Goyal (University

More information

Hybrid model rating prediction with Linked Open Data for Recommender Systems

Hybrid model rating prediction with Linked Open Data for Recommender Systems Hybrid model rating prediction with Linked Open Data for Recommender Systems Andrés Moreno 12 Christian Ariza-Porras 1, Paula Lago 1, Claudia Jiménez-Guarín 1, Harold Castro 1, and Michel Riveill 2 1 School

More information

Performance Characterization of Game Recommendation Algorithms on Online Social Network Sites

Performance Characterization of Game Recommendation Algorithms on Online Social Network Sites Leroux P, Dhoedt B, Demeester P et al. Performance characterization of game recommendation algorithms on online social network sites. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 27(3): 611 623 May 2012.

More information

Distance Degree Sequences for Network Analysis

Distance Degree Sequences for Network Analysis Universität Konstanz Computer & Information Science Algorithmics Group 15 Mar 2005 based on Palmer, Gibbons, and Faloutsos: ANF A Fast and Scalable Tool for Data Mining in Massive Graphs, SIGKDD 02. Motivation

More information

! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms -- II

! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms -- II ! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms -- II Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and

More information

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With

More information

Challenges and Opportunities in Data Mining: Personalization

Challenges and Opportunities in Data Mining: Personalization Challenges and Opportunities in Data Mining: Big Data, Predictive User Modeling, and Personalization Bamshad Mobasher School of Computing DePaul University, April 20, 2012 Google Trends: Data Mining vs.

More information

Response prediction using collaborative filtering with hierarchies and side-information

Response prediction using collaborative filtering with hierarchies and side-information Response prediction using collaborative filtering with hierarchies and side-information Aditya Krishna Menon 1 Krishna-Prasad Chitrapura 2 Sachin Garg 2 Deepak Agarwal 3 Nagaraj Kota 2 1 UC San Diego 2

More information

Recommendation Tool Using Collaborative Filtering

Recommendation Tool Using Collaborative Filtering Recommendation Tool Using Collaborative Filtering Aditya Mandhare 1, Soniya Nemade 2, M.Kiruthika 3 Student, Computer Engineering Department, FCRIT, Vashi, India 1 Student, Computer Engineering Department,

More information

Recommendations in Mobile Environments. Professor Hui Xiong Rutgers Business School Rutgers University. Rutgers, the State University of New Jersey

Recommendations in Mobile Environments. Professor Hui Xiong Rutgers Business School Rutgers University. Rutgers, the State University of New Jersey 1 Recommendations in Mobile Environments Professor Hui Xiong Rutgers Business School Rutgers University ADMA-2014 Rutgers, the State University of New Jersey Big Data 3 Big Data Application Requirements

More information

Machine Learning over Big Data

Machine Learning over Big Data Machine Learning over Big Presented by Fuhao Zou fuhao@hust.edu.cn Jue 16, 2014 Huazhong University of Science and Technology Contents 1 2 3 4 Role of Machine learning Challenge of Big Analysis Distributed

More information

Graph Mining and Social Network Analysis

Graph Mining and Social Network Analysis Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann

More information

MACHINE LEARNING IN HIGH ENERGY PHYSICS

MACHINE LEARNING IN HIGH ENERGY PHYSICS MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!

More information

Learning Influence Probabilities In Social Networks

Learning Influence Probabilities In Social Networks Learning Influence Probabilities In Social Networks Amit Goyal University of British Columbia Vancouver, BC, Canada goyal@cs.ubc.ca Francesco Bonchi Yahoo! Research Barcelona, Spain bonchi@yahoo-inc.com

More information

Using multiple models: Bagging, Boosting, Ensembles, Forests

Using multiple models: Bagging, Boosting, Ensembles, Forests Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or

More information

A Collaborative Filtering Recommendation Algorithm Based On User Clustering And Item Clustering

A Collaborative Filtering Recommendation Algorithm Based On User Clustering And Item Clustering A Collaborative Filtering Recommendation Algorithm Based On User Clustering And Item Clustering GRADUATE PROJECT TECHNICAL REPORT Submitted to the Faculty of The School of Engineering & Computing Sciences

More information

Prediction of Atomic Web Services Reliability Based on K-means Clustering

Prediction of Atomic Web Services Reliability Based on K-means Clustering Prediction of Atomic Web Services Reliability Based on K-means Clustering Marin Silic University of Zagreb, Faculty of Electrical Engineering and Computing, Unska 3, Zagreb marin.silic@gmail.com Goran

More information

A survey on click modeling in web search

A survey on click modeling in web search A survey on click modeling in web search Lianghao Li Hong Kong University of Science and Technology Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models

More information

Extracting Information from Social Networks

Extracting Information from Social Networks Extracting Information from Social Networks Aggregating site information to get trends 1 Not limited to social networks Examples Google search logs: flu outbreaks We Feel Fine Bullying 2 Bullying Xu, Jun,

More information

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of

More information

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II

More information

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!

More information

Introduction to Graph Mining

Introduction to Graph Mining Introduction to Graph Mining What is a graph? A graph G = (V,E) is a set of vertices V and a set (possibly empty) E of pairs of vertices e 1 = (v 1, v 2 ), where e 1 E and v 1, v 2 V. Edges may contain

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

More information

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu

More information

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar

More information

Collective behaviour in clustered social networks

Collective behaviour in clustered social networks Collective behaviour in clustered social networks Maciej Wołoszyn 1, Dietrich Stauffer 2, Krzysztof Kułakowski 1 1 Faculty of Physics and Applied Computer Science AGH University of Science and Technology

More information

Business Intelligence and Process Modelling

Business Intelligence and Process Modelling Business Intelligence and Process Modelling F.W. Takes Universiteit Leiden Lecture 7: Network Analytics & Process Modelling Introduction BIPM Lecture 7: Network Analytics & Process Modelling Introduction

More information

A NOVEL RESEARCH PAPER RECOMMENDATION SYSTEM

A NOVEL RESEARCH PAPER RECOMMENDATION SYSTEM International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 7, Issue 1, Jan-Feb 2016, pp. 07-16, Article ID: IJARET_07_01_002 Available online at http://www.iaeme.com/ijaret/issues.asp?jtype=ijaret&vtype=7&itype=1

More information

Invited Applications Paper

Invited Applications Paper Invited Applications Paper - - Thore Graepel Joaquin Quiñonero Candela Thomas Borchert Ralf Herbrich Microsoft Research Ltd., 7 J J Thomson Avenue, Cambridge CB3 0FB, UK THOREG@MICROSOFT.COM JOAQUINC@MICROSOFT.COM

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Performance Metrics for Graph Mining Tasks

Performance Metrics for Graph Mining Tasks Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical

More information

Rating Prediction with Informative Ensemble of Multi-Resolution Dynamic Models

Rating Prediction with Informative Ensemble of Multi-Resolution Dynamic Models JMLR: Workshop and Conference Proceedings 75 97 Rating Prediction with Informative Ensemble of Multi-Resolution Dynamic Models Zhao Zheng Hong Kong University of Science and Technology, Hong Kong Tianqi

More information

HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

More information

Contact Recommendations from Aggegrated On-Line Activity

Contact Recommendations from Aggegrated On-Line Activity Contact Recommendations from Aggegrated On-Line Activity Abigail Gertner, Justin Richer, and Thomas Bartee The MITRE Corporation 202 Burlington Road, Bedford, MA 01730 {gertner,jricher,tbartee}@mitre.org

More information

Towards running complex models on big data

Towards running complex models on big data Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation

More information

A Measurement-driven Analysis of Information Propagation in the Flickr Social Network

A Measurement-driven Analysis of Information Propagation in the Flickr Social Network A Measurement-driven Analysis of Information Propagation in the Flickr Social Network Meeyoung Cha MPI-SWS Campus E1 4 Saarbrücken, Germany mcha@mpi-sws.org Alan Mislove MPI-SWS Campus E1 4 Saarbrücken,

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

Homework 2. Page 154: Exercise 8.10. Page 145: Exercise 8.3 Page 150: Exercise 8.9

Homework 2. Page 154: Exercise 8.10. Page 145: Exercise 8.3 Page 150: Exercise 8.9 Homework 2 Page 110: Exercise 6.10; Exercise 6.12 Page 116: Exercise 6.15; Exercise 6.17 Page 121: Exercise 6.19 Page 122: Exercise 6.20; Exercise 6.23; Exercise 6.24 Page 131: Exercise 7.3; Exercise 7.5;

More information

RECOMMENDATION SYSTEM

RECOMMENDATION SYSTEM RECOMMENDATION SYSTEM October 8, 2013 Team Members: 1) Duygu Kabakcı, 1746064, duygukabakci@gmail.com 2) Işınsu Katırcıoğlu, 1819432, isinsu.katircioglu@gmail.com 3) Sıla Kaya, 1746122, silakaya91@gmail.com

More information

Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu

Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of

More information

Data Mining Yelp Data - Predicting rating stars from review text

Data Mining Yelp Data - Predicting rating stars from review text Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority

More information

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014 Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

More information

Complex Networks Analysis: Clustering Methods

Complex Networks Analysis: Clustering Methods Complex Networks Analysis: Clustering Methods Nikolai Nefedov Spring 2013 ISI ETH Zurich nefedov@isi.ee.ethz.ch 1 Outline Purpose to give an overview of modern graph-clustering methods and their applications

More information

Machine Learning Capacity and Performance Analysis and R

Machine Learning Capacity and Performance Analysis and R Machine Learning and R May 3, 11 30 25 15 10 5 25 15 10 5 30 25 15 10 5 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 100 80 60 40 100 80 60 40 100 80 60 40 30 25 15 10 5 25 15 10

More information

Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

More information

Characterization of Latent Social Networks Discovered through Computer Network Logs

Characterization of Latent Social Networks Discovered through Computer Network Logs Characterization of Latent Social Networks Discovered through Computer Network Logs Kevin M. Carter MIT Lincoln Laboratory 244 Wood St Lexington, MA 02420 kevin.carter@ll.mit.edu Rajmonda S. Caceres MIT

More information

Dynamical Clustering of Personalized Web Search Results

Dynamical Clustering of Personalized Web Search Results Dynamical Clustering of Personalized Web Search Results Xuehua Shen CS Dept, UIUC xshen@cs.uiuc.edu Hong Cheng CS Dept, UIUC hcheng3@uiuc.edu Abstract Most current search engines present the user a ranked

More information

Probabilistic user behavior models in online stores for recommender systems

Probabilistic user behavior models in online stores for recommender systems Probabilistic user behavior models in online stores for recommender systems Tomoharu Iwata Abstract Recommender systems are widely used in online stores because they are expected to improve both user

More information

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II. Matrix Algorithms DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

More information

Part 1: Link Analysis & Page Rank

Part 1: Link Analysis & Page Rank Chapter 8: Graph Data Part 1: Link Analysis & Page Rank Based on Leskovec, Rajaraman, Ullman 214: Mining of Massive Datasets 1 Exam on the 5th of February, 216, 14. to 16. If you wish to attend, please

More information

User Data Analytics and Recommender System for Discovery Engine

User Data Analytics and Recommender System for Discovery Engine User Data Analytics and Recommender System for Discovery Engine Yu Wang Master of Science Thesis Stockholm, Sweden 2013 TRITA- ICT- EX- 2013: 88 User Data Analytics and Recommender System for Discovery

More information

Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D.

Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D. Data Mining on Social Networks Dionysios Sotiropoulos Ph.D. 1 Contents What are Social Media? Mathematical Representation of Social Networks Fundamental Data Mining Concepts Data Mining Tasks on Digital

More information

Social Media Mining. Graph Essentials

Social Media Mining. Graph Essentials Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures

More information

Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations

Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU 1 Introduction What can we do with graphs? What patterns

More information

Ranking User Influence in Healthcare Social Media

Ranking User Influence in Healthcare Social Media Ranking User Influence in Healthcare Social Media XUNING TANG College of Information Science and Technology, Drexel University, PA, U.S.A. and CHRISTOPHER C. YANG College of Information Science and Technology,

More information

Similarity Search in a Very Large Scale Using Hadoop and HBase

Similarity Search in a Very Large Scale Using Hadoop and HBase Similarity Search in a Very Large Scale Using Hadoop and HBase Stanislav Barton, Vlastislav Dohnal, Philippe Rigaux LAMSADE - Universite Paris Dauphine, France Internet Memory Foundation, Paris, France

More information

Chapter 20: Data Analysis

Chapter 20: Data Analysis Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification

More information

A Workbench for Comparing Collaborative- and Content-Based Algorithms for Recommendations

A Workbench for Comparing Collaborative- and Content-Based Algorithms for Recommendations A Workbench for Comparing Collaborative- and Content-Based Algorithms for Recommendations Master Thesis Pat Kläy from Bösingen University of Fribourg March 2015 Prof. Dr. Andreas Meier, Information Systems,

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Fast Data in the Era of Big Data: Twitter s Real-

Fast Data in the Era of Big Data: Twitter s Real- Fast Data in the Era of Big Data: Twitter s Real- Time Related Query Suggestion Architecture Gilad Mishne, Jeff Dalton, Zhenghua Li, Aneesh Sharma, Jimmy Lin Presented by: Rania Ibrahim 1 AGENDA Motivation

More information

Achieve Better Ranking Accuracy Using CloudRank Framework for Cloud Services

Achieve Better Ranking Accuracy Using CloudRank Framework for Cloud Services Achieve Better Ranking Accuracy Using CloudRank Framework for Cloud Services Ms. M. Subha #1, Mr. K. Saravanan *2 # Student, * Assistant Professor Department of Computer Science and Engineering Regional

More information

QDquaderni. UP-DRES User Profiling for a Dynamic REcommendation System E. Messina, D. Toscani, F. Archetti. university of milano bicocca

QDquaderni. UP-DRES User Profiling for a Dynamic REcommendation System E. Messina, D. Toscani, F. Archetti. university of milano bicocca A01 084/01 university of milano bicocca QDquaderni department of informatics, systems and communication UP-DRES User Profiling for a Dynamic REcommendation System E. Messina, D. Toscani, F. Archetti research

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising

More information

MINFS544: Business Network Data Analytics and Applications

MINFS544: Business Network Data Analytics and Applications MINFS544: Business Network Data Analytics and Applications March 30 th, 2015 Daning Hu, Ph.D., Department of Informatics University of Zurich F Schweitzer et al. Science 2009 Stop Contagious Failures in

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

CNFSAT: Predictive Models, Dimensional Reduction, and Phase Transition

CNFSAT: Predictive Models, Dimensional Reduction, and Phase Transition CNFSAT: Predictive Models, Dimensional Reduction, and Phase Transition Neil P. Slagle College of Computing Georgia Institute of Technology Atlanta, GA npslagle@gatech.edu Abstract CNFSAT embodies the P

More information

Managing Incompleteness, Complexity and Scale in Big Data

Managing Incompleteness, Complexity and Scale in Big Data Managing Incompleteness, Complexity and Scale in Big Data Nick Duffield Electrical and Computer Engineering Texas A&M University http://nickduffield.net/work Three Challenges for Big Data Complexity Problem:

More information

Nodes, Ties and Influence

Nodes, Ties and Influence Nodes, Ties and Influence Chapter 2 Chapter 2, Community Detec:on and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool, September, 2010. 1 IMPORTANCE OF NODES 2 Importance of Nodes Not

More information

On the Effectiveness of Obfuscation Techniques in Online Social Networks

On the Effectiveness of Obfuscation Techniques in Online Social Networks On the Effectiveness of Obfuscation Techniques in Online Social Networks Terence Chen 1,2, Roksana Boreli 1,2, Mohamed-Ali Kaafar 1,3, and Arik Friedman 1,2 1 NICTA, Australia 2 UNSW, Australia 3 INRIA,

More information