Exploiting Recommendation on Social Media Networks



Similar documents
The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

An Interest-Oriented Network Evolution Mechanism for Online Communities

Forecasting the Direction and Strength of Stock Market Movement

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Probabilistic Latent Semantic User Segmentation for Behavioral Targeted Advertising*

A DATA MINING APPLICATION IN A STUDENT DATABASE

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Web Object Indexing Using Domain Knowledge *

An Alternative Way to Measure Private Equity Performance

What is Candidate Sampling

Ring structure of splines on triangulations

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Semantic Link Analysis for Finding Answer Experts *

A Dynamic Load Balancing for Massive Multiplayer Online Game Server

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Efficient Project Portfolio as a tool for Enterprise Risk Management

A Multi-mode Image Tracking System Based on Distributed Fusion

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

A Secure Password-Authenticated Key Agreement Using Smart Cards

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

A Simple Approach to Clustering in Excel

Cloud-based Social Application Deployment using Local Processing and Global Distribution

Using Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council

Context-aware Mobile Recommendation System Based on Context History

LIFETIME INCOME OPTIONS

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Using Content-Based Filtering for Recommendation 1

Data Visualization by Pairwise Distortion Minimization

SIMPLE LINEAR CORRELATION

Bayesian Cluster Ensembles

How To Calculate The Accountng Perod Of Nequalty

Gender Classification for Real-Time Audience Analysis System

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

A Fast Incremental Spectral Clustering for Large Data Sets

Performance Analysis and Coding Strategy of ECOC SVMs

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES

IMPACT ANALYSIS OF A CELLULAR PHONE

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

Assessing Student Learning Through Keyword Density Analysis of Online Class Messages

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm

Recurrence. 1 Definitions and main statements

Software project management with GAs

L10: Linear discriminants analysis

Statistical Approach for Offline Handwritten Signature Verification

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS

Inter-Ing INTERDISCIPLINARITY IN ENGINEERING SCIENTIFIC INTERNATIONAL CONFERENCE, TG. MUREŞ ROMÂNIA, November 2007.

A Probabilistic Theory of Coherence

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

A Design Method of High-availability and Low-optical-loss Optical Aggregation Network Architecture

Enterprise Master Patient Index

Traffic-light a stress test for life insurance provisions

Sketching Sampled Data Streams

A Study on Secure Data Storage Strategy in Cloud Computing

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

A neuro-fuzzy collaborative filtering approach for Web recommendation. G. Castellano, A. M. Fanelli, and M. A. Torsello *

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Improved SVM in Cloud Computing Information Mining

Design and Development of a Security Evaluation Platform Based on International Standards

A Performance Analysis of View Maintenance Techniques for Data Warehouses

Traditional versus Online Courses, Efforts, and Learning Performance

Single and multiple stage classifiers implementing logistic discrimination

Towards a Global Online Reputation

Gaining Insights to the Tea Industry of Sri Lanka using Data Mining

Extending Probabilistic Dynamic Epistemic Logic

Watermark-based Provable Data Possession for Multimedia File in Cloud Storage

Vehicle Detection and Tracking in Video from Moving Airborne Platform

Logistic Regression. Steve Kroon

Project Networks With Mixed-Time Constraints

Optimal Provisioning of Resource in a Cloud Service

Customer Lifetime Value Modeling and Its Use for Customer Retention Planning

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble

Title Language Model for Information Retrieval

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System

DEFINING %COMPLETE IN MICROSOFT PROJECT

Detecting Global Motion Patterns in Complex Videos

Mining Multiple Large Data Sources

RequIn, a tool for fast web traffic inference

Searching for Interacting Features for Spam Filtering

A Passive Network Measurement-based Traffic Control Algorithm in Gateway of. P2P Systems

How To Analyze News From A News Report

Transcription:

Internatonal Journal of Scence and Research IJSR) ISSN Onln: 2319-7064 Index Coperncus Value 2013): 6.14 Impact Factor 2013): 4.438 Explotng Recommendaton on Socal Meda Networs Swat A. Adhav 1, Sheetal A. Taale 2 1 ME-II Computer Engneerng, Savtrba Phule Pune Unversty, Vdya Pratshthan s College of Engneerng, Baramat, 2 Professor, Department of Informaton Technology, Savtrba Phule Pune Unversty, Vdya Pratshthan s College of Engneerng, Baramat, Abstract: Socal meda where the user nteracts wth each other to form socal networs s becomng very popular these days. In socal meda networs, users are nfluenced by others for varous reasons. For example, the colleagues have a strong mpact on one s wor, whle the frends have a strong mpact on one s daly lfe. Socal nfluence mnng n socal networs s mportant n real world applcatons such as photo recommendaton. Snce the Socal networ s mult-modal and heterogeneous, t s nsgnfcant to effectvely explot the socal meda nformaton to learn the topc dstrbuton for users and mages accurately. To tacle ths Problem, Explotng Recommendaton on Socal Meda Networs s used to fnd topcal nfluental for users and mages wth the help of Hypergraph Learnng approach. The Hypergraph learnng method s used to model user, mages and socal ln relatonshp. It combnes the content of mages and socal lns to determne the topc-specfc nfluence for users and mages n the socal meda networ. Explotng Recommendaton on Socal Meda Networs conssts of three learnng stages Hypergraph Constructon, topc dstrbuton learnng, and topc senstve nfluence ranng. Keywords: Hypergraph Constructon, topc dstrbuton learnng, topc senstve nfluence ranng. 1. Introducton Socal meda has reform the way people share and access nformaton. In communtes there are many socal sharng webstes, whch allows user to share photos, web lns, songs, pctures etc. wth socal meda networs, users necessarly nteract wth each other n communtes. The socal communcatons nclude two-way lns le connect n LnedIn, add frend n Faceboo, and one-way lns le contact n Flcr networ. For example, the colleagues on LnedIn wll largely mpact one s choce n wor, whle the frends from Faceboo have strong nfluence on one s preferences n daly lfe. Wth the ncrease of socal sharng networs such as Faceboo, and Flcr, Socal Influence becomes more mportant. In large socal meda networs, users are nfluenced by others for varous reasons. Socal nfluence [2] s a most mportant topc n Socal Networs and loos at how ndvdual thoughts and feelngs are nfluenced by socal groups. Socal nfluence mnng n socal meda networs s mportant n real-world applcatons such as photo recommendaton. Fg. 1 shows the contact networ of user Sudhr n Flcr whch ncludes three nfluencers: Sunl, Geeta and Ram. On the rght there are expertse of each nfluencer regardng topcs on fashon, travel and technology. Assumng Sudhr s searchng photos of Goa for hs trp, obvously Sunls preference wll nfluence hm most. when Sudhr searches photos of DG fashon show, the advce from Geeta wll be most apprecated. Ths demonstrates that some nfluencers n certan topcs mght be more trustworthy than others and the nfluence value s topc-senstve. Fgure 1: Example of Topc Senstve Influencer Socal networ s mult-modal and heterogeneous, t s nsgnfcant to effectvely explot the socal meda nformaton to learn the topc dstrbuton for users and mages accurately. Real socal meda networs le Faceboo, Flcr are gettng bgger wth thousands or mllons of user s, to effcently fnd out the topc based nfluental strength for each socal networ s really a challengng problem. To tacle ths Problem, Explotng Recommendaton on Socal Meda Networs s used to fnd topcal nfluental for users and mages wth the help of Hypergraph Learnng approach. In ths paper, we nvestgate the problem of topc senstve nfluencer mnng n socal meda networs. 2. Lterature Survey A. Hypergraph Learnng Zhou.et.al. [6] proposed hypergraph s nothng but a generalzaton of the smple graph n whch the edges, called hyperedges. Hypergraph can be used to model both varous types of enttes and complex relatons, whch s extremely Paper ID: SUB155077 318

Internatonal Journal of Scence and Research IJSR) ISSN Onln: 2319-7064 Index Coperncus Value 2013): 6.14 Impact Factor 2013): 4.438 approprate n socal meda modelng. Hypergraph has been used n many nformaton retreval and data mnng tass, such as mage retreval and object recognton. Zhou et al used a general hypergraph framewor and apply t to clusterng, classfcaton and embeddng tass. Lu et al [5] ntended a transductve learnng framewor for mage retreval. In ths method, each mage s represented as a vertex n the constructed hypergraph, and the vsual clusterng results are used to construct the hyperedge. Lu et.al [8] used a transductve learnng framewor for mage retreval. It constructs a hypergraph by generatng a hyperedge from each sample and ts neghbors, and hypergraph-based ranng s then performed. Based on the vsual smlarty matrx computed from varous feature descrptors, t tae each mage as a centrod vertex and create a hyperedge by a centrod and ts -nearest neghbors. To more explot the correlaton nformaton among mages, t propose a probablstc hypergraph, whch assgns each vertex to a hyperedge n a probablstc way. Ths wor demonstrates the effectveness of hypergraph model n capturng hgher-order relatonshp. Yue et.al [4] ntended an approach that smultaneously utlzes both vsual and textual nformaton for socal mage search. In the proposed method, both vsual content and tags are used to generate the hyperedges of a hypergraph, and a relevance learnng method s performed on the hypergraph structure where a set of pseudo-relevant samples are used. Dfferent from the conventonal hypergraph learnng algorthms, our method learns not only the relevance scores among mages but also the weghts of hyperedges. By usng the learnng of hyperedge weghts, the effects of unnformatve tags and vsual words can be mnmzed. B. Topc Dstrbuton Learnng Topc models, such as PLSA [5], LapPLSI, LDA, Corr-LDA, have shown mpressve success n many felds. Corr-LDA s used, to jontly model textual and vsual features for latent topc representaton. Latent Drchlet Allocaton LDA) s the smplest and most popular statstcal topc modelng Ble et el. 2003). Many other models came up as an extenson of LDA s the smplest topc model [8]. The ntuton behnd LDA s that documents exhbt multple topcs. In LDA [9], the topcs are dstrbutons over words and ths dscrete dstrbuton generates observatons words n documents). LDA cannot model the correlatons among topcs. For example the topc genetcs s more lely to be smlar to dsease than to astronaut. LDA fals to depct ths correlaton of topcs. Correlated topc model CTM) s ntroduced by Ble and Lafferty 2007) whch s an extenson of LDA that can model the correlatons among topcs. Hoffman et.al has proposed PLSA s a statstcal model for analyzng two modes and fndng the co-occurrence of data. PLSA s based on a generatve probablstc model. We can apply the Probablstc Latent Semantc Analyss PLSA) to model the generaton of each mage and ts word cooccurrences. Probablstc Latent Semantc Analyss s a novel statstcal technque for the analyss of two mode and co-occurrence nformaton, whch has applcatons n nformaton retreval, machne learnng from text and natural language processng. It can be used to extract topcs from a collecton of documents. The startng pont for Probablstc Latent Semantc Analyss s a statstcal model whch has been called aspect model. Ths model s a latent varable model for co-occurrence data whch assocates an unobserved class varable. 3. Implementaton Detals The proposed framewor s organzed nto three stages Hypergraph Constructon, Topc dstrbuton learnng, and Topc senstve nfluence ranng respectvely. The system archtecture s defned n Fg.2. A. Hypergraph Constructon For hypergraph constructon, there are two types of vertces correspondng to the users and mages, whch consttute the vertex set denoted as V= {u, o}. The constructon of the hyperedges s llustrated as follows: 1. Homogeneous hyperedges: It s used to represent the vsual-textual content relatons among mage vertces. There are two types of homogeneous hyperedges ncludng vsual content relaton hyperedge and textual content relaton hyperedge. In our experment, from each mage 512-dmensonal feature vector s extracted as the content representaton, ncludng 512-dmensonal GIST feature. Feature extracton usng GIST The vsual content smlarty hyperedge s weght s set based on the vsual smlarty matrx [8]. A s calculated accordng to x xj exp f j Nt 2 2 0 otherwse Paper ID: SUB155077 319 A j or Nt j Where denotes the ndex set for t nearest neghbors, and are feature vectors assocated wth mages respectvely, s a scalng parameter. To construct textual, we buld a tag vocabulary. Each tag s used to buld a hyperedge,.e., the mages contanng the same tag are connected by a hyperedge. The homogeneous hyperedges are used to learn the topc dstrbuton. Vsual Smlarty Matrx s Textual Smlarty Matrx s:

Internatonal Journal of Scence and Research IJSR) ISSN Onln: 2319-7064 Index Coperncus Value 2013): 6.14 Impact Factor 2013): 4.438 A hypergraph G = {V, E, W} conssts of the vertex set V, the hyperedge set E, and the hyperedge weght vector w. each edge s assgned a weght w the hypergraph G can be denoted by ncdent matrx H 1f v e h v, 0 f v e h v, s We can apply the Probablstc Latent Semantc Analyss PLSA) [5] to model the generaton of each mage and ts word co-occurrences by the followng steps: 1. Select an mage o wth probablty P o ), 2. Select a potental topc of nterest wth probablty P z o ), 3. Create a word wth probablty P w j z ). Followng the maxmum lelhood prncple, we can determne the model parameters l= log, )P, ) By explorng the homogeneous relatons among mages n the hypergraph and utlzng hypergraph regularzer, represented by R= * 1) to maxmze the regularzed log-lelhood from equaton 1as follows: L= l- R For a hyperedge e E, hyperedge degree can be estmated by: d h v, Vertex degree vv of each vertex v V s: d v) w h v, ee B. Topc Dstrbuton Learnng Input: Image-Term matrx. Where, M: No of mages. {,,... } W: No of Words n Tag Vocabulary. {,,... } = log, )P, ) - R P w j z ) can be calculated by usng M step re-estmaton: P w j z ) = P z o ) P z o ) N 1 M N j' 1 1 Calculated by usng: N 1 n P n w' j) P w' j) M j1 n 2) n P Output: the Topc Dstrbuton Fgure 2: System Archtecture Paper ID: SUB155077 320

the topcs extracted are as follows: Internatonal Journal of Scence and Research IJSR) ISSN Onln: 2319-7064 Index Coperncus Value 2013): 6.14 Impact Factor 2013): 4.438 C. Topc-senstve Influence Ranng- 1. Learnng Influence score for Users: To calculate nfluence score of user we requre user s contact networ and ther socal ln relatonshp le comments. Frst we calculate user s ndvdual nfluence on each topc based on mages that he/she has uploaded on partcular topc and after that we calculate commenter nfluence based on whch user has comment on whch topc of mage, after combnng user s ndvdual nfluence and commenter nfluence we get Fnal nfluence score of users. Top 3 representatve users and ther photos on Topc2 from collected Flcr dataset, raned by topc-specfc socal nfluence scores of users. 2. Learnng Influence score for Images: To calculate nfluence score of mages we requre each user s uploaded mages ncludng tags. Frst we consder the tags of each mage and calculatng the probablty of tags that comes under each topc. 4. Results and Dataset We collect data from Flcr API to evaluate our approach. We downloaded each user s contact nformaton and each user s uploaded mages. The socal lns ncludng contacts, vews and comments are recorded. In ths dataset, each user has dfferent metadata nformaton ncludng textual ttle and tags, comments. We also classfed that each photo has at least one type of socal lns, contanng comments. Top 3 representatve users and ther photos on Topc3 from collected Flcr dataset, raned by topc-specfc socal nfluence scores of users. Top 3 representatve users and ther photos on Topc1 from collected Flcr dataset, raned by topc-specfc socal nfluence scores of users. Representatve photos on three topcs on the collected Flcr dataset, raned by topc-specfc socal nfluence scores of mages. Paper ID: SUB155077 321

5. Concluson Internatonal Journal of Scence and Research IJSR) ISSN Onln: 2319-7064 Index Coperncus Value 2013): 6.14 Impact Factor 2013): 4.438 Explotng recommendaton on socal meda networ s proposed to deal wth the problem of topc-senstve nfluencer mnng n socal sharng webstes. It ams to fnd the nfluental user and mages n the socal meda networs. Socal lns between users and mages ndcate the socal nfluence of users and mages n the socal meda networs. We also demonstrate that ths system can mprove the performance sgnfcantly n the applcatons of photo recommendaton. 6. Acnowledgment I aval ths opportunty to express my deep sense of grattude and whole hearted thans to my gude Prof. S. A. Taale madam for gvng her valuable gudance, nspraton and encouragement to embar ths paper. Wthout ther Coordnaton, gudance and revewng, ths tas could not be completed alone. Prof. S. A. Taale madam gave me all the freedom I needed for ths project. I also tae great pleasure n thanng Prof. S. S. Nandgaonar for provdng approprate gudance. I would personally le to than Prof. G. J. Chhajed, Head of Computer Dept. at Vdya Vdya Pratshthan s Collage of Engneerng VPCOE), and our Honble prncpal Dr. S. B. Deosarar sr who creates a healthy envronment for all of us to learn n best possble way. References [1] J. Sang, Q. Fang, Topc-Senstve Influencer Mnng n Interest-Based Socal Meda Networs va Hypergraph Learnng", IEEE Trans. Multmeda, vol. 16, no. 3, Aprl. 2014. [2] J.Tang, J. Sun, C. Wang, and Z. Yang, Socal nfluence analyss n large-scale networs, n Proc. KDD, 2009, pp. 807 816.. [3] J. Sang, C. Xu, Rght Buddy Maes the Dfference: an Early Exploraton of Socal Relaton Analyss n Multmeda Applcatons, Proc. ACM Multmeda, 2012. [4] Y. Gao, M. Wang, Vsual-Textual Jont Relevance Learnng for Tag-Based Socal Image Search, n Proc. ACM Multmeda, 2011, pp. 15171520. [5] T. Hofmann, Probablstc latent semantc ndexng, n Proc. SIGIR, 1999, pp. 50 57. [6] D. Zhou, J. Huang, and B. Schoopf, Learnng wth hypergraphs: Clusterng, classfcaton, and embeddng, n Proc. Adv. Neural Inf. Process. Syst., 2007, pp. 1 8. [7] S. Agarwal, K. Branson, and S. Belonge, Hgher order learnng wth graphs, n Proc. ICML, 2006, pp. 17 24. [8] Y. Huang, Q. Lu, S. Zhang, and D. N. Metaxas, Image retreval va probablstc hypergraph ranng, n Proc. CVPR, 2010, pp. 3376 3383. [9] Davd M.Ble, Introducton to Probablstc Topc Models Paper ID: SUB155077 322