# Topical Authority Identification in Community Question Answering

Save this PDF as:

Size: px
Start display at page:

## Transcription

3 624 G. Zhou, K. Liu, and J. Zhao LDA is a bayesian probabilistic graphical model, which models each document as a mixture of underlying topics and generates each word from one topic. The generation process of a document is described in Table 1. A document d is associated with a multinomial distribution over K topics, which is denoted as θ d.for each word w di in document d: (1) a topic z di is first sampled from the multinomial distribution θ d, which is generated from the Dirichlet prior parameterized by α; (2) then each word w di is generated from multinomial distribution φ zdi, which is generated from the Dirichlet prior parameterized by β. The two Dirichlet priors for document-topic distributions θ d and topic-word distributions φ z reduce the probability of overfitting training documents and enhance the ability of inferring topic distribution for new documents [11]. Here, we employ Gibbs sampling [12] for parameter estimation due to its faster convergence and better performance [13]. Table 1. The generation process of LDA For each topic z i {1,,K}, sample a multinomial distribution over words, φ zi Dir(β) For each document d: 1. sample a multinomial distribution over topics, θ d Dir(α) 2. For each word w di in document d: * sample a topic z di Multinomial(θ d ) * sample a word w di Multinomial(φ zdi ) To distill the topics that users are interested in using LDA, documents should naturally correspond to questions and answers. However, since the goal is to distill the topics that each user is interested in rather than the topics that each question and the corresponding answers are about, we aggregate the user profiles provided by each individual user into a big document. Thus, each document essentially corresponds to a user. The results of topic distillation are represented in two matrices: DK =[θ] D K,a D K matrix, where D is the number of users, and K is the number of topics. DK ij DK contains the number of times a word in u i s profiles (questions and the corresponding answers) has been assigned to topic z j. WK = [φ] W K,a W K matrix, where W is the number of unique words used in question-answer collection, and K is the number of topics. WK ij WK denotes the number of times unique word w i has been assigned to the specific topic z j. In these two matrices, matrix DK contains the number of times a word in a user (e.g., u i ) profiles has been assigned to a particular topic. We can row normalize it as DK such that DK i 1 =1foreachrowDK i.. Each row of matrix DK is the probability distribution of u i s interest over the K topics, e.g., each element DK ij denotes the probability that u i is interested in topic z j ( P (z j u i )=DK ij ).

4 Topical Authority Identification in Community Question Answering PageRank for Authority Identification Based on the topics distilled in subsection 2.1, a directed graph G =(V,E) is formed with the topic-specific question-answer relationships among users. V is a set of nodes representing users (askers and answerers). A directed edge e E where e =(u i,u j ), u i V and u j V, indicates that user u j answers the questions of user u i.eachedgee ij E is associated with an affinity weight f(i j) between u i and u j. The weight is computed as follows: f(i j) = Q(i) A(j) (1) where Q(i) is the set of questions asked by u i, A(j) is the set of questions answered by u j. Two users are connected if their affinity weight is larger than 0 and we let f(i i) = 0 to avoid self transition. 4 The transition probability from u i to u j is then defined by normalizing the corresponding affinity weight as follows: p(i j) = { f(i j) V k=1 f(i k) if f 0 0 otherwise (2) where p(i j) is usually not equal to p(j i). We use the row-normalized matrix M = [ M ij ] V V to describe G with each entry corresponding to the transition probability. M ij = p(i j) (3) In order to make the graph fulfill the property of being aperiodic and M be a stochastic matrix, the rows with all zero elements are replaced by a smoothing vector with all elements set to 1/ V. Basedonthematrix M, the saliency score R(u i )foru i can be deduced from those of all other users linked with it and it can be formulated in a recursive manner as in the PageRank algorithm. R(u i )=λ R(u j ) M ji +(1 λ) 1 V j:u j u i where λ [0, 1] is a damping factor. The damping factor indicates that each vertex has a probability of (1 λ) to perform random jump to another vertex within this graph. The saliency score are obtained by running equation (4) iteratively until convergence. (4) 2.3 Topical PageRank for Authority Identification In equation (4), the second term is set to be the same value 1/ V for all vertices within the graph, which indicates that there are equal probabilities of random jump to all vertices. However, Haveliwala [14] and Nie et al. [15] proposed a 4 In CQA, the users cannot answer their own questions.

5 626 G. Zhou, K. Liu, and J. Zhao topical PageRank-like algorithm (TPR) and argued that the second term in equation (4) should be set to be non-uniformed. The assumption is that if we assign larger probabilities to some vertices, the final saliency score will prefer these vertices. The idea of TPR is to run PageRank for each topic separately. Each topicspecific PageRank prefers those users with high relevance to the corresponding topic. Formally, for a specific topic z, we will assign a topic-specific preference value p(u z) toeachuseru as its random jump probability u V p(u z) =1. The users who are interested in topic z will be assigned larger probabilities when performing the PageRank. Given a topic z, the TPR-like saliency score are defined as follows: R(u i z) =λ R(u j z) M ji +(1 λ)p(u i z) (5) j:u j u i The setting of preference value p(u i z) in equation (5) will have great influence to TPR. In this paper, we set p( z) =DZ.z, wheredz.z is the zth column of matrix DZ, which is the column normalized form of matrix DZ such that DZ.z 1 = 1. A large R(u z) indicates a user u is a good candidate authority in topic z. For implementation, the initial scores of all users are set to 1 and the iteration algorithm in equation (5) is used to compute the new scores of the users. Usually the convergence of the iteration is achieved when the difference between the scores computed at two successive iterations for any users falls below a given threshold ( in this paper). After ranking the users by using the TPR or other methods, we select top K users for each topic as topical candidate authorities. 3 Experiments 3.1 Data Set Yahoo! Answers web service supplies an API to allow web users to crawl the existing question answer archives and the corresponding user information from the website [17]. We crawl the data set from Yahoo! Answers, the data set consists of 237,083 resolved questions, and 593,107 answers posted by 286,053 users. Table 2 presents the statistics on the data set. In this paper, for all resolved questions, the information of each question includes: (1) Texts of question and the associated answers, with stop words being excluded 5 and the words being stemmed. 6 (2) User IDs of all questions and answers. (3) Users rating information (e.g., thumbs up, thumbs down, the best answers and so on.)

6 Topical Authority Identification in Community Question Answering 627 Table 2. Yahoo! Answers data set Number of questions 237,083 Number of answers 593,107 Number of best answers 162,733 Number of total users 286,053 Number of askers 180,166 Number of answerers 135,441 Number of both askers and answerers 29,554 Since there is no available benchmark for authority identification for a given topic in CQA, we manually inspect the authority identification results. For each candidate authority u for topic z, we ask two annotators to check whether u is a real authority for the given topic. In this process, the annotators are given the top topic words and user profile. Each identified authority is voted by two annotators with label Yes (the user is a real authority for the given topic) or No (the user is not a real authority for the given topic). If a conflict happens, a third person will make judgement for the final result. The Cohen s Kappa coefficients of the Z topics range from 0.51 to 0.77, showing fair to good agreement. 3.2 Evaluation Metrics To evaluate the performance of authority identification, we use the three widely studied metrics in information retrieval. Mean Average Precision (MAP): This metric is the mean of the average precision scores for each topic. Mean Reciprocal Rank (MRR): This metric is the multiplicative inverse of the rank of the first retrieved authority for each topic. Average (Avg. This metric denotes the average ratio of the relevant authorities in top n identified authorities for each topic. 3.3 Parameter Setting We have several parameters: i.e., Dirichlet hyper-parameters α, β, topicnumber Z, damping factor parameter λ used in PageRank. In this paper, we set Dirichlet priors α =50/Z,andβ =0.05 as Griffiths and Steyvers [12]. We run LDA with 200 iterations of Gibbs sampling. After trying a few different numbers of topics, we empirically set Z = 15. We choose these parameter settings because they give coherent and meaningful topics for our data set. For parameter λ, we conduct an experiment on a small development set to determine the best value among 0.1, 0.2,,0.9 in terms of MAP. This set is also extracted from Yahoo! Answers, and it is not included in the evaluation set. We find that λ =0.2 is the optimal parameter for PR, and TPR.

7 628 G. Zhou, K. Liu, and J. Zhao Table 3. Comparison of authority identification for different methods # Methods MAP MRR Avg. 1 PR HITS InD ER TPR Experimental Results Comparison with different methods To demonstrate the effectiveness of our proposed TPR method, comparisons against some previous work are also included: PageRank (PR): This method finds the authorities with only link structure taken into account [8]. HITS:Jurczyk and Agichtein [3] proposed to find authorities in CQA and estimated the ranking scores by using HITS algorithm. InDegree(InD):This method identifies the authorities based on the number of best answers described in Bouguessa et al. [2] ExpertiseRank (ER): Zhang et al. [7] proposed a PageRank-like algorithm called ExpertiseRanking to rank authorities in an expertise network considering how many users involved in asking and answering questions. Table 3 presents the comparison of authority identification for different methods. From this table, we can find that our proposed method significantly outperforms all previous works (row 1, row 2, row 3, and row 4 vs. row 5). 7 The results show the effectiveness of the propose method by considering the topic information users. 4 Conclusion and Future Work In this paper, we propose a topical rank method for authority identification in CQA. Compared to the traditional link analysis techniques, our proposed method is more effective because it finds the authorities by taking into account both the link structure and the topic information about users. We conduct experiments on real world data set from Yahoo! Answers. Experimental results show that our proposed method significantly outperforms the traditional link analysis techniques and achieves the state-of-the-art performance. Acknowledgements. This work was supported by the National Natural Science Foundation of China (No ), the National Basic Research Program of China (No. 2012CB316300), Tsinghua National Laboratory for Information 7 We perform a significant t-test. The comparisons between our method and previous works are significant at p<0.05.

8 Topical Authority Identification in Community Question Answering 629 Science and Technology (TNList) Cross-discipline Foundation and the Opening Project of Beijing Key Laboratory of Internet Culture and Digital Dissemination Research (No ). We thank the anonymous reviewers for their insightful comments. References 1. Agichtein, E., Castillo, C., Donato, D.: Finding High-Quality Content in Social Media. In: Proceedings of WSDM, pp Bouguessa, M., Dumoulin, B., Wang, S.: Identifying authoritative actors in question-answering forums-the case of Yahoo! Answers. In: Proceedings of KDD, pp Jurczyk, P., Agichtein, E.: Discovering authorities in question answer communities by using link analysis. In: Proceedings of CIKM, pp Liu, J., Song, Y.-I., Lin, C.-Y.: Competition-based user expertise score estimation. In: Proceedings of SIGIR, pp Pal, A., Konstan, J.: Expert identification in community question answering: exploring question selection bias. In: Proceedings of CIKM, pp Kao, W., Liu, D., Wang, S.: Expert finding in question-answering websites: a novel hybrid approach. In: Proceedings of SAC, pp Zhang, J., Ackerman, M., Adamic, L.: Expertise networks in online commmunities: structure and algorithm. In: Proceedings of WWW 8. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Stanford Digtital Library Technologies Project 9. Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3, Guo, J., Xu, S., Bao, S., Yu, Y.: Tapping on the potential of Q&A community by recommending answer providers. In: Proceedings of CIKM, pp Griffiths, T., Steyvers, M.: Finding scientific topics. The National Academy of Sciences 101, Porteous, I., Newman, D., Ihler, A., Asuncion, A., Smyth, P., Welling, M.: Fast collapsed gibbs sampling for latent dirichlet allocation. In: Proceedings of KDD, pp Haveliwala. T. H.: Topic-sensitive pagerank. In: Proceedings of WWW 15. Nie, L., Davison, B.D., Qi, X.: Topic link analysis for web search. In: Proceedings of SIGIR 16. Li, B., King, I.: Routing questions to appropriate answerers in community question answering services. In: Proceedings of CIKM, pp Zhou, G., Cai, L., Zhao, J., Liu, K.: Phrase-based translation model for question retrieval in community question answer archives. In: Proceedings of ACL, pp

### Subordinating to the Majority: Factoid Question Answering over CQA Sites

Journal of Computational Information Systems 9: 16 (2013) 6409 6416 Available at http://www.jofcis.com Subordinating to the Majority: Factoid Question Answering over CQA Sites Xin LIAN, Xiaojie YUAN, Haiwei

### Joint Relevance and Answer Quality Learning for Question Routing in Community QA

Joint Relevance and Answer Quality Learning for Question Routing in Community QA Guangyou Zhou, Kang Liu, and Jun Zhao National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy

### Incorporating Participant Reputation in Community-driven Question Answering Systems

Incorporating Participant Reputation in Community-driven Question Answering Systems Liangjie Hong, Zaihan Yang and Brian D. Davison Department of Computer Science and Engineering Lehigh University, Bethlehem,

### Learning to Recognize Reliable Users and Content in Social Media with Coupled Mutual Reinforcement

Learning to Recognize Reliable Users and Content in Social Media with Coupled Mutual Reinforcement Jiang Bian College of Computing Georgia Institute of Technology jbian3@mail.gatech.edu Eugene Agichtein

Routing Questions for Collaborative Answering in Community Question Answering Shuo Chang Dept. of Computer Science University of Minnesota Email: schang@cs.umn.edu Aditya Pal IBM Research Email: apal@us.ibm.com

### Integrated Expert Recommendation Model for Online Communities

Integrated Expert Recommendation Model for Online Communities Abeer El-korany 1 Computer Science Department, Faculty of Computers & Information, Cairo University ABSTRACT Online communities have become

### Question Routing by Modeling User Expertise and Activity in cqa services

Question Routing by Modeling User Expertise and Activity in cqa services Liang-Cheng Lai and Hung-Yu Kao Department of Computer Science and Information Engineering National Cheng Kung University, Tainan,

### A Tri-Role Topic Model for Domain-Specific Question Answering

A Tri-Role Topic Model for Domain-Specific Question Answering Zongyang Ma Aixin Sun Quan Yuan Gao Cong School of Computer Engineering, Nanyang Technological University, Singapore 639798 {zma4, qyuan1}@e.ntu.edu.sg

### Exploiting Bilingual Translation for Question Retrieval in Community-Based Question Answering

Exploiting Bilingual Translation for Question Retrieval in Community-Based Question Answering Guangyou Zhou, Kang Liu and Jun Zhao National Laboratory of Pattern Recognition Institute of Automation, Chinese

### Early Detection of Potential Experts in Question Answering Communities

Early Detection of Potential Experts in Question Answering Communities Aditya Pal 1, Rosta Farzan 2, Joseph A. Konstan 1, and Robert Kraut 2 1 Dept. of Computer Science and Engineering, University of Minnesota

### Practical Graph Mining with R. 5. Link Analysis

Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities

### New Metrics for Reputation Management in P2P Networks

New for Reputation in P2P Networks D. Donato, M. Paniccia 2, M. Selis 2, C. Castillo, G. Cortesi 3, S. Leonardi 2. Yahoo!Research Barcelona Catalunya, Spain 2. Università di Roma La Sapienza Rome, Italy

### International Journal of Engineering Research-Online A Peer Reviewed International Journal Articles are freely available online:http://www.ijoer.

RESEARCH ARTICLE SURVEY ON PAGERANK ALGORITHMS USING WEB-LINK STRUCTURE SOWMYA.M 1, V.S.SREELAXMI 2, MUNESHWARA M.S 3, ANIL G.N 4 Department of CSE, BMS Institute of Technology, Avalahalli, Yelahanka,

### CQARank: Jointly Model Topics and Expertise in Community Question Answering

CQARank: Jointly Model Topics and Expertise in Community Question Answering Liu Yang,, Minghui Qiu, Swapna Gottipati, Feida Zhu, Jing Jiang, Huiping Sun, Zhong Chen School of Software and Microelectronics,

### Incorporate Credibility into Context for the Best Social Media Answers

PACLIC 24 Proceedings 535 Incorporate Credibility into Context for the Best Social Media Answers Qi Su a,b, Helen Kai-yun Chen a, and Chu-Ren Huang a a Department of Chinese & Bilingual Studies, The Hong

### DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

### FINDING EXPERT USERS IN COMMUNITY QUESTION ANSWERING SERVICES USING TOPIC MODELS

FINDING EXPERT USERS IN COMMUNITY QUESTION ANSWERING SERVICES USING TOPIC MODELS by Fatemeh Riahi Submitted in partial fulfillment of the requirements for the degree of Master of Computer Science at Dalhousie

### Evolution of Experts in Question Answering Communities

Evolution of Experts in Question Answering Communities Aditya Pal, Shuo Chang and Joseph A. Konstan Department of Computer Science University of Minnesota Minneapolis, MN 55455, USA {apal,schang,konstan}@cs.umn.edu

### Data Mining Yelp Data - Predicting rating stars from review text

Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority

### Corporate Leaders Analytics and Network System (CLANS): Constructing and Mining Social Networks among Corporations and Business Elites in China

Corporate Leaders Analytics and Network System (CLANS): Constructing and Mining Social Networks among Corporations and Business Elites in China Yuanyuan Man, Shuai Wang, Yi Li, Yong Zhang, Long Cheng,

### Latent Dirichlet Markov Allocation for Sentiment Analysis

Latent Dirichlet Markov Allocation for Sentiment Analysis Ayoub Bagheri Isfahan University of Technology, Isfahan, Iran Intelligent Database, Data Mining and Bioinformatics Lab, Electrical and Computer

### MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of

### Network Big Data: Facing and Tackling the Complexities Xiaolong Jin

Network Big Data: Facing and Tackling the Complexities Xiaolong Jin CAS Key Laboratory of Network Data Science & Technology Institute of Computing Technology Chinese Academy of Sciences (CAS) 2015-08-10

### Pharos: Social Map-Based Recommendation for Content-Centric Social Websites

Pharos: Social Map-Based Recommendation for Content-Centric Social Websites Wentao Zheng Michelle Zhou Shiwan Zhao Quan Yuan Xiatian Zhang Changyan Chi IBM Research China IBM Research Almaden ABSTRACT

### Personalizing Image Search from the Photo Sharing Websites

Personalizing Image Search from the Photo Sharing Websites Swetha.P.C, Department of CSE, Atria IT, Bangalore swethapc.reddy@gmail.com Aishwarya.P Professor, Dept.of CSE, Atria IT, Bangalore aishwarya_p27@yahoo.co.in

### Improving Question Retrieval in Community Question Answering Using World Knowledge

Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Improving Question Retrieval in Community Question Answering Using World Knowledge Guangyou Zhou, Yang Liu, Fang

### Finding Expert Users in Community Question Answering

Finding Expert Users in Community Question Answering Fatemeh Riahi Faculty of Computer Science Dalhousie University riahi@cs.dal.ca Zainab Zolaktaf Faculty of Computer Science Dalhousie University zolaktaf@cs.dal.ca

### Graph Processing and Social Networks

Graph Processing and Social Networks Presented by Shu Jiayu, Yang Ji Department of Computer Science and Engineering The Hong Kong University of Science and Technology 2015/4/20 1 Outline Background Graph

### Probabilistic topic models for sentiment analysis on the Web

University of Exeter Department of Computer Science Probabilistic topic models for sentiment analysis on the Web Chenghua Lin September 2011 Submitted by Chenghua Lin, to the the University of Exeter as

### Learning to Suggest Questions in Online Forums

Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence Learning to Suggest Questions in Online Forums Tom Chao Zhou 1, Chin-Yew Lin 2,IrwinKing 3, Michael R. Lyu 1, Young-In Song 2

### Link-based Analysis on Large Graphs. Presented by Weiren Yu Mar 01, 2011

Link-based Analysis on Large Graphs Presented by Weiren Yu Mar 01, 2011 Overview 1 Introduction 2 Problem Definition 3 Optimization Techniques 4 Experimental Results 2 1. Introduction Many applications

### Information Quality on Yahoo! Answers

Information Quality on Yahoo! Answers Pnina Fichman Indiana University, Bloomington, United States ABSTRACT Along with the proliferation of the social web, question and answer (QA) sites attract millions

### Topic models for Sentiment analysis: A Literature Survey

Topic models for Sentiment analysis: A Literature Survey Nikhilkumar Jadhav 123050033 June 26, 2014 In this report, we present the work done so far in the field of sentiment analysis using topic models.

### The Second Eigenvalue of the Google Matrix

0 2 The Second Eigenvalue of the Google Matrix Taher H Haveliwala and Sepandar D Kamvar Stanford University taherh,sdkamvar @csstanfordedu Abstract We determine analytically the modulus of the second eigenvalue

### REVIEW ON QUERY CLUSTERING ALGORITHMS FOR SEARCH ENGINE OPTIMIZATION

Volume 2, Issue 2, February 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: A REVIEW ON QUERY CLUSTERING

### The PageRank Citation Ranking: Bring Order to the Web

The PageRank Citation Ranking: Bring Order to the Web presented by: Xiaoxi Pang 25.Nov 2010 1 / 20 Outline Introduction A ranking for every page on the Web Implementation Convergence Properties Personalized

### A survey on click modeling in web search

A survey on click modeling in web search Lianghao Li Hong Kong University of Science and Technology Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models

### Part 1: Link Analysis & Page Rank

Chapter 8: Graph Data Part 1: Link Analysis & Page Rank Based on Leskovec, Rajaraman, Ullman 214: Mining of Massive Datasets 1 Exam on the 5th of February, 216, 14. to 16. If you wish to attend, please

### Enhancing the Ranking of a Web Page in the Ocean of Data

Database Systems Journal vol. IV, no. 3/2013 3 Enhancing the Ranking of a Web Page in the Ocean of Data Hitesh KUMAR SHARMA University of Petroleum and Energy Studies, India hkshitesh@gmail.com In today

### Big Data Technology Motivating NoSQL Databases: Computing Page Importance Metrics at Crawl Time

Big Data Technology Motivating NoSQL Databases: Computing Page Importance Metrics at Crawl Time Edward Bortnikov & Ronny Lempel Yahoo! Labs, Haifa Class Outline Link-based page importance measures Why

### 1 o Semestre 2007/2008

Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Outline 1 2 3 4 5 Outline 1 2 3 4 5 Exploiting Text How is text exploited? Two main directions Extraction Extraction

### RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS

ISBN: 978-972-8924-93-5 2009 IADIS RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS Ben Choi & Sumit Tyagi Computer Science, Louisiana Tech University, USA ABSTRACT In this paper we propose new methods for

### Inference Methods for Analyzing the Hidden Semantics in Big Data. Phuong LE-HONG phuonglh@gmail.com

Inference Methods for Analyzing the Hidden Semantics in Big Data Phuong LE-HONG phuonglh@gmail.com Introduction Grant proposal for basic research project Nafosted, 2014 24 months Principal Investigator:

### Web Graph Analyzer Tool

Web Graph Analyzer Tool Konstantin Avrachenkov INRIA Sophia Antipolis 2004, route des Lucioles, B.P.93 06902, France Email: K.Avrachenkov@sophia.inria.fr Danil Nemirovsky St.Petersburg State University

### The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity

The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity David Cohn Burning Glass Technologies 201 South Craig St, Suite 2W Pittsburgh, PA 15213 david.cohn@burning-glass.com

### An Improved Page Rank Algorithm based on Optimized Normalization Technique

An Improved Page Rank Algorithm based on Optimized Normalization Technique Hema Dubey,Prof. B. N. Roy Department of Computer Science and Engineering Maulana Azad National Institute of technology Bhopal,

Ranking Community Answers by Modeling Question-Answer Relationships via Analogical Reasoning Xin-Jing Wang Microsoft Research Asia 4F Sigma, 49 Zhichun Road Beijing, P.R.China xjwang@microsoft.com Xudong

### PRODUCT REVIEW RANKING SUMMARIZATION

PRODUCT REVIEW RANKING SUMMARIZATION N.P.Vadivukkarasi, Research Scholar, Department of Computer Science, Kongu Arts and Science College, Erode. Dr. B. Jayanthi M.C.A., M.Phil., Ph.D., Associate Professor,

### Semantic Search in Portals using Ontologies

Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br

### USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu

### Discovering Social Media Experts by Integrating Social Networks and Contents

Proceedings of the Twenty-Third Australasian Database Conference (ADC 2012), Melbourne, Australia Discovering Social Media Experts by Integrating Social Networks and Contents Zhao Zhang Bin Zhao Weining

### Characterization of Latent Social Networks Discovered through Computer Network Logs

Characterization of Latent Social Networks Discovered through Computer Network Logs Kevin M. Carter MIT Lincoln Laboratory 244 Wood St Lexington, MA 02420 kevin.carter@ll.mit.edu Rajmonda S. Caceres MIT

### 1. Systematic literature review

1. Systematic literature review Details about population, intervention, outcomes, databases searched, search strings, inclusion exclusion criteria are presented here. The aim of systematic literature review

IADIS International Journal on WWW/Internet Vol. 12, No. 1, pp. 52-64 ISSN: 1645-7641 USER INTENT PREDICTION FROM ACCESS LOG IN ONLINE SHOP Hidekazu Yanagimoto. Osaka Prefecture University. 1-1, Gakuen-cho,

### HITS vs. Non-negative Matrix Factorization

Department of Computer Science and Engineering University of Texas at Arlington Arlington, TX 76019 HITS vs. Non-negative Matrix Factorization Yuanzhe Cai, Sharma Chakravarthy Technical Report CSE 2014

### Fraudulent Support Telephone Number Identification Based on Co-occurrence Information on the Web

Fraudulent Support Telephone Number Identification Based on Co-occurrence Information on the Web Xin Li, Yiqun Liu, Min Zhang, Shaoping Ma State Key Laboratory of Intelligent Technology and Systems Tsinghua

### Learning to Rank Revisited: Our Progresses in New Algorithms and Tasks

The 4 th China-Australia Database Workshop Melbourne, Australia Oct. 19, 2015 Learning to Rank Revisited: Our Progresses in New Algorithms and Tasks Jun Xu Institute of Computing Technology, Chinese Academy

### Question Utility: A Novel Static Ranking of Question Search

Question Utility: A Novel Static Ranking of Question Search Young-In Song Korea University Seoul, Korea song@nlp.korea.ac.kr Chin-Yew Lin, Yunbo Cao Microsoft Research Asia Beijing, China {cyl, yunbo.cao}@microsoft.com

### Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School

### Question Quality in Community Question Answering Forums: A Survey

Question Quality in Community Question Answering Forums: A Survey ABSTRACT Antoaneta Baltadzhieva Tilburg University P.O. Box 90153 Tilburg, Netherlands a baltadzhieva@yahoo.de Community Question Answering

### Dynamical Clustering of Personalized Web Search Results

Dynamical Clustering of Personalized Web Search Results Xuehua Shen CS Dept, UIUC xshen@cs.uiuc.edu Hong Cheng CS Dept, UIUC hcheng3@uiuc.edu Abstract Most current search engines present the user a ranked

### Crowdsourcing Fraud Detection Algorithm Based on Psychological Behavior Analysis

, pp.138-142 http://dx.doi.org/10.14257/astl.2013.31.31 Crowdsourcing Fraud Detection Algorithm Based on Psychological Behavior Analysis Li Peng 1,2, Yu Xiao-yang 1, Liu Yang 2, Bi Ting-ting 2 1 Higher

### IT services for analyses of various data samples

IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical

### Personalized Reputation Management in P2P Networks

Personalized Reputation Management in P2P Networks Paul - Alexandru Chirita 1, Wolfgang Nejdl 1, Mario Schlosser 2, and Oana Scurtu 1 1 L3S Research Center / University of Hannover Deutscher Pavillon Expo

### THUTR: A Translation Retrieval System

THUTR: A Translation Retrieval System Chunyang Liu, Qi Liu, Yang Liu, and Maosong Sun Department of Computer Science and Technology State Key Lab on Intelligent Technology and Systems National Lab for

### Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media

Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media ABSTRACT Jiang Bian College of Computing Georgia Institute of Technology Atlanta, GA 30332 jbian@cc.gatech.edu Eugene

### Affinity Prediction in Online Social Networks

Affinity Prediction in Online Social Networks Matias Estrada and Marcelo Mendoza Skout Inc., Chile Universidad Técnica Federico Santa María, Chile Abstract Link prediction is the problem of inferring whether

### Quality-Aware Collaborative Question Answering: Methods and Evaluation

Quality-Aware Collaborative Question Answering: Methods and Evaluation ABSTRACT Maggy Anastasia Suryanto School of Computer Engineering Nanyang Technological University magg0002@ntu.edu.sg Aixin Sun School

### Ranking on Data Manifolds

Ranking on Data Manifolds Dengyong Zhou, Jason Weston, Arthur Gretton, Olivier Bousquet, and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics, 72076 Tuebingen, Germany {firstname.secondname

### Spam Detection with a Content-based Random-walk Algorithm

Spam Detection with a Content-based Random-walk Algorithm ABSTRACT F. Javier Ortega Departamento de Lenguajes y Sistemas Informáticos Universidad de Sevilla Av. Reina Mercedes s/n 41012, Sevilla (Spain)

### An Introduction to Data Mining

An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

### Data Mining in Web Search Engine Optimization and User Assisted Rank Results

Data Mining in Web Search Engine Optimization and User Assisted Rank Results Minky Jindal Institute of Technology and Management Gurgaon 122017, Haryana, India Nisha kharb Institute of Technology and Management

### Ranking User Influence in Healthcare Social Media

Ranking User Influence in Healthcare Social Media XUNING TANG College of Information Science and Technology, Drexel University, PA, U.S.A. and CHRISTOPHER C. YANG College of Information Science and Technology,

### Extracting Information from Social Networks

Extracting Information from Social Networks Aggregating site information to get trends 1 Not limited to social networks Examples Google search logs: flu outbreaks We Feel Fine Bullying 2 Bullying Xu, Jun,

### Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data

Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data Jun Wang Department of Mechanical and Automation Engineering The Chinese University of Hong Kong Shatin, New Territories,

### Online Courses Recommendation based on LDA

Online Courses Recommendation based on LDA Rel Guzman Apaza, Elizabeth Vera Cervantes, Laura Cruz Quispe, José Ochoa Luna National University of St. Agustin Arequipa - Perú {r.guzmanap,elizavvc,lvcruzq,eduardo.ol}@gmail.com

21st International Congress on Modelling and Simulation, Gold Coast, Australia, 29 Nov to 4 Dec 2015 www.mssanz.org.au/modsim2015 On the Feasibility of Answer Suggestion for Advice-seeking Community Questions

### Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines

, 22-24 October, 2014, San Francisco, USA Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines Baosheng Yin, Wei Wang, Ruixue Lu, Yang Yang Abstract With the increasing

### Improving Web Page Retrieval using Search Context from Clicked Domain Names

Improving Web Page Retrieval using Search Context from Clicked Domain Names Rongmei Li School of Electrical, Mathematics, and Computer Science University of Twente P.O.Box 217, 7500 AE, Enschede, the Netherlands

### SUIT: A Supervised User-Item Based Topic Model for Sentiment Analysis

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence SUIT: A Supervised User-Item Based Topic Model for Sentiment Analysis Fangtao Li 1, Sheng Wang 2, Shenghua Liu 3 and Ming Zhang

### PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL

Journal homepage: www.mjret.in ISSN:2348-6953 PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL Utkarsha Vibhute, Prof. Soumitra

### Social Tagging Behaviour in Community-driven Question Answering

Social Tagging Behaviour in Community-driven Question Answering Eduarda Mendes Rodrigues Natasa Milic-Frayling Blaz Fortuna Microsoft Research Microsoft Research Dept. of Knowledge Technologies 7 JJ Thomson

### A PREDICTIVE MODEL FOR QUERY OPTIMIZATION TECHNIQUES IN PERSONALIZED WEB SEARCH

International Journal of Computer Science and System Analysis Vol. 5, No. 1, January-June 2011, pp. 37-43 Serials Publications ISSN 0973-7448 A PREDICTIVE MODEL FOR QUERY OPTIMIZATION TECHNIQUES IN PERSONALIZED

### Identifying Influential Scholars in Academic Social Media Platforms

Identifying Influential Scholars in Academic Social Media Platforms Na Li, Denis Gillet École Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne, Switzerland {na.li, denis.gillet}@epfl.ch Abstract

### Web based English-Chinese OOV term translation using Adaptive rules and Recursive feature selection

Web based English-Chinese OOV term translation using Adaptive rules and Recursive feature selection Jian Qu, Nguyen Le Minh, Akira Shimazu School of Information Science, JAIST Ishikawa, Japan 923-1292

### Quality of Service Routing Network and Performance Evaluation*

Quality of Service Routing Network and Performance Evaluation* Shen Lin, Cui Yong, Xu Ming-wei, and Xu Ke Department of Computer Science, Tsinghua University, Beijing, P.R.China, 100084 {shenlin, cy, xmw,

### Identifying Focus, Techniques and Domain of Scientific Papers

Identifying Focus, Techniques and Domain of Scientific Papers Sonal Gupta Department of Computer Science Stanford University Stanford, CA 94305 sonal@cs.stanford.edu Christopher D. Manning Department of

### Understanding Web Hosting Utility of Chinese ISPs

Understanding Web Hosting Utility of Chinese ISPs Zhang Guanqun 1,2, Wang Hui 1,2, Yang Jiahai 1,2 1 The Network Research Center, Tsinghua University, 2 Tsinghua National Laboratory for Information Science

### Recommender Systems Seminar Topic : Application Tung Do. 28. Januar 2014 TU Darmstadt Thanh Tung Do 1

Recommender Systems Seminar Topic : Application Tung Do 28. Januar 2014 TU Darmstadt Thanh Tung Do 1 Agenda Google news personalization : Scalable Online Collaborative Filtering Algorithm, System Components

### Effective and Efficient Approaches to Retrieving and Using Expertise in Social Media

Effective and Efficient Approaches to Retrieving and Using Expertise in Social Media Reyyan Yeniterzi CMU-LTI-15-008 Language Technologies Institute School of Computer Science Carnegie Mellon University

### FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT

### The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of

### Query term suggestion in academic search

Query term suggestion in academic search Suzan Verberne 1, Maya Sappelli 1,2, and Wessel Kraaij 2,1 1. Institute for Computing and Information Sciences, Radboud University Nijmegen 2. TNO, Delft Abstract.

### Pagerank-like algorithm for ranking news stories and news portals

Pagerank-like algorithm for ranking news stories and news portals Igor Trajkovski Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University in Skopje, Rugjer Boshkovikj 16, P.O. Box

### Emoticon Smoothed Language Models for Twitter Sentiment Analysis

Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence Emoticon Smoothed Language Models for Twitter Sentiment Analysis Kun-Lin Liu, Wu-Jun Li, Minyi Guo Shanghai Key Laboratory of

### Achieve Better Ranking Accuracy Using CloudRank Framework for Cloud Services

Achieve Better Ranking Accuracy Using CloudRank Framework for Cloud Services Ms. M. Subha #1, Mr. K. Saravanan *2 # Student, * Assistant Professor Department of Computer Science and Engineering Regional

### Ranked Keyword Search in Cloud Computing: An Innovative Approach

International Journal of Computational Engineering Research Vol, 03 Issue, 6 Ranked Keyword Search in Cloud Computing: An Innovative Approach 1, Vimmi Makkar 2, Sandeep Dalal 1, (M.Tech) 2,(Assistant professor)