Expert Finding for Question Answering via Graph Regularized Matrix Completion

Save this PDF as:

Size: px
Start display at page:

Transcription

6 6 Algorithm 1 The optimization using GRMC-EGM Input: Y, L, Q and tolerance ε Initialize: ρ 0, γ and X 0 1: repeat : STEP 1. Update ( X k as ) 3: X k = D λ X k 1 1 ρ ρ k 1 g(x k 1 ) k 1 4: STEP. Update ρ k as 5: ˆρ ρ k 1 6: while h(x k ) > Gˆρ (X k, X k 1 ) do 7: ˆρ = γ ˆρ 8: ρ k ˆρ 9: until X k X k 1 F < ε that matrix C = X k 1 1 ρ k 1 g(x k 1 ). Thus, we can obtain the closed form solution of the above Problem (14) by Theorem 1 as follows: X k = D λ ρ k 1 ( X k 1 1 ρ k 1 g(x k 1 ) ) (17) 1 Computing ρ k : We then compute the step size ρ k properly such that convergence rate of our method can be O( 1 k ). We first introduce the following useful theorem to learn the convergence rate: Theorem : ([6]) Let X = arg min X h(x) and ρ be the smoothing parameter. Then for any k 1, we have h(x k ) h(x ) γρ X 0 W F k (18) if we can find an appropriate value for ρ at each iteration such that the condition h(x k ) G ρ (X k, X k 1 ) (19) is satisfied. Fix X k 1, we compute the value of ρ k with multiplier γ to satisfy the condition in Theorem at each iteration, given by ˆρ ρ k 1, and ˆρ = γ ˆρ (0) and G ρ (X k, X k 1 ) is then update by Equation 15. The whole procedure of GRMC-EGM is summarized in Algorithm 1. The main computation cost of GRMC-EGM in each iteration is the computation of SVD in STEP 1. The additional cost in STEP is much smaller. For large scale problems, we can adopt some existing techniques [1] to accelerate the computation of SVD and make Algorithm 1 more efficient. Furthermore, the convergence of Algorithm 1 for Problem (14) is O( 1 k ), which is guaranteed by the EGM [17]. 5. The Optimization Using GRMC-AGM In this section, we introduce an accelerated gradient method for solving the regularized graph matrix completion Problem (8), given by GRMC-AGM. Algorithm The optimization using GRMC-AGM Input: Z, L, Q and tolerance ε Initialize: ρ 0, γ and X 0 1: repeat : STEP 1. Update ( X k as ) 3: X k = D λ Z k 1 1 ρ ρ k 1 g(z k 1 ) k 1 4: STEP. Update α k as 5: α k = α k 1 6: STEP 3. Update Z k as 7: Z k = X k 1 + α k 1 1 α k (X k X k 1 ) 8: STEP 4. Update ρ k as 9: ˆρ ρ k 1 10: while h(z k ) > Gˆρ (Z k, Z k 1 ) do 11: ˆρ = γ ˆρ 1: ρ k ˆρ 13: until X k X k 1 F < ε Firstly, we introduce a new variable Z. GRMC-AGM method constructs an approximation of h(x) at a given point Z as G ρ (X, Z) = g(z) + ρ k X Z F + tr ( (X Z) T g(z) ) + λ X (1) Then, GRMC-AGM method solves the optimization Problem (8) by iteratively updating X, Z, and ρ. Computing X k : In the kth iteration, we update X k as the unique minimizer of G ρk 1 (X, Z k 1 ): X k = arg min G ρ k 1 (X, Z k 1 ) X = ρ ( k 1 X Z k 1 1 ) g(z k 1 ) ρ k 1 F + λ X () Similarly, we consider that the matrix C = Z k 1 1 ρ k 1 g(x k 1 ). Then, we can obtain the closed form solution of the above Problem () by Theorem 1 as follows: X k = D λ ρ k 1 ( Z k 1 1 ) g(z k 1 ) ρ k 1 (3) Computing Z k : Then, Z k is updated in the same way as [17]: αk 1 α k =, (4) Z k = X k 1 + α k 1 1 (X k X k 1 ) (5) α k We summarize the main procedure for solving problem 8 by AGM in Algorithm. Compared with GRMC-EGM, Algorithm has a convergence rate of O( 1 k ), which is guaranteed by the convergence property of the AGM method [17].

9 9 TABLE 5 EXPERIMENTAL RESULTS ON ACCU (THE best SCORE IN bold) ACCU D 1 D D 3 D 4 D 5 D 6 D 7 D 8 AuthorityRank ExpertsRank TSPM DRM CRAR GRMC-EGN GRMC-AGM TABLE 6 EXPERIMENTAL RESULTS ON AVERAGE (THE best SCORE IN bold) D 1 D D 3 D 4 D 5 D 6 D 7 D 8 AuthorityRank ExpertsRank TSPM DRM CRAR GRMC-EGN GRMC-AGM MRR Value CRAR GRMC-EGM GRMC-AGM ndcg Value CRAR GRMC-EGM 0.7 GRMC-AGM λ 1 λ 1 (a) Varying λ 1 on MRR (b) Varying λ 1 on ndcg Accu Value CRAR 0.35 GRMC-EGM GRMC-AGM Value CRAR GRMC-EGM 0. GRMC-AGM λ 1 λ 1 (c) Varying λ 1 on Accu (d) Varying λ 1 on Fig. 3. The performance of GRMC-EGM and GRMC-AGM versus parameter λ 1.

10 x GRMC EGM x GRMC AGM Objective value Objective value Iteration (a) GRMC-EGM Iteration (b) GRMC-AGM Fig. 4. The convergence of GRMC-EGM and GRMC-AGM The AuthorityRank and ExpertsRank methods are based on link analysis of users while TSPM and DRM methods are based on probabilistic model. The CRAR method is based on both link analysis of users and probabilistic model. These experiments reveal a number of interesting points: The topic-oriented methods, both TSPM and DRM, outperform the AuthorityRank and ExpertsRank methods, which suggests the effectiveness of the latent user model for the problem of expert finding. The CRAR method achieves better performance of other baseline methods. This suggests that the link analysis of users can also improve the performance of expert finding. In all the cases, our GRMC-EGM and GRMC-AGM methods achieve the best performance. This shows that leveraging the power of both the social network of users and the viewpoint of missing value estimation can further improve the performance of expert finding. In our approaches, there is an essential parameter, that is, the graph regularization parameter λ 1. When the value of λ 1 becomes small, our problem can be considered as the original problem of matrix completion, which is only based on the past question-answering activities. We vary parameter λ 1 to investigate the benefits of our methods from the idea of graph regularized matrix completion for missing value estimation based on social network of users. We vary the value of parameter λ 1 from 10 4 to 10 4 and show the evaluation results in Figures 3(a), 3(b), 3(c) and 3(d). We notice that the CRAR method consistently outperforms other methods in most of the datasets. Thus, we mainly compare our methods with CRAR method on data D 1 by varying parameter λ 1. The performance trend of our methods by varying parameter λ 1 is similar on other data sets. As we can see, the performance of both GRMC-EGM nd GRMC-AGM methods is very stable with respect to the parameter λ 1. Both GRMC-EGM and GRMC-AGM methods almost achieve consistently good performance when λ 1 varies from 10 to 10 4 in Figures 3(a), 3(b), 3(c) and 3(d). As we have described, both GRMC-EGM and GRMC- AGM use the social network of users to capture the local geometric structure of the user model. The success of graph regularized matrix completion for expert finding relies on the assumption that two neighboring users share the similar user model. When the parameter λ 1 is small, the formulation of graph regularized matrix completion can be considered as matrix completion for expert finding without graph of users. We observe that the performance of both GRMC-EGM and GRMC-AGM is relatively low, which is similar to the performance of DRM (based on PLSA model). This is the reason why the performance of both GRMC-EGM and GRMC-AGM methods increases as λ 1 increases, as shown in Figures 3(a), 3(b), 3(c) and 3(d). 6.4 Convergence Study The updating rules for minimizing the objective function of both GRMC-EGM and GRMC-AGM methods are essentially iterative. Here we investigate how both GRMC-EGM and GRMC-AGM methods converge. Figures 4(a) and 4(b) show the convergence curves of GRMC-EGM and GRMC-AGM methods on data D 1, respectively. The y-axis is the value of the objective function and x-axis denotes the iteration number. We can observe that method GRMC-AGM converges much faster than method GRMC-EGM. 7 CONCLUSIONS We formulated the problem of expert finding from a new perspective of missing value estimation. We presented a novel method called graph regularized matrix completion

Subordinating to the Majority: Factoid Question Answering over CQA Sites

Journal of Computational Information Systems 9: 16 (2013) 6409 6416 Available at http://www.jofcis.com Subordinating to the Majority: Factoid Question Answering over CQA Sites Xin LIAN, Xiaojie YUAN, Haiwei

Topical Authority Identification in Community Question Answering

Topical Authority Identification in Community Question Answering Guangyou Zhou, Kang Liu, and Jun Zhao National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences 95

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of

Incorporating Participant Reputation in Community-driven Question Answering Systems

Incorporating Participant Reputation in Community-driven Question Answering Systems Liangjie Hong, Zaihan Yang and Brian D. Davison Department of Computer Science and Engineering Lehigh University, Bethlehem,

Joint Relevance and Answer Quality Learning for Question Routing in Community QA

Joint Relevance and Answer Quality Learning for Question Routing in Community QA Guangyou Zhou, Kang Liu, and Jun Zhao National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING ----Changsheng Liu 10-30-2014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph

A Tri-Role Topic Model for Domain-Specific Question Answering

A Tri-Role Topic Model for Domain-Specific Question Answering Zongyang Ma Aixin Sun Quan Yuan Gao Cong School of Computer Engineering, Nanyang Technological University, Singapore 639798 {zma4, qyuan1}@e.ntu.edu.sg

CSE 494 CSE/CBS 598 (Fall 2007): Numerical Linear Algebra for Data Exploration Clustering Instructor: Jieping Ye

CSE 494 CSE/CBS 598 Fall 2007: Numerical Linear Algebra for Data Exploration Clustering Instructor: Jieping Ye 1 Introduction One important method for data compression and classification is to organize

Learning to Recognize Reliable Users and Content in Social Media with Coupled Mutual Reinforcement

Learning to Recognize Reliable Users and Content in Social Media with Coupled Mutual Reinforcement Jiang Bian College of Computing Georgia Institute of Technology jbian3@mail.gatech.edu Eugene Agichtein

Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

Community Mining from Multi-relational Networks

Community Mining from Multi-relational Networks Deng Cai 1, Zheng Shao 1, Xiaofei He 2, Xifeng Yan 1, and Jiawei Han 1 1 Computer Science Department, University of Illinois at Urbana Champaign (dengcai2,

Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

Predicting Information Popularity Degree in Microblogging Diffusion Networks

Vol.9, No.3 (2014), pp.21-30 http://dx.doi.org/10.14257/ijmue.2014.9.3.03 Predicting Information Popularity Degree in Microblogging Diffusion Networks Wang Jiang, Wang Li * and Wu Weili College of Computer

Predict Influencers in the Social Network

Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

Question Routing by Modeling User Expertise and Activity in cqa services

Question Routing by Modeling User Expertise and Activity in cqa services Liang-Cheng Lai and Hung-Yu Kao Department of Computer Science and Information Engineering National Cheng Kung University, Tainan,

Personalizing Image Search from the Photo Sharing Websites

Personalizing Image Search from the Photo Sharing Websites Swetha.P.C, Department of CSE, Atria IT, Bangalore swethapc.reddy@gmail.com Aishwarya.P Professor, Dept.of CSE, Atria IT, Bangalore aishwarya_p27@yahoo.co.in

Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650

Crowdclustering with Sparse Pairwise Labels: A Matrix Completion Approach

Outline Crowdclustering with Sparse Pairwise Labels: A Matrix Completion Approach Jinfeng Yi, Rong Jin, Anil K. Jain, Shaili Jain 2012 Presented By : KHALID ALKOBAYER Crowdsourcing and Crowdclustering

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

, pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and

Routing Questions for Collaborative Answering in Community Question Answering Shuo Chang Dept. of Computer Science University of Minnesota Email: schang@cs.umn.edu Aditya Pal IBM Research Email: apal@us.ibm.com

Chapter 6. The stacking ensemble approach

82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

Bilinear Prediction Using Low-Rank Models

Bilinear Prediction Using Low-Rank Models Inderjit S. Dhillon Dept of Computer Science UT Austin 26th International Conference on Algorithmic Learning Theory Banff, Canada Oct 6, 2015 Joint work with C-J.

Domain Classification of Technical Terms Using the Web

Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using

GRAPH-BASED ranking models have been deeply

EMR: A Scalable Graph-based Ranking Model for Content-based Image Retrieval Bin Xu, Student Member, IEEE, Jiajun Bu, Member, IEEE, Chun Chen, Member, IEEE, Can Wang, Member, IEEE, Deng Cai, Member, IEEE,

Comparing IPL2 and Yahoo! Answers: A Case Study of Digital Reference and Community Based Question Answering

Comparing and : A Case Study of Digital Reference and Community Based Answering Dan Wu 1 and Daqing He 1 School of Information Management, Wuhan University School of Information Sciences, University of

Community-Aware Prediction of Virality Timing Using Big Data of Social Cascades

1 Community-Aware Prediction of Virality Timing Using Big Data of Social Cascades Alvin Junus, Ming Cheung, James She and Zhanming Jie HKUST-NIE Social Media Lab, Hong Kong University of Science and Technology

Regression Using Support Vector Machines: Basic Foundations

Regression Using Support Vector Machines: Basic Foundations Technical Report December 2004 Aly Farag and Refaat M Mohamed Computer Vision and Image Processing Laboratory Electrical and Computer Engineering

Predict the Popularity of YouTube Videos Using Early View Data

000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

Lecture Topic: Low-Rank Approximations

Lecture Topic: Low-Rank Approximations Low-Rank Approximations We have seen principal component analysis. The extraction of the first principle eigenvalue could be seen as an approximation of the original

Subspace Analysis and Optimization for AAM Based Face Alignment

Subspace Analysis and Optimization for AAM Based Face Alignment Ming Zhao Chun Chen College of Computer Science Zhejiang University Hangzhou, 310027, P.R.China zhaoming1999@zju.edu.cn Stan Z. Li Microsoft

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 28, 83-97 (2012) Aggregate Two-Way Co-Clustering of Ads and User Data for Online Advertisements * Department of Computer Science and Information Engineering

Big Data - Lecture 1 Optimization reminders

Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Schedule Introduction Major issues Examples Mathematics

Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu

Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of

HITS vs. Non-negative Matrix Factorization

Department of Computer Science and Engineering University of Texas at Arlington Arlington, TX 76019 HITS vs. Non-negative Matrix Factorization Yuanzhe Cai, Sharma Chakravarthy Technical Report CSE 2014

Joint Feature Learning and Clustering Techniques for Clustering High Dimensional Data: A Review

International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-03 E-ISSN: 2347-2693 Joint Feature Learning and Clustering Techniques for Clustering High Dimensional

Volume 4, Issue 6, June 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Improving Web Image

REVIEW ON QUERY CLUSTERING ALGORITHMS FOR SEARCH ENGINE OPTIMIZATION

Volume 2, Issue 2, February 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: A REVIEW ON QUERY CLUSTERING

Statistical machine learning, high dimension and big data

Statistical machine learning, high dimension and big data S. Gaïffas 1 14 mars 2014 1 CMAP - Ecole Polytechnique Agenda for today Divide and Conquer principle for collaborative filtering Graphical modelling,

COMMUNITY QUESTION ANSWERING (CQA) services, Improving Question Retrieval in Community Question Answering with Label Ranking

Improving Question Retrieval in Community Question Answering with Label Ranking Wei Wang, Baichuan Li Department of Computer Science and Engineering The Chinese University of Hong Kong Shatin, N.T., Hong

Large Scale Learning to Rank

Large Scale Learning to Rank D. Sculley Google, Inc. dsculley@google.com Abstract Pairwise learning to rank methods such as RankSVM give good performance, but suffer from the computational burden of optimizing

Network Big Data: Facing and Tackling the Complexities Xiaolong Jin

Network Big Data: Facing and Tackling the Complexities Xiaolong Jin CAS Key Laboratory of Network Data Science & Technology Institute of Computing Technology Chinese Academy of Sciences (CAS) 2015-08-10

Data Mining Practical Machine Learning Tools and Techniques

Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL

Journal homepage: www.mjret.in ISSN:2348-6953 PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL Utkarsha Vibhute, Prof. Soumitra

A survey on click modeling in web search

A survey on click modeling in web search Lianghao Li Hong Kong University of Science and Technology Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models

Automatic Web Page Classification

Automatic Web Page Classification Yasser Ganjisaffar 84802416 yganjisa@uci.edu 1 Introduction To facilitate user browsing of Web, some websites such as Yahoo! (http://dir.yahoo.com) and Open Directory

The Enron Corpus: A New Dataset for Email Classification Research

The Enron Corpus: A New Dataset for Email Classification Research Bryan Klimt and Yiming Yang Language Technologies Institute Carnegie Mellon University Pittsburgh, PA 15213-8213, USA {bklimt,yiming}@cs.cmu.edu

Nonnegative Matrix Factorization: Algorithms, Complexity and Applications

Nonnegative Matrix Factorization: Algorithms, Complexity and Applications Ankur Moitra Massachusetts Institute of Technology July 6th, 2015 Ankur Moitra (MIT) NMF July 6th, 2015 m M n W m M = A inner dimension

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II

A Novel Method to Detect Fraud Ranking For Mobile Apps

A Novel Method to Detect Fraud Ranking For Mobile Apps Geedikapally Rajesh Kumar M.Tech Student Department of CSE St.Peter's Engineering College. Abstract: Ranking fraud in the mobile App market refers

The fastclime Package for Linear Programming and Large-Scale Precision Matrix Estimation in R

Journal of Machine Learning Research 15 (2014) 489-493 Submitted 3/13; Revised 8/13; Published 2/14 The fastclime Package for Linear Programming and Large-Scale Precision Matrix Estimation in R Haotian

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #18: Dimensionality Reduc7on

CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #18: Dimensionality Reduc7on Dimensionality Reduc=on Assump=on: Data lies on or near a low d- dimensional subspace Axes of this subspace

Boosting the Feature Space: Text Classification for Unstructured Data on the Web

Boosting the Feature Space: Text Classification for Unstructured Data on the Web Yang Song 1, Ding Zhou 1, Jian Huang 2, Isaac G. Councill 2, Hongyuan Zha 1,2, C. Lee Giles 1,2 1 Department of Computer

Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines

, 22-24 October, 2014, San Francisco, USA Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines Baosheng Yin, Wei Wang, Ruixue Lu, Yang Yang Abstract With the increasing

Blind Deconvolution of Corrupted Barcode Signals

Blind Deconvolution of Corrupted Barcode Signals Everardo Uribe and Yifan Zhang Advisors: Ernie Esser and Yifei Lou Interdisciplinary Computational and Applied Mathematics Program University of California,

Factorization Machines

Factorization Machines Steffen Rendle Department of Reasoning for Intelligence The Institute of Scientific and Industrial Research Osaka University, Japan rendle@ar.sanken.osaka-u.ac.jp Abstract In this

Algorithmic Crowdsourcing. Denny Zhou Microsoft Research Redmond Dec 9, NIPS13, Lake Tahoe

Algorithmic Crowdsourcing Denny Zhou icrosoft Research Redmond Dec 9, NIPS13, Lake Tahoe ain collaborators John Platt (icrosoft) Xi Chen (UC Berkeley) Chao Gao (Yale) Nihar Shah (UC Berkeley) Qiang Liu

Application of Machine Learning to Link Prediction

Application of Machine Learning to Link Prediction Kyle Julian (kjulian3), Wayne Lu (waynelu) December 6, 6 Introduction Real-world networks evolve over time as new nodes and links are added. Link prediction

Learning to Rank Revisited: Our Progresses in New Algorithms and Tasks

The 4 th China-Australia Database Workshop Melbourne, Australia Oct. 19, 2015 Learning to Rank Revisited: Our Progresses in New Algorithms and Tasks Jun Xu Institute of Computing Technology, Chinese Academy

An Analysis of Verifications in Microblogging Social Networks - Sina Weibo

An Analysis of Verifications in Microblogging Social Networks - Sina Weibo Junting Chen and James She HKUST-NIE Social Media Lab Dept. of Electronic and Computer Engineering The Hong Kong University of

Ranking on Data Manifolds

Ranking on Data Manifolds Dengyong Zhou, Jason Weston, Arthur Gretton, Olivier Bousquet, and Bernhard Schölkopf Max Planck Institute for Biological Cybernetics, 72076 Tuebingen, Germany {firstname.secondname

Big Data Analytics: Optimization and Randomization

Big Data Analytics: Optimization and Randomization Tianbao Yang, Qihang Lin, Rong Jin Tutorial@SIGKDD 2015 Sydney, Australia Department of Computer Science, The University of Iowa, IA, USA Department of

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

Advanced Ensemble Strategies for Polynomial Models

Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer

Standardization and Its Effects on K-Means Clustering Algorithm

Research Journal of Applied Sciences, Engineering and Technology 6(7): 399-3303, 03 ISSN: 040-7459; e-issn: 040-7467 Maxwell Scientific Organization, 03 Submitted: January 3, 03 Accepted: February 5, 03

RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS

ISBN: 978-972-8924-93-5 2009 IADIS RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS Ben Choi & Sumit Tyagi Computer Science, Louisiana Tech University, USA ABSTRACT In this paper we propose new methods for

An Exploratory Study on Software Microblogger Behaviors

2014 4th IEEE Workshop on Mining Unstructured Data An Exploratory Study on Software Microblogger Behaviors Abstract Microblogging services are growing rapidly in the recent years. Twitter, one of the most

Evolution of Experts in Question Answering Communities

Evolution of Experts in Question Answering Communities Aditya Pal, Shuo Chang and Joseph A. Konstan Department of Computer Science University of Minnesota Minneapolis, MN 55455, USA {apal,schang,konstan}@cs.umn.edu

Large Scale Spectral Clustering with Landmark-Based Representation

Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence Large Scale Spectral Clustering with Landmark-Based Representation Xinlei Chen Deng Cai State Key Lab of CAD&CG, College of Computer

Impact of Boolean factorization as preprocessing methods for classification of Boolean data

Impact of Boolean factorization as preprocessing methods for classification of Boolean data Radim Belohlavek, Jan Outrata, Martin Trnecka Data Analysis and Modeling Lab (DAMOL) Dept. Computer Science,

Social Prediction in Mobile Networks: Can we infer users emotions and social ties?

Social Prediction in Mobile Networks: Can we infer users emotions and social ties? Jie Tang Tsinghua University, China 1 Collaborate with John Hopcroft, Jon Kleinberg (Cornell) Jinghai Rao (Nokia), Jimeng

BALANCE LEARNING TO RANK IN BIG DATA. Guanqun Cao, Iftikhar Ahmad, Honglei Zhang, Weiyi Xie, Moncef Gabbouj. Tampere University of Technology, Finland

BALANCE LEARNING TO RANK IN BIG DATA Guanqun Cao, Iftikhar Ahmad, Honglei Zhang, Weiyi Xie, Moncef Gabbouj Tampere University of Technology, Finland {name.surname}@tut.fi ABSTRACT We propose a distributed

Discovering Social Media Experts by Integrating Social Networks and Contents

Proceedings of the Twenty-Third Australasian Database Conference (ADC 2012), Melbourne, Australia Discovering Social Media Experts by Integrating Social Networks and Contents Zhao Zhang Bin Zhao Weining

Kaiquan Xu, Associate Professor, Nanjing University. Kaiquan Xu

Kaiquan Xu Marketing & ebusiness Department, Business School, Nanjing University Email: xukaiquan@nju.edu.cn Tel: +86-25-83592129 Employment Associate Professor, Marketing & ebusiness Department, Nanjing

An Overview Of Software For Convex Optimization. Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM 87801 borchers@nmt.

An Overview Of Software For Convex Optimization Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM 87801 borchers@nmt.edu In fact, the great watershed in optimization isn t between linearity

Practical Graph Mining with R. 5. Link Analysis

Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities

A Comparative Study on Sentiment Classification and Ranking on Product Reviews

A Comparative Study on Sentiment Classification and Ranking on Product Reviews C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan

Two Heads Better Than One: Metric+Active Learning and Its Applications for IT Service Classification

29 Ninth IEEE International Conference on Data Mining Two Heads Better Than One: Metric+Active Learning and Its Applications for IT Service Classification Fei Wang 1,JimengSun 2,TaoLi 1, Nikos Anerousis

Parallel Data Mining. Team 2 Flash Coders Team Research Investigation Presentation 2. Foundations of Parallel Computing Oct 2014

Parallel Data Mining Team 2 Flash Coders Team Research Investigation Presentation 2 Foundations of Parallel Computing Oct 2014 Agenda Overview of topic Analysis of research papers Software design Overview

Ranking Methods in Machine Learning

Ranking Methods in Machine Learning A Tutorial Introduction Shivani Agarwal Computer Science & Artificial Intelligence Laboratory Massachusetts Institute of Technology Example 1: Recommendation Systems

Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data

Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data Jun Wang Department of Mechanical and Automation Engineering The Chinese University of Hong Kong Shatin, New Territories,

Design call center management system of e-commerce based on BP neural network and multifractal

Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2014, 6(6):951-956 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 Design call center management system of e-commerce

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.

Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach

Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach Alex Hai Wang College of Information Sciences and Technology, The Pennsylvania State University, Dunmore, PA 18512, USA

Science Navigation Map: An Interactive Data Mining Tool for Literature Analysis

Science Navigation Map: An Interactive Data Mining Tool for Literature Analysis Yu Liu School of Software yuliu@dlut.edu.cn Zhen Huang School of Software kobe_hz@163.com Yufeng Chen School of Computer

Support Vector Machines with Clustering for Training with Very Large Datasets

Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!

Estimating Twitter User Location Using Social Interactions A Content Based Approach

2011 IEEE International Conference on Privacy, Security, Risk, and Trust, and IEEE International Conference on Social Computing Estimating Twitter User Location Using Social Interactions A Content Based

4. GPCRs PREDICTION USING GREY INCIDENCE DEGREE MEASURE AND PRINCIPAL COMPONENT ANALYIS

4. GPCRs PREDICTION USING GREY INCIDENCE DEGREE MEASURE AND PRINCIPAL COMPONENT ANALYIS The GPCRs sequences are made up of amino acid polypeptide chains. We can also call them sub units. The number and

Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics

Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Part I: Factorizations and Statistical Modeling/Inference Amnon Shashua School of Computer Science & Eng. The Hebrew University

Proximal mapping via network optimization

L. Vandenberghe EE236C (Spring 23-4) Proximal mapping via network optimization minimum cut and maximum flow problems parametric minimum cut problem application to proximal mapping Introduction this lecture:

Big Text Data Clustering using Class Labels and Semantic Feature Based on Hadoop of Cloud Computing

Vol.8, No.4 (214), pp.1-1 http://dx.doi.org/1.14257/ijseia.214.8.4.1 Big Text Data Clustering using Class Labels and Semantic Feature Based on Hadoop of Cloud Computing Yong-Il Kim 1, Yoo-Kang Ji 2 and

MapReduce Approach to Collective Classification for Networks

MapReduce Approach to Collective Classification for Networks Wojciech Indyk 1, Tomasz Kajdanowicz 1, Przemyslaw Kazienko 1, and Slawomir Plamowski 1 Wroclaw University of Technology, Wroclaw, Poland Faculty

Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network

Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network Qian Wu, Yahui Wang, Long Zhang and Li Shen Abstract Building electrical system fault diagnosis is the

LOW-DIMENSIONAL MODELS FOR MISSING DATA IMPUTATION IN ROAD NETWORKS

LOW-DIMENSIONAL MODELS FOR MISSING DATA IMPUTATION IN ROAD NETWORKS Muhammad Tayyab Asif 1, Nikola Mitrovic 1, Lalit Garg 1,2, Justin Dauwels 1, Patrick Jaillet 3,4 1 School of Electrical and Electronic

Clarify Some Issues on the Sparse Bayesian Learning for Sparse Signal Recovery

Clarify Some Issues on the Sparse Bayesian Learning for Sparse Signal Recovery Zhilin Zhang and Bhaskar D. Rao Technical Report University of California at San Diego September, Abstract Sparse Bayesian