Interactive Chinese Question Answering System in Medicine Diagnosis
|
|
|
- Lauren Armstrong
- 10 years ago
- Views:
Transcription
1 Interactive Chinese ing System in Medicine Diagnosis Xipeng Qiu School of Computer Science Fudan University Jiatuo Xu Shanghai University of Traditional Chinese Medicine Abstract In this paper, we propose a general framework for the interactive question answering system in medical diagnosis, which can interact simply with user to get more refined question descriptions and return answers. The system first gets FAQ pairs from cqa website, and builds the medical ontology with incremental methods. Then it analyzes the question and enquires user for the lacking information. After getting user s feedbacks, it performs question retrieval and extracts answers. The experiment shows our system has better performance with user s feedbacks. 1. Introduction Automatic question answering (QA) is an important research topic in information retrieval and natural language processing fields [39, 40], which is an alternative to the keyword based information retrieval system, like Google 1, Baidu 2. The input of a QA system is a question, and the output is the corresponding answers extracted from a large corpus or web[20]. However, these system cannot deal with some complicated questions which are related with domain knowledge, such as medical domain. Fig. 1 shows the general framework of question answering in the open domain. To alleviate this problem, we can resort to online large scale FAQ archive for specific domains. In recent years, the community-based question answering services (cqa) have become very popular, such as Baidu Zhidao 3. Instead of finding answer by forums or search engines, users can post their question on cqa websites and wait the other people to answer it. While forums focus on the discussion and communication between users, cqa services focus on answering the questions of users. Therefore, users can get a faster response in cqa websites. These cqa websites also provide an interface to retrieval the answered questions, which are almost based on keyword search engine. So it is not still enough to offer the exact information to user. The user also need consider some appropriate keywords to represent his needs. Besides, the good answers are often mingle with large of bad or wrong answers. Therefore, the major issue is to find the exact one when the answers of many complicated question already exist. There are some related works, including question suggestion, answers qualities, question answering pairs extraction, etc[14, 18, 23, 26, 32, 9, 27]. In this paper, we propose a general framework for the interactive chinese question answering system in medical diagnosis, which can interact simply with user to get refined question descriptions and return user the extracted answers. The system first gets FAQ pairs from cqa website, and collects the medical ontology with incremental method. Then it analyzes the question and enquires user about the lacking information. After getting user s feedbacks, it performs question retrieval and extracts answers. In the rest of the paper, we first describe our system in section 2, and evaluate it by the experiments in section 3. Finally, we give the conclusions in section System Framework In this section, we introduce our system for the interactive chinese question answering system in medical diagnosis Topical Crawler Topical crawlers play an important role in domain search engines. Topical crawlers can start with some seed keywords or urls and gather the web pages which have similar content with seeds [35] [28, 5]. The context is one of the most useful features, which can guide crawler to locate highly relevant target pages. In our system, we collect medical webpages by analyzing the anchor text attached to hyperlinks. We first collect the anchor texts with the corresponding categories from
2 Query Generation Classification Semantic Web Retrieval Ranking Extraction WWW Figure 1. The flowchart of the open domain question answering system two chinese cqa websites, which provide the question categories. Then we select the anchor text with categories related to the medical keywords, such as 医 疗 / 疾 病. Then we build a two-class classifier to classify the anchor texts to medical or non-medical texts. The classifier we used is naive Bayes with multinomial distribution[17] Medical Ontology Construction To take advantage of the medical domain knowledges, we need to establish the ontology about the medical terms, concepts, entities and their relations. Due to the difficulty in collect knowledges manually, we use an automatic method to collect them. There are already some works to extract information within the collected corpus automatically[7, 37]. The objective of information extraction (IE) is to extract certain pieces of information from text that are related to a prescribed set of related concepts. We first collect some initial information, which includes names of drug, symptom, disease and the relations between them manually. Then we build the medical ontology with information extraction methods QA Pairs Extraction Since there are many methods to extract the best answers for a question in cqa websites [26, 15]. The answer quality problem is important when there are many duplicated questions, or wrong questions. These questions have answers with varying quality levels, therefore it is not enough to measure relevance alone and the quality of answers must be considered together. We use the features to predicate to exact the best answer, which are described in [15]. These features includes: er s Acceptance Ratio, Length, er s Self Evaluation, er s Activity Level, er s Category Specialty, Users Recommendation, Number of s In the general question analysis system, the first step is question classification[29, 24, 10, 44]. The categories is consistent with entity extraction in the latter steps. However, there are some difficulties in chinese medical QA system. First, it is very different between English and Chinese question sentence. Second, most questions are not factbased and are complex to be categoried. In our system we build an the question analysis model with medical ontology[13]. It first analyzes the focus words in the question, and finds the related concepts in medical ontology. Then it classifies the question to a category and decides what is missing information for the question Interactive Feedback A user often input a question with just mainly symptom, but it is not often enough to get cause of disease. For example, 有 什 么 方 法 能 治 疗 头 晕?There are many reasons to lead to dizziness, and the corresponding treatments vary greatly with different reasons or state of health. To get the exact answers, the user are asked to provide some extra information, such as his age, other symptoms, etc. With the user provided symptoms, the system firstly gets the related symptoms from the collected medical knowledge. Then the system interact with user to ensure all signs of his disease.
3 Ranking Feedback Lacking Information Type Focus s Filtering Candidates Extraction Auxiliary Information QA Pairs Medical Ontology WWW cqa Websties Topic Web Crawler Figure 2. The flowchart of the interactive chinese question answering system There are also some researches on interactive question answering[11, 12, 25] FAQ Retrieval Giving a FAQ corpus, there is still a problem to retrieve useful information for the user s questions. There are many works to improve the performance of FAQ retrieval[41, 22, 2, 19, 4, 3, 14, 16, 43, 4]. An importance problem is how to calculate the similarity between user s question and a FAQ pair, which requires some semantic analysis. However, measuring semantic similarities between questions is not trivial. Sometimes, two questions that have the same meaning use very different wording. For example, Q1: 糖 尿 病 患 者 长 期 服 用 什 么 药 比 较 有 效, 副 作 用 比 较 小? and Q2: 有 什 么 能 有 效 降 低 血 糖 并 且 对 身 体 无 害 的 药? have almost the identical meaning but they are lexically very different. Similarity measures developed for document retrieval work poorly when there is little word overlap. Thus, if there is the QA pair of Q2 in FAQ corpus, but the user ask the question Q1. Then, he could not get answer because Q1 and Q2 are almost different with traditional information retrieval method. A solution for this issue is query expansion[31, 38, 42]. In our system, we expand the query by the domain ontology. For the name of disease, we add some keywords about its corresponding symptoms Extraction In cqa websites, the repliers often provide background or related informations for the questions, which are useful to help questioner to find out the fact himself. But sometimes, especially for the factoid and list questions, the user need the exact answers instead of the related pieces of answers. For example, 请 问 糖 尿 病 的 症 状 有 哪 些?. So we need extract the answers from the related informations[8, 33, 34, 45]. We first extract the entities from these informations, and classify them to the different entity categories, such as Person, Location, Organization, Durations, Quantities and Dates, etc[1]. Then we score the entities and filtering them with a threshold. Entity scores have two components. The first component is whether or not the entity s category matches the query s
4 category. The second component of the entity score is based on the frequency and position of occurrences of a given entity within the retrieved passages[1]. In our system, we use conditional random fields [21, 30] to label the entities and its corresponding categories Re-ranking Before return answers to user, the system need re-rank the answers to improve the system performance. For example, removing redundancy answers [6]. We can use more features to [36] to judge the scores for each answers candidates. 3. Experiments We implement our system and collect about 84,000 QA pairs in medical domain from cqa websites: Baidu Zhidao 4, WenWen 5. We evaluate our results with mean precision at rank 1 (P@1), which is the percentage of questions with the correct answer on the first position. We use the keywords query as the baseline system. These keywords are just the terms in question. We select randomly 100 questions and evaluate the qualities of answers manually. Table 1. Results of different systems with P@1 Systems P@1 Baseline 79% No feedback 82% Feedback 87% Table 1 shows the results of our system. The feedback of user can improve the answer quality greatly. 4. Conclusion In this paper, we propose a framework of the interactive question system in medical domain. It integrates the question analysis, query expansion, ontology construction, answer extraction and answer ranking. We also address the difficulties in each part and the preliminary solutions. The proposed framework is also applied for the other domain, such as music, travel Acknowledgement This work was supported by the National High Technology Research and Development Program of China (863 Program)(No.2007AA02Z429, the Natural Science Foundation of China (No and ). References [1] S. Abney, M. Collins, and A. Singhal. extraction. Proceedings of the sixth conference on Applied natural language processing, pages , [2] R. Baeza-Yates, B. Ribeiro-Neto, et al. Modern information retrieval. Addison-Wesley Harlow, England, [3] R. Burke, K. Hammond, and J. Kozlovsky. Knowledgebased information retrieval from semi-structured text. AAAI Fall Symposium on AI Applications in Knowledge Navigation and Retrieval, pages 19 24, [4] R. Burke, K. Hammond, V. Kulyukin, S. Lytinen, N. Tomuro, and S. Schoenberg. answering from frequently asked question files: Experiences with the faq finder system. AI Magazine, 18(2):57 66, [5] S. Chakrabarti, K. Punera, and M. Subramanyam. Accelerated focused crawling through online relevance feedback. Proceedings of the 11th international conference on World Wide Web, pages , [6] C. Clarke, G. Cormack, and T. Lynam. Exploiting redundancy in question answering. Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages , [7] J. Cowie and W. Lehnert. Information extraction. Communications of the ACM, 39(1):80 91, [8] D. Demner-Fushman and J. Lin. Knowledge extraction for clinical question answering: Preliminary results. Proceedings of the AAAI-05 Workshop on ing in Restricted Domains, pages 9 13, [9] S. Ding, G. Cong, C.-Y. Lin, and X. Zhu. Using conditional random fields to extract contexts and answers of questions from online forums. In Proceedings of ACL-08: HLT, pages , Columbus, Ohio, June Association for Computational Linguistics. [10] J. Ely, J. Osheroff, P. Gorman, M. Ebell, M. Chambliss, E. Pifer, and P. Stavri. A taxonomy of generic clinical questions: classification study, [11] T. Hao, D. Hu, L. Wenyin, and Q. Zeng. Semantic patterns for user-interactive question answering. CONCURRENCY AND COMPUTATION, 20(7):783, [12] S. Harabagiu, A. Hickl, J. Lehmann, and D. Moldovan. Experiments with interactive question-answering. Ann Arbor, 100, [13] U. Hermjakob. Parsing and question classification for question answering. Proceedings of the Workshop on ing at the Conference ACL-2001, [14] J. Jeon, W. Croft, and J. Lee. Finding similar questions in large question and answer archives. Proceedings of the 14th ACM international conference on Information and knowledge management, pages 84 90, 2005.
5 [15] J. Jeon, W. Croft, J. Lee, and S. Park. A framework to predict the quality of answers with non-textual features. Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages , [16] V. Jijkoun and M. de Rijke. Retrieving answers from frequently asked questions pages on the web. Proceedings of the 14th ACM international conference on Information and knowledge management, pages 76 83, [17] M. Jordan. Learning in Graphical Models. Kluwer Academic Publishers, [18] P. Jurczyk and E. Agichtein. Discovering authorities in question answer communities by using link analysis. Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pages , [19] H. Kim and J. Seo. High-performance faq retrieval using an automatic clustering method of query logs. Information Processing and Management, 42(3): , [20] C. Kwok, O. Etzioni, and D. Weld. Scaling question answering to the web. Proceedings of the 10th international conference on World Wide Web, pages , [21] J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML 01: Proceedings of the Eighteenth International Conference on Machine Learning, pages , San Francisco, CA, USA, Morgan Kaufmann Publishers Inc. [22] C. Lee. Intention Extraction and Semantic Matching for Internet FAQ Retrieval. PhD thesis, Master Thesis, Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan, ROC, [23] C. LENGELER, D. SAVIGNY, H. MSHINDA, C. MAY- OMBANA, S. TAYARI, C. HATZ, A. DEGRÉMONT, and M. TANNER. Community-based questionnaires and health statistics as tools for the cost-efficient identification of communities at risk of urinary schistosomiasis. International Journal of Epidemiology, 20(3): , [24] X. Li and D. Roth. Learning question classifiers. Proceedings of the 19th International Conference on Computational Linguistics, pages , [25] J. Lin, D. Quan, V. Sinha, K. Bakshi, D. Huynh, B. Katz, and D. Karger. What makes a good answer? the role of context in question answering. Human-Computer Interaction, [26] X. Liu, W. Croft, and M. Koll. Finding experts in community-based question-answering services. In Proceedings of the 14th ACM international conference on Information and knowledge management, pages ACM New York, NY, USA, [27] Y. Liu and E. Agichtein. You ve got answers: Towards personalized models for predicting success in community question answering. In Proceedings of ACL-08: HLT, Short Papers, pages , Columbus, Ohio, June Association for Computational Linguistics. [28] F. Menczer, G. Pant, P. Srinivasan, and M. Ruiz. Evaluating topic-driven web crawlers. Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages , [29] D. Metzler and W. Croft. of statistical question classification for fact-based questions. Information Retrieval, 8(3): , [30] F. Peng, F. Feng, and A. McCallum. Chinese segmentation and new word detection using conditional random fields. Proceedings of the 20th international conference on Computational Linguistics, [31] Y. Qiu and H. Frei. Concept based query expansion. Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, pages , [32] B. Smyth, E. Balfe, J. Freyne, P. Briggs, M. Coyle, and O. Boydell. Exploiting query repetition and regularity in an adaptive community-based web search engine. User Modeling and User-Adapted Interaction, 14(5): , [33] R. Srihari and W. Li. A question answering system supported by information extraction. Proceedings of the sixth conference on Applied natural language processing, pages , [34] R. Srihari, W. Li, and N. CYMFONY. Information extraction supported question answering. NIST SPECIAL PUBLI- CATION SP, pages , [35] P. Srinivasan, F. Menczer, and G. Pant. A General Evaluation Framework for Topical Crawlers. Information Retrieval, 8(3): , [36] M. Surdeanu, M. Ciaramita, and H. Zaragoza. Learning to rank answers on large online QA collections. In Proceedings of ACL-08: HLT, pages , Columbus, Ohio, June Association for Computational Linguistics. [37] J. Turmo, A. Ageno, and N. Català. Adaptive information extraction. ACM Computing Surveys (CSUR), 38(2), [38] E. Voorhees. Query expansion using lexical-semantic relations. Springer-Verlag New York, Inc. New York, NY, USA, [39] E. Voorhees. The trec-8 question answering track report. NIST SPECIAL PUBLICATION SP, pages 77 82, [40] E. Voorhees. Overview of the trec 2003 question answering track. Proceedings of the Twelfth Text REtrieval Conference (TREC 2003), 142, [41] C. Wu, J. Yeh, and Y. Lai. Semantic segment extraction and matching for internet faq retrieval. IEEE TRANS- ACTIONS ON KNOWLEDGE AND DATA ENGINEERING, pages , [42] J. Xu and W. Croft. Query expansion using local and global document analysis. Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, pages 4 11, [43] S. Yang, F. Chuang, and C. Ho. Ontology-supported faq processing and ranking techniques. Journal of Intelligent Information Systems, 28(3): , [44] W. Zhang and T. Chen. Classification based on symmetric maximized minimal distance in subspace (SMMS). In Proc. of IEEE Conf. on Comput. Vision and Pattern Recogn. (CVPR), [45] Z. Zheng. bus question answering system. Proceedings of the second international conference on Human Language Technology Research, pages , 2002.
Searching Questions by Identifying Question Topic and Question Focus
Searching Questions by Identifying Question Topic and Question Focus Huizhong Duan 1, Yunbo Cao 1,2, Chin-Yew Lin 2 and Yong Yu 1 1 Shanghai Jiao Tong University, Shanghai, China, 200240 {summer, yyu}@apex.sjtu.edu.cn
Subordinating to the Majority: Factoid Question Answering over CQA Sites
Journal of Computational Information Systems 9: 16 (2013) 6409 6416 Available at http://www.jofcis.com Subordinating to the Majority: Factoid Question Answering over CQA Sites Xin LIAN, Xiaojie YUAN, Haiwei
TREC 2003 Question Answering Track at CAS-ICT
TREC 2003 Question Answering Track at CAS-ICT Yi Chang, Hongbo Xu, Shuo Bai Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China [email protected] http://www.ict.ac.cn/
Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2
Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2 Department of Computer Engineering, YMCA University of Science & Technology, Faridabad,
How To Cluster On A Search Engine
Volume 2, Issue 2, February 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: A REVIEW ON QUERY CLUSTERING
Comparing IPL2 and Yahoo! Answers: A Case Study of Digital Reference and Community Based Question Answering
Comparing and : A Case Study of Digital Reference and Community Based Answering Dan Wu 1 and Daqing He 1 School of Information Management, Wuhan University School of Information Sciences, University of
Term extraction for user profiling: evaluation by the user
Term extraction for user profiling: evaluation by the user Suzan Verberne 1, Maya Sappelli 1,2, Wessel Kraaij 1,2 1 Institute for Computing and Information Sciences, Radboud University Nijmegen 2 TNO,
Incorporating Participant Reputation in Community-driven Question Answering Systems
Incorporating Participant Reputation in Community-driven Question Answering Systems Liangjie Hong, Zaihan Yang and Brian D. Davison Department of Computer Science and Engineering Lehigh University, Bethlehem,
CAS-ICT at TREC 2005 SPAM Track: Using Non-Textual Information to Improve Spam Filtering Performance
CAS-ICT at TREC 2005 SPAM Track: Using Non-Textual Information to Improve Spam Filtering Performance Shen Wang, Bin Wang and Hao Lang, Xueqi Cheng Institute of Computing Technology, Chinese Academy of
On the Feasibility of Answer Suggestion for Advice-seeking Community Questions about Government Services
21st International Congress on Modelling and Simulation, Gold Coast, Australia, 29 Nov to 4 Dec 2015 www.mssanz.org.au/modsim2015 On the Feasibility of Answer Suggestion for Advice-seeking Community Questions
Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines
, 22-24 October, 2014, San Francisco, USA Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines Baosheng Yin, Wei Wang, Ruixue Lu, Yang Yang Abstract With the increasing
Data Mining in Web Search Engine Optimization and User Assisted Rank Results
Data Mining in Web Search Engine Optimization and User Assisted Rank Results Minky Jindal Institute of Technology and Management Gurgaon 122017, Haryana, India Nisha kharb Institute of Technology and Management
Semantic Concept Based Retrieval of Software Bug Report with Feedback
Semantic Concept Based Retrieval of Software Bug Report with Feedback Tao Zhang, Byungjeong Lee, Hanjoon Kim, Jaeho Lee, Sooyong Kang, and Ilhoon Shin Abstract Mining software bugs provides a way to develop
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
Semantic Search in Portals using Ontologies
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Athira P. M., Sreeja M. and P. C. Reghuraj Department of Computer Science and Engineering, Government Engineering
Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media
Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media ABSTRACT Jiang Bian College of Computing Georgia Institute of Technology Atlanta, GA 30332 [email protected] Eugene
Question Routing by Modeling User Expertise and Activity in cqa services
Question Routing by Modeling User Expertise and Activity in cqa services Liang-Cheng Lai and Hung-Yu Kao Department of Computer Science and Information Engineering National Cheng Kung University, Tainan,
Quality-Aware Collaborative Question Answering: Methods and Evaluation
Quality-Aware Collaborative Question Answering: Methods and Evaluation ABSTRACT Maggy Anastasia Suryanto School of Computer Engineering Nanyang Technological University [email protected] Aixin Sun School
SEARCHING QUESTION AND ANSWER ARCHIVES
SEARCHING QUESTION AND ANSWER ARCHIVES A Dissertation Presented by JIWOON JEON Submitted to the Graduate School of the University of Massachusetts Amherst in partial fulfillment of the requirements for
Finding Expert Users in Community Question Answering
Finding Expert Users in Community Question Answering Fatemeh Riahi Faculty of Computer Science Dalhousie University [email protected] Zainab Zolaktaf Faculty of Computer Science Dalhousie University [email protected]
Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis
Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,
Understanding and Summarizing Answers in Community-Based Question Answering Services
Understanding and Summarizing Answers in Community-Based Answering Services Yuanjie Liu 1, Shasha Li 2, Yunbo Cao 1,3, Chin-Yew Lin 3, Dingyi Han 1, Yong Yu 1 1 Shanghai Jiao Tong University, Shanghai,
Sustaining Privacy Protection in Personalized Web Search with Temporal Behavior
Sustaining Privacy Protection in Personalized Web Search with Temporal Behavior N.Jagatheshwaran 1 R.Menaka 2 1 Final B.Tech (IT), [email protected], Velalar College of Engineering and Technology,
Domain Classification of Technical Terms Using the Web
Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using
Research of Postal Data mining system based on big data
3rd International Conference on Mechatronics, Robotics and Automation (ICMRA 2015) Research of Postal Data mining system based on big data Xia Hu 1, Yanfeng Jin 1, Fan Wang 1 1 Shi Jiazhuang Post & Telecommunication
Joint Relevance and Answer Quality Learning for Question Routing in Community QA
Joint Relevance and Answer Quality Learning for Question Routing in Community QA Guangyou Zhou, Kang Liu, and Jun Zhao National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy
Teaching in School of Electronic, Information and Electrical Engineering
Introduction to Teaching in School of Electronic, Information and Electrical Engineering Shanghai Jiao Tong University Outline Organization of SEIEE Faculty Enrollments Undergraduate Programs Sample Curricula
How To Write A Summary Of A Review
PRODUCT REVIEW RANKING SUMMARIZATION N.P.Vadivukkarasi, Research Scholar, Department of Computer Science, Kongu Arts and Science College, Erode. Dr. B. Jayanthi M.C.A., M.Phil., Ph.D., Associate Professor,
SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY
SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY G.Evangelin Jenifer #1, Mrs.J.Jaya Sherin *2 # PG Scholar, Department of Electronics and Communication Engineering(Communication and Networking), CSI Institute
Wikipedia and Web document based Query Translation and Expansion for Cross-language IR
Wikipedia and Web document based Query Translation and Expansion for Cross-language IR Ling-Xiang Tang 1, Andrew Trotman 2, Shlomo Geva 1, Yue Xu 1 1Faculty of Science and Technology, Queensland University
Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
Natural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL
Journal homepage: www.mjret.in ISSN:2348-6953 PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL Utkarsha Vibhute, Prof. Soumitra
Identifying Best Bet Web Search Results by Mining Past User Behavior
Identifying Best Bet Web Search Results by Mining Past User Behavior Eugene Agichtein Microsoft Research Redmond, WA, USA [email protected] Zijian Zheng Microsoft Corporation Redmond, WA, USA [email protected]
A Framework of User-Driven Data Analytics in the Cloud for Course Management
A Framework of User-Driven Data Analytics in the Cloud for Course Management Jie ZHANG 1, William Chandra TJHI 2, Bu Sung LEE 1, Kee Khoon LEE 2, Julita VASSILEVA 3 & Chee Kit LOOI 4 1 School of Computer
The Application Research of Ant Colony Algorithm in Search Engine Jian Lan Liu1, a, Li Zhu2,b
3rd International Conference on Materials Engineering, Manufacturing Technology and Control (ICMEMTC 2016) The Application Research of Ant Colony Algorithm in Search Engine Jian Lan Liu1, a, Li Zhu2,b
Removing Web Spam Links from Search Engine Results
Removing Web Spam Links from Search Engine Results Manuel EGELE [email protected], 1 Overview Search Engine Optimization and definition of web spam Motivation Approach Inferring importance of features
Analysis of Social Media Streams
Fakultätsname 24 Fachrichtung 24 Institutsname 24, Professur 24 Analysis of Social Media Streams Florian Weidner Dresden, 21.01.2014 Outline 1.Introduction 2.Social Media Streams Clustering Summarization
Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project
Bridging CAQDAS with text mining: Text analyst s toolbox for Big Data: Science in the Media Project Ahmet Suerdem Istanbul Bilgi University; LSE Methodology Dept. Science in the media project is funded
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
Importance of Domain Knowledge in Web Recommender Systems
Importance of Domain Knowledge in Web Recommender Systems Saloni Aggarwal Student UIET, Panjab University Chandigarh, India Veenu Mangat Assistant Professor UIET, Panjab University Chandigarh, India ABSTRACT
Incorporate Credibility into Context for the Best Social Media Answers
PACLIC 24 Proceedings 535 Incorporate Credibility into Context for the Best Social Media Answers Qi Su a,b, Helen Kai-yun Chen a, and Chu-Ren Huang a a Department of Chinese & Bilingual Studies, The Hong
Domain Adaptive Relation Extraction for Big Text Data Analytics. Feiyu Xu
Domain Adaptive Relation Extraction for Big Text Data Analytics Feiyu Xu Outline! Introduction to relation extraction and its applications! Motivation of domain adaptation in big text data analytics! Solutions!
The Impact of Query Suggestion in E-Commerce Websites
The Impact of Query Suggestion in E-Commerce Websites Alice Lee 1 and Michael Chau 1 1 School of Business, The University of Hong Kong, Pokfulam Road, Hong Kong [email protected], [email protected] Abstract.
Network Big Data: Facing and Tackling the Complexities Xiaolong Jin
Network Big Data: Facing and Tackling the Complexities Xiaolong Jin CAS Key Laboratory of Network Data Science & Technology Institute of Computing Technology Chinese Academy of Sciences (CAS) 2015-08-10
Query term suggestion in academic search
Query term suggestion in academic search Suzan Verberne 1, Maya Sappelli 1,2, and Wessel Kraaij 2,1 1. Institute for Computing and Information Sciences, Radboud University Nijmegen 2. TNO, Delft Abstract.
Approaches to Exploring Category Information for Question Retrieval in Community Question-Answer Archives
Approaches to Exploring Category Information for Question Retrieval in Community Question-Answer Archives 7 XIN CAO and GAO CONG, Nanyang Technological University BIN CUI, Peking University CHRISTIAN S.
A Rule-Based Short Query Intent Identification System
A Rule-Based Short Query Intent Identification System Arijit De 1, Sunil Kumar Kopparapu 2 TCS Innovation Labs-Mumbai Tata Consultancy Services Pokhran Road No. 2, Thane West, Maharashtra 461, India 1
Facilitating Business Process Discovery using Email Analysis
Facilitating Business Process Discovery using Email Analysis Matin Mavaddat [email protected] Stewart Green Stewart.Green Ian Beeson Ian.Beeson Jin Sa Jin.Sa Abstract Extracting business process
A Comparative Approach to Search Engine Ranking Strategies
26 A Comparative Approach to Search Engine Ranking Strategies Dharminder Singh 1, Ashwani Sethi 2 Guru Gobind Singh Collage of Engineering & Technology Guru Kashi University Talwandi Sabo, Bathinda, Punjab
A Survey on Product Aspect Ranking
A Survey on Product Aspect Ranking Charushila Patil 1, Prof. P. M. Chawan 2, Priyamvada Chauhan 3, Sonali Wankhede 4 M. Tech Student, Department of Computer Engineering and IT, VJTI College, Mumbai, Maharashtra,
Web based English-Chinese OOV term translation using Adaptive rules and Recursive feature selection
Web based English-Chinese OOV term translation using Adaptive rules and Recursive feature selection Jian Qu, Nguyen Le Minh, Akira Shimazu School of Information Science, JAIST Ishikawa, Japan 923-1292
A Comparative Study on Sentiment Classification and Ranking on Product Reviews
A Comparative Study on Sentiment Classification and Ranking on Product Reviews C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan
Link Analysis and Site Structure in Information Retrieval
Link Analysis and Site Structure in Information Retrieval Thomas Mandl Information Science Universität Hildesheim Marienburger Platz 22 31141 Hildesheim - Germany [email protected] Abstract: Link
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
A Survey on Product Aspect Ranking Techniques
A Survey on Product Aspect Ranking Techniques Ancy. J. S, Nisha. J.R P.G. Scholar, Dept. of C.S.E., Marian Engineering College, Kerala University, Trivandrum, India. Asst. Professor, Dept. of C.S.E., Marian
The Enron Corpus: A New Dataset for Email Classification Research
The Enron Corpus: A New Dataset for Email Classification Research Bryan Klimt and Yiming Yang Language Technologies Institute Carnegie Mellon University Pittsburgh, PA 15213-8213, USA {bklimt,yiming}@cs.cmu.edu
Improving Question Retrieval in Community Question Answering Using World Knowledge
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Improving Question Retrieval in Community Question Answering Using World Knowledge Guangyou Zhou, Yang Liu, Fang
Discovering and Querying Hybrid Linked Data
Discovering and Querying Hybrid Linked Data Zareen Syed 1, Tim Finin 1, Muhammad Rahman 1, James Kukla 2, Jeehye Yun 2 1 University of Maryland Baltimore County 1000 Hilltop Circle, MD, USA 21250 [email protected],
Building a Question Classifier for a TREC-Style Question Answering System
Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given
Data-Intensive Question Answering
Data-Intensive Question Answering Eric Brill, Jimmy Lin, Michele Banko, Susan Dumais and Andrew Ng Microsoft Research One Microsoft Way Redmond, WA 98052 {brill, mbanko, sdumais}@microsoft.com [email protected];
Dynamical Clustering of Personalized Web Search Results
Dynamical Clustering of Personalized Web Search Results Xuehua Shen CS Dept, UIUC [email protected] Hong Cheng CS Dept, UIUC [email protected] Abstract Most current search engines present the user a ranked
ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search
Project for Michael Pitts Course TCSS 702A University of Washington Tacoma Institute of Technology ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search Under supervision of : Dr. Senjuti
Identifying Focus, Techniques and Domain of Scientific Papers
Identifying Focus, Techniques and Domain of Scientific Papers Sonal Gupta Department of Computer Science Stanford University Stanford, CA 94305 [email protected] Christopher D. Manning Department of
Anti-Spam Filter Based on Naïve Bayes, SVM, and KNN model
AI TERM PROJECT GROUP 14 1 Anti-Spam Filter Based on,, and model Yun-Nung Chen, Che-An Lu, Chao-Yu Huang Abstract spam email filters are a well-known and powerful type of filters. We construct different
Experiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
Graph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
RETRIEVING QUESTIONS AND ANSWERS IN COMMUNITY-BASED QUESTION ANSWERING SERVICES KAI WANG
RETRIEVING QUESTIONS AND ANSWERS IN COMMUNITY-BASED QUESTION ANSWERING SERVICES KAI WANG (B.ENG, NANYANG TECHNOLOGICAL UNIVERSITY) A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY SCHOOL OF COMPUTING
An Overview of a Role of Natural Language Processing in An Intelligent Information Retrieval System
An Overview of a Role of Natural Language Processing in An Intelligent Information Retrieval System Asanee Kawtrakul ABSTRACT In information-age society, advanced retrieval technique and the automatic
Online Marketing Optimization Essentials
Online Marketing Optimization Essentials Bilal Saleh Principal Partner E-Nor Inc. May 20, 2014 Agenda 2 E-Nor Overview Search Engine Optimization (SEO) Paid search Web Analytics Q&A Graphics by: http://www.iconarchive.com/show/seo-icons-by-designbolts.html
Micro blogs Oriented Word Segmentation System
Micro blogs Oriented Word Segmentation System Yijia Liu, Meishan Zhang, Wanxiang Che, Ting Liu, Yihe Deng Research Center for Social Computing and Information Retrieval Harbin Institute of Technology,
NTT DOCOMO Technical Journal. Knowledge Q&A: Direct Answers to Natural Questions. 1. Introduction. 2. Overview of Knowledge Q&A Service
Knowledge Q&A: Direct Answers to Natural Questions Natural Language Processing Question-answering Knowledge Retrieval Knowledge Q&A: Direct Answers to Natural Questions In June, 2012, we began providing
Data Mining Yelp Data - Predicting rating stars from review text
Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University [email protected] Chetan Naik Stony Brook University [email protected] ABSTRACT The majority
Mining Signatures in Healthcare Data Based on Event Sequences and its Applications
Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Siddhanth Gokarapu 1, J. Laxmi Narayana 2 1 Student, Computer Science & Engineering-Department, JNTU Hyderabad India 1
SEO Techniques for various Applications - A Comparative Analyses and Evaluation
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727 PP 20-24 www.iosrjournals.org SEO Techniques for various Applications - A Comparative Analyses and Evaluation Sandhya
Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework
Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Usha Nandini D 1, Anish Gracias J 2 1 [email protected] 2 [email protected] Abstract A vast amount of assorted
Personalizing Image Search from the Photo Sharing Websites
Personalizing Image Search from the Photo Sharing Websites Swetha.P.C, Department of CSE, Atria IT, Bangalore [email protected] Aishwarya.P Professor, Dept.of CSE, Atria IT, Bangalore [email protected]
Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari [email protected]
Web Mining Margherita Berardi LACAM Dipartimento di Informatica Università degli Studi di Bari [email protected] Bari, 24 Aprile 2003 Overview Introduction Knowledge discovery from text (Web Content
Query Recommendation employing Query Logs in Search Optimization
1917 Query Recommendation employing Query Logs in Search Optimization Neha Singh Department of Computer Science, Shri Siddhi Vinayak Group of Institutions, Bareilly Email: [email protected] Dr Manish
Framework for Intelligent Crawler Engine on IaaS Cloud Service Model
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 17 (2014), pp. 1783-1789 International Research Publications House http://www. irphouse.com Framework for
Ming-Wei Chang. Machine learning and its applications to natural language processing, information retrieval and data mining.
Ming-Wei Chang 201 N Goodwin Ave, Department of Computer Science University of Illinois at Urbana-Champaign, Urbana, IL 61801 +1 (917) 345-6125 [email protected] http://flake.cs.uiuc.edu/~mchang21 Research
Will my Question be Answered? Predicting Question Answerability in Community Question-Answering Sites
Will my Question be Answered? Predicting Question Answerability in Community Question-Answering Sites Gideon Dror, Yoelle Maarek and Idan Szpektor Yahoo! Labs, MATAM, Haifa 31905, Israel {gideondr,yoelle,idan}@yahoo-inc.com
