Clinical Decision Support with the SPUD Language Model
|
|
|
- Clare Pope
- 10 years ago
- Views:
Transcription
1 Clinical Decision Support with the SPUD Language Model Ronan Cummins The Computer Laboratory, University of Cambridge, UK Abstract. In this paper we present the systems and techniques used by the University of Cambridge for the CDS (Clinical Decision Support) track of the 24 th Text Retrieval Conference (TREC). The system was among the best automatic approaches for both CDS tasks and is based on a new language modelling approach implemented using Lucene. 1 1 Introduction We outline the main models and techniques used to participate in the CDS track of TREC The CDS track consisted of retrieving relevant biomedical articles for answering generic clinical questions about medical records. As the documents are full scientific articles they tend to be much longer than the average general Web document. Furthermore, the types of the queries issued for this task are also much longer than those used for typical web search. Therefore, we adopted the use of our new document language model [1] that has been shown to retrieve longer documents more fairly. The recently developed SPUD language model [1] treats document generation using the Pólya process and aims to model wordburstiness directly. It has been shown to incorporate a number of theoretically interesting properties. For example, it models the scope and verbosity hypothesis [3] separately, and reintroduces a measure closely related to inverse document frequency [4]. Therefore, we hypothesised that the SPUD language model would be well-suited to the retrieval of scientific texts. We also studied the performance of a newly developed query modelling technique which re-weights salient terms in longer queries. Furthermore, we investigated query expansion using two different models. 2 Models We now outline the main models used in our approaches. 1
2 2 Ronan Cummins 2.1 Document Models We model each document as a mixture of multivariate Pólya distributions. The model captures word burstiness by modelling the dependencies between recurrences of the same word-type. Each document is modelled as follows: α d = (1 ω) α τ + ω α c (1) where α d, α τ, and α c are the document model, topic model, 2 and background model respectively. The hyper-parameter ω controls the smoothing and is stable at approximately ω = 0.8. Each of these models are multivariate Pólya distributions with parameters estimated as follows: ˆα τ = {m d c(t, d) d : t d} ˆα c = {m c df t t df t : t C} (2) where m d is the number of word-types (distinct terms) in d, c(t, d) is the count of term t in document d, d is the number of word tokens in d, df t is the document frequency of term t in the collection C, and m c is a background mass parameter that can be estimated via numerical methods (see [1] for details). The scale parameters m d and m c can be interpreted as beliefs in the parameters c(t, d)/ d and df t / t df t respectively. For the document collection in the CDS track the parameter m c is estimated to be 775. The KL-divergence approach to ranking documents can be used with these document models whereby one takes a point-estimate of the distribution (i.e. E[α d ] is a multinomial) and one can rank documents according to the negative KL-divergence between the query distribution and the expected document multinomial. 2.2 Query Models When a user formulates a short keyword query (e.g. hypotension, hypoxia), it is usually assumed that they have already distilled the topical aspect of the information need. Consequently, one may assume that the probability that a particular query token is topical is 1.0 and this can be normalised accordingly to estimate the maximum likelihood query model (i.e. { 1 2, 1 2 }). This is the standard method of estimating query models for use with KL-Divergence. However, when dealing with natural language queries (e.g. A 44-year-old man with coffee-ground emesis, tachycardia, hypoxia, hypotension and cool, clammy extremities.) it is likely that some terms are generated by a background query language model. Therefore, we assume that long natural language queries are generated by drawing terms from a query model α q which consists of both a topical language model α qτ and a background query language model α qc. The topical query model describes the topical information that the user requires, 2 For the purposes of this paper, we refer to the unsmoothed model as the topic model of the document as it explains words not explained by the general background model.
3 Clinical Decision Support with the SPUD Language Model 3 while the background query model describes the syntactic glue of the general query language. Examples of fragments that can be explained by the background query language model are tokens such as I, am, looking, for,, and A, relevant, document, may, include, (a stereotypical TREC construct). Therefore, our new query model is defined as follows: α q = (1 λ q ) α qτ + (λ q ) α qc (3) where λ q is the probability mass of the background query language model. Although the background query language model is likely to contain some structural clues regarding relevance, we simple regard this model as generating noise tokens, and therefore aim to extract the topical part of each query. This can be achieved by determining using Bayes theorem the probability that a particular query term t was generated by the topical query model as follows: p(α qτ t) = (1 λ q ) p(t α qτ ) (1 λ q ) p(t α qτ ) + (λ q ) p(t α qc ) The final step involves determining the distribution of terms in the topical query model α qτ by normalising over the tokens in q as follows: c(t, q) p(α qτ t) p(t α qτ ) = t q (c(t, q) p(α qτ t )) which we call the discriminative query model (DQM). For the specific instantiation of the model using multivariate Pòlya distributions (α qτ ), the probability that a particular term t came from the topical part of the query model, when assuming that the background query model is the general collection, is as follows: p(α qτ t) = c(t, q) c(t, q) + (1 ωq) ω q df t m c q t df t (4) (5) m d (6) which can be plugged into Eq. 5 to yield the DQM using the multivariate Pòlya. The one free parameter in this specific query model is ω q which determines the belief in the background query model. We set this value to be the same as that from the document model such that ω = ω q. This model essentially introduces a type of idf into longer queries, as common terms occurring in the query will be more likely to come from the background model. As a result, these common terms are down-weighted in the query to varying degrees. 2.3 Psuedo-Relevance Feedback Models We adopt two types of pseudo-relevance feedback models. Firstly, we use the standard RM3 model [2] with our SPUD framework, where we assume the top F documents are relevant. Secondly, we use an approach that extracts the topical terms from the feedback documents in a similar manner to that outlined in the previous section. Given a set of feedback documents F, we then rank terms as follows:
4 4 Ronan Cummins p(θ Q t) = d F p(α τ t) p(q α d ) d F p(q α d ) (7) where p(q α d ) is the document score. Given the SPUD language model (Eq. 1) and its parameters estimates (Eq. 2), the probability that the term t was generated from the topical model α τ of a feedback-document can be calculated via Bayes theorem as follows: (1 ω) α τt p(α τ t) = (8) (1 ω) α τt + ω α ct where α τt and α ct are the parameters of t for the document topic model and background model respectively. A relatively simple intuition for this formula is that topical terms are those that are more likely generated from the topical part of a document than those that are generated by the background model. For all approaches we interpolate 30 feedback terms with the original query using linear-interpolation of System and Topics Our models were all implemented in the Lucene retrieval framework. We stemmed all text using Porter s stemmer and removed a small number of stopwords (the 26 stopwords from the Lucene EnglishAnalyzer). The topics in the CDS track are of three different types (diagnosis, test, and treatment) depending on what the specific task the clinician is involved in. An example topic is as follows: <topic number="1" type="diagnosis"> <description>a 44 yo male is brought to the emergency room after multiple bouts of vomiting that has a coffee ground appearance. His heart rate is 135 bpm and blood pressure is 70/40 mmhg. Physical exam findings include decreased mental status and cool extremities. He receives a rapid infusion of crystalloid solution followed by packed red blood cell transfusion and is admitted to the ICU for further care. </description> <summary>a 44-year-old man with coffee-ground emesis, tachycardia, hypoxia, hypotension and cool, clammy extremities. </summary> </topic> 4 Results This section presents the results of the six runs submitted to the CDS track (three for task A and three for task B). Table 1 shows details of the six runs submitted.
5 Clinical Decision Support with the SPUD Language Model 5 Table 1. Details of settings used for each run System Task Doc Model Query Model Feedback Model Topic Fields CAMspud1 A SPUD ω=0.9 DQM ω=0.9 RM3 λ=0.5, F = 5 type + summary CAMspud3 A SPUD ω=0.9 DQM ω=0.9 QTM λ=0.5, F = 5 summary CAMspud5 A SPUD ω=0.9 DQM ω=0.9 RM3 λ=0.5, F = 5 description CAMspud6 B SPUD ω=0.85 DQM ω=0.85 No Feedback diagnosis + summary CAMspud7 B SPUD ω=0.85 DQM ω=0.85 RM3 λ=0.5, F = 10 diagnosis + summary CAMspud8 B SPUD ω=0.85 DQM ω=0.85 QTM λ=0.5, F = 10 diagnosis + summary 4.1 Task A Table 2 shows the MAP, InfAP, and InfNDCG of the three runs submitted for task A. The rows labelled median and best show the median and best performance per topic averaged over all 30 topics for all of the runs in the track. It is worth noting that best does not represent one system, rather it indicates an upper-bound or oracle approach. All of our approaches performed above the median which is encouraging, with CAMspud1 being our best run for task A. Our best run is very close in performance to the best single run for task A (labelled top system). This approach used the type and summary fields with RM3 relevance feedback of 30 terms. Overall, our system was the third best for task A (out of 36 automatic systems). Table 2. MAP, InfAP, and InfNDCG over 30 Topics System MAP InfAP InfNDCG CAMspud CAMspud CAMspud top system median best Task B Table 3 shows the MAP, InfAP, and InfNDCG of the three runs submitted for task B (where a diagnosis field is included in the topics of type treatment and test). For task B we used both the diagnosis and summary field in all of our runs. Again all of our runs outperform the median run. CAMspud6 does not use pseudo-relevance query expansion and is the worst of the three runs. The only
6 6 Ronan Cummins difference between CAMspud7 and CAMspud8 is that the former uses RM3 pseudo-relevance expansion, while the latter uses the new query topic modelling approach. As these results are higher than in task A, it suggests that including the diagnosis field is useful. Overall, our system was the fifth best for task B (out of 26 automatic systems). Table 3. MAP, InfAP, and InfNDCG over 30 Topics System MAP InfAP InfNDCG CAMspud CAMspud CAMspud top system median best Summary We experimented with using the new SPUD language modelling approach in the CDS track. In general the approach is quite effective and is among the top five systems on both of the CDS tasks. In summary, we found query expansion to be beneficial, the summary field to be more effective than the description field, and we found that using the diagnosis field (when available) also leads to an improvement in the task. References 1. Ronan Cummins, Jiaul H. Paik, and Yuanhua Lv. A Pólya urn document language model for improved information retrieval. ACM Transactions of Informations Systems, 33(4):21, Victor Lavrenko and W. Bruce Croft. Relevance based language models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 01, pages , New York, NY, USA, ACM. 3. S. E. Robertson and S. Walker. Some simple effective approximations to the 2- poisson model for probabilistic weighted retrieval. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 94, pages , New York, NY, USA, Springer-Verlag New York, Inc. 4. Karen Spärck-Jones. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28:11 21, 1972.
TEMPER : A Temporal Relevance Feedback Method
TEMPER : A Temporal Relevance Feedback Method Mostafa Keikha, Shima Gerani and Fabio Crestani {mostafa.keikha, shima.gerani, fabio.crestani}@usi.ch University of Lugano, Lugano, Switzerland Abstract. The
UMass at TREC 2008 Blog Distillation Task
UMass at TREC 2008 Blog Distillation Task Jangwon Seo and W. Bruce Croft Center for Intelligent Information Retrieval University of Massachusetts, Amherst Abstract This paper presents the work done for
An Information Retrieval using weighted Index Terms in Natural Language document collections
Internet and Information Technology in Modern Organizations: Challenges & Answers 635 An Information Retrieval using weighted Index Terms in Natural Language document collections Ahmed A. A. Radwan, Minia
Terrier: A High Performance and Scalable Information Retrieval Platform
Terrier: A High Performance and Scalable Information Retrieval Platform Iadh Ounis, Gianni Amati, Vassilis Plachouras, Ben He, Craig Macdonald, Christina Lioma Department of Computing Science University
Axiomatic Analysis and Optimization of Information Retrieval Models
Axiomatic Analysis and Optimization of Information Retrieval Models ChengXiang Zhai Dept. of Computer Science University of Illinois at Urbana Champaign USA http://www.cs.illinois.edu/homes/czhai Hui Fang
Towards a Visually Enhanced Medical Search Engine
Towards a Visually Enhanced Medical Search Engine Lavish Lalwani 1,2, Guido Zuccon 1, Mohamed Sharaf 2, Anthony Nguyen 1 1 The Australian e-health Research Centre, Brisbane, Queensland, Australia; 2 The
Incorporating Window-Based Passage-Level Evidence in Document Retrieval
Incorporating -Based Passage-Level Evidence in Document Retrieval Wensi Xi, Richard Xu-Rong, Christopher S.G. Khoo Center for Advanced Information Systems School of Applied Science Nanyang Technological
Simple Language Models for Spam Detection
Simple Language Models for Spam Detection Egidio Terra Faculty of Informatics PUC/RS - Brazil Abstract For this year s Spam track we used classifiers based on language models. These models are used to
The University of Lisbon at CLEF 2006 Ad-Hoc Task
The University of Lisbon at CLEF 2006 Ad-Hoc Task Nuno Cardoso, Mário J. Silva and Bruno Martins Faculty of Sciences, University of Lisbon {ncardoso,mjs,bmartins}@xldb.di.fc.ul.pt Abstract This paper reports
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
Medical Information-Retrieval Systems. Dong Peng Medical Informatics Group
Medical Information-Retrieval Systems Dong Peng Medical Informatics Group Outline Evolution of medical Information-Retrieval (IR). The information retrieval process. The trend of medical information retrieval
Approaches to Exploring Category Information for Question Retrieval in Community Question-Answer Archives
Approaches to Exploring Category Information for Question Retrieval in Community Question-Answer Archives 7 XIN CAO and GAO CONG, Nanyang Technological University BIN CUI, Peking University CHRISTIAN S.
Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
Homework 2. Page 154: Exercise 8.10. Page 145: Exercise 8.3 Page 150: Exercise 8.9
Homework 2 Page 110: Exercise 6.10; Exercise 6.12 Page 116: Exercise 6.15; Exercise 6.17 Page 121: Exercise 6.19 Page 122: Exercise 6.20; Exercise 6.23; Exercise 6.24 Page 131: Exercise 7.3; Exercise 7.5;
Search Engines. Stephen Shaw <[email protected]> 18th of February, 2014. Netsoc
Search Engines Stephen Shaw Netsoc 18th of February, 2014 Me M.Sc. Artificial Intelligence, University of Edinburgh Would recommend B.A. (Mod.) Computer Science, Linguistics, French,
Comparison of Standard and Zipf-Based Document Retrieval Heuristics
Comparison of Standard and Zipf-Based Document Retrieval Heuristics Benjamin Hoffmann Universität Stuttgart, Institut für Formale Methoden der Informatik Universitätsstr. 38, D-70569 Stuttgart, Germany
Dynamical Clustering of Personalized Web Search Results
Dynamical Clustering of Personalized Web Search Results Xuehua Shen CS Dept, UIUC [email protected] Hong Cheng CS Dept, UIUC [email protected] Abstract Most current search engines present the user a ranked
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma [email protected] The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
Supporting Keyword Search in Product Database: A Probabilistic Approach
Supporting Keyword Search in Product Database: A Probabilistic Approach Huizhong Duan 1, ChengXiang Zhai 2, Jinxing Cheng 3, Abhishek Gattani 4 University of Illinois at Urbana-Champaign 1,2 Walmart Labs
Eng. Mohammed Abdualal
Islamic University of Gaza Faculty of Engineering Computer Engineering Department Information Storage and Retrieval (ECOM 5124) IR HW 5+6 Scoring, term weighting and the vector space model Exercise 6.2
Data Mining Yelp Data - Predicting rating stars from review text
Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University [email protected] Chetan Naik Stony Brook University [email protected] ABSTRACT The majority
Modern Information Retrieval: A Brief Overview
Modern Information Retrieval: A Brief Overview Amit Singhal Google, Inc. [email protected] Abstract For thousands of years people have realized the importance of archiving and finding information. With
Early Warning Scores (EWS) Clinical Sessions 2011 By Bhavin Doshi
Early Warning Scores (EWS) Clinical Sessions 2011 By Bhavin Doshi What is EWS? After qualifying, junior doctors are expected to distinguish between the moderately sick patients who can be managed in the
Bagged Ensemble Classifiers for Sentiment Classification of Movie Reviews
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3 Issue 2 February, 2014 Page No. 3951-3961 Bagged Ensemble Classifiers for Sentiment Classification of Movie
Mining Text Data: An Introduction
Bölüm 10. Metin ve WEB Madenciliği http://ceng.gazi.edu.tr/~ozdemir Mining Text Data: An Introduction Data Mining / Knowledge Discovery Structured Data Multimedia Free Text Hypertext HomeLoan ( Frank Rizzo
Emoticon Smoothed Language Models for Twitter Sentiment Analysis
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence Emoticon Smoothed Language Models for Twitter Sentiment Analysis Kun-Lin Liu, Wu-Jun Li, Minyi Guo Shanghai Key Laboratory of
Topic models for Sentiment analysis: A Literature Survey
Topic models for Sentiment analysis: A Literature Survey Nikhilkumar Jadhav 123050033 June 26, 2014 In this report, we present the work done so far in the field of sentiment analysis using topic models.
A Generalized Framework of Exploring Category Information for Question Retrieval in Community Question Answer Archives
A Generalized Framework of Exploring Category Information for Question Retrieval in Community Question Answer Archives Xin Cao 1, Gao Cong 1, 2, Bin Cui 3, Christian S. Jensen 1 1 Department of Computer
PREDICTING MARKET VOLATILITY FEDERAL RESERVE BOARD MEETING MINUTES FROM
PREDICTING MARKET VOLATILITY FROM FEDERAL RESERVE BOARD MEETING MINUTES Reza Bosagh Zadeh and Andreas Zollmann Lab Advisers: Noah Smith and Bryan Routledge GOALS Make Money! Not really. Find interesting
Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Text Mining Anya Yarygina Boris Novikov Introduction Generally used to denote any system that analyzes large quantities of natural language text and detects lexical or
Learning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
Latent Semantic Indexing with Selective Query Expansion Abstract Introduction
Latent Semantic Indexing with Selective Query Expansion Andy Garron April Kontostathis Department of Mathematics and Computer Science Ursinus College Collegeville PA 19426 Abstract This article describes
Information Retrieval Elasticsearch
Information Retrieval Elasticsearch IR Information retrieval (IR) is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches
VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter
VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter Gerard Briones and Kasun Amarasinghe and Bridget T. McInnes, PhD. Department of Computer Science Virginia Commonwealth University Richmond,
Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
Improving Contextual Suggestions using Open Web Domain Knowledge
Improving Contextual Suggestions using Open Web Domain Knowledge Thaer Samar, 1 Alejandro Bellogín, 2 and Arjen de Vries 1 1 Centrum Wiskunde & Informatica, Amsterdam, The Netherlands 2 Universidad Autónoma
Optimization of Internet Search based on Noun Phrases and Clustering Techniques
Optimization of Internet Search based on Noun Phrases and Clustering Techniques R. Subhashini Research Scholar, Sathyabama University, Chennai-119, India V. Jawahar Senthil Kumar Assistant Professor, Anna
Query term suggestion in academic search
Query term suggestion in academic search Suzan Verberne 1, Maya Sappelli 1,2, and Wessel Kraaij 2,1 1. Institute for Computing and Information Sciences, Radboud University Nijmegen 2. TNO, Delft Abstract.
Combining Global and Personal Anti-Spam Filtering
Combining Global and Personal Anti-Spam Filtering Richard Segal IBM Research Hawthorne, NY 10532 Abstract Many of the first successful applications of statistical learning to anti-spam filtering were personalized
W. Heath Rushing Adsurgo LLC. Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare. Session H-1 JTCC: October 23, 2015
W. Heath Rushing Adsurgo LLC Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare Session H-1 JTCC: October 23, 2015 Outline Demonstration: Recent article on cnn.com Introduction
Prediction of Heart Disease Using Naïve Bayes Algorithm
Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,
1 o Semestre 2007/2008
Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Outline 1 2 3 4 5 Outline 1 2 3 4 5 Exploiting Text How is text exploited? Two main directions Extraction Extraction
Building a Question Classifier for a TREC-Style Question Answering System
Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given
CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation.
CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation. Miguel Ruiz, Anne Diekema, Páraic Sheridan MNIS-TextWise Labs Dey Centennial Plaza 401 South Salina Street Syracuse, NY 13202 Abstract:
Christfried Webers. Canberra February June 2015
c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic
THUTR: A Translation Retrieval System
THUTR: A Translation Retrieval System Chunyang Liu, Qi Liu, Yang Liu, and Maosong Sun Department of Computer Science and Technology State Key Lab on Intelligent Technology and Systems National Lab for
A Statistical Text Mining Method for Patent Analysis
A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, [email protected] Abstract Most text data from diverse document databases are unsuitable for analytical
Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection
Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection Gareth J. F. Jones, Declan Groves, Anna Khasin, Adenike Lam-Adesina, Bart Mellebeek. Andy Way School of Computing,
Information Retrieval System Assigning Context to Documents by Relevance Feedback
Information Retrieval System Assigning Context to Documents by Relevance Feedback Narina Thakur Department of CSE Bharati Vidyapeeth College Of Engineering New Delhi, India Deepti Mehrotra ASCS Amity University,
FINANCIAL SERVICES BOARD INSURANCE DEPARTMENT
FINANCIAL SERVICES BOARD INSURANCE DEPARTMENT SPECIAL REPORT ON THE RESULTS OF THE LONG-TERM INSURANCE INDUSTRY FOR THE PERIOD ENDED MARCH 6 June 1 SPECIAL REPORT ON THE RESULTS OF THE LONG-TERM INSURANCE
A Study of Using an Out-Of-Box Commercial MT System for Query Translation in CLIR
A Study of Using an Out-Of-Box Commercial MT System for Query Translation in CLIR Dan Wu School of Information Management Wuhan University, Hubei, China [email protected] Daqing He School of Information
Direct Marketing Profit Model. Bruce Lund, Marketing Associates, Detroit, Michigan and Wilmington, Delaware
Paper CI-04 Direct Marketing Profit Model Bruce Lund, Marketing Associates, Detroit, Michigan and Wilmington, Delaware ABSTRACT A net lift model gives the expected incrementality (the incremental rate)
HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...
HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men
Procedia Computer Science 00 (2012) 1 21. Trieu Minh Nhut Le, Jinli Cao, and Zhen He. [email protected], [email protected], [email protected].
Procedia Computer Science 00 (2012) 1 21 Procedia Computer Science Top-k best probability queries and semantics ranking properties on probabilistic databases Trieu Minh Nhut Le, Jinli Cao, and Zhen He
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts Julio Villena-Román 1,3, Sara Lana-Serrano 2,3 1 Universidad Carlos III de Madrid 2 Universidad Politécnica de Madrid 3 DAEDALUS
Efficient Cluster Detection and Network Marketing in a Nautural Environment
A Probabilistic Model for Online Document Clustering with Application to Novelty Detection Jian Zhang School of Computer Science Cargenie Mellon University Pittsburgh, PA 15213 [email protected] Zoubin
Comparing Tag Clouds, Term Histograms, and Term Lists for Enhancing Personalized Web Search
Comparing Tag Clouds, Term Histograms, and Term Lists for Enhancing Personalized Web Search Orland Hoeber and Hanze Liu Department of Computer Science, Memorial University St. John s, NL, Canada A1B 3X5
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,
Chapter 2 The Information Retrieval Process
Chapter 2 The Information Retrieval Process Abstract What does an information retrieval system look like from a bird s eye perspective? How can a set of documents be processed by a system to make sense
Blocking Blog Spam with Language Model Disagreement
Blocking Blog Spam with Language Model Disagreement Gilad Mishne Informatics Institute, University of Amsterdam Kruislaan 403, 1098SJ Amsterdam The Netherlands [email protected] David Carmel, Ronny
