Information Retrieval. Lecture 8 - Relevance feedback and query expansion. Introduction. Overview. About Relevance Feedback. Wintersemester 2007
|
|
|
- Randell Lamb
- 9 years ago
- Views:
Transcription
1 Information Retrieval Lecture 8 - Relevance feedback and query expansion Seminar für Sprachwissenschaft International Studies in Computational Linguistics Wintersemester / 32 Introduction An information need may be expressed using different keywords (synonymy) impact on recall examples: ship vs boat, aircraft vs airplane Solutions: refining queries manually or expanding queries (semi) automatically Semi-automatic query expansion: local methods global methods based on the retrieved documents and the query (ex: Relevance Feedback) independent of the query and results (ex: thesaurus, spelling corrections) 2/ 32 Overview When to use Relevance Feedback 3/ 32 Feedback given by the user about the relevance of the documents in the initial set of results 4/ 32
2 (continued) Based on the idea that: (i) defining good queries is difficult when the collection is (partly) unknown (ii) judging particular documents is easy Allows to deal with situations where the user s information needs evolve with the checking of the retrieved documents Example: image search engine 5/ 32 Relevance feedback example 1 6/ 32 Relevance feedback example 1 7/ 32 Relevance feedback example 1 8/ 32
3 Relevance feedback example 2 Query: New space satellite applications , 08/13/91, NASA Hasn t Scrapped Imaging Spectrometer , 07/09/91, NASA Scratches Environment Gear From Satellite Plan , 04/04/90, Science Panel Backs NASA Satellite Plan, But Urges Launches of Smaller Probes , 09/09/91, A NASA Satellite Project Accomplishes Incredible Feat: Staying Within Budget , 07/24/90, Scientist Who Exposed Global Warming Proposes Satellites for Climate Research , 08/22/90, Report Provides Support for the Critics Of Using Big Satellites to Study Climate , 04/13/87, Arianespace Receives Satellite Launch Pact From Telesat Canada , 12/02/87, Telecommunications Tale of Two Companies 9/ 32 Relevance feedback example new space satellite application nasa eos launch aster instrument arianespace bundespost ss rocket scientist broadcast earth oil measure 10/ 32 Relevance feedback example , 07/09/91, NASA Scratches Environment Gear From Satellite Plan , 08/13/91, NASA Hasn t Scrapped Imaging Spectrometer , 08/07/89, When the Pentagon Launches a Secret Satellite, Space Sleuths Do Some Spy Work of Their Own , 07/31/89, NASA Uses Warm Superconductors For Fast Circuit , 12/02/87, Telecommunications Tale of Two Companies , 07/09/91, Soviets May Adapt Parts of SS-20 Missile For Commercial Use , 07/12/88, Gaping Gap: Pentagon Lags in Race To Match the Soviets In Rocket Launchers , 06/14/90, Rescue of Satellite By Space Agency To Cost \$90 Million 11/ 32 Standard algorithm for relevance feedback (SMART, 70s) Integrates a measure of relevance feedback into the Vector Space Model Idea: we want to find a query vector q opt maximizing the similarity with relevant documents while minimizing the similarity with non-relevant documents q opt = argmax q [sim( q, C r ) sim( q, C nr )] With the cosine similarity, this gives: q opt = 1 d j 1 C r C nr dj Cr d j dj Cnr 12/ 32
4 (continued) Problem with the above metrics: the set of relevant documents is unknown Instead, we produce the modified query m: q m = α q 0 + β 1 d j γ 1 D r D nr dj Dr d j dj Dnr where: q 0 is the original query vector D r is the set of known relevant documents D nr is the set of known non-relevant documents α, β, γ are balancing weights (judge vs system) 13/ 32 (continued) Remarks: Negative weights are usually ignored Rocchio-based relevance feedback improves both recall and precision For reaching high recall, many iterations are needed Empirically determined values for the balancing weights: α = 1 β = 0.75 γ = 0.15 Positive feedback is usually more valuable than negative feedback: β > γ 14/ 32 Rocchio algorithm: exercise Consider the following collection (one doc per line): good movie trailer shown trailer with good actor unseen movie a dictionary made of the words movie, trailer and good, and an IR system using the standard tf idf weighting (without normalisation). Assuming a user judges the first 2 documents relevant for the query movie trailer. What would be the Rocchio-revised query? 15/ 32 Alternative to the Rocchio algorithm, use a document classification instead of a Vector Space Model-based retrieval P(x t = 1 R = 1) = VR t VR P(x t = 0 R = 0) = n t VR t N VR where: N is the total number of documents n t is the number of documents containing t VR is the set of known relevant documents VR t is the set of known relevant documents containing t Problem: no memory of the original query 16/ 32
5 When to use Relevance Feedback When to use Relevance Feedback Relevance Feedback does not work when: the query is misspelled we want cross-language retrieval the vocabulary is ambiguous the users do not have sufficient initial knowledge the query concerns an instance of a general concept (e.g. felines) the documents are gathered into subsets each using a different vocabulary the query has disjunctive answer sets (e.g. the pop star that worked at KFC ) there exist several prototypes of relevant documents Practical problem: refining leads to longer queries that need more time to process 17/ 32 Few web IR systems use relevance feedback hard to explain to users users are mainly interested in fast retrieval (i.e. no iterations) users usually are not interested in high recall Nowadays: clickstream-based feedback (which links are clicked on by users) implicit feedback from the writer rather than feedback from the reader 18/ 32 Note that improvements brought by the relevance feedback decrease with the number of iterations, usually one round gives good results Several evaluation strategies: (a) comparative evaluation query q 0 prec/recall graph query q m prec/recall graph usually +50% of mean average precision (partly comes from the fact that known relevant documents are higher ranked) 19/ 32 (continued) Evaluation strategies: (b) residual collection same technique as above but by looking at the set of retrieved documents - the set of assessed relevant documents the performance measure drops (c) using two similar collections collection #1 is used for querying and giving relevance feedback collection #2 is used for comparative evaluation q 0 and q m are compared on collection #2 20/ 32
6 (continued) Evaluation strategies: (d) user studies e.g. time-based comparison of retrieval, user satisfaction, etc. user utility is a fair evaluation as it corresponds to real system usage 21/ 32 Overview When to use Relevance Feedback 22/ 32 Pseudo Relevance Feedback Aka blind relevance feedback No need of an extended interaction between the user and the system Method: normal retrieval to find an initial set of most relevant documents assumption that the top k documents are relevant relevance feedback defined accordingly Works with the TREC Ad Hoc task lnc.ltc (precision at k = 50): no-rf 62.5 %, RF 72.7 % Problem: distribution of the documents may influence the results 23/ 32 Indirect Relevance Feedback Uses evidences rather than explicit feedback Example: number of clicks on a given retrieved document Not user-specific More suitable for web IR, since it does not need an extra action from the user 24/ 32
7 Overview When to use Relevance Feedback 25/ 32 Vocabulary tools for query reformulation Tools displaying: a list of close terms belonging to the dictionary information about the query words that were omitted (cf stop-list) the results of stemming debugging environnement 26/ 32 Query logs and thesaurus Users select among query suggestions that are built either from query logs or thesaurus Replacement words are extracted from thesaurus according to their proximity to the initial query word Thesaurus can be developed: manually (e.g. biomedicine) automatically (cf below) NB: query expansion (i) increases recall (ii) may need users relevance on query terms ( documents) 27/ 32 Automatic thesaurus generation Analyze of the collection for building the thesaurus automatically: 1. Using word co-occurrences (co-occurring words are more likely to belong to the same query field) may contain false positives (example: apple) 2. Using a shallow grammatical analyzes to find out relations between words example:cooked, eaten, digested food Note that co-occurrence-based thesaurus are more robust, but grammatical-analyzes thesaurus are more accurate 28/ 32
8 Building a co-occurrence-based thesaurus We build a term-document matrix A where A[t, d] = w t,d (e.g. normalized tf idf ) We then calculate C = A.A T c 11 c 1n C =..... c m1 c mn c ij is the similarity score between terms i and j 29/ 32 Automatically built thesaurus 30/ 32 Conclusion Query expansion using either local methods: Rocchio algorithm for Relevance Feedback Pseudo Relevance Feedback Indirect Relevance Feedback or global ones: Query logs Thesaurus Thesaurus-based query expansion increases recall but may decrease precision (cf ambiguous terms) High cost of thesaurus development and maintenance Thesaurus-based query expansion is less efficient than Rocchio Relevance Feedback but may be as good as Pseudo Relevance Feedback 31/ 32 References C. Manning, P. Raghavan and H. Schütze, Introduction to Information Retrieval chapter09-queryexpansion.pdf Chris Buckley and Gerard Salton and James Allan The effect of adding relevance information in a relevance feedback environment (1994) Ian Ruthven and Mounia Lalmas A survey on the use of relevance feedback for information access systems (2003) ir/papers/ker.pdf 32/ 32
Homework 2. Page 154: Exercise 8.10. Page 145: Exercise 8.3 Page 150: Exercise 8.9
Homework 2 Page 110: Exercise 6.10; Exercise 6.12 Page 116: Exercise 6.15; Exercise 6.17 Page 121: Exercise 6.19 Page 122: Exercise 6.20; Exercise 6.23; Exercise 6.24 Page 131: Exercise 7.3; Exercise 7.5;
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
Comparison of Standard and Zipf-Based Document Retrieval Heuristics
Comparison of Standard and Zipf-Based Document Retrieval Heuristics Benjamin Hoffmann Universität Stuttgart, Institut für Formale Methoden der Informatik Universitätsstr. 38, D-70569 Stuttgart, Germany
An Information Retrieval using weighted Index Terms in Natural Language document collections
Internet and Information Technology in Modern Organizations: Challenges & Answers 635 An Information Retrieval using weighted Index Terms in Natural Language document collections Ahmed A. A. Radwan, Minia
Latent Semantic Indexing with Selective Query Expansion Abstract Introduction
Latent Semantic Indexing with Selective Query Expansion Andy Garron April Kontostathis Department of Mathematics and Computer Science Ursinus College Collegeville PA 19426 Abstract This article describes
Data and Analysis. Informatics 1 School of Informatics, University of Edinburgh. Part III Unstructured Data. Ian Stark. Staff-Student Liaison Meeting
Inf1-DA 2010 2011 III: 1 / 89 Informatics 1 School of Informatics, University of Edinburgh Data and Analysis Part III Unstructured Data Ian Stark February 2011 Inf1-DA 2010 2011 III: 2 / 89 Part III Unstructured
The University of Lisbon at CLEF 2006 Ad-Hoc Task
The University of Lisbon at CLEF 2006 Ad-Hoc Task Nuno Cardoso, Mário J. Silva and Bruno Martins Faculty of Sciences, University of Lisbon {ncardoso,mjs,bmartins}@xldb.di.fc.ul.pt Abstract This paper reports
Information Need Assessment in Information Retrieval
Information Need Assessment in Information Retrieval Beyond Lists and Queries Frank Wissbrock Department of Computer Science Paderborn University, Germany [email protected] Abstract. The goal of every information
Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
I. The SMART Project - Status Report and Plans. G. Salton. The SMART document retrieval system has been operating on a 709^
1-1 I. The SMART Project - Status Report and Plans G. Salton 1. Introduction The SMART document retrieval system has been operating on a 709^ computer since the end of 1964. The system takes documents
Eng. Mohammed Abdualal
Islamic University of Gaza Faculty of Engineering Computer Engineering Department Information Storage and Retrieval (ECOM 5124) IR HW 5+6 Scoring, term weighting and the vector space model Exercise 6.2
Incorporating Window-Based Passage-Level Evidence in Document Retrieval
Incorporating -Based Passage-Level Evidence in Document Retrieval Wensi Xi, Richard Xu-Rong, Christopher S.G. Khoo Center for Advanced Information Systems School of Applied Science Nanyang Technological
Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek
Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pelánek 2015 Today lecture, basic principles: content-based knowledge-based hybrid, choice of approach,... critiquing, explanations,...
Lecture 5: Evaluation
Lecture 5: Evaluation Information Retrieval Computer Science Tripos Part II Simone Teufel Natural Language and Information Processing (NLIP) Group [email protected] 1 Overview 1 Recap/Catchup
Lecture 1: Introduction and the Boolean Model
Lecture 1: Introduction and the Boolean Model Information Retrieval Computer Science Tripos Part II Simone Teufel Natural Language and Information Processing (NLIP) Group [email protected] 1 Overview
Information Retrieval and Web Search Engines
Information Retrieval and Web Search Engines Lecture 7: Document Clustering December 10 th, 2013 Wolf-Tilo Balke and Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig
A COMBINED TEXT MINING METHOD TO IMPROVE DOCUMENT MANAGEMENT IN CONSTRUCTION PROJECTS
A COMBINED TEXT MINING METHOD TO IMPROVE DOCUMENT MANAGEMENT IN CONSTRUCTION PROJECTS Caldas, Carlos H. 1 and Soibelman, L. 2 ABSTRACT Information is an important element of project delivery processes.
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
1 o Semestre 2007/2008
Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Outline 1 2 3 4 5 Outline 1 2 3 4 5 Exploiting Text How is text exploited? Two main directions Extraction Extraction
Medical Information-Retrieval Systems. Dong Peng Medical Informatics Group
Medical Information-Retrieval Systems Dong Peng Medical Informatics Group Outline Evolution of medical Information-Retrieval (IR). The information retrieval process. The trend of medical information retrieval
PDF hosted at the Radboud Repository of the Radboud University Nijmegen
PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is an author's version which may differ from the publisher's version. For additional information about this
Introduction to Information Retrieval http://informationretrieval.org
Introduction to Information Retrieval http://informationretrieval.org IIR 6&7: Vector Space Model Hinrich Schütze Institute for Natural Language Processing, University of Stuttgart 2011-08-29 Schütze:
Content-Based Recommendation
Content-Based Recommendation Content-based? Item descriptions to identify items that are of particular interest to the user Example Example Comparing with Noncontent based Items User-based CF Searches
Dynamical Clustering of Personalized Web Search Results
Dynamical Clustering of Personalized Web Search Results Xuehua Shen CS Dept, UIUC [email protected] Hong Cheng CS Dept, UIUC [email protected] Abstract Most current search engines present the user a ranked
Clustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller [email protected] Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
Machine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
Mining Text Data: An Introduction
Bölüm 10. Metin ve WEB Madenciliği http://ceng.gazi.edu.tr/~ozdemir Mining Text Data: An Introduction Data Mining / Knowledge Discovery Structured Data Multimedia Free Text Hypertext HomeLoan ( Frank Rizzo
The Need for Training in Big Data: Experiences and Case Studies
The Need for Training in Big Data: Experiences and Case Studies Guy Lebanon Amazon Background and Disclaimer All opinions are mine; other perspectives are legitimate. Based on my experience as a professor
Comparing Tag Clouds, Term Histograms, and Term Lists for Enhancing Personalized Web Search
Comparing Tag Clouds, Term Histograms, and Term Lists for Enhancing Personalized Web Search Orland Hoeber and Hanze Liu Department of Computer Science, Memorial University St. John s, NL, Canada A1B 3X5
Dynamics of Genre and Domain Intents
Dynamics of Genre and Domain Intents Shanu Sushmita, Benjamin Piwowarski, and Mounia Lalmas University of Glasgow {shanu,bpiwowar,mounia}@dcs.gla.ac.uk Abstract. As the type of content available on the
Content-Based Image Retrieval
Content-Based Image Retrieval Selim Aksoy Department of Computer Engineering Bilkent University [email protected] Image retrieval Searching a large database for images that match a query: What kind
Inverted Indexes: Trading Precision for Efficiency
Inverted Indexes: Trading Precision for Efficiency Yufei Tao KAIST April 1, 2013 After compression, an inverted index is often small enough to fit in memory. This benefits query processing because it avoids
TF-IDF. David Kauchak cs160 Fall 2009 adapted from: http://www.stanford.edu/class/cs276/handouts/lecture6-tfidf.ppt
TF-IDF David Kauchak cs160 Fall 2009 adapted from: http://www.stanford.edu/class/cs276/handouts/lecture6-tfidf.ppt Administrative Homework 3 available soon Assignment 2 available soon Popular media article
Search Engines. Stephen Shaw <[email protected]> 18th of February, 2014. Netsoc
Search Engines Stephen Shaw Netsoc 18th of February, 2014 Me M.Sc. Artificial Intelligence, University of Edinburgh Would recommend B.A. (Mod.) Computer Science, Linguistics, French,
How To Cluster On A Search Engine
Volume 2, Issue 2, February 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: A REVIEW ON QUERY CLUSTERING
Modern Information Retrieval: A Brief Overview
Modern Information Retrieval: A Brief Overview Amit Singhal Google, Inc. [email protected] Abstract For thousands of years people have realized the importance of archiving and finding information. With
Assisting bug Triage in Large Open Source Projects Using Approximate String Matching
Assisting bug Triage in Large Open Source Projects Using Approximate String Matching Amir H. Moin and Günter Neumann Language Technology (LT) Lab. German Research Center for Artificial Intelligence (DFKI)
TEMPER : A Temporal Relevance Feedback Method
TEMPER : A Temporal Relevance Feedback Method Mostafa Keikha, Shima Gerani and Fabio Crestani {mostafa.keikha, shima.gerani, fabio.crestani}@usi.ch University of Lugano, Lugano, Switzerland Abstract. The
A SYSTEM FOR AUTOMATIC QUERY EXPANSION IN A BROWSER-BASED ENVIRONMENT
Mario Kubek Technical University of Ilmenau, Germany [email protected] Hans Friedrich Witschel University of Leipzig, Germany [email protected] A SYSTEM FOR AUTOMATIC QUERY EXPANSION
α α λ α = = λ λ α ψ = = α α α λ λ ψ α = + β = > θ θ β > β β θ θ θ β θ β γ θ β = γ θ > β > γ θ β γ = θ β = θ β = θ β = β θ = β β θ = = = β β θ = + α α α α α = = λ λ λ λ λ λ λ = λ λ α α α α λ ψ + α =
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts Julio Villena-Román 1,3, Sara Lana-Serrano 2,3 1 Universidad Carlos III de Madrid 2 Universidad Politécnica de Madrid 3 DAEDALUS
Information Retrieval System Assigning Context to Documents by Relevance Feedback
Information Retrieval System Assigning Context to Documents by Relevance Feedback Narina Thakur Department of CSE Bharati Vidyapeeth College Of Engineering New Delhi, India Deepti Mehrotra ASCS Amity University,
Personalized Hierarchical Clustering
Personalized Hierarchical Clustering Korinna Bade, Andreas Nürnberger Faculty of Computer Science, Otto-von-Guericke-University Magdeburg, D-39106 Magdeburg, Germany {kbade,nuernb}@iws.cs.uni-magdeburg.de
FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
Technical challenges in web advertising
Technical challenges in web advertising Andrei Broder Yahoo! Research 1 Disclaimer This talk presents the opinions of the author. It does not necessarily reflect the views of Yahoo! Inc. 2 Advertising
Computational Advertising Andrei Broder Yahoo! Research. SCECR, May 30, 2009
Computational Advertising Andrei Broder Yahoo! Research SCECR, May 30, 2009 Disclaimers This talk presents the opinions of the author. It does not necessarily reflect the views of Yahoo! Inc or any other
LCs for Binary Classification
Linear Classifiers A linear classifier is a classifier such that classification is performed by a dot product beteen the to vectors representing the document and the category, respectively. Therefore it
SIMILAR THESAURUS BASED ON ARABIC DOCUMENT: AN OVERVIEW AND COMPARISON
SIMILAR THESAURUS BASED ON ARABIC DOCUMENT: AN OVERVIEW AND COMPARISON Essam S. Hanandeh, Department of Computer Information System, Zarqa University, Zarqa, Jordan [email protected] ABSTRACT The massive
The PageRank Citation Ranking: Bring Order to the Web
The PageRank Citation Ranking: Bring Order to the Web presented by: Xiaoxi Pang 25.Nov 2010 1 / 20 Outline Introduction A ranking for every page on the Web Implementation Convergence Properties Personalized
Performance Metrics for Graph Mining Tasks
Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical
Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup
Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca
Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?
AN SQL EXTENSION FOR LATENT SEMANTIC ANALYSIS
Advances in Information Mining ISSN: 0975 3265 & E-ISSN: 0975 9093, Vol. 3, Issue 1, 2011, pp-19-25 Available online at http://www.bioinfo.in/contents.php?id=32 AN SQL EXTENSION FOR LATENT SEMANTIC ANALYSIS
DYNAMIC QUERY FORMS WITH NoSQL
IMPACT: International Journal of Research in Engineering & Technology (IMPACT: IJRET) ISSN(E): 2321-8843; ISSN(P): 2347-4599 Vol. 2, Issue 7, Jul 2014, 157-162 Impact Journals DYNAMIC QUERY FORMS WITH
Developing a Collaborative MOOC Learning Environment utilizing Video Sharing with Discussion Summarization as Added-Value
, pp. 397-408 http://dx.doi.org/10.14257/ijmue.2014.9.11.38 Developing a Collaborative MOOC Learning Environment utilizing Video Sharing with Discussion Summarization as Added-Value Mohannad Al-Mousa 1
Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement
white paper Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement»» Summary For business intelligence analysts the era
Terminology Extraction from Log Files
Terminology Extraction from Log Files Hassan Saneifar 1,2, Stéphane Bonniol 2, Anne Laurent 1, Pascal Poncelet 1, and Mathieu Roche 1 1 LIRMM - Université Montpellier 2 - CNRS 161 rue Ada, 34392 Montpellier
Exam in course TDT4215 Web Intelligence - Solutions and guidelines -
English Student no:... Page 1 of 12 Contact during the exam: Geir Solskinnsbakk Phone: 94218 Exam in course TDT4215 Web Intelligence - Solutions and guidelines - Friday May 21, 2010 Time: 0900-1300 Allowed
Software-assisted document review: An ROI your GC can appreciate. kpmg.com
Software-assisted document review: An ROI your GC can appreciate kpmg.com b Section or Brochure name Contents Introduction 4 Approach 6 Metrics to compare quality and effectiveness 7 Results 8 Matter 1
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
Linear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, [email protected] Spring 2007 Text mining & Information Retrieval Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki
Custom Web Development Guidelines
Introduction Custom Web Development Guidelines Unlike shrink wrap software, custom software development involves a partnership between the architect/programmer/developer (SonicSpider) and the owner/testers/users
Machine Learning. CS 188: Artificial Intelligence Naïve Bayes. Example: Digit Recognition. Other Classification Tasks
CS 188: Artificial Intelligence Naïve Bayes Machine Learning Up until now: how use a model to make optimal decisions Machine learning: how to acquire a model from data / experience Learning parameters
Interactive Recovery of Requirements Traceability Links Using User Feedback and Configuration Management Logs
Interactive Recovery of Requirements Traceability Links Using User Feedback and Configuration Management Logs Ryosuke Tsuchiya 1, Hironori Washizaki 1, Yoshiaki Fukazawa 1, Keishi Oshima 2, and Ryota Mibe
An Overview of a Role of Natural Language Processing in An Intelligent Information Retrieval System
An Overview of a Role of Natural Language Processing in An Intelligent Information Retrieval System Asanee Kawtrakul ABSTRACT In information-age society, advanced retrieval technique and the automatic
Lean UX. Best practices for integrating user insights into the app development process. Best Practices Copyright 2015 UXprobe bvba
Lean UX Best practices for integrating user insights into the app development process Best Practices Copyright 2015 UXprobe bvba Table of contents Introduction.... 3 1. Ideation phase... 4 1.1. Task Analysis...
Get the most value from your surveys with text analysis
PASW Text Analytics for Surveys 3.0 Specifications Get the most value from your surveys with text analysis The words people use to answer a question tell you a lot about what they think and feel. That
THE ROLE OF INFORMATION RETRIEVAL IN KNOWLEDGE MANAGEMENT
212 THE ROLE OF INFORMATION RETRIEVAL IN KNOWLEDGE MANAGEMENT DR. K V S N JAWAHAR BABU*; V. HARSHAVARDHAN**; J.S. ANANDA KUMAR*** ABSTRACT *Principal, KMM College, Tirupati. **Assistant Professor, Department
Network Big Data: Facing and Tackling the Complexities Xiaolong Jin
Network Big Data: Facing and Tackling the Complexities Xiaolong Jin CAS Key Laboratory of Network Data Science & Technology Institute of Computing Technology Chinese Academy of Sciences (CAS) 2015-08-10
How To Use B Insight'S New Search Engine On Sharepoint
How the BA Insight Federator Extends SharePoint Search BA Insight 2015 20 Park Plaza, Suite 1115 Boston, MA 02116, USA 1.339.368.7234 [email protected] www.bainsight.com BA Insight Federator The BA Insight
degrees Fahrenheit. Scientists believe it's human activity that's driving the temperatures up, a process
Global Warming For 2.5 million years, the earth's climate has fluctuated, cycling from ice ages to warmer periods. But in the last century, the planet's temperature has risen unusually fast, about 1.2
Term extraction for user profiling: evaluation by the user
Term extraction for user profiling: evaluation by the user Suzan Verberne 1, Maya Sappelli 1,2, Wessel Kraaij 1,2 1 Institute for Computing and Information Sciences, Radboud University Nijmegen 2 TNO,
Screen Design : Navigation, Windows, Controls, Text,
Overview Introduction Fundamentals of GUIs Screen Design : Navigation, Windows, Controls, Text, Evaluating GUI Performance - Methods - Comparison 1 Example: Automotive HMI (CAR IT 03/2013) 64, 68, 69 2
Fast Data in the Era of Big Data: Twitter s Real-
Fast Data in the Era of Big Data: Twitter s Real- Time Related Query Suggestion Architecture Gilad Mishne, Jeff Dalton, Zhenghua Li, Aneesh Sharma, Jimmy Lin Presented by: Rania Ibrahim 1 AGENDA Motivation
Mining a Corpus of Job Ads
Mining a Corpus of Job Ads Workshop Strings and Structures Computational Biology & Linguistics Jürgen Jürgen Hermes Hermes Sprachliche Linguistic Data Informationsverarbeitung Processing Institut Department
Best Practice Search Engine Optimisation
Best Practice Search Engine Optimisation October 2007 Lead Hitwise Analyst: Australia Heather Hopkins, Hitwise UK Search Marketing Services Contents 1 Introduction 1 2 Search Engines 101 2 2.1 2.2 2.3
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
2. EXPLICIT AND IMPLICIT FEEDBACK
Comparison of Implicit and Explicit Feedback from an Online Music Recommendation Service Gawesh Jawaheer [email protected] Martin Szomszor [email protected] Patty Kostkova [email protected]
Phases, Activities, and Work Products. Object-Oriented Software Development. Project Management. Requirements Gathering
Object-Oriented Software Development What is Object-Oriented Development Object-Oriented vs. Traditional Development An Object-Oriented Development Framework Phases, Activities, and Work Products Phases,
8 Evaluating Search Engines
8 Evaluating Search Engines Evaluation, Mr. Spock Captain Kirk, Star Trek: e Motion Picture 8.1 Why Evaluate? Evaluation is the key to making progress in building better search engines. It is also essential
Artificial Intelligence and Transactional Law: Automated M&A Due Diligence. By Ben Klaber
Artificial Intelligence and Transactional Law: Automated M&A Due Diligence By Ben Klaber Introduction Largely due to the pervasiveness of electronically stored information (ESI) and search and retrieval
Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: [email protected] Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
Introduction to Software Engineering. 8. Software Quality
Introduction to Software Engineering 8. Software Quality Roadmap > What is quality? > Quality Attributes > Quality Assurance: Planning and Reviewing > Quality System and Standards 2 Sources > Software
