Manual Query Modification and Data Fusion for Medical Image Retrieval

Similar documents
Approaches of Using a Word-Image Ontology and an Annotated Image Corpus as Intermedia for Cross-Language Image Retrieval

Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew s Collection

Search and Information Retrieval

Universities of Leeds, Sheffield and York

Incorporating Window-Based Passage-Level Evidence in Document Retrieval

The University of Lisbon at CLEF 2006 Ad-Hoc Task

LABERINTO at ImageCLEF 2011 Medical Image Retrieval Task

MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts

An Information Retrieval using weighted Index Terms in Natural Language document collections

Secondary Use of Clinical Data from Electronic Health Records: The TREC Medical Records Track

Building Multilingual Search Index using open source framework

Term extraction for user profiling: evaluation by the user

CINDOR Conceptual Interlingua Document Retrieval: TREC-8 Evaluation.

A Hybrid Approach for Multi-Faceted IR in Multimodal Domain

Relevance Feedback versus Local Context Analysis as Term Suggestion Devices: Rutgers TREC 8 Interactive Track Experience

Technical Report. The KNIME Text Processing Feature:

Medical Information-Retrieval Systems. Dong Peng Medical Informatics Group

Semantic Search in Portals using Ontologies

Query term suggestion in academic search

Diagnosis Code Assignment Support Using Random Indexing of Patient Records A Qualitative Feasibility Study

A Visual Tagging Technique for Annotating Large-Volume Multimedia Databases

Bringing the algorithms to the data: cloud based benchmarking for medical image analysis

Information Retrieval Elasticsearch

Overview of the TREC 2012 Medical Records Track

SIGIR 2004 Workshop: RIA and "Where can IR go from here?"

A ranking SVM based fusion model for cross-media meta-search engine *

An Information Retrieval System for Expert and Consumer Users

The XLDB Group at CLEF 2004

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

Interactive Recovery of Requirements Traceability Links Using User Feedback and Configuration Management Logs

Chapter 7.30 Retrieving Medical Records Using Bayesian Networks

Comparing Tag Clouds, Term Histograms, and Term Lists for Enhancing Personalized Web Search

TREC 2007 ciqa Task: University of Maryland

Clinical Decision Support with the SPUD Language Model

Medical Image Retrieval: A Multi-modal Approach

Filtering Noisy Contents in Online Social Network by using Rule Based Filtering System

PREDA S4-classes. Francesco Ferrari October 13, 2015

Ranked Keyword Search in Cloud Computing: An Innovative Approach


BiTeM group report for TREC Medical Records Track 2011

Towards a Visually Enhanced Medical Search Engine

Training Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object

Dr. Anuradha et al. / International Journal on Computer Science and Engineering (IJCSE)

Using Transactional Data From ERP Systems for Expert Finding

Wikipedia and Web document based Query Translation and Expansion for Cross-language IR

Query Modification through External Sources to Support Clinical Decisions

Query Recommendation employing Query Logs in Search Optimization

Latent Semantic Indexing with Selective Query Expansion Abstract Introduction

Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features

Performance evaluation of Web Information Retrieval Systems and its application to e-business

Using Discharge Summaries to Improve Information Retrieval in Clinical Domain

Stemming Methodologies Over Individual Query Words for an Arabic Information Retrieval System

Methods for accessing permanent information and their evaluation

Three Methods for ediscovery Document Prioritization:

A View Integration Approach to Dynamic Composition of Web Services

The Scientific Data Mining Process

Training the Rest: Informatics Competencies for Health Care Professionals. Covvey s original categories of biomedical informatics practice

Industrial Challenges for Content-Based Image Retrieval

South Carolina College- and Career-Ready (SCCCR) Probability and Statistics

Recommending Social Media Content to Community Owners

Program Advisory Committee (PAC) Agenda. December 14, :00am 3:00pm PST. Agenda Items:

ISSN: A Review: Image Retrieval Using Web Multimedia Mining

Terrier: A High Performance and Scalable Information Retrieval Platform

Florida International University - University of Miami TRECVID 2014

A Multidatabase System as 4-Tiered Client-Server Distributed Heterogeneous Database System

ISSUES ON FORMING METADATA OF EDITORIAL SYSTEM S DOCUMENT MANAGEMENT

Research Evaluation of Indian Journal of Cancer: A Bibliometric Study

Developing a Collaborative MOOC Learning Environment utilizing Video Sharing with Discussion Summarization as Added-Value

A Hierarchical SVG Image Abstraction Layer for Medical Imaging

Data Discovery on the Information Highway

A Business Process Services Portal

Retrieving Medical Literature for Clinical Decision Support

QOS Based Web Service Ranking Using Fuzzy C-means Clusters

EHR Data Preservation

Using Wikipedia to Translate OOV Terms on MLIR

Processing and data collection of program structures in open source repositories

A bright picture for biomedical informatics in the 21 st century

What is Visual Analytics?

Towards Effective Recommendation of Social Data across Social Networking Sites

ONTOLOGY-BASED APPROACH TO DEVELOPMENT OF ADJUSTABLE KNOWLEDGE INTERNET PORTAL FOR SUPPORT OF RESEARCH ACTIVITIY

An Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials

JERIBI Lobna, RUMPLER Beatrice, PINON Jean Marie

ProteinQuest user guide

Effective Metadata for Social Book Search from a User Perspective

Tutorial on Medical Image Retrieval - application domains-

Collecting Polish German Parallel Corpora in the Internet

Subordinating to the Majority: Factoid Question Answering over CQA Sites

2. SYSTEM ARCHITECTURE CLINICAL DECISION SUPPORT TRACK 1. INTRODUCTION

MusicGuide: Album Reviews on the Go Serdar Sali

Finding Advertising Keywords on Web Pages. Contextual Ads 101

2 AIMS: an Agent-based Intelligent Tool for Informational Support

Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND II. PROBLEM AND SOLUTION

Web 3.0 image search: a World First

Results of the MUSCLE CIS Coin Competition 2006

Providence- OHSU Informatics Course

Aiding the Data Integration in Medicinal Settings by Means of Semantic Technologies

The Enron Corpus: A New Dataset for Classification Research

Visual Analysis Tool for Bipartite Networks

Using COTS Search Engines and Custom Query Strategies at CLEF

Review of Biomedical Image Processing

Transcription:

Manual Query Modification and Data Fusion for Medical Image Retrieval Jeffery R. Jensen and William R. Hersh Department of Medical Informatics & Clinical Epidemiology Oregon Health & Science University Portland, Oregon, USA {jensejef, hersh}@ohsu.edu Abstract. Image retrieval has great potential for a variety of tasks in medicine but is currently underdeveloped. For the ImageCLEF 25 medical task, we used a text retrieval system as the foundation of our experiments to assess retrieval of images from the test collection. We conducted experiments using automatic queries, manual queries, and manual queries augmented with results from visual queries. The best performance was obtained from manual modification of queries. The combination of manual and visual retrieval results resulted in lower performance based on mean average precision but higher precision within the top 3 results. Further research is needed not only to sort out the relative benefit of textual and visual methods in image retrieval but also to determine which performance measures are most relevant to the operational setting. 1 Introduction The goal of the medical image retrieval task of ImageCLEF is to identify and develop methods to enhance the retrieval of images based on real-world topics that a user would bring to such an image retrieval system. A test collection of nearly 5, images - annotated in English, French, and/or German - and 25 topics provided the basis for experiments. As described in the track overview paper [1], the test collection was organized from four collections, each of which was organized into cases consisting of one or more images plus annotations at the case or image level (depending on the organization of the original collection). There are two general approaches to image retrieval, semantic (also called contextbased) and visual (also called content-based) [2]. Semantic image retrieval uses textual information to determine an image s subject matter, such as an annotation or more structured metadata. Visual image retrieval, on the other hand, uses features from the image, such as color, texture, shapes, etc., to determine its content. The latter has historically been a difficult task, especially in the medical domain [3]. The most success for visual retrieval has come from more images like this one types of queries. There has actually been little research in the types of techniques that would achieve good performance for queries more akin to those a user might enter into a text retrieval system, such as images showing different types of skin cancers. Some C. Peters et al. (Eds.): CLEF 25, LNCS 422, pp. 673 679, 26. Springer-Verlag Berlin Heidelberg 26

674 J.R. Jensen and W.R. Hersh researchers have begun to investigate hybrid methods that combine both image context and content for indexing and retrieval [3]. Oregon Health & Science University (OHSU) participated in the medical image retrieval task of ImageCLEF 25. Our experiments were based on a semantic image retrieval system, although we also attempted to improve our performance by fusing our results with output from a content-based search. Data fusion has been used for a variety of tasks in IR, e.g., [4]. Our experimental runs included an automatic query, a manually modified query, and a manual/visual query (the manual query refined with the results of a visual search). 2 System Overview Our retrieval system was based on the open-source search engine, Lucene. We have used Lucene in other retrieval evaluation forums, such as the Text Retrieval Conference (TREC) Genomics Track [5]. Documents in Lucene are indexed by parsing of individual words and weighting of those words with an algorithm that sums for each query term in each document the product of the term frequency (TF), the inverse document frequency (IDF), the boost factor of the term, the normalization of the document, the fraction of query terms in the document, and the normalization of the weight of the query terms, for each term in the query. The score of document d for query q consisting of terms t is calculated as follows: score( q, d) = tf ( t, d) * idf ( t) * boost( t, d) * norm( d, t) * frac( t, d) * norm( q) t in q where: tf(t.d) = term frequency of term t in document d idf(t) = inverse document frequency of term t boost(t,d) = boost for term t in document d norm(t,d) = normalization of d with respect to t frac(t,d) = fraction of t contained in d norm(q) = normalization of query q Lucene is distributed with a variety of analyzers for textual indexing. We chose Lucene s standard analyzer, which supports acronyms, floating point numbers, lowercasing, and stop word removal. The standard analyzer was chosen to bolster precision. Each annotation, within the library, was indexed with three data fields, which consisted of a collection name, a file name, and the contents of the file to be indexed. Although the annotations were structured in XML, we indexed each annotation without the use of an XML parser. Therefore, every XML element was indexed (including its tag) along with its corresponding value. As noted in the track overview paper, some images were indexed at the case level, i.e., the annotation applied to all images associated with the case. (This applied for the Casimage and MIR collections, but not the PEIR or PathoPIC collections.) When the search engine matched a case annotation, each of the images associated with the case was added to the retrieval output. It was for this reason that we also did a run that filtered the output based on retrieval by a visual retrieval run, in an attempt to focus the output of images by whole cases.

Manual Query Modification and Data Fusion for Medical Image Retrieval 675 3 Methods OHSU submitted three official runs for ImageCLEF 25 medical image retrieval track. These included two that were purely semantic, and one that employed a combination of semantic and visual searching methods. Our first run (OHSUauto) was purely semantic. This run was a baseline run, just using the text in the topics as provided with the unmodified Lucene system. Therefore, we used the French and German translations that were also provided with the topics. For our ranked image output, we used all of the images associated with each retrieved annotation. For our second run (OHSUman), we carried out manual modification of the query for each topic. For some topics, the keywords were expanded or mapped to more specific terms. This made the search statements for this run more specific. For example, one topic focused on chest x-rays showing an enlarged heart, so we added a term like cardiomegaly. Since the manual modification resulted in no longer having accurate translations, we expanded the manual queries with translations that were obtained from Babelfish (http://babelfish.altavista.com). The newly translated terms were added to the query with the text of each language group (English, German, and French) connecting via a union (logical OR). Figure 1 shows a sample query from this run. In addition to the minimal term mapping and/or expansion, we also increased the significance for a group of relevant terms using Lucene s term boosting function. For example, for the topic focusing on chest x-rays showing an enlarged heart; we increased the significance of documents that contained the terms, chest and x-ray and posteroanterior and cardiomegaly, while the default significance was used for documents that contained the terms, chest or x-ray or posteroanterior, or cardiomegaly. This strategy was designed to give a higher rank to the more relevant documents within a given search. Moreover, this approach attempted to improve the precision of the results from our first run. Similar to the OHSUauto run, we returned all images associated with the retrieved annotation. (AP^2 PA^2 anteroposterior^2 posteroanterior^2 thoracic thorax cardiomegaly^3 heart coeur) Fig. 1. Manual query for topic 12 Our third run (OHSUmanviz) used a combination of textual and visual retrieval methods. We took the image output from OHSUman and excluded all documents that were not retrieved by the University of Geneva baseline visual run (GE_M_4g.txt). In other words, we performed an intersection (logical AND) between the OHSUman and GE_M_4g.txt runs as a combined visual and semantic run. Consistent with the ImageCLEF medical protocol, we used mean average precision (MAP) as our primary outcome measure. However, we also analyzed other measures output from trec_eval, in particular the precision at N images measures.

676 J.R. Jensen and W.R. Hersh 4 Results Our automatic query run (OHSUauto) had the largest number of images returned for each topic. The MAP for this run was extremely low at.43, which fell below the median (.11) of the 9 submissions in the text-only automated category. The manually modified queries run (OHSUman) for the most part returned large numbers of images. However, there were some topics for which it returned fewer images than the OHSUauto run. Two of these topics were those that focused on Alzheimer s disease and hand-drawn images of a person. This run was in the text-only manual category and achieved an MAP of.2116. Despite being the only submission in this category, this run scored above any run from the text-only automatic category and as such was the best text-only run. When we incorporated visual retrieval data (OHSUmanviz), our queries returned the smallest number images for each topic. The intent was to improve precision of the results from the previous two techniques. This run was placed in the text and visual manual category, and achieved an MAP of.161, which was the highest score in this category. This technique s performance was less than that of our manual query technique. Recall that both our manual and manual/visual techniques used the same textual queries, so the difference in the overall score was a result of the visual refinement. Figure 2 illustrates the number of images returned by each of the techniques, while Figure 3 shows MAP per topic for each run. Even though the fully automatic query technique consistently returned the largest number of images on a per-query basis, this approach rarely outperformed the others. Whereas the manual query technique did not consistently return large numbers of images for each query, it did return a good proportion of relevant images for each query. The manual/visual query technique found a good proportion of relevant images but clearly eliminated some images that the text-only search found, resulting in decreased performance. 12 1 Number of Images 8 6 4 2 Auto-Retr Man-Retr Manvis-Retr Auto-Relret Man-Relret Manvis-Relret 1 3 5 7 9 11 13 15 17 19 21 23 25 Topic Fig. 2. Number of relevant images and retrieved relevant images for each of the three runs for each topic

Manual Query Modification and Data Fusion for Medical Image Retrieval 677 Perhaps the most interesting result from all of our runs was comparing the performance based on MAP with precision at top of the output. Despite the overall lower MAP, the OHSUmanvis had better precision starting at five images and continuing to 3 images. The better MAP is explainable by the high precision across the remainder of the output (down to 1, images). However, this finding is significant by virtue of the fact that many real-world uses of image retrieval may have users who explore output solely in this range. Figure 4 shows precision at various levels of output, while Figure 5 shows a recall-precision curve comparing the two..9.8 Mean Average Precision.7.6.5.4.3.2.1 Auto Man Manvis 1 3 5 7 9 11 13 15 17 19 21 23 25 Topic Fig. 3. Mean average precision for each of the three runs for each topic.6.5 Precision.4.3.2 Auto Man Manvis.1 P5 P1 P15 P2 P3 P1 P2 P5 P1 At N documents Fig. 4. Average of precision at 5, 1, 15, 2, 3, 1, 2, 5, and 1 images for each run. The manual plus visual query run has higher precision down to 3 images retrieved, despite its lower mean average precision.

678 J.R. Jensen and W.R. Hersh Precision.8.7.6.5.4.3.2.1..1.2.3.4.5.6.7.8.9 1. Recall Auto Man Manvis Fig. 5. Recall-precision curves for each run. The manual plus visual query run has a higher precision at low levels of recall (i.e., at the top of image output). 5 Conclusions Our ImageCLEF medical track experiments showed that manual query modification and use of an automated translation tool provide benefit in retrieving relevant images. Filtering the output with findings from a baseline content-based approach diminished performance overall, but perhaps not in the part of the output most likely to be seen by real users, i.e., the top 3 images. The experiments of our groups and others raise many questions about image retrieval: - Which measures coming from automated retrieval evaluation experiments are most important for assessing systems in the hands of real users? - How would text retrieval methods shown to be more effective in some domains (e.g., Okapi weighting) improve performance? - How would better approaches to data fusion of semantic and visual queries impact performance? - Are there methods of semantic and visual retrieval that improve performance in complementary manners? - How much do these variations in output matter to real users? Our future work also includes building a more robust image retrieval system proper, which will both simplify further experiments as well as give us the capability to employ real users in them. With such a system, users will be able to manually modify queries and/or provide translation. Additional work we are carrying out includes better elucidating the needs of those who use image retrieval systems based on a pilot study we have performed [6].

Manual Query Modification and Data Fusion for Medical Image Retrieval 679 References 1. Clough, P., Müller, H., Deselaers, T., Grubinger, M., Lehmann, T., Jensen, J., Hersh W.: The CLEF 25 cross-language image retrieval track. In: Springer Lecture Notes in Computer Science (LNCS), Vienna, Austria. (26 - to appear) 2. Müller, H., Michoux, N., Bandon, D., Geissbuhler, A.: A review of content-based image retrieval systems in medical applications-clinical benefits and future directions. International Journal of Medical Informatics, 73 (24) 1-23 3. Antani, S., Long, L., Thoma, G.R.: A biomedical information system for combined contentbased retrieval of spine x-ray images and associated text information. Proceedings of the 3rd Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP 22), Ahamdabad, India (22) 4. Belkin, N., Cool, C., Croft, W.B., Callan, J.P.: Effect of multiple query representations on information retrieval system performance. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Pittsburgh, PA. ACM Press (1993) 339-346 5. Cohen, A.M., Bhupatiraju, R.T., Hersh, W.: Feature generation, feature selection, classifiers, and conceptual drift for biomedical document triage, In: The Thirteenth Text Retrieval Conference: TREC 24 (24) http://trec.nist.gov/pubs/trec13/papers/ohsu-hersh.geo.pdf 6. Hersh, W., Jensen, J., Müller, H., Gorman, P., Ruch, P.: A qualitative task analysis of biomedical image use and retrieval. MUSCLE/ImageCLEF Workshop on Image and Video Retrieval Evaluation, Vienna, Austria (25) http://medir.ohsu.edu/~hersh/muscle-5- image.pdf