Analyzing and Evaluating Query Reformulation Strategies in Web Search Logs



Similar documents
Sequences and Series

Average Price Ratios

Summation Notation The sum of the first n terms of a sequence is represented by the summation notation i the index of summation

of the relationship between time and the value of money.

16. Mean Square Estimation

Preprocess a planar map S. Given a query point p, report the face of S containing p. Goal: O(n)-size data structure that enables O(log n) query time.

Banking (Early Repayment of Housing Loans) Order,

APPENDIX III THE ENVELOPE PROPERTY

1. The Time Value of Money

IDENTIFICATION OF THE DYNAMICS OF THE GOOGLE S RANKING ALGORITHM. A. Khaki Sedigh, Mehdi Roudaki

Reasoning to Solve Equations and Inequalities

Software Size Estimation in Incremental Software Development Based On Improved Pairwise Comparison Matrices

Public Auditing Based on Homomorphic Hash Function in

6.7 Network analysis Introduction. References - Network analysis. Topological analysis

Abraham Zaks. Technion I.I.T. Haifa ISRAEL. and. University of Haifa, Haifa ISRAEL. Abstract

Generalized solutions for the joint replenishment problem with correction factor

T = 1/freq, T = 2/freq, T = i/freq, T = n (number of cash flows = freq n) are :

Classic Problems at a Glance using the TVM Solver

A Parallel Transmission Remote Backup System

Online Appendix: Measured Aggregate Gains from International Trade

On Error Detection with Block Codes

ANOVA Notes Page 1. Analysis of Variance for a One-Way Classification of Data

Bayesian Updating with Continuous Priors Class 13, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS R =

An IMM Algorithm for Tracking Maneuvering Vehicles in an Adaptive Cruise Control Environment

ECONOMIC CHOICE OF OPTIMUM FEEDER CABLE CONSIDERING RISK ANALYSIS. University of Brasilia (UnB) and The Brazilian Regulatory Agency (ANEEL), Brazil

Stock Index Modeling using EDA based Local Linear Wavelet Neural Network

Applications of Support Vector Machine Based on Boolean Kernel to Spam Filtering

THE well established 80/20 rule for client-server versus

10.5 Future Value and Present Value of a General Annuity Due

A Study of Unrelated Parallel-Machine Scheduling with Deteriorating Maintenance Activities to Minimize the Total Completion Time

An Evaluation of Naïve Bayesian Anti-Spam Filtering Techniques

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Numerical Methods with MS Excel

Performance Attribution. Methodology Overview

A DISTRIBUTED REPUTATION BROKER FRAMEWORK FOR WEB SERVICE APPLICATIONS

CHAPTER 2. Time Value of Money 6-1

Chapter Eight. f : R R

Green Master based on MapReduce Cluster

A system to extract social networks based on the processing of information obtained from Internet

WHAT HAPPENS WHEN YOU MIX COMPLEX NUMBERS WITH PRIME NUMBERS?

A MODEL FOR AIRLINE PASSENGER AND CARGO FLIGHT SCHEDULING

The Time Value of Money

DISTANCE MEASURE FOR ORDINAL DATA *

The analysis of annuities relies on the formula for geometric sums: r k = rn+1 1 r 1. (2.1) k=0

EQUATIONS OF LINES AND PLANES

IMPLEMENTATION IN PUBLIC ADMINISTRATION OF MEXICO GOVERNMENT USING GAMES THEORY AND SOLVING WITH LINEAR PROGRAMMING

Optimal multi-degree reduction of Bézier curves with constraints of endpoints continuity

AntiSpyware Enterprise Module 8.5

A New Bayesian Network Method for Computing Bottom Event's Structural Importance Degree using Jointree

An Effectiveness of Integrated Portfolio in Bancassurance

Graphs on Logarithmic and Semilogarithmic Paper

Fractal-Structured Karatsuba`s Algorithm for Binary Field Multiplication: FK

Repeated multiplication is represented using exponential notation, for example:

Low-Cost Side Channel Remote Traffic Analysis Attack in Packet Networks

Chapter = 3000 ( ( 1 ) Present Value of an Annuity. Section 4 Present Value of an Annuity; Amortization

Bayesian Network Representation

Efficient Traceback of DoS Attacks using Small Worlds in MANET

Compressive Sensing over Strongly Connected Digraph and Its Application in Traffic Monitoring

Helicopter Theme and Variations

Morgan Stanley Ad Hoc Reporting Guide

Chapter 13 Volumetric analysis (acid base titrations)

Models for Selecting an ERP System with Intuitionistic Trapezoidal Fuzzy Information

How To Make A Profit From A Website

Beta. A Statistical Analysis of a Stock s Volatility. Courtney Wahlstrom. Iowa State University, Master of School Mathematics. Creative Component

An Approach to Evaluating the Computer Network Security with Hesitant Fuzzy Information

RESEARCH ON PERFORMANCE MODELING OF TRANSACTIONAL CLOUD APPLICATIONS

A MODEL WITH STORAGE LIMITATION AND SIMULATED DEMAND AS FRESH MEAT INVENTORY MANAGEMENT SUPPORT

Reinsurance and the distribution of term insurance claims

Working Paper Series PHD Ph.D. student working papers. Marketing Strategies for Products with Cross-Market Network Externalities

The Gompertz-Makeham distribution. Fredrik Norström. Supervisor: Yuri Belyaev

We will begin this chapter with a quick refresher of what an exponent is.

Fault Tree Analysis of Software Reliability Allocation

MATHEMATICS FOR ENGINEERING BASIC ALGEBRA

Statistical Pattern Recognition (CE-725) Department of Computer Engineering Sharif University of Technology

The Digital Signature Scheme MQQ-SIG

Treatment Spring Late Summer Fall Mean = 1.33 Mean = 4.88 Mean = 3.

Does Immigration Induce Native Flight from Public Schools? Evidence from a Large Scale Voucher Program

Approximation Algorithms for Scheduling with Rejection on Two Unrelated Parallel Machines

RQM: A new rate-based active queue management algorithm

The impact of service-oriented architecture on the scheduling algorithm in cloud computing

FINANCIAL MATHEMATICS 12 MARCH 2014

Factoring Polynomials

FUZZY PERT FOR PROJECT MANAGEMENT

IP Network Topology Link Prediction Based on Improved Local Information Similarity Algorithm

Suspicious Transaction Detection for Anti-Money Laundering

Report 52 Fixed Maturity EUR Industrial Bond Funds

The simple linear Regression Model

Projection model for Computer Network Security Evaluation with interval-valued intuitionistic fuzzy information. Qingxiang Li

m n Use technology to discover the rules for forms such as a a, various integer values of m and n and a fixed integer value a.

5 a LAN 6 a gateway 7 a modem

Appendix D: Completing the Square and the Quadratic Formula. In Appendix A, two special cases of expanding brackets were considered:

Impact of Interference on the GPRS Multislot Link Level Performance

A. Description: A simple queueing system is shown in Fig Customers arrive randomly at an average rate of

RUSSIAN ROULETTE AND PARTICLE SPLITTING

Real-Time Scheduling Models: an Experimental Approach

Security Analysis of RAPP: An RFID Authentication Protocol based on Permutation

Or more simply put, when adding or subtracting quantities, their uncertainties add.

Optimal Packetization Interval for VoIP Applications Over IEEE Networks

n. We know that the sum of squares of p independent standard normal variables has a chi square distribution with p degrees of freedom.

Transcription:

Alyg d Evlutg Query Reformulto Strteges We Serch Logs Jeff Hug Uversty of Wshgto Iformto School ckm09@jeffhug.com Efthms N. Efthmds Uversty of Wshgto Iformto School efthms@u.wshgto.edu ABSTRACT Users frequetly modfy prevous serch query hope of retrevg etter results. These modfctos re clled query reformultos or query refemets. Exstg reserch hs studed how we serch eges c propose reformultos, ut hs gve less tteto to how people perform query reformultos. I ths pper, we m to etter uderstd how we serchers refe queres d form theoretcl foudto for query reformulto. We study users reformulto strteges the cotext of the AOL query logs. We crete txoomy of query refemet strteges d uld hgh precso rule-sed clssfer to detect ech type of reformulto. Effectveess of reformultos s mesured usg user clck ehvor. Most reformulto strteges result some eeft to the user. Cert strteges lke dd/remove words, word susttuto, croym expso, d spellg correcto re more lkely to cuse clcks, especlly o hgher rked results. I cotrst, users ofte clck the sme result s ther prevous query or select o results whe formg croyms d reorderg words. Perhps the most surprsg fdg s tht some reformultos re etter suted to helpg users whe the curret results re lredy frutful, whle other reformultos re more effectve whe the results re lckg. Our fdgs form the desg of pplctos tht c ssst serchers; exmples re descred ths pper. Ctegores d Suject Descrptors H.3.3 [Iformto Serch d Retrevl]: Query formulto Geerl Terms Algorthms, Mesuremet, Hum Fctors Keywords Query reformulto, serch effectveess, query log lyss.. INTRODUCTION Of the roughly llo dly we serches mde y teret users [8], pproxmtely 8% re modfctos to the prevous query [9], lso kow s query reformultos or query Permsso to mke dgtl or hrd copes of ll or prt of ths work for persol or clssroom use s grted wthout fee provded tht copes re ot mde or dstruted for proft or commercl dvtge d tht copes er ths otce d the full ctto o the frst pge. To copy otherwse, or repulsh, to post o servers or to redstrute to lsts, requres pror specfc permsso d/or fee. CIKM 09, Novemer 6, 009, Hog Kog, Ch. Copyrght 009 ACM 978--60558-5-3/09/...$0.00. refemets. For exmple, user my serch for p Settle, ut lter ther query to susge p Settle f they re ustsfed wth the results from the tl query. Reformultos mke up lrge porto of we serch ctvty. I study of Dogple.com logs, Jse et l. [6] reported tht 37% of serch queres were reformultos whe gorg sme queres. A study of Altvst logs [7] detfed tht 5% of users reformulted ther queres. Serch eges d hums oth try hrd to come up wth pproprte query reformultos. My we serch eges tody offer query reformulto suggestos y, for exmple, mg query logs. Users re mully reformultg ther queres sed o the serch results from the tl query, d ther kowledge d experece of how serch eges work. The reformulto process s tertve edevor etwee users d serch eges gettg stsfctory set of results. Whle the serch ege sde of query reformulto hs ee studed extesvely y the serch compes d pror formto retrevl reserch, how users perform query reformultos hs receved less tteto. Amog the eefts to uderstdg how people serch s eg le to utomtclly propose query reformultos. If my users serchg for hummus reformulte ther query to hummus recpe, the serch ege c e proctve d suggest hummus recpe whe the user serches for hummus. Users c lso eeft from mproved serch experece whe performg reformultos. Curretly, serch eges preset the sme terfce regrdless of whether the user gves t ew query, sme query, or query reformulto. Beg le to ccurtely detect whe user s mkg query reformulto gves the serch ege opportuty to preset mproved terfce. The gol of ths work s to look t the types of query reformulto users perform d evlute them usg effectveess metrcs such s clck dt. I order to study these metrcs, we frst costruct txoomy of query reformulto strteges dopted y users. Next, we uld clssfer for these dfferet types of reformultos. Whle there re some exstg clssfers tht determe whether query s reformulto, ours s the frst to seprte them to reformulto types. Our work mkes three specfc cotrutos: A comprehesve txoomy of query reformulto strteges defed y forml lguge, developed y comg the dfferet types of reformultos reported exstg work d tertve expermetto over query logs (Secto 3).

A usupervsed rule-sed clssfer wth hgh precso detectg the dfferet query reformulto strteges (Secto 4). Alyss of correltos etwee query reformulto strteges d effectveess metrcs, gvg us etter overll uderstdg of query reformulto strtegy effectveess (Secto 5).. RELATED WORK. Computer-Geerted Reformultos Much of the work o query reformulto for we serch hs focused o offerg utomtclly geerted query suggestos to the user. The suggestos re typclly show o the sme pge s the serch results. These query suggestos re ult to every mjor serch ege tody. Pror reserch ths ve hs explored computer-geerted suggestos usg query expso [6], query susttuto [], d other refemet techques [6][3]. Implct relevce feedck from users s commo dt source for computer-geerted reformultos. For exmple, work y Be-Ytes et l. [6] uses query logs to dscover ew query reformultos, fdg smlr queres usg cose fucto over term-weghted vector ult from the clcked documets. A study y Ack [3] showed tht these utomtclly geerted reformultos were s effectve s hum costructed reformultos, usg metrcs such s uptke d clck ehvor.. Query Sesso Boudry Detecto We process query logs cotg rw serch queres; therefore, to clssfy query reformulto, we must frst determe whether query s deed reformulto sted of ew query. Ths s smlr to the prolem of detectg query sessos d ther oudres. Jse et l. defe sesso s seres of terctos y the user towrd ddressg sgle formto eed [6]; Slverste et l. [33] ote, A sesso s met to cpture user's ttempt to fll sgle formto eed. Therefore, sessos c e cosdered s sgle query, followed y y umer of reformulted queres. From ths, our defto of query reformulto s: modfcto to serch query tht ddresses the sme formto eed. Further dervg from these deftos, we c coclude tht f we were le to correctly detfy the oudres of ll query sessos, we would kow whch queres re tl queres d whch re reformultos. Coversely, y detfyg whch queres re reformultos, we would e le to ccurtely group query sessos together. Therefore, the prolem of detfyg query reformultos s smlr to the prolem of detectg sesso oudres. Most exstg work detfes sessos usg smple temporl strtegy, where specfc tme tervl of ctvty represets oudry. Ths method s smple to mplemet d the defto s umguous. He et l. [5] d Omutlu [8] used tme d commo words to determe sesso cutoffs. Comprg severl sesso detecto lgorthms, He et l. tted 73% precso d 6% recll usg tme oly, d 60% precso d 98% recll usg tme d commo words together. Arltt [4] foud sesso oudres usg clculted tmeout threshold. Murry et l. [7] exteded ths work y usg herrchcl clusterg to fd etter tmeout vlues to detect sesso oudres. Ther method hd 97% precso d 76% recll o hum-clssfed dtset. More recetly, Joes d Klker [] preseted evdece tht y temporl cutoff s rtrry d detects sesso oudres o etter th rdom cutoff tme. They evluted the exstg sesso oudry detecto methods logsde ther ow. Ther study revewed these methods wthout cosderg sme queres. Usg the optml cutoff tme, 5 mutes, query reformultos were ccurtely detfed 63% of the tme. Comg the optml fetures from pror work, tht s, commo word + prsm (see [3]) + tme, they cheved 84% ccurcy. Usg oly Leveshte edt dstce resulted 85% ccurcy. Lstly, ther ow comto of methods resulted the est ccurcy, 87%..3 Clck Dt Alyss My reserchers hve studed clck dt s dctors of serch relevce. A erly qury y Jochms [9] revels tht clck dt c deed e used to mprove serch relevce. Severl lter studes gree tht clck dt re dctors of serch result prefereces d dscuss est methods of lyg clck dt [][]. Jochms et l. lso fd tht lyg clcks over query reformultos smlrly provdes useful formto [0]. Ths dt hs lso ee show to e helpful for mprovg serch relevce [][9]. Our study pples lessos lered from these reports of clck dt lyss. Whle they study the effectveess of lyg dfferet clck ptters, we study the effectveess of reformulto strteges usg dfferet clck ptters..4 Txoomes of Reformulto Strteges Txoomes of query reformulto hve ee developed for dfferet types of serch. A more comprehesve revew of query reformulto trdtol formto retrevl c e foud [0]. Here we focus oly o the txoomes developed y lyg query logs. These re geerlly costructed y exmg smll set of query logs. Some studes re out of dte or complete. Noe hve ult utomtc clssfer dstgushg reformulto strteges, s we hve. Tle presets mppg etwee our txoomy of query reformulto strteges d the termology for these strteges from pror work. Ack [3] clssfed rdom smple of 00 reformultos y hd to eleve ctegores. Lu d Horvt [4], Jse et l. [6], d He et l. [5] used the sme reformulto ctegores terms tke from lgustcs [8]. As prt of study of re-fdg ehvor, Teev et l. [34] costructed txoomy y lookg through query logs, d mplemeted lgorthms to detect suset of the reformulto strteges. Whttle et l. [36] modeled some reformulto strteges usg grphcl etwork. Bru d Des [7] mully clssfed,040 queres to ther ow txoomy. Guo et l. [3] lso costructed smll txoomy d used codtol rdom feld model to predct query refemets. Reh d Xe [3] costructed coceptul reformulto ctegores lke cotet, formt, resource; these re ot cluded the tle ecuse ther strct ture mkes them dffcult to mp gst cocrete reformulto techques. 3. REFORMULATION STRATEGIES We costructed our ow txoomy y comg the types of query reformulto detfed pror work (Tle ). We mplemeted mtchg rule for ech strtegy, whch ws tertvely mproved to fd the est usupervsed lgorthm. For

Preset Study Ack [3] Teev [34] word reorder sytctc vrt word order whtespce d puctuto o lphumercs, word merge Jse [6], He [5], Lu [4] remove words remove words / duplctes geerlto D(k) DEL dd words hed, modfer dd words, dd stopwords speclto C(k) ADD url strppg dom Whttle [36] Bru [7] Guo [3] SPL, PUN word splttg, word mergg stemmg morphologcl vrt stemmg d plurlto M(k) DER word stemmg croym croym revtos ABR expso sustrg revto word susttuto ltertve, hypoym, chge word swps, syoyms reformulto W(k), w(k) SUB spellg correcto spellg msspellgs M(k) SPE spellg correcto * ot detected elorto, locto reformulto S(k), s(k) * ot dt Tle : Mppg etwee txoomes of query reformulto serch logs cptlto, extr whtespce J(k) CAS stce, the dd words rule ws modfed to detect dded words eve whe the other words were reordered. To determe f we were mssg y other rules or eeded to djust exstg rules, we r our clssfer over the AOL query logs d rdomly checked the output. We optmed for reducg flse postves whle keepg flse egtves low sce we wted hgh precso clssfer. From ths, we tweked severl rules d dded oe tht would detect umer of query reformultos tht other rules dd ot, mely sustrg. A few ctegores from pror work were ether vgue or dffcult to detect, mrked ot detected the tle. For exmple, determg whether query ws locto reformulto s defed [3] s sujectve d would reduce the precso of our clssfer. Ctegores mrked ot dt could ot e clssfed ecuse the queres were ormled (v lowercsg d puctuto removl) our dtset. The query reformulto strteges (ordered y rule precedece) from Tle re descred elow forml lguge otto. 3. Deftos Let uderscore _ deote the spce chrcter; puctuto, represeted y P comprses the three puctuto chrcters left the query logs: the postrophe, dsh, d perod;.e. P = {',,.}. The empty strg s represeted y λ. Let Σ e the lphet of letters, dgts, d puctuto, = {[ - ],[0-9]}U P. c s chrcter tht lphet c Σ, w s word tht lphet w Σ, d s y strg composed from tht lphet or spce chrcter ( Σ U{_}), cludg the empty strg. Icludes form croym d expd croym Icludes sustrg d superstrg REFORMULATION. WORD REORDER I word reorder, the words the frst (tl) query re reordered ut uchged otherwse, producg the secod (refomulted) query. Ths trsformto c e defed formlly usg recursve defto, WR = _ =, _ f y WR = _ w_, =, _ _ w Explctly, ether oth queres cot the sme two words ut reversed, or removg the sme word from oth queres mkes the secod query word reorder of the frst query. The frst codto s the se cse d secod codto s the recursve step. Exmple: settle p plce p settle plce REFORMULATION. WHITESPACE AND PUNCTUATION The secod query s whtespce d puctuto reformulto of the frst query f oly whtespce d puctuto re ltered the reformulto. Ths c e defed recursvely, WP v v v, v P U { λ, _} = f y WP A whtespce d puctuto reformulto occurs whe fter removg whtespce or puctuto chrcter, the remg queres re the sme or the remg secod query s whtespce d puctuto reformulto of the secod query. Exmple: wl mrt, tomtoprces wlmrt tomto prces REFORMULATION 3. REMOVE WORDS A remove words reformulto s whe y umer of words s removed from the frst query resultg the sme words oth queres. Ths reformulto eglects word order.

RW = WR f y RW = _ wx_, _ c_ = _, c Words re recursvely removed from the frst query utl t s word reorder or equl to the secod query. The frst d secod codtos re se cses where ether the two queres re equl or word reorder. The thrd codto removes words log wth the surroudg spces from the frst query d replces them wth spces. Spces re temporrly dded to the left d rght of the query to ccout for the leftmost d rghtmost words. Exmple: yhoo stock prce prce yhoo REFORMULATION 4. ADD WORDS A dd words reformulto occurs whe oe or more words re dded to the frst query. Ths reformulto pples eve f words re reordered the secod query. It s esly defed s the reverse trsformto of remove words, AW RW ff Exmple: estlke home estlke home prce dex REFORMULATION 5. URL STRIPPING Users ofte pped compoets from URL to the query, mstkg the serch ox wth ther rowser s ddress r. Whe they rele ths, they wll strp these strgs from ther query. Ths lso hppes reverse, where the user copes the trget URL to the serch ox fter serchg. A url strppg reformulto occurs whe the frst d secod queres re the sme fter removg.com, www., d http from oth sdes. Ω = _http, http_, www.,.com, Let { } v v, v US = f y US Ths rule pples f there s some permutto of removg URL compoets from oth queres tht mkes them the sme. Exmple: http www.yhoo.com yhoo v Ω U { λ} REFORMULATION 6. STEMMING A stemmg reformulto volves chgg the word stems the frst query. The rule stems every word oth queres usg Porter s stemmg lgorthm [30] d compres them. Let P (w) e the stem of the word w, Stem w w _ K w w K w K w f ( P( w ) = P( w )) Exmple: rug over rdges ru over rdge REFORMULATION 7. FORM ACRONYM A form croym trsformto occurs whe the secod query s croym formed from the frst query s words. FA w cw cw c c Exmple: persol computer pc Kc Kc REFORMULATION 8. EXPAND ACRONYM A expd croym trsformto occurs whe the frst query s croym d the reformulto s query cosstg of the words tht form the croym. EA Kc Kc cw cw K c Exmple: pd persol dgtl ssstt _ c w REFORMULATION 9. SUBSTRING A sustrg s defed s stce where the secod query s strct prefx or suffx of the frst query. Ulke the trdtol defto of sustrg, ths does ot clude stces where oly sde chrcters of the frst query re extrcted. Su Exmple: s there spywre o my computer s there spyw REFORMULATION 0. SUPERSTRING A superstrg s defed s stce where the secod query cots the frst query s prefx or suffx. Super x x Exmple: evd polce rec evd polce records 008 REFORMULATION. ABBREVIATION A revto reformulto s whe correspodg words from the frst d secod queres re prefxes of ech other. Ths dffers from sustrg whch cosders suffxes d oly compres the etre queres. w K Ar _ w w w ( w w = w w = w w ) f c Exmple: shorteed dct short dctory c w w REFORMULATION. WORD SUBSTITUTION A word susttuto occurs whe oe or more words the frst query re susttuted wth semtclly relted words, determed from the Wordet dtse [0]. Two words re relted f oe s semtc relto (syoym, hypoym, hyperym, meroym, or holoym) of the other fter oth re coverted to ther se morphologcl form. Ths rule s mplemeted two steps. Frst, f the queres ther etrety re relted, they re cosdered word susttuto; ths detects susttutos of the etre query. Secod, f every correspodg pr of words s the sme or relted, ths s lso word susttuto. Let the opertor represet semtc relto etwee two words, cludg the cse whe the words re the sme. WS = w w w = w w w f ( w w ) Syoym: The two words hve the exct sme meg. Exmple: ester egg serch ester egg hut Hypoym: The frst word s specfc stce of the secod word. These re lso referred s rod terms. Exmple: crmso scrf red scrf

Hyperym: The secod word s specfc stce of the frst word. These re lso referred s rrow terms. Exmple: persol computer lptop Meroym: The frst word s costtuet prt of the secod word. Exmple: fger hd Holoym: The secod word s costtuet prt of the frst word. Exmple: utomole wheel REFORMULATION 3. SPELLING CORRECTION A spellg correcto s detected usg coservtve Leveshte edt dstce fucto [5]. Ths fucto mps well to spellg correcto user would typclly mke, ecuse t trcks the umer of chrcter edts etwee two queres. The queres re clssfed s spellg correcto reformulto f the Leveshte dstce s or less. A threshold of mtches chrcter swps d mssg chrcters. I the expresso elow, L (, ) s the Leveshte edt dstce etwee strgs d, SC f L(, ) Exmple: reformulto reformulto 3. Udetected Reformultos There re few ctegores of reformultos whch re ot cluded our txoomy. They re dffcult for our clssfer to detect, d my eve e dffcult for hum to detect. We rdomly smpled 00 of the 96 mssed reformultos from our evluto dt to get geerl sese of whch reformultos our clssfer mssed. Three types of mssed reformultos emerged, descred the ext three susectos d qutfed Tle. 3.. Semtc Rephrsg Hums c rephrse ther queres complex wys. My rephrsgs re dffcult for eve smrt lgorthm to detect, requrg sophstcted semtc ssocto t mmum. Cotext or pop culture kowledge my e eeded. Exmple: esy rsperry mousse cool whp mousse Exmple: how to clculte utrtol vlues weght wtchers clcultor 3.. Mult-Reformultos Users ofte perform more th sgle reformulto strtegy. For exmple, they my correct spellg d replce oe word wth syoym. Whle clssfer c theoretclly try comtos of reformulto strteges, ths s dffcult or eve mpossle ecuse reformulto strteges do ot hve commuttve property. I other words, dfferet orderg of strteges gves dfferet results. For exmple, tryg to detect spellg correctos fter stemmg wll yeld dfferet results th dog so efore stemmg. Addtolly, my reformultos ovously cot e comed, such s word reorder d croym. Add words d remove words together were ot cosdered mult-reformulto sce y query c e trsformed to y other query. The most commo comtos of reformultos our smple were dd words & spellg correcto, remove words & spellg correcto, url strppg & whtespce d puctuto. Explorg the chllege of mult-reformultos s pled s future work. The followg exmple demostrtes mult-reformulto volvg two reformultos: dd words d spellg correcto. Exmple: le couty grge le couty grge dsposl 3..3 Clssfer Rule Lmttos Some stces of reformulto strteges were suffcetly mtched y clssfer rule. However, fxg the rules to detect these reformultos would hve troduced ew complctos. Our rule for detectg spellg correcto used Leveshte edt dstce of. Whle ths cheved hgh precso, the rule mssed spellg correcto volvg three or more chrcter edts. For exmple, metuer chged to mteur. Ths s exmple of the clssc trde-off etwee precso d recll. We chose lower threshold to optme for hgh precso clssfer. Word susttutos re depedet o the Wordet dtse. Susttutos set from the dtse cot e detected y our clssfer. Ths lmtto wll lkely e solved over tme. Our rule for url strppg curretly oly removes the.com toplevel dom from the query. Some queres volve other top-level doms or secod-level doms whch re ot strpped. The lst of top-level doms s ot costt d there re fte umer of secod-level doms so cpturg these reformultos requres more sophstcted rule. The revto detecto rule oly checked for sustrg prefx for ech word. There re cses the Eglsh lguge where revto s ot sustrg prefx such s dept for deprtmet. Tle : Mssed reformultos smple evluto dt Udetected Reformulto Occurreces. Semtc Rephrsg 08. Mult Reformultos 60 reformultos 46 3 reformultos 4 3. Clssfer Rule Lmttos 3 spellg correcto 5 word susttuto url strppg 3 croym revto Totl 00 4. THE RULE-BASED CLASSIFIER Clssfers commoly ler from set of trg dt, whch we refer to s mche lerg clssfers. We developed rulesed clssfer sted of mche lerg clssfer ecuse our query reformulto strteges ft procedurl rule model etter th lerg model. No pror work hs developed mche lerg clssfer tht dstgushes dfferet query reformulto strteges. Furthermore, usg rule-sed clssfer llowed us to mke detled djustmets to our clssfer for specl cses. A mplemetto of the clssfer s freely vlle to the reserch commuty 3. 3 Source code: http://jeffhug.com/reformultoclssfer.py

The clssfer reds the query log strtg from the top d compres prs of cosecutve queres (, ) from the sme user. The frst query the pr s the tl query d the secod query s potetlly reformulted query. The query prs re mtched gst the ordered reformulto rules defed Secto 3.. If there s mtch, the secod query s clssfed s reformulto of the frst query. Fgure shows the flow of queres to the clssfer d segmeted to query types. Usg the otto s the th query the query log exmple from Fgure, we c see tht (, ), (, 3 ), d ( 5, 6 ) re clssfed ut ot ( 3, 4 ) or ( 4, 5 ) ecuse 4 ws from dfferet user. Query Logs user, query strg, tmestmp, rk, url user, query strg, tmestmp, rk, url user, query strg3, tmestmp, rk, url user, query strg, tmestmp, rk, url user3, query strg, tmestmp, rk, url user3, query strg, tmestmp, rk, url New Queres Sme Queres Clssfer Reformulto Acroym Stemmg etc... Fgure : Dgrm of the queres d clssfer 4. Precso vs. Recll Accurcy s the percetge of query prs correctly detected s reformulto. Exstg mesures of ccurcy most query reformulto reserch do ot dfferette etwee precso, the percetge of query reformultos detfed tht re ctully reformultos, d recll, the percetge of query reformultos detfed. Our gol s to crete rule-sed clssfer wth hgh precso, ut ot ecessrly hgh recll. We deemphse recll ecuse we re studyg the propertes wth ech reformulto rther th etwee ech reformulto. I other words, we re terested ter-reformulto, rther th tr-reformulto, comprsos. For exmple, the proporto of URL clcks wth ech reformulto helps us uderstd the reformultos etter th comprg the solute couts of URL clcks etwee ech reformulto. The mgtude of query logs provdes suffcet evets, so the lyss wll stll e geerlle d compellg eve wth lower recll. We mully clssfed every query from 00 users the AOL query logs for evluto. Essetlly, ths ws sesso oudry detecto tsk. I totl, there were 9,09 query prs where we determed whether the secod query ws reformulto of the frst. Sme queres were removed (40.8% of queres), to vod fltg clssfer performce ecuse they c e detected trvlly. Of these prs, we foud,483 reformultos d 6,608 ew queres, or 7.3% reformultos. Ths s very close to the 8% reformultos reported for ths dtset [9]. Our clssfer ws evluted o ths test dt, mrkg the secod query of ech query pr s reformulto f the query pr mtched reformulto strtegy. Tle 3 presets the results, comprg our clssfer wth mche lerg clssfers. Tle 3: Precso, recll, d ccurcy mesures for sesso oudry detecto studes Precso Recll Accurcy Preset Study 98.% 6.3% 89.% He [5] 60% 4 98% Joes [] 87.3% Murry [7] 97.3% 4 76% Rdlsk [3] 96.5% 4 9.3% Our focus o precso rther th recll resulted 98.% precso whch s 38% hgher th reported He et l d slghtly hgher th Rdlsk d Jochm s 96.5%. Note tht ech study used dfferet set of query logs, so results c ot e drectly compred. Cert query logs re eser to clssfy th others ecuse of the ture of the serch ege d ther users. Lookg closer t the.8% (8 ctul) queres tht our clssfer correctly determed to e reformulto, we oly foud oe cse tht ws true mstke. The other 7 were dffcult to judge d detle whether these were reformultos or ot (see Secto 5.3.3 for dscusso). Therefore, we propose tht our precso s eve etter th the 98.% reported. 5. RESULTS Our results re extrcted from the AOL query logs, whch were relesed o August 3, 006 [9]. The logs cot 36,389,567 queres from whch our clssfer detfed 6,069,4 ew queres, 4,86,36 sme queres, d 3,4,706 reformultos. Ech le the logs cots fve felds: the query strg, tmestmp, the rk of the tem selected (f y), the dom porto of the selected tem s URL pth (f y), d uque detfer for ech user. 5. Reformulto Effectveess Metrcs We use effectveess metrcs to fer the qulty of serch results. Pst studes foud tht clckthrough dt d tme spet predcted users stsfcto wth the results []. Whether users clcked durg the tl query d the reformulted query, whch we cll clck ptter, c e predctor of serch relevce [0]. We pply metrcs lered from prevous reserch to study the effectveess of dfferet reformulto strteges. These metrcs re mostly sed o clck ehvor d help show the usge ptter d effectveess of specfc reformultos. I our lyss, we lso cluded ew d sme queres for comprso. Some reformultos re msdetfed s ew queres due to our clssfer s lower recll, ut detfyg ew queres hs o effect o our study of reformulted queres. Dffereces etwee reformulto strteges were ll sttstclly sgfct, due to the lrge umer of evets our dtset. 5.. Clck Ptter A reformulto s composed of tl query followed y reformulted query. For ech query, the user c decde to clck or ot clck (skp) result, cretg =4 possle clck ptters, preseted Tle 4. 4 Sme queres, whch flte precso, my hve ee cluded

Tle 4: Clck ptters for queres d ther reformulto Sercher Actos o Results Itl Query Clck Skp Reformulto Clck Skp Clck Skp A clck ptter of Skp followed y Clck (SkpClck) mes the user dd ot clck y result from ther tl query, the reformulted ther query d clcked result. Ths s dctor tht the user foud the query reformulto to e effectve. A Clck followed y Skp (ClckSkp) suggests tht the reformulto dd ot help [0]. Smlrly, two cosecutve Clcks c e tke s successful serches, whle two cosecutve Skps s fled serches. Over ll queres the query logs, the rto of clcks to skps ws pproxmtely 5:4. Fgure shows the proportos of the clck ptters for ech type of reformulto. A ch-squred lyss verfes tht the query reformulto type hs sttstclly sgfct effect o clck ptter χ (4, N=34,34,453) = 6,7,864.37, p <.00. The results show tht dfferet reformulto strteges hve sgfctly dfferet proportos of Clcks vs. Skps the tl query. We c see ths y lookg t the rtos of SkpSkp + SkpClck to ClckClck + ClckSkp. Spellg correcto, expd croym, d superstrg hve hgh rtos, meg people ttempt these reformultos whe they re ustsfed wth ther tl query, perhps due to msspelled query or mguous croym. I cotrst, form croym, remove words, word reorder, d word susttuto hve lower rtos, dctg the tl results my e somewht relevt d users re further refg ther query. Sme queres hve the lowest rto s expected sce users re ulkely to repet serch usg the sme query f the tl results were ustsfyg; fct, sme queres usully hve ClckClck ptters proly ecuse they re re-fdg queres [35]. These proportos re cosstet wth our curret uderstdg of users. Comprg the proportos of Clcks vs. Skps the reformulted query gves sght to whether the reformulto ws helpful. Lookg t the SkpSkp + ClckSkp to SkpClck + ClckClck rtos, we c see tht reformulto results were clcked out s ofte s ew queres. Ths s postve dctor for reformultos ecuse t suggests users re s successful wth reformultos s wth ew serches. The sustrg d superstrg reformultos were lest helpful, possly ecuse my of those reformultos were mstkes y users. Add words, word susttuto, stemmg, spellg correcto, d expd croym were most helpful uder ths comprso. We c lso compre the proportos of Clcks vs. Skps the reformulted query gve specfc cto the tl query. We cotrol the cto vrle the tl query d regrd the cto the reformulted query s the depedet vrle. For exmple, we compre the rto of SkpSkp to SkpClck to see whether user s more lkely to clck f the tl cto s Skp. Sme queres ehve s expected: f the tl query ws Skp, the user s sgfctly more lkely to skp the secod query s well; f the tl query cused clck, the user s out 0 more lkely to clck th skp fter serchg wth the sme query. Whe the tl query cuses Skp, the spellg correcto, expd croym, d dd words reformultos hve the hghest lkelhood tht the user wll clck. Lkely expltos re tht spellg correcto d expd croym fx correct queres d dsmgute croyms, whle dd words rrows the serch to mke the results more relevt. I cotrst, superstrg, url strppg, d sustrg re lest lkely to help whe the tl query results Skp. Dfferet reformultos re effectve whe lookg t tl queres tht result Clck. Word susttutos, word reorder, d dd words re the three most helpful reformultos ths codto. Whe serch provdes relevt queres, users tht susttute words for relted words, reorder ther words, d dd ew words get etter follow-up results. O the other hd, sustrg, superstrg, revto, d spellg correcto re ot useful whe the tl query results Clck. Ths s terestg ecuse spellg correcto s oe of the most helpful reformultos whe the tl cto s Skp, ut oe of the lest helpful reformultos whe the tl cto s Clck. Query Reformulto Type ew form croym remove words sustrg word reorder dd words word susttuto url strppg revto superstrg stemmg whtespce / puctuto sme spellg correcto expd croym Clck Ptter SkpSkp ClckClck SkpClck ClckSkp Fgure : Proportos of Clck Ptters used for ech Reformulto Type The ext two metrcs, Clck URL d Rk Chge of Clcked Results, oly pply the cse of ClckClck ptter ecuse rk d URL from correspodg clcks re used the lyss. 5.. Clck URL Users my e re-fdg rther th reformultg queres to retreve etter results. Ths c e oserved y checkg f the URL s the sme etwee queres. We hypothese tht users clck o the sme URL sme queres (re-fdg). There re some lmttos to the lyss ecuse the URLs the AOL logs re tructed t the dom level for prvcy. Fgure 3 shows the proportos of clcked URLs whch were the sme for ech reformulto type. A ch-squred lyss shows tht reformulto type hs sttstclly sgfct effect o ths metrc χ (8, N=34,34,453) = 5,394,409.56, p <.00. The umer of ew queres whch resulted the sme URL s smll s expected. The sme URLs were ofte selected efore d fter url strppg from the query ths s lso ovously expected. Users susttut-

g relted words ther query,.e. word susttuto, seemed to select dfferet results. The mrked dfferece etwee formg d expdg croyms my e ecuse users form croyms to retur to the sme URL d re smply usg shortcut query, whle users expd croyms to look for ew results, perhps to dsmgute commo croym. Also otle s tht spellg correcto cused few sme URL clcks, suggestg tht the correcto helped fetch ew, mproved results. url strppg form croym whtespce / puctuto word reorder Query Reformulto Type revto stemmg sme expd croym sustrg superstrg spellg correcto remove words dd words word susttuto ew URL Clcked Sme Dfferet Fgure 3: Proportos of URLs Clcked whch were the Sme vs. Dfferet for ech Reformulto Type 5..3 Rk Chge of Clcked Results A rk chge s the dfferece etwee the rk of the result clcked the tl query sutrcted from the rk of the result clcked the reformulted query. Successful reformultos should hve postve effect o rk chge. Tle 5 shows tht ll reformultos hve postvely ffected the rk of the selected result. The rk chge s postve f the user clcked hgher rked result the query reformulto. The most postve rk chges occurred wth the reformulto types word susttuto d dd / remove words. Url strppg, chgg whtespce d puctuto, d formg croyms resulted smll postve rk chge. We suspect url strppg oly hd smll rk chge effect ecuse most clcks were for the sme URL (see Secto 5..) whch would lkely hve the sme or smlr rk. Clculted rk chges were foud to e sgfctly dfferet (F 4, 3434438 = 6,670.58, p < 0.00). 5..4 Med Tme etwee Queres Ths metrc mesures how quckly users performed ech type of reformulto. The verge tme ws computed for ech reformulto strtegy. Our results Tle 5 show tht complex reformultos such s word susttutos d formg croyms took users loger th smple oes lke spellg correcto. Surprsgly, the med tme for sme query ws secod; ths suggests tht some sme queres my e mde y computers rther th hums. As expected, ew queres took the logest tme sce they re ofte prt of dfferet query sessos. Clculted tmes were foud to e sgfctly dfferet ccordg to ANOVA (F 4, 38344 = 48,35.05, p < 0.00). Tle 5: The med tme ( secods) etwee queres d me rk chge for ech reformulto Reformulto Type Med Tme (s) etwee Queres Me Rk Chge word susttuto 73 +4.04 dd words 63 +3.9 sustrg 33 +3.5 remove words 68 +3.0 word reorder 85 +.86 expd croym 4 +.0 stemmg 33 +.00 ew,47 +.9 revto 35 +.39 superstrg 53 +.0 spellg correcto +.03 form croym 03 +.64 whtespce & puctuto 7 +.54 url strppg 57 +.9 sme.83 5. Dscusso Most fdgs were cosstet wth our expecttos, evdece tht the geerl pproch of lyg effectveess metrcs of reformulto strteges s useful. A surprsg fdg ws tht dfferet reformulto strteges were effectve depedg o the cto from the tl query. Ths emerged whe comprg the rtos of ctos reformulted query whle cotrollg for the tl cto. Word susttuto reformultos were more lkely to result Skp th Clck whe the tl cto ws Skp, ut result Clck 3 s ofte s Skp whe the tl cto s Clck. Ths s supported y metrcs tht show word susttuto s correlted wth dfferet URL clcks s well s hgher rked clcks, suggestg tht the user s terested relted ut etter results. I cotrst, spellg correcto s oe of the lest effectve reformultos whe the tl cto s Clck, ut ecomes oe of the most effectve reformultos whe the tl cto s Skp. Ths demostrtes the pror cto eeds to e cosdered whe determg the effectveess of reformulto strteges. 5.3 Lmttos 5.3. Lck of Cotext Grmes et l. [4] ote tht whle vst mout of formto c e dscovered from ggregtg dt, query logs re the lest rch source of dt for dvdul evets. Query logs oly show the recorded ctos d ot the tet ehd the queres. Idetfyg the user tet c e dffcult or mpossle wthout cotext, whch s set from logs. For exmple, query logs cot tell whether user dd ot clck ecuse the formto they were lookg for ws foud o the results pge, or ecuse the results were ustsfyg. Complemetg ths reserch wth survey d user studes could ddress the lck of cotext. 5.3. Normled Query Logs The AOL query logs were relesed wth ormled dt, whch my skew the results. Some queres were removed or modfed for prvcy resos. The pths the clck URLs were strpped levg oly the doms. Lstly, ll queres were lowercsed d most puctuto ws removed, prevetg us from detectg whe the user performed cptlto query reformulto.

5.3.3 Amguous Queres Be-Ytes et l. [5] ote tht eve hums hve dffculty mully clssfyg some queres d the sujectvty volved c led to errors. Whe mully seprtg query sessos, we ecoutered queres where t ws mguous whether they were reformulto. Queres c e relted, ut whether they ft the defto of reformulto, s prt of the sme formto eed, my stll e ucler. If hum cot ccurtely clssfy query, computer progrmmed y hum, suject to ther lmttos, wll ot e more successful. A exmple s frst query merc rles d secod query delt rles ; would they e cosdered prt of the sme formto eed? The tet ehd the queres could e dfferet (e.g. the user wts to fd formto out ech rle), or prt of the sme formto eed (e.g. comprg prces etwee the rles). 5.3.4 Serch Ege Effects The fdgs ths pper re flueced y the AOL serch ege s mplemetto. Studyg dfferet serch ege s query logs my ffect the reformultos used ecuse of the dfferet results dsplyed or the wy t hdles queres. A reformulto my work etter for dfferet serch ege. For exmple, users my ler over tme ot to reorder words ther query f they fd t s effectve due to the serch ege s gorg of word order. Durg the perod whe these logs were collected, AOL Serch retured results from Google [Chowdhury, persol commucto] d query suggestos were offered for some queres. Despte these effects, the results here re ucompromsed ecuse we study ter-reformulto rther th trreformulto effectveess metrcs. 6. APPLICATIONS 6. Iterfces Supportg Reformulto Curret serch eges hve tegrted utomtclly geerted query reformulto suggestos to ther terfce. However, they do ot dstgush etwee ew d reformulted queres. Users ofte perform query reformulto ecuse they re dsstsfed wth the results from ther tl query. Oe possle terfce chge would e dsplyg dfferetly the overlppg serch results etwee the reformulted serch d the tl serch. For exmple, whe user serches for lptop d the wdescree lptop, the serch ege c gry out the results the secod query tht were lso preseted the lptop query ecuse t kows the user ws ot terested those results. A relted terfce hs ee lredy ult to we rowsers sce ther cepto vsted lks tur purple whle uvsted lks re lue, whch helps users vod selectg results lredy vsted. Eml pplctos show the coverstol hstory etwee recpets whch remds them of pst dscusso. We suspect tht query sesso hstory s ot show serch pges ecuse exstg reformulto detecto methods re error proe. However, wth our hgh precso clssfer, my prevous queres c e determed wth cofdece. Whle we my mss some prevous queres, tht s less crucl ths pplcto. Showg query hstory fts the eed of hgh precso, low recll clssfer. We c desg d evlute serch terfce tht shows user s query hstory whe the user s query sesso,.e., performg query reformulto. It my help the user to see pror queres whle they re reformultg. 6. Query Sesso Boudry Detecto Ack [3] otes tht query reformultos exst 56% of sessos; whle Pss et l. [9] fd the typcl sesso cots.6 reformultos o verge. A clssfer lke ours tht detfes query reformultos solves the sme prolem s clssfers tht detfy query sesso oudres (see Secto. for dscusso). Geerlly, whe orthogol clssfers re comed, the result s oe tht s etter th ether of ts compoets. Sce our clssfer s rule-sed, opertg orthogolly to exstg clssfers, t c theoretclly e comed wth exstg temporl or mche lerg clssfer. Ths wll produce relle overll clssfer for detectg sesso oudres. 6.3 Itellget Query Assstce Uderstdg how users re reformultg queres d ther effectveess c help serch eges provde etter utomtc query ssstce. For exmple, serch ege should propose dfferet reformulto strteges depedg o the user s cto fter query. Our fdgs hve show tht expdg croyms d spellg correctos re helpful reformultos whe user does ot clck o y result, ut word susttutos d query expso re more helpful whe user hs clcked. 6.4 Persoled Serch Reformulto strteges lso gretly vry etwee users. Serch eges c rect dfferetly depedg o the user performg the serch. A serch ege tht hs hstory of user s queres wll e le to offer query ssstce suted for tht user, or offer helpful suggestos out how the user c mprove ther serchg d reformultg. For exmple, the serch ege c suggest stemmed queres to user who would eeft from stemmg reformulto, or t mght dsply messge lke We otced you hve ee usg future teses your serches, we suggest chgg to preset tese for etter results. 7. CONCLUSIONS Ths pper descres the hum sde of query reformulto d cotrutes to our uderstdg of users serch tercto. We creted txoomy of query reformulto strteges, ult hgh precso rule-sed clssfer to detect ech type of reformulto, d lyed query reformultos the AOL query logs usg metrcs whch re dctors of effectveess. We foud tht dfferet reformulto strteges hve dstct chrcterstcs whe studed through the les of clck dt. Cert reformultos lke dd/remove words, word susttuto, croym expso, d spellg correcto seem most effectve. O the other hd, croym formto d reorderg words my e less eefcl to the user. We dscovered tht dfferet reformulto strteges re useful depedg o the user s ehvor respose to the tl set of results. These fdgs eeft reserch query sesso oudry detecto, mprove query ssstce d persoled serch, d propose desg mplctos for user terfces supportg reformulto. 8. ACKNOWLEDGMENTS The uthors would lke to thk the oymous revewers, d Eyt Adr, Nchols J. Belk, Ede Rsmusse, Jco O. Worock for helpful commets o erler drfts. We lso thk Xuemg Hug for help wth forml lguge otto.

9. REFERENCES [] Agchte, E., Brll, E., d Dums, S. (006). Improvg we serch rkg y corportg user ehvor formto. I SIGIR 06, 9-6. [] Agchte, E., Brll, E., Dums, S., d Rgo, R. (006). Lerg user tercto models for predctg we serch result prefereces. I SIGIR 06, 3-0. [3] Ack, P. (003). Usg termologcl feedck for we serch refemet: log-sed study. I SIGIR 03, 88-95. [4] Arltt, M. (000). Chrcterg We user sessos. ACM SIGMETRICS Performce Evl Revew, 8(), 50-63. [5] Be-Ytes, R., Clderó-Bevdes, L., d Goále- Cro, C. (006). The teto ehd We queres. I SPIRE 06, 98-09. [6] Be-Ytes, R., Hurtdo, C., d Medo, M. (004). Query recommedto usg query logs serch eges. I EDBT 04, 588-596. [7] Bru, P.D. d Des, S. (997). Query Reformulto o the Iteret: Emprcl Dt d the Hyperdex Serch Ege. I RIAO 97, 488-499. [8] comscore. (008). Bdu Rked Thrd Lrgest Worldwde Serch Property Dec 007. Retreved Nov 30, 008 from http://www.comscore.com/press/relese.sp?press=08 [9] Dou, Z., Sog, R., Yu, X., d We, J. (008). Are clckthrough dt dequte for lerg we serch rkgs?. I CIKM 08, 73-8. [0] Efthmds, E.N. (996). Query Expso. Aul Revew of Iformto Scece d Techology, 3, -87. [] Fellum, C. (998). WordNet: A Electroc Lexcl Dtse. The MIT Press. [] Fox, S., Krwt, K., Mydld, M., Dums, S., d Whte, T. (005). Evlutg mplct mesures to mprove we serch. ACM Trsctos o Iformto Systems, 3(), 47-68. [3] Guo, J., Xu, G., L, H., d Cheg, X. (008). A ufed d dscrmtve model for query refemet. I SIGIR 08, 379-386. [4] Grmes, C., Tg, D., d Russell, D.M. (007). Query Logs Aloe re ot Eough. I WWW 07, Workshop o Logs Alyss. [5] He, D., Göker, A., d Hrper, D.J. (00). Comg evdece for utomtc we sesso detfcto. Iformto Processg & Mgemet, 38(5), 77-74. [6] Jse, B.J., Spk, A., Blkely, C., d Koshm, S. (007). Defg sesso o We serch eges. Jourl of the Amerc Socety for Iformto Scece d Techology, 58(6), 86-87. [7] Jse, B.J., Spk, A., d Pederse, J. (005). A temporl comprso of AltVst We serchg. Jourl of the Amerc Socety for Iformto Scece d Techology, 56(6), 559-570. [8] Jse, B.J., Zhg, M., d Spk, A. (007). Ptters d trstos of query reformulto durg we serchg. Itertol Jourl of We Iformto Systems, 3(4), 38-340. [9] Jochms, T. (00). Optmg serch eges usg clckthrough dt. I SIGKDD 0, 33-4. [0] Jochms, T., Grk, L., P, B., Hemrooke, H., Rdlsk, F., d Gy, G. (007). Evlutg the ccurcy of mplct feedck from clcks d query reformultos We serch. ACM Trsctos o Iformto Systems, 5(). [] Joes, R. d Klker, K.L. (008). Beyod the sesso tmeout: utomtc herrchcl segmetto of serch topcs query logs. I CIKM 08, 699-708. [] Joes, R., Rey, B., Md, O., d Greer, W. (006). Geertg query susttutos. I WWW 06, 387-396. [3] Krft, R. d Ze, J. (004). Mg chor text for query refemet. I WWW 04, 666-674. [4] Lu, T. d Horvt, E. (999). Ptters of serch: lyg d modelg We query refemet. I User Modelg 99, 9-8. [5] Leveshte, V.I. (996). Bry codes cple of correctg deletos, sertos, d reversls. Sovet Physcs Dokldy, 0, 707 70. [6] Mtr, M., Sghl, A., d Buckley, C. (998). Improvg utomtc query expso. I SIGIR 98, 06-4. [7] Murry, G.C., J. L, d A. Chowdhury,. (006). Idetfcto of User Sessos wth Herrchcl Agglomertve Clusterg. I ASIS&T 06, 43(), -5. [8] Omutlu, S. (006). Automtc ew topc detfcto usg multple ler regresso. Iformto Processg & Mgemet, 4(4), 934-950. [9] Pss, G., Chowdhury, A., d Torgeso, C. (006). A pcture of serch. I IfoScle 06,. [30] Porter, M.F. (980). A lgorthm for suffx strppg, Progrm, 4(3), 30-37. [3] Rdlsk, F. d Jochms, T. (005). Query chs: lerg to rk from mplct feedck. I SIGKDD 05, 39-48. [3] Reh, S.Y. d Xe, H. (006). Alyss of multple query reformultos o the we: the terctve formto retrevl cotext. Iformto Processg & Mgemet, 4(3), 75-768. [33] Slverste, C., Mrs, H., Heger, M., d Morc, M. (999). Alyss of very lrge we serch ege query log. SIGIR Forum 33(), 6-. [34] Teev, J., Adr, E., Joes, R., d Potts, M.A. (007). Iformto re-retrevl: repet queres Yhoo's logs. I SIGIR 07, 5-58. [35] Teev, J. (007). The re:serch ege: smulteous support for fdg d re-fdg. I UIST 07, 3-3. [36] Whttle, M., Eglestoe, B., Ford, N., Gllet, V. J., d Mdde, A. (007). Dt mg of serch ege logs. Jourl of the Amerc Socety for Iformto Scece d Techology, 58, 4, 38-400.