Predicting Current User Intent with Contextual Markov Models

Similar documents
Reading. Minimum Spanning Trees. Outline. A File Sharing Problem. A Kevin Bacon Problem. Spanning Trees. Section 9.6

Distributed Systems Principles and Paradigms. Chapter 11: Distributed File Systems. Distributed File Systems. Example: NFS Architecture

Algorithmic Aspects of Access Networks Design in B3G/4G Cellular Networks

Usability Test Checklist

Hospitals. Internal Revenue Service Information about Schedule H (Form 990) and its instructions is at

Oracle PL/SQL Programming Advanced

Network Decoupling for Secure Communications in Wireless Sensor Networks

Schedule C. Notice in terms of Rule 5(10) of the Capital Gains Rules, 1993

Link-Disjoint Paths for Reliable QoS Routing

Discovering Petri Nets From Event Logs

Operational Procedure: ACNC Data Breach Response Plan

MANAGEMENT OF INFORMATION SECURITY AND FORENSICS

Where preparation meets opportunity. My Academic Planner. Early Academic Outreach Program (EAOP)

Revised Conditions (January 2009) LLOYDS BANKING GROUP SHARE ISA CONDITIONS

Industry regulations Jurisdictional regulations Legal defensibility Legal frameworks Legal research

1. Number of questions to be answered: ALL Multiple Choice (Section A) and 3 from 5 of the short answer questions (Section B)

One Ring to Rule them All: Service Discovery and Binding in Structured Peer-to-Peer Overlay Networks

Uses for Binary Trees -- Binary Search Trees

Summary of changes to Regulations recommended to the Senate by Graduate School Management Committee. Changed wording is shown in bold italics.

Diagram Editing with Hypergraph Parser Support

Process Mining Making Sense of Processes Hidden in Big Event Data

Last time Interprocedural analysis Dimensions of precision (flow- and context-sensitivity) Flow-Sensitive Pointer Analysis

Back left Back right Front left Front right. Blue Shield of California. Subscriber JOHN DOE. a b c d

A MESSAGE FROM CLAIMTEK

P U B L I C A T I O N I N T E R N E 1800 PARTIAL ORDER TECHNIQUES FOR DISTRIBUTED DISCRETE EVENT SYSTEMS: WHY YOU CAN T AVOID USING THEM

Functional Valuation of Ecosystem Services on Bonaire

Higher. Exponentials and Logarithms 160

Dinh Hong Giang 1,2, Ed Sarobol 2, * and Sutkhet Nakasathien 2 ABSTRACT

SEE PAGE 2 FOR BRUSH MOTOR WIRING SEE PAGE 3 FOR MANUFACTURER SPECIFIC BLDC MOTOR WIRING EXAMPLES A

Distributed Process Discovery and Conformance Checking

The Swedish Radiation Protection Institute s Regulations on X- ray Diagnostics;

Menu Structure. Section 5. Introduction. General Functions Menu

Discovering Block-Structured Process Models From Event Logs Containing Infrequent Behaviour

CompactPCI Connectors acc. to PIGMG 2.0 Rev. 3.0

University of Mumbai Application for selection as Best College

Economics 340: International Economics Andrew T. Hill Nontariff Barriers to Trade

How To Get A Usb Power Button On Your Computer (For A Free) For A Year (For Free) (For An Ipad) (Free) (Apple) (Mac) (Windows) (Power) (Net) (Winows

11 + Non-verbal Reasoning

Upward Planar Drawings of Series-Parallel Digraphs with Maximum Degree Three

Net Promoter Industry Report

Standard Conditions for Street Traders The Royal Borough of Kensington and Chelsea. Revised standard conditions for street trading

Change Your History How Can Soccer Knowledge Improve Your Business Processes?

Chapter 3 Chemical Equations and Stoichiometry

Quality and Pricing for Outsourcing Service: Optimal Contract Design

A122 MARION COUNTY HEALTH BUILDING HVAC, GLAZING AND LIGHTING RENOVATION 75% DOCUMENTS 08/31/2015

AdvancedTCA Connectors acc. to PICMG 3.0

SEE PAGE 2 FOR BRUSH MOTOR WIRING SEE PAGE 3 FOR MANUFACTURER SPECIFIC BLDC MOTOR WIRING EXAMPLES

Learning Schemas for Unordered XML

QUANTITATIVE METHODS CLASSES WEEK SEVEN

The example is taken from Sect. 1.2 of Vol. 1 of the CPN book.

Graph Theoretical Analysis and Design of Multistage Interconnection Networks

Turkish Keyphrase Extraction Using Multi-Criterion Ranking

GENERAL OPERATING PRINCIPLES

Total Trans Fat Content in Commercially Available Hydrogenated Vegetable Oils

SKILL TEST IR(H) HELICOPTER SE ME Application and report form A. Udfyldes af ansøgeren/to be filled out by the applicant:

Important result on the first passage time and its integral functional for a certain diffusion process

How To Understand The Rules Of A Game Of Chess

Homeomorphic Alignment of Weighted Trees

the machine and check the components Black Yellow Cyan Magenta

Maximum area of polygon

Five-Layer Density Column

Automatic Modeling of Musical Style

MATH PLACEMENT REVIEW GUIDE

SecurView Antivirus Software Installation

5.4 Exponential Functions: Differentiation and Integration TOOTLIFTST:

Word Wisdom Correlations to the Common Core State Standards, Grade 6

Transistor is a semiconductor device with fast respond and accuracy. There are two types

The Splunk Guide to Operational Intelligence

Magic Message Maker Amaze your customers with this Gift of Caring communication piece

Hermes: Dynamic Partitioning for Distributed Social Network Graph Databases

Ethical and Professional Standards

Enterprise Digital Signage Create a New Sign

Network Analyzer Error Models and Calibration Methods

Econ 371: Answer Key for Problem Set 1 (Chapter 12-13)

Fluent Merging: A General Technique to Improve Reachability Heuristics and Factored Planning

REFUGEE PERCEPTIONS STUDY

- DAY 1 - Website Design and Project Planning

1 Fractions from an advanced point of view

25/8/94 (previous title) 08/06/12 [15/05/13 Formal Delegations amended] 15/12/95 13/10/00 2/11/01, 9/9/05, 14/12/11 5 yearly Immediately

The art of Paperarchitecture (PA). MANUAL

5 2 index. e e. Prime numbers. Prime factors and factor trees. Powers. worked example 10. base. power

Student Access to Virtual Desktops from personally owned Windows computers

RECEIVED 2812 HAY 10 PMI2:00 FEC MAIL CEHTER

AC Circuits Three-Phase Circuits

An overview on XML similarity: background, current trends and future directions

A Note on Approximating. the Normal Distribution Function

Matching Execution Histories of Program Versions

Whey protein concentrate gels with different sucrose content: instrumental texture measurements and sensory perception

Immigrant Citizens Survey - questionnaire

Fundamentals of Tensor Analysis

December Homework- Week 1

The Splunk Guide to Operational Intelligence

Transcription:

Priting Currnt Usr Intnt with Contxtul Mrkov Mols Juli Kislv, Hong Thnh Lm, Mykol Phnizkiy Dprtmnt of Computr Sin Einhovn Univrsity of Thnology P.O. Box 513, NL-5600MB, th Nthrlns {t.l.hong, j.kislv, m.phnizkiy}@tu.nl Toon Clrs Computr & Dision Enginring Dprtmnt Univrsité Lir Bruxlls Avnu F.D. Roosvlt 50, B-1050 Bruxlls, Blgium Toon.Clrs@ul.. Astrt In mny w informtion systms lik -shops n informtion portls pritiv moling is us to unrstn usr intntions s on thir rowsing hvior. Usr hvior is inhrntly snsitiv to vrious ontxts. Intifying suh rlvnt ontxts n hlp to improv th prition prformn. In this work, w propos forml pproh in whih th ontxt isovry pross is fin s n optimiztion prolm. For simpliity w ssum onrt yt gnri snrio in whih ontxt is onsir to sonry ll of n instn tht is ithr known from th vill ontxtul ttriut (.g. usr lotion) or n inu from th trining t (.g. novi vs. xprt usr). In n il s, th ojtiv funtion of th optimiztion prolm hs n nlytil form nling us to sign ontxt isovry lgorithm solving th optimiztion prolm irtly. An xmpl with Mrkov mols, typil pproh for moling usr rowsing hvior, shows tht th riv nlytil form of th optimiztion prolm provis us with usful mthmtil insights of th prolm. Exprimnts with rl-worl us-s show tht w n isovr usful ontxts llowing us to signifintly improv th prition of usr intntions with ontxtul Mrkov mols. Kywors intnt prition; ontxt-wrnss; I. INTRODUCTION In mny w pplitions, ontxtul informtion is vill long with th t. This informtion is vry usful for pritiv nlytis. For xmpl, in wlog of usrs tivitis on wsit, th ontxtul informtion suh s usr s (urrnt) lotion, us vi or gnr n split usrs into sugroups shring similr kgrouns. Usrs in th sm group usully hv in similr wy. Thrfor, thir intntions r sir to rogniz whn th pritiv mols lvrly lvrg th vill ontxtul informtion,.g. trining n mploying lol mol for h of th ontxts rogniz to usful. Thr r svrl importnt groups of rsrh qustions in ontxt-wr pritiv nlytis rlt to ontxt isovry, ontxt mngmnt n ontxt intgrtion into pritiv moling. W fous on on of th most gnrl qustions from th first group: how to isovr st of usful ontxts from t. In mny ss, ontxtul informtion is provi xpliitly in th form of itionl fturs sriing.g. usr urrnt lotion. Ths xpliit ontxts r iffrnt from impliit ontxts tht n only infrr from t. For instn, w my hv no xpliit informtion whthr usr is vry wll fmilir with prtiulr wsit s funtionlity or flls into th tgory of novi usrs. Howvr, ontxt isovry pproh my l to infr suh informtion utomtilly s on th numr of visits or ss pttrns of th usr. For simpliity, ut without loosing th gnrlity of our stuy w ssum tht ll th ontxts r non-ovrlpping n tht usr is t ny momnt is ssoit only with on ontxtul tgory. If w wnt to fous pritiv moling on sust of th most promising ontxts, w n prform t xplortion n us omin xprtis for hoosing n pproprit sust of ontxts. Howvr, for omplit n lrg-sl tsts, p unrstning of t y xplortion n rthr limit. In ition to tht, in mny ss, omin xprtis my not lwys vill (yt). Thrfor, n ltrntiv strightforwr ontxt sltion pproh is irt vlution of vry sust of th st of trgt ontxts through pritiv ury tsting. This solution is omputtionlly mning whn th st of trgt ontxts hs high rinlity. Morovr, in svrl ss, vlution must on in n onlin sttings with rl informtion systm in oprtion tht is lso vry xpnsiv n tim mning. In this work, w formult (usful) ontxt isovry s n optimiztion prolm. Evn whn omin xprts r not vill n t xplortion givs only prtil knowlg out th t n if irt ontxt vlution n tsting r xpnsiv, it is still possil to slt goo st of ontxts if w know th los form of th ojtiv funtion of th optimiztion prolm. On on hn, n nlytil form of th ojtiv funtion provis us with usful mthmtil insights of th prolm. It my giv us goo hint for ontxt isovry vn in th s tht th optimiztion prolm is hr. On th othr hn, it nls us to vlut th ontxts in n off-lin sttings for prforming n onlin tsting with smll st of sltiv ontxts. W fous on Mrkov mols tht r ommonly us for moling w usr hvior. Our nlysis shows tht th ojtiv funtion lult s th xpt ury of prition using th Mrkov mol hs los form. Anlyzing th nlytil form of th ojtiv funtion hlps us to fin intrsting proprtis of opting ontxtul informtion for prition with Mrkov mol. Nmly, if th t r gnrt y Mrkov mol, ny ontxt prsrving th Mrkovin proprty in h ontxtul tgory is usful in th sns tht th ury of prition using Mrkov mol

uilt for h tgory of th ontxt is t lst s lrg s th xpt ury of th Mrkov mol uilt for th whol t. This proprty is thortil jugmnt of ontxt-wr mtho for prition using Mrkov mols. W onut xprimnts on rl tst to illustrt tht 1) if usful ontxts r isovr, th lol Mrkov mols prit usr intntions sttistilly signifintly ttr thn th glol mol, n 2) lol Mrkov mols prform wll, i.. not signifintly wors thn glol mol, vn whn th ontxts r solutly not usful n hv sustntilly smllr numr of instns to inu lol mols. Th rst of th ppr is orgnis s follows. In Stion II w introu rlt work on ontxt-wrnss in suprvis lrning pplitions. In Stion III w introu finitions of ontxt-wr pritiv nlytis. In Stion IV w introu spifi s of using ontxtul informtion for improving prition ility of Mrkov mols. In Stion V w propos th ontxt isovry mtho s on nvigtion grph lustring. W prsnt our xprimntl stuy using rl W portl t sts in Stions VI n VII. Stion VIII onlus. II. RELATED WORK Mny stuis hv monstrt tht intgrting ontxtwrnss into pritiv moling hlps to ttr unrstn usr informtion ns n improvs th fftivnss of rnking [18], qury lssifition [6] n rommntions [12]. Th first pprohs for ontxt-wr moling ssum tht ontxts wr givn xpliitly n fous on th intgrtion of th this itionl t sour into th sp of pritiv fturs, or using it for lrning lol mols n hyri mols, for orrtion of mol outputs or prforming ontxtul sltion of instns to lrn mol from [15]. Exmpls of ontxtul informtion inlu urrnt t, sson, wthr [5], usrs lotion [16] n motionl sttus. In mhin lrning rsrh th trm usully hrtrizs th fturs tht o not trmin or influn th lss of n ojt irtly, ut improv pritiv prformn whn us togthr with othr pritiv fturs [15]. Rnt pprohs onsir how to riv suh fturs. E.g. in [19] ftur trnsformtions rquiring th rsulting ontxts to inpnnt from th lss lls hv ing xplor. In this work w onsir oth ss, whn ontxtul informtion is givn n whn w n to riv it s on som ssumptions on wht kin of usful hin informtion my prsnt in th t. Usr molling for prsonliztion is funmntl n hllnging prolm. For moling usr hvior (nvigtion) on th W, th us of Mrkov mols is rsonl hoi s thy r ompt, simpl n s on wll-stlish thory. Svrl Mrkov mols wr propos for molling usr W t: first-orr Mrkov mol, hyri-orr tr-lik Mrkov mol [10], prition y prtil mth forst [7], kth-orr Mrkov mols [9], vril orr Mrkov mol (VOMM) [4] tht provi th mn to ptur oth lrg n smll orr Mrkov pnnis. Rntly, it ws shown in [8] on lrg t st tht it is ttr to us th vril orr Mrkov mols for this purpos. Othr, prhps th most ommonly us thniqus, r s on Hin Mrkov Mols (HMM). Howvr, work with HMMs typilly rquirs unrstning of th omin n vry lrg trining smpls [2]. In [11] hirrhil lustring pproh ws propos for omposing usrs w sssions into nonovrlpping tmporl sgmnts. In th xprimntl stuy it ws shown tht suh tmporl ontxt n intifi n us for mor urt nxt usr tion prition with Mrkov mols. In this work w lso stuy ontxt-wrnss with th lss of Mrkov mols. III. CONTEXTUAL PREDICTION This stion isusss gnrliz finitions of ontxtul pritiv nlytis. Lt D th st of ll possil t instns. As running xmpl, w onsir wsit ontining fiv iffrnt tivitis with tgoril lls,,, n. Evry usr visiting th wsit prous squn of trnsition tivitis orrsponing to th tgoris tht th usr hs visit. In this xmpl, D is th st of ll possil squns of tivitis from th tgoris,,, n. Lt Θ = C 1 C 2 C 3 C N th sp of ll possil ontxtul fturs ssoit with vry t instn, whr h C i is ontxt. Dnot θ s Θ s th ontxtul ftur vtor ssoit with squn s. Lt M : Θ D V pritiv mol tht mps h tst squn s D ssoit with th ontxtul informtion θ s to th ision sp V. Lt F (s, M(θ s, s)) : kd V R th funtion vluts how goo mol is. For xmpl, in th s whih prits th nxt tivity whih th usr will prform, th ision sp V is th sm s th t instn sp, i.. V kd. An xmpl of th vlution funtion is th numr tru pritions m y M ovr th tst instn s. For instn, ssum tht th mol M prits s = s M(θ s, s) = thn it mks thr tru pritions orrsponing to th unrlin tivitis, i.. F (s, M(θ s, s)) = 3. Lt T kd st of tst instns n not P r(s) s th proility tht s T. Th xpttion of th vlution funtion F (s, M(θ s, s)) ovr th tst st is fin s E[T, M] = s T P r(s).f (s, M(θ s, s)). Th vlu of th xpttion E[T, M] n onsir s n ojtiv tht w n to optimiz n ssum tht M is th optiml mol, i.. M rg mx M E[T, M]. Lt C ontxt with n tgoris: C = { 1, 2,..., n } ssoit with h t instn s kd. A ontxt my hv iffrnt tgoris,.g. th rgion ontxt n ivi into four tgoris suh s Europ, Afri, Amrin, or Asi. For simplifying th isussion, w onsir ontxts tht hv only two tgoris. Th isussion of th gnrl ss with mor thn two tgoris is vry similr. Assum tht w hv ontxt C with two tgoris 1 n 2 iviing th tst st into two isjoint susts T 1 n T 2 suh tht T = T 1 T 2. Dnot M 1 n M 2 s two pritiv mols uilt for th tgory 1 n 2 rsptivly. Lt P ( 1 ) n P ( 2 ) r proilitis tht tst instn longing to th tgory 1 n 2 rsptivly. Thorm 1 (Contxtul Prinipl): Lt M n optiml mol on T thn it is omintion of M1 n M2. Whr

M 1 is n optiml mol for T 1 n M 2 is n optiml mol for T 2. Proof: Bus M 1 = rg mx M1 E[T 1, M 1 ] n M 2 = rg mx M2 E[T 2, M 2 ] w must hv E[T 1, M 1 ] E[T 1, M ] n E[T 2, M 2 ] E[T 2, M ]. W furthr riv: P ( 1 )E[T 1, M1 ] P ( 1 )E[T 1, M ] (1) P ( 2 )E[T 2, M2 ] P ( 2 )E[T 2, M ] (2) P ( 1 )E[T 1, M 1 ] + P ( 2 )E[T 2, M 2 ] E[T, M ] (3) On th othr hn, sin M = rg mx M E[T, M], w hv: E[T, M ] P ( 1 )E[T 1, M 1 ] + P ( 2 )E[T 2, M 2 ] (4) From two inqulitis 3 n 4 w imply tht: E[T, M ] = P ( 1 )E[T 1, M 1 ] + P ( 2 )E[T 2, M 2 ]. In othr wors, M is omintion of M 1 n M 2. Thorm 1 shows tht th prolm of fining th st mol for vry tst instn n solv y onsiring th suprolms of fining optiml mols for tst instns in h iniviul ontxtul tgory. This rsult provis us with thortil jugmnt for prsonliztion n xploittion of ontxtul informtion in pritiv nlytis. Nvrthlss, in prti fining n optiml mol for h ontxtul tgory is usully s hr s fining n optiml mol for th whol t. In, it is usully th s tht th typ of mol is hosn in vn,.g. Mrkov mols. Mol s prmtrs r stimt from trining t D. Unr this irumstn, ontxtul pritiv nlytis sks for ontxt suh tht it ivis th trining t into two susts D 1 n D 2 n th pritiv mols trin on D 1 n D 2 improv th pritiv prformn in omprison to th mol trin on th whol trining t. To this n, w fin usful ontxts s follows: Dfinition 1 (Usful Contxt): Givn mol M uilt s upon th whol trining t D n M 1, M 2 r two mols uilt s upon D 1 n D 2 orrsponing to h ontxtul tgory of ontxt C rsptivly. Th ontxt C is usful if n only if: E[T 1, M 1 ] E[T 1, M] n E[T 2, M 2 ] E[T 2, M] IV. CONTEXTUAL MARKOV MODELS This stion isusss spifi s of using ontxtul informtion for improving prition ility of Mrkov mols. In prtiulr, w r givn log of squns of tivitis prform y usrs in w pplition. Th tsk is to prit th nxt tivity in squn. Mrkov mol is hosn s pritiv mol for this prolm. W r intrst in fining usful ontxt suh tht Mrkov mols uilt for h tgory of th ontxt improv th prition prformn ompr to th Mrkov mol uilt for th whol t. W ll this prolm s th ontxtul Mrkov mol. To simplify th isussion, w only onsir th spil s with th first orr Mrkov mol or Mrkov hin. Gnrliztion of our isussion to Mrkov mols with ny orr is similr to tht spil s. Lt us not ka = { 1, 2,, n } s th st of ll possil tivity. A Mrkov hin M is ssoit with trnsition proility mtrix [P ( j i )], whr P ( j i ) is th proility of trnsition from th tivity i to th tivity j. For ny tivity ka, w not m() s th tivity with highst trnsition proility from th tivity, i.. m() = rg mx ka P r( ). Givn tht th urrnt stt is th tivity, if th t follows Mrkovin proprty thn m() is lwys th st prition of th nxt stt. Thrfor, w onsir pritor whih lwys hooss th most prol trnsition for th nxt stt. If th tst squns in T r rnom smpls from th Mrkov mol M, th xpt ury of th pritor, i.. th xpttion of tru prition rt n lult s follows: E[T, M] = Σ ka P ()P (m() ) (5) Lt C = { 1, 2 } ny ontxt n M 1, M 2 r two Mrkov hins uilt for h tgoris 1 n 2 rsptivly. Consir nw pritiv mol tht uss M 1 to prit tst squns longing to T 1 orrsponing to th first tgory 1 n uss M 2 to prit tst squns longing to T 2 orrsponing to th son tgory 2. W lso not [P 1 ( j i )] s th trnsition mtrix of th Mrkov mol M 1 n [P 2 ( j i )] s th trnsition mtrix of th Mrkov mol M 2. If two tst sts T 1 n T 2 ontin rnomly smpl squns from two Mrkov mols M 1 n M 2 thn th xpt ury of this prition n lult s follows: E[T, M 1, M 2 ] = P ( 1 )E[T 1, M 1 ] + P ( 2 )E[T 2, M 2 ](6) whr P ( 1 ) n P ( 2 ) stn for th proility of th tst squn longing to th first n th son tgory rsptivly n: E[T 1, M 1 ] = P 1 ()P 1 (m 1 () ) (7) ka E[T 2, M 2 ] = ka P 2 ()P 2 (m 2 () ) (8) Thorm 2: Assum tht th tst t possss th Mrkovin proprty n this proprty hols for vry tgory of ontxt C. Morovr, th trining t D togthr with D 1 n D 2 r lrg nough suh tht w n lrn urt Mrkov mols M, M 1 n M 2 thn tht ontxt is usful, i..: E(T 1, M 1 ) E(T 1, M) n E(T 2, M 2 ) E(T 2, M) Proof: Unr th tgory 1, lt P (m(), 1 ) th proility of th vnt initing tht th urrnt tivity is n th nxt tivity is m(). W hv: E[T 1, M] = P (m(), 1 ) (9) ka = ka = ka ka P (m(), 1 ).P ( 1 ) (10) P (m(), 1 ).P 1 () (11) P 1 (m 1 (), 1 ).P 1 () (12) E[T 1, M 1 ] (13) Th inqulity E[T 2, M] E[T 2, M 2 ] n riv in similr wy from whih th thorm is prov.

0.3 0.1 1 2 0.1 0.3 0.4 0 1 0.4 2 0 Fig. 1. An xmpl of trnsition istriutions from to th othr stts. Two ontxts C = { 1, 2 } n C = { 1, 2 } hv iffrnt trnsition istriutions. Th most prol trnsition pths r highlight with r-purpl olor. Thorm 2 shows tht if th t possss th Mrkovin proprty thn xploiting ny ontxt prsrving th Mrkovin proprty is lwys nfiil. This thorm n onsir s thortil jugmnt for using ontxts to improv Mrkov mol. In prti, Mrkov mols r usully lrnt from trining t. Th ury of mol s prmtrs stimtion is highly pnnt on th mount of vill t. Whn w xploit ontxt, th trining t is split into smllr portions y th ontxt whih my us th lin in th ury of prmtr stimtion. Finlly, th ontxtul Mrkov prolm is fin s n optimiztion prolm s follows: Dfinition 2 (Contxtul Mrkov): Givn trining t D, fin th ontxt C = { 1, 2 } splitting D into D 1 n D 2 suh tht th Mrkov mols M 1 n M 2 lrnt from D 1 n D 2 rsptivly mximiz th vlution funtion on th tst st T : E[T, M 1, M 2 ]. V. CONTEXT DISCOVERY TECHNIQUES A. Clustring-Bs Approh In orr to illustrt th ky i hin our propos lgorithm, onsir n xmpl in Figur 1 whr trnsition proilitis P (x ) (x {,,,, }) from th urrnt stt to th othr stts r shown. Figurs 1. n 1. show th trnsition proility P (x, C) (x {,,,, }) in two iffrnt ontxts C = { 1, 2 } n C = { 1, 2}. In Figur 1., ll trnsitions r qully prol. Thrfor, th trnsition proility istriution from hs vry high ntropy mking prition infftiv. If w us th pritor lwys priting th most prol trnsition, th xpt tru prition rt is. Th sitution is hng whn w onsir two ontxts C n C. In prtiulr, in Figur 1., th istriutions P (x, 1 ) n P (x, 2 ) oth hv lowr ntropy thn th trnsition istriution P (x ). Unr th ontxt C, th tru prition is P (m(), 1 ) = 0.3 in th tgory 1 n P (m(), 2 ) = 0.3 in th son tgory. Similrly, unr th ontxt C th tru prition rt is P (m(), 1) = 0.4 in th tgory 1 n P (m(), 2) = 0.4 in th son tgory. Thrfor, y xploiting th ontxt C w my inrs th prition ury from to 0.4. Common sns tlls us tht th prition is sir if th ontxt splits th t into homognous groups. In oing so, usrs with similr hvior r group togthr whih my rsult in low-ntropy trnsition istriution. A possil lustring lgorithm to group usrs is n gglomrtiv hirrhil lustring lgorithm whih uss th ojtiv funtion E[T, M 1, M 2 ] s prinipl for mrging lustrs. Ovrll, our pproh onsists of two importnt omponnts: (1) lustring lgorithm whih groups trining squn into groups with similr squns n (2) n lignmnt prourwhih ssigns nw tst squn to lustrs givn prtil ontnt of th tst squn ing sn so fr. B. Clustring y Community Dttion A gnrl rprsnttion of usrs historil hvior is givn s log of w sssions kd = {S 1, S 2,..., S n } whr h w sssion is squn of stts S i = ( 1, 2,, m ) orrsponing to historil rowsing tivitis of usr. In our s th usrs tions r tgoriz y th typ of th usrs tions: srhs, liks on s or hompg visits. A omplt st of us tgoris is prsnt in Figur 2 s grph nos. Howvr th st of ll possil tivity stts pns on ns of prtiulr srvi.g. visit of th hom pg n onsir s n tivity. Thus, tivitis n thir possil orrings within usr w sssions n summriz s usr nvigtion grph. Dfinition 3 (Usr nvigtion grph): A usr nvigtion grph is irt n wight grph G = (V, E), whr V is st of vrtis orrsponing to ll possil usr tions ka n E r th st of gs ( i, j ). Eh g of G is ssoit wight w() initing th trnsition proility twn two inint vrtis of th gs. Dpning on usr xprin thy my prform iffrnt tivitis y visiting iffrnt stts in th nvigtion grph. Thrfor, w propos usr tion lustring mtho s on ommunity ttion in th nvigtion grph. W wnt to unrstn if thr r ny groups of nos in th nvigtion grph n thn us this knowlg to hrtriz th usrs

Clik on Country Link Progrm Imprssion in srh rsults Fil Viw Quik Srh Univrsity Spotlight Imprssion Clik on Univrsity Link Empty srh rsult Progrm Imprssion in rlt progrms Bnnr Clik Rfin Srh Progrm Imprssion in lning-pg Univrsity Imprssion on nry univrsitis Progrm link lik Bsi Srh Sumit Inquiry X no Sumit Qustion Fig. 2. A usr nvigtion grph. Th mning of nos is sri in Stion VI-A in tils. A grph prtitioning lgorithm is us to tt two ommunitis in th grph: th r stts r ssoit with xprt usrs n th grn stts r ssoit with novi usrs. hviour. Intuitivly thr r two typs of usr s hviour on sit: (1) xprt usrs, who is xprin with wsit intrf or srhs xtnsivly to fin rquir informtion, n (2) novi usr, who ns mor tim to lrn out wsit or is not intrst muh in ontnt Assum tht w hv n grph prtitions y using ommunitis ttion mtho. Disovr groups of stts my intrprt ftr nlysis.g. V i orrspons to novi usr s hviour. Howvr, if n is too ig to nlyz thn w hv lustrs: {V i } n i=1. To simplify th isussion, w onsir two lustrs: Lt V xp n V nov. V xp orrspons to stts whih r visit y xprt usrs (r stts in Figur 2) n V nov onsists of tivts rlt to novi usr (grn stor in Figur 2). Hving lustrs of stts, w n lign h w sssion with orrsponing lustrs. If w sssion ontins oth r n grn stts, th lignmnt is prform y xmining th squn from th lft to th right: if xprt stt nountrs in squn thn mtho ssigns th squn to V xp, othrwis, sssion is ssign to V nov so fr. Lt us ll this prour squntil usr lignmnt. For xmpl, lt S i =,V xp = {, } n V nov = {,, }. Th lignmnt funtion A ligns th squn S i to lustrs s follows. Fist, A ss th stt n S i is tmporrily ssign to V nov. Thn, A gin ss nothr novi stt n S i rmins in V nov. In th thir stp, A nountrs xprt stt n S i is mov to V xp. In this s, typ of usr s hviour is ontxt. Th stts in th nvigtion grph whih r visit y usr uring sssion S j is ontxtul fturs θ Sj. {V i } n i=1 rprsnt ontxtul tgoris. Th propos lignmnt pproh llows us to fftivly lign trining squns to lustrs. Mor importntly, th lignmnt is vry onvnint for squntilly ligning tst squns to lustrs. In th xprimnts, w show tht this pproh works wll with spifi rlworl us-s. Morovr, th propos pproh n sily gnrliz to ny squn t. Vinnt t l. [3] introu n lgorithm tht fins high moulrity prtitions of ntworks in short tim n tht unfols omplt hirrhil ommunity strutur for th ntwork, thry giving ss to iffrnt rsolutions of ommunity ttion. C. Clustring y Gogrphil Position Our tst ontins usr lotion. In th litrtur, it ws shown tht th usrs lotion is usful ontxtul informtion in mny pplitions [16], [14], [1]. A ontxt s on gogrphil lotion n hv iffrnt lvls of grnulrity lik ontinnt, ountry, ity n so on. In our xprimnts w onntrt on ontinnt lvl u to limittions from th vlution si. Grouping sssion. W us usr IP rsss s ontxtul fturs, thn θ s = IP is ontxtul vtor ssoit with sssion s. W fin six ontxtul tgoris: C go = {C 1 = Europ, C 2 = Afri, C 3 = North Amri, C 4 = South Amri, C 5 = Asi, C 6 = Oni}. Gogrphil lignmnt funtion. kd is ivi into six isjoint trining sts ssoit with th ontinnt: kd Europ, kd Afri, kd Asi, kd NorthAmri, kd SouthAmri, kd Oni. A. Dt VI. EXPERIMENTAL STUDY Th nonymiz tst for our s stuy oms from StuyPortls.u. Th w-portl provis informtion out vrious stuy progrmms in Europ. W us t tht ws ollt in My 2012, tst ontin ovr 350.000 sssion 1. Eh usr s sssion is ror s following: usr IP rss, th two timstmps initing th strt of th sssion n th n of th sssion, squn of th usr s tions. StuyPortls.u hs tgoristion of tions usr n prform on thir wsit to sri usrs trnsitions. This txonomy is us to sri usrs pths on th wsit whih n trnsfrr into nvigtion grph. Th usrs nvigtion for th StuyPortls.u is monstrt in Figur 2, whr possil vlus for usrs tions i r prsnt: A = { 1, 2,..., 16 }. Gnrl typ of tion on th sit is viw (xp. viw stuy fil with itionl informtion), lik (lik on nnr, or ountry informtion link, or 1 Th tst is pulily vill s nhmrk. Pls rfr to th stion Co n Dtsts stion t http://www.win.tu.nl/ mphn/projts/p/.

univrsity link, or progrm link), sumission (usr fk through qustion or inquiry sumit), imprssion (rfr to th rommntion tions), n srh (quik srh - simpl srh from hompg, si srh - whn usr uss spil srh pg, rfin srh - whn itionl filtrs r us). X no is th tion whih is out of th tgoristion sop. B. Exprimnt Dsign Our squntil tst of usrs s tions is rnomly split into two prts: tst st T - 20% of th n trining st (T r) ontins - 80% of th sssion log. Trining phs. Th whol T r is us to lrn glol pritiv mol M (Glo.). Using th squntil usr lignmnt mtho w ivi th whol trining tst into susts, whih pply to h of th ontxts. Assum w hv ontxt C with k tgoris {} k i=1 iviing th trin st into k isjoint susts {T r i } k i=1 suh tht T r = T r 1 T r 2... T r k. Lt not {M i } k i=1 s k pritiv mols uilt for tgoris { i } k i=1 rsptivly. Tsting phs. During th tsting stg w lult ury of glol mol M - A[T, M] (Eqution 5). k tgoris {} k i=1 ivi th tst st T into k isjoint susts {T i } k i=1 suh tht T = T 1 T 2... T k. Lt P ( i ) is proility tht th tst instn longs to th tgory i thn A[T, {M} k i=1 ] = k i=1 P ( k)a[t i, M i ] (Eqution 6). As pritiv mols w us th following Mrkov mols: FOMM, CTW [17] n PST [13]. To lult th finl mtris w run th sri vlution 10 prours tims to ollt vrg mtris. W run th vlution yl for two isuss ontxts: usrs typ ( novi vs. xprt ) n gogrphil lotion. VII. EXPERIMENTAL RESULTS A. Contxt y Gogrphil Position In this xprimnt w wnt to vlut n impt of ontxt s on gogrphil lotion of usr. Th mtho to otin th ontxt s on gogrphil position is sri in tils in Stion V-C. W us mpping from ontxtul ftur IP rss to ontinnt s lignmnt mtho to lustr th sssion. Thrfor, w hv six ontxtul tgoris: EU - usrs from ontinnt Europ, AS - usrs from ontinnt Asi, AF - usrs from ontinnt Afri, NA - usrs from ontinnt North Amri, SA - usrs from ontinntsouth Amri, OC - usrs from ontinnt Oni. W riv six sprt pritiv mols for h ontinnt: {M i } 6 i=1 tht is trin for h ontinnt n on glol prition mol M tht is trin on whol tst s output of trining stg. Th rsult ury is shown in Tl I. Clrly, th usr s gogrphil lotion is not n usful ontxt oring Dfinition 2. Bus th wy s th ontxt ivis t os not giv us ny nfits in trms of A[T, {M i } 6 i=1 ] tht is ury. Th rlt improvmnts of th pritiv ury r lmost lwys ngtiv. Only for th s of PST lotion ontxt givs slightly improvmnt. Thrfor, th gogrphil lotion is not usful ontxt for our omin in this prtiulr us-s. TABLE I. AVERAGE ACCURACIES (± STANDARD DEVIATION) OF USER INTENT PREDICTION WITH THE GLOBAL MARKOV AND LOCAL ( LOCATION CONTEXT) MARKOV MODELS. GLOB. - GLOBAL MODEL ACCURACY, W.SUM - WEIGHTED SUM OF LOCAL MODEL ACCURACIES (EQUATION 5), RI - RELATIVE IMPROVEMENT COMPARED TO THE GLOBAL MODELS. Ct. i Siz i FOMM(%) CTW(%) PST(%) Glo. 1 40.6±0.3 49.2±4.3 45.3± EU 0.45 45.0±0.4 48.3±4.4 47.3± 3.9 AS 7 38.9±0.4 47.4±4.1 44.4± 3.3 AF 0.08 34.4±0.7 48.5±3.2 48.4± 3.2 NA 0.16 35.8±0.8 48.3±5.2 49.1± 4.9 SA 0.02 41.7±1.7 48.1±1.6 5±4.1 OC 0.01 46.8±2.8 45.2±6.4 49.4±9.1 W.Sum 1 40.1±0.4 48.3±2.6 46.1±1.4 RI - -1.2-1.8 +1.8 B. Contxt y Community Dttion In this xprimnt w wnt to vlut n impt of th ontxt s on isovr ommunitis in th usr nvigtion grph. Th mtho to otin th ontxt s on typ of usrs hviour is sri in tils in Stion V-B. By pplying this mtho w otin two ommunitis in our usrs nvigtion grph with moulrity quls to 0.174. M is glol mol tht uilt on whol T r. W us lignmnt mtho n s rsult w otin two lustrs: V xpr n V novi. Ths lustrs r us to lrn two ontxtul mols: M xprt n M novi. Th rsult ury is shown in Tl II. Th rlt improvmnt ompr to prformn of glol mol is high, up to 18.9% in trms of PST pritor. Distintly, th typ of usrs hviour is usful ontxt oring to Dfinition 2. Sin this ontxt givs improvmnt in trms of A[T, M 1, M 2 ], for ll givn priting mols: for FOMM rltiv improvmnt is 6.9%, for CTW rltiv improvmnt is 10.6%, n for PST rltiv improvmnt is 18.9%. Aoring to th pritiv ury it is importnt to noti tht w hv muh highr rltiv improvmnt for th vn usrs whih r our trgt group from our usinss prsptivs. This group of usrs hs longr sssions whih gin inits thir intrst to fin suitl progrm. Thrfor th typ of usr s hviour is usful ontxt for our omin in prtiulr us-s of usrs tril prition. Thrfor, th propos thniqu to isovr usful ontxts tht n us to improv th prition mols for th usr nvigtion trils. TABLE II. AVERAGE ACCURACIES (± STANDARD DEVIATION) OF USER INTENT PREDICTION WITH THE GLOBAL MARKOV AND LOCAL ( USER TYPE CONTEXT) MARKOV MODELS. RELATIVE IMPROVEMENT COMPARED TO THE GLOBAL MODEL ( GLOB. ) IS GIVEN IN BOLD IN THE ROUND BRACKETS. W.SUM IS WEIGHTED SUM OF THE LOCAL MODEL ACCURACIES (EQUATION 5). Ct. i Siz i FOMM(%) CTW (%) PST (%) Glo. 1 40.6±0.3 49.2±4.3 45.3± xprt 0.11 55.3±0.9 (+36.2) 59.3±3.1 (+20.5) 60.7±1.8 (+34.0) novi 0.89 43.4±0.3 (+6.9) 53.2±1.9 (+8.3) 53.1±2.9 (+17.2) W.Sum 1 43.4±8 (+6.9) 54.4±1.7 (+10.6) 53.9±2.7 (+18.9) C. Rnom Contxt W introu rnom ontxt R in orr to provi support vin for prsnt thory out ontxtul Mrkov

mols. In prtiulr, w im to provi n xprimntl rgumnt tht lol Mrkov mols r not wors thn glol Mols. W slt rnomly trining smpls of iffrnt siz. Assum tht rnom ontxt hv two tgoris, so k = 2. Thrfor, w ivi rnomly T r into two smpls (T r/2) 1 n (T r/2) 2 n uil lol mols M T r/21 n M T r/22 rsptivly. An lignmnt funtion rnomly slts mol M T r/ni for tst instn. Thn th xpt ury A[T, M T r/21, M T r/22 ] is lult (Eqution 6). W ontinu th xprimnt rursivly splitting th trining t until th siz of T r/n oms lss thn 100 sssions. W run th xprimnt 10 tims n omput vrgs n stnr vitions of gnrliztion uris. Th rsults r prsnt in Figur 3. Blu plot Wigh rnom ontxt shows ury of lol mols of iffrnt siz. Figur 3 (B) prsnts rsults for PTS pritor. W n lrly s whn th trining siz oms lss thn 4k, th stnr rror inrss sustntilly n th ury lins. Th sm sitution hppns with CTW pritor in Figur 3 (A) - ury rops whn trining siz is lss thn 4k instns. Both pritors show th sm tnny - th ury rss whn th siz of th smpl trining sust is lss thn 10-20% of th whol st, n n inrs of th stnr rror tstifis out futur rution of ury (or futur unxpt hviour). Figur 3 lso pits uris n th orrsponing stnr rrors of th glol mol n onsir ontxts: gogrphil lotion n usrs hviour typ. Bs on th osrvtions w n hypothsiz tht if stnr rror is low thn th isovr lustr is strong. VIII. CONCLUSION In prti, omin xprts n hv mny is out possil ontxt for th omin, s on thir intuition; thy n pply xplortiv nlysis n t mining thniqus to intify th ontxtul fturs. In this ppr, w introu forml finition of usful ontxt n th prolm of lrning ontxtul Mrkov mols. W formult th ontxt isovry s n optimiztion prolm. W provi intuitiv proofs showing tht n optiml glol mol orrspons to optionl ontxtul mols n for Mrkov mols, th ontxtul mols r xpt to t lst s goo s glol. W prform xprimnt with rnom ontxtul Mrkov mols n it shows tht with som onstrints thy r lmost s goo s glol. Thrfor, this ft givs us xprimntl vin out sfty of tsting lol mols. Thus, t lst for this lss of mols w hv soun justifition n motivtion for ontxt-wr pritiv nlytis. W introu th mtho to ontxt isovry whih onsists of two importnt omponnts: lustring lgorithm whih ivis trining squns into k groups n n lignmnt mtho whih ssigns nw tst squn to lustrs. W prsnt th xprimnts with rl-worl tst for two spifi xmpls of our mtho: (1) xpliit ontxts of usrs lotion whih is wily us in mny pplitions n (2) impliit ontxt tht is infrr from th usr nvigtion grph with our pproh. Th xprimntl s stuy on th rl tst tht w prform n rgr s n illustrtion of ontxtul Mrkov mols lrning. This s stuy shows tht if w n intify usful ontxts th lol Mrkov mols outprform th singl glol Mrkov, n if ontxt is not usful, lol mols will still prform s goo s th glol mol. ACKNOWLEDGEMENTS This rsrh hs n prtly support y STW CAPA n NWO COMPASS projts. Th xprimnttion ws prtly rri out on th Ntionl -infrstrutur with th support of SURF Fountion. W woul lik to thnk Thijs Putmn from StuyPortls.u for proviing th nonymiz tst. REFERENCES [1] A. O. Alvs n F. C.Prir. Mking sns of lotion ontxt. 2012. [2] R. Bglitr, R. El-Yniv, n G. Yon. On prition using vril orr mrkov mols. Journl of Artifiil Intllign Rsrh (JAIR), 22:385 421, 2004. [3] V. D. Blonl, J.-L. Guillum, R. Lmiott, n E. Lfvr. Fst unfoling of ommunitis in lrg ntworks. Journl of Sttistil Mhnis: Thory n Exprimnt, 10, 2008. [4] J. Borgs n M. Lvn. Evluting vril-lngth mrkov hin mols for nlysis of usr w nvigtion sssions. IEEE Trns. Knowl. Dt Eng. (TKDE), 19(4):441 452, 2007. [5] P. Brown, J. Bovy, n X. Chn. Contxt-wr pplitions: From th lortory to th mrktpl. IEEE Prsonl Comm, 4:58 64, 1997. [6] H. Co, D. H. Hu, D. Shn, D. Jing, J.-T. Sun, E. Chn, n Q. Yng. Contxt-wr qury lssifition. In SIGIR, 2009. [7] X. Chn n X. Zhng. A populrity-s prition mol for w prfthing. Computr, 36(6):63 70, 2003. [8] F. Chirihtti, R. Kumr, P. Rghvn, n T. Srlós. Ar w usrs rlly mrkovin? In WWW, pgs 609 618, 2012. [9] M. Dshpn n G. Krypis. Sltiv mrkov mols for priting w pg sss. ACM Trns. Intrnt Thn. (TOIT), 4((2)):163 184, 2004. [10] X. Dongshn n S. Junyi. A nw mrkov mol for w ss prition. Computing in Sin n Enginring, 4(6):34 39, 2002. [11] J. Kislv, H. T. Lm, M. Phnizkiy, n T. Clrs. Disovring tmporl hin ontxts in w sssions for usr tril prition. In Proings of th 22n intrntionl onfrn on Worl Wi W, (Compnion Volum, TmpW@WWW 2013 ), pgs 1067 1074. ACM, 2013. [12] S. Rnl, Z. Gntnr, C. Frunthlr, n L. Shmit-Thim. Fst ontxt-wr rommntions with ftoriztion mhins. In SIGIR, volum 10, 2011. [13] D. Ron, Y. Singr, n N. Tishy. Th powr of mnsi: Lrning proilisti utomt with vril mmory lngth. Mhin Lrning (ML), 25(2-3):117 149, 1996. [14] A. Shmit, M. Bigl, n H.-W. Gllrsn. Thr is mor to ontxt thn lotion. Computrs & Grphis, 23(6):893 901, 1999. [15] P. Turny. Th intifition of ontxt-snsitiv fturs: A forml finition of ontxt for onpt lrning. 2002. [16] R. Wnt, A. Hoppr, V. Flão, n J. Gions. Th tiv g lotion systm. ACM Trns. Inf. Syst. (TOIS), 10(1):91 202, 1992. [17] F. M. J. Willms. Th ontxt-tr wighting mtho : Extnsions. IEEE Trnstions on Informtion Thory (TIT), 44(2):792 798, 1998. [18] B. Xing, D. Jing, J. Pi, X. Sun, E. Chn, n H. Li. Contxt-wr rnking in w srh. In SIGIR, 2010. [19] I. Zlioit, J. Bkkr, n M. Phnizkiy:. Towrs ontxt wr foo sls prition. In ICDM Workshops, pgs 94 99, 2009.

Contxt Tr Wighing(A) Proilisti Suffix Trs (B) 60 60 Aury (%) 50 40 136 273 547 1095 2191 4382 8765 17530 28049 35061 70123 140247 252444 280494 Aury (%) 50 40 136 273 547 1095 2191 4382 8765 17530 28049 35061 70123 140247 252444 280494 Trining st siz Trining st siz Wigh rnom ontxt Wigh go ontxt Glol VOMM Wigh usr typ ontxt Exprt usr ontxt Novi usr ontxt Fig. 3. Mn of ury for 10 itrtions with stnr rror (SE). Plot (A) rprsnts rsults for CTW lgorithm. Plot (B) rprsnts rsults for PST lgorithm.