Optimizing Result Prefetching in Web Search Engines. with Segmented Indices. Extended Abstract. Department of Computer Science.
|
|
|
- Rosalyn Hodges
- 10 years ago
- Views:
Transcription
1 Optiizig Result Prefetchig i Web Search Egies with Segeted Idices Exteded Abstract Roy Lepel Shloo Mora Departet of Coputer Sciece The Techio, Haifa 32000, Israel eail: frlepel,[email protected] Abstract We study the process i which search egies with segeted idices serve queries. I particular, we ivestigate the uber of result pages which search egies should prepare durig the query processig phase. Search egie users have bee observed to browse through very few pages of results for queries which they subit. This behavior of users suggests that prefetchig ay results upo processig a iitial query is ot eciet, sice ost of the prefetched results will ot be requested by the user who iitiated the search. However, a policy which abados result prefetchig i favor of retrievig just the rst page of search results ight ot ake optial use of syste resources as well. We argue that for a certai behavior of users, egies should prefetch a costat uber of result pages per query. We dee a cocrete query processig odel for search egies with segeted idices, ad aalyze the cost of such prefetchig policies. Based o these costs, we show how to deterie the costat which optiizes the prefetchig policy. Our results are ostly applicable to local idex partitios of the iverted les, but are also applicable to processig of short queries i global idex architectures. Perissio to copy without fee all or part of this aterial is grated provided that the copies are ot ade or distributed for direct coercial advatage, the VLDB copyright otice ad the title of the publicatio ad its date appear, ad otice is give that copyig is by perissio of the Very Large Data Base Edowet. To copy otherwise, or to republish, requires a fee ad/or special perissio fro the Edowet. Proceedigs of the 28th VLDB Coferece, Hog Kog, Chia, Itroductio The sheer size of the WWW ad the eorts of search egies to idex sigicat portios of it [14] have caused ay search egies to partitio their iverted idex of the Web ito several disjoit segets (partial idices). The partitioig of the idex ipacts the aer i which the egies process queries. Most egies also use soe for of query result cachig, where results of queries that were served are cached for soe tie. I particular, query results ay be prefetched i aticipatio of user requests. Such scearios occurs whe the egie retrieves (for a certai query) ore results tha will iitially be retured to the user. We exaie eciet prefetchig policies for search egies. These policies deped o the architecture of the search egie (which, i tur, aects its query processig schee) ad o the behavior patters of search egie users. 1.1 Search Egie Users Users subit queries to search egies. Fro a user's poit of view, a egie aswers each query with a liked set of raked result pages, typically with 10 results per page. All users browse the rst page of results (the results deeed by the egie's rakig schee to be the ost relevat to the query), ad soe sca additioal result pages, usually i the atural order i which those pages are preseted. Three studies have aalyzed the aer i which users query search egies ad view result pages: a study by Jase et al. [7], based o 51; 473 queries subitted to the search egie Excite 1 ; a study by Markatos [15], based o about a illio queries subitted to Excite; ad a study by Silverstei et al. [19], based o about a billio queries subitted to the search egie AltaVista 2. Three digs which these studies share are particularly relevat to this work:
2 The queries subitted to WWW search egies are very short, averagig less tha 2:4 ters per query, with over half of the queries cotaiig just oe or two ters. These results were reported by both [19] ad [7]. While the two studies dee query ters soewhat dieretly, the reported ter couts ay be loosely iterpreted as the uber of words per query. Users browse through very few result pages. The etioed studies dier i the reported distributio of page views, but agree that at least 58% of the users view oly the rst page (the top-10 results), ad that o ore tha 12% of users browse through ore tha 3 result pages. The uber of distict iforatio eeds of users is very large, as ca be see fro the huge variety of queries subitted to search egies. However, popular queries are repeated ay ties, ad the 25 ost popular queries accout for over 1% of all queries subitted to the egies. 1.2 Cachig ad Prefetchig of Search Results It is cooly believed that all ajor search egies perfor soe sort of search result cachig ad prefetchig. Cachig of results was oted i Bri ad Page's descriptio of the prototype of the search egie Google 3 [3] as a iportat optiizatio techique of search egies. Markatos [15] deostrated that cachig search results ca lead to hit ratios of close to 30%. I additio to storig results that were requested by users i the cache, search egies ay also prefetch results that they predict to be requested shortly. A iediate exaple is prefetchig the secod page of results wheever a ew query is subitted by a user. Sice studies [19, 7] idicate that the secod page of results is requested shortly after a ew query is subitted i at least 15% of cases, search egies ay prepare ad cache two (or ore) result pages per query. 1.3 Idex Structure ad Query Processig Models Iverted idices, or iverted lists/les, are regarded as the ost widely applied idexig techique [18, 8, 20, 17, 16], ad are believed to be used by the ajor search egies. As search egies idex hudreds of illios of Web pages [14], the size of their iverted idices is easured i terabytes. Ribeiro-Neto ad Barbosa [17] etio three hardware coguratios that ca hadle large digital libraries: a powerful cetral achie, a parallel achie, or a high-speed etwork of achies (workstatios ad high ed desktops). However, whe cosiderig the size of the idices which search egies ai- 3 tai, the growth rate of the Web ad the large uber of queries which search egies aswer each day, usig a etwork of achies is cosidered to be the ost cost-eective ad scalable architecture [6, 17]. Such etworks operate i a shared-othig eory orgaizatio [18] where each achie has its ow processig power (oe or several CPUs), its ow eory ad its ow secodary storage. The achies couicate by passig essages via the high speed etwork that coects the. There are two well-studied schees of partitioig a iverted idex across several achies: Global idex orgaizatio. I this schee, the iverted idex is partitioed by ters. Each achie holds postig lists for a distict set of ters (the ters ay be partitioed by lexicographic order, for exaple). The postig list for ter t holds etries for all docuets that iclude t. Local idex orgaizatio. I this schee, the iverted idex is partitioed by docuets. Each achie is resposible for idexig a distict set of docuets, ad will hold postig lists for all ters that appeared i its set of docuets. Soe works, such as those by Ribeiro-Neto ad Barbosa [17] ad Toasic ad Garcia-Molia [20] have copared the (ru-tie) eciecy of the above schees. Parallel geeratio of a global idex has bee studied i [18], while a syste which crawls the Web ad builds a distributed local idex was preseted i [16]. Cahoo et al. [4] evaluated the coputatioal perforace of local idices uder a variety ofwork- loads, ad Hawkig [6] exaied scalability issues of local idex orgaizatios. The prototype of Google was reported as usig global idex partitioig [3]. May of the above etioed works [4, 18,6,17, 20] describe essetially the sae odel for processig queries i systes with segeted idices: User queries arrive to a certai desigated achie, which we will call the Query Itegrator, or QI. This achie was called hoe site i [20], cetral broker i [18] ad [17], user iterface (or UIF) i [6] ad coectio server i [4]. The QI issues each query to the separate idex segets, i a aer which depeds o the partitioig schee of the idex. With local idex partitioig, the QI will sed the query (as subitted by the user) to all segets. With global idex partitioig, the QI seds each seget a partial query cosistig oly of the set of ters whose postig lists are stored i the seget. The QI waits for the relevat segets to retur their result sets, ad erges these result sets with respect to the syste's rakig schee. Agai,
3 the two idex partitioig schees iply dieret erge operatios. With local idex partitioig, it is usually assued that each seget has the ability to calculate the global score of each docuet i its local idex with respect to all queries. Sice the result sets that are retured by dieret segets are disjoit, ergig the various result sets is straightforward. With global idex partitioig, each (relevat) seget returs a raked docuet list that ay overlap lists retured by other segets, ad where each score reects oly the score of the docuet with respect to the partial query which that seget received. The QI ay eed to perfor set operatios o the partial result sets (for queries cotaiig boolea operators), ad ight eed to weigh the scores retured fro each seget dieretly (for exaple, accordig to the dieret idf values of the ters i each partial query). The QI returs the erged results to the users. We cosider a cache-augeted process, i which the QI aitais a query-result cache. Upo receivig a query fro a user, the QI rst checks if the cache cotais results for that query. If so, the cached results are retured to the user, without forwardig the query to ay of the segets. If the query caot be aswered fro the cache, the QI processes the query as described above, ad upo copletio, caches the erged results This work Whe cosiderig the query processig odel described above i the cotext of Web search egies, we ote that erged results are retured to users i sall batches (typically 10 at a tie), i decreasig order of relevace (as raked by the search egie). The QI, however, ay prepare ore results tha are to populate the rst batch, ad cache the for future use. This raises the issue of optiizig the uber of prefetched results i systes where the cost of processig ucached queries icreases with the uber of results that are fetched: prefetchig a large uber of results per query will be costly at rst, but ay pay o should the user request additioal batches of results (sice these will already be cached). Note that with the cost of prefetchig we also associate the cache space that is occupied by the prefetched results. Assuig a xed-size cache, icreasig the uber of prefetched results per query ay decrease the uber of queries whose results ca be siultaeously cached. 4 The aiteace of the cache is ot cosidered i this work. I particular, we do ot exaie how cached etries are replaced or how the freshess of the results is aitaied. This ay lead to lower cache hit ratios, ad to a icrease i the load of the egie. Aother issue arisig fro the query processig odel, is the relatioship betwee the uber of results which the QI decides to prefetch per query, ad the uber of results which it should ask of each seget. As a exaple, cosider a egie which uses local partitioig ito segets, ad whose policy is to prefetch results per query. How ay top results (deoted by l) should the QI collect fro each seget for each query? It ay happe that all of the top results reside o a particular seget. Therefore, i order to be certai that ideed all top results are obtaied, it is ecessary to collect the top- results of each seget (settig l = ). However, assuig that docuets are partitioed radoly ad idepedetly ito the segets, the QI ay be able to collect cosiderably less results fro each seget ad still, with very high probability, obtai all of the top- results. Thus, whe optiizig the uber of prefetched results, the behavior of l with respect to ust also be cosidered. The tradeo betwee the aout (ad cost) of result prefetchig ad the possibility of servig subsequet queries fro the cache is the ai topic of this paper. As popular search egies process illios of queries every day, eciet prefetchig policies ca help reduce both the hardware requireets ad the respose tie of the egies. The rest of this paper is orgaized as follows. Sectio 2 forally presets the probles studied ad the otatios used throughout this paper (the otatios are suarized there i Table 1). We odel both the search egie's query service process ad the users' behavior. We the dee the cost of prefetchig a give uber of results i ters of a cost fuctio which is aalyzed ad optiized i later sectios. Sectio 3 presets a algorith which optiizes the prefetch cost fuctio for two special cases. The rst case deals with iverted idices that t o a sigle achie. This sigle achie sceario also odels servig sigleter queries (which are quite coo o the Web) with a globally-partitioed idex. The secod special case deals with a sceario whe the egie guaratees that the users receive absolutely optial results, usig worst-case assuptios o the distributio of relevat docuets i local-idex partitios. The ai body of work is cotaied i Sectio 4, which presets algoriths that solve ad approxiately solve the optiizatio proble for locally partitioed idices with a arbitrary uber of segets, aog which the docuets are radoly distributed. Sectios 5 tackles the cobiatorial proble of settig the uber of results which should be retrieved fro each seget i order to provide quality erged results to the users. Sectio 6 discusses the practical ipact that our results ay have o search egie egieerig. Coclu-
4 sios ad suggestios for future research are brought i Sectio 7. 2 Notatios ad Foral Model 2.1 The User Our work requires a odel for the way search egie users view result pages of their searches. Two studies [19,7]have reported o several aspects of such user behavior by exaiig the query logs of search egies. To our purposes, the aalysis of AltaVista's log [19] did ot report i suciet detail the exact distributio of result pages views (citig percetages of users viewig 1; 2 ad 3 pages oly). I additio, the statistics reported i that paper oly cosidered requests for additioal results which arrived withi 5 iutes of the previous request ade by the sae user. The study of Excite's users [7] brigs a ore elaborate distributio of result page views per query. 58% of the page views were of the rst result page, 19% of the views were of the secod result page, ad the views of result pages 3-9 (21:3% of the views) cofored to a Geoetric distributio with a paraeter betwee 0:288 ad 0:427. Wechose to odel the uber of result pages which users view per query as a Geoetric rado variable u G(1, p). Accordig to this odel, users view result pages i their atural order, ad the probabilityof a user viewig exactly result pages 1;:::;k (ot viewig result pages k + 1 ad beyod) equals (1, p)p k,1. I other words, upo viewig a result page, the user requests the ext page with probability p. A iportat property of the Geoetric distributio is the fact that it is eoryless: Pr(u s+t j u s)=p t 8s; t 2 IN Assue that the coplexity of retrievig raked results is also \eoryless", eaig that the coplexity of retrievig the results that rak i places ; +1;:::;+(k,1) depeds oly o the uber of results retrieved, k. As we will see, this assuptio holds whe the idetity of the result that raks i place, 1 is kow. The, the eoryless behavior of the users ad the eoryless cost of retrieval iplies that the optial uber of result pages r opt that should be prefetched for a query is idepedet of the uber of result pages requested so far: ay tie a query caot be served fro the cache, the QI should prepare the ext r opt result pages. 2.2 The Idex Architecture ad the Coplexity of Processig Queries The odel to which we refer i ost of this paper is that of a local idex partitioig schee i a sharedothig etwork. The idex is partitioed aog segets. We assue that docuets (URLs) are partitioed ito segets by a rado process which assigs each docuet to a seget accordig to the uifor distributio, ad idepedetly of all other docuets. Such a partitioig ca be achieved by hashig every URL ito a xed-size docuet ID, ad appig these IDs ito segets. Such aschee was etioed i [1] i the cotext of buildig URL repositories, ad the sae techique ca be applied whe assigig pages to the segets of a iverted idex. Sice the uber of docuets cosidered is i the hudreds of illios while is cosiderably saller (uch less tha the square root of the uber of docuets), the segets will cotai roughly the sae uber of docuets (with high probability). The query processig odel is as described i Sectio 1.3. Throughout the discussio we cosider the processig of a \broad topic" query that atches C docuets i each seget, where C is uch larger tha the uber of the results a user will actually browse. Let A deote the uber of results which the egie presets i each result page (a typical value is A = 10). Sice results should be prefetched i page uits, the uber of prefetched results per query should be a ultiple of A. I what follows we exaie the cost of prefetchig = ra results per query, so that i subsequet sectios we will be able to optiize the value of r - the uber of prefetched result pages. We will deote a user's query by a pair (t; k), where t is the search topic ad k 1 is the (ordial) uber of result page requested. A query ca either start a search of a ew topic (ad the k = 1), or ask for additioal results i a existig search (k>1). The followig discussio addresses both query types. Preliiaries Upo receivig a query (t; k) which caot be aswered fro the cache, the QI eeds to fetch results for t. The rst task is to set the value of l, the uber of results to retrieve fro each of the segets. Let B() deote the set of docuets that the egie should ideally retrieve for the query: the docuets that attai the best scores for t (accordig to the egie's rakig fuctio), out of all docuets that have ot bee retrieved for queries (t; k 0 );k 0 <k. Let R(l; ) deote the set of docuets that will be retrieved for the query t whe each of the segets returs its l ost relevat (ad previously uretrieved) atches for t. Ideally, R(l; ) would cotai B(), but esurig that eas settig l to equal. 5 Istead, we assue that the egie eploys the followig quality policy, based o a probability q: The QI sets the value of l with respect to such that Pr[B()R(l; )] q 5 This special case is discussed i Sectio 3.
5 I other words, the QI should collect eough (previously uretrieved) quality results fro each seget so that with probability q, the top- retrieved results will ideed be the best (previously uretrieved) results for t i the etire idex. The relatioship betwee ad l will be studied i Sectio 5. For the tie beig, it suces to ote that by the assuptio that docuets are uiforly distributed aog the segets, the above probability depeds oly o the values of ; ad l, ad is idepedet of the topic t. Let ~ l q (; ) deote the iial uber of docuets which should be retrieved by each of the segets so that the quality criterio is satised: ~l q (; ) 4 = if l j Pr[B()R(l; )] qg Collectig results The QI seds each seget the topic t ad a request for its ~ l q (; ) top results for the query. Wheever k>1 (this is ot the rst batch of results to be retrieved for t), seget i also receives s i (t; k, 1), the score of the lowest rakig docuet that it had cotributed to the results of (t; 1);:::;(t; k,1). 6 Weow estiate the cost of servig such requests. By our assuptio, the query atches C docuets i each seget, where C is uch larger tha the uber of results users will actually browse through, ad cosequetly is uch larger tha ~ l q (; ) (sice ~ l q (; ) is bouded by, ad is bouded by the uber of results that users browse through). We assue that idetifyig the C-sized set of cadidate docuets ca be doe i a tie that is liear i C. This assuptio holds for the iverted idex structure whe the uber of query ters is very sall, as is the case with broad topic queries o the Web (see Sectio 1.1). Recall that each seget receives the score of the lowest rakig docuet that it has retrieved so far for the query, ad ca thus discard previously retrieved results fro the set of cadidates. The top-scorig ~ l q (; ) docuets of the reaiig cadidates are the foud. Each seget will thus sped (C + ~ lq (; ) log C) processig steps (per query) i order to retur ~ l q (; ) sorted results to the QI. Mergig results The QI receives sorted result sets of legth ~ l q (; ). Readig ad buerig these sets takes ( ~ l q (; )) operatios. It the partially erges the results util it ideties the top = ra retrieved results that will populate the r result pages. By usig Tree Selectio Sortig [12] with the sorted result lists hagig fro the leaves of the tree, the erge ca be accoplished i tie (2 + log ). The overall coplexity of this step is thus ( ~ l q (; )+2+log ). 6 We assue that the results of the query (t; k, 1) are still cached whe the query (t; k) arrives. Cachig results The r result pages are cached, ad the rst of those pages is retured to the user. The scores s 1 (t; k);:::;s (t; k) are also oted. The overall space coplexityisthus (ra + ). The coplexity of the query processig odel Our odel requires two essages to be passed betwee the QI ad each of the segets: the QI seds the query to each seget, ad each seget returs ~l q (; ) results to the QI. The total uber of results received by the QI is ~ l q (; ), ad this aout of data ipacts its tie coplexity. Had we allowed ore rouds of couicatio, we could have aaged by sedig the QI oly +(,1) results, lowerig the coplexity of the erge step above to ( + + log ). We chose ot do so sice iiizig couicatio rouds betwee achies (eve at the expese of sedig larger essages) is likely to iprove perforace i distributed coputatios [6]. Note that the coplexity of the retrieval odel described above is ideed \eoryless" (see discussio i Sectio 2.1). The odel iplies the followig coputatioal loads o the various resources of the egie, whe followig a policy of prefetchig r result pages per query: The QI perfors ( ~ l q (ra; )+2+rA log ) coputatio steps. Each idex seget perfors (C + ~l q (ra; ) log C) coputatios. The cache space required is (ra + ). Additioally, we itroduce two o-egative coeciets ad which will allow us to assig dieret weights to the three resources which are cosued durig query processig. Specically, will ultiply the coputatios of the QI ad will ultiply the cache space required 7. Tuig the values of ad ca ephasize eory (cache) liitatios, coputatioal bottleecks (the QI vs. the segets) ad respose tie per query. More o this i Sectio 6.2. We are ow ready to forulate W (r), the expected cost (or work) of a policy which prefetches r pages for geoetric users with paraeter p. Result pages ir+1;ir+2;:::;(i+1)r will be tered as the i'th batch of result pages. For ease of otatio, we itroduce l q (r;) 4 = ~ l q (ra; ). 7 The coputatioal loads were expressed usig the () otatio. For cocreteess ad siplicity, we will cosider the give expressios as the exact coplexities. This allows us to avoid tedious otatios, ad does ot aect the esuig aalysis (ad results) of the paper.
6 W (r) = Cachig overhead + 1X i=1 [ Pr(preparig batch i) (batch preparatio coplexity) ] = (Ar + )+ 1X i=0 p ir [C + l q (r;) log C + (ra log + l q (r;)+2)] = (Ar + )+ C+l q(r;) log C 1, p r + (ra log + l q (r;)+2) 1,p r Rearragig the ters, ad igorig the costat additive ter (which does ot deped o r ad will ot aect the optiizatio of W (r)), we get W (r) = Ar + (C +2) + (log C + )l q(r;)+(a log )r 1, p r To ease the otatio, we dee the followig costats: a = A; b =(C+2); c = (log C + ) ad d = A log. With this otatio, W (r) =ar + b + cl q(r;)+dr 1, p r C, the uber of docuets per seget which atch a query, istypically a large uber, while A ad are typically uch saller. Thus, whe the proportioality costats ad are both about 1, typical values of b are large (tes of thousads ad beyod), while a; c; d are relatively sall (typically less tha 100). Our issio: Give a -way locally segeted idex, geoetric-p users ad soe quality criterio q, deterie r opt,aitegral value of r which iiizes W (r). I doig so, deterie l q (r opt ;). The QI will the prepare r opt result pages wheever a query caot be aswered fro the cache, askig each of the segets to retrieve its top l q (r opt ;) results (that score below a certai threshold) for the query beig processed. We will strive to obtai exact or alost exact values of r opt ad l q (r opt ;). 3 Siple Special Cases I this sectio we show that the proble for a sigle seget (= 1) ad the proble for ultiple segets with q = 1 behave siilarly, ad i both cases r opt ca be foud i (log r opt ) steps. Whe the idex is stored i a sigle seget, we ca igore the ters i the coplexity fuctio W (r) which deal with the ergig of results fro Sybol a b c d p q r ropt A C W (r) lq(r;) Deotes shorthad for A shorthad for (C + 2) shorthad for (log C + ) shorthad for A log uber of segets i idex probability of viewig result page k whe viewig page k, 1 quality criterio of QI uber of result pages to fetch optial itegral value of r uber of results per result page uber of relevat results per seget work required for fetchig r result pages per query uber of results to fetch fro each seget so that the best ra results are collected with probability at least q; equals ~ lq(ra; ) ultiplies the coputatios of the QI i W (r) ultiplies the required cachig space i W (r) Table 1: suary of otatios dieret segets (aely the ters ivolvig ). I additio, l q (r; 1) = ra regardless of q's value. Thus, W (r) becoes: W (r) =ar + C +(Alog C)r 1, p r Note that whe a idex is partitioed globally (each seget holds postig lists for a distict set of ters), sigle-ter queries are eectively queries to a sigle seget as described above. Studies [19, 7] idicate that the percetage of sigle-ter queries o the Web is quite large (25%, 30%). For the case where q =1we agai have l 1 (r;)= ra, ad the coplexity fuctio W (r) takes the followig for: W (r) =ar + b +(ca + d)r 1, p r Both cases iply a coplexity fuctio of the for W (r) =ar + b0 + d 0 r 1, p r ; b 0 ;d 0 >0 The derivative of W(r) is egative at zero ad icreases for all r > 0. Therefore, W (r) (for positive values of r) decreases at rst util reachig its (uique) iial value, ad the icreases. Relyig o this behavior, a optial itegral value of r (r o pt) ca be foud by applyig the followig procedure: 1. Fid the iial atural uber such that W (2 ) <W(2 +1 ). 2. Fid a optial value of r, usig biary search, i the rage 2,1 ;:::;2 +1. Sice will ot exceed 1 + dlog r opt e, the coplexity of dig r opt is (log r opt ).
7 4 Solutio for a -way Segeted Local Idex I this sectio we study the proble of settig the optial value of r give the quality criterio q (q <1), the egie's architecture paraeters A; C ad, ad the coplexity paraeters ad. Subsectio 4.1 presets a algorith for deteriig the optial value of r, which iiizes the retrieval coplexity fuctio W (r). Subsectio 4.2 presets a approxiatio algorith, which ds a value of r for which W (r) is approxiately optial. 4.1 Optiizig r i Idices With Segets First, recall the coplexity fuctio fro Sectio 2.2: W (r) =ar + b + cl q(r;)+dr 1, p r Clearly, the behavior of W (r) depeds o the behavior of l q (r;). While we will show how to precisely calculate l q (r;) i Sectio 5, for the purpose of this subsectio it suces to ote that if r 0 r the l q (r 0 ;) l q (r;). I order to facilitate the search for r opt,weow set forth to d, for every value of r, a upper boud o the set fr j W (r) W(r)g 8. Deitio 1 A fuctio g(r) will be called W - restrictive if for all r 0 g(r), W (r 0 ) >W(r). For exaple, g 1 (r) = 4 W (r) is W -restrictive, sice for a all r 0 g 1 (r), we havew(r 0 )>ar 0 ag 1 (r) =W(r). Cosequetly, r opt is ot larger tha g 1 (1). We will use W -restrictive fuctios to boud our search space for r opt. For this we ow seek a W - restrictive fuctio that is better tha g 1, providig tighter bouds o the size of the search space. The followig Propositio is proved i the full versio of this paper: Propositio 1 The fuctio is W -restrictive. g(r) =r+ pr (b+cl q (r;)+dr) (1, p r )(a + d) Note that the above fuctio reects all the architectural paraeters of the search egie's idex, ad also the user's behavior (represeted by p) ad the desired quality criterio q. Figure 1 displays Algorith OP for settig the optial value of r. All of the steps except the calculatio of l q (r;)i2(a) are trivial; that calculatio is the topic of Sectio 5. The correctess of the algorith follows fro the W -restrictiveess of g(r) (Propositio 1), sice we do ot eed to iterate through values of r for which W (r) iskow to be higher tha values we have already see. 8 Sice li r!1 W (r) =1, the set fr j W (r) W(r)g is ite for all r. 1. Iitializatios: W i W (1) ; r opt 1 ; liit 1; r While r<liit: (a) Calculate l = l q (r;), ad use the value to set W W (r) ; g g(r). (b) If liit >g: liit g. (c) If W<W i : W i W ; r opt r. (d) r r prit r opt. Figure 1: Algorith OP for optiizig the prefetch policy The coplexity of the algorith Algorith OP eeds to be executed relatively few ties, whe cogurig the prefetchig policy of the search egie (see discussio i Sectio 6.2). Therefore, its ow coplexity does ot ipact the perforace of the egie. Nevertheless, we ow prove that its ruig tie is polyoial. We do so by boudig r ax, the axial uber of iteratios which OP ay require throughout its course. For this, let a + d = b + ca + d Note that by our assuptios o the relative values of a; b; c ad d (see Sectio 2.2), is a sall costat. Sice l q (1;) A,wehave that g(1) 1+ p is (1,p) a boud o the uber of iteratios. Thus, r ax is bouded by 1+ 2p 1 wheever 0:5, ad by 1+ 1,p 1,p wheever p. Next, we boud r ax whe p> ad <0:5. Lea 1 r ax 3l log log p l Proof: Let r = log log p ad p r =2 rlog p 2 log =. Thus, g(r) = r + wheever p> ; < The, sice 1 >p>,r>1 p r b + cl q (r;)+dr (1, p r ) a + d b + cl q (r;)+dr r + (1, ) a + d b + cra + dr r + (1, ) a + d (sice obviously l q (r;) ra) r < r + (1, ) = 2, log 1, r 3r =3 log p To coplete the aalysis of the coplexity of algorith OP for dig r opt, we show i Sectio 5
8 that calculatig the values of l q (r;) for all r 2 f1;:::;r ax g requires O( 2 A 2 rax) 2 steps (regardless of the value of q). Sice we have already bouded r ax by siple fuctios of ; p ad, bouds o the coplexity of the algorith follow. Table 2 brigs saple results of the algorith. For every cobiatio of ad p, r opt ad rax act (the highest value of r for which W (r) was actually calculated durig executio) are show. Figure 2 plots W (r) as a fuctio of r, as calculated durig the algorith for three values of with p = 0:5. For all displayed results, we used q = 0:99; = = 1;A = 10 ad C =2 13. p 0:3 0:5 0:7 5 4(5) 7(9) 11(15) 25 4(5) 6(8) 12(14) 50 4(5) 6(8) 10(13) Table 2: r opt (r act ax) values as a fuctio of ad p 4.2 Approxiatig the Optial Solutio I the previous subsectio we have show how tode- terie r opt, the uber of pages which iiizes the coplexity fuctio W (r). However, if we are willig to settle for early optial solutios, aely dig values of r for which W (ropt) 1, for sall values W (r) of, we ca use the followig algorith: l 4 1. Let r ax = log. log p 2. Fid the value of r i the rage f1;:::;r ax g which iiizes W (r). Note that r ax depeds o the user's behavior (as odeled by p) but is idepedet of the egie's architecture ad quality policy (which are odeled by a; b; c; d ad q). Furtherore, the above algorith is applicable to ay work fuctio W(r) ~ such that (1, p r ) W ~ (r) is a icreasig fuctio of r. Note that W (r) satises this coditio, sice (1, p r )W (r) =(1,p r )ar + b + cl q (r;)+dr where a; b; c; d are positive costats, ad the fuctios (1, p r );l q (r;) are odecreasig fuctios of r. The correctess of the approxiatio algorith relies o the followig Propositio. Propositio 2 Let W (r) be ay positive fuctio such that (1, p r )W (r) is a icreasig fuctio of r. Let r;t 2 IN such that W (t) 1 W (r) 1,p t. The, for all r 0 t, W (r 0 ) >W(r). Proof: Sice (1, p t )W (t) W (r), we have for r 0 t W (r 0 ) > (1, p r0 )W (r 0 ) (1, p t )W (t) W (r) Corollary 1 Let 0 <s<t. The W (s) < W (t) (1,p s ). Proof: Sice (1, p r )W (r) icreases with r, we get W (s) < Corollary 2 For all s, W (s) (1, p t ) < W (t) (1, p s ) ifw (1);:::;W(s)g< W(r opt) 1, p s Proof: If 1 r opt s, the clai holds. Otherwise, the result is iplied by Corollary 1, with t = r opt. Substitutig s = r ax = d log e i the last Corollary yields the approxiatio log p algorith: ifw (1);:::;W(s)g < = W(r opt) 1, p d log log p e W (r opt ) 1, 2 log pd log log p e < W (r opt) 1, Table 3 shows the values of r ax for p = 0:1; 0:2;:::;0:9 ad =0:1;0:01 ad 0:001. As etioed earlier, calculatig the values of l q (r;) for all r 2f1;:::;r ax g requires O( 2 A 2 r 2 ax) coputatioal steps, ad thus the tie coplexity of the approxiatio algorith is O( 2 A 2 l log log p 2). Fially, we ote that the results of this subsectio ay be used i practice to iprove the ruig tie of Algorith OP (gure 1), by checkig (betwee W steps 2(b) ad 2(c)) whether 1 Wi 1,p r, ad settig liit r if so (thus teriatig the algorith). Propositio 2 asserts that all future iteratios with larger values of r will result i greater values of W (r), ad so OP ca safely teriate ad output the curret value of r opt. 5 Calculatig l q (r;) This sectio brigs recursive forulae with which l q (r;) ca be calculated i a tie which is polyoial i ; r ad A. We odel the distributio of the top results i the segets by the followig rado process: = 4 ra dieret balls (the top results for a query) are throw radoly ad idepedetly ito dieret cells (the P segets), where i balls are iserted to cell i ( i=1 i = ). We odel the queryig process by takig ifl; i g balls fro cell i for i = 1;:::;. Deote by e ;;l the uber of excess balls that reai i the cells after the queryig process is copleted. I Sectio 5.1 we calculate the probability that e ;;l = 0. This correspods to the case where o cell
9 Figure 2: W (r) as a fuctio of r, for =5;25 ad 50 (p =0:5) p 0:1 0:2 0:3 0:4 0:5 0:6 0:7 0:8 0:9 0: : : Table 3: r ax as a fuctio of p ad cotais ore tha l balls, so that the QI ideed aaged to collect the top results fro the segets. I the full versio of this work we also calculate the probability that e ;;l = k. This correspods to the case where the QI aaged to collect just, k of the top results. We study this case sice the QI ay choose to eploy a relaxed quality policy, requirig that with high probability, ost (but ot ecessarily all) of the top results are retured to the user. Subsectio 5.2 briey reviews previous work o related issues. We rst preset a rough boud o l q (r;) which ay suce whe precise calculatios are ot essetial. Clearly, isalower boud o l q(r;) for all q>0. We show that for q =1, 1 2,l q (r;) eed ot be larger tha axfd + log e; d 2eeg 9 : The probability that exactly i, of the results are iserted to a give seget is i, 1 i, 1, 1,i,., Sice i e i, i ( 1 i )i (1, 1 ),i < ( e i )i ( 1 )i =( e i )i : 9 Sharper (kow) asyptotic bouds o lq(r; ) are discussed i Sectio 5.2. Hece, the probability that ore tha ` results are iserted ito a give seget is bouded by X i=`+1 ( e i )i < 1X i=`+1 ( e ` )i = ( `)` e e ` 1, e ` Wheever ` axf( + log ; 2e g, this last expressio is bouded fro above by = , 1 +log Thus, by the uio boud, the probability that at least oe of the segets cotais ore tha ` results is saller tha 1 2. The results follow. 5.1 Precise Calculatio of l q (r;) Weow tur to the precise calculatio of l q (r;). For this we will calculate the probability P (; ; l) =Pr[e ;;l =0]; the probability of throwig dieret balls ito dieret cells so that o cell cotais ore tha l balls. The size of the proble space is. We will actually be coutig N(; ; l), the uber of ways to throw dieret balls ito dieret cells so that o
10 r Table 4: l q (r;) for q =0:99;A= 10 ad various values of r ad cell cotais ore tha l balls, ad the P (; ; l) =N(; ; l)= The followig recursive forulae ay be used to calculate the N(; ; l) values: Xl,1 N( +1;;l) = N(, j;, 1;l) j j=0 lx N(; +1;l) = N(, j; ; l) j j=0 However, the recursio that ost aturally ts i Algorith OP fro Sectio 4.1 is: N(; ; l) = X N(, jl;, j; l, 1) j l;:::;l;,jl j=0 First, we choose soe j cells to have exactly l balls. We the choose the balls to populate those cells (the ultioial coeciet has jl-ters). The reaiig,jl balls are distributed to the reaiig,j cells, with each such cell collectig o ore tha l, 1 balls. As r grows i subsequet iteratios OP, so will the value of l q (r;). This recursio aturally uses results of N(; ; l) fro previous iteratios i later iteratios. As for the iitial values: 1. For all ; l, N(0;;l)=1. Wheever >0, N(; 0;l)=N(; ; 0) = For all >0; >0: Wheever l<d e; N(; ; l) =0. N(; ; d e)=, k where k 4 = od.! d ek b, c,k Deotig by ax 4 = rax A the value of i the last iteratio of OP (ad by l ax the value of l foud i that iteratio), the total tie spet calculatig values of l q (r;)is( ax 2 l ax )=O( 2 ax 2 ). Table 4 shows saple values of l q (r;). 5.2 Previous Work The stochastic properties of the process which radoly throws balls ito cells have bee studied extesively. Two good refereces are [13] ad [10]. Aog the properties studied was the distributio of the axiu uber of balls i a cell, which we will deote by L(; ). For exaple, for (ore l balls tha cells), L(; ) = ( l d1+ l e + ) with probability1,o(1) [5]. Whe =, L(; ) behaves asyptotically as (1 + o(1)) l with probability 1,o(1) [2]. I [13], the distributio of L(; ) l l is exaied with regard to the behavior of the ratio as ;!1. Separate results are obtaied l for the three cases! l 0;! >0;ad l!1.i[9]itwas show that the distributio l of L(; ) ay be approxiated by the the distributio of s j ax P j=1 s ; j where each s j is a idepedet 2 variable with degrees of freedo. 2(,1) 6 Fro Theory to Practice This sectio attepts to bridge the gap betwee theory ad practice by highlightig the possible practical iplicatios of our odel ad results. 6.1 The Coplexity Fuctio W (r) We rst revisit two assuptios we have ade while foralizig W (r) (Sectio 2). These assuptios pertai to the aer by which users view result pages ad to the eoryless query processig schee. 1. \Users view search result pages accordig to a eoryless geoetric process". While this assuptio is extreely siplistic, the studies cited i Sectio 2.1 idicate that it ight reasoably approxiate the aggregate behavior of users. 2. \Whe a request for result page k arrives, result page k, 1 is still cached". We used this assuptio to sed each seget the score of the lowest result it had cotributed to page k, 1. This, i tur, allowed us to forulate a eoryless query processig schee. While igorig cache aageet issues i this work, the followig cosideratio justies the ituitio behid this assuptio: the ai of ay policy that prefetches r pages (ubered k;:::;k+ r,1) whe processig a request for result page k of soe query, is to rapidly aswer (fro the cache) subsequet requests for pages k +1;:::;k+ r, 1 of that query. Thus, the prefetchig policy iplicitly assues that the
11 life expectacy of cached etries will allow page k + r, 1tobecached util it is requested. I other words, every policy that prefetches r pages assues that pages will be cached log eough for r,1 subsequet requests. We require pages to be cached for r subsequet requests. The above assuptios allowed us to forulate a exact coplexity fuctio to our cocrete query processig odel. At the ed of Sectio 2.2, the coplexity fuctio was abbreviated to the for W (r) =ar + b + cl q(r;)+dr 1, p r : We clai that this abbreviated for (ad our results) ca accoodate ay retrieval odel that icurs the followig costs whe prefetchig r pages: Cache space that is liear i r, the uber of prefetched result pages. Retrieval coplexity that is the su of (1) a ter that depeds o the query's breadth (uber of atchig results), (2) a ter that is liear i l q (r;), ad (3) a ter that is liear i r. Thus, our results ay apply to idex structures ad query processig schees that dier fro our odel. Furtherore, the results of Sectio 4.2 apply to ay coplexity fuctio ~ W (r) where (1,pr ) ~ W (r) isaicreasig fuctio of r. Fially, the results of Sectio 5, where we deteried the uber of results that should be retrieved fro each seget (l q (r;)), are applicable to ay search egie that uses a locally segeted idex i which docuets are partitioed uiforly ad idepedetly. 6.2 Ipleetig a Prefetchig Policy Ipleetig a prefetchig policy for egies with locally segeted idices i the fraework of this research requires the followig two preprocessig steps: Settig the paraeters: a approxiate value of p is derived fro aalyzig the egie's query logs, the paraeter q is set accordig to the quality policy, ad the values of ; are set accordig to the egie's resources. Systes with sall caches should set to a high value; whe the QI is heavily loaded, should be set to a high value; etc. For a rage of query breadths (a rage of values for the paraeter C), a algorith (either optiizig or approxiatig r opt ) is executed. The QI ad each seget are the loaded with tables cotaiig the values of r opt (C) ad l q (r opt (C);) for values of C i the rage. Upo receivig a query t, each seget estiates that query's breadth (the value of C t that correspods to t). This ca be doe i two ways: May local idex ipleetatios icorporate global ter statistics i each seget i order to facilitate ter-based scorig [1]. These statistics ay help estiatig the breadth of certai types of queries. By our assuptio, each of the seget cotai approxiately the sae uber of results for broad topic queries (whe C ). Thus, a seget ca process the query, ad use the uber of atches it ds as a estiate of C. After estiatig C t, the seget forwards l q (r opt (C t )) results to the QI, which erges the retrieved results to produce r opt (C t ) result pages. 7 Coclusios ad Future Work This work exaied how search egies should prefetch search results for user queries. We started by presetig a cocrete query processig odel for search egies with locally segeted iverted idices. We argued that for a odel which assues that the uber of result pages that users view is distributed geoetrically, the optial egie policy is to prefetch a costat uber of result pages r. We expressed the coputatioal cost of a policy that prefetches r pages, ad suggested a algorith for dig the optial value of r (which iiizes the expected cost). We also suggested how to d values of r which iply policies whose cost is approxiately optial. Several extesios of this work are the followig: The odel preseted i this paper igores overlaps i the iforatio eeds of dieret users. We did ot cosider, for exaple, that popular queries ay be subitted by ultiple users durig a short tie spa, icreasig the probabilityof at least oe user requestig additioal results. By takig query popularityito accout, we ay d that popular queries warrat ore result prefetchig tha rare queries do. This work did ot address cache replaceet policies; i particular, we did ot suggest which result pages should be reoved fro the cache upo prefetchig results for a ew query. As oted i [11] (i the cotext of buerig of postig lists), kowledge of the access patters to the query cache should be cosidered whe settig the replaceet policy. For exaple, users usually browse result pages i their atural order. Thus, assuig that the rst two result pages of soe query are cached ad that oe of the ust be evicted, it sees atural to reove the secod page of results (ad ot the rst, as a LRU policy ight suggest). Most of the results i this paper are applicable to locally segeted idices. Oly sigle-ter
12 queries to global idices are cosidered. Additioal research is required i order to exted our results to ulti-ter queries to global idices. Ackowledgets We thak Adrei Broder 10 ad Farzi Maghoul fro AltaVista for useful discussios ad isights o the probles covered i this paper. Refereces [1] A. Arasu, J. Cho, H. Garcia-Molia, A. Paepcke, ad S. Raghava. Searchig the web. ACM Trasactios o Iteret Techology, 1(1):2{43, [2] Yossi Azar, Adrei Z. Broder, Aa R. Karli, ad Eli Upfal. Balaced allocatios. SIAM Joural of Coputig, 29(1):180{200, [3] Sergey Bri ad Lawrece Page. The aatoy of a large-scale hypertextual web search egie. Proc. 7th Iteratioal WWW Coferece, [4] Bredo Cahoo, Kathry S. McKiley, ad Zhihog Lu. Evaluatig the perforace of distributed architectures for iforatio retrieval usigavariety ofworkloads. ACM Trasactios o Iforatio Systes, 18(1):1{43, [5] Artur Czuaj ad Volker Stea. Radoized allocatios processes. Proc. 38th IEEE Syposiu o Foudatios of Coputer Sciece, pages 194{203, [6] David Hawkig. Scalable text retrieval for large digital libraries. First Europea Coferece o Digital Libraries, [7] Berard J. Jase, Aada Spik, ad Tefko Saracevic. Real life, real users, ad real eeds: A study ad aalysis of user queries o the web. Iforatio Processig ad Maageet, 36(2):207{227, [8] Byeog-Soo Jeog ad Edward Oieciski. Iverted le partitioig schees i ultiple disk systes. IEEE Trasactios o Parallel ad Distributed Systes, 6(2):142{153, [9] N. L. Johso ad D. H. Youg. Soe applicatios of two approxiatios to the ultioial distributio. Bioetrika, 47:463{469, [11] Bjor THor Josso, Michael J. Frakli, ad Divesh Srivastava. Iteractio of query evaluatio ad buer aageet for iforatio retrieval. I SIGMOD 1998, Proceedigs ACM SIG- MOD Iteratioal Coferece o Maageet of Data, Seattle, Washigto, USA, pages 118{ 129, Jue [12] Doald E. Kuth. The Art of Coputer Prograig, Volue 3. Addiso-Wesley Publishig Copay Ic., [13] Valeti F. Kolchi, Boris A. Sevast'yaov, ad Vladiir P. Chistyakov. Rado Allocatios. V. H. Wisto & Sos, [14] Steve Lawrece ad C. Lee Giles. Searchig the world wide web. Sciece, 280, April [15] Evagelos P. Markatos. O cachig search egie query results. Proceedigs of the 5th Iteratioal Web Cachig ad Cotet Delivery Workshop, May [16] Sergey Melik, Srira Raghava, Beverly Yag, ad Hector Garcia-Molia. Buildig a distributed full-text idex for the web. Proc. 10th Iteratioal WWW Coferece, [17] B. Ribeiro-Neto ad R. Barbosa. Query perforace for tightly coupled distributed digital libraries. I Proc. ACM Digital Libraries Coferece, 1998., pages 182{190, [18] B. A. Ribeiro-Neto, J. P. Kitajia, G. Navarro, C. R. G. Sat'Aa, ad N. Ziviai. Parallel geeratio of iverted les for distributed text collectios. Proc. 18th Iteratioal Coferece of the Chilea Coputer Sciece Society, [19] Craig Silverstei, Moika Heziger, Haes Marais, ad Michael Moricz. Aalysis of a very large altavista query log. Techical Report , Copaq Systes Research Ceter, October [20] A. Toasic ad H. Garcia-Molia. Perforace of iverted idices i shared-othig distributed text docuet iforatio retrieval systes. I Proc. Secod Iteratioal Coferece o Parallel ad Distributed Iforatio Systes, pages 8{ 17, [10] Nora L. Johso ad Sauel I. Kotz. Ur Models ad their Applicatio. Joh Wiley & Sos, Ic., Adrei Broder is curretly with IBM Research
CHAPTER 4: NET PRESENT VALUE
EMBA 807 Corporate Fiace Dr. Rodey Boehe CHAPTER 4: NET PRESENT VALUE (Assiged probles are, 2, 7, 8,, 6, 23, 25, 28, 29, 3, 33, 36, 4, 42, 46, 50, ad 52) The title of this chapter ay be Net Preset Value,
The Binomial Multi- Section Transformer
4/15/21 The Bioial Multisectio Matchig Trasforer.doc 1/17 The Bioial Multi- Sectio Trasforer Recall that a ulti-sectio atchig etwork ca be described usig the theory of sall reflectios as: where: Γ ( ω
In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
Ant Colony Algorithm Based Scheduling for Handling Software Project Delay
At Coloy Algorith Based Schedulig for Hadlig Software Project Delay Wei Zhag 1,2, Yu Yag 3, Juchao Xiao 4, Xiao Liu 5, Muhaad Ali Babar 6 1 School of Coputer Sciece ad Techology, Ahui Uiversity, Hefei,
An Electronic Tool for Measuring Learning and Teaching Performance of an Engineering Class
A Electroic Tool for Measurig Learig ad Teachig Perforace of a Egieerig Class T.H. Nguye, Ph.D., P.E. Abstract Creatig a egieerig course to eet the predefied learig objectives requires a appropriate ad
Distributed Storage Allocations for Optimal Delay
Distributed Storage Allocatios for Optial Delay Derek Leog Departet of Electrical Egieerig Califoria Istitute of echology Pasadea, Califoria 925, USA derekleog@caltechedu Alexadros G Diakis Departet of
GSR: A Global Stripe-based Redistribution Approach to Accelerate RAID-5 Scaling
: A Global -based Redistributio Approach to Accelerate RAID-5 Scalig Chetao Wu ad Xubi He Departet of Electrical & Coputer Egieerig Virgiia Coowealth Uiversity {wuc4,xhe2}@vcu.edu Abstract Uder the severe
arxiv:0903.5136v2 [math.pr] 13 Oct 2009
First passage percolatio o rado graphs with fiite ea degrees Shakar Bhaidi Reco va der Hofstad Gerard Hooghiestra October 3, 2009 arxiv:0903.536v2 [ath.pr 3 Oct 2009 Abstract We study first passage percolatio
I. Chi-squared Distributions
1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.
Department of Computer Science, University of Otago
Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling
Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria
CDAS: A Crowdsourcing Data Analytics System
CDAS: A Crowdsourcig Data Aalytics Syste Xua Liu,MeiyuLu, Beg Chi Ooi, Yaya She,SaiWu, Meihui Zhag School of Coputig, Natioal Uiversity of Sigapore, Sigapore College of Coputer Sciece, Zhejiag Uiversity,
Modified Line Search Method for Global Optimization
Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o
Soving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
A Cyclical Nurse Schedule Using Goal Programming
ITB J. Sci., Vol. 43 A, No. 3, 2011, 151-164 151 A Cyclical Nurse Schedule Usig Goal Prograig Ruzzaiah Jeal 1,*, Wa Rosaira Isail 2, Liog Choog Yeu 3 & Ahed Oughalie 4 1 School of Iforatio Techology, Faculty
Controller Area Network (CAN) Schedulability Analysis with FIFO queues
Cotroller Area Network (CAN) Schedulability Aalysis with FIFO queues Robert I. Davis Real-Tie Systes Research Group, Departet of Coputer Sciece, Uiversity of York, YO10 5DD, York, UK [email protected]
A Supply Chain Game Theory Framework for Cybersecurity Investments Under Network Vulnerability
A Supply Chai Gae Theory Fraework for Cybersecurity Ivestets Uder Network Vulerability Aa Nagurey, Ladier S. Nagurey, ad Shivai Shukla I Coputatio, Cryptography, ad Network Security, N.J. Daras ad M.T.
MARTINGALES AND A BASIC APPLICATION
MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measure-theoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this
GOAL PROGRAMMING BASED MASTER PLAN FOR CYCLICAL NURSE SCHEDULING
Joural of Theoretical ad Applied Iforatio Techology 5 th Deceber 202. Vol. 46 No. 2005-202 JATIT & LLS. All rights reserved. ISSN: 992-8645 www.jatit.org E-ISSN: 87-395 GOAL PROGRAMMING BASED MASTER PLAN
Supply Chain Network Design with Preferential Tariff under Economic Partnership Agreement
roceedigs of the 2014 Iteratioal oferece o Idustrial Egieerig ad Oeratios Maageet Bali, Idoesia, Jauary 7 9, 2014 Suly hai Network Desig with referetial ariff uder Ecooic artershi greeet eichi Fuaki Yokohaa
Digital Interactive Kanban Advertisement System Using Face Recognition Methodology
Coputatioal Water, Eergy, ad Eviroetal Egieerig, 2013, 2, 26-30 doi:10.4236/cweee.2013.23b005 Published Olie July 2013 (http://www.scirp.org/joural/cweee) Digital Iteractive Kaba Advertiseet Syste Usig
Domain 1: Designing a SQL Server Instance and a Database Solution
Maual SQL Server 2008 Desig, Optimize ad Maitai (70-450) 1-800-418-6789 Domai 1: Desigig a SQL Server Istace ad a Database Solutio Desigig for CPU, Memory ad Storage Capacity Requiremets Whe desigig a
Throughput and Delay Analysis of Hybrid Wireless Networks with Multi-Hop Uplinks
This paper was preseted as part of the ai techical progra at IEEE INFOCOM 0 Throughput ad Delay Aalysis of Hybrid Wireless Networks with Multi-Hop Upliks Devu Maikata Shila, Yu Cheg ad Tricha Ajali Dept.
where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return
EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The
5 Boolean Decision Trees (February 11)
5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected
Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may
the product of the hook-lengths is over all boxes of the diagram. We denote by d (n) the number of semi-standard tableaux:
O Represetatio Theory i Coputer Visio Probles Ao Shashua School of Coputer Sciece ad Egieerig Hebrew Uiversity of Jerusale Jerusale 91904, Israel eail: [email protected] Roy Meshula Departet of Matheatics
How To Calculate Stretch Factor Of Outig I Wireless Network
Stretch Factor of urveball outig i Wireless Network: ost of Load Balacig Fa Li Yu Wag The Uiversity of North arolia at harlotte, USA Eail: {fli, yu.wag}@ucc.edu Abstract outig i wireless etworks has bee
Article Writing & Marketing: The Best of Both Worlds!
2612 JOURNAL OF SOFTWARE, VOL 8, NO 1, OCTOBER 213 C-Cell: A Efficiet ad Scalable Network Structure for Data Ceters Hui Cai Logistical Egieerig Uiversity of PLA, Chogqig, Chia Eail: caihui_cool@126co ShegLi
A Faster Clause-Shortening Algorithm for SAT with No Restriction on Clause Length
Joural o Satisfiability, Boolea Modelig ad Computatio 1 2005) 49-60 A Faster Clause-Shorteig Algorithm for SAT with No Restrictio o Clause Legth Evgey Datsi Alexader Wolpert Departmet of Computer Sciece
Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments
Project Deliverables CS 361, Lecture 28 Jared Saia Uiversity of New Mexico Each Group should tur i oe group project cosistig of: About 6-12 pages of text (ca be loger with appedix) 6-12 figures (please
Infinite Sequences and Series
CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...
INTEGRATED TRANSFORMER FLEET MANAGEMENT (ITFM) SYSTEM
INTEGRATED TRANSFORMER FLEET MANAGEMENT (ITFM SYSTEM Audrius ILGEVICIUS Maschiefabrik Reihause GbH, [email protected] Alexei BABIZKI Maschiefabrik Reihause GbH [email protected] ABSTRACT The
Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).
BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly
Spot Market Competition in the UK Electricity Industry
Spot Market Copetitio i the UK Electricity Idustry Nils-Herik M. vo der Fehr Uiversity of Oslo David Harbord Market Aalysis Ltd 2 February 992 Abstract With particular referece to the structure of the
Sequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
The Computational Rise and Fall of Fairness
Proceedigs of the Twety-Eighth AAAI Coferece o Artificial Itelligece The Coputatioal Rise ad Fall of Fairess Joh P Dickerso Caregie Mello Uiversity dickerso@cscuedu Joatha Golda Caregie Mello Uiversity
Controller Area Network (CAN) Schedulability Analysis: Refuted, Revisited and Revised
Cotroller Area Networ (CAN) Schedulability Aalysis: Refuted, Revisited ad Revised Robert. Davis ad Ala Burs Real-ie Systes Research Group, Departet of Coputer Sciece, Uiversity of Yor, YO1 5DD, Yor (UK)
Lecture 2: Karger s Min Cut Algorithm
priceto uiv. F 3 cos 5: Advaced Algorithm Desig Lecture : Karger s Mi Cut Algorithm Lecturer: Sajeev Arora Scribe:Sajeev Today s topic is simple but gorgeous: Karger s mi cut algorithm ad its extesio.
Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions
Chapter 5 Uit Aual Amout ad Gradiet Fuctios IET 350 Egieerig Ecoomics Learig Objectives Chapter 5 Upo completio of this chapter you should uderstad: Calculatig future values from aual amouts. Calculatig
Convexity, Inequalities, and Norms
Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for
CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations
CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad
Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis
Ruig Time ( 3.) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics
Designing Incentives for Online Question and Answer Forums
Desigig Icetives for Olie Questio ad Aswer Forums Shaili Jai School of Egieerig ad Applied Scieces Harvard Uiversity Cambridge, MA 0238 USA [email protected] Yilig Che School of Egieerig ad Applied
Incremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich [email protected] [email protected] Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
Section 11.3: The Integral Test
Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult
Chapter 6: Variance, the law of large numbers and the Monte-Carlo method
Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem
Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits
CHAPTER 3 DIGITAL CODING OF SIGNALS
CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity
A probabilistic proof of a binomial identity
A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two
Entropy of bi-capacities
Etropy of bi-capacities Iva Kojadiovic LINA CNRS FRE 2729 Site école polytechique de l uiv. de Nates Rue Christia Pauc 44306 Nates, Frace [email protected] Jea-Luc Marichal Applied Mathematics
Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL.
Auities Uder Radom Rates of Iterest II By Abraham Zas Techio I.I.T. Haifa ISRAEL ad Haifa Uiversity Haifa ISRAEL Departmet of Mathematics, Techio - Israel Istitute of Techology, 3000, Haifa, Israel I memory
Professional Networking
Professioal Networkig 1. Lear from people who ve bee where you are. Oe of your best resources for etworkig is alumi from your school. They ve take the classes you have take, they have bee o the job market
A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design
A Combied Cotiuous/Biary Geetic Algorithm for Microstrip Atea Desig Rady L. Haupt The Pesylvaia State Uiversity Applied Research Laboratory P. O. Box 30 State College, PA 16804-0030 [email protected] Abstract:
The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection
The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity
ODBC. Getting Started With Sage Timberline Office ODBC
ODBC Gettig Started With Sage Timberlie Office ODBC NOTICE This documet ad the Sage Timberlie Office software may be used oly i accordace with the accompayig Sage Timberlie Office Ed User Licese Agreemet.
Output Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
(VCP-310) 1-800-418-6789
Maual VMware Lesso 1: Uderstadig the VMware Product Lie I this lesso, you will first lear what virtualizatio is. Next, you ll explore the products offered by VMware that provide virtualizatio services.
CHAPTER 3 THE TIME VALUE OF MONEY
CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all
.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth
Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,
Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable
Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5
Research Article Sign Data Derivative Recovery
Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov
Impacts of the Collocation Window on the Accuracy of Altimeter/Buoy Wind Speed Comparison A Simulation Study. Ge Chen 1,2
Ge Che Ipacts of the Collocatio Widow o the ccuracy of ltieter/uoy Wid Speed Copariso Siulatio Study Ge Che, Ocea Reote Sesig Istitute, Ocea Uiversity of Qigdao 5 Yusha Road, Qigdao 66003, Chia E-ail:
Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows:
Subettig Subettig is used to subdivide a sigle class of etwork i to multiple smaller etworks. Example: Your orgaizatio has a Class B IP address of 166.144.0.0 Before you implemet subettig, the Network
Hypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
1. C. The formula for the confidence interval for a population mean is: x t, which was
s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value
Confidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.
Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).
0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5
Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.
Chapter 5 O A Cojecture Of Erdíos Proceedigs NCUR VIII è1994è, Vol II, pp 794í798 Jeærey F Gold Departmet of Mathematics, Departmet of Physics Uiversity of Utah Do H Tucker Departmet of Mathematics Uiversity
Investigation of Atwood s machines as Series and Parallel networks
Ivestiatio of Atwood s achies as Series ad Parallel etworks Jafari Matehkolaee, Mehdi; Bavad, Air Ahad Islaic Azad uiversity of Shahrood, Shahid Beheshti hih school i Sari, Mazadara, Ira [email protected]
Nr. 2. Interpolation of Discount Factors. Heinz Cremers Willi Schwarz. Mai 1996
Nr 2 Iterpolatio of Discout Factors Heiz Cremers Willi Schwarz Mai 1996 Autore: Herausgeber: Prof Dr Heiz Cremers Quatitative Methode ud Spezielle Bakbetriebslehre Hochschule für Bakwirtschaft Dr Willi
MODELS AND METHODS OF RESOURCE MANAGEMENT FOR VPS HOSTING MODELE I METODY ZARZĄDZANIA ZASOBAMI DLA VPS HOSTING
SERGII TELENYK, OLEKSANDR ROLIK, МAKSYM BUKASOV, DMYTRO HALUSHKO MODELS AND METHODS OF RESOURCE MANAGEMENT FOR VPS HOSTING MODELE I METODY ZARZĄDZANIA ZASOBAMI DLA VPS HOSTING Abstract The paper suarizes
Chapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT
Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee
Chapter 5: Inner Product Spaces
Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples
Analyzing Longitudinal Data from Complex Surveys Using SUDAAN
Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical
Notes on exponential generating functions and structures.
Notes o expoetial geeratig fuctios ad structures. 1. The cocept of a structure. Cosider the followig coutig problems: (1) to fid for each the umber of partitios of a -elemet set, (2) to fid for each the
Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.
This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio
Recovery time guaranteed heuristic routing for improving computation complexity in survivable WDM networks
Computer Commuicatios 30 (2007) 1331 1336 wwwelseviercom/locate/comcom Recovery time guarateed heuristic routig for improvig computatio complexity i survivable WDM etworks Lei Guo * College of Iformatio
SOLAR POWER PROFILE PREDICTION FOR LOW EARTH ORBIT SATELLITES
Jural Mekaikal Jue 2009, No. 28, 1-15 SOLAR POWER PROFILE PREDICTION FOR LOW EARTH ORBIT SATELLITES Chow Ki Paw, Reugath Varatharajoo* Departet of Aerospace Egieerig Uiversiti Putra Malaysia 43400 Serdag,
Transient Vibration of the single degree of freedom systems.
Trasiet Vibratio of the sigle degree of freedo systes. 1. -INTRODUCTION. Trasiet vibratio is defied as a teporarily sustaied vibratio of a echaical syste. It ay cosist of forced or free vibratios, or both
Exploratory Data Analysis
1 Exploratory Data Aalysis Exploratory data aalysis is ofte the rst step i a statistical aalysis, for it helps uderstadig the mai features of the particular sample that a aalyst is usig. Itelliget descriptios
Capacity of Wireless Networks with Heterogeneous Traffic
Capacity of Wireless Networks with Heterogeeous Traffic Migyue Ji, Zheg Wag, Hamid R. Sadjadpour, J.J. Garcia-Lua-Aceves Departmet of Electrical Egieerig ad Computer Egieerig Uiversity of Califoria, Sata
THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
How to read A Mutual Fund shareholder report
Ivestor BulletI How to read A Mutual Fud shareholder report The SEC s Office of Ivestor Educatio ad Advocacy is issuig this Ivestor Bulleti to educate idividual ivestors about mutual fud shareholder reports.
SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx
SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval
Basic Elements of Arithmetic Sequences and Series
MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic
The Forgotten Middle. research readiness results. Executive Summary
The Forgotte Middle Esurig that All Studets Are o Target for College ad Career Readiess before High School Executive Summary Today, college readiess also meas career readiess. While ot every high school
The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles
The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio
