Arecent solicitation from the National Science Foundation


 Todd Simmons
 2 years ago
 Views:
Transcription
1 ClietServer Computig WWW Traffic Reductio ad Load Balacig through ServerBased Cachig Azer Bestavros Bosto Uiversity This cachig protocol exploits the geographic ad temporal locality of referece exhibited i clietaccess patters. It automatically ad dyamically dissemiates popular data toward cosumers. Arecet solicitatio from the Natioal Sciece Foudatio deemed two research topics critical for projected applicatios of the Natioal Iformatio Ifrastructure: New techiques for orgaizig cache memories ad other bufferig schemes to alleviate memory ad etwork latecy ad to icrease badwidth. Partitioig ad distributio of system resources throughout a distributed system to reduce the amout of data that must be moved. Most cachig studies for large distributed iformatio systems cocetrate o clietbased cachig, which caches recetly ad frequetly accessed iformatio at the cliet (or at a proxy thereof) i aticipatio of future accesses to that iformatio (see the Clietbased cachig research sidebar). However, these techiques are myopic: they focus exclusively o cachig at a particular cliet or set of cliets, ad hece are likely to have limited impact. The Object Cachig Eviromets for Applicatios ad Network Services (Oceas) group, part of Bosto Uiversity s Computer Sciece Departmet, has developed a more global solutio that allows the replicatio of iformatio o a supplyaddemad basis. This process is cotrolled by servers, which uquestioably have a better view of dataaccess patters tha cliets have /97/$. 997 IEEE IEEE Cocurrecy
2 Clietbased cachig research Usually, clietbased cachig research has dealt with distributed file systems such as the Su NFS, 2 the Adrew File System, 3 ad the Coda system. 4 Recetly, some researchers have tried to exted these techiques to distributed iformatio systems such as FTP ad HTTP. Peter Dazig, Richard Hall, ad Michael Schwartz have studied cachig to reduce the badwidth requiremets for the FTP protocol o the NSFet. 5 Their study shows that a hierarchical cachig system that caches files at core odal switchig subsystems reduces the NSFet backboe traffic by 2%. Swarup Acharya ad Staley Zdoik have studied how data placemet ad replicatio affect etwork traffic, usig file access patters to suggest a distributed dyamic replicatio scheme. 6 Christos Papadimitrou, Sriivas Ramaatha, ad P. Vekat Raga have suggested a more static solutio based o fixed etwork ad storage costs for the delivery of multimedia home etertaimet. 7 Susa Eggers ad Rady Katz have simulated a twolevel cachig system that reduces etwork ad server loads. 8 Matthew Blaze has proposed a dyamic hierarchical file system that supports demaddrive replicatio, whereby cliets ca service requests issued by other cliets from the local disk cache. 9 Michael Dahli ad his colleagues have suggested a similar cooperative cachig idea. Steve Glassma preseted oe of the earliest attempts at cachig o the Web. His method orgaized satellite relays (proxy caches) ito a treestructured hierarchy, with cache misses i lower relays percolatig up through higher relays util the requested object is foud. The performace of this cachig system for a sigle relay with a rather small cache idicated that maitaiig a fairly stable 33% hit rate is possible. Usig a Zipfbased model, Glassma extrapolated the performace of such a system for large caches. (A Zipfbased model is a simulatio model that uses a Zipf distributio, whose parameters are estimated empirically from traces, to sythetically model the popularity of documets.) I particular, he estimated that with a ifiitesize cache, the maximum achievable hit rate is 4%. Mimi Recker ad James Pitkow made aother early attempt to characterize Web access patters to egieer Web cachig systems. 2 They based their model for Web iformatio access o two metrics borrowed from psychology research o huma memory: the frequecy ad rececy rates of past accesses. Stephe Williams ad his colleagues preset a taxoomy (ad compare the performace) of a umber of proxy cache replacemet policies. 3 Their work suggests that proxy cache maagemet should cosider documet sizes whe makig replacemet decisios. JeaChrysostome Bolot ad Philipp Hoschka have cofirmed this idea. 4 They showed the usefuless of usig iformatio about documet size ad etwork load i the replacemet algorithms of Web caches, based o timeseriesaalysis techiques of Web traffic. Marc Abrams ad his colleagues have preseted similar results about the iadequacy of classic LRU (least recetly used) cache replacemet. 5 Refereces. J.H. Howard et al., Scale ad Performace i a Distributed File System, ACM Tras. Computer Systems, Vol. 6, No., Feb. 988, pp R. Sadber et al., Desig ad Implemetatio of the Su Network File System, Proc. Useix Summer Cof., Useix Assoc., Berkeley, Calif., 985, pp J.H. Morris et al., Adrew: A Distributed Persoal Computig Eviromet, Comm. ACM, Vol. 29, No. 3, Mar. 986, pp M. Satyaarayaa et al., Coda: A Highly Available File System for a Distributed Workstatio Eviromet, IEEE Tras. Computers, Vol. 39, No. 4, Apr. 99, pp P. Dazig, R. Hall, ad M. Schwartz, A Case for Cachig File Objects iside Iteretworks, Tech. Report CUCS , Dept. of Computer Sciece, Uiv. of Colorado, Boulder, Colo., S. Acharya ad S.B. Zdoik, A Efficiet Scheme for Dyamic Data Replicatio, Tech. Report CS9343, Dept. of Computer Sciece, Brow Uiv., Providece, R.I., C.H. Papadimitriou, S. Ramaatha, ad P.V. Raga, Iformatio Cachig for Delivery of Persoalized Video Programs o Home Etertaimet Chaels, Proc. It l Cof. Multimedia Computig ad Systems, IEEE CS Press, 994, pp D. Mutz ad P. Hoeyma, MultiLevel Cachig i Distributed File Systems or Your Cache Ai t Nuthig but Trash, Proc. Witer 992 Useix, Useix Assoc., 992, pp M.A. Blaze, Cachig i Large Scale Distributed File Systems, PhD thesis, Computer Sciece Dept., Priceto Uiv., Priceto, N.J., M.D. Dahli et al., Cooperative Cachig: Usig Remote Cliet Memory to Improve File System Performace, First Symp. Operatig Systems Desig ad Implemetatio (OSDI), Useix Assoc., 994, pp S. Glassma, A Cachig Relay for the World Wide Web, Proc. First It l Cof. WWW, NorthHollad, Amsterdam, 994, pp M.M. Recker ad J.E. Pitkow, Predictig Documet Access i Large, Multimedia Repositories, Tech. Report VUGIT9435, Graphics, Visualizatio, ad Usability Ceter, Georgia Tech, Atlata, S. Williams et al., Removal Policies i Network Caches for WorldWide Web Documets, Virgia Polytechic Ist. ad State Uiv., Blacksburg, Va., J.C. Bolot ad Philipp Hoschka, Performace Egieerig of the World Wide Web: Applicatio to Dimesioig ad Cache Desig, Proc. Fifth It l Cof. WWW, Elsevier Press, Amsterdam, 996; Overview.html. 5. M. Abrams et al., Cachig Proxies: Limitatios ad Potetials, Proc. Fourth It l Cof. WWW, O Reilly & Associates, Sebastopol, Calif., 995, pp Jauary March
3 Cache expasio idex Slope of o geographic locality of referece Slope of perfect geographic locality of referece 2 Our protocol reduces the load o popular servers by duplicatig (o other servers) a small percetage of their data. The extet of this duplicatio (how much, where, ad o how may sites) depeds o two factors: the server s popularity ad the expected reductio i traffic if the dissemiatio is i a particular directio. I other words, our protocol provides a mechaism that automatically ad dyamically dissemiates popular data toward cosumers the more popular the data, the closer it gets to the cliets. Demadbased dissemiatio of iformatio from producers to cosumers is ot a ew idea: it is used i the retail ad ewspaper busiesses, amog other thigs. I this article, I propose the same philosophy for distributed iformatio systems. (See the Related work sidebar for research similar to ours.) We used the World Wide Web as the uderlyig distributed computig resource to be maaged. First, the WWW offers a umatched opportuity to ispect a wide rage of distributed object types, structures, ad sizes. Secod, thousads of istitutios worldwide have fully deployed the WWW, which gives us a uparalleled opportuity to apply our fidigs to a realworld applicatio. Usig extesive log data from HTTP servers, we ve developed a aalytical model of our protocol. This model demostrates how to implemet such dissemiatio, both efficietly ad with miimal chages to the Iteret s prevailig clietserver ifrastructure. Tracedrive simulatios based o the data have quatified the potetial performace gais from dissemiatio. 3 Number of cliets Byte hit rate 6% 8% % 2% Figure. The cache expasio idex (the rate at which the proxy cache should be iflated to maitai a costat byte hit rate). 4 5 Clietbased cachig ad locality of referece I a comprehesive study of clietbased cachig for the Web, the Oceas group established its limited effectiveess for very large distributed iformatio systems. 2 The study measured the effectiveess of sessio cachig, host cachig, ad LAN proxy cachig, usig a uique set of 5,7 cliet traces (almost 6, URL requests) that we obtaied by istrumetig Mosaic. 3 We cocluded that LAN proxy cachig is ultimately limited by the low level of sharig of remote documets amog cliets of the same site. This fidig agrees with Steve Glassma s predictios 4 ad was further cofirmed for geeral proxy cachig by Marc Abrams ad his colleagues. 5 To uderstad clietbased cachig s limited effectiveess, we eed to cosider how locality of referece cotributes to ehaced cache performace. Access patters i a distributed iformatio system (such as the WWW) exhibit three locality of referece properties: temporal, geographical, ad spatial. Temporal locality implies that recetly accessed objects will likely be accessed agai. Geographical locality implies that a object accessed by a cliet will likely be accessed agai by earby cliets. This property is similar to the processor locality of referece exhibited i parallel applicatios. 6 Spatial locality implies that a object ear a recetly accessed object will likely be accessed. If clietbased cachig occurs o a persessio basis (that is, the cache is cleared at the start of each cliet sessio), oly temporal locality of referece ca be exploited. The results of our clietbased cachig study suggest that for a sigle cliet, the temporal locality of referece is quite limited, especially for remote documets. I particular, we foud that eve with a ifiite cache size, the average byte hit rate is limited to 36% ad could be as low as 6% for some cliet traces. (The byte hit rate is a ormalized hit rate that takes ito accout the size of the objects that are accessed.) This poor performace could be attributed to the surfig behavior of cliets, which implies that recetly examied documets are rarely revisited. If clietbased cachig is o a persite basis (that is, all cliets share a commo proxy cache), you would expect improved cache performace as a result of the geographical locality of referece. However, the results of our study suggest that for remote documets, the amout of sharig betwee cliets is limited. I particular, we foud that eve with a ifiite proxy cache size, the average byte hit rate improves oly from 36% to 5%. 58 IEEE Cocurrecy
4 Table. Summary statistics for log data used i this article. CSWWW.BU.EDU DATA SET DATA SET 2 DATA SET 3 Days URL requests 72, ,739 4,68,432 Mbytes trasferred,447 6,544 2,5 Average daily trasfer i Mbytes 26 36,8 Files o system 2,8 2,679 N/A Files accessed (remotely) 974 (656),628 (,32) N/A (,5) Size of file system (amout accessed) i Mbytes 5 (37) 62 (42) N/A (42) Uique cliets (+ requests) 8,23 8,474 6,46 Figure, based o the data i our study, illustrates this poit. It plots the cache expasio idex: the rate at which the proxy cache (called LAN cache i our study) should be iflated to maitai a costat byte hit rate. The figure idicates that the CEI is proportioal to the umber of cliets (N) sharig the proxy cache. For a large umber of cliets cocurretly usig the cache, the CEI is liearly related to N ad does ot seem to level off. Figure does ot imply that the relatioship betwee the CEI ad N will cotiue to be liear for eve larger values of N; it oly implies that for the levels of cocurrecy likely to be aggregated at a sigle site through a proxy cache, the relatioship is liear. To get a higher level of cocurrecy, we eed to thik about a ecoomy of scale far beyod what a proxy cache at, say, a orgaizatio ca provide. Therefore, temporal ad geographical locality of referece for WWW access patters are ot strog eough to result i a effective cachig strategy at a sigle cliet or site. However, we ca exploit temporal ad geographical locality of referece o a much larger scale (for example, thousads of cliets), as I ll explai. The premise of dissemiatio To quatify the available locality of referece that could be exploited o the WWW, we collected extesive server traces from the HTTP server of Bosto Uiversity s Computer Sciece Departmet (http://cswww. bu.edu) ad from the HTTP server of the Rollig Stoes Web site (http://www.stoes.com/). Table summarizes these traces. Uless I state otherwise, this data drove our model validatio ad trace simulatios. Carlos Cuha, Mark Crovell, ad I have documeted the highly ueve popularity of various Web documets. 3 This study cofirmed the applicability of Zipf s law 7,8 to Web documets. Zipf s law origially applied to the relatioship betwee a word s popularity i terms of rak ad its frequecy of use (it has subsequetly bee applied to other examples of popularity i the social scieces.). It states that if you rak the popularity of words i a give text (ρ) by their frequecy of use (P), the P /ρ. This distributio is a parameterless hyperbolic distributio that is, ρ is raised to exactly, so that the thmostpopular documet is exactly twice as likely to be accessed as the 2thmostpopular documet. Our data shows that Zipf s law applies quite strogly to Web documets serviced by Web servers. Figure 2a demostrates this for all 2,8 documets accessed i data set (see Table ). The figure shows a loglog plot of the total umber of refereces (y axis) to each documet as a fuctio of the documet s rak i overall popularity (x axis). The tightess of the fit to a straight lie is strog (R 2 =.99), as is the lie s slope:.95 (show i the figure). So, the expoet relatig popularity to rak for Web documets is very early, as Zipf s law predicted. Out of some 2+ files available through the WWW server, the most popular 256 Kbyte block of documets (that is,.5% of all available Related work James Gwertzma ad Margo Seltzer s research is closest to ours. I particular, they propose geographical pushcachig. This method lets servers decide whe ad where to cache iformatio, based o geographical iformatio: the distace i miles betwee servers ad cliets. Their work assumes that the physical distace betwee two Iteret odes correlates with the umber of hops betwee these odes. Abdelsalam Heddaya ad Sulaima Mirdad attempt to achieve loadbalacig amog servers through automatic documet dissemiatio by addig a cachig compoet to routig protocols. 2 Such a approach requires the implemetatio of ew trasport protocols ad thus does ot apply to curret TCP/IPbased etworks. Refereces. J.S. Gwertzma ad M. Seltzer, The Case for Geographical PushCachig, Proc. Fifth Workshop o Hot Topics i Operatig Systems (HotOSV), IEEE Computer Society Press, Los Alamitos, Calif., 995, pp A. Heddaya ad S. Mirdad, Wave: WideArea Virtual Eviromet for Distributig Published Documets, Proc. SIGCOMM 95: Workshop o Middleware, Cambridge, Mass., 995; PapersNoTR/middleware95.html. Jauary March
5 5, Access frequecy (total umber of refereces) (a) Load reductio (%) (b) 2,, Documet rak 2 3 Mbytes dissemiated, Figure 2. The popularity of documets versus their rak (a), ad the potetial load reductio from dissemiatio (b). 4 5 documets) accouted for 69% of all requests. Oly % of all blocks accouted for 9% of all requests! This observatio leads to the followig questio: How much badwidth could be saved if requests for popular documets from outside the LAN were hadled at a earlier stage (for example, usig a proxy at the edge of the orgaizatio)? Figure 2b shows the percetage of the server load (measured i total bytes serviced) that would be saved if some other server serviced various block sizes, startig with the most popular blocks ad progressig to less popular blocks. We corroborated our observatios by aalyzig the HTTP logs of the Rollig Stoes server from November, 994, to February 9, 995. As Table shows, this server ulike the cswww.bu.edu HTTP server is iteded to serve remote cliets exclusively. It is a very popular server, servicig more tha Gbyte of multimedia iformatio per day to tes of thousads of (distict) cliets. (Durig the aalysis period, the server hadled,9,46,92 bytes per day, ad 6,46 cliets retrieved at least files.) Figure 3a shows the access frequecy for all documets that were serviced at least oce. Figure 3b shows the percetage of the remote badwidth that would be saved if some other server hadled various block sizes of decreasig popularity. Of the 4 MBytes of iformatio accessed at least oce (the total umber of bytes available from that server is much larger tha 4 MBytes), oly 2 MBytes (5.25%) were resposible for 85% of the traffic. A closer look at the logs of which is a typical server that caters primarily to local cliets, reveals three classes of documets. Figure 4 shows the ratio of remotetolocal (ad localtoremote) accesses for each of the 974 documets accessed at least oce. Of these documets, 99 had a access ratio larger tha 85% these are remotely popular documets; 5 documets had a access ratio smaller tha 5% these are locally popular documets; 365 documets had a access ratio betwee 85% ad 5% these are globally popular documets. Servers could easily classify documets ito these three classes (based o temporal ad geographical locality of referece), to decide which documets to dissemiate ad where to dissemiate them. A importat factor that could affect our dissemiatio protocol s performace is the rate at which popular documets are updated. The more frequetly these documets are updated, the more frequetly the home server must refresh the replicas at the proxies. I a related study, we moitored the date of last update of remotely, locally, ad globally popular documets for a sixmoth (26week) period. 9 The period started with the 56day period of Table, ad cotiued for aother 26 days. Both remotely popular ad globally popular documets were updated very ifrequetly (less tha.5% update probability per documet per day). Locally popular documets were updated more frequetly (about 2% update probability per documet per day). (We couted multiple updates to a documet i oe day 6 IEEE Cocurrecy
6 Access frequecy (total umber of refereces) (a) 22, 2, 8, 6, 4, 2,, 8, 6, 4, 2, File rak 8, Access ratio (%) Remotely popular documets 2 Globally popular documets 4 6 File rak Figure 4. Local versus remote popularity of documets. Local to remote Remote to local 8 Locally popular documets, 9 Server load reductio (%) (b) Documet size (Mbytes) Figure 3. Access frequecy (a), ad projected badwidth reductio (b), for the server. as oe update.) This result suggests that the documets most likely to be dissemiated are the oes least likely to chage. James Gwertzma ad Margo Seltzer have further ivestigated the sigificace of these fidigs, i the cotext of WWW cache cosistecy. System model ad aalysis 5 I our model, home servers (producers) dissemiate iformatio to service proxies (agets) closer to cliets (cosumers). We assume a maytomay mappig betwee home servers ad service proxies. A service proxy ad the set of home servers it represets form a cluster. We model the WWW as a hierarchy of such clusters. Our otio of a service proxy is similar to that of a cliet proxy, except that the service proxy acts o behalf of a cluster of servers rather tha a cluster of cliets. I practice, we evisio service proxies to be iformatio outlets that are available throughout the Iteret, ad whose badwidth could be, for example, reted. Alterately, service proxies could be public egies, part of a atioal computer iformatio ifrastructure, similar to the NSF backboe. For the remaider of this article, proxy meas a service proxy. Our model does ot limit the umber of proxies that could be used to frot ed a particular server. Each server i the system might belog to a umber of clusters, ad thus might have a umber of proxies actig o its behalf, thereby dissemiatig its documets alog multiple routes (or toward various subetworks). A server ca use (through biddig, for example) a subset of these proxies to dissemiate its data to cliets. For a give home server, we view the WWW clietele as a tree rooted at the server. The tree s leaves are the cliets, ad the iteral odes are the potetial proxies. I Performace evaluatio, I ll describe a efficiet techique for buildig such a tree. Furthermore, by aalyzig the access patters of cliets (as recorded i the server logs), we ca optimally locate the set of odes to use as proxies for that home server. (We did this for the home server, whose 26 week clietele tree cosisted of more tha 34, odes.) A RESOURCE ALLOCATION STRATEGY AT SERVICE PROXIES Let C = S, S, S 2,, S deote all the servers i a cluster, where server S is distiguished as the proxy of cluster C. Let R i deote the total umber of bytes per Jauary March 997 6
7 uit time (say, oe day) serviced by S i i C to cliets outside that cluster. Furthermore, let H i (b) deote the probability that a request for a documet o S i ca be serviced at proxy S as a result of dissemiatig the most popular b bytes from S i to S. Fially, let B i deote the umber of bytes that proxy S duplicates from server S i, ad let B deote the total storage space available at proxy S (that is, B = B + B B ). By iterceptig requests from outside the cluster, S should be able to service a fractio, α C, of these requests. α C = Ri () The objective of S is to allocate storage spaces B, B 2,, B to maximize the value of α C, subject to the costrait that B B + B B. This is a costraiedmaximum problem, which ca be solved usig the Lagrage multiplier theorem. Thus, the maximum for α C occurs whe for all j =, 2,,, δ α δb δ δb j j C = R H ( B ) i i i k, for a costat k Ri Hi( Bi) = k Ri Ri hj( Bj)= k R j (2) where h j (B j ) deotes the probability desity fuctio correspodig to H j (B j ). Equatio 2 uses the value of k (the Lagrage multiplier) to satisfy B = B + B B. Our desire to make our protocol useful restricts the type of assumptios we could make. Thus, i our protocol, we have avoided usig ay parameters that could ot be readily estimated from available logs of etwork protocols (for example, HTTP ad FTP). However, future work alog the same lies could use other iformatio to better tue the system. For example, if iformatio about the commuicatio cost betwee servers, proxies, ad cliets is available, our protocol could be easily adapted to weigh such kowledge ito our resourceallocatio methodology. ANALYSIS UNDER AN EXPONENTIAL POPULARITY MODEL We ll use a expoetial model to approximate the fuctio H i (b). We assume that for i =, 2,,, H b e i b i()= λ, where λ i is the distributio s costat. The probability desity fuctio to H i (b) is h i (b), where δ h b b H b e λi b i()= i()= λi δ (3) Give a server S j, where j, we substitute for h j (b) i Equatio 2 to get a value for B j. λ j R j j B j = log λ k Ri (4) Equatio 4 specifies a set of equatios to ratio the total bufferig space B available at S amog the servers S i, for i =, 2,,. To do so, we must fid the value of the costat k. We do this by observig the requiremet that B B + B B, which results i this expressio for k: ( λiri) λi λi k = B e Ri (5) Substitutig for k from Equatio 5 ito Equatio 4, we get the optimum storage capacity to allocate o S for a particular server S j, where j. These calculatios require that R i ad λ i be estimated, for i =, 2,,. This ca be easily ad efficietly computed from the server logs. Actually, Figure 2 was produced by a program that computed these parameters for cswww. bu.edu. Moreover, our measuremets suggest that these parameters are quite static, i that they chage oly slightly over time. Hece, the calculatio of R i ad λ i, as well as the allocatio of storage space o S for servers S i, for i =, 2,,, eed ot be doe frequetly. They could occur either off lie or periodically (for example, weekly). SPECIAL CASES To help explai our demadbased documetdissemiatio protocol, I ll preset three special cases. Equally effective duplicatio Let λ i = λ for i =, 2,,. That is, let s assume that the 62 IEEE Cocurrecy
8 reductio i badwidth that results from duplicatig some umber of bytes from a particular server S j is equal to the reductio i badwidth that results from duplicatig the same umber of bytes from ay other server S i for i =, 2,,. Substitutig i Equatio 5 ad the Equatio 4, we get B R j B j = + λ log Ri (6) Equatio 6 suggests that popular servers are allocated extra storage capacity o the proxy. This extra storage depeds o two factors: /λ, which is a measure of duplicatio effectiveess, ad log R j Ri which reflects a server s popularity relative to the geometric mea of all servers i the system. This dual depedecy o duplicatio effectiveess ad relative popularity gives us a hadle o how to exted our results for arbitrary distributios of H i (b). I particular, if the skewess of H i (b) ca be measured for a particular server (by aalyzig its logs, as suggested earlier), this measure ca be used istead of /λ. Equally popular servers Let R i = R for i =, 2,,. That is, let s assume that all servers i the system are equally popular. Substitutig i Equatio 5 ad the i Equatio 4, we get j λ Bj = B + log λ j λi λi λi (7) Equatio 7 suggests that servers whose data are accessed more uiformly (that is, servers with a smaller λ) should be allotted more storage capacity o the proxy as log as the proxy s total capacity is large eough (that is, B /λ i ). However, if the server s capacity is ot big eough, servers with itermediate values for λ should be favored. Figure 5 shows the optimal storage capacity to allocate to server S j for various values of λ j, assumig that all other servers have equal λ i ad that B = /λ i (tight) or B = /λ i (lax), for i ad i j. Storage capacity alloted (%) Symmetric clusters To appreciate the effectiveess of demadbased documet dissemiatio, let s cosider a symmetric cluster, where all servers have idetical values for R i ad λ i. From Equatio 5 ad Equatio 4, we get B As expected, Equatio 8 equally allocates storage o S for all the servers i the cluster. By substitutig the value of B j ito Equatio, we get α Figure 5. Storage allocatio for R i = R ad B = /λ i. λ j, the costat of the H j (b) distributio itroduced i Equatio 3, measures how skewed H j (b) is. j C 3 3 3, λ = log λ e = R H B λ B Lax resources B = λi N = 2 N = 4 N = 64 R R = e B = B λ R (9) Equatio 9 could be used to estimate the storage requiremets o the proxy as a fuctio α: B = log λ α C () Assume, for example, that the server is oe of servers whose most popular data are duplicated o a proxy. Equatio suggests that to reduce the remote badwidth by 9% o all servers, the proxy must secure 36 Mbytes to be divided equally amog all servers. This assumes a value of λ = , which we estimated from the HTTP demo logs o the cswww.bu.edu server. With a storage capacity of 5 λ j λ Tight resources B = λi N = 2 N = 4 N = 64 (8) Jauary March
9 Number of cliets,8,6,4,2, Hops from server to cliets 25 3 Bytes hops saved (%) 45 Most popular % of data 4 Most popular 4% of data Figure 6. How far away are cliets? (a) Number of proxies Mbytes, a proxy could shield servers from as much as 96% of their remote badwidth. These umbers raise a legitimate questio: If oe proxy serves 96% of all remote accesses to servers (or eve 9% of all accesses to servers), wo t that proxy be a performace bottleeck? Yes, uless the process of dissemiatig popular iformatio cotiues for aother level, ad so o. If that is ot possible, the aother solutio would be for the proxy to dyamically adjust the level of shieldig it provides for its costituet servers. I other words, whe the proxy becomes overloaded, B decreases, thus forcig more of the requests back to the servers. Performace evaluatio To evaluate our dissemiatio protocol, we had to devise a realistic clusterig of the Iteret to reflect our dissemiatio model. Several criteria could determie this clusterig. Oe criteria could be geographical iformatio (for example, distace i miles betwee cliets). Aother could be the istitutioal or etwork boudaries (for example, oe service proxy per istitutio or etwork). We based the clusterig o the structure of the routes betwee a server ad its cliets. Such a clusterig (based o the distace betwee the server ad cliet measured i actual route hops) lets us quatify the savigs achievable through dissemiatio, because it lets us measure the badwidth saved i bytes hops uits. Usig the record route optio of TCP/IP, it is possible to build a complete serverproxycliet tree. We built such a tree for all traceable cliets of cswww.bu.edu. A traceable cliet is a cliet for which traceroute() succeeded i idetifyig a route that remaied cosistet over several traceroute experimets. Almost 9% of all bytes trasferred from trasferred to traceable cliets. For Reductio i server load (%) (b) Number of proxies 4 5 Figure 7. The results of dissemiatio: reductios i (a) badwidth ad (b) home server load. data set 2 of Table, the costructed tree cosisted of more tha 8, odes. Figure 6 shows a histogram of the umber of cliets versus their distace i hops from the server. This histogram shows three distict cliet populatios. The first populatio is withi two to three hops ad represets ocampus cliets at Bosto Uiversity. The secod is withi four to ie hops ad represets cliets o the New Eglad Academic ad Research etwork (NEARet). The third is greater tha ie hops away ad represets WAN cliets. For servers where WAN cliets geerate most of the demad, Figure 6 suggests that popular documets could be dissemiated as much as eight to ie hops from the server. Such a dissemiatio would result i large savigs. Specifically, 64 IEEE Cocurrecy
10 6 5 5 weeks 3 weeks 2 weeks replicatig the most popular 25 Mbytes from stoes.com o proxies eight to ie hops closer to cliets would yield a whoppig daily etwork badwidth savigs of more tha 8 Gbytes hops. We simulated our dissemiatio strategy based o the structure of this servertocliet routig tree. (Aalysis of logs suggests that the tree s shape especially iteral odes ad the load distributio are quite static over time.) The 26week traces obtaied from (data set 2 i Table ) drove our simulatios. Our simulatios dissemiated replicas of the most popular files o every week, dow to proxies closer to the cliets. The locatio of such proxies depeded o the demad durig the previous moth from the various parts of the tree. (Our protocol proved to be quite robust with respect to the frequecy of dissemiatio ad the history legth used to establish file popularity profiles. Because of space limitatios, I have t icluded simulatio results to support this.) Our geeral dissemiatio protocol assumes a hierarchy of proxies. Our implemetatio of this idea was simpler; we flatteed the hierarchy by havig the home server use the access patters of all its cliets to compute the ultimate placemet for the replicas. I other words, our simulatios did ot cosider multilevel proxies, as would be possible uder the geeral dissemiatio model. We assumed that ay iteral ode is available as a service proxy. I a real system, this assumptio might ot be valid because iteral odes are routers, ulikely to be available as service proxies. We evisio that i practice, the set of proxies available for ret by a particular server could be matched up to the optimum places determied by our protocol. The simulatios required cliets to always request service from the home server. If the requested documets are available at proxies closer to the cliet, the home server forwards the request to the proxy that is closest to the cliet. This forwardig mechaism is supported by the HTTP protocol ad requires much less overhead tha the more geeral hierarchical ameresolutio strategy. 2 The simulatios also assumed that the cost of propagatig updates to proxies ca be igored. To validate this assumptio, cosider the ratio of a popular documet s umber of chages per uit time to its umber of accesses per uit time. As I oted earlier, popular documets are updated least. So, that ratio will be very small Bytes hops saved (%) Number of proxies 5 week Figure 8. The algorithm s sesitivity to the legth of the history that is used to compute the popularity profile. (we measured it at less tha. for popular documets). Thus, the iaccuracy of our simulatios (by ot takig ito accout the cost of propagatig updates) is isigificat. Figure 7a shows the reductio i badwidth (measured i bytes hops) that is achievable by dissemiatig differet amouts of the most popular data. The figure shows two curves. I the first, the system dissemiates the most popular % of the data; whereas i the secod, it dissemiates the most popular 4% of the data. A larger umber of proxies results i a deeper dissemiatio ad, thus, larger badwidth savigs. Figure 7b shows the reductio i home server load (measured i bytes) achievable by dissemiatig 4% of the most popular data. Figure 7a shows that the dissemiatio of a very small amout of data causes most of the badwidth savigs. For example, icreasig the level of dissemiatio by 5% (from 2 Mbytes per service proxy to 5 Mbytes per service proxy) results i a meager 6% additioal badwidth savigs. Also, most of the saved badwidth results from usig a small umber of service proxies. For example, triplig the umber of service proxies from 2 to 6 saves oly 2% additioal badwidth. Fially, Figure 7 shows that the payoff per replica decreases as the umber of replicas icreases. Figure 8 shows our algorithm s sesitivity to chages i the popularity profile for documets o the home server. I particular, it shows the performace results for four experimets, i which the dissemiatio occurred oce a week, usig the logs of the previous weeks to compute the popularity profile, for =, 2, 3, ad 5. For these experimets, the system dissemiated the % most popular data (computed over the week period). The figure shows that the loger the history Jauary March
11 that is used to compute the popularity profile, the better the performace, especially whe the umber of proxies is large. Our simulatio dissemiated the same data to all proxies. Better results are attaiable if the dissemiatio strategy takes greater advatage of the geographic locality of referece (by dissemiatig differet data to differet proxies based o the access patters of cliets served by each proxy). These experimets demostrate the potetial savigs from dissemiatio. More experimets are eeded to geeralize these fidigs by cosiderig traces from a larger set of servers. Also, more experimets are eeded to study the effects of dissemiatio from competig service proxies. Fially, the Web is evolvig i terms of types of iformatio ad how this iformatio is preseted (for example, the icreases i Commo Gate Iterface queries, Java applets, ad scripts). This evolutio might require the developmet of more elaborate dissemiatio protocols. clietbased protocols cocers dyamic server selectio, itroduced by Mark Crovella ad Robert Carter. 3 The primary purpose of a clietbased dyamic serverselectio protocol is to determie o the spot which oe of several servers to use to retrieve a documet replicated o all the servers. This selectio is based o dyamic measuremets that attempt to locate the server with the least cogested route. Such a techique could complemet dissemiatio by allowig servers to commuicate with cliets the locatio of all (or some) replicas that are deemed close to the cliet. The clietbased serverselectio protocol would have the fial decisio as to which replica to retrieve based o the dyamic coditios of the etwork. The protocol I ve preseted exemplifies the potetial of clietaccess patters i egieerig scaleable protocols ad services i largescale distributed iformatio systems (such as the WWW). Ocea is developig prototypes to test ad evaluate similar protocols. The basic premise of our work is that servers are i a uique positio to decide which documets are worth replicatig through dissemiatio, how may replicas should be dissemiated, ad where to place these replicas. This, however, does ot imply that cliets ad their proxies do ot (or should ot) play a role i improvig the performace of iformatio retrieval ad i alleviatig commuicatio bottleecks. For example, our work is ot a substitute for clietbased cachig; rather, it is a complemet. Clietbased cachig s primary purpose is to reduce the latecy of iformatio retrieval, whereas iformatio dissemiatio s primary purpose is to reduce traffic ad balace the load amog servers. Of course, these purposes are ot completely idepedet a protocol whose primary purpose is to reduce traffic ad balace load is likely to improve the latecy of iformatio retrieval. Nevertheless, the highest reductio i latecy will likely result from a protocol whose primary purpose is to do just that (clietbased cachig i this case). Aother example of how our work complemets ACKNOWLEDGMENTS I would like to thak all the members of the Oceas group (http:// cswww.bu.edu/groups/oceas) for their feedback ad may discussios o this work. I particularly thak Carlos Cuha for his tracecollectio efforts ad for helpig with the tracedrive simulatio of the dissemiatio protocol preseted i this article. This work has bee partially supported by NSF Grat CCR REFERENCES. M. Foster ad R. Jump, NSF Solicitatio 9475, STIS Database, Nat l Sciece Foudatio, Arligto, Va., A. Bestavros et al., Applicatio Level Documet Cachig i the Iteret, Proc. Secod It l Workshop o Services i Distributed ad Networked Eviromets (SDNE 95), IEEE Computer Society Press, Los Alamitos, Calif., 995, pp IEEE Cocurrecy
12 3. C. Cuha, A. Bestavros, ad M. Crovella, Characteristics of WWW ClietBased Traces, Tech. Report TR95, Computer Sciece Dept., Bosto Uiv., Bosto, S. Glassma, A Cachig Relay for the World Wide Web, Proc. First It l Cof. WWW, NorthHollad, Amsterdam, 994, pp M. Abrams et al., Cachig Proxies: Limitatios ad Potetials, Proc. Fourth It l Cof. WWW, O Reilly & Associates, Sebastopol, Calif., 995, pp S.J. Eggers ad R.H. Katz, A Characterisatio of Sharig i Parallel Programs ad Its Applicatio to Coherece Protocol Evaluatio, Proc. 5th A. It l Symp. Computer Architecture, IEEE CS Press, 988, pp J. Gwertzma ad M. Seltzer, World Wide Web Cache Cosistecy, Proc. 996 USENIX Tech. Cof., Useix Assoc., Berkeley, Calif., 996, pp J.S. Gwertzma ad M. Seltzer, The Case for Geographical PushCachig, Proc. Fifth Workshop o Hot Topics i Operatig Systems (HotOSV), IEEE CS Press, 995, pp P. Dazig, R. Hall, ad M. Schwartz, A Case for Cachig File Objects iside Iteretworks, Tech. Report CUCS , Computer Sciece Dept., Uiv. of Colorado, Boulder, Colo., M. Crovella ad R. Carter, Dyamic Server Selectio i the Iteret, Proc. Third IEEE Workshop o the Architecture ad Implemetatio of High Performace Commuicatio Subsystems (HPCS 95), IEEE Commuicatio Soc., New York, 995, pp B.B. Madelbrot, The Fractal Geometry of Nature, W.H. Freedma ad Co., New York, G.K. Zipf, Huma Behavior ad the Priciple of LeastEffort, AddisoWesley, Readig, Mass., A. Bestavros, Speculative Data Dissemiatio ad Service to Reduce Server Load, Network Traffic ad Service Time i Distributed Iformatio Systems, Proc. ICDE 96: 2th It l Cof. Data Eg., IEEE CS Press, 996, pp Azer Bestavros is o the faculty of Bosto Uiversity s Computer Sciece Departmet. His research maily cocers realtime systems ad distributed systems, ad is partially fuded by research grats from the Natioal Sciece Foudatio, the US Army Research Office, ad GTE. His work o largescale distributed iformatio systems is coducted with Bosto Uiversity s Oceas research group, which he cofouded i 994. He is the editor of the Newsletter of the IEEE Computer Society Techical Committee o RealTime Systems ad maitais its archives at Bosto Uiversity. He obtaied his SM ad PhD from Harvard Uiversity i 988 ad 992. He ca be reached at the Computer Sciece Dept., Bosto Uiv., MA 225; Jauary March
(VCP310) 18004186789
Maual VMware Lesso 1: Uderstadig the VMware Product Lie I this lesso, you will first lear what virtualizatio is. Next, you ll explore the products offered by VMware that provide virtualizatio services.
More informationCHAPTER 3 THE TIME VALUE OF MONEY
CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all
More information.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth
Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,
More informationIn nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
More information1 Computing the Standard Deviation of Sample Means
Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.
More informationI. Chisquared Distributions
1 M 358K Supplemet to Chapter 23: CHISQUARED DISTRIBUTIONS, TDISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad tdistributios, we first eed to look at aother family of distributios, the chisquared distributios.
More informationModified Line Search Method for Global Optimization
Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o
More informationIncremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
More informationThroughput of Ideally Routed Wireless Ad Hoc Networks
Throughput of Ideally Routed Wireless Ad Hoc Networks Gábor Németh, Zoltá Richárd Turáyi, 2 ad Adrás Valkó 2 Commuicatio Networks Laboratory 2 Traffic Lab, Ericsso Research, Hugary I. INTRODUCTION At the
More informationNonlife insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring
Nolife isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy
More informationNPTEL STRUCTURAL RELIABILITY
NPTEL Course O STRUCTURAL RELIABILITY Module # 0 Lecture 1 Course Format: Web Istructor: Dr. Aruasis Chakraborty Departmet of Civil Egieerig Idia Istitute of Techology Guwahati 1. Lecture 01: Basic Statistics
More informationDomain 1: Designing a SQL Server Instance and a Database Solution
Maual SQL Server 2008 Desig, Optimize ad Maitai (70450) 18004186789 Domai 1: Desigig a SQL Server Istace ad a Database Solutio Desigig for CPU, Memory ad Storage Capacity Requiremets Whe desigig a
More informationEvaluating Model for B2C E commerce Enterprise Development Based on DEA
, pp.180184 http://dx.doi.org/10.14257/astl.2014.53.39 Evaluatig Model for B2C E commerce Eterprise Developmet Based o DEA Weli Geg, Jig Ta Computer ad iformatio egieerig Istitute, Harbi Uiversity of
More informationEkkehart Schlicht: Economic Surplus and Derived Demand
Ekkehart Schlicht: Ecoomic Surplus ad Derived Demad Muich Discussio Paper No. 200617 Departmet of Ecoomics Uiversity of Muich Volkswirtschaftliche Fakultät LudwigMaximiliasUiversität Müche Olie at http://epub.ub.uimueche.de/940/
More information*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.
Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.
More informationSoving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
More informationTaking DCOP to the Real World: Efficient Complete Solutions for Distributed MultiEvent Scheduling
Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed MultiEvet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationAnalyzing Longitudinal Data from Complex Surveys Using SUDAAN
Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical
More informationBaan Service Master Data Management
Baa Service Master Data Maagemet Module Procedure UP069A US Documetiformatio Documet Documet code : UP069A US Documet group : User Documetatio Documet title : Master Data Maagemet Applicatio/Package :
More informationData Analysis and Statistical Behaviors of Stock Market Fluctuations
44 JOURNAL OF COMPUTERS, VOL. 3, NO. 0, OCTOBER 2008 Data Aalysis ad Statistical Behaviors of Stock Market Fluctuatios Ju Wag Departmet of Mathematics, Beijig Jiaotog Uiversity, Beijig 00044, Chia Email:
More informationOnline Banking. Internet of Things
Olie Bakig & The Iteret of Thigs Our icreasigly iteretcoected future will mea better bakig ad added security resposibilities for all of us. FROM DESKTOPS TO SMARTWATCHS Just a few years ago, Americas coducted
More informationConfiguring Additional Active Directory Server Roles
Maual Upgradig your MCSE o Server 2003 to Server 2008 (70649) 18004186789 Cofigurig Additioal Active Directory Server Roles Active Directory Lightweight Directory Services Backgroud ad Cofiguratio
More informationCase Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
More informationHow to read A Mutual Fund shareholder report
Ivestor BulletI How to read A Mutual Fud shareholder report The SEC s Office of Ivestor Educatio ad Advocacy is issuig this Ivestor Bulleti to educate idividual ivestors about mutual fud shareholder reports.
More informationAQA STATISTICS 1 REVISION NOTES
AQA STATISTICS 1 REVISION NOTES AVERAGES AND MEASURES OF SPREAD www.mathsbox.org.uk Mode : the most commo or most popular data value the oly average that ca be used for qualitative data ot suitable if
More informationSystems Design Project: Indoor Location of Wireless Devices
Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 6985295 Email: bcm1@cec.wustl.edu Supervised
More informationINVESTMENT PERFORMANCE COUNCIL (IPC)
INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks
More informationCapacity of Wireless Networks with Heterogeneous Traffic
Capacity of Wireless Networks with Heterogeeous Traffic Migyue Ji, Zheg Wag, Hamid R. Sadjadpour, J.J. GarciaLuaAceves Departmet of Electrical Egieerig ad Computer Egieerig Uiversity of Califoria, Sata
More informationGrade 7. Strand: Number Specific Learning Outcomes It is expected that students will:
Strad: Number Specific Learig Outcomes It is expected that studets will: 7.N.1. Determie ad explai why a umber is divisible by 2, 3, 4, 5, 6, 8, 9, or 10, ad why a umber caot be divided by 0. [C, R] [C]
More informationVladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT
Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee
More informationCHAPTER 7: Central Limit Theorem: CLT for Averages (Means)
CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:
More informationDomain 1  Describe Cisco VoIP Implementations
Maual ONT (6428) 18004186789 Domai 1  Describe Cisco VoIP Implemetatios Advatages of VoIP Over Traditioal Switches Voice over IP etworks have may advatages over traditioal circuit switched voice etworks.
More informationQuadrat Sampling in Population Ecology
Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may
More informationODBC. Getting Started With Sage Timberline Office ODBC
ODBC Gettig Started With Sage Timberlie Office ODBC NOTICE This documet ad the Sage Timberlie Office software may be used oly i accordace with the accompayig Sage Timberlie Office Ed User Licese Agreemet.
More informationwhere: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return
EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The
More informationThe Forgotten Middle. research readiness results. Executive Summary
The Forgotte Middle Esurig that All Studets Are o Target for College ad Career Readiess before High School Executive Summary Today, college readiess also meas career readiess. While ot every high school
More informationThe analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection
The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity
More informationCHAPTER 3 DIGITAL CODING OF SIGNALS
CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity
More informationINVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology
Adoptio Date: 4 March 2004 Effective Date: 1 Jue 2004 Retroactive Applicatio: No Public Commet Period: Aug Nov 2002 INVESTMENT PERFORMANCE COUNCIL (IPC) Preface Guidace Statemet o Calculatio Methodology
More informationFM4 CREDIT AND BORROWING
FM4 CREDIT AND BORROWING Whe you purchase big ticket items such as cars, boats, televisios ad the like, retailers ad fiacial istitutios have various terms ad coditios that are implemeted for the cosumer
More informationLesson 17 Pearson s Correlation Coefficient
Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) types of data scatter plots measure of directio measure of stregth Computatio covariatio of X ad Y uique variatio i X ad Y measurig
More informationEnhancing Oracle Business Intelligence with cubus EV How users of Oracle BI on Essbase cubes can benefit from cubus outperform EV Analytics (cubus EV)
Ehacig Oracle Busiess Itelligece with cubus EV How users of Oracle BI o Essbase cubes ca beefit from cubus outperform EV Aalytics (cubus EV) CONTENT 01 cubus EV as a ehacemet to Oracle BI o Essbase 02
More informationBasic Elements of Arithmetic Sequences and Series
MA40S PRECALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic
More informationLECTURE 13: Crossvalidation
LECTURE 3: Crossvalidatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Threeway data partitioi Itroductio to Patter Aalysis Ricardo GutierrezOsua Texas A&M
More informationAmendments to employer debt Regulations
March 2008 Pesios Legal Alert Amedmets to employer debt Regulatios The Govermet has at last issued Regulatios which will amed the law as to employer debts uder s75 Pesios Act 1995. The amedig Regulatios
More informationEstimating Probability Distributions by Observing Betting Practices
5th Iteratioal Symposium o Imprecise Probability: Theories ad Applicatios, Prague, Czech Republic, 007 Estimatig Probability Distributios by Observig Bettig Practices Dr C Lych Natioal Uiversity of Irelad,
More informationSequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
More informationMultiserver Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu
Multiserver Optimal Badwidth Moitorig for QoS based Multimedia Delivery Aup Basu, Iree Cheg ad Yizhe Yu Departmet of Computig Sciece U. of Alberta Architecture Applicatio Layer Request receptio coectio
More informationSTUDENTS PARTICIPATION IN ONLINE LEARNING IN BUSINESS COURSES AT UNIVERSITAS TERBUKA, INDONESIA. Maya Maria, Universitas Terbuka, Indonesia
STUDENTS PARTICIPATION IN ONLINE LEARNING IN BUSINESS COURSES AT UNIVERSITAS TERBUKA, INDONESIA Maya Maria, Uiversitas Terbuka, Idoesia Coauthor: Amiuddi Zuhairi, Uiversitas Terbuka, Idoesia Kuria Edah
More informationPROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics
More informationRecursion and Recurrences
Chapter 5 Recursio ad Recurreces 5.1 Growth Rates of Solutios to Recurreces Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer. Cosider, for example,
More informationTIEE Teaching Issues and Experiments in Ecology  Volume 1, January 2004
TIEE Teachig Issues ad Experimets i Ecology  Volume 1, Jauary 2004 EXPERIMENTS Evirometal Correlates of Leaf Stomata Desity Bruce W. Grat ad Itzick Vatick Biology, Wideer Uiversity, Chester PA, 19013
More informationOptimize your Network. In the Courier, Express and Parcel market ADDING CREDIBILITY
Optimize your Network I the Courier, Express ad Parcel market ADDING CREDIBILITY Meetig today s challeges ad tomorrow s demads Aswers to your key etwork challeges ORTEC kows the highly competitive Courier,
More informationDescriptive statistics deals with the description or simple analysis of population or sample data.
Descriptive statistics Some basic cocepts A populatio is a fiite or ifiite collectio of idividuals or objects. Ofte it is impossible or impractical to get data o all the members of the populatio ad a small
More informationChapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions
Chapter 5 Uit Aual Amout ad Gradiet Fuctios IET 350 Egieerig Ecoomics Learig Objectives Chapter 5 Upo completio of this chapter you should uderstad: Calculatig future values from aual amouts. Calculatig
More informationPredictive Modeling Data. in the ACT Electronic Student Record
Predictive Modelig Data i the ACT Electroic Studet Record overview Predictive Modelig Data Added to the ACT Electroic Studet Record With the release of studet records i September 2012, predictive modelig
More informationInstitute of Actuaries of India Subject CT1 Financial Mathematics
Istitute of Actuaries of Idia Subject CT1 Fiacial Mathematics For 2014 Examiatios Subject CT1 Fiacial Mathematics Core Techical Aim The aim of the Fiacial Mathematics subject is to provide a groudig i
More informationChapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
More informationhp calculators HP 12C Statistics  average and standard deviation Average and standard deviation concepts HP12C average and standard deviation
HP 1C Statistics  average ad stadard deviatio Average ad stadard deviatio cocepts HP1C average ad stadard deviatio Practice calculatig averages ad stadard deviatios with oe or two variables HP 1C Statistics
More informationDepartment of Computer Science, University of Otago
Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS200609 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly
More informationConvention Paper 6764
Audio Egieerig Society Covetio Paper 6764 Preseted at the 10th Covetio 006 May 0 3 Paris, Frace This covetio paper has bee reproduced from the author's advace mauscript, without editig, correctios, or
More information5 Boolean Decision Trees (February 11)
5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected
More informationTrading the randomness  Designing an optimal trading strategy under a drifted random walk price model
Tradig the radomess  Desigig a optimal tradig strategy uder a drifted radom walk price model Yuao Wu Math 20 Project Paper Professor Zachary Hamaker Abstract: I this paper the author iteds to explore
More informationStat 104 Lecture 2. Variables and their distributions. DJIA: monthly % change, 2000 to Finding the center of a distribution. Median.
Stat 04 Lecture Statistics 04 Lecture (IPS. &.) Outlie for today Variables ad their distributios Fidig the ceter Measurig the spread Effects of a liear trasformatio Variables ad their distributios Variable:
More informationOn the Capacity of Hybrid Wireless Networks
O the Capacity of Hybrid ireless Networks Beyua Liu,ZheLiu +,DoTowsley Departmet of Computer Sciece Uiversity of Massachusetts Amherst, MA 0002 + IBM T.J. atso Research Ceter P.O. Box 704 Yorktow Heights,
More informationModule 4: Mathematical Induction
Module 4: Mathematical Iductio Theme 1: Priciple of Mathematical Iductio Mathematical iductio is used to prove statemets about atural umbers. As studets may remember, we ca write such a statemet as a predicate
More informationSPC for Software Reliability: Imperfect Software Debugging Model
IJCSI Iteratioal Joural of Computer Sciece Issues, Vol. 8, Issue 3, o., May 0 ISS (Olie: 694084 www.ijcsi.org 9 SPC for Software Reliability: Imperfect Software Debuggig Model Dr. Satya Prasad Ravi,.Supriya
More informationSecurity Functions and Purposes of Network Devices and Technologies (SY0301) 18004186789. Firewalls. Audiobooks
Maual Security+ Domai 1 Network Security Every etwork is uique, ad architecturally defied physically by its equipmet ad coectios, ad logically through the applicatios, services, ad idustries it serves.
More informationDomain 1: Configuring Domain Name System (DNS) for Active Directory
Maual Widows Domai 1: Cofigurig Domai Name System (DNS) for Active Directory Cofigure zoes I Domai Name System (DNS), a DNS amespace ca be divided ito zoes. The zoes store ame iformatio about oe or more
More informationProfessional Networking
Professioal Networkig 1. Lear from people who ve bee where you are. Oe of your best resources for etworkig is alumi from your school. They ve take the classes you have take, they have bee o the job market
More informationleasing Solutions We make your Business our Business
if you d like to discover how Bp paribas leasig Solutios Ca help you to achieve your goals please get i touch leasig Solutios We make your Busiess our Busiess We look forward to hearig from you you ca
More informationExample 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).
BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook  Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly
More informationReview: Classification Outline
Data Miig CS 341, Sprig 2007 Decisio Trees Neural etworks Review: Lecture 6: Classificatio issues, regressio, bayesia classificatio Pretice Hall 2 Data Miig Core Techiques Classificatio Clusterig Associatio
More informationMeasures of Spread and Boxplots Discrete Math, Section 9.4
Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,
More informationExample Consider the following set of data, showing the number of times a sample of 5 students check their per day:
Sectio 82: Measures of cetral tedecy Whe thikig about questios such as: how may calories do I eat per day? or how much time do I sped talkig per day?, we quickly realize that the aswer will vary from day
More information1 Correlation and Regression Analysis
1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio
More information7. Sample Covariance and Correlation
1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 7. Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y
More informationReliability Analysis in HPC clusters
Reliability Aalysis i HPC clusters Narasimha Raju, Gottumukkala, Yuda Liu, Chokchai Box Leagsuksu 1, Raja Nassar, Stephe Scott 2 College of Egieerig & Sciece, Louisiaa ech Uiversity Oak Ridge Natioal Lab
More informationGregory Carey, 1998 Linear Transformations & Composites  1. Linear Transformations and Linear Composites
Gregory Carey, 1998 Liear Trasformatios & Composites  1 Liear Trasformatios ad Liear Composites I Liear Trasformatios of Variables Meas ad Stadard Deviatios of Liear Trasformatios A liear trasformatio
More information9.8: THE POWER OF A TEST
9.8: The Power of a Test CD91 9.8: THE POWER OF A TEST I the iitial discussio of statistical hypothesis testig, the two types of risks that are take whe decisios are made about populatio parameters based
More informationChapter 6: CPU Scheduling. Previous Lectures. Basic Concepts. Histogram of CPUburst Times. CPU Scheduler. Alternating Sequence of CPU And I/O Bursts
Multithreadig Memory Layout Kerel vs User threads Represetatio i OS Previous Lectures Differece betwee thread ad process Thread schedulig Mappig betwee user ad kerel threads Multithreadig i Java Thread
More informationA Resource for Freestanding Mathematics Qualifications Working with %
Ca you aswer these questios? A savigs accout gives % iterest per aum.. If 000 is ivested i this accout, how much will be i the accout at the ed of years? A ew car costs 16 000 ad its value falls by 1%
More informationArithmetic Sequences and Partial Sums. Arithmetic Sequences. Definition of Arithmetic Sequence. Example 1. 7, 11, 15, 19,..., 4n 3,...
3330_090.qxd 1/5/05 11:9 AM Page 653 Sectio 9. Arithmetic Sequeces ad Partial Sums 653 9. Arithmetic Sequeces ad Partial Sums What you should lear Recogize,write, ad fid the th terms of arithmetic sequeces.
More informationPENSION ANNUITY. Policy Conditions Document reference: PPAS1(7) This is an important document. Please keep it in a safe place.
PENSION ANNUITY Policy Coditios Documet referece: PPAS1(7) This is a importat documet. Please keep it i a safe place. Pesio Auity Policy Coditios Welcome to LV=, ad thak you for choosig our Pesio Auity.
More informationHow to use what you OWN to reduce what you OWE
How to use what you OWN to reduce what you OWE Maulife Oe A Overview Most Caadias maage their fiaces by doig two thigs: 1. Depositig their icome ad other shortterm assets ito chequig ad savigs accouts.
More informationTo c o m p e t e in t o d a y s r e t a i l e n v i r o n m e n t, y o u n e e d a s i n g l e,
Busiess Itelligece Software for Retail To c o m p e t e i t o d a y s r e t a i l e v i r o m e t, y o u e e d a s i g l e, comprehesive view of your busiess. You have to tur the decisiomakig of your
More informationChapter 6: Variance, the law of large numbers and the MonteCarlo method
Chapter 6: Variace, the law of large umbers ad the MoteCarlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
More informationGCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.
GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea  add up all
More informationFor customers Key features of the Guaranteed Pension Annuity
For customers Key features of the Guarateed Pesio Auity The Fiacial Coduct Authority is a fiacial services regulator. It requires us, Aego, to give you this importat iformatio to help you to decide whether
More informationCREATIVE MARKETING PROJECT 2016
CREATIVE MARKETING PROJECT 2016 The Creative Marketig Project is a chapter project that develops i chapter members a aalytical ad creative approach to the marketig process, actively egages chapter members
More informationProject Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments
Project Deliverables CS 361, Lecture 28 Jared Saia Uiversity of New Mexico Each Group should tur i oe group project cosistig of: About 612 pages of text (ca be loger with appedix) 612 figures (please
More informationJune 3, 1999. Voice over IP
Jue 3, 1999 Voice over IP This applicatio ote discusses the Hypercom solutio for providig edtoed Iteret protocol (IP) coectivity i a ew or existig Hypercom Hybrid Trasport Mechaism (HTM) etwork, reducig
More informationInformation for Programs Seeking Initial Accreditation
Iformatio for Programs Seekig Iitial Accreditatio Aswers to Frequetly AskedQuestios (from www.abet.org/ewtoaccreditatio/) Assurig Quality l Stimulatig Iovatio This documet iteds to aswer may of the
More informationVolatility of rates of return on the example of wheat futures. Sławomir Juszczyk. Rafał Balina
Overcomig the Crisis: Ecoomic ad Fiacial Developmets i Asia ad Europe Edited by Štefa Bojec, Josef C. Brada, ad Masaaki Kuboiwa http://www.hippocampus.si/isbn/9789616832328/cotets.pdf Volatility of
More informationAutomatic Tuning for FOREX Trading System Using Fuzzy Time Series
utomatic Tuig for FOREX Tradig System Usig Fuzzy Time Series Kraimo Maeesilp ad Pitihate Soorasa bstract Efficiecy of the automatic currecy tradig system is time depedet due to usig fixed parameters which
More informationThe Fundamental CapacityDelay Tradeoff in Large Mobile Ad Hoc Networks
The Fudametal CapacityDelay Tradeoff i Large Mobile Ad Hoc Networks Xiaoju Li ad Ness B. Shroff School of Electrical ad Computer Egieerig, Purdue Uiversity West Lafayette, IN 47907, U.S.A. {lix, shroff}@ec.purdue.edu
More informationRecovery time guaranteed heuristic routing for improving computation complexity in survivable WDM networks
Computer Commuicatios 30 (2007) 1331 1336 wwwelseviercom/locate/comcom Recovery time guarateed heuristic routig for improvig computatio complexity i survivable WDM etworks Lei Guo * College of Iformatio
More informationClustering Algorithm Analysis of Web Users with Dissimilarity and SOM Neural Networks
JONAL OF SOFTWARE, VOL. 7, NO., NOVEMBER 533 Clusterig Algorithm Aalysis of Web Users with Dissimilarity ad SOM Neal Networks Xiao Qiag School of Ecoomics ad maagemet, Lazhou Jiaotog Uiversity, Lazhou;
More informationLecture 2: Karger s Min Cut Algorithm
priceto uiv. F 3 cos 5: Advaced Algorithm Desig Lecture : Karger s Mi Cut Algorithm Lecturer: Sajeev Arora Scribe:Sajeev Today s topic is simple but gorgeous: Karger s mi cut algorithm ad its extesio.
More information