Design and Analysis of Benchmarking Experiments for Distributed Internet Services

Size: px
Start display at page:

Download "Design and Analysis of Benchmarking Experiments for Distributed Internet Services"

Transcription

1 Desgn and Analyss of Benchmarkng Experments for Dstrbuted Internet Servces Eytan Bakshy Facebook Menlo Park, CA Etan Frachtenberg Facebook Menlo Park, CA ABSTRACT The successful development and deployment of large-scale Internet servces depend crtcally on performance. Even small regressons n processng tme can translate drectly nto sgnfcant energy and user experence costs. Despte the wdespread use of dstrbuted server nfrastructure (e.g., n cloud computng and Web servces), there s lttle research on how to benchmark such systems to obtan vald and precse nferences wth mnmal data collecton costs. Correctly A/B testng dstrbuted Internet servces can be surprsngly dffcult because nterdependences between user requests (e.g., for search results, socal meda streams, photos) and host servers volate assumptons requred by standard statstcal tests. We develop statstcal models of dstrbuted Internet servce performance based on data from Perflab, a producton system used at Facebook whch vets thousands of changes to the company s codebase each day. We show how these models can be used to understand the tradeoffs between dfferent benchmarkng routnes, and what factors must be taken nto account when performng statstcal tests. Usng smulatons and emprcal data from Perflab, we valdate our theoretcal results, and provde easy-to-mplement gudelnes for desgnng and analyzng such benchmarks. Keywords A/B testng; benchmarkng; cloud computng; crossed random effects; bootstrappng Categores and Subject Descrptors G.3 [Probablty and Statstcs]: Expermental Desgn 1. INTRODUCTION Benchmarkng experments are used extensvely at Facebook and other Internet companes to detect performance regressons and bugs, as well as to optmze the performance of exstng nfrastructure and backend servces. They often nvolve testng varous code paths wth one or more users on multple hosts. Ths ntroduces multple sources of varablty: each user request may nvolve dfferent amounts of computaton because results are dynamcally customzed to ther own data. The amount of data vares from user to user and often follows heavy-taled dstrbutons for quanttes such as frend count, feed stores, and search matches [2]. Furthermore, dfferent users and requests engage dfferent code paths based on ther data, and can affected dfferently by just-ntme (JIT) complaton and cachng. User-based benchmarks therefore need to take nto account a good mx of both servce endponts and users makng the requests, n order to capture the broad range of performance ssues that may arse n producton settngs. An addtonal source of varablty comes from utlzng multple hosts, as s common among cloud and Web-based servces. Testng n a dstrbuted envronment s often necessary due to engneerng or tme constrants, and can even mprove the representatveness of the overall performance data, because the producton envronment s also dstrbuted. But each host can have ts own performance characterstcs, and even these vary over tme. The performance of computer archtectures has grown ncreasngly nondetermnstc over the years, and performance nnovatons often come at a cost of lower predctablty. Examples nclude: dynamc frequency scalng and sleep states; adaptve cachng, branch predcton, and memory prefetchng; and modular, ndvdual hardware components comprsng the system wth ther own varance [5, 16]. Our research addresses challenges nherent to any benchmarkng system whch tests user traffc on a dstrbuted set of hosts. Motvated by problems encountered n the development Perflab [14], an automated system responsble for A/B testng hundreds of code changes each day at Facebook, we develop statstcal models of user-based, dstrbuted benchmarkng experments. These models help us address real-lfe challenges that arse n a producton benchmarkng system, ncludng hgh varablty n performance tests, and hgh rates of false postves n detectng performance regressons. Our paper s organzed as follows. Usng data from our producton benchmarkng system, we motvate the use of statstcal concepts, ncludng two-stage samplng, dependence, and uncertanty (Secton 3). Wth ths framework, we develop a statstcal model that allows us to explore the consequences of how known sources of varaton requests and hosts a hypothetcal benchmark. We then use these models to formally evaluate the precson of dfferent expermental desgns of dstrbuted benchmarks (Secton 4). These varance expressons for each desgn also nforms how statstcal tests should be computed. Fnally, we propose and evaluate a bootstrappng procedure n Secton 5 whch shows good performance both n our smulatons and producton tests. 2. PROBLEM DESCRIPTION We often want to make changes to software a new user nterface, a feature, a rankng model, a retreval system, or vrtual host and need to know how each change affects performance metrcs such as computaton tme or network load. Ths goal s typcally accomplshed by drectng a subset of user traffc to dfferent ma- 1

2 chnes (hosts) runnng dfferent versons of software and comparng ther performance. Ths scheme poses a number of challenges that affect our ablty to detect statstcally sgnfcant dfferences, both wth respect to Type I errors a false postve where two versons are deemed to have dfferent performance when n fact they don t, and Type II errors an nablty to detect sgnfcant dfferences where they do exst. These problems can result n fnancal loss, wasted engneerng tme, and msallocaton of resources. As mentoned earler, A/B testng s challengng both from the perspectve of reducng the varablty n results, and from the perspectve of statstcal nference. Performance can vary sgnfcantly due to many factors, ncludng characterstcs of user requests, machnes, and complaton. Experments not desgned to account for these sources of varaton can obscure performance reversons. Furthermore, methods for computng confdence ntervals for userbased benchmarkng experments whch do not take nto account these sources of varaton can result n hgh false-postve rates. 2.1 Desgn goals When desgnng an experment to detect performance reversons, we strve to accomplsh three man goals: (1) Representatveness: we wsh to produce performance estmates that are vald (reflectng performance for the producton use case), take samples from a representatve subset of test cases, and generalze well to the test cases that weren t sampled; (2) Precson: performance estmates should be fne-graned enough to dscern even small effects, per our tolerance lmts; (3) Economy: estmates should be obtaned wth mnmal cost (measured n wall tme), subject to resource constrants. In short, we wsh to desgn a representatve experment that mnmzes both the number of observatons and the varance of the results. To smplfy, we consder a sngle endpont or servce, and use the terms request, query, and user nterchangeably Testng envronment To motvate our model, as well as valdate our results emprcally, we dscuss Facebook s n-house benchmarkng system, Perflab [14]. Perflab can assess the performance mpact of new code wthout actually nstallng t on the servers used by real users. Ths enables developers to use perflab as part of ther testng sequence even before they commt code. Moreover, perflab s used to automatcally check all code commtted nto the code repostory, to uncover both logc and performance problems. Even small performance ssues need to be montored and corrected contnuously, because f they are left to accumulate they can quckly lead to capacty problems and unnecessary expendture on addtonal nfrastructure. Problems uncovered by perflab or other tests that cannot be resolved wthn a short tme may cause the code revson to be removed from the push and delayed to a subsequent push, after the problems are resolved. A Perflab experment s started by defnng two versons of the software/envronment (whch can be dentcal for an A/A test ). Perflab reserves a set of hosts to run ether verson, and prepares them for a batch of requests by restartng them and flushng the caches of the backend servces they rely on (whch are coped and solated from the producton envronment, to ensure Perflab has no sde effects). The workload s selected as a representatve subset of endponts and users from a prevous sample of lve traffc. It s replayed repeatedly and concurrently at the systems under tests, whch queue up the requests so that the system s under a relatvely constant, near-saturaton load, to represent a realstcally hgh producton load. Ths contnues untl we get the requred number 1 We dscuss how our approach can be generalzed to multple endponts n Secton 7. of observatons, but we dscard a number of ntal observatons ( warm-up perod) and fnal observatons ( cool-down perod). Ths range was determned emprcally by analyzng the autocorrelaton functon [13] for ndvdual requests as a functon of repetton number. Perflab then resets the hosts, swaps software versons so that hosts prevously runnng verson A now run verson B and vce-versa, and reruns the experment to collect more observatons. For each observaton, we collect all the metrcs of nterest, whch nclude: CPU tme and nstructons; database/key-value fetches and bandwdth; memory use; HTML sze; and varous others. These metrcs detect not only performance changes, as n CPU tme and nstructons, but can also detect bugs. For example, a sharp drop n the number of database fetches (for a gven user and endpont) between two software versons s more often than not an undesred sde effect and an ndcaton of a faulty code path, rather than a mraculous performance mprovement, snce the underlyng user data hasn t changed between versons. 3. STATISTICS OF USER-BASED BENCHMARKS In order to understand how best to desgn user-based benchmarks, we must frst consder how one capture (sample) metrcs over a populaton of requests. We begn by dscussng the relatonshp between samplng theory and data collecton for benchmarks. Ths leads us to confront sources of varablty frst from requests, and then from hosts. We conclude ths secton wth a very general model that brngs these aspects together, and serves as the bass for the formal analyss of expermental desgns for dstrbuted benchmarks nvolvng user traffc. 3.1 Samplng Two-stage samplng of user requests We wsh to estmate the average performance of a gven verson of software, and do so by sendng requests (e.g., loadng a partcular endpont for varous users) to a servce, and measurng the sample average ȳ, for some outcome y for each observaton. Our certanty about the true average value Ȳ of the complete dstrbuton Y ncreases wth the number of observatons. If each observatons of Y s ndependent and dentcally dstrbuted (d), then the standard error of for our estmate of Ȳ s SEd = s/ N, where N s the number of observatons, and s s the sample standard devaton of our observed y s. The standard error s the standard devaton of our estmate of Ȳ, and we can use t to obtan the confdence nterval for Ȳ at sgnfcance level α by computng ȳ ± Φ 1 (1 α/2)se, where Φ 1 s the nverse cumulatve dstrbuton of the standard normal. For example, the 95% confdence nterval for Ȳ can be computed as ȳ ± 1.96SE. Often tmes data collected from benchmarks nvolvng user traffc s not d. For example, f we repeat requests for users multple tmes, the observatons may be clustered along certan values. Fgure 1 shows how ths knd of clusterng occurs wth respect to CPU tme 2 for requests fulflled to a heavly traffcked endpont at Facebook. Here, we see 300 repettons for 4, 16, and 256 dfferent requests (recall that by dfferent requests, we are referrng to dfferent users for a fxed endpont). CPU tme s dstrbuted approxmately normally for any gven request, but the amount of varablty wthn 2 Because relatve performance s easer to reason about than absolute performance n ths context, all outcomes n ths paper are standardzed, so that for any partcular benchmark for an endpont, the grand mean s subtracted from the outcome and then dvded by the standard devaton. 2

3 count CPU tme Fgure 1: Clusterng n the dstrbuton of CPU tmes for requests. Panels correspond to dfferent numbers of requests. Requests are repeated 300 tmes. standard error repettons requests Fgure 2: Standard error for clustered data (e.g., requests) as a functon of the number of requests and repettons for dfferent ntraclass correlaton coeffcents, ρ. Multple repeated observatons of the same request have lttle effect on the SE when ρ s large. a sngle request s much smaller than the overall varablty between requests. When hundreds of dfferent requests are mxed n, n fact, t s dffcult to tell that ths aggregate dstrbuton s really the mxture of many, approxmately normally dstrbuted clusters. If observatons are clustered n some way (as n Fgure 1), the effectve sample sze could be much smaller than the total number of observatons. Measurements for the same request tend to be correlated, such that e.g., the executon tme for requests for the same user are more smlar to one another than requests related to dfferent users. Ths s captured by the ntra-class correlaton coeffcent (ICC) ρ, whch s the rato of between-cluster varaton to total varaton [28]: ρ = σα 2 + σε 2 Where σ α s the standard devaton of the request effect and σ ε s the standard devaton of the measurement nose. If ρ s close to 1, then repeated observatons for the same request are nearly dentcal, and when t s close to 0, there s lttle relaton between requests. It s easy to see that f most of the varablty occurs between clusters, rather than wthn clusters (hgh ρ), addtonal repeated measurements for the same cluster do not help wth obtanng a good σ2 α ρ requests 32 requests 256 requests estmate of the mean of Y, Ȳ. Ths dea s captured by the desgn effect [10], d eff = (1 + (T 1)ρ), whch s the rato of the varance of the clustered sample (e.g., multple samples from the same request) to the smple random sample (e.g., random samples of multple ndependent requests), where T s the number of tmes each request s repeated. Under samplng desgns where we may choose both the number of requests R, and repettons T, the standard error s nstead: 1 SE clust = σ/ (1 + (T 1)ρ) (1) RT Such that addtonal repettons only reduce the standard error of our estmate f ρ s small. Fgure 2 shows the relatonshp between the standard error and R for small and large vales of ρ. For most of the top endponts benchmarked at Facebook ncludng news feed loads and search queres nearly all endponts have values of ρ between 0.8 and In other words, from our experence wth user traffc, mere repetton of requests s not an effectve means of ncreasng precson when the ntraclass correlaton, ρ, s large Samplng on a budget Explorng the tradeoffs between collectng more requests vs. more repettons needs to take nto account not only the value of ρ but also the costs of swtchng requests. For example, t s often necessary to frst warm up a vrtual host or cache to reduce temporal effects (e.g., seral autocorrelaton) [16] to get accurate estmates for an effect. That s, the fxed cost requred to sample a request s often much greater than t s to gan addtonal repettons. In stuatons n whch ρ s smaller, t may be useful to consder larger number of repettons per request. There s also an addtonal fxed cost S f to set up an experment (or a batch ), whch ncludes resettng the hardware and software as necessary and watng for the runtme systems to warm up. So, f the fxed cost for benchmarkng a sngle request s C f, the margnal cost for an addtonal repetton of a request s C m, and an avalable budget, B, one could mnmze Eq. 1 subject to the constrant that S f + R(C f + T C m) B. 3.2 Models of dstrbuted benchmarks Whle benchmarkng on a sngle host s smple enough, we are often constraned to run benchmarks n a short perod of tme, whch can be dffcult to accomplsh on only one host. Facebook, for example, runs automated performance tests on every sngle code commt to ts man ste, thousands of tmes a day [14]. It s therefore mperatve to conduct benchmarks n parallel on multple hosts. Usng multple hosts also has the beneft of surfacng performance ssues that may affect one host and not another, for further nvestgaton. 3 But multple hosts also ntroduce new sources of varablty, whch we model statstcally n the followng sectons. To motvate and llustrate these models, let us look at an example of emprcal data from a Perflab A/A benchmark (Fgure 3). Two requests (correspondng to servce queres for two dstnct users under the same software verson) are repeated across two dfferent machnes n two batches, runnng thrty tmes on each. For each batch, the software on the hosts was restarted, and caches and JIT envronments were warmed up. The observatons correspond to CPU tme measured for an endpont runnng under a PHP vrtual machne. Note that each repetton of the same request on the same machne n the same batch s relatvely smlar and has few 3 Such nvestgaton sometmes leads to dentfyng faulty systems n the benchmarkng envronment, but can also reveal unntended consequences of the software nteractng wth dfferent subsystems. 3

4 CPU tme batch 0 batch repetton host.d request Fgure 3: CPU tme for two requests executed over 30 repettons, executed on two hosts over two dfferent batches. Each shape represents a dfferent request, and each color represents a dfferent host. Panels correspond to dfferent batches. Lnes represent a model ft based on Eq. 3. tme trends. Furthermore, the between-request varablty (crcle vs. square shapes) tends to be greater than the per-machne varablty (e.g., the green vs purple dots). Between-batch effects appear to be smlar n sze to the nose Random effects formulaton An observaton from a benchmark can be thought of as beng generated by several ndependent effects. In the smplest case, we could thnk of a sngle observaton (e.g., CPU tme) for some partcular endpont or servce as havng some average value (e.g., 500ms), plus some shft due to the user nvolved n the request (e.g., +500ms for a user wth many frends), plus a shft due to the host executng the request (e.g., -200ms for a faster host), plus some random nose. Ths formulaton s referred to as a crossed random effects model, and we refer to host and request-level shfts as a random effects [3, 26] or levels [28]. Formally, we can denote the request correspondng to an observaton as r[], and the host correspondng to hosts as h[]. In the equaton below, we denote the random effects (random varables) for requests, hosts, and nose for each observaton as α r[], β h[], and ε, respectvely. The mean value s gven by µ. f denotes a transformaton (a lnk functon ). For addtve models, f s smply the dentty functon, but for multplcatve models (e.g., each effect nduces a certan percent ncrease or decrease n performance), f may be the exponental functon 4, exp( ): Y = f(µ + α r[] + β h[] + ε ) Under the heterogeneous random effects model [26], each request and host can have ts own varance. Ths model can be complex to manpulate analytcally, and dffcult to estmate n practce, so one common approach s to nstead assume that random effects for requests or hosts are drawn from the same dstrbuton. Homogeneous random effects model for a sngle batch. Under ths model, random effects for hosts and requests are respec- 4 For the sake of smplcty we work wth addtve models, but t s often desrable to work wth the multplcatve model, or equvalently log(y ). The remanng results hold equally for the logtransformed outcomes. tvely drawn from a common dstrbuton. 5 Y = µ + α r[] + β h[] + ε α r N (0, σ 2 α), β h N (0, σ 2 β), ε N (0, σ 2 ε). (2) In ths homogenous random effects model each request has a constant effect, whch s sampled from some common normal dstrbuton, N (0, σα). 2 Any varablty from the same request on the same host s ndependent of the request. Homogeneous random effects model for multple batches. Modern runtme envronments are non-determnstc and for varous reasons, restartng a servce may cause the characterstc performance of hosts or requests to devate slghtly. For example, the dynamc request order and mx can vary and affect the code paths that the JIT comples and keeps n cache. Ths can be specfed by addtonal batch-level effects for hosts and requests. We model ths behavor as follows: each tme a servce s ntalzed, we draw a batch effect for each request, γ r[],b[] N (0, σ γ), and smlarly for each host, η h[],b[] N (0, σ η). That s, each tme a host s restarted n some way (a new batch ), there s some addtonal nose ntroduced that remans constant throughout the executon of the batch, ether pertanng to the host or the request. Note that per-request and per-host batch effects are ndependent from batch to batch 6, smlar to how ε s ndependent between observatons: Y = µ + α r[] + β h[] + γ r[],b[] + η h[],b[] + ε α r N (0, σ 2 α), β h N (0, σ 2 β), γ r,b N (0, σ 2 γ), η h,b N (0, σ 2 η), ε N (0, σ 2 ε) (3) Estmaton How large are each of these effects n practce? We begn wth some of observatons from a real benchmark to llustrate the sources of varablty, and then move on to estmatng model parameters for several endponts n producton. Models such as Eq. 2 and Eq. 3 can be ft effcently to benchmark data va restrcted maxmum lkelhood estmaton usng off-the-shelf statstcal packages, such as lmer n R. In Fgure 3, the horzontal lnes ndcate the predcted values for each endpont usng the model from Eq. 3 ft to an A/A test (e.g., an experment n whch both versons have the same software) usng lmer. In general, we fnd that the requestlevel random effects are much greater than the host level effects, whch are smlar n magntude to the batch-level effects. We summarze estmates for top endponts at Facebook n Table 1, and use them n subsequent sectons to llustrate our analytcal results. endpont µ σ α σ β σ γ σ η σ ε Table 1: Parameter estmates for the model n Eq. 3 for the fve most traffcked endponts benchmarked at Facebook. 5 Clearly, some requests vary more than others (cf., Fgure 1). Ths mght be caused by, e.g., the fact that some users requre addtonal data fetches or processng tme, whch can produce greater varance n response tme. Log-transformng the outcome varable may help reduce ths heteroskedastcty. 6 Ths formal assumpton corresponds to no carryover effects from one batch to another [7]. As we descrbe n Secton 2.2, the system we use n producton s desgned to elmnate carryover effects. 4

5 4. EXPERIMENTAL DESIGN Benchmarkng experments for Internet servces are most commonly run to compare two dfferent versons of software usng a mx of requests and hosts (servers). How precsely we are able to measure dfferences can depend greatly on whch requests are delvered to what machnes, and what versons of the software those machnes are runnng. In ths secton, we generalze the random effects model for multple batches to nclude expermental comparsons, and derve expressons for how the standard error of expermental comparsons (.e., the dfference n means) depends on aspects of the expermental desgn. We then present four smple expermental desgns that cover a range of benchmarkng setups, ncludng lve benchmarks and carefully controlled experments, and derve ther standard errors. Fnally, we wll show how basc parameters, ncludng the number of requests, hosts, and repettons, affect the standard error of dfferent desgns. 4.1 Formulaton We treat the problem formally usng the potental outcomes framework [24], n whch we consder outcomes (e.g., CPU tme) for an observaton (a request-host par runnng wthn a partcular batch), runnng under ether verson (the expermental condton), whose assgnment s denoted by D = 0 or D = 1. We use Y (1) to denote the potental outcome of under the treatment, and Y (0) for the control. Although we cannot smultaneously observe both potental outcomes for any partcular, we can compute the average treatment effect, δ = E[Y (1) Y (0) ] because by lnearty of expectaton, t s equal to the dfference n means E[Y (1) ] E[Y (0) ] across dfferent populatons when D s randomly assgned. In a benchmarkng experment, we dentfy δ by specfyng a schedule of delvery of requests to hosts, along wth hosts assgnments to condtons. The partculars of how ths assgnment procedure works s the expermental desgn, and t can substantally affect the precson wth whch we can estmate δ. We generalze the random effects model n Eq. 2 to nclude average treatment effects and treatment nteractons: Y (d) = µ (d) + α (d) r[] + β(d) h[] + γ r[],b[] + η h[],b[] + ε α r N (0, Σ α), βh N (0, Σ β ), γ r,b N (0, σ 2 γ), η h,b N (0, σ 2 η), ε N (0, σ 2 ε) (4) Our goal therefore s to dentfy the true dfference n means, δ = µ (1) µ (0). Unfortunately, we can never observe δ drectly, and nstead must estmate t from data, wth nose. Exactly how much nose there s depends on whch hosts and requests are nvolved, the software versons, and how many batches are needed to run the experment. More formally, we denote the number of observatons for a partcular request host batch tuple r, h, b runnng under the treatment condton d, by n (d) rhb. We express the total nose from requests, hosts, and resdual error under each condton as: φ (d) R [ n (d) r α r (d) + ] n (d) r b γ r,b r b φ (d) H φ (d) E h 2N =1 [ n (d) h β(d) h + b ε (d) 1[D = d], ] n (d) hb η h,b where, e.g., n (d) r b represents the total number of observatons nvolvng a request r executed n batch b under condton d. We take the total number of observatons per condton to be equal so that n (0) = n (1) = N. We can then wrte down our estmate, ˆδ, as ˆδ = δ + 1 [ (φ (1) R N φ(0) R ) + (φ(1) H φ(0) H ) + (φ(1) E φ(0) E ]. ) To obtan confdence ntervals for ˆδ, we also need to know ts varance, V[ˆδ]. Followng Bakshy & Eckles [4], we approach the problem by frst descrbng how observatons from each error component are repeated wthn each condton. We defne the duplcaton coeffcents [22, 23]: ν (d) R 1 N r ( n (d) ) 2 r ν (d) H 1 N h ( n (d) h ) 2, whch are the average number of observatons sharng the same request (ν R) or host (ν H). We then defne the between-condton duplcaton coeffcent [4], whch gves a measure of how balanced hosts or requests are across condtons: ω R 1 N r n (0) r n (1) r ω H 1 N h n (0) h n(1) h. Furthermore, we only consder expermental desgns n whch the request-level and host-level duplcaton are the same n both condtons, so that ν (0) R = ν(1) R and ν(0) H = ν(1) H, and omt the superscrpts n subsequent expressons. 7 Notng that because batch-level random effects are ndependent, the varance of ther sums over batches can be expressed n terms of duplcaton coeffcents, so that e.g., V[ 1 N b n (d) r b γ r,b] = 1 N 2 ( n (d) r ) 2 σ 2 γ = 1 N νrσ2 γ, and because all random effects are ndependent, t s straghtforward to show that the varance of ˆδ can be wrtten n terms of these duplcaton factors: V[ˆδ] = 1 ) [(ν R(σ 2α(1) N + σ2α(0) + 2σ2γ) 2ω Rσ α(0),α(1) ) + (ν H(σ 2β(1) + σ2β(0) + 2σ2η) 2ω Hσ (5) β(0),β(1) )] + (σ 2ε(0) + σ2ε(1). Ths expresson llustrates how repeated observatons of the same request or host affect the varance of our estmator for ˆδ. Host-level varaton s multpled by how often hosts are repeated n the data, and smlarly for requests. Varance s reduced when requests or hosts appear equally n both condtons, snce, for example, ω R s greatest when n (0) r = n (1) r for all r. We use these facts to explore how dfferent expermental desgns ways of delverng requests to hosts under dfferent condtons affect the precson wth whch we can estmate δ. 4.2 Four desgns for dstrbuted benchmarks In ths secton, we descrbe four smple expermental desgns that 7 Note that the ndvdual components, e.g., whch requests appear n the treatment and control need not be the same for the duplcaton coeffcents to be equal. 5

6 Host Req. Ver. Batch 1 h 0 r h 0 r h 1 r h 1 r Host Req. Ver. Batch 1 h 0 r h 0 r h 1 r h 1 r Host Req. Ver. Batch 1 h 0 r h 1 r h 0 r h 1 r Host Req. Ver. Batch 1 h 0 r h 1 r h 0 r h 1 r (a) Unbalanced (b) Request balanced (c) Host balanced (d) Fully balanced Table 2: Example schedules for the four expermental desgns for experments wth two hosts (H = 2) n whch two requests are executed per condton (R = 2). Each example shows four observatons, each wth a host ID, request ID (e.g., a request to an endpont for a partcular user), software verson, and batch number. reflect basc engneerng tradeoffs, and analyze ther standard errors under a common set of condtons. In partcular, we consder our certanty about expermental effects when: 1. The sharp null s true that s, the experment has no effects at all, so that δ s zero and all varance components are the same (e.g., σ 2 = σ 2 = σ α (0) α (1) α (0) α (1)) [4], or 2. There are no treatment nteractons, but there s a constant addtve effect δ. 8 Although these requrements are rather narrow, they correspond to a meanngful and common scenaro n whch benchmarks are used for dfference detecton; by mnmzng the standard error of the experment, we are better able to detect stuatons n whch there s a devaton from no change n performance. In the four desgns we dscuss n the followng sectons, we constran the desgn space to smplfy presentaton n a few ways. Frst, we assume symmetry wth respect to the pattern of delvery of requests; we repeat each request the same number of tmes n each condton, so that N = RT. Second, because executng the same request on multple hosts wthn the same condton would ncrease ν R (and therefore nflate V [ˆδ]), we only consder desgns n whch requests are executed on at most one machne per condton. And fnally, snce by (1) and (2), the varances of the error terms are equal, we drop the superscrpts for each σ 2. We consder two classes of desgns: sngle-batch experments, n whch half of all hosts are assgned to the treatment, and the other half to the control, or two-batch experments n whch hosts run both versons of the software. Note that R requests are splt among two batches n the latter case. 9 In the sngle-batch desgn, each of the R requests s executed on ether the frst block of hosts or the second block, where each block s ether assgned to the treatment or control. Therefore, when computng the host-level duplcaton coeffcent, ν H for a partcular condton, we only sum across H/2 hosts: ν H = 1 RT H/2 h=1 ( RT ) 2 = 2RT H/2 H, In the two-batch desgn, each of the R requests are executed n both the treatment and control across spread across all H hosts: ν H = 1 RT H h=1 ( RT ) 2 = RT H H, 8 When the outcome varable s log-transformed, ths δ corresponds to a multplcatve effect. 9 Alternatvely, one can thnk of the experment as nvolvng subjects and tems [3] (whch correspond to hosts and requests). The two-batch experments correspond to wthn-subjects desgns, and sngle-batch experments correspond to between-subjects desgns Defntons Unbalanced desgn ( lve benchmarkng ). In the unbalanced desgn, each host only executes one verson of the software, and each request s processed once. It can be carred out n one batch (see example layout n Table 2 (a)). Ths desgn s the smplest to mplement snce t does not requre the ablty to replay requests. It s often the desgn of choce when benchmarkng lve requests only, whether as a necessary feature or merely as a choce of convenence: t obvates the need for a possbly complex nfrastructure to record and replay requests. The varance of the dfference-nmeans estmator for the unbalanced desgn s: V UB(ˆδ) = 2T (σα 2 + σγ) RT H (σ2 β + ση) 2 + 2σɛ 2 1 The standard error s VUB(ˆδ). Expandng and smplfyng RT ths expresson yelds: SE UB (ˆδ) = 2( 1 R (σ2 α + σγ) H (σ2 β + σ2 η) + 1 RT σɛ2) The convenence of the unbalanced desgn comes at a prce: the standard error term ncludes error components both from requests and hosts. We wll show later that ths desgn s the least powerful of the four: t acheves sgnfcantly lower accuracy (or wder confdence ntervals) for the same resource budget. Request balanced desgn ( parallel benchmarkng ). The request balanced desgn executes the same request n parallel on dfferent hosts usng dfferent versons of the software. Ths requres that one has the ablty to splt or replay traffc, and can be done n one batch (see example layout n Table 2 (b)). Its standard error s defned as: SE RB (ˆδ) = 2( 1 R σ2 γ + 2 H (σ2 β + σ2 η) + 1 RT σɛ2) The request balanced desgn cancels out the effect of the request, but nose due to each host s amplfed by the average number of requests per host, R. Compared to lve benchmarkng, ths desgn offers hgher accuracy, wth smlar resources. Compared to sequen- H tal benchmarkng, ths desgn can be run n half the wall tme, but wth twce as many hosts. It s therefore sutable for benchmarkng when request replayng s avalable and tme budget s more mportant than host budget. Host balanced desgn ( sequental benchmarkng ). The host balanced desgn executes requests usng dfferent versons of the software on the same host, and thus requres two batches. Each request s agan only executed once, so a request replayng ablty s not requred (see Table 2 (c)). 6

7 fully balanced host balanced standard error request balanced unbalanced number of requests Fgure 4: Standard errors for each of the four expermental desgns as a functon of the number of requests and hosts. Lnes are theoretcal standard errors from Secton and ponts are emprcal standard errors from 10,000 smulatons. Panels ndcate the number of hosts used n the benchmark, and the number of requests are on a log scale. Compared to lve benchmarkng, sequental benchmarkng takes twce as long to run, but can use just half the number of hosts. It may therefore be useful n stuatons wth no replay ablty and a lmted number of hosts for benchmarkng. It s also more accurate, for a gven number of observatons, as shown by ts standard error: SE HB(ˆδ) = 2 ( 1 R (σ2 α + σγ) H σ2 η + 1 RT σɛ2) The host balanced desgn cancels out the effect of the host, but one s left wth nose due to the request. Note that the number of hosts does not affect the precson of the SEs n ths desgn. Fully balanced desgn ( controlled benchmarkng ). The fully balanced desgn acheves the most accuracy wth the most resources. It executes the same request on the same host usng dfferent versons of the software. Ths requres that one has the ablty to splt or replay traffc, and takes two batches to complete (see example layout n Table 2 (d)). Ths desgn has by far the least varance of the four desgns, and s the best choce for benchmarkng experments n terms of accuracy, as shown by the standard error: SE FB (ˆδ) = 2( 1 R σ2 γ + 1 H σ2 η + 1 RT σɛ2) Ths desgn requres replay ablty, as well as twce the machnes of sequental benchmarkng and twce the wall tme (batches) of lve and parallel benchmarkng. However, t s so much more accurate than the other desgns, that t requres far fewer observatons (requests) to reach a smlar level of accuracy. Dependng on the tradeoffs between request runnng tme costs and batch setup cost, as well as the desred accuracy, ths desgn may end up takng less computatonal resources than the other desgns Analyss We analyze each desgn va smulaton and vsualzaton. We obtaned realstc smulaton parameters from our model fts to the top endpont (Table 1). Our smulaton then smply draws random ŷ values from normal dstrbutons usng these parameters. For any gven desgn, the smulatons dffer n that host effects, request effects, and random nose are redrawn from a normal dstrbuton wth σ α, σ β, and σ ε, respectvely. densty fully balanced host balanced request balanced unbalanced δ^ Fgure 5: Dstrbuton of ˆδs for each of the four expermental desgns. The data was generated by smulatng 10,000 hypothetcal experments from the random effects model n on Eq. 3 usng parameter estmates from Endpont 1 n Table 1 wth 16 hosts and 256 requests, for each desgn. We frst consder smulatons for each expermental desgn wth a fxed number of hosts and requests and zero average effects. Fgure 5 shows the dstrbuton of ˆδs generated from 10,000 smulatons. We can see that the fully balanced desgn has by far the least varance n ˆδ, followed by the request balanced, host balanced, and unbalanced desgns. Next, we explore the parameter space of varyng hosts and requests n Fgure 4. In all cases, addng hosts or addng requests narrows the SE. For the unbalanced and host balanced desgns, the effect of the number of requests on the SE s much more pronounced than that of the number of hosts: n the former because varablty due to requests s much hgher than that due to hosts, as shown n Fgure 1; and n the latter because we control for the hosts. Smlarly, the request balanced desgn controls for requests, and therefore shows lttle effect from varyng the number of requests. And fnally, the fully balanced desgn exhbts both the smallest SE n absolute terms, as well as the least senstvty to the number of hosts. 7

8 5. BOOTSTRAPPING So far we have dscussed smple models that help us understand the man levers that can mprove statstcal precson n dstrbuted, user-based benchmarks. Our theoretcal results, however, assume that the model s correct, and that parameters are known or can easly estmated from data. Ths s generally not the case, and n fact, estmatng models from the data may requre many more observatons than s necessary to estmate an average treatment effect. 10 In ths secton we wll revew a smple non-parametrc method for performng statstcal nference the bootstrap. We then evaluate how well t does at reconstructng known standard errors based on smulatons from the random effects model. Fnally, we demonstrate the performance of the bootstrap on real producton benchmarks from Perflab. 5.1 Overvew of the bootstrap Often tmes we wsh to generate confdence ntervals for an average wthout makng strong assumptons about how the data was generated. The bootstrap [12] s one such technque for dong ths. The bootstrap dstrbuton of a sample statstc (e.g., the dfference n means between two expermental condtons) s the dstrbuton of that statstc when observatons are resampled [12] or reweghted [23, 25]. We descrbe the latter method because t s easest to mplement n a computatonally effcent manner. The most basc way of gettng a confdence nterval for an average treatment effect for d data s to reweght observatons ndependently, and repeat ths process R tmes. We assgn each observaton a weght w r, from a mean-one random varable, (e.g., Unform(0,2) or Pos(1)) [23, 25] and use them to average the data, usng each replcate number r and observaton number as a random seed: ˆδ r = 1 N r w r,y I(D = 1) 1 M r w r,y I(D = 0) Here, Nr and Mr denote the sum of the bootstrap weghts (e.g., wr,i(d = 1)) under the treatment and control, respectvely. Ths process produces a dstrbuton of our statstc, the sample dfference n means, ˆδ r=1...,r. One can then summarze ths dstrbuton to obtan confdence ntervals. For example, to compute the 95% confdence nterval for ˆδ by takng the 2.5th and 97.5th quantles of the bootstrap dstrbuton of ˆδ. Another method s to use the central lmt theorem (CLT). The dstrbuton of our statstc s expected to be asymptotcally normal, so that one can compute the 95% nterval usng the quantles of the normal dstrbuton wth a mean and standard devaton set to the sample mean and standard devaton of the ˆδ r s. The CLT ntervals are generally more stable than drectly computng the quantles of the bootstrap dstrbuton, so we use ths method throughout the remanng sectons. Smlar to how the d standard errors n Secton underestmate the varablty n Ȳ, we expect the d bootstrap to underestmate the varance of ˆδ when observatons are clustered, yeldng overly narrow ( ant-conservatve ) confdence ntervals and hgh false postve rates [21]. The soluton to ths problem s to use a clustered bootstrap. In the clustered bootstrap, weghts are assgned for each factor level, rather than observaton number. For example, f we wsh to use a clustered bootstrap based on the request ID, as to capture varablty due to the request, we can assgn 10 For example, dentfyng host-level effects requres executng the same request on multple machnes, and dentfyng request-level effects requres multple repettons of the same request. Both ncrease request-level duplcaton, whch reduces effcency when request-level effects are large relatve to the nose, σ ε (Sec. 4.2). all observatons for a partcular host to a weght. In the case of requests, we would nstead use host IDs and replcate numbers as our random number seed, and for each replcate, compute: ˆδ r = 1 N r w r,h[] y I(D = 1) 1 M r w r,h[] y I(D = 0) 5.2 Bootstrappng expermental dfferences Before examnng the behavor of the bootstrap on producton behavor, we valdate ts performance based on how well t approxmates known standard errors, as generated by our dealzed benchmark models from Secton 3. To llustrate how ths works, we can plot the true dstrbuton of ˆδs drawn from the 10,000 smulatons for the fully balanced desgn shown n Fgure 6, along wth the ˆδ s under the host and request-clustered bootstrap for a sngle experment. The host-clustered bootstrap, whch accounts for the varaton nduced by repeated observatons of the same host, tends to produce estmates of the standard errors that are smlar to the true standard error, whle the request-clustered bootstrap tends to be too narrow, n that t underestmates the varablty of ˆδ. densty true dstrbuton host bootstrap request bootstrap δ^ Fgure 6: Comparson of the dstrbuton of ˆδs generated by 10,000 smulatons (sold lne) wth the dstrbuton of bootstrapped ˆδ s from a sngle experment (dashed lnes). estmated treatment effect unbalanced host balanced request balanced fully balanced rank Fgure 7: Vsualzaton of bootstrap confdence ntervals from 500 smulated A/A tests, run wth dfferent expermental desgns and bootstrap methods. Experments are ranked by estmated effect sze. Shaded error bars ndcate false postves. The request-level and d bootstraps yeld overly narrow confdence ntervals. d bootstrap request bootstrap host bootstrap sgnfcant FALSE TRUE bootstrap.method d bootstrap request bootstrap host bootstrap 8

9 method.x host.bootstrap d.bootstrap request.bootstrap Endpont /profle_book.php /profle_book.php /profle_book.php /home.php:ltestand:top_news_secton_of /home.php:ltestand:top_news_secton_of /home.php:ltestand:top_news_secton_of /home.php /home.php /home.php /wap/story.php /wap/story.php /wap/story.php /wap/home.php:touch /wap/home.php:touch /wap/home.php:touch /ajax/typeahead/search.php:search /wap/profle_tmelne.php /ajax/typeahead/search.php:search /ajax/typeahead/search.php:search TypeaheadFacebarQueryController /wap/profle_tmelne.php /wap/profle_tmelne.php /wap/home.php:faceweb TypeaheadFacebarQueryController TypeaheadFacebarQueryController /wap/home.php:basc /wap/home.php:faceweb /wap/home.php:faceweb /wap/home.php:basc /wap/home.php:basc 3 Standard error relatve to host-clustered bootstrap 0% 2% 4% 6% 8% Type I error rate Endpont Endpont Standard error relatve to host-clustered bootstrap Fgure 8: Emprcal valdaton of bootstrap estmators for(a) the top 10 endponts usng producton data from Perflab. (b) Left: Type I error rates for host-clustered and IID bootstrap (request-clustered bootstrap has a Type I error rate of > 20% for all endponts and s not shown); Sold lne ndcates the desred 5% Type I error rate, and the dashed lnes ndcate the range of possble observed Type I error rates that would be consstent wth a true error rate of 5%. Rght: Standard error of host-clustered, request-clustered, and d bootstrap relatve to the host-clustered standard error. Next, we evaluate three bootstrappng procedures d, hostclustered, and request-clustered wth each of the four desgns for a sngle endpont. To do ths, we use the same model parameters from endpont 1, as n prevous plots. To get an ntutve pcture for how the confdence ntervals are dstrbuted, we run 500 smulated A/A tests, and for each confguraton, we rank-order experments by the pont estmates of ˆδ, and vsualze ther confdence ntervals (Fgure 7). Shaded regons represent false postves (.e., ther confdence ntervals do not cross 0). The d bootstrap consstently produces the wdest CIs for all desgns. Ths happens because the d bootstrap doesn t preserve the balance across hosts or requests across condtons when resamplng. There s also a clear relatonshp between the wdth of CIs and the Type I error rate, n that there are a hgher proporton of type I error rates when the CIs are too narrow. To more closely examne the precson of each bootstrap method wth each desgn, we run 10,000 smulated A/A tests for each confguraton and summarze ther results n Table 3. For the requestbalanced desgn, we also nclude an addtonal bootstrap strategy, whch we call the host-block bootstrap, where pars of hosts that execute the same requests are bootstrapped. Ths ensures that when whole hosts are bootstrapped, the balance of requests s not broken. Ths method turns out to produce standard errors wth good coverage for the request-balanced desgn, and so we wll henceforth refer to the host block bootstrap strategy as the host-clustered bootstrap n further analyses. We can see that the host-clustered bootstrap appears to estmate the true SE most accurately for all desgns. 5.3 Evaluaton wth producton data Fnally, havng verfed that the bootstrap method provdes a conservatve estmate of the standard error when the true standard error s known (because t was generated by our statstcal model), we turn toward testng the bootstrap on raw producton data from Perflab, whch uses the fully balanced desgn. To do ths, we conduct 256 A/A tests usng dentcal bnares of the Facebook WWW codebase. Fgure 8 summarzes the results from these tests for the top 10 most vsted endponts that are benchmarked by Perflab. Consstent wth the results of our smulatons, we fnd that the request-level bootstrap s massvely ant-conservatve, and produces confdence ntervals that are far too narrow, resultng n a hgh false postve rate (.e., > 20%) across all endponts. Smlarly, we fnd that our emprcal results echo that of the smulatons: the d bootstrap produces estmates of the standard error that are far wder than they should be. The host-clustered bootstrap, how- Desgn Bootstrap Type I err. ŜE SE unbalanced request 21.4% unbalanced d 17.8% unbalanced host 3.6% request balanced request 71.6% request balanced d 10.8% request balanced host 0.8% request balanced host-block 5.7% host balanced request 8.4% host balanced d 6.8% host balanced host 5.0% fully balanced request 46.8% fully balanced host 4.8% fully balanced d 0.0% Table 3: Comparson of Type I error rates and standard errors for each expermental desgn and bootstrap strategy. ŜE ndcates the average estmated standard error from the bootstrap, whle SE ndcates the true standard error from the analytcal formulae. Antconservatve bootstrap estmates of the standard errors produce hgh false postve rates (e.g., >5%). ever, produces Type I error rates that are not sgnfcantly dfferent from 5% for all but 3 of the endponts; n these cases, the confdence ntervals are slghtly conservatve, as s desred n our use case. For these reasons, all producton benchmarks conducted at Facebook wth Perflab use the host-clustered bootstrap. 6. RELATED WORK Many researchers recognzed the obstreperous nature of performance measurement tools, and addressed t pecemeal. Some focus on controllng archtectural performance varablty, whch s stll a very actve research feld. Statstcal nference tools can be appled to reduce the effort of repeated expermentaton [11, 19]. These studes focus prmarly on managng host-level varablty (and even ntra-host-level varance), and do not spend much attenton on the varablty of software. Varablty n software performance has been examned n many other studes that attempt to quantfy the effcacy of technques such as: allowng for a warm-up perod [8]; reducng random performance fluctuatons usng regresson benchmarkng [18]; randomzng mult-threaded smulatons [1]; and applcaton-specfc benchmarkng [27]. In addton, there s an ncreasng nterest specfcally n the performance of onlne systems and n the modelng of large-scale workloads and dynamc content [2, 6]. 9

10 Several studes addressed the holstc performance evaluaton of hardware, software, and users. Cheng et al. descrbed a system called Monkey that captures and replays TCP workloads, allowng the repeated measurng of the system under test wthout generatng synthetc workloads [9]. Gupta et al., n ther Decast system, addressed the challenge of scalng down massve-scale dstrbuted systems nto representatve benchmarks [15]. In addton, representatve characterstcs of user-generated load s crtcal for benchmarkng large-scale onlne systems, as dscussed by Manley et al. [20]. Ther system, hbench:web, tres to capture user-level varaton n terms of user sessons and user equvalence classes. Request-level varaton s modeled statstcally, as opposed to measurng t drectly. A few studes also proposed statstcal models for benchmarkng experments. Kalbera et al. showed that smply averaged multple repettons of a benchmark wthout accountng for sources of varaton can produce unrepresentatve results [17, 18]. They developed models focusng on mnmzng expermentaton tme under software/envronment random effects, whch can be generalzed to other software-nduced varablty. These models are smlar to the model developed here, but they do not take nto account host-level or user-level effects, whch are central to the expermental desgn and statstcal nference problem we wsh to address. To the best of our knowledge, ths s the frst work to rgorously address dstrbuted benchmarkng n the context of user data. 7. CONCLUSIONS AND FUTURE WORK Benchmarkng for performance dfferences n large Internet software systems poses many challenges. On top of the many wellstuded requrements for accurate performance benchmarkng, we have the addtonal dmenson of user requests. Because of the potentally large performance varaton from one user request to another, t s crucal to take the clusterng of user requests nto account. Yet another complcatng dmenson, whch s nevertheless crtcal to scale large testng systems, s dstrbuted benchmarkng across multple hosts. It too ntroduces non-trval complcatons wth respect to how hosts nteract wth request-level effects and expermental desgn to affect the standard error of the benchmark. In ths paper we developed a statstcal model to understand and quantfy these effects, and explored ther practcal mpact on benchmarkng. Ths model enables the analytcal development of expermental desgns wth dfferent engneerng and effcency tradeoffs. Our results from these models show that a fully balanced desgn accountng for both request varablty and host varablty s optmal for the goal of mnmzng the benchmark s standard error gven a fxed number of requests and machnes. Although ths desgn may requre more computatonal resources than the other three, t s deal for Facebook s rapd development and deployment mode because t mnmzes developer resources. Desgn effects due to repeated observatons from the same host also show how resdual error terms that cannot be canceled out va balancng, such as batch-level host effects due to JIT optmzaton and cachng, can also be an mportant lever for further ncreasng precson, especally when there are few hosts relatve to the number of requests. From a practcal pont of vew, estmatng the model parameters to compute the standard errors for these experments (especally n a lve system) can be costly and complex. We showed how a non-parametrc bootstrappng technques can be used to estmate the standard error of each desgn. Usng emprcal data from Facebook s largest dfferental benchmarkng system, Perflab, we confrm that ths technque can relably capture the true standard error wth good accuracy. Consequently, all producton Perflab experments use a fully-balanced desgn, and the host-level bootstrap to evaluate changes n key metrcs, ncludng CPU tme, nstructons, memory usage, etc. Our hope s that ths paper provdes a smple and actonable understandng of the procedures nvolved n benchmarkng for performance changes n other contexts as well. Wth a more quanttatvely nformed approach, practtoners can select the most sutable expermental desgn to mnmze the benchmark s run tme for any desred level of accuracy. Our results focus on measurng the performance of a sngle endpont or servce. Often tmes one wshes to benchmark multple such endponts or servces. Poolng across these endponts to obtan a composte standard error s trval f one could reasonably regard the performance of each endpont as ndependent. 11 If there s a strong correlaton between endponts, however, the models here must be extended to take nto account covarances between observatons from dfferent endponts. Fnally, another area of development for future work s a more extensve evaluaton of the costs assocated wth each of the four basc desgns. For example, one mght devse a cost model whch takes as nputs the desred accuracy and parameters such as batch setup tme, fxed and margnal costs per request, etc. These cost functons formulae would depend on the partculars of the benchmarkng platform. For example, n some systems the number of requests needed for stable measurements for may depend on some maxmum load per host. How much warm up dfferent subsystems need mght also depend on how servces are dstrbuted across machnes. 8. ACKNOWLEDGEMENTS Perflab s developed by many engneers on the Ste Effcency Team at Facebook. We are especally thankful for all of the help from Davd Harrngton and Eunchang Lee, who worked closely wth us on mplementng and deployng the testng routnes dscussed n ths paper. Ths work would not be possble wthout ther help and collaboraton. We also thank Dean Eckles for hs suggestons on formulatng and fttng the random effects model, and Alex Deng for hs excellent feedback on ths paper. 9. REFERENCES [1] A. R. Alameldeen, C. J. Mauer, M. Xu, P. J. Harper, M. M. K. Martn, D. J. Sorn, M. D. Hll, and D. A. Wood. Evaluatng non-determnstc mult-threaded commercal workloads. In In Proceedngs of the Ffth Workshop on Computer Archtecture Evaluaton Usng Commercal Workloads, pages 30 38, [2] B. Atkoglu, Y. Xu, E. Frachtenberg, S. Jang, and M. Paleczny. Workload analyss of a large-scale key-value store. In Proceedngs of the 12th Jont Conference On Measurement And Modelng of Computer Systems (SIGMETRICS/Performance 12), London, UK, June [3] R. H. Baayen, D. J. Davdson, and D. M. Bates. Mxed-effects modelng wth crossed random effects for subjects and tems. Journal of Memory and Language, 59(4): , [4] E. Bakshy and D. Eckles. Uncertanty n onlne experments wth dependent data: An evaluaton of bootstrap methods. In 11 If each endpont consttutes some fracton w of the traffc, and the outcomes for each endpont are uncorrelated, the varance of δ pooled across all endponts s smply Var[ˆδ pool] = w2 Var[ˆδ ]. Ths ndependence assumpton can be made more plausble by loadng endponts wth dsjont sets of users. 10

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 14 MORE ABOUT REGRESSION CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

Credit Limit Optimization (CLO) for Credit Cards

Credit Limit Optimization (CLO) for Credit Cards Credt Lmt Optmzaton (CLO) for Credt Cards Vay S. Desa CSCC IX, Ednburgh September 8, 2005 Copyrght 2003, SAS Insttute Inc. All rghts reserved. SAS Propretary Agenda Background Tradtonal approaches to credt

More information

Analysis of Premium Liabilities for Australian Lines of Business

Analysis of Premium Liabilities for Australian Lines of Business Summary of Analyss of Premum Labltes for Australan Lnes of Busness Emly Tao Honours Research Paper, The Unversty of Melbourne Emly Tao Acknowledgements I am grateful to the Australan Prudental Regulaton

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

More information

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

How To Calculate The Accountng Perod Of Nequalty

How To Calculate The Accountng Perod Of Nequalty Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE Yu-L Huang Industral Engneerng Department New Mexco State Unversty Las Cruces, New Mexco 88003, U.S.A. Abstract Patent

More information

Portfolio Loss Distribution

Portfolio Loss Distribution Portfolo Loss Dstrbuton Rsky assets n loan ortfolo hghly llqud assets hold-to-maturty n the bank s balance sheet Outstandngs The orton of the bank asset that has already been extended to borrowers. Commtment

More information

Multiple-Period Attribution: Residuals and Compounding

Multiple-Period Attribution: Residuals and Compounding Multple-Perod Attrbuton: Resduals and Compoundng Our revewer gave these authors full marks for dealng wth an ssue that performance measurers and vendors often regard as propretary nformaton. In 1994, Dens

More information

Statistical Methods to Develop Rating Models

Statistical Methods to Develop Rating Models Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and

More information

Brigid Mullany, Ph.D University of North Carolina, Charlotte

Brigid Mullany, Ph.D University of North Carolina, Charlotte Evaluaton And Comparson Of The Dfferent Standards Used To Defne The Postonal Accuracy And Repeatablty Of Numercally Controlled Machnng Center Axes Brgd Mullany, Ph.D Unversty of North Carolna, Charlotte

More information

An Empirical Study of Search Engine Advertising Effectiveness

An Empirical Study of Search Engine Advertising Effectiveness An Emprcal Study of Search Engne Advertsng Effectveness Sanjog Msra, Smon School of Busness Unversty of Rochester Edeal Pnker, Smon School of Busness Unversty of Rochester Alan Rmm-Kaufman, Rmm-Kaufman

More information

SIMPLE LINEAR CORRELATION

SIMPLE LINEAR CORRELATION SIMPLE LINEAR CORRELATION Smple lnear correlaton s a measure of the degree to whch two varables vary together, or a measure of the ntensty of the assocaton between two varables. Correlaton often s abused.

More information

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

On the Optimal Control of a Cascade of Hydro-Electric Power Stations On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES In ths chapter, we wll learn how to descrbe the relatonshp between two quanttatve varables. Remember (from Chapter 2) that the terms quanttatve varable

More information

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

How To Understand The Results Of The German Meris Cloud And Water Vapour Product Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University Characterzaton of Assembly Varaton Analyss Methods A Thess Presented to the Department of Mechancal Engneerng Brgham Young Unversty In Partal Fulfllment of the Requrements for the Degree Master of Scence

More information

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation Exhaustve Regresson An Exploraton of Regresson-Based Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The

More information

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001 Proceedngs of the Annual Meetng of the Amercan Statstcal Assocaton, August 5-9, 2001 LIST-ASSISTED SAMPLING: THE EFFECT OF TELEPHONE SYSTEM CHANGES ON DESIGN 1 Clyde Tucker, Bureau of Labor Statstcs James

More information

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services An Evaluaton of the Extended Logstc, Smple Logstc, and Gompertz Models for Forecastng Short Lfecycle Products and Servces Charles V. Trappey a,1, Hsn-yng Wu b a Professor (Management Scence), Natonal Chao

More information

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING Matthew J. Lberatore, Department of Management and Operatons, Vllanova Unversty, Vllanova, PA 19085, 610-519-4390,

More information

ECONOMICS OF PLANT ENERGY SAVINGS PROJECTS IN A CHANGING MARKET Douglas C White Emerson Process Management

ECONOMICS OF PLANT ENERGY SAVINGS PROJECTS IN A CHANGING MARKET Douglas C White Emerson Process Management ECONOMICS OF PLANT ENERGY SAVINGS PROJECTS IN A CHANGING MARKET Douglas C Whte Emerson Process Management Abstract Energy prces have exhbted sgnfcant volatlty n recent years. For example, natural gas prces

More information

STATISTICAL DATA ANALYSIS IN EXCEL

STATISTICAL DATA ANALYSIS IN EXCEL Mcroarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 6 Some Advanced Topcs Dr. Petr Nazarov 14-01-013 petr.nazarov@crp-sante.lu Statstcal data analyss n Ecel. 6. Some advanced topcs Correcton for

More information

Evaluating credit risk models: A critique and a new proposal

Evaluating credit risk models: A critique and a new proposal Evaluatng credt rsk models: A crtque and a new proposal Hergen Frerchs* Gunter Löffler Unversty of Frankfurt (Man) February 14, 2001 Abstract Evaluatng the qualty of credt portfolo rsk models s an mportant

More information

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy Fnancal Tme Seres Analyss Patrck McSharry patrck@mcsharry.net www.mcsharry.net Trnty Term 2014 Mathematcal Insttute Unversty of Oxford Course outlne 1. Data analyss, probablty, correlatons, vsualsaton

More information

Calculating the high frequency transmission line parameters of power cables

Calculating the high frequency transmission line parameters of power cables < ' Calculatng the hgh frequency transmsson lne parameters of power cables Authors: Dr. John Dcknson, Laboratory Servces Manager, N 0 RW E B Communcatons Mr. Peter J. Ncholson, Project Assgnment Manager,

More information

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6 PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has

More information

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ Effcent Strpng Technques for Varable Bt Rate Contnuous Meda Fle Servers æ Prashant J. Shenoy Harrck M. Vn Department of Computer Scence, Department of Computer Scences, Unversty of Massachusetts at Amherst

More information

Sketching Sampled Data Streams

Sketching Sampled Data Streams Sketchng Sampled Data Streams Florn Rusu, Aln Dobra CISE Department Unversty of Florda Ganesvlle, FL, USA frusu@cse.ufl.edu adobra@cse.ufl.edu Abstract Samplng s used as a unversal method to reduce the

More information

BERNSTEIN POLYNOMIALS

BERNSTEIN POLYNOMIALS On-Lne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy 4.02 Quz Solutons Fall 2004 Multple-Choce Questons (30/00 ponts) Please, crcle the correct answer for each of the followng 0 multple-choce questons. For each queston, only one of the answers s correct.

More information

Section 5.4 Annuities, Present Value, and Amortization

Section 5.4 Annuities, Present Value, and Amortization Secton 5.4 Annutes, Present Value, and Amortzaton Present Value In Secton 5.2, we saw that the present value of A dollars at nterest rate per perod for n perods s the amount that must be deposted today

More information

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background: SPEE Recommended Evaluaton Practce #6 efnton of eclne Curve Parameters Background: The producton hstores of ol and gas wells can be analyzed to estmate reserves and future ol and gas producton rates and

More information

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall SP 2005-02 August 2005 Staff Paper Department of Appled Economcs and Management Cornell Unversty, Ithaca, New York 14853-7801 USA Farm Savngs Accounts: Examnng Income Varablty, Elgblty, and Benefts Brent

More information

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS 21 22 September 2007, BULGARIA 119 Proceedngs of the Internatonal Conference on Informaton Technologes (InfoTech-2007) 21 st 22 nd September 2007, Bulgara vol. 2 INVESTIGATION OF VEHICULAR USERS FAIRNESS

More information

Politecnico di Torino. Porto Institutional Repository

Politecnico di Torino. Porto Institutional Repository Poltecnco d Torno Porto Insttutonal Repostory [Artcle] A cost-effectve cloud computng framework for acceleratng multmeda communcaton smulatons Orgnal Ctaton: D. Angel, E. Masala (2012). A cost-effectve

More information

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining Rsk Model of Long-Term Producton Schedulng n Open Pt Gold Mnng R Halatchev 1 and P Lever 2 ABSTRACT Open pt gold mnng s an mportant sector of the Australan mnng ndustry. It uses large amounts of nvestments,

More information

Traffic State Estimation in the Traffic Management Center of Berlin

Traffic State Estimation in the Traffic Management Center of Berlin Traffc State Estmaton n the Traffc Management Center of Berln Authors: Peter Vortsch, PTV AG, Stumpfstrasse, D-763 Karlsruhe, Germany phone ++49/72/965/35, emal peter.vortsch@ptv.de Peter Möhl, PTV AG,

More information

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits Lnear Crcuts Analyss. Superposton, Theenn /Norton Equalent crcuts So far we hae explored tmendependent (resste) elements that are also lnear. A tmendependent elements s one for whch we can plot an / cure.

More information

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008 Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn

More information

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS Chrs Deeley* Last revsed: September 22, 200 * Chrs Deeley s a Senor Lecturer n the School of Accountng, Charles Sturt Unversty,

More information

7 ANALYSIS OF VARIANCE (ANOVA)

7 ANALYSIS OF VARIANCE (ANOVA) 7 ANALYSIS OF VARIANCE (ANOVA) Chapter 7 Analyss of Varance (Anova) Objectves After studyng ths chapter you should apprecate the need for analysng data from more than two samples; understand the underlyng

More information

Lecture 3: Force of Interest, Real Interest Rate, Annuity

Lecture 3: Force of Interest, Real Interest Rate, Annuity Lecture 3: Force of Interest, Real Interest Rate, Annuty Goals: Study contnuous compoundng and force of nterest Dscuss real nterest rate Learn annuty-mmedate, and ts present value Study annuty-due, and

More information

IMPACT ANALYSIS OF A CELLULAR PHONE

IMPACT ANALYSIS OF A CELLULAR PHONE 4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng

More information

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819-840 (2008) Data Broadcast on a Mult-System Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,

More information

Fault tolerance in cloud technologies presented as a service

Fault tolerance in cloud technologies presented as a service Internatonal Scentfc Conference Computer Scence 2015 Pavel Dzhunev, PhD student Fault tolerance n cloud technologes presented as a servce INTRODUCTION Improvements n technques for vrtualzaton and performance

More information

Traffic-light a stress test for life insurance provisions

Traffic-light a stress test for life insurance provisions MEMORANDUM Date 006-09-7 Authors Bengt von Bahr, Göran Ronge Traffc-lght a stress test for lfe nsurance provsons Fnansnspetonen P.O. Box 6750 SE-113 85 Stocholm [Sveavägen 167] Tel +46 8 787 80 00 Fax

More information

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1 Send Orders for Reprnts to reprnts@benthamscence.ae The Open Cybernetcs & Systemcs Journal, 2014, 8, 115-121 115 Open Access A Load Balancng Strategy wth Bandwdth Constrant n Cloud Computng Jng Deng 1,*,

More information

1 Example 1: Axis-aligned rectangles

1 Example 1: Axis-aligned rectangles COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton

More information

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression Novel Methodology of Workng Captal Management for Large Publc Constructons by Usng Fuzzy S-curve Regresson Cheng-Wu Chen, Morrs H. L. Wang and Tng-Ya Hseh Department of Cvl Engneerng, Natonal Central Unversty,

More information

A Probabilistic Theory of Coherence

A Probabilistic Theory of Coherence A Probablstc Theory of Coherence BRANDEN FITELSON. The Coherence Measure C Let E be a set of n propostons E,..., E n. We seek a probablstc measure C(E) of the degree of coherence of E. Intutvely, we want

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

Understanding the Impact of Marketing Actions in Traditional Channels on the Internet: Evidence from a Large Scale Field Experiment

Understanding the Impact of Marketing Actions in Traditional Channels on the Internet: Evidence from a Large Scale Field Experiment A research and educaton ntatve at the MT Sloan School of Management Understandng the mpact of Marketng Actons n Tradtonal Channels on the nternet: Evdence from a Large Scale Feld Experment Paper 216 Erc

More information

The Current Employment Statistics (CES) survey,

The Current Employment Statistics (CES) survey, Busness Brths and Deaths Impact of busness brths and deaths n the payroll survey The CES probablty-based sample redesgn accounts for most busness brth employment through the mputaton of busness deaths,

More information

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST) yaoqi.feng@yahoo.

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST) yaoqi.feng@yahoo. ICSV4 Carns Australa 9- July, 007 RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL Yaoq FENG, Hanpng QIU Dynamc Test Laboratory, BISEE Chna Academy of Space Technology (CAST) yaoq.feng@yahoo.com Abstract

More information

The Application of Fractional Brownian Motion in Option Pricing

The Application of Fractional Brownian Motion in Option Pricing Vol. 0, No. (05), pp. 73-8 http://dx.do.org/0.457/jmue.05.0..6 The Applcaton of Fractonal Brownan Moton n Opton Prcng Qng-xn Zhou School of Basc Scence,arbn Unversty of Commerce,arbn zhouqngxn98@6.com

More information

Extending Probabilistic Dynamic Epistemic Logic

Extending Probabilistic Dynamic Epistemic Logic Extendng Probablstc Dynamc Epstemc Logc Joshua Sack May 29, 2008 Probablty Space Defnton A probablty space s a tuple (S, A, µ), where 1 S s a set called the sample space. 2 A P(S) s a σ-algebra: a set

More information

Joe Pimbley, unpublished, 2005. Yield Curve Calculations

Joe Pimbley, unpublished, 2005. Yield Curve Calculations Joe Pmbley, unpublshed, 005. Yeld Curve Calculatons Background: Everythng s dscount factors Yeld curve calculatons nclude valuaton of forward rate agreements (FRAs), swaps, nterest rate optons, and forward

More information

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing A Replcaton-Based and Fault Tolerant Allocaton Algorthm for Cloud Computng Tork Altameem Dept of Computer Scence, RCC, Kng Saud Unversty, PO Box: 28095 11437 Ryadh-Saud Araba Abstract The very large nfrastructure

More information

GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM

GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM BARRIOT Jean-Perre, SARRAILH Mchel BGI/CNES 18.av.E.Beln 31401 TOULOUSE Cedex 4 (France) Emal: jean-perre.barrot@cnes.fr 1/Introducton The

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center Dynamc Resource Allocaton and Power Management n Vrtualzed Data Centers Rahul Urgaonkar, Ulas C. Kozat, Ken Igarash, Mchael J. Neely urgaonka@usc.edu, {kozat, garash}@docomolabs-usa.com, mjneely@usc.edu

More information

Time Value of Money. Types of Interest. Compounding and Discounting Single Sums. Page 1. Ch. 6 - The Time Value of Money. The Time Value of Money

Time Value of Money. Types of Interest. Compounding and Discounting Single Sums. Page 1. Ch. 6 - The Time Value of Money. The Time Value of Money Ch. 6 - The Tme Value of Money Tme Value of Money The Interest Rate Smple Interest Compound Interest Amortzng a Loan FIN21- Ahmed Y, Dasht TIME VALUE OF MONEY OR DISCOUNTED CASH FLOW ANALYSIS Very Important

More information

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign PAS: A Packet Accountng System to Lmt the Effects of DoS & DDoS Debsh Fesehaye & Klara Naherstedt Unversty of Illnos-Urbana Champagn DoS and DDoS DDoS attacks are ncreasng threats to our dgtal world. Exstng

More information

Traditional versus Online Courses, Efforts, and Learning Performance

Traditional versus Online Courses, Efforts, and Learning Performance Tradtonal versus Onlne Courses, Efforts, and Learnng Performance Kuang-Cheng Tseng, Department of Internatonal Trade, Chung-Yuan Chrstan Unversty, Tawan Shan-Yng Chu, Department of Internatonal Trade,

More information

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems STAN-CS-73-355 I SU-SE-73-013 An Analyss of Central Processor Schedulng n Multprogrammed Computer Systems (Dgest Edton) by Thomas G. Prce October 1972 Techncal Report No. 57 Reproducton n whole or n part

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

Enabling P2P One-view Multi-party Video Conferencing

Enabling P2P One-view Multi-party Video Conferencing Enablng P2P One-vew Mult-party Vdeo Conferencng Yongxang Zhao, Yong Lu, Changja Chen, and JanYn Zhang Abstract Mult-Party Vdeo Conferencng (MPVC) facltates realtme group nteracton between users. Whle P2P

More information

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance Calbraton Method Instances of the Cell class (one nstance for each FMS cell) contan ADC raw data and methods assocated wth each partcular FMS cell. The calbraton method ncludes event selecton (Class Cell

More information

Quantization Effects in Digital Filters

Quantization Effects in Digital Filters Quantzaton Effects n Dgtal Flters Dstrbuton of Truncaton Errors In two's complement representaton an exact number would have nfntely many bts (n general). When we lmt the number of bts to some fnte value

More information

MARKET SHARE CONSTRAINTS AND THE LOSS FUNCTION IN CHOICE BASED CONJOINT ANALYSIS

MARKET SHARE CONSTRAINTS AND THE LOSS FUNCTION IN CHOICE BASED CONJOINT ANALYSIS MARKET SHARE CONSTRAINTS AND THE LOSS FUNCTION IN CHOICE BASED CONJOINT ANALYSIS Tmothy J. Glbrde Assstant Professor of Marketng 315 Mendoza College of Busness Unversty of Notre Dame Notre Dame, IN 46556

More information

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT Toshhko Oda (1), Kochro Iwaoka (2) (1), (2) Infrastructure Systems Busness Unt, Panasonc System Networks Co., Ltd. Saedo-cho

More information

Vasicek s Model of Distribution of Losses in a Large, Homogeneous Portfolio

Vasicek s Model of Distribution of Losses in a Large, Homogeneous Portfolio Vascek s Model of Dstrbuton of Losses n a Large, Homogeneous Portfolo Stephen M Schaefer London Busness School Credt Rsk Electve Summer 2012 Vascek s Model Important method for calculatng dstrbuton of

More information

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications Methodology to Determne Relatonshps between Performance Factors n Hadoop Cloud Computng Applcatons Lus Eduardo Bautsta Vllalpando 1,2, Alan Aprl 1 and Alan Abran 1 1 Department of Software Engneerng and

More information

Statistical algorithms in Review Manager 5

Statistical algorithms in Review Manager 5 Statstcal algorthms n Reve Manager 5 Jonathan J Deeks and Julan PT Hggns on behalf of the Statstcal Methods Group of The Cochrane Collaboraton August 00 Data structure Consder a meta-analyss of k studes

More information

Performance Analysis of Energy Consumption of Smartphone Running Mobile Hotspot Application

Performance Analysis of Energy Consumption of Smartphone Running Mobile Hotspot Application Internatonal Journal of mart Grd and lean Energy Performance Analyss of Energy onsumpton of martphone Runnng Moble Hotspot Applcaton Yun on hung a chool of Electronc Engneerng, oongsl Unversty, 511 angdo-dong,

More information

Abstract. 260 Business Intelligence Journal July IDENTIFICATION OF DEMAND THROUGH STATISTICAL DISTRIBUTION MODELING FOR IMPROVED DEMAND FORECASTING

Abstract. 260 Business Intelligence Journal July IDENTIFICATION OF DEMAND THROUGH STATISTICAL DISTRIBUTION MODELING FOR IMPROVED DEMAND FORECASTING 260 Busness Intellgence Journal July IDENTIFICATION OF DEMAND THROUGH STATISTICAL DISTRIBUTION MODELING FOR IMPROVED DEMAND FORECASTING Murphy Choy Mchelle L.F. Cheong School of Informaton Systems, Sngapore

More information

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA* HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA* Luísa Farnha** 1. INTRODUCTION The rapd growth n Portuguese households ndebtedness n the past few years ncreased the concerns that debt

More information

SMART: Scalable, Bandwidth-Aware Monitoring of Continuous Aggregation Queries

SMART: Scalable, Bandwidth-Aware Monitoring of Continuous Aggregation Queries : Scalable, Bandwdth-Aware Montorng of Contnuous Aggregaton Queres Navendu Jan, Praveen Yalagandula, Mke Dahln, and Yn Zhang Unversty of Texas at Austn HP Labs ABSTRACT We present, a scalable, bandwdth-aware

More information

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence 1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh

More information