Counterfactual Reasoning and Learning Systems: The Example of Computational Advertising

Size: px
Start display at page:

Download "Counterfactual Reasoning and Learning Systems: The Example of Computational Advertising"

Transcription

1 Joural of Machie Learig Research 14 (2013) Submitted 9/12; Revised 3/13; Published 11/13 Couterfactual Reasoig ad Learig Systems: The Example of Computatioal Advertisig Léo Bottou Microsoft 1 Microsoft Way Redmod, WA 98052, USA Joas Peters Max Plack Istitute Spemastraße Tübige, Germay Joaqui Quiñoero-Cadela Deis X. Charles D. Max Chickerig Elo Portugaly Dipakar Ray Patrice Simard Ed Selso Microsoft 1 Microsoft Way Redmod, WA 98052, USA Abstract This work shows how to leverage causal iferece to uderstad the behavior of complex learig systems iteractig with their eviromet ad predict the cosequeces of chages to the system. Such predictios allow both humas ad algorithms to select the chages that would have improved the system performace. This work is illustrated by experimets o the ad placemet system associated with the Big search egie. Keywords: causatio, couterfactual reasoig, computatioal advertisig 1. Itroductio Statistical machie learig techologies i the real world are ever without a purpose. Usig their predictios, humas or machies make decisios whose circuitous cosequeces ofte violate the modelig assumptios that justified the system desig i the first place. Such cotradictios appear very clearly i the case of the learig systems that power web scale applicatios such as search egies, ad placemet egies, or recommedatio systems. For istace, the placemet of advertisemet o the result pages of Iteret search egies deped o the bids of advertisers ad o scores computed by statistical machie learig systems. Because the scores affect the cotets of the result pages proposed to the users, they directly ifluece the occurrece of clicks ad the correspodig advertiser paymets. They also have importat idirect effects. Ad placemet decisios impact the satisfactio of the users ad therefore their willigess to frequet this web site i the future. They also impact the retur o ivestmet observed by the. Curret address: Joas Peters, ETH Zürich, Rämistraße 101, 8092 Zürich, Switzerlad.. Curret address: Joaqui Quiñoero-Cadela, Facebook, 1 Hacker Way, Melo Park, CA 94025, USA. c 2013 Léo Bottou, Joas Peters, Joaqui Quiñoero-Cadela, Deis X. Charles, D. Max Chickerig, Elo Portugaly, Dipakar Ray, Patrice Simard ad Ed Selso

2 BOTTOU, PETERS, ET AL. advertisers ad therefore their future bids. Fially they chage the ature of the data collected for traiig the statistical models i the future. These complicated iteractios are clarified by importat theoretical works. Uder simplified assumptios, mechaism desig (Myerso, 1981) leads to a isightful accout of the advertiser feedback loop (Varia, 2007; Edelma et al., 2007). Uder simplified assumptios, multiarmed badits theory (Robbis, 1952; Auer et al., 2002; Lagford ad Zhag, 2008) ad reiforcemet learig (Sutto ad Barto, 1998) describe the exploratio/exploitatio dilemma associated with the traiig feedback loop. However, oe of these approaches gives a complete accout of the complex iteractios foud i real-life systems. This cotributio proposes a ovel approach: we view these complicated iteractios as maifestatios of the fudametal differece that separates correlatio ad causatio. Usig the ad placemet example as a model of our problem class, we therefore argue that the laguage ad the methods of causal iferece provide flexible meas to describe such complex machie learig systems ad give soud aswers to the practical questios facig the desiger of such a system. Is it useful to pass a ew iput sigal to the statistical model? Is it worthwhile to collect ad label a ew traiig set? What about chagig the loss fuctio or the learig algorithm? I order to aswer such questios ad improve the operatioal performace of the learig system, oe eeds to uravel how the iformatio produced by the statistical models traverses the web of causes ad effects ad evetually produces measurable performace metrics. Readers with a iterest i causal iferece will fid i this paper (i) a real world example demostratig the value of causal iferece for large-scale machie learig applicatios, (ii) causal iferece techiques applicable to cotiuously valued variables with meaigful cofidece itervals, ad (iii) quasi-static aalysis techiques for estimatig how small itervetios affect certai causal equilibria. Readers with a iterest i real-life applicatios will fid (iv) a selectio of practical couterfactual aalysis techiques applicable to may real-life machie learig systems. Readers with a iterest i computatioal advertisig will fid a pricipled framework that (v) explais how to soudly use machie learig techiques for ad placemet, ad (vi) coceptually coects machie learig ad auctio theory i a compellig maer. The paper is orgaized as follows. Sectio 2 gives a overview of the advertisemet placemet problem which serves as our mai example. I particular, we stress some of the difficulties ecoutered whe oe approaches such a problem without a pricipled perspective. Sectio 3 provides a codesed review of the essetial cocepts of causal modelig ad iferece. Sectio 4 ceters o formulatig ad aswerig couterfactual questios such as how would the system have performed durig the data collectio period if certai itervetios had bee carried out o the system? We describe importace samplig methods for couterfactual aalysis, with clear coditios of validity ad cofidece itervals. Sectio 5 illustrates how the structure of the causal graph reveals opportuities to exploit prior iformatio ad vastly improve the cofidece itervals. Sectio 6 describes how couterfactual aalysis provides essetial sigals that ca drive learig algorithms. Assume that we have idetified itervetios that would have caused the system to perform well durig the data collectio period. Which guaratee ca we obtai o the performace of these same itervetios i the future? Sectio 7 presets couterfactual differetial techiques for the study of equlibria. Usig data collected whe the system is at equilibrium, we ca estimate how a small itervetio displaces the equilibrium. This provides a elegat ad effective way to reaso about log-term feedback effects. Various appedices complete the mai text with iformatio that we thik more relevat to readers with specific backgrouds. 3208

3 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS 2. Causatio Issues i Computatioal Advertisig After givig a overview of the advertisemet placemet problem, which serves as our mai example, this sectio illustrates some of the difficulties that arise whe oe does ot pay sufficiet attetio to the causal structure of the learig system. 2.1 Advertisemet Placemet All Iteret users are ow familiar with the advertisemet messages that ador popular web pages. Advertisemets are particularly effective o search egie result pages because users who are searchig for somethig are good targets for advertisers who have somethig to offer. Several actors take part i this Iteret advertisemet game: Advertisers create advertisemet messages, ad place bids that describe how much they are willig to pay to see their ads displayed or clicked. Publishers provide attractive web services, such as, for istace, a Iteret search egie. They display selected ads ad expect to receive paymets from the advertisers. The ifrastructure to collect the advertiser bids ad select ads is sometimes provided by a advertisig etwork o behalf of its affiliated publishers. For the purposes of this work, we simply cosider a publisher large eough to ru its ow ifrastructure. Users reveal iformatio about their curret iterests, for istace, by eterig a query i a search egie. They are offered web pages that cotai a selectio of ads (Figure 1). Users sometimes click o a advertisemet ad are trasported to a web site cotrolled by the advertiser where they ca iitiate some busiess. A covetioal biddig laguage is ecessary to precisely defie uder which coditios a advertiser is willig to pay the bid amout. I the case of Iteret search advertisemet, each bid specifies (a) the advertisemet message, (b) a set of keywords, (c) oe of several possible matchig criteria betwee the keywords ad the user query, ad (d) the maximal price the advertiser is willig to pay whe a user clicks o the ad after eterig a query that matches the keywords accordig to the specified criterio. Wheever a user visits a publisher web page, a advertisemet placemet egie rus a auctio i real time i order to select wiig ads, determie where to display them i the page, ad compute the prices charged to advertisers, should the user click o their ad. Sice the placemet egie is operated by the publisher, it is desiged to further the iterests of the publisher. Fortuately for everyoe else, the publisher must balace short term iterests, amely the immediate reveue brought by the ads displayed o each web page, ad log term iterests, amely the future reveues resultig from the cotiued satisfactio of both users ad advertisers. Auctio theory explais how to desig a mechaism that optimizes the reveue of the seller of a sigle object (Myerso, 1981; Milgrom, 2004) uder various assumptios about the iformatio available to the buyers regardig the itetios of the other buyers. I the case of the ad placemet problem, the publisher rus multiple auctios ad sells opportuities to receive a click. Whe early idetical auctios occur thousad of times per secod, it is temptig to cosider that the advertisers have perfect iformatio about each other. This assumptio gives support to the popular geeralized secod price rak-score auctio (Varia, 2007; Edelma et al., 2007): 3209

4 BOTTOU, PETERS, ET AL. Figure 1: Mailie ad sidebar ads o a search result page. Ads placed i the mailie are more likely to be oticed, icreasig both the chaces of a click if the ad is relevat ad the risk of aoyig the user if the ad is ot relevat. Let x represet the auctio cotext iformatio, such as the user query, the user profile, the date, the time, etc. The ad placemet egie first determies all eligible ads a 1...a ad the correspodig bids b 1...b o the basis of the auctio cotext x ad of the matchig criteria specified by the advertisers. For each selected ad a i ad each potetial positio p o the web page, a statistical model outputs the estimate q i,p (x) of the probability that ad a i displayed i positio p receives a user click. The rak-score r i,p (x)=b i q i,p (x) the represets the purported value associated with placig ad a i at positio p. Let L represet a possible ad layout, that is, a set of positios that ca simultaeously be populated with ads, ad let L be the set of possible ad layouts, icludig of course the empty layout. The optimal layout ad the correspodig ads are obtaied by maximizig the total rak-score subject to reserve costraits max L L max i 1,i 2 r ip,p(x), (1),... p L p L, r ip,p(x) R p (x), ad also subject to diverse policy costraits, such as, for istace, prevetig the simultaeous display of multiple ads belogig to the same advertiser. Uder mild assumptios, this discrete maximizatio problem is ameable to computatioally efficiet greedy algorithms (see appedix A.) The advertiser paymet associated with a user click is computed usig the geeralized secod price (GSP) rule: the advertiser pays the smallest bid that it could have etered without chagig the solutio of the discrete maximizatio problem, all other bids remaiig equal. I other words, the advertiser could ot have maipulated its bid ad obtaied the same treatmet for a better price. 3210

5 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS Uder the perfect iformatio assumptio, the aalysis suggests that the publisher simply eeds to fid which reserve prices R p (x) yield the best reveue per auctio. However, the total reveue of the publisher also depeds o the traffic experieced by its web site. Displayig a excessive umber of irrelevat ads ca trai users to igore the ads, ad ca also drive them to competig web sites. Advertisers ca artificially raise the rak-scores of irrelevat ads by temporarily icreasig the bids. Idelicate advertisers ca create deceivig advertisemets that elicit may clicks but direct users to spam web sites. Experiece shows that the cotiued satisfactio of the users is more importat to the publisher tha it is to the advertisers. Therefore the geeralized secod price rak-score auctio has evolved. Rak-scores have bee augmeted with terms that quatify the user satisfactio or the ad relevace. Bids receive adaptive discouts i order to deal with situatios where the perfect iformatio assumptio is urealistic. These adjustmets are drive by additioal statistical models. The ad placemet egie should therefore be viewed as a complex learig system iteractig with both users ad advertisers. 2.2 Cotrolled Experimets The desiger of such a ad placemet egie faces the fudametal questio of testig whether a proposed modificatio of the ad placemet egie results i a improvemet of the operatioal performace of the system. The simplest way to aswer such a questio is to try the modificatio. The basic idea is to radomly split the users ito treatmet ad cotrol groups (Kohavi et al., 2008). Users from the cotrol group see web pages geerated usig the umodified system. Users of the treatmet groups see web pages geerated usig alterate versios of the system. Moitorig various performace metrics for a couple moths usually gives sufficiet iformatio to reliably decide which variat of the system delivers the most satisfactory performace. Modifyig a advertisemet placemet egie elicits reactios from both the users ad the advertisers. Whereas it is easy to split users ito treatmet ad cotrol groups, splittig advertisers ito treatmet ad cotrol groups demads special attetio because each auctio ivolves multiple advertisers (Charles et al., 2012). Simultaeously cotrollig for both users ad advertisers is probably impossible. Cotrolled experimets also suffer from several drawbacks. They are expesive because they demad a complete implemetatio of the proposed modificatios. They are slow because each experimet typically demads a couple moths. Fially, although there are elegat ways to efficietly ru overlappig cotrolled experimets o the same traffic (Tag et al., 2010), they are limited by the volume of traffic available for experimetatio. It is therefore difficult to rely o cotrolled experimets durig the coceptio phase of potetial improvemets to the ad placemet egie. It is similarly difficult to use cotrolled experimets to drive the traiig algorithms associated with click probability estimatio models. Cheaper ad faster statistical methods are eeded to drive these essetial aspects of the developmet of a ad placemet egie. Ufortuately, iterpretig cheap ad fast data ca be very deceivig. 2.3 Cofoudig Data Assessig the cosequece of a itervetio usig statistical data is geerally challegig because it is ofte difficult to determie whether the observed effect is a simple cosequece of the itervetio or has other ucotrolled causes. 3211

6 BOTTOU, PETERS, ET AL. Treatmet A: Ope surgery Treatmet B: Percutaeous ephrolithotomy Overall Patiets with small stoes Patiets with large stoes 78% (273/350) 93% (81/87) 73% (192/263) 83% (289/350) 87% (234/270) 69% (55/80) Table 1: A classic example of Simpso s paradox. The table reports the success rates of two treatmets for kidey stoes (Charig et al., 1986, Tables I ad II). Although the overall success rate of treatmet B seems better, treatmet B performs worse tha treatmet A o both patiets with small kidey stoes ad patiets with large kidey stoes. See Sectio 2.3. For istace, the empirical compariso of certai kidey stoe treatmets illustrates this difficulty (Charig et al., 1986). Table 2.3 reports the success rates observed o two groups of 350 patiets treated with respectively ope surgery (treatmet A, with 78% success) ad percutaeous ephrolithotomy (treatmet B, with 83% success). Although treatmet B seems more successful, it was more frequetly prescribed to patiets sufferig from small kidey stoes, a less serious coditio. Did treatmet B achieve a high success rate because of its itrisic qualities or because it was preferetially applied to less severe cases? Further splittig the data accordig to the size of the kidey stoes reverses the coclusio: treatmet A ow achieves the best success rate for both patiets sufferig from large kidey stoes ad patiets sufferig from small kidey stoes. Such a iversio of the coclusio is called Simpso s paradox (Simpso, 1951). The stoe size i this study is a example of a cofoudig variable, that is a ucotrolled variable whose cosequeces pollute the effect of the itervetio. Doctors kew the size of the kidey stoes, chose to treat the healthier patiets with the least ivasive treatmet B, ad therefore caused treatmet B to appear more effective tha it actually was. If we ow decide to apply treatmet B to all patiets irrespective of the stoe size, we break the causal path coectig the stoe size to the outcome, we elimiate the illusio, ad we will experiece disappoitig results. Whe we suspect the existece of a cofoudig variable, we ca split the cotigecy tables ad reach improved coclusios. Ufortuately we caot fully trust these coclusios uless we are certai to have take ito accout all cofoudig variables. The real problem therefore comes from the cofoudig variables we do ot kow. Radomized experimets arguably provide the oly correct solutio to this problem (see Stigler, 1992). The idea is to radomly chose whether the patiet receives treatmet A or treatmet B. Because this radom choice is idepedet from all the potetial cofoudig variables, kow ad ukow, they caot pollute the observed effect of the treatmets (see also Sectio 4.2). This is why cotrolled experimets i ad placemet (Sectio 2.2) radomly distribute users betwee treatmet ad cotrol groups, ad this is also why, i the case of a ad placemet egie, we should be somehow cocered by the practical impossibility to radomly distribute both users ad advertisers. 3212

7 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS Overall q 2 low q 2 high q 1 low 6.2% (124/2000) 5.1% (92/1823) 18.1% (32/176) q 1 high 7.5% (149/2000) 4.8% (71/1500) 15.6% (78/500) Table 2: Cofoudig data i ad placemet. The table reports the click-through rates ad the click couts of the secod mailie ad. The overall couts suggest that the click-through rate of the secod mailie ad icreases whe the click probability estimate q 1 of the top ad is high. However, if we further split the pages accordig to the click probability estimate q 2 of the secod mailie ad, we reach the opposite coclusio. See Sectio Cofoudig Data i Ad Placemet Let us retur to the questio of assessig the value of passig a ew iput sigal to the ad placemet egie click predictio model. Sectio 2.1 outlies a placemet method where the click probability estimates q i,p (x) deped o the ad ad the positio we cosider, but do ot deped o other ads displayed o the page. We ow cosider replacig this model by a ew model that additioally uses the estimated click probability of the top mailie ad to estimate the click probability of the secod mailie ad (Figure 1). We would like to estimate the effect of such a itervetio usig existig statistical data. We have collected ad placemet data for Big search result pages served durig three cosecutive hours o a certai slice of traffic. Let q 1 ad q 2 deote the click probability estimates computed by the existig model for respectively the top mailie ad ad the secod mailie ad. After excludig pages displayig fewer tha two mailie ads, we form two groups of 2000 pages radomly picked amog those satisfyig the coditios q 1 < 0.15 for the first group ad q for the secod group. Table 2.4 reports the click couts ad frequecies observed o the secod mailie ad i each group. Although the overall umbers show that users click more ofte o the secod mailie ad whe the top mailie ad has a high click probability estimate q 1, this coclusio is reversed whe we further split the data accordig to the click probability estimate q 2 of the secod mailie ad. Despite superficial similarities, this example is cosiderably more difficult to iterpret tha the kidey stoe example. The overall click couts show that the actual click-through rate of the secod mailie ad is positively correlated with the click probability estimate o the top mailie ad. Does this mea that we ca icrease the total umber of clicks by placig regular ads below frequetly clicked ads? Remember that the click probability estimates deped o the search query which itself depeds o the user itetio. The most likely explaatio is that pages with a high q 1 are frequetly associated with more commercial searches ad therefore receive more ad clicks o all positios. The observed correlatio occurs because the presece of a click ad the magitude of the click probability estimate q 1 have a commo cause: the user itetio. Meawhile, the click probability estimate q 2 retured by the curret model for the secod mailie ad also deped o the query ad therefore the user itetio. Therefore, assumig that this depedece has comparable stregth, ad assumig that there are o other causal paths, splittig the couts accordig to the magitude of q 2 factors out the effects of this commo cofoudig cause. We the observe a egative correlatio which ow 3213

8 BOTTOU, PETERS, ET AL. suggests that a frequetly clicked top mailie ad has a egative impact o the click-through rate of the secod mailie ad. If this is correct, we would probably icrease the accuracy of the click predictio model by switchig to the ew model. This would decrease the click probability estimates for ads placed i the secod mailie positio o commercial search pages. These ads are the less likely to clear the reserve ad therefore more likely to be displayed i the less attractive sidebar. The et result is probably a loss of clicks ad a loss of moey despite the higher quality of the click probability model. Although we could tue the reserve prices to compesate this ufortuate effect, othig i this data tells us where the performace of the ad placemet egie will lad. Furthermore, ukow cofoudig variables might completely reverse our coclusios. Makig sese out of such data is just too complex! 2.5 A Better Way It should ow be obvious that we eed a more pricipled way to reaso about the effect of potetial itervetios. We provide oe such more pricipled approach usig the causal iferece machiery (Sectio 3). The ext step is the the idetificatio of a class of questios that are sufficietly expressive to guide the desiger of a complex learig system, ad sufficietly simple to be aswered usig data collected i the past usig adequate procedures (Sectio 4). A machie learig algorithm ca the be viewed as a automated way to geerate questios about the parameters of a statistical model, obtai the correspodig aswers, ad update the parameters accordigly (Sectio 6). Learig algorithms derived i this maer are very flexible: huma desigers ad machie learig algorithms ca cooperate seamlessly because they rely o similar sources of iformatio. 3. Modelig Causal Systems Whe we poit out a causal relatioship betwee two evets, we describe what we expect to happe to the evet we call the effect, should a exteral operator maipulate the evet we call the cause. Maipulability theories of causatio (vo Wright, 1971; Woodward, 2005) raise this commosese isight to the status of a defiitio of the causal relatio. Difficult adjustmets are the eeded to iterpret statemets ivolvig causes that we ca oly observe through their effects, because they love me, or that are ot easily maipulated, because the earth is roud. Moder statistical thikig makes a clear distictio betwee the statistical model ad the world. The actual mechaisms uderlyig the data are cosidered ukow. The statistical models do ot eed to reproduce these mechaisms to emulate the observable data (Breima, 2001). Better models are sometimes obtaied by deliberately avoidig to reproduce the true mechaisms (Vapik, 1982, Sectio 8.6). We ca approach the maipulability puzzle i the same spirit by viewig causatio as a reasoig model (Bottou, 2011) rather tha a property of the world. Causes ad effects are simply the pieces of a abstract reasoig game. Causal statemets that are ot empirically testable acquire validity whe they are used as itermediate steps whe oe reasos about maipulatios or itervetios ameable to experimetal validatio. This sectio presets the rules of this reasoig game. We largely follow the framework proposed by Pearl (2009) because it gives a clear accout of the coectios betwee causal models ad probabilistic models. 3214

9 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS x = f 1 (u,ε 1 ) Query cotext x from user itet u. a = f 2 (x,v,ε 2 ) Eligible ads (a i ) from query x ad ivetory v. b = f 3 (x,v,ε 3 ) Correspodig bids (b i ). q = f 4 (x,a,ε 4 ) Scores (q i,p,r p ) from query x ad ads a. s = f 5 (a,q,b,ε 5 ) Ad slate s from eligible ads a, scores q ad bids b. c = f 6 (a,q,b,ε 6 ) Correspodig click prices c. y = f 7 (s,u,ε 7 ) User clicks y from ad slate s ad user itet u. z = f 8 (y,c,ε 8 ) Reveue z from clicks y ad prices c. Figure 2: A structural equatio model for ad placemet. The sequece of equatios describes the flow of iformatio. The fuctios f k describe how effects deped o their direct causes. The additioal oise variables ε k represet idepedet sources of radomess useful to model probabilistic depedecies. 3.1 The Flow of Iformatio Figure 2 gives a determiistic descriptio of the operatio of the ad placemet egie. Variable u represets the user ad his or her itetio i a uspecified maer. The query ad query cotext x is the expressed as a ukow fuctio of the u ad of a oise variable ε 1. Noise variables i this framework are best viewed as idepedet sources of radomess useful for modelig a odetermiistic causal depedecy. We shall oly metio them whe they play a specific role i the discussio. The set of eligible ads a ad the correspodig bids b are the derived from the query x ad the ad ivetory v supplied by the advertisers. Statistical models the compute a collectio of scores q such as the click probability estimates q i,p ad the reserves R p itroduced i Sectio 2.1. The placemet logic uses these scores to geerate the ad slate s, that is, the set of wiig ads ad their assiged positios. The correspodig click prices c are computed. The set of user clicks y is expressed as a ukow fuctio of the ad slate s ad the user itet u. Fially the reveue z is expressed as aother fuctio of the clicks y ad the prices c. Such a system of equatios is amed structural equatio model (Wright, 1921). Each equatio asserts a fuctioal depedecy betwee a effect, appearig o the left had side of the equatio, ad its direct causes, appearig o the right had side as argumets of the fuctio. Some of these causal depedecies are ukow. Although we postulate that the effect ca be expressed as some fuctio of its direct causes, we do ot kow the form of this fuctio. For istace, the desiger of the ad placemet egie kows fuctios f 2 to f 6 ad f 8 because he has desiged them. However, he does ot kow the fuctios f 1 ad f 7 because whoever desiged the user did ot leave sufficiet documetatio. Figure 3 represets the directed causal graph associated with the structural equatio model. Each arrow coects a direct cause to its effect. The oise variables are omitted for simplicity. The structure of this graph reveals fudametal assumptios about our model. For istace, the user clicks y do ot directly deped o the scores q or the prices c because users do ot have access to this iformatio. We hold as a priciple that causatio obeys the arrow of time: causes always precede their effects. Therefore the causal graph must be acyclic. Structural equatio models the support two fudametal operatios, amely simulatio ad itervetio. 3215

10 BOTTOU, PETERS, ET AL. Figure 3: Causal graph associated with the structural equatio model of Figure 2. The mutually idepedet oise variables ε 1 to ε 8 are implicit. The variables a, b, q, s, c, ad z deped o their direct causes i kow ways. I cotrast, the variables u ad v are exogeous ad the variables x ad y deped o their direct causes through ukow fuctios. Simulatio Let us assume that we kow both the exact form of all fuctioal depedecies ad the value of all exogeous variables, that is, the variables that ever appear i the left had side of a equatio. We ca compute the values of all the remaiig variables by applyig the equatios i their atural time sequece. Itervetio As log as the causal graph remais acyclic, we ca costruct derived structural equatio models usig arbitrary algebraic maipulatios of the system of equatios. For istace, we ca clamp a variable to a costat value by rewritig the right-had side of the correspodig equatio as the specified costat value. The algebraic maipulatio of the structural equatio models provides a powerful laguage to describe itervetios o a causal system. This is ot a coicidece. May aspects of the mathematical otatio were iveted to support causal iferece i classical mechaics. However, we o loger have to iterpret the variable values as physical quatities: the equatios simply describe the flow of iformatio i the causal model (Wieer, 1948). 3.2 The Isolatio Assumptio Let us ow tur our attetio to the exogeous variables, that is, variables that ever appear i the left had side of a equatio of the structural model. Leibiz s priciple of sufficiet reaso claims that there are o facts without causes. This suggests that the exogeous variables are the effects of a etwork of causes ot expressed by the structural equatio model. For istace, the user itet u ad the ad ivetory v i Figure 3 have temporal correlatios because both users ad advertisers worry about their budgets whe the ed of the moth approaches. Ay structural equatio model should the be uderstood i the cotext of a larger structural equatio model potetially describig all thigs i existece. Ads served o a particular page cotribute to the cotiued satisfactio of both users ad advertisers, ad therefore have a effect o their willigess to use the services of the publisher i the future. The ad placemet structural equatio model show i Figure 2 oly describes the causal depedecies for a sigle page ad therefore caot accout for such effects. Cosider however a very 3216

11 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS Figure 4: Coceptually urollig the user feedback loop by threadig istaces of the sigle page causal graph (Figure 3). Both the ad slate s t ad user clicks y t have a idirect effect o the user itet u t+1 associated with the ext query. large structural equatio model cotaiig a copy of the page-level model for every web page ever served by the publisher. Figure 4 shows how we ca thread the page-level models correspodig to pages served to the same user. Similarly we could model how advertisers track the performace ad the cost of their advertisemets ad model how their satisfactio affects their future bids. The resultig causal graphs ca be very complex. Part of this complexity results from time-scale differeces. Thousads of search pages are served i a secod. Each page cotributes a little to the cotiued satisfactio of oe user ad a few advertisers. The accumulatio of these cotributios produces measurable effects after a few weeks. May of the fuctioal depedecies expressed by the structural equatio model are left uspecified. Without direct kowledge of these fuctios, we must reaso usig statistical data. The most fudametal statistical data is collected from repeated trials that are assumed idepedet. Whe we cosider the large structured equatio model of everythig, we ca oly have oe large trial producig a sigle data poit. 1 It is therefore desirable to idetify repeated patters of idetical equatios that ca be viewed as repeated idepedet trials. Therefore, whe we study a structural equatio model represetig such a patter, we eed to make a additioal assumptio to expresses the idea that the outcome of oe trial does ot affect the other trials. We call such a assumptio a isolatio assumptio by aalogy with thermodyamics. 2 This ca be achieved by assumig that the exogeous variables are idepedetly draw from a ukow but fixed joit probability distributio. This assumptio cuts the causatio effects that could flow through the exogeous variables. The oise variables are also exogeous variables actig as idepedet source of radomess. The oise variables are useful to represet the coditioal distributio P( effect causes) usig the equatio effect= f(causes,ε). Therefore, we also assume joit idepedece betwee all the oise variables ad ay of the amed exogeous variable. 3 For istace, i the case of the ad placemet 1. See also the discussio o reiforcemet learig, Sectio The cocept of isolatio is pervasive i physics. A isolated system i thermodyamics (Reichl, 1998, Sectio 2.D) or a closed system i mechaics (Ladau ad Lifshitz, 1969, 5) evolves without exchagig mass or eergy with its surroudigs. Experimetal trials ivolvig systems that are assumed isolated may differ i their iitial setup ad therefore have differet outcomes. Assumig isolatio implies that the outcome of each trial caot affect the other trials. 3. Rather tha lettig two oise variables display measurable statistical depedecies because they share a commo cause, we prefer to ame the commo cause ad make the depedecy explicit i the graph. 3217

12 BOTTOU, PETERS, ET AL. ( u,v,x,a,b P q,s,c,y,z ) = P(u,v) Exogeous vars. P(x u) Query. P(a x,v) Eligible ads. P(b x,v) Bids. P(q x,a) Scores. P(s a,q,b) Ad slate. P(c a,q,b) Prices. P(y s,u) Clicks. P(z y,c) Reveue. Figure 5: Markov factorizatio of the structural equatio model of Figure 2. Figure 6: Bayesia etwork associated with the Markov factorizatio show i Figure 5. model show i Figure 2, we assume that the joit distributio of the exogeous variables factorizes as P(u,v,ε 1,...,ε 8 )=P(u,v)P(ε 1 )...P(ε 8 ). Sice a isolatio assumptio is oly true up to a poit, it should be expressed clearly ad remai uder costat scrutiy. We must therefore measure additioal performace metrics that reveal how the isolatio assumptio holds. For istace, the ad placemet structural equatio model ad the correspodig causal graph (figures 2 ad 3) do ot take user feedback or advertiser feedback ito accout. Measurig the reveue is ot eough because we could easily geerate reveue at the expese of the satisfactio of the users ad advertisers. Whe we evaluate itervetios uder such a isolatio assumptio, we also eed to measure a battery of additioal quatities that act as proxies for the user ad advertiser satisfactio. Noteworthy examples iclude ad relevace estimated by huma judges, ad advertiser surplus estimated from the auctios (Varia, 2009). 3.3 Markov Factorizatio Coceptually, we ca draw a sample of the exogeous variables usig the distributio specified by the isolatio assumptio, ad we ca the geerate values for all the remaiig variables by simulatig the structural equatio model. 3218

13 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS This process defies a geerative probabilistic model represetig the joit distributio of all variables i the structural equatio model. The distributio readily factorizes as the product of the joit probability of the amed exogeous variables, ad, for each equatio i the structural equatio model, the coditioal probability of the effect give its direct causes (Spirtes et al., 1993; Pearl, 2000). As illustrated by figures 5 ad 6, this Markov factorizatio coects the structural equatio model that describes causatio, ad the Bayesia etwork that describes the joit probability distributio followed by the variables uder the isolatio assumptio. 4 Structural equatio models ad Bayesia etworks appear so itimately coected that it could be easy to forget the differeces. The structural equatio model is a algebraic object. As log as the causal graph remais acyclic, algebraic maipulatios are iterpreted as itervetios o the causal system. The Bayesia etwork is a geerative statistical model represetig a class of joit probability distributios, ad, as such, does ot support algebraic maipulatios. However, the symbolic represetatio of its Markov factorizatio is a algebraic object, essetially equivalet to the structural equatio model. 3.4 Idetificatio, Trasportatio, ad Trasfer Learig Cosider a causal system represeted by a structural equatio model with some ukow fuctioal depedecies. Subject to the isolatio assumptio, data collected durig the operatio of this system follows the distributio described by the correspodig Markov factorizatio. Let us first assume that this data is sufficiet to idetify the joit distributio of the subset of variables we ca observe. We ca itervee o the system by clampig the value of some variables. This amouts to replacig the right-had side of the correspodig structural equatios by costats. The joit distributio of the variables is the described by a ew Markov factorizatio that shares may factors with the origial Markov factorizatio. Which coditioal probabilities associated with this ew distributio ca we express usig oly coditioal probabilities idetified durig the observatio of the origial system? This is called the idetifiability problem. More geerally, we ca cosider arbitrarily complex maipulatios of the structural equatio model, ad we ca perform multiple experimets ivolvig differet maipulatios of the causal system. Which coditioal probabilities pertaiig to oe experimet ca be expressed usig oly coditioal probabilities idetified durig the observatio of other experimets? This is called the trasportability problem. Pearl s do-calculus completely solves the idetifiability problem ad provides useful tools to address may istaces of the trasportability problem (see Pearl, 2012). Assumig that we kow the coditioal probability distributios ivolvig observed variables i the origial structural equatio model, do-calculus allows us to derive coditioal distributios pertaiig to the maipulated structural equatio model. Ufortuately, we must further distiguish the coditioal probabilities that we kow (because we desiged them) from those that we estimate from empirical data. This distictio is importat because estimatig the distributio of cotiuous or high cardiality variables is otoriously difficult. Furthermore, do-calculus ofte combies the estimated probabilities i ways that amplify estimatio errors. This happes whe the maipulated structural equatio model exercises the variables i ways that were rarely observed i the data collected from the origial structural equatio model. 4. Bayesia etworks are directed graphs represetig the Markov factorizatio of a joit probability distributio: the arrows o loger have a causal iterpretatio. 3219

14 BOTTOU, PETERS, ET AL. Therefore we prefer to use much simpler causal iferece techiques (see sectios 4.1 ad 4.2). Although these techiques do ot have the completeess properties of do-calculus, they combie estimatio ad trasportatio i a maer that facilitates the derivatio of useful cofidece itervals. 3.5 Special Cases Three special cases of causal models are particularly relevat to this work. I the multi-armed badit (Robbis, 1952), a user-defied policy fuctio π determies the distributio of actio a {1...K}, ad a ukow reward fuctio r determies the distributio of the outcome y give the actio a (Figure 7). I order to maximize the accumulated rewards, the player must costruct policies π that balace the exploratio of the actio space with the exploitatio of the best actio idetified so far (Auer et al., 2002; Audibert et al., 2007; Seldi et al., 2012). The cotextual badit problem (Lagford ad Zhag, 2008) sigificatly icreases the complexity of multi-armed badits by addig oe exogeous variable x to the policy fuctio π ad the reward fuctios r (Figure 8). Both multi-armed badit ad cotextual badit are special case of reiforcemet learig (Sutto ad Barto, 1998). I essece, a Markov decisio process is a sequece of cotextual badits where the cotext is o loger a exogeous variable but a state variable that depeds o the previous states ad actios (Figure 9). Note that the policy fuctio π, the reward fuctio r, ad the trasitio fuctio s are idepedet of time. All the time depedecies are expressed usig the states s t. These special cases have icreasig geerality. May simple structural equatio models ca be reduced to a cotextual badit problem usig appropriate defiitios of the cotext x, the actio a ad the outcome y. For istace, assumig that the prices c are discrete, the ad placemet structural equatio model show i Figure 2 reduces to a cotextual badit problem with cotext (u, v), actios (s, c) ad reward z. Similarly, give a sufficietly itricate defiitio of the state variables s t, all structural equatio models with discrete variables ca be reduced to a reiforcemet learig problem. Such reductios lose the fie structure of the causal graph. We show i Sectio 5 how this fie structure ca i fact be leveraged to obtai more iformatio from the same experimets. Moder reiforcemet learig algorithms (see Sutto ad Barto, 1998) leverage the assumptio that the policy fuctio, the reward fuctio, the trasitio fuctio, ad the distributios of the correspodig oise variables, are idepedet from time. This ivariace property provides great beefits whe the observed sequeces of actios ad rewards are log i compariso with the size of the state space. Oly Sectio 7 i this cotributio presets methods that take advatage of such a ivariace. The geeral questio of leveragig arbitrary fuctioal ivariaces i causal graphs is left for future work. 4. Couterfactual Aalysis We ow retur to the problem of formulatig ad aswerig questios about the value of proposed chages of a learig system. Assume for istace that we cosider replacig the score computatio 3220

15 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS a = π(ε) Actio a {1...K} y = r(a, ε ) Reward y R Figure 7: Structural equatio model for the multi-armed badit problem. The policy π selects a discrete actio a, ad the reward fuctio r determies the outcome y. The oise variables ε ad ε represet idepedet sources of radomess useful to model probabilistic depedecies. a = π(x, ε) Actio a {1...K} y = r(x, a, ε ) Reward y R Figure 8: Structural equatio model for cotextual badit problem. Both the actio ad the reward deped o a exogeous cotext variable x. a t = π(s t 1, ε t ) Actio y t = r(s t 1, a t, ε t) Reward r t R s t = s(s t 1, a t, ε t ) Next state Figure 9: Structural equatio model for reiforcemet learig. The above equatios are replicated for all t {0...,T}. The cotext is ow provided by a state variable s t 1 that depeds o the previous states ad actios. model M of a ad placemet egie by a alterate model M. We seek a aswer to the coditioal questio: How will the system perform if we replace model M by model M? Give sufficiet time ad sufficiet resources, we ca obtai the aswer usig a cotrolled experimet (Sectio 2.2). However, istead of carryig out a ew experimet, we would like to obtai a aswer usig data that we have already collected i the past. How would the system have performed if, whe the data was collected, we had replaced model M by model M? The aswer of this couterfactual questio is of course a couterfactual statemet that describes the system performace subject to a coditio that did ot happe. Couterfactual statemets challege ordiary logic because they deped o a coditio that is kow to be false. Although assertio A B is always true whe assertio A is false, we certaily do ot mea for all couterfactual statemets to be true. Lewis (1973) avigates this paradox usig a modal logic i which a couterfactual statemet describes the state of affairs i a alterate world that resembles ours except for the specified differeces. Couterfactuals ideed offer may subtle ways to qualify such alterate worlds. For istace, we ca easily describe isolatio assumptios (Sectio 3.2) i a couterfactual questio: How would the system have performed if, whe the data was collected, we had replaced model M by model M without icurrig user or advertiser reactios? 3221

16 BOTTOU, PETERS, ET AL. Figure 10: Causal graph for a image recogitio system. We ca estimate couterfactuals by replayig data collected i the past. Figure 11: Causal graph for a radomized experimet. We ca estimate certai couterfactuals by reweightig data collected i the past. The fact that we could ot have chaged the model without icurrig the user ad advertiser reactios does ot matter ay more tha the fact that we did ot replace model M by model M i the first place. This does ot prevet us from usig couterfactual statemets to reaso about causes ad effects. Couterfactual questios ad statemets provide a atural framework to express ad share our coclusios. The remaiig text i this sectio explais how we ca aswer certai couterfactual questios usig data collected i the past. More precisely, we seek to estimate performace metrics that ca be expressed as expectatios with respect to the distributio that would have bee observed if the couterfactual coditios had bee i force Replayig Empirical Data Figure 10 shows the causal graph associated with a simple image recogitio system. The classifier takes a image x ad produces a prospective class label ŷ. The loss measures the pealty associated with recogizig class ŷ while the true class is y. To estimate the expected error of such a classifier, we collect a represetative data set composed of labelled images, ru the classifier o each image, ad average the resultig losses. I other words, we replay the data set to estimate what (couterfactual) performace would have bee observed if we had used a differet classifier. We ca the select i retrospect the classifier that would have worked the best ad hope that it will keep workig well. This is the couterfactual viewpoit o empirical risk miimizatio (Vapik, 1982). Replayig the data set works because both the alterate classifier ad the loss fuctio are kow. More geerally, to estimate a couterfactual by replayig a data set, we eed to kow all the fuctioal depedecies associated with all causal paths coectig the itervetio poit to the measuremet poit. This is obviously ot always the case. 5. Although couterfactual expectatios ca be viewed as expectatios of uit-level couterfactuals (Pearl, 2009, Defiitio 4), they elude the sematic subtleties of uit-level couterfactuals ad ca be measured with radomized experimets (see Sectio 4.2.) 3222

17 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS 4.2 Reweightig Radomized Trials Figure 11 illustrates the radomized experimet suggested i Sectio 2.3. The patiets are radomly split ito two equally sized groups receivig respectively treatmets A ad B. The overall success rate for this experimet is therefore Y =(Y A +Y B )/2 where Y A ad Y B are the success rates observed for each group. We would like to estimate which (couterfactual) overall success rate Y would have bee observed if we had selected treatmet A with probability p ad treatmet B with probability 1 p. Sice we do ot kow how the outcome depeds o the treatmet ad the patiet coditio, we caot compute which outcome y would have bee obtaied if we had treated patiet x with a differet treatmet u. Therefore we caot aswer this questio by replayig the data as we did i Sectio 4.1. However, observig differet success rates Y A ad Y B for the treatmet groups reveals a empirical correlatio betwee the treatmet u ad the outcome y. Sice the oly cause of the treatmet u is a idepedet roll of the dices, this correlatio caot result from ay kow or ukow cofoudig commo cause. 6 Havig elimiated this possibility, we ca reweight the observed outcomes ad compute the estimate Y py A +(1 p)y B. 4.3 Markov Factor Replacemet The reweightig approach ca i fact be applied uder much less striget coditios. Let us retur to the ad placemet problem to illustrate this poit. The average umber of ad clicks per page is ofte called click yield. Icreasig the click yield usually beefits both the advertiser ad the publisher, whereas icreasig the reveue per page ofte beefits the publisher at the expese of the advertiser. Click yield is therefore a very useful metric whe we reaso with a isolatio assumptio that igores the advertiser reactios to pricig chages. Let be a shorthad for all variables appearig i the Markov factorizatio of the ad placemet structural equatio model, P() = P(u,v)P(x u)p(a x,v)p(b x,v)p(q x,a) P(s a,q,b)p(c a,q,b)p(y s,u)p(z y,c). (2) Variable y was defied i Sectio 3.1 as the set of user clicks. I the rest of the documet, we slightly abuse this otatio by usig the same letter y to represet the umber of clicks. We also write the expectatio Y =E P() [y] usig the itegral otatio Y = y P(). We would like to estimate what the expected click yield Y would have bee if we had used a differet scorig fuctio (Figure 12). This itervetio amouts to replacig the actual factor P(q x,a) by a couterfactual factor P (q x,a) i the Markov factorizatio. P () = P(u,v)P(x u)p(a x,v)p(b x,v)p (q x,a) P(s a,q,b)p(c a,q,b)p(y s,u)p(z x,c). (3) 6. See also the discussio of Reichebach s commo cause priciple ad of its limitatios i Spirtes et al. (1993) ad Spirtes ad Scheies (2004). 3223

18 BOTTOU, PETERS, ET AL. Figure 12: Estimatig which average umber of clicks per page would have bee observed if we had used a differet scorig model. Let us assume, for simplicity, that the actual factor P(q x,a) is ozero everywhere. We ca the estimate the couterfactual expected click yield Y usig the trasformatio Y = y P () = y P (q x,a) P(q x,a) P() 1 i=1 y i P (q i x i,a i ) P(q i x i,a i ), (4) where the data set of tuples (a i,x i,q i,y i ) is distributed accordig to the actual Markov factorizatio istead of the couterfactual Markov factorizatio. This data could therefore have bee collected durig the ormal operatio of the ad placemet system. Each sample is reweighted to reflect its probability of occurrece uder the couterfactual coditios. I geeral, we ca use importace samplig to estimate the couterfactual expectatio of ay quatityl() : with weights Y = l() P () = l() P () P() P() 1 i=1 l( i ) w i (5) w i = w( i ) = P ( i ) P( i ) = factors appearig i P ( i ) but ot i P( i ) factors appearig i P( i ) but ot i P ( i ). (6) Equatio (6) emphasizes the simplificatios resultig from the algebraic similarities of the actual ad couterfactual Markov factorizatios. Because of these simplificatios, the evaluatio of the weights oly requires the kowledge of the few factors that differ betwee P() ad P (). Each data sample eeds to provide the value of l( i ) ad the values of all variables eeded to evaluate the factors that do ot cacel i the ratio (6). I cotrast, the replayig approach (Sectio 4.1) demads the kowledge of all factors of P () coectig the poit of itervetio to the poit of measuremet l(). O the other had, it does ot require the kowledge of factors appearig oly i P(). Importace samplig relies o the assumptio that all the factors appearig i the deomiator of the reweightig ratio (6) are ozero wheever the factors appearig i the umerator are ozero. Sice these factors represets coditioal probabilities resultig from the effect of a idepedet oise variable i the structural equatio model, this assumptio meas that the data 3224

19 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS must be collected with a experimet ivolvig active radomizatio. We must therefore desig cost-effective radomized experimets that yield eough iformatio to estimate may iterestig couterfactual expectatios with sufficiet accuracy. This problem caot be solved without aswerig the cofidece iterval questio: give data collected with a certai level of radomizatio, with which accuracy ca we estimate a give couterfactual expectatio? 4.4 Cofidece Itervals At first sight, we ca ivoke the law of large umbers ad write Y = P() l()w() 1 i=1 l( i )w i. (7) For sufficietly large, the cetral limit theorem provides cofidece itervals whose width grows with the stadard deviatio of the product l()w(). Ufortuately, whe P() is small, the reweightig ratio w() takes large values with low probability. This heavy tailed distributio has aoyig cosequeces because the variace of the itegrad could be very high or ifiite. Whe the variace is ifiite, the cetral limit theorem does ot hold. Whe the variace is merely very large, the cetral limit covergece might occur too slowly to justify such cofidece itervals. Importace samplig works best whe the actual distributio ad the couterfactual distributio overlap. Whe the couterfactual distributio has sigificat mass i domais where the actual distributio is small, the few samples available i these domais receive very high weights. Their oisy cotributio domiates the reweighted estimate (7). We ca obtai better cofidece itervals by elimiatig these few samples draw i poorly explored domais. The resultig bias ca be bouded usig prior kowledge, for istace with a assumptio about the rage of values take byl(), l() [0, M]. (8) Let us choose the maximum weight value R deemed acceptable for the weights. We have obtaied very cosistet results i practice with R equal to the fifth largest reweightig ratio observed o the empirical data. 7 We ca the rely o clipped weights to elimiate the cotributio of the poorly explored domais, w() = { w() if P ()<R P() 0 otherwise. The coditio P ()<RP() esures that the ratio has a ozero deomiator P() ad is smaller tha R. Let Ω R be the set of all values of associated with acceptable ratios: Ω R = { : P ()<R P()}. We ca decompose Y i two terms: Y = l()p () + Ω R Ω\Ω R l()p () = Ȳ +(Y Ȳ ). (9) 7. This is i fact a slight abuse because the theory calls for choosig R before seeig the data. 3225

20 BOTTOU, PETERS, ET AL. The first term of this decompositio is the clipped expectatio Ȳ. Estimatig the clipped expectatio Ȳ is much easier tha estimatig Y from (7) because the clipped weights w() are bouded by R. Ȳ = l()p () = w() P() Ŷ l() = 1 l( i ) w( i ). (10) Ω R The secod term of Equatio (9) ca be bouded by leveragig assumptio (8). The resultig boud ca the be coveietly estimated usig oly the clipped weights. [ ] [ ] Y Ȳ = l()p () 0, M P (Ω\Ω R ) = 0, M(1 W ) with Ω\Ω R W = P (Ω R ) = Ω R P () = i=1 w()p() W = 1 i=1 w( i ). (11) Sice the clipped weights are bouded, the estimatio errors associated with (10) ad (11) are well characterized usig either the cetral limit theorem or usig empirical Berstei bouds (see appedix B for details). Therefore we ca derive a outer cofidece iterval of the form { } P Ŷ ε R Ȳ Ŷ + ε R 1 δ (12) ad a ier cofidece iterval of the form P{ Ȳ Y Ȳ + M(1 W + ξ R ) } 1 δ. (13) The ames ier ad outer are i fact related to our preferred way to visualize these itervals (e.g., Figure 13). Sice the bouds o Y Ȳ ca be writte as we ca derive our fial cofidece iterval, Ȳ Y Ȳ + M(1 W ), (14) P{ Ŷ ε R Y Ŷ + M(1 W + ξ R )+ε R } 1 2δ. (15) I coclusio, replacig the ubiased importace samplig estimator (7) by the clipped importace samplig estimator (10) with a suitable choice of R leads to improved cofidece itervals. Furthermore, sice the derivatio of these cofidece itervals does ot rely o the assumptio that P() is ozero everywhere, the clipped importace samplig estimator remais valid whe the distributio P() has a limited support. This relaxes the mai restrictio associated with importace samplig. 4.5 Iterpretig the Cofidece Itervals The estimatio of the couterfactual expectatio Y ca be iaccurate because the sample size is isufficiet or because the samplig distributio P() does ot sufficietly explore the couterfactual coditios of iterest. By costructio, the clipped expectatio Ȳ igores the domais poorly explored by the samplig distributio P(). The differece Y Ȳ the reflects the iaccuracy resultig from a lack of exploratio. Therefore, assumig that the boud R has bee chose competetly, the relative sizes of the outer ad ier cofidece itervals provide precious cues to determie whether we ca cotiue collectig data usig the same experimetal setup or should adjust the data collectio experimet i order to obtai a better coverage. 3226

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

Key Ideas Section 8-1: Overview hypothesis testing Hypothesis Hypothesis Test Section 8-2: Basics of Hypothesis Testing Null Hypothesis

Key Ideas Section 8-1: Overview hypothesis testing Hypothesis Hypothesis Test Section 8-2: Basics of Hypothesis Testing Null Hypothesis Chapter 8 Key Ideas Hypothesis (Null ad Alterative), Hypothesis Test, Test Statistic, P-value Type I Error, Type II Error, Sigificace Level, Power Sectio 8-1: Overview Cofidece Itervals (Chapter 7) are

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

3. Covariance and Correlation

3. Covariance and Correlation Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 3. Covariace ad Correlatio Recall that by takig the expected value of various trasformatios of a radom variable, we ca measure may iterestig characteristics

More information

CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

More information

Subject CT5 Contingencies Core Technical Syllabus

Subject CT5 Contingencies Core Technical Syllabus Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value

More information

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature. Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

INVESTMENT PERFORMANCE COUNCIL (IPC)

INVESTMENT PERFORMANCE COUNCIL (IPC) INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

Recursion and Recurrences

Recursion and Recurrences Chapter 5 Recursio ad Recurreces 5.1 Growth Rates of Solutios to Recurreces Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer. Cosider, for example,

More information

5 Boolean Decision Trees (February 11)

5 Boolean Decision Trees (February 11) 5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected

More information

MARTINGALES AND A BASIC APPLICATION

MARTINGALES AND A BASIC APPLICATION MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measure-theoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this

More information

Trading the randomness - Designing an optimal trading strategy under a drifted random walk price model

Trading the randomness - Designing an optimal trading strategy under a drifted random walk price model Tradig the radomess - Desigig a optimal tradig strategy uder a drifted radom walk price model Yuao Wu Math 20 Project Paper Professor Zachary Hamaker Abstract: I this paper the author iteds to explore

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Institute of Actuaries of India Subject CT1 Financial Mathematics

Institute of Actuaries of India Subject CT1 Financial Mathematics Istitute of Actuaries of Idia Subject CT1 Fiacial Mathematics For 2014 Examiatios Subject CT1 Fiacial Mathematics Core Techical Aim The aim of the Fiacial Mathematics subject is to provide a groudig i

More information

INVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology

INVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology Adoptio Date: 4 March 2004 Effective Date: 1 Jue 2004 Retroactive Applicatio: No Public Commet Period: Aug Nov 2002 INVESTMENT PERFORMANCE COUNCIL (IPC) Preface Guidace Statemet o Calculatio Methodology

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee

More information

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

Research Method (I) --Knowledge on Sampling (Simple Random Sampling) Research Method (I) --Kowledge o Samplig (Simple Radom Samplig) 1. Itroductio to samplig 1.1 Defiitio of samplig Samplig ca be defied as selectig part of the elemets i a populatio. It results i the fact

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

Hypergeometric Distributions

Hypergeometric Distributions 7.4 Hypergeometric Distributios Whe choosig the startig lie-up for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you

More information

Pre-Suit Collection Strategies

Pre-Suit Collection Strategies Pre-Suit Collectio Strategies Writte by Charles PT Phoeix How to Decide Whether to Pursue Collectio Calculatig the Value of Collectio As with ay busiess litigatio, all factors associated with the process

More information

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria

More information

Quadrat Sampling in Population Ecology

Quadrat Sampling in Population Ecology Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may

More information

Present Values, Investment Returns and Discount Rates

Present Values, Investment Returns and Discount Rates Preset Values, Ivestmet Returs ad Discout Rates Dimitry Midli, ASA, MAAA, PhD Presidet CDI Advisors LLC dmidli@cdiadvisors.com May 2, 203 Copyright 20, CDI Advisors LLC The cocept of preset value lies

More information

How to read A Mutual Fund shareholder report

How to read A Mutual Fund shareholder report Ivestor BulletI How to read A Mutual Fud shareholder report The SEC s Office of Ivestor Educatio ad Advocacy is issuig this Ivestor Bulleti to educate idividual ivestors about mutual fud shareholder reports.

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

ODBC. Getting Started With Sage Timberline Office ODBC

ODBC. Getting Started With Sage Timberline Office ODBC ODBC Gettig Started With Sage Timberlie Office ODBC NOTICE This documet ad the Sage Timberlie Office software may be used oly i accordace with the accompayig Sage Timberlie Office Ed User Licese Agreemet.

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

Page 1. Real Options for Engineering Systems. What are we up to? Today s agenda. J1: Real Options for Engineering Systems. Richard de Neufville

Page 1. Real Options for Engineering Systems. What are we up to? Today s agenda. J1: Real Options for Engineering Systems. Richard de Neufville Real Optios for Egieerig Systems J: Real Optios for Egieerig Systems By (MIT) Stefa Scholtes (CU) Course website: http://msl.mit.edu/cmi/ardet_2002 Stefa Scholtes Judge Istitute of Maagemet, CU Slide What

More information

9.8: THE POWER OF A TEST

9.8: THE POWER OF A TEST 9.8: The Power of a Test CD9-1 9.8: THE POWER OF A TEST I the iitial discussio of statistical hypothesis testig, the two types of risks that are take whe decisios are made about populatio parameters based

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments Project Deliverables CS 361, Lecture 28 Jared Saia Uiversity of New Mexico Each Group should tur i oe group project cosistig of: About 6-12 pages of text (ca be loger with appedix) 6-12 figures (please

More information

G r a d e. 2 M a t h e M a t i c s. statistics and Probability

G r a d e. 2 M a t h e M a t i c s. statistics and Probability G r a d e 2 M a t h e M a t i c s statistics ad Probability Grade 2: Statistics (Data Aalysis) (2.SP.1, 2.SP.2) edurig uderstadigs: data ca be collected ad orgaized i a variety of ways. data ca be used

More information

The second difference is the sequence of differences of the first difference sequence, 2

The second difference is the sequence of differences of the first difference sequence, 2 Differece Equatios I differetial equatios, you look for a fuctio that satisfies ad equatio ivolvig derivatives. I differece equatios, istead of a fuctio of a cotiuous variable (such as time), we look for

More information

Information about Bankruptcy

Information about Bankruptcy Iformatio about Bakruptcy Isolvecy Service of Irelad Seirbhís Dócmhaieachta a héirea Isolvecy Service of Irelad Seirbhís Dócmhaieachta a héirea What is the? The Isolvecy Service of Irelad () is a idepedet

More information

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

INDEPENDENT BUSINESS PLAN EVENT 2016

INDEPENDENT BUSINESS PLAN EVENT 2016 INDEPENDENT BUSINESS PLAN EVENT 2016 The Idepedet Busiess Pla Evet ivolves the developmet of a comprehesive proposal to start a ew busiess. Ay type of busiess may be used. The Idepedet Busiess Pla Evet

More information

7. Sample Covariance and Correlation

7. Sample Covariance and Correlation 1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 7. Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y

More information

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here). BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

Unit 20 Hypotheses Testing

Unit 20 Hypotheses Testing Uit 2 Hypotheses Testig Objectives: To uderstad how to formulate a ull hypothesis ad a alterative hypothesis about a populatio proportio, ad how to choose a sigificace level To uderstad how to collect

More information

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 [C] Commuicatio Measuremet A1. Solve problems that ivolve liear measuremet, usig: SI ad imperial uits of measure estimatio strategies measuremet strategies.

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

A Mathematical Perspective on Gambling

A Mathematical Perspective on Gambling A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics

More information

Systems Design Project: Indoor Location of Wireless Devices

Systems Design Project: Indoor Location of Wireless Devices Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 698-5295 Email: bcm1@cec.wustl.edu Supervised

More information

Engineering Data Management

Engineering Data Management BaaERP 5.0c Maufacturig Egieerig Data Maagemet Module Procedure UP128A US Documetiformatio Documet Documet code : UP128A US Documet group : User Documetatio Documet title : Egieerig Data Maagemet Applicatio/Package

More information

Cantilever Beam Experiment

Cantilever Beam Experiment Mechaical Egieerig Departmet Uiversity of Massachusetts Lowell Catilever Beam Experimet Backgroud A disk drive maufacturer is redesigig several disk drive armature mechaisms. This is the result of evaluatio

More information

PUBLIC RELATIONS PROJECT 2016

PUBLIC RELATIONS PROJECT 2016 PUBLIC RELATIONS PROJECT 2016 The purpose of the Public Relatios Project is to provide a opportuity for the chapter members to demostrate the kowledge ad skills eeded i plaig, orgaizig, implemetig ad evaluatig

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

Simple Annuities Present Value.

Simple Annuities Present Value. Simple Auities Preset Value. OBJECTIVES (i) To uderstad the uderlyig priciple of a preset value auity. (ii) To use a CASIO CFX-9850GB PLUS to efficietly compute values associated with preset value auities.

More information

The Stable Marriage Problem

The Stable Marriage Problem The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV William.Hut@mail.wvu.edu 1 Itroductio Imagie you are a matchmaker,

More information

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction THE ARITHMETIC OF INTEGERS - multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,

More information

Ekkehart Schlicht: Economic Surplus and Derived Demand

Ekkehart Schlicht: Economic Surplus and Derived Demand Ekkehart Schlicht: Ecoomic Surplus ad Derived Demad Muich Discussio Paper No. 2006-17 Departmet of Ecoomics Uiversity of Muich Volkswirtschaftliche Fakultät Ludwig-Maximilias-Uiversität Müche Olie at http://epub.ub.ui-mueche.de/940/

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat

More information

Sequences II. Chapter 3. 3.1 Convergent Sequences

Sequences II. Chapter 3. 3.1 Convergent Sequences Chapter 3 Sequeces II 3. Coverget Sequeces Plot a graph of the sequece a ) = 2, 3 2, 4 3, 5 + 4,...,,... To what limit do you thik this sequece teds? What ca you say about the sequece a )? For ǫ = 0.,

More information

Domain 1: Designing a SQL Server Instance and a Database Solution

Domain 1: Designing a SQL Server Instance and a Database Solution Maual SQL Server 2008 Desig, Optimize ad Maitai (70-450) 1-800-418-6789 Domai 1: Desigig a SQL Server Istace ad a Database Solutio Desigig for CPU, Memory ad Storage Capacity Requiremets Whe desigig a

More information

Amendments to employer debt Regulations

Amendments to employer debt Regulations March 2008 Pesios Legal Alert Amedmets to employer debt Regulatios The Govermet has at last issued Regulatios which will amed the law as to employer debts uder s75 Pesios Act 1995. The amedig Regulatios

More information

Domain 1: Identifying Cause of and Resolving Desktop Application Issues Identifying and Resolving New Software Installation Issues

Domain 1: Identifying Cause of and Resolving Desktop Application Issues Identifying and Resolving New Software Installation Issues Maual Widows 7 Eterprise Desktop Support Techicia (70-685) 1-800-418-6789 Domai 1: Idetifyig Cause of ad Resolvig Desktop Applicatio Issues Idetifyig ad Resolvig New Software Istallatio Issues This sectio

More information

Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL.

Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL. Auities Uder Radom Rates of Iterest II By Abraham Zas Techio I.I.T. Haifa ISRAEL ad Haifa Uiversity Haifa ISRAEL Departmet of Mathematics, Techio - Israel Istitute of Techology, 3000, Haifa, Israel I memory

More information

Learning outcomes. Algorithms and Data Structures. Time Complexity Analysis. Time Complexity Analysis How fast is the algorithm? Prof. Dr.

Learning outcomes. Algorithms and Data Structures. Time Complexity Analysis. Time Complexity Analysis How fast is the algorithm? Prof. Dr. Algorithms ad Data Structures Algorithm efficiecy Learig outcomes Able to carry out simple asymptotic aalysisof algorithms Prof. Dr. Qi Xi 2 Time Complexity Aalysis How fast is the algorithm? Code the

More information

AP Calculus BC 2003 Scoring Guidelines Form B

AP Calculus BC 2003 Scoring Guidelines Form B AP Calculus BC Scorig Guidelies Form B The materials icluded i these files are iteded for use by AP teachers for course ad exam preparatio; permissio for ay other use must be sought from the Advaced Placemet

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

Confidence intervals and hypothesis tests

Confidence intervals and hypothesis tests Chapter 2 Cofidece itervals ad hypothesis tests This chapter focuses o how to draw coclusios about populatios from sample data. We ll start by lookig at biary data (e.g., pollig), ad lear how to estimate

More information

Problem Set 1 Oligopoly, market shares and concentration indexes

Problem Set 1 Oligopoly, market shares and concentration indexes Advaced Idustrial Ecoomics Sprig 2016 Joha Steek 29 April 2016 Problem Set 1 Oligopoly, market shares ad cocetratio idexes 1 1 Price Competitio... 3 1.1 Courot Oligopoly with Homogeous Goods ad Differet

More information

Enhancing Oracle Business Intelligence with cubus EV How users of Oracle BI on Essbase cubes can benefit from cubus outperform EV Analytics (cubus EV)

Enhancing Oracle Business Intelligence with cubus EV How users of Oracle BI on Essbase cubes can benefit from cubus outperform EV Analytics (cubus EV) Ehacig Oracle Busiess Itelligece with cubus EV How users of Oracle BI o Essbase cubes ca beefit from cubus outperform EV Aalytics (cubus EV) CONTENT 01 cubus EV as a ehacemet to Oracle BI o Essbase 02

More information

How to use what you OWN to reduce what you OWE

How to use what you OWN to reduce what you OWE How to use what you OWN to reduce what you OWE Maulife Oe A Overview Most Caadias maage their fiaces by doig two thigs: 1. Depositig their icome ad other short-term assets ito chequig ad savigs accouts.

More information

The Forgotten Middle. research readiness results. Executive Summary

The Forgotten Middle. research readiness results. Executive Summary The Forgotte Middle Esurig that All Studets Are o Target for College ad Career Readiess before High School Executive Summary Today, college readiess also meas career readiess. While ot every high school

More information

Chapter 10. Hypothesis Tests Regarding a Parameter. 10.1 The Language of Hypothesis Testing

Chapter 10. Hypothesis Tests Regarding a Parameter. 10.1 The Language of Hypothesis Testing Chapter 10 Hypothesis Tests Regardig a Parameter A secod type of statistical iferece is hypothesis testig. Here, rather tha use either a poit (or iterval) estimate from a simple radom sample to approximate

More information

Divide and Conquer. Maximum/minimum. Integer Multiplication. CS125 Lecture 4 Fall 2015

Divide and Conquer. Maximum/minimum. Integer Multiplication. CS125 Lecture 4 Fall 2015 CS125 Lecture 4 Fall 2015 Divide ad Coquer We have see oe geeral paradigm for fidig algorithms: the greedy approach. We ow cosider aother geeral paradigm, kow as divide ad coquer. We have already see a

More information