Counterfactual Reasoning and Learning Systems: The Example of Computational Advertising


 Clifton White
 2 years ago
 Views:
Transcription
1 Joural of Machie Learig Research 14 (2013) Submitted 9/12; Revised 3/13; Published 11/13 Couterfactual Reasoig ad Learig Systems: The Example of Computatioal Advertisig Léo Bottou Microsoft 1 Microsoft Way Redmod, WA 98052, USA Joas Peters Max Plack Istitute Spemastraße Tübige, Germay Joaqui QuiñoeroCadela Deis X. Charles D. Max Chickerig Elo Portugaly Dipakar Ray Patrice Simard Ed Selso Microsoft 1 Microsoft Way Redmod, WA 98052, USA Abstract This work shows how to leverage causal iferece to uderstad the behavior of complex learig systems iteractig with their eviromet ad predict the cosequeces of chages to the system. Such predictios allow both humas ad algorithms to select the chages that would have improved the system performace. This work is illustrated by experimets o the ad placemet system associated with the Big search egie. Keywords: causatio, couterfactual reasoig, computatioal advertisig 1. Itroductio Statistical machie learig techologies i the real world are ever without a purpose. Usig their predictios, humas or machies make decisios whose circuitous cosequeces ofte violate the modelig assumptios that justified the system desig i the first place. Such cotradictios appear very clearly i the case of the learig systems that power web scale applicatios such as search egies, ad placemet egies, or recommedatio systems. For istace, the placemet of advertisemet o the result pages of Iteret search egies deped o the bids of advertisers ad o scores computed by statistical machie learig systems. Because the scores affect the cotets of the result pages proposed to the users, they directly ifluece the occurrece of clicks ad the correspodig advertiser paymets. They also have importat idirect effects. Ad placemet decisios impact the satisfactio of the users ad therefore their willigess to frequet this web site i the future. They also impact the retur o ivestmet observed by the. Curret address: Joas Peters, ETH Zürich, Rämistraße 101, 8092 Zürich, Switzerlad.. Curret address: Joaqui QuiñoeroCadela, Facebook, 1 Hacker Way, Melo Park, CA 94025, USA. c 2013 Léo Bottou, Joas Peters, Joaqui QuiñoeroCadela, Deis X. Charles, D. Max Chickerig, Elo Portugaly, Dipakar Ray, Patrice Simard ad Ed Selso
2 BOTTOU, PETERS, ET AL. advertisers ad therefore their future bids. Fially they chage the ature of the data collected for traiig the statistical models i the future. These complicated iteractios are clarified by importat theoretical works. Uder simplified assumptios, mechaism desig (Myerso, 1981) leads to a isightful accout of the advertiser feedback loop (Varia, 2007; Edelma et al., 2007). Uder simplified assumptios, multiarmed badits theory (Robbis, 1952; Auer et al., 2002; Lagford ad Zhag, 2008) ad reiforcemet learig (Sutto ad Barto, 1998) describe the exploratio/exploitatio dilemma associated with the traiig feedback loop. However, oe of these approaches gives a complete accout of the complex iteractios foud i reallife systems. This cotributio proposes a ovel approach: we view these complicated iteractios as maifestatios of the fudametal differece that separates correlatio ad causatio. Usig the ad placemet example as a model of our problem class, we therefore argue that the laguage ad the methods of causal iferece provide flexible meas to describe such complex machie learig systems ad give soud aswers to the practical questios facig the desiger of such a system. Is it useful to pass a ew iput sigal to the statistical model? Is it worthwhile to collect ad label a ew traiig set? What about chagig the loss fuctio or the learig algorithm? I order to aswer such questios ad improve the operatioal performace of the learig system, oe eeds to uravel how the iformatio produced by the statistical models traverses the web of causes ad effects ad evetually produces measurable performace metrics. Readers with a iterest i causal iferece will fid i this paper (i) a real world example demostratig the value of causal iferece for largescale machie learig applicatios, (ii) causal iferece techiques applicable to cotiuously valued variables with meaigful cofidece itervals, ad (iii) quasistatic aalysis techiques for estimatig how small itervetios affect certai causal equilibria. Readers with a iterest i reallife applicatios will fid (iv) a selectio of practical couterfactual aalysis techiques applicable to may reallife machie learig systems. Readers with a iterest i computatioal advertisig will fid a pricipled framework that (v) explais how to soudly use machie learig techiques for ad placemet, ad (vi) coceptually coects machie learig ad auctio theory i a compellig maer. The paper is orgaized as follows. Sectio 2 gives a overview of the advertisemet placemet problem which serves as our mai example. I particular, we stress some of the difficulties ecoutered whe oe approaches such a problem without a pricipled perspective. Sectio 3 provides a codesed review of the essetial cocepts of causal modelig ad iferece. Sectio 4 ceters o formulatig ad aswerig couterfactual questios such as how would the system have performed durig the data collectio period if certai itervetios had bee carried out o the system? We describe importace samplig methods for couterfactual aalysis, with clear coditios of validity ad cofidece itervals. Sectio 5 illustrates how the structure of the causal graph reveals opportuities to exploit prior iformatio ad vastly improve the cofidece itervals. Sectio 6 describes how couterfactual aalysis provides essetial sigals that ca drive learig algorithms. Assume that we have idetified itervetios that would have caused the system to perform well durig the data collectio period. Which guaratee ca we obtai o the performace of these same itervetios i the future? Sectio 7 presets couterfactual differetial techiques for the study of equlibria. Usig data collected whe the system is at equilibrium, we ca estimate how a small itervetio displaces the equilibrium. This provides a elegat ad effective way to reaso about logterm feedback effects. Various appedices complete the mai text with iformatio that we thik more relevat to readers with specific backgrouds. 3208
3 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS 2. Causatio Issues i Computatioal Advertisig After givig a overview of the advertisemet placemet problem, which serves as our mai example, this sectio illustrates some of the difficulties that arise whe oe does ot pay sufficiet attetio to the causal structure of the learig system. 2.1 Advertisemet Placemet All Iteret users are ow familiar with the advertisemet messages that ador popular web pages. Advertisemets are particularly effective o search egie result pages because users who are searchig for somethig are good targets for advertisers who have somethig to offer. Several actors take part i this Iteret advertisemet game: Advertisers create advertisemet messages, ad place bids that describe how much they are willig to pay to see their ads displayed or clicked. Publishers provide attractive web services, such as, for istace, a Iteret search egie. They display selected ads ad expect to receive paymets from the advertisers. The ifrastructure to collect the advertiser bids ad select ads is sometimes provided by a advertisig etwork o behalf of its affiliated publishers. For the purposes of this work, we simply cosider a publisher large eough to ru its ow ifrastructure. Users reveal iformatio about their curret iterests, for istace, by eterig a query i a search egie. They are offered web pages that cotai a selectio of ads (Figure 1). Users sometimes click o a advertisemet ad are trasported to a web site cotrolled by the advertiser where they ca iitiate some busiess. A covetioal biddig laguage is ecessary to precisely defie uder which coditios a advertiser is willig to pay the bid amout. I the case of Iteret search advertisemet, each bid specifies (a) the advertisemet message, (b) a set of keywords, (c) oe of several possible matchig criteria betwee the keywords ad the user query, ad (d) the maximal price the advertiser is willig to pay whe a user clicks o the ad after eterig a query that matches the keywords accordig to the specified criterio. Wheever a user visits a publisher web page, a advertisemet placemet egie rus a auctio i real time i order to select wiig ads, determie where to display them i the page, ad compute the prices charged to advertisers, should the user click o their ad. Sice the placemet egie is operated by the publisher, it is desiged to further the iterests of the publisher. Fortuately for everyoe else, the publisher must balace short term iterests, amely the immediate reveue brought by the ads displayed o each web page, ad log term iterests, amely the future reveues resultig from the cotiued satisfactio of both users ad advertisers. Auctio theory explais how to desig a mechaism that optimizes the reveue of the seller of a sigle object (Myerso, 1981; Milgrom, 2004) uder various assumptios about the iformatio available to the buyers regardig the itetios of the other buyers. I the case of the ad placemet problem, the publisher rus multiple auctios ad sells opportuities to receive a click. Whe early idetical auctios occur thousad of times per secod, it is temptig to cosider that the advertisers have perfect iformatio about each other. This assumptio gives support to the popular geeralized secod price rakscore auctio (Varia, 2007; Edelma et al., 2007): 3209
4 BOTTOU, PETERS, ET AL. Figure 1: Mailie ad sidebar ads o a search result page. Ads placed i the mailie are more likely to be oticed, icreasig both the chaces of a click if the ad is relevat ad the risk of aoyig the user if the ad is ot relevat. Let x represet the auctio cotext iformatio, such as the user query, the user profile, the date, the time, etc. The ad placemet egie first determies all eligible ads a 1...a ad the correspodig bids b 1...b o the basis of the auctio cotext x ad of the matchig criteria specified by the advertisers. For each selected ad a i ad each potetial positio p o the web page, a statistical model outputs the estimate q i,p (x) of the probability that ad a i displayed i positio p receives a user click. The rakscore r i,p (x)=b i q i,p (x) the represets the purported value associated with placig ad a i at positio p. Let L represet a possible ad layout, that is, a set of positios that ca simultaeously be populated with ads, ad let L be the set of possible ad layouts, icludig of course the empty layout. The optimal layout ad the correspodig ads are obtaied by maximizig the total rakscore subject to reserve costraits max L L max i 1,i 2 r ip,p(x), (1),... p L p L, r ip,p(x) R p (x), ad also subject to diverse policy costraits, such as, for istace, prevetig the simultaeous display of multiple ads belogig to the same advertiser. Uder mild assumptios, this discrete maximizatio problem is ameable to computatioally efficiet greedy algorithms (see appedix A.) The advertiser paymet associated with a user click is computed usig the geeralized secod price (GSP) rule: the advertiser pays the smallest bid that it could have etered without chagig the solutio of the discrete maximizatio problem, all other bids remaiig equal. I other words, the advertiser could ot have maipulated its bid ad obtaied the same treatmet for a better price. 3210
5 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS Uder the perfect iformatio assumptio, the aalysis suggests that the publisher simply eeds to fid which reserve prices R p (x) yield the best reveue per auctio. However, the total reveue of the publisher also depeds o the traffic experieced by its web site. Displayig a excessive umber of irrelevat ads ca trai users to igore the ads, ad ca also drive them to competig web sites. Advertisers ca artificially raise the rakscores of irrelevat ads by temporarily icreasig the bids. Idelicate advertisers ca create deceivig advertisemets that elicit may clicks but direct users to spam web sites. Experiece shows that the cotiued satisfactio of the users is more importat to the publisher tha it is to the advertisers. Therefore the geeralized secod price rakscore auctio has evolved. Rakscores have bee augmeted with terms that quatify the user satisfactio or the ad relevace. Bids receive adaptive discouts i order to deal with situatios where the perfect iformatio assumptio is urealistic. These adjustmets are drive by additioal statistical models. The ad placemet egie should therefore be viewed as a complex learig system iteractig with both users ad advertisers. 2.2 Cotrolled Experimets The desiger of such a ad placemet egie faces the fudametal questio of testig whether a proposed modificatio of the ad placemet egie results i a improvemet of the operatioal performace of the system. The simplest way to aswer such a questio is to try the modificatio. The basic idea is to radomly split the users ito treatmet ad cotrol groups (Kohavi et al., 2008). Users from the cotrol group see web pages geerated usig the umodified system. Users of the treatmet groups see web pages geerated usig alterate versios of the system. Moitorig various performace metrics for a couple moths usually gives sufficiet iformatio to reliably decide which variat of the system delivers the most satisfactory performace. Modifyig a advertisemet placemet egie elicits reactios from both the users ad the advertisers. Whereas it is easy to split users ito treatmet ad cotrol groups, splittig advertisers ito treatmet ad cotrol groups demads special attetio because each auctio ivolves multiple advertisers (Charles et al., 2012). Simultaeously cotrollig for both users ad advertisers is probably impossible. Cotrolled experimets also suffer from several drawbacks. They are expesive because they demad a complete implemetatio of the proposed modificatios. They are slow because each experimet typically demads a couple moths. Fially, although there are elegat ways to efficietly ru overlappig cotrolled experimets o the same traffic (Tag et al., 2010), they are limited by the volume of traffic available for experimetatio. It is therefore difficult to rely o cotrolled experimets durig the coceptio phase of potetial improvemets to the ad placemet egie. It is similarly difficult to use cotrolled experimets to drive the traiig algorithms associated with click probability estimatio models. Cheaper ad faster statistical methods are eeded to drive these essetial aspects of the developmet of a ad placemet egie. Ufortuately, iterpretig cheap ad fast data ca be very deceivig. 2.3 Cofoudig Data Assessig the cosequece of a itervetio usig statistical data is geerally challegig because it is ofte difficult to determie whether the observed effect is a simple cosequece of the itervetio or has other ucotrolled causes. 3211
6 BOTTOU, PETERS, ET AL. Treatmet A: Ope surgery Treatmet B: Percutaeous ephrolithotomy Overall Patiets with small stoes Patiets with large stoes 78% (273/350) 93% (81/87) 73% (192/263) 83% (289/350) 87% (234/270) 69% (55/80) Table 1: A classic example of Simpso s paradox. The table reports the success rates of two treatmets for kidey stoes (Charig et al., 1986, Tables I ad II). Although the overall success rate of treatmet B seems better, treatmet B performs worse tha treatmet A o both patiets with small kidey stoes ad patiets with large kidey stoes. See Sectio 2.3. For istace, the empirical compariso of certai kidey stoe treatmets illustrates this difficulty (Charig et al., 1986). Table 2.3 reports the success rates observed o two groups of 350 patiets treated with respectively ope surgery (treatmet A, with 78% success) ad percutaeous ephrolithotomy (treatmet B, with 83% success). Although treatmet B seems more successful, it was more frequetly prescribed to patiets sufferig from small kidey stoes, a less serious coditio. Did treatmet B achieve a high success rate because of its itrisic qualities or because it was preferetially applied to less severe cases? Further splittig the data accordig to the size of the kidey stoes reverses the coclusio: treatmet A ow achieves the best success rate for both patiets sufferig from large kidey stoes ad patiets sufferig from small kidey stoes. Such a iversio of the coclusio is called Simpso s paradox (Simpso, 1951). The stoe size i this study is a example of a cofoudig variable, that is a ucotrolled variable whose cosequeces pollute the effect of the itervetio. Doctors kew the size of the kidey stoes, chose to treat the healthier patiets with the least ivasive treatmet B, ad therefore caused treatmet B to appear more effective tha it actually was. If we ow decide to apply treatmet B to all patiets irrespective of the stoe size, we break the causal path coectig the stoe size to the outcome, we elimiate the illusio, ad we will experiece disappoitig results. Whe we suspect the existece of a cofoudig variable, we ca split the cotigecy tables ad reach improved coclusios. Ufortuately we caot fully trust these coclusios uless we are certai to have take ito accout all cofoudig variables. The real problem therefore comes from the cofoudig variables we do ot kow. Radomized experimets arguably provide the oly correct solutio to this problem (see Stigler, 1992). The idea is to radomly chose whether the patiet receives treatmet A or treatmet B. Because this radom choice is idepedet from all the potetial cofoudig variables, kow ad ukow, they caot pollute the observed effect of the treatmets (see also Sectio 4.2). This is why cotrolled experimets i ad placemet (Sectio 2.2) radomly distribute users betwee treatmet ad cotrol groups, ad this is also why, i the case of a ad placemet egie, we should be somehow cocered by the practical impossibility to radomly distribute both users ad advertisers. 3212
7 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS Overall q 2 low q 2 high q 1 low 6.2% (124/2000) 5.1% (92/1823) 18.1% (32/176) q 1 high 7.5% (149/2000) 4.8% (71/1500) 15.6% (78/500) Table 2: Cofoudig data i ad placemet. The table reports the clickthrough rates ad the click couts of the secod mailie ad. The overall couts suggest that the clickthrough rate of the secod mailie ad icreases whe the click probability estimate q 1 of the top ad is high. However, if we further split the pages accordig to the click probability estimate q 2 of the secod mailie ad, we reach the opposite coclusio. See Sectio Cofoudig Data i Ad Placemet Let us retur to the questio of assessig the value of passig a ew iput sigal to the ad placemet egie click predictio model. Sectio 2.1 outlies a placemet method where the click probability estimates q i,p (x) deped o the ad ad the positio we cosider, but do ot deped o other ads displayed o the page. We ow cosider replacig this model by a ew model that additioally uses the estimated click probability of the top mailie ad to estimate the click probability of the secod mailie ad (Figure 1). We would like to estimate the effect of such a itervetio usig existig statistical data. We have collected ad placemet data for Big search result pages served durig three cosecutive hours o a certai slice of traffic. Let q 1 ad q 2 deote the click probability estimates computed by the existig model for respectively the top mailie ad ad the secod mailie ad. After excludig pages displayig fewer tha two mailie ads, we form two groups of 2000 pages radomly picked amog those satisfyig the coditios q 1 < 0.15 for the first group ad q for the secod group. Table 2.4 reports the click couts ad frequecies observed o the secod mailie ad i each group. Although the overall umbers show that users click more ofte o the secod mailie ad whe the top mailie ad has a high click probability estimate q 1, this coclusio is reversed whe we further split the data accordig to the click probability estimate q 2 of the secod mailie ad. Despite superficial similarities, this example is cosiderably more difficult to iterpret tha the kidey stoe example. The overall click couts show that the actual clickthrough rate of the secod mailie ad is positively correlated with the click probability estimate o the top mailie ad. Does this mea that we ca icrease the total umber of clicks by placig regular ads below frequetly clicked ads? Remember that the click probability estimates deped o the search query which itself depeds o the user itetio. The most likely explaatio is that pages with a high q 1 are frequetly associated with more commercial searches ad therefore receive more ad clicks o all positios. The observed correlatio occurs because the presece of a click ad the magitude of the click probability estimate q 1 have a commo cause: the user itetio. Meawhile, the click probability estimate q 2 retured by the curret model for the secod mailie ad also deped o the query ad therefore the user itetio. Therefore, assumig that this depedece has comparable stregth, ad assumig that there are o other causal paths, splittig the couts accordig to the magitude of q 2 factors out the effects of this commo cofoudig cause. We the observe a egative correlatio which ow 3213
8 BOTTOU, PETERS, ET AL. suggests that a frequetly clicked top mailie ad has a egative impact o the clickthrough rate of the secod mailie ad. If this is correct, we would probably icrease the accuracy of the click predictio model by switchig to the ew model. This would decrease the click probability estimates for ads placed i the secod mailie positio o commercial search pages. These ads are the less likely to clear the reserve ad therefore more likely to be displayed i the less attractive sidebar. The et result is probably a loss of clicks ad a loss of moey despite the higher quality of the click probability model. Although we could tue the reserve prices to compesate this ufortuate effect, othig i this data tells us where the performace of the ad placemet egie will lad. Furthermore, ukow cofoudig variables might completely reverse our coclusios. Makig sese out of such data is just too complex! 2.5 A Better Way It should ow be obvious that we eed a more pricipled way to reaso about the effect of potetial itervetios. We provide oe such more pricipled approach usig the causal iferece machiery (Sectio 3). The ext step is the the idetificatio of a class of questios that are sufficietly expressive to guide the desiger of a complex learig system, ad sufficietly simple to be aswered usig data collected i the past usig adequate procedures (Sectio 4). A machie learig algorithm ca the be viewed as a automated way to geerate questios about the parameters of a statistical model, obtai the correspodig aswers, ad update the parameters accordigly (Sectio 6). Learig algorithms derived i this maer are very flexible: huma desigers ad machie learig algorithms ca cooperate seamlessly because they rely o similar sources of iformatio. 3. Modelig Causal Systems Whe we poit out a causal relatioship betwee two evets, we describe what we expect to happe to the evet we call the effect, should a exteral operator maipulate the evet we call the cause. Maipulability theories of causatio (vo Wright, 1971; Woodward, 2005) raise this commosese isight to the status of a defiitio of the causal relatio. Difficult adjustmets are the eeded to iterpret statemets ivolvig causes that we ca oly observe through their effects, because they love me, or that are ot easily maipulated, because the earth is roud. Moder statistical thikig makes a clear distictio betwee the statistical model ad the world. The actual mechaisms uderlyig the data are cosidered ukow. The statistical models do ot eed to reproduce these mechaisms to emulate the observable data (Breima, 2001). Better models are sometimes obtaied by deliberately avoidig to reproduce the true mechaisms (Vapik, 1982, Sectio 8.6). We ca approach the maipulability puzzle i the same spirit by viewig causatio as a reasoig model (Bottou, 2011) rather tha a property of the world. Causes ad effects are simply the pieces of a abstract reasoig game. Causal statemets that are ot empirically testable acquire validity whe they are used as itermediate steps whe oe reasos about maipulatios or itervetios ameable to experimetal validatio. This sectio presets the rules of this reasoig game. We largely follow the framework proposed by Pearl (2009) because it gives a clear accout of the coectios betwee causal models ad probabilistic models. 3214
9 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS x = f 1 (u,ε 1 ) Query cotext x from user itet u. a = f 2 (x,v,ε 2 ) Eligible ads (a i ) from query x ad ivetory v. b = f 3 (x,v,ε 3 ) Correspodig bids (b i ). q = f 4 (x,a,ε 4 ) Scores (q i,p,r p ) from query x ad ads a. s = f 5 (a,q,b,ε 5 ) Ad slate s from eligible ads a, scores q ad bids b. c = f 6 (a,q,b,ε 6 ) Correspodig click prices c. y = f 7 (s,u,ε 7 ) User clicks y from ad slate s ad user itet u. z = f 8 (y,c,ε 8 ) Reveue z from clicks y ad prices c. Figure 2: A structural equatio model for ad placemet. The sequece of equatios describes the flow of iformatio. The fuctios f k describe how effects deped o their direct causes. The additioal oise variables ε k represet idepedet sources of radomess useful to model probabilistic depedecies. 3.1 The Flow of Iformatio Figure 2 gives a determiistic descriptio of the operatio of the ad placemet egie. Variable u represets the user ad his or her itetio i a uspecified maer. The query ad query cotext x is the expressed as a ukow fuctio of the u ad of a oise variable ε 1. Noise variables i this framework are best viewed as idepedet sources of radomess useful for modelig a odetermiistic causal depedecy. We shall oly metio them whe they play a specific role i the discussio. The set of eligible ads a ad the correspodig bids b are the derived from the query x ad the ad ivetory v supplied by the advertisers. Statistical models the compute a collectio of scores q such as the click probability estimates q i,p ad the reserves R p itroduced i Sectio 2.1. The placemet logic uses these scores to geerate the ad slate s, that is, the set of wiig ads ad their assiged positios. The correspodig click prices c are computed. The set of user clicks y is expressed as a ukow fuctio of the ad slate s ad the user itet u. Fially the reveue z is expressed as aother fuctio of the clicks y ad the prices c. Such a system of equatios is amed structural equatio model (Wright, 1921). Each equatio asserts a fuctioal depedecy betwee a effect, appearig o the left had side of the equatio, ad its direct causes, appearig o the right had side as argumets of the fuctio. Some of these causal depedecies are ukow. Although we postulate that the effect ca be expressed as some fuctio of its direct causes, we do ot kow the form of this fuctio. For istace, the desiger of the ad placemet egie kows fuctios f 2 to f 6 ad f 8 because he has desiged them. However, he does ot kow the fuctios f 1 ad f 7 because whoever desiged the user did ot leave sufficiet documetatio. Figure 3 represets the directed causal graph associated with the structural equatio model. Each arrow coects a direct cause to its effect. The oise variables are omitted for simplicity. The structure of this graph reveals fudametal assumptios about our model. For istace, the user clicks y do ot directly deped o the scores q or the prices c because users do ot have access to this iformatio. We hold as a priciple that causatio obeys the arrow of time: causes always precede their effects. Therefore the causal graph must be acyclic. Structural equatio models the support two fudametal operatios, amely simulatio ad itervetio. 3215
10 BOTTOU, PETERS, ET AL. Figure 3: Causal graph associated with the structural equatio model of Figure 2. The mutually idepedet oise variables ε 1 to ε 8 are implicit. The variables a, b, q, s, c, ad z deped o their direct causes i kow ways. I cotrast, the variables u ad v are exogeous ad the variables x ad y deped o their direct causes through ukow fuctios. Simulatio Let us assume that we kow both the exact form of all fuctioal depedecies ad the value of all exogeous variables, that is, the variables that ever appear i the left had side of a equatio. We ca compute the values of all the remaiig variables by applyig the equatios i their atural time sequece. Itervetio As log as the causal graph remais acyclic, we ca costruct derived structural equatio models usig arbitrary algebraic maipulatios of the system of equatios. For istace, we ca clamp a variable to a costat value by rewritig the righthad side of the correspodig equatio as the specified costat value. The algebraic maipulatio of the structural equatio models provides a powerful laguage to describe itervetios o a causal system. This is ot a coicidece. May aspects of the mathematical otatio were iveted to support causal iferece i classical mechaics. However, we o loger have to iterpret the variable values as physical quatities: the equatios simply describe the flow of iformatio i the causal model (Wieer, 1948). 3.2 The Isolatio Assumptio Let us ow tur our attetio to the exogeous variables, that is, variables that ever appear i the left had side of a equatio of the structural model. Leibiz s priciple of sufficiet reaso claims that there are o facts without causes. This suggests that the exogeous variables are the effects of a etwork of causes ot expressed by the structural equatio model. For istace, the user itet u ad the ad ivetory v i Figure 3 have temporal correlatios because both users ad advertisers worry about their budgets whe the ed of the moth approaches. Ay structural equatio model should the be uderstood i the cotext of a larger structural equatio model potetially describig all thigs i existece. Ads served o a particular page cotribute to the cotiued satisfactio of both users ad advertisers, ad therefore have a effect o their willigess to use the services of the publisher i the future. The ad placemet structural equatio model show i Figure 2 oly describes the causal depedecies for a sigle page ad therefore caot accout for such effects. Cosider however a very 3216
11 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS Figure 4: Coceptually urollig the user feedback loop by threadig istaces of the sigle page causal graph (Figure 3). Both the ad slate s t ad user clicks y t have a idirect effect o the user itet u t+1 associated with the ext query. large structural equatio model cotaiig a copy of the pagelevel model for every web page ever served by the publisher. Figure 4 shows how we ca thread the pagelevel models correspodig to pages served to the same user. Similarly we could model how advertisers track the performace ad the cost of their advertisemets ad model how their satisfactio affects their future bids. The resultig causal graphs ca be very complex. Part of this complexity results from timescale differeces. Thousads of search pages are served i a secod. Each page cotributes a little to the cotiued satisfactio of oe user ad a few advertisers. The accumulatio of these cotributios produces measurable effects after a few weeks. May of the fuctioal depedecies expressed by the structural equatio model are left uspecified. Without direct kowledge of these fuctios, we must reaso usig statistical data. The most fudametal statistical data is collected from repeated trials that are assumed idepedet. Whe we cosider the large structured equatio model of everythig, we ca oly have oe large trial producig a sigle data poit. 1 It is therefore desirable to idetify repeated patters of idetical equatios that ca be viewed as repeated idepedet trials. Therefore, whe we study a structural equatio model represetig such a patter, we eed to make a additioal assumptio to expresses the idea that the outcome of oe trial does ot affect the other trials. We call such a assumptio a isolatio assumptio by aalogy with thermodyamics. 2 This ca be achieved by assumig that the exogeous variables are idepedetly draw from a ukow but fixed joit probability distributio. This assumptio cuts the causatio effects that could flow through the exogeous variables. The oise variables are also exogeous variables actig as idepedet source of radomess. The oise variables are useful to represet the coditioal distributio P( effect causes) usig the equatio effect= f(causes,ε). Therefore, we also assume joit idepedece betwee all the oise variables ad ay of the amed exogeous variable. 3 For istace, i the case of the ad placemet 1. See also the discussio o reiforcemet learig, Sectio The cocept of isolatio is pervasive i physics. A isolated system i thermodyamics (Reichl, 1998, Sectio 2.D) or a closed system i mechaics (Ladau ad Lifshitz, 1969, 5) evolves without exchagig mass or eergy with its surroudigs. Experimetal trials ivolvig systems that are assumed isolated may differ i their iitial setup ad therefore have differet outcomes. Assumig isolatio implies that the outcome of each trial caot affect the other trials. 3. Rather tha lettig two oise variables display measurable statistical depedecies because they share a commo cause, we prefer to ame the commo cause ad make the depedecy explicit i the graph. 3217
12 BOTTOU, PETERS, ET AL. ( u,v,x,a,b P q,s,c,y,z ) = P(u,v) Exogeous vars. P(x u) Query. P(a x,v) Eligible ads. P(b x,v) Bids. P(q x,a) Scores. P(s a,q,b) Ad slate. P(c a,q,b) Prices. P(y s,u) Clicks. P(z y,c) Reveue. Figure 5: Markov factorizatio of the structural equatio model of Figure 2. Figure 6: Bayesia etwork associated with the Markov factorizatio show i Figure 5. model show i Figure 2, we assume that the joit distributio of the exogeous variables factorizes as P(u,v,ε 1,...,ε 8 )=P(u,v)P(ε 1 )...P(ε 8 ). Sice a isolatio assumptio is oly true up to a poit, it should be expressed clearly ad remai uder costat scrutiy. We must therefore measure additioal performace metrics that reveal how the isolatio assumptio holds. For istace, the ad placemet structural equatio model ad the correspodig causal graph (figures 2 ad 3) do ot take user feedback or advertiser feedback ito accout. Measurig the reveue is ot eough because we could easily geerate reveue at the expese of the satisfactio of the users ad advertisers. Whe we evaluate itervetios uder such a isolatio assumptio, we also eed to measure a battery of additioal quatities that act as proxies for the user ad advertiser satisfactio. Noteworthy examples iclude ad relevace estimated by huma judges, ad advertiser surplus estimated from the auctios (Varia, 2009). 3.3 Markov Factorizatio Coceptually, we ca draw a sample of the exogeous variables usig the distributio specified by the isolatio assumptio, ad we ca the geerate values for all the remaiig variables by simulatig the structural equatio model. 3218
13 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS This process defies a geerative probabilistic model represetig the joit distributio of all variables i the structural equatio model. The distributio readily factorizes as the product of the joit probability of the amed exogeous variables, ad, for each equatio i the structural equatio model, the coditioal probability of the effect give its direct causes (Spirtes et al., 1993; Pearl, 2000). As illustrated by figures 5 ad 6, this Markov factorizatio coects the structural equatio model that describes causatio, ad the Bayesia etwork that describes the joit probability distributio followed by the variables uder the isolatio assumptio. 4 Structural equatio models ad Bayesia etworks appear so itimately coected that it could be easy to forget the differeces. The structural equatio model is a algebraic object. As log as the causal graph remais acyclic, algebraic maipulatios are iterpreted as itervetios o the causal system. The Bayesia etwork is a geerative statistical model represetig a class of joit probability distributios, ad, as such, does ot support algebraic maipulatios. However, the symbolic represetatio of its Markov factorizatio is a algebraic object, essetially equivalet to the structural equatio model. 3.4 Idetificatio, Trasportatio, ad Trasfer Learig Cosider a causal system represeted by a structural equatio model with some ukow fuctioal depedecies. Subject to the isolatio assumptio, data collected durig the operatio of this system follows the distributio described by the correspodig Markov factorizatio. Let us first assume that this data is sufficiet to idetify the joit distributio of the subset of variables we ca observe. We ca itervee o the system by clampig the value of some variables. This amouts to replacig the righthad side of the correspodig structural equatios by costats. The joit distributio of the variables is the described by a ew Markov factorizatio that shares may factors with the origial Markov factorizatio. Which coditioal probabilities associated with this ew distributio ca we express usig oly coditioal probabilities idetified durig the observatio of the origial system? This is called the idetifiability problem. More geerally, we ca cosider arbitrarily complex maipulatios of the structural equatio model, ad we ca perform multiple experimets ivolvig differet maipulatios of the causal system. Which coditioal probabilities pertaiig to oe experimet ca be expressed usig oly coditioal probabilities idetified durig the observatio of other experimets? This is called the trasportability problem. Pearl s docalculus completely solves the idetifiability problem ad provides useful tools to address may istaces of the trasportability problem (see Pearl, 2012). Assumig that we kow the coditioal probability distributios ivolvig observed variables i the origial structural equatio model, docalculus allows us to derive coditioal distributios pertaiig to the maipulated structural equatio model. Ufortuately, we must further distiguish the coditioal probabilities that we kow (because we desiged them) from those that we estimate from empirical data. This distictio is importat because estimatig the distributio of cotiuous or high cardiality variables is otoriously difficult. Furthermore, docalculus ofte combies the estimated probabilities i ways that amplify estimatio errors. This happes whe the maipulated structural equatio model exercises the variables i ways that were rarely observed i the data collected from the origial structural equatio model. 4. Bayesia etworks are directed graphs represetig the Markov factorizatio of a joit probability distributio: the arrows o loger have a causal iterpretatio. 3219
14 BOTTOU, PETERS, ET AL. Therefore we prefer to use much simpler causal iferece techiques (see sectios 4.1 ad 4.2). Although these techiques do ot have the completeess properties of docalculus, they combie estimatio ad trasportatio i a maer that facilitates the derivatio of useful cofidece itervals. 3.5 Special Cases Three special cases of causal models are particularly relevat to this work. I the multiarmed badit (Robbis, 1952), a userdefied policy fuctio π determies the distributio of actio a {1...K}, ad a ukow reward fuctio r determies the distributio of the outcome y give the actio a (Figure 7). I order to maximize the accumulated rewards, the player must costruct policies π that balace the exploratio of the actio space with the exploitatio of the best actio idetified so far (Auer et al., 2002; Audibert et al., 2007; Seldi et al., 2012). The cotextual badit problem (Lagford ad Zhag, 2008) sigificatly icreases the complexity of multiarmed badits by addig oe exogeous variable x to the policy fuctio π ad the reward fuctios r (Figure 8). Both multiarmed badit ad cotextual badit are special case of reiforcemet learig (Sutto ad Barto, 1998). I essece, a Markov decisio process is a sequece of cotextual badits where the cotext is o loger a exogeous variable but a state variable that depeds o the previous states ad actios (Figure 9). Note that the policy fuctio π, the reward fuctio r, ad the trasitio fuctio s are idepedet of time. All the time depedecies are expressed usig the states s t. These special cases have icreasig geerality. May simple structural equatio models ca be reduced to a cotextual badit problem usig appropriate defiitios of the cotext x, the actio a ad the outcome y. For istace, assumig that the prices c are discrete, the ad placemet structural equatio model show i Figure 2 reduces to a cotextual badit problem with cotext (u, v), actios (s, c) ad reward z. Similarly, give a sufficietly itricate defiitio of the state variables s t, all structural equatio models with discrete variables ca be reduced to a reiforcemet learig problem. Such reductios lose the fie structure of the causal graph. We show i Sectio 5 how this fie structure ca i fact be leveraged to obtai more iformatio from the same experimets. Moder reiforcemet learig algorithms (see Sutto ad Barto, 1998) leverage the assumptio that the policy fuctio, the reward fuctio, the trasitio fuctio, ad the distributios of the correspodig oise variables, are idepedet from time. This ivariace property provides great beefits whe the observed sequeces of actios ad rewards are log i compariso with the size of the state space. Oly Sectio 7 i this cotributio presets methods that take advatage of such a ivariace. The geeral questio of leveragig arbitrary fuctioal ivariaces i causal graphs is left for future work. 4. Couterfactual Aalysis We ow retur to the problem of formulatig ad aswerig questios about the value of proposed chages of a learig system. Assume for istace that we cosider replacig the score computatio 3220
15 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS a = π(ε) Actio a {1...K} y = r(a, ε ) Reward y R Figure 7: Structural equatio model for the multiarmed badit problem. The policy π selects a discrete actio a, ad the reward fuctio r determies the outcome y. The oise variables ε ad ε represet idepedet sources of radomess useful to model probabilistic depedecies. a = π(x, ε) Actio a {1...K} y = r(x, a, ε ) Reward y R Figure 8: Structural equatio model for cotextual badit problem. Both the actio ad the reward deped o a exogeous cotext variable x. a t = π(s t 1, ε t ) Actio y t = r(s t 1, a t, ε t) Reward r t R s t = s(s t 1, a t, ε t ) Next state Figure 9: Structural equatio model for reiforcemet learig. The above equatios are replicated for all t {0...,T}. The cotext is ow provided by a state variable s t 1 that depeds o the previous states ad actios. model M of a ad placemet egie by a alterate model M. We seek a aswer to the coditioal questio: How will the system perform if we replace model M by model M? Give sufficiet time ad sufficiet resources, we ca obtai the aswer usig a cotrolled experimet (Sectio 2.2). However, istead of carryig out a ew experimet, we would like to obtai a aswer usig data that we have already collected i the past. How would the system have performed if, whe the data was collected, we had replaced model M by model M? The aswer of this couterfactual questio is of course a couterfactual statemet that describes the system performace subject to a coditio that did ot happe. Couterfactual statemets challege ordiary logic because they deped o a coditio that is kow to be false. Although assertio A B is always true whe assertio A is false, we certaily do ot mea for all couterfactual statemets to be true. Lewis (1973) avigates this paradox usig a modal logic i which a couterfactual statemet describes the state of affairs i a alterate world that resembles ours except for the specified differeces. Couterfactuals ideed offer may subtle ways to qualify such alterate worlds. For istace, we ca easily describe isolatio assumptios (Sectio 3.2) i a couterfactual questio: How would the system have performed if, whe the data was collected, we had replaced model M by model M without icurrig user or advertiser reactios? 3221
16 BOTTOU, PETERS, ET AL. Figure 10: Causal graph for a image recogitio system. We ca estimate couterfactuals by replayig data collected i the past. Figure 11: Causal graph for a radomized experimet. We ca estimate certai couterfactuals by reweightig data collected i the past. The fact that we could ot have chaged the model without icurrig the user ad advertiser reactios does ot matter ay more tha the fact that we did ot replace model M by model M i the first place. This does ot prevet us from usig couterfactual statemets to reaso about causes ad effects. Couterfactual questios ad statemets provide a atural framework to express ad share our coclusios. The remaiig text i this sectio explais how we ca aswer certai couterfactual questios usig data collected i the past. More precisely, we seek to estimate performace metrics that ca be expressed as expectatios with respect to the distributio that would have bee observed if the couterfactual coditios had bee i force Replayig Empirical Data Figure 10 shows the causal graph associated with a simple image recogitio system. The classifier takes a image x ad produces a prospective class label ŷ. The loss measures the pealty associated with recogizig class ŷ while the true class is y. To estimate the expected error of such a classifier, we collect a represetative data set composed of labelled images, ru the classifier o each image, ad average the resultig losses. I other words, we replay the data set to estimate what (couterfactual) performace would have bee observed if we had used a differet classifier. We ca the select i retrospect the classifier that would have worked the best ad hope that it will keep workig well. This is the couterfactual viewpoit o empirical risk miimizatio (Vapik, 1982). Replayig the data set works because both the alterate classifier ad the loss fuctio are kow. More geerally, to estimate a couterfactual by replayig a data set, we eed to kow all the fuctioal depedecies associated with all causal paths coectig the itervetio poit to the measuremet poit. This is obviously ot always the case. 5. Although couterfactual expectatios ca be viewed as expectatios of uitlevel couterfactuals (Pearl, 2009, Defiitio 4), they elude the sematic subtleties of uitlevel couterfactuals ad ca be measured with radomized experimets (see Sectio 4.2.) 3222
17 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS 4.2 Reweightig Radomized Trials Figure 11 illustrates the radomized experimet suggested i Sectio 2.3. The patiets are radomly split ito two equally sized groups receivig respectively treatmets A ad B. The overall success rate for this experimet is therefore Y =(Y A +Y B )/2 where Y A ad Y B are the success rates observed for each group. We would like to estimate which (couterfactual) overall success rate Y would have bee observed if we had selected treatmet A with probability p ad treatmet B with probability 1 p. Sice we do ot kow how the outcome depeds o the treatmet ad the patiet coditio, we caot compute which outcome y would have bee obtaied if we had treated patiet x with a differet treatmet u. Therefore we caot aswer this questio by replayig the data as we did i Sectio 4.1. However, observig differet success rates Y A ad Y B for the treatmet groups reveals a empirical correlatio betwee the treatmet u ad the outcome y. Sice the oly cause of the treatmet u is a idepedet roll of the dices, this correlatio caot result from ay kow or ukow cofoudig commo cause. 6 Havig elimiated this possibility, we ca reweight the observed outcomes ad compute the estimate Y py A +(1 p)y B. 4.3 Markov Factor Replacemet The reweightig approach ca i fact be applied uder much less striget coditios. Let us retur to the ad placemet problem to illustrate this poit. The average umber of ad clicks per page is ofte called click yield. Icreasig the click yield usually beefits both the advertiser ad the publisher, whereas icreasig the reveue per page ofte beefits the publisher at the expese of the advertiser. Click yield is therefore a very useful metric whe we reaso with a isolatio assumptio that igores the advertiser reactios to pricig chages. Let be a shorthad for all variables appearig i the Markov factorizatio of the ad placemet structural equatio model, P() = P(u,v)P(x u)p(a x,v)p(b x,v)p(q x,a) P(s a,q,b)p(c a,q,b)p(y s,u)p(z y,c). (2) Variable y was defied i Sectio 3.1 as the set of user clicks. I the rest of the documet, we slightly abuse this otatio by usig the same letter y to represet the umber of clicks. We also write the expectatio Y =E P() [y] usig the itegral otatio Y = y P(). We would like to estimate what the expected click yield Y would have bee if we had used a differet scorig fuctio (Figure 12). This itervetio amouts to replacig the actual factor P(q x,a) by a couterfactual factor P (q x,a) i the Markov factorizatio. P () = P(u,v)P(x u)p(a x,v)p(b x,v)p (q x,a) P(s a,q,b)p(c a,q,b)p(y s,u)p(z x,c). (3) 6. See also the discussio of Reichebach s commo cause priciple ad of its limitatios i Spirtes et al. (1993) ad Spirtes ad Scheies (2004). 3223
18 BOTTOU, PETERS, ET AL. Figure 12: Estimatig which average umber of clicks per page would have bee observed if we had used a differet scorig model. Let us assume, for simplicity, that the actual factor P(q x,a) is ozero everywhere. We ca the estimate the couterfactual expected click yield Y usig the trasformatio Y = y P () = y P (q x,a) P(q x,a) P() 1 i=1 y i P (q i x i,a i ) P(q i x i,a i ), (4) where the data set of tuples (a i,x i,q i,y i ) is distributed accordig to the actual Markov factorizatio istead of the couterfactual Markov factorizatio. This data could therefore have bee collected durig the ormal operatio of the ad placemet system. Each sample is reweighted to reflect its probability of occurrece uder the couterfactual coditios. I geeral, we ca use importace samplig to estimate the couterfactual expectatio of ay quatityl() : with weights Y = l() P () = l() P () P() P() 1 i=1 l( i ) w i (5) w i = w( i ) = P ( i ) P( i ) = factors appearig i P ( i ) but ot i P( i ) factors appearig i P( i ) but ot i P ( i ). (6) Equatio (6) emphasizes the simplificatios resultig from the algebraic similarities of the actual ad couterfactual Markov factorizatios. Because of these simplificatios, the evaluatio of the weights oly requires the kowledge of the few factors that differ betwee P() ad P (). Each data sample eeds to provide the value of l( i ) ad the values of all variables eeded to evaluate the factors that do ot cacel i the ratio (6). I cotrast, the replayig approach (Sectio 4.1) demads the kowledge of all factors of P () coectig the poit of itervetio to the poit of measuremet l(). O the other had, it does ot require the kowledge of factors appearig oly i P(). Importace samplig relies o the assumptio that all the factors appearig i the deomiator of the reweightig ratio (6) are ozero wheever the factors appearig i the umerator are ozero. Sice these factors represets coditioal probabilities resultig from the effect of a idepedet oise variable i the structural equatio model, this assumptio meas that the data 3224
19 COUNTERFACTUAL REASONING AND LEARNING SYSTEMS must be collected with a experimet ivolvig active radomizatio. We must therefore desig costeffective radomized experimets that yield eough iformatio to estimate may iterestig couterfactual expectatios with sufficiet accuracy. This problem caot be solved without aswerig the cofidece iterval questio: give data collected with a certai level of radomizatio, with which accuracy ca we estimate a give couterfactual expectatio? 4.4 Cofidece Itervals At first sight, we ca ivoke the law of large umbers ad write Y = P() l()w() 1 i=1 l( i )w i. (7) For sufficietly large, the cetral limit theorem provides cofidece itervals whose width grows with the stadard deviatio of the product l()w(). Ufortuately, whe P() is small, the reweightig ratio w() takes large values with low probability. This heavy tailed distributio has aoyig cosequeces because the variace of the itegrad could be very high or ifiite. Whe the variace is ifiite, the cetral limit theorem does ot hold. Whe the variace is merely very large, the cetral limit covergece might occur too slowly to justify such cofidece itervals. Importace samplig works best whe the actual distributio ad the couterfactual distributio overlap. Whe the couterfactual distributio has sigificat mass i domais where the actual distributio is small, the few samples available i these domais receive very high weights. Their oisy cotributio domiates the reweighted estimate (7). We ca obtai better cofidece itervals by elimiatig these few samples draw i poorly explored domais. The resultig bias ca be bouded usig prior kowledge, for istace with a assumptio about the rage of values take byl(), l() [0, M]. (8) Let us choose the maximum weight value R deemed acceptable for the weights. We have obtaied very cosistet results i practice with R equal to the fifth largest reweightig ratio observed o the empirical data. 7 We ca the rely o clipped weights to elimiate the cotributio of the poorly explored domais, w() = { w() if P ()<R P() 0 otherwise. The coditio P ()<RP() esures that the ratio has a ozero deomiator P() ad is smaller tha R. Let Ω R be the set of all values of associated with acceptable ratios: Ω R = { : P ()<R P()}. We ca decompose Y i two terms: Y = l()p () + Ω R Ω\Ω R l()p () = Ȳ +(Y Ȳ ). (9) 7. This is i fact a slight abuse because the theory calls for choosig R before seeig the data. 3225
20 BOTTOU, PETERS, ET AL. The first term of this decompositio is the clipped expectatio Ȳ. Estimatig the clipped expectatio Ȳ is much easier tha estimatig Y from (7) because the clipped weights w() are bouded by R. Ȳ = l()p () = w() P() Ŷ l() = 1 l( i ) w( i ). (10) Ω R The secod term of Equatio (9) ca be bouded by leveragig assumptio (8). The resultig boud ca the be coveietly estimated usig oly the clipped weights. [ ] [ ] Y Ȳ = l()p () 0, M P (Ω\Ω R ) = 0, M(1 W ) with Ω\Ω R W = P (Ω R ) = Ω R P () = i=1 w()p() W = 1 i=1 w( i ). (11) Sice the clipped weights are bouded, the estimatio errors associated with (10) ad (11) are well characterized usig either the cetral limit theorem or usig empirical Berstei bouds (see appedix B for details). Therefore we ca derive a outer cofidece iterval of the form { } P Ŷ ε R Ȳ Ŷ + ε R 1 δ (12) ad a ier cofidece iterval of the form P{ Ȳ Y Ȳ + M(1 W + ξ R ) } 1 δ. (13) The ames ier ad outer are i fact related to our preferred way to visualize these itervals (e.g., Figure 13). Sice the bouds o Y Ȳ ca be writte as we ca derive our fial cofidece iterval, Ȳ Y Ȳ + M(1 W ), (14) P{ Ŷ ε R Y Ŷ + M(1 W + ξ R )+ε R } 1 2δ. (15) I coclusio, replacig the ubiased importace samplig estimator (7) by the clipped importace samplig estimator (10) with a suitable choice of R leads to improved cofidece itervals. Furthermore, sice the derivatio of these cofidece itervals does ot rely o the assumptio that P() is ozero everywhere, the clipped importace samplig estimator remais valid whe the distributio P() has a limited support. This relaxes the mai restrictio associated with importace samplig. 4.5 Iterpretig the Cofidece Itervals The estimatio of the couterfactual expectatio Y ca be iaccurate because the sample size is isufficiet or because the samplig distributio P() does ot sufficietly explore the couterfactual coditios of iterest. By costructio, the clipped expectatio Ȳ igores the domais poorly explored by the samplig distributio P(). The differece Y Ȳ the reflects the iaccuracy resultig from a lack of exploratio. Therefore, assumig that the boud R has bee chose competetly, the relative sizes of the outer ad ier cofidece itervals provide precious cues to determie whether we ca cotiue collectig data usig the same experimetal setup or should adjust the data collectio experimet i order to obtai a better coverage. 3226
I. Chisquared Distributions
1 M 358K Supplemet to Chapter 23: CHISQUARED DISTRIBUTIONS, TDISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad tdistributios, we first eed to look at aother family of distributios, the chisquared distributios.
More informationHypothesis Tests Applied to Means
The Samplig Distributio of the Mea Hypothesis Tests Applied to Meas Recall that the samplig distributio of the mea is the distributio of sample meas that would be obtaied from a particular populatio (with
More informationConfidence Intervals for One Mean with Tolerance Probability
Chapter 421 Cofidece Itervals for Oe Mea with Tolerace Probability Itroductio This procedure calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) with
More informationKey Ideas Section 81: Overview hypothesis testing Hypothesis Hypothesis Test Section 82: Basics of Hypothesis Testing Null Hypothesis
Chapter 8 Key Ideas Hypothesis (Null ad Alterative), Hypothesis Test, Test Statistic, Pvalue Type I Error, Type II Error, Sigificace Level, Power Sectio 81: Overview Cofidece Itervals (Chapter 7) are
More informationDefinition. Definition. 72 Estimating a Population Proportion. Definition. Definition
7 stimatig a Populatio Proportio I this sectio we preset methods for usig a sample proportio to estimate the value of a populatio proportio. The sample proportio is the best poit estimate of the populatio
More informationWeek 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable
Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5
More informationHypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
More informationStandard Errors and Confidence Intervals
Stadard Errors ad Cofidece Itervals Itroductio I the documet Data Descriptio, Populatios ad the Normal Distributio a sample had bee obtaied from the populatio of heights of 5yearold boys. If we assume
More information.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth
Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,
More informationChapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
More informationProperties of MLE: consistency, asymptotic normality. Fisher information.
Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout
More information5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
More informationChapter 6: Variance, the law of large numbers and the MonteCarlo method
Chapter 6: Variace, the law of large umbers ad the MoteCarlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
More informationGrade 7. Strand: Number Specific Learning Outcomes It is expected that students will:
Strad: Number Specific Learig Outcomes It is expected that studets will: 7.N.1. Determie ad explai why a umber is divisible by 2, 3, 4, 5, 6, 8, 9, or 10, ad why a umber caot be divided by 0. [C, R] [C]
More informationAnalyzing Longitudinal Data from Complex Surveys Using SUDAAN
Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical
More informationIn nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may
More informationOverview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals
Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of
More information3. Covariance and Correlation
Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 3. Covariace ad Correlatio Recall that by takig the expected value of various trasformatios of a radom variable, we ca measure may iterestig characteristics
More informationThe analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection
The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity
More informationAQA STATISTICS 1 REVISION NOTES
AQA STATISTICS 1 REVISION NOTES AVERAGES AND MEASURES OF SPREAD www.mathsbox.org.uk Mode : the most commo or most popular data value the oly average that ca be used for qualitative data ot suitable if
More informationCHAPTER 3 THE TIME VALUE OF MONEY
CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all
More informationSubject CT5 Contingencies Core Technical Syllabus
Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value
More information*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.
Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.
More informationModified Line Search Method for Global Optimization
Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o
More informationLECTURE 13: Crossvalidation
LECTURE 3: Crossvalidatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Threeway data partitioi Itroductio to Patter Aalysis Ricardo GutierrezOsua Texas A&M
More informationORDERS OF GROWTH KEITH CONRAD
ORDERS OF GROWTH KEITH CONRAD Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really wat to uderstad their behavior It also helps you better grasp topics i calculus
More informationCase Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
More informationLecture Notes CMSC 251
We have this messy summatio to solve though First observe that the value remais costat throughout the sum, ad so we ca pull it out frot Also ote that we ca write 3 i / i ad (3/) i T () = log 3 (log ) 1
More informationSection 73 Estimating a Population. Requirements
Sectio 73 Estimatig a Populatio Mea: σ Kow Key Cocept This sectio presets methods for usig sample data to fid a poit estimate ad cofidece iterval estimate of a populatio mea. A key requiremet i this sectio
More informationNPTEL STRUCTURAL RELIABILITY
NPTEL Course O STRUCTURAL RELIABILITY Module # 0 Lecture 1 Course Format: Web Istructor: Dr. Aruasis Chakraborty Departmet of Civil Egieerig Idia Istitute of Techology Guwahati 1. Lecture 01: Basic Statistics
More informationSequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
More informationStatistical inference: example 1. Inferential Statistics
Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either
More informationRecursion and Recurrences
Chapter 5 Recursio ad Recurreces 5.1 Growth Rates of Solutios to Recurreces Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer. Cosider, for example,
More informationINVESTMENT PERFORMANCE COUNCIL (IPC)
INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks
More informationTHE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
More informationwhen n = 1, 2, 3, 4, 5, 6, This list represents the amount of dollars you have after n days. Note: The use of is read as and so on.
Geometric eries Before we defie what is meat by a series, we eed to itroduce a related topic, that of sequeces. Formally, a sequece is a fuctio that computes a ordered list. uppose that o day 1, you have
More informationCenter, Spread, and Shape in Inference: Claims, Caveats, and Insights
Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the
More information5 Boolean Decision Trees (February 11)
5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected
More informationStat 104 Lecture 16. Statistics 104 Lecture 16 (IPS 6.1) Confidence intervals  the general concept
Statistics 104 Lecture 16 (IPS 6.1) Outlie for today Cofidece itervals Cofidece itervals for a mea, µ (kow σ) Cofidece itervals for a proportio, p Margi of error ad sample size Review of mai topics for
More informationTrading the randomness  Designing an optimal trading strategy under a drifted random walk price model
Tradig the radomess  Desigig a optimal tradig strategy uder a drifted radom walk price model Yuao Wu Math 20 Project Paper Professor Zachary Hamaker Abstract: I this paper the author iteds to explore
More informationDivide and Conquer, Solving Recurrences, Integer Multiplication Scribe: Juliana Cook (2015), V. Williams Date: April 6, 2016
CS 6, Lecture 3 Divide ad Coquer, Solvig Recurreces, Iteger Multiplicatio Scribe: Juliaa Cook (05, V Williams Date: April 6, 06 Itroductio Today we will cotiue to talk about divide ad coquer, ad go ito
More informationConfidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
More informationINVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology
Adoptio Date: 4 March 2004 Effective Date: 1 Jue 2004 Retroactive Applicatio: No Public Commet Period: Aug Nov 2002 INVESTMENT PERFORMANCE COUNCIL (IPC) Preface Guidace Statemet o Calculatio Methodology
More informationDetermining the sample size
Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors
More informationModule 4: Mathematical Induction
Module 4: Mathematical Iductio Theme 1: Priciple of Mathematical Iductio Mathematical iductio is used to prove statemets about atural umbers. As studets may remember, we ca write such a statemet as a predicate
More informationMARTINGALES AND A BASIC APPLICATION
MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measuretheoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this
More informationResearch Method (I) Knowledge on Sampling (Simple Random Sampling)
Research Method (I) Kowledge o Samplig (Simple Radom Samplig) 1. Itroductio to samplig 1.1 Defiitio of samplig Samplig ca be defied as selectig part of the elemets i a populatio. It results i the fact
More informationInstitute of Actuaries of India Subject CT1 Financial Mathematics
Istitute of Actuaries of Idia Subject CT1 Fiacial Mathematics For 2014 Examiatios Subject CT1 Fiacial Mathematics Core Techical Aim The aim of the Fiacial Mathematics subject is to provide a groudig i
More informationVladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT
Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee
More informationCHAPTER 3 DIGITAL CODING OF SIGNALS
CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity
More informationif A S, then X \ A S, and if (A n ) n is a sequence of sets in S, then n A n S,
Lecture 5: Borel Sets Topologically, the Borel sets i a topological space are the σalgebra geerated by the ope sets. Oe ca build up the Borel sets from the ope sets by iteratig the operatios of complemetatio
More informationHypergeometric Distributions
7.4 Hypergeometric Distributios Whe choosig the startig lieup for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you
More informationWe have seen that the physically observable properties of a quantum system are represented
Chapter 14 Probability, Expectatio Value ad Ucertaity We have see that the physically observable properties of a quatum system are represeted by Hermitea operators (also referred to as observables ) such
More informationTaking DCOP to the Real World: Efficient Complete Solutions for Distributed MultiEvent Scheduling
Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed MultiEvet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria
More information1 The Binomial Theorem: Another Approach
The Biomial Theorem: Aother Approach Pascal s Triagle I class (ad i our text we saw that, for iteger, the biomial theorem ca be stated (a + b = c a + c a b + c a b + + c ab + c b, where the coefficiets
More informationPreSuit Collection Strategies
PreSuit Collectio Strategies Writte by Charles PT Phoeix How to Decide Whether to Pursue Collectio Calculatig the Value of Collectio As with ay busiess litigatio, all factors associated with the process
More informationSection IV.5: Recurrence Relations from Algorithms
Sectio IV.5: Recurrece Relatios from Algorithms Give a recursive algorithm with iput size, we wish to fid a Θ (best big O) estimate for its ru time T() either by obtaiig a explicit formula for T() or by
More informationQuadrat Sampling in Population Ecology
Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may
More informationNonlife insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring
Nolife isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy
More informationPresent Values, Investment Returns and Discount Rates
Preset Values, Ivestmet Returs ad Discout Rates Dimitry Midli, ASA, MAAA, PhD Presidet CDI Advisors LLC dmidli@cdiadvisors.com May 2, 203 Copyright 20, CDI Advisors LLC The cocept of preset value lies
More information9.8: THE POWER OF A TEST
9.8: The Power of a Test CD91 9.8: THE POWER OF A TEST I the iitial discussio of statistical hypothesis testig, the two types of risks that are take whe decisios are made about populatio parameters based
More informationGeometric Sequences and Series. Geometric Sequences. Definition of Geometric Sequence. such that. a2 4
3330_0903qxd /5/05 :3 AM Page 663 Sectio 93 93 Geometric Sequeces ad Series 663 Geometric Sequeces ad Series What you should lear Recogize, write, ad fid the th terms of geometric sequeces Fid th partial
More informationInstitute for the Advancement of University Learning & Department of Statistics
Istitute for the Advacemet of Uiversity Learig & Departmet of Statistics Descriptive Statistics for Research (Hilary Term, 00) Lecture 5: Cofidece Itervals (I.) Itroductio Cofidece itervals (or regios)
More informationExample 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).
BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook  Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly
More informationARITHMETIC AND GEOMETRIC PROGRESSIONS
Arithmetic Ad Geometric Progressios Sequeces Ad ARITHMETIC AND GEOMETRIC PROGRESSIONS Successio of umbers of which oe umber is desigated as the first, other as the secod, aother as the third ad so o gives
More informationThe shaded region above represents the region in which z lies.
GCE A Level H Maths Solutio Paper SECTION A (PURE MATHEMATICS) (i) Im 3 Note: Uless required i the questio, it would be sufficiet to just idicate the cetre ad radius of the circle i such a locus drawig.
More informationIncremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
More informationEconomics 140A Confidence Intervals and Hypothesis Testing
Ecoomics 140A Cofidece Itervals ad Hypothesis Testig Obtaiig a estimate of a parameter is ot the al purpose of statistical iferece because it is highly ulikely that the populatio value of a parameter is
More informationPage 1. Real Options for Engineering Systems. What are we up to? Today s agenda. J1: Real Options for Engineering Systems. Richard de Neufville
Real Optios for Egieerig Systems J: Real Optios for Egieerig Systems By (MIT) Stefa Scholtes (CU) Course website: http://msl.mit.edu/cmi/ardet_2002 Stefa Scholtes Judge Istitute of Maagemet, CU Slide What
More informationBASIC STATISTICS. Discrete. Mass Probability Function: P(X=x i ) Only one finite set of values is considered {x 1, x 2,...} Prob. t = 1.
BASIC STATISTICS 1.) Basic Cocepts: Statistics: is a sciece that aalyzes iformatio variables (for istace, populatio age, height of a basketball team, the temperatures of summer moths, etc.) ad attempts
More information1. C. The formula for the confidence interval for a population mean is: x t, which was
s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : pvalue
More informationODBC. Getting Started With Sage Timberline Office ODBC
ODBC Gettig Started With Sage Timberlie Office ODBC NOTICE This documet ad the Sage Timberlie Office software may be used oly i accordace with the accompayig Sage Timberlie Office Ed User Licese Agreemet.
More informationG r a d e. 2 M a t h e M a t i c s. statistics and Probability
G r a d e 2 M a t h e M a t i c s statistics ad Probability Grade 2: Statistics (Data Aalysis) (2.SP.1, 2.SP.2) edurig uderstadigs: data ca be collected ad orgaized i a variety of ways. data ca be used
More informationAsymptotic Growth of Functions
CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll
More informationHow to read A Mutual Fund shareholder report
Ivestor BulletI How to read A Mutual Fud shareholder report The SEC s Office of Ivestor Educatio ad Advocacy is issuig this Ivestor Bulleti to educate idividual ivestors about mutual fud shareholder reports.
More informationThe second difference is the sequence of differences of the first difference sequence, 2
Differece Equatios I differetial equatios, you look for a fuctio that satisfies ad equatio ivolvig derivatives. I differece equatios, istead of a fuctio of a cotiuous variable (such as time), we look for
More informationProject Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments
Project Deliverables CS 361, Lecture 28 Jared Saia Uiversity of New Mexico Each Group should tur i oe group project cosistig of: About 612 pages of text (ca be loger with appedix) 612 figures (please
More informationG r a d e. 5 M a t h e M a t i c s. Patterns and relations
G r a d e 5 M a t h e M a t i c s Patters ad relatios Grade 5: Patters ad Relatios (Patters) (5.PR.1) Edurig Uderstadigs: Number patters ad relatioships ca be represeted usig variables. Geeral Outcome:
More information23.3 Sampling Distributions
COMMON CORE Locker LESSON Commo Core Math Stadards The studet is expected to: COMMON CORE SIC.B.4 Use data from a sample survey to estimate a populatio mea or proportio; develop a margi of error through
More informationInformation about Bankruptcy
Iformatio about Bakruptcy Isolvecy Service of Irelad Seirbhís Dócmhaieachta a héirea Isolvecy Service of Irelad Seirbhís Dócmhaieachta a héirea What is the? The Isolvecy Service of Irelad () is a idepedet
More informationME 101 Measurement Demonstration (MD 1) DEFINITIONS Precision  A measure of agreement between repeated measurements (repeatability).
INTRODUCTION This laboratory ivestigatio ivolves makig both legth ad mass measuremets of a populatio, ad the assessig statistical parameters to describe that populatio. For example, oe may wat to determie
More informationHere are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.
This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio
More informationUnit 20 Hypotheses Testing
Uit 2 Hypotheses Testig Objectives: To uderstad how to formulate a ull hypothesis ad a alterative hypothesis about a populatio proportio, ad how to choose a sigificace level To uderstad how to collect
More information8.1 Arithmetic Sequences
MCR3U Uit 8: Sequeces & Series Page 1 of 1 8.1 Arithmetic Sequeces Defiitio: A sequece is a comma separated list of ordered terms that follow a patter. Examples: 1, 2, 3, 4, 5 : a sequece of the first
More information7. Sample Covariance and Correlation
1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 7. Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y
More informationNormal Distribution.
Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued
More informationwhere: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return
EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The
More informationLesson 17 Pearson s Correlation Coefficient
Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) types of data scatter plots measure of directio measure of stregth Computatio covariatio of X ad Y uique variatio i X ad Y measurig
More informationStatistical Methods. Chapter 1: Overview and Descriptive Statistics
Geeral Itroductio Statistical Methods Chapter 1: Overview ad Descriptive Statistics Statistics studies data, populatio, ad samples. Descriptive Statistics vs Iferetial Statistics. Descriptive Statistics
More informationFOUNDATIONS OF MATHEMATICS AND PRECALCULUS GRADE 10
FOUNDATIONS OF MATHEMATICS AND PRECALCULUS GRADE 10 [C] Commuicatio Measuremet A1. Solve problems that ivolve liear measuremet, usig: SI ad imperial uits of measure estimatio strategies measuremet strategies.
More informationFACTSHEET 1: DEVELOPING A STRATEGIC MANAGEMENT PLAN
FACTSHEET 1: DEVELOPING A STRATEGIC MANAGEMENT PLAN THIS FACTSHEET RELATES TO QUESTION 2.3 OF THE MSPI (MUSEUM STANDARDS PROGRAMME FOR IRELAND) 1. Itroductio Writig a Strategic Maagemet Pla ca provide
More informationINDEPENDENT BUSINESS PLAN EVENT 2016
INDEPENDENT BUSINESS PLAN EVENT 2016 The Idepedet Busiess Pla Evet ivolves the developmet of a comprehesive proposal to start a ew busiess. Ay type of busiess may be used. The Idepedet Busiess Pla Evet
More informationConcept #1. Goals for Presentation. I m going to be a mathematics teacher: Where did this stuff come from? Why didn t I know this before?
I m goig to be a mathematics teacher: Why did t I kow this before? Steve Williams Associate Professor of Mathematics/ Coordiator of Secodary Mathematics Educatio Lock Have Uiversity of PA swillia@lhup.edu
More informationCHAPTER 7: Central Limit Theorem: CLT for Averages (Means)
CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics
More informationThe Limit of a Sequence
3 The Limit of a Sequece 3. Defiitio of limit. I Chapter we discussed the limit of sequeces that were mootoe; this restrictio allowed some shortcuts ad gave a quick itroductio to the cocept. But may importat
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationZTEST / ZSTATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown
ZTEST / ZSTATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large TTEST / TSTATISTIC: used to test hypotheses about
More information