Causal Inference on Total, Direct, and Indirect Effects

Size: px
Start display at page:

Download "Causal Inference on Total, Direct, and Indirect Effects"

Transcription

1 Causal Inference on Total, Direct, and Indirect Effects Rolf Steyer, Axel Mayer and Christiane Fiege In: A. Michalos (ed.), Encyclopedia of Quality of Life Research Version: May 27,

2 1 RESEARCH TRADITIONS 2 Abstract The theory of causal effects (TCE) is a mathematical theory providing a methodological foundation for design and analysis of experiments and quasi-experiments. TCE consists of two parts. In the first part, total, direct, and indirect effects are defined, the second part deals with causal inference, i. e., in the second part it is shown how causal effects are identified by estimable quantities. In each part, there are two levels, a disaggregated and a re-aggregated one. In the definition part of TCE, the disaggregated level is called the atomic level. In this part we translate J. St. Mill s ceteris paribus clause into probabilistic concepts. For this purpose, we introduce temporal order between events and/or random variables using the concept of a filtration. Defining an atomic totaleffect variable we isolate the effects of X on Y, controlling for all variables that are prior or simultaneous to X, while ignoring all intermediate variables in between X and Y. In contrast, in the definition of an atomic t-direct-effect variable, we ignore all intermediate variables in between t and Y, but control all variables (potential confounders) that are prior or simultaneous to t. At the second level of the definition part of TCE, we aggregate these atomic effects defining average effects as expectations and conditional effects as conditional expectations of the corresponding atomic effect variables. In the identification part of TCE we connect causal effects to estimable quantities, namely conditional expectations of Y given X, or of Y given X, covariates, and/or intermediate variables by the unbiasedness assumption. At the disaggregated level of the identification part, we present a number of causality conditions, i. e., conditions that imply unbiasedness and identifiability of causal effects. At this level we condition on covariates such that one of these causality conditions hold. Once identification of conditional causal effects is achieved by controlling for these covariates, we can again re-aggregate taking expectations and/or conditional expectations of those conditional causal effects obtained at the disaggregated level. In this way we can coarsen the conditional effects obtained at the disaggregated level of the identification part. TCE has implications for design and data analysis of empirical studies aimed at estimating and testing causal effects. The most important design techniques are randomization, conditional randomization, and covariate selection. All these design techniques aim at satisfying one of the causality conditions. Techniques of data analysis can also be selected and/or developed guided by TCE (for more details see the Conclusion). 1 Research Traditions Studying the effect of a variable X on a variable Y, we distinguish between total, direct, and indirect effects (Wright, 1921, 1923). In a randomized experiment, the average total treatment effect is typically estimated, which is the average causal effect of a treatment variable X on a response variable Y, irrespective of mediation processes. As soon as we want to gain insight into transmitting pathways, intermediate variables have to be included in order to estimate direct effects of X on Y. Direct effects represent those parts of total effects that are not transmitted through the intermediate variables. In contrast, indirect effects are those components of total effects of X on Y that are not direct but are transmitted through mediators. This article aims at sharpened these intuitive ideas, presenting the stochastic theory of causal effects, which emerged from several research traditions. The most important contribution of the Neyman-Rubin tradition (see, e.g., Splawa- Neyman, 1923/1990; Rubin, 2005) is its emphasis on defining causal effects such as

3 1 RESEARCH TRADITIONS 3 individual, conditional, and average treatment effects. Defining such effects is important for proving that certain methods of data analysis yield unbiased estimates of these effects if certain assumptions can be made. Are there conditions under which the analysis of change scores (between pre- and post-tests) and repeatedmeasures analysis of variance yield causal effects? Under which conditions do we test causal effects in the analysis of covariance? Which are the assumptions under which propensity score methods yield unbiased estimates of causal effects? Answers to all these questions presuppose that we have a clear-cut definition of causal effects. The Campbellian tradition (see, e. g., Shadish, Cook, & Campbell, 2002), less formalized than the Neyman-Rubin tradition, addresses questions and problems beyond causality itself, which are also relevant in empirical causal research, such as: How to generalize beyond the study? What does the treatment variable mean? Which is the causal agent in a compound treatment variable comprising many aspects? What is the meaning of the response variable? Does it in fact measure the construct of interest? And, perhaps the most important question: Are there alternative explanations for the effects? In the graphical modeling tradition (see, e. g., Pearl, 2009; Spirtes, Glymour, & Scheines, 2000) techniques have been developed for estimating causal effects, finding confounders, identifying causal effects, and searching for causal models if specific assumptions can be made. The fact that a randomized experiment does not guarantee the validity of causal inference on direct effects has been brought up by this research tradition. Structural equation modeling and psychometrics showed how to use latent variables and structural equation modeling in testing causal hypotheses. Although many scientists hope to find causal answers via structural equation modeling, it should be clearly stated that structural equation modeling and this is also true for graphical modeling and other kinds of statistical modeling (including analysis of variance) does neither necessarily mean to estimate and test causal effects, nor does it provide a satisfactory theory of causal effects. Nevertheless, this research tradition contributes just like graphical modeling and other areas of statistics many techniques and tools that are useful in the analysis of causal effects. Mediation Analysis has roots in genetics, psychology, sociology and epidemiology. Mediation is concerned with analyzing total, direct and indirect effects. To date, much substantive research applying mediation analysis is based on influential papers by Baron and Kenny (1986), Bollen (1987), MacKinnon (2008), Preacher, Rucker, and Hayes (2007), Sobel (1982), which are based on the original work of Sewall Wright cited above. Recently, ideas from the causal inference literature have entered the discourse on mediation analysis. Questions like Is the effect of X on Y causally transmitted through a mediator? or What is the causal direct effect of X on Y have been raised in this field. Early concerns about the causal interpretability of mediation effects have already been expressed by Judd and Kenny (1981) as well as Holland (1988). All these research traditions (as well as others not mentioned) contributed to our knowledge about causal inference. In this article, we present a unified stochastic theory of causal effects, focussing on experimental and quasi-experimental designs, in which the putative cause is a discrete random variable (see also Mayer, Thoemmes, Rose, Steyer, & West, 2012). It is presumed that the reader is familiar with some fundamental concepts of measure and probability theory as provided in textbooks such as Bauer (1996), Bauer (2001), Klenke (2008), or Steyer, Nagel,

4 2 PRELIMINARY CONSIDERATIONS 4 Partchev, and Mayer (in press). 2 Preliminary Considerations 2.1 Basic Idea of Causal Effects For the time being, consider a random variable X with two values, 0 and 1, assuming P(X=0), P(X =1) > 0. These values of X may represent two treatment (intervention, exposition) conditions, e. g., a treatment and a control. Therefore, X will be called a treatment variable. The random variable Y (assessing, e. g., quality of life) will be called the response variable and is assumed to have a finite expectation. This assumption implies that the regression E(Y X ) is defined, that the conditional expectations E(Y X=1) and E(Y X=0) exist and are finite, and that the difference E(Y X=1) E(Y X=0) between the two conditional expectations of Y in the two treatment conditions is defined. Note that E(Y X ) can be written as a linear function α 0 + α 1 X with slope α 1 = E(Y X=1) E(Y X=0). However, the difference E(Y X=1) E(Y X=0) is not necessarily identical to the (total) causal treatment effect comparing treatment (x =1) to control (x= 0). The crucial problem is that X and Y may both depend on a covariate Z. In this case a regressive dependence of Y on X would be observed i. e., α 1 0 (see left-hand side of Fig. 1), even though Y is regressively independent of X given Z (see right-hand side of Fig. 1). In such a case, the slope of the regression, i. e., α 1 = E(Y X=1) E(Y X=0), does not describe the total causal effect of X on Y. What would be the remedy? Clearly, if Z were the only variable biasing the dependence of Y on X, then keeping constant Z at one of its values z would eliminate this spurious dependence of Y on X. In this case, the differences E(Y X=1, Z=z) E(Y X=0, Z=z) would describe the (Z =z)-conditional total treatment effects of X on Y. Furthermore, taking the expectation of these (Z =z)-conditional effects over the distribution of Z would yield the average total treatment effect. Unfortunately, in experiments and quasi-experiments in psychology and other social sciences, there are many variables that may create bias. However, conceptually, and for the purpose of defining atomic and various aggregated (average, conditional) causal effects, all these variables can be controlled, as will be shown in section 3. Of course, in empirical applications, controlling all these variables that may create bias is a challenge. 2.2 Conceptual Framework Probability Space and Random Variables In the sequel, the following kind of random experiments are considered: (a) sample a person u out of a set of persons (the population of persons), (b) observe the value z of a fallible (possibly multivariate) pre-treatment variable Z, (c) assign the unit or observe its assignment to one of several experimental conditions (represented by the values x of the treatment variable X ),

5 2 PRELIMINARY CONSIDERATIONS 5 Z X ε X α Y ε X 0 Y ε Y Figure 1. Path diagrams representing the regression E(Y X ) (on the left) as well as E(X Z ) and E(Y X, Z ) (on the right). (d) observe the value m of a (possibly multivariate) intermediate variable M, (e) observe the value y of the response variable Y. This kind of random experiments is called a single-unit trial. It is the kind of empirical phenomena typically considered in that part of the stochastic theory of causality that is devoted to causal effects in experiments and quasi-experiments. It does not represent a sampling process, which consists of repeating such a single-unit trial many times in some way or other and which would be considered in treating estimation and hypothesis testing. But these issues are not considered in this article. Like all other random experiments, a single-unit trial as described above is represented by a probability space (Ω, A, P), which is the formal framework that is necessary to define, e. g., random variables, distributions, conditional expectations (see, e. g., Bauer, 2001, Klenke, 2008, or Steyer, Nagel, et al., in press), and causal effects. Example 1: Joe and Ann With Self-Selection The first column of Table 1 shows all eight possible outcomes ω Ω of a very simple random experiment: Sample a person from the set Ω U := { Joe,Ann }, observe whether (yes) or not (no) the sampled person selects treatment, and whether (+) or not ( ) the sampled person reaches a success criterion. This example suffices to illustrate many concepts used in this article. The eight triples displayed in the first column are the elements of the set Ω. The set of all subsets of Ω, the power set P (Ω), is chosen as the σ-algebra A. (For the concept of a σ-algebra see, e. g., chapter 1 of Steyer, Nagel, et al., in press). It has 2 8 = 256 elements, representing all events that can be considered in this random experiment. The second column displays the probabilities of all elementary events {ω}, ω Ω. These eight probabilities can be used to compute the probabilities of all 256 events using the additivity of the probability measure P : A [0,1] (see Steyer, Nagel, et al., in press, Def. 4.1). For example, the probability of the event that Joe is sampled and treated is P[{( Joe, yes, ), ( Joe, yes, +)}] = P[{( Joe, yes, )}] + P[{( Joe, yes, +)}] = =.02. In fact, all other parameters displayed in Table 1 can be computed from the probabilities of the eight elementary events. Alternatively, all probabilities displayed in this table, including those for the elementary events, can be computed from the eight parameters of the first experiment displayed in Table 2.

6 2 PRELIMINARY CONSIDERATIONS 6 Table 1. Joe and Ann with self-selection Outcomes ω Observables Regressions Unit Treatment Success P ({ω}) Person variable U Treatment variable X Response variable Y E(Y X,U ) E(Y X ) E(X U ) E X=0 (Y U ) E X=1 (Y U ) P X=0 ({ω}) P X=1 ({ω}) ( Joe, no, ).144 Joe ( Joe, no, +).336 Joe ( Joe, yes, ).004 Joe ( Joe, yes, +).016 Joe (Ann, no, ).096 Ann (Ann, no, +).024 Ann (Ann, yes, ).228 Ann (Ann, yes,+).152 Ann Table 1 also displays several random variables, e. g.,the observable random variables (the observables) U, X, and Y. The first, U, will be called the person variable. Its values are the person sampled in the random experiment considered. The second, X, called the treatment variable, indicates whether or not the person sampled is treated. The third, Y, is called the response variable. In this example, it indicates whether or not the patient sampled gives a positive statement about his or her quality of live, six months after treatment. Other random variables are the regressions (synonymously, conditional expectations) E(Y X,U ), E(Y X ), and E(X U ). Because Y is a dichotomous regressand with values 0 and 1, these regressions are also denoted by P(Y= 1 X,U ), P(Y= 1 X ), and P(X=1 U ), respectively. All the random variables mentioned above are mappings on Ω with values in a subset of the setr of real numbers, except for U, which takes on its values in the set { Joe, Ann }. By definition, all random variables on (Ω,A,P) are measurable with respect to A and have a distribution denoted by P U, P X, P Y, etc. (see chapter 5 of Steyer, Nagel, et al., in press). Let us use X to illustrate the concept of measurability with respect to A. First note that X : Ω Ω X is a mapping with domain Ω and range Ω X = {0,1}. Furthermore, the definition of a random variable does not only presuppose that there is a σ-algebra on Ω, but also a σ-algebra on Ω X. In our example, this σ-algebra on Ω X is A X = { Ω X,Ø,{0},{1} }. The definition of a random variable on (Ω,A,P) requires that X 1 (A ) A, A A X, (1) where X 1 (A ) := {ω Ω: X (ω) A } is the inverse image of A under X. In our example, (1) is trivially true, because A is the power set P (Ω). In examples in which A P (Ω) and this is the case as soon as continuous random variables are involved this requirement is not trivial. If it holds, then it follows that all events X 1 (A ), A A X, have a probability, namely P[X 1 (A )], because P is a mapping on

7 2 PRELIMINARY CONSIDERATIONS 7 Table 2. Four random experiments with Joe and Ann compressed Random Experiment u P (U=u) E X=0 (Y U=u) E X=1 (Y U=u) P (X = 1 U=u) 1. With self-selection Joe Ann No Treatment for Joe Joe Ann With random assignment Joe Ann Homogeneous Joe Ann A, assigning a probability to all its elements. This fact is used to define the distribution of X by P X : A X [0,1] with P X (A ) = P[X 1 (A )], A A X. (2) In our example, the distribution of X is specified by P X ({0}) = P[X 1 ({0})] = P[{( Joe, no, ), ( Joe, no, +),( Ann, no, ), ( Ann, no, +)}] = =.6. Analogously, P X ({1})=.4, P X (Ω X)=1, and P X (Ø)=0. Finally, the set X 1 (A X) := { X 1 (A ): A A } X is called the σ-algebra generated by X and is also denoted by σ(x ). In our example, σ(x ) = { Ω, Ø, X 1 ({0}), X 1 ({1}) } = { Ω, Ø, {( Joe, no, ), ( Joe, no, +), ( Ann, no, ), ( Ann, no, +)}, Filtration and Temporal Order {( Joe, yes, ),( Joe, yes, +),( Ann, yes, ), ( Ann, yes, +)} }. In contrast to many other random experiments, in a single-unit trial as described in section 2.2, there is additional structure: There are events that are prior to the treatment variable X such as the event Joe is sampled or the event that the person sampled is male. Random variables may also be prior to X, such as the fallible pre-test Z = quality of life before treatment, or the person variable U (taking on the values Joe, Ann, Jim, etc.) itself, which is prior to Z, because the person, its sex, its race, etc. are determined before a fallible value z of Z is assessed. Furthermore, the response Y represents events, such as {Y=y } := {ω Ω: Y (ω) = y }, that may occur after treatment. Hence, Y is posterior to X. (3)

8 2 PRELIMINARY CONSIDERATIONS 8 In more formal and general terms, this temporal order can be represented by a filtration (F t ) t T in A, which is a fundamental concept in the theory of stochastic processes (see Table 3 and, e. g., Bauer, 1996; Klenke, 2008; Øksendal, 2007). In many applications it is sufficient to consider a filtration with an index set T = {1,...,n T }, where n T is a natural number > 1. In other applications, T might be a subset of the set of real numbers. Example 1 continued In Example 1, we define F 1 := σ(u ), F 2 := σ(u, X ), and F 3 := σ(u, X,Y ). Hence, in this example, the filtration (F t ) t T consists of n T = 3 σ-algebras: F 1 has only four elements, the event that Joe is sampled, the event that Ann is sampled, Ω (Joe or Ann is sampled), and Ø (neither Joe nor Ann is sampled). The σ-algebra F 2 has 2 4 = 16 elements: All elements in F 1, the event that the person sampled is treated, the event that the person sampled is not treated, as well as events such as Joe (Ann) is sampled and (not) treated. Finally, F 3 has 2 8 = 256 elements. It is identical to the power set of Ω and contains as elements all events that can be considered in this random experiment. 2.3 Preliminary Definitions Order With Respect to a Filtration Figure 2 depicts a filtration with n T = 5 and it also shows in which σ-algebra F t the events {U=u }, {Z=z }, {X=x }, etc. occur for the first time. For example, {Z=z } F 1 but {Z=z } F 2, {X=x } F 2 but {X=x } F 3, etc. Using such a filtration, one can easily define terms such as U is prior to X, X is prior to Y, and X 1 is simultaneous to X 2, e. g., if X 2 is a second treatment variable and the second treatment is applied at the same time as the first one. The idea is to see in which σ-algebra F t events such as {X=x }, {Z=z }, or {Y=y } occur for the first time. Using this criterion for the kind of single-unit trial described above, X is prior to M, which itself is prior to Y, whereas X is posterior to U and Z. Using the concept of a σ-algebra generated by a random variable V [denoted σ(v )] (see e. g., Klenke, 2008), this idea is defined in more formal terms in Table 3. The σ-algebras generated by X, Y, etc. are subsets of the corresponding σ-algebras F t. In contrast, the events {X=x }, {Y=y }, etc. are elements of the corresponding F t (see again Fig. 2). Global Covariates The concept of a global t-covariate of X defined in Table 3 is crucial. It is denoted by C X,t and will be used to define true-outcome variables and atomic causal effects, i. e., effects on the most fine-grained level (see section 3). Note that there several time points t T with respect to which a global covariate of X can be considered. For example, defining atomic total effects, we control for C X,tX, which is defined such that it comprises all variables other than X that are prior or simultaneous to X. In contrast, defining atomic direct effects with respect to time t, we control for C X,t, which is defined such that it comprises all variables other than X that are prior, simultaneous, or posterior to X, but not posterior to t (see Table 3). More precisely, C X,t is a random variable on (Ω,A,P) and its most important property is σ(c X,t, X ) = F t, i. e., C X,t and X together generate F t. The second assumption ensures that X is not comprised in C X,t, i. e., σ(x ) σ(c X,t )], and the third

9 2 PRELIMINARY CONSIDERATIONS 9 {U=u } F 1 σ(u ) {Z=z } F 2 σ(z ) {X=x } F 3 = F tx σ(x ) {M=m} F 4 = F tm σ(m) {Y=y } F 5 σ(y ) Figure 2. Venn diagram of a filtration with T = {1,...,5} assumption implies that C X,t is simultaneous or posterior to X and prior to Y. Intuitively speaking, C X,t comprises all random variables on (Ω,A,P) that are prior or simultaneous to t, except for X itself, i. e., it comprises all potential confounders that could possibly bias t-direct effects (pertaining to pairs of values) of X on Y. Covariates and Intermediate Variables In this framework, a t-covariate of X is defined as any random variable on (Ω,A,P), say Z t, with σ(z t ) σ(c X,t ). This implies: all events A A that are represented by a t-covariate of X, such as {Z t = z t }, are elements of F t. Similarly, using the filtration, (t 1, t 2 )-intermediate variables can also be defined (see Table 3). Note again that T may also be a continuous (time) set. Simplified Notation For simplicity, the terms covariate of X and t X -covariate will be used as synonyms. Similarly, C X := C X,tX and Z := Z tx denote a global t X -covariate of X (or simply, a global covariate of X ) and a t X -covariate of X (or simply, a covariate of X ), respectively. In single-unit trials in which no fallible covariates of X are assessed, U can be a global covariate of X. Considering a random experiment in which a fallible covariate of X is assessed, then (U, Z ) can be a global covariate of X, where Z denotes the (possibly multivariate) random variable consisting of all fallible covariates of X.

10 2 PRELIMINARY CONSIDERATIONS 10 Table 3. Framework and preliminary concepts Let X, Y, and W be random variables on a probability space (Ω,A,P). Filtration in A A family (F t ) t T of σ-algebras F t A such that F s F t if s t. X is prior to Y X is called prior to Y (and Y posterior to X ) in (F t ) t T, if there is an s T such that σ(x ) F s, σ(y ) F s, and there is a t T, s < t, such that σ(y ) F t. X is simultaneous to Y Global t-covariate of X t X t Y X is called simultaneous to Y in (F t ) t T, if there is a t T such that σ(x ) F t, σ(y ) F t, and no s T, s < t, such that σ(x ) F s or σ(y ) F s. A random variable denoted C X,t satisfying: (a) σ(x,c X,t )=F t, (b) the product measure P X P CX, t exists, and (c) t X t < t Y, where t X T is defined by σ(x ) F tx and σ(x ) F t if t < t X. (t Y is defined in the same way replacing X by Y ). t-covariate of X A random variable Z t on (Ω,A,P) with σ(z t ) σ(c X,t ). Global covariate of X A random variable C X on (Ω,A,P) such that C X := C X,tX. Covariate of X A random variable Z on (Ω,A,P) such that σ(z ) σ(c X ). (t 1, t 2 )-intermediate variable Causality space with discrete cause A random variable M on (Ω,A,P) such that σ(m) F t1 and there exists a t T, t < t 2, such that σ(m) F t. A quadruple ( (Ω,A,P),(F t ) t T, X,Y ) satisfying: (a) (F t ) t T is a filtration in A, (b) X is discrete with values in Ω X = {0,1,...,n }, and P(X=x)> 0, x Ω X, (c) Y is numerical with finite expectation E(Y ), and (d) X is prior to Y in (F t ) t T. Causality Space Throughout the rest of this article we assume that there is a causality space with discrete cause (see Table 3). Such a causality space provides the formal framework in which causal effects can be defined. Example 1 continued In section 2.2, the filtration (F t ) t T has been specified for Example 1. Using the definitions displayed in Table 3, yields: U is prior to X and to Y, X is prior to Y. Furthermore, 1 Joe is simultaneous to U, where 1 Joe denotes the indicator variable of the event that Joe is sampled. It takes on the value 1, if Joe is sampled, and 0, otherwise. In this example, U and 1 Joe are global covariates of X, and U, 1 Joe, and 1 mal e are covariates of X, where 1 mal e is the indicator variable of the event that the sampled person is male. (In this specific example with only one male and one female person, 1 Joe = 1 mal e.) If we would also like to consider intermediate variables, Table 1 would have to be extended to include at least one intermediate variable such

11 3 CAUSAL EFFECTS 11 as quality of live three months after treatment. The filtration (F t ) t T would have ( to be extended correspondingly. Hence, now all components of a causality space (Ω,A,P),(Ft ) t T, X,Y ) defined in Table 3 have been illustrated. 3 Causal Effects In this section the definitions of the adjusted conditional expectations and causal effects displayed in Table 4 are explained and illustrated. 3.1 (X=x)-Conditional Probability Measure A fundamental concept used in the definitions in Table 4 is the (X =x)-conditional probability measure P X=x. Assume P (X=x)>0, for x Ω X. Then the (X=x)-conditional probability measure on (Ω, A ) is defined by A A : P X=x (A) := P(A X=x). (4) where x Ω X = {0,1,...,n }. Hence, for each value x of X there is such a probability measure. The last two columns of Table 1 display the values of P X=0 and P X=1 for all elementary events {ω} in Example 1. Because distributions, expectations, conditional expectations, etc. all refer to a probability measure, each of these measures defines distributions, expectations, conditional expectations, etc. with respect to these measures. Hence, P X=x Y will denote the distribution, E X=x (Y ) the expectation, and E X=x (Y Z ) the Z -conditional expectation of Y with respect to the measure P X=x. [Chapter 13 of Steyer, Nagel, et al., in press provides an extensive presentation of E X=x (Y Z ).] 3.2 True-Outcome Variable With Respect to t As already mentioned, C X,t, a global t-covariate of X, comprises all variables that are prior or simultaneous to t, except for X. Hence, conditioning on C X,t, all potential confounders of t-direct effects are controlled. Now consider the C X,t -conditional expectation of Y with respect to P X=x. For t T, we define a version of the trueoutcome variable τ x,t with respect to t by τ x,t := E X=x (Y C X,t ), x Ω X. (5) Hence, intuitively speaking, considering such a true-outcome variable τ x,t, conditioning is on X and all other variables that are prior or simultaneous to t. What still varies and may affect Y are measurement errors of Y, but also effects of variables that are in between t and t Y. If t = t X and U takes the role of C X,t, then τ x,t is analog to Rubins potential outcome (see, e. g., Rubin, 2005). P X=x -Uniqueness and P -Uniqueness In general, conditional expectations are not uniquely defined. Hence, there is a set E X=x (Y C X,t ) of such conditional expectations. However, if τ x,t,τ x,t E X=x (Y C X,t ) are two such versions, then they are P X=x -equivalent, i. e., τ x,t = P X=x τ x,t, (6)

12 3 CAUSAL EFFECTS 12 Table 4. Adjusted conditional expectations and t-direct-effect functions Let ( (Ω,A,P),(F t ) t T, X,Y ) be a causality space with discrete cause, let C X,t be a global t-covariate of X, let W be a random variable on (Ω,A,P), and let x, x Ω X = {0,1,...,n } be two values of X. E X=x (Y C X,t ) E X=x (Y C X,t ) τ x,t δ xx,t E C X,t(Y X=x) ADE xx,t The set of all versions of the C X,t -conditional expectation of Y with respect to P X=x. A version of the C X,t -conditional expectation of Y with respect to P X=x. A shortcut for E X=x (Y C X,t ) is τ x,t. A version of the atomic t-direct-effect variable of x vs. x. Assume: (a) there is a τ x,t E X=x (Y C X,t ) with finite expectation E(τ x,t ) and a τ x,t E X=x (Y C X,t ) with finite expectation E(τ x,t ). (b) τ x,t and τ x,t are P-unique. Assumption (a) implies that there is a finite τ x,t E X=x (Y C X,t ) and a finite τ x,t E X=x (Y C X,t ). Choosing two such finite τ x,t and τ x,t, we define δ xx,t := τ x,t τ x,t. Assumption (b) implies that δ xx,t is P-unique. The C X,t -adjusted (X=x)-conditional expectation of Y. If (a) and (b) hold, we define E C X,t(Y X=x) := E(τ x,t ) and say that it exists. The average t-direct effect of x vs. x. Assuming that E C X,t(Y X=x) and E C X,t(Y X=x ) exist, we define ADE xx,t := E C X,t(Y X=x) E C X,t(Y X=x ). E C X,t(Y X=x;W ) A version of the C X,t -adjusted (X=x,W )-conditional expectation of Y. If (a) and (b) hold, we define E C X,t(Y X=x;W ) := E(τ x,t W ) and say that it exists. CDE xx,t (W ) A version of the W -conditional t-direct effect-function of x vs. x. If (a) and (b) hold, then E C X,t(Y X=x;W ) and E C X,t(Y X=x ;W ) are P-unique. Furthermore, under (a) and (b), there is a finite version E C X,t(Y X=x;W ) and a finite version E C X,t(Y X=x ;W ). Choosing two such finite versions, we define CDE xx,t (W ) := E C X,t(Y X=x;W ) E C X,t(Y X=x ;W ). Note: Proofs of the propositions in this table are found in chapters 13 to 15 of Steyer, Nagel, et al. (in press).

13 3 CAUSAL EFFECTS 13 which is a shortcut for P X=x({ ω Ω: τ x,t (ω)=τ x,t (ω) }) = 1. Hence, Equation (6) means that τ x,t and τ x,t take on identical values with (X=x)- conditional probability 1. In this case, E X=x (Y C X,t ) is said to be P X=x -unique. Hence, P X=x -uniqueness of E X=x (Y C X,t ) means that all versions τ x,t E X=x (Y C X,t ) are pairwise P X=x -equivalent. Note that P X=x -uniqueness does not imply P-uniqueness of E X=x (Y C X,t ), i. e., it does not imply P-equivalence of τ x,t,τ x,t E X=x (Y C X,t ), which is defined by Again, (7) is a shortcut for τ x,t = P τ x,t. (7) P ({ ω Ω: τ x,t (ω)=τ x,t (ω) }) = 1. The assumption that τ x,t is P-unique plays a crucial role not only in the definition but also in the identification of causal effects. It implies that all versions τ x,t E X=x (Y C X,t ) have identical distributions, and therefore also identical expectations, variances, and covariances with other random variables. P-uniqueness of τ x,t is equivalent to which is defined by P(X=x C X,t ) > P 0, (8) P ({ ω Ω: P(X=x C X,t )(ω)> 0 }) = 1. In our examples with Joe and Ann, in which U takes the role of C X,tX, requiring P(X=x U ) > P 0 means that all persons must have a nonzero treatment probability, unless the person has a zero probability to be sampled. [See chapter 13 of Steyer, Nagel, et al., in press for other conditions that are equivalent to P-uniqueness.]. Example 1 continued In Example 1, U is a global t X -covariate of X. Using the simplified notation C X,t = C X for the case t = t X, we can also say that U is a global covariate of X. Table 1 displays the U -conditional expectations E X=0 (Y U ) and E X=1 (Y U ), which are identical with the total-effect true-outcome variables τ 0 and τ 1. In this example, these true-outcome variables are uniquely defined and therefore they are also P-unique. They are random variables on (Ω,A,P) just like U, X, Y, and the other regressions such as E(Y X ), E(Y X,U ), and E(X U ). Note that, by definition, τ 0 = E X=0 (Y U ) and τ 1 = E X=1 (Y U ) are measurable with respect to U, i. e., σ(τ 0 ) σ(u ) and σ(τ 1 ) σ(u ). This implies that there are functions g 0, g 1 : { Joe,Ann } R such that τ 0 = g 0 (U ) and τ 1 = g 1 (U ) (see section of Steyer, Nagel, et al., in press). From a substantive point of view, this means that the values of τ 0 and τ 1 represent properties of the person u sampled in the random experiment considered, the conditional expectations E X= 0 (Y U=u) and E X=1 (Y U=u).

14 3 CAUSAL EFFECTS 14 Example 2: No Treatment for Joe The second part of Table 2 displays a random experiment in which the causality space is identical to the one described in Example 1 except for the probability measure P. In this second example, τ 1 = E X=1 (Y U ) is not P-unique. The reason is that P(X=1 U=Joe ) = 0, whereas P(U=Joe ) > 0. In this case, the value of E X=1 (Y U ) is not uniquely defined for all ω {U=Joe }. Hence, E X=1 (Y U=Joe ) is an arbitrary real number. [In Table 2, the number 99 has arbitrarily been chosen. Although this number is not a conditional probability, it is fully in line with the general definition of a conditional expected value as the value of a factorization of a regression (see Steyer, Nagel, et al., in press, chapter 9).] The fact that E X=1 (Y U=Joe ) is arbitrary is not a problem by itself, because P(X=1 U=Joe ) = 0. However, together with P(U=Joe )>0, it is a problem: It implies that E X=1 (Y U ) is not P-unique, and which in turn implies, e. g., that different versions τ 1,τ 1 E X=1 (Y U ) have different expectations, i. e., E(τ 1 ) E(τ 1). In the same example, τ 0 = E X=0 (Y U ) is P-unique ; it is even uniquely defined. This implies, e. g., that E(τ 0 ) is a uniquely defined number. Expectations such as E(τ 0 ) and E(τ 1 ) play a crucial role in the definition of average direct and total effects. However, P-uniqueness of the true-outcome variables is also required in the definition of atomic total and direct effect variables. 3.3 Atomic t -Direct-Effect Variable Assumption (a) in Table 4 implies that there is a finite version τ x,t E X=x (Y C X,t ) and a finite version τ x,t E X=x (Y C X,t ). Assuming P-uniqueness of τ x,t and τ x,t [see assumption (b) in that table] is a second prerequisite for the difference τ x,t τ x,t to be meaningful. This assumption is equivalent to P(X=x C X,t ) > P 0 and P(X= x C X,t ) > P 0. (9) It implies that τ x,t τ x,t is P-unique. Assuming (a) and (b) in Table 4, we choose finite versions of τ x,t and τ x,t and define a version of the atomic t-direct-effect variable by δ xx,t := τ x,t τ x,t. (10) This definition implies that δ xx,t is P-unique and finite. If Z t is a t-covariate of X, then, by definition, σ(z t ) σ(c X,t ) F t. Therefore, E X=x (Y C X,t ) = P X=x E X=x (Y C X,t, Z t ), x Ω X. This means, with C X,t we control for all t-covariates of X. In intuitive terms this means: With C X,t all potential confounders of t-direct effects or controlled. In other words, an atomic t-direct-effect variable is defined such that it cannot be biased (cf. sections 2.1 and 4.1). 3.4 Atomic Total-Effect Variable If t = t X, we omit the index t using τ x := E X=x (Y C X ), x Ω X, (11) and δ xx := τ x τ x. (12)

15 3 CAUSAL EFFECTS 15 The random variable τ x is called a version of the total-effect true-outcome variable pertaining to x, whereas δ xx is called a version of the atomic total-effect variable of x vs. x. Hence, an atomic total-effect variable is an atomic t X -direct-effect variable. In the example presented in Table 1, the atomic total-effect variable δ 10 is identical to the difference E X=1 (Y U ) E X=0 (Y U ) taking the value δ 10 (ω) =.10 if ω {U=Joe } and the value δ 10 (ω) =.20 if ω {U=Ann }. It is a random variable on the probability space (Ω,A,P), it is P-unique, and it is measurable with respect to U, i. e., σ(δ 10 ) σ(u ). In Example 2, the atomic total-effect variable δ 10 is not defined, because τ 1 = E X=1 (Y U ) is not P-unique. 3.5 Adjusted (X =x)-conditional Expectation As explained above, the true-outcome variables and the atomic-effect variables are defined such that they cannot be biased, because, with C X,t, all variables that could induce bias are controlled. In general, in applications, neither the true-outcome variables nor the atomic-effect variables can be observed or estimated. However, expectations and conditional expectations of the true-outcome variables and atomiceffect variables can be estimated, provided that appropriate assumptions can be made (see section 4). Note that although re-aggregated, these expectations and conditional expectations remain adjusted from bias. In general, the expectations and conditional expectations of the atomic-effect variables just coarsen the effects, they do not introduce bias. The concept of a C X,t -adjusted (X=x)-conditional expectation, denoted E C X,t(Y X=x), is a good starting point. Under the assumptions (a) and (b) in Table 4, it exists and is defined as the expectation E(τ x,t ) (see Table 4). Assumptions (a) and (b) in Table 4 imply that E(τ x,t ) is uniquely defined and finite, which also means that E(τ x,t ) does not depend on the choice of the version τ x,t E X=x (Y C X,t ). 3.6 Average t -Direct Effect If E C X,t(Y X=x) and E C X,t(Y X= x ) exist, then the average t-direct effect of x vs. x is defined by ADE xx,t := E C X,t (Y X=x) E C X,t (Y X= x ). (13) Note that ADE xx,t = E(δ xx,t ) = E(τ x,t ) E(τ x,t ). (14) 3.7 Adjusted (X =x, W )-Conditional Expectation So far two extremes have been considered, the true-outcome variables and their differences, the atomic t-direct effects on one side, and their expectations, the adjusted (X =x)-conditional expectations and their differences, the average t-direct effects, one the other side. Conditional t-direct effects are somewhere in between these two extremes. The basic idea is to consider a random variable W and the W -conditional expectations of the atomic t-direct effects given W. Because W can be multivariate, consisting of several univariate random variables W 1,...,W m, the degree of aggregation of the atomic t-direct effects depends on the choice of W. Note that W might also be continuous.

16 3 CAUSAL EFFECTS 16 Table 5. Adjusted conditional expectations and total effects Let ( (Ω,A,P),(F t ) t T, X,Y ) be a causality space with discrete cause, let C X be a global covariate of X, and let W be random variable on (Ω,A,P). E X=x (Y C X ) τ x The set of all versions of the C X -conditional expectation of Y with respect to P X=x, where x Ω X. A version of the total-effect true-outcome variable. τ x := E X=x (Y C X ). δ xx A version of the atomic total-effect variable of x vs. x. E C X (Y X=x) δ xx := δ xx,t X. A version of the C X -adjusted (X=x)-conditional expectation of Y. E C X (Y X=x) := E C X,t X(Y X=x). ATE xx The average total effect of x vs. x. E C X (Y X=x;W ) ATE xx := ADE xx,t X. A version of the C X -adjusted (X=x,W )-conditional expectation of Y. E C X (Y X=x;W ) := E C X,t X(Y X=x;W ). CTE xx (W ) A version of the W -conditional total effect-function of x vs. x. CTE xx (W ) := CDE xx,t X (W ). Again, begin with a version of the C X,t -adjusted (X=x,W )-conditional expectation of Y. Under the assumptions (a) and (b) in Table 4 we define E C X,t (Y X=x;W ) := E(τ x,t W ), (15) call it a version of the C X,t -adjusted (X=x,W )-conditional expectation of Y, and say that it exists. Assumptions (a) and (b) in Table 4 imply that there is a finite version E(τ x,t W ), and P-uniqueness of τ x,t implies that E(τ x,t W ) = P E(τx,t W ) if τ x,t,τ x,t E X=x (Y C X,t ). Hence, there exists a finite version E C X,t(Y X=x;W ) and it is P-unique. 3.8 W-Conditional t -Direct-Effect Function Assumptions (a) and (b) in Table 4 imply that there is a finite version E C X,t(Y X=x;W ) and a finite version E C X,t(Y X= x ;W ). Choosing two such finite versions, we define CDE xx,t (W ) := E C X,t (Y X=x;W ) E C X,t (Y X= x ;W ) (16) call it a version of the W-conditional t-direct-effect function of x vs. x, and say that it exists. Note that CDE xx,t (W ) is P-unique and finite, and that CDE xx,t (W ) = P E(δ xx,t W ) = P E(τ x,t W ) E(τ x,t W ). (17)

17 3 CAUSAL EFFECTS Average and Conditional Total Effects Remember, C X := C X,tX and the atomic total effect has been defined as a special t- direct effect for t = t X. Correspondingly, all average and conditional total effects will be defined as t X -direct effects. Table 5 summarizes the various total effects. Example 1 continued In the example displayed in Table 1, the expectations of the true-outcome variables τ 0 = E X=0 (Y U ) and τ 1 = E X=1 (Y U ) are E(τ 0 ) =.70 P(U=Joe ) +.20 P(U=Ann ) = =.45 and E(τ 1 ) =.80 P(U=Joe ) +.40 P(U=Ann ) = =.60. Hence, the expectation of δ 10 = τ 1 τ 0 is E(δ 10 ) = E(τ 1 ) E(τ 0 ) = =.15. In this example, the U -conditional total-effect function CTE 10 (U ) = P E(δ 10 U ) (18) can also considered. Because δ 10 is measurable with respect to U, it follows that E(δ 10 U )=δ 10 [see Rule (vii) of Box 9.2 in Steyer, Nagel, et al., in press]. Later on other examples are presented in which a Z -conditional total-effect function CTE 10 (Z ) is considered, where Z denotes the random variable sex (see Table 10). In these examples, CTE 10 (Z ) CTE 10 (U ) Indirect Effects Indirect effects are simply differences between total and direct effects. Suppose that the assumptions (a) and (b) in Table 4 hold for t (with global covariate C X,t ) and for t X (with global covariate C X ), where t X < t < t Y. Then the difference δ xx δ xx,t (19) is called a version of the atomic t-indirect effect-variable of x vs. x. Under the same assumptions we define AIE xx,t := ATE xx ADE xx,t (20) and call it the average t-indirect effect. Finally, and again under the same assumptions, we define CIE xx,t (W ) := CTE xx (W ) CDE xx,t (W ) (21) and call it the W -conditional t-indirect-effect function.

18 3 CAUSAL EFFECTS 18 M ε M Z X Y ε Y Figure 3. A path diagram representing a causal process with a single mediator M. Example 3: A Simple Path Model Total, direct, and indirect effects are most easily illustrated by a computer simulation, such as the following one: (a) Sample a value of a normally distributed random variable Z with expectation 100 and standard deviation 10. (b) Sample a value of a Bernoulli distributed random variable X with expectation.5. Ensure that X and Z are independent. (This independence would also be created in a randomized experiment.) (c) Compute a value of M by M = X +.3 Z + ε M, where ε M is normally distributed with expectation 0 and standard deviation 3. Ensure that ε M and (X, Z ) are independent. (d) Compute a value of Y by Y = X+.7 Z+.5 M+ε Y, where ε Y is normally distributed with expectation 0 and standard deviation 3. Ensure that ε Y and (X, Z, M) are independent. Repeating steps (a) to (d) n times would yield a concrete sample of size n and a data matrix of type n 4. The dependencies between the four random variables are perfectly described by the two regression equations and E(M X, Z ) = P X Z. (22) E(Y X, M, Z ) = P X Z M, (23) which, except for the intercepts, can also be represented by the path diagram displayed in Figure 3. For didactic purposes, this example is confined to linear parameterizations of the regressions without interactions. However, the general theory of causal effects, outlined in this article, can accommodate much more complex models. Now let us construct the causality space, in particular the probability space (Ω, A, P) and the filtration (F t ) t T. The set of possible outcomes is Ω =R Ω X R R, where Ω X = {0,1}, the σ-algebra on Ω is the product σ-algebra A = B P (Ω X ) B B

19 3 CAUSAL EFFECTS 19 (see Steyer, Nagel, et al., in press, chapter 1), and the probability measure P on (Ω,A ) is specified by the distributional assumptions described in points (a) to (d) above. Now, X, Y, Z, and M are random variables on (Ω,A,P) and a filtration (F t ) t T in A can be specified by: F 1 = σ(z ), F 2 = σ(z, X ), F 3 = σ(z, X, M), and F 4 = A = σ(z, X, M,Y ). Now total, direct, and indirect effects are specified in this example, starting with atomic total-effect variable δ 10. In this example, Z is a global covariate of X, because σ(z, X )=F 2 = F tx. Therefore, τ 0 := E X=0 (Y C X )=E X=0 (Y Z ), τ 1 := E X=1 (Y C X )= E X=1 (Y Z ), and δ 10 := τ 1 τ 0 = E X=1 (Y Z ) E X=0 (Y Z ). Hence, in order to specify δ 10, the conditional expectations E X=0 (Y Z ) and E X=1 (Y Z ) have to be computed. As a first step, using the rules of computation for regressions (see Steyer, Nagel, et al., in press, Box 9.2), compute E(Y X, Z ) = P E [E(Y X, M, Z ) X, Z ] = P E(80+10 X +.7 Z +.5 M X, Z ) [(23)] = P X +.7 Z +.5 E(M X, Z ) = P X +.7 Z +.5 (60+20 X +.3 Z ) [(22)] = P X +.85 Z. Now, independence of X and Z implies that the regressions E X=x (Y Z ) are P-unique. Hence, E X=0 (Y Z ) = P Z, E X=1 (Y Z ) = P Z (see section 13.4 of Steyer, Nagel, et al., in press) and δ 10 = τ 1 τ 0 = P E X=1 (Y Z ) E X=0 (Y Z ) = P 20. Hence, in this example, the atomic total-effect function δ 10 is constant and therefore its expectation, the average total effect is ATE 10 = E(δ 10 ) = E(20) = 20. The same applies to the Z -conditional total-effect function CTE 10 (Z ) = P E(δ 10 Z ) = P E(20 Z ) = P 20. Now consider the atomic t 3 = t M -direct-effect variable δ 10,tM = τ 1,tM τ 0,tM = E X=1 (Y C X,tM ) E X=0 (Y C X,tM ). In this example, the bivariate random variable (Z, M) is a global t M -covariate of X, because σ(z, M, X ) = F tm = F 3. Furthermore, because P(X=x Z, M) > P 0, the regressions E X=x (Y Z, M) are P-unique (see Steyer, Nagel, et al., in press, chapter 13). This implies Equation (23) implies δ 10,tM = τ 1,tM τ 0,tM = P E X=1 (Y Z, M) E X=0 (Y Z, M). E X=0 (Y Z, M) = P Z +.5 M

20 4 CAUSALITY CONDITIONS AND IDENTIFICATION OF CAUSAL EFFECTS 20 and Therefore, E X=1 (Y Z, M) = P Z +.5 M. δ 10,tM = P Z +.5 M ( Z +.5 M) = 10, which, in this example, is a constant, too. Hence, the average t M -direct effect is ADE 10,tM = E(δ 10,tM ) = E(10) = 10, and the Z -conditional t M -direct-effect function is CDE 10,tM (Z ) = P E(δ 10,tM Z ) = P E(10 Z ) = P 10. Finally, in this example, the atomic t M -indirect-effect variable is δ 10 δ 10,tM = P = 10, again a constant. Hence, AIE 10,tM = E(δ 10 δ 10,tM ) and AIE 10,tM (Z )=E(δ 10 δ 10,tM Z ) are equal to 10 as well. Obviously, our results are in line with the well-known rules of computing total, direct, and indirect effects in linear path models (see, e. g., Bollen, 1987). However, while those are restricted to linear path models and exclude interactions, our theory applies irrespective of how the regressions involved are parameterized. In this example, two observations are worthwhile mentioning. First, independence of X and Z implies that the total effect of X on Y is Z -unbiased. However, even though X and Z are independent, omitting Z yields a seriously biased direct effect (see Mayer et al., 2012 for a detailed presentation). Second, note that, in this particular example, A = σ(z, X, M,Y ) and the joint distribution of these four random variables determines the probability measure P on (Ω,A ). In this sense, our example is a closed system. In this particular example, there are no random variables that are not measurable with respect to σ(z, X, M,Y ). Such a closed system is realistic in the computer sciences and in engineering. In many other empirical sciences, the situation is different: there, σ(z, X, M,Y ) A, but not σ(z, X, M, Y ) = A. In the theory of causal effects, not only the random variables such as X, Y, Z, and M are needed, but also a probability space (Ω,A,P) and a filtration (F t ) t T, which are constructed such that all pre-treatment variables and not only Z are measurable with respect to A. Similarly, if considering direct effects, F tm has to be constructed in such a way that all variables that are simultaneous or prior to M have to be measurable with respect to F tm. Only with reference to them the relationship between the included variables such as X, Y, Z, and M, and omitted variables that may create bias can be specified. In other words, in serious empirical applications, (Ω,A,P) and (F t ) t T have to be constructed such that they represent the real world. Only then is it possible to investigate if it is sufficient to consider the variables such as X, Y, Z, and M that occur in our regression models. It is exactly the relationship between the included and the omitted variables that is at issue in the definition of unbiasedness and other causality conditions. 4 Causality Conditions and Identification of Causal Effects So far, the concepts of atomic, average and conditional total, direct, and indirect effects have been defined and illustrated, confining the presentation to experiments

21 4 CAUSALITY CONDITIONS AND IDENTIFICATION OF CAUSAL EFFECTS 21 or quasi-experiments. Now causal inference is treated: How to infer from empirically estimable quantities to these causal effects? How to identify the various causal effects and effect functions from empirically estimable quantities? The key is to link the causal effects to estimable quantities by an unbiasedness assumption. Although such an unbiasedness assumption is not empirically testable itself, it is implied by a number of causality conditions, some of which are empirically testable. 4.1 Unbiasedness Unbiasedness of the Conditional Expectations E(Y X =x) and E(Y X ) Let τ x,t be a version of the true-outcome variable with respect to t and E C X,t(Y X=x) a version of the C X,t -adjusted (X=x)-conditional expectation of Y (see Table 4). Then the conditional expectation E(Y X=x) is called C X,t -unbiased, if E(Y X=x) = E C X,t (Y X=x). (24) Because E C X,t(Y X=x)=E(τ x,t ), it follows: If E C X,t(Y X=x) exists, then the conditional expectation E(Y X=x) is C X,t -unbiased if and only if E(Y X=x) = E(τ x,t ). (25) Finally, because it is presumed that X is discrete with P (X=x)>0 for all its values, we can define C X,t -unbiasedness of the conditional expectation E(Y X ) by E(Y X=x) = E C X,t (Y X=x), x Ω X. (26) Unbiasedness of the Conditional Expectations E X=x (Y W ) and E(Y X,W ) In Table 4 we defined E C X,t(Y X=x;W ) := E(τ x,t W ), a version of the C X,t -adjusted (X=x,W )-conditional expectation of Y. Referring to this term, E X=x (Y W ) is called (C X,t ;W )-unbiased, if E X=x (Y W ) = P E C X,t (Y X=x;W ). (27) Again, if E C X,t(Y X=x;W ) exists, we can conclude that E X=x (Y W ) is (C X,t ;W )-unbiased if and only if E X=x (Y W ) = P E(τ x,t W ). (28) Finally, because we confine ourselves to the case in which X is discrete with P (X=x) > 0 for all its values, (C X,t ;W )-unbiasedness of the conditional expectation E(Y X,W ) can be defined by E X=x (Y W ) = P E C X,t (Y X=x;W ), x Ω X. (29) Usually, unbiasedness cannot be tested empirically, at least not for all values of X, because it involves the true-outcome variables that cannot be estimated unless overly strong assumptions are introduced. However, there are a number of conditions implying unbiasedness and identifiability of causal effects. Conditions that imply unbiasedness are called causality conditions, and some of these can be tested empirically. We present two kinds of such testable conditions and a third kind that cannot be tested empirically. In the first kind of these conditions we consider the relationship between X and C X,t, and in the second, the relationship between Y and C X,t. The third, which is analog to Rosenbaum and Rubin s strong ignorability (see Rosenbaum & Rubin, 1983), is implied by both kinds of causality conditions.

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

It is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3.

It is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3. IDENTIFICATION AND ESTIMATION OF AGE, PERIOD AND COHORT EFFECTS IN THE ANALYSIS OF DISCRETE ARCHIVAL DATA Stephen E. Fienberg, University of Minnesota William M. Mason, University of Michigan 1. INTRODUCTION

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

South Carolina College- and Career-Ready (SCCCR) Probability and Statistics

South Carolina College- and Career-Ready (SCCCR) Probability and Statistics South Carolina College- and Career-Ready (SCCCR) Probability and Statistics South Carolina College- and Career-Ready Mathematical Process Standards The South Carolina College- and Career-Ready (SCCCR)

More information

Covariance and Correlation

Covariance and Correlation Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such

More information

MATH 10034 Fundamental Mathematics IV

MATH 10034 Fundamental Mathematics IV MATH 0034 Fundamental Mathematics IV http://www.math.kent.edu/ebooks/0034/funmath4.pdf Department of Mathematical Sciences Kent State University January 2, 2009 ii Contents To the Instructor v Polynomials.

More information

Integer roots of quadratic and cubic polynomials with integer coefficients

Integer roots of quadratic and cubic polynomials with integer coefficients Integer roots of quadratic and cubic polynomials with integer coefficients Konstantine Zelator Mathematics, Computer Science and Statistics 212 Ben Franklin Hall Bloomsburg University 400 East Second Street

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products Chapter 3 Cartesian Products and Relations The material in this chapter is the first real encounter with abstraction. Relations are very general thing they are a special type of subset. After introducing

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

Module 223 Major A: Concepts, methods and design in Epidemiology

Module 223 Major A: Concepts, methods and design in Epidemiology Module 223 Major A: Concepts, methods and design in Epidemiology Module : 223 UE coordinator Concepts, methods and design in Epidemiology Dates December 15 th to 19 th, 2014 Credits/ECTS UE description

More information

Math 4310 Handout - Quotient Vector Spaces

Math 4310 Handout - Quotient Vector Spaces Math 4310 Handout - Quotient Vector Spaces Dan Collins The textbook defines a subspace of a vector space in Chapter 4, but it avoids ever discussing the notion of a quotient space. This is understandable

More information

LEARNING OBJECTIVES FOR THIS CHAPTER

LEARNING OBJECTIVES FOR THIS CHAPTER CHAPTER 2 American mathematician Paul Halmos (1916 2006), who in 1942 published the first modern linear algebra book. The title of Halmos s book was the same as the title of this chapter. Finite-Dimensional

More information

Multiple Imputation for Missing Data: A Cautionary Tale

Multiple Imputation for Missing Data: A Cautionary Tale Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust

More information

6/15/2005 7:54 PM. Affirmative Action s Affirmative Actions: A Reply to Sander

6/15/2005 7:54 PM. Affirmative Action s Affirmative Actions: A Reply to Sander Reply Affirmative Action s Affirmative Actions: A Reply to Sander Daniel E. Ho I am grateful to Professor Sander for his interest in my work and his willingness to pursue a valid answer to the critical

More information

NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

More information

CHAPTER 3. Methods of Proofs. 1. Logical Arguments and Formal Proofs

CHAPTER 3. Methods of Proofs. 1. Logical Arguments and Formal Proofs CHAPTER 3 Methods of Proofs 1. Logical Arguments and Formal Proofs 1.1. Basic Terminology. An axiom is a statement that is given to be true. A rule of inference is a logical rule that is used to deduce

More information

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Rotation Matrices and Homogeneous Transformations

Rotation Matrices and Homogeneous Transformations Rotation Matrices and Homogeneous Transformations A coordinate frame in an n-dimensional space is defined by n mutually orthogonal unit vectors. In particular, for a two-dimensional (2D) space, i.e., n

More information

Measurement with Ratios

Measurement with Ratios Grade 6 Mathematics, Quarter 2, Unit 2.1 Measurement with Ratios Overview Number of instructional days: 15 (1 day = 45 minutes) Content to be learned Use ratio reasoning to solve real-world and mathematical

More information

Missing data in randomized controlled trials (RCTs) can

Missing data in randomized controlled trials (RCTs) can EVALUATION TECHNICAL ASSISTANCE BRIEF for OAH & ACYF Teenage Pregnancy Prevention Grantees May 2013 Brief 3 Coping with Missing Data in Randomized Controlled Trials Missing data in randomized controlled

More information

Basic Concepts in Research and Data Analysis

Basic Concepts in Research and Data Analysis Basic Concepts in Research and Data Analysis Introduction: A Common Language for Researchers...2 Steps to Follow When Conducting Research...3 The Research Question... 3 The Hypothesis... 4 Defining the

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

1 Teaching notes on GMM 1.

1 Teaching notes on GMM 1. Bent E. Sørensen January 23, 2007 1 Teaching notes on GMM 1. Generalized Method of Moment (GMM) estimation is one of two developments in econometrics in the 80ies that revolutionized empirical work in

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Creating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities

Creating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities Algebra 1, Quarter 2, Unit 2.1 Creating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities Overview Number of instructional days: 15 (1 day = 45 60 minutes) Content to be learned

More information

Multivariate Logistic Regression

Multivariate Logistic Regression 1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

[This document contains corrections to a few typos that were found on the version available through the journal s web page]

[This document contains corrections to a few typos that were found on the version available through the journal s web page] Online supplement to Hayes, A. F., & Preacher, K. J. (2014). Statistical mediation analysis with a multicategorical independent variable. British Journal of Mathematical and Statistical Psychology, 67,

More information

The Bivariate Normal Distribution

The Bivariate Normal Distribution The Bivariate Normal Distribution This is Section 4.7 of the st edition (2002) of the book Introduction to Probability, by D. P. Bertsekas and J. N. Tsitsiklis. The material in this section was not included

More information

Some probability and statistics

Some probability and statistics Appendix A Some probability and statistics A Probabilities, random variables and their distribution We summarize a few of the basic concepts of random variables, usually denoted by capital letters, X,Y,

More information

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 119 STATISTICS AND ELEMENTARY ALGEBRA 5 Lecture Hours, 2 Lab Hours, 3 Credits Pre-

More information

Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA We Can Early Learning Curriculum PreK Grades 8 12 INSIDE ALGEBRA, GRADES 8 12 CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA April 2016 www.voyagersopris.com Mathematical

More information

Supplementary PROCESS Documentation

Supplementary PROCESS Documentation Supplementary PROCESS Documentation This document is an addendum to Appendix A of Introduction to Mediation, Moderation, and Conditional Process Analysis that describes options and output added to PROCESS

More information

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Samarjit Chakraborty Computer Engineering and Networks Laboratory Swiss Federal Institute of Technology (ETH) Zürich March

More information

The primary goal of this thesis was to understand how the spatial dependence of

The primary goal of this thesis was to understand how the spatial dependence of 5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial

More information

Equations, Inequalities & Partial Fractions

Equations, Inequalities & Partial Fractions Contents Equations, Inequalities & Partial Fractions.1 Solving Linear Equations 2.2 Solving Quadratic Equations 1. Solving Polynomial Equations 1.4 Solving Simultaneous Linear Equations 42.5 Solving Inequalities

More information

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION

PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION Chin-Diew Lai, Department of Statistics, Massey University, New Zealand John C W Rayner, School of Mathematics and Applied Statistics,

More information

ST 371 (IV): Discrete Random Variables

ST 371 (IV): Discrete Random Variables ST 371 (IV): Discrete Random Variables 1 Random Variables A random variable (rv) is a function that is defined on the sample space of the experiment and that assigns a numerical variable to each possible

More information

Mathematical Induction

Mathematical Induction Mathematical Induction In logic, we often want to prove that every member of an infinite set has some feature. E.g., we would like to show: N 1 : is a number 1 : has the feature Φ ( x)(n 1 x! 1 x) How

More information

Inequality, Mobility and Income Distribution Comparisons

Inequality, Mobility and Income Distribution Comparisons Fiscal Studies (1997) vol. 18, no. 3, pp. 93 30 Inequality, Mobility and Income Distribution Comparisons JOHN CREEDY * Abstract his paper examines the relationship between the cross-sectional and lifetime

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

CURVE FITTING LEAST SQUARES APPROXIMATION

CURVE FITTING LEAST SQUARES APPROXIMATION CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

4.5 Linear Dependence and Linear Independence

4.5 Linear Dependence and Linear Independence 4.5 Linear Dependence and Linear Independence 267 32. {v 1, v 2 }, where v 1, v 2 are collinear vectors in R 3. 33. Prove that if S and S are subsets of a vector space V such that S is a subset of S, then

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Introduction to Fixed Effects Methods

Introduction to Fixed Effects Methods Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed

More information

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

Maximum likelihood estimation of mean reverting processes

Maximum likelihood estimation of mean reverting processes Maximum likelihood estimation of mean reverting processes José Carlos García Franco Onward, Inc. jcpollo@onwardinc.com Abstract Mean reverting processes are frequently used models in real options. For

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

For example, estimate the population of the United States as 3 times 10⁸ and the

For example, estimate the population of the United States as 3 times 10⁸ and the CCSS: Mathematics The Number System CCSS: Grade 8 8.NS.A. Know that there are numbers that are not rational, and approximate them by rational numbers. 8.NS.A.1. Understand informally that every number

More information

Chapter 1 Introduction. 1.1 Introduction

Chapter 1 Introduction. 1.1 Introduction Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations

More information

MULTIPLE REGRESSION WITH CATEGORICAL DATA

MULTIPLE REGRESSION WITH CATEGORICAL DATA DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

More information

8 Divisibility and prime numbers

8 Divisibility and prime numbers 8 Divisibility and prime numbers 8.1 Divisibility In this short section we extend the concept of a multiple from the natural numbers to the integers. We also summarize several other terms that express

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

More information

5 Directed acyclic graphs

5 Directed acyclic graphs 5 Directed acyclic graphs (5.1) Introduction In many statistical studies we have prior knowledge about a temporal or causal ordering of the variables. In this chapter we will use directed graphs to incorporate

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

You know from calculus that functions play a fundamental role in mathematics.

You know from calculus that functions play a fundamental role in mathematics. CHPTER 12 Functions You know from calculus that functions play a fundamental role in mathematics. You likely view a function as a kind of formula that describes a relationship between two (or more) quantities.

More information

Qualitative vs Quantitative research & Multilevel methods

Qualitative vs Quantitative research & Multilevel methods Qualitative vs Quantitative research & Multilevel methods How to include context in your research April 2005 Marjolein Deunk Content What is qualitative analysis and how does it differ from quantitative

More information

How To Prove The Dirichlet Unit Theorem

How To Prove The Dirichlet Unit Theorem Chapter 6 The Dirichlet Unit Theorem As usual, we will be working in the ring B of algebraic integers of a number field L. Two factorizations of an element of B are regarded as essentially the same if

More information

Randomization Based Confidence Intervals For Cross Over and Replicate Designs and for the Analysis of Covariance

Randomization Based Confidence Intervals For Cross Over and Replicate Designs and for the Analysis of Covariance Randomization Based Confidence Intervals For Cross Over and Replicate Designs and for the Analysis of Covariance Winston Richards Schering-Plough Research Institute JSM, Aug, 2002 Abstract Randomization

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Analysis of a Production/Inventory System with Multiple Retailers

Analysis of a Production/Inventory System with Multiple Retailers Analysis of a Production/Inventory System with Multiple Retailers Ann M. Noblesse 1, Robert N. Boute 1,2, Marc R. Lambrecht 1, Benny Van Houdt 3 1 Research Center for Operations Management, University

More information

Notes on Determinant

Notes on Determinant ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

More information

In this commentary, I am going to first review my history with mediation. In the second

In this commentary, I am going to first review my history with mediation. In the second Reflections on Mediation David A. Kenny University of Connecticut Organizational Research Methods Volume XX Number X Month XXXX XX-XX Ó XXXX Sage Publications 10.1177/1094428107308978 http://orm.sagepub.com

More information

Empirical Methods in Applied Economics

Empirical Methods in Applied Economics Empirical Methods in Applied Economics Jörn-Ste en Pischke LSE October 2005 1 Observational Studies and Regression 1.1 Conditional Randomization Again When we discussed experiments, we discussed already

More information

1.2 Solving a System of Linear Equations

1.2 Solving a System of Linear Equations 1.. SOLVING A SYSTEM OF LINEAR EQUATIONS 1. Solving a System of Linear Equations 1..1 Simple Systems - Basic De nitions As noticed above, the general form of a linear system of m equations in n variables

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

Test Bias. As we have seen, psychological tests can be well-conceived and well-constructed, but

Test Bias. As we have seen, psychological tests can be well-conceived and well-constructed, but Test Bias As we have seen, psychological tests can be well-conceived and well-constructed, but none are perfect. The reliability of test scores can be compromised by random measurement error (unsystematic

More information

Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS

Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS Sensitivity Analysis 3 We have already been introduced to sensitivity analysis in Chapter via the geometry of a simple example. We saw that the values of the decision variables and those of the slack and

More information

The Method of Least Squares

The Method of Least Squares The Method of Least Squares Steven J. Miller Mathematics Department Brown University Providence, RI 0292 Abstract The Method of Least Squares is a procedure to determine the best fit line to data; the

More information

Algebra I Notes Relations and Functions Unit 03a

Algebra I Notes Relations and Functions Unit 03a OBJECTIVES: F.IF.A.1 Understand the concept of a function and use function notation. Understand that a function from one set (called the domain) to another set (called the range) assigns to each element

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

Introduction to Principal Components and FactorAnalysis

Introduction to Principal Components and FactorAnalysis Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a

More information

Multiple regression - Matrices

Multiple regression - Matrices Multiple regression - Matrices This handout will present various matrices which are substantively interesting and/or provide useful means of summarizing the data for analytical purposes. As we will see,

More information

Principle of Data Reduction

Principle of Data Reduction Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

More information

Gröbner Bases and their Applications

Gröbner Bases and their Applications Gröbner Bases and their Applications Kaitlyn Moran July 30, 2008 1 Introduction We know from the Hilbert Basis Theorem that any ideal in a polynomial ring over a field is finitely generated [3]. However,

More information

Physics Lab Report Guidelines

Physics Lab Report Guidelines Physics Lab Report Guidelines Summary The following is an outline of the requirements for a physics lab report. A. Experimental Description 1. Provide a statement of the physical theory or principle observed

More information

Reflections on Probability vs Nonprobability Sampling

Reflections on Probability vs Nonprobability Sampling Official Statistics in Honour of Daniel Thorburn, pp. 29 35 Reflections on Probability vs Nonprobability Sampling Jan Wretman 1 A few fundamental things are briefly discussed. First: What is called probability

More information

LOGNORMAL MODEL FOR STOCK PRICES

LOGNORMAL MODEL FOR STOCK PRICES LOGNORMAL MODEL FOR STOCK PRICES MICHAEL J. SHARPE MATHEMATICS DEPARTMENT, UCSD 1. INTRODUCTION What follows is a simple but important model that will be the basis for a later study of stock prices as

More information

Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points

Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points Introduction to Algebraic Geometry Bézout s Theorem and Inflection Points 1. The resultant. Let K be a field. Then the polynomial ring K[x] is a unique factorisation domain (UFD). Another example of a

More information

Mathematics Georgia Performance Standards

Mathematics Georgia Performance Standards Mathematics Georgia Performance Standards K-12 Mathematics Introduction The Georgia Mathematics Curriculum focuses on actively engaging the students in the development of mathematical understanding by

More information

Time series Forecasting using Holt-Winters Exponential Smoothing

Time series Forecasting using Holt-Winters Exponential Smoothing Time series Forecasting using Holt-Winters Exponential Smoothing Prajakta S. Kalekar(04329008) Kanwal Rekhi School of Information Technology Under the guidance of Prof. Bernard December 6, 2004 Abstract

More information

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

More information

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document

More information