Chapter 37 The MIXED Procedure. Chapter Table of Contents

Size: px
Start display at page:

Download "Chapter 37 The MIXED Procedure. Chapter Table of Contents"

Transcription

1 Chapter 37 The MIXED Procedure Chapter Table of Contents OVERVIEW BasicFeatures NotationfortheMixedModel PROCMIXEDContrastedwithOtherSASProcedures GETTING STARTED ClusteredDataExample SYNTAX PROCMIXEDStatement BYStatement CLASSStatement CONTRAST Statement ESTIMATE Statement IDStatement LSMEANSStatement MAKEStatement MODELStatement PARMSStatement PRIORStatement RANDOM Statement REPEATED Statement WEIGHTStatement DETAILS MixedModelsTheory ParameterizationofMixedModels DefaultOutput OutputChangesinVersion ComputationalIssues EXAMPLES Example 37.1 Split-Plot Design Example 37.2 Repeated Measures Example 37.3 Plotting the Likelihood Example37.4KnownGandR

2 1946 Chapter 37. The MIXED Procedure Example 37.5 Random Coefficients Example 37.6 Line-Source Sprinkler Irrigation REFERENCES

3 Chapter 37 The MIXED Procedure Overview The MIXED procedure fits a variety of mixed linear models to data and enables you to use these fitted models to make statistical inferences about the data. A mixed linear model is a generalization of the standard linear model used in the GLM procedure, the generalization being that the data are permitted to exhibit correlation and nonconstant variability. The mixed linear model, therefore, provides you with the flexibility of modeling not only the means of your data (as in the standard linear model) but their variances and covariances as well. The primary assumptions underlying the analyses performed by PROC MIXED are as follows: The data are normally distributed (Gaussian). The means (expected values) of the data are linear in terms of a certain set of parameters. The variances and covariances of the data are in terms of a different set of parameters, and they exhibit a structure matching one of those available in PROC MIXED. Since Gaussian data can be modeled entirely in terms of their means and variances/covariances, the two sets of parameters in a mixed linear model actually specify the complete probability distribution of the data. The parameters of the mean model are referred to as fixed-effects parameters, and the parameters of the variancecovariance model are referred to as covariance parameters. The fixed-effects parameters are associated with known explanatory variables, as in the standard linear model. These variables can be either qualitative (as in the traditional analysis of variance) or quantitative (as in standard linear regression). However, the covariance parameters are what distinguishes the mixed linear model from the standard linear model. The need for covariance parameters arises quite frequently in applications, the following being the two most typical scenarios: The experimental units on which the data are measured can be grouped into clusters, and the data from a common cluster are correlated. Repeated measurements are taken on the same experimental unit, and these repeated measurements are correlated or exhibit variability that changes.

4 1948 Chapter 37. The MIXED Procedure The first scenario can be generalized to include one set of clusters nested within another. For example, if students are the experimental unit, they can be clustered into classes, which in turn can be clustered into schools. Each level of this hierarchy can introduce an additional source of variability and correlation. The second scenario occurs in longitudinal studies, where repeated measurements are taken over time. Alternatively, the repeated measures could be spatial or multivariate in nature. PROC MIXED provides a variety of covariance structures to handle the previous two scenarios. The most common of these structures arises from the use of random-effects parameters, which are additional unknown random variables assumed to impact the variability of the data. The variances of the random-effects parameters, commonly known as variance components, become the covariance parameters for this particular structure. Traditional mixed linear models contain both fixed- and random-effects parameters, and, in fact, it is the combination of these two types of effects that led to the name mixed model. PROC MIXED fits not only these traditional variance component models but numerous other covariance structures as well. PROC MIXED fits the structure you select to the data using the method of restricted maximum likelihood (REML), also known as residual maximum likelihood. It is here that the Gaussian assumption for the data is exploited. Other estimation methods are also available, including maximum likelihood and MIVQUE0. The details behind these estimation methods are discussed in subsequent sections. Once a model has been fit to your data, you can use it to draw statistical inferences via both the fixed-effects and covariance parameters. PROC MIXED computes several different statistics suitable for generating hypothesis tests and confidence intervals. The validity of these statistics depends upon the mean and variance-covariance model you select, so it is important to choose the model carefully. Some of the output from PROC MIXED helps you assess your model and compare it with others. Basic Features PROC MIXED provides easy accessibility to numerous mixed linear models that are useful in many common statistical analyses. In the style of the GLM procedure, PROC MIXED fits the specified mixed linear model and produces appropriate statistics. Some basic features of PROC MIXED are covariance structures, including variance components, compound symmetry, unstructured, AR(1), Toeplitz, spatial, general linear, and factor analytic GLM-type grammar, using MODEL, RANDOM, and REPEATED statements for model specification and CONTRAST, ESTIMATE, and LSMEANS statements for inferences appropriate standard errors for all specified estimable linear combinations of fixed and random effects, and corresponding t- and F-tests subject and group effects that enable blocking and heterogeneity, respectively

5 Notation for the Mixed Model 1949 REML and ML estimation methods implemented with a Newton-Raphson algorithm capacity to handle unbalanced data ability to create a SAS data set corresponding to any table PROC MIXED uses the Output Delivery System (ODS), a SAS subsystem that provides capabilities for displaying and controlling the output from SAS procedures. ODS enables you to convert any of the output from PROC MIXED into a SAS data set. See the Output Changes in Version 7 section on page Notation for the Mixed Model This section introduces the mathematical notation used throughout this chapter to describe the mixed linear model. You should be familiar with basic matrix algebra (refer to Searle 1982). A more detailed description of the mixed model is contained in the Mixed Models Theory section on page A statistical model is a mathematical description of how data are generated. The standard linear model, as used by the GLM procedure, is one of the most common statistical models: y = X + In this expression, y represents a vector of observed data, is an unknown vector of fixed-effects parameters with known design matrix X, and is an unknown random error vector modeling the statistical noise around X. The focus of the standard linear model is to model the mean of y by using the fixed-effects parameters. The residual errors are assumed to be independent and identically distributed Gaussian random variables with mean 0 and variance 2. The mixed model generalizes the standard linear model as follows: y = X + Z + Here, is an unknown vector of random-effects parameters with known design matrix Z, and is an unknown random error vector whose elements are no longer required to be independent and homogeneous. To further develop this notion of variance modeling, assume that and are Gaussian random variables that are uncorrelated and have expectations 0 and variances G and R, respectively. The variance of y is thus V = ZGZ 0 + R

6 1950 Chapter 37. The MIXED Procedure Note that, when R = 2 I and Z = 0, the mixed model reduces to the standard linear model. You can model the variance of the data, y, by specifying the structure (or form) of Z, G, andr. The model matrix Z is set up in the same fashion as X, the model matrix for the fixed-effects parameters. For G and R, you must select some covariance structure. Possible covariance structures include variance components compound symmetry (common covariance plus diagonal) unstructured (general covariance) autoregressive spatial general linear factor analytic By appropriately defining the model matrices X and Z, as well as the covariance structure matrices G and R, you can perform numerous mixed model analyses. PROC MIXED Contrasted with Other SAS Procedures PROC MIXED is a generalization of the GLM procedure in the sense that PROC GLM fits standard linear models, and PROC MIXED fits the wider class of mixed linear models. Both procedures have similar CLASS, MODEL, CONTRAST, ESTI- MATE, and LSMEANS statements, but their RANDOM and REPEATED statements differ (see the following paragraphs). Both procedures use the nonfull-rank model parameterization, although the sorting of classification levels can differ between the two. PROC MIXED computes only Type I Type III tests of fixed effects, while PROC GLM offers Types I IV. The RANDOM statement in PROC MIXED incorporates random effects constituting the vector in the mixed model. However, in PROC GLM, effects specified in the RANDOM statement are still treated as fixed as far as the model fit is concerned, and they serve only to produce corresponding expected mean squares. These expected mean squares lead to the traditional ANOVA estimates of variance components. PROC MIXED computes REML and ML estimates of variance parameters, which are generally preferred to the ANOVA estimates (Searle 1988; Harville 1988; Searle, Casella, and McCulloch 1992). Optionally, PROC MIXED also computes MIVQUE0 estimates, which are similar to ANOVA estimates. The REPEATED statement in PROC MIXED is used to specify covariance structures for repeated measurements on subjects, while the REPEATED statement in PROC GLM is used to specify various transformations with which to conduct the traditional univariate or multivariate tests. In repeated measures situations, the mixed model approach used in PROC MIXED is more flexible and more widely applicable than either the univariate or multivariate approaches. In particular, the mixed model ap-

7 Clustered Data Example 1951 proach provides a larger class of covariance structures and a better mechanism for handling missing values. PROC MIXED subsumes the VARCOMP procedure. PROC MIXED provides a wide variety of covariance structures, while PROC VARCOMP estimates only simple random effects. PROC MIXED carries out several analyses that are absent in PROC VARCOMP, including the estimation and testing of linear combinations of fixed and random effects. The ARIMA and AUTOREG procedures provide more time series structures than PROC MIXED, although they do not fit variance component models. The CALIS procedure fits general covariance matrices, but it does not allow fixed effects as does PROC MIXED. The LATTICE and NESTED procedures fit special types of mixed linear models that can also be handled in PROC MIXED, although PROC MIXED may run slower because of its more general algorithm. The TSCSREG procedure analyzes time-series cross-sectional data, and it fits some structures not available in PROC MIXED. Getting Started Clustered Data Example Consider the following SAS data set as an introductory example: data heights; input Family Gender$ datalines; 1 F 67 1 F 66 1 F 64 1 M 71 1 M 72 2 F 63 2 F 63 2 F 67 2 M 69 2 M 68 2 M 70 3 F 63 3 M 64 4 F 67 4 F 66 4 M 67 4 M 67 4 M 69 run; The response variable Height measures the heights (in inches) of 18 individuals. The individuals are classified according to Family and Gender. You can perform a traditional two-way analysis of variance of these data with the following PROC MIXED code: proc mixed; class Family Gender; model Height = Gender Family Family*Gender; run; The PROC MIXED statement invokes the procedure. The CLASS statement instructs PROC MIXED to consider both Family and Gender as classification variables. Dummy (indicator) variables are, as a result, created corresponding to all of the distinct levels of Family and Gender. For these data, Family has four levels and Gender has two levels.

8 1952 Chapter 37. The MIXED Procedure The MODEL statement first specifies the response (dependent) variable Height. The explanatory (independent) variables are then listed after the equal (=) sign. Here, the two explanatory variables are Gender and Family, and they comprise the main effects of the design. The third explanatory term, Family*Gender, models an interaction between the two main effects. PROC MIXED uses the dummy variables associated with Gender, Family, andfamily*gender to construct the X matrix for the linear model. A column of 1s is also included as the first column of X to model a global intercept. There are no Z or G matrices for this model, and R is assumed to equal 2 I,whereI is an 1818 identity matrix. The RUN statement completes the specification. The coding is precisely the same as with the GLM procedure. However, much of the output from PROC MIXED is different from that produced by PROC GLM. The following is the output from PROC MIXED. Model Information Data Set Dependent Variable Covariance Structure Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method WORK.HEIGHTS Height Diagonal REML Profile Model-Based Residual Figure Model Information The Model Information table describes the model, some of the variables that it involves, and the method used in fitting it. This table also lists the method (profile, factor, or fit) for handling the residual variance. Class Level Information Class Levels Values Family Gender 2 F M Figure Class Level Information The Class Level Information table lists the levels of all variables specified in the CLASS statement. You can check this table to make sure that the data are correct.

9 Clustered Data Example 1953 Dimensions Covariance Parameters 1 Columns in X 15 Columns in Z 0 Subjects 1 Max Obs Per Subject 18 Observations Used 18 Observations Not Used 0 Total Observations 18 Figure Dimensions The Dimensions table lists the sizes of relevant matrices. This table can be useful in determining CPU time and memory requirements. Covariance Parameter Estimates Cov Parm Estimate Residual Figure Covariance Parameter Estimates The Covariance Parameter Estimates table displays the estimate of 2 model. for the Fitting Information Res Log Likelihood Akaike s Information Criterion Schwarz s Bayesian Criterion Res Log Likelihood 41.6 Figure Model Fitting Information The Fitting Information table lists several pieces of information about the fitted mixed model, including values derived from the computed value of the restricted/residual likelihood. Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F Gender Family Family*Gender Figure Tests of Fixed Effects

10 1954 Chapter 37. The MIXED Procedure The Type 3 Tests of Fixed Effects table displays significance tests for the three effects listed in the MODEL statement. The Type III F -statistics and p-values are the same as those produced by the GLM procedure. However, because PROC MIXED uses a likelihood-based estimation scheme, it does not directly compute or display sums of squares for this analysis. The Type 3 test for Family*Gender effect is not significant at the 5% level, but the tests for both main effects are significant. The important assumptions behind this analysis are that the data are normally distributed and that they are independent with constant variance. For these data, the normality assumption is probably realistic since the data are observed heights. However, since the data occur in clusters (families), it is very likely that observations from the same family are statistically correlated, that is, not independent. The methods implemented in PROC MIXED are still based on the assumption of normally distributed data, but you can drop the assumption of independence by modeling statistical correlation in a variety of ways. You can also model variances that are heterogeneous, that is, nonconstant. For the height data, one of the simplest ways of modeling correlation is through the use of random effects. Here the family effect is assumed to be normally distributed with zero mean and some unknown variance. This is in contrast to the previous model in which the family effects are just constants, or fixed effects. Declaring Family as a random effect sets up a common correlation among all observations having the same level of Family. Declaring Family*Gender as a random effect models an additional correlation between all observations that have the same level of both Family and Gender. One interpretation of this effect is that a female in a certain family exhibits more correlation with the other females in that family than with the other males, and likewise for a male. With the height data, this model seems reasonable. The code to fit this correlation model in PROC MIXED is as follows: proc mixed; class Family Gender; model Height = Gender; random Family Family*Gender; run; Note that Family and Family*Gender are now listed in the RANDOM statement. The dummy variables associated with them are used to construct the Z matrix in the mixed model. The X matrix now consists of a column of 1s and the dummy variables for Gender. The G matrix for this model is diagonal, and it contains the variance components for both Family and Family*Gender. TheR matrix is still assumed to equal 2 I,where I is an identity matrix. The output from this analysis is as follows.

11 Clustered Data Example 1955 Model Information Data Set Dependent Variable Covariance Structure Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method WORK.HEIGHTS Height Variance Components REML Profile Model-Based Containment Figure Model Information The Model Information table shows that the containment method is used to compute the degrees of freedom for this analysis. This is the default method when a RANDOM statement is used; see the description of the DDFM= option on page 1979 for more information. Class Level Information Class Levels Values Family Gender 2 F M Figure Class Levels Information The Class Levels Information table is the same as before. Dimensions Covariance Parameters 3 Columns in X 3 Columns in Z 12 Subjects 1 Max Obs Per Subject 18 Observations Used 18 Observations Not Used 0 Total Observations 18 Figure Dimensions The Dimensions table displays the new sizes of the X and Z matrices.

12 1956 Chapter 37. The MIXED Procedure Iteration History Iteration Evaluations -2 Res Log Like Criterion Convergence criteria met. Figure REML Estimation Iteration History The Iteration History table displays the results of the numerical optimization of the restricted/residual likelihood. Six iterations are required to achieve the default convergence criterion of 1E,8. Covariance Parameter Estimates Cov Parm Estimate Family Family*Gender Residual Figure Covariance Parameter Estimates (REML) The Covariance Parameter Estimates table displays the results of the REML fit. The Estimate column contains the estimates of the variance components for Family and Family*Gender, as well as the estimate of 2. Fitting Information Res Log Likelihood Akaike s Information Criterion Schwarz s Bayesian Criterion Res Log Likelihood 71.0 Figure Fitting Information The Fitting Information table contains basic information about the REML fit.

13 Clustered Data Example 1957 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F Gender Figure Type 3 Tests of Fixed Effects The Type 3 Tests of Fixed Effects table contains a significance test for the lone fixed effect, Gender. Note that the associated p-value is not nearly as significant as in the previous analysis. This illustrates the importance of correctly modeling correlation in your data. An additional benefit of the random effects analysis is that it enables you to make inferences about gender that apply to an entire population of families, whereas the inferences about gender from the analysis where Family and Family*Gender are fixed effects apply only to the particular families in the data set. PROC MIXED thus offers you the ability to model correlation directly and to make inferences about fixed effects that apply to entire populations of random effects.

14 1958 Chapter 37. The MIXED Procedure Syntax The following statements are available in PROC MIXED. PROC MIXED < options > ; BY variables ; CLASS variables ; ID variables ; MODEL dependent = < fixed-effects ></ options > ; RANDOM random-effects < / options > ; REPEATED < repeated-effect >< / options > ; PARMS (value-list) ::: </ options > ; PRIOR < distribution >< / options > ; CONTRAST label < fixed-effect values ::: > < j random-effect values ::: >, ::: </ options > ; ESTIMATE label < fixed-effect values ::: > < j random-effect values ::: ></ options > ; LSMEANS fixed-effects < / options > ; MAKE table OUT=SAS-data-set ; WEIGHT variable ; Itemswithinanglebrackets(<>)areoptional. TheCONTRAST, ESTIMATE, LSMEANS, MAKE, and RANDOM statements can appear multiple times; all other statements can appear only once. The PROC MIXED and MODEL statements are required, and the MODEL statement must appear after the CLASS statement if a CLASS statement is included. The CON- TRAST, ESTIMATE, LSMEANS, RANDOM, and REPEATED statements must follow the MODEL statement. The CONTRAST and ESTIMATE statements must also follow any RANDOM statements. Table 37.1 summarizes the basic functions and important options of each PROC MIXED statement. The syntax of each statement in Table 37.1 is described in the following sections in alphabetical order after the description of the PROC MIXED statement.

15 PROC MIXED Statement 1959 Table Summary of PROC MIXED Statements MODEL RANDOM Statement Description Important Options PROC MIXED invokes the procedure DATA= specifies input data set, METHOD= specifies estimation method BY performs multiple none PROC MIXED analyses in one invocation CLASS declares qualitative variables none that create indica- tor variables in design matrices ID lists additional variables to be included in pre- none dicted values tables specifies dependent variable and fixed effects, setting up X specifies random effects, setting up Z and G S requests solution for fixed-effects parameters, DDFM= specifies denominator degrees of freedom method, OUTP= outputs predicted values to a data set SUBJECT= creates block-diagonality, TYPE= specifies covariance structure, S requests solution for random-effects parameters, G displays estimated G REPEATED sets up R SUBJECT= creates block-diagonality, TYPE= specifies covariance structure, R displays estimated blocks of R, GROUP= enables betweensubject heterogeneity, LOCAL adds a diagonal PARMS specifies a grid of initial values for the covariance parameters PRIOR performs a samplingbased Bayesian analysis for variance component models CONTRAST constructs custom hypothesis tests ESTIMATE constructs custom scalar estimates LSMEANS computes least squares means for classification fixed effects MAKE WEIGHT converts any displayed table into a SAS data set specifies a variable by which to weight R matrix to R HOLD= and NOITER hold the covariance parameters or their ratios constant, PDATA= reads the initial values from a SAS data set NSAMPLE= specifies the sample size, SEED= specifies the starting seed E displays the L matrix coefficients CL produces confidence limits DIFF computes differences of the least squares means, ADJUST= performs multiple comparisons adjustments, AT changes covariates, OM changes weighting, CL produces confidence limits, SLICE= tests simple effects none. Has been superceded by the Output Delivery System (ODS) none

16 1960 Chapter 37. The MIXED Procedure PROC MIXED Statement PROC MIXED < options >; The PROC MIXED statement invokes the procedure. You can specify the following options. ABSOLUTE makes the convergence criterion absolute. By default, it is relative (divided by the current objective function value). See the CONVF, CONVG, and CONVH options in this section for a description of various convergence criteria. ALPHA=number requests that confidence limits be constructed for the covariance parameter estimates with confidence level 1, number. Thevalueofnumber must be between 0 and 1; the default is ASYCORR produces the asymptotic correlation matrix of the covariance parameter estimates. It is computed from the corresponding asymptotic covariance matrix (see the description of the ASYCOV option, which follows). For ODS purposes, the label of the Asymptotic Correlation table is AsyCorr. ASYCOV requests that the asymptotic covariance matrix of the covariance parameters be displayed. By default, this matrix is the observed inverse Fisher information matrix, which equals 2H,1,whereH is the Hessian (second derivative) matrix of the objective function. See the Covariance Parameter Estimates section on page 2025 for more information about this matrix. When you use the SCORING= option and PROC MIXED converges without stopping the scoring algorithm, PROC MIXED uses the expected Hessian matrix to compute the covariance matrix instead of the observed Hessian. For ODS purposes, the label of the Asymptotic Covariance table is AsyCov. CL<=WALD> requests confidence limits for the covariance parameter estimates. A Satterthwaite approximation is used to construct limits for all parameters that have a default lower boundary constraint of zero. These limits take the form b 2 2 ;1,=2 2 b2 2 ;=2 where =2Z 2, Z is the Wald statistic b 2 =se(b 2 ), and the denominators are quantiles of the 2 -distribution with degrees of freedom. Refer to Milliken and Johnson (1992) and Burdick and Graybill (1992) for similar techniques. For all other parameters, Wald Z-scores and normal quantiles are used to construct the limits. The optional =WALD specification requests Wald limits for all parameters.

17 PROC MIXED Statement 1961 The confidence limits are displayed as extra columns in the Covariance Parameter Estimates table. The confidence level is 1, = 0:95 by default; this can be changed with the ALPHA= option. CONVF<=number> requests the relative function convergence criterion with tolerance number. The relative function convergence criterion is jf k, f k,1 j jf k j number where f k is the value of the objective function at iteration k. To prevent the division by jf k j, use the ABSOLUTE option. The default convergence criterion is CONVH, and the default tolerance is 1E,8. CONVG <=number> requests the relative gradient convergence criterion with tolerance number. The relative gradient convergence criterion is max j jg jk j jf k j number where f k is the value of the objective function, and g jk is the jth element of the gradient (first derivative) of the objective function, both at iteration k. To prevent division by jf k j, use the ABSOLUTE option. The default convergence criterion is CONVH, and the default tolerance is 1E,8. CONVH<=number> requests the relative Hessian convergence criterion with tolerance number. The relative Hessian convergence criterion is g k0 H,1g k k jf k j number where f k is the value of the objective function, g k is the gradient (first derivative) of the objective function, and H k is the Hessian (second derivative) of the objective function, all at iteration k. If H k is singular, then PROC MIXED uses the following relative criterion: g 0 kg k jf k j number To prevent the division by jf k j, use the ABSOLUTE option. The default convergence criterion is CONVH, and the default tolerance is 1E,8. COVTEST produces asymptotic standard errors and Wald Z-tests for the covariance parameter estimates.

18 1962 Chapter 37. The MIXED Procedure DATA=SAS-data-set names the SAS data set to be used by PROC MIXED. The default is the most recently created data set. DFBW has the same effect as the DDFM=BW option in the MODEL statement. EMPIRICAL computes the estimated variance-covariance matrix of the fixed-effects parameters by using the asymptotically consistent estimator described in Huber (1967), White (1980), Liang and Zeger (1986), and Diggle, Liang, and Zeger (1994). This estimator is commonly referred to as the sandwich estimator, and it is computed as follows: (X 0 b V,1 X), SX i=1 X 0 i c V i,1 b i b i 0c Vi,1 X i! (X 0 b V,1 X), Here, b i = y i, X i b, S is the number of subjects, and matrices with an i subscript are those for the ith subject. You must include the SUBJECT= option in either a RANDOM or REPEATED statement for this option to take effect. When you specify the EMPIRICAL option, PROC MIXED adjusts all standard errors and test statistics involving the fixed-effects parameters. This changes output in the following tables (listed in Table 37.7 on page 2028): Contrast, CorrB, CovB, Diffs, Estimates, InvCovB, LSMeans, MMEq, MMEqSol, Slices, SolutionF, Tests, Tests1 Tests3. The OUTP= and OUTPM= data sets are also affected. IC displays a table of various information criteria. Four different criteria are computed in four different ways, producing 16 values in all. Table 37.2 displays the four criteria in both larger-is-better and smaller-is-better forms. Table Information Criteria Criteria Larger-is-better Smaller-is-better Reference AIC `, d,2` +2d Akaike (1974) HQIC `, d log log n,2` + 2d log log n Hannan and Quinn (1979) BIC `, d=2 log n,2` + d log n Schwarz (1978) CAIC `, d(log n +1)=2,2` + d(log n +1) Bozdogan (1987) Here ` denotes the maximum value of the (possibly restricted) log likelihood, d the dimension of the model, and n the number of effective observations. In Version 6 of SAS/STAT software, n equals the number of valid observations for maximum likelihood estimation and n, p for restricted maximum likelihood estimation, where p equals the rank of X. In Version 7, n equals the number of effective subjects as displayed in the Dimensions table, unless this value equals 1, in which case n reverts to the Version 6 values. PROC MIXED evaluates the criteria for both forms using d equal to both q and q + p, where q is the effective number of estimated covariance parameters. The value of d has changed in Version 7 in certain instances. In Version 6, when a parameter estimate lies on a boundary constraint, then it is still included in the calculation of

19 PROC MIXED Statement 1963 d, but in Version 7 it is not. The most common example of this behavior is when a variance component is estimated to equal zero. For ODS purposes, the name of the Information Criteria table is InfoCrit. INFO is a default option in Version 7. The creation of the Model Information and Dimensions tables can be suppressed using the NOINFO option. In Version 6, this option displays the Model Information and Dimensions tables. ITDETAILS displays the parameter values at each iteration and enables the writing of notes to the SAS log pertaining to infinite likelihood and singularities during Newton- Raphson iterations. LOGNOTE writes periodic notes to the log describing the current status of computations. It is designed for use with analyses requiring extensive CPU resources. MAXFUNC=number specifies the maximum number of likelihood evaluations in the optimization process. The default is 150. MAXITER=number specifies the maximum number of iterations. The default is 50. METHOD=REML METHOD=ML METHOD=MIVQUE0 METHOD=TYPE1 METHOD=TYPE2 METHOD=TYPE3 specifies the estimation method for the covariance parameters. The REML specification performs residual (restricted) maximum likelihood, and it is the default method. The ML specification performs maximum likelihood, and the MIVQUE0 specification performs minimum variance quadratic unbiased estimation of the covariance parameters. The METHOD=TYPEn specifications apply only to variance component models with no SUBJECT= effects and no REPEATED statement. An analysis of variance table is included in the output, and the expected mean squares are used to estimate the variance components (refer to Chapter 28, The GLM Procedure, for further explanation). The resulting method-of-moment variance component estimates are used in subsequent calculations, including standard errors computed from ESTIMATE and LSMEANS statements. For ODS purposes, the new table names are Type1, Type2, and Type3, respectively.

20 1964 Chapter 37. The MIXED Procedure MMEQ requests that coefficients of the mixed model equations be displayed. These are " X 0 b R,1X X 0 b R,1 Z b 0,1 R X Z b 0,1 R Z,1 Z + G b # ; " X 0 b R,1 y Z 0 b R,1y # assuming that b G is nonsingular. If b G is singular, PROC MIXED produces the following coefficients " # X 0 R b,1x X b 0,1 R Zb G bgz b 0,1 R,1X GZ b 0 R b Zb G + G b ; " X 0 b R,1 y bgz 0 b R,1y See the Estimating and in the Mixed Model section on page 2015 for further information on these equations. MMEQSOL requests that a solution to the mixed model equations be produced, as well as the inverted coefficients matrix. Formulas for these equations are provided in the preceding description of the MMEQ option. When b G is singular, b and a generalized inverse of the left-hand-side coefficient matrix are transformed using b G to produce b and b C, respectively, where b C is a generalized inverse of the left-hand-side coefficient matrix of the original equations. NAMELEN<=number> specifies the length to which long effect names are shortened. The default and minimum value is 20. NOBOUND has the same effect as the NOBOUND option in the PARMS statement (see page 1985). NOCLPRINT<=number> suppresses the display of the Class Level Information table if you do not specify number. If you do specify number, only levels with totals that are less than number are listed in the table. NOINFO suppresses the display of the Model Information and Dimensions tables. NOITPRINT suppresses the display of the Iteration History table. NOPROFILE includes the residual variance as part of the Newton-Raphson iterations. By default, the residual variance is profiled out of the likelihood. This option may be useful in conjunction with the HOLD= or NOITER option in the PARMS statement. #

21 PROC MIXED Statement 1965 ORD displays ordinates of the relevant distribution in addition to p-values. The ordinate can be viewed as an approximate odds ratio of hypothesis probabilities. ORDER=DATA ORDER=FORMATTED ORDER=FREQ ORDER=INTERNAL specifies the sorting order for the levels of the classification variables (specified in the CLASS statement). This ordering determines which parameters in the model correspond to each level in the data, so the ORDER= option may be useful when you use a CONTRAST or an ESTIMATE statement. The following table shows how PROC MIXED interprets values of the ORDER= option. Value of ORDER= DATA FORMATTED FREQ INTERNAL Levels Sorted by order of appearance in the input data set external formatted value descending frequency count; levels with the most observations come first in the order internal machine representation By default, ORDER=FORMATTED. For FORMATTED and INTERNAL, the sort order is machine dependent. For FORMATTED, the option applies to all classification variables, not just the ones for which you have explicitly defined formats. RATIO produces the ratio of the covariance parameter estimates to the estimate of the residual variance when the latter exists in the model. RIDGE=number specifies the starting value for the minimum ridge value used in the Newton-Raphson algorithm. The default is SCORING<=number> requests that Fisher scoring be used in association with the estimation method up to iteration number, which is 0 by default. When you use the SCORING= option and PROC MIXED converges without stopping the scoring algorithm, PROC MIXED uses the expected Hessian matrix to compute approximate standard errors for the covariance parameters instead of the observed Hessian. The output from the ASYCOV and ASYCORR options is similarly adjusted. SIGITER is an alias for the NOPROFILE option. UPDATE is an alias for the LOGNOTE option.

22 1966 Chapter 37. The MIXED Procedure BY Statement BY variables ; You can specify a BY statement with PROC MIXED to obtain separate analyses on observations in groups defined by the BY variables. When a BY statement appears, the procedure expects the input data set to be sorted in order of the BY variables. The variables are one or more variables in the input data set. If your input data set is not sorted in ascending order, use one of the following alternatives: Sort the data using the SORT procedure with a similar BY statement. Specify the BY statement options NOTSORTED or DESCENDING in the BY statement for the MIXED procedure. The NOTSORTED option does not mean that the data are unsorted but rather means that the data are arranged in groups (according to values of the BY variables) and that these groups are not necessarily in alphabetical or increasing numeric order. Create an index on the BY variables using the DATASETS procedure (in base SAS software). Since sorting the data changes the order in which PROC MIXED reads observations, the sorting order for the levels of the CLASS variable may be affected if you have specified ORDER=DATA in the PROC MIXED statement. This, in turn, affects specifications in the CONTRAST statements. For more information on the BY statement, refer to the discussion in SAS Language Reference: Concepts. For more information on the DATASETS procedure, refer to the discussion in the SAS Procedures Guide. CLASS Statement CLASS variables ; The CLASS statement names the classification variables to be used in the analysis. If the CLASS statement is used, it must appear before the MODEL statement. Classification variables can be either character or numeric. The procedure uses only the first 16 characters of a character variable. Class levels are determined from the formatted values of the CLASS variables. Thus, you can use formats to group values into levels. Refer to the discussion of the FORMAT procedure in the SAS Procedures Guide and to the discussions of the FORMAT statement and SAS formats in SAS Language Reference: Dictionary. You can adjust the display order of CLASS variable levels with the ORDER= option in the PROC MIXED statement.

23 CONTRAST Statement 1967 CONTRAST Statement CONTRAST label < fixed-effect values... > < j random-effect values... >,...< / options > ; The CONTRAST statement provides a mechanism for obtaining custom hypothesis tests. It is patterned after the CONTRAST statement in PROC GLM, although it has been extended to include random effects. This enables you to select an appropriate inference space (McLean, Sanders, and Stroup 1991). You can test the hypothesis L 0 = 0, wherel 0 = (K 0 M 0 ) and 0 = ( 0 0 ), in several inference spaces. The inference space corresponds to the choice of M. When M = 0, your inferences apply to the entire population from which the random effects are sampled; this is known as the broad inference space. When all elements of M are nonzero, your inferences apply only to the observed levels of the random effects. This is known as the narrow inference space, and you can also choose it by specifying all of the random effects as fixed. The GLM procedure uses the narrow inference space. Finally, by zeroing portions of M corresponding to selected main effects and interactions, you can choose intermediate inference spaces. The broad inference space is usually the most appropriate, and it is used when you do not specify any random effects in the CONTRAST statement. In the CONTRAST statement, label identifies the contrast in the table. A label is required for every contrast specified. Labels can be up to 20 characters and must be enclosed in single quotes. fixed-effect identifies an effect that appears in the MODEL statement. The keyword INTERCEPT can be used as an effect when an intercept is fitted in the model. You do not need to include all effects that are in the MODEL statement. random-effect values identifies an effect that appears in the RANDOM statement. The first random effect must follow a vertical bar (j); however, random effects do not have to be specified. are constants that are elements of the L matrix associated with the fixed and random effects. The rows of L 0 are specified in order and are separated by commas. The rows of the K 0 component of L 0 are specified on the left side of the vertical bars (j). These rows test the fixed effects and are, therefore, checked for estimability. The rows of the M 0 component of L 0 are specified on the right side of the vertical bars. They test the random effects, and no estimability checking is necessary. If PROC MIXED finds the fixed-effects portion of the specified contrast to be nonestimable (see the SINGULAR= option on page 1969), then it displays Non-est for the contrast entries.

24 1968 Chapter 37. The MIXED Procedure The following CONTRAST statement reproduces the F-test for the effect A in the split-plot example (see Example 37.1 on page 2037): contrast A broad A A*B , A A*B / df=6; Note that no random effects are specified in the preceding contrast; thus, the inference space is broad. The resulting F-test has two numerator degrees of freedom because L 0 has two rows. The denominator degrees of freedom is, by default, the residual degrees of freedom (9), but the DF= option changes the denominator degrees of freedom to 6. The following CONTRAST statement reproduces the F-test for A when Block and A*Block are considered fixed effects (the narrow inference space): contrast A narrow A A*B A*Block , A A*B A*Block ; The preceding contrast does not contain coefficients for B and Block because they cancel out in estimated differences between levels of A. Coefficients for B and Block are necessary when estimating the mean of one of the levels of A in the narrow inference space (see Example 37.1 on page 2037). If the elements of L are not specified for an effect that contains a specified effect, then the elements of the specified effect are automatically filled in over the levels of the higher-order effect. This feature is designed to preserve estimability for cases when there are complex higher-order effects. The coefficients for the higher-order effect are determined by equitably distributing the coefficients of the lower-level effect as in the construction of least squares means. In addition, if the intercept is specified, it is distributed over all classification effects that are not contained by any other specified effect. If an effect is not specified and does not contain any specified effects, then all of its coefficients in L are set to 0. You can override this behavior by specifying coefficients for the higher-order effect. If too many values are specified for an effect, the extra ones are ignored; if too few are specified, the remaining ones are set to 0. If no random effects are specified, the vertical bar can be omitted; otherwise, it must be present. If a SUBJECT effect is used in the RANDOM statement, then the coefficients specified for the effects in the RANDOM statement are equitably distributed across the levels of the SUBJECT effect. You can use the E option to see exactly what L matrix is used.

25 CONTRAST Statement 1969 The SUBJECT and GROUP options in the CONTRAST statement are useful for the case when a SUBJECT= or GROUP= variable appears in the RANDOM statement, and you want to contrast different subjects or groups. By default, CONTRAST statement coefficients on random effects are distributed equally across subjects and groups. PROC MIXED handles missing level combinations of classification variables similarly to the way PROC GLM does. Both procedures delete fixed-effects parameters corresponding to missing levels in order to preserve estimability. However, PROC MIXED does not delete missing level combinations for random-effects parameters because linear combinations of the random-effects parameters are always estimable. These conventions can affect the way you specify your CONTRAST coefficients. The CONTRAST statement computes the statistic bb 0 L 0 (L 0 b CL),1 L bb F = rank(l) and approximates its distribution with an F-distribution. In this expression, C b is an estimate of the generalized inverse of the coefficient matrix in the mixed model equations. See the Inference and Test Statistics section on page 2017 for more information on this F-statistic. The numerator degrees of freedom in the F-approximation is rank(l), and the denominator degrees of freedom is taken from the Tests of Fixed Effects table and corresponds to the final effect you list in the CONTRAST statement. You can change the denominator degrees of freedom by using the DF= option. You can specify the following options in the CONTRAST statement after a slash (/). CHISQ requests that the 2 -test be performed in addition to the F-test. DF=number specifies the denominator degrees of freedom for the F-test. The default is the denominator degrees of freedom taken from the Tests of Fixed Effects table and corresponds to the final effect you list in the CONTRAST statement. E requests that the L matrix coefficients for the contrast be displayed. For ODS purposes, the label of this L Matrix Coefficients table is Coefficients. GROUP coeffs GRP coeffs sets up random-effect contrasts between different groups when a GROUP= variable appears in the RANDOM statement. By default, CONTRAST statement coefficients on random effects are distributed equally across groups. SINGULAR=number tunes the estimability checking. If v is a vector, define ABS(v) to be the absolute value of the element of v with the largest absolute value. If ABS(K 0,K 0 T) is greater

26 1970 Chapter 37. The MIXED Procedure than C*number for any row of K 0 in the contrast, then K is declared nonestimable. Here T is the Hermite form matrix (X 0 X), X 0 X, and C is ABS(K 0 ) except when it equals 0, and then C is 1. The value for number must be between 0 and 1; the default is 1E,4. SUBJECT coeffs SUB coeffs sets up random-effect contrasts between different subjects when a SUBJECT= variable appears on the RANDOM statement. By default, CONTRAST statement coefficients on random effects are distributed equally across subjects. ESTIMATE Statement ESTIMATE label < fixed-effect values... > < j random-effect values... >,...< / options > ; The ESTIMATE statement is exactly like a CONTRAST statement, except only onerow L matrices are permitted. The actual estimate, L 0 bp, is displayed along with its approximate standard error. An approximate t-test that L 0 bp = 0 is also produced. PROC MIXED selects the degrees of freedom to match those displayed in the Tests of Fixed Effects table for the final effect you list in the ESTIMATE statement. You can modify the degrees of freedom using the DF= option. If PROC MIXED finds the fixed-effects portion of the specified estimate to be nonestimable, then it displays Non-est for the estimate entries. The following examples of ESTIMATE statements compute the mean of the first level of A in the split-plot example (see Example 37.1 on page 2037) for various inference spaces: estimate A1 mean narrow intercept 1 A 1 B.5.5 A*B.5.5 block A*Block ; estimate A1 mean intermed intercept 1 A 1 B.5.5 A*B.5.5 Block ; estimate A1 mean broad intercept 1 A 1 B.5.5 A*B.5.5; The construction of the L vector for an ESTIMATE statement follows the same rules as listed under the CONTRAST statement.

27 ESTIMATE Statement 1971 You can specify the following options in the ESTIMATE statement after a slash (/). ALPHA=number requests that a t-type confidence interval be constructed with confidence level 1, number. Thevalueofnumber must be between 0 and 1; the default is CL requests that t-type confidence limits be constructed. The confidence level is 0.95 by default; this can be changed with the ALPHA= option. DF=number specifies the degrees of freedom for the t-test and confidence limits. The default is the denominator degrees of freedom taken from the Tests of Fixed Effects table and corresponds to the final effect you list in the ESTIMATE statement. DIVISOR=number specifies a value by which to divide all coefficients so that fractional coefficients can be entered as integer numerators. E requests that the L matrix coefficients be displayed. For ODS purposes, the label of this L Matrix Coefficients table is Coefficients. GROUP coeffs GRP coeffs sets up random-effect contrasts between different groups when a GROUP= variable appears in the RANDOM statement. By default, ESTIMATE statement coefficients on random effects are distributed equally across groups. LOWER LOWERTAILED requests that the p-value for the t-test be based only on values less than the t-statistic. A two-tailed test is the default. A lower-tailed confidence limit is also produced if you specify the CL option. SINGULAR=number tunes the estimability checking as documented for the CONTRAST statement. SUBJECT coeffs SUB coeffs sets up random-effect contrasts between different subjects when a SUBJECT= variable appears in the RANDOM statement. By default, ESTIMATE statement coefficients on random effects are distributed equally across subjects. For example, the ESTIMATE statement in the following code from Example 37.5 constructs the difference between the random slopes of the first two batches. proc mixed data=rc; class batch; model y = month / s; random int month / type=un sub=batch s; estimate slope b1 - slope b2 month 1 / subject 1-1; run;

9.2 User s Guide SAS/STAT. The MIXED Procedure. (Book Excerpt) SAS Documentation

9.2 User s Guide SAS/STAT. The MIXED Procedure. (Book Excerpt) SAS Documentation SAS/STAT 9.2 User s Guide The MIXED Procedure (Book Excerpt) SAS Documentation This document is an individual chapter from SAS/STAT 9.2 User s Guide. The correct bibliographic citation for the complete

More information

Chapter 29 The GENMOD Procedure. Chapter Table of Contents

Chapter 29 The GENMOD Procedure. Chapter Table of Contents Chapter 29 The GENMOD Procedure Chapter Table of Contents OVERVIEW...1365 WhatisaGeneralizedLinearModel?...1366 ExamplesofGeneralizedLinearModels...1367 TheGENMODProcedure...1368 GETTING STARTED...1370

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

SAS Syntax and Output for Data Manipulation:

SAS Syntax and Output for Data Manipulation: Psyc 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (in preparation) chapter 5. We will be examining

More information

Technical report. in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE

Technical report. in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE Linear mixedeffects modeling in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE Table of contents Introduction................................................................3 Data preparation for MIXED...................................................3

More information

Random effects and nested models with SAS

Random effects and nested models with SAS Random effects and nested models with SAS /************* classical2.sas ********************* Three levels of factor A, four levels of B Both fixed Both random A fixed, B random B nested within A ***************************************************/

More information

Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure

Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure Technical report Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure Table of contents Introduction................................................................ 1 Data preparation

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Chapter 39 The LOGISTIC Procedure. Chapter Table of Contents

Chapter 39 The LOGISTIC Procedure. Chapter Table of Contents Chapter 39 The LOGISTIC Procedure Chapter Table of Contents OVERVIEW...1903 GETTING STARTED...1906 SYNTAX...1910 PROCLOGISTICStatement...1910 BYStatement...1912 CLASSStatement...1913 CONTRAST Statement.....1916

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA

Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA Paper P-702 Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA ABSTRACT Individual growth models are designed for exploring longitudinal data on individuals

More information

Family economics data: total family income, expenditures, debt status for 50 families in two cohorts (A and B), annual records from 1990 1995.

Family economics data: total family income, expenditures, debt status for 50 families in two cohorts (A and B), annual records from 1990 1995. Lecture 18 1. Random intercepts and slopes 2. Notation for mixed effects models 3. Comparing nested models 4. Multilevel/Hierarchical models 5. SAS versions of R models in Gelman and Hill, chapter 12 1

More information

Milk Data Analysis. 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED

Milk Data Analysis. 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED 2. Introduction to SAS PROC MIXED The MIXED procedure provides you with flexibility

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Multivariate Analysis of Variance (MANOVA)

Multivariate Analysis of Variance (MANOVA) Chapter 415 Multivariate Analysis of Variance (MANOVA) Introduction Multivariate analysis of variance (MANOVA) is an extension of common analysis of variance (ANOVA). In ANOVA, differences among various

More information

Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom

More information

Chapter 15. Mixed Models. 15.1 Overview. A flexible approach to correlated data.

Chapter 15. Mixed Models. 15.1 Overview. A flexible approach to correlated data. Chapter 15 Mixed Models A flexible approach to correlated data. 15.1 Overview Correlated data arise frequently in statistical analyses. This may be due to grouping of subjects, e.g., students within classrooms,

More information

xtmixed & denominator degrees of freedom: myth or magic

xtmixed & denominator degrees of freedom: myth or magic xtmixed & denominator degrees of freedom: myth or magic 2011 Chicago Stata Conference Phil Ender UCLA Statistical Consulting Group July 2011 Phil Ender xtmixed & denominator degrees of freedom: myth or

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Introduction to Fixed Effects Methods

Introduction to Fixed Effects Methods Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996) MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

Longitudinal Data Analyses Using Linear Mixed Models in SPSS: Concepts, Procedures and Illustrations

Longitudinal Data Analyses Using Linear Mixed Models in SPSS: Concepts, Procedures and Illustrations Research Article TheScientificWorldJOURNAL (2011) 11, 42 76 TSW Child Health & Human Development ISSN 1537-744X; DOI 10.1100/tsw.2011.2 Longitudinal Data Analyses Using Linear Mixed Models in SPSS: Concepts,

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

SP10 From GLM to GLIMMIX-Which Model to Choose? Patricia B. Cerrito, University of Louisville, Louisville, KY

SP10 From GLM to GLIMMIX-Which Model to Choose? Patricia B. Cerrito, University of Louisville, Louisville, KY SP10 From GLM to GLIMMIX-Which Model to Choose? Patricia B. Cerrito, University of Louisville, Louisville, KY ABSTRACT The purpose of this paper is to investigate several SAS procedures that are used in

More information

Lecture 14: GLM Estimation and Logistic Regression

Lecture 14: GLM Estimation and Logistic Regression Lecture 14: GLM Estimation and Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

LOGISTIC REGRESSION ANALYSIS

LOGISTIC REGRESSION ANALYSIS LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

SUGI 29 Statistics and Data Analysis

SUGI 29 Statistics and Data Analysis Paper 194-29 Head of the CLASS: Impress your colleagues with a superior understanding of the CLASS statement in PROC LOGISTIC Michelle L. Pritchard and David J. Pasta Ovation Research Group, San Francisco,

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison

SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89 by Joseph Collison Copyright 2000 by Joseph Collison All rights reserved Reproduction or translation of any part of this work beyond that permitted by Sections

More information

New SAS Procedures for Analysis of Sample Survey Data

New SAS Procedures for Analysis of Sample Survey Data New SAS Procedures for Analysis of Sample Survey Data Anthony An and Donna Watts, SAS Institute Inc, Cary, NC Abstract Researchers use sample surveys to obtain information on a wide variety of issues Many

More information

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217 Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

861 Example SPLH. 5 page 1. prefer to have. New data in. SPSS Syntax FILE HANDLE. VARSTOCASESS /MAKE rt. COMPUTE mean=2. COMPUTE sal=2. END IF.

861 Example SPLH. 5 page 1. prefer to have. New data in. SPSS Syntax FILE HANDLE. VARSTOCASESS /MAKE rt. COMPUTE mean=2. COMPUTE sal=2. END IF. SPLH 861 Example 5 page 1 Multivariate Models for Repeated Measures Response Times in Older and Younger Adults These data were collected as part of my masters thesis, and are unpublished in this form (to

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Module 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling

Module 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling Module 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling Pre-requisites Modules 1-4 Contents P5.1 Comparing Groups using Multilevel Modelling... 4

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

How To Model A Series With Sas

How To Model A Series With Sas Chapter 7 Chapter Table of Contents OVERVIEW...193 GETTING STARTED...194 TheThreeStagesofARIMAModeling...194 IdentificationStage...194 Estimation and Diagnostic Checking Stage...... 200 Forecasting Stage...205

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

Introducing the Multilevel Model for Change

Introducing the Multilevel Model for Change Department of Psychology and Human Development Vanderbilt University GCM, 2010 1 Multilevel Modeling - A Brief Introduction 2 3 4 5 Introduction In this lecture, we introduce the multilevel model for change.

More information

Logit Models for Binary Data

Logit Models for Binary Data Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response

More information

Notes on Applied Linear Regression

Notes on Applied Linear Regression Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

More information

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

ADVANCED FORECASTING MODELS USING SAS SOFTWARE ADVANCED FORECASTING MODELS USING SAS SOFTWARE Girish Kumar Jha IARI, Pusa, New Delhi 110 012 gjha_eco@iari.res.in 1. Transfer Function Model Univariate ARIMA models are useful for analysis and forecasting

More information

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

More information

Introduction to Longitudinal Data Analysis

Introduction to Longitudinal Data Analysis Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction

More information

Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc.

Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc. Paper 264-26 Model Fitting in PROC GENMOD Jean G. Orelien, Analytical Sciences, Inc. Abstract: There are several procedures in the SAS System for statistical modeling. Most statisticians who use the SAS

More information

data visualization and regression

data visualization and regression data visualization and regression Sepal.Length 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 I. setosa I. versicolor I. virginica I. setosa I. versicolor I. virginica Species Species

More information

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection Chapter 311 Introduction Often, theory and experience give only general direction as to which of a pool of candidate variables (including transformed variables) should be included in the regression model.

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information

Linear Models for Continuous Data

Linear Models for Continuous Data Chapter 2 Linear Models for Continuous Data The starting point in our exploration of statistical models in social research will be the classical linear model. Stops along the way include multiple linear

More information

A Brief Introduction to SPSS Factor Analysis

A Brief Introduction to SPSS Factor Analysis A Brief Introduction to SPSS Factor Analysis SPSS has a procedure that conducts exploratory factor analysis. Before launching into a step by step example of how to use this procedure, it is recommended

More information

Analysis of Variance. MINITAB User s Guide 2 3-1

Analysis of Variance. MINITAB User s Guide 2 3-1 3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced

More information

Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

More information

An Introduction to Modeling Longitudinal Data

An Introduction to Modeling Longitudinal Data An Introduction to Modeling Longitudinal Data Session I: Basic Concepts and Looking at Data Robert Weiss Department of Biostatistics UCLA School of Public Health robweiss@ucla.edu August 2010 Robert Weiss

More information

HLM software has been one of the leading statistical packages for hierarchical

HLM software has been one of the leading statistical packages for hierarchical Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Predict the Popularity of YouTube Videos Using Early View Data

Predict the Popularity of YouTube Videos Using Early View Data 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

IBM SPSS Missing Values 22

IBM SPSS Missing Values 22 IBM SPSS Missing Values 22 Note Before using this information and the product it supports, read the information in Notices on page 23. Product Information This edition applies to version 22, release 0,

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

Electronic Thesis and Dissertations UCLA

Electronic Thesis and Dissertations UCLA Electronic Thesis and Dissertations UCLA Peer Reviewed Title: A Multilevel Longitudinal Analysis of Teaching Effectiveness Across Five Years Author: Wang, Kairong Acceptance Date: 2013 Series: UCLA Electronic

More information

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions. Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course

More information

Overview of Methods for Analyzing Cluster-Correlated Data. Garrett M. Fitzmaurice

Overview of Methods for Analyzing Cluster-Correlated Data. Garrett M. Fitzmaurice Overview of Methods for Analyzing Cluster-Correlated Data Garrett M. Fitzmaurice Laboratory for Psychiatric Biostatistics, McLean Hospital Department of Biostatistics, Harvard School of Public Health Outline

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

7 Time series analysis

7 Time series analysis 7 Time series analysis In Chapters 16, 17, 33 36 in Zuur, Ieno and Smith (2007), various time series techniques are discussed. Applying these methods in Brodgar is straightforward, and most choices are

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of

More information

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values

More information

Multivariate Logistic Regression

Multivariate Logistic Regression 1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

More information

Factor Analysis. Chapter 420. Introduction

Factor Analysis. Chapter 420. Introduction Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node

ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node Enterprise Miner - Regression 1 ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node 1. Some background: Linear attempts to predict the value of a continuous

More information

SAS Certificate Applied Statistics and SAS Programming

SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and Advanced SAS Programming Brigham Young University Department of Statistics offers an Applied Statistics and

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Getting Correct Results from PROC REG

Getting Correct Results from PROC REG Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Part II. Multiple Linear Regression

Part II. Multiple Linear Regression Part II Multiple Linear Regression 86 Chapter 7 Multiple Regression A multiple linear regression model is a linear model that describes how a y-variable relates to two or more xvariables (or transformations

More information

Simple Tricks for Using SPSS for Windows

Simple Tricks for Using SPSS for Windows Simple Tricks for Using SPSS for Windows Chapter 14. Follow-up Tests for the Two-Way Factorial ANOVA The Interaction is Not Significant If you have performed a two-way ANOVA using the General Linear Model,

More information

Chapter 1 Introduction. 1.1 Introduction

Chapter 1 Introduction. 1.1 Introduction Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations

More information

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group Introduction to Multilevel Modeling Using HLM 6 By ATS Statistical Consulting Group Multilevel data structure Students nested within schools Children nested within families Respondents nested within interviewers

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

GLM, insurance pricing & big data: paying attention to convergence issues.

GLM, insurance pricing & big data: paying attention to convergence issues. GLM, insurance pricing & big data: paying attention to convergence issues. Michaël NOACK - michael.noack@addactis.com Senior consultant & Manager of ADDACTIS Pricing Copyright 2014 ADDACTIS Worldwide.

More information

Time Series Analysis

Time Series Analysis Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:

More information

LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

More information

Directions for using SPSS

Directions for using SPSS Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

More information

Multiple regression - Matrices

Multiple regression - Matrices Multiple regression - Matrices This handout will present various matrices which are substantively interesting and/or provide useful means of summarizing the data for analytical purposes. As we will see,

More information