Class Notes: Week 3. proficient
|
|
- Shonda Norton
- 7 years ago
- Views:
Transcription
1 Ronald Heck Class Notes: Week 3 1 Class Notes: Week 3 This week we will look a bit more into relationships between two variables using crosstabulation tables. Let s go back to the analysis of home language background in a subset of data and third grade reading proficiency from Week 2. You may try completing some of the analyses. English * proficient Crosstabulation English Total proficient Total Count Expected Count Count Expected Count Count Expected Count We can examine a number of relationships in a contingency table that we will make use of further in developing predictive models with categorical outcomes. The proportion of students who are proficient can be determined from the number proficient divided by the sample total (72/200 =0.36). First, we can calculate the odds of an event occurring. The odds of an event occurring is defined as follows: Odds, 1 where the Greek letter pi is used as the probability of the event of interest occurring in the population. In this case, the odds of being proficient will be the following: = We can use the odds to obtain the probability of being proficient as follows: odds odds This returns us to the proportion that is proficient in the sample (72/200 = 0.36). We can also obtain the odds ratio and risk estimates in the Statistics dialog box. The odds ratio is defined as the ratio of two odds. More specifically, we can compare the odds of people being proficient (72/200), broken down by whether they are English speaking (coded 1) [67/93=
2 Ronald Heck Class Notes: Week ] or non-english speaking (coded 0) [5/35 = ]. This can be understood as the odds ratio of being proficient for English speaking versus non-english speaking students: Odds ratio = This matches the coefficient for the odds ratio in the following table. Risk Estimate Value 95% Confidence Interval Lower Upper Odds Ratio for English (0 / 1) For cohort Prof = For cohort Prof = N of Valid Cases 200 We can also obtain the relative risk (or risk ratio) for each of the groups. The relative risk compares the probability of the event occurring (rather than the odds of it occurring) between the two groups. We can also compare the probability of being non-proficient by first group (non- English speaking) as 35/40 =.875, or the second group (English speaking) as 93/160 = The relative risk for the non-proficient group (0) will then be 0.875/ = We can also calculate the probably of being proficient (1) for the first group (non-english speaking) as 5/40 = 0.125, and for the second group (English speaking) as 67/160 = The relative risk coefficient is then 0.125/ = We can use the relative risk of being not proficient to proficient for determining the odds ratio (1.505/.299) as Specifying Models in Regression and GENLIN Next, we will begin specifying models using Regression and GENLIN in SPSS. We will look more specifically at some of the assumptions underlying various categorical models and how to find and use these programs in SPSS to examine categorical outcomes.
3 Ronald Heck Class Notes: Week 3 3 We start with the notion that statistical modeling depends on a family of probability distributions for outcome variables (Agresti, 2007). The term random variable describes the possible values that an outcome may have. A generalized linear model has the following (McCullagh & Nelder, 1989): 1) A probability distribution with an underlying random component or mathematical function that links a particular observed outcome obtained in a sample to the probability of its occurrence in a specific population, E(Y) = μ ; 2) A link function, g( ) which transforms the expected value of the outcome so that a linear model can be used to examine the relationship between the predictors and the transformed outcome (η); and 3) A structural model with defines the combination of covariates (continuous) and factors (categorical) that predict values of the transformed outcome. Let s suppose we wish to investigate a relationship between gender and probability of being proficient. There are 20 students in the class. When we arrange the data in a cross-tabulation table, we find the following: proficient * female Cross-tabulation Count Female Total proficient Total We can obtain the chi-square coefficient and other supporting tests. They suggest there is a relationship between gender and probability of being proficient. Chi-Square Tests Value Df Asymp. Sig. (2- sided) Pearson Chi-Square a Continuity Correction Likelihood Ratio Exact Sig. (2- sided) Exact Sig. (1- sided) Fisher's Exact Test Linear-by-Linear Association N of Valid Cases 20 a. 2 cells (50.0%) have expected count less than 5. The minimum expected count is Let s obtain the odds ratio of being proficient for females versus males from the table. For females the odds of being proficient are 10/1 (10). For males, the odds of being proficient are 4/5 (0.80). The odds ratio is then 10/0.80 or 12.5 (i.e., 12.5/1).
4 Ronald Heck Class Notes: Week 3 4 We can use ANALYZE: REGRESSION (Binary Logistic) to obtain results in a different format. Note if we specify female as categorical we will probably want to specify the first category as the reference group (males = 0). We can formulate a model as follows (note there is no separate error term, as in a typical regression model): η = log[π/(1-π)] = β 0 +β 1 female Variables in the Equation B S.E. Wald df Sig. Exp(B) Step 1 a female(1) Constant a. Variable(s) entered on step 1: female. For males, we can take the natural log of the probability ln(4/5) = This is the log odds (or logit) of being proficient if one is a male, which is the same as the intercept (or constant) log odds in the table, that is, the log odds when the other variables in the model are 0. [Note: Generally, ln or log is used to refer to the natural log, which is approximately ] For females, we can take the natural log of the probability ln(10/1) = This can also be obtained from the equation in the table above: = (female) We would say the predicted logit (2.303) favoring proficiency increases by units if one is female as opposed to male. Note: log odds coefficients are added. If we want the odds that females are proficient versus the odds males are proficient, we can take the natural log of the ratio of the odds of proficiency for females compared with males: ln(10/.8) = ln(12.5) = The odds ratio for females versus males can then be estimated as exp(b) = = This can be interpreted as the probability of being proficient for females versus males is increased by a factor of We can obtain the odds of females being proficient by multiplying the intercept odds (odds of proficiency for males) by the odds for females versus males (.800*12.5 = 10). This matches the odds females are proficient (10/1 = 10). Note: Odds ratios are multiplied. The probability that a female is proficient can be estimated from the predicted log odds (2.303) in the equation above. If we estimate the odds ratio [exp(b)] we obtain = We
5 Ronald Heck Class Notes: Week 3 5 can use the following formula [odds/(1+odds)] to obtain the probability females are proficient (10/11=.909). Other statistics include the -2 log likelihood (or deviance), which is the value of the likelihood function multiplied by -2 (so it will generally be positive). The Cox and Shell R and Nagelkerke R represent types of pseudo-r squares. These are not typical r-squares since they are not based on variance accounted for in Y. Model Summary Step -2 Log likelihood Cox & Snell R Square Nagelkerke R Square a a. Estimation terminated at iteration number 5 because parameter estimates changed by less than.001. We can also use GENLIN to obtain the results. WE can open ANALYZE: GENERALIZED LINEAR MODELS (Generalized Linear Models). We will obtain a screen with several different types of response variables (default = linear). We can specify binary logistic. We then select the Response variable (proficient) and select the reference category (first). We then open Predictors and select female (note we can place dichotomous variables either as factors or covariates). If we select female as categorical you will want to open Options and use the first category (male = 0) as the reference group. Then open Model and select female and place it in the Model box (note you could also build interactions here if there were several predictors). You can look at Estimation. The default is model based which is appropriate for small data sets. If you open the likelihood function and select kernel, you can see how the likelihood ratio test is used from model to model. Regarding estimates, where there is more data available we generally prefer robust estimates. In the Statistics tab, you can check Include exponential parameter estimates. This also provides you with a Likelihood ratio test (versus Wald chisquare test). We often prefer the Likelihood ratio tests for small samples (but you can leave the default on). Then you can run the model. We first receive some information to check whether the correct probability distribution and link function have been used. Model Information Dependent Variable proficient a Probability Distribution Binomial Link Function Logit a. The procedure models 1 as the response, treating 0 as the reference category. We also receive a variety of model fit information, some of which is more useful when comparing a series of models. In ML estimation, the model is fit by evaluating the likelihood of the population estimates given the observed estimates. The likelihood function describes that
6 Ronald Heck Class Notes: Week 3 6 discrepancy between the two sets of estimates (between 0 and 1), and we typically take the log of it. So, for example, if the value of the likelihood function is , the log likelihood is about the tabled value below (i.e., actually about ). Value df Value/df Deviance Scaled Deviance Pearson Chi-Square Scaled Pearson Chi-Square Log Likelihood b Akaike's Information Criterion (AIC) Finite Sample Corrected AIC (AICC) Bayesian Information Criterion (BIC) Consistent AIC (CAIC) Model: (Intercept), female a. Information criteria are in small-is-better form. b. The kernel of the log likelihood function is displayed and used in computing information criteria. Finally, the estimates are presented. There is a scale parameter which results from the fact that there is no separate variance (since it is tied to the expected value, or mean). You can see some subtle differences in the output presented through the REGRESSION and GENLIN routines in SPSS. Parameter Estimates Parameter B Std. Error Hypothesis Test Exp(B) Wald Chi- Square df Sig. (Intercept) [female=1] [female=0] 0 a (Scale) 1 b Model: (Intercept), female a. Set to zero because this parameter is redundant. b. Fixed at the displayed value. Below, the likelihood ratio test suggests that the model with gender is better than a model with just the outcome. This compares the change in the model from the intercept only (i.e., no predictors are included in the model) to a model with one predictor. Omnibus Test a Likelihood Ratio Chi-Square df Sig Model: (Intercept), female a. Compares the fitted model against the intercept-only model.
7 Ronald Heck Class Notes: Week 3 7 Notice if we go back and estimate an intercept only model, we obtain the following fit information. Value df Value/df Deviance Scaled Deviance Pearson Chi-Square Scaled Pearson Chi-Square Log Likelihood b Akaike's Information Criterion (AIC) Finite Sample Corrected AIC (AICC) Bayesian Information Criterion (BIC) Consistent AIC (CAIC) Model: (Intercept) a. Information criteria are in small-is-better form. b. The kernel of the log likelihood function is displayed and used in computing information criteria. The log likelihood of the intercept model is For the model with gender, it is The difference is , which when multiplied by -2 = 5.366, which is chi-square estimate for the Likelihood Ratio Test (with slight discrepancy due to rounding). Because the coefficient is larger than 3.84 (for 1 df, p <.05), we can conclude that the model with gender fits the data better than a model with just the intercept (as we might expect). As this comparison suggests, the GLM approach several advantages for examining relationships between variables than the more simplified cross-tabulation approach. References Agresti, A. (2007). An introduction to categorical data analysis. Hoboken, NJ: John Wiley & Sons, Inc. McCullagh, P. & Nelder, J. A. (1989). Generalized linear models (2 nd Edition). New York: Chapman & Hall.
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationBinary Logistic Regression
Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including
More informationVI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationLOGISTIC REGRESSION ANALYSIS
LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationAppendix 1: Estimation of the two-variable saturated model in SPSS, Stata and R using the Netherlands 1973 example data
Appendix 1: Estimation of the two-variable saturated model in SPSS, Stata and R using the Netherlands 1973 example data A. SPSS commands and corresponding parameter estimates Copy the 1973 data from the
More informationOverview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
More informationStudents' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)
Cairo University Faculty of Economics and Political Science Statistics Department English Section Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study) Prepared
More informationLogistic Regression. http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests
Logistic Regression http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests Overview Binary (or binomial) logistic regression is a form of regression which is used when the dependent is a dichotomy
More informationModule 4 - Multiple Logistic Regression
Module 4 - Multiple Logistic Regression Objectives Understand the principles and theory underlying logistic regression Understand proportions, probabilities, odds, odds ratios, logits and exponents Be
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
More informationLogit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
More informationBasic Statistical and Modeling Procedures Using SAS
Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationModule 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling
Module 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling Pre-requisites Modules 1-4 Contents P5.1 Comparing Groups using Multilevel Modelling... 4
More information13. Poisson Regression Analysis
136 Poisson Regression Analysis 13. Poisson Regression Analysis We have so far considered situations where the outcome variable is numeric and Normally distributed, or binary. In clinical work one often
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationLinda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents
Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén
More informationCategorical Data Analysis
Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods
More informationA Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps
More informationCalculating the Probability of Returning a Loan with Binary Probability Models
Calculating the Probability of Returning a Loan with Binary Probability Models Associate Professor PhD Julian VASILEV (e-mail: vasilev@ue-varna.bg) Varna University of Economics, Bulgaria ABSTRACT The
More informationTwo Correlated Proportions (McNemar Test)
Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with
More informationCredit Risk Analysis Using Logistic Regression Modeling
Credit Risk Analysis Using Logistic Regression Modeling Introduction A loan officer at a bank wants to be able to identify characteristics that are indicative of people who are likely to default on loans,
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationThe Probit Link Function in Generalized Linear Models for Data Mining Applications
Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications
More informationGENERALIZED LINEAR MODELS IN VEHICLE INSURANCE
ACTA UNIVERSITATIS AGRICULTURAE ET SILVICULTURAE MENDELIANAE BRUNENSIS Volume 62 41 Number 2, 2014 http://dx.doi.org/10.11118/actaun201462020383 GENERALIZED LINEAR MODELS IN VEHICLE INSURANCE Silvie Kafková
More informationConfidence Intervals for One Standard Deviation Using Standard Deviation
Chapter 640 Confidence Intervals for One Standard Deviation Using Standard Deviation Introduction This routine calculates the sample size necessary to achieve a specified interval width or distance from
More informationIndependent t- Test (Comparing Two Means)
Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationSun Li Centre for Academic Computing lsun@smu.edu.sg
Sun Li Centre for Academic Computing lsun@smu.edu.sg Elementary Data Analysis Group Comparison & One-way ANOVA Non-parametric Tests Correlations General Linear Regression Logistic Models Binary Logistic
More informationTests for Two Proportions
Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics
More informationIndices of Model Fit STRUCTURAL EQUATION MODELING 2013
Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit A recommended minimal set of fit indices that should be reported and interpreted when reporting the results of SEM analyses:
More informationAuxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationUsing An Ordered Logistic Regression Model with SAS Vartanian: SW 541
Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 libname in1 >c:\=; Data first; Set in1.extract; A=1; PROC LOGIST OUTEST=DD MAXITER=100 ORDER=DATA; OUTPUT OUT=CC XBETA=XB P=PROB; MODEL
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationFree Trial - BIRT Analytics - IAAs
Free Trial - BIRT Analytics - IAAs 11. Predict Customer Gender Once we log in to BIRT Analytics Free Trial we would see that we have some predefined advanced analysis ready to be used. Those saved analysis
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationEPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST
EPS 625 INTERMEDIATE STATISTICS The Friedman test is an extension of the Wilcoxon test. The Wilcoxon test can be applied to repeated-measures data if participants are assessed on two occasions or conditions
More informationUsing Stata for Categorical Data Analysis
Using Stata for Categorical Data Analysis NOTE: These problems make extensive use of Nick Cox s tab_chi, which is actually a collection of routines, and Adrian Mander s ipf command. From within Stata,
More informationThe Chi-Square Test. STAT E-50 Introduction to Statistics
STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed
More informationIt is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3.
IDENTIFICATION AND ESTIMATION OF AGE, PERIOD AND COHORT EFFECTS IN THE ANALYSIS OF DISCRETE ARCHIVAL DATA Stephen E. Fienberg, University of Minnesota William M. Mason, University of Michigan 1. INTRODUCTION
More informationPenalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
More informationSPSS Notes (SPSS version 15.0)
SPSS Notes (SPSS version 15.0) Annie Herbert Salford Royal Hospitals NHS Trust July 2008 Contents Page Getting Started 1 1 Opening SPSS 1 2 Layout of SPSS 2 2.1 Windows 2 2.2 Saving Files 3 3 Creating
More informationLogistic Regression 1. y log( ) logit( y) 1 y = = +
Written by: Robin Beaumont e-mail: robin@organplayers.co.uk Date Thursday, 05 July 2012 Version: 3 Logistic Regression model Can be converted to a probability dependent variable = outcome/event log odds
More informationECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node
Enterprise Miner - Regression 1 ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node 1. Some background: Linear attempts to predict the value of a continuous
More informationDeveloping Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@
Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,
More informationHow to set the main menu of STATA to default factory settings standards
University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be
More informationSUGI 29 Statistics and Data Analysis
Paper 194-29 Head of the CLASS: Impress your colleagues with a superior understanding of the CLASS statement in PROC LOGISTIC Michelle L. Pritchard and David J. Pasta Ovation Research Group, San Francisco,
More informationIBM SPSS Regression 20
IBM SPSS Regression 20 Note: Before using this information and the product it supports, read the general information under Notices on p. 41. This edition applies to IBM SPSS Statistics 20 and to all subsequent
More informationConfidence Intervals for Cp
Chapter 296 Confidence Intervals for Cp Introduction This routine calculates the sample size needed to obtain a specified width of a Cp confidence interval at a stated confidence level. Cp is a process
More informationDirections for using SPSS
Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...
More informationUnit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)
Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Logistic regression generalizes methods for 2-way tables Adds capability studying several predictors, but Limited to
More informationStatistics in Retail Finance. Chapter 2: Statistical models of default
Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision
More informationLecture 14: GLM Estimation and Logistic Regression
Lecture 14: GLM Estimation and Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
More informationi SPSS Regression 17.0
i SPSS Regression 17.0 For more information about SPSS Inc. software products, please visit our Web site at http://www.spss.com or contact SPSS Inc. 233 South Wacker Drive, 11th Floor Chicago, IL 60606-6412
More informationLOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y
More informationEnd User Satisfaction With a Food Manufacturing ERP
Applied Mathematical Sciences, Vol. 8, 2014, no. 24, 1187-1192 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2014.4284 End-User Satisfaction in ERP System: Application of Logit Modeling Hashem
More information5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
More informationChapter 29 The GENMOD Procedure. Chapter Table of Contents
Chapter 29 The GENMOD Procedure Chapter Table of Contents OVERVIEW...1365 WhatisaGeneralizedLinearModel?...1366 ExamplesofGeneralizedLinearModels...1367 TheGENMODProcedure...1368 GETTING STARTED...1370
More informationRegression Modeling Strategies
Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions
More informationSPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
More informationPredicting Successful Completion of the Nursing Program: An Analysis of Prerequisites and Demographic Variables
Predicting Successful Completion of the Nursing Program: An Analysis of Prerequisites and Demographic Variables Introduction In the summer of 2002, a research study commissioned by the Center for Student
More informationMultiple logistic regression analysis of cigarette use among high school students
Multiple logistic regression analysis of cigarette use among high school students ABSTRACT Joseph Adwere-Boamah Alliant International University A binary logistic regression analysis was performed to predict
More informationData Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through
More informationBivariate Statistics Session 2: Measuring Associations Chi-Square Test
Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution
More informationAnalyses on Hurricane Archival Data June 17, 2014
Analyses on Hurricane Archival Data June 17, 2014 This report provides detailed information about analyses of archival data in our PNAS article http://www.pnas.org/content/early/2014/05/29/1402786111.abstract
More informationTechnical report. in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE
Linear mixedeffects modeling in SPSS AN INTRODUCTION TO THE MIXED PROCEDURE Table of contents Introduction................................................................3 Data preparation for MIXED...................................................3
More informationLogistic Regression (1/24/13)
STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used
More informationTesting differences in proportions
Testing differences in proportions Murray J Fisher RN, ITU Cert., DipAppSc, BHSc, MHPEd, PhD Senior Lecturer and Director Preregistration Programs Sydney Nursing School (MO2) University of Sydney NSW 2006
More informationJanuary 26, 2009 The Faculty Center for Teaching and Learning
THE BASICS OF DATA MANAGEMENT AND ANALYSIS A USER GUIDE January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS Table of Contents Table of Contents... i
More informationModeling Lifetime Value in the Insurance Industry
Modeling Lifetime Value in the Insurance Industry C. Olivia Parr Rud, Executive Vice President, Data Square, LLC ABSTRACT Acquisition modeling for direct mail insurance has the unique challenge of targeting
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationAnalysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk
Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:
More information7 Generalized Estimating Equations
Chapter 7 The procedure extends the generalized linear model to allow for analysis of repeated measurements or other correlated observations, such as clustered data. Example. Public health of cials can
More informationMultinomial and ordinal logistic regression using PROC LOGISTIC Peter L. Flom National Development and Research Institutes, Inc
ABSTRACT Multinomial and ordinal logistic regression using PROC LOGISTIC Peter L. Flom National Development and Research Institutes, Inc Logistic regression may be useful when we are trying to model a
More information5. Survey Samples, Sample Populations and Response Rates
An Analysis of Mode Effects in Three Mixed-Mode Surveys of Veteran and Military Populations Boris Rachev ICF International, 9300 Lee Highway, Fairfax, VA 22031 Abstract: Studies on mixed-mode survey designs
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More information171:290 Model Selection Lecture II: The Akaike Information Criterion
171:290 Model Selection Lecture II: The Akaike Information Criterion Department of Biostatistics Department of Statistics and Actuarial Science August 28, 2012 Introduction AIC, the Akaike Information
More informationUNDERSTANDING THE DEPENDENT-SAMPLES t TEST
UNDERSTANDING THE DEPENDENT-SAMPLES t TEST A dependent-samples t test (a.k.a. matched or paired-samples, matched-pairs, samples, or subjects, simple repeated-measures or within-groups, or correlated groups)
More informationIs it statistically significant? The chi-square test
UAS Conference Series 2013/14 Is it statistically significant? The chi-square test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chi-square? Tests whether two categorical
More informationLogistic Regression (a type of Generalized Linear Model)
Logistic Regression (a type of Generalized Linear Model) 1/36 Today Review of GLMs Logistic Regression 2/36 How do we find patterns in data? We begin with a model of how the world works We use our knowledge
More informationHLM software has been one of the leading statistical packages for hierarchical
Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush
More informationIntroduction to Predictive Modeling Using GLMs
Introduction to Predictive Modeling Using GLMs Dan Tevet, FCAS, MAAA, Liberty Mutual Insurance Group Anand Khare, FCAS, MAAA, CPCU, Milliman 1 Antitrust Notice The Casualty Actuarial Society is committed
More informationThe first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com
The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com 2. Why do I offer this webinar for free? I offer free statistics webinars
More informationChapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
More informationProbability Calculator
Chapter 95 Introduction Most statisticians have a set of probability tables that they refer to in doing their statistical wor. This procedure provides you with a set of electronic statistical tables that
More informationMultiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.
Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.
More informationEstimating the Influence of Accident Related Factors on Motorcycle Fatal Accidents using Logistic Regression (Case Study: Denpasar-Bali)
Civil Engineering Dimension, Vol. 12, No. 2, September 2010, 106-112 ISSN 1410-9530 print / ISSN 1979-570X online Estimating the Influence of Accident Related Factors on Motorcycle Fatal Accidents using
More informationCHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES
Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical
More informationSAMPLE SIZE TABLES FOR LOGISTIC REGRESSION
STATISTICS IN MEDICINE, VOL. 8, 795-802 (1989) SAMPLE SIZE TABLES FOR LOGISTIC REGRESSION F. Y. HSIEH* Department of Epidemiology and Social Medicine, Albert Einstein College of Medicine, Bronx, N Y 10461,
More information