Stata: Bivariate Statistics Topics: Chi-square test, t-test, Pearson s R correlation coefficient
|
|
|
- Silvester Franklin
- 9 years ago
- Views:
Transcription
1 Stata: Bivariate Statistics Topics: Chi-square test, t-test, Pearson s R correlation coefficient There are three situations during survey data analysis in which bivariate statistics are commonly used. 1. Compare two groups First, bivariate statistics are used to compare two study groups to see if they are similar. For example, to compare two groups at baseline before an intervention is implemented, or to compare participants who are lost to follow up to those who remained in the study. When comparing groups, we want to provide strong evidence of any group differences, so we use a conservative threshold of p<0.05 to determine statistical significance. In this course, we are learning to analyze research questions with binary outcomes. Bivariate statistics can be used to summarize and compare characteristic across groups. For example, were there differences in social-demographic characteristics of women who did and did not experience intimate partner violence in the last 12 months? 2. Identify covariates for general explanatory model When a characteristic like age is different in people who did and did not experience the outcome, we say that the characteristic is associated with the outcome. This is because the characteristic helps to explain variance in the outcome. In cross sectional data analysis, we cannot draw causal conclusions. We are not talking about causal Page 1 of 8
2 mechanisms that predict the outcome. Although woman s age group might be associated with whether or not she experienced intimate partner violence in the last 12 months, the biological process of aging does not cause her partner to act violently toward her. Rather, we are staying that a characteristic (like older age) tends to be present when the outcome is present. When we are developing a general explanatory model when the research question is Which factors are associated with [the outcome]? - then we use bivariate statistics to identify potential covariates that are worth testing in a multivariable model. If a variable is independently associated with the outcome, it might continue to explain the outcome once other factors are taken into account. In this case, when bivariate statistics are used for the purpose of filtering potential covariates in multivariate analysis, we use a generous threshold of p<0.1 to determine statistical significance to ensure that we do not drop any potentially useful variables from the analysis. Note, the same statistical test used to compare two groups (usually the chi-square test in logistic regression), is the same test and output that we use here to filter variables. The only difference is in purpose of the test, and therefore our interpretation of its results are different. Page 2 of 8
3 3. Chi-square test The chi-square test is a common bivariate statistic used to test whether the distribution in a categorical variable is statistically different in two or more groups. The chi-square test gives a yes/no answer - a p-value less than the threshold means, yes, there are differences between the two groups. In a manuscript, if you see a p-value next to a categorical variable (with data summarized as percentages), this is usually a chi-square test statistic. Source: Manzi, A., et al. (2014) BMC Pregnancy and Childbirth The chi-square test statistic p-value is easy to interpret after you have set a threshold for statistical significance either the distributions are, or are not, that same. The chi-square test is a global statistic; it tells if you if there are any differences across cells, though it does not tell you which cell(s) are different. You can often tell which cells are different qualitatively based on the percentages, though additional or different testing might be performed to isolate whether certain cells are statistically different from the rest. You should not use the chi-square test statistic if one or more cells in the cross tabulation has fewer than five observations, though this is incredibly rare in survey data analysis when tens of thousands of respondents are interviewed. If we have a response category with fewer than five observations, then we should combine it with another category. The chi-square test statistic is simple to implement in Stata. In fact, we have been doing it all along! Each time we use the tabulate command with survey data (by starting with svy:), we are producing a Pearson s chi-square F-statistic and p- value. Page 3 of 8
4 4. T-test A t-test is used to test whether the distribution of a continuous variable is statistically different across groups a p-value less than the threshold means, yes, there are differences. Do NOT use a t-test when the distribution of outcomes within groups are not normal, or when the variance is not the same across groups. In these situations, consider transforming the variable (we do not discuss this further in this course), or categorize the continuous values and test it as a categorical variable. You can produce t-test statistics for a continuous variable across two or more groups with survey data by specifying a linear regression, and testing for differences in the outcomes across group categories. Page 4 of 8
5 5. Test for collinearity among two covariates Before fitting any kind of multivariate model whether a general explanatory model or a hypothesis test model you should test for collinearity. Collinearity occurs when two covariates in a multivariable model are highly related; usually this is because the two variables represent the same thing (the same concept, or they happen simultaneously). For example, in a society where husbands and wives tend to have the same level of education, then woman s education status and men s education status represent the same construct within households. Wife s education might do a good job explaining variance in the outcome, leaving little left over variance to be explained by husband s education. As a result, the model becomes unstable. To produce parsimonious (efficient) multivariable models, and to prevent strange, unstable results, we test for strong associations among covariates and remove any collinear covariates from the analysis. The Pearson s R correlation coefficient is used to identify binary, ordinal, and continuous covariates that are correlated. Correlations of r>0.5 are often considered collinear in the social sciences. When two or more covariates are found to be collinear, we keep the one variable this is most strongly associated with the outcome, unless there is a conceptual reason to keep one over the other. For nominal variables (variables with non-ordered categories), say marriage type, you cannot use the Pearson s R correlation coefficient. If you want to be rigorous, you might test one or more binary definitions of the variable, for example, married (yes/no), or separated (yes/no), rather than a four category definition of marital status. In practice, you might only do this step if you were concerned about collinearity for conceptual reasons. Page 5 of 8
6 6. Pearson s R correlation coefficient The reason we only use Pearson s R correlation coefficient for binary, ordinal, and continuous data is that it is a measure of strength of linear association between two variables. The Pearson s R correlations answers the question: How much are two variables associated on a scale of zero to absolute one? The Pearson s R correlation statistic is related to linear regression; it tries to draw a line of best fit through the data of two variables. The strength of association is measured on a scale of -1 to +1, where 0 indicates no association (this means that as the value of one variable increase, the other is random). As r approaches +1, it denotes a positive association (this means as the value of one variable increases, the other increases). And as r approaches -1, it denotes a negative association (as the values of one variable increases, the other decreases). Page 6 of 8
7 The command used to perform correlation analysis with survey data does not come installed with Stata. So we have to use the findit command to find and install the command onto our computer. We only need to install the.ado command files once, after which the command will be integrated into your Stata. The command is corr_svy. Since the command is not part of the normal Stata package, we have to manually specify all aspects of the sample design including pweight(), psu(), and strata(). We can also include a subpop() statement, if applicable. If we include two variables in this corr_svy statement, Stata will produce the Pearson s R correlation statistic for that one pair. If we list multiple variables, Stata will produce the Pearson s R correlation statistics for all pair combinations. Note that the output shows a number of correlations equal to 1. We can ignore these. Correlation equals 1 when the same variable listed on the x axis appears on the y axis; they are the same variable and therefore perfectly correlated. Page 7 of 8
8 7. Bivariate statistics in an analysis workflow Table 2. Bivariate statistics Let us briefly review how to use bivariate statistics in an analysis workflow. Let us say that our study population is women who answered questions about domestic violence in the Rwanda 2010 Demographic and Health Survey. The outcome of our analysis is binary either a woman experienced intimate partner violence in the last 12 months, or she did not. Based on our conceptual framework, we generated 20 variables that might be associated with intimate partner violence based on a literature review, common sense, and our own experiences. We categorize all variables, and then use chi-square statistics to test whether each covariate is associated with the outcome. We summarize the findings for all variables, including those variables that are not statistically significant, in Table 2. In any presentation of these results, we can talk about differences between women who did and did not experience intimate partner violence in the last 12 months based on statistical significance of the chi-square statistic at p<0.05 [black]. Using the same output, we decide to advance all variables that are associated at p<0.1 to the next stage in the analysis [black and red]. In most analyses, we find several variables that are not independently associated with the outcome, so we do not advance them in the analysis workflow. Pearson s R Correlation Coefficients With the covariates that remain, we use the Pearson s R test for collinearity to ensure that each variable in the analysis represents a unique concept, and that our multivariate model will be stable. We use the svy_corr statement to test for collinearity among all covariate pairs, and remove any collinear covariates from the analysis. So now we are ready to move forward with multivariate modeling. Page 8 of 8
Association Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
Row vs. Column Percents. tab PRAYER DEGREE, row col
Bivariate Analysis - Crosstabulation One of most basic research tools shows how x varies with respect to y Interpretation of table depends upon direction of percentaging example Row vs. Column Percents.
Chapter 7 Factor Analysis SPSS
Chapter 7 Factor Analysis SPSS Factor analysis attempts to identify underlying variables, or factors, that explain the pattern of correlations within a set of observed variables. Factor analysis is often
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
Descriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
Paid and Unpaid Labor in Developing Countries: an inequalities in time use approach
Paid and Unpaid Work inequalities 1 Paid and Unpaid Labor in Developing Countries: an inequalities in time use approach Paid and Unpaid Labor in Developing Countries: an inequalities in time use approach
Introduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
Module 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
SPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots
Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship
Solución del Examen Tipo: 1
Solución del Examen Tipo: 1 Universidad Carlos III de Madrid ECONOMETRICS Academic year 2009/10 FINAL EXAM May 17, 2010 DURATION: 2 HOURS 1. Assume that model (III) verifies the assumptions of the classical
Univariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
Multiple logistic regression analysis of cigarette use among high school students
Multiple logistic regression analysis of cigarette use among high school students ABSTRACT Joseph Adwere-Boamah Alliant International University A binary logistic regression analysis was performed to predict
Regression Analysis of the Relationship between Income and Work Hours
Regression Analysis of the Relationship between Income and Work Hours Sina Mehdikarimi Samuel Norris Charles Stalzer Georgia Institute of Technology Econometric Analysis (ECON 3161) Dr. Shatakshee Dhongde
Binary Logistic Regression
Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including
Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
Ordinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
The correlation coefficient
The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative
The Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Module 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
Gender Differences in Employed Job Search Lindsey Bowen and Jennifer Doyle, Furman University
Gender Differences in Employed Job Search Lindsey Bowen and Jennifer Doyle, Furman University Issues in Political Economy, Vol. 13, August 2004 Early labor force participation patterns can have a significant
Section 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA
CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA Chapter 13 introduced the concept of correlation statistics and explained the use of Pearson's Correlation Coefficient when working
Statistical tests for SPSS
Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly
Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
Introduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
Chapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015
Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Stata Example (See appendices for full example).. use http://www.nd.edu/~rwilliam/stats2/statafiles/multicoll.dta,
Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.
Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 24, 2015 Rather than repeat material that is
1.0 Abstract. Title: Real Life Evaluation of Rheumatoid Arthritis in Canadians taking HUMIRA. Keywords. Rationale and Background:
1.0 Abstract Title: Real Life Evaluation of Rheumatoid Arthritis in Canadians taking HUMIRA Keywords Rationale and Background: This abbreviated clinical study report is based on a clinical surveillance
Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
When to Use a Particular Statistical Test
When to Use a Particular Statistical Test Central Tendency Univariate Descriptive Mode the most commonly occurring value 6 people with ages 21, 22, 21, 23, 19, 21 - mode = 21 Median the center value the
Mode and Patient-mix Adjustment of the CAHPS Hospital Survey (HCAHPS)
Mode and Patient-mix Adjustment of the CAHPS Hospital Survey (HCAHPS) April 30, 2008 Abstract A randomized Mode Experiment of 27,229 discharges from 45 hospitals was used to develop adjustments for the
Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2014/11/6) The numbers of figures in the SPSS_screenshot.pptx are shown in red.
Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2014/11/6) The numbers of figures in the SPSS_screenshot.pptx are shown in red. 1. How to display English messages from IBM SPSS Statistics
Categorical Data Analysis
Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods
Data Analysis: Analyzing Data - Inferential Statistics
WHAT IT IS Return to Table of ontents WHEN TO USE IT Inferential statistics deal with drawing conclusions and, in some cases, making predictions about the properties of a population based on information
The Basics of Regression Analysis. for TIPPS. Lehana Thabane. What does correlation measure? Correlation is a measure of strength, not causation!
The Purpose of Regression Modeling The Basics of Regression Analysis for TIPPS Lehana Thabane To verify the association or relationship between a single variable and one or more explanatory One explanatory
4. Multiple Regression in Practice
30 Multiple Regression in Practice 4. Multiple Regression in Practice The preceding chapters have helped define the broad principles on which regression analysis is based. What features one should look
Organizing Your Approach to a Data Analysis
Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize
Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data. Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York
Types of Data, Descriptive Statistics, and Statistical Tests for Nominal Data Patrick F. Smith, Pharm.D. University at Buffalo Buffalo, New York . NONPARAMETRIC STATISTICS I. DEFINITIONS A. Parametric
UNIVERSITY OF NAIROBI
UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER
Poisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
Pearson s Correlation
Pearson s Correlation Correlation the degree to which two variables are associated (co-vary). Covariance may be either positive or negative. Its magnitude depends on the units of measurement. Assumes the
How to Get More Value from Your Survey Data
Technical report How to Get More Value from Your Survey Data Discover four advanced analysis techniques that make survey research more effective Table of contents Introduction..............................................................2
Inferential Statistics. What are they? When would you use them?
Inferential Statistics What are they? When would you use them? What are inferential statistics? Why learn about inferential statistics? Why use inferential statistics? When are inferential statistics utilized?
Linear Models in STATA and ANOVA
Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples
UNDERSTANDING THE TWO-WAY ANOVA
UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
Research Methods & Experimental Design
Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and
Gerry Hobbs, Department of Statistics, West Virginia University
Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit
Analysing Questionnaires using Minitab (for SPSS queries contact -) [email protected]
Analysing Questionnaires using Minitab (for SPSS queries contact -) [email protected] Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers
Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables
Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab Exercise #5 Analysis of Time of Death Data for Soldiers in Vietnam
Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab Exercise #5 Analysis of Time of Death Data for Soldiers in Vietnam Objectives: 1. To use exploratory data analysis to investigate
II. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
Stata Walkthrough 4: Regression, Prediction, and Forecasting
Stata Walkthrough 4: Regression, Prediction, and Forecasting Over drinks the other evening, my neighbor told me about his 25-year-old nephew, who is dating a 35-year-old woman. God, I can t see them getting
Remarriage in the United States
Remarriage in the United States Poster presented at the annual meeting of the American Sociological Association, Montreal, August 10-14, 2006 Rose M. Kreider U.S. Census Bureau [email protected]
Running Descriptive Statistics: Sample and Population Values
Running Descriptive Statistics: Sample and Population Values Goal This exercise is an introduction to a few of the variables in the household- and person-level LIS data sets. The exercise concentrates
Dimensionality Reduction: Principal Components Analysis
Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely
STATISTICAL DATA ANALYSIS
STATISTICAL DATA ANALYSIS INTRODUCTION Fethullah Karabiber YTU, Fall of 2012 The role of statistical analysis in science This course discusses some statistical methods, which involve applying statistical
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
Business Cycles and Divorce: Evidence from Microdata *
Business Cycles and Divorce: Evidence from Microdata * Judith K. Hellerstein 1 Melinda Sandler Morrill 2 Ben Zou 3 We use individual-level data to show that divorce is pro-cyclical on average, a finding
Simple Linear Regression, Scatterplots, and Bivariate Correlation
1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.
DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
Multinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
Simple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
Descriptive Inferential. The First Measured Century. Statistics. Statistics. We will focus on two types of statistical applications
Introduction: Statistics, Data and Statistical Thinking The First Measured Century FREC 408 Dr. Tom Ilvento 213 Townsend Hall [email protected] http://www.udel.edu/frec/ilvento http://www.pbs.org/fmc/index.htm
Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.
SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. [email protected] Course Objective This course is designed
The trend of Vietnamese household size in recent years
2011 International Conference on Humanities, Society and Culture IPEDR Vol.20 (2011) (2011) IACSIT Press, Singapore The trend of Vietnamese household size in recent years Nguyen, Thanh Binh 1 Free University
Statistics for Sports Medicine
Statistics for Sports Medicine Suzanne Hecht, MD University of Minnesota ([email protected]) Fellow s Research Conference July 2012: Philadelphia GOALS Try not to bore you to death!! Try to teach
Homework 11. Part 1. Name: Score: / null
Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = -0.80 C. r = 0.10 D. There is
Relationships Between Two Variables: Scatterplots and Correlation
Relationships Between Two Variables: Scatterplots and Correlation Example: Consider the population of cars manufactured in the U.S. What is the relationship (1) between engine size and horsepower? (2)
Instructions for SPSS 21
1 Instructions for SPSS 21 1 Introduction... 2 1.1 Opening the SPSS program... 2 1.2 General... 2 2 Data inputting and processing... 2 2.1 Manual input and data processing... 2 2.2 Saving data... 3 2.3
Goodness of fit assessment of item response theory models
Goodness of fit assessment of item response theory models Alberto Maydeu Olivares University of Barcelona Madrid November 1, 014 Outline Introduction Overall goodness of fit testing Two examples Assessing
Logit Models for Binary Data
Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis. These models are appropriate when the response
16 : Demand Forecasting
16 : Demand Forecasting 1 Session Outline Demand Forecasting Subjective methods can be used only when past data is not available. When past data is available, it is advisable that firms should use statistical
Standard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
Section 3 Part 1. Relationships between two numerical variables
Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.
This chapter provides information on the research methods of this thesis. The
Chapter 3 Research Methods This chapter provides information on the research methods of this thesis. The survey research method has been chosen to determine the factors influencing hedge fund investment
EMPLOYEE RECRUITMENT AND RETENTION PRACTICES IN INDIAN BANKING SECTOR. Dr. Narinder Kaur. Principal. University College, Meerapur ( Patiala)
EMPLOYEE RECRUITMENT AND RETENTION PRACTICES IN INDIAN BANKING SECTOR Dr. Narinder Kaur Principal University College, Meerapur ( Patiala) [email protected] and Introduction Sandeep Bansal Research scholar
Example: Boats and Manatees
Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant
Chapter 5 Conceptualization, Operationalization, and Measurement
Chapter 5 Conceptualization, Operationalization, and Measurement Chapter Outline Measuring anything that exists Conceptions, concepts, and reality Conceptions as constructs Conceptualization Indicators
High School Dropout Determinants: The Effect of Poverty and Learning Disabilities Adrienne Ingrum
High School Dropout Determinants: The Effect of Poverty and Learning Disabilities I. Introduction Child Trends Data Bank reports that for the year 2003 the high school dropout rate was 10%. Considering
LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as
LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values
CHAPTER 15 NOMINAL MEASURES OF CORRELATION: PHI, THE CONTINGENCY COEFFICIENT, AND CRAMER'S V
CHAPTER 15 NOMINAL MEASURES OF CORRELATION: PHI, THE CONTINGENCY COEFFICIENT, AND CRAMER'S V Chapters 13 and 14 introduced and explained the use of a set of statistical tools that researchers use to measure
Sun Li Centre for Academic Computing [email protected]
Sun Li Centre for Academic Computing [email protected] Elementary Data Analysis Group Comparison & One-way ANOVA Non-parametric Tests Correlations General Linear Regression Logistic Models Binary Logistic
MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD
Tips for surviving the analysis of survival data Philip Twumasi-Ankrah, PhD Big picture In medical research and many other areas of research, we often confront continuous, ordinal or dichotomous outcomes
So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1.
Joint probabilit is the probabilit that the RVs & Y take values &. like the PDF of the two events, and. We will denote a joint probabilit function as P,Y (,) = P(= Y=) Marginal probabilit of is the probabilit
End User Satisfaction With a Food Manufacturing ERP
Applied Mathematical Sciences, Vol. 8, 2014, no. 24, 1187-1192 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2014.4284 End-User Satisfaction in ERP System: Application of Logit Modeling Hashem
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
Child Marriage and Education: A Major Challenge Minh Cong Nguyen and Quentin Wodon i
Child Marriage and Education: A Major Challenge Minh Cong Nguyen and Quentin Wodon i Why Does Child Marriage Matter? The issue of child marriage is getting renewed attention among policy makers. This is
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma [email protected] The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
