Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY"

Transcription

1 Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship between race and the existence and amount of credit card debt, along with the associations between debt and secondary variables such as income, age, education, and kids. The analysis was split into three stages by first conducting a logistical regression on data obtained from a survey conducted by the Federal Reserve, then performing a logtransformed multiple linear regression on the data, and finally incorporating interaction terms into the multiple linear regression. According to the logistical regression, income and age were negatively correlated with taking on credit card debt, while variables kids and education were positively associated with taking on debt; racial categories African-American and Hispanic were not statistically significant and the racial category Other was less likely to take credit card debt compared to whites with all else held constant. The log-transformed, multiple linear regression demonstrated that kids, education, and age are positively associated with the amount of credit card debt. The racial category for blacks was negatively correlated with amount of credit card debt compared to whites while Hispanic and Other were not statistically significant. Considering interaction terms, an increase in income had a greater effect (positive coefficient) for blacks than it would for whites, and a similar relationship was observed for Hispanics. However, Other has a negative coefficient, demonstrating the effect of income on credit card balance for Other is less than the effect for the racial category white. This study may ultimately expose insights as to how race can be correlated with patterns of credit card spending and lead to future research into the nuances of immigrant versus domestic categories of the same race. INTRODUCTION: This study involved an investigation of how race and other factors may impact credit-card debt and if applicable, the extent of the accumulated debt. The purpose of the initial, logistical regression run in our two-part study was to determine the relationship between race as the explanatory variable and the existence of credit card debt. To control for potential confounding variables that may distort this association, income, number of kids, level of education, and age were also included as explanatory variables and their relationship with the existence of credit card debt was evaluated. The hypothesis presented for the first part of this report predicted a negative association between income and credit card balance; that is, a higher level of income would be correlated with a lower probability of having credit card debt. Number of kids was predicted to be positively associated with the existence of credit card debt, while education and age are predicted to have a negative association with the response variable. The second stage of the study was conducted in order to determine the association between the same set of explanatory variables and the amount of credit card debt accumulated given that an unpaid balance existed. Similar to the first part of the study, the primary relationship explored was the potential effect of race on the quantity of credit card debt. The hypotheses for this stage predicted a negative association between income and credit card balance; that is, a higher level of income would be correlated with a lower credit card balance. Number of kids was predicted to be positively associated with the amount of debt, while education and age were predicted to have negative

2 association with amount of debt. The possible interactions that may exist between explanatory variables were also explored within this report. The analysis conducted within this report would likely be of great interest to credit card companies in terms of evaluating the likelihood that their customers might accrue unpaid credit card balances. Although race was the principal factor considered, the other explanatory variables included may also provide insight into how credit card debt may be determined. Given the recent financial crisis and the lingering impact on consumers, the hope is that this report may shed light on the various factors associated with the existence and accumulation of credit card debt. METHODS: The data used in this report was obtained from the Federal Reserve s 2007 Survey of Consumer Finances in which 4,418 families were surveyed. The response variable ccbal represents credit card balance in dollars and five explanatory variables were examined. The primary explanatory variable of interest, race, was divided into subcategories of white, African-American/black, Hispanic, and Other (the racial category Asian was incorporated into Other in the original data set). Income was defined as the total amount of income of the household in dollars, while the variable kids represented the total number of children in the household. The variable titled EDUC indicated the total number of years of education completed by the head of the household. Finally, the explanatory variable age represented the age of the head of household. Although the survey data included many other explanatory variables, the aforementioned predictors were chosen based on their relevance to daily spending and interests of the authors of this report. Statistical analysis was conducted on the data compilation using the statistical package StataSE. A logistic regression was run in order to investigate which of the variables accounted for an individual s likelihood to take on credit card debt, a binary response variable (the subcategories of race were incorporated into the model as dummy variables). Next, a multiple linear regression was run to determine how the predictor variables affected the amount of credit card debt incurred, given an individual had credit card debt, which was executed by setting the parameter ccal>0. Since the survey data was right-skewed, a log transformation was performed on ccbal to minimize the scale during the multiple linear regression. The criterion for statistical significance of variables for both models was p<0.05. Possible interactions between predictor variables were also addressed. Since the effect of race was the primary concern of the report, twelve regressions were run that each addressed an interaction between either African-American, Hispanic, or Other (white being the baseline for all comparison) and education, age, kids, or income. Several diagnostics tests were conducted to evaluate the regression models. Heteroskedasticity was tested for using the Breusch-Pagan/Cook-Weisberg test given ccbal>0. The Shapiro-Francia normality test and Wilkes normality test were then executed to verify normality. RESULTS: Part 1: Logistic Regression The first stage of this study used a logistic regression in order to determine who might take on credit card debt (Y=1) and who would not (Y=0).

3 From Table 1, the logistic regression model is as follows: P(Y=1) = exp( e-07*income *kids *education *age *black *Hispanic *other) / [1 + exp( e-07*income *kids *education *age *black *Hispanic *other)] Race Black Significance P-value = 0.394, not significant. Hispanic P-value = 0.223, not significant. Other P-value = 0.004, significant. The variables black and Hispanic were not statistically significant in the model, as the p-values for their coefficients are above This demonstrates that blacks and Hispanics are not any more likely to take on credit card debt than whites, holding all other x variables constant. For the racial category Other (Asians, etc.), the variable was found to be significant in the model and the coefficient was negative, demonstrating that this racial category would be less likely to take on debt than whites (and blacks and Hispanics), holding all else constant. As expected, the coefficient for income was negative, demonstrating that an increase in income is associated with being less likely to take on debt, with all else constant. Also as expected, the coefficient for kids was positive and the coefficient for age was negative, demonstrating that as the number of kids increases, the probability of taking on credit card debt increases (holding all else constant), and as age increases, the probability of taking on credit card debt decreases (controlling for all other variables). The positive coefficient for education did not correspond to the original hypothesis that an increase in education would be related to a lower likelihood of debt; instead, an increase in years of education is associated with a greater likelihood of debt. Part 2: Linear Regression Due to the right-skewed nature of the response variable, credit card balance (ccbal), the y variable was transformed using a logarithmic (base 10) transformation. A multiple regression was run with log10ccbal and the five x variables: race, income, number of kids, years of education, and age. The results are presented in Table 1. The estimated regression model from Table 2 is as follows: Log10ccbal = e-09*income *kids *education *age *black *Hispanic *other In this linear regression model, the variables kids, education, age, and racial subcategory black were significant factors with p value <.01, whereas income (p=0.136), Hispanic (p=0.518) and Other (p=0.977) are insignificant. Among the significant predictors, kids, education and age are positively correlated with credit card balance, whereas black is negatively correlated. For every child a white family has, credit card balance will be multiplied by 10^0.05 = 1.12, holding all other factors constant. For every 1 year

4 increase in years of education, credit card balance will experience a multiplicative increase by 10^(.151) = For every 1 year increase in years of age, white has a 10^(.00386) = 1.01 multiplicative increase in debt, all else being the same, and blacks take on 10^(0.196) = 1.57 times less debt than whites, given that all other factors are the same. Part 3: Interaction Terms A. Interaction between Income and Race Race Black Significance P < 0.001, significant Hispanic P = 0.019, significant Other P < 0.001, significant Regression formula from Table 3.1 [See Appendix]: log10ccbal = e-09*income *kids *edu *age -.307*black *Hispanic *other e-06*income_black. Regression formula from Table 3.2 [See Appendix]: log10ccbal = e-09*income *kids *edu *age -.189*black *Hispanic *other e-07*income_hispanic. The interaction term income_black has a positive coefficient and is significant with p-value less than 0.001, showing that an increase in income for blacks increases credit card balance by a slight but statistically significant multiplicative value compared to whites, all else constant. There is a similar result for income_hispanic as income_black. Regression formula from Table 3.3 [See Appendix]: log10ccbal = e-09*income *kids *edu *age -.187*black *Hispanic +.144*other e-06*income_other. The interaction term income_other has a negative coefficient and is significant with p-value less than 0.001, showing that an increase in income for racial category Other decreases credit card balance by a slight but statistically significant multiplicative value compared to whites, with all else constant. B. Interaction between Kids and Race Race Black Significance P < 0.001, significant Hispanic P = 0.4, not significant Other P = 0.25, not significant

5 The interaction term kids_black has a negative coefficient and is significant with p-value less than 0.001, showing that an additional child for blacks decreases credit card balance by a multiple of 10^-.137 =0.729 compared to whites all else constant. The effects for Hispanic and Other are not significant, demonstrating that the effects of kids on credit card balance for these groups is about the same as whites. C. Interaction between Education and Race: Race Black Significance P = 0.005, significant Hispanic P < 0.001, significant Other P = 0.731, not significant The interaction term education_black has a negative coefficient and is significant showing that an additional year of education for blacks decreases credit card balance by a multiple of 10^ =.935 compared to whites, with all else constant. The effect of Hispanic is about the same as the effect just discussed for Blacks. The effect Other is not significant, demonstrating that the effects of education on credit card balance for Other is about the same as whites. D. Interaction between Age and Race: Race Black Significance P < 0.001, significant Hispanic P =0.011, significant Other P = 0.971, not significant The interaction term age_black has a coefficient and is significant showing that an additional year of age for blacks increases credit card balance by a multiple of 10^.00871= 1.02 compared to whites, with all else constant. The effect of Hispanic is about the same as the effect just discussed for Blacks. The effect of Other is not significant, demonstrating that the effect of age on credit card balance for Others is about the same as whites. CONCLUSION/DISCUSSION In this study, a logarithmic transformation was originally performed on the data set obtained from the Federal Reserve since the data was very right-skewed. Furthermore, since the validity of the model may be affected by the normality of values, two normality tests were performed to see if the residuals were normally distributed (see Table 4 in Appendix). The Shapiro-Francia normality test and Wilkes normality test yielded two different results (See Appendix), and the histogram (Graph 1 in Appendix) shows that the residuals are not normally distributed. This may be due to the huge sample size and the exclusion of other factors not included in this model, such as personal preference. When

6 considering the predictive power of the current models, these diagnostic results should be taken into account. The model also demonstrates that the residuals have constant variance according to the Breusch- Pagan / Cook-Weisberg test (hettest) for heteroskedasticity, which had a p-value for the multiple linear regression. Since this p value>0.05, the null hypothesis of the heteroskedastic test that there is constant variance fails to be rejected. Therefore, the multiple linear regression has constant variance and is not heteroskedastic. The scatter plot of residuals against fitted values looks normal except for an abnormal straight line. Independence among the samples is implied in the model. Multicollinearity was tested for by correlating the explanatory variables against each other (Table 5 for Appendix). The largest correlation value amongst the explanatory variables is between age and kids, which is and falls below the 0.5 cutoff for very significant multicollinearity. Hence, it can be said that though there are some collinear relationships among the explanatory variables, they are not significant enough to severely affect our model. In the linear regression model, income is surprisingly not significant. This may be due to collinearity that may exist among the explanatory variables. For example, if income can be partially explained by education and age, then it may lose its explanatory power for credit card balance. According to the model, kids, education and age have positive correlation with the response variable. One explanation may be that as families have more children, consumption increases and they take on more credit card debt. Contradicting the original hypotheses, the more years of education and older one gets, the more outstanding credit card debt one has (although age had a negative association with the existence of debt, it had a positive association with the accumulation of credit card debt). This may be due to the fact that educated individuals may rely on higher, consistent income sources, so they have a greater ability to pay back their debt in the future, and consume more as a result. Furthermore, financial concerns may increase with age as people being to pay for cars, materials for their children and for house mortgages. These factors may all contribute to credit card debt. One interesting observation in this model is that, while Hispanics and Others (Asians) are not significant, black is significantly negatively correlated with credit card debt. This might indicate that Others and Hispanics consumption patterns are roughly the same compared to white if they take on debt, whereas blacks take on less debt. Given the negative coefficients for the interaction terms, this would seem to indicate that blacks spend less on kids and education, so they have less outstanding credit card debt. A logistical regression was conducted to investigate how each explanatory variable contributes to explaining the chance of taking on credit card debt. In this model, Other becomes a significant factor along with income, kids, education and age; whereas black and Hispanic continue to be insignificant. One interesting observation made is that an Asian family is less likely to take on debt compared to a white family when all other factors are held constant. Other races have roughly the same chance of taking on debt as white families. The addition of an interaction term between income and race in the linear regression model indicates that there are positive correlations between income_hispanics and debt and income_black and debt, while there is a negative correlation between income_other and debt. One likely explanation for the aforementioned observations may be cultural differences. For instance, the regressions performed would lend themselves to the broad interpretation that Asians tend to save money (relative to other races) and use cash or debit cards instead of credit cards, whereas other racial groups may be more comfortable with using credit cards or spending ahead of time. As their income increases, Asians are more likely to save and spend less

7 compared to whites, while Hispanics and blacks tend to spend more comparatively. Therefore, this model mainly shows the differences in consumption pattern and credit card debt holding between the racial category Other and the various races Hispanic, black, and white considered. One possible option for future studies could be comparing Asian immigrants and American-born Asians credit card debt holding behaviors. The current study would indicate that Asians are less likely to take on debt, but if they do, they take on roughly the same debt as Whites. Further research could be done to see if Asian immigrants tend not to take on any credit card debt, while Americanborn Asians share the similar debt-holding behaviors with other racial groups in the U.S.

8 Appendix: Table 1: Logistic Regression. logit logisticccbal income kids educ age black hispanic other Iteration 0: log likelihood = Iteration 1: log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Iteration 4: log likelihood = Iteration 5: log likelihood = Iteration 6: log likelihood = Iteration 7: log likelihood = Logistic regression Number of obs = LR chi2(7) = Prob > chi2 = Log likelihood = Pseudo R2 = logisticcc~l Coef. Std. Err. z P> z [95% Conf. Interval] income -4.52e e e e-07 kids educ age black hispanic other _cons Note: 113 failures and 0 successes completely determined. Table 2: Linear Regression. regress log10ccbal income kids educ age black hispanic other if ccbal>0 Source SS df MS Number of obs = 8484 F( 7, 8476) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = log10ccbal Coef. Std. Err. t P> t [95% Conf. Interval] income -8.70e e e e-09 kids educ age black hispanic other _cons Table 3: Interaction Terms Table 3.1. regress logccbal1 income kids educ age black hispanic other income_black if c > cbal>0 Source SS df MS Number of obs = 8484 F( 8, 8475) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = logccbal1 Coef. Std. Err. t P> t [95% Conf. Interval] income -9.12e e e e-09 kids educ age black hispanic other income_black 1.80e e e e-06 _cons

9 Table 3.2. gen income_hispanic = income*hispanic. regress logccbal1 income kids educ age black hispanic other income_hispanic i > f ccbal>0 Source SS df MS Number of obs = 8484 F( 8, 8475) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = logccbal1 Coef. Std. Err. t P> t [95% Conf. Interval] income -9.14e e e e-09 kids educ age black hispanic other income_his~c 3.48e e e e-07 _cons Table 3.3. regress logccbal1 income kids educ age black hispanic other income_other if c > cbal>0 Source SS df MS Number of obs = 8484 F( 8, 8475) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = logccbal1 Coef. Std. Err. t P> t [95% Conf. Interval] income -8.24e e e e-09 kids educ age black hispanic other income_other -1.06e e e e-07 _cons Table 3.4 Interaction Term: kids_black regress log10ccbal income kids educ age black hispanic other kids_black if ccb > al>0 Source SS df MS Number of obs = F( 8, 8475) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE =.71862

10 log10ccbal Coef. Std. Err. t P> t [95% Conf. Interval] income -8.64e e e e-09 kids educ age black hispanic other kids_black _cons Table 3.5 Interaction Term: kids_hispanic regress log10ccbal income kids educ age black hispanic other kids_hispanic if > ccbal>0 Source SS df MS Number of obs = F( 8, 8475) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = log10ccbal Coef. Std. Err. t P> t [95% Conf. Interval] income -8.69e e e e-09 kids educ age black hispanic other kids_hispa~c _cons

11 Table 3.6 Interaction Term: kids_other regress log10ccbal income kids educ age black hispanic other kids_other if cc > bal>0 Source SS df MS Number of obs = F( 8, 8475) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = log10ccbal Coef. Std. Err. t P> t [95% Conf. Interval] income -8.71e e e e-09 kids educ age black hispanic other kids_other _cons

12 Table 3.7 Table 3.8

13 Table 3.9 Table gen age_black = age*black. regress logccbal1 income kids educ age black hispanic other age_black if ccba > l>0 Source SS df MS Number of obs = 8484 F( 8, 8475) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = logccbal1 Coef. Std. Err. t P> t [95% Conf. Interval] income -8.28e e e e-09 kids educ age black hispanic other age_black _cons

14 Table gen age_hispanic = age*hispanic. regress logccbal1 income kids educ age black hispanic other age_hispanic if c > cbal>0 Source SS df MS Number of obs = 8484 F( 8, 8475) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = logccbal1 Coef. Std. Err. t P> t [95% Conf. Interval] income -8.59e e e e-09 kids educ age black hispanic other age_hispanic _cons Table gen age_other = age*other. regress logccbal1 income kids educ age black hispanic other age_other if ccba > l>0 Source SS df MS Number of obs = 8484 F( 8, 8475) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE =.7203 logccbal1 Coef. Std. Err. t P> t [95% Conf. Interval] income -8.72e e e e-09 kids educ age black hispanic other age_other _cons

15 -4-2 Residuals Density Diagnostic Graph 1: Histogram of Residuals Residuals Graph 2: Scatter Plot of Residuals vs. Fitted Values Fitted values

16 Table 4: Normality test Stata output:. swilk logccbal1 Shapiro-Wilk W test for normal data Variable Obs W V z Prob>z logccbal sfrancia logccbal1 Shapiro-Francia W' test for normal data Variable Obs W' V' z Prob>z logccbal Table 5: Test for multi-collinearity among explanatory variables. corr income kids educ age (obs=22090) income kids educ age income kids educ age References: Federal Reserve. (2009) Survey of Consumer Finances. [Data file]. Retrieved from StataCorp (2010). StataSE (Version 11) [Computer software]. College Station, TX: StataCorp LP.

The Numbers Behind the MLB Anonymous Students: AD, CD, BM; (TF: Kevin Rader)

The Numbers Behind the MLB Anonymous Students: AD, CD, BM; (TF: Kevin Rader) The Numbers Behind the MLB Anonymous Students: AD, CD, BM; (TF: Kevin Rader) Abstract This project measures the effects of various baseball statistics on the win percentage of all the teams in MLB. Data

More information

MULTIPLE REGRESSION EXAMPLE

MULTIPLE REGRESSION EXAMPLE MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if

More information

College Education Matters for Happier Marriages and Higher Salaries ----Evidence from State Level Data in the US

College Education Matters for Happier Marriages and Higher Salaries ----Evidence from State Level Data in the US College Education Matters for Happier Marriages and Higher Salaries ----Evidence from State Level Data in the US Anonymous Authors: SH, AL, YM Contact TF: Kevin Rader Abstract It is a general consensus

More information

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 References: Long 1997, Long and Freese 2003 & 2006 & 2014,

More information

Regression in Stata. Alicia Doyle Lynch Harvard-MIT Data Center (HMDC)

Regression in Stata. Alicia Doyle Lynch Harvard-MIT Data Center (HMDC) Regression in Stata Alicia Doyle Lynch Harvard-MIT Data Center (HMDC) Documents for Today Find class materials at: http://libraries.mit.edu/guides/subjects/data/ training/workshops.html Several formats

More information

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

More information

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS Nađa DRECA International University of Sarajevo nadja.dreca@students.ius.edu.ba Abstract The analysis of a data set of observation for 10

More information

Discussion Section 4 ECON 139/239 2010 Summer Term II

Discussion Section 4 ECON 139/239 2010 Summer Term II Discussion Section 4 ECON 139/239 2010 Summer Term II 1. Let s use the CollegeDistance.csv data again. (a) An education advocacy group argues that, on average, a person s educational attainment would increase

More information

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

More information

REGRESSION LINES IN STATA

REGRESSION LINES IN STATA REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression

More information

Regression Analysis. Data Calculations Output

Regression Analysis. Data Calculations Output Regression Analysis In an attempt to find answers to questions such as those posed above, empirical labour economists use a useful tool called regression analysis. Regression analysis is essentially a

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look

More information

August 2012 EXAMINATIONS Solution Part I

August 2012 EXAMINATIONS Solution Part I August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

More information

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING

MODEL I: DRINK REGRESSED ON GPA & MALE, WITHOUT CENTERING Interpreting Interaction Effects; Interaction Effects and Centering Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Models with interaction effects

More information

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2 University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

More information

Interaction effects between continuous variables (Optional)

Interaction effects between continuous variables (Optional) Interaction effects between continuous variables (Optional) Richard Williams, University of Notre Dame, http://www.nd.edu/~rwilliam/ Last revised February 0, 05 This is a very brief overview of this somewhat

More information

International Statistical Institute, 56th Session, 2007: Phil Everson

International Statistical Institute, 56th Session, 2007: Phil Everson Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

More information

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors.

Failure to take the sampling scheme into account can lead to inaccurate point estimates and/or flawed estimates of the standard errors. Analyzing Complex Survey Data: Some key issues to be aware of Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 24, 2015 Rather than repeat material that is

More information

Lectures 8, 9 & 10. Multiple Regression Analysis

Lectures 8, 9 & 10. Multiple Regression Analysis Lectures 8, 9 & 0. Multiple Regression Analysis In which you learn how to apply the principles and tests outlined in earlier lectures to more realistic models involving more than explanatory variable and

More information

MODELING AUTO INSURANCE PREMIUMS

MODELING AUTO INSURANCE PREMIUMS MODELING AUTO INSURANCE PREMIUMS Brittany Parahus, Siena College INTRODUCTION The findings in this paper will provide the reader with a basic knowledge and understanding of how Auto Insurance Companies

More information

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, Last revised March 28, 2015

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame,  Last revised March 28, 2015 Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised March 28, 2015 NOTE: The routines spost13, lrdrop1, and extremes are

More information

Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015

Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Stata Example (See appendices for full example).. use http://www.nd.edu/~rwilliam/stats2/statafiles/multicoll.dta,

More information

Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015

Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Interaction effects and group comparisons Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 20, 2015 Note: This handout assumes you understand factor variables,

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052)

Department of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052) Department of Economics Session 2012/2013 University of Essex Spring Term Dr Gordon Kemp EC352 Econometric Methods Solutions to Exercises from Week 10 1 Problem 13.7 This exercise refers back to Equation

More information

Nonlinear Regression Functions. SW Ch 8 1/54/

Nonlinear Regression Functions. SW Ch 8 1/54/ Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General

More information

Lecture 15. Endogeneity & Instrumental Variable Estimation

Lecture 15. Endogeneity & Instrumental Variable Estimation Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental

More information

Lecture 10: Logistical Regression II Multinomial Data. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II

Lecture 10: Logistical Regression II Multinomial Data. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II Lecture 10: Logistical Regression II Multinomial Data Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II Logit vs. Probit Review Use with a dichotomous dependent variable Need a link

More information

Handling missing data in Stata a whirlwind tour

Handling missing data in Stata a whirlwind tour Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Introduction to Stata

Introduction to Stata Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

More information

Stata Walkthrough 4: Regression, Prediction, and Forecasting

Stata Walkthrough 4: Regression, Prediction, and Forecasting Stata Walkthrough 4: Regression, Prediction, and Forecasting Over drinks the other evening, my neighbor told me about his 25-year-old nephew, who is dating a 35-year-old woman. God, I can t see them getting

More information

Linear Regression Models with Logarithmic Transformations

Linear Regression Models with Logarithmic Transformations Linear Regression Models with Logarithmic Transformations Kenneth Benoit Methodology Institute London School of Economics kbenoit@lse.ac.uk March 17, 2011 1 Logarithmic transformations of variables Considering

More information

Data Analysis Methodology 1

Data Analysis Methodology 1 Data Analysis Methodology 1 Suppose you inherited the database in Table 1.1 and needed to find out what could be learned from it fast. Say your boss entered your office and said, Here s some software project

More information

is paramount in advancing any economy. For developed countries such as

is paramount in advancing any economy. For developed countries such as Introduction The provision of appropriate incentives to attract workers to the health industry is paramount in advancing any economy. For developed countries such as Australia, the increasing demand for

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Hypothesis Testing on Multiple Parameters In many cases we may wish to know whether two or more variables are jointly significant in a regression.

More information

The average hotel manager recognizes the criticality of forecasting. However, most

The average hotel manager recognizes the criticality of forecasting. However, most Introduction The average hotel manager recognizes the criticality of forecasting. However, most managers are either frustrated by complex models researchers constructed or appalled by the amount of time

More information

Quantitative Methods for Economics Tutorial 9. Katherine Eyal

Quantitative Methods for Economics Tutorial 9. Katherine Eyal Quantitative Methods for Economics Tutorial 9 Katherine Eyal TUTORIAL 9 4 October 2010 ECO3021S Part A: Problems 1. In Problem 2 of Tutorial 7, we estimated the equation ŝleep = 3, 638.25 0.148 totwrk

More information

Econ 371 Problem Set #3 Answer Sheet

Econ 371 Problem Set #3 Answer Sheet Econ 371 Problem Set #3 Answer Sheet 4.3 In this question, you are told that a OLS regression analysis of average weekly earnings yields the following estimated model. AW E = 696.7 + 9.6 Age, R 2 = 0.023,

More information

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal

More information

A Panel Data Analysis of Corporate Attributes and Stock Prices for Indian Manufacturing Sector

A Panel Data Analysis of Corporate Attributes and Stock Prices for Indian Manufacturing Sector Journal of Modern Accounting and Auditing, ISSN 1548-6583 November 2013, Vol. 9, No. 11, 1519-1525 D DAVID PUBLISHING A Panel Data Analysis of Corporate Attributes and Stock Prices for Indian Manufacturing

More information

GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial

More information

c 2015, Jeffrey S. Simonoff 1

c 2015, Jeffrey S. Simonoff 1 Modeling Lowe s sales Forecasting sales is obviously of crucial importance to businesses. Revenue streams are random, of course, but in some industries general economic factors would be expected to have

More information

Title. Syntax. stata.com. fp Fractional polynomial regression. Estimation

Title. Syntax. stata.com. fp Fractional polynomial regression. Estimation Title stata.com fp Fractional polynomial regression Syntax Menu Description Options for fp Options for fp generate Remarks and examples Stored results Methods and formulas Acknowledgment References Also

More information

Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

GETTING STARTED: STATA & R BASIC COMMANDS ECONOMETRICS II. Stata Output Regression of wages on education

GETTING STARTED: STATA & R BASIC COMMANDS ECONOMETRICS II. Stata Output Regression of wages on education GETTING STARTED: STATA & R BASIC COMMANDS ECONOMETRICS II Stata Output Regression of wages on education. sum wage educ Variable Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------

More information

Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question. Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

More information

An Analysis of the Undergraduate Tuition Increases at the University of Minnesota Duluth

An Analysis of the Undergraduate Tuition Increases at the University of Minnesota Duluth Proceedings of the National Conference On Undergraduate Research (NCUR) 2012 Weber State University March 29-31, 2012 An Analysis of the Undergraduate Tuition Increases at the University of Minnesota Duluth

More information

Quick Stata Guide by Liz Foster

Quick Stata Guide by Liz Foster by Liz Foster Table of Contents Part 1: 1 describe 1 generate 1 regress 3 scatter 4 sort 5 summarize 5 table 6 tabulate 8 test 10 ttest 11 Part 2: Prefixes and Notes 14 by var: 14 capture 14 use of the

More information

Testing for serial correlation in linear panel-data models

Testing for serial correlation in linear panel-data models The Stata Journal (2003) 3, Number 2, pp. 168 177 Testing for serial correlation in linear panel-data models David M. Drukker Stata Corporation Abstract. Because serial correlation in linear panel-data

More information

Determining Factors of a Quick Sale in Arlington's Condo Market. Team 2: Darik Gossa Roger Moncarz Jeff Robinson Chris Frohlich James Haas

Determining Factors of a Quick Sale in Arlington's Condo Market. Team 2: Darik Gossa Roger Moncarz Jeff Robinson Chris Frohlich James Haas Determining Factors of a Quick Sale in Arlington's Condo Market Team 2: Darik Gossa Roger Moncarz Jeff Robinson Chris Frohlich James Haas Executive Summary The real estate market for condominiums in Northern

More information

VOL. 4, NO. 4, September 2015 ISSN 2307-2466 International Journal of Economics, Finance and Management 2011-2015. All rights reserved.

VOL. 4, NO. 4, September 2015 ISSN 2307-2466 International Journal of Economics, Finance and Management 2011-2015. All rights reserved. Credit Information Sharing and its Impact on Access to Bank Credit across Income Bracket Groupings Baah Aye Kusi, Kwadjo Ansah-Adu University of Ghana Business School, Department of Finance, Ghana Valley

More information

Addressing Alternative. Multiple Regression. 17.871 Spring 2012

Addressing Alternative. Multiple Regression. 17.871 Spring 2012 Addressing Alternative Explanations: Multiple Regression 17.871 Spring 2012 1 Did Clinton hurt Gore example Did Clinton hurt Gore in the 2000 election? Treatment is not liking Bill Clinton 2 Bivariate

More information

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:

Lab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format: Lab 5 Linear Regression with Within-subject Correlation Goals: Data: Fit linear regression models that account for within-subject correlation using Stata. Compare weighted least square, GEE, and random

More information

Quantitative Methods for Economics Tutorial 12. Katherine Eyal

Quantitative Methods for Economics Tutorial 12. Katherine Eyal Quantitative Methods for Economics Tutorial 12 Katherine Eyal TUTORIAL 12 25 October 2010 ECO3021S Part A: Problems 1. State with brief reason whether the following statements are true, false or uncertain:

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Yiming Peng, Department of Statistics. February 12, 2013

Yiming Peng, Department of Statistics. February 12, 2013 Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

S TAT E P LA N N IN G OR G A N IZAT IO N

S TAT E P LA N N IN G OR G A N IZAT IO N S TAT E P LA N N IN G OR G A N IZAT IO N D G FOR REGIO N A L D E V E LO P MENT A N D STRUCTUR AL A DJ USTMENT W O RKING PA PER AN ECONOMETRIC ANALYSIS OF SURVEY STUDY ON BILKENT CYBERPARK AND BATI AKDENIZ

More information

CHAPTER 5. Exercise Solutions

CHAPTER 5. Exercise Solutions CHAPTER 5 Exercise Solutions 91 Chapter 5, Exercise Solutions, Principles of Econometrics, e 9 EXERCISE 5.1 (a) y = 1, x =, x = x * * i x i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 y * i (b) (c) yx = 1, x = 16, yx

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

BRIEF OVERVIEW ON INTERPRETING COUNT MODEL RISK RATIOS

BRIEF OVERVIEW ON INTERPRETING COUNT MODEL RISK RATIOS BRIEF OVERVIEW ON INTERPRETING COUNT MODEL RISK RATIOS An Addendum to Negative Binomial Regression Cambridge University Press (2007) Joseph M. Hilbe 2008, All Rights Reserved This short monograph is intended

More information

Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

How to set the main menu of STATA to default factory settings standards

How to set the main menu of STATA to default factory settings standards University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be

More information

From this it is not clear what sort of variable that insure is so list the first 10 observations.

From this it is not clear what sort of variable that insure is so list the first 10 observations. MNL in Stata We have data on the type of health insurance available to 616 psychologically depressed subjects in the United States (Tarlov et al. 1989, JAMA; Wells et al. 1989, JAMA). The insurance is

More information

SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

More information

From the help desk: hurdle models

From the help desk: hurdle models The Stata Journal (2003) 3, Number 2, pp. 178 184 From the help desk: hurdle models Allen McDowell Stata Corporation Abstract. This article demonstrates that, although there is no command in Stata for

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

FIRM SPECIFIC FACTORS THAT DETERMINE INSURANCE COMPANIES PERFORMANCE IN ETHIOPIA

FIRM SPECIFIC FACTORS THAT DETERMINE INSURANCE COMPANIES PERFORMANCE IN ETHIOPIA FIRM SPECIFIC FACTORS THAT DETERMINE INSURANCE COMPANIES PERFORMANCE IN ETHIOPIA Daniel Mehari, MSc Arba Minch University, Arba Minch, Ethiopia Tilahun Aemiro, Msc Bahir Dar University, Bahir Dar, Ethiopia

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p. Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

More information

Chapter 18. Effect modification and interactions. 18.1 Modeling effect modification

Chapter 18. Effect modification and interactions. 18.1 Modeling effect modification Chapter 18 Effect modification and interactions 18.1 Modeling effect modification weight 40 50 60 70 80 90 100 male female 40 50 60 70 80 90 100 male female 30 40 50 70 dose 30 40 50 70 dose Figure 18.1:

More information

MEASURING THE INVENTORY TURNOVER IN DISTRIBUTIVE TRADE

MEASURING THE INVENTORY TURNOVER IN DISTRIBUTIVE TRADE MEASURING THE INVENTORY TURNOVER IN DISTRIBUTIVE TRADE Marijan Karić, Ph.D. Josip Juraj Strossmayer University of Osijek Faculty of Economics in Osijek Gajev trg 7, 31000 Osijek, Croatia Phone: +385 31

More information

Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@

Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

xtmixed & denominator degrees of freedom: myth or magic

xtmixed & denominator degrees of freedom: myth or magic xtmixed & denominator degrees of freedom: myth or magic 2011 Chicago Stata Conference Phil Ender UCLA Statistical Consulting Group July 2011 Phil Ender xtmixed & denominator degrees of freedom: myth or

More information

A Predictive Model for NFL Rookie Quarterback Fantasy Football Points

A Predictive Model for NFL Rookie Quarterback Fantasy Football Points A Predictive Model for NFL Rookie Quarterback Fantasy Football Points Steve Bronder and Alex Polinsky Duquesne University Economics Department Abstract This analysis designs a model that predicts NFL rookie

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Implied Volatility Skews in the Foreign Exchange Market. Empirical Evidence from JPY and GBP: 1997-2002

Implied Volatility Skews in the Foreign Exchange Market. Empirical Evidence from JPY and GBP: 1997-2002 Implied Volatility Skews in the Foreign Exchange Market Empirical Evidence from JPY and GBP: 1997-2002 The Leonard N. Stern School of Business Glucksman Institute for Research in Securities Markets Faculty

More information

EARLY VS. LATE ENROLLERS: DOES ENROLLMENT PROCRASTINATION AFFECT ACADEMIC SUCCESS? 2007-08

EARLY VS. LATE ENROLLERS: DOES ENROLLMENT PROCRASTINATION AFFECT ACADEMIC SUCCESS? 2007-08 EARLY VS. LATE ENROLLERS: DOES ENROLLMENT PROCRASTINATION AFFECT ACADEMIC SUCCESS? 2007-08 PURPOSE Matthew Wetstein, Alyssa Nguyen & Brianna Hays The purpose of the present study was to identify specific

More information

Getting Correct Results from PROC REG

Getting Correct Results from PROC REG Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking

More information

DEMOGRAPHICS OF PAYDAY LENDING IN OKLAHOMA

DEMOGRAPHICS OF PAYDAY LENDING IN OKLAHOMA DEMOGRAPHICS OF PAYDAY LENDING IN OKLAHOMA Haydar Kurban, PhD Adji Fatou Diagne HOWARD UNIVERSITY CENTER ON RACE AND WEALTH 1840 7th street NW Washington DC, 20001 TABLE OF CONTENTS 1. Executive Summary

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

Module 14: Missing Data Stata Practical

Module 14: Missing Data Stata Practical Module 14: Missing Data Stata Practical Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine www.missingdata.org.uk Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Applied Regression Analysis Using STATA

Applied Regression Analysis Using STATA Applied Regression Analysis Using STATA Josef Brüderl Regression analysis is the statistical method most often used in social research. The reason is that most social researchers are interested in identifying

More information

Econometrics II. Lecture 9: Sample Selection Bias

Econometrics II. Lecture 9: Sample Selection Bias Econometrics II Lecture 9: Sample Selection Bias Måns Söderbom 5 May 2011 Department of Economics, University of Gothenburg. Email: mans.soderbom@economics.gu.se. Web: www.economics.gu.se/soderbom, www.soderbom.net.

More information

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables. SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

More information