CRJ Doctoral Comprehensive Exam Statistics Friday August 23, :00pm 5:30pm

Similar documents
SPSS Guide: Regression Analysis

Binary Logistic Regression

Moderator and Mediator Analysis

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Additional sources Compilation of sources:

Simple linear regression

LOGISTIC REGRESSION ANALYSIS

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Ordinal Regression. Chapter

Chapter 7: Simple linear regression Learning Objectives

DAFTAR PUSTAKA. Arifin Ali, 2002, Membaca Saham, Edisi I, Yogyakarta : Andi. Bapepam, 2004, Ringkasan Data Perusahaan, Jakarta : Bapepam

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Correlation and Regression Analysis: SPSS

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)

Simple Linear Regression Inference

Chapter 13 Introduction to Linear Regression and Correlation Analysis

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

The Dummy s Guide to Data Analysis Using SPSS

SPSS-Applications (Data Analysis)

Directions for using SPSS

Introduction to Regression and Data Analysis

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Module 5: Multiple Regression Analysis

Multicollinearity Richard Williams, University of Notre Dame, Last revised January 13, 2015

11. Analysis of Case-control Studies Logistic Regression

Multiple Regression. Page 24

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Independent t- Test (Comparing Two Means)

Calculating the Probability of Returning a Loan with Binary Probability Models

Fairfield Public Schools

Descriptive Statistics

Multiple logistic regression analysis of cigarette use among high school students

Multiple Regression Using SPSS

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

SPSS Guide How-to, Tips, Tricks & Statistical Techniques

Simple Linear Regression, Scatterplots, and Bivariate Correlation

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

SPSS Explore procedure

II. DISTRIBUTIONS distribution normal distribution. standard scores

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Multiple Regression: What Is It?

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Linear Models in STATA and ANOVA

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Statistiek II. John Nerbonne. October 1, Dept of Information Science

False. Model 2 is not a special case of Model 1, because Model 2 includes X5, which is not part of Model 1. What she ought to do is estimate

Multinomial and Ordinal Logistic Regression

An introduction to IBM SPSS Statistics

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Chapter 23. Inferences for Regression

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection

T-test & factor analysis

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Data Analysis for Marketing Research - Using SPSS

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

Chapter Four. Data Analyses and Presentation of the Findings

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM. R, analysis of variance, Student test, multivariate analysis

Understanding Characteristics of Caravan Insurance Policy Buyer

Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

(and sex and drugs and rock 'n' roll) ANDY FIELD

International Statistical Institute, 56th Session, 2007: Phil Everson

Univariate Regression

Predicting success in nursing programs

Chapter 2 Probability Topics SPSS T tests

Regression Analysis (Spring, 2000)

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

SPSS Notes (SPSS version 15.0)

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank

Projects Involving Statistics (& SPSS)

Trust, Job Satisfaction, Organizational Commitment, and the Volunteer s Psychological Contract

Developing Risk Adjustment Techniques Using the System for Assessing Health Care Quality in the

13. Poisson Regression Analysis

Student debt from higher education attendance is an increasingly troubling problem in the

Regression Modeling Strategies

SAS Software to Fit the Generalized Linear Model

Modeling Lifetime Value in the Insurance Industry

August 2012 EXAMINATIONS Solution Part I

Factors affecting online sales

Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model

Logit Models for Binary Data

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Study Guide for the Final Exam

ANALYSIS OF USER ACCEPTANCE OF A NETWORK MONITORING SYSTEM WITH A FOCUS ON ICT TEACHERS

Chapter 5 Analysis of variance SPSS Analysis of variance

HLM software has been one of the leading statistical packages for hierarchical

Using the Profitability Factor and Big Data to Combat Customer Churn

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

CRITICAL FACTORS AFFECTING THE UTILIZATION OF CLOUD COMPUTING Alberto Daniel Salinas Montemayor 1, Jesús Fabián Lopez 2, Jesús Cruz Álvarez 3


INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Transcription:

CRJ Doctoral Comprehensive Exam Statistics Friday August 23, 23 2:pm 5:3pm Instructions: (Answer all questions below) Question I: Data Collection and Bivariate Hypothesis Testing. Answer the following questions as they pertain to bivariate statistical approaches to testing for group differences and variable association. a) The T-test, ANOVA, and Chi-Square test are all ways of detecting variable associates via examinations of groups differences and associations. In what instance would you expect each of the three tests to be used? b) Pertaining to the first two tests listed above, how are the formal null hypotheses stated? What are the meanings of these formal statements? c) What is sampling theory? How is sampling theory linked to probability? and how does this underlie our ability to produce reliable and statistics within reasonable levels of confidence? d) Suppose you must choose the one- or two-tailed version pertain to certain tests mentioned above. In what cases would a one-tail test appropriate? In what case would a two-tail test be appropriate? Why?

Question II: Multivariate Regression Analysis OLS (see attached output) Familial disruption has been linked to higher levels of social disorganization and crime rates in research in the area of ecological criminology. However, levels of familial disruption have also been shown to be significantly related to regional differences in crime rates. Using county level data, the attached output has been compiled to test for the potential effects of being a Southern County ( south ) and the county level percent divorced ( pctdiv ) on the index crime rate of the county ( indexrt ). Interpret the output by detailing the results of the analysis and referring to the appropriate tables in your attempt to answer this question. Be sure to properly, and formally, interpret all appropriate statistics from the output. In doing so, focus on three basic research questions: ) What are the basic assumptions of the OLS regression approach? How are each tested in this case? does this data violate any of these assumptions? 2) Interpret all useful statistical output? 3) If we wanted to test that the relationship between familial disruption and crime rates at the county level were related to the region of the country in which the county was geographically, how would we do that in both mediation and moderation form? Logistic (see attached output) Using survey data associated with conditions, fear, and demographics suppose an analysis of one s fear of their was conducted. In the dataset there are a series of variables, including a binary indicator of fear ( = ever feeling unsafe in one s in reference to = never feeling unsafe). For this question then, we are predicting ever feeling unsafe in one s by race (being white), gender (being male) and by age. Interpret the output by detailing the results of the analysis and referring to the appropriate tables in your attempt to answer this question. Be sure to properly, and formally, interpret all appropriate statistics from the output.

In doing so, focus on three basic research questions/directives: ) What are the basic assumptions of the Logistic regression approach? How does this differ from the OLS approach?... and what inherent violations of the OLS approach make using the Logistic Regression approach necessary (hint: refer to violations of OLS assumptions)? 2) What is the nature of the Block and Block portions of the output? What does each section represent? 3) Interpret all useful statistical output.

Regression Question 2 Part Page of 5 Variables Entered/Removed b Variables Entered Variables Removed Method % of the population divorced, Southern County Indicator. Enter b. Dependent Variable: County Crime Rate per, Summary b R R Square Adjusted R Square Std. Error of the Estimate Durbin- Watson.36 a.93.92 29.59338.79 a. Predictors: (Constant), % of the population divorced, Southern County Indicator b. Dependent Variable: County Crime Rate per, ANOVA b Regression Residual Total Sum of Squares 2235.52 89.26 3699.698 df 2 353 355 Mean Square 67.726 875.768 F 69.673 a. Predictors: (Constant), % of the population divorced, Southern County Indicator b. Dependent Variable: County Crime Rate per, Sig. a (Constant) Southern County Indicator % of the population divorced Coefficients a Unstandardized Coefficients B Std. Error 32.689.22 2..953.39.68 Page

(Constant) Southern County Indicator % of the population divorced Question 2 Part Page 2 of 5 Standardized Coefficients Beta Coefficients a.297.23 t 26.769.85.826 Sig..9 Collinearity Statistics Tolerance.886.886 VIF.29.29 a. Dependent Variable: County Crime Rate per, Collinearity Diagnostics a Dimension 2 3 Eigenvalue 2.223.529.28 Condition Index 2.5 2.997 a. Dependent Variable: County Crime Rate per, (Constant).7.8.75 Variance Proportions Southern County Indicator.8.88. % of the population divorced.7.6.88 Case Number Casewise Diagnostics a County Crime Rate per, Std. Residual Predicted Value Residual 63 3.5 23.89 3.9662 88.92382 238 3.65 2.55 3.6885 7.8655 292 3.3 8.3 55.75 92.6595 359 5.756 227.27 56.9285 7.38 563.2 332.27 32.7583 299.57 62 6.2 27.2 57.229 89.9977 9 3.62.5 32.9 7.23899 3.27 22.96 33.3693 89.597 3 3.396 5.88 5.373.5655 3.952 52.3 35.3967 6.9335 2 3.832 6.33 32.9388 3.392 5 3.3 33.2 3.2 98.9559 28 3.99 59.66 56.23 3.53688 a. Dependent Variable: County Crime Rate per, Page 2

Question 2 Part Page 3 of 5 Residuals Statistics a Minimum Maximum Mean Std. Deviation N Predicted Value 32.6888 58.55 38.9 9.96 356 Residual -57.3899 299.575 29.5753 356 Std. Predicted Value -.656 2.27 356 Std. Residual -.937.2.999 356 a. Dependent Variable: County Crime Rate per, Charts Histogram Dependent Variable: County Crime Rate per, 3 Mean = 2.79E-5 Std. Dev. =.999 N =,356 Frequency 2-2 2 6 8 2 Regression Standardized Residual Page 3

Question 2 Part Page of 5 Normal P-P Plot of Regression Standardized Residual. Dependent Variable: County Crime Rate per,.8 Expected Cum Prob.6..2...2..6.8. Observed Cum Prob Page

Question 2 Part Page 5 of 5 Scatterplot Dependent Variable: County Crime Rate per, Regression Standardized Residual 8 6 2-2 - -2 2 Regression Standardized Predicted Value 6 Page 5

Logistic Regression Question 2 Part 2 Page of 3 Case Processing Summary Unweighted Cases a N Selected Cases Included in Analysis Missing Cases Total 52 52 Unselected Cases Total 52 Percent..... a. If weight is in effect, see classification table for the total number of cases. Dependent Variable Encoding Original Value Internal Value have felt unsafe Block : Beginning Block Classification Table a,b Observed Predicted have felt unsafe Step have felt unsafe 939 63 Overall Percentage Classification Table a,b Observed Predicted Percentage Correct Step have felt unsafe.. Overall Percentage 6.9 a. Constant is included in the model. b. The cut value is.5 Page

Question 2 Part 2 Page 2 of 3 Variables in the Equation B S.E. Wald df Sig. Exp(B) Step Constant -.3.52 72.29.62 Variables not in the Equation Score df Sig. Step Variables black.29.39 gender 26.5 age 7.97 emp_ft.25.263 Overall Statistics 5.865 Block : Method = Enter Omnibus Tests of Coefficients Chi-square df Sig. Step Step 5.583 Block 5.583 5.583 Step -2 Log likelihood Summary Cox & Snell R Square Nagelkerke R Square 22.278 a.33.5 a. Estimation terminated at iteration number 3 because parameter estimates changed by less than.. Classification Table a Observed Predicted have felt unsafe Step have felt unsafe 85 96 85 7 Overall Percentage Page 2

Step Observed Overall Percentage Question 2 Part 2 Page 3 of 3 Classification Table a have felt unsafe Predicted Percentage Correct 9.9 7.7 62.3 a. The cut value is.5 Variables in the Equation Step a black B.2 S.E.. Wald.533 df Sig..26 Exp(B).52 gender -.558. 25.53.573 age -.5.3 2.666.985 emp_ft -.8. 2.62.5.835 Constant.52.96 7.39.8.682 a. Variable(s) entered on step : black, gender, age, emp_ft. Page 3