Statistics 9055 Chapter 2
Example: Children and Malaria A random sample of 100 children aged 3 15 years was taken from a village in Ghana. The children were followed for a period of eight months. At the beginning of the study, values of a particular antibody were assessed. Based on observations during the study period, the children were categorized into two groups: individuals with and without symptoms of malaria
Variables in the Dataset subject subject code age ab mal age in years antibody level 1 if the subject has malaria, 0 if not Note: the response variable mal is Bernoulli
Reading the Data into R > library(iswr) > data(malaria) > attach(malaria) > head(malaria) subject age ab mal 1 1 15 546 0 2 2 14 268 0 3 3 12 284 0 4 4 15 38 0 5 5 14 827 0 6 6 12 252 0
Treat age as a Factor > malglm_full<glm(mal~factor(age)+ab,family=binomial) > summary(malglm_full)
Output Call: glm(formula = mal ~ factor(age) + ab, family = binomial) Deviance Residuals: Min 1Q Median 3Q Max -1.3984-0.8654-0.4969 0.9825 2.9660 Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) -3.725e-01 7.346e-01-0.507 0.6121 factor(age)4 5.269e-01 1.013e+00 0.520 0.6029 factor(age)5 9.354e-01 1.063e+00 0.880 0.3788 factor(age)6-1.746e+01 2.557e+03-0.007 0.9946 factor(age)7-3.462e-01 1.109e+00-0.312 0.7549 factor(age)8-2.571e-01 1.119e+00-0.230 0.8184 factor(age)9 3.042e-01 9.845e-01 0.309 0.7574 factor(age)10-1.938e-01 1.126e+00-0.172 0.8633 factor(age)11 6.152e-02 1.155e+00 0.053 0.9575 factor(age)12-4.302e-01 1.367e+00-0.315 0.7530 factor(age)13-1.732e+01 2.276e+03-0.008 0.9939 factor(age)14-6.800e-01 1.349e+00-0.504 0.6141 factor(age)15-1.132e-01 1.131e+00-0.100 0.9203 ab -2.369e-03 1.222e-03-1.940 0.0524. --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 116.652 on 99 degrees of freedom Residual deviance: 96.904 on 86 degrees of freedom AIC: 124.9 Number of Fisher Scoring iterations: 17
Run the Analysis without age > malglm_ab<-glm(mal~ab,family=binomial) > summary(malglm_ab) Call: glm(formula = mal ~ ab, family = binomial) Deviance Residuals: Min 1Q Median 3Q Max -0.9960-0.8893-0.6472 1.3766 2.8993 Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) -0.437616 0.292493-1.496 0.1346 ab -0.002665 0.001214-2.196 0.0281 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 116.65 on 99 degrees of freedom Residual deviance: 107.28 on 98 degrees of freedom AIC: 111.28 Number of Fisher Scoring iterations: 6
Likelihood Ratio Test for age Recall Residual deviance: 96.904 on 86 degrees of freedom Residual deviance: 107.28 on 98 degrees of freedom Likelihood ratio test calculation > as.numeric(-2*loglik(malglm_full)) [1] 96.90391 > as.numeric(-2*loglik(malglm_ab)) [1] 107.2765 > lrt<-as.numeric(-2*(loglik(malglm_ab)-loglik(malglm_full))) > lrt [1] 10.37261 > pchisq(lrt,12,lower=false) [1] 0.5833076
Example: Animal Testing
Data File dead alive dose spleen 0 5 3 0 1 4 4 0 0 5 5 0 0 5 6 0 4 2 7 0 5 1 8 0 0 5 3 0.25 0 5 4 0.25 2 3 5 0.25 4 2 6 0.25 5 1 7 0.25 5 0 8 0.25 0 5 3 0.5 1 4 4 0.5 5 1 5 0.5 6 0 6 0.5 4 1 7 0.5 5 0 8 0.5 0 6 3 0.75 2 4 4 0.75 5 0 5 0.75 5 0 6 0.75 5 0 7 0.75 5 0 8 0.75 4 2 3 1 5 1 4 1 4 1 5 1 5 0 6 1 5 0 7 1 5 0 8 1
0 0.25 0.5 0.75 1 3 4 5 6 7 8 Initial Manipulations > animals<read.table("animaltesting.tx t",header=t) > attach(animals) > head(animals) dead alive dose spleen 1 0 5 3 0 2 1 4 4 0 3 0 5 5 0 4 0 5 6 0 5 4 2 7 0 6 5 1 8 0 > y<-cbind(dead,alive) > p<-dead/(dead+alive) > stripchart(p~dose) > stripchart(p~spleen) 0.0 0.2 0.4 0.6 0.8 1.0 p 0.0 0.2 0.4 0.6 0.8 1.0 p
Analysis I > glm_animal1<-glm(y~dose+spleen,family=binomial(link=logit)) > summary(glm_animal1) Call: glm(formula = y ~ dose + spleen, family = binomial(link = logit)) Deviance Residuals: Min 1Q Median 3Q Max -1.73921-0.66524 0.09684 0.49472 1.86060 Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) -10.3840 1.7507-5.931 3.00e-09 *** dose 1.5572 0.2588 6.018 1.77e-09 *** spleen 5.7412 1.0935 5.251 1.52e-07 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 135.601 on 29 degrees of freedom Residual deviance: 24.057 on 27 degrees of freedom AIC: 55.504 Number of Fisher Scoring iterations: 6
Analysis II > glm_animal2<-glm(y~factor(dose)+factor(spleen),family=binomial(link=logit)) > summary(glm_animal2) Call: glm(formula = y ~ factor(dose) + factor(spleen), family = binomial(link = logit)) Deviance Residuals: Min 1Q Median 3Q Max -1.77358-0.33470 0.09222 0.49415 2.02356 Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) -6.2057 1.1985-5.178 2.25e-07 *** factor(dose)4 1.7119 0.9175 1.866 0.062056. factor(dose)5 3.9596 1.0510 3.768 0.000165 *** factor(dose)6 5.0161 1.1258 4.455 8.37e-06 *** factor(dose)7 6.2780 1.2375 5.073 3.92e-07 *** factor(dose)8 8.0508 1.5458 5.208 1.91e-07 *** factor(spleen)0.25 1.7227 0.7978 2.159 0.030817 * factor(spleen)0.5 3.2909 0.9255 3.556 0.000377 *** factor(spleen)0.75 4.0851 1.0191 4.008 6.11e-05 *** factor(spleen)1 6.2286 1.2068 5.161 2.46e-07 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 135.601 on 29 degrees of freedom Residual deviance: 21.347 on 20 degrees of freedom AIC: 66.794 Number of Fisher Scoring iterations: 6
Significance Tests: Analysis II > glm_animal2_spleen<-glm(y~factor(spleen),family=binomial) > glm_animal2_dose<-glm(y~factor(dose),family=binomial) > glm_animal2$deviance [1] 21.34725 > glm_animal2_dose$deviance [1] 74.777 > glm_animal2_spleen$deviance [1] 110.2314 > devfull<-glm_animal2$deviance > devdose<-glm_animal2_dose$deviance > devspleen<-glm_animal2_spleen$deviance Testing for the significance of the dosages > pchisq(devspleen-devfull,5,lower=false) [1] 1.152612e-17 Testing for the significance of the amount of spleen that is removed > pchisq(devdose-devfull,4,lower=false) [1] 6.927712e-11