# Package hier.part. February 20, Index 11. Goodness of Fit Measures for a Regression Hierarchy

Save this PDF as:

Size: px
Start display at page:

Download "Package hier.part. February 20, 2015. Index 11. Goodness of Fit Measures for a Regression Hierarchy"

## Transcription

1 Version Date Title Hierarchical Partitioning Author Chris Walsh and Ralph Mac Nally Depends gtools Package hier.part February 20, 2015 Variance partition of a multivariate data set Maintainer Chris Walsh License GPL Repository CRAN Date/Publication :25:26 NeedsCompilation yes R topics documented: all.regs combos hier.part hier.part.data partition rand.hp Index 11 all.regs Goodness of Fit Measures for a Regression Hierarchy Calculates goodness of fit measures for regressions of a single dependent variable to all combinations of N independent variables 1

2 2 all.regs all.regs(y, xcan, family = "gaussian", gof = "RMSPE", print.vars = FALSE) Arguments y xcan family gof print.vars a vector containing the dependent variables a dataframe containing the n independent variables family argument of glm Goodness-of-fit measure. Currently "RMSPE", Root-mean-square prediction error, "loglik", Log-Likelihood or "Rsqu", R-squared if FALSE, the function returns a vector of goodness-of-fit measures. If TRUE, a data frame is returned with first column listing variable combinations and the second column listing goodness-of-fit measures. This function calculates goodness of fit measures for the entire hierarchy of models using all combinations of N dependent variables, and returns them as an ordered list ready for input into the function partition. This function requires the gtools package in the gregmisc bundle Value gfs If print.vars is FALSE, a vector of goodness of fit measures for all combinations of independent varaiables in the hierarchy or, if print.vars is TRUE, a data frame listing all combinations of independent variables in the first column in ascending order, and the corresponding goodness of fit measure for the model using those variables Chris Walsh References Hatt, B. E., Fletcher, T. D., Walsh, C. J. and Taylor, S. L The influence of urban density and drainage infrastructure on the concentrations and loads of pollutants in small streams. Environmental Management 34, Walsh, C. J., Papas, P. J., Crowther, D., Sim, P. T., and Yoo, J Stormwater drainage pipes as a threat to a stream-dwelling amphipod of conservation significance, Austrogammarus australis, in southeastern Australia. Biodiversity and Conservation 13, See Also hier.part, partition, rand.hp

3 combos 3 Examples #linear regression of log(electrical conductivity) in streams #against seven independent variables describing catchment #characteristics (from Hatt et al. 2004). data(urbanwq) env <- urbanwq[,2:8] all.regs(urbanwq\$lec, env, fam = "gaussian", gof = "Rsqu", print.vars = TRUE) #logistic regression of an amphipod species occurrence in #streams against four independent variables describing #catchment characteristics (from Walsh et al. 2004). data(amphipod) env1 <- amphipod[,2:5] all.regs(amphipod\$australis, env1, fam = "binomial", gof = "loglik", print.vars = TRUE) combos All Combinations of a Hierarchy of Models of n Variables Lists a matrix of combinations of 1 to n variables in ascending order combos(n) Arguments n an integer: the number of variables Value Lists hierarchy of all possible combinations of n variables in ascending order, starting with 1 variable, then all combinations of 2 variables, and so on until the one combination with all n variables. This function is used by all.regs to structure the models required for hierarchical partioning. This function requires the gtools package in the gregmisc bundle. a list containing ragged binary a matrix with zeroes in empty elements and 1 in column 1, 2 in column 2... n in column n for full elements a matrix as for ragged, but with 1 in all full elements

4 4 hier.part Chris Walsh See Also all.regs hier.part Goodness of Fit Calculation and Hierarchical Partitioning Partitions variance in a multivariate dataset hier.part(y, xcan, family = "gaussian", gof = "RMSPE", barplot = TRUE) Arguments y xcan family gof barplot a vector containing the dependent variables a dataframe containing the n independent variables family argument of glm Goodness-of-fit measure. Currently "RMSPE", Root-mean-square prediction error, "loglik", Log-Likelihood or "Rsqu", R-squared If TRUE, a barplot of I and J for each variable is plotted expressed as percentage of total explained variance. This function calculates goodness of fit measures for the entire hierarchy of models using all combinations of N independent variables using the function all.regs. It takes the list of goodness of fit measures and, using the partition function, applies the hierarchical partitioning algorithm of Chevan and Sutherland (1991) to return a simple table listing each variable, its independent contribution (I) and its conjoint contribution with all other variables (J). Note earlier versions of hier.part (<1.0) produced a matrix and barplot of percentage distribution of effects as a percentage of the sum of all Is and Js, as portrayed in Hatt et al. (2004) and Walsh et al. (2004). This version plots the percentage distribution of independent effects only. The sum of Is equals the goodness of fit measure of the full model minus the goodness of fit measure of the null model. The distribution of joint effects shows the relative contribution of each variable to shared variability in the full model. Negative joint effects are possible for variables that act as suppressors of other variables (Chevan and Sutherland 1991). At this stage, the partition routine will not run for more than 12 independent variables. This function requires the gtools package in the gregmisc bundle.

5 hier.part 5 Value a list containing gfs IJ I.perc a data frame or vector listing all combinations of independent variables in the first column in ascending order, and the corresponding goodness of fit measure for the model using those variables a data frame of I, the independent and J the joint contribution for each independent variable a data frame of I as a percentage of total explained variance Note The function produces a minor rounding error for analyses with more than than 9 independent variables. To check if this error affects the inference from an analysis, run the analysis several times with the variables entered in a different order. There are no known problems for analyses with 9 or fewer variables. Chris Walsh using c and fortran code written by Ralph Mac Nally References Chevan, A. and Sutherland, M Hierarchical Partitioning. The American Statistician 45, Hatt, B. E., Fletcher, T. D., Walsh, C. J. and Taylor, S. L The influence of urban density and drainage infrastructure on the concentrations and loads of pollutants in small streams. Environmental Management 34, Mac Nally, R Regression and model building in conservation biology, biogeography and ecology: the distinction between and reconciliation of predictive and explanatory models. Biodiversity and Conservation 9, Walsh, C. J., Papas, P. J., Crowther, D., Sim, P. T., and Yoo, J Stormwater drainage pipes as a threat to a stream-dwelling amphipod of conservation significance, Austrogammarus australis, in south-eastern Australia. Biodiversity and Conservation 13, See Also all.regs, partition, rand.hp Examples #linear regression of log(electrical conductivity) in #streams against seven independent variables #describing catchment characteristics (from #Hatt et al. 2004) data(urbanwq)

6 6 hier.part.data env <- urbanwq[,2:8] hier.part(urbanwq\$lec, env, fam = "gaussian", gof = "Rsqu") #logistic regression of an amphipod species occurrence in #streams against four independent variables describing #catchment characteristics (from Walsh et al. 2004). data(amphipod) env1 <- amphipod[,2:5] hier.part(amphipod\$australis, env1, fam = "binomial", gof = "loglik") hier.part.data Example Data for hier.part Example data sets for hier.part package. data(urbanwq) data(amphipod) data(chevan) urbanwq.txt Seven catchment variables (fimp, catchment imperviousness 0.25 ; sconn, drainage connection 0.5 ; sdensep, septic tank density 0.5 ; unsealden, unsealed road density; fcarea, catchment area 0.25 ; selev, elevation 0.5 ; amgeast, longitude) and median baseflow concentrations of three water quality variables in sixteen streams draining 16 independent subcatchments (ldoc, log(dissolved organic carbon); lec, log(electrical conductivity); lnox, log(nitrate/nitrite). Data from Hatt et al. (2004). amphipod.txt Presence-absence data for a stream-dwelling amphipod (australis) in 58 sites, with four catchment variables (fimp, imperviousness 0.25 ; fconn, drainage connection 0.25 ; densep, density of septic tanks; unseal, unsealed road density). Data from Walsh et al. (2004). chevan.txt Chi-squared (chisq) and R-squared (rsqu) for logistic and linear regression example from Chevan and Sutherland (1991). Chris Walsh

7 partition 7 References Chevan, A. and Sutherland, M Hierarchical Partitioning. The American Statistician 45, Hatt, B. E., Fletcher, T. D., Walsh, C. J. and Taylor, S. L The influence of urban density and drainage infrastructure on the concentrations and loads of pollutants in small streams. Environmental Management 34, Walsh, C. J., Papas, P. J., Crowther, D., Sim, P. T., and Yoo, J Stormwater drainage pipes as a threat to a stream-dwelling amphipod of conservation significance, Austrogammarus australis, in southeastern Australia. Biodiversity and Conservation 13, partition Hierarchical Partitioning from a List of Goodness of Fit Measures Partitions variance in a multivariate dataset from a list of goodness of fit measures partition(gfs, pcan, var.names = NULL) Arguments gfs an array as outputted by the function all.regs or a vector of goodness of fit measures from a hierarchy of regressions based on pcan variables in ascending order (as produced by function combos, but also including the null model as the first element) pcan the number of variables from which the hierarchy was constructed (maximum = 12) var.names an array of pcan variable names, if required This function applies the hierarchical partitioning algorithm of Chevan and Sutherland (1991) to return a simple table listing of each variable, its independent contribution (I) and its conjoint contribution with all other variables (J). The output is identical to the function hier.part, which takes the dependent and independent variable data as its input. Note earlier versions of partition (hier.part<1.0) produced a matrix and barplot of percentage distribution of effects as a percentage of the sum of all Is and Js, as portrayed in Hatt et al. (2004) and Walsh et al. (2004). This version plots the percentage distribution of independent effects only. The sum of Is equals the goodness of fit measure of the full model minus the goodness of fit measure of the null model. The distribution of joint effects shows the relative contribution of each variable to shared variability in the full model. Negative joint effects are possible for variables that act as suppressors of other variables (Chevan and Sutherland 1991).

8 8 partition Value At this stage, the partition routine will not run for more than 12 independent variables. This function requires the gtools package in the gregmisc bundle. a list containing gfs IJ I.perc J.perc a data frame listing all combinations of independent variables in the first column in ascending order, and the corresponding goodness of fit measure for the model using those variables a data frame of I, the independent and J the joint contribution for each independent variable a data frame of I as a percentage of total explained variance a data frame of J as a percentage of sum of all Js Note The function produces a minor rounding error for hierarchies constructed from more than 9 variables. To check if this error affects the inference from an analysis, run the analysis several times with the variables entered in a different order. There are no known problems for hierarchies with 9 or fewer variables. Chris Walsh using c and fortran code written by Ralph Mac Nally References Chevan, A. and Sutherland, M Hierarchical Partitioning. The American Statistician 45, Hatt, B. E., Fletcher, T. D., Walsh, C. J. and Taylor, S. L The influence of urban density and drainage infrastructure on the concentrations and loads of pollutants in small streams. Environmental Management 34, See Also all.regs, partition, rand.hp Examples #linear regression of log(electrical conductivity) in streams #against seven independent variables describing catchment #characteristics (from Hatt et al. 2004). data(urbanwq) env <- urbanwq[,2:8] gofs <- all.regs(urbanwq\$lec, env, fam = "gaussian", gof = "Rsqu", print.vars = TRUE) partition(gofs, pcan = 7, var.names = names(urbanwq[,2,8]))

9 rand.hp 9 #hierarchical partitioning of logistic and linear regression #goodness of fit measures from Chevan and Sutherland (1991). data(chevan) partition(chevan\$chisq, pcan = 4) partition(chevan\$rsqu, pcan = 4) rand.hp Randomization Test for Hierarchical Partitioning Randomizes elements in each column in xcan and recalculates hier.part num.reps times rand.hp(y, xcan, family = "gaussian", gof = "RMSPE", num.reps = 100) Arguments y xcan family gof num.reps a vector containing the dependent variables a dataframe containing the n independent variables family argument of glm Goodness-of-fit measure. Currently "RMSPE", Root-mean-square Prediction error, "NLL", Negative log Likelihood or "Rsqu", R-squared Number of repeated randomizations This function is a randomization routine for the hier.part function which returns a matrix of I values (the independent contribution towards explained variance in a multivariate dataset) for all values from num.reps randomizations. For each randomization, the values in each variable (i.e each column of xcan) are randomized independently, and hier.part is run on the randomized xcan. As well as the randomized I matrix, the function returns a summary table listing the observed I values, the 95th and 99th percentile values of I for the randomized dataset. Value a list containing Irands matrix of num.reps + 1 rows of I values for each dependent variable. The first row contains the observed I values and the remaining num.reps rows contains the I values returned for each randomization.

10 10 rand.hp Iprobs data.frame of observed I values for each variable, Z-scores for the generated distribution of randomized Is and an indication of statistical significance. Z- scores are cacluated as (observed - mean(randomizations))/sd(randomizations), and statistical significance (*) is based on upper 0.95 confidence limit (Z >= 1.65). Chris Walsh References Hatt, B. E., Fletcher, T. D., Walsh, C. J. and Taylor, S. L The influence of urban density and drainage infrastructure on the concentrations and loads of pollutants in small streams. Environmental Management 34, Mac Nally, R Regression and model building in conservation biology, biogeography and ecology: the distinction between and reconciliation of predictive and explanatory models. Biodiversity and Conservation 9, Mac Nally, R Multiple regression and inference in conservation biology and ecology: further comments on identifying important predictor variables. Biodiversity and Conservation 11, Walsh, C. J., Papas, P. J., Crowther, D., Sim, P. T., and Yoo, J Stormwater drainage pipes as a threat to a stream-dwelling amphipod of conservation significance, Austrogammarus australis, in southeastern Australia. Biodiversity and Conservation 13, See Also hier.part, partition Examples #linear regression of log(electrical conductivity) in streams #against four independent variables describing catchment #characteristics (from Hatt et al. 2004). data(urbanwq) env <- urbanwq[,2:5] rand.hp(urbanwq\$lec, env, fam = "gaussian", gof = "Rsqu")\$Iprobs #logistic regression of an amphipod species occurrence in #streams against four independent variables describing #catchment characteristics (from Walsh et al. 2004). data(amphipod) env1 <- amphipod[,2:5] rand.hp(amphipod\$australis, env1, fam = "binomial", gof = "loglik")\$iprobs

11 Index Topic datasets hier.part.data, 6 Topic regression all.regs, 1 hier.part, 4 all.regs, 1, 4, 5, 8 amphipod (hier.part.data), 6 chevan (hier.part.data), 6 combos, 3 hier.part, 2, 4, 10 hier.part.data, 6 partition, 2, 5, 7, 8, 10 rand.hp, 2, 5, 8, 9 urbanwq (hier.part.data), 6 11

### Package empiricalfdr.deseq2

Type Package Package empiricalfdr.deseq2 May 27, 2015 Title Simulation-Based False Discovery Rate in RNA-Seq Version 1.0.3 Date 2015-05-26 Author Mikhail V. Matz Maintainer Mikhail V. Matz

### Poisson Models for Count Data

Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

### Package EstCRM. July 13, 2015

Version 1.4 Date 2015-7-11 Package EstCRM July 13, 2015 Title Calibrating Parameters for the Samejima's Continuous IRT Model Author Cengiz Zopluoglu Maintainer Cengiz Zopluoglu

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### Package changepoint. R topics documented: November 9, 2015. Type Package Title Methods for Changepoint Detection Version 2.

Type Package Title Methods for Changepoint Detection Version 2.2 Date 2015-10-23 Package changepoint November 9, 2015 Maintainer Rebecca Killick Implements various mainstream and

### SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

### Package MDM. February 19, 2015

Type Package Title Multinomial Diversity Model Version 1.3 Date 2013-06-28 Package MDM February 19, 2015 Author Glenn De'ath ; Code for mdm was adapted from multinom in the nnet package

### Package acrm. R topics documented: February 19, 2015

Package acrm February 19, 2015 Type Package Title Convenience functions for analytical Customer Relationship Management Version 0.1.1 Date 2014-03-28 Imports dummies, randomforest, kernelfactory, ada Author

### Package bigrf. February 19, 2015

Version 0.1-11 Date 2014-05-16 Package bigrf February 19, 2015 Title Big Random Forests: Classification and Regression Forests for Large Data Sets Maintainer Aloysius Lim OS_type

### Package neuralnet. February 20, 2015

Type Package Title Training of neural networks Version 1.32 Date 2012-09-19 Package neuralnet February 20, 2015 Author Stefan Fritsch, Frauke Guenther , following earlier work

### Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

### Package trimtrees. February 20, 2015

Package trimtrees February 20, 2015 Type Package Title Trimmed opinion pools of trees in a random forest Version 1.2 Date 2014-08-1 Depends R (>= 2.5.0),stats,randomForest,mlbench Author Yael Grushka-Cockayne,

### Cross Validation techniques in R: A brief overview of some methods, packages, and functions for assessing prediction models.

Cross Validation techniques in R: A brief overview of some methods, packages, and functions for assessing prediction models. Dr. Jon Starkweather, Research and Statistical Support consultant This month

### LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values

### Chapter 2 Exploratory Data Analysis

Chapter 2 Exploratory Data Analysis 2.1 Objectives Nowadays, most ecological research is done with hypothesis testing and modelling in mind. However, Exploratory Data Analysis (EDA), which uses visualization

### Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

### Package MixGHD. June 26, 2015

Type Package Package MixGHD June 26, 2015 Title Model Based Clustering, Classification and Discriminant Analysis Using the Mixture of Generalized Hyperbolic Distributions Version 1.7 Date 2015-6-15 Author

### Package CoImp. February 19, 2015

Title Copula based imputation method Date 2014-03-01 Version 0.2-3 Package CoImp February 19, 2015 Author Francesca Marta Lilja Di Lascio, Simone Giannerini Depends R (>= 2.15.2), methods, copula Imports

### E205 Final: Version B

Name: Class: Date: E205 Final: Version B Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The owner of a local nightclub has recently surveyed a random

### Package dunn.test. January 6, 2016

Version 1.3.2 Date 2016-01-06 Package dunn.test January 6, 2016 Title Dunn's Test of Multiple Comparisons Using Rank Sums Author Alexis Dinno Maintainer Alexis Dinno

### Package retrosheet. April 13, 2015

Type Package Package retrosheet April 13, 2015 Title Import Professional Baseball Data from 'Retrosheet' Version 1.0.2 Date 2015-03-17 Maintainer Richard Scriven A collection of tools

### Package cpm. July 28, 2015

Package cpm July 28, 2015 Title Sequential and Batch Change Detection Using Parametric and Nonparametric Methods Version 2.2 Date 2015-07-09 Depends R (>= 2.15.0), methods Author Gordon J. Ross Maintainer

### Package tagcloud. R topics documented: July 3, 2015

Package tagcloud July 3, 2015 Type Package Title Tag Clouds Version 0.6 Date 2015-07-02 Author January Weiner Maintainer January Weiner Description Generating Tag and Word Clouds.

### Package dsmodellingclient

Package dsmodellingclient Maintainer Author Version 4.1.0 License GPL-3 August 20, 2015 Title DataSHIELD client site functions for statistical modelling DataSHIELD

### GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial

### STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

### Gamma Distribution Fitting

Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

### Generalized Linear Models

Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

### Package pesticides. February 20, 2015

Type Package Package pesticides February 20, 2015 Title Analysis of single serving and composite pesticide residue measurements Version 0.1 Date 2010-11-17 Author David M Diez Maintainer David M Diez

### Automated Biosurveillance Data from England and Wales, 1991 2011

Article DOI: http://dx.doi.org/10.3201/eid1901.120493 Automated Biosurveillance Data from England and Wales, 1991 2011 Technical Appendix This online appendix provides technical details of statistical

### Step 5: Conduct Analysis. The CCA Algorithm

Model Parameterization: Step 5: Conduct Analysis P Dropped species with fewer than 5 occurrences P Log-transformed species abundances P Row-normalized species log abundances (chord distance) P Selected

### Package MBA. February 19, 2015. Index 7. Canopy LIDAR data

Version 0.0-8 Date 2014-4-28 Title Multilevel B-spline Approximation Package MBA February 19, 2015 Author Andrew O. Finley , Sudipto Banerjee Maintainer Andrew

### Package benford.analysis

Type Package Package benford.analysis November 17, 2015 Title Benford Analysis for Data Validation and Forensic Analytics Version 0.1.3 Author Carlos Cinelli Maintainer Carlos Cinelli

### Statistics in Retail Finance. Chapter 2: Statistical models of default

Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision

### Package metafuse. November 7, 2015

Type Package Package metafuse November 7, 2015 Title Fused Lasso Approach in Regression Coefficient Clustering Version 1.0-1 Date 2015-11-06 Author Lu Tang, Peter X.K. Song Maintainer Lu Tang

### R2MLwiN Using the multilevel modelling software package MLwiN from R

Using the multilevel modelling software package MLwiN from R Richard Parker Zhengzheng Zhang Chris Charlton George Leckie Bill Browne Centre for Multilevel Modelling (CMM) University of Bristol Using the

### Statistical Models in R

Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

### Package survpresmooth

Package survpresmooth February 20, 2015 Type Package Title Presmoothed Estimation in Survival Analysis Version 1.1-8 Date 2013-08-30 Author Ignacio Lopez de Ullibarri and Maria Amalia Jacome Maintainer

### Logistic Regression (1/24/13)

STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

### Package smoothhr. November 9, 2015

Encoding UTF-8 Type Package Depends R (>= 2.12.0),survival,splines Package smoothhr November 9, 2015 Title Smooth Hazard Ratio Curves Taking a Reference Value Version 1.0.2 Date 2015-10-29 Author Artur

### Multivariate Analysis of Ecological Data

Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology

### Package COSINE. February 19, 2015

Type Package Title COndition SpecIfic sub-network Version 2.1 Date 2014-07-09 Author Package COSINE February 19, 2015 Maintainer Depends R (>= 3.1.0), MASS,genalg To identify

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

Type Package Package ChannelAttribution October 10, 2015 Title Markov Model for the Online Multi-Channel Attribution Problem Version 1.2 Date 2015-10-09 Author Davide Altomare and David Loris Maintainer

### Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University

Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision

### Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

### Factors affecting online sales

Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

### Analysis of Variance. MINITAB User s Guide 2 3-1

3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced

### Package support.ces. R topics documented: April 27, 2015. Type Package

Type Package Package support.ces April 27, 2015 Title Basic Functions for Supporting an Implementation of Choice Experiments Version 0.3-3 Date 2015-04-27 Author Hideo Aizaki Maintainer Hideo Aizaki

### Package missforest. February 20, 2015

Type Package Package missforest February 20, 2015 Title Nonparametric Missing Value Imputation using Random Forest Version 1.4 Date 2013-12-31 Author Daniel J. Stekhoven Maintainer

### Package erp.easy. September 26, 2015

Type Package Package erp.easy September 26, 2015 Title Event-Related Potential (ERP) Data Exploration Made Easy Version 0.6.3 A set of user-friendly functions to aid in organizing, plotting and analyzing

### Package support.ces. R topics documented: August 27, 2015. Type Package

Type Package Package support.ces August 27, 2015 Title Basic Functions for Supporting an Implementation of Choice Experiments Version 0.4-1 Date 2015-08-27 Author Hideo Aizaki Maintainer Hideo Aizaki

### The Probit Link Function in Generalized Linear Models for Data Mining Applications

Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/\$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications

### Ordinal Regression. Chapter

Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe

### Package uptimerobot. October 22, 2015

Type Package Version 1.0.0 Title Access the UptimeRobot Ping API Package uptimerobot October 22, 2015 Provide a set of wrappers to call all the endpoints of UptimeRobot API which includes various kind

### Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@

Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,

### Journal of Statistical Software

JSS Journal of Statistical Software August 26, Volume 16, Issue 8. http://www.jstatsoft.org/ Some Algorithms for the Conditional Mean Vector and Covariance Matrix John F. Monahan North Carolina State University

### data visualization and regression

data visualization and regression Sepal.Length 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 I. setosa I. versicolor I. virginica I. setosa I. versicolor I. virginica Species Species

### Logistic regression diagnostics

Logistic regression diagnostics Biometry 755 Spring 2009 Logistic regression diagnostics p. 1/28 Assessing model fit A good model is one that fits the data well, in the sense that the values predicted

### Joint models for classification and comparison of mortality in different countries.

Joint models for classification and comparison of mortality in different countries. Viani D. Biatat 1 and Iain D. Currie 1 1 Department of Actuarial Mathematics and Statistics, and the Maxwell Institute

### Some Essential Statistics The Lure of Statistics

Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived

### Package GSA. R topics documented: February 19, 2015

Package GSA February 19, 2015 Title Gene set analysis Version 1.03 Author Brad Efron and R. Tibshirani Description Gene set analysis Maintainer Rob Tibshirani Dependencies impute

### Introduction to Multivariate Models: Modeling Multivariate Outcomes with Mixed Model Repeated Measures Analyses

Introduction to Multivariate Models: Modeling Multivariate Outcomes with Mixed Model Repeated Measures Analyses Applied Multilevel Models for Cross Sectional Data Lecture 11 ICPSR Summer Workshop University

### Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

### How to set the main menu of STATA to default factory settings standards

University of Pretoria Data analysis for evaluation studies Examples in STATA version 11 List of data sets b1.dta (To be created by students in class) fp1.xls (To be provided to students) fp1.txt (To be

### Package TRADER. February 10, 2016

Type Package Package TRADER February 10, 2016 Title Tree Ring Analysis of Disturbance Events in R Version 1.2-1 Date 2016-02-10 Author Pavel Fibich , Jan Altman ,

### Package fastghquad. R topics documented: February 19, 2015

Package fastghquad February 19, 2015 Type Package Title Fast Rcpp implementation of Gauss-Hermite quadrature Version 0.2 Date 2014-08-13 Author Alexander W Blocker Maintainer Fast, numerically-stable Gauss-Hermite

### extreme Datamining mit Oracle R Enterprise

extreme Datamining mit Oracle R Enterprise Oliver Bracht Managing Director eoda Matthias Fuchs Senior Consultant ISE Information Systems Engineering GmbH extreme Datamining with Oracle R Enterprise About

### Generalized Linear Models. Today: definition of GLM, maximum likelihood estimation. Involves choice of a link function (systematic component)

Generalized Linear Models Last time: definition of exponential family, derivation of mean and variance (memorize) Today: definition of GLM, maximum likelihood estimation Include predictors x i through

### Package hoarder. June 30, 2015

Type Package Title Information Retrieval for Genetic Datasets Version 0.1 Date 2015-06-29 Author [aut, cre], Anu Sironen [aut] Package hoarder June 30, 2015 Maintainer Depends

### Package CRM. R topics documented: February 19, 2015

Package CRM February 19, 2015 Title Continual Reassessment Method (CRM) for Phase I Clinical Trials Version 1.1.1 Date 2012-2-29 Depends R (>= 2.10.0) Author Qianxing Mo Maintainer Qianxing Mo

### 5. Ordinal regression: cumulative categories proportional odds. 6. Ordinal regression: comparison to single reference generalized logits

Lecture 23 1. Logistic regression with binary response 2. Proc Logistic and its surprises 3. quadratic model 4. Hosmer-Lemeshow test for lack of fit 5. Ordinal regression: cumulative categories proportional

### Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort xavier.conort@gear-analytics.com Motivation Location matters! Observed value at one location is

### Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

### Package HHG. July 14, 2015

Type Package Package HHG July 14, 2015 Title Heller-Heller-Gorfine Tests of Independence and Equality of Distributions Version 1.5.1 Date 2015-07-13 Author Barak Brill & Shachar Kaufman, based in part

### Package SHELF. February 5, 2016

Type Package Package SHELF February 5, 2016 Title Tools to Support the Sheffield Elicitation Framework (SHELF) Version 1.1.0 Date 2016-01-29 Author Jeremy Oakley Maintainer Jeremy Oakley

### Package MFDA. R topics documented: July 2, 2014. Version 1.1-4 Date 2007-10-30 Title Model Based Functional Data Analysis

Version 1.1-4 Date 2007-10-30 Title Model Based Functional Data Analysis Package MFDA July 2, 2014 Author Wenxuan Zhong , Ping Ma Maintainer Wenxuan Zhong

### BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING

BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING Xavier Conort xavier.conort@gear-analytics.com Session Number: TBR14 Insurance has always been a data business The industry has successfully

### Package depend.truncation

Type Package Package depend.truncation May 28, 2015 Title Statistical Inference for Parametric and Semiparametric Models Based on Dependently Truncated Data Version 2.4 Date 2015-05-28 Author Takeshi Emura

### The Data Mining Process

Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

### Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

### Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round \$200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

### Yew May Martin Maureen Maclachlan Tom Karmel Higher Education Division, Department of Education, Training and Youth Affairs.

How is Australia s Higher Education Performing? An analysis of completion rates of a cohort of Australian Post Graduate Research Students in the 1990s. Yew May Martin Maureen Maclachlan Tom Karmel Higher

### SUGI 29 Statistics and Data Analysis

Paper 194-29 Head of the CLASS: Impress your colleagues with a superior understanding of the CLASS statement in PROC LOGISTIC Michelle L. Pritchard and David J. Pasta Ovation Research Group, San Francisco,

### DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

### Profile Likelihood Confidence Intervals for GLM s

Profile Likelihood Confidence Intervals for GLM s The standard procedure for computing a confidence interval (CI) for a parameter in a generalized linear model is by the formula: estimate ± percentile

### HLM software has been one of the leading statistical packages for hierarchical

Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush

### Statistical methods for the comparison of dietary intake

Appendix Y Statistical methods for the comparison of dietary intake Jianhua Wu, Petros Gousias, Nida Ziauddeen, Sonja Nicholson and Ivonne Solis- Trapala Y.1 Introduction This appendix provides an outline

### SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way

### ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node

Enterprise Miner - Regression 1 ECLT5810 E-Commerce Data Mining Technique SAS Enterprise Miner -- Regression Model I. Regression Node 1. Some background: Linear attempts to predict the value of a continuous

### Examining a Fitted Logistic Model

STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic

### Penalized Logistic Regression and Classification of Microarray Data

Penalized Logistic Regression and Classification of Microarray Data Milan, May 2003 Anestis Antoniadis Laboratoire IMAG-LMC University Joseph Fourier Grenoble, France Penalized Logistic Regression andclassification

### Multiple Linear Regression

Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

### Package lss. February 20, 2015

Type Package Package lss February 20, 2015 Title the accelerated failure time model to right censored data based on least-squares principle Version 0.52 Date 2006-12-01 Author Lin Huang ,

### Software for Distributions in R

David Scott 1 Diethelm Würtz 2 Christine Dong 1 1 Department of Statistics The University of Auckland 2 Institut für Theoretische Physik ETH Zürich July 10, 2009 Outline 1 Introduction 2 Distributions

### Package gazepath. April 1, 2015

Type Package Package gazepath April 1, 2015 Title Gazepath Transforms Eye-Tracking Data into Fixations and Saccades Version 1.0 Date 2015-03-03 Author Daan van Renswoude Maintainer Daan van Renswoude