Geographically Weighted Regression

Size: px
Start display at page:

Download "Geographically Weighted Regression"

Transcription

1 Geographically Weighted Regression CSDE Statistics Workshop Christopher S. Fowler PhD. February 1 st 2011 Significant portions of this workshop were culled from presentations prepared by Fotheringham, Charleton and Brunsdon and presented at the 2010 Advanced Workshop on Spatial Analysis at the University of Santa Barbara. University of Washington Center for Studies in Demography and Ecology

2 Outline for the Session The motivation for GWR Examples from YOUR discipline Mapping OLS Residuals A good baseline for why we need GWR GWR Definitions, basic concepts Running GWR A straightforward implementation in ArcGIS GWR and some extensions

3 Basics of OLS y X Assumes a stationary process Same stimulus provokes the same response anywhere in the study area

4 Why might relationships vary spatially? Sampling variation Relationships intrinsically different across space (attitudes, preferences, contextual effects) Model misspecification

5 Applications: Ecology GWR works on trees Could have been differentiated sampling pattern creates predictable and changing levels of interaction among observations

6 Applications: Public Health Relationships vary systematically The relationship between mortality and occupational segregation and between mortality and unemployment varies across Tokyo

7 Applications: Sociology/Public Policy Missing variables (and they may very well be unknowable) The link between multifamily housing and residential burglaries varies widely even when controlling for numerous socioeconomic and neighborhood factors

8 Back up How do we know if we have nonstationarity in our model? Map residuals and test them for spatial autocorrelation if our model errs systematically with a spatial pattern then we may be on to something.

9 Poverty in the Southern U.S.

10 Our example Model Poverty Fem aleh eadedh ousehold U nem ployed Black 65andolder M etro AtLeastH ighschooleducation Based on the work of Paul Voss and Katherine Curtis These are all understood to be good predictors of poverty What kinds of spatial structures influence this data set?

11 Lab Part 1 Run our OLS model in ArcGIS Examine model output Map residuals Calculate Moran s I and Local Moran s I

12

13 Our best aspatial model

14 So what now? Add more missing variables and try again Repeat the steps from the lab Accept that there is something about certain places that makes them different (spatial heterogeneity) Try GWR Test variables meant to explore interactions taking place at short distances (spatial dependence) Try Spatial Regression (Likely a spatial lag model) Assume that the correlation is a nuisance and control for it in the error term Try Spatial Regression (Likely a spatial error model)

15 Outline for Part II What is GWR Weighting in GWR

16 Geographically Weighted Regression Local statistical technique to analyze spatial variations in relationships We are not content with global averages of spatial data (climate for example) Why should we be satisfied with global averages in a statistical analysis?

17 Put another way.simpson s Paradox If we think of these points as our data grouped into colors by region we can see that the global and local models differ significantly Source: Rücker and Schumacher BMC Medical Research Methodology :34 doi: /

18 Basic definitions Spatial nonstationarity exists when the same stimulus provokes a different response in different parts of the study region Global models are statements about processes that are assumed to be stationary and, as such, are location GWR independent greater detail Local models are spatial disaggregations of global models, the results of which are location specific Spatial heterogeneity refers to spatial patterns resulting from broad similarities usually over time Spatial dependence refers to spatial patterns that result from interactions among observations

19 Spatial Heterogeneity and Spatial Dependence

20 GWR and Spatial Processes GWR is excellent at picking up broad scale regional differences spatial heterogeneity Not as effective at dealing with small scale interaction processes Too much bias in each local model That doesn t mean it wont try (and give you misleading results)

21 GWR in a nutshell Global model y X becomes y X i i i i Where i indicates that there is a set of coefficients estimated for every observation in our data set

22 The Key Difference We estimate a set of regression coefficients for each observation To do so we weight near observations more heavily than more distant ones. We may also estimate coefficients based on some local subset of observations

23 Some advantages of GWR Excellent tool for testing model specification Where does model fit look good, where are you missing something? Residuals generally lower and not spatially autocorrelated

24 Real values for β

25 Estimated Values of β in global model

26 Residuals from global model

27 Reasons to use GWR Identify model misspecification Identify nonstationarity in relationships Improved model fit (R 2, AIC, etc) Reduced spatial autocorrelation Represent context Address spatial heterogeneity when precise variables may not exist

28 You ve convinced me, what next? Run your aspatial model (as we did in 1 st lab) We will want the results and diagnostics to compare with what comes next. Decide how you are going to weight your nearby locations Fixed bandwidth Variable bandwidth User-defined bandwidth

29 It all comes down to how you weight the observations We can use a fixed bandwidth h h Wij = exp[-((dij/h) 2 )/2] Number of observations will vary, but area they represent will remain constant

30 Weighting option 2 Or we can employ an adaptive bandwidth Wij = [1-(d ij2 / h 2 )] 2 if j is one of i s N nearest neighbors Number of observations will remain fixed, but area will not be the same

31 Kernels and Weights Bandwidth specifies shape of weights curve Kernel type tells us whether we will define our bandwidth based on distance (fixed) or number of neighbors (adaptive) So how do we know what bandwidth to use?

32 Judging the appropriate bandwidth A tradeoff between Bias: we include observations that are not part of the same spatial group and Variance: we don t have enough points in our model to say anything with conviction AIC Variance Optimum Bias AICc or CV measure model fit Optimize fit to obtain best bandwidth. Bandwidth

33 To sum Weighting assumptions are very important to outcomes in GWR Fixed distance kernel is more appropriate when the distribution of your observations is relatively stable across space (e.g. size, number of neighbors). Adaptive kernel is appropriate when distribution varies across space (e.g. events are clustered or polygons are heterogeneous) Once a kernel type is selected optimization takes some of the guesswork out of it, but robustness checks are still needed

34 Residuals from the OLS model from last lesson Looks reasonably good Moran s I is still.22 and highly significant

35 Lab Run GWR model Check Residuals Check variation in coefficients

36

37 Further topics/issues in GWR Where to go for next steps General troubleshooting Significance testing Outlier problems Poisson and Logistic model implementations Mixed form models

38 Other software implementations of GWR GWR 3.x (4.0 should be out soon) R (spgwr package) Stata Matlab Perhaps others I haven t heard of

39 General Troubleshooting Regional dummies BAD Eliminate them from model we are trying to show regional variation, not control for it Binary and low probability count variables Use caution, lack of variation may cause model to crash or have trouble finding a workable bandwidth

40 Significance Testing How do I know if the variation I see in my coefficients is meaningful? Could do t-test, but you will run into problems with multiple (1,387) tests Results in lots of false positives Standard correction (Bonferroni) will make any significance finding nearly impossible

41 Best Method: Monte Carlo simulation Randomly reassign all observation values (dependent and independent variables travel together) to different observation locations Each county s data gets assigned randomly to a different county Re-run GWR and record coefficients Repeat lots of times (at least 100) Define a distribution for coefficient values and compare your coefficients to this distribution

42 Other method: Fotheringham Significance Test Fotheringham 1 p e p e np p e is effective number of parameters p is the number of parameters

43 Fotheringham Significance Test F otheringham Fotheringham 1 p e p e np Type equation here (37.97) In Excel we can find the significant T-statistic using: TINV( ,1379) In R we use: qt(1-( /2),1379) Either way we get a value of ~

44 Results: Significant Nonstationarity for Percent Hispanic

45 Outlier problems Outliers cause problems for everybody, but their impact is greater for local regressions, particularly when bandwidth keeps number of observations low. In standard OLS Run model and identify observations with high or low residuals (~ +/- 4) Weight these observations less than 1 Re-run until none of the observations have extreme residuals Now do your GWR with weights assigned

46 Poisson and Logistic model forms Implementations exist in both R and GWR 3.x software Both require much greater care with respect to colinearity and lack of variation

47 Mixed-form models What if some of your variables are stationary and others have variation? Mixed-form models allow you to hold some coefficients constant while allowing others to vary Not yet implemented in any statistical package, but not that difficult from a technical standpoint

48 Concluding comments What comes next? Spatial regression Multilevel models

Geographically Weighted Regression

Geographically Weighted Regression Geographically Weighted Regression A Tutorial on using GWR in ArcGIS 9.3 Martin Charlton A Stewart Fotheringham National Centre for Geocomputation National University of Ireland Maynooth Maynooth, County

More information

Introduction of geospatial data visualization and geographically weighted reg

Introduction of geospatial data visualization and geographically weighted reg Introduction of geospatial data visualization and geographically weighted regression (GWR) Vanderbilt University August 16, 2012 Study Background Study Background Data Overview Algorithm (1) Information

More information

Spatial Analysis with GeoDa Spatial Autocorrelation

Spatial Analysis with GeoDa Spatial Autocorrelation Spatial Analysis with GeoDa Spatial Autocorrelation 1. Background GeoDa is a trademark of Luc Anselin. GeoDa is a collection of software tools designed for exploratory spatial data analysis (ESDA) based

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

5. Multiple regression

5. Multiple regression 5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

More information

Introduction to Longitudinal Data Analysis

Introduction to Longitudinal Data Analysis Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction

More information

Local classification and local likelihoods

Local classification and local likelihoods Local classification and local likelihoods November 18 k-nearest neighbors The idea of local regression can be extended to classification as well The simplest way of doing so is called nearest neighbor

More information

EXPLORING SPATIAL PATTERNS IN YOUR DATA

EXPLORING SPATIAL PATTERNS IN YOUR DATA EXPLORING SPATIAL PATTERNS IN YOUR DATA OBJECTIVES Learn how to examine your data using the Geostatistical Analysis tools in ArcMap. Learn how to use descriptive statistics in ArcMap and Geoda to analyze

More information

Financial Risk Management Exam Sample Questions/Answers

Financial Risk Management Exam Sample Questions/Answers Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period

More information

Introduction to nonparametric regression: Least squares vs. Nearest neighbors

Introduction to nonparametric regression: Least squares vs. Nearest neighbors Introduction to nonparametric regression: Least squares vs. Nearest neighbors Patrick Breheny October 30 Patrick Breheny STA 621: Nonparametric Statistics 1/16 Introduction For the remainder of the course,

More information

Environmental Remote Sensing GEOG 2021

Environmental Remote Sensing GEOG 2021 Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class

More information

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort xavier.conort@gear-analytics.com Motivation Location matters! Observed value at one location is

More information

Is the person a permanent immigrant. A non permanent resident. Does the person identify as male. Person appearing Chinese

Is the person a permanent immigrant. A non permanent resident. Does the person identify as male. Person appearing Chinese Cole Sprague Kai Addae Economics 312 Canadian Census Project Introduction This project is based off of the 2001 Canadian Census data, and examines the relationship between wages and education, while controlling

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Integrated Resource Plan

Integrated Resource Plan Integrated Resource Plan March 19, 2004 PREPARED FOR KAUA I ISLAND UTILITY COOPERATIVE LCG Consulting 4962 El Camino Real, Suite 112 Los Altos, CA 94022 650-962-9670 1 IRP 1 ELECTRIC LOAD FORECASTING 1.1

More information

DEMOGRAPHICS OF PAYDAY LENDING IN OKLAHOMA

DEMOGRAPHICS OF PAYDAY LENDING IN OKLAHOMA DEMOGRAPHICS OF PAYDAY LENDING IN OKLAHOMA Haydar Kurban, PhD Adji Fatou Diagne HOWARD UNIVERSITY CENTER ON RACE AND WEALTH 1840 7th street NW Washington DC, 20001 TABLE OF CONTENTS 1. Executive Summary

More information

Geographic Variation in Ambulatory EHR Adoption and Implications for Underserved Communities

Geographic Variation in Ambulatory EHR Adoption and Implications for Underserved Communities Geographic Variation in Ambulatory EHR Adoption and Implications for Underserved Communities Jennifer King and Michael Furukawa Office of the National Coordinator for Health IT (ONC) Melinda Buntin formerly

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

Spatial Prediction Models for Real Estate Market Analysis

Spatial Prediction Models for Real Estate Market Analysis Ekonomia nr 35/2013 Spatial Prediction Models for Real Estate Market Analysis Krzysztof Chrostek *, Katarzyna Kopczewska ** Abstract The econometric modeling of real estate prices is an important step

More information

Exploring Changes in the Labor Market of Health Care Service Workers in Texas and the Rio Grande Valley I. Introduction

Exploring Changes in the Labor Market of Health Care Service Workers in Texas and the Rio Grande Valley I. Introduction Ina Ganguli ENG-SCI 103 Final Project May 16, 2007 Exploring Changes in the Labor Market of Health Care Service Workers in Texas and the Rio Grande Valley I. Introduction The shortage of healthcare workers

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Obesity in America: A Growing Trend

Obesity in America: A Growing Trend Obesity in America: A Growing Trend David Todd P e n n s y l v a n i a S t a t e U n i v e r s i t y Utilizing Geographic Information Systems (GIS) to explore obesity in America, this study aims to determine

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Spatial Data Analysis Using GeoDa. Workshop Goals

Spatial Data Analysis Using GeoDa. Workshop Goals Spatial Data Analysis Using GeoDa 9 Jan 2014 Frank Witmer Computing and Research Services Institute of Behavioral Science Workshop Goals Enable participants to find and retrieve geographic data pertinent

More information

How To Understand The Theory Of Probability

How To Understand The Theory Of Probability Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL

More information

NC Public Health and Cancer - Trends for 2014

NC Public Health and Cancer - Trends for 2014 Research Brief April 2015 Measuring community health outcomes: New approaches for public health services research P ublic Health agencies are increasingly asked to do more with less. Tough economic times

More information

Spatial Exploratory Data Analysis of Birth Defect Risk. factors Identification

Spatial Exploratory Data Analysis of Birth Defect Risk. factors Identification Spatial Exploratory Data Analysis of Birth Defect Risk factors Identification Jilei WU 1 *, Jinfeng WANG 1, Gong CHEN 2, Lihua PANG 2, Xinming SONG 2, Bin MENG 1, Keli ZHANG 3, Ting ZHANG 4 and Xiaoying

More information

The primary goal of this thesis was to understand how the spatial dependence of

The primary goal of this thesis was to understand how the spatial dependence of 5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial

More information

HLM software has been one of the leading statistical packages for hierarchical

HLM software has been one of the leading statistical packages for hierarchical Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Getting the Most from Demographics: Things to Consider for Powerful Market Analysis

Getting the Most from Demographics: Things to Consider for Powerful Market Analysis Getting the Most from Demographics: Things to Consider for Powerful Market Analysis Charles J. Schwartz Principal, Intelligent Analytical Services Demographic analysis has become a fact of life in market

More information

Accessibility and Residential Land Values: Some Tests with New Measures

Accessibility and Residential Land Values: Some Tests with New Measures Accessibility and Residential Land Values: Some Tests with New Measures University Autonoma of Barcelona July 2010 Genevieve Giuliano Peter Gordon Qisheng Pan Jiyoung Park Presentation Outline Purpose

More information

GREGORY SHARP Curriculum Vitae November 2015

GREGORY SHARP Curriculum Vitae November 2015 GREGORY SHARP Curriculum Vitae November 2015 Department of Sociology Email: gsharp@buffalo.edu University at Buffalo, SUNY Phone: (716) 645-8479 468 Park Hall Fax: (716) 645-3934 Buffalo, NY 14260 PROFESSIONAL

More information

Data Mining Lab 5: Introduction to Neural Networks

Data Mining Lab 5: Introduction to Neural Networks Data Mining Lab 5: Introduction to Neural Networks 1 Introduction In this lab we are going to have a look at some very basic neural networks on a new data set which relates various covariates about cheese

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group Introduction to Multilevel Modeling Using HLM 6 By ATS Statistical Consulting Group Multilevel data structure Students nested within schools Children nested within families Respondents nested within interviewers

More information

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

More information

UNIVERSITY OF WAIKATO. Hamilton New Zealand

UNIVERSITY OF WAIKATO. Hamilton New Zealand UNIVERSITY OF WAIKATO Hamilton New Zealand Can We Trust Cluster-Corrected Standard Errors? An Application of Spatial Autocorrelation with Exact Locations Known John Gibson University of Waikato Bonggeun

More information

Workshop: Using Spatial Analysis and Maps to Understand Patterns of Health Services Utilization

Workshop: Using Spatial Analysis and Maps to Understand Patterns of Health Services Utilization Enhancing Information and Methods for Health System Planning and Research, Institute for Clinical Evaluative Sciences (ICES), January 19-20, 2004, Toronto, Canada Workshop: Using Spatial Analysis and Maps

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Review of Transpower s. electricity demand. forecasting methods. Professor Rob J Hyndman. B.Sc. (Hons), Ph.D., A.Stat. Contact details: Report for

Review of Transpower s. electricity demand. forecasting methods. Professor Rob J Hyndman. B.Sc. (Hons), Ph.D., A.Stat. Contact details: Report for Review of Transpower s electricity demand forecasting methods Professor Rob J Hyndman B.Sc. (Hons), Ph.D., A.Stat. Contact details: Telephone: 0458 903 204 Email: robjhyndman@gmail.com Web: robjhyndman.com

More information

Latent Class Regression Part II

Latent Class Regression Part II This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification

More information

Is the Forward Exchange Rate a Useful Indicator of the Future Exchange Rate?

Is the Forward Exchange Rate a Useful Indicator of the Future Exchange Rate? Is the Forward Exchange Rate a Useful Indicator of the Future Exchange Rate? Emily Polito, Trinity College In the past two decades, there have been many empirical studies both in support of and opposing

More information

Rob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1

Rob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1 Rob J Hyndman Forecasting using 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1 Outline 1 Regression with ARIMA errors 2 Example: Japanese cars 3 Using Fourier terms for seasonality 4

More information

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY 5-10 hours of input weekly is enough to pick up a new language (Schiff & Myers, 1988). Dutch children spend 5.5 hours/day

More information

Calculating Effect-Sizes

Calculating Effect-Sizes Calculating Effect-Sizes David B. Wilson, PhD George Mason University August 2011 The Heart and Soul of Meta-analysis: The Effect Size Meta-analysis shifts focus from statistical significance to the direction

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Models for Longitudinal and Clustered Data

Models for Longitudinal and Clustered Data Models for Longitudinal and Clustered Data Germán Rodríguez December 9, 2008, revised December 6, 2012 1 Introduction The most important assumption we have made in this course is that the observations

More information

Virtual Site Event. Predictive Analytics: What Managers Need to Know. Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015

Virtual Site Event. Predictive Analytics: What Managers Need to Know. Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015 Virtual Site Event Predictive Analytics: What Managers Need to Know Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015 1 Ground Rules Virtual Site Ground Rules PMI Code of Conduct applies for this

More information

COURSES: 1. Short Course in Econometrics for the Practitioner (P000500) 2. Short Course in Econometric Analysis of Cointegration (P000537)

COURSES: 1. Short Course in Econometrics for the Practitioner (P000500) 2. Short Course in Econometric Analysis of Cointegration (P000537) Get the latest knowledge from leading global experts. Financial Science Economics Economics Short Courses Presented by the Department of Economics, University of Pretoria WITH 2015 DATES www.ce.up.ac.za

More information

Evaluation & Validation: Credibility: Evaluating what has been learned

Evaluation & Validation: Credibility: Evaluating what has been learned Evaluation & Validation: Credibility: Evaluating what has been learned How predictive is a learned model? How can we evaluate a model Test the model Statistical tests Considerations in evaluating a Model

More information

Geographic variation in work injuries: a multilevel analysis of individual-level and area-level factors within Canada

Geographic variation in work injuries: a multilevel analysis of individual-level and area-level factors within Canada Geographic variation in work injuries: a multilevel analysis of individual-level and area-level factors within Canada Curtis Breslin & Sara Morassaei Institute for Work & Health Research Team: Selahadin

More information

Time Series Analysis

Time Series Analysis Time Series Analysis Identifying possible ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos

More information

Risk pricing for Australian Motor Insurance

Risk pricing for Australian Motor Insurance Risk pricing for Australian Motor Insurance Dr Richard Brookes November 2012 Contents 1. Background Scope How many models? 2. Approach Data Variable filtering GLM Interactions Credibility overlay 3. Model

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

Chapter 5: Analysis of The National Education Longitudinal Study (NELS:88)

Chapter 5: Analysis of The National Education Longitudinal Study (NELS:88) Chapter 5: Analysis of The National Education Longitudinal Study (NELS:88) Introduction The National Educational Longitudinal Survey (NELS:88) followed students from 8 th grade in 1988 to 10 th grade in

More information

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t

More information

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

Identifying Schools for the Fruit in Schools Programme

Identifying Schools for the Fruit in Schools Programme 1 Identifying Schools for the Fruit in Schools Programme Introduction This report identifies New Zeland publicly funded schools for the Fruit in Schools programme. The schools are identified based on need.

More information

Practical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University

Practical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University Practical I conometrics data collection, analysis, and application Christiana E. Hilmer Michael J. Hilmer San Diego State University Mi Table of Contents PART ONE THE BASICS 1 Chapter 1 An Introduction

More information

Data Analysis, Statistics, and Probability

Data Analysis, Statistics, and Probability Chapter 6 Data Analysis, Statistics, and Probability Content Strand Description Questions in this content strand assessed students skills in collecting, organizing, reading, representing, and interpreting

More information

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.

More information

Appendix 1: Time series analysis of peak-rate years and synchrony testing.

Appendix 1: Time series analysis of peak-rate years and synchrony testing. Appendix 1: Time series analysis of peak-rate years and synchrony testing. Overview The raw data are accessible at Figshare ( Time series of global resources, DOI 10.6084/m9.figshare.929619), sources are

More information

Introduction to spatial data analysis

Introduction to spatial data analysis Introduction to spatial data analysis 3 Scuola di Dottorato in Economia, La Sapienza, 2015/2016 Instructors: Filippo Celata, Federico Martellozzo and Luca Salvati http://www.memotef.uniroma1.it/node/6524

More information

IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD

IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD REPUBLIC OF SOUTH AFRICA GOVERNMENT-WIDE MONITORING & IMPACT EVALUATION SEMINAR IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD SHAHID KHANDKER World Bank June 2006 ORGANIZED BY THE WORLD BANK AFRICA IMPACT

More information

The Effects of Unemployment on Crime Rates in the U.S.

The Effects of Unemployment on Crime Rates in the U.S. The Effects of Unemployment on Crime Rates in the U.S. Sandra Ajimotokin, Alexandra Haskins, Zach Wade April 14 th, 2015 Abstract This paper aims to analyze the relationship between unemployment and crime

More information

Introducing the Multilevel Model for Change

Introducing the Multilevel Model for Change Department of Psychology and Human Development Vanderbilt University GCM, 2010 1 Multilevel Modeling - A Brief Introduction 2 3 4 5 Introduction In this lecture, we introduce the multilevel model for change.

More information

Rethinking the Cultural Context of Schooling Decisions in Disadvantaged Neighborhoods: From Deviant Subculture to Cultural Heterogeneity

Rethinking the Cultural Context of Schooling Decisions in Disadvantaged Neighborhoods: From Deviant Subculture to Cultural Heterogeneity Rethinking the Cultural Context of Schooling Decisions in Disadvantaged Neighborhoods: From Deviant Subculture to Cultural Heterogeneity Sociology of Education David J. Harding, University of Michigan

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

Mgmt 469. Fixed Effects Models. Suppose you want to learn the effect of price on the demand for back massages. You

Mgmt 469. Fixed Effects Models. Suppose you want to learn the effect of price on the demand for back massages. You Mgmt 469 Fixed Effects Models Suppose you want to learn the effect of price on the demand for back massages. You have the following data from four Midwest locations: Table 1: A Single Cross-section of

More information

OUTLIER ANALYSIS. Data Mining 1

OUTLIER ANALYSIS. Data Mining 1 OUTLIER ANALYSIS Data Mining 1 What Are Outliers? Outlier: A data object that deviates significantly from the normal objects as if it were generated by a different mechanism Ex.: Unusual credit card purchase,

More information

Presentation Overview

Presentation Overview Treatment and Self-help Availability in Disadvantaged and Minority Neighborhoods Katherine J. Karriker-Jaffe, PhD Deidre Patterson, MPH Lee Ann Kaskutas, DrPH R01AA020328 to K.J. Karriker-Jaffe Presentation

More information

SYSTEMS OF REGRESSION EQUATIONS

SYSTEMS OF REGRESSION EQUATIONS SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations

More information

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS. SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

More information

The Wondrous World of fmri statistics

The Wondrous World of fmri statistics Outline The Wondrous World of fmri statistics FMRI data and Statistics course, Leiden, 11-3-2008 The General Linear Model Overview of fmri data analysis steps fmri timeseries Modeling effects of interest

More information

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents

Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén

More information

A spreadsheet Approach to Business Quantitative Methods

A spreadsheet Approach to Business Quantitative Methods A spreadsheet Approach to Business Quantitative Methods by John Flaherty Ric Lombardo Paul Morgan Basil desilva David Wilson with contributions by: William McCluskey Richard Borst Lloyd Williams Hugh Williams

More information

Getting Correct Results from PROC REG

Getting Correct Results from PROC REG Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking

More information

Power Calculation Using the Online Variance Almanac (Web VA): A User s Guide

Power Calculation Using the Online Variance Almanac (Web VA): A User s Guide Power Calculation Using the Online Variance Almanac (Web VA): A User s Guide Larry V. Hedges & E.C. Hedberg This research was supported by the National Science Foundation under Award Nos. 0129365 and 0815295.

More information

Specifications for this HLM2 run

Specifications for this HLM2 run One way ANOVA model 1. How much do U.S. high schools vary in their mean mathematics achievement? 2. What is the reliability of each school s sample mean as an estimate of its true population mean? 3. Do

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims

More information

Penalized regression: Introduction

Penalized regression: Introduction Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

More information

Econometric Modelling for Revenue Projections

Econometric Modelling for Revenue Projections Econometric Modelling for Revenue Projections Annex E 1. An econometric modelling exercise has been undertaken to calibrate the quantitative relationship between the five major items of government revenue

More information

Using GIS to Identify Pedestrian- Vehicle Crash Hot Spots and Unsafe Bus Stops

Using GIS to Identify Pedestrian- Vehicle Crash Hot Spots and Unsafe Bus Stops Using GIS to Identify Pedestrian-Vehicle Crash Hot Spots and Unsafe Bus Stops Using GIS to Identify Pedestrian- Vehicle Crash Hot Spots and Unsafe Bus Stops Long Tien Truong and Sekhar V. C. Somenahalli

More information

Northumberland Knowledge

Northumberland Knowledge Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

More information

USING PREDICTIVE ANALYTICS TO UNDERSTAND HOUSING ENROLLMENTS

USING PREDICTIVE ANALYTICS TO UNDERSTAND HOUSING ENROLLMENTS USING PREDICTIVE ANALYTICS TO UNDERSTAND HOUSING ENROLLMENTS Heather Kelly, Ed.D., University of Delaware Karen DeMonte, M.Ed., University of Delaware Darlena Jones, Ph.D., EBI MAP-Works Predictive Analytics:

More information

VOLATILITY AND DEVIATION OF DISTRIBUTED SOLAR

VOLATILITY AND DEVIATION OF DISTRIBUTED SOLAR VOLATILITY AND DEVIATION OF DISTRIBUTED SOLAR Andrew Goldstein Yale University 68 High Street New Haven, CT 06511 andrew.goldstein@yale.edu Alexander Thornton Shawn Kerrigan Locus Energy 657 Mission St.

More information

Module 4 - Multiple Logistic Regression

Module 4 - Multiple Logistic Regression Module 4 - Multiple Logistic Regression Objectives Understand the principles and theory underlying logistic regression Understand proportions, probabilities, odds, odds ratios, logits and exponents Be

More information

Statistics in Retail Finance. Chapter 2: Statistical models of default

Statistics in Retail Finance. Chapter 2: Statistical models of default Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision

More information

Chapter 1 Introduction. 1.1 Introduction

Chapter 1 Introduction. 1.1 Introduction Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations

More information

OFFICIAL FILING BEFORE THE PUBLIC SERVICE COMMISSION OF WISCONSIN DIRECT TESTIMONY OF JANNELL E. MARKS

OFFICIAL FILING BEFORE THE PUBLIC SERVICE COMMISSION OF WISCONSIN DIRECT TESTIMONY OF JANNELL E. MARKS OFFICIAL FILING BEFORE THE PUBLIC SERVICE COMMISSION OF WISCONSIN Application of Northern States Power Company, a Wisconsin Corporation, for Authority to Adjust Electric and Natural Gas Rates Docket No.

More information

Applying Statistics Recommended by Regulatory Documents

Applying Statistics Recommended by Regulatory Documents Applying Statistics Recommended by Regulatory Documents Steven Walfish President, Statistical Outsourcing Services steven@statisticaloutsourcingservices.com 301-325 325-31293129 About the Speaker Mr. Steven

More information