Regression Analysis Using ArcMap. By Jennie Murack

Similar documents
Obesity in America: A Growing Trend

Univariate Regression

Session 9 Case 3: Utilizing Available Software Statistical Analysis

Module 5: Multiple Regression Analysis

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Multiple Regression: What Is It?

Example: Boats and Manatees

MTH 140 Statistics Videos

Lean Six Sigma Analyze Phase Introduction. TECH QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY

Geographically Weighted Regression

2. Linear regression with multiple regressors

Regression Analysis (Spring, 2000)

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Simple Methods and Procedures Used in Forecasting

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

Solución del Examen Tipo: 1

16 : Demand Forecasting

Geographically Weighted Regression

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Module 5: Statistical Analysis

EXPLORING SPATIAL PATTERNS IN YOUR DATA

430 Statistics and Financial Mathematics for Business

Systat: Statistical Visualization Software

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

Advances in the Application of Geographic Information Systems (GIS) Carmelle J. Terborgh, Ph.D. ESRI Federal/Global Affairs

Practical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University

UNIVERSITY OF WAIKATO. Hamilton New Zealand

Simple linear regression

A full analysis example Multiple correlations Partial correlations

Factors affecting online sales

Regression Analysis: A Complete Example

Introduction to Regression and Data Analysis

Premaster Statistics Tutorial 4 Full solutions

HLM software has been one of the leading statistical packages for hierarchical

Statistics 2014 Scoring Guidelines

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Scatter Plot, Correlation, and Regression on the TI-83/84

4. Simple regression. QBUS6840 Predictive Analytics.

Simple Regression Theory II 2010 Samuel L. Baker

OFFICIAL FILING BEFORE THE PUBLIC SERVICE COMMISSION OF WISCONSIN DIRECT TESTIMONY OF JANNELL E. MARKS

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Dimensionality Reduction: Principal Components Analysis

Market Prices of Houses in Atlanta. Eric Slone, Haitian Sun, Po-Hsiang Wang. ECON 3161 Prof. Dhongde

Homework 8 Solutions

A spreadsheet Approach to Business Quantitative Methods

Introduction to Data Analysis in Hierarchical Linear Models

Module 3: Correlation and Covariance

Causal Forecasting Models

2013 MBA Jump Start Program. Statistics Module Part 3

Joseph Twagilimana, University of Louisville, Louisville, KY

Elements of statistics (MATH0487-1)

Nonlinear Regression Functions. SW Ch 8 1/54/

MULTIPLE REGRESSION EXAMPLE

Is the Forward Exchange Rate a Useful Indicator of the Future Exchange Rate?

Fairfield Public Schools

Exercise 1.12 (Pg )

An Analysis of the Telecommunications Business in China by Linear Regression

2. Simple Linear Regression

Predicting Defaults of Loans using Lending Club s Loan Data

GLM I An Introduction to Generalized Linear Models

REGRESSION: FORECASTING USING EXPLANATORY FACTORS

Regression step-by-step using Microsoft Excel

Financial Risk Management Exam Sample Questions/Answers

Note 2 to Computer class: Standard mis-specification tests

ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? 1. INTRODUCTION

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Causes of Inflation in the Iranian Economy

Data Mining Part 5. Prediction

Simple Predictive Analytics Curtis Seare

Directions for using SPSS

Chapter 7: Modeling Relationships of Multiple Variables with Linear Regression

Multiple Linear Regression

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Simple Linear Regression Inference

Session 7 Bivariate Data and Analysis

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

4. Multiple Regression in Practice

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

The Big Picture. Correlation. Scatter Plots. Data

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

Part 2: Analysis of Relationship Between Two Variables

Chapter 9: Univariate Time Series Analysis

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013

Chapter 7: Simple linear regression Learning Objectives

2. What is the general linear model to be used to model linear trend? (Write out the model) = or

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Transcription:

Regression Analysis Using ArcMap By Jennie Murack

Regression Basics

How is Regression Different from other Spatial Statistical Analyses? With other tools you ask WHERE something is happening? Are there places in the United States where people are persistently dying young? Where are the hot spots for crime, 911 emergency calls, or fires? Where do we find a higher than expected proportion of traffic accidents in a city?

With Regression Analyses, you ask WHY something is happening. Why are there places in the United States where people persistently die young? What might be causing this? Can we model the characteristics of places that experience a lot of crime, 911 calls, or fire events to help reduce these incidents? What are the factors contributing to higher than expected traffic accidents? Are there policy implications or mitigating actions that might reduce traffic accidents across the city and/or in particular high accident areas?

Regression analysis allows you to Model, examine and explore spatial relationships Predict Coefficients for percent rural and low-weight births T-scores show where this relationship is significant

Reasons to Use Regression Analysis To model phenomenon to better understand it and possibly make decisions To model phenomenon to predict values at other places or times To explore hypotheses

Types of Regression

Spatial Regression Spatial data often do not fit traditional, non-spatial regression requirements because they are: spatially autocorrelated (features near each other are more similar than those further away) nonstationary (features behave differently based on their location/regional variation) No spatial regression method is effective for both characteristics.

Linear Regression Used to analyze linear relationships among variables. Linear relationships are positive or negative Regression analyses attempt to demonstrate the degree to which one or more variables potentially promote positive or negative change in another variable.

Linear Regression Techniques Ordinary Least Squares (OLS) is the best known technique and a good starting point for all spatial regression analyses. Global model = provides 1 equation to represent the entire dataset Geographically Weighted Regression (GWR) Local Model = fits a regression equation to every feature in the dataset Regional variation incorporated into the regression model

The Equation

Regression Equation y = process you are trying to predict or understand X = used to model or predict the dependent variable B = coefficients computed by the regression tool, represent the strength and type of relationship X has to Y

Regression Equation p-values = result of a statistical test low p-values suggest that the coefficient is important to your model R 2 = statistics derived from the regression equation to quantity the performance of the model The closer r 2 is to 1, the more dependence there is among variables. Residuals = the unexplained portion of the dependent variable large residuals = a poor model fit

Residuals Difference between the observed and predicted values

Potential Regression Problems

Omitted explanatory variables (misspecification) Solution: Map and examine OLS residuals and GWR coefficients Run Hot Spot Analysis on OLS residuals

Nonlinear Relationships Solutions: Create a scatter plot matrix graph and transform variables Use a non-linear regression model

Data Outliers Solutions: Run regression with and without outliers to see their effect on the analysis Create a scatter plot to examine extreme values and correct or remove outliers if possible.

Nonstationarity Definition: The relationship among the data changes based on location. Solutions: OLS automatically tests for problems with nonstationarity. GWR may be a more appropriate analysis.

Multicollinearity Definition: One or a combination of explanatory variables is redundant. Solutions: OLS tool automatically checks for this. Remove or modify the variable(s).

Inconsistent variance in residuals Definition: Model may predict well for small values of the dependent variable, but become unreliable for large values. Solutions: OLS tests for inconsistent residuals. Consult the robust probabilities from the output.

Spatially autocorrelated residuals Solutions: Run the spatial autocorrelation tool on the residuals. If there is significant clustering, there could be misspecification (a variable is missing from the model).

Normal Distribution Bias Solutions: OLS tests whether residuals are normally distributed. Model may be misspecified or nonlinear.

Steps of Regression Determine what you are trying to predict or examine (dependent variable) Identify key explanatory variables Examine the distribution to determine the type of regression to conduct Run the regression Examine the coefficients Examine the residuals The mean should equal 0. Overestimates and underestimates should create a random pattern. They should create a normal distribution. Problems could indicate missing variables. Remove or add variables and repeat regression