Geographically Weighted Regression
|
|
|
- Chloe Henderson
- 10 years ago
- Views:
Transcription
1 Geographically Weighted Regression CSDE Statistics Workshop Christopher S. Fowler PhD. February 1 st 2011 Significant portions of this workshop were culled from presentations prepared by Fotheringham, Charleton and Brunsdon and presented at the 2010 Advanced Workshop on Spatial Analysis at the University of Santa Barbara. University of Washington Center for Studies in Demography and Ecology
2 Outline for the Session The motivation for GWR Examples from YOUR discipline Mapping OLS Residuals A good baseline for why we need GWR GWR Definitions, basic concepts Running GWR A straightforward implementation in ArcGIS GWR and some extensions
3 Basics of OLS y X Assumes a stationary process Same stimulus provokes the same response anywhere in the study area
4 Why might relationships vary spatially? Sampling variation Relationships intrinsically different across space (attitudes, preferences, contextual effects) Model misspecification
5 Applications: Ecology GWR works on trees Could have been differentiated sampling pattern creates predictable and changing levels of interaction among observations
6 Applications: Public Health Relationships vary systematically The relationship between mortality and occupational segregation and between mortality and unemployment varies across Tokyo
7 Applications: Sociology/Public Policy Missing variables (and they may very well be unknowable) The link between multifamily housing and residential burglaries varies widely even when controlling for numerous socioeconomic and neighborhood factors
8 Back up How do we know if we have nonstationarity in our model? Map residuals and test them for spatial autocorrelation if our model errs systematically with a spatial pattern then we may be on to something.
9 Poverty in the Southern U.S.
10 Our example Model Poverty Fem aleh eadedh ousehold U nem ployed Black 65andolder M etro AtLeastH ighschooleducation Based on the work of Paul Voss and Katherine Curtis These are all understood to be good predictors of poverty What kinds of spatial structures influence this data set?
11 Lab Part 1 Run our OLS model in ArcGIS Examine model output Map residuals Calculate Moran s I and Local Moran s I
12
13 Our best aspatial model
14 So what now? Add more missing variables and try again Repeat the steps from the lab Accept that there is something about certain places that makes them different (spatial heterogeneity) Try GWR Test variables meant to explore interactions taking place at short distances (spatial dependence) Try Spatial Regression (Likely a spatial lag model) Assume that the correlation is a nuisance and control for it in the error term Try Spatial Regression (Likely a spatial error model)
15 Outline for Part II What is GWR Weighting in GWR
16 Geographically Weighted Regression Local statistical technique to analyze spatial variations in relationships We are not content with global averages of spatial data (climate for example) Why should we be satisfied with global averages in a statistical analysis?
17 Put another way.simpson s Paradox If we think of these points as our data grouped into colors by region we can see that the global and local models differ significantly Source: Rücker and Schumacher BMC Medical Research Methodology :34 doi: /
18 Basic definitions Spatial nonstationarity exists when the same stimulus provokes a different response in different parts of the study region Global models are statements about processes that are assumed to be stationary and, as such, are location GWR independent greater detail Local models are spatial disaggregations of global models, the results of which are location specific Spatial heterogeneity refers to spatial patterns resulting from broad similarities usually over time Spatial dependence refers to spatial patterns that result from interactions among observations
19 Spatial Heterogeneity and Spatial Dependence
20 GWR and Spatial Processes GWR is excellent at picking up broad scale regional differences spatial heterogeneity Not as effective at dealing with small scale interaction processes Too much bias in each local model That doesn t mean it wont try (and give you misleading results)
21 GWR in a nutshell Global model y X becomes y X i i i i Where i indicates that there is a set of coefficients estimated for every observation in our data set
22 The Key Difference We estimate a set of regression coefficients for each observation To do so we weight near observations more heavily than more distant ones. We may also estimate coefficients based on some local subset of observations
23 Some advantages of GWR Excellent tool for testing model specification Where does model fit look good, where are you missing something? Residuals generally lower and not spatially autocorrelated
24 Real values for β
25 Estimated Values of β in global model
26 Residuals from global model
27 Reasons to use GWR Identify model misspecification Identify nonstationarity in relationships Improved model fit (R 2, AIC, etc) Reduced spatial autocorrelation Represent context Address spatial heterogeneity when precise variables may not exist
28 You ve convinced me, what next? Run your aspatial model (as we did in 1 st lab) We will want the results and diagnostics to compare with what comes next. Decide how you are going to weight your nearby locations Fixed bandwidth Variable bandwidth User-defined bandwidth
29 It all comes down to how you weight the observations We can use a fixed bandwidth h h Wij = exp[-((dij/h) 2 )/2] Number of observations will vary, but area they represent will remain constant
30 Weighting option 2 Or we can employ an adaptive bandwidth Wij = [1-(d ij2 / h 2 )] 2 if j is one of i s N nearest neighbors Number of observations will remain fixed, but area will not be the same
31 Kernels and Weights Bandwidth specifies shape of weights curve Kernel type tells us whether we will define our bandwidth based on distance (fixed) or number of neighbors (adaptive) So how do we know what bandwidth to use?
32 Judging the appropriate bandwidth A tradeoff between Bias: we include observations that are not part of the same spatial group and Variance: we don t have enough points in our model to say anything with conviction AIC Variance Optimum Bias AICc or CV measure model fit Optimize fit to obtain best bandwidth. Bandwidth
33 To sum Weighting assumptions are very important to outcomes in GWR Fixed distance kernel is more appropriate when the distribution of your observations is relatively stable across space (e.g. size, number of neighbors). Adaptive kernel is appropriate when distribution varies across space (e.g. events are clustered or polygons are heterogeneous) Once a kernel type is selected optimization takes some of the guesswork out of it, but robustness checks are still needed
34 Residuals from the OLS model from last lesson Looks reasonably good Moran s I is still.22 and highly significant
35 Lab Run GWR model Check Residuals Check variation in coefficients
36
37 Further topics/issues in GWR Where to go for next steps General troubleshooting Significance testing Outlier problems Poisson and Logistic model implementations Mixed form models
38 Other software implementations of GWR GWR 3.x (4.0 should be out soon) R (spgwr package) Stata Matlab Perhaps others I haven t heard of
39 General Troubleshooting Regional dummies BAD Eliminate them from model we are trying to show regional variation, not control for it Binary and low probability count variables Use caution, lack of variation may cause model to crash or have trouble finding a workable bandwidth
40 Significance Testing How do I know if the variation I see in my coefficients is meaningful? Could do t-test, but you will run into problems with multiple (1,387) tests Results in lots of false positives Standard correction (Bonferroni) will make any significance finding nearly impossible
41 Best Method: Monte Carlo simulation Randomly reassign all observation values (dependent and independent variables travel together) to different observation locations Each county s data gets assigned randomly to a different county Re-run GWR and record coefficients Repeat lots of times (at least 100) Define a distribution for coefficient values and compare your coefficients to this distribution
42 Other method: Fotheringham Significance Test Fotheringham 1 p e p e np p e is effective number of parameters p is the number of parameters
43 Fotheringham Significance Test F otheringham Fotheringham 1 p e p e np Type equation here (37.97) In Excel we can find the significant T-statistic using: TINV( ,1379) In R we use: qt(1-( /2),1379) Either way we get a value of ~
44 Results: Significant Nonstationarity for Percent Hispanic
45 Outlier problems Outliers cause problems for everybody, but their impact is greater for local regressions, particularly when bandwidth keeps number of observations low. In standard OLS Run model and identify observations with high or low residuals (~ +/- 4) Weight these observations less than 1 Re-run until none of the observations have extreme residuals Now do your GWR with weights assigned
46 Poisson and Logistic model forms Implementations exist in both R and GWR 3.x software Both require much greater care with respect to colinearity and lack of variation
47 Mixed-form models What if some of your variables are stationary and others have variation? Mixed-form models allow you to hold some coefficients constant while allowing others to vary Not yet implemented in any statistical package, but not that difficult from a technical standpoint
48 Concluding comments What comes next? Spatial regression Multilevel models
Geographically Weighted Regression
Geographically Weighted Regression A Tutorial on using GWR in ArcGIS 9.3 Martin Charlton A Stewart Fotheringham National Centre for Geocomputation National University of Ireland Maynooth Maynooth, County
Introduction of geospatial data visualization and geographically weighted reg
Introduction of geospatial data visualization and geographically weighted regression (GWR) Vanderbilt University August 16, 2012 Study Background Study Background Data Overview Algorithm (1) Information
Spatial Analysis with GeoDa Spatial Autocorrelation
Spatial Analysis with GeoDa Spatial Autocorrelation 1. Background GeoDa is a trademark of Luc Anselin. GeoDa is a collection of software tools designed for exploratory spatial data analysis (ESDA) based
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
Introduction to Longitudinal Data Analysis
Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction
Local classification and local likelihoods
Local classification and local likelihoods November 18 k-nearest neighbors The idea of local regression can be extended to classification as well The simplest way of doing so is called nearest neighbor
EXPLORING SPATIAL PATTERNS IN YOUR DATA
EXPLORING SPATIAL PATTERNS IN YOUR DATA OBJECTIVES Learn how to examine your data using the Geostatistical Analysis tools in ArcMap. Learn how to use descriptive statistics in ArcMap and Geoda to analyze
Financial Risk Management Exam Sample Questions/Answers
Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period
Introduction to nonparametric regression: Least squares vs. Nearest neighbors
Introduction to nonparametric regression: Least squares vs. Nearest neighbors Patrick Breheny October 30 Patrick Breheny STA 621: Nonparametric Statistics 1/16 Introduction For the remainder of the course,
Environmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model
Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort [email protected] Motivation Location matters! Observed value at one location is
Is the person a permanent immigrant. A non permanent resident. Does the person identify as male. Person appearing Chinese
Cole Sprague Kai Addae Economics 312 Canadian Census Project Introduction This project is based off of the 2001 Canadian Census data, and examines the relationship between wages and education, while controlling
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
Integrated Resource Plan
Integrated Resource Plan March 19, 2004 PREPARED FOR KAUA I ISLAND UTILITY COOPERATIVE LCG Consulting 4962 El Camino Real, Suite 112 Los Altos, CA 94022 650-962-9670 1 IRP 1 ELECTRIC LOAD FORECASTING 1.1
Multiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
Obesity in America: A Growing Trend
Obesity in America: A Growing Trend David Todd P e n n s y l v a n i a S t a t e U n i v e r s i t y Utilizing Geographic Information Systems (GIS) to explore obesity in America, this study aims to determine
Simple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
Introduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
Spatial Data Analysis Using GeoDa. Workshop Goals
Spatial Data Analysis Using GeoDa 9 Jan 2014 Frank Witmer Computing and Research Services Institute of Behavioral Science Workshop Goals Enable participants to find and retrieve geographic data pertinent
How To Understand The Theory Of Probability
Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL
NC Public Health and Cancer - Trends for 2014
Research Brief April 2015 Measuring community health outcomes: New approaches for public health services research P ublic Health agencies are increasingly asked to do more with less. Tough economic times
The primary goal of this thesis was to understand how the spatial dependence of
5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial
HLM software has been one of the leading statistical packages for hierarchical
Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush
Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
GREGORY SHARP Curriculum Vitae November 2015
GREGORY SHARP Curriculum Vitae November 2015 Department of Sociology Email: [email protected] University at Buffalo, SUNY Phone: (716) 645-8479 468 Park Hall Fax: (716) 645-3934 Buffalo, NY 14260 PROFESSIONAL
Data Mining Lab 5: Introduction to Neural Networks
Data Mining Lab 5: Introduction to Neural Networks 1 Introduction In this lab we are going to have a look at some very basic neural networks on a new data set which relates various covariates about cheese
Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This
Section 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group
Introduction to Multilevel Modeling Using HLM 6 By ATS Statistical Consulting Group Multilevel data structure Students nested within schools Children nested within families Respondents nested within interviewers
1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
UNIVERSITY OF WAIKATO. Hamilton New Zealand
UNIVERSITY OF WAIKATO Hamilton New Zealand Can We Trust Cluster-Corrected Standard Errors? An Application of Spatial Autocorrelation with Exact Locations Known John Gibson University of Waikato Bonggeun
2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
Geostatistics Exploratory Analysis
Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras [email protected]
Introduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
Latent Class Regression Part II
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets
Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification
Is the Forward Exchange Rate a Useful Indicator of the Future Exchange Rate?
Is the Forward Exchange Rate a Useful Indicator of the Future Exchange Rate? Emily Polito, Trinity College In the past two decades, there have been many empirical studies both in support of and opposing
Rob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1
Rob J Hyndman Forecasting using 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1 Outline 1 Regression with ARIMA errors 2 Example: Japanese cars 3 Using Fourier terms for seasonality 4
EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION
EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY 5-10 hours of input weekly is enough to pick up a new language (Schiff & Myers, 1988). Dutch children spend 5.5 hours/day
Calculating Effect-Sizes
Calculating Effect-Sizes David B. Wilson, PhD George Mason University August 2011 The Heart and Soul of Meta-analysis: The Effect Size Meta-analysis shifts focus from statistical significance to the direction
Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
Models for Longitudinal and Clustered Data
Models for Longitudinal and Clustered Data Germán Rodríguez December 9, 2008, revised December 6, 2012 1 Introduction The most important assumption we have made in this course is that the observations
Virtual Site Event. Predictive Analytics: What Managers Need to Know. Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015
Virtual Site Event Predictive Analytics: What Managers Need to Know Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015 1 Ground Rules Virtual Site Ground Rules PMI Code of Conduct applies for this
COURSES: 1. Short Course in Econometrics for the Practitioner (P000500) 2. Short Course in Econometric Analysis of Cointegration (P000537)
Get the latest knowledge from leading global experts. Financial Science Economics Economics Short Courses Presented by the Department of Economics, University of Pretoria WITH 2015 DATES www.ce.up.ac.za
Evaluation & Validation: Credibility: Evaluating what has been learned
Evaluation & Validation: Credibility: Evaluating what has been learned How predictive is a learned model? How can we evaluate a model Test the model Statistical tests Considerations in evaluating a Model
Time Series Analysis
Time Series Analysis Identifying possible ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos
Risk pricing for Australian Motor Insurance
Risk pricing for Australian Motor Insurance Dr Richard Brookes November 2012 Contents 1. Background Scope How many models? 2. Approach Data Variable filtering GLM Interactions Credibility overlay 3. Model
Statistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
Chapter 5: Analysis of The National Education Longitudinal Study (NELS:88)
Chapter 5: Analysis of The National Education Longitudinal Study (NELS:88) Introduction The National Educational Longitudinal Survey (NELS:88) followed students from 8 th grade in 1988 to 10 th grade in
Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest
Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t
Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
Practical. I conometrics. data collection, analysis, and application. Christiana E. Hilmer. Michael J. Hilmer San Diego State University
Practical I conometrics data collection, analysis, and application Christiana E. Hilmer Michael J. Hilmer San Diego State University Mi Table of Contents PART ONE THE BASICS 1 Chapter 1 An Introduction
Data Analysis, Statistics, and Probability
Chapter 6 Data Analysis, Statistics, and Probability Content Strand Description Questions in this content strand assessed students skills in collecting, organizing, reading, representing, and interpreting
New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
Introduction to spatial data analysis
Introduction to spatial data analysis 3 Scuola di Dottorato in Economia, La Sapienza, 2015/2016 Instructors: Filippo Celata, Federico Martellozzo and Luca Salvati http://www.memotef.uniroma1.it/node/6524
IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD
REPUBLIC OF SOUTH AFRICA GOVERNMENT-WIDE MONITORING & IMPACT EVALUATION SEMINAR IMPACT EVALUATION: INSTRUMENTAL VARIABLE METHOD SHAHID KHANDKER World Bank June 2006 ORGANIZED BY THE WORLD BANK AFRICA IMPACT
The Effects of Unemployment on Crime Rates in the U.S.
The Effects of Unemployment on Crime Rates in the U.S. Sandra Ajimotokin, Alexandra Haskins, Zach Wade April 14 th, 2015 Abstract This paper aims to analyze the relationship between unemployment and crime
Introducing the Multilevel Model for Change
Department of Psychology and Human Development Vanderbilt University GCM, 2010 1 Multilevel Modeling - A Brief Introduction 2 3 4 5 Introduction In this lecture, we introduce the multilevel model for change.
Rethinking the Cultural Context of Schooling Decisions in Disadvantaged Neighborhoods: From Deviant Subculture to Cultural Heterogeneity
Rethinking the Cultural Context of Schooling Decisions in Disadvantaged Neighborhoods: From Deviant Subculture to Cultural Heterogeneity Sociology of Education David J. Harding, University of Michigan
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
Mgmt 469. Fixed Effects Models. Suppose you want to learn the effect of price on the demand for back massages. You
Mgmt 469 Fixed Effects Models Suppose you want to learn the effect of price on the demand for back massages. You have the following data from four Midwest locations: Table 1: A Single Cross-section of
OUTLIER ANALYSIS. Data Mining 1
OUTLIER ANALYSIS Data Mining 1 What Are Outliers? Outlier: A data object that deviates significantly from the normal objects as if it were generated by a different mechanism Ex.: Unusual credit card purchase,
SYSTEMS OF REGRESSION EQUATIONS
SYSTEMS OF REGRESSION EQUATIONS 1. MULTIPLE EQUATIONS y nt = x nt n + u nt, n = 1,...,N, t = 1,...,T, x nt is 1 k, and n is k 1. This is a version of the standard regression model where the observations
Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.
SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. [email protected] Course Objective This course is designed
Data Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
The Wondrous World of fmri statistics
Outline The Wondrous World of fmri statistics FMRI data and Statistics course, Leiden, 11-3-2008 The General Linear Model Overview of fmri data analysis steps fmri timeseries Modeling effects of interest
Linda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents
Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén
A spreadsheet Approach to Business Quantitative Methods
A spreadsheet Approach to Business Quantitative Methods by John Flaherty Ric Lombardo Paul Morgan Basil desilva David Wilson with contributions by: William McCluskey Richard Borst Lloyd Williams Hugh Williams
Getting Correct Results from PROC REG
Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking
Power Calculation Using the Online Variance Almanac (Web VA): A User s Guide
Power Calculation Using the Online Variance Almanac (Web VA): A User s Guide Larry V. Hedges & E.C. Hedberg This research was supported by the National Science Foundation under Award Nos. 0129365 and 0815295.
Specifications for this HLM2 run
One way ANOVA model 1. How much do U.S. high schools vary in their mean mathematics achievement? 2. What is the reliability of each school s sample mean as an estimate of its true population mean? 3. Do
Univariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal
MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims
Penalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
Econometric Modelling for Revenue Projections
Econometric Modelling for Revenue Projections Annex E 1. An econometric modelling exercise has been undertaken to calibrate the quantitative relationship between the five major items of government revenue
Using GIS to Identify Pedestrian- Vehicle Crash Hot Spots and Unsafe Bus Stops
Using GIS to Identify Pedestrian-Vehicle Crash Hot Spots and Unsafe Bus Stops Using GIS to Identify Pedestrian- Vehicle Crash Hot Spots and Unsafe Bus Stops Long Tien Truong and Sekhar V. C. Somenahalli
Northumberland Knowledge
Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about
USING PREDICTIVE ANALYTICS TO UNDERSTAND HOUSING ENROLLMENTS
USING PREDICTIVE ANALYTICS TO UNDERSTAND HOUSING ENROLLMENTS Heather Kelly, Ed.D., University of Delaware Karen DeMonte, M.Ed., University of Delaware Darlena Jones, Ph.D., EBI MAP-Works Predictive Analytics:
VOLATILITY AND DEVIATION OF DISTRIBUTED SOLAR
VOLATILITY AND DEVIATION OF DISTRIBUTED SOLAR Andrew Goldstein Yale University 68 High Street New Haven, CT 06511 [email protected] Alexander Thornton Shawn Kerrigan Locus Energy 657 Mission St.
Module 4 - Multiple Logistic Regression
Module 4 - Multiple Logistic Regression Objectives Understand the principles and theory underlying logistic regression Understand proportions, probabilities, odds, odds ratios, logits and exponents Be
Statistics in Retail Finance. Chapter 2: Statistical models of default
Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision
Chapter 1 Introduction. 1.1 Introduction
Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations
OFFICIAL FILING BEFORE THE PUBLIC SERVICE COMMISSION OF WISCONSIN DIRECT TESTIMONY OF JANNELL E. MARKS
OFFICIAL FILING BEFORE THE PUBLIC SERVICE COMMISSION OF WISCONSIN Application of Northern States Power Company, a Wisconsin Corporation, for Authority to Adjust Electric and Natural Gas Rates Docket No.
Applying Statistics Recommended by Regulatory Documents
Applying Statistics Recommended by Regulatory Documents Steven Walfish President, Statistical Outsourcing Services [email protected] 301-325 325-31293129 About the Speaker Mr. Steven
