Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting



Similar documents
SIMPLE LINEAR CORRELATION

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

Economic Interpretation of Regression. Theory and Applications

1. Measuring association using correlation and regression

STATISTICAL DATA ANALYSIS IN EXCEL

Regression Models for a Binary Response Using EXCEL and JMP

Forecasting the Direction and Strength of Stock Market Movement

BERNSTEIN POLYNOMIALS

Calibration and Linear Regression Analysis: A Self-Guided Tutorial

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

How To Calculate The Accountng Perod Of Nequalty

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Portfolio Loss Distribution

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Lecture 2: Single Layer Perceptrons Kevin Swingler

Credit Limit Optimization (CLO) for Credit Cards

The impact of hard discount control mechanism on the discount volatility of UK closed-end funds

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

Lecture 14: Implementing CAPM

total A A reag total A A r eag

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

Brigid Mullany, Ph.D University of North Carolina, Charlotte

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Although ordinary least-squares (OLS) regression

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University

What is Candidate Sampling

the Manual on the global data processing and forecasting system (GDPFS) (WMO-No.485; available at

1 De nitions and Censoring

GRAVITY DATA VALIDATION AND OUTLIER DETECTION USING L 1 -NORM

Analysis of Premium Liabilities for Australian Lines of Business

Recurrence. 1 Definitions and main statements

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

Forecasting Irregularly Spaced UHF Financial Data: Realized Volatility vs UHF-GARCH Models

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

Faraday's Law of Induction

This circuit than can be reduced to a planar circuit

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

Stress test for measuring insurance risks in non-life insurance

DEFINING %COMPLETE IN MICROSOFT PROJECT

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

UK Letter Mail Demand: a Content Based Time Series Analysis using Overlapping Market Survey Statistical Techniques

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

L10: Linear discriminants analysis

Prediction of Disability Frequencies in Life Insurance

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

ECONOMICS OF PLANT ENERGY SAVINGS PROJECTS IN A CHANGING MARKET Douglas C White Emerson Process Management

International University of Japan Public Management & Policy Analysis Program

The Application of Fractional Brownian Motion in Option Pricing

OLA HÖSSJER, BENGT ERIKSSON, KAJSA JÄRNMALM AND ESBJÖRN OHLSSON ABSTRACT

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

Media Mix Modeling vs. ANCOVA. An Analytical Debate

The OC Curve of Attribute Acceptance Plans

An Alternative Way to Measure Private Equity Performance

Measuring portfolio loss using approximation methods

High Correlation between Net Promoter Score and the Development of Consumers' Willingness to Pay (Empirical Evidence from European Mobile Markets)

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Calculation of Sampling Weights

Project Networks With Mixed-Time Constraints

Statistical Methods to Develop Rating Models

Point cloud to point cloud rigid transformations. Minimizing Rigid Registration Errors

7.5. Present Value of an Annuity. Investigate

Solution: Let i = 10% and d = 5%. By definition, the respective forces of interest on funds A and B are. i 1 + it. S A (t) = d (1 dt) 2 1. = d 1 dt.

Efficient Project Portfolio as a tool for Enterprise Risk Management

Survival analysis methods in Insurance Applications in car insurance contracts

n + d + q = 24 and.05n +.1d +.25q = 2 { n + d + q = 24 (3) n + 2d + 5q = 40 (2)

Joe Pimbley, unpublished, Yield Curve Calculations

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

Sulaiman Mouselli Damascus University, Damascus, Syria. and. Khaled Hussainey* Stirling University, Stirling, UK

Heterogeneous Paths Through College: Detailed Patterns and Relationships with Graduation and Earnings

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Support Vector Machines

Dynamics of Toursm Demand Models in Japan

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

Two Faces of Intra-Industry Information Transfers: Evidence from Management Earnings and Revenue Forecasts

Vasicek s Model of Distribution of Losses in a Large, Homogeneous Portfolio

Diagnostic Tests of Cross Section Independence for Nonlinear Panel Data Models

FINANCIAL MATHEMATICS

Loop Parallelization

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Using Series to Analyze Financial Situations: Present Value

An Analysis of the relationship between WTI term structure and oil market fundamentals in

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Transcription:

Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of forecastng: Output 1 Regresson Analyss Determnes and measures the relatonshp between two or more varables Smple lnear regresson: varables Multple lnear regresson: 3+ varables 3 Smple Lnear Regresson Evaluates the relatonshp (gongtogether) of two varables Dependent varable () Independent varable () Relatonshp depcted by a straght lne model: = a + b 4 Forecastng Whch s Independent? Buld the model usng hstorcal data Then use knowledge of the ndependent varable () to forecast the value of the dependent varable () Assumptons: The relatonshp between and s strong The future follows the past Sales Age wear Demand Prce Advertsng Equpment Tme Unts sold 5 6

Regresson Forecastng Steps 1. Plot the scatter dagram. Compute the regresson equaton 3. Forecast usng the regresson model and estmates of Scatter Dagram The frst step for smple regresson modelng Used to Dsplay hstorcal raw data Spot patterns of relatonshps Wll help you determne f regresson s approprate 7 8 Drect lnear Postve relatonshp As ncreases, tends to ncrease by a constant amount Types of Relatonshps Inverse lnear Negatve relatonshp As ncreases, tends to decrease by a constant amount Types of Relatonshps 9 10 No correlaton Change n tells nothng about Types of Relatonshps Nonlnear relatonshp As ncreases, changes by a varyng amount Types of Relatonshps 11 1

Regresson Model Regresson Lne Expresses the relatonshp between and as a straght lne: c = a + b (the regresson lne) where c = estmated average for a gven = actual value of ndependent varable a = estmated -ntercept (f =0) b = estmated slope of regresson lne a b=slope c = a + b change n slope = change n 13 14 Purposes for the Regresson Provdes a mathematcal defnton of the relatonshp Precse, accuracy depends on data ft Is a standard of perfect correlaton Can compare lne wth actual data values If all values on the lne, perfect correlaton Is a model for forecastng usng Plug an -value nto: c = a + b 15 Whch Lne s Best? There are many possbltes for a and b Each defnes a dfferent lne and model To evaluate mathematcally, let: = hstorcal value of for a gven c = calculated usng n regresson lne ( - c ) = devaton, error between actual and model forecast 16 Measurng Goodness of Ft Measurng the ft of the lne to the data: Sum of the devatons n = 1 ( ) Is 0 for any lne gong through (,), due to +/- cancellatons c Measurng Goodness of Ft Sum of the squared devatons n = 1 ( ) c Elmnates the sgn problem Is the generally accepted least squares crteron 17 18

Least-Squares Regresson Lne To mnmze the squared devatons use: ( ) n b = ( ) n( ) a = b where: n = number of data ponts, = mean of 's, 's ( ) = sum of { } ( ) = sum of { 's squared} 19 Date of Advertsng Sept. 9 Sept. 6 Oct. Oct. 9 Oct. 16 Oct. 3 Mal Order Sales vs. Advertsng $ Spent on Advertsng $1,700 3,000,000 1,500 0 1,500 $ Sales n Next Week $,000,000,000,000,000,000 0 Scatter Plot Computng the Regresson Lne, Sales () 10 100 80 40 0 0 $0 $1 $ $3 $4, Advertsng ($000s) Advert.0 Sales 1 Step 1: Sum Column 1 for Σ Step : Sum Column for Σ (1) Advert.0 Sales (1) Advert.0 () Sales 3 4

Step 3: (1) ()=(3), Sum for Σ Step 4: (1) =(4), Sum for Σ (1) Advert.0 () Sales (3) (1)x() (1) Advert.0 () Sales (3) (1)x() (4) (1) 5 6 Step 5: Compute the Mean of = n Step 6: Compute the Mean of = n 7 8 b = ( ) n ( ) n( ) Compute b a = b Compute a 9

The Regresson Equaton The resultant equaton: c = 7.4 + 34.49 Interpretaton and reasonableness check: a = 7.4 = b = 34.49 = Forecast sales wth $1800 advertsng: Evaluatng the Model How Well Dd We Do? 31 3 Compare Actuals wth Estmates Model Estmate c Error (-c) Error (-c).0 66.09.93 76.44 59.19 8.15 59.19-6.09-0.93 8.56-4.19 1. 0.81 37.11 0.87 73.8 17.58 4.4 5 Correlaton Analyss Measures the degree of assocaton between two varables 33 34 Measurng Correlaton We compare two approaches to estmatng or forecastng for a gven : Usng the mean of Usng our least-squares regresson lne We could use to estmate (for any ) and, on average, be ok _ Can regresson do better? Varaton Analyss 35 36

Let s look at varatons around the regresson lne to see how much better t explans the s than the mean Varaton Analyss y 1 _ (x 1,y 1 ) c Explaned devaton from the mean: (c-) Devaton explaned by the regresson lne Explaned Devaton y 1 c1 _ (x 1,y 1 ) c Explaned } Devaton x 1 x 1 37 38 Devaton from the mean not explaned by the regresson lne: (y 1 -c) Unexplaned Devaton y 1 (x 1,y 1 ) Unexplaned c devaton { c1 _ Explaned } devaton The total devaton from the mean = explaned + unexplaned Total Devaton (x 1,y 1 ) y 1 c Total { c1 _ devaton{ } x 1 x 1 39 40 Varaton Varaton s the square of devatons from the mean of Total varaton = Explaned + Unexplaned varaton Total = Explaned + Unexplaned ( ) = ( ) + ( ) c c Sample coeffcent of determnaton: Explaned varaton r = Total varaton Porton Explaned, r The fracton of varaton from the mean explaned by the regresson lne r = ( c ) ( ) 41 4

r = 1 Perfect lnear correlaton All ponts are explaned by the lne All ponts are on the lne Extreme Values of r r = 0 No correlaton The regresson does not explan the data any better than the mean of provdes no useful nformaton about n ths context The correlaton coeffcent, r : r =± r Correlaton Untless Sgn: + f b>0, - f b<0 Smply a dfferent way of expressng the relatonshp (correlaton) between two varables 43 44 Correlaton Coeffcent r = +/-1 Only f a perfect lnear relatonshp =a+b exsts All ponts on the lne Some thnk that t looks better than r r = 0.36 r = 0. y x Example Scatterplot A 58 51 3 4 67 65 54 5 y 39 45 38 4 45 31 31 51 54 6 67 7 44 46 40 36 53 44 5 63 59 38 41 48 51 43 38 4 x 45 46 y x 38 45 5 58 5 40 6 41 57 61 34 56 70 64 50 40 56 65 5 57 53 43 45 54 56 35 54 63 34 5 40 48 35 51 y Example Scatterplot B x Shows The drecton of the relatonshp The strength of assocaton Cautons It only measures lnear assocaton It s unstable wth a small sample sze Is dstorted by extreme values or by ncludng dfferent data sets n the analyss Correlaton Coeffcent 47 48

Nonlnear Relatonshp Monkey Data Wt Ht 1 9 45 7 3 35 17 4 39 9 5 53 31 6 41 1 7 51 31 8 35 13 9 57 37 10 57 41 11 45 45 1 47 35 13 35 5 14 49 5 15 43 31 16 51 33 17 31 9 18 53 7 19 47 17 0 51 45 49 50 Monkey & Kng Kong Data 1 9 45 7 3 35 17 4 39 9 5 53 31 6 41 1 7 51 31 8 35 13 9 57 37 10 57 41 11 45 45 1 47 35 13 35 5 14 49 5 15 43 31 16 51 33 17 31 9 18 53 7 19 47 17 0 51 45 KK 1 150 Multple Regresson Same concept, more varables 51 5 Multple Regresson Models An extenson of the smple case Permts use of more varables to try to explan more varaton Example model: = a+ b11+ bl Real Estate Example Monthly sales () are related to Mortgage rates ( 1 ) Number of salespersons ( ) Wth smple regresson models: = a + b 1, r = 0.36 = a + b, r = 0.5 Multple regresson model = a + b 1 1 + b, r = 0.49, not 1! 53 54

Real Estate Example Why s not more varaton explaned? Multcollnearty exsts: 1 s correlated wth We want ndependence of the s (uncorrelated) Total varaton Explaned by 1 Explaned by MLR Software 56 MLR Input Ttle lne Varables and observatons Labels for varables, dependent last For each observaton j values, followed by j j s n label order Blanks separate all values and labels MLR Reports Descrptve statstcs Correlaton matrx and determnant Regresson equaton, each varable: Label coeffcent beta value standard error of the coeffcent t-statstc and probablty that b = 0 57 58 MLR Reports Analyss of varance P(nsgnfcant regresson model) Summary statstcs r s y,x Resdual summary (optonal) Resduals (errors) Graph Standard Error of the Estmate The standard devaton of the observed values of from the regresson lne s yx, ( c ) a b = = n n On average, how the data vares around the regresson lne 59

Confdence Intervals Usng the 68-95-99.7 Rule of Normalty µ ± 1 σ ncludes 68% of all values µ ± σ ncludes 95% µ ± 3 σ ncludes 99.7% b ± Z s y,x gves confdence nterval for a gven probablty and assocated Z- value If Z=1, a 68% confdence that the nterval contans the true regresson coeffcent 61