Time Series Analysis of Aviation Data



Similar documents
Analysis of algorithms of time series analysis for forecasting sales

Univariate and Multivariate Methods PEARSON. Addison Wesley

Time Series - ARIMA Models. Instructor: G. William Schwert

TIME SERIES ANALYSIS

Some useful concepts in univariate time series analysis

Rob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1

9th Russian Summer School in Information Retrieval Big Data Analytics with R

Lecture 4: Seasonal Time Series, Trend Analysis & Component Model Bus 41910, Time Series Analysis, Mr. R. Tsay

TIME SERIES ANALYSIS

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

Time Series Analysis

Sales forecasting # 2

ITSM-R Reference Manual

Time Series Analysis: Basic Forecasting.

16 : Demand Forecasting

Chapter 9: Univariate Time Series Analysis

Advanced Forecasting Techniques and Models: ARIMA

Time Series Analysis

Promotional Forecast Demonstration

Forecasting methods applied to engineering management

How To Model A Series With Sas

Energy Load Mining Using Univariate Time Series Analysis

Introduction to Time Series Analysis. Lecture 1.

Regression and Time Series Analysis of Petroleum Product Sales in Masters. Energy oil and Gas

Graphical Tools for Exploring and Analyzing Data From ARIMA Time Series Models

Time Series Laboratory

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

Time Series Analysis and Forecasting

Time Series Analysis

COMP6053 lecture: Time series analysis, autocorrelation.

USE OF ARIMA TIME SERIES AND REGRESSORS TO FORECAST THE SALE OF ELECTRICITY

Time Series Analysis and Forecasting Methods for Temporal Mining of Interlinked Documents

Traffic Safety Facts. Research Note. Time Series Analysis and Forecast of Crash Fatalities during Six Holiday Periods Cejun Liu* and Chou-Lin Chen

(More Practice With Trend Forecasts)

Forecasting of Paddy Production in Sri Lanka: A Time Series Analysis using ARIMA Model

Chapter 4: Vector Autoregressive Models

State Space Time Series Analysis

Integrated Resource Plan

Software Review: ITSM 2000 Professional Version 6.0.

Agenda. Managing Uncertainty in the Supply Chain. The Economic Order Quantity. Classic inventory theory

Introduction to RStudio

Time Series Analysis

Time Series Analysis 1. Lecture 8: Time Series Analysis. Time Series Analysis MIT 18.S096. Dr. Kempthorne. Fall 2013 MIT 18.S096

THE UNIVERSITY OF CHICAGO, Booth School of Business Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Homework Assignment #2

Using JMP Version 4 for Time Series Analysis Bill Gjertsen, SAS, Cary, NC

Time-Series Regression and Generalized Least Squares in R

Machine Learning in Statistical Arbitrage

IBM SPSS Forecasting 22

Trend and Seasonal Components

2.2 Elimination of Trend and Seasonality

JetBlue Airways Stock Price Analysis and Prediction

Forecasting Using Eviews 2.0: An Overview

The SAS Time Series Forecasting System

I. Introduction. II. Background. KEY WORDS: Time series forecasting, Structural Models, CPS

Time series analysis of the dynamics of news websites

4. Simple regression. QBUS6840 Predictive Analytics.

11. Time series and dynamic linear models

A Regime-Switching Model for Electricity Spot Prices. Gero Schindlmayr EnBW Trading GmbH

Promotional Analysis and Forecasting for Demand Planning: A Practical Time Series Approach Michael Leonard, SAS Institute Inc.

IBM SPSS Forecasting 21

Financial TIme Series Analysis: Part II

Readers will be provided a link to download the software and Excel files that are used in the book after payment. Please visit

Univariate Time Series Analysis; ARIMA Models

THE SVM APPROACH FOR BOX JENKINS MODELS

Time Series Analysis with R - Part I. Walter Zucchini, Oleg Nenadić

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

Estimating an ARMA Process

2013 MBA Jump Start Program. Statistics Module Part 3

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics

Forecasting model of electricity demand in the Nordic countries. Tone Pedersen

Time Series Analysis

Joseph Twagilimana, University of Louisville, Louisville, KY

Vector Time Series Model Representations and Analysis with XploRe

Is the Forward Exchange Rate a Useful Indicator of the Future Exchange Rate?

Luciano Rispoli Department of Economics, Mathematics and Statistics Birkbeck College (University of London)

Analysis and Computation for Finance Time Series - An Introduction

Chapter 25 Specifying Forecasting Models

FORECASTING AND TIME SERIES ANALYSIS USING THE SCA STATISTICAL SYSTEM

Exercises on using R for Statistics and Hypothesis Testing Dr. Wenjia Wang

430 Statistics and Financial Mathematics for Business

Studying Achievement

Didacticiel Études de cas

APPLICATION OF THE VARMA MODEL FOR SALES FORECAST: CASE OF URMIA GRAY CEMENT FACTORY

Java Modules for Time Series Analysis

Econometric Modelling for Revenue Projections

Predicting Indian GDP. And its relation with FMCG Sales

Statistics in Retail Finance. Chapter 6: Behavioural models

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs

Forecasting areas and production of rice in India using ARIMA model

Time Series in Mathematical Finance

A Trading Strategy Based on the Lead-Lag Relationship of Spot and Futures Prices of the S&P 500

Getting Started with R and RStudio 1

Forecasting Tourism Demand: Methods and Strategies. By D. C. Frechtling Oxford, UK: Butterworth Heinemann 2001

1 Short Introduction to Time Series

3. Regression & Exponential Smoothing

Transcription:

Time Series Analysis of Aviation Data Dr. Richard Xie February, 2012

What is a Time Series A time series is a sequence of observations in chorological order, such as Daily closing price of stock MSFT in the past ten years Weekly unemployment claims in the past 2 years Monthly airline revenue passenger miles in the past ten years Time series analysis is useful when No other data available System too complicated to model in detail

Where to Get the Data?

What Information Are You Interested In? How the data changes from month to month, year to year? Any trend? How fluctuated the curve is? Any seasonal effects? Any unusual years/months which have significantly small or large number? Can we forecast future value based on the time series?

Let s Work on the Data But first, what tool will you use? Pencil and quadrille pad (or back of an envelope) Excel Matlab, Mathematica, Maple SAS, SPSS, STATA, R ROOT, PAW, KNIME, Data Applied, etc. Others

Use R! R is free R is a language, not just a statistical tool R makes graphics and visualization of the best quality A flexible statistical analysis toolkit Access to powerful, cutting-edge analytics A robust, vibrant community Unlimited possibilities

Where to Download R To download R Go to http://www.r-project.org/ Choose a CRAN Mirror, such as http://cran.cnr.berkeley.edu/ Click the link to download R according to your operating system (Linux, MacOS X, or Windows) Or Enhanced Version of R Distributed by 3 rd Party Revolution R Community (http://www.revolutionanalytics.com/products/revolution-r.php) Free academic version of Revolution R Enterprise (http://www.revolutionanalytics.com/products/revolutionenterprise.php) CRAN: Comprehensive R Archive Network

R References An Introduction to R, W.N.Venables, D.M.Smith and R Development Core Team More Documents/Tutorials, go to http://cran.cnr.berkeley.edu/other-docs.html

Start an R Project Recommend using RStudio as the console (http://rstudio.org/download/) Create a project folder for storing R scripts, data, etc. e.g. C:/Users/xie/Documents/SYST460/R projects/airline_timeseries Open RStudio, navigate to the project folder 1. Use getwd() to find out the current working directory 2. Click and select the project folder 4. Current working directory is changed 3. Click here

Work on Data, Finally! Use Ctr+Shft+N to create a new script > rm(list=ls(all=true)) to clear the existing variables in workspace, if any > rpm = read.csv("system Passenger - Revenue Passenger Miles (Jan 1996 - Oct 2011).csv") Use Ctr+Enter to run the current line or selection

What the Data Looks Like? > ls(rpm) > plot(rpm), equivalent to > plot(rpm$total~rpm$yyyymm)

Plot It As A Time Series > rpm.ts = ts(as.numeric(rpm$total), start = c(1996,01),freq=12) > plot(rpm.ts,ylab='revenue Passenger Miles') Commands to check properties of rmp.ts

Trends and Seasonal Variation > layout(1:2) > plot(aggregate(rpm.ts)) > boxplot(rpm.ts~cycle(rpm.ts)) Overall trend of increasing over years Max value Upper Quartile Median Lower Quartile Min value

Window Function Extract a part of the time series between specified start and end points > rpm.feb <- window(rpm.ts, start = c(1996,02), freq = TRUE) > rpm.aug <- window(rpm.ts, start = c(1996,08), freq = TRUE) > mean(rpm.feb)/mean(rpm.aug)

Modeling Time Series - Notations

Modeling Time Series Decomposition Models

Time Series Decomposition in R > rpm.decom=decompose(rpm.ts) > plot(rpm.decom)

Auto-Correlation

AutoCorrelation and Correlogram > rpm.acf=acf(as.numeric(rpm.ts),lag.max=40)

Correlogram after Decomposition >acf(as.numeric(rpm.decom$random),na.acti on=na.omit,lag.max=40)

Regression Trends: stochastic trends, deterministic trends Deterministic trends and seasonal variation can be modeled using regression Deterministic trends are often used for prediction Time series regression differs from standard regression as time series tends to be serially correlated

Linear Models

Fit A Linear Regression Model > fit = lm(rpm.ts ~ time(rpm.ts)) > plot(rpm.ts, type="o", ylab="rpm") > abline(fit)

Diagnostic Plots > hist(resid(fit)) > acf(resid(fit)) > AIC(fit)

Linear Model with Seasonal Variables

Linear Model with Seasonal Variables - R > Seas = cycle(rpm.ts) Gives the positions in the cycle of each obsv. > Time = time(rpm.ts) Creates the vector of times at which rpm.ts was sampled > rpm.lm = lm(rpm.ts~0+time+factor(seas)) Fit rpm.ts to the linear model with seasonal variables

Take a Look at rpm.lm > summary(rpm.lm) Model Residuals from fitting Coefficients

Correlogram of rpm.lm Residual > acf(resid(rpm.lm),lag.max=40) It indicates strong positive-autocorrelation Residuals are not pure random numbers, so it should be further modeled

How Random is Random? - White Noise A time series {w t : t = 1, 2,..., n} is discrete white noise (DWN) if the variables w 1,w 2,...,w n are independent and identically distributed with a mean of zero. This implies that the variables all have the same variance σ 2 and Cor(w i,w j ) = 0 for all i j. If, in addition, the variables also follow a normal distribution (i.e., w t N(0, σ 2 )) the series is called Gaussian white noise.

Simulate White Noise in R > set.seed(1) > w = rnorm(100) > plot(w, type = "l") > acf(w)

Random Walk Let {x t } be a time series. Then {x t } is a random walk if x t = x t 1 + w t where {w t } is a white noise series. Substituting x t 1 = x t 2 +w t 1 and then substituting for x t 2, followed by x t 3 and so on gives: x t = w t + w t 1 + w t 2 +... In practice, the series will start at some time t = 1. Hence, x t = w 1 + w 2 +... + w t

Simulate A Random Walk in R > x <- w <- rnorm(1000) > for (t in 2:1000) x[t] <- x[t - 1] + w[t] > plot(x, type = "l") > acf(x) Your plot will be different for sure!

Stationarity

Auto-Regressive (AR) Models The series {x t } is an autoregressive process of order p, abbreviated to AR(p), if x t = α 1 x t 1 + α 2 x t 2 +... + α p x t p + w t where {w t } is white noise and the α i are the model parameters with α p 0 for an order p process. The model is a regression of x t on past terms from the same series; hence the use of the term autoregressive.

Fit an AR Model > res.ar=ar(resid(rpm.lm),method="ols") > res.ar x t = 0.7153x t 1 + 0.1010x t 2 +0.0579x t 3 + w t > acf(res.ar$res[-(1:3)]) OLS: Ordinary Least Square

Moving Average Model

Simulate an MA Process > set.seed(1) > b <- c(0.8, 0.6, 0.4) > x <- w <- rnorm(1000) > for (t in 4:1000) { for (j in 1:3) x[t] <- x[t] + b[j] * w[t - j] } > plot(x, type = "l") > acf(x)

Plot Results First 3 lags with significant correlation Simulated MA Process Correlogram of the Simulated MA Process

ARMA Model

Fit An ARMA Model > rpm.arma=arima(rpm.ts,order=c(1,0,1)) > acf(as.numeric(rpm.arma$resid)) Strong seasonal information left in residuals

ARIMA and SARIMA Model A time series {x t } follows an ARIMA(p, d, q) process if the d th differences of the {x t } series are an ARMA(p, q) process SARIMA is Seasonal ARIMA model which extends ARIMA model with seasonal terms

Fit A SARIMA Model > rpm.arima=arima(rpm.ts,order=c(1,1,1), seas=list(order=c(1,0,0),12)) First c(1,1,1): AR(1), first-order difference, MA(1) Second c(1,0,0): seasonal terms on AR process, frequency 12 > acf(as.numeric(rpm.arima$resid))

Forecast Predicting future values of a time series, x n+m, using the set of present and past values of the time series, x={x n, x n-1,, x 1 } The minimum mean square error predictor of n x n+m is x E( x x) n m predict(model, newdata) method model: a model object used for prediction newdata: value of explanatory variables n m

Forecast in R > new.t <- seq(2011.750, len = 2 * 12, by = 1/12) > new.dat <- data.frame(time = new.t, Seas = rep(1:12, 2)) > rpm.pred=ts(predict(rpm.lm, new.dat)[1:24],start=c(2011,11),freq=12) > ts.plot(rpm.ts,rpm.pred,lty=1:2)

Plot of Forecast

Forecast with SARIMA Model > ts.plot( cbind( window(rpm.ts,start = c(1996,1)), predict(rpm.arima,48)$pred ), lty = 1:2)

Homework Download and Install R with Rstudio Read An Introduction to R Download System Passenger - Revenue Aircraft Miles Flown (000) (Jan 1996 - Oct 2011) data from BTS Read the data into R using Rstudio Create a time series plot of the data, and plot its auto-correlation correlogram Decompose the time series and save the plot

Homework (Ctd.) Construct a linear regression model without seasonal factors, and plot the correlogram of the model s residual data Construct a linear regression model with seasonal factors, and identifies the characteristics of the model residual. Fit an AR model to the model residual of the above model Forecast the time series data into next 24 months using the seasonal model

Exam Questions What are the data elements in a time series? What does auto-correlation mean? What are white noise and random walk? What are stationary models and non-stationary models?