Introduction to time series analysis



Similar documents
Trend and Seasonal Components

2.2 Elimination of Trend and Seasonality

Time Series Analysis

Simple Methods and Procedures Used in Forecasting

TIME SERIES ANALYSIS & FORECASTING

16 : Demand Forecasting

Introduction to time series analysis

Forecasting in supply chains

3. Regression & Exponential Smoothing

4. Simple regression. QBUS6840 Predictive Analytics.

Industry Environment and Concepts for Forecasting 1

Correlation key concepts:

Time series forecasting

8. Time Series and Prediction

Time series Forecasting using Holt-Winters Exponential Smoothing

A Primer on Forecasting Business Performance

Time Series Analysis. 1) smoothing/trend assessment

During the analysis of cash flows we assume that if time is discrete when:

State Space Time Series Analysis

Demand Forecasting When a product is produced for a market, the demand occurs in the future. The production planning cannot be accomplished unless

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques Page 1 of 11. EduPristine CMA - Part I

Overview Accounting for Business (MCD1010) Introductory Mathematics for Business (MCD1550) Introductory Economics (MCD1690)...

METHODOLOGICAL NOTE: Seasonal adjustment of retail trade sales

South Carolina College- and Career-Ready (SCCCR) Probability and Statistics

430 Statistics and Financial Mathematics for Business

This content downloaded on Tue, 19 Feb :28:43 PM All use subject to JSTOR Terms and Conditions

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

Ch.3 Demand Forecasting.

LOGNORMAL MODEL FOR STOCK PRICES

Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration

1 Example of Time Series Analysis by SSA 1

CALL VOLUME FORECASTING FOR SERVICE DESKS

COMBINING THE METHODS OF FORECASTING AND DECISION-MAKING TO OPTIMISE THE FINANCIAL PERFORMANCE OF SMALL ENTERPRISES

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

South Carolina College- and Career-Ready (SCCCR) Algebra 1

Analysis of algorithms of time series analysis for forecasting sales

Practical Time Series Analysis Using SAS

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

PCHS ALGEBRA PLACEMENT TEST

Theory at a Glance (For IES, GATE, PSU)

Lecture 4: Seasonal Time Series, Trend Analysis & Component Model Bus 41910, Time Series Analysis, Mr. R. Tsay

OUTLIER ANALYSIS. Data Mining 1

Introduction to Regression and Data Analysis

Outline: Demand Forecasting

Time Series Analysis and Forecasting Methods for Temporal Mining of Interlinked Documents

AP Physics 1 and 2 Lab Investigations

Module 6: Introduction to Time Series Forecasting

Demand forecasting & Aggregate planning in a Supply chain. Session Speaker Prof.P.S.Satish

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

Simple Predictive Analytics Curtis Seare

7 Time series analysis

Time series analysis Matlab tutorial. Joachim Gross

A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study

Linear Codes. Chapter Basics

A Correlation of. to the. South Carolina Data Analysis and Probability Standards

Fairfield Public Schools

SIMPLIFIED PERFORMANCE MODEL FOR HYBRID WIND DIESEL SYSTEMS. J. F. MANWELL, J. G. McGOWAN and U. ABDULWAHID

Least Squares Estimation

The Real Business Cycle model

Creating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities

Statistics in Retail Finance. Chapter 6: Behavioural models

Chapter 4: Vector Autoregressive Models

Exponential Smoothing with Trend. As we move toward medium-range forecasts, trend becomes more important.

U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, Notes on Algebra

Big Ideas in Mathematics

A Regime-Switching Model for Electricity Spot Prices. Gero Schindlmayr EnBW Trading GmbH

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

Vilnius University. Faculty of Mathematics and Informatics. Gintautas Bareikis

The Basics of Interest Theory

Forecasting Methods. What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes?

Time Series Analysis

PHILOSOPHY OF THE MATHEMATICS DEPARTMENT

EcOS (Economic Outlook Suite)

Slides Prepared by JOHN S. LOUCKS St. Edward s University

MATH. ALGEBRA I HONORS 9 th Grade ALGEBRA I HONORS

Using simulation to calculate the NPV of a project

Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data

Chapter 21: The Discounted Utility Model

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone:

CHAPTER 11 FORECASTING AND DEMAND PLANNING

2013 MBA Jump Start Program. Statistics Module Part 3

Introduction to Engineering System Dynamics

COMP6053 lecture: Time series analysis, autocorrelation.

Chapter 27 Using Predictor Variables. Chapter Table of Contents

Diablo Valley College Catalog

Simple linear regression

Teaching Multivariate Analysis to Business-Major Students

TIME SERIES ANALYSIS. A time series is essentially composed of the following four components:

Geostatistics Exploratory Analysis

USING SEASONAL AND CYCLICAL COMPONENTS IN LEAST SQUARES FORECASTING MODELS

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

3. Interpolation. Closing the Gaps of Discretization... Beyond Polynomials

Module 5: Multiple Regression Analysis

parent ROADMAP MATHEMATICS SUPPORTING YOUR CHILD IN HIGH SCHOOL

Time Series and Forecasting

Experiment #1, Analyze Data using Excel, Calculator and Graphs.

Manual on Air Traffic Forecasting

Transcription:

Introduction to time series analysis Margherita Gerolimetto November 3, 2010 1 What is a time series? A time series is a collection of observations ordered following a parameter that for us is time. Examples occur in a variety of fields (economics, engeneering, meteorology and so on...). For instance, economic time series include share prices on successive days, export totals in successive months, average incomes in successive months, company profits in successive years (see the graph below). Marketing time series are sales figures in successive weeks or months. 2 Terminology A series is said continuous when observations are made continuously in time (the term continuous is used also when the measured variable takes only a discrete set of values). A series is said discrete when observations are taken only at specific times, in general equally spaced (the term discrete is used for this kind of series even when the measured variable is a continuous one). In general we will use discrete time series, that can arise in several ways: 1. Given a continuous time series we can read off (or digitize) the values in equal intervals of time to produce a discrete time series that is a sampled time series (for example, temperature measured every 15 minutes or the price of an asset every day). In this case the time series (especially the economic ones) are called also flows. 2. Suppose a variable does not have an istantaneous values, but the values can be aggregated over equal intervals of time (for example, exports 1

measured monthly, rainfall measured monthly). In this case the series (especially the economic ones) are called stocks. Another introductory comment should be made. Time series analysis has the characteristic of not assuming that the sample consists of iid data (typical of statistical theory). Dependence is, on the contrary, maybe the most important feature of time series data and the analysis must be made taking into account the time order. Indeed, when successive observations are dependent, future values can be predicted by past observations 1. 3 Objectives of time series analysis The main objectives of time series analysis are: Description: the first step in the analysis is the plot of the data and obtain simple measures to have a look at the main properties of the series. Although much more sophisticated techinques are often used, the analysis of the graph must not be neglected since it can reveal the existence of trend, seasonality, outliers, turning points... Explanation: this means indentifying the random mechanism that generates the phenomenon of which a sequence of observations are available. In case observations are taken on two or more variables it is possible to explain the variation in one time series on the basis of the variation of another time series (linear systems). Prediction: given observed time series a typical aim is to predict future values. This is based on the principle that the behaviour of the phenomenon in the past is maintained in the future also. 4 Approaches to time series analysis To pratically conduct time series analysis bit further that simple description, it is often necessary to introduce a statistical model for the time series, whose 1 Of course, in case the time series is stochastic, as it usually is, exact predictions are not possible and are replaced by the idea that future values have a probability distribution, conditioned by the knowledge of past values 2

specification is arbitrary since it complies with convenience requirements. Stylized behaviours can be: The Error Model: x t = f(t) + ɛ t t = 1, 2,..., n In this case the series x t is defined as the resultant of an explicit mathematical function plus a random error term which must comply with the typical assumptions of the regression model: E(ɛ t ) = 0; E(ɛ 2 t ) = σ 2 < + ; E(ɛ t ɛ s ) = 0, t s This model is particularly appropriate for regular phenomena. The stochastic component is only conceptually added to the dynamic (consider for example cyclical phenomena for which the function f is the combination of harmonic waves). The stochastic model: X t = g(ɛ t, ɛ t 1,...) In this case the phenomenon is described as a funtion g of stochastic variables. This model is stochastic since the mechanism that generates the ɛ s (hence the x s) is totally stochastic. According to how the model is conceived, two broad approaches for the analysis of time series can be considered: 4.1 Classical approach to time series analysis The classical (or traditional) approach is mirrored in the Error Model above presented: x t = f(t) + ɛ t t = 1, 2,..., n Every time series x t can be considered as the result of a combination of different unobserved factors (stochastic and non stochastic), called the components of a time series: Trend T t : can be defined as long-term change in the mean level. The meaning of long term is quite a crucial point. Granger (1966) writes that a trend in mean could be defined as comprising all cyclic components whose wave length exceeds the length of the observed time series. 3

Cycle C t : refers to variations exhibited by the time series at fixed period that cannot be considered as seasonality. For example, economic data are often affected by what is called business cycle, with a period varying between 5 and 7 years. Seasonality S t : refers to variations that are annual in period. example the consumption of some products. For Irregular fluctuations a t : after trend, cycle and seasonal variation have been removed something is left in the residuals that is not predictable from the past history. Making sure that the a t component is truly random is a good guarantee that the decomposition of the series into components is correct. The classical (or tradional methods) of analysis of time series are concerned with the decomposition of the series into Trend, Cycle, Seaonality, Irregular component. In other words, one could say that the objective is to comprehend how function f(t) can be approximated. It is worth noting that the decomposition is, in general, not unique unless certain assumptions are made. The most common choice is to use an additive decomposition: x t = T t + C t + S t + a t but the multiplicative decomposition is also used 2 : x t = T t C t S t a t The classical approach to time series has a number of advantages: 1. It is based on rather intuitive concepts 2. It is exploitable also when the series length is rahter short 3. It is very useful when the series plot reveals that the variability is strongly dominated by a component, for example trend or seasonality It also has a number of disadvantages: 1. The decomposition is not unique 2. The stochastic part is only in the error term 2 The passage from the additive to the multiplicative is quite easily obtained by applying the logarithm function 4

4.2 The modern approach to time series In the modern approach to time series, the observed time series is conceived as a finite realization of a stochastic process. The stochastic part coincides with the systematic part of the DGP and not simply the error term. This is formally described by the above presented Stocahstic Model: X t = g(ɛ t, ɛ t 1,...) Compared to the Error Model, here the f(t) component is assumed to be absent or preliminarily removed and the focus is on g(ɛ t, ɛ t 1,...), a correlated component process that governs the data generating mechanism. An inferential scheme is derived, where the observed data guides the researcher toward the model that looks more appropriate. The disadvantages, typical of the classical approach are here overcome. The cornerstone of this approach is the Box-Jenkins procedure (Box and Jenkins, 1970). 5 Classical approach: estimate with mathematical functions A first method to estimate components of an observed time series x t is based on mathematical functions. The main assumption is that the hypothesized function governs the dynamic over the entire period. 5.1 Trend The trend component can be usually well represented by a regular function f(t) (it could be for example a polynomial function). Starting from the model x t = f(t) + ɛ t, let us distinguish three operative scenarios: 1. If f(t) is known the issue is only in the estimation of the coefficients. This can be extremely easy in case f(t) is linear, more complicated in case f(t) is not linear. Still, a part from the computational difficulties in nonlinear estimation problems, it is only a matter of parameter estimations from the data since the functional form is already specified. 2. If f(t) is unknown, before estimating the coefficients the issue is to identify a functional that can be a good approximation for the unknown 5

f(t). For example, if the trend appears to fluctuate with wide movements it can be approximated by a polynomial, in case it has a more periodic appearance a combination of trigonometric terms can be an appealing option. 3. It f(t) is unknown and it is not even possible to derive a acceptable approximation, this is the case of using a different approach from mathematical function, which is based on smoothing techniques (we will see them later on) that are not based on the assumption (very strong!) that a certain functional form holds over the entire sample period. Suppose to have a trend that can be classified as polynomial, hence we assume that the trend dynamic follows this form: hence f(t) = α 0 + α 1 t +... + α q t q x t = α 0 + α 1 t +... + α q t q + ɛ t Using least squares estimation methods, it is possible to obtain the estimates of the coefficients (solving the normal equations system). The polynomial that results can be a good guide to interpret the phenomenon. Much more caution must be used, instead, when the aim is to make prediction since there is no warrant that the functional form, supposed to hold over the sample period, does so also outside the time span of the observed time series. Another interesting case is when f(t) is supposed to follow an exponential trend f(t) = α 0 e α 1t The logarithm allows the linearization: logf(t) = α 0 + α 1 t Often trend are not representable as linearizable functions. This happens when the phenomenon exhibits phases of extremely fast growth, followed by phases of slow dynamics. This kind of behaviours can be represented by growth curves that are non linear in parameters, a possible example of this category of functions is the logistic function 3. 3 A possible formalization is: f(t) = eα0+α1t 1 + e α0+α1t 6

5.2 Seasonality The regression scheme presented for the trend component can be adopted also for the seasonal component, representable by a periodic function g(t). Remember that periodic functions are those functions such that the value taken at time t repeats at regular intervals of period s: g(t) = g(t + s) = g(t + 2s) s = 4 for quarterly data, s = 12 for monthly data. So the idea is that the data generating process is x t = g(t) + ɛ t A first approach to study g is to assume that g is representable as a sum of dummy variables S g(t) = γ j d jt j=1 where d jt is 1 in the j-th period and 0 otherwise. Then γ j indicates the level of the phenomenon at the j-th period of the year. Using least squares it is possible to estimate the γ j coefficients that are denoted by γ j and called raw seasonal coefficients. Generally, from these coefficients their mean is subtracted to obtain γ j γ j since the latter possess the property of summing to 0 at the end of the year, which is compatible with the concept of seasonality (the effect should not last more than one year). A second possibility is to assume that the seasonal component is given by a sum of harmonic functions. Generally a sum of 2,3 harmonic waves can be a reasonable option to represent seasonal dynamics sufficiently complex. 6 Classical approach: moving averages When the analytic description of the phenomenon is not possible, for example because the dynamic exhibited by the phenomenon is extremely irregular, 7

the moving averages are a good answer. Intuitively, moving averages tries to capture the dynamic of the phenomenon without, with this, necessarily define a functional form that holds over the entire time span as a variation law. 6.1 Main characteristics of moving averages The moving averages logic can be simply synthesized as follows: to estimate a component using moving averages, it should be applied to the observed series x t a transformation m such that the desired component is preserved and other fluctuations are smoothed. In this view, moving averages are a weighted sum of observed values of x t around the generic instant t. m 2 x t = θ i x t+i i= m 1 where m 1 + m 2 + 1 is the order of the moving average. If θ i = θ i and all 1 θ i =, then the moving average is called simple and if m m 1 +m 2 +1 2 = m 2 it is called centered. A time series x t is invariant to the moving average M, i.e. if Mx t = x t, in this case the moving average is said to preserve x t. In general, formal conditions to guarantee that a moving average M preserves a degree p polynomial are the following: m 2 i= m 1 θ i = 1; m 2 i= m 1 i r θ i = 0 forr = 1, 2..., p 6.2 Using moving averages Using moving averages it is possible to: 1. estimate the trend 2. remove seasonality from series 3. reduce the random component Having in mind the three objectives enumerated above, the following are basics for using moving averages: 8

1. There are moving average that preserve the polynomial trend up to a certain order; hence, when estimating the trend component, be careful of using moving averages that preserve it, i.e. choose opportunely the coefficients θ. 2. If a moving average of order s is applied to an observed series characterized by a wave of period s, this wave is eliminated; hence, when removing the seasonal component, choose accurately the order of the moving average. 3. The variability due to the error component ɛ can be reduced if a moving average with coefficient such that m 2 i= m 1 θ 2 i < 1 are used. To estimate the trend component in an observed time series x t using the moving averages (in moving averages language we could say that we want to emphasize the trend component), one must take into account the issues presented above. So, if x t is a monthly time series, one needs an average over 12 subsequent values to produce the series x t that is not influenced by the seasonal component (hence leaving only the trend component!). Thus, a moving average of 13 terms is a good option to estimate trend (trend-cycle in case there is one) and a moving average of 5 terms is good for quarterly data. The moving averages are used also to estimate the seasonal component with the further aim, in general, of removing seasonality from the series in order to produce what is called a deseasonalized time series. The latter is extremely informative, especially where it is of interest a comparison of the level taken by the phenomenon in subsequent months (for example to evaluate the real dynamic) without this judgment being spuriously affected by the seasonality. Estimating and removing seasonality are a rather interesting issue and a deeper discussion is of worth. Here we do not focus on this (for brevity reasons), but other details can be found in time series manuals (e.g. Chatfield, 1989). 6.3 Moving averages as bandpass filters Moving averages can be seen as filters, more precisely they are bandpass filters. The idea is to fit a polynomial curve not on the whole series, but only to different parts. Put it differently, when a trend is estimated and other 9

fluctuations are smoothed, this is like filtering out the desired component and exclude the non-desired components. Whenever a symmetric filter is applied, there is likely to be an end-effects problem. In some situations this is not necessarily a problem, in some other it is and it is then important to get smoothed values up to t = N. To do this an option (only a possible one...) for the analyst is to use asymmetric filters which only involve present and past values of x t. For example a popular technique is the exponential smoothing that can be seen as a particular asymmetric filter: x t = α(1 α) j x t j j=0 where α is a constant such that 0 < α < 1. The weights applied to observations are α(1 α) j and decrease geometrically with j. How do we choose an appropriate filter? The answer to this question requires a considerable knowledge about the phenomenon and the properties of filters. For example, to get smoothed values we want to remove the high frequency variation and preserve the low frequency variation, in this case we want what is called lowpass filter. On the contrary, if we are interested in removing long term variations and focus on high frequency variations, we will use what is called highpass filter. 10