On the Periodicity of Time-series Network and Service Metrics



Similar documents
Lesson 17 Pearson s Correlation Coefficient

1 Computing the Standard Deviation of Sample Means

Confidence Intervals for One Mean

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Output Analysis (2, Chapters 10 &11 Law)

Basic Measurement Issues. Sampling Theory and Analog-to-Digital Conversion

Chapter 7: Confidence Interval and Sample Size

1 Correlation and Regression Analysis

CHAPTER 3 THE TIME VALUE OF MONEY

Modified Line Search Method for Global Optimization

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

CHAPTER 3 DIGITAL CODING OF SIGNALS

I. Chi-squared Distributions

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

Department of Computer Science, University of Otago

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Systems Design Project: Indoor Location of Wireless Devices

Lesson 15 ANOVA (analysis of variance)

Incremental calculation of weighted mean and variance

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Quadrat Sampling in Population Ecology

Soving Recurrence Relations

Determining the sample size

5: Introduction to Estimation

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Amendments to employer debt Regulations

ODBC. Getting Started With Sage Timberline Office ODBC

1. C. The formula for the confidence interval for a population mean is: x t, which was

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

A probabilistic proof of a binomial identity

Normal Distribution.

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

Chair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics

Hypothesis testing. Null and alternative hypotheses

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

Statistical inference: example 1. Inferential Statistics

Convention Paper 6764

Domain 1: Designing a SQL Server Instance and a Database Solution

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

Institute of Actuaries of India Subject CT1 Financial Mathematics

Chapter 7 Methods of Finding Estimators

Data Analysis and Statistical Behaviors of Stock Market Fluctuations

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Sequences and Series

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

Volatility of rates of return on the example of wheat futures. Sławomir Juszczyk. Rafał Balina

INVESTMENT PERFORMANCE COUNCIL (IPC)

Study on the application of the software phase-locked loop in tracking and filtering of pulse signal

Subject CT5 Contingencies Core Technical Syllabus

(VCP-310)

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis

Pre-Suit Collection Strategies

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships

Basic Elements of Arithmetic Sequences and Series

HCL Dynamic Spiking Protocol

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

3. Greatest Common Divisor - Least Common Multiple

PSYCHOLOGICAL STATISTICS

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10

The Stable Marriage Problem

LECTURE 13: Cross-validation

Measures of Spread and Boxplots Discrete Math, Section 9.4

Forecasting. Forecasting Application. Practical Forecasting. Chapter 7 OVERVIEW KEY CONCEPTS. Chapter 7. Chapter 7

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Optimize your Network. In the Courier, Express and Parcel market ADDING CREDIBILITY

How To Solve The Homewor Problem Beautifully

Asymptotic Growth of Functions

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

Multi-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu

Evaluating Model for B2C E- commerce Enterprise Development Based on DEA

Simple Annuities Present Value.

Chatpun Khamyat Department of Industrial Engineering, Kasetsart University, Bangkok, Thailand

Properties of MLE: consistency, asymptotic normality. Fisher information.

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

PUBLIC RELATIONS PROJECT 2016

Automatic Tuning for FOREX Trading System Using Fuzzy Time Series

INVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology

THE ROLE OF EXPORTS IN ECONOMIC GROWTH WITH REFERENCE TO ETHIOPIAN COUNTRY

Detecting Voice Mail Fraud. Detecting Voice Mail Fraud - 1

WindWise Education. 2 nd. T ransforming the Energy of Wind into Powerful Minds. editi. A Curriculum for Grades 6 12

client communication


Enhancing Oracle Business Intelligence with cubus EV How users of Oracle BI on Essbase cubes can benefit from cubus outperform EV Analytics (cubus EV)

Forecasting techniques

Transcription:

O the Periodicity of Time-series Network ad Service Metrics Joseph T. Lizier ad Terry J. Dawso Telstra Research Laboratories Sydey, NSW, Australia {oseph.lizier, terry..dawso}@team.telstra.com Abstract The presece of a uderlyig periodicity i time-series etwork ad service metrics has bee used as a basis for some recet aomaly detectio techiques. These techiques however assume the presece of a periodicity, ad would beefit from the cocept of a quatitative figure of merit for the stregth of a give periodicity i the metric. We survey a umber of potetial techiques for this purpose, ad fid oe suitable. As such, we costruct such a figure of merit to suit our applicatio. Use of the figure of merit allows selectio of the most appropriate period for the metric, ad we preset a efficiet automated method for this selectio. Furthermore, this figure of merit is a useful idicator of whether periodic aalysis for aomaly detectio is i fact suitable for the give metric. Fially, we suggest a umber of other areas where use of the figure of merit could ehace aomaly detectio usig periodic aalysis. I. INTRODUCTION The presece of a uderlyig periodicity i various timeseries has bee cited i several fields, for example iformatio ad commuicatios techology (ICT) [] ad climatology []. Such periodicity has bee see as a importat property primarily from a modelig ad predictive perspective; i ICT for example it has bee cited as useful for resource schedulig [3]. Additioally, kowledge of a uderlyig periodicity has foud aother applicatio i ICT: aomaly detectio i etwork ad service metrics, e.g. []. The term etwork ad service metrics refers to variables used to characterize the performace of a etwork or service as a fuctio of time. Typical metrics are bytes per miute at a etwork level, or trasactios per miute at a service level. We cosider oly umeric metrics which are recorded at fixed discrete time itervals. The term aomaly detectio describes efforts to detect deviatios of a system from ormal behaviour. Typical deviatios from ormal behaviour are etwork failures ad performace degradatios. Here we cosider aomaly detectio focused o a sigle give etwork or service metric (i.e. uivariate aalysis). Ay multivariate aalysis (either withi a system or across systems) is cosidered to be the task of a higher layer aalysis system. We are iterested i the two dimesioal time-series approach to aomaly detectio described i [], [4] ad [5]. I this type of techique statistics are computed for each time poit alog a characteristic period, usig a set of historical metric values from previous correspodig poits i the period. Aomaly detectio is the performed o each icomig metric value usig the statistics for that correspodig time poit i the period. This aomaly detectio techique ca beefit from a figure of merit to quatify the stregth of a specific period i a give time-series metric. I this paper we seek to defie such a figure of merit, ad to use it to automatically idetify the appropriate characteristic period ad evaluate the applicability of the techique to the give metric. This motivatio strogly guides our ivestigatios, ad as such we explai it i further detail i Sectio II. We survey a rage of potetial methods for computig the figure of merit ad idetifyig the characteristic period. However, as outlied i Sectio III, this survey does ot idetify a suitable method. Therefore, we costruct a figure of merit kow as the correlatio of periods, which is described i Sectio IV. A techique to efficietly search for ad idetify the characteristic period for a give metric, usig the figure of merit, is outlied i Sectio V. We test the figure of merit ad search techique agaist a umber of real sample metrics, described i Sectio VI. This testig cofirms that the methods are suitable for the purposes iteded. Fially, we preset several other applicatios of the figure of merit i Sectio VII. These applicatios iclude the importat step of determiig whether the give metric is suitable for aomaly detectio usig the aforemetioed type of periodic aalysis. For this purpose, we preset a guide o the suitability of metrics with various rages of the figure of merit for the stregth of the characteristic period. II. MOTIVATION: ANOMALY DETECTION USING PERIODIC ANALYSIS Aomaly detectio i ICT systems has attracted sigificat iterest, with recet motivatios icludig security-related itrusio detectio [4]. Aomaly detectio ecapsulates attempts to detect both hard faults (system failures) ad soft faults (performace degradatios) [6]. The detectio of soft faults is cosidered very importat, as it allows for the possibility of fault cotaimet or correctio, whereas the detectio of a hard fault meas that a failure is already preset. The most commo method of aomaly detectio for ICT metrics is threshold aalysis, where a alarm is raised whe the metric exceeds (or falls below) a static threshold value. Most etwork elemets have agets to perform such tests [6]. Statistical process cotrol (SPC) is a useful method, less commoly used for aomaly detectio i ICT systems. SPC improves o static threshold aalysis by icorporatig statistical aalysis of the metric over time to calculate a regio for ormal operatio, which ca be updated. A example applicatio of SPC techiques to ICT metrics is give i [7]. A wide variety of other mechaisms for aomaly detectio i ICT metrics have

However, these methods did ot seek to quatify the stregth of the characteristic period i the metrics uder aalysis. As such, they were uable to make a quatitative decisio o whether a daily, weekly or other period was most appropriate for a give metric. Furthermore, [5] gives a example of a metric with o obvious periodicity, ad questios whether this type of periodic aalysis is appropriate for that data set. We coclude that the aomaly detectio techique of iterest would beefit from the cocept of a figure of merit for the uderlyig periodicity i a give etwork or service metric. I the followig sectios, we seek to defie the figure of merit ad explore how it ca be used to address the above issues. Figure. Trasactios per miute versus time i week for Trasactio system. 8 cosecutive weeks are superimposed. week = 0080 mi. bee reported, ragig from moitorig regularities i idividual user behaviour [8] to microscopic self-similarity [9]. By defiitio, aomaly detectio requires a (at least implicit) uderstadig of the ormal behaviour of the metric uder aalysis. May etwork ad service metrics exhibit a uderlyig periodicity. For example, Fig. shows trasactios per miute o a highly used system over a umber of weeks. There are clear similarities betwee correspodig days of the week, as well as betwee each day. Where such periodicity exists, it is sesible for a aomaly detectio techique to icorporate it i a defiitio of ormal behaviour of the metric. To this ed, several recetly published aomaly detectio techiques use the uderlyig periodicity of a give metric to determie ormal behaviour. Data miig approaches to this are out of scope for our purposes (e.g. [0] defies temporal associatio rules betwee properties of trasactios). We are cocered with techiques which compute a rage for the ormal behaviour of a umeric metric based o the uderlyig periodicity observed i historical values of the metric. We refer primarily to the work of Burgess et al. [], [4], [5]. Related work icludes that of Ho et al. [6], [] ad to a lesser extet Brutlag []. I [4], for a give metric a weighted mea ad stadard deviatio are computed for each time poit alog a weekly iterval, usig historical values from the correspodig time i previous weeks. These weighted meas ad stadard deviatios are the used to costruct a time-depedet rage of expected ormal behaviour; icomig values outside this rage are labeled as aomalies. We shall refer to this as the aomaly detectio techique of iterest. The above techiques all utilize a sigle characteristic period. Ho et al. [6] ad Brutlag [] assume their characteristic periods to be oe week ad oe day respectively. Burgess et al. [] describe that the daily ad weekly periods are the oly sigificat patters i Fourier aalysis ad autocorrelatio. They coclude that the weekly period was stroger tha the daily period because of the larger variatio betwee days of the week, ad because weekeds itroduce larger ucertaity. III. SURVEY OF CANDIDATES FOR FIGURE OF MERIT COMPUTATION I this sectio, we survey cadidates for computig a figure of merit for the stregth of a give periodicity i a metric. Some cadidates are perhaps more strogly aliged to a search for the characteristic period tha a defiitio of its stregth, but we iclude these for completeess. We restrict our scope to discrete time sigals, where the characteristic period is a iteger multiple of the basic uit of time. A obvious startig poit is Fourier aalysis. The Fast Fourier Trasform (FFT) geerally shows strog daily ad weekly compoets for our sample metrics, as expected. Typically, the daily compoet greatly outweighed the weekly compoet, eve where the weekly period was stroger from a self-similarity perspective (e.g. the metric i Fig. ). This is a sesible result sice the FFT computes the stregth of siusoidal compoets, yet give our focus o periodic selfsimilarity it is highly udesirable. Fourier aalysis did however show promise i suggestig cadidate periods of iterest. Aother suggestio from a traditioal perspective is the autocorrelatio fuctio, as by defiitio it evaluates a give data set s self-similarity after various shifts. Fig. shows the autocorrelatio of the metric plotted i Fig.. The autocorrelatio plot of this strogly periodic metric (which is similar to Fig. i []), oscillates as the metric is shifted alog itself, reachig maxima after each shift of aother day. O top of the daily oscillatios, a weekly oscillatio ca be see, with the peak after 7 days beig the largest of the daily peaks, ad the 3 ad 4 days shifts beig the smallest (of the shifts of less tha a week). This elevatio of the weekly period over the daily period fits with our ituitive uderstadig of the periodicity of this metric. Also, the property of the autocorrelatio fuctio i returig a value i [-, ] is quite appealig for the figure of merit. However, for a give shift, the autocorrelatio fuctio evaluates the similarity betwee data poits separated by oe shift. For applicatio to aomaly detectio, compariso to data poits several shifts away is desirable. Also, skewig of the fuctio is experieced where the legth of iput data is ot a iteger multiple of the characteristic period. Similarly, widowig effects are see whe usig the FFT as a short-cut for the autocorrelatio computatio i a efficiet time (via the Wieer-Khichi Theorem [3]). Furthermore, methods would be required for differetiatig meaigful periods from the high autocorrelatio values at low shifts, ad differetiatig the

. Compute the liear correlatio coefficiet r i betwee each distict period (i.e. betwee each colum) i ad. The scale the coefficiet to accout for the differet amplitudes (ecapsulated by α) ad meas (ecapsulated by ρ) of the periods; i.e. the formula for r i [3] is scaled to become: Figure. Autocorrelatio of trasactios per miute versus time shift for Trasactio system. The autocorrelatio fuctio was computed over 6 weeks of data. week = 0080 mi. characteristic periods from iteger multiples of themselves (which have similar autocorrelatio values). Agai, this fuctio does ot appear directly useful, though it shows promise for idirect use. Several authors have reported data miig techiques for detectig periodicity, e.g. [3], [4], [5]. Typically this ivolves the discretizatio of the give data set, replacig umeric values of the data with the symbol for the correspodig iterval, ad performig a miig aalysis for periodicity i the symbol series. From our perspective such approaches are too coarse, as the smoothig fuctio of the discretizatio dilutes ay umeric measure o the stregth of the uderlyig periodicity. We also ote the reportig of a square coherece statistic i []. However this statistic is computed from a frequecy domai perspective so does ot give a true measure of periodicity from a time domai similarity perspective. A iterestig techique i [6] describes computig the stregth of a periodicity via the sigular value decompositio (SVD) operatio applied to the data series as a m matrix. (m beig the umber of periods of legth ). The stregth is give as the ratio of the first two sigular values s / s. The computatio for a give periodicity is said i [6] to require O(m ) time; sice > m i geeral for our periods of iterest, this is quite sigificat. Also, multiplicative differeces i the shape of each period do ot have a adverse effect o the stregth, which is udesirable. The rage of the stregth value [0, ) is ot as desirable as [-, ], but is still useful. IV. CORRELATION OF PERIODS FIGURE OF MERIT We have costructed our ow figure of merit for the stregth of a period i a give metric. This techique, the correlatio of periods method, has bee tailored for our aomaly detectio purpose, ad ivolves computig a correlatio score betwee periods i the history of the metric. The steps of the method are as follows:. For a give legth of historical metric values, take the largest iteger umber m of the give period, ad form ito a m matrix (i.e. periods are arraged i colums). ( xi ( t) xi )( x ( t) x ) t = ri = α ρ α = Max ( t) xi ), ( x ( t) x ). () t= Mi ρ = Max ( xi t= ( xi, x ) ( x, x ) i 3. Average these correlatio scores to obtai the figure of merit FOM(,m) for the stregth of the give period over m periods: FOM (, m) =. () m( m ) m m r i i= = i+ This techique addresses may of the shortcomigs of the previously described methods, with respect to our applicatio to aomaly detectio. It takes ito accout how each istace of the period relates to each other istace, ot oly their eighbors i time, which is very well aliged with our purpose. The use of a iteger umber of the give period oly (as also doe i [6] ad []) removes the problem of skewig due to a icomplete period. Also, the figures of merit for short periods are ot ureasoably large (ulike those of the Autocorrelatio fuctio). Furthermore, this figure of merit is defied o the rage [-, ], which was previously oted as desirable. Fially, the computatio of this figure of merit requires O(m ) time (there are O(m ) correlatio computatios, each performed i O() time) ad geerally > m for periods of iterest. This is less efficiet tha a autocorrelatio computatio for a give shift value at O(m), however provides a more thorough overall measure of periodic self-similarity. O the virtues of these properties, we adopt the correlatio of periods method as our figure of merit for the stregth of a uderlyig periodicity i a etwork or service metric. The techique is also somewhat extesible to cater for potetial variatios i the relevat aomaly detectio techiques. Refereces [4] ad [5] describe computig the aomaly detectio statistics from the historical values of the metric usig expoetial decay of the cotributio of each week. This could be reflected i a similar decay of the cotributio of each correlatio score to the figure of merit, based o the relative times of the two periods (to each other ad to the preset time). Also, the aomaly detectio techique of iterest does ot icorporate a liear tred ito the rage of expected values. If this were to be doe, the the computatio of the figure of merit should ivolve the removal of the liear tred from the historical widow. The liear tred would the be icorporated ito the rage of expected values i the future.

V. EFFICIENT SEARCH FOR THE CHARACTERISTIC PERIOD Ituitively, the characteristic period of the metric should be the period with the highest figure of merit. For our applicatio, the realistic maor cadidates for the characteristic period are oe day or oe week, so perhaps oly these cadidates eed have their figures of merit evaluated. Noetheless, it is useful to uderstad how the characteristic period would be idetified if this assumptio were ivalid or for other applicatios. Appedix I shows that a exhaustive search computig the figure of merit for all possible periods will have asymptotic complexity O(N log N), where N is the legth of the historical widow of metric values (ote: N m for ay give period ad umber of periods m). It is desirable to fid a search method more efficiet tha this. Furthermore, periods at iteger multiples of the characteristic period are likely to display peaks i the figure of merit at values similar to that of the characteristic period. This is also observed i [6] with respect to the SVD based method. It is desirable for the search method to distiguish betwee these ad idetify oly the characteristic period. To address these issues, we have costructed a more efficiet method to search for the characteristic period usig the figure of merit. This method haresses advatageous properties of the Fourier trasform ad autocorrelatio fuctio, ad is composed of the followig steps:. Take the autocorrelatio fuctio of the give widow of historical metric values. This is doe haressig the Wieer-Khichi Theorem [3] to compute the autocorrelatio fuctio via the FFT i O(N log N) time.. Take the FFT of the autocorrelatio fuctio. As Fig. shows, the importat cadidate periods produce a oscillatio i the autocorrelatio fuctio, so will produce peaks i its FFT (at correspodig frequecies). (Note: Step ca be computatioally elimiated, because step is computed first via the Wieer-Khichi Theorem shortcut). 3. Select a umber of local maxima from this FFT as frequecies of iterest. This ca be doe by selectig a fixed umber of the most sigificat local maxima, or all local maxima with a miimum spectral cotet. 4. For each frequecy of iterest: a. Covert the frequecy of iterest to the correspodig rage of periods of iterest. b. For each period of iterest, compute the figure of merit for the stregth of the period. c. Store the period of iterest with the maximum figure of merit for this frequecy of iterest. This period becomes a cadidate period. If this period is at the edge of the rage of periods of iterest, cotiue evaluatig figures of merit beyod the rage i order to fid a local maximum (This accouts for skewig i the FFT). 5. From the cadidate periods, select that with the largest figure of merit as the characteristic period. As discussed i Sectio VII, if the selected characteristic period has a figure of merit below a certai threshold, it should be cocluded that o characteristic period exists. Appedix I shows that this algorithm scales as O(N log N) for a icrease i the time legth of historical data aalyzed. This is a better scalig tha the exhaustive search. However, the algorithm is also show to scale as O(N log N) for a icrease i the desity of poits per uit time. While this scalig is o better tha the exhaustive search, the performace of this algorithm is better by a sigificat costat factor, sice by defiitio it evaluates the figure of merit for a fiite proportio of possible periods. We coclude that the algorithm is a defiite improvemet i efficiecy over a exhaustive search. This method also effectively elimiates the cadidacy of iteger multiples of the characteristic period, sice they will ot correspod to a maor frequecy compoet i the FFT. Similarly, harmoics of the characteristic period s frequecy are likely to produce oly cadidate periods with isigificat figures of merit. This is a sigificat improvemet as this issue was highlighted i related work, e.g. [6]. Also, if the daily ad weekly periods are the oly cadidates evaluated, this method provides some isight ito whether the weekly period is strog i its ow right, or oly as a iteger multiple of the daily period. This amouts to whether the weekly period produces a maor frequecy compoet of the FFT of the autocorrelatio. Fially, it is importat to ote that this efficiet search method is ot specific to our correlatio of periods figure of merit. That is to say, it could be used with ay related figure of merit method, e.g. the SVD based method [6]. VI. RESULTS OF APPLICATION TO SAMPLE DATA SETS I order to assess the suitability of the correlatio of periods figure of merit, we applied it to several real sample metrics. The results are show i Table I. For each metric, the computatio was made over various legths of historical data (from moth up to year i some cases) ad from various startig poits i time; this results i a rage of figures of merit for each metric. For each data set here (aside from the Sigle iteret customer ) the weekly period was computatioally stroger tha the daily period, though the differece was more proouced i some metrics tha others. This was expected from a visual ispectio of the data sets. This is by o meas a uiversal result there are data sets where the daily period will be the characteristic period (e.g. that show i []). A reasoable spread of figures of merit is displayed by our metrics. There is o correct aswer to compare the results with; however a visual ispectio of the metrics cofirms that the relative rakig of the metrics periodicity by the figure of merit seems qualitatively correct. For example, the figures of merit for the weekly period of Trasactio System idicate a strogly periodic metric this is cofirmed by the visualizatio of this metric i Fig.. Also, the egligible figures of merit for the weekly ad daily periods i the Sigle iteret customer metric were cofirmed i that o periodicity was evidet i a visual ispectio of the metric. These qualitative results provide evidece that the correlatio of periods figure of merit is a appropriate measure of the stregth of a give periodicity i a metric, from the perspective of the aomaly detectio techique of iterest.

TABLE I. Metric Name Trasactio system tras. / mi Trasactio system tras. / mi Telephoy platform calls / mi IP router bytes / mi i IP router bytes / mi i Sigle iteret customer bytes / mi i FIGURES OF MERIT FOR SAMPLE NETWORK AND SERVICE METRICS Samplig Figure of Merit rage rate iterval Weekly period Daily period 5 mi 0.86 to 0.96 0.70 to 0.77 mi 0.64 to 0.7 0.30 to 0.36 mi 0.86 to 0.97 0.59 to 0.67 5 mi 0.5 to 0.57 0.4 to 0.7 5 mi 0.6 to 0.67 0.5 to 0.30 5 mi 0.000 0.008 I additio to the suitability of the figure of merit computatios, the efficiet search for the characteristic period was observed to fuctio well. For all sample metrics except for Sigle iteret customer, the techique successfully idetified oe week as the characteristic period of the metric. A example search over 6 weeks of data of the Telephoy platform calls/mi metric (with a miute samplig rate, givig 0080 time poits per week) took approximately miutes. It must be oted that the search techique is curretly implemeted i scripts for the Octave mathematical eviromet [7], which are ru i a sigle threaded mode. While the time for the code to ru is ot prohibitive, it ca certaily be improved with parallel figure of merit calculatios i compiled rather tha scripted code. It was oted that a shift i to or out of daylight savig durig the historical data affected the figure of merit computatios where the timestamp used was a Uiversal Time Co-ordiate (UTC) or offset equivalet. The use of local time co-ordiates (that shift with daylight savig) was foud to uphold the idetificatio of the expected period. Fially, otice that the sample data sets here iclude etwork elemets i both telephoy ad IP as well trasactio systems. The results demostrate the broad applicability of the figure of merit across various types of metrics, ad importatly the applicability of the aomaly detectio techique of iterest. VII. APPLICATIONS FOR THE FIGURE OF MERIT The primary applicatio for the figure of merit is idetificatio of the characteristic period of the metric uder aalysis. However, there are a umber of areas to which the utility of the figure of merit ca be exteded. Some have bee explored here, others are left to future work. A importat use of the figure of merit is i determiig whether the aomaly detectio techique of iterest is valid for the give metric. If the metric does ot display a strog periodicity, the other aomaly detectio techiques such as traditioal threshold aalysis or SPC may be more appropriate. I attemptig to apply a aomaly detectio techique based o [4] ad [5] to our sample data sets, we have made observatios for the followig approximate rages of the figure of merit for the characteristic period:. Greater tha 0.75 a figure of merit i this rage idicates strog periodicity. This type of techique appears highly appropriate for such metrics.. Betwee 0.5 ad 0.75 a figure of merit i this rage idicates moderate periodicity. This type of techique appears to have some applicability here. Its applicatio could be improved by usig a loger widow of historical data to compute the statistics, ad with a large degree of smoothig or local averagig (as labeled i [4]) of the statistics across eighborig poits i the period. 3. Less tha 0.5 a figure of merit i this rage idicates mild periodicity dow to o periodicity (aroud zero) ad atiperiodicity (represeted by egative scores). The techique does ot appear applicable for such metrics. Prior to the aomaly detectio techique of iterest beig applied to ay give metric, the figure of merit for the characteristic period should be evaluated (as per the above criteria) to establish whether the techique is appropriate. The figure of merit could also be used to determie the legth of historical metric values used for computig the statistics for each time poit i the period. Sice the figure of merit calculatio idicates the stregth of the period durig a give widow of historical data, calculatios over various widow legths could be used to detect process chages ad guide selectio of the widow legth. Furthermore, the figure of merit could add to coecture over the appropriate samplig rate or coutig iterval to use i moitorig the metric. Referece [] suggests that the samplig rate could be iferred from the autocorrelatio half-life, ad that o sigificat chages are expected over itervals of 5 miutes or so. Further isight could be gaied by computig the figure of merit over various coutig itervals for a give metric, ad usig its variatio to guide the selectio of coutig iterval. Of course, the figure of merit could be used to quatitatively ivestigate the prevalece of periodicity i various types of etwork ad service metrics. There is some evidece that periodicity is more prevalet i metrics which represet aggregate data rather tha data from idividual users or from systems with low volume or sporadic use. It could be said that periodicity is a populatio property while idividual users are guided by similar periodic iflueces, the periodicity is oly strogly evidet whe examiig the group behaviour. Referece [] otes that stroger treds were observed i metrics from hosts that experiece more usage, ad that the stregth of periodicity depeds o how strogly the metric is coupled to the periodic force of the system users. These observatios are echoed i our ow quatitative results here the metrics exhibitig the strogest periodicity were macroscopic metrics (e.g. total trasactios per miute) from systems hadlig the largest umber of users. Also, the oly metric show here without ay sigificat periodicity was from a sigle user system. Fially, it is worth otig that the aomaly detectio techique of iterest oly haresses a sigle characteristic

period. Related methods however could icorporate multiple characteristic periods (be they harmoically related as oe day ad oe week, or otherwise). The figure of merit could be useful i idetifyig the multiple periods, perhaps beig computed for each cadidate period after the ifluece of the precedig period is removed from the metric. VIII. CONCLUSION We have idetified the eed for a better uderstadig of periodicity i etwork ad service metrics for a aomaly detectio techique of iterest. Subsequetly, we costructed the correlatio of periods figure of merit for the stregth of the periodicity i such metrics. This figure of merit ca be used to idetify the characteristic period of the metric, ad a efficiet search techique was preseted for this purpose. Aother importat applicatio of the figure of merit is i quatitatively evaluatig whether the aomaly detectio techique of iterest is appropriate for a give metric. Other uses for the figure of merit were discussed, ad are potetial areas for future work. Future work will also iclude probig the oise immuity of our figure of merit ad search techique, usig geerated sample data sets. APPENDIX I. ASYMPTOTIC COMPLEXITY OF SEARCHES FOR THE CHARACTERISTIC PERIOD A exhaustive search for the characteristic period ivolves computatios of the figure of merit over the potetial periods = to N/. Each figure of merit evaluatio has complexity O(m ) = O(N /). So, the total umber of operatios i the exhaustive search is: N / = N N N = N + + + + + + +.. + 3 4 5 6 7 N / + + + + + + +.. 4 4 4 4.. + q +... + q [ + () + () + () +... + () ] = N ( + q), where q = log N -. That is, the asymptotic complexity of the exhaustive search for the characteristic period is O(N log N). I a practical situatio, the figure of merit may ot be evaluated for periods larger tha say N/3 or smaller tha 3, however this does ot alter the asymptotic complexity. The ru time of the efficiet search is domiated by the loop over the frequecies of iterest i steps 3 ad 4. We assume the algorithm selects a fixed umber of maxima from the frequecy spectrum. The umber of periods of iterest correspodig to this fixed umber of frequecies is directly proportioal to the desity d of samples i the time series metric, i.e. O(d). Now, the figure of merit evaluatio for each period of iterest costs O(m ), however sice we do t kow which periods will be periods of iterest we estimate the average cost as that of the (3) exhaustive search averaged over each period, i.e. O(N log N). So, the ru time of the efficiet search scales as O(dN log N) for a icrease i the time legth of historical metric data this scales as O(N log N); for a icrease i the desity of samples it scales as O(N log N). REFERENCES [] M. Burgess, H. Haugerud, S. Straumses, T. Reita, Measurig system ormality, ACM Tras.Computer Systems, vol..0, o., pp. 5-60, May 00. [] R. Lud, H. Hurd, P. Bloomfield, R. Smith, Climatological time series with periodic correlatio, J. Climate, vol. 8, o., pp. 787-809, Dec. 995. [3] A. Adrzeak, M. Ceyra, Characterizig ad predictig resource demad by periodicity miig, J. Network ad Systems Maagemet, vol. 3, o., Ju. 005. [4] M. Burgess, Two dimesioal time-series for aomaly detectio ad regulatio i adaptive systems, i Proc. 3th IFIP/IEEE It. Workshop Distributed Systems: Operatios ad Maagemet (DSOM) (Lecture Notes i Computer Sciece Vol.506), Motreal, Oct. 00, pp.69-80. Spriger Verlag. [5] M. Burgess, Probabilistic aomaly detectio i distributed computer etworks, Sciece of Computer Programmig, i press. [6] L. L. Ho, D. J. Cavuto, S. Papavassiliou, A. G. Zawadzki, Adaptive ad automated detectio of service aomalies i trasactio-orieted WAN's: etwork aalysis, algorithms, implemetatio, ad deploymet", IEEE J. Selected areas i Commuicatios, vol. 8, o. 5, pp. 744-757, May 000. [7] N. Ye, S. Vilbert, Q. Che, Computer itrusio detectio through EWMA for autocorrelated ad ucorrelated data, IEEE Tras. Reliability, vol. 5, o., pp. 75-8, Mar. 003. [8] A. Selezyov, A methodology to detect temporal regularities i user behavior for aomaly detectio, i Dupuy, M., Paradias, P. (eds) Trusted iformatio: The ew decade challege: Proc. 6th It. Cof. Iformatio Security (IFIP/Sec 0), Paris, Ju. 00, pp. 339-35. Kluwer Academic Publishers. [9] W. H. Alle, G. A. Mari, O the self-similarity of sythetic traffic for the evaluatio of itrusio detectio systems", i Proc. 003 Symp. Applicatios ad the Iteret (SAINT 003), Orlado, Ja. 003, pp. 4-48. IEEE Comput. Soc. [0] Y. Li, N. Wu, S. Wag, S. Jaodia, Ehacig profiles for aomaly detectio usig time graularities, J. Computer Security, vol. 0, o. -, pp. 37-57, 00. [] L. L. Ho, D. J. Cavuto, S. Papavassiliou, A. G. Zawadzki, Adaptive Aomaly Detectio i Trasactio-Orieted Networks, J. Network ad Systems Maagemet, vol. 9, o., pp. 39-60, Ju. 00. [] J. D. Brutlag, Aberrat Behavior Detectio i Time Series for Network Moitorig, i Proc. 4th Systems Admiistratio Cof. (LISA), New Orleas, Dec. 000, pp. 39-46. USENIX Assoc. [3] W. H. Press, S. A. Teukolsky, W. T. Vetterlig, B. P. Flaery, Numerical Recipes i C, d Editio. Cambridge: Press Sydicate of The Uiversity of Cambridge, 99, pp. 496-500, p. 636. [4] S. Ma, J. L. Hellerstei, Miig partially periodic evet patters with ukow periods, i Proc. 7th It. Cof. o Data Egieerig (ICDE), Heidelberg, April 00, pp. 05-4. IEEE Comput. Soc. [5] C. Berberidis, I. Vlahavas, W. G. Aref, M. Atallah, A. K. Elmagarmid, O the discovery of weak periodicities i large time series, i Proc. 6th Europea Cof. Priciples of Data Miig ad Kowledge Discovery (PKDD) (Lecture Notes i Computer Sciece vol. 43), Helsiki, August 00, pp. 5-6. Spriger Verlag. [6] P. P. Kailal, J. Bhattacharya, G. Saha, Robust method for periodicity detectio ad characterizatio of irregular cyclical series i terms of embedded periodic compoets, Phys. Rev. E, vol. 59, o. 4, pp. 403-405, Apr. 999. [7] J. W. Eato, GNU Octave, 998. Available: http://www.octave.org/