# Java Modules for Time Series Analysis

Save this PDF as:

Size: px
Start display at page:

Download "Java Modules for Time Series Analysis"

## Transcription

1 Java Modules for Time Series Analysis

2 Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction

3 1. Clustering + Cluster 1 Synthetic Clustering + Time series Cluster 2 Synthetic + Cluster 3 Synthetic

4 Clustering Goal grouping of time series in such a way that the series with similar historical behavior to be in the same group Input A set of single time series (bond, share, fund prices) or time series groups (for example interest rate market curves) Number of clusters Output Clusters of time series, Clustering quality statistics Every cluster is represented by a prototype series (synthetic curve) with the same dimensionality as the all other series Using Clustering can be used to reduce a huge number of series and thus to facilitate and make feasible time consuming operation like calculation of huge correlation matrices, etc. The number of time series is reduced by: Identifying the cluster in which a series belongs to Using of the prototype of the cluster instead of the real series Determine similar behavior of market factors or Issuers (Cartels)

5 Clustering Clustering can be performed for: Time series (for example shares or bond prices having historical development) Curve time series (for example interest rate market curves having historical development) In addition to the clusters with their series and prototypes clustering quality statistics are generated: Inter and intra cluster statistics, adjuster R squared, average linkage, etc. Some of these statistics can be used to determine the optimal number of clusters, i.e. the best number of groups of min internal distance and max distance to each other

6 Error Finding optimal number of clusters using clustering error Num Clusters Adjusted R squared Error 2 0, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,20 0,15 0,10 0,05 0, Optimal number of clusters = 18 Number of clusters Optimal number of clusters = 18

7 Example: clustering of 20 spread curves Actual Spread Curves MM CM CM CM CM CM Num Maturity(Years) 0, FORTUM-AEUR-MM 0,1439% 0,2293% 0,3047% 0,3686% 0,4647% 0,5828% 2 SRBIA-AEUR-CR 1,2863% 1,2500% 1,7880% 2,1133% 2,6375% 3,0123% 3 UKRAIN-AUSD-CR 1,7606% 1,9218% 2,7834% 3,2418% 3,8589% 4,5407% 4 ITALY-AEUR-CR 0,1022% 0,1167% 0,1896% 0,2727% 0,3965% 0,4980% 5 SLOVEN-AEUR-CR 0,0376% 0,1044% 0,1351% 0,1584% 0,2338% 0,3622% 6 CZECH-AEUR-CR 0,1295% 0,1471% 0,2220% 0,2583% 0,3627% 0,4925% 7 TURKEY-AUSD-CR 0,8137% 0,9853% 1,6326% 2,1247% 2,7918% 3,4798% 8 ROMANI-AUSD-CR 0,5478% 0,5594% 1,1125% 1,3618% 1,7735% 2,0718% 9 POLAND-AUSD-CR 0,0883% 0,1608% 0,2432% 0,3576% 0,5238% 0,6876% 10 PEUGOT-AEUR-MM 0,6865% 0,8433% 1,0406% 1,2925% 1,6265% 1,7360% 11 JPM-CUSD-MR 1,0987% 1,0272% 1,1421% 1,2384% 1,3570% 1,3909% 12 DANBNK-AEUR-MM 0,1830% 0,2635% 0,3686% 0,4481% 0,5610% 0,6098% 13 GS-AUSD-MR 1,1586% 1,1238% 1,1868% 1,2137% 1,2638% 1,2969% 14 CROATI-AEUR-CR 0,1554% 0,2741% 0,4423% 0,5602% 0,8274% 1,0806% 15 ESPSAN-CEUR-MM 0,7725% 0,9577% 1,2941% 1,5505% 1,9120% 1,9371% 16 SUEDZU-AEUR-MM 0,6148% 0,7388% 0,9829% 1,2075% 1,5032% 1,6203% 17 LEH-AUSD-MR 6,7011% 6,8482% 5,4481% 4,5185% 3,2897% 2,6982% 18 HELLAS-CEUR-MM 4,6793% 4,9875% 8,3839% 9,9521% 10,6632% 10,6198% 19 REPHUN-AUSD-CR 0,2793% 0,3050% 0,5168% 0,7809% 1,1361% 1,3992% 20 BGARIA-AUSD-CR 0,3123% 0,4825% 0,8722% 1,1222% 1,5268% 1,8284%

8 All 20 spread curves 12,00% 10,00% 8,00% 6,00% 4,00% 2,00% 0,00% Actual Spread Curves on , FORTUM-AEUR-MM SRBIA-AEUR-CR UKRAIN-AUSD-CR ITALY-AEUR-CR SLOVEN-AEUR-CR CZECH-AEUR-CR TURKEY-AUSD-CR ROMANI-AUSD-CR POLAND-AUSD-CR PEUGOT-AEUR-MM JPM-CUSD-MR DANBNK-AEUR-MM GS-AUSD-MR CROATI-AEUR-CR ESPSAN-CEUR-MM SUEDZU-AEUR-MM LEH-AUSD-MR HELLAS-CEUR-MM REPHUN-AUSD-CR BGARIA-AUSD-CR

9 Clusters of spread curves Spread Curve Clusters - Actual Rates MM CM CM CM CM CM Std Deviation Num Maturity(Years) 0, Cluster 1 0,1725% 1 FORTUM-AEUR-MM 0,1439% 0,2293% 0,3047% 0,3686% 0,4647% 0,5828% 4 ITALY-AEUR-CR 0,1022% 0,1167% 0,1896% 0,2727% 0,3965% 0,4980% 5 SLOVEN-AEUR-CR 0,0376% 0,1044% 0,1351% 0,1584% 0,2338% 0,3622% 6 CZECH-AEUR-CR 0,1295% 0,1471% 0,2220% 0,2583% 0,3627% 0,4925% 9 POLAND-AUSD-CR 0,0883% 0,1608% 0,2432% 0,3576% 0,5238% 0,6876% 12 DANBNK-AEUR-MM 0,1830% 0,2635% 0,3686% 0,4481% 0,5610% 0,6098% 14 CROATI-AEUR-CR 0,1554% 0,2741% 0,4423% 0,5602% 0,8274% 1,0806% Cluster Spread 0,1200% 0,1851% 0,2722% 0,3463% 0,4814% 0,6162% Cluster 2 0,3019% 8 ROMANI-AUSD-CR 0,5478% 0,5594% 1,1125% 1,3618% 1,7735% 2,0718% 13 GS-AUSD-MR 1,1586% 1,1238% 1,1868% 1,2137% 1,2638% 1,2969% 10 PEUGOT-AEUR-MM 0,6865% 0,8433% 1,0406% 1,2925% 1,6265% 1,7360% 11 JPM-CUSD-MR 1,0987% 1,0272% 1,1421% 1,2384% 1,3570% 1,3909% 15 ESPSAN-CEUR-MM 0,7725% 0,9577% 1,2941% 1,5505% 1,9120% 1,9371% 16 SUEDZU-AEUR-MM 0,6148% 0,7388% 0,9829% 1,2075% 1,5032% 1,6203% 19 REPHUN-AUSD-CR 0,2793% 0,3050% 0,5168% 0,7809% 1,1361% 1,3992% 20 BGARIA-AUSD-CR 0,3123% 0,4825% 0,8722% 1,1222% 1,5268% 1,8284% Cluster Spread 0,6838% 0,7547% 1,0185% 1,2209% 1,5123% 1,6601% Cluster 3 0,8026% 2 SRBIA-AEUR-CR 1,2863% 1,2500% 1,7880% 2,1133% 2,6375% 3,0123% 3 UKRAIN-AUSD-CR 1,7606% 1,9218% 2,7834% 3,2418% 3,8589% 4,5407% 7 TURKEY-AUSD-CR 0,8137% 0,9853% 1,6326% 2,1247% 2,7918% 3,4798% 17 LEH-AUSD-MR 6,7011% 6,8482% 5,4481% 4,5185% 3,2897% 2,6982% Cluster Spread 2,6402% 2,7512% 2,9129% 2,9995% 3,1445% 3,4328% Cluster 4 0,0000% 18 HELLAS-CEUR-MM 4,6793% 4,9875% 8,3839% 9,9521% 10,6632% 10,6198% Cluster Spread 4,6793% 4,9875% 8,3839% 9,9521% 10,6632% 10,6198%

10 Clusters of market curves Cluster ,2000% 1,0000% FORTUM-AEUR-MM ITALY-AEUR-CR 0,8000% 0,6000% 0,4000% SLOVEN-AEUR-CR CZECH-AEUR-CR POLAND-AUSD-CR DANBNK-AEUR-MM Synthetic curve 0,2000% 0,0000% CROATI-AEUR-CR Cluster Spread Cluster ,5000% ROMANI-AUSD-CR Synthetic curve 2,0000% 1,5000% 1,0000% 0,5000% GS-AUSD-MR PEUGOT-AEUR-MM JPM-CUSD-MR ESPSAN-CEUR-MM SUEDZU-AEUR-MM REPHUN-AUSD-CR 0,0000% BGARIA-AUSD-CR Cluster Spread

11 Historical development Cluster 2: 6 Months Cluster 2: 1 Year 3,00% 3,00% 2,50% ROMANI-AUSD-CRMM 2,50% ROMANI-AUSD-CRCM GS-AUSD-MRMM GS-AUSD-MRCM 2,00% PEUGOT-AEUR-MMMM 2,00% PEUGOT-AEUR-MMCM JPM-CUSD-MRMM JPM-CUSD-MRCM 1,50% ESPSAN-CEUR-MMMM 1,50% ESPSAN-CEUR-MMCM SUEDZU-AEUR-MMMM SUEDZU-AEUR-MMCM 1,00% REPHUN-AUSD-CRMM 1,00% REPHUN-AUSD-CRCM BGARIA-AUSD-CRMM BGARIA-AUSD-CRCM 0,50% Cluster 0,50% Cluster 0,00% Cluster 2: 5 Years 0,00% Cluster 2: 10 Years Synthetic curve 4,00% 4,00% 3,50% ROMANI-AUSD-CRCM 3,50% ROMANI-AUSD-CRCM 3,00% 2,50% GS-AUSD-MRCM PEUGOT-AEUR-MMCM JPM-CUSD-MRCM 3,00% 2,50% GS-AUSD-MRCM PEUGOT-AEUR-MMCM JPM-CUSD-MRCM 2,00% ESPSAN-CEUR-MMCM 2,00% ESPSAN-CEUR-MMCM 1,50% 1,00% 0,50% SUEDZU-AEUR-MMCM REPHUN-AUSD-CRCM BGARIA-AUSD-CRCM Cluster 1,50% 1,00% 0,50% SUEDZU-AEUR-MMCM REPHUN-AUSD-CRCM BGARIA-AUSD-CRCM Cluster 0,00% 0,00%

12 2. Non-Normal Distributions Theoretical distribution type + parameters Non-normal distributions Cauchy Empirical distribution Normal

13 Non-normal distributions Goal automatically identification of distribution type and its parameters using market time series and use the Copula approach to simulate market factors in Monte Carlo VaR using mapped distributions Input The time series of the market factors Chosen standard distribution types (Beta, Cauchy, Student, Weibull, etc.) Output Identified distribution type The parameters of the identified distribution type Numerical estimation of the distance between the empirical distribution and all other distribution types (allows to order distribution types and choose other good fitting distribution type) Using Improving Monte Carlo VaR simulation by using of correlated non-normal distribution samples instead of correlated normal distribution samples

14 Non-normal distributions Calculation of Value at Risk Q Confidence level a quartile Market VaR(a) Expected value The distribution of time series for market factors is assumed to be normal in the most cases. But this don t correspond to reality, the time series expose often skewed and flat tail distributions which is connected to underestimation of market risk for improbable large loses (flat tail losses)

15 Non-normal distributions Mapping Risk Factors to best fit Distribution The best fit is given by the Cauchy Distribution (green) Normal Distribution The Beta Distribution will produce larger confidence risk because of the flat tail

16 Distribution parameters estimation The main important goal is to achieve best modeling of empirical distribution shape by reproducible theoretical distribution shape Together with the distribution type identification, the distribution parameters are also estimated from market data using the method of moments, least squares regression or maximum likelihood. The additional parameters shift and scale are also used to avoid distribution parameters values in undefined regions Data having a given distribution can be generated by: Distribution type Distribution parameters Additional parameters (shift, scale) Values count Cumulative distributions are used for the subsequent Copula Monte Carlo Simulation

17 Standard distribution parameters estimation 10 distribution types Distribution parameters Additional parameters Distribution Parameter 1 Parameter 2 Parameter 1 Parameter 2 Beta Shape Shape Shift Scale Cauchy Location Scale Exponential Rate --- Shift --- Inverse Normal Mu Lambda Shift --- Log Normal Log Scale Shape Shift --- Normal Mean Variance Shift --- Pareto Scale Shape Rayleigh Sigma --- Shift --- Student Nu --- Shift Scale Weibull Scale Shape Shift ---

18 Distribution mapping Two metrics are used to compare distributions: Histogram metric empirical histogram bins frequencies are compared against theoretical histogram bins probabilities Cumulative distances metric ignoring X-axis values, cumulative distances between market series data points are calculated. The same function is calculated using theoretically generated values for the distribution under consideration. These two cumulative values are compared. Both histogram and cumulative distances are compared using average squared error

19 Histogram metric Distances between theoretical and empirical histograms Theoretical histogram Empirical histogram Best (mapped) distribution is identified by the minimum sum of squared distances between the distribution theoretical histogram and empirical histogram min max

20 Cumulative distances metric Data values y Cumulative distances between values Cumulative distances graph p i p 1 = d 1 d 2 p 2 = d 1 + d 2 p 3 = d 1 + d 2 + d 3 d 1 i Best (mapped) distribution is identified by the minimum sum of squared distances between the empirical cumulative values and corresponding theoretical cumulative values

21 Copula Monte Carlo VaR Example for 2 Market Factors (Lognormal and Beta distributed) Market Risk Correlation Matrix Normal distributed correlated random samples Cumulative Distribution Lognormal Distribution x = F -1 (y) Equally distributed and Correlated random samples (0...1) Cumulative Distribution Monte Carlo Simulation VaR Distribution Skewed Distribution Beta Distribution Correlated non-normal distributed samples are put to Monte Carlo simulation instead of correlated normal distributed samples generated using the market risk correlation matrix

22 Copula Monte Carlo VaR Skewed and flat tail VaR distributions Skewed VaR distribution Flat tail VaR distribution

23 Prototype system Theoretical histograms Empirical histogram Cumulative values Parameters estimations Distributions generator Distances between theoretical and empirical distributions Best Fit for Weibul Distribution

24 3. Multifactor models Formula Target factor Multifactor Models Target factor = Coefficients Explanatory factors Functions -0, Instruments_Fund-FR , Instruments_Fund-LU sqrt -11, StockIndexCurve_DJIA ln 0, StockIndexCurve_GEX 0, StockIndexCurve_Nasdaq-Composite ^2.0 0, StockIndexCurve_Nikkei225 sqrt -0, StockIndexCurve_SDAXPI sqrt -10, StockIndexCurve_TECDAXPI ln -6739,26524 Target factor Obtained by formula Explanatory factors Time series

25 Multifactor models Goal building formulas describing unknown market instruments by instruments with known pricing models based on time series Input The historical time series of the target factor (the instrument with unknown pricing approach or unknown market factor dependency) Other available historical time series to be used as explanatory factors (indices, spread curves, interest rate, inter banking rate, foreign exchange rate, etc.) Output Polynomial like formula describing the dependency of the target factor by the explanatory factors Using The generated formula can be used to develop a new type instrument having a pricing approach based on a set of known factors Obtain a factor contribution to instrument price development and risk

26 Multifactor models object Available market factors Target instrument Formula building Formula calibration Time Target instrument time series Target instrument by formula Explanatory factor time series Other factors The target instrument is calculated by formula The formula is built and periodically calibrated using target instrument and explanatory instrument time series

27 Stages of modeling Start Target factor selection Explanatory factors suggestion/selection Basis functions combination determination Regression coefficients determination Final formula determination and error calculation End - all given factors in the system - determined by system and/or human - determined by system

28 Explanatory factors selection When a target factor is selected explanatory factors should be selected by automatic suggestion and/or hand choosing Automatic suggestion could be done by: Clustering Explanatory factors are obtained from the cluster in which the target factor is classified. If the number of explanatory factors determined in this way are insufficient then the number of clusters could be decreased in order to increase the number of elements in the cluster Minimal co-variances between candidate factors Co-variances between all factors are calculated and the first n minimal co-variances determine the factors Maximal co-variances between candidate factors and the target factor

29 Formula builder After the target and explanatory factors are selected formula building process should be started in which the system performs: Finding of combination of basis functions to the explanatory factors The basis functions are used to: improve the accuracy avoid linear dependencies between factors in that causes matrices equations problems Regression coefficients β i (Beta Factors) y = β 1 f 1 (x 1 ) + β 2 f 2 (x 2 ) β n f m (x n ) + β n+1 + ε y target factor x 1, x 2,, x n explanatory factors β 1, β 2,, β n, β n+1 regression coefficients f 1, f 2,, f m basis functions ε error

30 Basis functions combination Basis functions f 1 f 2 f 3 f 4 f m Function exponent logarithm sine cosine htangent Explanatory factors x 1 x 2 x 3 x n Name GOV Bel FX USD Oil price. Gold price Date 1 Date 2 Date t Date K Combination of basis functions applied to the explanatory factors Target factor y ỹ GOV Aut GOV Aut estimation ε = (y - ỹ) 2 Distance ỹ =β 1 f 3 (x 1 ) + β 2 f 1 (x 2 ) β n f 2 (x n ) + β n+1 + ε

31 Prototype system Generated formula Graphic results: Target and Multi-Factor

32 Prototype system settings

33 4. Implied Rating Scale building Time series Implied Ratings Classification Rating BB Tendency BBB

34 Implied Rating (Basel III) Goal building of a rating scale based on explicit CDS time series and using it to identify both the implied rating and the tendency of a new CDS input series Input A set of CDS time series that relate to assets or issuers (CDS spread curves or indices, bond prices, share prices, etc.) Rating system - number and symbols for the ratings of the rating scale Output Scale with boundaries between the ratings Using By supplying the built scale with a new time series representing an issuer, the system identifies: Current rating based on the historical development giving more importance to the last values Tendency what is the next probable rating

35 Steps to obtain implied rating Establishment of the rating scale Available time series are used to build given number of rating degrees and to determine their boundaries The time series are distributed into given rating degrees according to the historical behavior The center of every degree is determined using the all time series which belong to the degree The boundaries are derived from adjacent centers using equally distanced series A new time series is classified to a rating class by comparing with the centers (that are also time series) of the scale classes and finding the closest one The tendency is determined by Finding the second closest center of rating degrees Finding the closes boundary of the classified rating level

36 Rating degrees boundaries The time series of the rating degrees may overlap АА АА - A A The points of the ratings boundaries are calculated as average values of the corresponding points of the centers of the series in every rating degree The center of the series in a given degree resides not in the middle of the degree boundaries because in the most cases the time series is nonuniformly distributed

37 Rating degrees boundaries 1,80% Boundary Degree center The boundary resides in the mid of the series centers The center of the series resides not in the mid of the boundaries 1,40% 1,00% 0,60% 0,20% A AA

38 Weighting the historical values Weighting of the series values (EWMA by Decay Factor) is applied in order to make more important more actual date values 1,60% 1,40% 1,20% 1,00% 0,80% The last series values reside within the degree boundaries 0,60% 0,40% AA 0,20% 0,00%

39 Determine the rating of a new series In the classification phase histograms are build for the distributions of the data within the best and second best degrees (corresponding to the rating and tendency ) The histograms are shown with the centers of the class and the mean of the new classified series Mean and standard deviation used to build the histograms are calculated taking into account of the same decay factor used to build the ratings scale

40 02,08,,,, 16,08,,,, 30,08,,,, 13,09,,,, 27,09,,,, 11,10,,,, 25,10,,,, 10,11,,,, 24,11,,,, 09,12,,,, 23,12,,,, 10,01,,,, 24,01,,,, 07,02,,,, 21,02,,,, 07,03,,,, 21,03,,,, 04,04,,,, 18,04,,,, 03,05,,,, 17,05,,,, 31,05,,,, 16,06,,,, 01,07,,,, 15,07,,,, 29,07,,,, Determine the rating of a new series 1,60% 1,40% 1,20% 1,00% 0,80% 0,60% 0,40% 0,20% Classification of a new series - Barklays Bank PLC New time series (yellow) that should be classified A AA Mean of the 60 new series, 50 Rating AA and 40 Tendency to A ,08% 0,13% 0,18% 0,23% 0,28% 0,33% 0,37% 0,42% 0,47% 0,52% 0,57% 0,62% 0,66% 0,71% 0,76% 0,81% 0,86% 0,91% 0,96% 1,00% 1,05% 1,10% 1,15% 1,20% 1,25% 1,30% Mean and standard deviation with decay factor

41 Prototype system Time series used to build the scale Built ratings scale New input Rating and tendency Rating system Series within the selected degree Histograms for rating&tendency New input mean

42 5. Time Series Prediction Predicted future Prediction Time series Predicted time series

43 Time series prediction Goal prediction of a given time series for a given time horizon by analyzing the series historical development Input A time series Setting according to the used approach (for example learning iterations, time window size, etc.) Output The given time series with additional predicted values Confidence bounds Prediction quality statistics Using The predicted values can be used as the most probable future values, for instance in algorithmic trading

44 Time series prediction The most commonly used prediction methods are: Averages (MA, WMA, EWMA, etc.) Autoregressive methods (AR, ARMA, ARIMA, SARIMA, ARMAX, SETAR, etc.) with Box-Jenkins methodology Trend-extrapolation (based on LSE, trend polynomial finding, etc.) Neural Networks (MLP, RBF, SOM, ART, recurrent Elman/Jordan networks, etc.), Neural Network are used in current approach Other regression based (e.g. Observers) and econometric models Kalman, Wiener and other filters Wavelet based methods Holt-Winter decomposition Hybrid approaches The prediction could be used for technical analysis Confidence bounds are used Predictability indicators can be suggested (Hurst exponent, etc.)

45 Prediction by neural network Model identification Historical values Input vector Output vector Neural Network Target function Sliding window Optimization Prediction Recursive prediction Horizon Neural Network

46 Prediction by neural network Modeling process Data pre-processing Modeling of NN architecture Training Application of NN model Evaluation Manual by trying and error approach Preprocessing Post processing Prediction with confidence bounds 0,0008 0,0008 0,0007 0,0007 0,0007 0,0007 0,0007 0,0006 0,0006 0,0006 0, Neural network 0,0008 0,0008 0,0007 0,0007 0,0007 0,0007 0,0007 0,0006 0,0006 0,0006 0, Prediction The prediction generally includes data pre-processing, solving of matrix equations (batch or iteratively) and data post-processing Historical values Horizon

47 Prototype system Preprocessing Prediction methods Values Time horizon Test Error graphic Confidence bounds

48 Modules dependencies Series Calculations processing Neural Networks distributions and parameters estimation 1.Clustering 2. Non-normal distributions sating scale building histograms 4. Implied Ratings formula building factors selection 3. Multifactor Models learning & prediction 5. Prediction Time series

### Contents. List of Figures. List of Tables. List of Examples. Preface to Volume IV

Contents List of Figures List of Tables List of Examples Foreword Preface to Volume IV xiii xvi xxi xxv xxix IV.1 Value at Risk and Other Risk Metrics 1 IV.1.1 Introduction 1 IV.1.2 An Overview of Market

### Volatility modeling in financial markets

Volatility modeling in financial markets Master Thesis Sergiy Ladokhin Supervisors: Dr. Sandjai Bhulai, VU University Amsterdam Brian Doelkahar, Fortis Bank Nederland VU University Amsterdam Faculty of

### MATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!

MATH BOOK OF PROBLEMS SERIES New from Pearson Custom Publishing! The Math Book of Problems Series is a database of math problems for the following courses: Pre-algebra Algebra Pre-calculus Calculus Statistics

### Multiple Choice: 2 points each

MID TERM MSF 503 Modeling 1 Name: Answers go here! NEATNESS COUNTS!!! Multiple Choice: 2 points each 1. In Excel, the VLOOKUP function does what? Searches the first row of a range of cells, and then returns

### AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

### Practical Calculation of Expected and Unexpected Losses in Operational Risk by Simulation Methods

Practical Calculation of Expected and Unexpected Losses in Operational Risk by Simulation Methods Enrique Navarrete 1 Abstract: This paper surveys the main difficulties involved with the quantitative measurement

### Data Preparation and Statistical Displays

Reservoir Modeling with GSLIB Data Preparation and Statistical Displays Data Cleaning / Quality Control Statistics as Parameters for Random Function Models Univariate Statistics Histograms and Probability

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### Gamma Distribution Fitting

Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

### Quantitative Methods for Finance

Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain

### Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### Descriptive Statistics

Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

### Geostatistics Exploratory Analysis

Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

### Maximum Likelihood Estimation of an ARMA(p,q) Model

Maximum Likelihood Estimation of an ARMA(p,q) Model Constantino Hevia The World Bank. DECRG. October 8 This note describes the Matlab function arma_mle.m that computes the maximum likelihood estimates

### Chapter 2: Systems of Linear Equations and Matrices:

At the end of the lesson, you should be able to: Chapter 2: Systems of Linear Equations and Matrices: 2.1: Solutions of Linear Systems by the Echelon Method Define linear systems, unique solution, inconsistent,

### State Space Time Series Analysis

State Space Time Series Analysis p. 1 State Space Time Series Analysis Siem Jan Koopman http://staff.feweb.vu.nl/koopman Department of Econometrics VU University Amsterdam Tinbergen Institute 2011 State

### ADVANCED FORECASTING MODELS USING SAS SOFTWARE

ADVANCED FORECASTING MODELS USING SAS SOFTWARE Girish Kumar Jha IARI, Pusa, New Delhi 110 012 gjha_eco@iari.res.in 1. Transfer Function Model Univariate ARIMA models are useful for analysis and forecasting

### Risk Analysis Using Monte Carlo Simulation

Risk Analysis Using Monte Carlo Simulation Here we present a simple hypothetical budgeting problem for a business start-up to demonstrate the key elements of Monte Carlo simulation. This table shows the

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### Logistic Regression (a type of Generalized Linear Model)

Logistic Regression (a type of Generalized Linear Model) 1/36 Today Review of GLMs Logistic Regression 2/36 How do we find patterns in data? We begin with a model of how the world works We use our knowledge

### NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### Basel II: Operational Risk Implementation based on Risk Framework

Systems Ltd General Kiselov 31 BG-9002 Varna Tel. +359 52 612 367 Fax +359 52 612 371 email office@eurorisksystems.com WEB: www.eurorisksystems.com Basel II: Operational Risk Implementation based on Risk

### 1 The Pareto Distribution

Estimating the Parameters of a Pareto Distribution Introducing a Quantile Regression Method Joseph Lee Petersen Introduction. A broad approach to using correlation coefficients for parameter estimation

### Tail-Dependence an Essential Factor for Correctly Measuring the Benefits of Diversification

Tail-Dependence an Essential Factor for Correctly Measuring the Benefits of Diversification Presented by Work done with Roland Bürgi and Roger Iles New Views on Extreme Events: Coupled Networks, Dragon

### Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:

### AP Physics 1 and 2 Lab Investigations

AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks

### CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen

CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 3: DATA TRANSFORMATION AND DIMENSIONALITY REDUCTION Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major

### A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes

A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes together with the number of data values from the set that

### Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

### Normality Testing in Excel

Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

### 2. Descriptive statistics in EViews

2. Descriptive statistics in EViews Features of EViews: Data processing (importing, editing, handling, exporting data) Basic statistical tools (descriptive statistics, inference, graphical tools) Regression

### INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

### Algebra I Vocabulary Cards

Algebra I Vocabulary Cards Table of Contents Expressions and Operations Natural Numbers Whole Numbers Integers Rational Numbers Irrational Numbers Real Numbers Absolute Value Order of Operations Expression

### 11. Time series and dynamic linear models

11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### Lean Six Sigma Training/Certification Book: Volume 1

Lean Six Sigma Training/Certification Book: Volume 1 Six Sigma Quality: Concepts & Cases Volume I (Statistical Tools in Six Sigma DMAIC process with MINITAB Applications Chapter 1 Introduction to Six Sigma,

### MTH304: Honors Algebra II

MTH304: Honors Algebra II This course builds upon algebraic concepts covered in Algebra. Students extend their knowledge and understanding by solving open-ended problems and thinking critically. Topics

### Regression III: Advanced Methods

Lecture 4: Transformations Regression III: Advanced Methods William G. Jacoby Michigan State University Goals of the lecture The Ladder of Roots and Powers Changing the shape of distributions Transforming

### MATLAB for Use in Finance Portfolio Optimization (Mean Variance, CVaR & MAD) Market, Credit, Counterparty Risk Analysis and beyond

MATLAB for Use in Finance Portfolio Optimization (Mean Variance, CVaR & MAD) Market, Credit, Counterparty Risk Analysis and beyond Marshall Alphonso Marshall.Alphonso@mathworks.com Senior Application Engineer

### Using Duration Times Spread to Forecast Credit Risk

Using Duration Times Spread to Forecast Credit Risk European Bond Commission / VBA Patrick Houweling, PhD Head of Quantitative Credits Research Robeco Asset Management Quantitative Strategies Forecasting

### Statistics Graduate Courses

Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

### Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization

Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization Jean- Damien Villiers ESSEC Business School Master of Sciences in Management Grande Ecole September 2013 1 Non Linear

### Univariate and Multivariate Methods PEARSON. Addison Wesley

Time Series Analysis Univariate and Multivariate Methods SECOND EDITION William W. S. Wei Department of Statistics The Fox School of Business and Management Temple University PEARSON Addison Wesley Boston

### Report on application of Probability in Risk Analysis in Oil and Gas Industry

Report on application of Probability in Risk Analysis in Oil and Gas Industry Abstract Risk Analysis in Oil and Gas Industry Global demand for energy is rising around the world. Meanwhile, managing oil

### Data Preparation Part 1: Exploratory Data Analysis & Data Cleaning, Missing Data

Data Preparation Part 1: Exploratory Data Analysis & Data Cleaning, Missing Data CAS Predictive Modeling Seminar Louise Francis Francis Analytics and Actuarial Data Mining, Inc. www.data-mines.com Louise.francis@data-mines.cm

### Frequency distributions, central tendency & variability. Displaying data

Frequency distributions, central tendency & variability Displaying data Software SPSS Excel/Numbers/Google sheets Social Science Statistics website (socscistatistics.com) Creating and SPSS file Open the

### Hedging Illiquid FX Options: An Empirical Analysis of Alternative Hedging Strategies

Hedging Illiquid FX Options: An Empirical Analysis of Alternative Hedging Strategies Drazen Pesjak Supervised by A.A. Tsvetkov 1, D. Posthuma 2 and S.A. Borovkova 3 MSc. Thesis Finance HONOURS TRACK Quantitative

### Statistical Analysis of Life Insurance Policy Termination and Survivorship

Statistical Analysis of Life Insurance Policy Termination and Survivorship Emiliano A. Valdez, PhD, FSA Michigan State University joint work with J. Vadiveloo and U. Dias Session ES82 (Statistics in Actuarial

### Precalculus REVERSE CORRELATION. Content Expectations for. Precalculus. Michigan CONTENT EXPECTATIONS FOR PRECALCULUS CHAPTER/LESSON TITLES

Content Expectations for Precalculus Michigan Precalculus 2011 REVERSE CORRELATION CHAPTER/LESSON TITLES Chapter 0 Preparing for Precalculus 0-1 Sets There are no state-mandated Precalculus 0-2 Operations

### IBM SPSS Neural Networks 22

IBM SPSS Neural Networks 22 Note Before using this information and the product it supports, read the information in Notices on page 21. Product Information This edition applies to version 22, release 0,

### Module 4: Data Exploration

Module 4: Data Exploration Now that you have your data downloaded from the Streams Project database, the detective work can begin! Before computing any advanced statistics, we will first use descriptive

### VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA

VISUALIZATION OF DENSITY FUNCTIONS WITH GEOGEBRA Csilla Csendes University of Miskolc, Hungary Department of Applied Mathematics ICAM 2010 Probability density functions A random variable X has density

### Cost of Capital and Corporate Refinancing Strategy: Optimization of Costs and Risks *

Cost of Capital and Corporate Refinancing Strategy: Optimization of Costs and Risks * Garritt Conover Abstract This paper investigates the effects of a firm s refinancing policies on its cost of capital.

### Dongfeng Li. Autumn 2010

Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

### SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

### Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

### A comparison between different volatility models. Daniel Amsköld

A comparison between different volatility models Daniel Amsköld 211 6 14 I II Abstract The main purpose of this master thesis is to evaluate and compare different volatility models. The evaluation is based

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

### A model to predict client s phone calls to Iberdrola Call Centre

A model to predict client s phone calls to Iberdrola Call Centre Participants: Cazallas Piqueras, Rosa Gil Franco, Dolores M Gouveia de Miranda, Vinicius Herrera de la Cruz, Jorge Inoñan Valdera, Danny

### The Big 50 Revision Guidelines for S1

The Big 50 Revision Guidelines for S1 If you can understand all of these you ll do very well 1. Know what is meant by a statistical model and the Modelling cycle of continuous refinement 2. Understand

### seven Statistical Analysis with Excel chapter OVERVIEW CHAPTER

seven Statistical Analysis with Excel CHAPTER chapter OVERVIEW 7.1 Introduction 7.2 Understanding Data 7.3 Relationships in Data 7.4 Distributions 7.5 Summary 7.6 Exercises 147 148 CHAPTER 7 Statistical

### Big Ideas in Mathematics

Big Ideas in Mathematics which are important to all mathematics learning. (Adapted from the NCTM Curriculum Focal Points, 2006) The Mathematics Big Ideas are organized using the PA Mathematics Standards

### Statistical Functions in Excel

Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.

### This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course

### Master of Mathematical Finance: Course Descriptions

Master of Mathematical Finance: Course Descriptions CS 522 Data Mining Computer Science This course provides continued exploration of data mining algorithms. More sophisticated algorithms such as support

### Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation

### Dr Christine Brown University of Melbourne

Enhancing Risk Management and Governance in the Region s Banking System to Implement Basel II and to Meet Contemporary Risks and Challenges Arising from the Global Banking System Training Program ~ 8 12

### SYSM 6304: Risk and Decision Analysis Lecture 3 Monte Carlo Simulation

SYSM 6304: Risk and Decision Analysis Lecture 3 Monte Carlo Simulation M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu September 19, 2015 Outline

### San Jose State University Engineering 10 1

KY San Jose State University Engineering 10 1 Select Insert from the main menu Plotting in Excel Select All Chart Types San Jose State University Engineering 10 2 Definition: A chart that consists of multiple

### Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

### Algebra 1 Course Information

Course Information Course Description: Students will study patterns, relations, and functions, and focus on the use of mathematical models to understand and analyze quantitative relationships. Through

### 2. Filling Data Gaps, Data validation & Descriptive Statistics

2. Filling Data Gaps, Data validation & Descriptive Statistics Dr. Prasad Modak Background Data collected from field may suffer from these problems Data may contain gaps ( = no readings during this period)

### Thinkwell s Homeschool Algebra 2 Course Lesson Plan: 34 weeks

Thinkwell s Homeschool Algebra 2 Course Lesson Plan: 34 weeks Welcome to Thinkwell s Homeschool Algebra 2! We re thrilled that you ve decided to make us part of your homeschool curriculum. This lesson

### Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

### MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

### PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUMBER OF REFERENCE SYMBOLS

PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUM OF REFERENCE SYMBOLS Benjamin R. Wiederholt The MITRE Corporation Bedford, MA and Mario A. Blanco The MITRE

### Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

### A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA

REVSTAT Statistical Journal Volume 4, Number 2, June 2006, 131 142 A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA Authors: Daiane Aparecida Zuanetti Departamento de Estatística, Universidade Federal de São

### Executive Program in Managing Business Decisions: A Quantitative Approach ( EPMBD) Batch 03

Executive Program in Managing Business Decisions: A Quantitative Approach ( EPMBD) Batch 03 Calcutta Ver 1.0 Contents Broad Contours Who Should Attend Unique Features of Program Program Modules Detailed

### COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for

### Overview of Math Standards

Algebra 2 Welcome to math curriculum design maps for Manhattan- Ogden USD 383, striving to produce learners who are: Effective Communicators who clearly express ideas and effectively communicate with diverse

### Time Series Analysis

Time Series Analysis Identifying possible ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos

### Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin

### Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes

Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes Simcha Pollack, Ph.D. St. John s University Tobin College of Business Queens, NY, 11439 pollacks@stjohns.edu

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Expression. Variable Equation Polynomial Monomial Add. Area. Volume Surface Space Length Width. Probability. Chance Random Likely Possibility Odds

Isosceles Triangle Congruent Leg Side Expression Equation Polynomial Monomial Radical Square Root Check Times Itself Function Relation One Domain Range Area Volume Surface Space Length Width Quantitative

### Data Analysis: Describing Data - Descriptive Statistics

WHAT IT IS Return to Table of ontents Descriptive statistics include the numbers, tables, charts, and graphs used to describe, organize, summarize, and present raw data. Descriptive statistics are most

### Module 3: Correlation and Covariance

Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

### Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express

### Data Mining Part 2. Data Understanding and Preparation 2.1 Data Understanding Spring 2010

Data Mining Part 2. and Preparation 2.1 Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Outline Introduction Measuring the Central Tendency Measuring the Dispersion of Data Graphic Displays References

### Lecture 20: Clustering

Lecture 20: Clustering Wrap-up of neural nets (from last lecture Introduction to unsupervised learning K-means clustering COMP-424, Lecture 20 - April 3, 2013 1 Unsupervised learning In supervised learning,

### Introduction to Support Vector Machines. Colin Campbell, Bristol University

Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.