Efficient Curve Fitting Techniques



Similar documents
Increasing for all. Convex for all. ( ) Increasing for all (remember that the log function is only defined for ). ( ) Concave for all.

Algebra 1 Course Title

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

ALGEBRA 2 CRA 2 REVIEW - Chapters 1-6 Answer Section

BookTOC.txt. 1. Functions, Graphs, and Models. Algebra Toolbox. Sets. The Real Numbers. Inequalities and Intervals on the Real Number Line

Zeros of Polynomial Functions

Rotation Rate of a Trajectory of an Algebraic Vector Field Around an Algebraic Curve

Zero: If P is a polynomial and if c is a number such that P (c) = 0 then c is a zero of P.

4.3 Lagrange Approximation

Linearly Independent Sets and Linearly Dependent Sets

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

2.5 ZEROS OF POLYNOMIAL FUNCTIONS. Copyright Cengage Learning. All rights reserved.

3.3 Real Zeros of Polynomials

Piecewise Cubic Splines

Indiana State Core Curriculum Standards updated 2009 Algebra I

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

ORTHOGONAL POLYNOMIAL CONTRASTS INDIVIDUAL DF COMPARISONS: EQUALLY SPACED TREATMENTS

Algebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.

Regression III: Advanced Methods

INTERPOLATION. Interpolation is a process of finding a formula (often a polynomial) whose graph will pass through a given set of points (x, y).

Simple linear regression

Zeros of a Polynomial Function

Partial Fractions. Combining fractions over a common denominator is a familiar operation from algebra:

4.5 Chebyshev Polynomials

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

AP Physics 1 and 2 Lab Investigations

Empirical Model-Building and Response Surfaces

10.2 ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS. The Jacobi Method

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

CORRELATION ANALYSIS

The degree of a polynomial function is equal to the highest exponent found on the independent variables.

Zeros of Polynomial Functions

Linear and quadratic Taylor polynomials for functions of several variables.

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

1 Review of Least Squares Solutions to Overdetermined Systems

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Correlation key concepts:

Econometrics Simple Linear Regression

Multiple Linear Regression in Data Mining

Zeros of Polynomial Functions

Creating, Solving, and Graphing Systems of Linear Equations and Linear Inequalities

Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points

Polynomial and Rational Functions

Numerical Analysis An Introduction

Sensitivity Analysis with Excel

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Roots of Equations (Chapters 5 and 6)

3.3. Solving Polynomial Equations. Introduction. Prerequisites. Learning Outcomes

Student Performance Q&A:

Mathematics Review for MS Finance Students

A three point formula for finding roots of equations by the method of least squares

Example: Boats and Manatees

Quantitative Methods for Finance

On using numerical algebraic geometry to find Lyapunov functions of polynomial dynamical systems

In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data.

Higher Education Math Placement

Vieta s Formulas and the Identity Theorem

Factoring Polynomials and Solving Quadratic Equations

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

Mathematics Georgia Performance Standards

Integration. Topic: Trapezoidal Rule. Major: General Engineering. Author: Autar Kaw, Charlie Barker.

AP CALCULUS AB 2008 SCORING GUIDELINES

How I won the Chess Ratings: Elo vs the rest of the world Competition

EdExcel Decision Mathematics 1

The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy

Calculation of Medical Liability - Combination of Statistical Methods, Actuarial Judgment, and Communication

Fitting Subject-specific Curves to Grouped Longitudinal Data

Introduction to time series analysis

0.4 FACTORING POLYNOMIALS

Mathematics Online Instructional Materials Correlation to the 2009 Algebra I Standards of Learning and Curriculum Framework

Method To Solve Linear, Polynomial, or Absolute Value Inequalities:

a 1 x + a 0 =0. (3) ax 2 + bx + c =0. (4)

21 : Theory of Cost 1

1. Briefly explain what an indifference curve is and how it can be graphically derived.

CALCULATIONS & STATISTICS

Application. Outline. 3-1 Polynomial Functions 3-2 Finding Rational Zeros of. Polynomial. 3-3 Approximating Real Zeros of.

A Detailed Price Discrimination Example

THE INSURANCE BUSINESS (SOLVENCY) RULES 2015

MATHEMATICAL METHODS OF STATISTICS

An introduction to Value-at-Risk Learning Curve September 2003

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the school year.

Rolle s Theorem. q( x) = 1

Corollary. (f є C n+1 [a,b]). Proof: This follows directly from the preceding theorem using the inequality

DRAFT. Further mathematics. GCE AS and A level subject content

Polynomial Invariants

1 if 1 x 0 1 if 0 x 1

THE FUNDAMENTAL THEOREM OF ALGEBRA VIA PROPER MAPS

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

1.3 Algebraic Expressions

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

4. Continuous Random Variables, the Pareto and Normal Distributions

Variable annuity economic capital: the least-squares Monte Carlo approach

DRAFT. Algebra 1 EOC Item Specifications

7.6 Approximation Errors and Simpson's Rule

Precalculus REVERSE CORRELATION. Content Expectations for. Precalculus. Michigan CONTENT EXPECTATIONS FOR PRECALCULUS CHAPTER/LESSON TITLES

Dimensionality Reduction: Principal Components Analysis

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Transcription:

15/11/11 Life Conference and Exhibition 11 Stuart Carroll, Christopher Hursey Efficient Curve Fitting Techniques - November 1 The Actuarial Profession www.actuaries.org.uk Agenda Background Outline of the problem and issues to consider The solution Theoretical justification Further considerations Practical issues Outcome Questions or comments 1 1

15/11/11 Background We wished to use an internal model running in excess of one million scenarios Many different products And many different risk factors It was not feasible to evaluate liabilities accurately in every scenario Therefore the decision was taken to formula fit liability values Use polynomial approximation formulae Fitted to a limited number of evaluation points We need to capture all material risk dependencies Each risk factor leads to a marginal risk function in one variable Non-linearity between risk factors leads to non linearity functions in two or more variables Marginal risk functions and non linearity functions are summed to give the final approximation formula. Outline of the problem and issues to consider Requirements Repeatable process that is objective Measurable performance Risk management exercise, not just compliance We wish to model the full distribution Sufficient number of sample points for accuracy Implies more sample points Reasonable run times Implies fewer sample points Need to resolve the tension between the previous two points. 3

15/11/11 Outline of the problem and issues to consider Testing and Validation Measures of goodness of fit Least squares Maximum absolute error Error dependency Error Bias Out of sample testing How many to be statistically significant Analysis of change Understanding the results. 4 Outline of the problem and issues to consider Other considerations Determining the limits of the model Range for each risk factor Allow for movements in roll-forward Four standard deviations? What about plan B? Must distinguish between analysis phase and production phase What if goodness of fit fails once in production? 5 3

15/11/11 A simple example Consider the variation in liability value with respect to a single risk factor The least squares quadratic is fitted using all evaluation points Initial fit is reasonable but uses too many calculations to be feasible for the production phase How do we reduce the number of fitting points? PV Liabilities Lapses -8-6 -4-4 6 8 Percent Actual Results Actual Results Approximation Results 6 The Error Curve The error curve is cubic Three roots at three points of intersection Those three points define a unique quadratic function Fit to the three points of intersection only same approximation function same error curve No loss in performance for greatly reduced effort. Error ( m) Error ( m) Approximation - All points 8 6 4-8% -6% -4% -% % % 4% 6% 8% - -4-6 -8 Percent Approximation - Three points 8 6 4-8% -6% -4% -% % % 4% 6% 8% - -4-6 -8 Percent 7 4

15/11/11 Implications In general, we wish to approximate our unknown function by an (n-1)-order polynomial The least squares (n-1)-order polynomial approximation will intersect the unknown function n times If those n points of intersection can be determined, they provide the optimal fitting points, or nodes for our approximation function We can achieve as good a fit using n nodes as using an infinite number of nodes Similarly, we can determine the points of maximum error Allows us to estimate maximum error more powerful than estimating maximum sample error. 8 More points is not necessarily better What happens when we add points to try and improve the fit? The error curve changes Maximum error has increased Sum squared error is worse Adding more points has led to a deterioration in the fit Error ( m) Approximation -- Three Five points 8 6 4-8% -6% -4% -% % % 4% 6% 8% - -4-6 -8 Percent Why? 9 5

15/11/11 Rationale This is not a regression problem but an approximation problem For a given order of polynomial approximation function there exists a unique optimal error curve, i.e. one that deviates least from zero in the least squares sense Identifying solutions to the optimum error curve allows us to use the absolute minimum number of nodes that are required to uniquely define the optimum approximation function Adding further nodes shifts the error curve and, by definition, results in a sub-optimal approximation. 1 Determining the fitting points We introduce a single assumption: our unknown function can be accurately represented by an n-order polynomial Powers of (n+1) and above are vanishingly small Under this assumption the n points of intersection depend only on the range, or domain, over which the least squares best fit is performed Similarly, the turning points, or points of maximum error, also depend only on the domain Usefully, all our points of interest are now independent of the unknown function and can be determined analytically. 11 6

15/11/11 An example We wish to approximate a liability function with a quadratic (n=3) Assuming powers of four and above are immaterial, we are attempting to approximate an unknown cubic the least squares quadratic best fit to any cubic, intersects that cubic at three points ( L ) L ( L 1 L1 ) 3 ( L L1 ) P1 P,3 5 and has turning points M 1, L L1 ) 1 ( L 5 L ) ( 1 - Quadratic fit to cubic 5 15 1 5-8 -6-4 - 4 6 8 where L 1 and L 1 represent the upper and lower limits of the domain. 15 1 5-8 -6-4 - 4 6 8-5 -1-15 1 More examples Quadratic least squares fit using 1, points and the resulting error curves Quadratic fit to cubic Quadratic fit to cubic Quadratic fit to cubic 1 15 3 1 8 6 4-8 -6-4 - 4 6 8-1 5-8 -6-4 - 4 6 8-5 -1 5 15 1 5-8 -6-4 - 4 6 8 15 15 15 1 1 1 5 5 5-8 -6-4 - 4 6 8-5 -8-6 -4-4 6 8-5 -8-6 -4-4 6 8-5 -1-1 -1-15 -15-15 - Points of intersection do not depend on the function. - - 13 7

15/11/11 Implications We can make an assumption about the order of polynomial that accurately represents the unknown function Given this assumption, the nodes that give optimum least squares fit can be determined without any prior analysis or knowledge of the unknown function Goodness of fit tests will fail for one of two reasons a good enough fit is not possible, or the initial assumption is invalid In either case. simply revise the assumption and try again As long as the initial assumption holds, once identified, the fitting nodes and points of maximum error remain fixed. 14 Theoretical Justification Weierstrauss Approximation Theorem - Any continuous function on a closed and bounded interval can be approximated on that interval by polynomials to any degree of accuracy The optimum nodes can be shown to correspond to the roots of the Legendre polynomials for all n Properties of Legendre polynomials Orthogonal Series convergence on [a, b] Minimum deviation from zero We effectively use a truncated Legendre series n 1 f ( x) aklk ( x) anln ( x) k Forces the error curve to be the n-order Legendre polynomial. 15 8

3 1-1 - -3 8 6 4 - -4-6 -8 4.5 4 3.5 3.5 1.5 1.5 15/11/11 Further Considerations Non Linearity The same theory can be extended and applied to non linearity functions in two or more variables Construct a combined risk surface by adding marginal risk functions Compare with actual combined risk surface to evaluate non linearityit If there is no non linearity between two risk factors then there should be no cross terms in the approximation formula. Lapses 14 1 1 8 6 4-5% -4.% -3.% -.% -1.%.% 1.%.% 3.% Combined Risk Surface from Marginals 5 15 1-5% 5-5.%.% 5.% 3.% 1.5% 5.%.% -3% -1.5% 4.% 5.% Interest Rates 18 16 14 1 1 8 6 4-3% -.4% -1.8% -1.% -.6%.%.6% 1.% 1.8% Actual Combined Risk Surface 5 15 1-5% 5-5.%.% 5.% 3.% 1.5% 5.%.% -1.5% -3%.4% 3.% 16 Further Considerations Non Linearity Non linearity surface Non-linearity is the difference between the combined impact of two or more risk factor and the sum of those same risk factors. Using a combination of terms in xy, x y, xy and x y, we can construct a two factor nd order polynomial approximation.4 1.5.6 -.3-1. -.1-3 Approximation 6 5 4 3 1-1 - -3-4 6 5 4-5% -35.% -.% -5.% 1.% 5.% 4.% -3% -1.8% -.6%.6% 1.8% 3.% -5% -35.% -.% -5.% 1.% 5.% 4.% -3% -1.8% -.6%.6% 1.8% 3.% -5% -35.% -.% -5.% 1.% 5.% 4.% -3% -1.8% -.6%.6% 1.8% 3.% 3 1-1 Least squares fit found from 14,641 nodes..4 1.5.6 -.3-1. -.1-3 - -3-4 17 9

15/11/11 Further Considerations Non Linearity Optimal fitting points are given by the intersection of the nonlinearity surface and the least squares non linearity approximation function No unique solution It can be shown that the intersection of the one factor solutions provide one solution to the two factor problem Same approximation function results from fitting to four nodes as from fitting to 14,641 nodes. -3. -.75 -.5 -.5 -. -1.75-1.5-1.5-1. -.75 Error Surface -.5 -.5..5.5.75 1. 1.5 1.5 1.75..5.5.75 3. 6 55 5 45 4 35 3 5 15 1 5-5 -1-15 - -5-3 -35-4 -45-5 -55-6 18 Further Considerations Non Linearity Can show that the roots to the Legendre polynomials provide optimum nodes single factor polynomials of any order Can also show that the intersection of the single factor solutions provide optimum nodes to the following: Two factor nd order polynomials Three factor nd order polynomials Two factor 3 rd order polynomials Attempts at a general proof for multifactor polynomials have so far been unsuccessful. 19 1

15/11/11 Further Considerations Constrained solutions Consider a quadratic least squares fit over a non-symmetrical domain Quadratic fit to cubic 5 15 1 15 5 1-6 -5-4 -3 - -1 1 3 4-5 5-1 -6-5 -4-3 - -1 1 3 4-15 Resulting error at zero may be undesirable Can solve directly with constraint that error at zero is nil Quadratic at c fit t to cubic c 5 15 1 15 5 1-6 -5-4 -3 - -1 1 3 4-5 5-1 -6-5 -4-3 - -1 1 3 4-15 Resulting fit is optimal given the constraint. Further Considerations What about Plan B? Quadratic fit to cubic Minimise the maximum error using Chebyshev nodes Adjust the domain to use nested and coincident nodes Use a Dampening function Error curve is next term in Legendre series Coefficient is the error at the extreme. 5 15 1 5-8 -6-4 - 4 6 8 - Same domain Same - no domain coincedent nodes 1.8.6.4. -.4 -.6 -.8 15 1 5-8 -6-4 - 4 6 8-1 -.8 -.6 -.4 -...4.6.8 1 -. 3rd Order Legendre -1 3rd Order Legendre Different domains - nodes coincident 1.8.6.4. -1 -.8 -.6 -.4 -...4.6.8 1 -. -.4 -.6 -.8-1 -5-1 -15 4th Order Legendre 3rd Order Legendre 4th Order Legendre 1 11

15/11/11 Practical Issues Stochastic values Approximation error must be minimised Take the average of two results close to, and equidistant from, each node Similarly for each test point Error accumulation The errors in marginal risk functions combine to form an error surface These errors may accumulate to exceed materiality limits Lower materiality limits in the marginal risk functions Fit to the true non linearity surface Adjust for marginal errors in non linearity before fitting. Outcome Efficiency maximised Fit exactly n points for n formula coefficients Process is repeatable and objective Based on theories widely accepted and used in engineering, physics and animation Less reliance on samples for performance measurement Maximum error is targeted and measured Model limitations can be determined with some confidence Better understanding of results Fitting errors are predictable and explainable Greater confidence in the internal model results. 3 1

15/11/11 Questions or comments? Expressions of individual views by members of The Actuarial Profession and its staff are encouraged. The views expressed in this presentation are those of the presenter. 1 The Actuarial Profession www.actuaries.org.uk 4 13