Advanced Lecture on Mathematical Science and Information Science I. Optimization in Finance



Similar documents
Optimization Methods in Finance

1 Portfolio mean and variance

Nonlinear Programming Methods.S2 Quadratic Programming

Two-Stage Stochastic Linear Programs

4.6 Linear Programming duality

Duality in General Programs. Ryan Tibshirani Convex Optimization /36-725

1 Introduction. Linear Programming. Questions. A general optimization problem is of the form: choose x to. max f(x) subject to x S. where.

24. The Branch and Bound Method

A Log-Robust Optimization Approach to Portfolio Management

Practical Guide to the Simplex Method of Linear Programming

2.3 Convex Constrained Optimization Problems

Linear Programming I

Mathematical finance and linear programming (optimization)

Nonlinear Optimization: Algorithms 3: Interior-point methods

3. Linear Programming and Polyhedral Combinatorics

Using Microsoft Excel to build Efficient Frontiers via the Mean Variance Optimization Method

Linear Programming Notes V Problem Transformations

Understanding the Impact of Weights Constraints in Portfolio Theory

Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method

Linear Programming for Optimization. Mark A. Schulze, Ph.D. Perceptive Scientific Instruments, Inc.

Sensitivity analysis of utility based prices and risk-tolerance wealth processes

Duality in Linear Programming

1 Solving LPs: The Simplex Algorithm of George Dantzig

LECTURE 5: DUALITY AND SENSITIVITY ANALYSIS. 1. Dual linear program 2. Duality theory 3. Sensitivity analysis 4. Dual simplex method

Solving polynomial least squares problems via semidefinite programming relaxations

OPRE 6201 : 2. Simplex Method

FIN FINANCIAL INSTRUMENTS SPRING 2008

Numerical methods for American options

A Log-Robust Optimization Approach to Portfolio Management

Review of Basic Options Concepts and Terminology

Portfolio Optimization Part 1 Unconstrained Portfolios

Chapter 2 Portfolio Management and the Capital Asset Pricing Model

Linear Programming. March 14, 2014

4: SINGLE-PERIOD MARKET MODELS

Linear Programming: Chapter 11 Game Theory

Mathematical Finance

Options pricing in discrete systems

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the school year.

Using simulation to calculate the NPV of a project

Study Guide 2 Solutions MATH 111

Chapter 6: Sensitivity Analysis

Duality of linear conic problems

On Black-Scholes Equation, Black- Scholes Formula and Binary Option Price

constraint. Let us penalize ourselves for making the constraint too big. We end up with a

Solutions Of Some Non-Linear Programming Problems BIJAN KUMAR PATEL. Master of Science in Mathematics. Prof. ANIL KUMAR

Statistical Machine Learning

Solving Systems of Linear Equations

How to assess the risk of a large portfolio? How to estimate a large covariance matrix?

Hedging Illiquid FX Options: An Empirical Analysis of Alternative Hedging Strategies

Interior-Point Algorithms for Quadratic Programming

Lecture 3: Linear methods for classification

Linear Programming in Matrix Form

3. INNER PRODUCT SPACES

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

The Multiplicative Weights Update method

Numerical Methods for Pricing Exotic Options

Convenient Conventions

Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS

COMPLETE MARKETS DO NOT ALLOW FREE CASH FLOW STREAMS

Proximal mapping via network optimization

What is Linear Programming?

Minimizing Downside Risk in Axioma Portfolio with Options

1 Capital Asset Pricing Model (CAPM)

BINOMIAL OPTIONS PRICING MODEL. Mark Ioffe. Abstract

1 Sets and Set Notation.

Special Situations in the Simplex Algorithm

Equilibrium computation: Part 1

Notes on Determinant

Solving Linear Programs

Chapter 1 INTRODUCTION. 1.1 Background

Master of Mathematical Finance: Course Descriptions

Rational Bounds on the Prices of Exotic Options

THE FUNDAMENTAL THEOREM OF ARBITRAGE PRICING

Least Squares Estimation

Optimization Modeling for Mining Engineers

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Moral Hazard. Itay Goldstein. Wharton School, University of Pennsylvania

Chap 3 CAPM, Arbitrage, and Linear Factor Models

Risk/Arbitrage Strategies: An Application to Stock Option Portfolio Management

Numerisches Rechnen. (für Informatiker) M. Grepl J. Berger & J.T. Frings. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Discrete Optimization

Inner Product Spaces

The Advantages and Disadvantages of Online Linear Optimization

Approximation Algorithms

IEOR 4404 Homework #2 Intro OR: Deterministic Models February 14, 2011 Prof. Jay Sethuraman Page 1 of 5. Homework #2

3. Evaluate the objective function at each vertex. Put the vertices into a table: Vertex P=3x+2y (0, 0) 0 min (0, 5) 10 (15, 0) 45 (12, 2) 40 Max

Standard Form of a Linear Programming Problem

Some probability and statistics

Linear Threshold Units

Optimization applications in finance, securities, banking and insurance

MATHEMATICAL METHODS OF STATISTICS

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

Summer course on Convex Optimization. Fifth Lecture Interior-Point Methods (1) Michel Baes, K.U.Leuven Bharath Rangarajan, U.

THE DYING FIBONACCI TREE. 1. Introduction. Consider a tree with two types of nodes, say A and B, and the following properties:

The Characteristic Polynomial

Linear Programming. Widget Factory Example. Linear Programming: Standard Form. Widget Factory Example: Continued.

Moreover, under the risk neutral measure, it must be the case that (5) r t = µ t.

CAPM, Arbitrage, and Linear Factor Models

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Transcription:

Advanced Lecture on Mathematical Science and Information Science I Optimization in Finance Reha H. Tütüncü Visiting Associate Professor Dept. of Mathematical and Computing Sciences Tokyo Institute of Technology Department of Mathematical Sciences Carnegie Mellon University Pittsburgh, PA 15213 USA e-mail: reha@cmu.edu Spring 2003

ii

Contents Preface xi 1 Introduction 1 1.1 Continuous Optimization: A Brief Classification............. 2 1.1.1 Linear Optimization........................ 3 1.1.2 Quadratic Optimization...................... 3 1.1.3 Conic Optimization......................... 4 1.2 Optimization with Data Uncertainty................... 4 1.2.1 Stochastic Optimization...................... 5 1.2.2 Robust Optimization........................ 6 1.3 Financial Mathematics........................... 7 1.3.1 Portfolio Selection and Asset Allocation............. 8 1.3.2 Pricing and Hedging of Options.................. 9 1.3.3 Risk Management.......................... 11 1.3.4 Asset Liability Management.................... 11 2 Linear Programming: Theory and Algorithms 13 2.1 The Linear Programming Problem.................... 13 2.2 Duality.................................... 14 2.3 Optimality Conditions........................... 17 2.4 The Simplex Method............................ 18 2.4.1 Basic Solutions........................... 18 2.4.2 Simplex Iterations......................... 21 2.4.3 The Tableau Form of the Simplex Method............ 24 2.5 Exercises................................... 27 3 LP Models and Tools in Finance 29 3.1 Derivative Securities and The Fundamental Theorem of Asset Pricing. 29 3.1.1 Replication............................. 30 3.1.2 Risk-Neutral Probabilities..................... 31 3.2 Arbitrage Detection Using Linear Programming............. 34 3.3 Risk Measures: Conditional Value-at-Risk................ 36 iii

iv CONTENTS 3.4 Exercises................................... 41 4 Quadratic Programming: Theory and Algorithms 43 4.1 The Quadratic Programming Problem.................. 43 4.2 Optimality Conditions........................... 44 4.3 Interior-Point Methods........................... 45 4.4 The Central Path.............................. 48 4.5 Interior-Point Methods........................... 49 4.5.1 Path-Following Algorithms..................... 49 4.5.2 Centered Newton directions.................... 50 4.5.3 Neighborhoods of the Central Path................ 53 4.5.4 A Long-Step Path-Following Algorithm.............. 55 4.5.5 Starting from an Infeasible Point................. 56 4.6 QP software................................. 56 4.7 Exercises................................... 57 5 QP Models and Tools in Finance 59 5.1 Mean-Variance Optimization........................ 59 5.2 Maximizing the Sharpe Ratio....................... 60 5.3 Returns-Based Style Analysis....................... 63 5.4 Recovering Risk-Neural Probabilities from Options Prices........ 65 5.5 Exercises................................... 68 6 Stochastic Programming Models 71 6.1 Introduction to Stochastic Programming................. 71 6.2 Two Stage Problems with Recourse.................... 72 6.3 Multi Stage Problems............................ 74 6.4 Stochastic Programming Models and Tools in Finance.......... 76 6.4.1 Asset/Liability Management.................... 76 6.4.2 Corporate Debt Management................... 78 7 Robust Optimization Models and Tools in Finance 83 7.1 Introduction to Robust Optimization................... 83 7.2 Model Robustness.............................. 83 7.2.1 Robust Multi-Period Portfolio Selection.............. 84 7.3 Solution Robustness............................ 88 7.3.1 Robust Portfolio Selection..................... 88 7.3.2 Robust Asset Allocation: A Case Study............. 90 7.4 Exercises................................... 92

CONTENTS v 8 Conic Optimization 97 8.1 Conic Optimization Models and Tools in Finance............ 98 8.1.1 Minimum Risk Arbitrage...................... 98 8.1.2 Approximating Covariance Matrices................ 99 8.2 Exercises................................... 100 A Convexity 101 B Cones 103 C A Probability Primer 105 D Newton s Method 109 E Karush-Kuhn-Tucker Conditions 115

vi CONTENTS

List of Figures 4.1 The Central Path.............................. 49 4.2 Pure and centered Newton directions................... 51 4.3 Narrow and wide neighborhoods of the central path........... 55 7.1 The efficient frontier and the composition of the efficient portfolios found using the classical MVO approach without any consideration of input uncertainty. 91 7.2 The efficient frontier and the composition of the efficient portfolios found using the robust asset allocation approach. 2.5 and 97.5 percentiles of means and covariances of bootstrapped samples were used to describe the uncertainty intervals for these inputs............................ 92 7.3 (σ, µ)-profiles of classical and robust efficient portfolios when actual moments are (i) equal to their point estimates, (ii) equal to their worst possible values within given bounds.............................. 93 D.1 First step of Newton s method....................... 111 vii

viii LIST OF FIGURES

List of Tables 7.1 2.5, 50, and 97.5 percentiles of mean monthly log-returns as well as the entries of the covariance matrix obtained from bootstrapped samples. Only the lower diagonal entries in the covariance matrix are listed for brevity........ 90 D.1 Newton s method for Example D.1.................... 112 ix

x LIST OF TABLES

Preface Optimization models and methods play an increasingly important role in financial decisions. Many computational finance problems ranging from asset allocation to risk management, from option pricing to model calibration can be solved efficiently using modern optimization techniques. This manuscript discusses several classes of optimization problems (including linear, quadratic, conic, robust, and stochastic programming problems) encountered in financial models. For each problem class, after introducing the relevant theory (optimality conditions, duality, etc.) and efficient solution methods, we discuss several problems of mathematical finance that can be modeled within this problem class. In addition to classical and well-known models such as Markowitz mean-variance optimization formulation we present some newer optimization models for a variety of financial problems. This manuscript is derived from a set of course notes I prepared for the course Advanced Lecture on Mathematical Science and Information Science I: Optimization in Finance that I taught at the Department of Mathematical and Computing Sciences at Tokyo Institute of Technology between April 18, 2003 and July 18, 2003, during my sabbatical visit to Tokyo Tech. Parts of these notes are based on the lectures I presented at the University of Coimbra, Portugal in the Summer of 2002 as part of a short course I taught there. I gratefully acknowledge the financial support of these two institutions during my stays. I also thank the attendants of these courses and, in particular, my hosts Luís N. Vicente in Coimbra and Masakazu Kojima in Tokyo, for their feedback and for many stimulating discussions. Reha H. Tütüncü August 2003, Tokyo xi

xii PREFACE

Chapter 1 Introduction Optimization is a branch of applied mathematics that derives its importance both from the wide variety of its applications and from the availability of advanced algorithms for the efficient and robust solution of many of its problem classes. Mathematically, it refers to the minimization (or maximization) of a given objective function of several decision variables that have to satisfy some functional constraints. A typical optimization model addresses the allocation of scarce resources among a set of alternative activities in order to maximize an objective function a measure of the modeler s satisfaction with the solution, for example, the total profit. Decision variables, the objective function, and constraints are three essential elements of any optimization problem. Some problems may lack constraints so that any set of decision variables (of appropriate dimension) are acceptable as alternative solutions. Such problems are called unconstrained optimization problems, while others are often referred to as constrained optimization problems. There are problem instances with no objective functions the so-called feasibility problems, and others with multiple objective functions. Such problems are often addressed by reduction to a single or a sequence of single-objective optimization problems. If the decision variables in an optimization problem are restricted to integers, or to a discrete set of possibilities, we have an integer or discrete optimization problem. If there are no such restrictions on the variables, the problem is a continuous optimization problem. Of course, some problems may have a mixture of discrete and continuous variables. Our focus in these lectures will be on continuous optimization problems. We continue with a short classification of the problem classes we will encounter during our lectures. 1

2 CHAPTER 1. INTRODUCTION 1.1 Continuous Optimization: A Brief Classification We start with a generic description of an optimization problem. Given a function f(x) : R n R and a set S R n, the problem of finding an x R n that solves (OP 0 ) min x f(x) s.t. x S (1.1) is called an optimization problem (OP). We refer to f as the objective function and to S as the feasible region. If S is empty, the problem is called infeasible. If it is possible to find a sequence x k such that x k S, k and f(x k ) diverges to, then the problem is unbounded. If the problem is neither infeasible nor unbounded, then it is often possible to find a solution x that satisfies f(x ) f(x), x S. Such an x is called a global minimizer of the problem (OP). If f(x ) < f(x), x S, x x, then x is a strict global minimizer. In other instances, we may only find an x that satisfies f(x ) f(x), x S B x (ε) for some ε > 0 where B x (ε) is the open ball with radius ε centered at x, i.e., B x (ε) = {x : x x < ε}. Such an x is called a local minimizer of the problem (OP). A strict local minimizer is defined similarly. In most cases, the feasible set S is described explicitly using functional constraints (equalities and inequalities). For example, S may be given as S := {x : g i (x) = 0, i E, g i (x) 0, i I}, where E and I are the index sets for equality and inequality constraints. Then, our generic optimization problem takes the following form: (OP) min x f(x) g i (x) = 0, i E g i (x) 0, i I. (1.2) There are many factors that affect the efficient solvability of optimization problems. For example, n the number of decision variables in the problem, and E + I the total

1.1. CONTINUOUS OPTIMIZATION: A BRIEF CLASSIFICATION 3 number of constraints, are generally good predictors of how easy or difficult it would be to solve a given optimization problem. Other factors are related to the properties of the functions f and g i s that define the problem. Problems with a linear objective function and linear constraints are easier, so are problems with convex objective functions and convex feasible sets. Therefore, instead of general purpose optimization algorithms, researchers have developed different algorithms for problems with special characteristics. This approach requires a proper classification of optimization problems. We list a few of these optimization problem classes that we will encounter in this manuscript. A more complete classification can be found, for example, on the Optimization Tree available from http://www-fp.mcs.anl.gov/otc/guide/optweb/. 1.1.1 Linear Optimization One of the most common and easiest optimization problems is the linear optimization (LO) or the linear programming (LP) problem: the problem of optimizing a linear objective function subject to linear equality and inequality constraints. This corresponds to the case where f and all g i s are linear in (OP). If either f or one of the g i s is not linear, then the resulting problem is a nonlinear programming (NLP) problem. The standard form of the LP is given below: (LP) min x c T x Ax = b x 0, (1.3) where A IR m n, b IR m, c IR n are given, and x IR n is the variable vector to be determined as the solution to the problem. As with (OP), the problem LP is said to be feasible if its constraints are consistent and it is called unbounded if there exists a sequence of feasible vectors {x k } such that c T x k. When LP is feasible but not unbounded it has an optimal solution, i.e., a vector x that satisfies the constraints and minimizes the objective value among all feasible vectors. The best known (and most successful) methods for solving LPs are interior-point methods and the simplex method. 1.1.2 Quadratic Optimization A more general optimization problem is the quadratic optimization (QO) or the quadratic programming (QP) problem, where the objective function is now a quadratic function of the variables. The standard form QP is defined as follows: (QP) min x 1 2 xt Qx + c T x Ax = b x 0, (1.4)

4 CHAPTER 1. INTRODUCTION where A IR m n, b IR m, c IR n, Q IR n n are given, and x IR n. Since x T Qx = 1 2 xt (Q + Q T )x, Q can be assumed to be symmetric without loss of generality. The objective function of the problem QP is a convex function of x when Q is a positive semidefinite matrix, i.e., when y T Qy 0 for all y (see the Appendix for a discussion on convex functions). This condition is equivalent to Q having all nonnegative eigenvalues. When this condition is satisfied, the QP problem is a convex optimization problem and can be solved in polynomial time using interior-point methods. 1.1.3 Conic Optimization Another generalization of the linear optimization problem is obtained by replacing the nonnegativity constraints with general conic inclusion constraints, resulting in a socalled conic optimization problem. For this purpose, we consider a closed convex cone C (see the Appendix for a brief discussion on cones) in a finite-dimensional vector space X and the following conic optimization problem: (CO) min x c T x Ax = b x C. (1.5) When X = R n and C = R n +, this problem is the standard form LP. However, much more general nonlinear optimization problems can also be formulated in this way. Furthermore, some of the most efficient and robust algorithmic machinery developed for linear optimization problems can be modified to solve these general optimization problems. Two important subclasses of conic optimization problems we will address are: (i) second-order cone optimization, and (ii) semidefinite optimization. These correspond to the cases when C is the second-order cone: C q := {x = (x 0, x 1,..., x n ) R n+1 : x 0 (x 1,..., x n ) }, and the cone of symmetric positive semidefinite matrices: x 11 x 1n C s := X =..... Rn n : X = X T, X is positive semidefinite. x n1 x nn When we work with the cone of positive semidefinite matrices, the standard inner products used in c T x and Ax in (1.5) are replaced by an appropriate inner product for the space of n-dimensional square matrices. 1.2 Optimization with Data Uncertainty In all the problem classes we discussed so far, we made the implicit assumption that the data of the problem, namely the parameters that describe the problem such as Q,

1.2. OPTIMIZATION WITH DATA UNCERTAINTY 5 A, b and c in (QP) are all known. This is not always the case. Often, the problem parameters correspond to quantities that will only be realized in the future, or can not be known exactly at the time the problem has to be formulated and solved. Such situations are especially common in the models that involve financial quantities such as returns on investments, risks, etc. We will discuss two fundamentally different approaches that address optimization with data uncertainty. Stochastic programming is an approach used when the data uncertainty is random and can be explained by some probability distribution. Robust optimization is used when the uncertainty structure is not random and/or a solution that can behave well in all possible realizations of the uncertain data is desired. These two alternative approaches are not problem classes (as in LP, QP, etc.) but rather modeling techniques for addressing data uncertainty. 1.2.1 Stochastic Optimization The term stochastic optimization or stochastic programming refers to an optimization problem in which some problem data are random. The underlying optimization problem might be a linear program, an integer program, or a nonlinear program. An important case is that of stochastic linear programs. A stochastic program with recourse arises when some of the decisions (recourse actions) can be taken after the outcomes of some (or all) random events have become known. For example, a two-stage stochastic linear program with recourse can be written as follows: max (c 1 ) T x 1 + E[max c 2 (ω) T x 2 (ω)] A 1 x 1 = b 1 B 2 (ω)x 1 + A 2 (ω)x 2 (ω) = b 2 (1.6) (ω) x 1 0, x 2 (ω) 0, where the first-stage decisions are represented by vector x 1 and the second-stage decisions by vector x 2 (ω), which depend on the realization ω of a random event. A 1 and b 1 define deterministic constraints on the first-stage decisions x 1, whereas A 2 (ω), B 2 (ω), and b 2 (ω) define stochastic linear constraints linking the recourse decisions x 2 (ω) to the first-stage decisions. The objective function contains a deterministic term (c 1 ) T x 1 and the expectation of the second-stage objective c 2 (ω) T x 2 (ω) taken over all realization of the random event ω. Note that, once the first-stage decisions x 1 have been made and the random event ω has been realized, one can compute the optimal second-stage decisions by solving the following linear program: f(x 1, ω) = max c 2 (ω) T x 2 (ω) A 2 (ω)x 2 (ω) = b 2 (ω) B 2 (ω)x 1 x 2 (ω) 0, (1.7) Let f(x 1 ) = E[f(x 1, ω)] denote the expected value of the optimal value of this problem.

6 CHAPTER 1. INTRODUCTION Then, the two-stage stochastic linear program becomes max (c 1 ) T x 1 + f(x 1 ) A 1 x 1 = b 1 x 1 0, (1.8) So, if the (possibly nonlinear) function f(x 1 ) is known, the problem reduces to a nonlinear programming problem. When the data c 2 (ω), A 2 (ω), B 2 (ω), and b 2 (ω) are described by finite distributions, one can show that f is piecewise linear and concave. When the data are described by probability densities that are absolutely continuous and have finite second moments, one can show that f is differentiable and concave. In both cases, we have a convex optimization problem with linear constraints for which specialized algorithms are available. 1.2.2 Robust Optimization Robust optimization refers to the modeling of optimization problems with data uncertainty to obtain a solution that is guaranteed to be good for all possible realizations of the uncertain parameters. In this sense, this approach departs from the randomness assumption used in stochastic optimization for uncertain parameters and gives the same importance to all possible realizations. Uncertainty in the parameters is described through uncertainty sets that contain all (or most) possible values that may be realized for the uncertain parameters. There are different definitions and interpretations of robustness and the resulting models differ accordingly. One important concept is model robustness; this refers to solutions that remain feasible for all possible values of the uncertain inputs we prefer to call such solutions constraint robust solutions. This type of solutions are required in many engineering applications. Here is an example adapted from Ben-Tal and Nemirovski: Consider a multi-phase engineering process (a chemical distillation process, for example) and a related process optimization problem that includes balance constraints (materials entering a phase of the process can not be more than what is produced/left over from the previous phase). Often, the quantities of the end products of a particular phase depend on external, uncontrollable factors and therefore are uncertain. However, no matter what the values of these uncontrollable factors are, the balance constraints must be satisfied. Therefore, our solution must be model robust with respect to the uncertainties of the problem. Here is a mathematical model for finding constraint robust solutions: Consider an optimization problem of the form: (OP uc ) min x f(x) G(x, p) K. (1.9) Here, x are the decision variables, f is the (certain) objective function, G and K are the structural elements of the constraints that are assumed to be certain and p are

1.3. FINANCIAL MATHEMATICS 7 the possibly uncertain parameters of the problem. Consider an uncertainty set U that contains all possible values of the uncertain parameters p. Then, a constraint robust optimal solution can be found by solving the following problem: (CROP ) min x f(x) G(x, p) K, p U. (1.10) Another important robustness concept is solution robustness. This refers to solutions that will remain close to optimal for all possible realizations of the uncertain problem parameters, and for this reason we prefer the alternative term objective robust for such solutions. Since such solutions may be difficult to obtain, especially when uncertainty sets are relatively large, an alternative goal for objective robustness is to find solutions whose worst-case behavior is optimized. Worst-case behavior of a solution corresponds to the value of the objective function for the worst possible realization of the uncertain data for that particular solution. Here is a mathematical model that addresses objective robustness: Consider an optimization problem of the form: (OP uo ) min x f(x, p) x S. (1.11) Here, S is the (certain) feasible set and f is the objective function that depends on uncertain parameters p. Assume as above that U is the uncertainty set that contains all possible values of the uncertain parameters p. Then, an objective robust solution can be obtained by solving: (OROP) min x S max p U f(x, p). (1.12) Note that solution robustness is a special case of model robustness it is easy to see that by introducing a new variable t (to be minimized) into (OP uo and imposing the constraint f(x, p) t, we get an equivalent problem to (OP uo ; the constraint robust formulation of the resulting problem is equivalent to OROP. Model robustness and solution robustness are concepts that arise in conservative decision making and are not always appropriate for optimization problems with data uncertainty. 1.3 Financial Mathematics Modern finance has become increasingly technical, requiring the use of sophisticated mathematical tools in both research and practice. Many find the roots of this trend in the portfolio selection models and methods described by Markowitz in the 1950 s and the option pricing formulas developed by Black, Scholes, and Merton in the late 1960 s. For the enormous effect these works produced on modern financial practice, Markowitz was awarded the Nobel prize in Economics in 1990, while Scholes and Merton won the Nobel prize in Economics in 1997.

8 CHAPTER 1. INTRODUCTION Below, we list a number of topics in finance that are especially suited for mathematical analysis and involve sophisticated tools from mathematical sciences. 1.3.1 Portfolio Selection and Asset Allocation The theory of optimal selection of portfolios was developed by Harry Markowitz in the 1950 s. His work formalized the diversification principle in portfolio selection and, as mentioned above, earned him the 1990 Nobel prize for economics. We will discuss his model in more detail later. Here we give a brief description of the model and relate it to QPs. Consider an investor who has a certain amount of money to be invested in a number of different securities (stocks, bonds, etc.) with random returns. For each security i, i = 1,..., n, estimates of its expected return, µ i, and variance, σi 2, are given. Furthermore, for any two securities i and j, their correlation coefficient ρ ij is also assumed to be known. If we represent the proportion of the total funds invested in security i by x i, one can compute the expected return and the variance of the resulting portfolio x = (x 1,..., x n ) as follows: E[x] = x 1 µ 1 +... + x n µ n = µ T x, and V ar[x] = i,j ρ ij σ i σ j x i x j = x T Qx where ρ ii 1, Q ij = ρ ij σ i σ j for i j, Q ii = σ 2 i, and µ = (µ 1,..., µ n ). The portfolio vector x must satisfy i x i = 1 and there may or may not be additional feasibility constraints. A feasible portfolio x is called efficient if it has the maximal expected return among all portfolios with the same variance, or alternatively, if it has the minimum variance among all portfolios that have at least a certain expected return. The collection of efficient portfolios form the efficient frontier of the portfolio universe. Markowitz portfolio selection problem, also called the mean-variance optimization (MVO) problem can be formulated in three different but equivalent ways. One formulation results in the problem of finding a minimum variance portfolio of the securities 1 to n that yields at least a target value of expected return (say b). Mathematically, this formulation produces a convex quadratic programming problem: min x x T Qx e T x = 1 µ T x b x 0, (1.13) where e is a n-dimensional vector of ones. The first constraint indicates that the proportions x i should sum to 1. The second constraint indicates that the expected return is no less than the target value and as we discussed above, the objective function

1.3. FINANCIAL MATHEMATICS 9 corresponds to the total variance of the portfolio. Nonnegativity constraints on x i are introduced to disallow short sales (selling a security that you do not have). Note that the matrix Q is positive semidefinite since x T Qx, the variance of the portfolio, must be nonnegative for every portfolio (feasible or not) x. The model (1.13) is rather versatile. For example, if short sales are permitted on some or all of the securities then this can be incorporated into the model simply by removing the nonnegativity constraint on the corresponding variables. If regulations or the investor s considerations limit the amount of investment in a subset of the securities, the model can be augmented with a linear constraint to reflect such a limit. In principle, any linear constraint can be added to the model without making it much harder to solve. Asset allocation problems have an identical mathematical structure to portfolio selection problems. In these problems, the objective is not to choose a portfolio of stocks (or other securities), but to determine the optimal investment among a set of asset classes. Examples of these asset classes are large capitalization stocks, small capitalization stocks, foreign stocks, government bonds, corporate bonds, etc. Since there are many mutual funds focusing on each one of these different asset classes, one can conveniently invest in these asset classes by purchasing the corresponding mutual fund. After estimating the expected returns, variances, and covariances for different asset classes, one can formulate a QP identical to (1.13) and obtain efficient portfolios of these asset classes. The formulation (1.13) we presented above makes several simplifying assumptions and much of the literature on asset allocation/portfolio selection focuses on solving this problem without some of these assumptions. We will address some of these variations and some other problems related to portfolio selection throughout the manuscript. 1.3.2 Pricing and Hedging of Options We first start with a description of some of the well-known financial options. A European call option is a contract with the following conditions: At a prescribed time in the future, known as the expiration date, the holder of the option has the right, but not the obligation to purchase a prescribed asset, known as the underlying, for a prescribed amount, known as the strike price or exercise price. A European put option is like the call option, except that it gives the right to sell an asset. An American call/put option is like a European option, except that it may be exercised on or before the expiration date. Since the payoff from an option depends on the value of the underlying security, its price is also related to the current value and expected behavior of this underlying security. To find the fair value of a given option, we need to solve a pricing problem and

10 CHAPTER 1. INTRODUCTION this problem can often be solved using sophisticated mathematical techniques, provided that there is a good model for the stochastic behavior of the underlying security. Option pricing problems are often solved using the following strategy: We try to determine a portfolio of assets with known prices which, if updated properly through time, will produce the same payoff as the option. Since the portfolio and the option will have the same eventual payoffs, we conclude that they must have the same value today (otherwise, there is arbitrage) and we can easily obtain the price of the option. A portfolio of other assets that produces the same payoff as a given financial instrument is called a replicating portfolio (or a hedge) for that instrument. Finding the right portfolio, of course, is not always easy and leads to a replication (or hedging) problem. Let us consider a simple example to illustrate these ideas. Let us assume that one share of stock XYZ is currently valued at $40. The price of XYZ a month from today is random: Assume that its value will either double or halve with equal probabilities. 80=S 1 (u) S 0 =$40. 20=S 1 (d) Today, we purchase a European call option to buy one share of XYZ stock for $50 a month from today. What is the fair price of this call option? Let us assume that we can borrow or lend money with no interest between today and next month, and that we can buy or sell any amount of the XYZ stock without any commissions, etc. These are part of the frictionless market assumptions we will address later in the manuscript. Further assume that XYZ will not pay any dividends within the next month. To solve the pricing problem, we consider the following hedging problem: Can we form a portfolio of the underlying stock (bought or sold) and cash (borrowed or lent) today, such that the payoff from the portfolio at the expiration date of the option will match the payoff of the option? Note that the option payoff will be $30 if the price of the stock goes up and $0 if it goes down. Say, this portfolio has shares of XYZ and $B cash. This portfolio would be worth 40 +B today. Next month, payoffs for this portfolio will be: Let us choose and B such that 80 +B=P 1 (u) P 0 =40 +B. 20 +B=P 1 (d) 80 + B = 30 20 + B = 0, so that the portfolio replicates the payoff of the option at the expiration date. This gives = 1 and B = 10, which is the hedge we were looking for. This portfolio is 2 worth P 0 = 40 + B =$10 today, therefore, the fair price of the option must also be $10.

1.3. FINANCIAL MATHEMATICS 11 1.3.3 Risk Management Risk is an inevitable consequence of productive activity. This is especially true for financial activities of companies and individuals where results of most decisions will be observed or realized in the future, in unpredictable circumstances. Since companies can not ignore such risks and can not insure themselves completely against risks, they have to manage it. This is a hard task even with the support of advanced mathematical techniques poor risk management practices led to several spectacular failures in the financial industry during the last decade (e.g., Barings Bank, Long Term Capital Management, Orange County). The development of a coherent risk management practice requires quantitative risk measures that adequately reflect the vulnerabilities of a company. Examples of these risk measures include portfolio variance as in the Markowitz MVO model, the Valueat-Risk (VaR) and the expected shortfall (also known as conditional VaR, or CVaR)). Furthermore, risk control techniques need to be developed and implemented to adapt to the rapid changes in the values of these risk measures. Government regulators already mandate that financial institutions control their holdings in certain ways and place margin requirements for risky positions. Optimization problems encountered in financial risk management often take the following form: optimize a performance measure (such as expected investment return) subject to the usual operating constraints and the constraint that a particular risk measure for the companies financial holdings does not exceed a prescribed amount. Mathematically, we may have the following problem: max x µ T x RM[x] γ e T x = 1 x 0, (1.14) As in the Markowitz MVO model, x i represent the proportion of the total funds invested in security. The objective is the expected portfolio return and µ is the expected return vector for the different securities. RM[x] denotes the value of a particular risk measure for portfolio x and γ is the prescribed upper limit on this measure. Since RM[x] is generally a nonlinear function of x, (1.14) is a nonlinear programming problem. Alternatively, we may optimize the risk measure while requiring that expected return of the portfolio is at least a certain amount. This would produce a problem very similar to (1.13). 1.3.4 Asset Liability Management How should a financial institution manage its assets and liabilities? A static meanvariance optimizing model such as the one we discussed for asset allocation fails to incorporate the multivariate nature of the liabilities faced by financial institutions.

12 CHAPTER 1. INTRODUCTION Furthermore, it equally penalizes returns above the expected returns and shortfalls. A multi-period model that emphasizes the need to meet liabilities in each period for a finite (or possibly infinite) horizon is often required. Since liabilities and asset returns usually have random components, their optimal management requires tools of Optimization under Uncertainty and most notably, stochastic programming approaches. Let L t be the liability of the company in period t for t = 1,..., T. Here, we assume that L t s are random with known distributions. A typical problem to solve in asset/liability management is to determine which assets (and in what quantities) the company should hold in each period to maximize its expected wealth at the end of period T. We can further assume that the asset classes the company can choose from have random returns (again, with known distributions) denoted by R it for asset class i in period t. Since the company can make the holding decisions for each period after observing the asset returns and liabilities in the previous periods, the resulting problem can be cast as a stochastic program with recourse: max x E[ i x i,t ] i(1 + R it )x i,t 1 i x i,t = L t, t = 1,..., T x i,t 0 i, t. (1.15) The objective function represents the expected total wealth at the end of the last period. The constraints indicate that the surplus left after liability L t is covered will be invested as follows: x i,t invested in asset i. In this formulation, x 0,t are the fixed, and possibly nonzero initial positions in the different asset classes.

Chapter 2 Linear Programming: Theory and Algorithms 2.1 The Linear Programming Problem One of the most common and fundamental optimization problems is the linear programming problem (LP), the problem of optimizing a linear objective function subject to linear equality and inequality constraints. A generic linear optimization problem has the following form: (LOP) min x c T x a T i x = b i, i E a T i x b i, i I, (2.1) where E and I are the index sets for linear equality and inequality constraints, respectively. For algorithmic purposes, it is often desirable to have the problems structured in a particular way. Since the development of the simplex method for LPs the following form has been a popular standard and is called the standard form LP: (LP) min x c T x Ax = b x 0. (2.2) Here A R m n, b R m, c R n are given, and x R n is the variable vector to be determined as the solution of the problem. The matrix A is assumed to have full row rank. This is done without loss of generality because if A does not have full row rank, the augmented matrix [A b] can be row reduced, which either reveals that the problem is infeasible or that one can continue with the reduced full-rank matrix. This form is not restrictive: Inequalities (other than non-negativity) can be rewritten as equalities after the introduction of a so-called slack or surplus variable that is restricted to be nonnegative. For example, 13

14 CHAPTER 2. LINEAR PROGRAMMING: THEORY AND ALGORITHMS can be rewritten as min x 1 x 2 2x 1 + x 2 12 x 1 + 2x 2 9 x 1, x 2 0 min x 1 x 2 2x 1 + x 2 + x 3 = 12 x 1 + 2x 2 + x 4 = 9 x 1, x 2, x 3, x 4 0. (2.3) (2.4) Variables that are not required to be nonnegative can be expressed as the difference of two new nonnegative variables. Simple transformations are available to rewrite any given LP in the standard form above. Therefore, in the rest of our theoretical and algorithmic discussion we assume that the LP is in the standard form. Recall the following definitions from the introductory chapter: LP is said to be feasible if its constraints are consistent and it is called unbounded if there exists a sequence of feasible vectors {x k } such that c T x k. When we talk about a solution (without any qualifiers) to LP we mean any candidate vector x R n. A feasible solution is one that satisfies the constraints, and an optimal solution is a vector x that satisfies the constraints and minimizes the objective value among all feasible vectors. When LP is feasible but not unbounded it has an optimal solution. 2.2 Duality The most important questions we will address in this chapter are the following: How do we recognize an optimal solution and how do we find such solutions? Consider the standard form LP in (2.4) above. Here are a few alternative feasible solutions: (x 1, x 2, x 3, x 4 ) = (0, 9 2, 15 2, 0) Objective value = 9 2 (x 1, x 2, x 3, x 4 ) = (6, 0, 0, 3) Objective value = 6 (x 1, x 2, x 3, x 4 ) = (5, 2, 0, 0) Objective value = 7 Since we are minimizing, the last solution is the best among the three feasible solutions we found, but is it the optimal solution? We can make such a claim if we can, somehow, show that there is no feasible solution with a smaller objective value. Note that the constraints provide us some bounds on the value of the objective function. For example, for any feasible solution, we must have x 1 x 2 2x 1 x 2 x 3 = 12

2.2. DUALITY 15 using the first constraint of the problem. The inequality above must hold for all feasible solutions since x i s are all nonnegative and the coefficient of each variable on the LHS are at least as large as the coefficient of the corresponding variable on the RHS. We can do better using the second constraint: x 1 x 2 x 1 2x 2 x 4 = 9 and even better by adding a negative third of each constraint: x 1 x 2 x 1 x 2 1 3 x 3 1 3 x 4 = 1 3 (2x 1 + x 2 + x 3 ) 1 3 (x 1 + 2x 2 + x 4 ) = 1 (12 + 9) = 7. 3 This last inequality indicates that for any feasible solution, the objective function value can not be smaller than -7. Since we already found a feasible solution achieving this bound, we conclude that this solution, namely (x 1, x 2, x 3, x 4 ) = (5, 2, 0, 0) is an optimal solution of the problem. This process illustrates the following strategy: If we find a feasible solution to the LP problem, and a bound on the optimal value of problem such that the bound and the objective value of the feasible solution coincide, then we can confidently recognize our feasible solution as an optimal solution. We will comment on this strategy shortly. Before that, though, we formalize our approach for finding a bound on the optimal objective value. Our strategy was to find a linear combination of the constraints, say with multipliers y 1 and y 2 for the first and second constraint respectively, such that the combined coefficient of each variable forms a lower bound on the objective coefficient of that variable. In other words, we tried to choose y 1 and y 2 such that y 1 (2x 1 +x 2 +x 3 )+y 2 (x 1 +2x 2 +x 4 ) = (2y 1 +y 2 )x 1 +(y 1 +2y 2 )x 2 +y 1 x 3 +y 2 x 4 x 1 x 2 or, 2y 1 + y 2 1 y 1 + 2y 2 1 y 1 0 y 2 0. Naturally, to obtain the best possible bound, we would like to find y 1 and y 2 that achieve the maximum combination of the right-hand-side values: max 12y 1 + 9y 2.

16 CHAPTER 2. LINEAR PROGRAMMING: THEORY AND ALGORITHMS This process results in a linear programming problem that is strongly related to the LP we are solving. We want to max 12y 1 + 9y 2 2y 1 + y 2 1 y 1 + 2y 2 1 y 1 0 y 2 0. (2.5) This problem is called the dual of the original problem we considered. The original LP in (2.2) is often called the primal problem. For a generic primal LP problem in standard form (2.2) the corresponding dual problem can be written as follows: (LD) max y b T y A T y c, (2.6) where y R m. Rewriting this problem with explicit dual slacks, we obtain the standard form dual linear programming problem: (LD) max y,s b T y A T y + s = c s 0, (2.7) where s R n. Next, we make some observations about the relationship between solutions of the primal and dual LPs. The objective value of any primal feasible solution is at least as large as the objective value of any feasible dual solution; this fact is known as the weak duality theorem: Theorem 2.1 (Weak Duality Theorem) Let x be any feasible solution to the primal LP (2.2) and y be any feasible solution to the dual LP (2.7). Then, c T x b T y. Proof: Since x 0 and c A T y = s 0, the inner product of these two vectors must be nonnegative: x T s = s T x = (c A T y) T x = c T x y T Ax = c T x y T b 0. The quantity x T s = c T x y T b is often called the duality gap. The following three results are immediate consequences of the weak duality theorem: Corollary 2.1 If the primal LP is unbounded, then the dual LP must be infeasible. Corollary 2.2 If the dual LP is unbounded, then the primal LP must be infeasible. Corollary 2.3 If x is feasible for the primal LP, y is feasible for the dual LP, and c T x = b T y, then x must be optimal for the primal LP and y must be optimal for the dual LP.

2.3. OPTIMALITY CONDITIONS 17 2.3 Optimality Conditions The last corollary of the previous section identified a sufficient condition for optimality of a primal-dual pair of feasible solutions, namely that their objective values coincide. One natural question to ask is whether this is a necessary condition. The answer is yes, as we illustrate next. Theorem 2.2 (Strong Duality Theorem) If both the primal LP problem and the dual LP have feasible solutions than they both have optimal solutions and for any primal optimal solution x and dual optimal solution y we have that c T x = b T y. We will omit the (elementary) proof of this theorem since it requires some additional tools. The reader can find a proof of this result on most standard linear programming textbooks. The strong duality theorem provides us with conditions to identify optimal solutions (called optimality conditions): x R n is an optimal solution of (2.2) if and only if 1. x is primal feasible: Ax = b, x 0, and there exists a y R m and s R n such that 2. (y, s) is dual feasible: A T y + s = c, s 0, and 3. there is no duality gap: c T x = b T y. Further analyzing the last condition above, we can obtain an alternative set of optimality conditions. Recall from the proof of the weak duality theorem that c T x b T y = (c A T y) T x 0 for any feasible primal-dual pair of solutions, since it is given as an inner product of two nonnegative vectors. This inner product is 0 (c T x = b T y) if and only if the following statement holds: For each i = 1,..., n, either x i or (c A T y) i = s i is zero. This equivalence is easy to see. All the terms in the summation on the RHS of the following equation are nonnegative: n 0 = (c A T y) T x = (c A T y) i x i i=1 Since the sum is zero, each term must be zero. Thus we found an alternative set of optimality conditions: x R n is an optimal solution of (2.2) if and only if 1. x is primal feasible: Ax = b, x 0, and there exists a y R m and s R n such that 2. (y, s) is dual feasible: A T y c, s 0, and 3. complementary slackness: for each i = 1,..., n we have x i s i = 0. The best known (and most successful) methods for solving LPs are interior-point methods and the simplex method. We discuss the latter here and postpone our discussion of interior-point methods till we study quadratic programming problems.

18 CHAPTER 2. LINEAR PROGRAMMING: THEORY AND ALGORITHMS 2.4 The Simplex Method To motivate our discussion of the simplex method, we consider the following example from the book Introduction to Mathematical Programming by R. Walker: Example 2.1 Farmer Jones has 100 acres of land to devote to wheat and corn and wishes to plan his planting to maximize the expected revenue. Jones has only $800 in capital to apply to planting the crops, and it costs $5 to plant an acre of wheat and $10 for an acre of corn. Their busy social schedule leaves the Jones family only 150 days of labor to devote to the crops. Two days will be required for each acre of wheat and one day for an acre of corn. If past experience indicates a return of $80 from each acre of wheat and $60 for each acre of corn, how many acres of each should be planted to maximize his revenue? Lettng variables x 1 and x 2 denote the number of acres used for wheat and corn respectively, we obtain the following formulation for Farmer Jones problem: Max Z = 80x 1 + 60x 2 subject to: x 1 + x 2 100 2x 1 + x 2 150 5x 1 + 10x 2 800 x 1, x 2 0. After we add slack variables to each of the functional constraints we obtain a representation of the problem in the standard form, suitable for the simplex method 1 : Max Z = 80x 1 + 60x 2 subject to: x 1 + x 2 + x 3 = 100 2x 1 + x 2 + x 4 = 150 5x 1 + 10x 2 + x 5 = 800 x 1, x 2, x 3, x 4, x 5 0. 2.4.1 Basic Solutions Let us consider a general LP problem in the following form: max c x Ax b x 0, 1 This representation is not exactly in the standard form since the objective is maximization rather than minimization. However, any maximization problem can be transformed into a minimization problem by multiplying the objective function by -1. Here, we avoid such a transformation to leave the objective function in its natural form it should be straightforward to adapt the steps of the algorithm in the following discussion to address minimization problems.

2.4. THE SIMPLEX METHOD 19 where A is an m n matrix with full row rank and b is an m-dimensional column vector and c is an n-dimensional row vector. The n-dimensional column vector x represents the variables of the problem. (In the Farmer Jones example we have m = 3 and n = 5.) Here is how we can represent these vectors and matrices: a 11 a 12... a 1n a 21 a 22... a 2n A =....., b =. a m1 a m2... a mn b 1 b 2. b m, c = [ ] c 1 c 2... c n, x = x 1 x 2. x n 0, 0 = 0. 0 Next, we add slack variables to each of the functional constraints to get the augmented form of the problem. Let x s denote the vector of slack variables x s = x n+1 x n+2. x n+m and let I denote the m m identity matrix. Now, the constraints in the augmented form can be written as [ ] [ ] [ ] x x A, I = b, 0. (2.8) x s x s To find basic solutions we consider partitions of the augmented matrix [A, I]: [ A, I ] = [ B, N ],, where B is an m m square matrix that[ consists ] of linearly independent columns of x [A, I]. If we partition the variable vector in the same way [ x x s ] = x s [ xb x N we can rewrite the equality constraints in (2.8) as [ B, N ] [ x B x N or by multiplying both sides by B 1 from left, ] ], = Bx B + Nx N = b, x B + B 1 Nx N = B 1 b.

20 CHAPTER 2. LINEAR PROGRAMMING: THEORY AND ALGORITHMS So the following systems of equations are equivalent; any solution to the first will be a solution for the next two, and vice versa: [ ] [ ] x A, I = b, x s Bx B + Nx N = b x B + B 1 Nx N = B 1 b Indeed, the linear systems in the second and third form are just re-representations of the first one with respect to a fixed matrix B. An obvious solution to the last system (and therefore, for the other two) is x N = 0, x B = B 1 b. In fact, for any fixed values of the components of x N we can obtain a solution by simply setting x B = B 1 b B 1 Nx N. The reader may want to think of x N as the independent variables that we can choose freely, and once they are chosen, the dependent variables x B are determined uniquely. We call a solution of the systems above a basic solution if it is of the form x N = 0, x B = B 1 b, for some basis matrix B. If in addition, x B = B 1 b 0, the solution x B = B 1 b, x N = 0 is a basic feasible solution of the LP problem above. The variables x B are called the basic variables, while x N are the nonbasic variables. The objective function Z = c x can be represented similarly using the basis partition. Let c = [ ] c B, c N be the represent the partition of the objective vector. Now, we have the following sequence of equivalent representations of the objective function equation: Z = c x Z c x = 0 Z [ ] [ ] x c B, c B N = 0 x N (2.9) Z c B x B c N x N = 0 Z c B (B 1 b B 1 Nx N ) c N x N = 0 Z (c N c B B 1 N) x N = c B B 1 b Note that the last one of the list of equations above does not contain the basic variables, which is exactly what we want to be able to figure out the net effect of changing a nonbasic variable on the objective function. A key observation is that when a linear programming problem has an optimal solution, it must have an optimal basic feasible solution. The significance of this result lies in the fact that when we are looking for a solution of a linear programming problem what we really need to check is the objective value of each basic solution. There are only finitely many of them, so this reduces our search space from an infinite space to a finite one.