Finitely Additive Dynamic Programming and Stochastic Games. Bill Sudderth University of Minnesota
|
|
|
- Adam Ryan
- 10 years ago
- Views:
Transcription
1 Finitely Additive Dynamic Programming and Stochastic Games Bill Sudderth University of Minnesota 1
2 Discounted Dynamic Programming Five ingredients: S, A, r, q, β. S - state space A - set of actions q( s, a) - law of motion r(s, a) - daily reward function (bounded, real-valued) β [0, 1) - discount factor 2
3 Play of the game You begin at some state s 1 S, select an action a 1 A, and receive a reward r(s 1, a 1 ). You then move to a new state s 2 with distribution q( s 1, a 1 ), select a 2 A, and receive β r(s 2, a 2 ). Then you move to s 3 with distribution q( s 2, a 2 ), select a 3 A, receive β 2 r(s 3, a 3 ). And so on. Your total reward is the expected value of n=1 β n 1 r(s n, a n ). 3
4 Strategies and Rewards A strategy σ selects each action a n, possibly at random, as a function of the history (s 1, a 1,..., a n 1, s n ). The reward from σ at the initial state s 1 = s is V (σ)(s) = E σ,s [ n=1 β n 1 r(s n, a n )]. Given s 1 = s and a 1 = a, the conditional strategy σ[s, a] is just the continuation of σ and V (σ)(s) = [r(s, a) + β V (σ[s, a])(t) q(dt s, a)]σ(s)(da). 4
5 The Optimal Reward and the Bellman Equation The optimal reward at s is V (s) = sup σ V (σ)(s). The Bellman Equation for V is V (s) = sup[r(s, a) + β a V (t) q(dt s, a)]. I will sketch the proof for S and A countable. 5
6 Proof of : For every strategy σ and s S, V (σ)(s) = [r(s, a) + β sup[r(s, a ) + β a sup[r(s, a ) + β a V (σ[s, a])(t) q(dt s, a)]σ(s)(da) V (σ[s, a ])(t) q(dt s, a )] V (t) q(dt s, a )]. Now take the sup over σ. 6
7 Proof of : Fix ɛ > 0. For every state t S, select a strategy σ t such that V (σ t )(t) V (t) ɛ/2. Fix a state s and choose an action a such that r(s, a)+β V (t) q(dt s, a) sup[r(s, a ) + β V (t) q(dt s, a )] ɛ/2. a Define the strategy σ at s 1 = s to have first action a and conditional strategies σ[s, a](t) = σ t. Then V (s) V (σ)(s) =r(s, a) + β sup[r(s, a ) + β a V (σ t )(t) q(dt s, a) V (t) q(dt s, a )] ɛ. 7
8 Measurable Dynamic Programming The first formulation of dynamic programming in a general measure theoretic setting was given by Blackwell (1965). He assumed: 1. S and A are Borel subsets of a Polish space (say, a Euclidean space). 2. The reward function r(s, a) is Borel measurable. 3. The law of motion q( s, a) is a regular conditional distribution. Strategies are required to select actions in a Borel measurable way. 8
9 Measurability Problems In his 1965 paper, Blackwell showed by example that for a Borel measurable dynamic programming problem: The optimal reward function V ( ) need not be Borel measurable and good Borel measurable strategies need not exist. This led to nontrivial work by a number of mathematicians including R. Strauch, D. Freedman, M. Orkin, D. Bertsekas, S. Shreve, and Blackwell himself. It follows from their work that for a Borel problem: The optimal reward function V ( ) is universally measurable and that there do exist good universally measurable strategies. 9
10 The Bellman Equation Again The equation still holds, but a proof requires a lot of measure theory. See, for example, chapter 7 of Bertsekas and Shreve (1978) - about 85 pages. Some additional results are needed to measurably select the σ t in the proof of. See Feinberg (1996). The proof works exactly as given in a finitely additive setting, and it works for general sets S and A. 10
11 Finitely Additive Probability Let γ be a finitely additive probability defined on a sigma-field of subsets of some set F. The integral φ dγ of a simple function is defined in the usual way. The integral ψ dγ of a bounded, measurable function ψ is defined by squeezing with simple functions. If γ is defined on the sigma-field F of all subsets of F, it is called a gamble and ψ dγ is defined for all bounded, real-valued functions ψ. 11
12 Finitely Additive Stochastic Processes Let G(F ) be the set of all gambles on F. A Dubins-Savage strategy (DS-strategy for brevity) γ is a sequence γ 1, γ 2,... such that γ 1 G(F ) and for n 2, γ n is a mapping from F n 1 to G(F ). Every DS-strategy γ naturally determines a finitely additive probability P γ on the product sigma-field F N. P γ is regarded as the distribution of a random sequence f 1, f 2,..., f n,.... Here f 1 has distribution γ 1 and, given f 1, f 2,..., f n 1, the conditional distribution of f n is γ n (f 1, f 2,..., f n 1 ). 12
13 Finitely Additive Dynamic Programming For each (s, a), q( s, a) is a gamble on S. A strategy σ chooses actions using gambles on A. Each σ together with q and an initial state s 1 = s determines a DS-strategy γ = γ(s, σ) on (A S) N. For D A S, γ 1 (D) = and γ n 1 (a 1, s 2,...,a n 1, s n )(D) = Let q(d a s, a) σ 1 (da) q(d a s n, a) σ(s 1, a 1, s 2,..., a n 1, s n )(da). P σ,s = P γ. 13
14 Rewards and the Bellman Equation For any bounded, real-valued reward function r, the reward for a strategy σ is well-defined by the same formula as before: V (σ)(s) = E σ,s [ n=1 β n 1 r(s n, a n )]. Also as before, the optimal reward function is V (s) = sup σ The Bellman equation V (σ)(s). V (s) = sup[r(s, a) + β V (t) q(dt s, a)]. a can be proved exactly as in the discrete case. 14
15 Blackwell Operators Let B be the Banach space of bounded functions x : S R equipped with the supremum norm. For each function f : S A, define the operator T f for elements x B by (T f x)(s) = r(s, f(s)) + β Also define the operator T by (T x)(s) = sup[r(s, a) + β a x(s ) q(ds s, f(s)). x(s ) q(ds s, a)]. This definition of T makes sense in the finitely additive case, and in the countably additive case when S is countable. There is trouble in the general measurable case. 15
16 Fixed Points The operators T f and T are β-contractions. Banach, they have unique fixed points. By a theorem of The fixed point of T is the optimal reward function V. equality The V (s) = (T V )(s) is just the Bellman equation V (s) = sup[r(s, a) + β a V (t) q(dt s, a)]. 16
17 Stationary Strategies A strategy σ is stationary if there is a function f : S A such that σ(s 1, a 1,..., a n 1, s n ) = f(s n ) for all (s 1, a 1,..., a n 1, s n ). Notation: σ = f. The fixed point of T f is the reward function V (σ)( ) for the stationary strategy σ = f. V (σ)(s) = r(s, f(s)) + β V (σ)(t) q(dt s, f(s)) = (T f V (σ))(s) Fundamental Question: Do optimal or nearly optimal stationary strategies exist? 17
18 Existence of Good Stationary Strategies Fix ɛ > 0. For each s, choose f(s) such that (T f V )(s) V (s) ɛ(1 β). Let σ = f. An easy induction shows that (Tf n V )(s) V (s) ɛ, for all s and n. But, by Banach s Theorem, (Tf n V )(s) V (σ)(s). So the stationary plan σ is ɛ - optimal. 18
19 The Measurable Case: Trouble for T T does not preserve Borel measurability. T does not preserve universal measurability. T does preserve upper semianalytic functions, but these do not form a Banach space. Good universally measurable stationary strategies do exist, but the proof is more complicated. 19
20 Finitely Additive Extensions of Measurable Problems Every probability measure on an algebra of subsets of a set F can be extended to a gamble on F, that is, a finitely additive probability defined on all subsets of F. (The extension is typically not unique.) Thus a measurable, discounted problem S, A, r, q, β can be extended to a finitely additive problem S, A, r, ˆq, β where ˆq( s, a) is a gamble on S that extends q( s, a) for every s, a. Questions: Is the optimal reward the same for both problems? Can a player do better by using non-measurable strategies? 20
21 Reward Functions for Measurable and for Finitely Additive Plans For a measurable strategy σ, the reward V M (σ)(s) = E σ,s [ n=1 β n 1 r(s n, a n )] is the expectation under the countably additive probability P σ,s. Each measurable σ can be extended to a finitely additive strategy ˆσ with reward V (ˆσ)(s) = Eˆσ,s [ n=1 β n 1 r(s n, a n )] calculated under the finitely additive probability Pˆσ,s. Fact: V M (σ)(s) = V (ˆσ)(s). 21
22 Optimal Rewards For a measurable problem, let V M (s) = sup V M(σ)(s), where the sup is over all measurable strategies σ, and let V (s) = sup V (σ)(s), where the sup is over all plans σ in some finitely additive extension. 22
23 Theorem: V M (s) = V (s). Proof: The Bellman equation is known to hold in the measurable theory: V M In other terms (s) = sup a [r(s, a) + β V M (s) = (T V M )(s). But V is the unique fixed point of T. VM (t) q(dt s, a)]. 23
24 Two-Person Zero-Sum Discounted Stochastic Game (Finitely Additive) Six ingredients: S, A, B, r, q, β. S - state space A, B - action sets for players 1 and 2 q( s, a, b) - law of motion (finitely additive) r(s, a, b) - daily payoff from player 2 to player 1(bounded realvalued) β [0, 1) - discount factor 24
25 Play of the game Play begins at some state s 1 S, player 1 chooses a 1 A using a gamble on A, player 2 chooses b 1 B using a gamble on B, and player 2 pays player 1 the amount r(s 1, a 1, b 1 ). The next state s 2 has distribution q( s 1, a 1, b 1 ) and play continues. The total payoff from player 2 to player 1 is the expected value of n=1 β n 1 r(s n, a n, b n ). 25
26 Strategies and Payoffs A strategy σ for player 1 (τ for player 2) selects each action a n (respectively b n ) with a gamble that is a function of (s 1, a 1, b 1,..., a n 1, b n 1, s n ). The pair σ, τ together with an initial state s 1 = s and the law of motion q determine a DS- strategy γ = γ(s, σ, τ). The expected payoff from player 2 to player 1 at the initial state s 1 = s is defined as V (σ, τ)(s) =E γ [ =E σ,τ,s [ n=1 β n 1 r(s n, a n, b n )] n=1 β n 1 r(s n, a n, b n )]. This expectation is well-defined in the Dubins-Savage framework. 26
27 Theorem: The discounted stochastic game has a value function V ; that is, for all s, V (s) = inf τ sup σ V (σ, τ)(s) = sup σ inf τ V (σ, τ)(s). The value function V satisfies the Shapley equation: V (s) = inf ν sup µ [r(s, a, b) + β V (t) q(dt s, a, b)]µ(da)ν(db). The infimum (respectively supremum) is over gambles on B (respectively A). Also, both players have optimal stationary strategies. The proof is essentially the same as that of Shapley (1953). 27
28 Finitely Additive One-Shot Games. Let φ : A B R be a bounded function. For gambles µ and ν on A and B respectively, let E µ,ν φ = φ(a, b) µ(da)ν(db). Warning: The order of integration matters. Lemma: The one-shot game with action sets A and B and payoff φ has a value in the sense that inf ν sup µ E µ,νφ = sup inf µ ν E µ,νφ. Also both players have optimal strategies. This follows from Ky Fan (1953). 28
29 Shapley Operator Recall that B is the Banach space of bounded x : S R. x B, s S, let G(s, x) be the one-shot game with payoff φ s,x (a, b) = r(s, a, b) + β x(t) q(dt s, a, b) For and let T x(s) be the value of G(s, x). So T x(s) = inf ν sup µ [r(s, a, b) + β x(t) q(dt s, a, b)]µ(da)ν(db) The operator T is a β-contraction with a unique fixed point x. For each s, let f(s) and g(s) be gambles on A and B that are optimal for players 1 and 2 in G(s, x ). Define σ and τ to be the stationary strategies f and g. 29
30 Proof of Theorem. An induction shows that for all s S, strategies τ for player 2, and positive integers n: n E σ,τ,s[ k=1 Let n to get that for all s and τ β k 1 r(s k, a k, b k ) + β n x (s n )] x (s). V (σ, τ)(s) x (s). Similarly, for all s and all strategies σ for player 1 V (σ, τ )(s) x (s). Therefore x is the value function for the stochastic game; σ and τ are optimal stationary strategies. 30
31 Measurable Discounted Stochastic Game S - a Borel subset of a Polish space A, B - compact metric spaces r(s, a, b) - Borel measurable and separately continuous in a and b for every fixed s q - a regular conditional distribution with q(d s, a, b) separately continuous in a and b for every fixed s and every Borel subset D of S β [0, 1) - discount factor Players are required to use measurable strategies. 31
32 Theorem: A measurable discounted stochastic game has a value function VM that is Borel measurable. The players have optimal, Borel measurable stationary strategies. See Mertens, Sorin, and Zamir (2014). Assume that q(d s, a, ), a A is an equicontinuous family for each fixed s, a and Borel subset D of B. For each triple s, a, b let q( s, a, b) be a finitely additive extension of q( s, a, b) to all subsets of S. Let V be the value function for the corresponding finitely additive game. Theorem: V M = V. This follows from Maitra and S (1993). 32
33 References D. Blackwell (1965).Discounted dynamic programming. Ann. Math. Statist D. Bertsekas and S. Shreve (1978). Stochastic Optimal Control: The Discrete Time Case. Academic Press. L. Dubins (1974). On Lebesgue-like extensions of finitely additive measures. Ann. Prob L. E. Dubins and L. J. Savage (1965). How to Gamble If You Must: Inequalities for Stochastic Processes. McGraw-Hill. E. Feinberg (1996). On measurability and representation of strategic measures in Markov decision theory. Statistics, Probability, and Game Theory: Papers in Honor of David Blackwell, editors T. S. Ferguson, L.S. Shapley, J. B. MacQueen. IMS Lecture Notes-Monograph Series
34 Ky Fan (1953). Minimax theorems. Proc. Natl. Acad. Sci A. Maitra and W. Sudderth (1993). Finitely additive and measurable stochastic games. Int. J. Game Theory A. Maitra and W. Sudderth (1998). Finitely additive stochastic games with Borel measurable payoffs. Int. J. Game Theory J-F Mertens, S. Sorin, and S. Zamir (2014). Repeated Games. Cambridge Press. R. Purves and W. Sudderth (1976). Some finitely additive probability theory. Ann. Prob L. S. Shapley (1953). Stochastic games. Proc. Natl. Acad. Sci
Goal Problems in Gambling and Game Theory. Bill Sudderth. School of Statistics University of Minnesota
Goal Problems in Gambling and Game Theory Bill Sudderth School of Statistics University of Minnesota 1 Three problems Maximizing the probability of reaching a goal. Maximizing the probability of reaching
1 Norms and Vector Spaces
008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)
No: 10 04. Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics
No: 10 04 Bilkent University Monotonic Extension Farhad Husseinov Discussion Papers Department of Economics The Discussion Papers of the Department of Economics are intended to make the initial results
INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS
INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem
Gambling Systems and Multiplication-Invariant Measures
Gambling Systems and Multiplication-Invariant Measures by Jeffrey S. Rosenthal* and Peter O. Schwartz** (May 28, 997.. Introduction. This short paper describes a surprising connection between two previously
6.2 Permutations continued
6.2 Permutations continued Theorem A permutation on a finite set A is either a cycle or can be expressed as a product (composition of disjoint cycles. Proof is by (strong induction on the number, r, of
ALMOST COMMON PRIORS 1. INTRODUCTION
ALMOST COMMON PRIORS ZIV HELLMAN ABSTRACT. What happens when priors are not common? We introduce a measure for how far a type space is from having a common prior, which we term prior distance. If a type
LECTURE 15: AMERICAN OPTIONS
LECTURE 15: AMERICAN OPTIONS 1. Introduction All of the options that we have considered thus far have been of the European variety: exercise is permitted only at the termination of the contract. These
MEASURE AND INTEGRATION. Dietmar A. Salamon ETH Zürich
MEASURE AND INTEGRATION Dietmar A. Salamon ETH Zürich 12 May 2016 ii Preface This book is based on notes for the lecture course Measure and Integration held at ETH Zürich in the spring semester 2014. Prerequisites
Markov random fields and Gibbs measures
Chapter Markov random fields and Gibbs measures 1. Conditional independence Suppose X i is a random element of (X i, B i ), for i = 1, 2, 3, with all X i defined on the same probability space (.F, P).
CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e.
CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e. This chapter contains the beginnings of the most important, and probably the most subtle, notion in mathematical analysis, i.e.,
CHAPTER IV - BROWNIAN MOTION
CHAPTER IV - BROWNIAN MOTION JOSEPH G. CONLON 1. Construction of Brownian Motion There are two ways in which the idea of a Markov chain on a discrete state space can be generalized: (1) The discrete time
Lecture 6 Black-Scholes PDE
Lecture 6 Black-Scholes PDE Lecture Notes by Andrzej Palczewski Computational Finance p. 1 Pricing function Let the dynamics of underlining S t be given in the risk-neutral measure Q by If the contingent
F. ABTAHI and M. ZARRIN. (Communicated by J. Goldstein)
Journal of Algerian Mathematical Society Vol. 1, pp. 1 6 1 CONCERNING THE l p -CONJECTURE FOR DISCRETE SEMIGROUPS F. ABTAHI and M. ZARRIN (Communicated by J. Goldstein) Abstract. For 2 < p
BANACH AND HILBERT SPACE REVIEW
BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but
Metric Spaces. Chapter 7. 7.1. Metrics
Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some
Mathematical Methods of Engineering Analysis
Mathematical Methods of Engineering Analysis Erhan Çinlar Robert J. Vanderbei February 2, 2000 Contents Sets and Functions 1 1 Sets................................... 1 Subsets.............................
CONTINUOUS REINHARDT DOMAINS PROBLEMS ON PARTIAL JB*-TRIPLES
CONTINUOUS REINHARDT DOMAINS PROBLEMS ON PARTIAL JB*-TRIPLES László STACHÓ Bolyai Institute Szeged, Hungary [email protected] www.math.u-szeged.hu/ Stacho 13/11/2008, Granada László STACHÓ () CONTINUOUS
A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails
12th International Congress on Insurance: Mathematics and Economics July 16-18, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint
TD(0) Leads to Better Policies than Approximate Value Iteration
TD(0) Leads to Better Policies than Approximate Value Iteration Benjamin Van Roy Management Science and Engineering and Electrical Engineering Stanford University Stanford, CA 94305 [email protected] Abstract
THOMAS S. FERGUSON, University of California, Los Angeles C. MELOLIDAKIS, Technical University of Crete
LAST ROUND BETTING THOMAS S. FERGUSON, University of California, Los Angeles C. MELOLIDAKIS, Technical University of Crete Abstract Two players with differing amounts of money simultaneously choose an
and s n (x) f(x) for all x and s.t. s n is measurable if f is. REAL ANALYSIS Measures. A (positive) measure on a measurable space
RAL ANALYSIS A survey of MA 641-643, UAB 1999-2000 M. Griesemer Throughout these notes m denotes Lebesgue measure. 1. Abstract Integration σ-algebras. A σ-algebra in X is a non-empty collection of subsets
Trimming a Tree and the Two-Sided Skorohod Reflection
ALEA, Lat. Am. J. Probab. Math. Stat. 12 (2), 811 834 (2015) Trimming a Tree and the Two-Sided Skorohod Reflection Emmanuel Schertzer UPMC Univ. Paris 06, Laboratoire de Probabilités et Modèles Aléatoires,
9 More on differentiation
Tel Aviv University, 2013 Measure and category 75 9 More on differentiation 9a Finite Taylor expansion............... 75 9b Continuous and nowhere differentiable..... 78 9c Differentiable and nowhere monotone......
About the inverse football pool problem for 9 games 1
Seventh International Workshop on Optimal Codes and Related Topics September 6-1, 013, Albena, Bulgaria pp. 15-133 About the inverse football pool problem for 9 games 1 Emil Kolev Tsonka Baicheva Institute
Reinforcement Learning
Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Martin Lauer AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg [email protected]
Metric Spaces. Lecture Notes and Exercises, Fall 2015. M.van den Berg
Metric Spaces Lecture Notes and Exercises, Fall 2015 M.van den Berg School of Mathematics University of Bristol BS8 1TW Bristol, UK [email protected] 1 Definition of a metric space. Let X be a set,
Universal Algorithm for Trading in Stock Market Based on the Method of Calibration
Universal Algorithm for Trading in Stock Market Based on the Method of Calibration Vladimir V yugin Institute for Information Transmission Problems, Russian Academy of Sciences, Bol shoi Karetnyi per.
Numerical methods for American options
Lecture 9 Numerical methods for American options Lecture Notes by Andrzej Palczewski Computational Finance p. 1 American options The holder of an American option has the right to exercise it at any moment
GAME THEORY. Thomas S. Ferguson University of California at Los Angeles INTRODUCTION.
GAME THEORY Thomas S. Ferguson University of California at Los Angeles INTRODUCTION. Game theory is a fascinating subject. We all know many entertaining games, such as chess, poker, tic-tac-toe, bridge,
Let H and J be as in the above lemma. The result of the lemma shows that the integral
Let and be as in the above lemma. The result of the lemma shows that the integral ( f(x, y)dy) dx is well defined; we denote it by f(x, y)dydx. By symmetry, also the integral ( f(x, y)dx) dy is well defined;
Undergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics
Undergraduate Notes in Mathematics Arkansas Tech University Department of Mathematics An Introductory Single Variable Real Analysis: A Learning Approach through Problem Solving Marcel B. Finan c All Rights
The Ergodic Theorem and randomness
The Ergodic Theorem and randomness Peter Gács Department of Computer Science Boston University March 19, 2008 Peter Gács (Boston University) Ergodic theorem March 19, 2008 1 / 27 Introduction Introduction
i/io as compared to 4/10 for players 2 and 3. A "correct" way to play Certainly many interesting games are not solvable by this definition.
496 MATHEMATICS: D. GALE PROC. N. A. S. A THEORY OF N-PERSON GAMES WITH PERFECT INFORMA TION* By DAVID GALE BROWN UNIVBRSITY Communicated by J. von Neumann, April 17, 1953 1. Introduction.-The theory of
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete
Notes from Week 1: Algorithms for sequential prediction
CS 683 Learning, Games, and Electronic Markets Spring 2007 Notes from Week 1: Algorithms for sequential prediction Instructor: Robert Kleinberg 22-26 Jan 2007 1 Introduction In this course we will be looking
Reinforcement Learning
Reinforcement Learning LU 2 - Markov Decision Problems and Dynamic Programming Dr. Joschka Bödecker AG Maschinelles Lernen und Natürlichsprachliche Systeme Albert-Ludwigs-Universität Freiburg [email protected]
Introduction to Topology
Introduction to Topology Tomoo Matsumura November 30, 2010 Contents 1 Topological spaces 3 1.1 Basis of a Topology......................................... 3 1.2 Comparing Topologies.......................................
ON GENERALIZED RELATIVE COMMUTATIVITY DEGREE OF A FINITE GROUP. A. K. Das and R. K. Nath
International Electronic Journal of Algebra Volume 7 (2010) 140-151 ON GENERALIZED RELATIVE COMMUTATIVITY DEGREE OF A FINITE GROUP A. K. Das and R. K. Nath Received: 12 October 2009; Revised: 15 December
Separation Properties for Locally Convex Cones
Journal of Convex Analysis Volume 9 (2002), No. 1, 301 307 Separation Properties for Locally Convex Cones Walter Roth Department of Mathematics, Universiti Brunei Darussalam, Gadong BE1410, Brunei Darussalam
arxiv:1112.0829v1 [math.pr] 5 Dec 2011
How Not to Win a Million Dollars: A Counterexample to a Conjecture of L. Breiman Thomas P. Hayes arxiv:1112.0829v1 [math.pr] 5 Dec 2011 Abstract Consider a gambling game in which we are allowed to repeatedly
ON COMPLETELY CONTINUOUS INTEGRATION OPERATORS OF A VECTOR MEASURE. 1. Introduction
ON COMPLETELY CONTINUOUS INTEGRATION OPERATORS OF A VECTOR MEASURE J.M. CALABUIG, J. RODRÍGUEZ, AND E.A. SÁNCHEZ-PÉREZ Abstract. Let m be a vector measure taking values in a Banach space X. We prove that
FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES
FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES CHRISTOPHER HEIL 1. Cosets and the Quotient Space Any vector space is an abelian group under the operation of vector addition. So, if you are have studied
1. Prove that the empty set is a subset of every set.
1. Prove that the empty set is a subset of every set. Basic Topology Written by Men-Gen Tsai email: [email protected] Proof: For any element x of the empty set, x is also an element of every set since
Lecture Notes on Measure Theory and Functional Analysis
Lecture Notes on Measure Theory and Functional Analysis P. Cannarsa & T. D Aprile Dipartimento di Matematica Università di Roma Tor Vergata [email protected] [email protected] aa 2006/07 Contents
Notes on metric spaces
Notes on metric spaces 1 Introduction The purpose of these notes is to quickly review some of the basic concepts from Real Analysis, Metric Spaces and some related results that will be used in this course.
6.207/14.15: Networks Lecture 15: Repeated Games and Cooperation
6.207/14.15: Networks Lecture 15: Repeated Games and Cooperation Daron Acemoglu and Asu Ozdaglar MIT November 2, 2009 1 Introduction Outline The problem of cooperation Finitely-repeated prisoner s dilemma
Fuzzy Differential Systems and the New Concept of Stability
Nonlinear Dynamics and Systems Theory, 1(2) (2001) 111 119 Fuzzy Differential Systems and the New Concept of Stability V. Lakshmikantham 1 and S. Leela 2 1 Department of Mathematical Sciences, Florida
Mathematical Finance
Mathematical Finance Option Pricing under the Risk-Neutral Measure Cory Barnes Department of Mathematics University of Washington June 11, 2013 Outline 1 Probability Background 2 Black Scholes for European
Stationary random graphs on Z with prescribed iid degrees and finite mean connections
Stationary random graphs on Z with prescribed iid degrees and finite mean connections Maria Deijfen Johan Jonasson February 2006 Abstract Let F be a probability distribution with support on the non-negative
Chapter 4 Lecture Notes
Chapter 4 Lecture Notes Random Variables October 27, 2015 1 Section 4.1 Random Variables A random variable is typically a real-valued function defined on the sample space of some experiment. For instance,
SYMMETRIC FORM OF THE VON NEUMANN POKER MODEL. Guido David 1, Pearl Anne Po 2
International Journal of Pure and Applied Mathematics Volume 99 No. 2 2015, 145-151 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu doi: http://dx.doi.org/10.12732/ijpam.v99i2.2
Power-law Price-impact Models and Stock Pinning near Option Expiration Dates. Marco Avellaneda Gennady Kasyan Michael D. Lipkin
Power-law Price-impact Models and Stock Pinning near Option Expiration Dates Marco Avellaneda Gennady Kasyan Michael D. Lipkin PDE in Finance, Stockholm, August 7 Summary Empirical evidence of stock pinning
n k=1 k=0 1/k! = e. Example 6.4. The series 1/k 2 converges in R. Indeed, if s n = n then k=1 1/k, then s 2n s n = 1 n + 1 +...
6 Series We call a normed space (X, ) a Banach space provided that every Cauchy sequence (x n ) in X converges. For example, R with the norm = is an example of Banach space. Now let (x n ) be a sequence
Columbia University in the City of New York New York, N.Y. 10027
Columbia University in the City of New York New York, N.Y. 10027 DEPARTMENT OF MATHEMATICS 508 Mathematics Building 2990 Broadway Fall Semester 2005 Professor Ioannis Karatzas W4061: MODERN ANALYSIS Description
INTEGRAL OPERATORS ON THE PRODUCT OF C(K) SPACES
INTEGRAL OPERATORS ON THE PRODUCT OF C(K) SPACES FERNANDO BOMBAL AND IGNACIO VILLANUEVA Abstract. We study and characterize the integral multilinear operators on a product of C(K) spaces in terms of the
Binomial lattice model for stock prices
Copyright c 2007 by Karl Sigman Binomial lattice model for stock prices Here we model the price of a stock in discrete time by a Markov chain of the recursive form S n+ S n Y n+, n 0, where the {Y i }
Metric Spaces Joseph Muscat 2003 (Last revised May 2009)
1 Distance J Muscat 1 Metric Spaces Joseph Muscat 2003 (Last revised May 2009) (A revised and expanded version of these notes are now published by Springer.) 1 Distance A metric space can be thought of
The Henstock-Kurzweil-Stieltjes type integral for real functions on a fractal subset of the real line
The Henstock-Kurzweil-Stieltjes type integral for real functions on a fractal subset of the real line D. Bongiorno, G. Corrao Dipartimento di Ingegneria lettrica, lettronica e delle Telecomunicazioni,
THE BANACH CONTRACTION PRINCIPLE. Contents
THE BANACH CONTRACTION PRINCIPLE ALEX PONIECKI Abstract. This paper will study contractions of metric spaces. To do this, we will mainly use tools from topology. We will give some examples of contractions,
Math 115 Spring 2011 Written Homework 5 Solutions
. Evaluate each series. a) 4 7 0... 55 Math 5 Spring 0 Written Homework 5 Solutions Solution: We note that the associated sequence, 4, 7, 0,..., 55 appears to be an arithmetic sequence. If the sequence
4: SINGLE-PERIOD MARKET MODELS
4: SINGLE-PERIOD MARKET MODELS Ben Goldys and Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2015 B. Goldys and M. Rutkowski (USydney) Slides 4: Single-Period Market
How to Gamble If You Must
How to Gamble If You Must Kyle Siegrist Department of Mathematical Sciences University of Alabama in Huntsville Abstract In red and black, a player bets, at even stakes, on a sequence of independent games
A domain of spacetime intervals in general relativity
A domain of spacetime intervals in general relativity Keye Martin Department of Mathematics Tulane University New Orleans, LA 70118 United States of America [email protected] Prakash Panangaden School
Adaptive Online Gradient Descent
Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650
ALGORITHMIC TRADING WITH MARKOV CHAINS
June 16, 2010 ALGORITHMIC TRADING WITH MARKOV CHAINS HENRIK HULT AND JONAS KIESSLING Abstract. An order book consists of a list of all buy and sell offers, represented by price and quantity, available
Lecture 7: Finding Lyapunov Functions 1
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.243j (Fall 2003): DYNAMICS OF NONLINEAR SYSTEMS by A. Megretski Lecture 7: Finding Lyapunov Functions 1
Basic Concepts of Point Set Topology Notes for OU course Math 4853 Spring 2011
Basic Concepts of Point Set Topology Notes for OU course Math 4853 Spring 2011 A. Miller 1. Introduction. The definitions of metric space and topological space were developed in the early 1900 s, largely
The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].
Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real
Two-Sided Matching Theory
Two-Sided Matching Theory M. Utku Ünver Boston College Introduction to the Theory of Two-Sided Matching To see which results are robust, we will look at some increasingly general models. Even before we
Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..
Probability Theory A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T.. Florian Herzog 2013 Probability space Probability space A probability space W is a unique triple W = {Ω, F,
s-convexity, model sets and their relation
s-convexity, model sets and their relation Zuzana Masáková Jiří Patera Edita Pelantová CRM-2639 November 1999 Department of Mathematics, Faculty of Nuclear Science and Physical Engineering, Czech Technical
Lecture notes on Ordinary Differential Equations. Plan of lectures
1 Lecture notes on Ordinary Differential Equations Annual Foundation School, IIT Kanpur, Dec.3-28, 2007. by Dept. of Mathematics, IIT Bombay, Mumbai-76. e-mail: [email protected] References Plan
Pricing of an Exotic Forward Contract
Pricing of an Exotic Forward Contract Jirô Akahori, Yuji Hishida and Maho Nishida Dept. of Mathematical Sciences, Ritsumeikan University 1-1-1 Nojihigashi, Kusatsu, Shiga 525-8577, Japan E-mail: {akahori,
SEQUENCES OF MAXIMAL DEGREE VERTICES IN GRAPHS. Nickolay Khadzhiivanov, Nedyalko Nenov
Serdica Math. J. 30 (2004), 95 102 SEQUENCES OF MAXIMAL DEGREE VERTICES IN GRAPHS Nickolay Khadzhiivanov, Nedyalko Nenov Communicated by V. Drensky Abstract. Let Γ(M) where M V (G) be the set of all vertices
Option Pricing. 1 Introduction. Mrinal K. Ghosh
Option Pricing Mrinal K. Ghosh 1 Introduction We first introduce the basic terminology in option pricing. Option: An option is the right, but not the obligation to buy (or sell) an asset under specified
Augusto Teixeira (IMPA)
Augusto Teixeira (IMPA) partially joint with Elisabetta Candellero (Warwick) Augusto Teixeira (IMPA) Percolation and isoperimetry SPA - Oxfort, 2015 1 / 19 Percolation Physical phenomenon: Fluid through
Metric Spaces. Chapter 1
Chapter 1 Metric Spaces Many of the arguments you have seen in several variable calculus are almost identical to the corresponding arguments in one variable calculus, especially arguments concerning convergence
SOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 3. Spaces with special properties
SOLUTIONS TO EXERCISES FOR MATHEMATICS 205A Part 3 Fall 2008 III. Spaces with special properties III.1 : Compact spaces I Problems from Munkres, 26, pp. 170 172 3. Show that a finite union of compact subspaces
(Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties
Lecture 1 Convex Sets (Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties 1.1.1 A convex set In the school geometry
1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let
Copyright c 2009 by Karl Sigman 1 Stopping Times 1.1 Stopping Times: Definition Given a stochastic process X = {X n : n 0}, a random time τ is a discrete random variable on the same probability space as
Fuzzy Probability Distributions in Bayesian Analysis
Fuzzy Probability Distributions in Bayesian Analysis Reinhard Viertl and Owat Sunanta Department of Statistics and Probability Theory Vienna University of Technology, Vienna, Austria Corresponding author:
Invariant Option Pricing & Minimax Duality of American and Bermudan Options
Invariant Option Pricing & Minimax Duality of American and Bermudan Options Farshid Jamshidian NIB Capital Bank N.V. FELAB, Applied Math Dept., Univ. of Twente April 2005, version 1.0 Invariant Option
ON SEQUENTIAL CONTINUITY OF COMPOSITION MAPPING. 0. Introduction
ON SEQUENTIAL CONTINUITY OF COMPOSITION MAPPING Abstract. In [1] there was proved a theorem concerning the continuity of the composition mapping, and there was announced a theorem on sequential continuity
Sums of Independent Random Variables
Chapter 7 Sums of Independent Random Variables 7.1 Sums of Discrete Random Variables In this chapter we turn to the important question of determining the distribution of a sum of independent random variables
On Integer Additive Set-Indexers of Graphs
On Integer Additive Set-Indexers of Graphs arxiv:1312.7672v4 [math.co] 2 Mar 2014 N K Sudev and K A Germina Abstract A set-indexer of a graph G is an injective set-valued function f : V (G) 2 X such that
