MINIMIZATION OF ENTROPY FUNCTIONALS UNDER MOMENT CONSTRAINTS. denote the family of probability density functions g on X satisfying

Size: px
Start display at page:

Download "MINIMIZATION OF ENTROPY FUNCTIONALS UNDER MOMENT CONSTRAINTS. denote the family of probability density functions g on X satisfying"

Transcription

1 MINIMIZATION OF ENTROPY FUNCTIONALS UNDER MOMENT CONSTRAINTS I. Csiszár (Budapest) Given a σ-finite measure space (X, X, µ) and a d-tuple ϕ = (ϕ 1,..., ϕ d ) of measurable functions on X, for a = (a 1,..., a d ) R d let L a denote the family of probability density functions g on X satisfying ϕgdµ = a, that is, ϕ i gdµ = a i, i = 1,..., d. Extensively studied problem: minimize J(g) = or K(g, h) = g log gdµ subject to g L a. (negative Shannon entropy) g log g dµ (Kullback-Leibler distance, h I-divergence, relative entropy) 1

2 First this problem, then its extension to other entropies and distances will be considered. For ϑ = (ϑ 1,..., ϑ d ) R d denote (ϑ) = log e ϑ,ϕ dµ ϑ, ϕ = d i=1 ϑ i ϕ i Assume: dom( ) = {ϑ : (ϑ) < + } is nonempty. Not hard to show: (ϑ) is the convex conjugate of the function H(a) = inf g La J(g) : (ϑ) = H (ϑ) = sup a R d [ ϑ, ϕ H(a)]. Dual problem associated with the primal problem of minimizing J(g) subject to g L a : maximize l a (ϑ) = ϑ, ϕ (ϑ) for ϑ R d. The supremum of l a (ϑ) is the convex conjugate of (ϑ), thus the second conjugate H (a) of H(a). Always H(a) H (A), the difference is called duality gap. 2

3 Exponential family with canonical statistic ϕ : E = {f ϑ = e ϑ,ϕ (ϑ) : ϑ dom( )}. When the empirical mean 1 n nj=1 ϕ(x j ) of ϕ in a sample x 1,..., x n drawn from a density in E is equal to a, the normalized log-likelihood function is l a (ϑ); for this a the dual problem means ML estimation. Moreover, if g L a then (lik.id) K(g, f ϑ ) = J(g) l a (ϑ), ϑ dom( ), hence providing J(g) is finite, the dual problem is equivalent to minimizing K(g, f ϑ ) for f ϑ E. 3

4 Note: this interpretation of the dual problem does not apply if a / dom(h), in which case J(g) = + for all g L a even though H (a) < H(a) = + is possible. Elementary proposition: If L a E =, it contains a single g a, and for this J(g) = J(g a ) + K(g, g a ), g L a ; equivalently, the Pythagorean identity holds: K(g, f) = K(g a, f) + K(g, g a ), g L a, f E In this case, the duality gap is 0: H(a) = H (a) = J(g a ), and the common member g a of L a and E is simultaneously the I-projection to L a of each f E and the reverse I-projection to E of each g L a with J(g) < +. 4

5 HISTORY HINTS Boltzmann, Gibbs: in 19. century Jaynes, Kullback: in the fifties Čencov 1972: information projections, diff. geom. approach Barndorff - Nielsen 1977: convex analysis approach to MLE for exponential families Csiszár 1975, Topsøe 1979: generalized minimizer when minimum not attained (Shannon case) Csiszár 1991: axiomatic approach Borwein and Lewis 1991: convex analysis approach for general entropies Csiszár 1995: generalized minimizer, general case Several recent works employ advanced Orlicz space techniques (Léonard ) or diff. geom. (Amari and Nagaoka 2000, etc.) 5

6 This talk is based on works of Csiszár and Matúš and hopefully will show that classical tools suffice for treating the problem efficiently. Convex core of a finite measure Q on R d (Csiszár - Matúš 2001): cc(q) = intersection of all convex Borel sets with full Q-measure = set of means of all probability measures P Q that have mean For the measure µ on X, define cc ϕ (µ)= { ϕgdµ : g prob.density, ϕg integrable } = {a R d : L a }. If µ is finite then cc ϕ (µ) = cc(µ ϕ ), µ ϕ image of µ on R d. 6

7 Lemma: If a cc ϕ (µ), there exists g L a with µ({x : g(x) > 0}) < +, g bounded. Corollary: dom(h) = cc ϕ (µ), that is, the necessary condition L a for H(a) = inf g La J(g) < + is sufficient, as well. Face of a convex set C R d : Nonempty convex subset F C such that a convex combination tx + (1 t)y of x C and y C (with 0 < t < 1) belongs to F only if x, y F For a face F of cc ϕ (µ), denote F = {x : ϕ(x) cl(f )} Lemma: For a in a face F of cc ϕ (µ), each g L a vanishes outside F (µ-a.e.) 7

8 Extended exponential family exte: The union of the families E F for all faces F of cc ϕ (µ), where E F = {f F,ϑ = e ϑ,ϕ f(ϑ) 1 F : ϑ dom( F )} F (ϑ) = log F e ϑ,ϕ dµ Theorem 1 (Csiszár - Matúš 2003): Whenever L a thus a cc ϕ (µ), there exists a unique g a, perhaps not in L a, such that J(g) = H(a) + K(g, g a ), g L a Moreover g a E F, for the face F of cc ϕ (µ) whose relative interior contains a. Clearly, if g a L a then it minimizes J(g) subject to g L a. Otherwise, it is a generalized minimizer: every sequence g n in L a with J(g n ) H(a) satisfies K(g n, g a ) 0, in particular, g n g a in L 1 (µ). 8

9 Generalized Pythagorean identity: K(g, f) = K(L a, f) + K(g, g a ) g L a, f E where K(L a, f) = inf g La K(g, f) K(g a, f) Thus, g a is the generalized I-projection to L a of each f E. If a ri(cc ϕ (µ)) thus g a E, then g a is also the reverse I-projection to E of each g L a with J(g) < +, and the duality gap is zero. g a / L a can happen if g a = f ϑ with ϑ on boundary of dom( ); g a may be the same for several vectors a. Existence of minimizer (I-projection): g a L a holds for all a ri(cc ϕ (µ)) if and only if is steep, and for all a ri(f ), if and only if F is steep. 9

10 Theorem 2. (Csiszár - Matúš ): If H (a) = sup ϑ R d l a (ϑ) is finite, there exists a unique density h a such that H (a) l a (ϑ) K(h a, f ϑ ), ϑ dom( ). Moreover, h a E F where F is the largest face of cc ϕ (µ) with a ri(f ) + barr(dom( )). Here barr denotes barrier cone: for any convex set C R d, barr(c) = {b : sup c C b, c < + }. Supplement: dom(h ) = cc ϕ (µ) + barr(dom( )). The maximum of l a (ϑ) is attained (MLE exists) if and any only if h a E. Otherwise, h a is a generalized MLE: every sequence ϑ n in dom( ) with l a (ϑ n ) H (a) satisfies K(h a, f ϑn ) 0, in particular, f ϑn h a in L 1 (µ). 10

11 GENERAL ENTROPY FUNCTIONALS In the sequel, γ is a given strictly convex, differentiable function on (0, + ), γ(0) is defined as lim t 0 γ(t); later, γ (0), γ (+ ) are also defined limiting. γ-entropy of a nonnegative function g on X: J γ (g) = γ(g)dµ Familiar choices of γ, in addition to t log t : γ(t) = log t Burg entropy γ(t) = sign(α 1)t α Rényi (Tsallis) entropy Problem: minimize J γ (g) subject to g L a, where L a is defined slightly differently than before: attention is not restricted to probability densities, accordingly we set ϕ = (ϕ 0, ϕ 1,..., ϕ d ) with ϕ 0 identically 1, and for a = (a 0,..., a d ) R 1+d, L a = {g 0 : ϕgdµ = a} 11

12 BASIC TOOLS The convex conjugate of γ, γ (r) = sup t>0 [rt γ(t)] is a nondecreasing convex function, finite and differentiable in (, γ (+ )), and its derivative goes to + as r γ (+ ). γ (γ (+ )) may or may not be finite. Denote by u the function on R equal to (γ ) in (, γ (+ )) and + outside. Then u(r) = 0 if r γ (0), and u is strictly increasing from 0 to + in the interval (γ (0), γ (+ )). Lemma 1. For r < γ (+ ) γ (u(r)) = max [ γ (0), r ] = r + γ (0) r + γ(u(r)) + γ (r) = ru(r). 12

13 For non-negative numbers t, s define γ (t.s) = γ(t) = [γ(s) + γ (s)(t s)] (not meaningful for s = 0 if γ (0) = ; then we set γ (0, 0) = 0, γ (t, 0) = + if t > 0) Bregman distance of nonnegative functions g, h on X : B γ (g, h) = γ (g, h)dµ Clearly, B γ (g, h) 0, equality iff g = h [µ] 13

14 KEY IDENTITY Denote: L a : the family of nonnegative (measurable) functions g on X satisfying the constraints gϕdµ = a (a = (a0,..., s d ) R 1+d ) F γ : the family of functions f ϑ = u( ϑ, ϕ ) with ϑ R 1+d such that γ ( ϑ, ϕ )dµ is finite, and ϑ, ϕ < γ (+ ) [µ] Key identity: For g L a and f ϑ F γ [ J γ (g) ϑ, a γ ( ϑ, ϕ )dµ ] = = B γ (g, f ϑ ) + g γ (0) ϑ, ϕ + dµ Proof: Immediate, using Lemma 1. 14

15 Proposition: If L a F γ, it consists of a single function g = f ϑ, this g minimizes J γ (g) subject to g L a, and ϑ maximizes ϑ, a γ ( ϑ, ϕ )dµ; these minimum and maximum are equal. [But ϑ need not be unique, only f ϑ is.] Proof: Immediate from the key identity. The family F γ is the γ-analogue of an exponential family in the theory of Shannon entropy maximization. While the functions in F γ need not be probability densities, in the case γ(t) = t log t they are exactly the constant multiples of the probability densities in F γ, which form an exponential family in the familiar statistical sense. For other γ however, no simple way is apparent to identify the probability densities in F γ. 15

16 Convex conjugate of H γ (a) = inf g La J γ (g) : H γ (ϑ) = sup [ ϑ, a H γ (a)], ϑ R 1+d. a R 1+d Lemma 2: If dom(h γ ), thus there exists some g with γ(g)dµ < + and gϕ i integrable for i = 0,..., d, then Hγ (ϑ) = γ ( ϑ, ϕ )dµ Dual problem: find the dual value Hγ [ (a) = ϑ, a H γ (ϑ) ], sup ϑ R 1+d and if it is finite, find ϑ R 1+d that attains the maximum, if such ϑ exists (dual attainment). In the latter case, the function f ϑ = u( ϑ, ϕ ) will be called dual solution, rather than ϑ itself. 16

17 Lemma 3: for ϑ dom(hγ ), the directional derivative of Hγ at ϑ, in a direction τ, exists and equals f ϑ τ, ϕ dµ < + whenever ϑ+tτ dom(hγ ) for some t > 0. In particular, H γ is differentiable in the interior of its essential domain, with the gradient equal to f ϑ ϕdµ. Corollary: For a R 1+d with H γ (a) finite, a dual solution satisfies f ϑ dµ a 0 Proof: Straightforward calculus. Differentiation within the integral is justified by monotone convergence. 17

18 Proposition 2: If H γ (a) is finite and a is in the relative interior of dom(h γ ) then the primal and dual values are equal, and dual attainment holds. Moreover, the dual solution f ϑ satisfies for each g L a J γ (g) = H γ (a) + B γ (g, f ϑ )+ + g γ (0) ϑ, ϕ + dµ. If, in addition, H γ (ϑ) = γ ( ϑ, ϕ )dµ is essentially smooth then the dual solution f ϑ belongs to L a, hence it is a primal solution, too. Proof: The first assertion is a general convex analysis result. The second assertion follows from it, by the key identity. The last assertion follows by Lemma 3, since if Hγ is essentially smooth, the maximizing ϑ has to be in the interior of dom(hγ). 18

19 Theorem 1: If γ(0) = 0 then dom(h γ ) = {ta : t 0, a cc ϕ (µ)} If γ(0) = + (and µ(x) < + ) then dom(h γ ) is either empty, or dom(h γ ) = {ta : t > 0, a ri(cc ϕ (µ))} = dom(h γ ). In both cases, dom(h γ ) is a cone that has a simple description in terms of cc ϕ (µ). In the case γ(0) = +, no general criterion appears available to determine whether dom(h γ ) is empty or not, but if nonempty, then Proposition 2 tells the story, as dom(h γ ) is relatively open. In the sequel, we concentrate on the case γ(0) = 0. 19

20 Given nonzero a dom(h γ ), equivalently a 0 > 0, a 1 0 a cc ϕ(µ), let F (a) denote the face of cc ϕ (µ) whose relative interior contains a 1 0 a. Let ν denote the restiction of µ to F (a) = ϕ 1 (el(f/a)). Each g L a vanishes outside F (a) = H γ (a) = inf g L a γ(g)dν. Proposition 2 applies to this a and the measure ν in the role of µ because a 1 0 a is in the relative interior of the face F (a) equal to cc ϕ (ν). 20

21 It follows, provided H γ (a) >, that the maximum of ϑ, a γ ( ϑ, ϕ )dν is attained, and with a maximizing ϑ, the function satisfies for all g L a f F,ϑ = u ( ϑ, ϕ ) 1 F (a) J γ (g) = = H γ (a) + B γ (g, f F,ϑ ) + g γ (0) ϑ, ϕ + dµ Theorem 2: To every a 0 with H γ (a) finite, there exists a (unique) generalized primal solution, of form f F,ϑ = u( ϑ, ϕ )1 F (a) and it satisfies the above identity. Essential smoothness of F (a) γ ( ϑ, ϕ )dµ is a sufficient condition for primal attainment. 21

Duality of linear conic problems

Duality of linear conic problems Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least

More information

Convex analysis and profit/cost/support functions

Convex analysis and profit/cost/support functions CALIFORNIA INSTITUTE OF TECHNOLOGY Division of the Humanities and Social Sciences Convex analysis and profit/cost/support functions KC Border October 2004 Revised January 2009 Let A be a subset of R m

More information

Separation Properties for Locally Convex Cones

Separation Properties for Locally Convex Cones Journal of Convex Analysis Volume 9 (2002), No. 1, 301 307 Separation Properties for Locally Convex Cones Walter Roth Department of Mathematics, Universiti Brunei Darussalam, Gadong BE1410, Brunei Darussalam

More information

No: 10 04. Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics

No: 10 04. Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics No: 10 04 Bilkent University Monotonic Extension Farhad Husseinov Discussion Papers Department of Economics The Discussion Papers of the Department of Economics are intended to make the initial results

More information

Walrasian Demand. u(x) where B(p, w) = {x R n + : p x w}.

Walrasian Demand. u(x) where B(p, w) = {x R n + : p x w}. Walrasian Demand Econ 2100 Fall 2015 Lecture 5, September 16 Outline 1 Walrasian Demand 2 Properties of Walrasian Demand 3 An Optimization Recipe 4 First and Second Order Conditions Definition Walrasian

More information

1 if 1 x 0 1 if 0 x 1

1 if 1 x 0 1 if 0 x 1 Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or

More information

A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION

A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION 1 A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION Dimitri Bertsekas M.I.T. FEBRUARY 2003 2 OUTLINE Convexity issues in optimization Historical remarks Our treatment of the subject Three unifying lines of

More information

CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e.

CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e. CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e. This chapter contains the beginnings of the most important, and probably the most subtle, notion in mathematical analysis, i.e.,

More information

2.3 Convex Constrained Optimization Problems

2.3 Convex Constrained Optimization Problems 42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

MEASURE AND INTEGRATION. Dietmar A. Salamon ETH Zürich

MEASURE AND INTEGRATION. Dietmar A. Salamon ETH Zürich MEASURE AND INTEGRATION Dietmar A. Salamon ETH Zürich 12 May 2016 ii Preface This book is based on notes for the lecture course Measure and Integration held at ETH Zürich in the spring semester 2014. Prerequisites

More information

Notes on metric spaces

Notes on metric spaces Notes on metric spaces 1 Introduction The purpose of these notes is to quickly review some of the basic concepts from Real Analysis, Metric Spaces and some related results that will be used in this course.

More information

Pacific Journal of Mathematics

Pacific Journal of Mathematics Pacific Journal of Mathematics GLOBAL EXISTENCE AND DECREASING PROPERTY OF BOUNDARY VALUES OF SOLUTIONS TO PARABOLIC EQUATIONS WITH NONLOCAL BOUNDARY CONDITIONS Sangwon Seo Volume 193 No. 1 March 2000

More information

and s n (x) f(x) for all x and s.t. s n is measurable if f is. REAL ANALYSIS Measures. A (positive) measure on a measurable space

and s n (x) f(x) for all x and s.t. s n is measurable if f is. REAL ANALYSIS Measures. A (positive) measure on a measurable space RAL ANALYSIS A survey of MA 641-643, UAB 1999-2000 M. Griesemer Throughout these notes m denotes Lebesgue measure. 1. Abstract Integration σ-algebras. A σ-algebra in X is a non-empty collection of subsets

More information

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem

More information

ALMOST COMMON PRIORS 1. INTRODUCTION

ALMOST COMMON PRIORS 1. INTRODUCTION ALMOST COMMON PRIORS ZIV HELLMAN ABSTRACT. What happens when priors are not common? We introduce a measure for how far a type space is from having a common prior, which we term prior distance. If a type

More information

Undergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics

Undergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics Undergraduate Notes in Mathematics Arkansas Tech University Department of Mathematics An Introductory Single Variable Real Analysis: A Learning Approach through Problem Solving Marcel B. Finan c All Rights

More information

1. Prove that the empty set is a subset of every set.

1. Prove that the empty set is a subset of every set. 1. Prove that the empty set is a subset of every set. Basic Topology Written by Men-Gen Tsai email: b89902089@ntu.edu.tw Proof: For any element x of the empty set, x is also an element of every set since

More information

BANACH AND HILBERT SPACE REVIEW

BANACH AND HILBERT SPACE REVIEW BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but

More information

Date: April 12, 2001. Contents

Date: April 12, 2001. Contents 2 Lagrange Multipliers Date: April 12, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 12 2.3. Informative Lagrange Multipliers...........

More information

(Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties

(Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties Lecture 1 Convex Sets (Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties 1.1.1 A convex set In the school geometry

More information

TOPIC 4: DERIVATIVES

TOPIC 4: DERIVATIVES TOPIC 4: DERIVATIVES 1. The derivative of a function. Differentiation rules 1.1. The slope of a curve. The slope of a curve at a point P is a measure of the steepness of the curve. If Q is a point on the

More information

9 More on differentiation

9 More on differentiation Tel Aviv University, 2013 Measure and category 75 9 More on differentiation 9a Finite Taylor expansion............... 75 9b Continuous and nowhere differentiable..... 78 9c Differentiable and nowhere monotone......

More information

So let us begin our quest to find the holy grail of real analysis.

So let us begin our quest to find the holy grail of real analysis. 1 Section 5.2 The Complete Ordered Field: Purpose of Section We present an axiomatic description of the real numbers as a complete ordered field. The axioms which describe the arithmetic of the real numbers

More information

Lecture Notes on Measure Theory and Functional Analysis

Lecture Notes on Measure Theory and Functional Analysis Lecture Notes on Measure Theory and Functional Analysis P. Cannarsa & T. D Aprile Dipartimento di Matematica Università di Roma Tor Vergata cannarsa@mat.uniroma2.it daprile@mat.uniroma2.it aa 2006/07 Contents

More information

Mathematical Methods of Engineering Analysis

Mathematical Methods of Engineering Analysis Mathematical Methods of Engineering Analysis Erhan Çinlar Robert J. Vanderbei February 2, 2000 Contents Sets and Functions 1 1 Sets................................... 1 Subsets.............................

More information

Notes V General Equilibrium: Positive Theory. 1 Walrasian Equilibrium and Excess Demand

Notes V General Equilibrium: Positive Theory. 1 Walrasian Equilibrium and Excess Demand Notes V General Equilibrium: Positive Theory In this lecture we go on considering a general equilibrium model of a private ownership economy. In contrast to the Notes IV, we focus on positive issues such

More information

Metric Spaces. Chapter 7. 7.1. Metrics

Metric Spaces. Chapter 7. 7.1. Metrics Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some

More information

Mathematics for Econometrics, Fourth Edition

Mathematics for Econometrics, Fourth Edition Mathematics for Econometrics, Fourth Edition Phoebus J. Dhrymes 1 July 2012 1 c Phoebus J. Dhrymes, 2012. Preliminary material; not to be cited or disseminated without the author s permission. 2 Contents

More information

Gambling Systems and Multiplication-Invariant Measures

Gambling Systems and Multiplication-Invariant Measures Gambling Systems and Multiplication-Invariant Measures by Jeffrey S. Rosenthal* and Peter O. Schwartz** (May 28, 997.. Introduction. This short paper describes a surprising connection between two previously

More information

Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1

Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1 Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1 J. Zhang Institute of Applied Mathematics, Chongqing University of Posts and Telecommunications, Chongqing

More information

Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

Follow links for Class Use and other Permissions. For more information send email to: permissions@pupress.princeton.edu

Follow links for Class Use and other Permissions. For more information send email to: permissions@pupress.princeton.edu COPYRIGHT NOTICE: Ariel Rubinstein: Lecture Notes in Microeconomic Theory is published by Princeton University Press and copyrighted, c 2006, by Princeton University Press. All rights reserved. No part

More information

Complex geodesics in convex tube domains

Complex geodesics in convex tube domains Complex geodesics in convex tube domains Sylwester Zając Institute of Mathematics, Faculty of Mathematics and Computer Science, Jagiellonian University, Łojasiewicza 6, 30-348 Kraków, Poland sylwester.zajac@im.uj.edu.pl

More information

Random graphs with a given degree sequence

Random graphs with a given degree sequence Sourav Chatterjee (NYU) Persi Diaconis (Stanford) Allan Sly (Microsoft) Let G be an undirected simple graph on n vertices. Let d 1,..., d n be the degrees of the vertices of G arranged in descending order.

More information

Optimal Investment with Derivative Securities

Optimal Investment with Derivative Securities Noname manuscript No. (will be inserted by the editor) Optimal Investment with Derivative Securities Aytaç İlhan 1, Mattias Jonsson 2, Ronnie Sircar 3 1 Mathematical Institute, University of Oxford, Oxford,

More information

n k=1 k=0 1/k! = e. Example 6.4. The series 1/k 2 converges in R. Indeed, if s n = n then k=1 1/k, then s 2n s n = 1 n + 1 +...

n k=1 k=0 1/k! = e. Example 6.4. The series 1/k 2 converges in R. Indeed, if s n = n then k=1 1/k, then s 2n s n = 1 n + 1 +... 6 Series We call a normed space (X, ) a Banach space provided that every Cauchy sequence (x n ) in X converges. For example, R with the norm = is an example of Banach space. Now let (x n ) be a sequence

More information

Metric Spaces Joseph Muscat 2003 (Last revised May 2009)

Metric Spaces Joseph Muscat 2003 (Last revised May 2009) 1 Distance J Muscat 1 Metric Spaces Joseph Muscat 2003 (Last revised May 2009) (A revised and expanded version of these notes are now published by Springer.) 1 Distance A metric space can be thought of

More information

Stationary random graphs on Z with prescribed iid degrees and finite mean connections

Stationary random graphs on Z with prescribed iid degrees and finite mean connections Stationary random graphs on Z with prescribed iid degrees and finite mean connections Maria Deijfen Johan Jonasson February 2006 Abstract Let F be a probability distribution with support on the non-negative

More information

The Ergodic Theorem and randomness

The Ergodic Theorem and randomness The Ergodic Theorem and randomness Peter Gács Department of Computer Science Boston University March 19, 2008 Peter Gács (Boston University) Ergodic theorem March 19, 2008 1 / 27 Introduction Introduction

More information

SOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 3. Spaces with special properties

SOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 3. Spaces with special properties SOLUTIONS TO EXERCISES FOR MATHEMATICS 205A Part 3 Fall 2008 III. Spaces with special properties III.1 : Compact spaces I Problems from Munkres, 26, pp. 170 172 3. Show that a finite union of compact subspaces

More information

On Lexicographic (Dictionary) Preference

On Lexicographic (Dictionary) Preference MICROECONOMICS LECTURE SUPPLEMENTS Hajime Miyazaki File Name: lexico95.usc/lexico99.dok DEPARTMENT OF ECONOMICS OHIO STATE UNIVERSITY Fall 993/994/995 Miyazaki.@osu.edu On Lexicographic (Dictionary) Preference

More information

Elements of probability theory

Elements of probability theory 2 Elements of probability theory Probability theory provides mathematical models for random phenomena, that is, phenomena which under repeated observations yield di erent outcomes that cannot be predicted

More information

Duality in Linear Programming

Duality in Linear Programming Duality in Linear Programming 4 In the preceding chapter on sensitivity analysis, we saw that the shadow-price interpretation of the optimal simplex multipliers is a very useful concept. First, these shadow

More information

CHAPTER 6. Shannon entropy

CHAPTER 6. Shannon entropy CHAPTER 6 Shannon entropy This chapter is a digression in information theory. This is a fascinating subject, which arose once the notion of information got precise and quantifyable. From a physical point

More information

About the Gamma Function

About the Gamma Function About the Gamma Function Notes for Honors Calculus II, Originally Prepared in Spring 995 Basic Facts about the Gamma Function The Gamma function is defined by the improper integral Γ) = The integral is

More information

Adaptive Online Gradient Descent

Adaptive Online Gradient Descent Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)! Math 7 Fall 205 HOMEWORK 5 SOLUTIONS Problem. 2008 B2 Let F 0 x = ln x. For n 0 and x > 0, let F n+ x = 0 F ntdt. Evaluate n!f n lim n ln n. By directly computing F n x for small n s, we obtain the following

More information

Metric Spaces. Chapter 1

Metric Spaces. Chapter 1 Chapter 1 Metric Spaces Many of the arguments you have seen in several variable calculus are almost identical to the corresponding arguments in one variable calculus, especially arguments concerning convergence

More information

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES I GROUPS: BASIC DEFINITIONS AND EXAMPLES Definition 1: An operation on a set G is a function : G G G Definition 2: A group is a set G which is equipped with an operation and a special element e G, called

More information

Lecture Notes on Elasticity of Substitution

Lecture Notes on Elasticity of Substitution Lecture Notes on Elasticity of Substitution Ted Bergstrom, UCSB Economics 210A March 3, 2011 Today s featured guest is the elasticity of substitution. Elasticity of a function of a single variable Before

More information

Practice with Proofs

Practice with Proofs Practice with Proofs October 6, 2014 Recall the following Definition 0.1. A function f is increasing if for every x, y in the domain of f, x < y = f(x) < f(y) 1. Prove that h(x) = x 3 is increasing, using

More information

MA651 Topology. Lecture 6. Separation Axioms.

MA651 Topology. Lecture 6. Separation Axioms. MA651 Topology. Lecture 6. Separation Axioms. This text is based on the following books: Fundamental concepts of topology by Peter O Neil Elements of Mathematics: General Topology by Nicolas Bourbaki Counterexamples

More information

Lecture 7: Finding Lyapunov Functions 1

Lecture 7: Finding Lyapunov Functions 1 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.243j (Fall 2003): DYNAMICS OF NONLINEAR SYSTEMS by A. Megretski Lecture 7: Finding Lyapunov Functions 1

More information

Some stability results of parameter identification in a jump diffusion model

Some stability results of parameter identification in a jump diffusion model Some stability results of parameter identification in a jump diffusion model D. Düvelmeyer Technische Universität Chemnitz, Fakultät für Mathematik, 09107 Chemnitz, Germany Abstract In this paper we discuss

More information

Some representability and duality results for convex mixed-integer programs.

Some representability and duality results for convex mixed-integer programs. Some representability and duality results for convex mixed-integer programs. Santanu S. Dey Joint work with Diego Morán and Juan Pablo Vielma December 17, 2012. Introduction About Motivation Mixed integer

More information

Fixed Point Theorems

Fixed Point Theorems Fixed Point Theorems Definition: Let X be a set and let T : X X be a function that maps X into itself. (Such a function is often called an operator, a transformation, or a transform on X, and the notation

More information

Practical Guide to the Simplex Method of Linear Programming

Practical Guide to the Simplex Method of Linear Programming Practical Guide to the Simplex Method of Linear Programming Marcel Oliver Revised: April, 0 The basic steps of the simplex algorithm Step : Write the linear programming problem in standard form Linear

More information

Linear Algebra. A vector space (over R) is an ordered quadruple. such that V is a set; 0 V ; and the following eight axioms hold:

Linear Algebra. A vector space (over R) is an ordered quadruple. such that V is a set; 0 V ; and the following eight axioms hold: Linear Algebra A vector space (over R) is an ordered quadruple (V, 0, α, µ) such that V is a set; 0 V ; and the following eight axioms hold: α : V V V and µ : R V V ; (i) α(α(u, v), w) = α(u, α(v, w)),

More information

Non-Arbitrage and the Fundamental Theorem of Asset Pricing: Summary of Main Results

Non-Arbitrage and the Fundamental Theorem of Asset Pricing: Summary of Main Results Proceedings of Symposia in Applied Mathematics Volume 00, 1997 Non-Arbitrage and the Fundamental Theorem of Asset Pricing: Summary of Main Results Freddy Delbaen and Walter Schachermayer Abstract. The

More information

KATO S INEQUALITY UP TO THE BOUNDARY. Contents 1. Introduction 1 2. Properties of functions in X 4 3. Proof of Theorem 1.1 6 4.

KATO S INEQUALITY UP TO THE BOUNDARY. Contents 1. Introduction 1 2. Properties of functions in X 4 3. Proof of Theorem 1.1 6 4. KATO S INEQUALITY UP TO THE BOUNDARY HAÏM BREZIS(1),(2) AND AUGUSTO C. PONCE (3) Abstract. We show that if u is a finite measure in then, under suitable assumptions on u near, u + is also a finite measure

More information

Fuzzy Probability Distributions in Bayesian Analysis

Fuzzy Probability Distributions in Bayesian Analysis Fuzzy Probability Distributions in Bayesian Analysis Reinhard Viertl and Owat Sunanta Department of Statistics and Probability Theory Vienna University of Technology, Vienna, Austria Corresponding author:

More information

Bipan Hazarika ON ACCELERATION CONVERGENCE OF MULTIPLE SEQUENCES. 1. Introduction

Bipan Hazarika ON ACCELERATION CONVERGENCE OF MULTIPLE SEQUENCES. 1. Introduction F A S C I C U L I M A T H E M A T I C I Nr 51 2013 Bipan Hazarika ON ACCELERATION CONVERGENCE OF MULTIPLE SEQUENCES Abstract. In this article the notion of acceleration convergence of double sequences

More information

FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES

FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES CHRISTOPHER HEIL 1. Cosets and the Quotient Space Any vector space is an abelian group under the operation of vector addition. So, if you are have studied

More information

24. The Branch and Bound Method

24. The Branch and Bound Method 24. The Branch and Bound Method It has serious practical consequences if it is known that a combinatorial problem is NP-complete. Then one can conclude according to the present state of science that no

More information

LECTURE 15: AMERICAN OPTIONS

LECTURE 15: AMERICAN OPTIONS LECTURE 15: AMERICAN OPTIONS 1. Introduction All of the options that we have considered thus far have been of the European variety: exercise is permitted only at the termination of the contract. These

More information

Several Views of Support Vector Machines

Several Views of Support Vector Machines Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min

More information

What is Linear Programming?

What is Linear Programming? Chapter 1 What is Linear Programming? An optimization problem usually has three essential ingredients: a variable vector x consisting of a set of unknowns to be determined, an objective function of x to

More information

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem

More information

Nonlinear Optimization: Algorithms 3: Interior-point methods

Nonlinear Optimization: Algorithms 3: Interior-point methods Nonlinear Optimization: Algorithms 3: Interior-point methods INSEAD, Spring 2006 Jean-Philippe Vert Ecole des Mines de Paris Jean-Philippe.Vert@mines.org Nonlinear optimization c 2006 Jean-Philippe Vert,

More information

DEGREES OF ORDERS ON TORSION-FREE ABELIAN GROUPS

DEGREES OF ORDERS ON TORSION-FREE ABELIAN GROUPS DEGREES OF ORDERS ON TORSION-FREE ABELIAN GROUPS ASHER M. KACH, KAREN LANGE, AND REED SOLOMON Abstract. We construct two computable presentations of computable torsion-free abelian groups, one of isomorphism

More information

Lecture 13 - Basic Number Theory.

Lecture 13 - Basic Number Theory. Lecture 13 - Basic Number Theory. Boaz Barak March 22, 2010 Divisibility and primes Unless mentioned otherwise throughout this lecture all numbers are non-negative integers. We say that A divides B, denoted

More information

Mathematical finance and linear programming (optimization)

Mathematical finance and linear programming (optimization) Mathematical finance and linear programming (optimization) Geir Dahl September 15, 2009 1 Introduction The purpose of this short note is to explain how linear programming (LP) (=linear optimization) may

More information

10. Proximal point method

10. Proximal point method L. Vandenberghe EE236C Spring 2013-14) 10. Proximal point method proximal point method augmented Lagrangian method Moreau-Yosida smoothing 10-1 Proximal point method a conceptual algorithm for minimizing

More information

t := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d).

t := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d). 1. Line Search Methods Let f : R n R be given and suppose that x c is our current best estimate of a solution to P min x R nf(x). A standard method for improving the estimate x c is to choose a direction

More information

Geometrical Characterization of RN-operators between Locally Convex Vector Spaces

Geometrical Characterization of RN-operators between Locally Convex Vector Spaces Geometrical Characterization of RN-operators between Locally Convex Vector Spaces OLEG REINOV St. Petersburg State University Dept. of Mathematics and Mechanics Universitetskii pr. 28, 198504 St, Petersburg

More information

Let H and J be as in the above lemma. The result of the lemma shows that the integral

Let H and J be as in the above lemma. The result of the lemma shows that the integral Let and be as in the above lemma. The result of the lemma shows that the integral ( f(x, y)dy) dx is well defined; we denote it by f(x, y)dydx. By symmetry, also the integral ( f(x, y)dx) dy is well defined;

More information

Schooling, Political Participation, and the Economy. (Online Supplementary Appendix: Not for Publication)

Schooling, Political Participation, and the Economy. (Online Supplementary Appendix: Not for Publication) Schooling, Political Participation, and the Economy Online Supplementary Appendix: Not for Publication) Filipe R. Campante Davin Chor July 200 Abstract In this online appendix, we present the proofs for

More information

CONTINUED FRACTIONS AND PELL S EQUATION. Contents 1. Continued Fractions 1 2. Solution to Pell s Equation 9 References 12

CONTINUED FRACTIONS AND PELL S EQUATION. Contents 1. Continued Fractions 1 2. Solution to Pell s Equation 9 References 12 CONTINUED FRACTIONS AND PELL S EQUATION SEUNG HYUN YANG Abstract. In this REU paper, I will use some important characteristics of continued fractions to give the complete set of solutions to Pell s equation.

More information

1 Error in Euler s Method

1 Error in Euler s Method 1 Error in Euler s Method Experience with Euler s 1 method raises some interesting questions about numerical approximations for the solutions of differential equations. 1. What determines the amount of

More information

Some Research Problems in Uncertainty Theory

Some Research Problems in Uncertainty Theory Journal of Uncertain Systems Vol.3, No.1, pp.3-10, 2009 Online at: www.jus.org.uk Some Research Problems in Uncertainty Theory aoding Liu Uncertainty Theory Laboratory, Department of Mathematical Sciences

More information

Lecture 7: Continuous Random Variables

Lecture 7: Continuous Random Variables Lecture 7: Continuous Random Variables 21 September 2005 1 Our First Continuous Random Variable The back of the lecture hall is roughly 10 meters across. Suppose it were exactly 10 meters, and consider

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

ON SEQUENTIAL CONTINUITY OF COMPOSITION MAPPING. 0. Introduction

ON SEQUENTIAL CONTINUITY OF COMPOSITION MAPPING. 0. Introduction ON SEQUENTIAL CONTINUITY OF COMPOSITION MAPPING Abstract. In [1] there was proved a theorem concerning the continuity of the composition mapping, and there was announced a theorem on sequential continuity

More information

Optimal File Sharing in Distributed Networks

Optimal File Sharing in Distributed Networks Optimal File Sharing in Distributed Networks Moni Naor Ron M. Roth Abstract The following file distribution problem is considered: Given a network of processors represented by an undirected graph G = (V,

More information

Section 1.3 P 1 = 1 2. = 1 4 2 8. P n = 1 P 3 = Continuing in this fashion, it should seem reasonable that, for any n = 1, 2, 3,..., = 1 2 4.

Section 1.3 P 1 = 1 2. = 1 4 2 8. P n = 1 P 3 = Continuing in this fashion, it should seem reasonable that, for any n = 1, 2, 3,..., = 1 2 4. Difference Equations to Differential Equations Section. The Sum of a Sequence This section considers the problem of adding together the terms of a sequence. Of course, this is a problem only if more than

More information

Numerisches Rechnen. (für Informatiker) M. Grepl J. Berger & J.T. Frings. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Numerisches Rechnen. (für Informatiker) M. Grepl J. Berger & J.T. Frings. Institut für Geometrie und Praktische Mathematik RWTH Aachen (für Informatiker) M. Grepl J. Berger & J.T. Frings Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2010/11 Problem Statement Unconstrained Optimality Conditions Constrained

More information

8 Divisibility and prime numbers

8 Divisibility and prime numbers 8 Divisibility and prime numbers 8.1 Divisibility In this short section we extend the concept of a multiple from the natural numbers to the integers. We also summarize several other terms that express

More information

Matrix Representations of Linear Transformations and Changes of Coordinates

Matrix Representations of Linear Transformations and Changes of Coordinates Matrix Representations of Linear Transformations and Changes of Coordinates 01 Subspaces and Bases 011 Definitions A subspace V of R n is a subset of R n that contains the zero element and is closed under

More information

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T.. Probability Theory A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T.. Florian Herzog 2013 Probability space Probability space A probability space W is a unique triple W = {Ω, F,

More information

WHEN DOES A RANDOMLY WEIGHTED SELF NORMALIZED SUM CONVERGE IN DISTRIBUTION?

WHEN DOES A RANDOMLY WEIGHTED SELF NORMALIZED SUM CONVERGE IN DISTRIBUTION? WHEN DOES A RANDOMLY WEIGHTED SELF NORMALIZED SUM CONVERGE IN DISTRIBUTION? DAVID M MASON 1 Statistics Program, University of Delaware Newark, DE 19717 email: davidm@udeledu JOEL ZINN 2 Department of Mathematics,

More information

Statistical Machine Translation: IBM Models 1 and 2

Statistical Machine Translation: IBM Models 1 and 2 Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation

More information

Max-Min Representation of Piecewise Linear Functions

Max-Min Representation of Piecewise Linear Functions Beiträge zur Algebra und Geometrie Contributions to Algebra and Geometry Volume 43 (2002), No. 1, 297-302. Max-Min Representation of Piecewise Linear Functions Sergei Ovchinnikov Mathematics Department,

More information

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

More information

An Introduction to Linear Programming

An Introduction to Linear Programming An Introduction to Linear Programming Steven J. Miller March 31, 2007 Mathematics Department Brown University 151 Thayer Street Providence, RI 02912 Abstract We describe Linear Programming, an important

More information

1 Solving LPs: The Simplex Algorithm of George Dantzig

1 Solving LPs: The Simplex Algorithm of George Dantzig Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.

More information

1 Norms and Vector Spaces

1 Norms and Vector Spaces 008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)

More information

This asserts two sets are equal iff they have the same elements, that is, a set is determined by its elements.

This asserts two sets are equal iff they have the same elements, that is, a set is determined by its elements. 3. Axioms of Set theory Before presenting the axioms of set theory, we first make a few basic comments about the relevant first order logic. We will give a somewhat more detailed discussion later, but

More information