MINIMIZATION OF ENTROPY FUNCTIONALS UNDER MOMENT CONSTRAINTS. denote the family of probability density functions g on X satisfying
|
|
- Theodore Grant
- 7 years ago
- Views:
Transcription
1 MINIMIZATION OF ENTROPY FUNCTIONALS UNDER MOMENT CONSTRAINTS I. Csiszár (Budapest) Given a σ-finite measure space (X, X, µ) and a d-tuple ϕ = (ϕ 1,..., ϕ d ) of measurable functions on X, for a = (a 1,..., a d ) R d let L a denote the family of probability density functions g on X satisfying ϕgdµ = a, that is, ϕ i gdµ = a i, i = 1,..., d. Extensively studied problem: minimize J(g) = or K(g, h) = g log gdµ subject to g L a. (negative Shannon entropy) g log g dµ (Kullback-Leibler distance, h I-divergence, relative entropy) 1
2 First this problem, then its extension to other entropies and distances will be considered. For ϑ = (ϑ 1,..., ϑ d ) R d denote (ϑ) = log e ϑ,ϕ dµ ϑ, ϕ = d i=1 ϑ i ϕ i Assume: dom( ) = {ϑ : (ϑ) < + } is nonempty. Not hard to show: (ϑ) is the convex conjugate of the function H(a) = inf g La J(g) : (ϑ) = H (ϑ) = sup a R d [ ϑ, ϕ H(a)]. Dual problem associated with the primal problem of minimizing J(g) subject to g L a : maximize l a (ϑ) = ϑ, ϕ (ϑ) for ϑ R d. The supremum of l a (ϑ) is the convex conjugate of (ϑ), thus the second conjugate H (a) of H(a). Always H(a) H (A), the difference is called duality gap. 2
3 Exponential family with canonical statistic ϕ : E = {f ϑ = e ϑ,ϕ (ϑ) : ϑ dom( )}. When the empirical mean 1 n nj=1 ϕ(x j ) of ϕ in a sample x 1,..., x n drawn from a density in E is equal to a, the normalized log-likelihood function is l a (ϑ); for this a the dual problem means ML estimation. Moreover, if g L a then (lik.id) K(g, f ϑ ) = J(g) l a (ϑ), ϑ dom( ), hence providing J(g) is finite, the dual problem is equivalent to minimizing K(g, f ϑ ) for f ϑ E. 3
4 Note: this interpretation of the dual problem does not apply if a / dom(h), in which case J(g) = + for all g L a even though H (a) < H(a) = + is possible. Elementary proposition: If L a E =, it contains a single g a, and for this J(g) = J(g a ) + K(g, g a ), g L a ; equivalently, the Pythagorean identity holds: K(g, f) = K(g a, f) + K(g, g a ), g L a, f E In this case, the duality gap is 0: H(a) = H (a) = J(g a ), and the common member g a of L a and E is simultaneously the I-projection to L a of each f E and the reverse I-projection to E of each g L a with J(g) < +. 4
5 HISTORY HINTS Boltzmann, Gibbs: in 19. century Jaynes, Kullback: in the fifties Čencov 1972: information projections, diff. geom. approach Barndorff - Nielsen 1977: convex analysis approach to MLE for exponential families Csiszár 1975, Topsøe 1979: generalized minimizer when minimum not attained (Shannon case) Csiszár 1991: axiomatic approach Borwein and Lewis 1991: convex analysis approach for general entropies Csiszár 1995: generalized minimizer, general case Several recent works employ advanced Orlicz space techniques (Léonard ) or diff. geom. (Amari and Nagaoka 2000, etc.) 5
6 This talk is based on works of Csiszár and Matúš and hopefully will show that classical tools suffice for treating the problem efficiently. Convex core of a finite measure Q on R d (Csiszár - Matúš 2001): cc(q) = intersection of all convex Borel sets with full Q-measure = set of means of all probability measures P Q that have mean For the measure µ on X, define cc ϕ (µ)= { ϕgdµ : g prob.density, ϕg integrable } = {a R d : L a }. If µ is finite then cc ϕ (µ) = cc(µ ϕ ), µ ϕ image of µ on R d. 6
7 Lemma: If a cc ϕ (µ), there exists g L a with µ({x : g(x) > 0}) < +, g bounded. Corollary: dom(h) = cc ϕ (µ), that is, the necessary condition L a for H(a) = inf g La J(g) < + is sufficient, as well. Face of a convex set C R d : Nonempty convex subset F C such that a convex combination tx + (1 t)y of x C and y C (with 0 < t < 1) belongs to F only if x, y F For a face F of cc ϕ (µ), denote F = {x : ϕ(x) cl(f )} Lemma: For a in a face F of cc ϕ (µ), each g L a vanishes outside F (µ-a.e.) 7
8 Extended exponential family exte: The union of the families E F for all faces F of cc ϕ (µ), where E F = {f F,ϑ = e ϑ,ϕ f(ϑ) 1 F : ϑ dom( F )} F (ϑ) = log F e ϑ,ϕ dµ Theorem 1 (Csiszár - Matúš 2003): Whenever L a thus a cc ϕ (µ), there exists a unique g a, perhaps not in L a, such that J(g) = H(a) + K(g, g a ), g L a Moreover g a E F, for the face F of cc ϕ (µ) whose relative interior contains a. Clearly, if g a L a then it minimizes J(g) subject to g L a. Otherwise, it is a generalized minimizer: every sequence g n in L a with J(g n ) H(a) satisfies K(g n, g a ) 0, in particular, g n g a in L 1 (µ). 8
9 Generalized Pythagorean identity: K(g, f) = K(L a, f) + K(g, g a ) g L a, f E where K(L a, f) = inf g La K(g, f) K(g a, f) Thus, g a is the generalized I-projection to L a of each f E. If a ri(cc ϕ (µ)) thus g a E, then g a is also the reverse I-projection to E of each g L a with J(g) < +, and the duality gap is zero. g a / L a can happen if g a = f ϑ with ϑ on boundary of dom( ); g a may be the same for several vectors a. Existence of minimizer (I-projection): g a L a holds for all a ri(cc ϕ (µ)) if and only if is steep, and for all a ri(f ), if and only if F is steep. 9
10 Theorem 2. (Csiszár - Matúš ): If H (a) = sup ϑ R d l a (ϑ) is finite, there exists a unique density h a such that H (a) l a (ϑ) K(h a, f ϑ ), ϑ dom( ). Moreover, h a E F where F is the largest face of cc ϕ (µ) with a ri(f ) + barr(dom( )). Here barr denotes barrier cone: for any convex set C R d, barr(c) = {b : sup c C b, c < + }. Supplement: dom(h ) = cc ϕ (µ) + barr(dom( )). The maximum of l a (ϑ) is attained (MLE exists) if and any only if h a E. Otherwise, h a is a generalized MLE: every sequence ϑ n in dom( ) with l a (ϑ n ) H (a) satisfies K(h a, f ϑn ) 0, in particular, f ϑn h a in L 1 (µ). 10
11 GENERAL ENTROPY FUNCTIONALS In the sequel, γ is a given strictly convex, differentiable function on (0, + ), γ(0) is defined as lim t 0 γ(t); later, γ (0), γ (+ ) are also defined limiting. γ-entropy of a nonnegative function g on X: J γ (g) = γ(g)dµ Familiar choices of γ, in addition to t log t : γ(t) = log t Burg entropy γ(t) = sign(α 1)t α Rényi (Tsallis) entropy Problem: minimize J γ (g) subject to g L a, where L a is defined slightly differently than before: attention is not restricted to probability densities, accordingly we set ϕ = (ϕ 0, ϕ 1,..., ϕ d ) with ϕ 0 identically 1, and for a = (a 0,..., a d ) R 1+d, L a = {g 0 : ϕgdµ = a} 11
12 BASIC TOOLS The convex conjugate of γ, γ (r) = sup t>0 [rt γ(t)] is a nondecreasing convex function, finite and differentiable in (, γ (+ )), and its derivative goes to + as r γ (+ ). γ (γ (+ )) may or may not be finite. Denote by u the function on R equal to (γ ) in (, γ (+ )) and + outside. Then u(r) = 0 if r γ (0), and u is strictly increasing from 0 to + in the interval (γ (0), γ (+ )). Lemma 1. For r < γ (+ ) γ (u(r)) = max [ γ (0), r ] = r + γ (0) r + γ(u(r)) + γ (r) = ru(r). 12
13 For non-negative numbers t, s define γ (t.s) = γ(t) = [γ(s) + γ (s)(t s)] (not meaningful for s = 0 if γ (0) = ; then we set γ (0, 0) = 0, γ (t, 0) = + if t > 0) Bregman distance of nonnegative functions g, h on X : B γ (g, h) = γ (g, h)dµ Clearly, B γ (g, h) 0, equality iff g = h [µ] 13
14 KEY IDENTITY Denote: L a : the family of nonnegative (measurable) functions g on X satisfying the constraints gϕdµ = a (a = (a0,..., s d ) R 1+d ) F γ : the family of functions f ϑ = u( ϑ, ϕ ) with ϑ R 1+d such that γ ( ϑ, ϕ )dµ is finite, and ϑ, ϕ < γ (+ ) [µ] Key identity: For g L a and f ϑ F γ [ J γ (g) ϑ, a γ ( ϑ, ϕ )dµ ] = = B γ (g, f ϑ ) + g γ (0) ϑ, ϕ + dµ Proof: Immediate, using Lemma 1. 14
15 Proposition: If L a F γ, it consists of a single function g = f ϑ, this g minimizes J γ (g) subject to g L a, and ϑ maximizes ϑ, a γ ( ϑ, ϕ )dµ; these minimum and maximum are equal. [But ϑ need not be unique, only f ϑ is.] Proof: Immediate from the key identity. The family F γ is the γ-analogue of an exponential family in the theory of Shannon entropy maximization. While the functions in F γ need not be probability densities, in the case γ(t) = t log t they are exactly the constant multiples of the probability densities in F γ, which form an exponential family in the familiar statistical sense. For other γ however, no simple way is apparent to identify the probability densities in F γ. 15
16 Convex conjugate of H γ (a) = inf g La J γ (g) : H γ (ϑ) = sup [ ϑ, a H γ (a)], ϑ R 1+d. a R 1+d Lemma 2: If dom(h γ ), thus there exists some g with γ(g)dµ < + and gϕ i integrable for i = 0,..., d, then Hγ (ϑ) = γ ( ϑ, ϕ )dµ Dual problem: find the dual value Hγ [ (a) = ϑ, a H γ (ϑ) ], sup ϑ R 1+d and if it is finite, find ϑ R 1+d that attains the maximum, if such ϑ exists (dual attainment). In the latter case, the function f ϑ = u( ϑ, ϕ ) will be called dual solution, rather than ϑ itself. 16
17 Lemma 3: for ϑ dom(hγ ), the directional derivative of Hγ at ϑ, in a direction τ, exists and equals f ϑ τ, ϕ dµ < + whenever ϑ+tτ dom(hγ ) for some t > 0. In particular, H γ is differentiable in the interior of its essential domain, with the gradient equal to f ϑ ϕdµ. Corollary: For a R 1+d with H γ (a) finite, a dual solution satisfies f ϑ dµ a 0 Proof: Straightforward calculus. Differentiation within the integral is justified by monotone convergence. 17
18 Proposition 2: If H γ (a) is finite and a is in the relative interior of dom(h γ ) then the primal and dual values are equal, and dual attainment holds. Moreover, the dual solution f ϑ satisfies for each g L a J γ (g) = H γ (a) + B γ (g, f ϑ )+ + g γ (0) ϑ, ϕ + dµ. If, in addition, H γ (ϑ) = γ ( ϑ, ϕ )dµ is essentially smooth then the dual solution f ϑ belongs to L a, hence it is a primal solution, too. Proof: The first assertion is a general convex analysis result. The second assertion follows from it, by the key identity. The last assertion follows by Lemma 3, since if Hγ is essentially smooth, the maximizing ϑ has to be in the interior of dom(hγ). 18
19 Theorem 1: If γ(0) = 0 then dom(h γ ) = {ta : t 0, a cc ϕ (µ)} If γ(0) = + (and µ(x) < + ) then dom(h γ ) is either empty, or dom(h γ ) = {ta : t > 0, a ri(cc ϕ (µ))} = dom(h γ ). In both cases, dom(h γ ) is a cone that has a simple description in terms of cc ϕ (µ). In the case γ(0) = +, no general criterion appears available to determine whether dom(h γ ) is empty or not, but if nonempty, then Proposition 2 tells the story, as dom(h γ ) is relatively open. In the sequel, we concentrate on the case γ(0) = 0. 19
20 Given nonzero a dom(h γ ), equivalently a 0 > 0, a 1 0 a cc ϕ(µ), let F (a) denote the face of cc ϕ (µ) whose relative interior contains a 1 0 a. Let ν denote the restiction of µ to F (a) = ϕ 1 (el(f/a)). Each g L a vanishes outside F (a) = H γ (a) = inf g L a γ(g)dν. Proposition 2 applies to this a and the measure ν in the role of µ because a 1 0 a is in the relative interior of the face F (a) equal to cc ϕ (ν). 20
21 It follows, provided H γ (a) >, that the maximum of ϑ, a γ ( ϑ, ϕ )dν is attained, and with a maximizing ϑ, the function satisfies for all g L a f F,ϑ = u ( ϑ, ϕ ) 1 F (a) J γ (g) = = H γ (a) + B γ (g, f F,ϑ ) + g γ (0) ϑ, ϕ + dµ Theorem 2: To every a 0 with H γ (a) finite, there exists a (unique) generalized primal solution, of form f F,ϑ = u( ϑ, ϕ )1 F (a) and it satisfies the above identity. Essential smoothness of F (a) γ ( ϑ, ϕ )dµ is a sufficient condition for primal attainment. 21
Duality of linear conic problems
Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least
More informationConvex analysis and profit/cost/support functions
CALIFORNIA INSTITUTE OF TECHNOLOGY Division of the Humanities and Social Sciences Convex analysis and profit/cost/support functions KC Border October 2004 Revised January 2009 Let A be a subset of R m
More informationSeparation Properties for Locally Convex Cones
Journal of Convex Analysis Volume 9 (2002), No. 1, 301 307 Separation Properties for Locally Convex Cones Walter Roth Department of Mathematics, Universiti Brunei Darussalam, Gadong BE1410, Brunei Darussalam
More informationNo: 10 04. Bilkent University. Monotonic Extension. Farhad Husseinov. Discussion Papers. Department of Economics
No: 10 04 Bilkent University Monotonic Extension Farhad Husseinov Discussion Papers Department of Economics The Discussion Papers of the Department of Economics are intended to make the initial results
More informationWalrasian Demand. u(x) where B(p, w) = {x R n + : p x w}.
Walrasian Demand Econ 2100 Fall 2015 Lecture 5, September 16 Outline 1 Walrasian Demand 2 Properties of Walrasian Demand 3 An Optimization Recipe 4 First and Second Order Conditions Definition Walrasian
More information1 if 1 x 0 1 if 0 x 1
Chapter 3 Continuity In this chapter we begin by defining the fundamental notion of continuity for real valued functions of a single real variable. When trying to decide whether a given function is or
More informationA NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION
1 A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION Dimitri Bertsekas M.I.T. FEBRUARY 2003 2 OUTLINE Convexity issues in optimization Historical remarks Our treatment of the subject Three unifying lines of
More informationCHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e.
CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e. This chapter contains the beginnings of the most important, and probably the most subtle, notion in mathematical analysis, i.e.,
More information2.3 Convex Constrained Optimization Problems
42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete
More informationMEASURE AND INTEGRATION. Dietmar A. Salamon ETH Zürich
MEASURE AND INTEGRATION Dietmar A. Salamon ETH Zürich 12 May 2016 ii Preface This book is based on notes for the lecture course Measure and Integration held at ETH Zürich in the spring semester 2014. Prerequisites
More informationNotes on metric spaces
Notes on metric spaces 1 Introduction The purpose of these notes is to quickly review some of the basic concepts from Real Analysis, Metric Spaces and some related results that will be used in this course.
More informationPacific Journal of Mathematics
Pacific Journal of Mathematics GLOBAL EXISTENCE AND DECREASING PROPERTY OF BOUNDARY VALUES OF SOLUTIONS TO PARABOLIC EQUATIONS WITH NONLOCAL BOUNDARY CONDITIONS Sangwon Seo Volume 193 No. 1 March 2000
More informationand s n (x) f(x) for all x and s.t. s n is measurable if f is. REAL ANALYSIS Measures. A (positive) measure on a measurable space
RAL ANALYSIS A survey of MA 641-643, UAB 1999-2000 M. Griesemer Throughout these notes m denotes Lebesgue measure. 1. Abstract Integration σ-algebras. A σ-algebra in X is a non-empty collection of subsets
More informationINDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS
INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem
More informationALMOST COMMON PRIORS 1. INTRODUCTION
ALMOST COMMON PRIORS ZIV HELLMAN ABSTRACT. What happens when priors are not common? We introduce a measure for how far a type space is from having a common prior, which we term prior distance. If a type
More informationUndergraduate Notes in Mathematics. Arkansas Tech University Department of Mathematics
Undergraduate Notes in Mathematics Arkansas Tech University Department of Mathematics An Introductory Single Variable Real Analysis: A Learning Approach through Problem Solving Marcel B. Finan c All Rights
More information1. Prove that the empty set is a subset of every set.
1. Prove that the empty set is a subset of every set. Basic Topology Written by Men-Gen Tsai email: b89902089@ntu.edu.tw Proof: For any element x of the empty set, x is also an element of every set since
More informationBANACH AND HILBERT SPACE REVIEW
BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but
More informationDate: April 12, 2001. Contents
2 Lagrange Multipliers Date: April 12, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 12 2.3. Informative Lagrange Multipliers...........
More information(Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties
Lecture 1 Convex Sets (Basic definitions and properties; Separation theorems; Characterizations) 1.1 Definition, examples, inner description, algebraic properties 1.1.1 A convex set In the school geometry
More informationTOPIC 4: DERIVATIVES
TOPIC 4: DERIVATIVES 1. The derivative of a function. Differentiation rules 1.1. The slope of a curve. The slope of a curve at a point P is a measure of the steepness of the curve. If Q is a point on the
More information9 More on differentiation
Tel Aviv University, 2013 Measure and category 75 9 More on differentiation 9a Finite Taylor expansion............... 75 9b Continuous and nowhere differentiable..... 78 9c Differentiable and nowhere monotone......
More informationSo let us begin our quest to find the holy grail of real analysis.
1 Section 5.2 The Complete Ordered Field: Purpose of Section We present an axiomatic description of the real numbers as a complete ordered field. The axioms which describe the arithmetic of the real numbers
More informationLecture Notes on Measure Theory and Functional Analysis
Lecture Notes on Measure Theory and Functional Analysis P. Cannarsa & T. D Aprile Dipartimento di Matematica Università di Roma Tor Vergata cannarsa@mat.uniroma2.it daprile@mat.uniroma2.it aa 2006/07 Contents
More informationMathematical Methods of Engineering Analysis
Mathematical Methods of Engineering Analysis Erhan Çinlar Robert J. Vanderbei February 2, 2000 Contents Sets and Functions 1 1 Sets................................... 1 Subsets.............................
More informationNotes V General Equilibrium: Positive Theory. 1 Walrasian Equilibrium and Excess Demand
Notes V General Equilibrium: Positive Theory In this lecture we go on considering a general equilibrium model of a private ownership economy. In contrast to the Notes IV, we focus on positive issues such
More informationMetric Spaces. Chapter 7. 7.1. Metrics
Chapter 7 Metric Spaces A metric space is a set X that has a notion of the distance d(x, y) between every pair of points x, y X. The purpose of this chapter is to introduce metric spaces and give some
More informationMathematics for Econometrics, Fourth Edition
Mathematics for Econometrics, Fourth Edition Phoebus J. Dhrymes 1 July 2012 1 c Phoebus J. Dhrymes, 2012. Preliminary material; not to be cited or disseminated without the author s permission. 2 Contents
More informationGambling Systems and Multiplication-Invariant Measures
Gambling Systems and Multiplication-Invariant Measures by Jeffrey S. Rosenthal* and Peter O. Schwartz** (May 28, 997.. Introduction. This short paper describes a surprising connection between two previously
More informationFurther Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1
Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1 J. Zhang Institute of Applied Mathematics, Chongqing University of Posts and Telecommunications, Chongqing
More informationContinued Fractions and the Euclidean Algorithm
Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction
More informationE3: PROBABILITY AND STATISTICS lecture notes
E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................
More informationFollow links for Class Use and other Permissions. For more information send email to: permissions@pupress.princeton.edu
COPYRIGHT NOTICE: Ariel Rubinstein: Lecture Notes in Microeconomic Theory is published by Princeton University Press and copyrighted, c 2006, by Princeton University Press. All rights reserved. No part
More informationComplex geodesics in convex tube domains
Complex geodesics in convex tube domains Sylwester Zając Institute of Mathematics, Faculty of Mathematics and Computer Science, Jagiellonian University, Łojasiewicza 6, 30-348 Kraków, Poland sylwester.zajac@im.uj.edu.pl
More informationRandom graphs with a given degree sequence
Sourav Chatterjee (NYU) Persi Diaconis (Stanford) Allan Sly (Microsoft) Let G be an undirected simple graph on n vertices. Let d 1,..., d n be the degrees of the vertices of G arranged in descending order.
More informationOptimal Investment with Derivative Securities
Noname manuscript No. (will be inserted by the editor) Optimal Investment with Derivative Securities Aytaç İlhan 1, Mattias Jonsson 2, Ronnie Sircar 3 1 Mathematical Institute, University of Oxford, Oxford,
More informationn k=1 k=0 1/k! = e. Example 6.4. The series 1/k 2 converges in R. Indeed, if s n = n then k=1 1/k, then s 2n s n = 1 n + 1 +...
6 Series We call a normed space (X, ) a Banach space provided that every Cauchy sequence (x n ) in X converges. For example, R with the norm = is an example of Banach space. Now let (x n ) be a sequence
More informationMetric Spaces Joseph Muscat 2003 (Last revised May 2009)
1 Distance J Muscat 1 Metric Spaces Joseph Muscat 2003 (Last revised May 2009) (A revised and expanded version of these notes are now published by Springer.) 1 Distance A metric space can be thought of
More informationStationary random graphs on Z with prescribed iid degrees and finite mean connections
Stationary random graphs on Z with prescribed iid degrees and finite mean connections Maria Deijfen Johan Jonasson February 2006 Abstract Let F be a probability distribution with support on the non-negative
More informationThe Ergodic Theorem and randomness
The Ergodic Theorem and randomness Peter Gács Department of Computer Science Boston University March 19, 2008 Peter Gács (Boston University) Ergodic theorem March 19, 2008 1 / 27 Introduction Introduction
More informationSOLUTIONS TO EXERCISES FOR. MATHEMATICS 205A Part 3. Spaces with special properties
SOLUTIONS TO EXERCISES FOR MATHEMATICS 205A Part 3 Fall 2008 III. Spaces with special properties III.1 : Compact spaces I Problems from Munkres, 26, pp. 170 172 3. Show that a finite union of compact subspaces
More informationOn Lexicographic (Dictionary) Preference
MICROECONOMICS LECTURE SUPPLEMENTS Hajime Miyazaki File Name: lexico95.usc/lexico99.dok DEPARTMENT OF ECONOMICS OHIO STATE UNIVERSITY Fall 993/994/995 Miyazaki.@osu.edu On Lexicographic (Dictionary) Preference
More informationElements of probability theory
2 Elements of probability theory Probability theory provides mathematical models for random phenomena, that is, phenomena which under repeated observations yield di erent outcomes that cannot be predicted
More informationDuality in Linear Programming
Duality in Linear Programming 4 In the preceding chapter on sensitivity analysis, we saw that the shadow-price interpretation of the optimal simplex multipliers is a very useful concept. First, these shadow
More informationCHAPTER 6. Shannon entropy
CHAPTER 6 Shannon entropy This chapter is a digression in information theory. This is a fascinating subject, which arose once the notion of information got precise and quantifyable. From a physical point
More informationAbout the Gamma Function
About the Gamma Function Notes for Honors Calculus II, Originally Prepared in Spring 995 Basic Facts about the Gamma Function The Gamma function is defined by the improper integral Γ) = The integral is
More informationAdaptive Online Gradient Descent
Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650
More informationMaximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
More informationHOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!
Math 7 Fall 205 HOMEWORK 5 SOLUTIONS Problem. 2008 B2 Let F 0 x = ln x. For n 0 and x > 0, let F n+ x = 0 F ntdt. Evaluate n!f n lim n ln n. By directly computing F n x for small n s, we obtain the following
More informationMetric Spaces. Chapter 1
Chapter 1 Metric Spaces Many of the arguments you have seen in several variable calculus are almost identical to the corresponding arguments in one variable calculus, especially arguments concerning convergence
More informationI. GROUPS: BASIC DEFINITIONS AND EXAMPLES
I GROUPS: BASIC DEFINITIONS AND EXAMPLES Definition 1: An operation on a set G is a function : G G G Definition 2: A group is a set G which is equipped with an operation and a special element e G, called
More informationLecture Notes on Elasticity of Substitution
Lecture Notes on Elasticity of Substitution Ted Bergstrom, UCSB Economics 210A March 3, 2011 Today s featured guest is the elasticity of substitution. Elasticity of a function of a single variable Before
More informationPractice with Proofs
Practice with Proofs October 6, 2014 Recall the following Definition 0.1. A function f is increasing if for every x, y in the domain of f, x < y = f(x) < f(y) 1. Prove that h(x) = x 3 is increasing, using
More informationMA651 Topology. Lecture 6. Separation Axioms.
MA651 Topology. Lecture 6. Separation Axioms. This text is based on the following books: Fundamental concepts of topology by Peter O Neil Elements of Mathematics: General Topology by Nicolas Bourbaki Counterexamples
More informationLecture 7: Finding Lyapunov Functions 1
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.243j (Fall 2003): DYNAMICS OF NONLINEAR SYSTEMS by A. Megretski Lecture 7: Finding Lyapunov Functions 1
More informationSome stability results of parameter identification in a jump diffusion model
Some stability results of parameter identification in a jump diffusion model D. Düvelmeyer Technische Universität Chemnitz, Fakultät für Mathematik, 09107 Chemnitz, Germany Abstract In this paper we discuss
More informationSome representability and duality results for convex mixed-integer programs.
Some representability and duality results for convex mixed-integer programs. Santanu S. Dey Joint work with Diego Morán and Juan Pablo Vielma December 17, 2012. Introduction About Motivation Mixed integer
More informationFixed Point Theorems
Fixed Point Theorems Definition: Let X be a set and let T : X X be a function that maps X into itself. (Such a function is often called an operator, a transformation, or a transform on X, and the notation
More informationPractical Guide to the Simplex Method of Linear Programming
Practical Guide to the Simplex Method of Linear Programming Marcel Oliver Revised: April, 0 The basic steps of the simplex algorithm Step : Write the linear programming problem in standard form Linear
More informationLinear Algebra. A vector space (over R) is an ordered quadruple. such that V is a set; 0 V ; and the following eight axioms hold:
Linear Algebra A vector space (over R) is an ordered quadruple (V, 0, α, µ) such that V is a set; 0 V ; and the following eight axioms hold: α : V V V and µ : R V V ; (i) α(α(u, v), w) = α(u, α(v, w)),
More informationNon-Arbitrage and the Fundamental Theorem of Asset Pricing: Summary of Main Results
Proceedings of Symposia in Applied Mathematics Volume 00, 1997 Non-Arbitrage and the Fundamental Theorem of Asset Pricing: Summary of Main Results Freddy Delbaen and Walter Schachermayer Abstract. The
More informationKATO S INEQUALITY UP TO THE BOUNDARY. Contents 1. Introduction 1 2. Properties of functions in X 4 3. Proof of Theorem 1.1 6 4.
KATO S INEQUALITY UP TO THE BOUNDARY HAÏM BREZIS(1),(2) AND AUGUSTO C. PONCE (3) Abstract. We show that if u is a finite measure in then, under suitable assumptions on u near, u + is also a finite measure
More informationFuzzy Probability Distributions in Bayesian Analysis
Fuzzy Probability Distributions in Bayesian Analysis Reinhard Viertl and Owat Sunanta Department of Statistics and Probability Theory Vienna University of Technology, Vienna, Austria Corresponding author:
More informationBipan Hazarika ON ACCELERATION CONVERGENCE OF MULTIPLE SEQUENCES. 1. Introduction
F A S C I C U L I M A T H E M A T I C I Nr 51 2013 Bipan Hazarika ON ACCELERATION CONVERGENCE OF MULTIPLE SEQUENCES Abstract. In this article the notion of acceleration convergence of double sequences
More informationFUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES
FUNCTIONAL ANALYSIS LECTURE NOTES: QUOTIENT SPACES CHRISTOPHER HEIL 1. Cosets and the Quotient Space Any vector space is an abelian group under the operation of vector addition. So, if you are have studied
More information24. The Branch and Bound Method
24. The Branch and Bound Method It has serious practical consequences if it is known that a combinatorial problem is NP-complete. Then one can conclude according to the present state of science that no
More informationLECTURE 15: AMERICAN OPTIONS
LECTURE 15: AMERICAN OPTIONS 1. Introduction All of the options that we have considered thus far have been of the European variety: exercise is permitted only at the termination of the contract. These
More informationSeveral Views of Support Vector Machines
Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min
More informationWhat is Linear Programming?
Chapter 1 What is Linear Programming? An optimization problem usually has three essential ingredients: a variable vector x consisting of a set of unknowns to be determined, an objective function of x to
More informationThe Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem
More informationNonlinear Optimization: Algorithms 3: Interior-point methods
Nonlinear Optimization: Algorithms 3: Interior-point methods INSEAD, Spring 2006 Jean-Philippe Vert Ecole des Mines de Paris Jean-Philippe.Vert@mines.org Nonlinear optimization c 2006 Jean-Philippe Vert,
More informationDEGREES OF ORDERS ON TORSION-FREE ABELIAN GROUPS
DEGREES OF ORDERS ON TORSION-FREE ABELIAN GROUPS ASHER M. KACH, KAREN LANGE, AND REED SOLOMON Abstract. We construct two computable presentations of computable torsion-free abelian groups, one of isomorphism
More informationLecture 13 - Basic Number Theory.
Lecture 13 - Basic Number Theory. Boaz Barak March 22, 2010 Divisibility and primes Unless mentioned otherwise throughout this lecture all numbers are non-negative integers. We say that A divides B, denoted
More informationMathematical finance and linear programming (optimization)
Mathematical finance and linear programming (optimization) Geir Dahl September 15, 2009 1 Introduction The purpose of this short note is to explain how linear programming (LP) (=linear optimization) may
More information10. Proximal point method
L. Vandenberghe EE236C Spring 2013-14) 10. Proximal point method proximal point method augmented Lagrangian method Moreau-Yosida smoothing 10-1 Proximal point method a conceptual algorithm for minimizing
More informationt := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d).
1. Line Search Methods Let f : R n R be given and suppose that x c is our current best estimate of a solution to P min x R nf(x). A standard method for improving the estimate x c is to choose a direction
More informationGeometrical Characterization of RN-operators between Locally Convex Vector Spaces
Geometrical Characterization of RN-operators between Locally Convex Vector Spaces OLEG REINOV St. Petersburg State University Dept. of Mathematics and Mechanics Universitetskii pr. 28, 198504 St, Petersburg
More informationLet H and J be as in the above lemma. The result of the lemma shows that the integral
Let and be as in the above lemma. The result of the lemma shows that the integral ( f(x, y)dy) dx is well defined; we denote it by f(x, y)dydx. By symmetry, also the integral ( f(x, y)dx) dy is well defined;
More informationSchooling, Political Participation, and the Economy. (Online Supplementary Appendix: Not for Publication)
Schooling, Political Participation, and the Economy Online Supplementary Appendix: Not for Publication) Filipe R. Campante Davin Chor July 200 Abstract In this online appendix, we present the proofs for
More informationCONTINUED FRACTIONS AND PELL S EQUATION. Contents 1. Continued Fractions 1 2. Solution to Pell s Equation 9 References 12
CONTINUED FRACTIONS AND PELL S EQUATION SEUNG HYUN YANG Abstract. In this REU paper, I will use some important characteristics of continued fractions to give the complete set of solutions to Pell s equation.
More information1 Error in Euler s Method
1 Error in Euler s Method Experience with Euler s 1 method raises some interesting questions about numerical approximations for the solutions of differential equations. 1. What determines the amount of
More informationSome Research Problems in Uncertainty Theory
Journal of Uncertain Systems Vol.3, No.1, pp.3-10, 2009 Online at: www.jus.org.uk Some Research Problems in Uncertainty Theory aoding Liu Uncertainty Theory Laboratory, Department of Mathematical Sciences
More informationLecture 7: Continuous Random Variables
Lecture 7: Continuous Random Variables 21 September 2005 1 Our First Continuous Random Variable The back of the lecture hall is roughly 10 meters across. Suppose it were exactly 10 meters, and consider
More informationMATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.
MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column
More informationON SEQUENTIAL CONTINUITY OF COMPOSITION MAPPING. 0. Introduction
ON SEQUENTIAL CONTINUITY OF COMPOSITION MAPPING Abstract. In [1] there was proved a theorem concerning the continuity of the composition mapping, and there was announced a theorem on sequential continuity
More informationOptimal File Sharing in Distributed Networks
Optimal File Sharing in Distributed Networks Moni Naor Ron M. Roth Abstract The following file distribution problem is considered: Given a network of processors represented by an undirected graph G = (V,
More informationSection 1.3 P 1 = 1 2. = 1 4 2 8. P n = 1 P 3 = Continuing in this fashion, it should seem reasonable that, for any n = 1, 2, 3,..., = 1 2 4.
Difference Equations to Differential Equations Section. The Sum of a Sequence This section considers the problem of adding together the terms of a sequence. Of course, this is a problem only if more than
More informationNumerisches Rechnen. (für Informatiker) M. Grepl J. Berger & J.T. Frings. Institut für Geometrie und Praktische Mathematik RWTH Aachen
(für Informatiker) M. Grepl J. Berger & J.T. Frings Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2010/11 Problem Statement Unconstrained Optimality Conditions Constrained
More information8 Divisibility and prime numbers
8 Divisibility and prime numbers 8.1 Divisibility In this short section we extend the concept of a multiple from the natural numbers to the integers. We also summarize several other terms that express
More informationMatrix Representations of Linear Transformations and Changes of Coordinates
Matrix Representations of Linear Transformations and Changes of Coordinates 01 Subspaces and Bases 011 Definitions A subspace V of R n is a subset of R n that contains the zero element and is closed under
More informationProbability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..
Probability Theory A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T.. Florian Herzog 2013 Probability space Probability space A probability space W is a unique triple W = {Ω, F,
More informationWHEN DOES A RANDOMLY WEIGHTED SELF NORMALIZED SUM CONVERGE IN DISTRIBUTION?
WHEN DOES A RANDOMLY WEIGHTED SELF NORMALIZED SUM CONVERGE IN DISTRIBUTION? DAVID M MASON 1 Statistics Program, University of Delaware Newark, DE 19717 email: davidm@udeledu JOEL ZINN 2 Department of Mathematics,
More informationStatistical Machine Translation: IBM Models 1 and 2
Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation
More informationMax-Min Representation of Piecewise Linear Functions
Beiträge zur Algebra und Geometrie Contributions to Algebra and Geometry Volume 43 (2002), No. 1, 297-302. Max-Min Representation of Piecewise Linear Functions Sergei Ovchinnikov Mathematics Department,
More informationFEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL
FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint
More informationAn Introduction to Linear Programming
An Introduction to Linear Programming Steven J. Miller March 31, 2007 Mathematics Department Brown University 151 Thayer Street Providence, RI 02912 Abstract We describe Linear Programming, an important
More information1 Solving LPs: The Simplex Algorithm of George Dantzig
Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.
More information1 Norms and Vector Spaces
008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)
More informationThis asserts two sets are equal iff they have the same elements, that is, a set is determined by its elements.
3. Axioms of Set theory Before presenting the axioms of set theory, we first make a few basic comments about the relevant first order logic. We will give a somewhat more detailed discussion later, but
More information