Maximum Entropy. Information Theory 2013 Lecture 9 Chapter 12. Tohid Ardeshiri. May 22, 2013
|
|
- Dorothy Norah Knight
- 7 years ago
- Views:
Transcription
1 Maximum Entropy Information Theory 2013 Lecture 9 Chapter 12 Tohid Ardeshiri May 22, 2013
2 Why Maximum Entropy distribution? max f (x) h(f ) subject to E r(x) = α Temperature of a gas corresponds to the expected square velocity of the molecules of a gas. What about the distribution of the velocity? How is the distribution of molecules velocity in presence of gravity subject to a total energy constraint? 1. Maxwell-Boltzmann distribution, 2. Exponential distribution of the air density in the atmosphere in the vertical direction, 2 th 3. 5 in kinetic energy and 3 th 5 in potential energy, 4. Distribution of the velocities are independent of the hight of the molecule.
3 Outline This lecture will cover Maximum Entropy Distributions. Anomalous Maximum Entropy Problem. pectrum Estimation. Entropy of a Gaussian Process. Burg s Entropy Theorem. All illustrations are borrowed from the book, Wikipedia and the lecture given by Thomas M. Cover at tanford users/web/pg/view_subject.php?subject=ee376b_ PRING_2010_2011
4 Maximum Entropy Distributions Maximize the entropy h(f ) over all probability densities f satisfying 1. f (x) 0, with equality outside the support set 2. f (x)dx = 1 3. f (x)ri(x)dx = αi, for 1 i m Example 1: = (, ), EX = 0, EX 2 = σ 2 f (x) = N (x; 0, σ 2 ) Example 2: = [0, + ), EX = λ f (x) = Exp(x; λ 1 ) Example 3: = [a, b], No constraint f (x) = U(x; a, b)
5 Finding the solution using Calculus Maximize the entropy h(f ) over all probability densities f satisfying 1. f (x) 0, with equality outside the support set 2. f (x)dx = 1 3. f (x)ri(x)dx = αi, for 1 i m Entropy is a concave function defined over a convex set m J(f ) = f ln f + λ 0 f + λ i r if i=1 J m = ln f (x) 1 + λ0 + λ ir i(x) f (x) i=1 m f (x) = e 1+λ 0+ λ i r i (x) i=1
6 Theorem : Maximum Entropy Distribution Theorem: Let f (x) = e 1+λ 0+ chosen so that f satisfies m i=1 λ i r i (x), x, where λ 0, λ 1,..., λ m are 1. f (x) 0, with equality outside the support set 2. f (x)dx = 1 3. f (x)ri(x)dx = αi, for 1 i m. Then f UNIQUELY maximizes h(f ) over all probability densities f satisfying the constraints.
7 Proof using Information Inequality Theorem: Let f (x) = e 1+λ 0+ chosen so that f satisfies m i=1 λ i r i (x), x, where λ 0, λ 1,..., λ m are 1. f (x) 0, with equality outside the support set 2. f (x)dx = 1 3. f (x)ri(x)dx = αi, for 1 i m. Then f UNIQUELY maximizes h(f ) over all probability densities f satisfying the constraints. h(g) = g ln g = g ln g f f = D(g f ) g ln f ( ) m 1 + λ 0 + λ ir i = g ln f = g f ( 1 + λ 0 + i=1 ) m λ ir i = f ln f = h(f ) i=1 Note: The equality holds iff D(g f ) = 0 for all x g = f except for a set of measure 0.
8 Anomalous Maximum Entropy Problem Maximize the entropy h(f ) over all probability densities f satisfying f (x)dx = 1 xf (x)dx = α 1 x 2 f (x)dx = α 2 f (x) = e λ 0+λ 1 x+λ 2 x 2 N (α 1, α 2 α 2 1) f (x) = e λ 0+λ 1 x+λ 2 x 2 +λ 3 x 3 x 3 f (x)dx = α 3 sup h(f ) = h(n (α 1, α 2 α 2 1)) = 1 2 ln 2π(α2 α2 1)
9 Entropy rates of a Gaussian Process The differential entropy rate of a stochastic process {X i}, X i R 1 h(x ) = lim h(x1, X2,..., Xn) = lim h(x n X n 1,..., X 1) n n n ince the P is Gaussian the conditional distribution is also Gaussian and hence, h(x n X n 1,..., X 1) = 1 2 log 2πeσ2 and therefore, lim n h(x n X n 1,..., X 1) = 1 2 log 2πeσ2 where σ 2 is the variance of the error in the best estimate of X n given the infinite past. Thus h(x ) = 1 2 log 2πeσ2 The entropy rate corresponds to the minimum mean-squared error of the best estimator of a sample of the process given the infinite past. σ 2 = 1 2πe 22h(X ),
10 Entropy rates of a Gaussian Process II For a stationary Gaussian stochastic process we have where K (n) ij h(x 1, X 2,..., X n) = 1 2 log(2πe)n K (n) = R(i j) = E(X i E X i)(x j E X j). Kolmogorov has shown that h(x ) = 1 2 log(2πe) + 1 4π π log (λ)dλ π
11 pectrum estimation Autocorrelation function for a stationary zero-mean stochastic process {X i}: R(k) = E X ix i+k Power pectral Density: (λ) = m= R(m)e( imλ), π < λ π is an indicative of the structure of the process. Periodogram, truncating and windowing. R(k) = 1 n k X ix i+k n k Burg suggested to instead of setting the autocorrelations at high lags to zero set them to values that make the fewest assumptions about the data i.e. values that maximize the entropy rate of the process. Burg assumed that the process to be stationary and Gaussian and found that the process which maximizes the entropy subject to the correlation constraint is an autoregressive Gaussian process of appropriate order. i=1
12 Burg s Maximum Entropy Theorem Theorem: The maximum entropy rate stochastic process {X i} satisfying the constraint E X ix i+k = α k, k = 0, 1,..., p for all i, (1) is the p th order Gauss-Markov process of the form X i = p a k X i k + Z i, where the Z i iid N (0, σ 2 ) and a 1, a 2,..., a pσ 2 are chosen to satisfy (1). k=1 Remark:We do not assume that {X i} is 1. zero mean, 2. Gaussian, or 3. wide-sense stationary.
13 Proof of the Burg s Theorem I Let X 1, X 2,..., X n be any stochastic process that satisfies the constraints. Let Z 1, Z 2,..., Z n be a Gaussian process with the same covariance matrix as X 1, X 2,..., X n. Let Y 1, Y 2,..., Y n be a p th order Gauss-Markov process with the same distribution as Z 1, Z 2,..., Z n for all orders up to p. Recall that the multivariate normal distribution maximizes the entropy over all vector-valued random variables under a covariance constraint. Recall that conditioning reduces the entropy. ince the conditional entropy depends only on the p th order distribution h(z i Z i 1, Z i 2,..., Z i p) = h(y i Y i 1, Y i 2,..., Y i p), h(z 1,..., Z p) = h(y 1,..., Y p)
14 Proof of the Burg s Theorem II h(x 1, X 2,..., X n) h(z 1, Z 2,..., Z n) n = h(z 1,..., Z p) + h(z i Z i 1, Z i 2,..., Z 1) = h(y 1,..., Y n) by the Markovity of Y i. i=p+1 n h(z 1,..., Z p) + h(z i Z i 1, Z i 2,..., Z i p) i=p+1 n = h(y 1,..., Y p) + h(y i Y i 1, Y i 2,..., Y i p) i=p+1
15 Proof of the Burg s Theorem III Dividing by n and taking the limit, we obtain lim n 1 1 h(x1, X2,..., Xn) lim n n n h(y1,..., Yn) = 1 log 2πeσ2 2 which is the entropy rate of the Gauss-Markov process. Hence, the maximum entropy rate stochastic process satisfying the constraints is the p th order Gauss-Markov process satisfying the constraints.
16 A bare-bones summary of the proof The entropy of a finite segment of a stochastic process is bounded above by the entropy of a segment of a Gaussian random process with the same covariance structure. This entropy is in turn bounded above by the entropy of the minimal order Gauss-Markov process satisfying the given covariance constraints. uch a process exists and has a convenient characterization by means of the Yule-Walker.
LECTURE 4. Last time: Lecture outline
LECTURE 4 Last time: Types of convergence Weak Law of Large Numbers Strong Law of Large Numbers Asymptotic Equipartition Property Lecture outline Stochastic processes Markov chains Entropy rate Random
More informationTHE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok
THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE Alexer Barvinok Papers are available at http://www.math.lsa.umich.edu/ barvinok/papers.html This is a joint work with J.A. Hartigan
More informationProbability and Random Variables. Generation of random variables (r.v.)
Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly
More informationMicroeconomic Theory: Basic Math Concepts
Microeconomic Theory: Basic Math Concepts Matt Van Essen University of Alabama Van Essen (U of A) Basic Math Concepts 1 / 66 Basic Math Concepts In this lecture we will review some basic mathematical concepts
More informationSF2940: Probability theory Lecture 8: Multivariate Normal Distribution
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,
More informationTopic 3b: Kinetic Theory
Topic 3b: Kinetic Theory What is temperature? We have developed some statistical language to simplify describing measurements on physical systems. When we measure the temperature of a system, what underlying
More informationReview Horse Race Gambling and Side Information Dependent horse races and the entropy rate. Gambling. Besma Smida. ES250: Lecture 9.
Gambling Besma Smida ES250: Lecture 9 Fall 2008-09 B. Smida (ES250) Gambling Fall 2008-09 1 / 23 Today s outline Review of Huffman Code and Arithmetic Coding Horse Race Gambling and Side Information Dependent
More informationSF2940: Probability theory Lecture 8: Multivariate Normal Distribution
SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance
More informationQuadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
More information1 Norms and Vector Spaces
008.10.07.01 1 Norms and Vector Spaces Suppose we have a complex vector space V. A norm is a function f : V R which satisfies (i) f(x) 0 for all x V (ii) f(x + y) f(x) + f(y) for all x,y V (iii) f(λx)
More informationTTT4120 Digital Signal Processing Suggested Solution to Exam Fall 2008
Norwegian University of Science and Technology Department of Electronics and Telecommunications TTT40 Digital Signal Processing Suggested Solution to Exam Fall 008 Problem (a) The input and the input-output
More informationHOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!
Math 7 Fall 205 HOMEWORK 5 SOLUTIONS Problem. 2008 B2 Let F 0 x = ln x. For n 0 and x > 0, let F n+ x = 0 F ntdt. Evaluate n!f n lim n ln n. By directly computing F n x for small n s, we obtain the following
More informationMath 461 Fall 2006 Test 2 Solutions
Math 461 Fall 2006 Test 2 Solutions Total points: 100. Do all questions. Explain all answers. No notes, books, or electronic devices. 1. [105+5 points] Assume X Exponential(λ). Justify the following two
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationGambling and Data Compression
Gambling and Data Compression Gambling. Horse Race Definition The wealth relative S(X) = b(x)o(x) is the factor by which the gambler s wealth grows if horse X wins the race, where b(x) is the fraction
More informationFor a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )
Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (1903-1987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll
More informationTail inequalities for order statistics of log-concave vectors and applications
Tail inequalities for order statistics of log-concave vectors and applications Rafał Latała Based in part on a joint work with R.Adamczak, A.E.Litvak, A.Pajor and N.Tomczak-Jaegermann Banff, May 2011 Basic
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationEstimating the Degree of Activity of jumps in High Frequency Financial Data. joint with Yacine Aït-Sahalia
Estimating the Degree of Activity of jumps in High Frequency Financial Data joint with Yacine Aït-Sahalia Aim and setting An underlying process X = (X t ) t 0, observed at equally spaced discrete times
More informationSensitivity analysis of utility based prices and risk-tolerance wealth processes
Sensitivity analysis of utility based prices and risk-tolerance wealth processes Dmitry Kramkov, Carnegie Mellon University Based on a paper with Mihai Sirbu from Columbia University Math Finance Seminar,
More informationWalrasian Demand. u(x) where B(p, w) = {x R n + : p x w}.
Walrasian Demand Econ 2100 Fall 2015 Lecture 5, September 16 Outline 1 Walrasian Demand 2 Properties of Walrasian Demand 3 An Optimization Recipe 4 First and Second Order Conditions Definition Walrasian
More informationA Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails
12th International Congress on Insurance: Mathematics and Economics July 16-18, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint
More informationCHAPTER IV - BROWNIAN MOTION
CHAPTER IV - BROWNIAN MOTION JOSEPH G. CONLON 1. Construction of Brownian Motion There are two ways in which the idea of a Markov chain on a discrete state space can be generalized: (1) The discrete time
More informationOn the mathematical theory of splitting and Russian roulette
On the mathematical theory of splitting and Russian roulette techniques St.Petersburg State University, Russia 1. Introduction Splitting is an universal and potentially very powerful technique for increasing
More informationModern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More information. (3.3) n Note that supremum (3.2) must occur at one of the observed values x i or to the left of x i.
Chapter 3 Kolmogorov-Smirnov Tests There are many situations where experimenters need to know what is the distribution of the population of their interest. For example, if they want to use a parametric
More informationThe Ergodic Theorem and randomness
The Ergodic Theorem and randomness Peter Gács Department of Computer Science Boston University March 19, 2008 Peter Gács (Boston University) Ergodic theorem March 19, 2008 1 / 27 Introduction Introduction
More informationPacific Journal of Mathematics
Pacific Journal of Mathematics GLOBAL EXISTENCE AND DECREASING PROPERTY OF BOUNDARY VALUES OF SOLUTIONS TO PARABOLIC EQUATIONS WITH NONLOCAL BOUNDARY CONDITIONS Sangwon Seo Volume 193 No. 1 March 2000
More informationNonlinear Algebraic Equations Example
Nonlinear Algebraic Equations Example Continuous Stirred Tank Reactor (CSTR). Look for steady state concentrations & temperature. s r (in) p,i (in) i In: N spieces with concentrations c, heat capacities
More informationEfficiency and the Cramér-Rao Inequality
Chapter Efficiency and the Cramér-Rao Inequality Clearly we would like an unbiased estimator ˆφ (X of φ (θ to produce, in the long run, estimates which are fairly concentrated i.e. have high precision.
More informationMultivariate normal distribution and testing for means (see MKB Ch 3)
Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................
More informationNeural Networks and Learning Systems
Neural Networks and Learning Systems Exercise Collection, Class 9 March 2010 x 1 x 2 x N w 11 3 W 11 h 3 2 3 h N w NN h 1 W NN y Neural Networks and Learning Systems Exercise Collection c Medical Informatics,
More informationStochastic Models for Inventory Management at Service Facilities
Stochastic Models for Inventory Management at Service Facilities O. Berman, E. Kim Presented by F. Zoghalchi University of Toronto Rotman School of Management Dec, 2012 Agenda 1 Problem description Deterministic
More informationSome stability results of parameter identification in a jump diffusion model
Some stability results of parameter identification in a jump diffusion model D. Düvelmeyer Technische Universität Chemnitz, Fakultät für Mathematik, 09107 Chemnitz, Germany Abstract In this paper we discuss
More information3. Regression & Exponential Smoothing
3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a
More informationTwo-Stage Stochastic Linear Programs
Two-Stage Stochastic Linear Programs Operations Research Anthony Papavasiliou 1 / 27 Two-Stage Stochastic Linear Programs 1 Short Reviews Probability Spaces and Random Variables Convex Analysis 2 Deterministic
More informationRandom access protocols for channel access. Markov chains and their stability. Laurent Massoulié.
Random access protocols for channel access Markov chains and their stability laurent.massoulie@inria.fr Aloha: the first random access protocol for channel access [Abramson, Hawaii 70] Goal: allow machines
More informationInequalities of Analysis. Andrejs Treibergs. Fall 2014
USAC Colloquium Inequalities of Analysis Andrejs Treibergs University of Utah Fall 2014 2. USAC Lecture: Inequalities of Analysis The URL for these Beamer Slides: Inequalities of Analysis http://www.math.utah.edu/~treiberg/inequalitiesslides.pdf
More informationTime Series Analysis
Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:
More informationTOPIC 4: DERIVATIVES
TOPIC 4: DERIVATIVES 1. The derivative of a function. Differentiation rules 1.1. The slope of a curve. The slope of a curve at a point P is a measure of the steepness of the curve. If Q is a point on the
More informationOctober 3rd, 2012. Linear Algebra & Properties of the Covariance Matrix
Linear Algebra & Properties of the Covariance Matrix October 3rd, 2012 Estimation of r and C Let rn 1, rn, t..., rn T be the historical return rates on the n th asset. rn 1 rṇ 2 r n =. r T n n = 1, 2,...,
More informationMath 120 Final Exam Practice Problems, Form: A
Math 120 Final Exam Practice Problems, Form: A Name: While every attempt was made to be complete in the types of problems given below, we make no guarantees about the completeness of the problems. Specifically,
More informationNonparametric adaptive age replacement with a one-cycle criterion
Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: Pauline.Schrijner@durham.ac.uk
More informationMULTIVARIATE PROBABILITY DISTRIBUTIONS
MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined
More informationTesting against a Change from Short to Long Memory
Testing against a Change from Short to Long Memory Uwe Hassler and Jan Scheithauer Goethe-University Frankfurt This version: December 9, 2007 Abstract This paper studies some well-known tests for the null
More informationOnline Appendix to Stochastic Imitative Game Dynamics with Committed Agents
Online Appendix to Stochastic Imitative Game Dynamics with Committed Agents William H. Sandholm January 6, 22 O.. Imitative protocols, mean dynamics, and equilibrium selection In this section, we consider
More informationLecture notes on Moral Hazard, i.e. the Hidden Action Principle-Agent Model
Lecture notes on Moral Hazard, i.e. the Hidden Action Principle-Agent Model Allan Collard-Wexler April 19, 2012 Co-Written with John Asker and Vasiliki Skreta 1 Reading for next week: Make Versus Buy in
More informationHow to assess the risk of a large portfolio? How to estimate a large covariance matrix?
Chapter 3 Sparse Portfolio Allocation This chapter touches some practical aspects of portfolio allocation and risk assessment from a large pool of financial assets (e.g. stocks) How to assess the risk
More informationProbability for Estimation (review)
Probability for Estimation (review) In general, we want to develop an estimator for systems of the form: x = f x, u + η(t); y = h x + ω(t); ggggg y, ffff x We will primarily focus on discrete time linear
More informationEstimating an ARMA Process
Statistics 910, #12 1 Overview Estimating an ARMA Process 1. Main ideas 2. Fitting autoregressions 3. Fitting with moving average components 4. Standard errors 5. Examples 6. Appendix: Simple estimators
More informationThe Heat Equation. Lectures INF2320 p. 1/88
The Heat Equation Lectures INF232 p. 1/88 Lectures INF232 p. 2/88 The Heat Equation We study the heat equation: u t = u xx for x (,1), t >, (1) u(,t) = u(1,t) = for t >, (2) u(x,) = f(x) for x (,1), (3)
More information2.3 Convex Constrained Optimization Problems
42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions
More informationClass #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
More informationMath 431 An Introduction to Probability. Final Exam Solutions
Math 43 An Introduction to Probability Final Eam Solutions. A continuous random variable X has cdf a for 0, F () = for 0 <
More informationOverview of Monte Carlo Simulation, Probability Review and Introduction to Matlab
Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab 1 Overview of Monte Carlo Simulation 1.1 Why use simulation?
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationFEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL
FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint
More informationDepartment of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.
Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x
More informationA Log-Robust Optimization Approach to Portfolio Management
A Log-Robust Optimization Approach to Portfolio Management Dr. Aurélie Thiele Lehigh University Joint work with Ban Kawas Research partially supported by the National Science Foundation Grant CMMI-0757983
More informationWeek 1: Introduction to Online Learning
Week 1: Introduction to Online Learning 1 Introduction This is written based on Prediction, Learning, and Games (ISBN: 2184189 / -21-8418-9 Cesa-Bianchi, Nicolo; Lugosi, Gabor 1.1 A Gentle Start Consider
More informationVERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS
VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS MICHAEL DRMOTA, OMER GIMENEZ, AND MARC NOY Abstract. We show that the number of vertices of a given degree k in several kinds of series-parallel labelled
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete
More informationMATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...
MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationMATH 425, PRACTICE FINAL EXAM SOLUTIONS.
MATH 45, PRACTICE FINAL EXAM SOLUTIONS. Exercise. a Is the operator L defined on smooth functions of x, y by L u := u xx + cosu linear? b Does the answer change if we replace the operator L by the operator
More informationBayesian logistic betting strategy against probability forecasting. Akimichi Takemura, Univ. Tokyo. November 12, 2012
Bayesian logistic betting strategy against probability forecasting Akimichi Takemura, Univ. Tokyo (joint with Masayuki Kumon, Jing Li and Kei Takeuchi) November 12, 2012 arxiv:1204.3496. To appear in Stochastic
More informationECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE
ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE YUAN TIAN This synopsis is designed merely for keep a record of the materials covered in lectures. Please refer to your own lecture notes for all proofs.
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationMA4001 Engineering Mathematics 1 Lecture 10 Limits and Continuity
MA4001 Engineering Mathematics 1 Lecture 10 Limits and Dr. Sarah Mitchell Autumn 2014 Infinite limits If f(x) grows arbitrarily large as x a we say that f(x) has an infinite limit. Example: f(x) = 1 x
More informationNumerical methods for American options
Lecture 9 Numerical methods for American options Lecture Notes by Andrzej Palczewski Computational Finance p. 1 American options The holder of an American option has the right to exercise it at any moment
More informationFluid models in performance analysis
Fluid models in performance analysis Marco Gribaudo 1, Miklós Telek 2 1 Dip. di Informatica, Università di Torino, marcog@di.unito.it 2 Dept. of Telecom., Technical University of Budapest telek@hitbme.hu
More informationContrôle dynamique de méthodes d approximation
Contrôle dynamique de méthodes d approximation Fabienne Jézéquel Laboratoire d Informatique de Paris 6 ARINEWS, ENS Lyon, 7-8 mars 2005 F. Jézéquel Dynamical control of approximation methods 7-8 Mar. 2005
More informationSpectra of Sample Covariance Matrices for Multiple Time Series
Spectra of Sample Covariance Matrices for Multiple Time Series Reimer Kühn, Peter Sollich Disordered System Group, Department of Mathematics, King s College London VIIIth Brunel-Bielefeld Workshop on Random
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More informationMaster s Theory Exam Spring 2006
Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationTutorial on Markov Chain Monte Carlo
Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,
More informationTime Series Analysis
Time Series Analysis Autoregressive, MA and ARMA processes Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 212 Alonso and García-Martos
More informationTesting against a Change from Short to Long Memory
Testing against a Change from Short to Long Memory Uwe Hassler and Jan Scheithauer Goethe-University Frankfurt This version: January 2, 2008 Abstract This paper studies some well-known tests for the null
More informationLecture Notes 1. Brief Review of Basic Probability
Probability Review Lecture Notes Brief Review of Basic Probability I assume you know basic probability. Chapters -3 are a review. I will assume you have read and understood Chapters -3. Here is a very
More information1 Portfolio mean and variance
Copyright c 2005 by Karl Sigman Portfolio mean and variance Here we study the performance of a one-period investment X 0 > 0 (dollars) shared among several different assets. Our criterion for measuring
More informationTowards a Tight Finite Key Analysis for BB84
The Uncertainty Relation for Smooth Entropies joint work with Charles Ci Wen Lim, Nicolas Gisin and Renato Renner Institute for Theoretical Physics, ETH Zurich Group of Applied Physics, University of Geneva
More informationSingle period modelling of financial assets
Single period modelling of financial assets Pål Lillevold and Dag Svege 17. 10. 2002 Single period modelling of financial assets 1 1 Outline A possible - and common - approach to stochastic modelling of
More informationLecture 6: Discrete & Continuous Probability and Random Variables
Lecture 6: Discrete & Continuous Probability and Random Variables D. Alex Hughes Math Camp September 17, 2015 D. Alex Hughes (Math Camp) Lecture 6: Discrete & Continuous Probability and Random September
More informationDiscrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability
CS 7 Discrete Mathematics and Probability Theory Fall 29 Satish Rao, David Tse Note 8 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces
More informationTTT4110 Information and Signal Theory Solution to exam
Norwegian University of Science and Technology Department of Electronics and Telecommunications TTT4 Information and Signal Theory Solution to exam Problem I (a The frequency response is found by taking
More information1. Prove that the empty set is a subset of every set.
1. Prove that the empty set is a subset of every set. Basic Topology Written by Men-Gen Tsai email: b89902089@ntu.edu.tw Proof: For any element x of the empty set, x is also an element of every set since
More informationCyber-Security Analysis of State Estimators in Power Systems
Cyber-Security Analysis of State Estimators in Electric Power Systems André Teixeira 1, Saurabh Amin 2, Henrik Sandberg 1, Karl H. Johansson 1, and Shankar Sastry 2 ACCESS Linnaeus Centre, KTH-Royal Institute
More information15 Kuhn -Tucker conditions
5 Kuhn -Tucker conditions Consider a version of the consumer problem in which quasilinear utility x 2 + 4 x 2 is maximised subject to x +x 2 =. Mechanically applying the Lagrange multiplier/common slopes
More informationNotes on Continuous Random Variables
Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes
More informationGambling with Information Theory
Gambling with Information Theory Govert Verkes University of Amsterdam January 27, 2016 1 / 22 How do you bet? Private noisy channel transmitting results while you can still bet, correct transmission(p)
More information3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions
3. Convex functions Convex Optimization Boyd & Vandenberghe basic properties and examples operations that preserve convexity the conjugate function quasiconvex functions log-concave and log-convex functions
More informationExtracting correlation structure from large random matrices
Extracting correlation structure from large random matrices Alfred Hero University of Michigan - Ann Arbor Feb. 17, 2012 1 / 46 1 Background 2 Graphical models 3 Screening for hubs in graphical model 4
More informationTo refer to or to cite this work, please use the citation to the published version:
biblio.ugent.be The UGent Institutional Repository is the electronic archiving and dissemination platform for all UGent research publications. Ghent University has implemented a mandate stipulating that
More informationChapter 10 Introduction to Time Series Analysis
Chapter 1 Introduction to Time Series Analysis A time series is a collection of observations made sequentially in time. Examples are daily mortality counts, particulate air pollution measurements, and
More informationDiscrete Optimization
Discrete Optimization [Chen, Batson, Dang: Applied integer Programming] Chapter 3 and 4.1-4.3 by Johan Högdahl and Victoria Svedberg Seminar 2, 2015-03-31 Todays presentation Chapter 3 Transforms using
More informationProbability Generating Functions
page 39 Chapter 3 Probability Generating Functions 3 Preamble: Generating Functions Generating functions are widely used in mathematics, and play an important role in probability theory Consider a sequence
More informationLecture 1: Asset pricing and the equity premium puzzle
Lecture 1: Asset pricing and the equity premium puzzle Simon Gilchrist Boston Univerity and NBER EC 745 Fall, 2013 Overview Some basic facts. Study the asset pricing implications of household portfolio
More informationLecture 13: Martingales
Lecture 13: Martingales 1. Definition of a Martingale 1.1 Filtrations 1.2 Definition of a martingale and its basic properties 1.3 Sums of independent random variables and related models 1.4 Products of
More information