Neural Networks and Learning Systems

Size: px
Start display at page:

Download "Neural Networks and Learning Systems"

Transcription

1 Neural Networks and Learning Systems Exercise Collection, Class 9 March 2010 x 1 x 2 x N w 11 3 W 11 h h N w NN h 1 W NN y

2 Neural Networks and Learning Systems Exercise Collection c Medical Informatics, IMT, LiTH Contents Exercises 2 1 Genetic Algorithms 2 Solutions 4 Formulary 7 1 Activation functions 7 2 Cost functions 7 3 Learning rules 8 4 Probability functions 9 5 Miscellaneous 10 1

3 Exercises 1. Genetic Algorithms 1.1. (Crossover and Mutation) We start of with a number of definitions in order to make further calculations easier. Define the order of a schema S, as the number of fixed positions, o(s). The distance between the first and the last fixed position in the schema is denoted as δ(s). a) Assume a crossover between two strings of the length l is taking place be means of a two step process. First a random position k is drawn from a rectangular distribution on the interval {1,l 1}. Then the strings swap the parts between and including position k +1and l with each other. Derive a lower bound for the probability, p s, that a schema of the length l survives a crossover given the probability of the crossover itself, p k. b) Let us also consider the possibility of mutation. The probability that a given position should be affected is assumed to be p m. What is the lower bound for survivability of a schema now? 1.2. (The Schema Theorem) Show the Schema Theorem, i.e. the number of copies of a schema S in a population will increase or decrease exponentially with respect to the relative fitness of the schema. Disregard crossover- and mutation effects (The survival of the fittest) A population contains strings with the following corresponding fitness: No. String Fitness The probability for mutation is p m =0.01 and the probability for crossover is p k =1.0. Calculate the expected number of schemata with the string S 1 =1 and S 2 =0 1 respectively in the next generation. Comments? 1.4. (Live and let die) Let us in this exercise ignore the possibility of the destruction of a schemata due to crossovers and mutations. a) A schema S 1 with one representative in the first generation has 25% larger fittness value than the average in the population of 100 individuals. After how many generations will this schema appear in every individual in the population? b) A schema S 2 appearing in 60 (first generation) of the 100 individuals has 10% lower fittness value than the average. After how many generations will this schema be extinct? 1.5. (*) (Two- and k-armed bandits) In the case with the two armed bandit, where one arm gives profit with an average m 1 and variance s 2 1 while the second gives m 2 on average with variance s 2 2, one can use the following tactics. We have N pulls to our disposal. Of these, we use 2n <Nto pull n times in each arm and N 2n to then pull the arm estimated to be the best. The expected loss if we use this tactics is given by: 2

4 L(N,n) = m 1 m 2 (n + p(n)(n 2n)), where p(n) denotes the probability that we after the initial 2n pulls choose the wrong arm to pull for the rest of the N 2n pulls. Now, the p(n) can be approximated with the tail of a normal distribution according to: p(n) e x2 /2 2πx where x = m 1 m 2 s s 2 2 n. a) If we follow a policy minimizing the loss L, how much more frequent should we pull the arm estimated to be the best compared to the arm we estimate to be the worst? b) Assume that the optimal relation between the best arm and the other arms we derived in the a-part still applies in the case of the k armed bandit. Then, what parallel can you make to the behavior of genetic algorithms? 3

5 Solutions Answers 1.1 a) The probability of such a crossover destroying a schema is given by the probability of the randomized position ending up somewhere between the fixed positions in the schema. The number of such positions are given by the length of the schema, δ(s). The position for the crossover is drawn from a rectangular distribution over the possible sites in the interval {1,l 1} and we get the probability for survival as: p s =1 δ(s)/(l 1). If p k is the probability of using the crossover operator, the lower bound for a schema to survive a crossover becomes : p s 1 p k δ(s)/(l 1). This is a lower bound because the schema might live on with another individual in the population and we have not taken that into consideration. b) In order for the schema to avoid damage from mutation, all fixed positions in the schema must come through. The number of fixed positions are given by the order of the schema, o(s). The probability to survive mutation is then: p m =(1 p m ) o(s) 1 o(s)p m, where the last approximation applies when p m 1. The lower bound in total for surviving both crossover and mutation becomes δ(s) p s 1 p k l 1 o(s)p m, if we neglect the influence from the second order terms. Answers 1.2 At reproduction an individual is chosen with a its relative fitness as probability, f i / f i. A schema is therefore chosen with the probability f(s)/ f i, where f(s) is the mean value of the fitness values for all individuals in the population having the schema. If we look at the expected number of representatives of a schema S in the next generation given the number of representatives in the current generation m(s, t), we get: m(s, t +1)=m(S, t) f(s) fi n. because the size of the population is n and we consequently make n random samples. We can rewrite this expression with the help of f ave, i.e. the mean fitness of the entire population: m(s, t +1)=m(S, t) f(s). f ave Now, if a schema in average has c f ave greater fitness than the average in the population, this schema will grow according to the recursive expression: m(s, t +1) = m(s, t) f ave + c f ave f ave = m(s, t) (1+c) m(s, t) = m(s, 0) (1 + c) t. 4

6 I.e. the genetic algorithm leads to an exponential growth of such a schema. With the same line of reasoning we see that schemata with less fitness than average will die off from the population according to the same exponential function. Answers 1.3 By combining the schemata theorem with the survivability calculations from exercise 1.1 we see that the expected number of representatives for a schema S in the next generation are given by: m(s, t +1) m(s, t) f(s) f ave (1 δ(s) l 1 p k o(s)p m ). Some of the parameters are given for all schemata, p k =1.0, p m =0.01, l =5and f ave =12.5. For the remaining parameters we can set up a table: Schema f(s) δ(s) o(s) m(s, t) Inserted in the recursion expression we get the expected number of both the schemata as m(s 1,t+1)= and m(s 2,t+1)= We see that schema number two will be reduced drastically due to its length, low fitness and having many fixed positions. The opposite applies for schema number one; it s not affected by crossover, it has a low probability for being affected by mutation and it has a fitness value larger than the average. Answers 1.4 a) Again we use the Schema theorem and we assume that schema S 1 has taken over the population when more than 99.5% of its individuals have been equipped with this schema. According to the exercise we start with one individual having schema S 1, i.e. m(s 1, 0) = 1. In addition we know that this schema is 25% better than average, which gives us c =0.25. Inserting these numbers then gives us: m(s 1,t) = m(s 1, 0) (1 + c) t 99.5 < 1 (1+0.25) t t > ln 99.5 ln , i.e. the expected number of generations before all individuals in the population carries this schema is 21. b) Since it will be an exponential decrease of a bad schema, we hold that S 2 is extinct when less than 0.5% of the individuals of the population carry this schema. According to the exercise we start with 60 individuals carrying this schema, i.e. m(s 2, 0) = 60. In addition we know that this schema is 10% worse than average, which gives us c = Inserting these numbers then gives us: m(s 2,t) = m(s 2, 0) (1 + c) t 0.5 > 60 (1 0.1) t t > ln(0.5/60) ln , i.e. the expected number of generations before no individuals in the population carries this schema is 46. 5

7 Answers 1.5 a) Derivate the expression and set it to zero. Then examine how the number of times you should pull best arm, N n, depends on the number of times you pulled the worst arm, n. Since we can disregard the constant difference in mean profit we can instead derivate the function: L(N,n) = n + p(n)(n 2n) = (N n)p(n)+n(1 p(n)) dl dn = dp dp (N n) p(n)+1 p(n) n dn dn = 0 N n = (2p(n)+n dp dp 1)/ dn dn. We now let x = an which results in the following expression for the density function and its derivative: p(n) = 1 2πan e an/2 and If we insert this in the expression for N n we get: dp dn = 1+an 2n 2πan e an/2. N n = 4n 1+an + n + 2n 2πan 1+an 4 8πn a + n + a 8πn a ean/2. ean/2 ean/2 where the second last step is given by the assumption of an 1 and the last step is given by the assumption that the exponential function dominates both the constant and linear term. The conclusion of this is then that the number of times we should pull the arm we think is the best is an exponential function of the number of times we pulled the arm we think is the worst. b) In the case with a k-armed bandit you should therefore pull exponentially more times in the arm you think is best compared to any of the other arms. This is exactly what you achieve by applying a genetic algorithm on k number of schemata. The best schema will grow exponentially compared with its competitors. 6

8 Formulary 1. Activation functions The Signum function y =sign(h) = { 1 h 0 1 h<0 (1) The Fermi function y = The Hyperbolic Tangent function 1 1+e h y = y (1 y) (2) y = tanh(h) y =(1 y 2 ) (3) Stochastic Activation function { 1 with probability P (h) y = 1 with probability 1 P (h) where P (h) = 1 1+e 2βh (4) 2. Cost functions Mean Square Error E = 1 2 E { d(x) y(x) 2 } (5) Square Error Sum, p examples E = 1 p d μ y μ 2 (6) 2 Relative entropy for the probability functions P α and Q α over states α E = α P α ln P α Q α (7) Relative entropy, p examples, N classes E = p P μ N i [ d i μ ln d i μ +(1 d i μ )ln 1 d ] i μ y i μ 1 y i μ (8) Regularization (Complexity reducing punishment functions) E c = i E c = i w 2 i (9) (w i /w 0 ) 2 1+(w i /w 0 ) 2 (10) 7

9 Clustering E = 1 x μ w μ 2 (11) 2 μ Yuille s Cost function E = 1 2 wt Cw w 4 (12) Entropy for the probability distribution P α E = H(α) =E{ ln P α } = α P α ln P α (13) Differential entropy for a continuous distribution E = h(y) =E{ ln p(y)} = p(y)lnp(y)dy (14) Value function for the MDP V f (x(t)) = γ i r (x(t + i),f(x(t + i))) (15) i=0 Q-function for the MDP Q f (x, y) =r(x, y)+γv f (x(y)) (16) 3. Learning rules The Outer Product rule w ij = The Perceptron Rule p p d μ i dμ j i.e. w = d μ d μt (17) Δw(t) =η [ d(t) y(t)]x(t) (18) The LMS Rule (online) Δw(t) =η [ d(t) y(t)]x(t) =ηδ(t)x(t) (19) Backpropagation (batch) on the cost function (6) Δw αβ = η p δ μ α V μ β = η p [ σ γ ] W γα δγ μ V μ β (20) The Boltzmann machine, auto-association Δw ij = η [ <S i S j > locked <S i S j > free ] (21) 8

10 The Boltzmann machine, hetero-association Δw ij = η [ <S i S j > I,O locked <S i S j > I locked ] (22) Clustering (online) Δw = η (x w ) (23) Kohonen s rule Δw i = ηh(i, )(x w i ) (24) Oja s rule Δw = ηy [ x y w ] (25) Sanger s rule Yuille s rule [ ] k Δw k = ηy k x y i w i i=1 (26) Δw = η (y x w 2 w) (27) Bell-Sejnowski s Entropy Maximation Rule with the activation function (2) ΔW = η ([W T ] 1 (1 2y) x T ) Δw 0 = η (1 2y) (28) Q-learning ΔQ = α [r(x, y)+γq f (x(y),f(x(y))) Q f (x, y)] (29) TD(λ)-rule Δw = α [r(x,f(x)) + γv f (x(f(x))) V f (x)] 4. Probability functions k i=1 λ k i w V f (x(i)) (30) N-dim. normal distribution p(x) = [ 1 (2π) N/2 det C exp 1 ] 2 (x m)t C 1 (x m) (31) Boltzmann-Gibb s distribution of states α with energies E α P α = 1 Z exp[ E α T ] where Z = β exp[ E β T ] (32) Markov process of 1st order a ij = P (x j (t +1) x i (t)) (33) 9

11 5. Miscellaneous Bellman s Equation of Optimality V f =max {r(x, y)+γv f (x(y))} (34) y The Schemata Theorem m(s, t +1) f(s) ( ) δ(s) m(s, t) 1 p k f l 1 O(S)p m (35) 10

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

An Introduction to Machine Learning

An Introduction to Machine Learning An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails 12th International Congress on Insurance: Mathematics and Economics July 16-18, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Statistics 100A Homework 8 Solutions

Statistics 100A Homework 8 Solutions Part : Chapter 7 Statistics A Homework 8 Solutions Ryan Rosario. A player throws a fair die and simultaneously flips a fair coin. If the coin lands heads, then she wins twice, and if tails, the one-half

More information

1 Sufficient statistics

1 Sufficient statistics 1 Sufficient statistics A statistic is a function T = rx 1, X 2,, X n of the random sample X 1, X 2,, X n. Examples are X n = 1 n s 2 = = X i, 1 n 1 the sample mean X i X n 2, the sample variance T 1 =

More information

A Log-Robust Optimization Approach to Portfolio Management

A Log-Robust Optimization Approach to Portfolio Management A Log-Robust Optimization Approach to Portfolio Management Dr. Aurélie Thiele Lehigh University Joint work with Ban Kawas Research partially supported by the National Science Foundation Grant CMMI-0757983

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher

More information

Høgskolen i Narvik Sivilingeniørutdanningen STE6237 ELEMENTMETODER. Oppgaver

Høgskolen i Narvik Sivilingeniørutdanningen STE6237 ELEMENTMETODER. Oppgaver Høgskolen i Narvik Sivilingeniørutdanningen STE637 ELEMENTMETODER Oppgaver Klasse: 4.ID, 4.IT Ekstern Professor: Gregory A. Chechkin e-mail: chechkin@mech.math.msu.su Narvik 6 PART I Task. Consider two-point

More information

INSURANCE RISK THEORY (Problems)

INSURANCE RISK THEORY (Problems) INSURANCE RISK THEORY (Problems) 1 Counting random variables 1. (Lack of memory property) Let X be a geometric distributed random variable with parameter p (, 1), (X Ge (p)). Show that for all n, m =,

More information

The Method of Least Squares. Lectures INF2320 p. 1/80

The Method of Least Squares. Lectures INF2320 p. 1/80 The Method of Least Squares Lectures INF2320 p. 1/80 Lectures INF2320 p. 2/80 The method of least squares We study the following problem: Given n points (t i,y i ) for i = 1,...,n in the (t,y)-plane. How

More information

Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S

Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard

More information

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015. Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x

More information

Math 431 An Introduction to Probability. Final Exam Solutions

Math 431 An Introduction to Probability. Final Exam Solutions Math 43 An Introduction to Probability Final Eam Solutions. A continuous random variable X has cdf a for 0, F () = for 0 <

More information

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

More information

α α λ α = = λ λ α ψ = = α α α λ λ ψ α = + β = > θ θ β > β β θ θ θ β θ β γ θ β = γ θ > β > γ θ β γ = θ β = θ β = θ β = β θ = β β θ = = = β β θ = + α α α α α = = λ λ λ λ λ λ λ = λ λ α α α α λ ψ + α =

More information

CHAPTER IV - BROWNIAN MOTION

CHAPTER IV - BROWNIAN MOTION CHAPTER IV - BROWNIAN MOTION JOSEPH G. CONLON 1. Construction of Brownian Motion There are two ways in which the idea of a Markov chain on a discrete state space can be generalized: (1) The discrete time

More information

General Theory of Differential Equations Sections 2.8, 3.1-3.2, 4.1

General Theory of Differential Equations Sections 2.8, 3.1-3.2, 4.1 A B I L E N E C H R I S T I A N U N I V E R S I T Y Department of Mathematics General Theory of Differential Equations Sections 2.8, 3.1-3.2, 4.1 Dr. John Ehrke Department of Mathematics Fall 2012 Questions

More information

Final Mathematics 5010, Section 1, Fall 2004 Instructor: D.A. Levin

Final Mathematics 5010, Section 1, Fall 2004 Instructor: D.A. Levin Final Mathematics 51, Section 1, Fall 24 Instructor: D.A. Levin Name YOU MUST SHOW YOUR WORK TO RECEIVE CREDIT. A CORRECT ANSWER WITHOUT SHOWING YOUR REASONING WILL NOT RECEIVE CREDIT. Problem Points Possible

More information

Machine Learning and Pattern Recognition Logistic Regression

Machine Learning and Pattern Recognition Logistic Regression Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Coding and decoding with convolutional codes. The Viterbi Algor

Coding and decoding with convolutional codes. The Viterbi Algor Coding and decoding with convolutional codes. The Viterbi Algorithm. 8 Block codes: main ideas Principles st point of view: infinite length block code nd point of view: convolutions Some examples Repetition

More information

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is. Some Continuous Probability Distributions CHAPTER 6: Continuous Uniform Distribution: 6. Definition: The density function of the continuous random variable X on the interval [A, B] is B A A x B f(x; A,

More information

Linear Classification. Volker Tresp Summer 2015

Linear Classification. Volker Tresp Summer 2015 Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

More information

SIMULATING CANCELLATIONS AND OVERBOOKING IN YIELD MANAGEMENT

SIMULATING CANCELLATIONS AND OVERBOOKING IN YIELD MANAGEMENT CHAPTER 8 SIMULATING CANCELLATIONS AND OVERBOOKING IN YIELD MANAGEMENT In YM, one of the major problems in maximizing revenue is the number of cancellations. In industries implementing YM this is very

More information

JANUARY 2016 EXAMINATIONS. Life Insurance I

JANUARY 2016 EXAMINATIONS. Life Insurance I PAPER CODE NO. MATH 273 EXAMINER: Dr. C. Boado-Penas TEL.NO. 44026 DEPARTMENT: Mathematical Sciences JANUARY 2016 EXAMINATIONS Life Insurance I Time allowed: Two and a half hours INSTRUCTIONS TO CANDIDATES:

More information

Practice problems for Homework 11 - Point Estimation

Practice problems for Homework 11 - Point Estimation Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:

More information

Stochastic Models for Inventory Management at Service Facilities

Stochastic Models for Inventory Management at Service Facilities Stochastic Models for Inventory Management at Service Facilities O. Berman, E. Kim Presented by F. Zoghalchi University of Toronto Rotman School of Management Dec, 2012 Agenda 1 Problem description Deterministic

More information

A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA

A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA REVSTAT Statistical Journal Volume 4, Number 2, June 2006, 131 142 A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA Authors: Daiane Aparecida Zuanetti Departamento de Estatística, Universidade Federal de São

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators... MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information

2008 AP Calculus AB Multiple Choice Exam

2008 AP Calculus AB Multiple Choice Exam 008 AP Multiple Choice Eam Name 008 AP Calculus AB Multiple Choice Eam Section No Calculator Active AP Calculus 008 Multiple Choice 008 AP Calculus AB Multiple Choice Eam Section Calculator Active AP Calculus

More information

1.5 / 1 -- Communication Networks II (Görg) -- www.comnets.uni-bremen.de. 1.5 Transforms

1.5 / 1 -- Communication Networks II (Görg) -- www.comnets.uni-bremen.de. 1.5 Transforms .5 / -- Communication Networks II (Görg) -- www.comnets.uni-bremen.de.5 Transforms Using different summation and integral transformations pmf, pdf and cdf/ccdf can be transformed in such a way, that even

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Errata for ASM Exam C/4 Study Manual (Sixteenth Edition) Sorted by Page 1 Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Practice exam 1:9, 1:22, 1:29, 9:5, and 10:8

More information

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE Alexer Barvinok Papers are available at http://www.math.lsa.umich.edu/ barvinok/papers.html This is a joint work with J.A. Hartigan

More information

Topic 3b: Kinetic Theory

Topic 3b: Kinetic Theory Topic 3b: Kinetic Theory What is temperature? We have developed some statistical language to simplify describing measurements on physical systems. When we measure the temperature of a system, what underlying

More information

Principle of Data Reduction

Principle of Data Reduction Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Multi-variable Calculus and Optimization

Multi-variable Calculus and Optimization Multi-variable Calculus and Optimization Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Multi-variable Calculus and Optimization 1 / 51 EC2040 Topic 3 - Multi-variable Calculus

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

6.2 Permutations continued

6.2 Permutations continued 6.2 Permutations continued Theorem A permutation on a finite set A is either a cycle or can be expressed as a product (composition of disjoint cycles. Proof is by (strong induction on the number, r, of

More information

Supplement to Call Centers with Delay Information: Models and Insights

Supplement to Call Centers with Delay Information: Models and Insights Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290

More information

Chapter 2: Binomial Methods and the Black-Scholes Formula

Chapter 2: Binomial Methods and the Black-Scholes Formula Chapter 2: Binomial Methods and the Black-Scholes Formula 2.1 Binomial Trees We consider a financial market consisting of a bond B t = B(t), a stock S t = S(t), and a call-option C t = C(t), where the

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

STAT 830 Convergence in Distribution

STAT 830 Convergence in Distribution STAT 830 Convergence in Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Convergence in Distribution STAT 830 Fall 2011 1 / 31

More information

Notes on the Negative Binomial Distribution

Notes on the Negative Binomial Distribution Notes on the Negative Binomial Distribution John D. Cook October 28, 2009 Abstract These notes give several properties of the negative binomial distribution. 1. Parameterizations 2. The connection between

More information

Hedging Options In The Incomplete Market With Stochastic Volatility. Rituparna Sen Sunday, Nov 15

Hedging Options In The Incomplete Market With Stochastic Volatility. Rituparna Sen Sunday, Nov 15 Hedging Options In The Incomplete Market With Stochastic Volatility Rituparna Sen Sunday, Nov 15 1. Motivation This is a pure jump model and hence avoids the theoretical drawbacks of continuous path models.

More information

Probability for Estimation (review)

Probability for Estimation (review) Probability for Estimation (review) In general, we want to develop an estimator for systems of the form: x = f x, u + η(t); y = h x + ω(t); ggggg y, ffff x We will primarily focus on discrete time linear

More information

Vacuum Technology. Kinetic Theory of Gas. Dr. Philip D. Rack

Vacuum Technology. Kinetic Theory of Gas. Dr. Philip D. Rack Kinetic Theory of Gas Assistant Professor Department of Materials Science and Engineering University of Tennessee 603 Dougherty Engineering Building Knoxville, TN 3793-00 Phone: (865) 974-5344 Fax (865)

More information

OPTIMAL TIMING OF THE ANNUITY PURCHASES: A

OPTIMAL TIMING OF THE ANNUITY PURCHASES: A OPTIMAL TIMING OF THE ANNUITY PURCHASES: A COMBINED STOCHASTIC CONTROL AND OPTIMAL STOPPING PROBLEM Gabriele Stabile 1 1 Dipartimento di Matematica per le Dec. Econ. Finanz. e Assic., Facoltà di Economia

More information

Probability Generating Functions

Probability Generating Functions page 39 Chapter 3 Probability Generating Functions 3 Preamble: Generating Functions Generating functions are widely used in mathematics, and play an important role in probability theory Consider a sequence

More information

Integer Programming: Algorithms - 3

Integer Programming: Algorithms - 3 Week 9 Integer Programming: Algorithms - 3 OPR 992 Applied Mathematical Programming OPR 992 - Applied Mathematical Programming - p. 1/12 Dantzig-Wolfe Reformulation Example Strength of the Linear Programming

More information

VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS

VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS MICHAEL DRMOTA, OMER GIMENEZ, AND MARC NOY Abstract. We show that the number of vertices of a given degree k in several kinds of series-parallel labelled

More information

Probability Calculator

Probability Calculator Chapter 95 Introduction Most statisticians have a set of probability tables that they refer to in doing their statistical wor. This procedure provides you with a set of electronic statistical tables that

More information

Web-based Supplementary Materials for. Modeling of Hormone Secretion-Generating. Mechanisms With Splines: A Pseudo-Likelihood.

Web-based Supplementary Materials for. Modeling of Hormone Secretion-Generating. Mechanisms With Splines: A Pseudo-Likelihood. Web-based Supplementary Materials for Modeling of Hormone Secretion-Generating Mechanisms With Splines: A Pseudo-Likelihood Approach by Anna Liu and Yuedong Wang Web Appendix A This appendix computes mean

More information

Linear algebra and the geometry of quadratic equations. Similarity transformations and orthogonal matrices

Linear algebra and the geometry of quadratic equations. Similarity transformations and orthogonal matrices MATH 30 Differential Equations Spring 006 Linear algebra and the geometry of quadratic equations Similarity transformations and orthogonal matrices First, some things to recall from linear algebra Two

More information

An Introduction to Neural Networks

An Introduction to Neural Networks An Introduction to Vincent Cheung Kevin Cannons Signal & Data Compression Laboratory Electrical & Computer Engineering University of Manitoba Winnipeg, Manitoba, Canada Advisor: Dr. W. Kinsner May 27,

More information

A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT

A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT New Mathematics and Natural Computation Vol. 1, No. 2 (2005) 295 303 c World Scientific Publishing Company A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA:

More information

Section 5.1 Continuous Random Variables: Introduction

Section 5.1 Continuous Random Variables: Introduction Section 5. Continuous Random Variables: Introduction Not all random variables are discrete. For example:. Waiting times for anything (train, arrival of customer, production of mrna molecule from gene,

More information

Lectures on Stochastic Processes. William G. Faris

Lectures on Stochastic Processes. William G. Faris Lectures on Stochastic Processes William G. Faris November 8, 2001 2 Contents 1 Random walk 7 1.1 Symmetric simple random walk................... 7 1.2 Simple random walk......................... 9 1.3

More information

M5A42 APPLIED STOCHASTIC PROCESSES PROBLEM SHEET 1 SOLUTIONS Term 1 2010-2011

M5A42 APPLIED STOCHASTIC PROCESSES PROBLEM SHEET 1 SOLUTIONS Term 1 2010-2011 M5A42 APPLIED STOCHASTIC PROCESSES PROBLEM SHEET 1 SOLUTIONS Term 1 21-211 1. Clculte the men, vrince nd chrcteristic function of the following probbility density functions. ) The exponentil distribution

More information

2. Illustration of the Nikkei 225 option data

2. Illustration of the Nikkei 225 option data 1. Introduction 2. Illustration of the Nikkei 225 option data 2.1 A brief outline of the Nikkei 225 options market τ 2.2 Estimation of the theoretical price τ = + ε ε = = + ε + = + + + = + ε + ε + ε =

More information

Scalar Valued Functions of Several Variables; the Gradient Vector

Scalar Valued Functions of Several Variables; the Gradient Vector Scalar Valued Functions of Several Variables; the Gradient Vector Scalar Valued Functions vector valued function of n variables: Let us consider a scalar (i.e., numerical, rather than y = φ(x = φ(x 1,

More information

Lecture 5: Model-Free Control

Lecture 5: Model-Free Control Lecture 5: Model-Free Control David Silver Outline 1 Introduction 2 On-Policy Monte-Carlo Control 3 On-Policy Temporal-Difference Learning 4 Off-Policy Learning 5 Summary Introduction Model-Free Reinforcement

More information

A characterization of trace zero symmetric nonnegative 5x5 matrices

A characterization of trace zero symmetric nonnegative 5x5 matrices A characterization of trace zero symmetric nonnegative 5x5 matrices Oren Spector June 1, 009 Abstract The problem of determining necessary and sufficient conditions for a set of real numbers to be the

More information

Calculus AB 2014 Scoring Guidelines

Calculus AB 2014 Scoring Guidelines P Calculus B 014 Scoring Guidelines 014 The College Board. College Board, dvanced Placement Program, P, P Central, and the acorn logo are registered trademarks of the College Board. P Central is the official

More information

The Engle-Granger representation theorem

The Engle-Granger representation theorem The Engle-Granger representation theorem Reference note to lecture 10 in ECON 5101/9101, Time Series Econometrics Ragnar Nymoen March 29 2011 1 Introduction The Granger-Engle representation theorem is

More information

A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS

A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS Eusebio GÓMEZ, Miguel A. GÓMEZ-VILLEGAS and J. Miguel MARÍN Abstract In this paper it is taken up a revision and characterization of the class of

More information

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February

More information

THE DYING FIBONACCI TREE. 1. Introduction. Consider a tree with two types of nodes, say A and B, and the following properties:

THE DYING FIBONACCI TREE. 1. Introduction. Consider a tree with two types of nodes, say A and B, and the following properties: THE DYING FIBONACCI TREE BERNHARD GITTENBERGER 1. Introduction Consider a tree with two types of nodes, say A and B, and the following properties: 1. Let the root be of type A.. Each node of type A produces

More information

Pacific Journal of Mathematics

Pacific Journal of Mathematics Pacific Journal of Mathematics GLOBAL EXISTENCE AND DECREASING PROPERTY OF BOUNDARY VALUES OF SOLUTIONS TO PARABOLIC EQUATIONS WITH NONLOCAL BOUNDARY CONDITIONS Sangwon Seo Volume 193 No. 1 March 2000

More information

Reject Inference in Credit Scoring. Jie-Men Mok

Reject Inference in Credit Scoring. Jie-Men Mok Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business

More information

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1]. Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real

More information

Portfolio Using Queuing Theory

Portfolio Using Queuing Theory Modeling the Number of Insured Households in an Insurance Portfolio Using Queuing Theory Jean-Philippe Boucher and Guillaume Couture-Piché December 8, 2015 Quantact / Département de mathématiques, UQAM.

More information

Constrained optimization.

Constrained optimization. ams/econ 11b supplementary notes ucsc Constrained optimization. c 2010, Yonatan Katznelson 1. Constraints In many of the optimization problems that arise in economics, there are restrictions on the values

More information

4. How many integers between 2004 and 4002 are perfect squares?

4. How many integers between 2004 and 4002 are perfect squares? 5 is 0% of what number? What is the value of + 3 4 + 99 00? (alternating signs) 3 A frog is at the bottom of a well 0 feet deep It climbs up 3 feet every day, but slides back feet each night If it started

More information

LARGE CLASSES OF EXPERTS

LARGE CLASSES OF EXPERTS LARGE CLASSES OF EXPERTS Csaba Szepesvári University of Alberta CMPUT 654 E-mail: szepesva@ualberta.ca UofA, October 31, 2006 OUTLINE 1 TRACKING THE BEST EXPERT 2 FIXED SHARE FORECASTER 3 VARIABLE-SHARE

More information

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T.. Probability Theory A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T.. Florian Herzog 2013 Probability space Probability space A probability space W is a unique triple W = {Ω, F,

More information

Temporal Difference Learning in the Tetris Game

Temporal Difference Learning in the Tetris Game Temporal Difference Learning in the Tetris Game Hans Pirnay, Slava Arabagi February 6, 2009 1 Introduction Learning to play the game Tetris has been a common challenge on a few past machine learning competitions.

More information

Factoring. Factoring 1

Factoring. Factoring 1 Factoring Factoring 1 Factoring Security of RSA algorithm depends on (presumed) difficulty of factoring o Given N = pq, find p or q and RSA is broken o Rabin cipher also based on factoring Factoring like

More information

Sensitivity analysis of utility based prices and risk-tolerance wealth processes

Sensitivity analysis of utility based prices and risk-tolerance wealth processes Sensitivity analysis of utility based prices and risk-tolerance wealth processes Dmitry Kramkov, Carnegie Mellon University Based on a paper with Mihai Sirbu from Columbia University Math Finance Seminar,

More information

Some stability results of parameter identification in a jump diffusion model

Some stability results of parameter identification in a jump diffusion model Some stability results of parameter identification in a jump diffusion model D. Düvelmeyer Technische Universität Chemnitz, Fakultät für Mathematik, 09107 Chemnitz, Germany Abstract In this paper we discuss

More information

Second Order Linear Partial Differential Equations. Part I

Second Order Linear Partial Differential Equations. Part I Second Order Linear Partial Differential Equations Part I Second linear partial differential equations; Separation of Variables; - point boundary value problems; Eigenvalues and Eigenfunctions Introduction

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

A Non-Linear Schema Theorem for Genetic Algorithms

A Non-Linear Schema Theorem for Genetic Algorithms A Non-Linear Schema Theorem for Genetic Algorithms William A Greene Computer Science Department University of New Orleans New Orleans, LA 70148 bill@csunoedu 504-280-6755 Abstract We generalize Holland

More information

Lab 4: 26 th March 2012. Exercise 1: Evolutionary algorithms

Lab 4: 26 th March 2012. Exercise 1: Evolutionary algorithms Lab 4: 26 th March 2012 Exercise 1: Evolutionary algorithms 1. Found a problem where EAs would certainly perform very poorly compared to alternative approaches. Explain why. Suppose that we want to find

More information

Monte Carlo-based statistical methods (MASM11/FMS091)

Monte Carlo-based statistical methods (MASM11/FMS091) Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February 5, 2013 J. Olsson Monte Carlo-based

More information

Inductive QoS Packet Scheduling for Adaptive Dynamic Networks

Inductive QoS Packet Scheduling for Adaptive Dynamic Networks Inductive QoS Packet Scheduling for Adaptive Dynamic Networks Malika BOURENANE Dept of Computer Science University of Es-Senia Algeria mb_regina@yahoo.fr Abdelhamid MELLOUK LISSI Laboratory University

More information

RANDOM INTERVAL HOMEOMORPHISMS. MICHA L MISIUREWICZ Indiana University Purdue University Indianapolis

RANDOM INTERVAL HOMEOMORPHISMS. MICHA L MISIUREWICZ Indiana University Purdue University Indianapolis RANDOM INTERVAL HOMEOMORPHISMS MICHA L MISIUREWICZ Indiana University Purdue University Indianapolis This is a joint work with Lluís Alsedà Motivation: A talk by Yulij Ilyashenko. Two interval maps, applied

More information

e.g. arrival of a customer to a service station or breakdown of a component in some system.

e.g. arrival of a customer to a service station or breakdown of a component in some system. Poisson process Events occur at random instants of time at an average rate of λ events per second. e.g. arrival of a customer to a service station or breakdown of a component in some system. Let N(t) be

More information

Homework #2 Solutions

Homework #2 Solutions MAT Spring Problems Section.:, 8,, 4, 8 Section.5:,,, 4,, 6 Extra Problem # Homework # Solutions... Sketch likely solution curves through the given slope field for dy dx = x + y...8. Sketch likely solution

More information

Stirling s formula, n-spheres and the Gamma Function

Stirling s formula, n-spheres and the Gamma Function Stirling s formula, n-spheres and the Gamma Function We start by noticing that and hence x n e x dx lim a 1 ( 1 n n a n n! e ax dx lim a 1 ( 1 n n a n a 1 x n e x dx (1 Let us make a remark in passing.

More information

2WB05 Simulation Lecture 8: Generating random variables

2WB05 Simulation Lecture 8: Generating random variables 2WB05 Simulation Lecture 8: Generating random variables Marko Boon http://www.win.tue.nl/courses/2wb05 January 7, 2013 Outline 2/36 1. How do we generate random variables? 2. Fitting distributions Generating

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

Efficiency and the Cramér-Rao Inequality

Efficiency and the Cramér-Rao Inequality Chapter Efficiency and the Cramér-Rao Inequality Clearly we would like an unbiased estimator ˆφ (X of φ (θ to produce, in the long run, estimates which are fairly concentrated i.e. have high precision.

More information

Gambling and Data Compression

Gambling and Data Compression Gambling and Data Compression Gambling. Horse Race Definition The wealth relative S(X) = b(x)o(x) is the factor by which the gambler s wealth grows if horse X wins the race, where b(x) is the fraction

More information