Neural Networks and Learning Systems
|
|
- Alan Newman
- 7 years ago
- Views:
Transcription
1 Neural Networks and Learning Systems Exercise Collection, Class 9 March 2010 x 1 x 2 x N w 11 3 W 11 h h N w NN h 1 W NN y
2 Neural Networks and Learning Systems Exercise Collection c Medical Informatics, IMT, LiTH Contents Exercises 2 1 Genetic Algorithms 2 Solutions 4 Formulary 7 1 Activation functions 7 2 Cost functions 7 3 Learning rules 8 4 Probability functions 9 5 Miscellaneous 10 1
3 Exercises 1. Genetic Algorithms 1.1. (Crossover and Mutation) We start of with a number of definitions in order to make further calculations easier. Define the order of a schema S, as the number of fixed positions, o(s). The distance between the first and the last fixed position in the schema is denoted as δ(s). a) Assume a crossover between two strings of the length l is taking place be means of a two step process. First a random position k is drawn from a rectangular distribution on the interval {1,l 1}. Then the strings swap the parts between and including position k +1and l with each other. Derive a lower bound for the probability, p s, that a schema of the length l survives a crossover given the probability of the crossover itself, p k. b) Let us also consider the possibility of mutation. The probability that a given position should be affected is assumed to be p m. What is the lower bound for survivability of a schema now? 1.2. (The Schema Theorem) Show the Schema Theorem, i.e. the number of copies of a schema S in a population will increase or decrease exponentially with respect to the relative fitness of the schema. Disregard crossover- and mutation effects (The survival of the fittest) A population contains strings with the following corresponding fitness: No. String Fitness The probability for mutation is p m =0.01 and the probability for crossover is p k =1.0. Calculate the expected number of schemata with the string S 1 =1 and S 2 =0 1 respectively in the next generation. Comments? 1.4. (Live and let die) Let us in this exercise ignore the possibility of the destruction of a schemata due to crossovers and mutations. a) A schema S 1 with one representative in the first generation has 25% larger fittness value than the average in the population of 100 individuals. After how many generations will this schema appear in every individual in the population? b) A schema S 2 appearing in 60 (first generation) of the 100 individuals has 10% lower fittness value than the average. After how many generations will this schema be extinct? 1.5. (*) (Two- and k-armed bandits) In the case with the two armed bandit, where one arm gives profit with an average m 1 and variance s 2 1 while the second gives m 2 on average with variance s 2 2, one can use the following tactics. We have N pulls to our disposal. Of these, we use 2n <Nto pull n times in each arm and N 2n to then pull the arm estimated to be the best. The expected loss if we use this tactics is given by: 2
4 L(N,n) = m 1 m 2 (n + p(n)(n 2n)), where p(n) denotes the probability that we after the initial 2n pulls choose the wrong arm to pull for the rest of the N 2n pulls. Now, the p(n) can be approximated with the tail of a normal distribution according to: p(n) e x2 /2 2πx where x = m 1 m 2 s s 2 2 n. a) If we follow a policy minimizing the loss L, how much more frequent should we pull the arm estimated to be the best compared to the arm we estimate to be the worst? b) Assume that the optimal relation between the best arm and the other arms we derived in the a-part still applies in the case of the k armed bandit. Then, what parallel can you make to the behavior of genetic algorithms? 3
5 Solutions Answers 1.1 a) The probability of such a crossover destroying a schema is given by the probability of the randomized position ending up somewhere between the fixed positions in the schema. The number of such positions are given by the length of the schema, δ(s). The position for the crossover is drawn from a rectangular distribution over the possible sites in the interval {1,l 1} and we get the probability for survival as: p s =1 δ(s)/(l 1). If p k is the probability of using the crossover operator, the lower bound for a schema to survive a crossover becomes : p s 1 p k δ(s)/(l 1). This is a lower bound because the schema might live on with another individual in the population and we have not taken that into consideration. b) In order for the schema to avoid damage from mutation, all fixed positions in the schema must come through. The number of fixed positions are given by the order of the schema, o(s). The probability to survive mutation is then: p m =(1 p m ) o(s) 1 o(s)p m, where the last approximation applies when p m 1. The lower bound in total for surviving both crossover and mutation becomes δ(s) p s 1 p k l 1 o(s)p m, if we neglect the influence from the second order terms. Answers 1.2 At reproduction an individual is chosen with a its relative fitness as probability, f i / f i. A schema is therefore chosen with the probability f(s)/ f i, where f(s) is the mean value of the fitness values for all individuals in the population having the schema. If we look at the expected number of representatives of a schema S in the next generation given the number of representatives in the current generation m(s, t), we get: m(s, t +1)=m(S, t) f(s) fi n. because the size of the population is n and we consequently make n random samples. We can rewrite this expression with the help of f ave, i.e. the mean fitness of the entire population: m(s, t +1)=m(S, t) f(s). f ave Now, if a schema in average has c f ave greater fitness than the average in the population, this schema will grow according to the recursive expression: m(s, t +1) = m(s, t) f ave + c f ave f ave = m(s, t) (1+c) m(s, t) = m(s, 0) (1 + c) t. 4
6 I.e. the genetic algorithm leads to an exponential growth of such a schema. With the same line of reasoning we see that schemata with less fitness than average will die off from the population according to the same exponential function. Answers 1.3 By combining the schemata theorem with the survivability calculations from exercise 1.1 we see that the expected number of representatives for a schema S in the next generation are given by: m(s, t +1) m(s, t) f(s) f ave (1 δ(s) l 1 p k o(s)p m ). Some of the parameters are given for all schemata, p k =1.0, p m =0.01, l =5and f ave =12.5. For the remaining parameters we can set up a table: Schema f(s) δ(s) o(s) m(s, t) Inserted in the recursion expression we get the expected number of both the schemata as m(s 1,t+1)= and m(s 2,t+1)= We see that schema number two will be reduced drastically due to its length, low fitness and having many fixed positions. The opposite applies for schema number one; it s not affected by crossover, it has a low probability for being affected by mutation and it has a fitness value larger than the average. Answers 1.4 a) Again we use the Schema theorem and we assume that schema S 1 has taken over the population when more than 99.5% of its individuals have been equipped with this schema. According to the exercise we start with one individual having schema S 1, i.e. m(s 1, 0) = 1. In addition we know that this schema is 25% better than average, which gives us c =0.25. Inserting these numbers then gives us: m(s 1,t) = m(s 1, 0) (1 + c) t 99.5 < 1 (1+0.25) t t > ln 99.5 ln , i.e. the expected number of generations before all individuals in the population carries this schema is 21. b) Since it will be an exponential decrease of a bad schema, we hold that S 2 is extinct when less than 0.5% of the individuals of the population carry this schema. According to the exercise we start with 60 individuals carrying this schema, i.e. m(s 2, 0) = 60. In addition we know that this schema is 10% worse than average, which gives us c = Inserting these numbers then gives us: m(s 2,t) = m(s 2, 0) (1 + c) t 0.5 > 60 (1 0.1) t t > ln(0.5/60) ln , i.e. the expected number of generations before no individuals in the population carries this schema is 46. 5
7 Answers 1.5 a) Derivate the expression and set it to zero. Then examine how the number of times you should pull best arm, N n, depends on the number of times you pulled the worst arm, n. Since we can disregard the constant difference in mean profit we can instead derivate the function: L(N,n) = n + p(n)(n 2n) = (N n)p(n)+n(1 p(n)) dl dn = dp dp (N n) p(n)+1 p(n) n dn dn = 0 N n = (2p(n)+n dp dp 1)/ dn dn. We now let x = an which results in the following expression for the density function and its derivative: p(n) = 1 2πan e an/2 and If we insert this in the expression for N n we get: dp dn = 1+an 2n 2πan e an/2. N n = 4n 1+an + n + 2n 2πan 1+an 4 8πn a + n + a 8πn a ean/2. ean/2 ean/2 where the second last step is given by the assumption of an 1 and the last step is given by the assumption that the exponential function dominates both the constant and linear term. The conclusion of this is then that the number of times we should pull the arm we think is the best is an exponential function of the number of times we pulled the arm we think is the worst. b) In the case with a k-armed bandit you should therefore pull exponentially more times in the arm you think is best compared to any of the other arms. This is exactly what you achieve by applying a genetic algorithm on k number of schemata. The best schema will grow exponentially compared with its competitors. 6
8 Formulary 1. Activation functions The Signum function y =sign(h) = { 1 h 0 1 h<0 (1) The Fermi function y = The Hyperbolic Tangent function 1 1+e h y = y (1 y) (2) y = tanh(h) y =(1 y 2 ) (3) Stochastic Activation function { 1 with probability P (h) y = 1 with probability 1 P (h) where P (h) = 1 1+e 2βh (4) 2. Cost functions Mean Square Error E = 1 2 E { d(x) y(x) 2 } (5) Square Error Sum, p examples E = 1 p d μ y μ 2 (6) 2 Relative entropy for the probability functions P α and Q α over states α E = α P α ln P α Q α (7) Relative entropy, p examples, N classes E = p P μ N i [ d i μ ln d i μ +(1 d i μ )ln 1 d ] i μ y i μ 1 y i μ (8) Regularization (Complexity reducing punishment functions) E c = i E c = i w 2 i (9) (w i /w 0 ) 2 1+(w i /w 0 ) 2 (10) 7
9 Clustering E = 1 x μ w μ 2 (11) 2 μ Yuille s Cost function E = 1 2 wt Cw w 4 (12) Entropy for the probability distribution P α E = H(α) =E{ ln P α } = α P α ln P α (13) Differential entropy for a continuous distribution E = h(y) =E{ ln p(y)} = p(y)lnp(y)dy (14) Value function for the MDP V f (x(t)) = γ i r (x(t + i),f(x(t + i))) (15) i=0 Q-function for the MDP Q f (x, y) =r(x, y)+γv f (x(y)) (16) 3. Learning rules The Outer Product rule w ij = The Perceptron Rule p p d μ i dμ j i.e. w = d μ d μt (17) Δw(t) =η [ d(t) y(t)]x(t) (18) The LMS Rule (online) Δw(t) =η [ d(t) y(t)]x(t) =ηδ(t)x(t) (19) Backpropagation (batch) on the cost function (6) Δw αβ = η p δ μ α V μ β = η p [ σ γ ] W γα δγ μ V μ β (20) The Boltzmann machine, auto-association Δw ij = η [ <S i S j > locked <S i S j > free ] (21) 8
10 The Boltzmann machine, hetero-association Δw ij = η [ <S i S j > I,O locked <S i S j > I locked ] (22) Clustering (online) Δw = η (x w ) (23) Kohonen s rule Δw i = ηh(i, )(x w i ) (24) Oja s rule Δw = ηy [ x y w ] (25) Sanger s rule Yuille s rule [ ] k Δw k = ηy k x y i w i i=1 (26) Δw = η (y x w 2 w) (27) Bell-Sejnowski s Entropy Maximation Rule with the activation function (2) ΔW = η ([W T ] 1 (1 2y) x T ) Δw 0 = η (1 2y) (28) Q-learning ΔQ = α [r(x, y)+γq f (x(y),f(x(y))) Q f (x, y)] (29) TD(λ)-rule Δw = α [r(x,f(x)) + γv f (x(f(x))) V f (x)] 4. Probability functions k i=1 λ k i w V f (x(i)) (30) N-dim. normal distribution p(x) = [ 1 (2π) N/2 det C exp 1 ] 2 (x m)t C 1 (x m) (31) Boltzmann-Gibb s distribution of states α with energies E α P α = 1 Z exp[ E α T ] where Z = β exp[ E β T ] (32) Markov process of 1st order a ij = P (x j (t +1) x i (t)) (33) 9
11 5. Miscellaneous Bellman s Equation of Optimality V f =max {r(x, y)+γv f (x(y))} (34) y The Schemata Theorem m(s, t +1) f(s) ( ) δ(s) m(s, t) 1 p k f l 1 O(S)p m (35) 10
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,
More informationMaximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
More informationA Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails
12th International Congress on Insurance: Mathematics and Economics July 16-18, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationStatistics 100A Homework 8 Solutions
Part : Chapter 7 Statistics A Homework 8 Solutions Ryan Rosario. A player throws a fair die and simultaneously flips a fair coin. If the coin lands heads, then she wins twice, and if tails, the one-half
More information1 Sufficient statistics
1 Sufficient statistics A statistic is a function T = rx 1, X 2,, X n of the random sample X 1, X 2,, X n. Examples are X n = 1 n s 2 = = X i, 1 n 1 the sample mean X i X n 2, the sample variance T 1 =
More informationA Log-Robust Optimization Approach to Portfolio Management
A Log-Robust Optimization Approach to Portfolio Management Dr. Aurélie Thiele Lehigh University Joint work with Ban Kawas Research partially supported by the National Science Foundation Grant CMMI-0757983
More informationProbability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur
Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce
More informationThese slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher
More informationHøgskolen i Narvik Sivilingeniørutdanningen STE6237 ELEMENTMETODER. Oppgaver
Høgskolen i Narvik Sivilingeniørutdanningen STE637 ELEMENTMETODER Oppgaver Klasse: 4.ID, 4.IT Ekstern Professor: Gregory A. Chechkin e-mail: chechkin@mech.math.msu.su Narvik 6 PART I Task. Consider two-point
More informationINSURANCE RISK THEORY (Problems)
INSURANCE RISK THEORY (Problems) 1 Counting random variables 1. (Lack of memory property) Let X be a geometric distributed random variable with parameter p (, 1), (X Ge (p)). Show that for all n, m =,
More informationThe Method of Least Squares. Lectures INF2320 p. 1/80
The Method of Least Squares Lectures INF2320 p. 1/80 Lectures INF2320 p. 2/80 The method of least squares We study the following problem: Given n points (t i,y i ) for i = 1,...,n in the (t,y)-plane. How
More informationLinear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S
Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard
More informationDepartment of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.
Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x
More informationMath 431 An Introduction to Probability. Final Exam Solutions
Math 43 An Introduction to Probability Final Eam Solutions. A continuous random variable X has cdf a for 0, F () = for 0 <
More informationModern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
More informationα α λ α = = λ λ α ψ = = α α α λ λ ψ α = + β = > θ θ β > β β θ θ θ β θ β γ θ β = γ θ > β > γ θ β γ = θ β = θ β = θ β = β θ = β β θ = = = β β θ = + α α α α α = = λ λ λ λ λ λ λ = λ λ α α α α λ ψ + α =
More informationCHAPTER IV - BROWNIAN MOTION
CHAPTER IV - BROWNIAN MOTION JOSEPH G. CONLON 1. Construction of Brownian Motion There are two ways in which the idea of a Markov chain on a discrete state space can be generalized: (1) The discrete time
More informationGeneral Theory of Differential Equations Sections 2.8, 3.1-3.2, 4.1
A B I L E N E C H R I S T I A N U N I V E R S I T Y Department of Mathematics General Theory of Differential Equations Sections 2.8, 3.1-3.2, 4.1 Dr. John Ehrke Department of Mathematics Fall 2012 Questions
More informationFinal Mathematics 5010, Section 1, Fall 2004 Instructor: D.A. Levin
Final Mathematics 51, Section 1, Fall 24 Instructor: D.A. Levin Name YOU MUST SHOW YOUR WORK TO RECEIVE CREDIT. A CORRECT ANSWER WITHOUT SHOWING YOUR REASONING WILL NOT RECEIVE CREDIT. Problem Points Possible
More informationMachine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
More informationChapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
More informationCoding and decoding with convolutional codes. The Viterbi Algor
Coding and decoding with convolutional codes. The Viterbi Algorithm. 8 Block codes: main ideas Principles st point of view: infinite length block code nd point of view: convolutions Some examples Repetition
More informationCHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.
Some Continuous Probability Distributions CHAPTER 6: Continuous Uniform Distribution: 6. Definition: The density function of the continuous random variable X on the interval [A, B] is B A A x B f(x; A,
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationSIMULATING CANCELLATIONS AND OVERBOOKING IN YIELD MANAGEMENT
CHAPTER 8 SIMULATING CANCELLATIONS AND OVERBOOKING IN YIELD MANAGEMENT In YM, one of the major problems in maximizing revenue is the number of cancellations. In industries implementing YM this is very
More informationJANUARY 2016 EXAMINATIONS. Life Insurance I
PAPER CODE NO. MATH 273 EXAMINER: Dr. C. Boado-Penas TEL.NO. 44026 DEPARTMENT: Mathematical Sciences JANUARY 2016 EXAMINATIONS Life Insurance I Time allowed: Two and a half hours INSTRUCTIONS TO CANDIDATES:
More informationPractice problems for Homework 11 - Point Estimation
Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:
More informationStochastic Models for Inventory Management at Service Facilities
Stochastic Models for Inventory Management at Service Facilities O. Berman, E. Kim Presented by F. Zoghalchi University of Toronto Rotman School of Management Dec, 2012 Agenda 1 Problem description Deterministic
More informationA LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA
REVSTAT Statistical Journal Volume 4, Number 2, June 2006, 131 142 A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA Authors: Daiane Aparecida Zuanetti Departamento de Estatística, Universidade Federal de São
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationMATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...
MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
More information2008 AP Calculus AB Multiple Choice Exam
008 AP Multiple Choice Eam Name 008 AP Calculus AB Multiple Choice Eam Section No Calculator Active AP Calculus 008 Multiple Choice 008 AP Calculus AB Multiple Choice Eam Section Calculator Active AP Calculus
More information1.5 / 1 -- Communication Networks II (Görg) -- www.comnets.uni-bremen.de. 1.5 Transforms
.5 / -- Communication Networks II (Görg) -- www.comnets.uni-bremen.de.5 Transforms Using different summation and integral transformations pmf, pdf and cdf/ccdf can be transformed in such a way, that even
More information4. Continuous Random Variables, the Pareto and Normal Distributions
4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random
More informationErrata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page
Errata for ASM Exam C/4 Study Manual (Sixteenth Edition) Sorted by Page 1 Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Practice exam 1:9, 1:22, 1:29, 9:5, and 10:8
More informationTHE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok
THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE Alexer Barvinok Papers are available at http://www.math.lsa.umich.edu/ barvinok/papers.html This is a joint work with J.A. Hartigan
More informationTopic 3b: Kinetic Theory
Topic 3b: Kinetic Theory What is temperature? We have developed some statistical language to simplify describing measurements on physical systems. When we measure the temperature of a system, what underlying
More informationPrinciple of Data Reduction
Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationMulti-variable Calculus and Optimization
Multi-variable Calculus and Optimization Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Multi-variable Calculus and Optimization 1 / 51 EC2040 Topic 3 - Multi-variable Calculus
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More information6.2 Permutations continued
6.2 Permutations continued Theorem A permutation on a finite set A is either a cycle or can be expressed as a product (composition of disjoint cycles. Proof is by (strong induction on the number, r, of
More informationSupplement to Call Centers with Delay Information: Models and Insights
Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290
More informationChapter 2: Binomial Methods and the Black-Scholes Formula
Chapter 2: Binomial Methods and the Black-Scholes Formula 2.1 Binomial Trees We consider a financial market consisting of a bond B t = B(t), a stock S t = S(t), and a call-option C t = C(t), where the
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationSTAT 830 Convergence in Distribution
STAT 830 Convergence in Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Convergence in Distribution STAT 830 Fall 2011 1 / 31
More informationNotes on the Negative Binomial Distribution
Notes on the Negative Binomial Distribution John D. Cook October 28, 2009 Abstract These notes give several properties of the negative binomial distribution. 1. Parameterizations 2. The connection between
More informationHedging Options In The Incomplete Market With Stochastic Volatility. Rituparna Sen Sunday, Nov 15
Hedging Options In The Incomplete Market With Stochastic Volatility Rituparna Sen Sunday, Nov 15 1. Motivation This is a pure jump model and hence avoids the theoretical drawbacks of continuous path models.
More informationProbability for Estimation (review)
Probability for Estimation (review) In general, we want to develop an estimator for systems of the form: x = f x, u + η(t); y = h x + ω(t); ggggg y, ffff x We will primarily focus on discrete time linear
More informationVacuum Technology. Kinetic Theory of Gas. Dr. Philip D. Rack
Kinetic Theory of Gas Assistant Professor Department of Materials Science and Engineering University of Tennessee 603 Dougherty Engineering Building Knoxville, TN 3793-00 Phone: (865) 974-5344 Fax (865)
More informationOPTIMAL TIMING OF THE ANNUITY PURCHASES: A
OPTIMAL TIMING OF THE ANNUITY PURCHASES: A COMBINED STOCHASTIC CONTROL AND OPTIMAL STOPPING PROBLEM Gabriele Stabile 1 1 Dipartimento di Matematica per le Dec. Econ. Finanz. e Assic., Facoltà di Economia
More informationProbability Generating Functions
page 39 Chapter 3 Probability Generating Functions 3 Preamble: Generating Functions Generating functions are widely used in mathematics, and play an important role in probability theory Consider a sequence
More informationInteger Programming: Algorithms - 3
Week 9 Integer Programming: Algorithms - 3 OPR 992 Applied Mathematical Programming OPR 992 - Applied Mathematical Programming - p. 1/12 Dantzig-Wolfe Reformulation Example Strength of the Linear Programming
More informationVERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS
VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS MICHAEL DRMOTA, OMER GIMENEZ, AND MARC NOY Abstract. We show that the number of vertices of a given degree k in several kinds of series-parallel labelled
More informationProbability Calculator
Chapter 95 Introduction Most statisticians have a set of probability tables that they refer to in doing their statistical wor. This procedure provides you with a set of electronic statistical tables that
More informationWeb-based Supplementary Materials for. Modeling of Hormone Secretion-Generating. Mechanisms With Splines: A Pseudo-Likelihood.
Web-based Supplementary Materials for Modeling of Hormone Secretion-Generating Mechanisms With Splines: A Pseudo-Likelihood Approach by Anna Liu and Yuedong Wang Web Appendix A This appendix computes mean
More informationLinear algebra and the geometry of quadratic equations. Similarity transformations and orthogonal matrices
MATH 30 Differential Equations Spring 006 Linear algebra and the geometry of quadratic equations Similarity transformations and orthogonal matrices First, some things to recall from linear algebra Two
More informationAn Introduction to Neural Networks
An Introduction to Vincent Cheung Kevin Cannons Signal & Data Compression Laboratory Electrical & Computer Engineering University of Manitoba Winnipeg, Manitoba, Canada Advisor: Dr. W. Kinsner May 27,
More informationA HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA: A FIRST REPORT
New Mathematics and Natural Computation Vol. 1, No. 2 (2005) 295 303 c World Scientific Publishing Company A HYBRID GENETIC ALGORITHM FOR THE MAXIMUM LIKELIHOOD ESTIMATION OF MODELS WITH MULTIPLE EQUILIBRIA:
More informationSection 5.1 Continuous Random Variables: Introduction
Section 5. Continuous Random Variables: Introduction Not all random variables are discrete. For example:. Waiting times for anything (train, arrival of customer, production of mrna molecule from gene,
More informationLectures on Stochastic Processes. William G. Faris
Lectures on Stochastic Processes William G. Faris November 8, 2001 2 Contents 1 Random walk 7 1.1 Symmetric simple random walk................... 7 1.2 Simple random walk......................... 9 1.3
More informationM5A42 APPLIED STOCHASTIC PROCESSES PROBLEM SHEET 1 SOLUTIONS Term 1 2010-2011
M5A42 APPLIED STOCHASTIC PROCESSES PROBLEM SHEET 1 SOLUTIONS Term 1 21-211 1. Clculte the men, vrince nd chrcteristic function of the following probbility density functions. ) The exponentil distribution
More information2. Illustration of the Nikkei 225 option data
1. Introduction 2. Illustration of the Nikkei 225 option data 2.1 A brief outline of the Nikkei 225 options market τ 2.2 Estimation of the theoretical price τ = + ε ε = = + ε + = + + + = + ε + ε + ε =
More informationScalar Valued Functions of Several Variables; the Gradient Vector
Scalar Valued Functions of Several Variables; the Gradient Vector Scalar Valued Functions vector valued function of n variables: Let us consider a scalar (i.e., numerical, rather than y = φ(x = φ(x 1,
More informationLecture 5: Model-Free Control
Lecture 5: Model-Free Control David Silver Outline 1 Introduction 2 On-Policy Monte-Carlo Control 3 On-Policy Temporal-Difference Learning 4 Off-Policy Learning 5 Summary Introduction Model-Free Reinforcement
More informationA characterization of trace zero symmetric nonnegative 5x5 matrices
A characterization of trace zero symmetric nonnegative 5x5 matrices Oren Spector June 1, 009 Abstract The problem of determining necessary and sufficient conditions for a set of real numbers to be the
More informationCalculus AB 2014 Scoring Guidelines
P Calculus B 014 Scoring Guidelines 014 The College Board. College Board, dvanced Placement Program, P, P Central, and the acorn logo are registered trademarks of the College Board. P Central is the official
More informationThe Engle-Granger representation theorem
The Engle-Granger representation theorem Reference note to lecture 10 in ECON 5101/9101, Time Series Econometrics Ragnar Nymoen March 29 2011 1 Introduction The Granger-Engle representation theorem is
More informationA SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS
A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS Eusebio GÓMEZ, Miguel A. GÓMEZ-VILLEGAS and J. Miguel MARÍN Abstract In this paper it is taken up a revision and characterization of the class of
More informationMonte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)
Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February
More informationTHE DYING FIBONACCI TREE. 1. Introduction. Consider a tree with two types of nodes, say A and B, and the following properties:
THE DYING FIBONACCI TREE BERNHARD GITTENBERGER 1. Introduction Consider a tree with two types of nodes, say A and B, and the following properties: 1. Let the root be of type A.. Each node of type A produces
More informationPacific Journal of Mathematics
Pacific Journal of Mathematics GLOBAL EXISTENCE AND DECREASING PROPERTY OF BOUNDARY VALUES OF SOLUTIONS TO PARABOLIC EQUATIONS WITH NONLOCAL BOUNDARY CONDITIONS Sangwon Seo Volume 193 No. 1 March 2000
More informationReject Inference in Credit Scoring. Jie-Men Mok
Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business
More informationThe sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].
Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real
More informationPortfolio Using Queuing Theory
Modeling the Number of Insured Households in an Insurance Portfolio Using Queuing Theory Jean-Philippe Boucher and Guillaume Couture-Piché December 8, 2015 Quantact / Département de mathématiques, UQAM.
More informationConstrained optimization.
ams/econ 11b supplementary notes ucsc Constrained optimization. c 2010, Yonatan Katznelson 1. Constraints In many of the optimization problems that arise in economics, there are restrictions on the values
More information4. How many integers between 2004 and 4002 are perfect squares?
5 is 0% of what number? What is the value of + 3 4 + 99 00? (alternating signs) 3 A frog is at the bottom of a well 0 feet deep It climbs up 3 feet every day, but slides back feet each night If it started
More informationLARGE CLASSES OF EXPERTS
LARGE CLASSES OF EXPERTS Csaba Szepesvári University of Alberta CMPUT 654 E-mail: szepesva@ualberta.ca UofA, October 31, 2006 OUTLINE 1 TRACKING THE BEST EXPERT 2 FIXED SHARE FORECASTER 3 VARIABLE-SHARE
More informationProbability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..
Probability Theory A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T.. Florian Herzog 2013 Probability space Probability space A probability space W is a unique triple W = {Ω, F,
More informationTemporal Difference Learning in the Tetris Game
Temporal Difference Learning in the Tetris Game Hans Pirnay, Slava Arabagi February 6, 2009 1 Introduction Learning to play the game Tetris has been a common challenge on a few past machine learning competitions.
More informationFactoring. Factoring 1
Factoring Factoring 1 Factoring Security of RSA algorithm depends on (presumed) difficulty of factoring o Given N = pq, find p or q and RSA is broken o Rabin cipher also based on factoring Factoring like
More informationSensitivity analysis of utility based prices and risk-tolerance wealth processes
Sensitivity analysis of utility based prices and risk-tolerance wealth processes Dmitry Kramkov, Carnegie Mellon University Based on a paper with Mihai Sirbu from Columbia University Math Finance Seminar,
More informationSome stability results of parameter identification in a jump diffusion model
Some stability results of parameter identification in a jump diffusion model D. Düvelmeyer Technische Universität Chemnitz, Fakultät für Mathematik, 09107 Chemnitz, Germany Abstract In this paper we discuss
More informationSecond Order Linear Partial Differential Equations. Part I
Second Order Linear Partial Differential Equations Part I Second linear partial differential equations; Separation of Variables; - point boundary value problems; Eigenvalues and Eigenfunctions Introduction
More informationJava Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
More informationA Non-Linear Schema Theorem for Genetic Algorithms
A Non-Linear Schema Theorem for Genetic Algorithms William A Greene Computer Science Department University of New Orleans New Orleans, LA 70148 bill@csunoedu 504-280-6755 Abstract We generalize Holland
More informationLab 4: 26 th March 2012. Exercise 1: Evolutionary algorithms
Lab 4: 26 th March 2012 Exercise 1: Evolutionary algorithms 1. Found a problem where EAs would certainly perform very poorly compared to alternative approaches. Explain why. Suppose that we want to find
More informationMonte Carlo-based statistical methods (MASM11/FMS091)
Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February 5, 2013 J. Olsson Monte Carlo-based
More informationInductive QoS Packet Scheduling for Adaptive Dynamic Networks
Inductive QoS Packet Scheduling for Adaptive Dynamic Networks Malika BOURENANE Dept of Computer Science University of Es-Senia Algeria mb_regina@yahoo.fr Abdelhamid MELLOUK LISSI Laboratory University
More informationRANDOM INTERVAL HOMEOMORPHISMS. MICHA L MISIUREWICZ Indiana University Purdue University Indianapolis
RANDOM INTERVAL HOMEOMORPHISMS MICHA L MISIUREWICZ Indiana University Purdue University Indianapolis This is a joint work with Lluís Alsedà Motivation: A talk by Yulij Ilyashenko. Two interval maps, applied
More informatione.g. arrival of a customer to a service station or breakdown of a component in some system.
Poisson process Events occur at random instants of time at an average rate of λ events per second. e.g. arrival of a customer to a service station or breakdown of a component in some system. Let N(t) be
More informationHomework #2 Solutions
MAT Spring Problems Section.:, 8,, 4, 8 Section.5:,,, 4,, 6 Extra Problem # Homework # Solutions... Sketch likely solution curves through the given slope field for dy dx = x + y...8. Sketch likely solution
More informationStirling s formula, n-spheres and the Gamma Function
Stirling s formula, n-spheres and the Gamma Function We start by noticing that and hence x n e x dx lim a 1 ( 1 n n a n n! e ax dx lim a 1 ( 1 n n a n a 1 x n e x dx (1 Let us make a remark in passing.
More information2WB05 Simulation Lecture 8: Generating random variables
2WB05 Simulation Lecture 8: Generating random variables Marko Boon http://www.win.tue.nl/courses/2wb05 January 7, 2013 Outline 2/36 1. How do we generate random variables? 2. Fitting distributions Generating
More informationClass #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More informationEfficiency and the Cramér-Rao Inequality
Chapter Efficiency and the Cramér-Rao Inequality Clearly we would like an unbiased estimator ˆφ (X of φ (θ to produce, in the long run, estimates which are fairly concentrated i.e. have high precision.
More informationGambling and Data Compression
Gambling and Data Compression Gambling. Horse Race Definition The wealth relative S(X) = b(x)o(x) is the factor by which the gambler s wealth grows if horse X wins the race, where b(x) is the fraction
More information