Parameter estimation for nonlinear models: Numerical approaches to solving the inverse problem. Lecture 12 04/08/2008. Sven Zenker

Size: px
Start display at page:

Download "Parameter estimation for nonlinear models: Numerical approaches to solving the inverse problem. Lecture 12 04/08/2008. Sven Zenker"

Transcription

1 Parameter estimation for nonlinear models: Numerical approaches to solving the inverse problem Lecture 12 04/08/2008 Sven Zenker

2 Assignment no. 8 Correct setup of likelihood function One fixed set of observation data Likelihood is a function of the parameter! Under assumption of independence, it becomes one product Absolute values will usually decrease as the number of observations increases: cannot compare likelihoods directly Also: in production mode, ALWAYS use logs when possible, in particular in MCMC settings Observations: Enough observations & reasonable noise: maximum LLH close to true value As the number of observations decreases and the noise increases, the LLH functions move away from being meaningfully described by a maximum likelihood estimate Observing the system only partially worsens this situations considerably This is a small, simple model! Real life problems in biology tend to be a lot worse (with a few exceptions )

3 20 observations of both states, \sigma noise = 0.5

4 20 observations of both states, noise \sigma = 2

5 5 observations of both states

6 20 observations of one state

7 5 observations of 1 state

8 Assignment no. 8 If experiments can be repeated often enough, the favorable properties of the maximum likelihood estimator come into play This is reflected in the shape of the likelihood functions In situations where the data is sparse and the model large, a maximum likelihood estimate may be relatively useless (marginals of) posterior distributions or likelihoods may still convey useful information about Ability to nail down individual parameter values (or lack thereof) Relationship of parameters on submanifolds compatible with observations and may therefore help to guide model reduction & development decide how to interpret parameter estimates and how much confidence to place into them

9 Review: Density estimation Histogram an instance of more general problem of density estimation: Given a finite set of samples from a probability distribution, how can I approximately find the PDF of the underlying distribution? Histogram one (relatively crude) way

10 Review: Histogram, bin width selection Various formulas based on asymptotic arguments exist. All of them (of course) have bin width decrease with increasing number of samples, and also take into account some measure of the spread of the data... E.g.: 7s w = (Scott's rule) where n number of samples, s standard deviation of sample bin 2n 1 3 or w bin = IQR 2n 1 3 (Freedman-Diaconis rule), where IQR is the interquartile range of the data These formulas give one bin width for the entire dataset. A variety of methods exist to further refine the bin widths by allowing them to change locally

11 Review: Kernel density estimation (KDE) For an i.i.d. sample { x,..., x of the underlying distribution is given by 1 N }, the (fixed bandwidth) kernel density estimate for the PDF N 1 x x ˆ( ) ( i f x = K ), Nh h i= 1 where K( x) is a kernel function satisfying K( xdx ) = Usually, one will also require K( x) 0 and K( x) = K(- x) for all x h is termed the bandwith. Its selection is critical for performance of the estimator and much more important than the specific shape of the kernel used.

12 Review: Adaptive bandwidth selection Bandwidth can also be chosen for each individual sample separately, leading to much improved estimates (lower density -> wider kernels, higher density -> > narrower kernels) The optimal bandwidth selection problem in higher dimensions is still an active area of research Important idea: penalized MLE approaches (Why? Dirac catastrophe )

13 KDE and MCMC: Caveat Optimal bandwidth selection algorithms are typically designed assuming i.i.d. samples This is NOT true for MCMC output Transition kernels that admit a density not very problematic General Metropolis-Hastings (point mass at current point!) more problematic, adaptations of standard procedures have been devised, some specialized literature exists

14 Review: Metropolis-Hastings MCMC Suppose transition kernel takes form Pxdy (, ) = pxydy (, ) + rx ( ) δ ( dy) x where pxx (, ) = 0 and δ ( dy) = 1 if x in dy and 0 otherwise and x rx ( ) 1 pxydy (, ) is the probability that = n the chain remains at x.

15 Review Metropolis-Hastings Now, if pxy (, ) satisfies the so-callled "detailed balance", "reversibility" condition π( x) p( x, y) = π( y) p( y, x) then π ( )is the invariant density of Px (, ).

16 Review Metropolis-Hastings How to achieve detailed balance So the the specific form for pxy (, ) in the Metropolis-Hastings algorithm becomes pmh ( x, y) = q( x, y) α( x, y), x y For the case where the inequality is as above, we may wish to set α( yx, ) = 1, the largest possible value for a probability, and can then compute α( xy, ) from the detailed balance condition π( xqxy ) (, ) α( xy, ) = π( yqyx ) (, ) α( yx, ) to obtain π ( yqyx ) (, ) α( xy, ) =, π ( xqxy ) (, ) and similarly for the inequality in the other direction by setting α ( xy, ) = 1.

17 Review Metropolis-Hastings How to achieve detailed balance So to obtain detailed balance/reversibility, we chose π ( yqyx ) (, ) min,1 if π ( xqxy ) (, ) > 0 α( xy, ) = π ( xqxy ) (, ) 1 otherwise which, together with the probability for staying at the current position rx ( ) = 1 qxy (, ) α( xydy, ) n yields an overall transition kernel which is a special case of the version from the previous slide: P MH ( xdy, ) = qxy (, ) α( xydy, ) + 1 qxy (, ) α( xydy, ) δx( dy) n for which we saw that it does have the desired invariant distribution since we have detailed balance/reversibility by construction...

18 Review: Observations -For a symmetric proposal distribution, that is pxy (, ) = pyx (, ), π ( y) α( xy, ) =, so that "uphill moves" will always be accepted π ( x) (simulated annealing!) -For qxy (, ) = π( y), α( xy, ) = 1, so if the proposal distribution is the true distribution we wish to sample from, we will always accept the move -The PDF of the distribution of interest π need only be known up to a constant scalar factor since it appears both in numerator and denominator

19 Review: Algorithm Initialization Specify Family of proposal distributions q(x,y) Desired number of samples N Initial value x 0 Main loop Repeat for j=1, N Generate (sample) y from q(x j,.) and u from uniform distribution on [0,1] If u <= α(x j,y) Set x j+1 = y Else Set x j+1 =x j End Repeat Termination Return the set of samples {x 1,,x N }

20 MCMC: Theoretical convergence We ve seen that detailed balance/reversibility ensures that the desired distribution is a stationary distribution of the Markov chain This, in and of itself, does not ensure that the chain will actually converge to that distribution, that is, eventually sample from it when started at an arbitrary point

21 MCMC: Theoretical convergence We ve seen that detailed balance/reversibility ensures that the desired distribution is a stationary distribution of the Markov chain This, in and of itself, does not ensure that the chain will actually converge to that distribution, that is, eventually sample from it when started at an arbitrary point

22 MCMC: Theoretical convergence Convergence will occur under mild regularity conditions, namely (very roughly): Irreducibility: : For any state of the Markov chain, there is a positive probability of reaching any small subset dy in the support of the distribution in finite time Aperiodicity: : we will not get trapped in cycles This can be made more detailed and precise. For practical purposes, it is usually not an issue if reasonable proposal distribution are chosen.

23 MCMC: Rate of convergence Much more important from a practical perspective: how fast does it converge, i.e., when can we start to believe that the sample we have obtained represents the target distribution reasonably well? This will critically depend on the choice of the proposal distribution

24 MCMC: burn-in To eliminate the initial influence of the choice of starting point for the Markov chain, one usually discards a number of initial samples, the so-called burn-in period Although some argue that this is theoretically not required, it will avoid undue influence of unlikely starting points for finite sample sizes

25 Thinning Practice of storing only every k th sample produced by the MCMC algorithm Cannot improve the description of the target distribution, but may save memory without actually worsening the description since the immediately subsequent samples may carry little independent information if autocorrelation is high.

26 MCMC: tuning In standard algorithms, the proposal bandwidth needs to be tuned For special cases, theoretical optima can be determined A theoretically justifiable tuning target for practical problems is the acceptance rate, that is, the ratio of the number of accepted steps and total steps Recommendations vary somewhat, an acceptance ratio between 0.2 and 0.6 covers most of these recommendations Strong relationship between proposal bandwidth, autocorrelation of sample, and sampling efficiency, that is (roughly), how many dependent samples do we expect to need to obtain one independent sample Relationship between proposal bandwidth and autocorrelation NOT monotonous!

27 MCMC: importance of diagnosing convergence Key risks when using (deceptively simple) MCMC methods are Inappropriate modeling: model may be unable to fit the data =>Perform sanity checks in regions of high likelihood/posterior density Programming errors May be impossible to detect in realistic problem => Always write generically applicable code and test on problems with known answers Slow convergence: The simulation may remain in a region heavily influenced by the starting condition for many iterations ( mixing problem) This is a fundamental issue: a finite run will never explore the distribution in all detail

28 MCMC: diagnosing convergence No fully satisfactory answer does (and can?) exist Key problem: we are trying to infer something from the sample itself, about which it may carry no information (sketch) Nevertheless, many methods, more or less heuristic in nature, have been proposed Will look at a few (by no means exhaustively)

29 Graphical diagnosis Plot positions as a function of time, that is, iteration number Attempt to determine when the process has settled down visually (reached stationary distribution?) Can detect continued drift from far out starting value Allows to get a feeling for acceptance behavior and jump sizes Cannot detect complete failure to mix Rather subjective

30 Comparison of multiple chains (Gelman) Run several independent chains from randomly selected, overdispersed starting points Compare the resulting distributions Stop only when they are indistinguishable by some meaningful criterion This can be formalized by for example requiring for convergence that the variance between chains be no larger than the variance within individual chains (Gelman( Gelman,, 1995) In practice, it may be hard to ascertain overdispersedness of starting points

31 Raftery and Lewis Based on 2-state 2 Markov chain theory Tries to bound errors on estimated quantiles of the the true distribution Will give recommended number of burn-in iterations to be discarded and recommended run length Available as R package an in FORTRAN implementation from Statlib ( lib.stat.cmu.edu/general/gibbsit) Rather computationally expensive, so FORTRAN version recommended for large sample sizes

32 And many others Bottom line: Graphical convergence diagnostics can serve as monitoring aids Formal convergence diagnostics can help to assess output Repeated (or parallel) runs can help to gain confidence in the accurate representation of the distribution, possibly aided by formal comparison of statistical properties of chains To my knowledge, no practical method can rigorously ensure convergence from MCMC output

33 Convergence diagnostics: Resources Cowles & Carling, Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review, Journal of the American Statistical Association, Vol. 91, No. 434: provides an critical review of approx. 10 different convergence diagnostic approaches (including the famous classics) and gives recommendations for application The CODA package for R implements a few established diagnostics ( A fast FORTRAN implementation of Raftery & Lewis convergence diagnostic is available at You may have to adjust the preallocated array size in the source code for larger samples.

34 Assignment no. 10 Modify your Metropolis-Hastings algorithm from assignment no. 9 to accept a function that provides the logarithm of the target density as input and performs p its acceptance computation using these values directly (that is, the exponentiation tion is never actually performed; for this to work you will have to transform the random m sample from the uniform distribution on [0, 1] before comparison ) Combine the modified algorithm with a modification of the likelihood function for the van der Pol system from assignment no. 8 that computes the logarithm of the likelihood up to an additive constant (corresponding to the familiar constant factor in the non-logarithmic likelihood) as a function of \mu and the initial condition for state 1 For the sampling to work reliably, you will have to catch integration errors using the lastwarn function. We will assume that failure to integrate corresponds to 0 likelihood (-Inf in log domain) Generate an artificial observation set of 20 measurements of both h states with additive noise with standard deviation 0.5 in the usual fashion and store it so that you can reuse for the each of the following steps to achieve reproducability.. Alternatively, you can reset the random seeds appropriately. Using runs of 1000 samples each, tune the proposal bandwidth such h that you obtain an acceptance rate of approx from a starting point of \mu = 1.5, y0_1 = 1.5 Plot histograms of the marginal distributions obtained by running g the sampler for iterations with the above starting point, as well as when using [3, 3] as a starting point. Also compare results when discarding burn-in periods of 0, 2000, and 5000 samples. What do you observe? How do you interpret your findings? Note: please provide plots & documentation of acceptance rates etc. so I don t have to rerun your simulations.

Tutorial on Markov Chain Monte Carlo

Tutorial on Markov Chain Monte Carlo Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,

More information

Introduction to Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem

More information

Markov Chain Monte Carlo Simulation Made Simple

Markov Chain Monte Carlo Simulation Made Simple Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical

More information

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data Faming Liang University of Florida August 9, 2015 Abstract MCMC methods have proven to be a very powerful tool for analyzing

More information

Gaussian Processes to Speed up Hamiltonian Monte Carlo

Gaussian Processes to Speed up Hamiltonian Monte Carlo Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

1 Error in Euler s Method

1 Error in Euler s Method 1 Error in Euler s Method Experience with Euler s 1 method raises some interesting questions about numerical approximations for the solutions of differential equations. 1. What determines the amount of

More information

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February

More information

SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one?

SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one? SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one? Simulations for properties of estimators Simulations for properties

More information

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014 Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

More information

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

More information

Bayesian Statistics in One Hour. Patrick Lam

Bayesian Statistics in One Hour. Patrick Lam Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Monte Carlo-based statistical methods (MASM11/FMS091)

Monte Carlo-based statistical methods (MASM11/FMS091) Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February 5, 2013 J. Olsson Monte Carlo-based

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Microeconomic Theory: Basic Math Concepts

Microeconomic Theory: Basic Math Concepts Microeconomic Theory: Basic Math Concepts Matt Van Essen University of Alabama Van Essen (U of A) Basic Math Concepts 1 / 66 Basic Math Concepts In this lecture we will review some basic mathematical concepts

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

1 Prior Probability and Posterior Probability

1 Prior Probability and Posterior Probability Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu [email protected] Modern machine learning is rooted in statistics. You will find many familiar

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Chapter 4. Probability and Probability Distributions

Chapter 4. Probability and Probability Distributions Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer [email protected] Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:

More information

More details on the inputs, functionality, and output can be found below.

More details on the inputs, functionality, and output can be found below. Overview: The SMEEACT (Software for More Efficient, Ethical, and Affordable Clinical Trials) web interface (http://research.mdacc.tmc.edu/smeeactweb) implements a single analysis of a two-armed trial comparing

More information

Lecture 2. Summarizing the Sample

Lecture 2. Summarizing the Sample Lecture 2 Summarizing the Sample WARNING: Today s lecture may bore some of you It s (sort of) not my fault I m required to teach you about what we re going to cover today. I ll try to make it as exciting

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9

More information

TEACHER NOTES MATH NSPIRED

TEACHER NOTES MATH NSPIRED Math Objectives Students will understand that normal distributions can be used to approximate binomial distributions whenever both np and n(1 p) are sufficiently large. Students will understand that when

More information

Introduction to Monte Carlo. Astro 542 Princeton University Shirley Ho

Introduction to Monte Carlo. Astro 542 Princeton University Shirley Ho Introduction to Monte Carlo Astro 542 Princeton University Shirley Ho Agenda Monte Carlo -- definition, examples Sampling Methods (Rejection, Metropolis, Metropolis-Hasting, Exact Sampling) Markov Chains

More information

Likelihood: Frequentist vs Bayesian Reasoning

Likelihood: Frequentist vs Bayesian Reasoning "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and

More information

Nonparametric adaptive age replacement with a one-cycle criterion

Nonparametric adaptive age replacement with a one-cycle criterion Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: [email protected]

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

Master s Theory Exam Spring 2006

Master s Theory Exam Spring 2006 Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

More information

Language Modeling. Chapter 1. 1.1 Introduction

Language Modeling. Chapter 1. 1.1 Introduction Chapter 1 Language Modeling (Course notes for NLP by Michael Collins, Columbia University) 1.1 Introduction In this chapter we will consider the the problem of constructing a language model from a set

More information

Numerical Methods for Option Pricing

Numerical Methods for Option Pricing Chapter 9 Numerical Methods for Option Pricing Equation (8.26) provides a way to evaluate option prices. For some simple options, such as the European call and put options, one can integrate (8.26) directly

More information

Normal Distribution. Definition A continuous random variable has a normal distribution if its probability density. f ( y ) = 1.

Normal Distribution. Definition A continuous random variable has a normal distribution if its probability density. f ( y ) = 1. Normal Distribution Definition A continuous random variable has a normal distribution if its probability density e -(y -µ Y ) 2 2 / 2 σ function can be written as for < y < as Y f ( y ) = 1 σ Y 2 π Notation:

More information

Statistics 104: Section 6!

Statistics 104: Section 6! Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: [email protected] Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC

More information

Non Parametric Inference

Non Parametric Inference Maura Department of Economics and Finance Università Tor Vergata Outline 1 2 3 Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for the Difference Between Two Means Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

More information

Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

More information

On the mathematical theory of splitting and Russian roulette

On the mathematical theory of splitting and Russian roulette On the mathematical theory of splitting and Russian roulette techniques St.Petersburg State University, Russia 1. Introduction Splitting is an universal and potentially very powerful technique for increasing

More information

Confidence Intervals for One Standard Deviation Using Standard Deviation

Confidence Intervals for One Standard Deviation Using Standard Deviation Chapter 640 Confidence Intervals for One Standard Deviation Using Standard Deviation Introduction This routine calculates the sample size necessary to achieve a specified interval width or distance from

More information

Imputing Missing Data using SAS

Imputing Missing Data using SAS ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

1 The Brownian bridge construction

1 The Brownian bridge construction The Brownian bridge construction The Brownian bridge construction is a way to build a Brownian motion path by successively adding finer scale detail. This construction leads to a relatively easy proof

More information

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve

More information

Computational Statistics for Big Data

Computational Statistics for Big Data Lancaster University Computational Statistics for Big Data Author: 1 Supervisors: Paul Fearnhead 1 Emily Fox 2 1 Lancaster University 2 The University of Washington September 1, 2015 Abstract The amount

More information

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition) Abstract Indirect inference is a simulation-based method for estimating the parameters of economic models. Its

More information

PS 271B: Quantitative Methods II. Lecture Notes

PS 271B: Quantitative Methods II. Lecture Notes PS 271B: Quantitative Methods II Lecture Notes Langche Zeng [email protected] The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.

More information

Dirichlet Processes A gentle tutorial

Dirichlet Processes A gentle tutorial Dirichlet Processes A gentle tutorial SELECT Lab Meeting October 14, 2008 Khalid El-Arini Motivation We are given a data set, and are told that it was generated from a mixture of Gaussian distributions.

More information

A teaching experience through the development of hypertexts and object oriented software.

A teaching experience through the development of hypertexts and object oriented software. A teaching experience through the development of hypertexts and object oriented software. Marcello Chiodi Istituto di Statistica. Facoltà di Economia-Palermo-Italy 1. Introduction (1) This paper is concerned

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

x 2 + y 2 = 1 y 1 = x 2 + 2x y = x 2 + 2x + 1

x 2 + y 2 = 1 y 1 = x 2 + 2x y = x 2 + 2x + 1 Implicit Functions Defining Implicit Functions Up until now in this course, we have only talked about functions, which assign to every real number x in their domain exactly one real number f(x). The graphs

More information

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document

More information

Inference on Phase-type Models via MCMC

Inference on Phase-type Models via MCMC Inference on Phase-type Models via MCMC with application to networks of repairable redundant systems Louis JM Aslett and Simon P Wilson Trinity College Dublin 28 th June 202 Toy Example : Redundant Repairable

More information

Random access protocols for channel access. Markov chains and their stability. Laurent Massoulié.

Random access protocols for channel access. Markov chains and their stability. Laurent Massoulié. Random access protocols for channel access Markov chains and their stability [email protected] Aloha: the first random access protocol for channel access [Abramson, Hawaii 70] Goal: allow machines

More information

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

A Bayesian Antidote Against Strategy Sprawl

A Bayesian Antidote Against Strategy Sprawl A Bayesian Antidote Against Strategy Sprawl Benjamin Scheibehenne ([email protected]) University of Basel, Missionsstrasse 62a 4055 Basel, Switzerland & Jörg Rieskamp ([email protected])

More information

Covariance and Correlation

Covariance and Correlation Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such

More information

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014

More information

Dongfeng Li. Autumn 2010

Dongfeng Li. Autumn 2010 Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

More information

Chapter 6. Cuboids. and. vol(conv(p ))

Chapter 6. Cuboids. and. vol(conv(p )) Chapter 6 Cuboids We have already seen that we can efficiently find the bounding box Q(P ) and an arbitrarily good approximation to the smallest enclosing ball B(P ) of a set P R d. Unfortunately, both

More information

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

More information

Lecture 5 : The Poisson Distribution

Lecture 5 : The Poisson Distribution Lecture 5 : The Poisson Distribution Jonathan Marchini November 10, 2008 1 Introduction Many experimental situations occur in which we observe the counts of events within a set unit of time, area, volume,

More information

Means, standard deviations and. and standard errors

Means, standard deviations and. and standard errors CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

More information

Lecture 8. Generating a non-uniform probability distribution

Lecture 8. Generating a non-uniform probability distribution Discrete outcomes Lecture 8 Generating a non-uniform probability distribution Last week we discussed generating a non-uniform probability distribution for the case of finite discrete outcomes. An algorithm

More information

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem Time on my hands: Coin tosses. Problem Formulation: Suppose that I have

More information

2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR)

2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR) 2DI36 Statistics 2DI36 Part II (Chapter 7 of MR) What Have we Done so Far? Last time we introduced the concept of a dataset and seen how we can represent it in various ways But, how did this dataset came

More information

An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics

An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics Slide 1 An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics Dr. Christian Asseburg Centre for Health Economics Part 1 Slide 2 Talk overview Foundations of Bayesian statistics

More information

Near Optimal Solutions

Near Optimal Solutions Near Optimal Solutions Many important optimization problems are lacking efficient solutions. NP-Complete problems unlikely to have polynomial time solutions. Good heuristics important for such problems.

More information

A MULTIVARIATE OUTLIER DETECTION METHOD

A MULTIVARIATE OUTLIER DETECTION METHOD A MULTIVARIATE OUTLIER DETECTION METHOD P. Filzmoser Department of Statistics and Probability Theory Vienna, AUSTRIA e-mail: [email protected] Abstract A method for the detection of multivariate

More information

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Statistical Machine Translation: IBM Models 1 and 2

Statistical Machine Translation: IBM Models 1 and 2 Statistical Machine Translation: IBM Models 1 and 2 Michael Collins 1 Introduction The next few lectures of the course will be focused on machine translation, and in particular on statistical machine translation

More information

Measuring Line Edge Roughness: Fluctuations in Uncertainty

Measuring Line Edge Roughness: Fluctuations in Uncertainty Tutor6.doc: Version 5/6/08 T h e L i t h o g r a p h y E x p e r t (August 008) Measuring Line Edge Roughness: Fluctuations in Uncertainty Line edge roughness () is the deviation of a feature edge (as

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

Lecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods

Lecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods Lecture 2 ESTIMATING THE SURVIVAL FUNCTION One-sample nonparametric methods There are commonly three methods for estimating a survivorship function S(t) = P (T > t) without resorting to parametric models:

More information

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 18. A Brief Introduction to Continuous Probability CS 7 Discrete Mathematics and Probability Theory Fall 29 Satish Rao, David Tse Note 8 A Brief Introduction to Continuous Probability Up to now we have focused exclusively on discrete probability spaces

More information

Principle of Data Reduction

Principle of Data Reduction Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

More information

Bayesian Model Averaging CRM in Phase I Clinical Trials

Bayesian Model Averaging CRM in Phase I Clinical Trials M.D. Anderson Cancer Center 1 Bayesian Model Averaging CRM in Phase I Clinical Trials Department of Biostatistics U. T. M. D. Anderson Cancer Center Houston, TX Joint work with Guosheng Yin M.D. Anderson

More information

Parallelization Strategies for Multicore Data Analysis

Parallelization Strategies for Multicore Data Analysis Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management

More information

L13: cross-validation

L13: cross-validation Resampling methods Cross validation Bootstrap L13: cross-validation Bias and variance estimation with the Bootstrap Three-way data partitioning CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna CSE@TAMU

More information

Chapter 4 - Lecture 1 Probability Density Functions and Cumul. Distribution Functions

Chapter 4 - Lecture 1 Probability Density Functions and Cumul. Distribution Functions Chapter 4 - Lecture 1 Probability Density Functions and Cumulative Distribution Functions October 21st, 2009 Review Probability distribution function Useful results Relationship between the pdf and the

More information

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010 Simulation Methods Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Simulation Methods 15.450, Fall 2010 1 / 35 Outline 1 Generating Random Numbers 2 Variance Reduction 3 Quasi-Monte

More information

Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

More information

Self Organizing Maps: Fundamentals

Self Organizing Maps: Fundamentals Self Organizing Maps: Fundamentals Introduction to Neural Networks : Lecture 16 John A. Bullinaria, 2004 1. What is a Self Organizing Map? 2. Topographic Maps 3. Setting up a Self Organizing Map 4. Kohonen

More information

Binomial lattice model for stock prices

Binomial lattice model for stock prices Copyright c 2007 by Karl Sigman Binomial lattice model for stock prices Here we model the price of a stock in discrete time by a Markov chain of the recursive form S n+ S n Y n+, n 0, where the {Y i }

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software October 2014, Volume 61, Issue 7. http://www.jstatsoft.org/ WebBUGS: Conducting Bayesian Statistical Analysis Online Zhiyong Zhang University of Notre Dame Abstract

More information

SAS Certificate Applied Statistics and SAS Programming

SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and Advanced SAS Programming Brigham Young University Department of Statistics offers an Applied Statistics and

More information

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

More information