Statistical Methods for Data Analysis. Probability and PDF s



Similar documents
Introduction to Probability

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Part 4 fitting with energy loss and multiple scattering non gaussian uncertainties outliers

E3: PROBABILITY AND STATISTICS lecture notes

LOGNORMAL MODEL FOR STOCK PRICES

Probability and statistics; Rehearsal for pattern recognition

Factoring Patterns in the Gaussian Plane

An Introduction to Basic Statistics and Probability

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

9.2 Summation Notation

Chapter ML:IV. IV. Statistical Learning. Probability Basics Bayes Classification Maximum a-posteriori Hypotheses

Monte Carlo Methods in Finance

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

STOCHASTIC MODELING. Math3425 Spring 2012, HKUST. Kani Chen (Instructor)

Sums of Independent Random Variables

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the school year.

Theory versus Experiment. Prof. Jorgen D Hondt Vrije Universiteit Brussel jodhondt@vub.ac.be

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Master s Theory Exam Spring 2006

Call Price as a Function of the Stock Price

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

1 Portfolio mean and variance

Assignment 2: Option Pricing and the Black-Scholes formula The University of British Columbia Science One CS Instructor: Michael Gelbart

The Black-Scholes Formula

MAS108 Probability I

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

STA 4273H: Statistical Machine Learning

BROWNIAN MOTION DEVELOPMENT FOR MONTE CARLO METHOD APPLIED ON EUROPEAN STYLE OPTION PRICE FORECASTING

THE CENTRAL LIMIT THEOREM TORONTO

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Basic Probability. Probability: The part of Mathematics devoted to quantify uncertainty

Algebra Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

1 Interest rates, and risk-free investments

Probability and Expected Value

MATH 095, College Prep Mathematics: Unit Coverage Pre-algebra topics (arithmetic skills) offered through BSE (Basic Skills Education)

MATH. ALGEBRA I HONORS 9 th Grade ALGEBRA I HONORS

Pricing of a worst of option using a Copula method M AXIME MALGRAT

Hedging Illiquid FX Options: An Empirical Analysis of Alternative Hedging Strategies

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Prentice Hall: Middle School Math, Course Correlated to: New York Mathematics Learning Standards (Intermediate)

3. INNER PRODUCT SPACES

Statistical Methods for Data Analysis. Random numbers with ROOT and RooFit

ECE302 Spring 2006 HW4 Solutions February 6,

LOOKING FOR A GOOD TIME TO BET

The Math. P (x) = 5! = = 120.

The Normal Approximation to Probability Histograms. Dice: Throw a single die twice. The Probability Histogram: Area = Probability. Where are we going?

APPLIED MATHEMATICS ADVANCED LEVEL

Introduction to time series analysis

The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables

Pre-Algebra Academic Content Standards Grade Eight Ohio. Number, Number Sense and Operations Standard. Number and Number Systems

Higher Education Math Placement

North Carolina Math 2

Appendix 3 IB Diploma Programme Course Outlines

Sample Space and Probability

Moreover, under the risk neutral measure, it must be the case that (5) r t = µ t.

Practical Applications of Stochastic Modeling for Disability Insurance

Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test

ECE302 Spring 2006 HW3 Solutions February 2,

Lecture 6: Option Pricing Using a One-step Binomial Tree. Friday, September 14, 12

Tutorial on Markov Chain Monte Carlo

04 Mathematics CO-SG-FLD Program for Licensing Assessments for Colorado Educators

Part III. Lecture 3: Probability and Stochastic Processes. Stephen Kinsella (UL) EC4024 February 8, / 149

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION

Mathematics (MAT) MAT 061 Basic Euclidean Geometry 3 Hours. MAT 051 Pre-Algebra 4 Hours

Metrics on SO(3) and Inverse Kinematics

LECTURE 9: A MODEL FOR FOREIGN EXCHANGE

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Solving Simultaneous Equations and Matrices

9. Sampling Distributions

Common Core Unit Summary Grades 6 to 8

Number Sense and Operations

Lecture 4 DISCRETE SUBGROUPS OF THE ISOMETRY GROUP OF THE PLANE AND TILINGS

Algebra 1 Course Title

The Behavior of Bonds and Interest Rates. An Impossible Bond Pricing Model. 780 w Interest Rate Models

The Intuition Behind Option Valuation: A Teaching Note

Big Ideas in Mathematics

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE

Pricing Barrier Options under Local Volatility

To begin, I want to introduce to you the primary rule upon which our success is based. It s a rule that you can never forget.

A Second Course in Mathematics Concepts for Elementary Teachers: Theory, Problems, and Solutions

Pricing Discrete Barrier Options

Chapter 4. Probability Distributions

Prentice Hall Algebra Correlated to: Colorado P-12 Academic Standards for High School Mathematics, Adopted 12/2009

1 The Brownian bridge construction

Valuing Stock Options: The Black-Scholes-Merton Model. Chapter 13

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution

Chapter 4 Lecture Notes

CS 522 Computational Tools and Methods in Finance Robert Jarrow Lecture 1: Equity Options

An introduction to Value-at-Risk Learning Curve September 2003

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

Option Valuation. Chapter 21

PYKC Jan Lecture 1 Slide 1

In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data.

INTERNATIONAL COMPARISON OF INTEREST RATE GUARANTEES IN LIFE INSURANCE

Hypothesis Testing for Beginners

What Is Probability?

Review of Random Variables

Review of Basic Options Concepts and Terminology

Transcription:

Statistical Methods for Data Analysis Probability and PDF s Luca Lista INFN Napoli

Definition of probability There are two main different definitions of the concept of probability Frequentist Probability is the ratio of the number of occurrences of an event to the total number of experiments, in the limit of very large number of repeatable experiments. Can only be applied to a specific classes of events (repeatable experiments) Meaningless to state: probability that the lightest SuSy particle s mass is less tha 1 TeV Bayesian Probability measures someone s the degree of belief that something is or will be true: would you bet? Can be applied to most of unknown events (past, present, future): Probability that Velociraptors hunted in groups Probability that S.S.C Naples will win next championship Luca Lista Statistical Methods for Data Analysis 2

Classical probability The theory of chance consists in reducing all the events of the same kind to a certain number of cases equally possible, that is to say, to such as we may be equally undecided about in regard to their existence, and in determining the number of cases favorable to the event whose probability is sought. The ratio of this number to that of all the cases possible is the measure of this probability, which is thus simply a fraction whose numerator is the number of favorable cases and whose denominator is the number of all the cases possible. Pierre Simon Laplace (1749-1827) Pierre-Simon Laplace, A Philosophical Essay on Probabilities Luca Lista Statistical Methods for Data Analysis 3

Classical Probability Probability = Number of favorable cases Number of total cases Assumes all accessible cases are equally probable This analysis is rigorously valid on discrete cases only Problems in continuous cases ( Bertrand s paradox) P = 1/2 P = 1/6 (each dice) P = 1/10 P = 1/4 Luca Lista Statistical Methods for Data Analysis 4

What about something like this? We should move a bit further Luca Lista Statistical Methods for Data Analysis 5

Probability and combinatorial Complex cases are managed via combinatorial analysis Reduce the event of interest into elementary equiprobable events Sample space Set algebra and/or/not intersection/union/complement E.g: 2 = {(1,1)} 3 = {(1,2), (2,1)} 4 = {(1,3), (2,2), (3,1)} 5 = {(1,4), (2,3), (3,2), (4,1)} etc. Luca Lista Statistical Methods for Data Analysis 6

Random extractions Success is the extraction of a red ball in a container of mixed white and red ball Red: p = 3/10 White: 1 p = 7/10 Success could also be: track reconstructed by a detector, or event selected by a set of cuts Classic probability applies only to integer cases, so strictly speaking, p should be a rational number p = 3/10 Luca Lista Statistical Methods for Data Analysis 7

Multiple random extractions p 1 p Extraction path leads to Pascal s / Tartaglia s triangle like (a + b) n 1 1 p 1 p p 1 p 1 2 1 p 1 p p 1 p p 1 p 1 3 3 1 p 1 p p 1 p p 1 p p 1 4 6 4 1 p 1 Luca Lista Statistical Methods for Data Analysis 8

Binomial distribution Distribution of number of successes on N trials, each trial with success probability p Average: n = Np Variance: n 2 n 2 = Np(1 p) Frequently used for efficiency estimate Efficiency error σ = Var(n) ½ : Note: σ ε = 0 for ε = 0, 1 Luca Lista Statistical Methods for Data Analysis 9

Tartaglia or Pascal? India: 10 th century commentaries of Chandas Shastra Pingala, dating 5 th -2 nd century BC Persia: Al-Karaji (953 1029), Omar Khayyám (, 1048-1131): Khayyam triangle China: Yang Hui (, 1238-1298): Yang Hui triangle Germany: Petrus Apianus (1495-1552) Italy: Niccolò Fontana Tartaglia (Ars Magna, by Gerolamo Cardano, 1545): Triangolo di Tartaglia France: Blaise Pascal (Traité du triangle arithmétique, 1655) Luca Lista Statistical Methods for Data Analysis 10

Bertrand s paradox Given a randomly chosen chord on a circle, what is the probability that the chord s length is larger than the side of the inscribed triangle? P = 1/3 P = 1/2 P = 1/4 Randomly chosen is not a well defined concept in this case Some classical probability concepts become arbitrary until we move to PDF s (uniform in which metrics?) Luca Lista Statistical Methods for Data Analysis 11

Probability definition (freqentist) A bit more formal definition of probability: Law of large numbers: if i.e.: isn t it a circular definition? Luca Lista Statistical Methods for Data Analysis 12

In a picture Luca Lista Statistical Methods for Data Analysis 13

Problems with probability definitions Frequentist probability is, to some extent, circularly defined A phenomenon can be proven to be random (i.e.: obeying laws of statistics) only if we observe infinite cases F.James et al.: this definition is not very appealing to a mathematician, since it is based on experimentation, and, in fact, implies unrealizable experiments (N ). But a physicist can take this with some pragmatism A frequentist model can be justified by details of poorly predictable underlying physical phenomena Deterministic dynamic with instability (chaos theory, ) Quantum Mechanics is intrinsically probabilistic! A school of statisticians state that Bayesian statistics is a more natural and fundamental concept, and frequentist statistic is just a special sub-case On the other hand, Bayesian statistics is subjectivity by definition, which is unpleasant for scientific applications. Bayesian reply that it is actually inter-subjective, i.e.: the real essence of learning and knowing physical laws Frequentist approach is preferred by the large fraction of physicists (probably the majority, but Bayesian statistics is getting more and more popular in many application, also thanks to its easier application in many cases Luca Lista Statistical Methods for Data Analysis 14

Axiomatic definition (A. Kolmogorov) Axiomatic probability definition applies to both frequentist and Bayesian probability Let (Ω, F 2 Ω, P) be a measure space that satisfy: 1 2 3 Andrej Nikolaevič Kolmogorov (1903-1987) Terminology: Ω = sample space, F = event space, P = probability measure So we have a formalism to deal with different types of probability Luca Lista Statistical Methods for Data Analysis 15

Conditional probability Probability of A, given B: P(A B) i.e.: probability that an event known to belong to set B, is also a member of set A: P(A B) = P(A B) / P(B) Event A is said to be independent on B if the conditional probability of A given B is equal to the probability of A: P(A B) = P(A) Hence, if A is independent on B: P(A B) = P(A) P(B) A B If A is independent on B, B is independent on A Luca Lista Statistical Methods for Data Analysis 16

Prob. Density Functions (PDF) Sample space = { } Experiment = one point on the sample space Event = a subset A of the sample space P.D.F. = Probability of an event A = Differential probability: For continuous cases, the probability of an event made of a single experiment is zero: P({x 0 }) = 0 Discrete variables may be treated as Dirac s δ uniform treatment of continuous and discrete cases Luca Lista Statistical Methods for Data Analysis 17

Variables transformation (discrete) 1D case: y = Y(x) {x 1,, x n } {y 1,, y m } = {Y(x 1 ),, Y(x n )} Note that different Y(x n ) may coincide (n m)! So: Generalization to more variables is straightforward: Sum on all cases which give the right combination (z) Will see how to generalize to the continuous case and get the error propagation! Luca Lista Statistical Methods for Data Analysis 18

Coordinate transformation From 2D to 1D: Generic change of coordinate (2D): Generalization of discrete case: sum only on elementary events (experiments) where: xʹ i = Xʹ i(x 1, x 2 ) (e.g.: result of sum of two dices) Easy to implement with Monte Carlo If the relation is invertible, the Jacobian determinant has to be multiplied by the transformed PDF Replace (x, y) in f with inverse transformation Transform of the n-d volume with jacobian: Luca Lista Statistical Methods for Data Analysis 19

PDF Examples

Gaussian distribution Carl Friedrich Gauss (1777-1855) Average = x = µ Variance = (x µ) 2 = σ 2 Widely used mainly because of the central limit theorem next slide Luca Lista Statistical Methods for Data Analysis 21

Central limit theorem The average of N random variables R n converges to a Gaussian, irrespective to the original distributions Adding n flat distributions Basic regularity conditions must hold (incl. finite variance) Luca Lista Statistical Methods for Data Analysis 22

Uniform ( flat ) distribution RMS = Model for position of rain drops, time of cosmic ray passage, etc. Basic distribution for pseudo-random number generators Luca Lista Statistical Methods for Data Analysis 23

Cumulative distribution Given a PDF f(x), the cumulative is defined as: The PDF for F is uniformly distributed in [0, 1]: Luca Lista Statistical Methods for Data Analysis 24

Distance from a time t 0 to first count t 0 is an arbitrary time Also: t = time between t 0 and the first count Notation: P(n,[t 1, t 2 ]) = probability of n entries in [t 1, t 2 ] given a rate r t 0 = 0 t δt t 1 Neglect the probability that two or more occur, O(δt 2 ) Independent events Equivalently: Luca Lista Statistical Methods for Data Analysis 25

Exponential distribution Cumulative r = 0.5 r = 1.0 r = 1.5 Luca Lista Statistical Methods for Data Analysis 26

Poisson distribution x Probability to have n entries in x X >> x Expect on average ν = N x / X = r x Binomial, where p = x / X = ν / N << 1 r = N / X, N, X Siméon-Denis Poisson (1781-1840) Luca Lista Statistical Methods for Data Analysis 27

Poisson limit with large ν For large ν a Gaussian approximation is sufficiently accurate ν = 2 ν = 5 ν = 10 ν = 20 ν = 30 Luca Lista Statistical Methods for Data Analysis 28

Summing Poissonian variables Probability distribution of the sum of two Poissonian variables with expected values ν 1 and ν 2 : P(n) = Σ m=0 n Poiss(m; ν 1 ) Poiss(n m; ν 2 ) The result is still a Poissonian: P(n) = Poiss(n; ν 1 + ν 2 ) Useful when combining Poissonian signal plus background: P(n; s, b) = Poiss(n; s + b) The same holds for convolution of binomial and Poissonian Take a fraction of Poissonian events with binomial efficiency No surprise, given how we constructed Poissonian probability! Luca Lista Statistical Methods for Data Analysis 29

Demonstration: Poisson Binomial Luca Lista Statistical Methods for Data Analysis 30

Other frequently used PDFs Argus function Crystal ball distribution Landau distribution Luca Lista Statistical Methods for Data Analysis 31

Argus function Mainly used to model background in mass peak distributions that exhibit a kinematic boundary BABAR The primitive can be computed in terms of error functions, so the numerical normalization within a given range is feasible Luca Lista Statistical Methods for Data Analysis 32

Argus primitive For the records: For ξ<0: For ξ 0: But please, verify with a symbolic integrator before using my formulae! Luca Lista Statistical Methods for Data Analysis 33

Crystal Ball function Adds an asymmetric power-law tail to a Gaussian PDF with proper normalization and continuity of PDF and its derivative Used first by the Crystal Ball collaboration at SLAC α = 0.1 α = 1 α = 10 Luca Lista Statistical Methods for Data Analysis 34

Landau distribution Used to model the fluctuations in the energy loss of particles in thin layers More frequently, scaled and shifted: Implementation provided by GNU Scientific Library (GSL) and ROOT (TMath::Landau) µ = 2 σ = 1 Luca Lista Statistical Methods for Data Analysis 35

PDFs in more dimensions

Multi-dimensional PDF 1D projections (marginal distributions): x and y are independent if: y We saw that A and B are independent events if: Luca Lista Statistical Methods for Data Analysis 37 x

Conditional distributions PDF w.r.t. y, given x = x 0 PDF should be projected and normalized with the given condition Remind: P(A B) = P(A B) / P(B) y x 0 x Luca Lista Statistical Methods for Data Analysis 38

Covariance and cov. matrix Definitions: Covariance: Correlation: Correlated n-dimensional Gaussian: where: Luca Lista Statistical Methods for Data Analysis 39

Two-dimensional Gaussian Product of two independent Gaussians with different σ Rotation in the (x, y) plane Luca Lista Statistical Methods for Data Analysis 40

Two-dimensional Gaussian (cont.) Rotation preserves the metrics: Covariance in rotated coordinates: Luca Lista Statistical Methods for Data Analysis 41

Two-dimensional Gaussian (cont.) A pictorial view of an iso-probability contour y σ y σ xʹ ϕ σ yʹ σ x x Luca Lista Statistical Methods for Data Analysis 42

1D projections PDF projections are (1D) Gaussians: Areas of 1σ and 2σ contours differ in 1D and 2D! y P 1D P 2D 1σ 0.6827 0.3934 2σ 0.9545 0.8647 2σ 1σ x 3σ 0.9973 0.9889 1.515σ 0.6827 2.486σ 0.9545 3.439σ 0.9973 1σ 2σ Luca Lista Statistical Methods for Data Analysis 43

Correlation and independence Independent variables are uncorrelated But not necessarily vice-versa Uncorrelated, but not independent! Luca Lista Statistical Methods for Data Analysis 44

PDF convolution Concrete example: add experimental resolution to a known PDF The intrinsic PDF of the variable x 0 is f(x 0 ) Given a true value x 0, the probability to measure x is: r(x, x 0 ) May depend on other parameters (e.g.: σ = experimental resolution, if r is a Gaussian) The probability to measure x considering both the intrinsic fluctuation and experimental resolution is the convolution of f with r: Often referred to as: g = f r Luca Lista Statistical Methods for Data Analysis 45

Convolution and Fourier Transform Reminder of Fourier transform definition: It can be demonstrated that: In particular, the FT of a Gaussian is still a Gaussian: Note: σ goes to the numerator! Numerically, FFT can be convenient for computation of convolution PDF s ( RooFit) Luca Lista Statistical Methods for Data Analysis 46

A small digression Application to economics

Familiar example: multiple scattering Assume the limit of small scattering angles Can add single random scattering angles θ = Σ i θ i For many steps, the distribution of θ can be approximated with a Gaussian θ = 0 σ θ 2 = Σ i δ 2 = N δ 2 = δ 2 x / Δx Hence: σ θ x This is similar to a Brownian motion, where in general (as a function of time): σ(t) t = σ t x More precisely (from PDG): θ Luca Lista Statistical Methods for Data Analysis 48

Stock prices vs bond prices Stock prices are represented as geometric Brownian motion: s(t) = s 0 e y(t) Where y(t) = µ t + σ n(t) Stock volatility: σ Bond growth (risk-free and deterministic): b(t) = b 0 e αt Discount rate: α Deterministic term Stochastic Brownian term price t s(t) b(t) Luca Lista Statistical Methods for Data Analysis 49

Average stock option at time t Average computed as usual: s(t) = s 0 e µ t e σ n(t) = s 0 e µ t e σ n(t) It s easy to demonstrate that, if w is Gaussian: e w = e w + Var w / 2 The Brownian variance is σ 2 t, hence: s(t) = s 0 e (µ + σ 2 / 2) t The risk-neutral price for the stock must be such that it produces no gain w.r.t. bonds: s(t) = b(t) µ + σ 2 /2 = α Luca Lista Statistical Methods for Data Analysis 50

Stock options A call-option is the right, buy not obligation, to buy one share at price K at time t in the future Gain at time t: g = s(t) K if K < s(t) g = 0 if K s(t) What is the risk-neutral price of the option? price s(t) K Luca Lista Statistical Methods for Data Analysis 51 t

Black-Scholes model The average gain minus cost c must be equal to bond gain: c e αt = g Equivalently: gain Gaussian PDF Which gives: Where: φ is the cumulative distribution of a normal Gaussian Luca Lista Statistical Methods for Data Analysis 52

Limit for t 0 For current time, the price is what you would expect, since there is no fluctuation At fixed (larger) t, the price curve gets smoothed by the Gaussian fluctiations Luca Lista Statistical Methods for Data Analysis 53

Sensitivity to stock price and time Chris Murray Luca Lista Statistical Methods for Data Analysis 54

Black and Scholes Black had a PhD in applied Mathematics. Died in 1995 Scholes won the Nobel price in Economics on 1997 He was co-funder of the hedgefund Long-Term Capital Management After gaining around 40% for the first years, it lost in 1998 $4.6 billion in less than four months and failed after the East Asian financial crisis Fischer Sheffey Black 1938 1995 Myron Samuel Scholes 1941 - Luca Lista Statistical Methods for Data Analysis 55

The End Nobody s perfect! Luca Lista Statistical Methods for Data Analysis 56