Lecture 1 Probability and Statistics

Similar documents
Overview of some probability distributions.

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

Properties of MLE: consistency, asymptotic normality. Fisher information.

I. Chi-squared Distributions

5: Introduction to Estimation

Normal Distribution.

Chapter 7 Methods of Finding Estimators

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

Confidence Intervals for One Mean

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Incremental calculation of weighted mean and variance

Determining the sample size

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

3 Basic Definitions of Probability Theory

PSYCHOLOGICAL STATISTICS

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Maximum Likelihood Estimators.

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Measures of Spread and Boxplots Discrete Math, Section 9.4

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

Chapter 14 Nonparametric Statistics

Math C067 Sampling Distributions

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals


Confidence intervals and hypothesis tests

1. C. The formula for the confidence interval for a population mean is: x t, which was

Chapter 5: Inner Product Spaces

One-sample test of proportions

Output Analysis (2, Chapters 10 &11 Law)

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

LECTURE 13: Cross-validation

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

A probabilistic proof of a binomial identity

A PROBABILISTIC VIEW ON THE ECONOMICS OF GAMBLING

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Section 11.3: The Integral Test

Sampling Distribution And Central Limit Theorem

A Mathematical Perspective on Gambling

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

Convexity, Inequalities, and Norms

Descriptive Statistics

The Stable Marriage Problem

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

1 Correlation and Regression Analysis

Asymptotic Growth of Functions

Chapter 7: Confidence Interval and Sample Size

Exploratory Data Analysis

Statistical inference: example 1. Inferential Statistics

Lesson 17 Pearson s Correlation Coefficient

1 Computing the Standard Deviation of Sample Means

Unit 8: Inference for Proportions. Chapters 8 & 9 in IPS

NATIONAL SENIOR CERTIFICATE GRADE 12

Quadrat Sampling in Population Ecology

Our aim is to show that under reasonable assumptions a given 2π-periodic function f can be represented as convergent series

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:

Partial Di erential Equations

1 The Gaussian channel

Soving Recurrence Relations

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships

Lesson 15 ANOVA (analysis of variance)

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern.

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Hypergeometric Distributions

Topic 5: Confidence Intervals (Chapter 9)

Basic Measurement Issues. Sampling Theory and Analog-to-Digital Conversion

How To Solve The Homewor Problem Beautifully

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10

Hypothesis testing. Null and alternative hypotheses

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

5 Boolean Decision Trees (February 11)

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Sequences and Series

Parametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal)

Present Values, Investment Returns and Discount Rates

The Fundamental Forces of Nature

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Now here is the important step

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006

Unbiased Estimation. Topic Introduction

Confidence Intervals

Institute of Actuaries of India Subject CT1 Financial Mathematics

Department of Computer Science, University of Otago

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

THE HEIGHT OF q-binary SEARCH TREES

Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.

A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design

Research Article Sign Data Derivative Recovery

Transcription:

Lecture 1 Probability ad Statistics Itroductio: l Uderstadig of may physical pheomea deped o statistical ad probabilistic cocepts: H Statistical Mechaics (physics of systems composed of may parts: gases, liquids, solids.) u 1 mole of aythig cotais 6x10 23 particles (Avogadro's umber) u impossible to keep track of all 6x10 23 particles eve with the fastest computer imagiable + resort to learig about the group properties of all the particles + partitio fuctio: calculate eergy, etropy, pressure... of a system H Quatum Mechaics (physics at the atomic or smaller scale) u wavefuctio = probability amplitude + probability of a electro beig located at (x,y,z) at a certai time. l Uderstadig/iterpretatio of experimetal data deped o statistical ad probabilistic cocepts: H how do we extract the best value of a quatity from a set of measuremets? H how do we decide if our experimet is cosistet/icosistet with a give theory? H how do we decide if our experimet is iterally cosistet? H how do we decide if our experimet is cosistet with other experimets? + I this course we will cocetrate o the above experimetal issues! K.K. Ga L1: Probability ad Statistics 1

Defiitio of probability: l Suppose we have N trials ad a specified evet occurs r times. H example: rollig a dice ad the evet could be rollig a 6. u defie probability (P) of a evet (E) occurrig as: P(E) = r/n whe N Æ H examples: six sided dice: P(6) = 1/6 coi toss: P(heads) = 0.5 + P(heads) should approach 0.5 the more times you toss the coi. + for a sigle coi toss we ca ever get P(heads) = 0.5! u by defiitio probability is a o-egative real umber bouded by 0 P 1 H if P = 0 the the evet ever occurs H if P = 1 the the evet always occurs H sum (or itegral) of all probabilities if they are mutually exclusive must = 1. evets are idepedet if: P(A«B) = P(A)P(B) «itersectio,» uio evets are mutually exclusive (disjoit) if: P(A«B) = 0 or P(A»B) = P(A) + P(B) K.K. Ga L1: Probability ad Statistics 2

l Probability ca be a discrete or a cotiuous variable. u Discrete probability: P ca have certai values oly. H examples: tossig a six-sided dice: P(x i ) = P i here x i = 1, 2, 3, 4, 5, 6 ad P i = 1/6 for all x i. tossig a coi: oly 2 choices, heads or tails. H for both of the above discrete examples (ad i geeral) whe we sum over all mutually exclusive possibilities: Â P( x i ) =1 u Cotiuous probability: P ca be ay umber betwee 0 ad 1. H defie a probability desity fuctio, pdf, f ( x) f ( x)dx = dp( x a x + dx) with a a cotiuous variable H probability for x to be i the rage a x b is: H just like the discrete case the sum of all probabilities must equal 1. + Ú f x dx =1 H i P(a x b) = - ( ) b Ú a ( ) f x dx + f(x) is ormalized to oe. probability for x to be exactly some umber is zero sice: x=a ( ) Ú f x dx = 0 x=a Notatio: x i is called a radom variable K.K. Ga L1: Probability ad Statistics 3

l Examples of some commo P(x) s ad f(x) s: Discrete = P(x) biomial Poisso Cotiuous = f(x) uiform, i.e. costat Gaussia expoetial chi square l How do we describe a probability distributio? u mea, mode, media, ad variace u for a cotiuous distributio, these quatities are defied by: Mea Mode Media Variace average most probable 50% poit width of distributio + a + f x m = Ú xf (x)dx = 0 0.5 = Ú f (x)dx s 2 = Ú f (x) x - m - ( ) x x = a - - ( ) 2 dx u for a discrete distributio, the mea ad variace are defied by: x i m = 1 Â s 2 = 1 Â(x i - m) 2 K.K. Ga L1: Probability ad Statistics 4

l Some cotiuous pdf: mode media mea s symmetric distributio (gaussia) For a Gaussia pdf, the mea, mode, ad media are all at the same x. Asymmetric distributio showig the mea, media ad mode For most pdfs, the mea, mode, ad media are at differet locatios. K.K. Ga L1: Probability ad Statistics 5

l Calculatio of mea ad variace: u example: a discrete data set cosistig of three umbers: {1, 2, 3} H average (m) is just: x m =  i = 1+ 2 + 3 = 2 3 H complicatio: suppose some measuremet are more precise tha others. + if each measuremet x i have a weight w i associated with it: m = x i  w i / Âw i H variace (s 2 ) or average squared deviatio from the mea is just: s 2 = 1 Â(x i - m) 2 s is called the stadard deviatio + rewrite the above expressio by expadig the summatios: s 2 = 1 È x 2 i + Âm 2 Í Â - 2m  x i Î = 1  x 2 i + m 2-2m 2 = 1  x 2 i -m 2 = x 2 - x 2 < > average weighted average variace describes the width of the pdf! i the deomiator would be -1 if we determied the average (m) from the data itself. K.K. Ga L1: Probability ad Statistics 6

H usig the defiitio of m from above we have for our example of {1,2,3}: s 2 = 1  x 2 i -m 2 = 4.67-2 2 = 0.67 H the case where the measuremets have differet weights is more complicated: s 2 = m is the weighted mea  w i (x i - m) 2 2 2 2 /  w i =  w i x i /  w i - m 2 if we calculated m from the data, s 2 gets multiplied by a factor /(-1). u example: a cotiuous probability distributio, f (x) = si 2 x for 0 x 2p H has two modes! H has same mea ad media, but differ from the mode(s). 2p H f(x) is ot properly ormalized: Ú si 2 xdx = p 1 0 + ormalized pdf: f (x) = si 2 2p x / Ú si 2 xdx = 1 0 p si2 x K.K. Ga L1: Probability ad Statistics 7

H for cotiuous probability distributios, the mea, mode, ad media are calculated usig either itegrals or derivatives: m = 1 p 2p Ú 0 x si 2 xdx = p mode : x si2 x = 0 fi p 2, 3p 2 media : 1 p a Ú 0 si2 xdx = 1 2 fi a = p u example: Gaussia distributio fuctio, a cotiuous probability distributio K.K. Ga L1: Probability ad Statistics 8

Accuracy ad Precisio: l Accuracy: The accuracy of a experimet refers to how close the experimetal measuremet is to the true value of the quatity beig measured. l Precisio: This refers to how well the experimetal result has bee determied, without regard to the true value of the quatity beig measured. u just because a experimet is precise it does ot mea it is accurate!! u measuremets of the eutro lifetime over the years: The size of bar reflects the precisio of the experimet H steady icrease i precisio but ay of these measuremets accurate? K.K. Ga L1: Probability ad Statistics 9

Measuremet Errors (Ucertaities) l Use results from probability ad statistics as a way of idicatig how good a measuremet is. u most commo quality idicator: relative precisio = [ucertaity of measuremet]/measuremet H example: we measure a table to be 10 iches with ucertaity of 1 ich. relative precisio = 1/10 = 0.1 or 10% (% relative precisio) u ucertaity i measuremet is usually square root of variace: s = stadard deviatio H usually calculated usig the techique of propagatio of errors. Statistics ad Systematic Errors l Results from experimets are ofte preseted as: N ± XX ± YY N: value of quatity measured (or determied) by experimet. XX: statistical error, usually assumed to be from a Gaussia distributio. With the assumptio of Gaussia statistics we ca say (calculate) somethig about how well our experimet agrees with other experimets ad/or theories. Expect a 68% chace that the true value is betwee N - XX ad N + XX. YY: systematic error. Hard to estimate, distributio of errors usually ot kow. u examples: mass of proto = 0.9382769 ± 0.0000027 GeV mass of W boso = 80.8 ± 1.5 ± 2.4 GeV K.K. Ga L1: Probability ad Statistics 10

l What s the differece betwee statistical ad systematic errors? u statistical errors are radom i the sese that if we repeat the measuremet eough times: XX T 0 u systematic errors do ot T 0 with repetitio. H examples of sources of systematic errors: voltmeter ot calibrated properly a ruler ot the legth we thik is (meter stick might really be < meter!) u because of systematic errors, a experimetal result ca be precise, but ot accurate! l How do we combie systematic ad statistical errors to get oe estimate of precisio? + big problem! u two choices: H s tot = XX + YY add them liearly H s tot = (XX 2 + YY 2 ) 1/2 add them i quadrature l Some other ways of quotig experimetal results u lower limit: the mass of particle X is > 100 GeV u upper limit: the mass of particle X is < 100 GeV +4 u asymmetric errors: mass of particle X = 100-3 GeV K.K. Ga L1: Probability ad Statistics 11