Fundamental Probability and Statistics

Similar documents
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Fairfield Public Schools

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Simple Linear Regression Inference

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

The correlation coefficient

Lecture 8. Confidence intervals and the central limit theorem

Introduction to General and Generalized Linear Models

Testing Research and Statistical Hypotheses

Statistics Graduate Courses

Introduction to Regression and Data Analysis

Chapter 5 Analysis of variance SPSS Analysis of variance

3.4 Statistical inference for 2 populations based on two samples

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics

1 Prior Probability and Posterior Probability

Handling attrition and non-response in longitudinal data

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

MTH 140 Statistics Videos

Multivariate Normal Distribution

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Part 2: Analysis of Relationship Between Two Variables

PROBABILITY AND STATISTICS. Ma To teach a knowledge of combinatorial reasoning.

From the help desk: Bootstrapped standard errors

Simple Regression Theory II 2010 Samuel L. Baker

STAT 350 Practice Final Exam Solution (Spring 2015)

Factors affecting online sales

STA 4273H: Statistical Machine Learning

Least Squares Estimation

Name: Date: Use the following to answer questions 3-4:

Elements of statistics (MATH0487-1)

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

International Statistical Institute, 56th Session, 2007: Phil Everson

Imputing Missing Data using SAS

Master s Theory Exam Spring 2006

1 Sufficient statistics

Unit 26 Estimation with Confidence Intervals

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

International College of Economics and Finance Syllabus Probability Theory and Introductory Statistics

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Projects Involving Statistics (& SPSS)

Chapter 1 Introduction. 1.1 Introduction

Point and Interval Estimates

E3: PROBABILITY AND STATISTICS lecture notes

HYPOTHESIS TESTING: POWER OF THE TEST

II. DISTRIBUTIONS distribution normal distribution. standard scores

UNDERGRADUATE DEGREE DETAILS : BACHELOR OF SCIENCE WITH

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Stat 704 Data Analysis I Probability Review

Likelihood: Frequentist vs Bayesian Reasoning

Statistics for Analysis of Experimental Data

Simple linear regression

Regression Analysis: A Complete Example

Chapter 2. Hypothesis testing in one population

Exact Nonparametric Tests for Comparing Means - A Personal Summary


Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Statistical tests for SPSS

Chapter 3 RANDOM VARIATE GENERATION

Basics of Statistical Machine Learning

2013 MBA Jump Start Program. Statistics Module Part 3

Using Excel for inferential statistics

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

LOGNORMAL MODEL FOR STOCK PRICES

Tests of Hypotheses Using Statistics

EE 570: Location and Navigation

LOGIT AND PROBIT ANALYSIS

Quantitative Methods for Finance

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

A Primer on Forecasting Business Performance

Dirichlet forms methods for error calculus and sensitivity analysis

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

START Selected Topics in Assurance


Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Bayesian Statistical Analysis in Medical Research

Bayesian Phylogeny and Measures of Branch Support

SAS Certificate Applied Statistics and SAS Programming

Sample Size and Power in Clinical Trials

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences

Linear regression methods for large n and streaming data

Premaster Statistics Tutorial 4 Full solutions

AP STATISTICS (Warm-Up Exercises)

33. STATISTICS. 33. Statistics 1

Permutation Tests for Comparing Two Populations

Descriptive Statistics

Regression III: Advanced Methods

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

Transcription:

Fundamental Probability and Statistics "There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know," Donald Rumsfeld

Probability Space: Probability Theory Reference: G.R. Grimmett and D.R. Stirzaker, Probability and Random Processes, Oxford Science Publications, 1997 Example: Toss possibly biased coin once Take Note: Fair coin if p = 1/2

Probability Theory Example: Two coins tossed possibly multiple times and outcome is ordered pair Let Then Definition: Events A and B are independent if

Random Variables and Distributions Definition: Definition: Definition: Example:

Distributions and Densities Definition: Definition: Definition: PDF Properties:

Density Properties Example: Example:

Additional Properties: Density Properties

Note: Important for longitudinal data Multivariate Distributions Joint CDF: Joint Density (if it exists): Example:

Multivariate Distributions Example: Note: Note:

Multivariate Distributions Definition: Definition: Marginal density function of X Definition: X and Y are independent if and only if or Note:

Estimators and Estimates Definition: An estimator is a function or procedure for deriving an estimate from observed data. An estimator is a random variable whereas an estimate is a real number. Example:

Other Estimators Commonly Employed Estimators: Maximum likelihood Bayes estimators Particle filter (Sequential Monte Carlo (SMC)) Markov chain Monte Carlo (MCMC) Kalman filter Wiener filter

Consider Linear Regression Example:

Linear Regression Statistical Model: Assumptions: Goals:

Minimize Least Squares Problem Note: General result for quadratic forms Thus where Least Squares Estimate: Least Squares Estimator: Note:

Parameter Estimator Properties Estimator Mean: Estimator Covariance:

Variance Estimator Properties Goal: Residual: Variance Estimator: Note:

Note: Variance Estimator Properties

Note: Variance Estimator Properties Unbiased Estimator: Unbiased Estimate:

Parameter Estimator Properties Properties of B: Central Limit Theorem:

Example Example: Consider the height-weight data from the 1975 World Almanac and Book of Facts Height (in) 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 Weight (lbs) 115 117 120 123 126 129 132 135 139 142 146 150 154 159 164 Consider the model

Here Example Least Square Estimate: Note: Note:

Example Variance Estimate: Parameter Covariance Estimate: Note: This yields variances and standard deviations for parameter estimates Goal: Can we additionally compute confidence intervals? Yes, but we need a little more statistics.

Example Hypothesis: One way to check the hypothesis of iid is to plot the residuals

Random Variables Related to the Normal Chi-Square Random Variables: T Random Variables:

Assumption: Variance Estimator Properties

Variance Estimator Properties

Confidence Interval: Variance Estimator Properties

Previous Example: Example Note:

Statistical Model: Summary of Linear Theory Assumptions: Least Squares Estimator and Estimate: Variance Estimator and Estimate: Covariance Estimator and Estimate:

Statistical Properties: Summary of Linear Theory

Statistical Testing: Hypothesis Testing An objective of statistics is to make inferences about unknown population parameters and models based on information in sample data. Inferences may be estimates of parameters or tests of hypotheses regarding their values. Hypothesis Testing: Largely originated with Ronald Fisher, Jerzy Neyman, Karl Pearson and Egon Pearson Fisher: Agricultural statistician: emphasized rigorous experiments and designs Neyman: Emphasized mathematical rigor Early Paper: R. Fisher, ``Mathematics of a Lady Tasting Tea, 1956 -- Question: Could lady determine means of tea preparation based on taste? -- Null Hypothesis: Lady had no such ability -- Fisher asserted that no alternative hypothesis was required

Hypothesis Testing Elements of Test: Strategy:

Hypothesis Testing Elements of Test: Definitions: Test Statistic: Function of sample measurement upon which decision is made. Rejection Region: Value of test statistic for which null hypothesis is rejected. Definitions:

Hypothesis Testing Example: Adam is running for Student body president and thinks he will gain more than 50% of the votes and hence win. His committee is very pragmatic and wants to test the hypothesis that he will receive less than 50% of the vote. Here we take

Hypothesis Testing Example: Is this test equally protect us from erroneously concluding that Adam is the winner when, in fact, he will lose? Suppose that he will really win 30% of the vote so that p = 0.3. What is the probability of a Type II error? Note: The test using this rejection region protects Adam from Type I errors but not Type II errors.

Hypothesis Testing One Solution: Use a larger critical or rejection region. Conclusion: This provides a better balance between Type I and Type II errors. Question: How can we reduce both errors?