Mean = (sum of the values / the number of the value) if probabilities are equal



Similar documents
Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

Final Exam Practice Problem Answers

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

Binomial Distribution n = 20, p = 0.3

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

2. Filling Data Gaps, Data validation & Descriptive Statistics

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Descriptive Statistics

Exploratory data analysis (Chapter 2) Fall 2011

Means, standard deviations and. and standard errors

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

3: Summary Statistics

THE BINOMIAL DISTRIBUTION & PROBABILITY

List of Examples. Examples 319

MEASURES OF VARIATION

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

Quantitative Methods for Finance

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Geostatistics Exploratory Analysis

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

Descriptive Statistics

Standard Deviation Estimator

How To Write A Data Analysis

Exploratory Data Analysis

Exercise 1.12 (Pg )

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Regression Analysis: A Complete Example

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Topic 9 ~ Measures of Spread

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

A and B This represents the probability that both events A and B occur. This can be calculated using the multiplication rules of probability.

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

Fairfield Public Schools

Dongfeng Li. Autumn 2010

Chapter 5 Discrete Probability Distribution. Learning objectives

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Northumberland Knowledge

CALCULATIONS & STATISTICS

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Bellwork Students will review their study guide for their test. Box-and-Whisker Plots will be discussed after the test.

430 Statistics and Financial Mathematics for Business

Pr(X = x) = f(x) = λe λx

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

Module 4: Data Exploration

Lecture 1: Review and Exploratory Data Analysis (EDA)

Simple linear regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Variables. Exploratory Data Analysis

Section 5 Part 2. Probability Distributions for Discrete Random Variables

Statistical Functions in Excel

Study Guide for the Final Exam

9.1 Measures of Center and Spread

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

6.4 Normal Distribution

Data Exploration Data Visualization

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck!

MATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!

Using SPSS, Chapter 2: Descriptive Statistics

3.2 Measures of Spread

DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability

DATA INTERPRETATION AND STATISTICS

International College of Economics and Finance Syllabus Probability Theory and Introductory Statistics

Statistics E100 Fall 2013 Practice Midterm I - A Solutions

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

APPLIED MATHEMATICS ADVANCED LEVEL

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

AP Statistics: Syllabus 1

NCSS Statistical Software

Using R for Linear Regression

How Does My TI-84 Do That

Section 1.3 Exercises (Solutions)

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data

Course Syllabus MATH 110 Introduction to Statistics 3 credits

Six Sigma Acronyms. 2-1 Do Not Reprint without permission of

Simple Linear Regression Inference

Ch. 3.1 # 3, 4, 7, 30, 31, 32

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Factors affecting online sales

Week 1. Exploratory Data Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Scatter Plots with Error Bars

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

Review of Random Variables

Chapter 7 Section 1 Homework Set A

University of Chicago Graduate School of Business. Business 41000: Business Statistics

STA 130 (Winter 2016): An Introduction to Statistical Reasoning and Data Science

THE SIX SIGMA BLACK BELT PRIMER

Multiple Linear Regression

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Transcription:

Population Mean Mean = (sum of the values / the number of the value) if probabilities are equal Compute the population mean Population/Sample mean: 1. Collect the data 2. sum all the values in the population/sample. 3. divide the sum by the number of elements in the population/sample. Median The median is a center value that divides a sorted list of data into two halves. Data Array Data that have been arranged in numerical order.

Mode Is the value in a data set that occurs most frequently. Percentile location value i = (P/100) (n+1) p desired percentile n number of values in the data set. The pth percentile in a data array is a value that divides the data set into two parts. The lower segment contains at least p%, and the upper segment contains at least (100 p)%, of the data. The 50 th percentile is the median. Box and Whisker plots 1. sort the data values from low to high

2. find the 25 th percentile ( first quartile), 50 th percentile (median), 75 th percentile 3. draw a box so that the ends of the box at Q1 and Q3, This box wil contain the middle 50% of the data values in the population or sample 4. Draw a vertical line through the box at the median. Half the data values in the box will be on either side of the median. 5. Calculate the interquartile range (IQR = Q3 Q1). Compute the lower limit for the box and whisker plot as Q1 1.5(Q3-Q1) and upper limit Q3 + 1.5(Q3 Q1). Any data values outside these limits are referred to as outliers. 6. extend dashed lines(call the whiskers) from each end to the box to the lowest (on the left) and highest value (on the right) within the limits. 7. any value outside the limits (outlier) found in 5 is marked with an asterisk(*). Range R = Maximum value minimum value

Interquartile Range IQR = Q3 Q1 Variance The population variance is the average of the squared distances of the data values from the mean. The sample variance is the average (divide by n-1 instead n) of the squared distances of the data values from the mean ( residuals ). Standard Deviation Positive square root of the variance. Coefficient of Variation CV = (SD/mean) 100

Standardized Data Values (Z scores) 1. compute the population mean and SD or the sample mean and SD 2. use these formulas: Z = (x mean) / SD For samples Z = (x sample mean) / sample SD Using Tree Diagram Independent Events

Two events are independent if the occurrence of one event in no way influences the probability of the occurrence of the other event. Probability Rule P(E1 or E2) = P(E1) + P(E2) P(E1 and E2) For two mutually exclusive events P(E1 or E2) = P(E1) + P(E2) Conditional probability P(E1 E2) = P(E1 and E2) / P(E2) It reads probability of Event E1 given event E2 has occurred. The sample space is E2 and you find the elements in E1 that are also in E2.

Conditional Probability for Independent Events P(E1 E2) = P (E1) And P(E2 E1) = P (E2) Binomial Use R Pbinom for if you want to find the probability less than or equal to q, Size = number of trials and p = probability of a success at each trial. pbinom( q, number_of_trials, probability of success) Probability of outcome x, use: dbinom( x, number_of_trials, probability_of_success )

Expected value for the binomial is = number_of_trials x probability_of_success Poisson: number of successes when number_of_trials is very large and the probability of a success is very small. λ = number_of_trials x prob_of_success = expected number of successes. Use R dpois (x, lambda ) = prob of x if expected value = lambda Normal distribution Use R Pnorm

Sample Error Sample Error = sample mean population mean Std.Error = SD of the Sample error = population SD / square root of n This is SD of the sampling distribution. To find probabilities associated with a sampling distribution of xbar for samples of size n from a population with mean and SD (if population is normal or if n is large) 1. compute the sample mean 2. Define the sampling distribution Population Mean of Sample mean = population mean SD of sample mean = SD / square root of n

3. define the event of interest 4. Express in terms of a Z value = (Sample Mean Pop Mean) / ( SD of sample mean) and use pnorm to get the probability Sample proportion 1. find p (true probability) 2. find pbar 3. find SD pbar If we have p: sqrt( p(1-p) / n) [Hypothesis testing] (6.10) If only pbar: sqrt( pbar(1-pbar) / n) [Confidence intervals] 4. define the event of interest 5. find the Z value 6. use pnorm

Confidence Interval Calculation Point estimate +/- (critical Value (Z or T))(Standard Error of Estimate) Developing a confidence interval estimate for a population proportion 1. define the population of interest and the variable from which to estimate the population proportion. 2. determine the sample size and select a simple random sample. 3. specify the level of confidence and obtain the critical value from qnorm or qt (in R) 4. calculate the pbar, the sample proportion. 5. construct the interval estimate.

One tailed test for a hypothesis about a population mean, SD known, large samples 1. Specify the population value of interest. 2. Formulate the null hypothesis and the alternative hypothesis in terms of the population mean. 3. Specify the desired significance level 4. construct the rejection region 5. compute the test statistic. 6. draw the conclusion T. Lau 2007