GEOS 36501/EVOL January 2010 Page 1 of 23
|
|
- William Green
- 7 years ago
- Views:
Transcription
1 GEOS 36501/EVOL January 2010 Page 1 of 23 III. Sampling 1 Overview of Sampling, Error, Bias 1.1 Biased vs. random sampling 1.2 Biased vs. unbiased statistic (or estimator) 1.3 Precision vs. accuracy 2 Error Estimates With Assumed Sampling Distribution 2.1 Standard Error: Standard deviation of distribution of sample statistics that would result from infinite number of trials of drawing sample from underlying probability distribution and calculating the sample statistic. 2.2 In practice we generally do not estimate error by repeated sampling from the underlying distribution (expensive and time-consuming), although there are exceptions. 2.3 Approximations based on sample distribution (from Sokal and Rohlf):
2 GEOS 36501/EVOL January 2010 Page 2 of 23
3 GEOS 36501/EVOL January 2010 Page 3 of Limitations: Many approximation formulae make assumptions about shape of distribution and sample size We may be interested in novel statistic or one whose sampling distribution is not well characterized. 3 Bootstrap Error Estimates 3.1 Estimate standard error by resampling from the single sample we have. 3.2 This approach uses sampling with replacement from observed sample to simulate sampling without replacement from the underlying distribution. 3.3 Procedure Start with observed sample of size n and observed sample statistic, call it Z Randomly pick a sample of size n, with replacement, from the observed sample Calculate the sample statistic of interest on this random sample; call is Z boot Repeat many times (generally hundreds to thousands, ideally until estimate of SE stabilizes) Calculate standard deviation of the Z boot. This is an estimate of the standard error of the observed sample statistic Z: SD(Z boot ) SE(Z). 3.4 Simple (but not necessarily most useful) example: trimmed mean Define p-% trimmed mean as mean of sample with p% lowest and p% highest observations discarded. (Idea is to try to reduce effect of outliers.) Suppose data consist of 10 (ordered) observations: 1,2,3,4,8,10,12,15,20,30. Let the trimmed mean be denoted Z. Then Z = ( )/6 = 8.67.
4 GEOS 36501/EVOL January 2010 Page 4 of 23 R code to estimate SE(Z) #define function trim.mean<-function(x,ntrim){ ii<-order(x) xtmp<-x[ii] return(mean(xtmp[(ntrim+1):(n-ntrim)])) } data<-c(1,2,3,4,8,10,12,15,20,30) #specify data n<-length(data) ntrim<-2 #specify number to trim from each side Zobs<-trim.mean(data,ntrim) #get observed value nrep< #specify number of bootstrap replicates Zboot<-rep(NA,nrep) #assign memory for (i in 1:nrep) #get bootstrap replicates Zboot[i]<-trim.mean(sample(data,n,replace=TRUE),ntrim) SE<-sd(Zboot) #calculate bootstrap std. error hist(zboot,breaks=50) #plot histogram of results This yields Z obs = 8.67 and SE(Z) 3.1. Histogram of Zboot Frequency Zboot
5 GEOS 36501/EVOL January 2010 Page 5 of Useful R function: sample(x,n,replace=true[or FALSE]) returns a random sample of size n from the vector x with or without replacement. 3.6 To sample from array X so that the variables (columns) stay together: nr<-dim(x)[1] #get number of rows i<-sample(1:nr,n,replace=true[or FALSE]) #returns vector of integers sampled on [1,n] XSAMP<-X[i,] 4 Parametric bootstrap 4.1 Take observed sample and estimate relevant parameter from it. 4.2 Resample from parametric distribution with parameter equal to sample estimate (rather than resampling from observed distribution). 4.3 This approach can also be applied to more complicated situations: for example, simulating a process with parameters estimated from data We ll do lots of this later...
6 GEOS 36501/EVOL January 2010 Page 6 of 23 5 Examples of Finite-sample Bias (sample-size bias) 5.1 Sample variance (x x) 2 /n is biased. This is systematically too low, which makes sense since it is based on squared deviations from sample mean (x x) 2 /(n 1) is unbiased. 5.2 Number of taxa Rarefaction method (from Raup 1975) Abundance of species i is N i ; N = N i. Consider a particular species, i. ( N N i ) n is the number of ways of drawing the non-i individuals in a sample of n. ( N n) is the number of ways of drawing all individuals. Therefore, the ratio of these two is the probability of not drawing any individuals of species i. Therefore 1 minus this ratio is the probability of drawing at least one individual of species i. So the expected number of species is just the sum of this probability, calculated for each species in turn Caveats Rarefaction for interpolation rather than extrapolation Collecting curves vs. rarefaction curves Apparent leveling off of curves does not imply that nearly everything has been found (only that you re unlikely to find it with modest effort). Curves affected by factors other than sample size (sampling method, taxonomic treatment, size of geographic area etc.). Crossing of rarefaction curves can make interpretation difficult.
7 GEOS 36501/EVOL January 2010 Page 7 of 23
8 GEOS 36501/EVOL January 2010 Page 8 of Examples of application of taxonomic rarefaction (Raup 1975; Raup and Schopf 1978) This example suggests that the increase in observed family diversity in post-paleozoic echinoids cannot be accounted for by an increase in the number of species sampled.
9 GEOS 36501/EVOL January 2010 Page 9 of 23 This example suggests that much of the variation in the number of observed echinoid orders is consistent with differences in number of sampled species.
10 GEOS 36501/EVOL January 2010 Page 10 of Interpretation of taxonomic rarefaction curves not entirely straightforward. Sampling standardization to be treated in more detail later
11 GEOS 36501/EVOL January 2010 Page 11 of Range Example: Range of samples from normal distribution
12 GEOS 36501/EVOL January 2010 Page 12 of 23
13 GEOS 36501/EVOL January 2010 Page 13 of 23
14 GEOS 36501/EVOL January 2010 Page 14 of 23
15 GEOS 36501/EVOL January 2010 Page 15 of Example: Test for nonrandomness of sampling with respect to morphology (Foote 1997, Paleobiology 23:181)
16 GEOS 36501/EVOL January 2010 Page 16 of Correction in general case via rarefaction (random subsampling at controlled sample-size) (Foote 1992, Paleobiology 18:1) Caveat: Range at standardized sample size may not convey any information that isn t conveyed by sample variance.
17 GEOS 36501/EVOL January 2010 Page 17 of 23 6 Extreme value statistics 6.1 Introduction to problem Previous look at standard errors considered sampling distribution of quantities such as mean We may also be interested in distribution of extremes: For example, how is the largest of n observations distributed, or the second smallest, etc.? Applications: earthquakes, floods, etc.; evolutionary constraints 6.2 Probability of number of observations exceeding some value, if distribution known P r(x > x) = 1 F (x), where F (x) is the cumulative distribution If there are N observations, then the probability that exactly k of them exceed some value x is given by a simple binomial: ( ) N [1 F (x)] k F (x) N k k Example: normal with N = 10, x = 0.67, and k = 3: F (0.67) = 0.75, so the probability = ( 10 3 ) = Future observations Suppse we have n 1 past observations ranked from m = 1 (largest) to m = n 1 (smallest), and we take n 2 future observations. What is the probability that exactly k of n 2 observations will exceed the m th value from the first set of n 1 observations? Simply find F (x) corresponding to the m th value and plug into previous binomial equation. Clearly this works only if we know the distribution.
18 GEOS 36501/EVOL January 2010 Page 18 of Probability of number of observations exceeding some value, even if distribution is not known General expressions:
19 GEOS 36501/EVOL January 2010 Page 19 of Derivaton: See Gumbel pp Intuitive explanation for insensitivity to distribution: A given number of points should cover a given proportion of the cumulative distribution, regardless of the shape of the distribution (provided that it is continuous) Example (table from Gumbel): Note symmetry in table. Probability of x exceedances above largest is the same as probability of x exceedances below lowest, etc.
20 GEOS 36501/EVOL January 2010 Page 20 of Application to crinoid evolution (Foote 1994)
21 GEOS 36501/EVOL January 2010 Page 21 of 23
22 GEOS 36501/EVOL January 2010 Page 22 of 23
23 GEOS 36501/EVOL January 2010 Page 23 of Relationship to theory of records Let there be n 1 past trials and n 2 future trials. What is the probability that the record set (m = 1) by first set of trials will stand by the second set (i.e. x = 0)? This is w(0). Now, suppose we let n 1 = n 2, then we have: ( n1 ) ( m m n2 ) x w(x) = (n 1 + n 2 ) ( n 1 +n 2 1), x+m 1 which, for n 1 = n 2, m = 1, and x = 0, gives which is equal to 1 2. w(0) = ( n1 1 )( n1 0 ) (2n 1 ) ( 2n What is the expected number of exceedances above the past record? E(x) = mn 2 n = n 1 n for large n 1 ), Thus, for athletic contests, if all trials reflect the same underlying pool of talent, equipment, etc., the waiting time between successive record should progressively double Likewise for discoveries of largest dinosaur, oldest primate etc. Deviations suggest change in rules or nonrandom searching.
99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm
Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the
More informationPermutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
More information4. Continuous Random Variables, the Pareto and Normal Distributions
4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random
More informationWHERE DOES THE 10% CONDITION COME FROM?
1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationRarefaction Method DRAFT 1/5/2016 Our data base combines taxonomic counts from 23 agencies. The number of organisms identified and counted per sample
Rarefaction Method DRAFT 1/5/2016 Our data base combines taxonomic counts from 23 agencies. The number of organisms identified and counted per sample differs among agencies. Some count 100 individuals
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
More informationConfidence Intervals for the Difference Between Two Means
Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means
More informationSIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one?
SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one? Simulations for properties of estimators Simulations for properties
More informationAssociation Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
More information6.4 Normal Distribution
Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under
More informationChapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationTwo-sample inference: Continuous data
Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As
More informationTEACHER NOTES MATH NSPIRED
Math Objectives Students will understand that normal distributions can be used to approximate binomial distributions whenever both np and n(1 p) are sufficiently large. Students will understand that when
More informationAn Introduction to Basic Statistics and Probability
An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random
More informationDescriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics
Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),
More informationLAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
More information6 Scalar, Stochastic, Discrete Dynamic Systems
47 6 Scalar, Stochastic, Discrete Dynamic Systems Consider modeling a population of sand-hill cranes in year n by the first-order, deterministic recurrence equation y(n + 1) = Ry(n) where R = 1 + r = 1
More informationSampling Strategies for Error Rate Estimation and Quality Control
Project Number: JPA0703 Sampling Strategies for Error Rate Estimation and Quality Control A Major Qualifying Project Report Submitted to the faculty of the Worcester Polytechnic Institute in partial fulfillment
More informationThe right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median
CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationChapter 4. Probability and Probability Distributions
Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the
More informationExploratory Data Analysis
Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More information12: Analysis of Variance. Introduction
1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider
More informationYou flip a fair coin four times, what is the probability that you obtain three heads.
Handout 4: Binomial Distribution Reading Assignment: Chapter 5 In the previous handout, we looked at continuous random variables and calculating probabilities and percentiles for those type of variables.
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More information2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system
1. Systems of linear equations We are interested in the solutions to systems of linear equations. A linear equation is of the form 3x 5y + 2z + w = 3. The key thing is that we don t multiply the variables
More informationStandard Deviation Estimator
CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationMATH 140 Lab 4: Probability and the Standard Normal Distribution
MATH 140 Lab 4: Probability and the Standard Normal Distribution Problem 1. Flipping a Coin Problem In this problem, we want to simualte the process of flipping a fair coin 1000 times. Note that the outcomes
More informationSTT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables
Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random
More informationStatistics courses often teach the two-sample t-test, linear regression, and analysis of variance
2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample
More informationCONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont
CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont To most people studying statistics a contingency table is a contingency table. We tend to forget, if we ever knew, that contingency
More informationTime Series and Forecasting
Chapter 22 Page 1 Time Series and Forecasting A time series is a sequence of observations of a random variable. Hence, it is a stochastic process. Examples include the monthly demand for a product, the
More informationNormal distribution. ) 2 /2σ. 2π σ
Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a
More informationWeek 3&4: Z tables and the Sampling Distribution of X
Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal
More informationSOLUTIONS: 4.1 Probability Distributions and 4.2 Binomial Distributions
SOLUTIONS: 4.1 Probability Distributions and 4.2 Binomial Distributions 1. The following table contains a probability distribution for a random variable X. a. Find the expected value (mean) of X. x 1 2
More informationSimple Random Sampling
Source: Frerichs, R.R. Rapid Surveys (unpublished), 2008. NOT FOR COMMERCIAL DISTRIBUTION 3 Simple Random Sampling 3.1 INTRODUCTION Everyone mentions simple random sampling, but few use this method for
More informationindividualdifferences
1 Simple ANalysis Of Variance (ANOVA) Oftentimes we have more than two groups that we want to compare. The purpose of ANOVA is to allow us to compare group means from several independent samples. In general,
More informationCharacteristics of Binomial Distributions
Lesson2 Characteristics of Binomial Distributions In the last lesson, you constructed several binomial distributions, observed their shapes, and estimated their means and standard deviations. In Investigation
More informationMeasurement with Ratios
Grade 6 Mathematics, Quarter 2, Unit 2.1 Measurement with Ratios Overview Number of instructional days: 15 (1 day = 45 minutes) Content to be learned Use ratio reasoning to solve real-world and mathematical
More informationOverview. Essential Questions. Precalculus, Quarter 4, Unit 4.5 Build Arithmetic and Geometric Sequences and Series
Sequences and Series Overview Number of instruction days: 4 6 (1 day = 53 minutes) Content to Be Learned Write arithmetic and geometric sequences both recursively and with an explicit formula, use them
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationProbability Distributions
Learning Objectives Probability Distributions Section 1: How Can We Summarize Possible Outcomes and Their Probabilities? 1. Random variable 2. Probability distributions for discrete random variables 3.
More information5.1 Identifying the Target Parameter
University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying
More informationBinomial Sampling and the Binomial Distribution
Binomial Sampling and the Binomial Distribution Characterized by two mutually exclusive events." Examples: GENERAL: {success or failure} {on or off} {head or tail} {zero or one} BIOLOGY: {dead or alive}
More informationSolutions to Math 51 First Exam January 29, 2015
Solutions to Math 5 First Exam January 29, 25. ( points) (a) Complete the following sentence: A set of vectors {v,..., v k } is defined to be linearly dependent if (2 points) there exist c,... c k R, not
More informationREPEATED TRIALS. The probability of winning those k chosen times and losing the other times is then p k q n k.
REPEATED TRIALS Suppose you toss a fair coin one time. Let E be the event that the coin lands heads. We know from basic counting that p(e) = 1 since n(e) = 1 and 2 n(s) = 2. Now suppose we play a game
More informationTHE KRUSKAL WALLLIS TEST
THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON
More informationSTAT 35A HW2 Solutions
STAT 35A HW2 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/09/spring/stat35.dir 1. A computer consulting firm presently has bids out on three projects. Let A i = { awarded project i },
More informationBiostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY
Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to
More informationSENSITIVITY ANALYSIS AND INFERENCE. Lecture 12
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationQuestion: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?
ECS20 Discrete Mathematics Quarter: Spring 2007 Instructor: John Steinberger Assistant: Sophie Engle (prepared by Sophie Engle) Homework 8 Hints Due Wednesday June 6 th 2007 Section 6.1 #16 What is the
More informationLecture 5 : The Poisson Distribution
Lecture 5 : The Poisson Distribution Jonathan Marchini November 10, 2008 1 Introduction Many experimental situations occur in which we observe the counts of events within a set unit of time, area, volume,
More informationCenter for Advanced Studies in Measurement and Assessment. CASMA Research Report
Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 13 and Accuracy Under the Compound Multinomial Model Won-Chan Lee November 2005 Revised April 2007 Revised April 2008
More informationDiscrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10
CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10 Introduction to Discrete Probability Probability theory has its origins in gambling analyzing card games, dice,
More information5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.
The Normal Distribution C H 6A P T E R The Normal Distribution Outline 6 1 6 2 Applications of the Normal Distribution 6 3 The Central Limit Theorem 6 4 The Normal Approximation to the Binomial Distribution
More informationL13: cross-validation
Resampling methods Cross validation Bootstrap L13: cross-validation Bias and variance estimation with the Bootstrap Three-way data partitioning CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna CSE@TAMU
More informationWeek 4: Standard Error and Confidence Intervals
Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.
More informationPackage SHELF. February 5, 2016
Type Package Package SHELF February 5, 2016 Title Tools to Support the Sheffield Elicitation Framework (SHELF) Version 1.1.0 Date 2016-01-29 Author Jeremy Oakley Maintainer Jeremy Oakley
More informationProbability and Statistics Vocabulary List (Definitions for Middle School Teachers)
Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence
More informationST 371 (IV): Discrete Random Variables
ST 371 (IV): Discrete Random Variables 1 Random Variables A random variable (rv) is a function that is defined on the sample space of the experiment and that assigns a numerical variable to each possible
More informationPeople have thought about, and defined, probability in different ways. important to note the consequences of the definition:
PROBABILITY AND LIKELIHOOD, A BRIEF INTRODUCTION IN SUPPORT OF A COURSE ON MOLECULAR EVOLUTION (BIOL 3046) Probability The subject of PROBABILITY is a branch of mathematics dedicated to building models
More informationEconometrics Simple Linear Regression
Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationThe Binomial Distribution
The Binomial Distribution James H. Steiger November 10, 00 1 Topics for this Module 1. The Binomial Process. The Binomial Random Variable. The Binomial Distribution (a) Computing the Binomial pdf (b) Computing
More information8. THE NORMAL DISTRIBUTION
8. THE NORMAL DISTRIBUTION The normal distribution with mean μ and variance σ 2 has the following density function: The normal distribution is sometimes called a Gaussian Distribution, after its inventor,
More informationContent Sheet 7-1: Overview of Quality Control for Quantitative Tests
Content Sheet 7-1: Overview of Quality Control for Quantitative Tests Role in quality management system Quality Control (QC) is a component of process control, and is a major element of the quality management
More informationExample: Boats and Manatees
Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant
More informationEstimation and Confidence Intervals
Estimation and Confidence Intervals Fall 2001 Professor Paul Glasserman B6014: Managerial Statistics 403 Uris Hall Properties of Point Estimates 1 We have already encountered two point estimators: th e
More informationLecture Notes Module 1
Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific
More informationTests for One Proportion
Chapter 100 Tests for One Proportion Introduction The One-Sample Proportion Test is used to assess whether a population proportion (P1) is significantly different from a hypothesized value (P0). This is
More informationHISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS
Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS
More informationBNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I
BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationFixed-Effect Versus Random-Effects Models
CHAPTER 13 Fixed-Effect Versus Random-Effects Models Introduction Definition of a summary effect Estimating the summary effect Extreme effect size in a large study or a small study Confidence interval
More informationNonparametric statistics and model selection
Chapter 5 Nonparametric statistics and model selection In Chapter, we learned about the t-test and its variations. These were designed to compare sample means, and relied heavily on assumptions of normality.
More informationBootstrap Methods and Permutation Tests*
CHAPTER 14 Bootstrap Methods and Permutation Tests* 14.1 The Bootstrap Idea 14.2 First Steps in Usingthe Bootstrap 14.3 How Accurate Is a Bootstrap Distribution? 14.4 Bootstrap Confidence Intervals 14.5
More informationSection 6.1 Discrete Random variables Probability Distribution
Section 6.1 Discrete Random variables Probability Distribution Definitions a) Random variable is a variable whose values are determined by chance. b) Discrete Probability distribution consists of the values
More informationAP STATISTICS 2010 SCORING GUIDELINES
2010 SCORING GUIDELINES Question 4 Intent of Question The primary goals of this question were to (1) assess students ability to calculate an expected value and a standard deviation; (2) recognize the applicability
More informationUNDERSTANDING THE TWO-WAY ANOVA
UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
More informationChapter 5. Random variables
Random variables random variable numerical variable whose value is the outcome of some probabilistic experiment; we use uppercase letters, like X, to denote such a variable and lowercase letters, like
More informationPr(X = x) = f(x) = λe λx
Old Business - variance/std. dev. of binomial distribution - mid-term (day, policies) - class strategies (problems, etc.) - exponential distributions New Business - Central Limit Theorem, standard error
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More informationThis unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.
Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course
More informationRandom variables, probability distributions, binomial random variable
Week 4 lecture notes. WEEK 4 page 1 Random variables, probability distributions, binomial random variable Eample 1 : Consider the eperiment of flipping a fair coin three times. The number of tails that
More informationMeasuring Line Edge Roughness: Fluctuations in Uncertainty
Tutor6.doc: Version 5/6/08 T h e L i t h o g r a p h y E x p e r t (August 008) Measuring Line Edge Roughness: Fluctuations in Uncertainty Line edge roughness () is the deviation of a feature edge (as
More informationDescriptive Statistics
Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web
More information2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationInference for two Population Means
Inference for two Population Means Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 27 November 1, 2011 Two Population Means 1 / 65 Case Study Case Study Example
More informationAdvanced Topics in Statistical Process Control
Advanced Topics in Statistical Process Control The Power of Shewhart s Charts Second Edition Donald J. Wheeler SPC Press Knoxville, Tennessee Contents Preface to the Second Edition Preface The Shewhart
More informationSAMPLING & INFERENTIAL STATISTICS. Sampling is necessary to make inferences about a population.
SAMPLING & INFERENTIAL STATISTICS Sampling is necessary to make inferences about a population. SAMPLING The group that you observe or collect data from is the sample. The group that you make generalizations
More informationAn introduction to Value-at-Risk Learning Curve September 2003
An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More informationProbability Distributions
CHAPTER 6 Probability Distributions Calculator Note 6A: Computing Expected Value, Variance, and Standard Deviation from a Probability Distribution Table Using Lists to Compute Expected Value, Variance,
More informationStatistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013
Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives
More information