GEOS 33000/EVOL January 2006 modified January 12, 2006 Page 1. 0 Some R commands for functions we ve covered so far
|
|
- Jonas Daniels
- 7 years ago
- Views:
Transcription
1 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page 1 III. Sampling 0 Some R commands for functions we ve covered so far 0.1 rbinom(m,n,p) returns m integers drawn from the binomial distribution with n trials and probability of success p. Each one of the integers returned would be k in our terminology. 0.2 dbinom(k,n,p) returns the probability of exactly k successes in n trials each with probability p. 0.3 pbinom(j,n,p) returns the cumulative probability of j or fewer successes in n trials each with probability p. 0.4 rpois(m,a) returns m integers drawn from the poisson distribution with parameter a. 0.5 dpois(k,a) returns the probability of exactly k events in poisson distribution with parameter a. 0.6 ppois(j,a) returns the cumulative probability of j or fewer events resulting from poisson distribution with parameter a. 0.7 rmultinom(m,n,p) returns m vectors of integers drawn from multinomial with n trials and vector of probabilities p. 0.8 dmultinom(k,n,p) returns the probability of sampling exactly k (where k is a vector of integers) in n trials with vector of probabilities p.
2 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page rexp(n,a) returns n numbers drawn from exponential distribution with parameter (rate) a dexp(x,a) returns density of exponential distribution with parameter a at X = x pexp(x,a) returns cumulative probability of exponential distribution with parameter a at X = x rnorm(n) returns n numbers drawn from standard normal distribution (with zero mean and unit variance) dnorm(x) returns the normal density at X = x pnorm(x) returns the normal distribution function (cumulative probability) at X = x runif(), dunif(), punif(): These are like rnorm(), dnorm(), pnorm(), but for uniform distribution on (0,1) choose(n,k) returns ( n k), i.e. n!/[k!(n k)!] factorial(j) returns j! lfactorial(j) returns ln(j!) gamma(x) returns Γ(x).
3 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page mean(x), median(x), var(x), sd(x) return mean, median, variance, and standard deviation of the vector or array x cov(x,y) returns the covariance between vectors x and y. 1 Overview of Sampling, Error, Bias 2 Error Estimates With Assumed Sampling Distribution 2.1 Standard Error: Standard deviation of distribution of sample statistics that would result from infinite number of trials of drawing sample from underlying probability distribution and calculating the sample statistic. 2.2 In practice we generally do not estimate error by repeated sampling from the underlying distribution (expensive and time-consuming), although there are exceptions. 2.3 Approximations based on sample distribution (from Sokal and Rohlf):
4 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page 4
5 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page Limitations: Many approximation formulae make assumptions about shape of distribution and sample size We may be interested in novel statistic or one whose sampling distribution is not well characterized. 3 Bootstrap Error Estimates 3.1 Estimate standard error by resampling from the single sample we have. 3.2 This approach uses sampling with replacement from observed sample to simulate sampling without replacement from the underlying distribution. 3.3 Procedure Start with observed sample of size n and observed sample statistic, call it Z Randomly pick a sample of size n, with replacement, from the observed sample Calculate the sample statistic of interest on this random sample; call is Z boot Repeat many times (generally hundreds to thousands) Calculate standard deviation of the Z boot. This is an estimate of the standard error of the observed sample statistic Z: (SD(Z boot ) SE(Z). 3.4 Simple (but not necessarily most useful) example: trimmed mean Define p-% trimmed mean as mean of sample with p% lowest and p% highest observations discarded. (Idea is to try to reduce effect of outlines.) Suppose data consist of 10 (ordered) observations: 1,2,3,4,8,10,12,15,20,30. Let the trimmed mean be denoted Z. Then Z = ( )/6 = 8.67.
6 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page 6 R code to estimate SE(Z) #define function trim.mean<-function(x,ntrim){ ii<-order(x) xtmp<-x[ii] return(mean(xtmp[(ntrim+1):(n-ntrim)])) } data<-c(1,2,3,4,8,10,12,15,20,30) #specify data n<-length(data) ntrim<-2 #specify number to trim from each side Zobs<-trim.mean(data,ntrim) #get observed value nrep< #specify number of bootstrap replicates Zboot<-rep(NA,nrep) #assign memory for (i in 1:nrep) #get bootstrap replicates Zboot[i]<-trim.mean(sample(data,n,replace=TRUE),ntrim) SE<-sd(Zboot) #calculate bootstrap std. error hist(zboot,breaks=50) #plot histogram of results This yields Z obs = 8.67 and SE(Z) 3.1. Histogram of Zboot Frequency Zboot
7 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page Useful R function: sample(x,n,replace=true[or FALSE]) returns a random sample of size n from the vector x with or without replacement. 3.6 To sample from array X so that the variables (columns) stay together: nr<-dim(x)[1] #get number of rows i<-sample(1:nr,n,replace=true[or FALSE]) #returns vector of integers sampled on [1,n] XSAMP<-X[i,] 4 Parametric bootstrap 4.1 Take observed sample and estimate relevant parameter from it. 4.2 Resample from parametric distribution with parameter equal to sample estimate (rather than resampling from observed distribution). 4.3 This approach can also be applied to more complicated situations: for example, simulating a process with parameters estimated from data We ll do lots of this later...
8 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page 8 5 Examples of Finite-sample Bias (sample-size bias) 5.1 Sample variance (x x) 2 /n is biased. This is systematically too low, which makes sense since it is based on squared deviations from sample mean (x x) 2 /(n 1) is unbiased. 5.2 Number of taxa Rarefaction method (from Raup 1975) Abundance of species i is N i ; N = N i. Consider a particular species, i. ( N N i ) n is the number of ways of drawing the non-i individuals in a sample of n. ( N n) is the number of ways of drawing all individuals. Therefore, the ratio of these two is the probability of not drawing any individuals of species i. Therefore 1 minus this ratio is the probability of drawing at least one individual of species i. So the expected number of species is just the sum of this probability, calculated for each species in turn Caveats Rarefaction for interpolation rather than extrapolation Collecting curves vs. rarefaction curves Apparent leveling off of curves does not imply that nearly everything has been found (only that you re unlikely to find it with modest effort). Curves affected by factors other than sample size (sampling method, taxonomic treatment, size of geographic area etc.). Crossing of rarefaction curves can make interpretation difficult.
9 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page 9
10 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page Examples of application of taxonomic rarefaction (Raup 1975; Raup and Schopf 1978) This example suggests that the increase in observed family diversity in post-paleozoic echinoids cannot be accounted for by an increase in the number of species sampled.
11 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page 11 This example suggests that much of the variation in the number of observed echinoid orders is consistent with differences in number of sampled species.
12 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page Interpretation of taxonomic rarefaction curves not entirely straightforward. Sampling standardization to be treated in more detail later
13 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page Range Example: Range of samples from normal distribution
14 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page 14
15 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page 15
16 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page 16
17 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page Example: Test for nonrandomness of sampling with respect to morphology
18 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page Correction in general case via rarefaction (random subsampling at controlled sample-size) Caveat: Range at standardized sample size may not convey any information that isn t conveyed by sample variance.
19 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page 19 6 Extreme value statistics 6.1 Introduction to problem Previous look at standard errors considered sampling distribution of quantities such as mean We may also be interested in distribution of extremes: For example, how is the largest of n observations distributed, or the second smallest, etc.? 6.2 Probability of number of observations exceeding some value, if distribution known P r(x > x) = 1 F (x), where F (x) is the cumulative distribution If there are N observations, then the probability that exactly k of them exceed some value x is given by a simple binomial: ( ) N [1 F (x)] k F (x) N k k Example: normal with N = 10, x = 0.67, and k = 3: F (0.67) = 0.75, so the probability = ( 10 3 ) = Future observations Suppse we have n 1 past observations ranked from m = 1 (largest) to m = n 1 (smallest), and we take n 2 future observations. What is the probability that exactly k of n 2 observations will exceed the m th value from the first set of n 1 observations? Simply find F (x) corresponding to the m th value and plug into previous binomial equation. Clearly this works only if we know the distribution.
20 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page Probability of number of observations exceeding some value, even if distribution is not known General expressions:
21 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page Intuitive explanation for insensitivity to distribution: A given number of points should cover a given proportion of the cumulative distribution, regardless of the shape of the distribution (provided that it is continuous) Example (table from Gumbel): Note symmetry in table. Probability of x exceedances above largest is the same as probability of x exceedances below lowest, etc.
22 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page Application to crinoid evolution (Foote 1994)
23 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page 23
24 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page 24
25 GEOS 33000/EVOL January 2006 modified January 12, 2006 Page Relationship to theory of records Let there be n 1 past trials and n 2 future trials. What is the probability that the record set (m = 1) by first set of trials will stand by the second set (i.e. x = 0)? This is w(0). Now, suppose we let n 1 = n 2, then we have: ( n1 ) ( m m n2 ) x w(x) = (n 1 + n 2 ) ( n 1 +n 2 1), x+m 1 which, for n 1 = n 2, m = 1, and x = 0, gives which is equal to 1 2. w(0) = ( n1 1 )( n1 0 ) (2n 1 ) ( 2n What is the expected number of exceedances above the past record? E(x) = mn 2 n = n 1 n for large n 1 ), Thus, for athletic contests, if all trials reflect the same underlying pool of talent, equipment, etc., the waiting time between successive record should progressively double Likewise for discoveries of largest dinosaur, oldest primate etc. Deviations suggest change in rules or nonrandom searching.
Permutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
More information99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm
Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the
More information4. Continuous Random Variables, the Pareto and Normal Distributions
4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random
More informationChapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
More informationNormal distribution. ) 2 /2σ. 2π σ
Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a
More informationSTT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables
Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random
More informationFEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL
FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint
More information6.4 Normal Distribution
Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationExploratory Data Analysis
Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction
More informationWHERE DOES THE 10% CONDITION COME FROM?
1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay
More informationAssociation Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
More information6 Scalar, Stochastic, Discrete Dynamic Systems
47 6 Scalar, Stochastic, Discrete Dynamic Systems Consider modeling a population of sand-hill cranes in year n by the first-order, deterministic recurrence equation y(n + 1) = Ry(n) where R = 1 + r = 1
More informationBootstrap Example and Sample Code
U.C. Berkeley Stat 135 : Concepts of Statistics Bootstrap Example and Sample Code 1 Bootstrap Example This section will demonstrate how the bootstrap can be used to generate confidence intervals. Suppose
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More informationAn Introduction to Basic Statistics and Probability
An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random
More informationWeek 4: Standard Error and Confidence Intervals
Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.
More informationLAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
More informationConfidence Intervals for the Difference Between Two Means
Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means
More informationRarefaction Method DRAFT 1/5/2016 Our data base combines taxonomic counts from 23 agencies. The number of organisms identified and counted per sample
Rarefaction Method DRAFT 1/5/2016 Our data base combines taxonomic counts from 23 agencies. The number of organisms identified and counted per sample differs among agencies. Some count 100 individuals
More informationProbability Distributions
CHAPTER 6 Probability Distributions Calculator Note 6A: Computing Expected Value, Variance, and Standard Deviation from a Probability Distribution Table Using Lists to Compute Expected Value, Variance,
More informationChapter 4 Lecture Notes
Chapter 4 Lecture Notes Random Variables October 27, 2015 1 Section 4.1 Random Variables A random variable is typically a real-valued function defined on the sample space of some experiment. For instance,
More informationPackage SHELF. February 5, 2016
Type Package Package SHELF February 5, 2016 Title Tools to Support the Sheffield Elicitation Framework (SHELF) Version 1.1.0 Date 2016-01-29 Author Jeremy Oakley Maintainer Jeremy Oakley
More informationThe normal approximation to the binomial
The normal approximation to the binomial The binomial probability function is not useful for calculating probabilities when the number of trials n is large, as it involves multiplying a potentially very
More informationStatistical tests for SPSS
Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly
More informationCHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.
Some Continuous Probability Distributions CHAPTER 6: Continuous Uniform Distribution: 6. Definition: The density function of the continuous random variable X on the interval [A, B] is B A A x B f(x; A,
More informationImportant Probability Distributions OPRE 6301
Important Probability Distributions OPRE 6301 Important Distributions... Certain probability distributions occur with such regularity in real-life applications that they have been given their own names.
More informationChapter 5. Random variables
Random variables random variable numerical variable whose value is the outcome of some probabilistic experiment; we use uppercase letters, like X, to denote such a variable and lowercase letters, like
More informationMaster s Theory Exam Spring 2006
Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem
More informationProbability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur
Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce
More informationTime Series and Forecasting
Chapter 22 Page 1 Time Series and Forecasting A time series is a sequence of observations of a random variable. Hence, it is a stochastic process. Examples include the monthly demand for a product, the
More informationStatistics courses often teach the two-sample t-test, linear regression, and analysis of variance
2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationNotes on Continuous Random Variables
Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes
More informationLecture 5 : The Poisson Distribution
Lecture 5 : The Poisson Distribution Jonathan Marchini November 10, 2008 1 Introduction Many experimental situations occur in which we observe the counts of events within a set unit of time, area, volume,
More informationChapter 4. Probability and Probability Distributions
Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the
More informationTenth Problem Assignment
EECS 40 Due on April 6, 007 PROBLEM (8 points) Dave is taking a multiple-choice exam. You may assume that the number of questions is infinite. Simultaneously, but independently, his conscious and subconscious
More informationCONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont
CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont To most people studying statistics a contingency table is a contingency table. We tend to forget, if we ever knew, that contingency
More informationPractice problems for Homework 11 - Point Estimation
Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:
More informationCenter for Advanced Studies in Measurement and Assessment. CASMA Research Report
Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 13 and Accuracy Under the Compound Multinomial Model Won-Chan Lee November 2005 Revised April 2007 Revised April 2008
More informationEstimation and Confidence Intervals
Estimation and Confidence Intervals Fall 2001 Professor Paul Glasserman B6014: Managerial Statistics 403 Uris Hall Properties of Point Estimates 1 We have already encountered two point estimators: th e
More informationPr(X = x) = f(x) = λe λx
Old Business - variance/std. dev. of binomial distribution - mid-term (day, policies) - class strategies (problems, etc.) - exponential distributions New Business - Central Limit Theorem, standard error
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationSTAT 35A HW2 Solutions
STAT 35A HW2 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/09/spring/stat35.dir 1. A computer consulting firm presently has bids out on three projects. Let A i = { awarded project i },
More informationStandard Deviation Estimator
CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of
More informationTEACHER NOTES MATH NSPIRED
Math Objectives Students will understand that normal distributions can be used to approximate binomial distributions whenever both np and n(1 p) are sufficiently large. Students will understand that when
More informationTwo-sample inference: Continuous data
Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As
More informationBNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I
BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential
More informationDescriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics
Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),
More informationAdvanced Topics in Statistical Process Control
Advanced Topics in Statistical Process Control The Power of Shewhart s Charts Second Edition Donald J. Wheeler SPC Press Knoxville, Tennessee Contents Preface to the Second Edition Preface The Shewhart
More informationSIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one?
SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one? Simulations for properties of estimators Simulations for properties
More informationLogistic Regression (1/24/13)
STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used
More informationCHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS
CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS CENTRAL LIMIT THEOREM (SECTION 7.2 OF UNDERSTANDABLE STATISTICS) The Central Limit Theorem says that if x is a random variable with any distribution having
More informationProbability and Statistics Vocabulary List (Definitions for Middle School Teachers)
Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence
More informationSENSITIVITY ANALYSIS AND INFERENCE. Lecture 12
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationStatistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013
Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives
More informationL13: cross-validation
Resampling methods Cross validation Bootstrap L13: cross-validation Bias and variance estimation with the Bootstrap Three-way data partitioning CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna CSE@TAMU
More informationSimple Random Sampling
Source: Frerichs, R.R. Rapid Surveys (unpublished), 2008. NOT FOR COMMERCIAL DISTRIBUTION 3 Simple Random Sampling 3.1 INTRODUCTION Everyone mentions simple random sampling, but few use this method for
More informationThe right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median
CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box
More informationNonparametric statistics and model selection
Chapter 5 Nonparametric statistics and model selection In Chapter, we learned about the t-test and its variations. These were designed to compare sample means, and relied heavily on assumptions of normality.
More informationProbability Distributions
Learning Objectives Probability Distributions Section 1: How Can We Summarize Possible Outcomes and Their Probabilities? 1. Random variable 2. Probability distributions for discrete random variables 3.
More informationMath 370, Spring 2008 Prof. A.J. Hildebrand. Practice Test 2 Solutions
Math 370, Spring 008 Prof. A.J. Hildebrand Practice Test Solutions About this test. This is a practice test made up of a random collection of 5 problems from past Course /P actuarial exams. Most of the
More informationSKEWNESS. Measure of Dispersion tells us about the variation of the data set. Skewness tells us about the direction of variation of the data set.
SKEWNESS All about Skewness: Aim Definition Types of Skewness Measure of Skewness Example A fundamental task in many statistical analyses is to characterize the location and variability of a data set.
More informationYou flip a fair coin four times, what is the probability that you obtain three heads.
Handout 4: Binomial Distribution Reading Assignment: Chapter 5 In the previous handout, we looked at continuous random variables and calculating probabilities and percentiles for those type of variables.
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationMATH 140 Lab 4: Probability and the Standard Normal Distribution
MATH 140 Lab 4: Probability and the Standard Normal Distribution Problem 1. Flipping a Coin Problem In this problem, we want to simualte the process of flipping a fair coin 1000 times. Note that the outcomes
More informationMath 461 Fall 2006 Test 2 Solutions
Math 461 Fall 2006 Test 2 Solutions Total points: 100. Do all questions. Explain all answers. No notes, books, or electronic devices. 1. [105+5 points] Assume X Exponential(λ). Justify the following two
More information2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system
1. Systems of linear equations We are interested in the solutions to systems of linear equations. A linear equation is of the form 3x 5y + 2z + w = 3. The key thing is that we don t multiply the variables
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationSolutions to Math 51 First Exam January 29, 2015
Solutions to Math 5 First Exam January 29, 25. ( points) (a) Complete the following sentence: A set of vectors {v,..., v k } is defined to be linearly dependent if (2 points) there exist c,... c k R, not
More informationJoint Exam 1/P Sample Exam 1
Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question
More informationSampling Strategies for Error Rate Estimation and Quality Control
Project Number: JPA0703 Sampling Strategies for Error Rate Estimation and Quality Control A Major Qualifying Project Report Submitted to the faculty of the Worcester Polytechnic Institute in partial fulfillment
More informationPLANNING PROBLEMS OF A GAMBLING-HOUSE WITH APPLICATION TO INSURANCE BUSINESS. Stockholm
PLANNING PROBLEMS OF A GAMBLING-HOUSE WITH APPLICATION TO INSURANCE BUSINESS HARALD BOHMAN Stockholm In the classical risk theory the interdependence between the security loading and the initial risk reserve
More informationHISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS
Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS
More informationBinomial Sampling and the Binomial Distribution
Binomial Sampling and the Binomial Distribution Characterized by two mutually exclusive events." Examples: GENERAL: {success or failure} {on or off} {head or tail} {zero or one} BIOLOGY: {dead or alive}
More informationSOLUTIONS: 4.1 Probability Distributions and 4.2 Binomial Distributions
SOLUTIONS: 4.1 Probability Distributions and 4.2 Binomial Distributions 1. The following table contains a probability distribution for a random variable X. a. Find the expected value (mean) of X. x 1 2
More informationSummary of R software commands used to generate bootstrap and permutation test output and figures in Chapter 16
Summary of R software commands used to generate bootstrap and permutation test output and figures in Chapter 16 Since R is command line driven and the primary software of Chapter 16, this document details
More informationAP STATISTICS 2010 SCORING GUIDELINES
2010 SCORING GUIDELINES Question 4 Intent of Question The primary goals of this question were to (1) assess students ability to calculate an expected value and a standard deviation; (2) recognize the applicability
More informationProbability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X
Week 6 notes : Continuous random variables and their probability densities WEEK 6 page 1 uniform, normal, gamma, exponential,chi-squared distributions, normal approx'n to the binomial Uniform [,1] random
More informationWeek 3&4: Z tables and the Sampling Distribution of X
Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationNormality Testing in Excel
Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com
More informationReview of Random Variables
Chapter 1 Review of Random Variables Updated: January 16, 2015 This chapter reviews basic probability concepts that are necessary for the modeling and statistical analysis of financial data. 1.1 Random
More informationMultiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
More informationseven Statistical Analysis with Excel chapter OVERVIEW CHAPTER
seven Statistical Analysis with Excel CHAPTER chapter OVERVIEW 7.1 Introduction 7.2 Understanding Data 7.3 Relationships in Data 7.4 Distributions 7.5 Summary 7.6 Exercises 147 148 CHAPTER 7 Statistical
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationCharacteristics of Binomial Distributions
Lesson2 Characteristics of Binomial Distributions In the last lesson, you constructed several binomial distributions, observed their shapes, and estimated their means and standard deviations. In Investigation
More information2WB05 Simulation Lecture 8: Generating random variables
2WB05 Simulation Lecture 8: Generating random variables Marko Boon http://www.win.tue.nl/courses/2wb05 January 7, 2013 Outline 2/36 1. How do we generate random variables? 2. Fitting distributions Generating
More informationDescriptive Statistics
Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web
More informationMeasurement with Ratios
Grade 6 Mathematics, Quarter 2, Unit 2.1 Measurement with Ratios Overview Number of instructional days: 15 (1 day = 45 minutes) Content to be learned Use ratio reasoning to solve real-world and mathematical
More information5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.
The Normal Distribution C H 6A P T E R The Normal Distribution Outline 6 1 6 2 Applications of the Normal Distribution 6 3 The Central Limit Theorem 6 4 The Normal Approximation to the Binomial Distribution
More informationChapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution
Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS Part 3: Discrete Uniform Distribution Binomial Distribution Sections 3-5, 3-6 Special discrete random variable distributions we will cover
More informationindividualdifferences
1 Simple ANalysis Of Variance (ANOVA) Oftentimes we have more than two groups that we want to compare. The purpose of ANOVA is to allow us to compare group means from several independent samples. In general,
More informationTutorial 5: Hypothesis Testing
Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................
More information