Sample Size. When is a coin biased? What does this mean? Why do we sample? Validity and Precision. Populations

Transcription

1 Sample Size Dr. Randall Singer Professor of Epidemiology Executive Veterinary Program University of Illinois December 11 12, 2014 Why do we sample? We rarely have the time or the resources to gather data on all individuals (or whatever the sampling unit is) We must select a subset of the population to study (i.e. a sample) The data will consist of estimates instead of population parameters The validity of these estimates will depend on the sampling method The precision of the estimate will depend on the sample size Validity and Precision Validity High High Precision Low Low Populations Target population The entire set of individuals to which the findings of the survey will be extrapolated Study population The collection of individuals (sampling units) that are actually studied and from which the sample is drawn Sampling frame A list of all sampling units in the study population What does this mean? When is a coin biased? Odds Ratio = 2.3 (95% CI ) p = 0.02 You want to test whether a coin is biased Null Hypothesis = coin is fair (unbiased) You toss it 5 times 5 consecutive heads What is your conclusion? How many heads do you need to throw? 1

2 How many heads do you need to throw? Frequency distribution: n heads in 10 tosses of an unbiased coin How biased is the coin? Biased coin will still throw tails Bias can be small but real Most (all?) coins may be biased How far from 0.5 is observed proportion? Probability of 5 heads with a fair coin? How confident do you want to be that any bias is of negligible importance? When is a coin biased? Statistical tests report a P-value Probability of obtaining a result as extreme or more extreme than that observed under the assumption that the null hypothesis is true. Declare groups statistically different if the probability of an observed difference occurring by chance is small Conventionally where P<0.05 (arbitrary) Now (finally!) being questioned seriously When is a coin biased? Coin example: Null hypothesis: P of head = heads: deem the coin unbiased (chance as P =.054) 9 or 10 heads statistically significant (P = 0.01). Observed vs. Expected Statistical inference is based on comparison of observed data with expected data Assuming no difference between treatments Always (almost) have some observed difference How different (magnitude of effect)? How likely (P value) are the data if the null hypothesis is true Biological vs. statistical significance Sample size and significance testing Statistically significant difference unlikely due to chance alone Effect of sample size Small sample size large sampling error non-significant result even when real effect exists Large sample size small differences of no biological importance may be statistically significant 12 2

3 Key points on P-values Indicate probability of observed data under the assumption that the null hypothesis is true Do not indicate probability the alternative hypothesis is true (e.g. the coin is unbiased) Do not indicate the magnitude of the effect of interest Assume no errors or biases in the data Cut-off for statistical significance (P = 0.05) is arbitrary! Use of exact P values (e.g. P = 0.06) preferred to use of cutoff values (e.g. P >.05) Statistical significance biological significance Statistical inference Conclusions re population value drawn from a sample (random) estimates vs. true (unknown) values Inference assumes no biases in data collection or analysis Point estimate Value representing the effect under investigation (e.g. mean) Confidence interval Indicates the precision of the point estimate 14 Statistical significance Hypothesis testing Is the risk of disease different between breed A and breed B? Evaluation of an observed difference between breeds Disease status Breed Positive Negative Prevalence A B By chance? Null hypothesis (H 0 ): breed A = breed B No difference in population parameters Observed difference is simply the result of random variation in the data Relationship between results of a statistical test and the true biological state Significant Study finding Not Significant Biological truth Association No association Power: probability a study Correct will find a statistical difference if it exists: = (1 b) Incorrect Type II or β error α: probability Incorrectof observed Type difference I or due Alpha to chance error alone Correct Key points: Error rates in statistical tests Type 1 ( ) error = false rejection of the null hypothesis. Analogous to false positive rate for a test level: predetermined significance threshold (0.05) Biologists accept an error rate of 5% for falsely concluding a relationship exists Type 2 ( ) error = false acceptance of the null hypothesis. Analogous to false negative rate for a test Conventional level of 0.2 (20%) used in study design Trade off between and errors Like sensitivity and specificity in diagnostic testing 3

4 Sample Size Calculations for Disease sample size to estimate disease occurrence prevalence, incidence usually we measure a proportion (%) simplest approach is to use the binomial distribution we may be interested in the mean herd prevalence or incidence; we assume this is a continuous variable that is normally distributed sample size to detect disease presence prevalence near zero assume a perfect diagnostic test Exercise 1 you plan to conduct a survey to estimate the prevalence of antibodies against BVD in cattle in a specific county. the are 165 herds in the county. you have no idea about the herd-level prevalence of BVD antibodies. How many herds would you sample (assuming you have a perfect diagnostic test)? Exercise 2 Exercise 3 before conducting the survey to estimate the prevalence of antibodies against BVD in cattle in the county, a colleague suggests that the prevalence might be about 20%. how many herds would you now sample? why is your answer different to exercise 1? your colleague also says that everyone knows that the prevalence of antibodies against BVD in cattle in the county is somewhere in the range of 10 to 30%. how many herds would you now sample? Exercise 4 you are interested in the average milk production of cows in your BVD survey. you think the average milk production is approximately 10 liters per day. in a pilot study of one herd, you recorded the following milk production: 7, 11, 12, 9, and 11 If you wanted to estimate milk production in a herd with 70 cows, how many would you Exercise 5 in the same village, you want to know if the amount of milk produced by BVD positive cows, versus BVD negative cows, is the same. you think that BVD may suppress milk production by as much as 40% How many cows do you need to sample, if you want to be at least 80% certain of finding a real difference, if it exists? sample? 4

5 Exercise 6 Sample Size Calculations for Disease if you think that BVD may suppress milk production by only about 10%, how does this affect your sample size estimate? what are some ways of minimizing the sample size needed? sample size to estimate disease occurrence prevalence, incidence usually we measure a proportion (%) simplest approach is to use the binomial distribution we may be interested in the mean herd prevalence or incidence; we assume this is a continuous variable that is normally distributed sample size to detect disease presence prevalence near zero assume a perfect diagnostic test suppose we are interested whether a herd is free of disease X If we assume that if the disease is present, 50% of animals will be infected, then if we sample and test just one animal from the herd, the probability of detecting disease is 50% If we sample 2 animals, the probability of detecting disease is 75% If we sample 3 animals, the probability of detecting disease is 87.5% probability of detecting disease = probability that sample 1 (or 2 or 3 or. ) is positive probability of detecting disease = probability that sample 1 (or 2 or 3 or. ) is positive probability of detecting disease = 1 probability of NOT detecting disease = 1 (1 probability animal 1 is positive) x (1 probability animal 2 is positive) x (1 probability animal 3 is positive) = 1 (1 probability each animal is positive) number sampled e.g. 50% probability of animals being infected (= prevalence) 3 animals are sampled e.g. 10% probability of animals being infected (= prevalence) 5 animals are sampled probability of detecting disease =1 (1 probability any animal is positive) number sampled = 1 (1 0.5) 3 = 87.5% probability of detecting disease =1 (1 probability any animal is positive) number sampled = 1 (1 0.1) 5 = 41% note: by convention, we usually either want 95, 98 or 99% confidence that a herd, region or country is free of disease i.e. we attempt to minimize the chances we are wrong to <5, <2 or <1% 5

6 Probability of Detecting Disease probability of detecting disease 1. probability that any given animal is positive (prevalence) 2. number of animals sampled probability of detecting disease if: 1. prevalence is low 2. sample size low sample size = 20 Probability of Detecting Disease sample size needed to find at least one disease animal 1. number of detectable cases in the population 2. number of animals sampled 3. confidence level 4. population size prevalence = 5% e.g. suppose you want to detect whether or not a flock of N = 1000 animal is positive for pathogen X if X is present, you suspect that 50% of the flock is infected d = 500 you desire 95% confidence P = 0.95 e.g. suppose you want to detect whether or not a flock of N = 1000 animal is positive for pathogen X if X is present, you suspect that 5% of the flock is infected d = 50 you desire 95% confidence P = 0.95 n = 4.48 (need to sample ~ 5 animals) n = 56.7 (need to sample ~ 57 animals) 6

7 Maximum Number of Positives sample size needs to be increased if: 1. number of detectable cases in the population is small 2. desired confidence level is high 3. population size is large What is the maximum number of positives in a population given that all animals are sampled? Maximum Number of Positives Exercise 7 e.g. suppose that 1,000 slaughter cows were tested negative for E. coli H7:O157 (n=1000) and the total number of cows slaughtered was 1 million (N = ). What is the maximal prevalence in the population of slaughtered cows, if you desire 95% confidence (P = 0.95) you plan to conduct a survey to estimate the prevalence of antibodies against BVD in cattle in the county. the are 165 herds in the county. how many samples per herd should you take do determine if a herd is BVD positive? d = (maximal prevalence = 0.30%) Exercise 8 you plan to conduct a survey to estimate the prevalence of antibodies against BVD in cattle in the county. you obtain some more information: in most herds there are 50 cows; there are some herds with 80 cows, and a few with 100 cows Exercise 9 in the previous exercise, what is the effect of dropping your confidence to 90%? increasing it to 97.5%? for a representative sample, how many cows should be sampled within these classes? 7

8 Exercise 10 Exercise 11 out of the 165 herds in the county, how many need to be sampled to detect BVD antibodies, if you think that 10% of the herds are positive? What would be the total sample size for a survey to detect BVD in county herds if there are 165 herds and you think that: 10% of herds are positive; 140 herds have 50 cows; 15 herds have 80 cows; 10 herds have 100 cows; and the prevalence of antibodies within herds is 5% Disease Freedom many countries and regions are free of important trade-limiting diseases such as foot-and-mouth disease, classical swine fever and avian influenza there may be periodic incursions of such diseases in order to demonstrate freedom, need to identify incursions quickly and control disease spread, then eradication at the beginning and end of such epidemics, prevalence of disease is very low close to zero failure to detect initial incursion larger, longer epidemics failure to demonstrate freedom following incursion unnecessary trade restrictions Exercise 12 Exercise 13 you want to determine if the village poultry population in the county is free of avian influenza antibodies in the county, there are 141 villages, all of which have poultry for freedom, OIE suggests that seroprevalence should be <0.1%, with 95% confidence how many villages need to be sampled? suppose in your survey of the village poultry population for avian influenza antibodies, you sample 20 villages and all samples are seronegative. what is the maximum number (%) of villages that could be seropositive with a 95% confidence level? 99% confidence level? 8