4. Continuous Random Variables, the Pareto and Normal Distributions

Size: px
Start display at page:

Download "4. Continuous Random Variables, the Pareto and Normal Distributions"

Transcription

1 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random variable is given by a probability density curve. The total area under this curve is 1. 1 / 95

2 Probability density curves, histograms and probability Note: If we observed a very large number of observations and drew a histogram using a large number of intervals, then the histogram would look very similar to the probability density curve. Hence, a histogram is an estimator of the probability density curve. The probability density curve may be thought of as the histogram for the whole population. 2 / 95

3 A histogram as an estimator of the probability density curve 3 / 95

4 Probability density curves and probability The probability that a random variable takes a value between a and b, P(a < X < b) is the area under the density curve between x = a and x = b. Note that for a < b, P(a < X < b) = P(X < b) P(X < a) = P(X > a) P(X > b) 4 / 95

5 Probability density curves and probability-ctd. The probability that a random variable takes a value less than a, P(X < a), is the area under the density curve up to x = a. 5 / 95

6 Probability density curves and probability-ctd. The probability that a random variable takes a value greater than b, P(X > b), is the area under the density curve starting at x = b. 6 / 95

7 4.1 The Pareto distribution Pareto distributions are used to model the distribution of full time salaries, especially when there is a minimum wage. Pareto distributions are clearly right skewed. A Pareto distribution is defined by two parameters x m, the minimum value the random variable can take (i.e. the minimum wage) and α, where x m > 0 and α > 0. The parameter α describes the degree of concentration of the distribution. The smaller α (i.e. the closer α is to 0), the heavier the tail of the distribution (i.e. the proportion of individuals earning very high salaries increases). 7 / 95

8 The Pareto distribution If X has a Pareto distribution, then we write X Pareto(x m, α). The density function is given by f (x) = αxα m x α+1, when x > x m, otherwise f (x) = 0. For α > 1, the expected value of a Pareto random variable X is given by E(X ) = αxm α 1. 8 / 95

9 Probability density function of a Pareto distribution 9 / 95

10 Probability density function of a Pareto distribution In order to calculate probabilities for the Pareto distribution, we use { 1, x < x P(X > x) = m (x m /x) α,, x x m together with the interval rule, which states that if a b P(a < X < b) = P(X > a) P(X > b), and the rule of complementarity, which states that P(X < x) = 1 P(X > x). The last two rules hold for any continuous distribution. 10 / 95

11 Example 4.1 Suppose that monthly salaries, denoted X, have a Pareto(1000, 2) distribution. Calculate 1. The expected (mean) salary. 2. The probability that an individual earns above the mean salary. 3. The probability that an individual earns below The probability that an individual earns between 3000 and The median salary (50% of individuals earn above the median salary). 11 / 95

12 Example The mean salary is given by 2. We have E(X ) = αx m α 1 = = P(X > 2000) = ( xm ) ( ) α = = Hence, only 25% of individuals earn more than the mean salary. 12 / 95

13 Example We need to calculate P(X < 1500). Using the rule of complementarity. ( xm ) α P(X < 1500)=1 P(X > 1500) = =1 (2/3) 2 = 5/9. 13 / 95

14 Example We need to calculate P(3000 < X < 6000). Using the interval rule P(3000 < X < 6000)=P(X > 3000) P(X > 6000) =(1000/3000) 2 (1000/6000) 2 =1/9 1/36 = 1/12 14 / 95

15 Example We need to find the value k for which P(X > k) = 0.5, i.e. we need to solve (1000/k) 2 = 0.5. Taking square roots: 1000/k = It follows that k 1000/ Hence, the median wage is 1414 (just over 70% of the mean wage). 15 / 95

16 4.2 The normal distribution The normal distribution has a symmetric bell shaped density curve. This curve is defined by two quantities a) The theoretical (population) mean µ (this gives the centre of the distribution). The distribution is symmetric about x = µ. b) The variance σ 2 (describes the dispersion of the distribution). We write X N(µ, σ 2 ). The (theoretical) variance of the random variable is σ 2. Note that some textbooks give the standard deviation σ as a parameter of the distribution, rather than the variance. 16 / 95

17 Probability density function of a normal distribution 17 / 95

18 4.3 The standard normal distribution A random variable with a standard normal distribution will be denoted by Z. Such a random variable has mean 0 and variance 1, i.e. the standard deviation is also 1. Values for P(Z > k) are given in Table 3 of the Murdoch and Barnes book of mathematical formulae (which will be available during the exam), for various k 0. If k is greater than 4, then we may assume P(Z > k) = / 95

19 Probabilities read from the table for the standard normal distribution This graph illustrates the probabilities given in the table. 19 / 95

20 Calculating probabilities using the table for the standard normal distribution Using the symmetry rule (when we change the sign of k we change the direction of the inequality ) 1. P(Z < k) = P(Z > k) 20 / 95

21 Calculating probabilities using the table for the standard normal distribution In order to calculate P(Z < k), we use the rule of complementarity 2. P(Z < k) = 1 P(Z > k). 21 / 95

22 Calculating probabilities using the table for the standard normal distribution Also, to calculate probabilities of the form P(a < Z < b), we use the interval rule 3.P(a < Z < b) = P(Z > a) P(Z > b). 22 / 95

23 Calculating probabilities using the table for the standard normal distribution Using these three rules and the table for the standard normal distribution, we can calculate the probability that Z comes in any interval. The probability P(Z > k) is found in the row corresponding to the digits in k directly before and after the decimal point and the column corresponding to the second digit after the decimal point. 23 / 95

24 Example 4.2 Calculate i) P(Z > 2.13) ii) P(Z < 1.05) iii) P(Z < 0.87) iv) P(Z > 2.06) v) P(0.23 < Z < 1.49) 24 / 95

25 Solution to Example 4.2 i) P(Z > 2.13). We can read this directly from the table. This probability is given in the row corresponding to 2.1 and the column corresponding to Thus, P(Z > 2.13) = / 95

26 Solution to Example 4.2 ii) P(Z < 1.05). When we have a negative constant, we use the symmetry rule i.e. P(Z < k) = P(Z > k) P(Z < 1.05) = P(Z > 1.05) = Note: Whenever the number on the right-hand side is negative, we will have to use the symmetry rule, but it might not lead immediately to the appropriate form. (see Example iv). 26 / 95

27 Solution to Example 4.2 iii) P(Z < 0.87). In this case (a positive constant, but the inequality is the wrong way round for us to read the probability directly), we use the rule of complementarity. P(Z < 0.87) = 1 P(Z > 0.87) = = / 95

28 Solution to Example 4.2 iv) P(Z > 2.06). First we use the rule of symmetry P(Z > 2.06) = P(Z < 2.06) Using the rule of complementarity P(Z < 2.06) = 1 P(Z > 2.06) = = / 95

29 Solution to Example 4.2 v) P(0.23 < Z < 1.49). In this case we first use the interval rule P(a < Z < b) = P(Z > a) P(Z > b) Thus, P(0.23 < Z < 1.49)=P(Z > 0.23) P(Z > 1.49) = = / 95

30 Calculating values from probabilities using the table for the standard normal distribution Sometimes we may wish to find the value of k for which P(Z > k) = p or P(Z < k) = p for some given p. Using the symmetry and complementarity rules, we can transform such a problem to the problem of finding P(Z > c) = p, where p < 0.5. The value of c can be found by finding the value closest to p in the heart of the table. The value of c corresponds to the row (digits around the decimal point) and the column (second decimal place). 30 / 95

31 Example 4.3 Find k such that i) P(Z > k) = 0.4 ii) P(Z < k) = 0.8 iii) P(Z > k) = 0.7 iv) P(Z < k) = / 95

32 Solution to Example 4.3 i) P(Z > k) = 0.4. This is in the appropriate form to read directly from the table. We find the value closest to 0.4 in the heart of the table. This value is and is in the row corresponding to 0.2 and the column corresponding to Hence, P(Z > 0.25) 0.4. Thus, k / 95

33 Solution to Example 4.3-ctd. ii) P(Z < k) = 0.8. In this case we cannot read k directly. In order to obtain a value of less than 0.5 on the right-hand side, we use the rule of complementarity P(Z > k) = 1 P(Z < k) = 0.2. The value in the heart of the table closest to 0.2 is , which is in the row corresponding to 0.8 and the column corresponding to It follows that Hence, k P(Z > 0.84) / 95

34 Solution to Example 4.3-ctd. iii) P(Z > k) = 0.7. Again we cannot directly read k. From the sketch below, it is clear that k < 0). 34 / 95

35 Solution to Example 4.3-ctd. Using complementarity to obtain a number on the right hand side less than 0.5 P(Z < k) = 1 P(Z > k) = 0.3. Once we have an appropriate number on the right hand side, we can use the law of symmetry to obtain the appropriate type of inequality P(Z < k) = P(Z > k) = 0.3. P(Z > k) = 0.3 is in the appropriate form. We can now read k from the table. As before k 0.52 Thus, k / 95

36 Solution to Example 4.3-ctd. iv) P(Z < k) = 0.1. Again k < 0. Using symmetry P(Z < k) = P(Z > k) = 0.1. From the tables P(Z > 1.28) 0.1. It follows that k 1.28 i.e. k / 95

37 4.4 Standardisation of a normal random variable By adding or subtracting a constant from a normally distributed random variable, we add or subtract (as appropriate) that constant from the mean of the random variable. The shape of the density curve remains the same (i.e. the dispersion is unchanged). Hence if X N(µ, σ 2 ), Y = X µ has a normal distribution with mean / 95

38 Standardisation of a normal random variable By dividing a random variable by a factor c, the standard deviation is decreased by the same factor. The density curve remains bell shaped. It follows that if X N(µ, σ 2 ), then Z = X µ σ, has a normal distribution with mean 0 and variance (and standard deviation) 1 (i.e. Z is a standard normal random variable). 38 / 95

39 Standardisation of a normal random variable Hence, in order to calculate the probability that any normal random variable takes a value in a given interval, we: i. First standardize. ii. Use the 3 rules given and tables to calculate the appropriate probability. 39 / 95

40 Example 4.4 The height of an Irish adult has a normal distribution with mean 170cm and variance 225cm 2. Calculate the probability that the height of an Irish adult is a) more than 191cm. b) less than 164cm. c) between 158 and 179cm. 40 / 95

41 Solution to Example 4.4 We have X N(170, 225). Hence, Z = X µ σ = X a) We wish to calculate P(X > 191). First we standardise both sides of the inequality by subtracting 170 (the mean µ) and dividing by 15 (the standard deviation σ). P(X > 191) = P( X µ σ > ) / 95

42 Solution to Example 4.4 On the left hand side of this inequality we now have a standard normal random variable P(Z > ) = P(Z > 1.4) = / 95

43 Solution to Example 4.4 ii) P(X < 164). First we standardize as before P(X < 164)=P( X µ < ) σ 15 =P(Z < 0.4) Using the rule of symmetry P(Z < 0.4) = P(Z > 0.4) = / 95

44 Solution to Example 4.4 iii) P(158 < X < 179). First, we standardise on all three sides P(158 < X < 179)=P( < Z < 15 =P( 0.8 < Z < 0.6) ) 15 Using the interval rule P( 0.8 < Z < 0.6) = P(Z > 0.8) P(Z > 0.6) 44 / 95

45 Solution to Example 4.4 Using the law of symmetry and then the law of complementarity for the first probability P(Z > 0.8) P(Z > 0.6)=P(Z < 0.8) =1 P(Z > 0.8) = = / 95

46 The normal distribution and probabilities of extreme values Note: If a variable has a normal distribution, then 1. The probability of an observation being within 1 standard deviation of the mean is approx The probability of being in one of the tails is thus approx The probability of an observation being within 2 (to be exact 1.96) standard deviations of the mean is approx The probability of being in one of the tails is thus approx The probability of being within standard deviations of the mean is / 95

47 The normal distribution and probabilities of extreme values 47 / 95

48 Example 4.5 The height of humans is normally distributed with mean 170cm and standard deviation 10cm. Approx. 2 3 of people are between 160cm and 180cm (i.e. 170 ± 10cm). Approx. 95% of people are between 150cm and 190cm (i.e. 170 ± 20cm). Note: These approximations are only valid when the distribution is normal. 48 / 95

49 4.5 Importance of the normal distribution, the central limit theorem The central limit theorem states that if a variable X is the sum of a large number of independent random variables (of comparible means and standard deviations), then X is approximately normally distributed. Note: large is usually interpreted as at least 30. If the random variables in the sum have a symmetric distribution, then this approximation will be very good. 49 / 95

50 4.5.1 Practical consequences of the central limit theorem 1. Many variables have a normal distribution. For example height is the sum of many factors: genetic, environmental and dietary (no individual factor is very important), and fits the normal distribution well. 2. The mean of a sample is simply the sum of the observations divided by a constant (the number of observations). Hence, if there is a large number of observations of a single variable, the distribution of the sample mean always fits the normal distribution well (i.e. the sample mean will be normally distributed around the population mean). 50 / 95

51 Practical consequences of the central limit theorem If there are a small number of observations, then the sample mean will only fit the normal distribution well if the observations are from a normal distribution (e.g. height, intelligence quotient). This fact is very important in statistical testing, since it is usually assumed that the mean of a sample is normally distributed. This assumption may not be appropriate when we are dealing with small samples. 51 / 95

52 Practical consequences of the central limit theorem 3. Suppose X has a binomial distribution with parameters n and p and n is large (and preferably p is not close to either 0 or 1). e.g. The number of heads when I throw a coin n times has a Bin(n, 1 2 ) distribution. If a proportion p of voters support party Y and n people are asked who they support. (Assuming they do not lie) The number of supporters of party Y has a binomial(n, p) distribution. 52 / 95

53 Practical consequences of the central limit theorem In the first example the total number of heads can be expressed at the sum of the number of heads from each individual throw (the number of heads from one throw is 1 with probability p otherwise it is zero). It follows that for large n, the number of heads will be approximately normally distributed. Using a similar argument, for large n the number of supporters of party Y will be approximately normally distributed. This approximation works well if the sample size n is at least 30 and p is between 0.1 and / 95

54 Practical consequences of the central limit theorem 4. For a large sample the proportion of observations in a given class is approximately normally distributed. This results from the fact that this proportion is simply the total number of such observations. (from 3, this is approximately normally distributed) divided by a constant (which does not change the form of the distribution). 54 / 95

55 4.5.2 The normal approximation to the binomial distribution Suppose n is large and X Bin(n, p), then X approx N(µ, σ 2 ), where µ = np, σ 2 = np(1 p). This approximation is used when n 30, 0.1 p / 95

56 The continuity correction for the normal approximation to the binomial distribution It should be noted that X has a discrete distribution, but we are using a continuous distribution in the approximation. For example, suppose we wanted to estimate the probability of obtaining exactly k heads when we throw a coin n times. This probability will in general be positive. However, if we use the normal approximation without an appropriate correction, we cannot sensibly estimate P(X = k) [for continuous distributions P(X = k) = 0]. 56 / 95

57 The continuity correction for the normal approximation to the binomial distribution Suppose the random variable X takes only integer values and has an approximately normal distribution. In order to estimate P(X = k), we use the continuity correction. This uses the fact that when k is an integer P(X = k) = P(k 0.5 < X < k + 0.5). 57 / 95

58 Example 4.6 Suppose a coin is tossed 36 times. Using CLT, estimate the probability that exactly 20 heads are thrown. 58 / 95

59 Example 4.6 Let X be the number of heads. We have X Bin(36, 0.5). Hence, E(X )=np = = 18 Var(X )=np(1 p) = = 9 It follows that X approx N(18, 9). We wish to estimate P(X = 20). Using the continuity correction, P(X = 20)=P(19.5 < X < 20.5) =P( < X µ < ) 9 σ 9 P(0.5 < Z < 0.83) = P(Z > 0.5) P(Z > 0.83) = = / 95

60 The continuity correction for the normal approximation to the binomial distribution This continuity correction can be adapted to problems in which we have to estimate the probability that the number of successes is in a given interval. e.g. P(15 X < 21)=P(X = 15) + P(X = 16) P(X = 20) =P(14.5 < X < 15.5) P(19.5 < X < 20.5) =P(14.5 < X < 20.5) In general, if the boundary of an interval is given by a non-strict inequality, we widen that end of the boundary by 0.5. If the boundary of an interval is given by a strict inequality, we narrow that end of the boundary by / 95

61 Example 4.7 A die is thrown 180 times. Estimate the probability that 1) at least 35 sixes are thrown 2) between 27 and 33 sixes are thrown (inclusively). 61 / 95

62 Example 4.7 Let X be the number of sixes. We have X Bin(180, 1 6 ) E(X )=np = = 30 Var(X )=np(1 p) = = / 95

63 Example 4.7 i) Using the continuity correction P(X 35)=P(X = 35) + P(X = 36) +... =P(34.5 < X < 35.5) + P(35.5 < X < 36.5) +... =P(X > 34.5) Standardising P(X > 34.5)=P( X µ > ) σ 25 P(Z > 0.9) = Note: Strictly speaking, we shouid calculate P(35 X 180) = P(X 35) P(X > 180). However, when the normal approximation is appropriate, the 2nd probability (that the number of successes is greater than the number of experiments) is estimated to be close to / 95

64 Example 4.7 ii) Using the continuity correction P(27 X 33)=P(X = 27) + P(X = 28) P(X = 33) Standardising =P(26.5 < X < 27.5) P(32.5 < X < 33.5) =P(26.5 < X < 33.5) P(26.5 < X < 33.5)=P( < X µ 25 σ < ) =P( 0.7 < Z < 0.7) = P(Z > 0.7) P(Z > 0.7) =P(Z < 0.7) P(Z > 0.7) = 1 2P(Z > 0.7) = = / 95

65 The normal approximation to the binomial It should be noted that the normal approximation to the binomial is most accurate when n is large and p is close to 0.5. This is due to the fact that X = X 1 + X X n, where X i 0 1(p). The distribution of X i is symmetric when p = / 95

66 4.6 Confidence intervals for population means Suppose we take a large number of samples of size n. The distribution of the sample means will be distributed around the population mean. If the size of the samples is increased, you would expect that the average error obtained by estimating the population mean using the sample mean will decrease (the distribution of the sample mean will be more concentrated around the population mean). If the population standard deviation of the observations is σ, then the standard deviation of the sample mean from a sample of n observations is σ n (otherwise known as the standard error, S.E.(x)). 66 / 95

67 4.6.1 Confidence intervals for the mean with large samples All the calculations in Section 4.6 assume that the sample mean has a normal distribution. This is always reasonable when there is a large number of observations. When there is a small number of observations, this is only reasonable if the observations themselves have a normal distribution. When the sample size is large (n > 30), we may assume that the sample standard deviation s is a good estimate of the population standard deviation σ. Hence, we can use s n as an approximation of the standard error. The sample mean X is the best estimator of the population mean µ (this is a point estimate). 67 / 95

68 Confidence intervals for the population mean A point estimate does not indicate the expected error of that estimate, so an interval estimate should be used (e.g. the average population height is 175 ± 4cm). The default confidence level is 95%, i.e. if we calculate one hundred 95% confidence intervals for the mean based on 100 samples, then on average 95 of them will contain the real population mean. The population mean is not guaranteed to lie within a confidence interval. 68 / 95

69 Confidence intervals for the population mean with large samples Since 95% of the observations of the sample mean will lie within 1.96 standard errors of the population mean, the following is a 95% confidence interval for the population mean (an interval estimate) 95% confidence interval for the population mean (large sample) x ± 1.96s n = x ± 1.96S.E.(x) 69 / 95

70 Confidence intervals for the population mean with large samples Similarly, 99% of the observations of the sample mean will lie within standard errors of the population mean, the following is a 99% confidence interval for the population mean. 99% confidence interval for the population mean (large sample) x ± 2.576s n = x ± 2.576S.E.(x) A general equation for calculating confidence intervals at a given confidence level will be given in the following subsection. 70 / 95

71 Example 4.8 Suppose the mean weekly wage of 100 randomly chosen Irish workers is 420 Euros and the sample standard deviation is 300. Calculate a 95% confidence interval for the mean weekly wage of all Irish workers. 71 / 95

72 Solution to Example 4.8 The standard error of the sample mean is approximately S.E.(x) = s n = = 30. The 95% confidence interval for the mean weekly wage of all Irish adults is x ± 1.96S.E.(x)=420 ± =420 ± 58.8 = [361.2, 478.8]. The narrower a confidence interval, the more accurately we are estimating a population mean. 72 / 95

73 4.6.2 Confidence intervals for the population mean with small samples When the sample size is small we cannot assume that the sample standard deviation is a good estimate of the population standard deviation. In this case, a confidence interval must reflect the increased degree of uncertainty resulting from not knowing the population standard deviation (i.e. the confidence interval must be wider). 73 / 95

74 The student distribution Suppose the observations X 1,..., X n are normally distributed. n(x µ) Z = σ has a normal distribution with mean 0 and standard deviation 1 (where µ and σ are the population mean and popluation standard deviation, respectively). Let n(x µ) T n 1 =, s where s is the sample standard deviation. T n 1 has a student distribution with n 1 degrees of freedom. 74 / 95

75 Relation betweeen the student distribution and the normal distribution Since s is an estimate of σ, the distribution of T n 1 will be similar to the distribution of Z. However, this estimation introduces a larger degree of uncertainty and the dispersion of the T n 1 distribution will be larger than the dispersion of the Z distribution. As n increases, s becomes a very good estimate of σ. Thus, for large n, the distribution of T n 1 will converge to the standard normal distribution. 75 / 95

76 Critical values for the student distribution The p-critical value of the student distribution with n 1 degrees of freedom is denoted t n 1,p. It satisfies P(T n 1 > t n 1,p ) = p. By symmetry, a proportion 1 2p of sample means will be within t n 1,p standard deviations of the mean. For n < 30 these critical values can be read from Table 7. For n > 30 we use the fact that for large n the student distribution converges to the standard normal distribution. The appropriate critical values for the normal distribution can also be read from Table 7. They are given as t,p. 76 / 95

77 Critical values for the student distribution The graph illustrates the critical value t n 1,0.005 ( 2.6 in this case). 77 / 95

78 Confidence intervals for the population mean with small samples For a small sample, the following is a 100(1 α)% confidence interval for the population mean 100(1 α)% confidence interval for the population mean (small sample) s x ± t n 1, α = x ± t 2 n 1, α S.E.(x) n 2 78 / 95

79 95% confidence interval for the population mean with small samples For a 95% confidence interval 100(1 α) = 95. Hence, α = % confidence interval for the population mean (small sample) s x ± t n 1,0.025 = x ± t n 1,0.025 S.E.(x) n 79 / 95

80 99% confidence interval for the population mean with small samples For a 99% confidence interval 100(1 α) = 99. Hence, α = % confidence interval for the population mean (small sample) x ± t n 1,0.005 s n = x ± t n 1,0.005 S.E.(x) 80 / 95

81 Example students were weighed. Their average weight was 68kg and the standard deviation 12kg. Calculate a 99% confidence interval for the mean weight of all students. 81 / 95

82 Solution to Example 4.9 This confidence interval is given by s 12 x ± t n 1,0.005 =68 ± t 8,0.005 n 3 =68 ± = 68 ± = [54.58, 81.42] Note: Since the sample size is small and weight has a skewed distribution (i.e. is not normally distributed), this is only approximately a 99% confidence interval. 82 / 95

83 4.7 Confidence intervals for proportions We want to estimate the proportion p of people in the population who have trait A. It will be assumed that the sample size n is large (n > 30). In this case the distribution of the sample proportion will be approximately normal (unless p is very close to 0 or 1). Suppose x people in a sample of n have trait A. ˆp = x n (the sample proportion) is an estimator of p (the population proportion). 83 / 95

84 Standard error of the sample proportion The standard error of this estimator is p(1 p) S.E.(ˆp) =, n which can be approximated using ˆp(1 ˆp) S.E.(ˆp) n It should be noted that the maximum standard error is attained when p = 0.5. It follows that S.E.(ˆp) S.E. max (ˆp) = 1 2 n 84 / 95

85 Formula for a confidence interval for the population proportion For large n and p not close to 0 or 1, ˆp will be approximately normally distributed. It follows that a 100(1-α)% confidence interval for p is given by ˆp ± t, α 2 S.E.(ˆp) 85 / 95

86 Particular cases of confidence intervals for the population proportion A 95% confidence interval for p is given by ˆp ± t,0.025 S.E.(ˆp) A 99% confidence interval for p is given by ˆp ± t,0.005 S.E.(ˆp) 86 / 95

87 Estimating the population proportion to a given accuracy The upper bound on the standard error of the sample proportion is useful when defining the sample size necessary to estimate the population proportion to a given accuracy. To calculate the sample size required to estimate a population proportion to within δ with a confidence of 100(1 α)%, we require that the error term t, α S.E.(ˆp) is smaller than the 2 required accuracy δ, i.e. t, α S.E.(ˆp) δ. 2 This will always be satisfied if t, α 2 S.E. max(ˆp) δ t, α 2 2 n δ. To find the appropriate sample size, we have to solve this inequality. 87 / 95

88 Example 4.10 In a survey of 300 people, 75 stated that they would vote for the Labour party. i) Calculate a 95% confidence interval for the proportion of people wishing to vote for the Labour party. ii) What sample size is needed in order to estimate this proportion to within 3% with a confidence level of 99%? 88 / 95

89 Solution to Example 4.10 i) The formula for this confidence interval is given by ˆp ± t,0.025 S.E.(ˆp) We have Also, ˆp = = 1 4 = ( ) ˆp(1 ˆp) S.E.(ˆp) n ( ) 3 = = t,0.025 = = / 95

90 Solution to Example 4.10 Hence, the 95% confidence interval is given by 0.25 ± =0.25 ± =[0.201, 0.299] 90 / 95

91 Solution to Example 4.10 For a 99% confidence level, we require t, α 2 S.E. max(ˆp) δ n n n n n. Hence, since the sample size is an integer we require at least 1844 observations. 91 / 95

92 4.8 Estimating a population mean to a given accuracy In a similar way, we can estimate the sample size required to estimate a population mean to some required accuracy. However, the standard error depends on the population standard deviation, which in general is unknown. Unlike the standard error for a population proportion, we cannot give an upper bound on the standard error for a population mean (it simply increases as the population standard deviation increases). Normally a two stage procedure is used. Firstly, a relatively small sample is used to estimate the standard deviation and then the estimation of the required sample size is based on the standard deviation for this sample. 92 / 95

93 Estimating a population mean to a given accuracy We assume that the required sample size is relatively large (i.e. n > 30). As previously, we require the error term in the corresponding confidence interval to be less than or equal to the permitted error, δ i.e. t,α/2 S.E.(X ) δ t,α/2 s δ. n 93 / 95

94 Example 4.11 The monthly salaries of 30 randomly chosen Irish adults were observed and the sample standard deviation was 1000 Euro. Find the sample size required to estimate the mean monthly salary of Irish adults to within 200 Euro with a confidence level of 95%. 94 / 95

95 Solution to Example 4.11 We have s = 1000, α = 0.05, δ = 200. Hence, we need to solve the following inequality for n: t,0.025 s 200 n n Hence, n = 9.8 n = It follows that we need at least 97 observations (i.e. at least another 67 observations in addition to the initial sample). 95 / 95

3. Continuous Random Variables

3. Continuous Random Variables 3. Continuous Random Variables A continuous random variable is one which can take any value in an interval (or union of intervals) The values that can be taken by such a variable cannot be listed. Such

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

Lecture 8. Confidence intervals and the central limit theorem

Lecture 8. Confidence intervals and the central limit theorem Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15 Central limit theorem Let X 1, X 2,... X n be a random sample of

More information

7 Hypothesis testing - one sample tests

7 Hypothesis testing - one sample tests 7 Hypothesis testing - one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

MAT140: Applied Statistical Methods Summary of Calculating Confidence Intervals and Sample Sizes for Estimating Parameters

MAT140: Applied Statistical Methods Summary of Calculating Confidence Intervals and Sample Sizes for Estimating Parameters MAT140: Applied Statistical Methods Summary of Calculating Confidence Intervals and Sample Sizes for Estimating Parameters Inferences about a population parameter can be made using sample statistics for

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

The Standard Normal distribution

The Standard Normal distribution The Standard Normal distribution 21.2 Introduction Mass-produced items should conform to a specification. Usually, a mean is aimed for but due to random errors in the production process we set a tolerance

More information

Estimation and Confidence Intervals

Estimation and Confidence Intervals Estimation and Confidence Intervals Fall 2001 Professor Paul Glasserman B6014: Managerial Statistics 403 Uris Hall Properties of Point Estimates 1 We have already encountered two point estimators: th e

More information

Lecture 10: Depicting Sampling Distributions of a Sample Proportion

Lecture 10: Depicting Sampling Distributions of a Sample Proportion Lecture 10: Depicting Sampling Distributions of a Sample Proportion Chapter 5: Probability and Sampling Distributions 2/10/12 Lecture 10 1 Sample Proportion 1 is assigned to population members having a

More information

MATH 10: Elementary Statistics and Probability Chapter 7: The Central Limit Theorem

MATH 10: Elementary Statistics and Probability Chapter 7: The Central Limit Theorem MATH 10: Elementary Statistics and Probability Chapter 7: The Central Limit Theorem Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of slides, you

More information

1) What is the probability that the random variable has a value greater than 2? A) 0.750 B) 0.625 C) 0.875 D) 0.700

1) What is the probability that the random variable has a value greater than 2? A) 0.750 B) 0.625 C) 0.875 D) 0.700 Practice for Chapter 6 & 7 Math 227 This is merely an aid to help you study. The actual exam is not multiple choice nor is it limited to these types of questions. Using the following uniform density curve,

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives. The Normal Distribution C H 6A P T E R The Normal Distribution Outline 6 1 6 2 Applications of the Normal Distribution 6 3 The Central Limit Theorem 6 4 The Normal Approximation to the Binomial Distribution

More information

Week 3&4: Z tables and the Sampling Distribution of X

Week 3&4: Z tables and the Sampling Distribution of X Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

The normal approximation to the binomial

The normal approximation to the binomial The normal approximation to the binomial The binomial probability function is not useful for calculating probabilities when the number of trials n is large, as it involves multiplying a potentially very

More information

Math 151. Rumbos Spring 2014 1. Solutions to Assignment #22

Math 151. Rumbos Spring 2014 1. Solutions to Assignment #22 Math 151. Rumbos Spring 2014 1 Solutions to Assignment #22 1. An experiment consists of rolling a die 81 times and computing the average of the numbers on the top face of the die. Estimate the probability

More information

Characteristics of Binomial Distributions

Characteristics of Binomial Distributions Lesson2 Characteristics of Binomial Distributions In the last lesson, you constructed several binomial distributions, observed their shapes, and estimated their means and standard deviations. In Investigation

More information

Probability Distributions

Probability Distributions Learning Objectives Probability Distributions Section 1: How Can We Summarize Possible Outcomes and Their Probabilities? 1. Random variable 2. Probability distributions for discrete random variables 3.

More information

You flip a fair coin four times, what is the probability that you obtain three heads.

You flip a fair coin four times, what is the probability that you obtain three heads. Handout 4: Binomial Distribution Reading Assignment: Chapter 5 In the previous handout, we looked at continuous random variables and calculating probabilities and percentiles for those type of variables.

More information

Notes on Continuous Random Variables

Notes on Continuous Random Variables Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

More information

How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

More information

4. Introduction to Statistics

4. Introduction to Statistics Statistics for Engineers 4-1 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation

More information

AMS 5 CHANCE VARIABILITY

AMS 5 CHANCE VARIABILITY AMS 5 CHANCE VARIABILITY The Law of Averages When tossing a fair coin the chances of tails and heads are the same: 50% and 50%. So if the coin is tossed a large number of times, the number of heads and

More information

Chapter 7. Estimates and Sample Size

Chapter 7. Estimates and Sample Size Chapter 7. Estimates and Sample Size Chapter Problem: How do we interpret a poll about global warming? Pew Research Center Poll: From what you ve read and heard, is there a solid evidence that the average

More information

WHERE DOES THE 10% CONDITION COME FROM?

WHERE DOES THE 10% CONDITION COME FROM? 1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay

More information

5. Continuous Random Variables

5. Continuous Random Variables 5. Continuous Random Variables Continuous random variables can take any value in an interval. They are used to model physical characteristics such as time, length, position, etc. Examples (i) Let X be

More information

, for x = 0, 1, 2, 3,... (4.1) (1 + 1/n) n = 2.71828... b x /x! = e b, x=0

, for x = 0, 1, 2, 3,... (4.1) (1 + 1/n) n = 2.71828... b x /x! = e b, x=0 Chapter 4 The Poisson Distribution 4.1 The Fish Distribution? The Poisson distribution is named after Simeon-Denis Poisson (1781 1840). In addition, poisson is French for fish. In this chapter we will

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

Chapter 4. Probability and Probability Distributions

Chapter 4. Probability and Probability Distributions Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the

More information

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test...

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test... Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction........................................ 1. Error Rates and Power of a Test.............................

More information

3.4 The Normal Distribution

3.4 The Normal Distribution 3.4 The Normal Distribution All of the probability distributions we have found so far have been for finite random variables. (We could use rectangles in a histogram.) A probability distribution for a continuous

More information

Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

Social Studies 201 Notes for November 19, 2003

Social Studies 201 Notes for November 19, 2003 1 Social Studies 201 Notes for November 19, 2003 Determining sample size for estimation of a population proportion Section 8.6.2, p. 541. As indicated in the notes for November 17, when sample size is

More information

Lesson 20. Probability and Cumulative Distribution Functions

Lesson 20. Probability and Cumulative Distribution Functions Lesson 20 Probability and Cumulative Distribution Functions Recall If p(x) is a density function for some characteristic of a population, then Recall If p(x) is a density function for some characteristic

More information

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

More information

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve

More information

An Introduction to Basic Statistics and Probability

An Introduction to Basic Statistics and Probability An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

The Normal Curve. The Normal Curve and The Sampling Distribution

The Normal Curve. The Normal Curve and The Sampling Distribution Discrete vs Continuous Data The Normal Curve and The Sampling Distribution We have seen examples of probability distributions for discrete variables X, such as the binomial distribution. We could use it

More information

Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.

Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures. Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.

More information

The basics of probability theory. Distribution of variables, some important distributions

The basics of probability theory. Distribution of variables, some important distributions The basics of probability theory. Distribution of variables, some important distributions 1 Random experiment The outcome is not determined uniquely by the considered conditions. For example, tossing a

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

The normal approximation to the binomial

The normal approximation to the binomial The normal approximation to the binomial In order for a continuous distribution (like the normal) to be used to approximate a discrete one (like the binomial), a continuity correction should be used. There

More information

Key Concept. Density Curve

Key Concept. Density Curve MAT 155 Statistical Analysis Dr. Claude Moore Cape Fear Community College Chapter 6 Normal Probability Distributions 6 1 Review and Preview 6 2 The Standard Normal Distribution 6 3 Applications of Normal

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

MATHEMATICS FOR ENGINEERS STATISTICS TUTORIAL 4 PROBABILITY DISTRIBUTIONS

MATHEMATICS FOR ENGINEERS STATISTICS TUTORIAL 4 PROBABILITY DISTRIBUTIONS MATHEMATICS FOR ENGINEERS STATISTICS TUTORIAL 4 PROBABILITY DISTRIBUTIONS CONTENTS Sample Space Accumulative Probability Probability Distributions Binomial Distribution Normal Distribution Poisson Distribution

More information

2 ESTIMATION. Objectives. 2.0 Introduction

2 ESTIMATION. Objectives. 2.0 Introduction 2 ESTIMATION Chapter 2 Estimation Objectives After studying this chapter you should be able to calculate confidence intervals for the mean of a normal distribution with unknown variance; be able to calculate

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

The Normal Distribution. Alan T. Arnholt Department of Mathematical Sciences Appalachian State University

The Normal Distribution. Alan T. Arnholt Department of Mathematical Sciences Appalachian State University The Normal Distribution Alan T. Arnholt Department of Mathematical Sciences Appalachian State University arnholt@math.appstate.edu Spring 2006 R Notes 1 Copyright c 2006 Alan T. Arnholt 2 Continuous Random

More information

Statistics GCSE Higher Revision Sheet

Statistics GCSE Higher Revision Sheet Statistics GCSE Higher Revision Sheet This document attempts to sum up the contents of the Higher Tier Statistics GCSE. There is one exam, two hours long. A calculator is allowed. It is worth 75% of the

More information

The Binomial Probability Distribution

The Binomial Probability Distribution The Binomial Probability Distribution MATH 130, Elements of Statistics I J. Robert Buchanan Department of Mathematics Fall 2015 Objectives After this lesson we will be able to: determine whether a probability

More information

Statistics 104: Section 6!

Statistics 104: Section 6! Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: dbloome@fas.harvard.edu Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC

More information

THE BINOMIAL DISTRIBUTION & PROBABILITY

THE BINOMIAL DISTRIBUTION & PROBABILITY REVISION SHEET STATISTICS 1 (MEI) THE BINOMIAL DISTRIBUTION & PROBABILITY The main ideas in this chapter are Probabilities based on selecting or arranging objects Probabilities based on the binomial distribution

More information

Solutions for the exam for Matematisk statistik och diskret matematik (MVE050/MSG810). Statistik för fysiker (MSG820). December 15, 2012.

Solutions for the exam for Matematisk statistik och diskret matematik (MVE050/MSG810). Statistik för fysiker (MSG820). December 15, 2012. Solutions for the exam for Matematisk statistik och diskret matematik (MVE050/MSG810). Statistik för fysiker (MSG8). December 15, 12. 1. (3p) The joint distribution of the discrete random variables X and

More information

MCQ S OF MEASURES OF CENTRAL TENDENCY

MCQ S OF MEASURES OF CENTRAL TENDENCY MCQ S OF MEASURES OF CENTRAL TENDENCY MCQ No 3.1 Any measure indicating the centre of a set of data, arranged in an increasing or decreasing order of magnitude, is called a measure of: (a) Skewness (b)

More information

Probability Models for Continuous Random Variables

Probability Models for Continuous Random Variables Density Probability Models for Continuous Random Variables At right you see a histogram of female length of life. (Births and deaths are recorded to the nearest minute. The data are essentially continuous.)

More information

Practice Problems for Homework #6. Normal distribution and Central Limit Theorem.

Practice Problems for Homework #6. Normal distribution and Central Limit Theorem. Practice Problems for Homework #6. Normal distribution and Central Limit Theorem. 1. Read Section 3.4.6 about the Normal distribution and Section 4.7 about the Central Limit Theorem. 2. Solve the practice

More information

Random variables, probability distributions, binomial random variable

Random variables, probability distributions, binomial random variable Week 4 lecture notes. WEEK 4 page 1 Random variables, probability distributions, binomial random variable Eample 1 : Consider the eperiment of flipping a fair coin three times. The number of tails that

More information

NPTEL STRUCTURAL RELIABILITY

NPTEL STRUCTURAL RELIABILITY NPTEL Course On STRUCTURAL RELIABILITY Module # 02 Lecture 6 Course Format: Web Instructor: Dr. Arunasis Chakraborty Department of Civil Engineering Indian Institute of Technology Guwahati 6. Lecture 06:

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

More information

MAS108 Probability I

MAS108 Probability I 1 QUEEN MARY UNIVERSITY OF LONDON 2:30 pm, Thursday 3 May, 2007 Duration: 2 hours MAS108 Probability I Do not start reading the question paper until you are instructed to by the invigilators. The paper

More information

PROBABILITY AND SAMPLING DISTRIBUTIONS

PROBABILITY AND SAMPLING DISTRIBUTIONS PROBABILITY AND SAMPLING DISTRIBUTIONS SEEMA JAGGI AND P.K. BATRA Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 0 0 seema@iasri.res.in. Introduction The concept of probability

More information

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

More information

PROBLEM SET 1. For the first three answer true or false and explain your answer. A picture is often helpful.

PROBLEM SET 1. For the first three answer true or false and explain your answer. A picture is often helpful. PROBLEM SET 1 For the first three answer true or false and explain your answer. A picture is often helpful. 1. Suppose the significance level of a hypothesis test is α=0.05. If the p-value of the test

More information

MT426 Notebook 3 Fall 2012 prepared by Professor Jenny Baglivo. 3 MT426 Notebook 3 3. 3.1 Definitions... 3. 3.2 Joint Discrete Distributions...

MT426 Notebook 3 Fall 2012 prepared by Professor Jenny Baglivo. 3 MT426 Notebook 3 3. 3.1 Definitions... 3. 3.2 Joint Discrete Distributions... MT426 Notebook 3 Fall 2012 prepared by Professor Jenny Baglivo c Copyright 2004-2012 by Jenny A. Baglivo. All Rights Reserved. Contents 3 MT426 Notebook 3 3 3.1 Definitions............................................

More information

PROBABILITIES AND PROBABILITY DISTRIBUTIONS

PROBABILITIES AND PROBABILITY DISTRIBUTIONS Published in "Random Walks in Biology", 1983, Princeton University Press PROBABILITIES AND PROBABILITY DISTRIBUTIONS Howard C. Berg Table of Contents PROBABILITIES PROBABILITY DISTRIBUTIONS THE BINOMIAL

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Let m denote the margin of error. Then

Let m denote the margin of error. Then S:105 Statistical Methods and Computing Sample size for confidence intervals with σ known t Intervals Lecture 13 Mar. 6, 009 Kate Cowles 374 SH, 335-077 kcowles@stat.uiowa.edu 1 The margin of error The

More information

Chapter 6 Random Variables

Chapter 6 Random Variables Chapter 6 Random Variables Day 1: 6.1 Discrete Random Variables Read 340-344 What is a random variable? Give some examples. A numerical variable that describes the outcomes of a chance process. Examples:

More information

Lecture 5 : The Poisson Distribution

Lecture 5 : The Poisson Distribution Lecture 5 : The Poisson Distribution Jonathan Marchini November 10, 2008 1 Introduction Many experimental situations occur in which we observe the counts of events within a set unit of time, area, volume,

More information

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1. Lecture 6: Chapter 6: Normal Probability Distributions A normal distribution is a continuous probability distribution for a random variable x. The graph of a normal distribution is called the normal curve.

More information

Statistical Inference

Statistical Inference Statistical Inference Idea: Estimate parameters of the population distribution using data. How: Use the sampling distribution of sample statistics and methods based on what would happen if we used this

More information

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck!

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck! STP 231 EXAM #1 (Example) Instructor: Ela Jackiewicz Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned.

More information

AP Statistics 1998 Scoring Guidelines

AP Statistics 1998 Scoring Guidelines AP Statistics 1998 Scoring Guidelines These materials are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use must be sought from the Advanced Placement

More information

The Normal distribution

The Normal distribution The Normal distribution The normal probability distribution is the most common model for relative frequencies of a quantitative variable. Bell-shaped and described by the function f(y) = 1 2σ π e{ 1 2σ

More information

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Opgaven Onderzoeksmethoden, Onderdeel Statistiek Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week

More information

Statistics 100 Binomial and Normal Random Variables

Statistics 100 Binomial and Normal Random Variables Statistics 100 Binomial and Normal Random Variables Three different random variables with common characteristics: 1. Flip a fair coin 10 times. Let X = number of heads out of 10 flips. 2. Poll a random

More information

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

More information

Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

More information

Chapter 5. Random variables

Chapter 5. Random variables Random variables random variable numerical variable whose value is the outcome of some probabilistic experiment; we use uppercase letters, like X, to denote such a variable and lowercase letters, like

More information

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

More information

Sampling Distribution of a Normal Variable

Sampling Distribution of a Normal Variable Ismor Fischer, 5/9/01 5.-1 5. Formal Statement and Examples Comments: Sampling Distribution of a Normal Variable Given a random variable. Suppose that the population distribution of is known to be normal,

More information

WEEK #22: PDFs and CDFs, Measures of Center and Spread

WEEK #22: PDFs and CDFs, Measures of Center and Spread WEEK #22: PDFs and CDFs, Measures of Center and Spread Goals: Explore the effect of independent events in probability calculations. Present a number of ways to represent probability distributions. Textbook

More information

PHP 2510 Central limit theorem, confidence intervals. PHP 2510 October 20,

PHP 2510 Central limit theorem, confidence intervals. PHP 2510 October 20, PHP 2510 Central limit theorem, confidence intervals PHP 2510 October 20, 2008 1 Distribution of the sample mean Case 1: Population distribution is normal For an individual in the population, X i N(µ,

More information

13.2 Measures of Central Tendency

13.2 Measures of Central Tendency 13.2 Measures of Central Tendency Measures of Central Tendency For a given set of numbers, it may be desirable to have a single number to serve as a kind of representative value around which all the numbers

More information

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the

More information

CONFIDENCE INTERVALS FOR MEANS AND PROPORTIONS

CONFIDENCE INTERVALS FOR MEANS AND PROPORTIONS LESSON SEVEN CONFIDENCE INTERVALS FOR MEANS AND PROPORTIONS An interval estimate for μ of the form a margin of error would provide the user with a measure of the uncertainty associated with the point estimate.

More information

MEI Statistics 1. Exploring data. Section 1: Introduction. Looking at data

MEI Statistics 1. Exploring data. Section 1: Introduction. Looking at data MEI Statistics Exploring data Section : Introduction Notes and Examples These notes have sub-sections on: Looking at data Stem-and-leaf diagrams Types of data Measures of central tendency Comparison of

More information

Numerical Summarization of Data OPRE 6301

Numerical Summarization of Data OPRE 6301 Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting

More information

Probability Distributions

Probability Distributions CHAPTER 5 Probability Distributions CHAPTER OUTLINE 5.1 Probability Distribution of a Discrete Random Variable 5.2 Mean and Standard Deviation of a Probability Distribution 5.3 The Binomial Distribution

More information

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit? ECS20 Discrete Mathematics Quarter: Spring 2007 Instructor: John Steinberger Assistant: Sophie Engle (prepared by Sophie Engle) Homework 8 Hints Due Wednesday June 6 th 2007 Section 6.1 #16 What is the

More information

6 3 The Standard Normal Distribution

6 3 The Standard Normal Distribution 290 Chapter 6 The Normal Distribution Figure 6 5 Areas Under a Normal Distribution Curve 34.13% 34.13% 2.28% 13.59% 13.59% 2.28% 3 2 1 + 1 + 2 + 3 About 68% About 95% About 99.7% 6 3 The Distribution Since

More information

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8. Random variables Remark on Notations 1. When X is a number chosen uniformly from a data set, What I call P(X = k) is called Freq[k, X] in the courseware. 2. When X is a random variable, what I call F ()

More information