4. Continuous Random Variables, the Pareto and Normal Distributions

Size: px
Start display at page:

Download "4. Continuous Random Variables, the Pareto and Normal Distributions"

Transcription

1 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random variable is given by a probability density curve. The total area under this curve is 1. 1 / 95

2 Probability density curves, histograms and probability Note: If we observed a very large number of observations and drew a histogram using a large number of intervals, then the histogram would look very similar to the probability density curve. Hence, a histogram is an estimator of the probability density curve. The probability density curve may be thought of as the histogram for the whole population. 2 / 95

3 A histogram as an estimator of the probability density curve 3 / 95

4 Probability density curves and probability The probability that a random variable takes a value between a and b, P(a < X < b) is the area under the density curve between x = a and x = b. Note that for a < b, P(a < X < b) = P(X < b) P(X < a) = P(X > a) P(X > b) 4 / 95

5 Probability density curves and probability-ctd. The probability that a random variable takes a value less than a, P(X < a), is the area under the density curve up to x = a. 5 / 95

6 Probability density curves and probability-ctd. The probability that a random variable takes a value greater than b, P(X > b), is the area under the density curve starting at x = b. 6 / 95

7 4.1 The Pareto distribution Pareto distributions are used to model the distribution of full time salaries, especially when there is a minimum wage. Pareto distributions are clearly right skewed. A Pareto distribution is defined by two parameters x m, the minimum value the random variable can take (i.e. the minimum wage) and α, where x m > 0 and α > 0. The parameter α describes the degree of concentration of the distribution. The smaller α (i.e. the closer α is to 0), the heavier the tail of the distribution (i.e. the proportion of individuals earning very high salaries increases). 7 / 95

8 The Pareto distribution If X has a Pareto distribution, then we write X Pareto(x m, α). The density function is given by f (x) = αxα m x α+1, when x > x m, otherwise f (x) = 0. For α > 1, the expected value of a Pareto random variable X is given by E(X ) = αxm α 1. 8 / 95

9 Probability density function of a Pareto distribution 9 / 95

10 Probability density function of a Pareto distribution In order to calculate probabilities for the Pareto distribution, we use { 1, x < x P(X > x) = m (x m /x) α,, x x m together with the interval rule, which states that if a b P(a < X < b) = P(X > a) P(X > b), and the rule of complementarity, which states that P(X < x) = 1 P(X > x). The last two rules hold for any continuous distribution. 10 / 95

11 Example 4.1 Suppose that monthly salaries, denoted X, have a Pareto(1000, 2) distribution. Calculate 1. The expected (mean) salary. 2. The probability that an individual earns above the mean salary. 3. The probability that an individual earns below The probability that an individual earns between 3000 and The median salary (50% of individuals earn above the median salary). 11 / 95

12 Example The mean salary is given by 2. We have E(X ) = αx m α 1 = = P(X > 2000) = ( xm ) ( ) α = = Hence, only 25% of individuals earn more than the mean salary. 12 / 95

13 Example We need to calculate P(X < 1500). Using the rule of complementarity. ( xm ) α P(X < 1500)=1 P(X > 1500) = =1 (2/3) 2 = 5/9. 13 / 95

14 Example We need to calculate P(3000 < X < 6000). Using the interval rule P(3000 < X < 6000)=P(X > 3000) P(X > 6000) =(1000/3000) 2 (1000/6000) 2 =1/9 1/36 = 1/12 14 / 95

15 Example We need to find the value k for which P(X > k) = 0.5, i.e. we need to solve (1000/k) 2 = 0.5. Taking square roots: 1000/k = It follows that k 1000/ Hence, the median wage is 1414 (just over 70% of the mean wage). 15 / 95

16 4.2 The normal distribution The normal distribution has a symmetric bell shaped density curve. This curve is defined by two quantities a) The theoretical (population) mean µ (this gives the centre of the distribution). The distribution is symmetric about x = µ. b) The variance σ 2 (describes the dispersion of the distribution). We write X N(µ, σ 2 ). The (theoretical) variance of the random variable is σ 2. Note that some textbooks give the standard deviation σ as a parameter of the distribution, rather than the variance. 16 / 95

17 Probability density function of a normal distribution 17 / 95

18 4.3 The standard normal distribution A random variable with a standard normal distribution will be denoted by Z. Such a random variable has mean 0 and variance 1, i.e. the standard deviation is also 1. Values for P(Z > k) are given in Table 3 of the Murdoch and Barnes book of mathematical formulae (which will be available during the exam), for various k 0. If k is greater than 4, then we may assume P(Z > k) = / 95

19 Probabilities read from the table for the standard normal distribution This graph illustrates the probabilities given in the table. 19 / 95

20 Calculating probabilities using the table for the standard normal distribution Using the symmetry rule (when we change the sign of k we change the direction of the inequality ) 1. P(Z < k) = P(Z > k) 20 / 95

21 Calculating probabilities using the table for the standard normal distribution In order to calculate P(Z < k), we use the rule of complementarity 2. P(Z < k) = 1 P(Z > k). 21 / 95

22 Calculating probabilities using the table for the standard normal distribution Also, to calculate probabilities of the form P(a < Z < b), we use the interval rule 3.P(a < Z < b) = P(Z > a) P(Z > b). 22 / 95

23 Calculating probabilities using the table for the standard normal distribution Using these three rules and the table for the standard normal distribution, we can calculate the probability that Z comes in any interval. The probability P(Z > k) is found in the row corresponding to the digits in k directly before and after the decimal point and the column corresponding to the second digit after the decimal point. 23 / 95

24 Example 4.2 Calculate i) P(Z > 2.13) ii) P(Z < 1.05) iii) P(Z < 0.87) iv) P(Z > 2.06) v) P(0.23 < Z < 1.49) 24 / 95

25 Solution to Example 4.2 i) P(Z > 2.13). We can read this directly from the table. This probability is given in the row corresponding to 2.1 and the column corresponding to Thus, P(Z > 2.13) = / 95

26 Solution to Example 4.2 ii) P(Z < 1.05). When we have a negative constant, we use the symmetry rule i.e. P(Z < k) = P(Z > k) P(Z < 1.05) = P(Z > 1.05) = Note: Whenever the number on the right-hand side is negative, we will have to use the symmetry rule, but it might not lead immediately to the appropriate form. (see Example iv). 26 / 95

27 Solution to Example 4.2 iii) P(Z < 0.87). In this case (a positive constant, but the inequality is the wrong way round for us to read the probability directly), we use the rule of complementarity. P(Z < 0.87) = 1 P(Z > 0.87) = = / 95

28 Solution to Example 4.2 iv) P(Z > 2.06). First we use the rule of symmetry P(Z > 2.06) = P(Z < 2.06) Using the rule of complementarity P(Z < 2.06) = 1 P(Z > 2.06) = = / 95

29 Solution to Example 4.2 v) P(0.23 < Z < 1.49). In this case we first use the interval rule P(a < Z < b) = P(Z > a) P(Z > b) Thus, P(0.23 < Z < 1.49)=P(Z > 0.23) P(Z > 1.49) = = / 95

30 Calculating values from probabilities using the table for the standard normal distribution Sometimes we may wish to find the value of k for which P(Z > k) = p or P(Z < k) = p for some given p. Using the symmetry and complementarity rules, we can transform such a problem to the problem of finding P(Z > c) = p, where p < 0.5. The value of c can be found by finding the value closest to p in the heart of the table. The value of c corresponds to the row (digits around the decimal point) and the column (second decimal place). 30 / 95

31 Example 4.3 Find k such that i) P(Z > k) = 0.4 ii) P(Z < k) = 0.8 iii) P(Z > k) = 0.7 iv) P(Z < k) = / 95

32 Solution to Example 4.3 i) P(Z > k) = 0.4. This is in the appropriate form to read directly from the table. We find the value closest to 0.4 in the heart of the table. This value is and is in the row corresponding to 0.2 and the column corresponding to Hence, P(Z > 0.25) 0.4. Thus, k / 95

33 Solution to Example 4.3-ctd. ii) P(Z < k) = 0.8. In this case we cannot read k directly. In order to obtain a value of less than 0.5 on the right-hand side, we use the rule of complementarity P(Z > k) = 1 P(Z < k) = 0.2. The value in the heart of the table closest to 0.2 is , which is in the row corresponding to 0.8 and the column corresponding to It follows that Hence, k P(Z > 0.84) / 95

34 Solution to Example 4.3-ctd. iii) P(Z > k) = 0.7. Again we cannot directly read k. From the sketch below, it is clear that k < 0). 34 / 95

35 Solution to Example 4.3-ctd. Using complementarity to obtain a number on the right hand side less than 0.5 P(Z < k) = 1 P(Z > k) = 0.3. Once we have an appropriate number on the right hand side, we can use the law of symmetry to obtain the appropriate type of inequality P(Z < k) = P(Z > k) = 0.3. P(Z > k) = 0.3 is in the appropriate form. We can now read k from the table. As before k 0.52 Thus, k / 95

36 Solution to Example 4.3-ctd. iv) P(Z < k) = 0.1. Again k < 0. Using symmetry P(Z < k) = P(Z > k) = 0.1. From the tables P(Z > 1.28) 0.1. It follows that k 1.28 i.e. k / 95

37 4.4 Standardisation of a normal random variable By adding or subtracting a constant from a normally distributed random variable, we add or subtract (as appropriate) that constant from the mean of the random variable. The shape of the density curve remains the same (i.e. the dispersion is unchanged). Hence if X N(µ, σ 2 ), Y = X µ has a normal distribution with mean / 95

38 Standardisation of a normal random variable By dividing a random variable by a factor c, the standard deviation is decreased by the same factor. The density curve remains bell shaped. It follows that if X N(µ, σ 2 ), then Z = X µ σ, has a normal distribution with mean 0 and variance (and standard deviation) 1 (i.e. Z is a standard normal random variable). 38 / 95

39 Standardisation of a normal random variable Hence, in order to calculate the probability that any normal random variable takes a value in a given interval, we: i. First standardize. ii. Use the 3 rules given and tables to calculate the appropriate probability. 39 / 95

40 Example 4.4 The height of an Irish adult has a normal distribution with mean 170cm and variance 225cm 2. Calculate the probability that the height of an Irish adult is a) more than 191cm. b) less than 164cm. c) between 158 and 179cm. 40 / 95

41 Solution to Example 4.4 We have X N(170, 225). Hence, Z = X µ σ = X a) We wish to calculate P(X > 191). First we standardise both sides of the inequality by subtracting 170 (the mean µ) and dividing by 15 (the standard deviation σ). P(X > 191) = P( X µ σ > ) / 95

42 Solution to Example 4.4 On the left hand side of this inequality we now have a standard normal random variable P(Z > ) = P(Z > 1.4) = / 95

43 Solution to Example 4.4 ii) P(X < 164). First we standardize as before P(X < 164)=P( X µ < ) σ 15 =P(Z < 0.4) Using the rule of symmetry P(Z < 0.4) = P(Z > 0.4) = / 95

44 Solution to Example 4.4 iii) P(158 < X < 179). First, we standardise on all three sides P(158 < X < 179)=P( < Z < 15 =P( 0.8 < Z < 0.6) ) 15 Using the interval rule P( 0.8 < Z < 0.6) = P(Z > 0.8) P(Z > 0.6) 44 / 95

45 Solution to Example 4.4 Using the law of symmetry and then the law of complementarity for the first probability P(Z > 0.8) P(Z > 0.6)=P(Z < 0.8) =1 P(Z > 0.8) = = / 95

46 The normal distribution and probabilities of extreme values Note: If a variable has a normal distribution, then 1. The probability of an observation being within 1 standard deviation of the mean is approx The probability of being in one of the tails is thus approx The probability of an observation being within 2 (to be exact 1.96) standard deviations of the mean is approx The probability of being in one of the tails is thus approx The probability of being within standard deviations of the mean is / 95

47 The normal distribution and probabilities of extreme values 47 / 95

48 Example 4.5 The height of humans is normally distributed with mean 170cm and standard deviation 10cm. Approx. 2 3 of people are between 160cm and 180cm (i.e. 170 ± 10cm). Approx. 95% of people are between 150cm and 190cm (i.e. 170 ± 20cm). Note: These approximations are only valid when the distribution is normal. 48 / 95

49 4.5 Importance of the normal distribution, the central limit theorem The central limit theorem states that if a variable X is the sum of a large number of independent random variables (of comparible means and standard deviations), then X is approximately normally distributed. Note: large is usually interpreted as at least 30. If the random variables in the sum have a symmetric distribution, then this approximation will be very good. 49 / 95

50 4.5.1 Practical consequences of the central limit theorem 1. Many variables have a normal distribution. For example height is the sum of many factors: genetic, environmental and dietary (no individual factor is very important), and fits the normal distribution well. 2. The mean of a sample is simply the sum of the observations divided by a constant (the number of observations). Hence, if there is a large number of observations of a single variable, the distribution of the sample mean always fits the normal distribution well (i.e. the sample mean will be normally distributed around the population mean). 50 / 95

51 Practical consequences of the central limit theorem If there are a small number of observations, then the sample mean will only fit the normal distribution well if the observations are from a normal distribution (e.g. height, intelligence quotient). This fact is very important in statistical testing, since it is usually assumed that the mean of a sample is normally distributed. This assumption may not be appropriate when we are dealing with small samples. 51 / 95

52 Practical consequences of the central limit theorem 3. Suppose X has a binomial distribution with parameters n and p and n is large (and preferably p is not close to either 0 or 1). e.g. The number of heads when I throw a coin n times has a Bin(n, 1 2 ) distribution. If a proportion p of voters support party Y and n people are asked who they support. (Assuming they do not lie) The number of supporters of party Y has a binomial(n, p) distribution. 52 / 95

53 Practical consequences of the central limit theorem In the first example the total number of heads can be expressed at the sum of the number of heads from each individual throw (the number of heads from one throw is 1 with probability p otherwise it is zero). It follows that for large n, the number of heads will be approximately normally distributed. Using a similar argument, for large n the number of supporters of party Y will be approximately normally distributed. This approximation works well if the sample size n is at least 30 and p is between 0.1 and / 95

54 Practical consequences of the central limit theorem 4. For a large sample the proportion of observations in a given class is approximately normally distributed. This results from the fact that this proportion is simply the total number of such observations. (from 3, this is approximately normally distributed) divided by a constant (which does not change the form of the distribution). 54 / 95

55 4.5.2 The normal approximation to the binomial distribution Suppose n is large and X Bin(n, p), then X approx N(µ, σ 2 ), where µ = np, σ 2 = np(1 p). This approximation is used when n 30, 0.1 p / 95

56 The continuity correction for the normal approximation to the binomial distribution It should be noted that X has a discrete distribution, but we are using a continuous distribution in the approximation. For example, suppose we wanted to estimate the probability of obtaining exactly k heads when we throw a coin n times. This probability will in general be positive. However, if we use the normal approximation without an appropriate correction, we cannot sensibly estimate P(X = k) [for continuous distributions P(X = k) = 0]. 56 / 95

57 The continuity correction for the normal approximation to the binomial distribution Suppose the random variable X takes only integer values and has an approximately normal distribution. In order to estimate P(X = k), we use the continuity correction. This uses the fact that when k is an integer P(X = k) = P(k 0.5 < X < k + 0.5). 57 / 95

58 Example 4.6 Suppose a coin is tossed 36 times. Using CLT, estimate the probability that exactly 20 heads are thrown. 58 / 95

59 Example 4.6 Let X be the number of heads. We have X Bin(36, 0.5). Hence, E(X )=np = = 18 Var(X )=np(1 p) = = 9 It follows that X approx N(18, 9). We wish to estimate P(X = 20). Using the continuity correction, P(X = 20)=P(19.5 < X < 20.5) =P( < X µ < ) 9 σ 9 P(0.5 < Z < 0.83) = P(Z > 0.5) P(Z > 0.83) = = / 95

60 The continuity correction for the normal approximation to the binomial distribution This continuity correction can be adapted to problems in which we have to estimate the probability that the number of successes is in a given interval. e.g. P(15 X < 21)=P(X = 15) + P(X = 16) P(X = 20) =P(14.5 < X < 15.5) P(19.5 < X < 20.5) =P(14.5 < X < 20.5) In general, if the boundary of an interval is given by a non-strict inequality, we widen that end of the boundary by 0.5. If the boundary of an interval is given by a strict inequality, we narrow that end of the boundary by / 95

61 Example 4.7 A die is thrown 180 times. Estimate the probability that 1) at least 35 sixes are thrown 2) between 27 and 33 sixes are thrown (inclusively). 61 / 95

62 Example 4.7 Let X be the number of sixes. We have X Bin(180, 1 6 ) E(X )=np = = 30 Var(X )=np(1 p) = = / 95

63 Example 4.7 i) Using the continuity correction P(X 35)=P(X = 35) + P(X = 36) +... =P(34.5 < X < 35.5) + P(35.5 < X < 36.5) +... =P(X > 34.5) Standardising P(X > 34.5)=P( X µ > ) σ 25 P(Z > 0.9) = Note: Strictly speaking, we shouid calculate P(35 X 180) = P(X 35) P(X > 180). However, when the normal approximation is appropriate, the 2nd probability (that the number of successes is greater than the number of experiments) is estimated to be close to / 95

64 Example 4.7 ii) Using the continuity correction P(27 X 33)=P(X = 27) + P(X = 28) P(X = 33) Standardising =P(26.5 < X < 27.5) P(32.5 < X < 33.5) =P(26.5 < X < 33.5) P(26.5 < X < 33.5)=P( < X µ 25 σ < ) =P( 0.7 < Z < 0.7) = P(Z > 0.7) P(Z > 0.7) =P(Z < 0.7) P(Z > 0.7) = 1 2P(Z > 0.7) = = / 95

65 The normal approximation to the binomial It should be noted that the normal approximation to the binomial is most accurate when n is large and p is close to 0.5. This is due to the fact that X = X 1 + X X n, where X i 0 1(p). The distribution of X i is symmetric when p = / 95

66 4.6 Confidence intervals for population means Suppose we take a large number of samples of size n. The distribution of the sample means will be distributed around the population mean. If the size of the samples is increased, you would expect that the average error obtained by estimating the population mean using the sample mean will decrease (the distribution of the sample mean will be more concentrated around the population mean). If the population standard deviation of the observations is σ, then the standard deviation of the sample mean from a sample of n observations is σ n (otherwise known as the standard error, S.E.(x)). 66 / 95

67 4.6.1 Confidence intervals for the mean with large samples All the calculations in Section 4.6 assume that the sample mean has a normal distribution. This is always reasonable when there is a large number of observations. When there is a small number of observations, this is only reasonable if the observations themselves have a normal distribution. When the sample size is large (n > 30), we may assume that the sample standard deviation s is a good estimate of the population standard deviation σ. Hence, we can use s n as an approximation of the standard error. The sample mean X is the best estimator of the population mean µ (this is a point estimate). 67 / 95

68 Confidence intervals for the population mean A point estimate does not indicate the expected error of that estimate, so an interval estimate should be used (e.g. the average population height is 175 ± 4cm). The default confidence level is 95%, i.e. if we calculate one hundred 95% confidence intervals for the mean based on 100 samples, then on average 95 of them will contain the real population mean. The population mean is not guaranteed to lie within a confidence interval. 68 / 95

69 Confidence intervals for the population mean with large samples Since 95% of the observations of the sample mean will lie within 1.96 standard errors of the population mean, the following is a 95% confidence interval for the population mean (an interval estimate) 95% confidence interval for the population mean (large sample) x ± 1.96s n = x ± 1.96S.E.(x) 69 / 95

70 Confidence intervals for the population mean with large samples Similarly, 99% of the observations of the sample mean will lie within standard errors of the population mean, the following is a 99% confidence interval for the population mean. 99% confidence interval for the population mean (large sample) x ± 2.576s n = x ± 2.576S.E.(x) A general equation for calculating confidence intervals at a given confidence level will be given in the following subsection. 70 / 95

71 Example 4.8 Suppose the mean weekly wage of 100 randomly chosen Irish workers is 420 Euros and the sample standard deviation is 300. Calculate a 95% confidence interval for the mean weekly wage of all Irish workers. 71 / 95

72 Solution to Example 4.8 The standard error of the sample mean is approximately S.E.(x) = s n = = 30. The 95% confidence interval for the mean weekly wage of all Irish adults is x ± 1.96S.E.(x)=420 ± =420 ± 58.8 = [361.2, 478.8]. The narrower a confidence interval, the more accurately we are estimating a population mean. 72 / 95

73 4.6.2 Confidence intervals for the population mean with small samples When the sample size is small we cannot assume that the sample standard deviation is a good estimate of the population standard deviation. In this case, a confidence interval must reflect the increased degree of uncertainty resulting from not knowing the population standard deviation (i.e. the confidence interval must be wider). 73 / 95

74 The student distribution Suppose the observations X 1,..., X n are normally distributed. n(x µ) Z = σ has a normal distribution with mean 0 and standard deviation 1 (where µ and σ are the population mean and popluation standard deviation, respectively). Let n(x µ) T n 1 =, s where s is the sample standard deviation. T n 1 has a student distribution with n 1 degrees of freedom. 74 / 95

75 Relation betweeen the student distribution and the normal distribution Since s is an estimate of σ, the distribution of T n 1 will be similar to the distribution of Z. However, this estimation introduces a larger degree of uncertainty and the dispersion of the T n 1 distribution will be larger than the dispersion of the Z distribution. As n increases, s becomes a very good estimate of σ. Thus, for large n, the distribution of T n 1 will converge to the standard normal distribution. 75 / 95

76 Critical values for the student distribution The p-critical value of the student distribution with n 1 degrees of freedom is denoted t n 1,p. It satisfies P(T n 1 > t n 1,p ) = p. By symmetry, a proportion 1 2p of sample means will be within t n 1,p standard deviations of the mean. For n < 30 these critical values can be read from Table 7. For n > 30 we use the fact that for large n the student distribution converges to the standard normal distribution. The appropriate critical values for the normal distribution can also be read from Table 7. They are given as t,p. 76 / 95

77 Critical values for the student distribution The graph illustrates the critical value t n 1,0.005 ( 2.6 in this case). 77 / 95

78 Confidence intervals for the population mean with small samples For a small sample, the following is a 100(1 α)% confidence interval for the population mean 100(1 α)% confidence interval for the population mean (small sample) s x ± t n 1, α = x ± t 2 n 1, α S.E.(x) n 2 78 / 95

79 95% confidence interval for the population mean with small samples For a 95% confidence interval 100(1 α) = 95. Hence, α = % confidence interval for the population mean (small sample) s x ± t n 1,0.025 = x ± t n 1,0.025 S.E.(x) n 79 / 95

80 99% confidence interval for the population mean with small samples For a 99% confidence interval 100(1 α) = 99. Hence, α = % confidence interval for the population mean (small sample) x ± t n 1,0.005 s n = x ± t n 1,0.005 S.E.(x) 80 / 95

81 Example students were weighed. Their average weight was 68kg and the standard deviation 12kg. Calculate a 99% confidence interval for the mean weight of all students. 81 / 95

82 Solution to Example 4.9 This confidence interval is given by s 12 x ± t n 1,0.005 =68 ± t 8,0.005 n 3 =68 ± = 68 ± = [54.58, 81.42] Note: Since the sample size is small and weight has a skewed distribution (i.e. is not normally distributed), this is only approximately a 99% confidence interval. 82 / 95

83 4.7 Confidence intervals for proportions We want to estimate the proportion p of people in the population who have trait A. It will be assumed that the sample size n is large (n > 30). In this case the distribution of the sample proportion will be approximately normal (unless p is very close to 0 or 1). Suppose x people in a sample of n have trait A. ˆp = x n (the sample proportion) is an estimator of p (the population proportion). 83 / 95

84 Standard error of the sample proportion The standard error of this estimator is p(1 p) S.E.(ˆp) =, n which can be approximated using ˆp(1 ˆp) S.E.(ˆp) n It should be noted that the maximum standard error is attained when p = 0.5. It follows that S.E.(ˆp) S.E. max (ˆp) = 1 2 n 84 / 95

85 Formula for a confidence interval for the population proportion For large n and p not close to 0 or 1, ˆp will be approximately normally distributed. It follows that a 100(1-α)% confidence interval for p is given by ˆp ± t, α 2 S.E.(ˆp) 85 / 95

86 Particular cases of confidence intervals for the population proportion A 95% confidence interval for p is given by ˆp ± t,0.025 S.E.(ˆp) A 99% confidence interval for p is given by ˆp ± t,0.005 S.E.(ˆp) 86 / 95

87 Estimating the population proportion to a given accuracy The upper bound on the standard error of the sample proportion is useful when defining the sample size necessary to estimate the population proportion to a given accuracy. To calculate the sample size required to estimate a population proportion to within δ with a confidence of 100(1 α)%, we require that the error term t, α S.E.(ˆp) is smaller than the 2 required accuracy δ, i.e. t, α S.E.(ˆp) δ. 2 This will always be satisfied if t, α 2 S.E. max(ˆp) δ t, α 2 2 n δ. To find the appropriate sample size, we have to solve this inequality. 87 / 95

88 Example 4.10 In a survey of 300 people, 75 stated that they would vote for the Labour party. i) Calculate a 95% confidence interval for the proportion of people wishing to vote for the Labour party. ii) What sample size is needed in order to estimate this proportion to within 3% with a confidence level of 99%? 88 / 95

89 Solution to Example 4.10 i) The formula for this confidence interval is given by ˆp ± t,0.025 S.E.(ˆp) We have Also, ˆp = = 1 4 = ( ) ˆp(1 ˆp) S.E.(ˆp) n ( ) 3 = = t,0.025 = = / 95

90 Solution to Example 4.10 Hence, the 95% confidence interval is given by 0.25 ± =0.25 ± =[0.201, 0.299] 90 / 95

91 Solution to Example 4.10 For a 99% confidence level, we require t, α 2 S.E. max(ˆp) δ n n n n n. Hence, since the sample size is an integer we require at least 1844 observations. 91 / 95

92 4.8 Estimating a population mean to a given accuracy In a similar way, we can estimate the sample size required to estimate a population mean to some required accuracy. However, the standard error depends on the population standard deviation, which in general is unknown. Unlike the standard error for a population proportion, we cannot give an upper bound on the standard error for a population mean (it simply increases as the population standard deviation increases). Normally a two stage procedure is used. Firstly, a relatively small sample is used to estimate the standard deviation and then the estimation of the required sample size is based on the standard deviation for this sample. 92 / 95

93 Estimating a population mean to a given accuracy We assume that the required sample size is relatively large (i.e. n > 30). As previously, we require the error term in the corresponding confidence interval to be less than or equal to the permitted error, δ i.e. t,α/2 S.E.(X ) δ t,α/2 s δ. n 93 / 95

94 Example 4.11 The monthly salaries of 30 randomly chosen Irish adults were observed and the sample standard deviation was 1000 Euro. Find the sample size required to estimate the mean monthly salary of Irish adults to within 200 Euro with a confidence level of 95%. 94 / 95

95 Solution to Example 4.11 We have s = 1000, α = 0.05, δ = 200. Hence, we need to solve the following inequality for n: t,0.025 s 200 n n Hence, n = 9.8 n = It follows that we need at least 97 observations (i.e. at least another 67 observations in addition to the initial sample). 95 / 95

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

Lecture 8. Confidence intervals and the central limit theorem

Lecture 8. Confidence intervals and the central limit theorem Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15 Central limit theorem Let X 1, X 2,... X n be a random sample of

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives. The Normal Distribution C H 6A P T E R The Normal Distribution Outline 6 1 6 2 Applications of the Normal Distribution 6 3 The Central Limit Theorem 6 4 The Normal Approximation to the Binomial Distribution

More information

Week 3&4: Z tables and the Sampling Distribution of X

Week 3&4: Z tables and the Sampling Distribution of X Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal

More information

The Standard Normal distribution

The Standard Normal distribution The Standard Normal distribution 21.2 Introduction Mass-produced items should conform to a specification. Usually, a mean is aimed for but due to random errors in the production process we set a tolerance

More information

Estimation and Confidence Intervals

Estimation and Confidence Intervals Estimation and Confidence Intervals Fall 2001 Professor Paul Glasserman B6014: Managerial Statistics 403 Uris Hall Properties of Point Estimates 1 We have already encountered two point estimators: th e

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

MATH 10: Elementary Statistics and Probability Chapter 7: The Central Limit Theorem

MATH 10: Elementary Statistics and Probability Chapter 7: The Central Limit Theorem MATH 10: Elementary Statistics and Probability Chapter 7: The Central Limit Theorem Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of slides, you

More information

Probability Distributions

Probability Distributions Learning Objectives Probability Distributions Section 1: How Can We Summarize Possible Outcomes and Their Probabilities? 1. Random variable 2. Probability distributions for discrete random variables 3.

More information

Chapter 4. Probability and Probability Distributions

Chapter 4. Probability and Probability Distributions Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the

More information

You flip a fair coin four times, what is the probability that you obtain three heads.

You flip a fair coin four times, what is the probability that you obtain three heads. Handout 4: Binomial Distribution Reading Assignment: Chapter 5 In the previous handout, we looked at continuous random variables and calculating probabilities and percentiles for those type of variables.

More information

Math 151. Rumbos Spring 2014 1. Solutions to Assignment #22

Math 151. Rumbos Spring 2014 1. Solutions to Assignment #22 Math 151. Rumbos Spring 2014 1 Solutions to Assignment #22 1. An experiment consists of rolling a die 81 times and computing the average of the numbers on the top face of the die. Estimate the probability

More information

The normal approximation to the binomial

The normal approximation to the binomial The normal approximation to the binomial The binomial probability function is not useful for calculating probabilities when the number of trials n is large, as it involves multiplying a potentially very

More information

Lecture 10: Depicting Sampling Distributions of a Sample Proportion

Lecture 10: Depicting Sampling Distributions of a Sample Proportion Lecture 10: Depicting Sampling Distributions of a Sample Proportion Chapter 5: Probability and Sampling Distributions 2/10/12 Lecture 10 1 Sample Proportion 1 is assigned to population members having a

More information

Characteristics of Binomial Distributions

Characteristics of Binomial Distributions Lesson2 Characteristics of Binomial Distributions In the last lesson, you constructed several binomial distributions, observed their shapes, and estimated their means and standard deviations. In Investigation

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

Notes on Continuous Random Variables

Notes on Continuous Random Variables Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

More information

5. Continuous Random Variables

5. Continuous Random Variables 5. Continuous Random Variables Continuous random variables can take any value in an interval. They are used to model physical characteristics such as time, length, position, etc. Examples (i) Let X be

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve

More information

AMS 5 CHANCE VARIABILITY

AMS 5 CHANCE VARIABILITY AMS 5 CHANCE VARIABILITY The Law of Averages When tossing a fair coin the chances of tails and heads are the same: 50% and 50%. So if the coin is tossed a large number of times, the number of heads and

More information

WHERE DOES THE 10% CONDITION COME FROM?

WHERE DOES THE 10% CONDITION COME FROM? 1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

Social Studies 201 Notes for November 19, 2003

Social Studies 201 Notes for November 19, 2003 1 Social Studies 201 Notes for November 19, 2003 Determining sample size for estimation of a population proportion Section 8.6.2, p. 541. As indicated in the notes for November 17, when sample size is

More information

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the

More information

Probability Distributions

Probability Distributions CHAPTER 5 Probability Distributions CHAPTER OUTLINE 5.1 Probability Distribution of a Discrete Random Variable 5.2 Mean and Standard Deviation of a Probability Distribution 5.3 The Binomial Distribution

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Chapter 5. Random variables

Chapter 5. Random variables Random variables random variable numerical variable whose value is the outcome of some probabilistic experiment; we use uppercase letters, like X, to denote such a variable and lowercase letters, like

More information

Lesson 20. Probability and Cumulative Distribution Functions

Lesson 20. Probability and Cumulative Distribution Functions Lesson 20 Probability and Cumulative Distribution Functions Recall If p(x) is a density function for some characteristic of a population, then Recall If p(x) is a density function for some characteristic

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1. Lecture 6: Chapter 6: Normal Probability Distributions A normal distribution is a continuous probability distribution for a random variable x. The graph of a normal distribution is called the normal curve.

More information

Key Concept. Density Curve

Key Concept. Density Curve MAT 155 Statistical Analysis Dr. Claude Moore Cape Fear Community College Chapter 6 Normal Probability Distributions 6 1 Review and Preview 6 2 The Standard Normal Distribution 6 3 Applications of Normal

More information

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

More information

Random variables, probability distributions, binomial random variable

Random variables, probability distributions, binomial random variable Week 4 lecture notes. WEEK 4 page 1 Random variables, probability distributions, binomial random variable Eample 1 : Consider the eperiment of flipping a fair coin three times. The number of tails that

More information

AP Statistics Solutions to Packet 2

AP Statistics Solutions to Packet 2 AP Statistics Solutions to Packet 2 The Normal Distributions Density Curves and the Normal Distribution Standard Normal Calculations HW #9 1, 2, 4, 6-8 2.1 DENSITY CURVES (a) Sketch a density curve that

More information

An Introduction to Basic Statistics and Probability

An Introduction to Basic Statistics and Probability An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random

More information

The normal approximation to the binomial

The normal approximation to the binomial The normal approximation to the binomial In order for a continuous distribution (like the normal) to be used to approximate a discrete one (like the binomial), a continuity correction should be used. There

More information

THE BINOMIAL DISTRIBUTION & PROBABILITY

THE BINOMIAL DISTRIBUTION & PROBABILITY REVISION SHEET STATISTICS 1 (MEI) THE BINOMIAL DISTRIBUTION & PROBABILITY The main ideas in this chapter are Probabilities based on selecting or arranging objects Probabilities based on the binomial distribution

More information

Lecture 5 : The Poisson Distribution

Lecture 5 : The Poisson Distribution Lecture 5 : The Poisson Distribution Jonathan Marchini November 10, 2008 1 Introduction Many experimental situations occur in which we observe the counts of events within a set unit of time, area, volume,

More information

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck!

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck! STP 231 EXAM #1 (Example) Instructor: Ela Jackiewicz Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned.

More information

PROBABILITY AND SAMPLING DISTRIBUTIONS

PROBABILITY AND SAMPLING DISTRIBUTIONS PROBABILITY AND SAMPLING DISTRIBUTIONS SEEMA JAGGI AND P.K. BATRA Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 0 0 seema@iasri.res.in. Introduction The concept of probability

More information

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

More information

Standard Deviation Estimator

Standard Deviation Estimator CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

6 3 The Standard Normal Distribution

6 3 The Standard Normal Distribution 290 Chapter 6 The Normal Distribution Figure 6 5 Areas Under a Normal Distribution Curve 34.13% 34.13% 2.28% 13.59% 13.59% 2.28% 3 2 1 + 1 + 2 + 3 About 68% About 95% About 99.7% 6 3 The Distribution Since

More information

The Normal distribution

The Normal distribution The Normal distribution The normal probability distribution is the most common model for relative frequencies of a quantitative variable. Bell-shaped and described by the function f(y) = 1 2σ π e{ 1 2σ

More information

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

More information

2 ESTIMATION. Objectives. 2.0 Introduction

2 ESTIMATION. Objectives. 2.0 Introduction 2 ESTIMATION Chapter 2 Estimation Objectives After studying this chapter you should be able to calculate confidence intervals for the mean of a normal distribution with unknown variance; be able to calculate

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Opgaven Onderzoeksmethoden, Onderdeel Statistiek Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week

More information

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8. Random variables Remark on Notations 1. When X is a number chosen uniformly from a data set, What I call P(X = k) is called Freq[k, X] in the courseware. 2. When X is a random variable, what I call F ()

More information

Statistics 104: Section 6!

Statistics 104: Section 6! Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: dbloome@fas.harvard.edu Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC

More information

The Normal Distribution. Alan T. Arnholt Department of Mathematical Sciences Appalachian State University

The Normal Distribution. Alan T. Arnholt Department of Mathematical Sciences Appalachian State University The Normal Distribution Alan T. Arnholt Department of Mathematical Sciences Appalachian State University arnholt@math.appstate.edu Spring 2006 R Notes 1 Copyright c 2006 Alan T. Arnholt 2 Continuous Random

More information

TEACHER NOTES MATH NSPIRED

TEACHER NOTES MATH NSPIRED Math Objectives Students will understand that normal distributions can be used to approximate binomial distributions whenever both np and n(1 p) are sufficiently large. Students will understand that when

More information

3.4 The Normal Distribution

3.4 The Normal Distribution 3.4 The Normal Distribution All of the probability distributions we have found so far have been for finite random variables. (We could use rectangles in a histogram.) A probability distribution for a continuous

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

Practice Problems for Homework #6. Normal distribution and Central Limit Theorem.

Practice Problems for Homework #6. Normal distribution and Central Limit Theorem. Practice Problems for Homework #6. Normal distribution and Central Limit Theorem. 1. Read Section 3.4.6 about the Normal distribution and Section 4.7 about the Central Limit Theorem. 2. Solve the practice

More information

MAS108 Probability I

MAS108 Probability I 1 QUEEN MARY UNIVERSITY OF LONDON 2:30 pm, Thursday 3 May, 2007 Duration: 2 hours MAS108 Probability I Do not start reading the question paper until you are instructed to by the invigilators. The paper

More information

Point and Interval Estimates

Point and Interval Estimates Point and Interval Estimates Suppose we want to estimate a parameter, such as p or µ, based on a finite sample of data. There are two main methods: 1. Point estimate: Summarize the sample by a single number

More information

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

More information

The Binomial Probability Distribution

The Binomial Probability Distribution The Binomial Probability Distribution MATH 130, Elements of Statistics I J. Robert Buchanan Department of Mathematics Fall 2015 Objectives After this lesson we will be able to: determine whether a probability

More information

Practice problems for Homework 11 - Point Estimation

Practice problems for Homework 11 - Point Estimation Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:

More information

Stats on the TI 83 and TI 84 Calculator

Stats on the TI 83 and TI 84 Calculator Stats on the TI 83 and TI 84 Calculator Entering the sample values STAT button Left bracket { Right bracket } Store (STO) List L1 Comma Enter Example: Sample data are {5, 10, 15, 20} 1. Press 2 ND and

More information

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research

More information

Chapter 4. iclicker Question 4.4 Pre-lecture. Part 2. Binomial Distribution. J.C. Wang. iclicker Question 4.4 Pre-lecture

Chapter 4. iclicker Question 4.4 Pre-lecture. Part 2. Binomial Distribution. J.C. Wang. iclicker Question 4.4 Pre-lecture Chapter 4 Part 2. Binomial Distribution J.C. Wang iclicker Question 4.4 Pre-lecture iclicker Question 4.4 Pre-lecture Outline Computing Binomial Probabilities Properties of a Binomial Distribution Computing

More information

Means, standard deviations and. and standard errors

Means, standard deviations and. and standard errors CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

More information

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit? ECS20 Discrete Mathematics Quarter: Spring 2007 Instructor: John Steinberger Assistant: Sophie Engle (prepared by Sophie Engle) Homework 8 Hints Due Wednesday June 6 th 2007 Section 6.1 #16 What is the

More information

Lecture 2: Discrete Distributions, Normal Distributions. Chapter 1

Lecture 2: Discrete Distributions, Normal Distributions. Chapter 1 Lecture 2: Discrete Distributions, Normal Distributions Chapter 1 Reminders Course website: www. stat.purdue.edu/~xuanyaoh/stat350 Office Hour: Mon 3:30-4:30, Wed 4-5 Bring a calculator, and copy Tables

More information

CURVE FITTING LEAST SQUARES APPROXIMATION

CURVE FITTING LEAST SQUARES APPROXIMATION CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship

More information

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

More information

7.1 Graphs of Quadratic Functions in Vertex Form

7.1 Graphs of Quadratic Functions in Vertex Form 7.1 Graphs of Quadratic Functions in Vertex Form Quadratic Function in Vertex Form A quadratic function in vertex form is a function that can be written in the form f (x) = a(x! h) 2 + k where a is called

More information

SOLUTIONS: 4.1 Probability Distributions and 4.2 Binomial Distributions

SOLUTIONS: 4.1 Probability Distributions and 4.2 Binomial Distributions SOLUTIONS: 4.1 Probability Distributions and 4.2 Binomial Distributions 1. The following table contains a probability distribution for a random variable X. a. Find the expected value (mean) of X. x 1 2

More information

SAMPLING DISTRIBUTIONS

SAMPLING DISTRIBUTIONS 0009T_c07_308-352.qd 06/03/03 20:44 Page 308 7Chapter SAMPLING DISTRIBUTIONS 7.1 Population and Sampling Distributions 7.2 Sampling and Nonsampling Errors 7.3 Mean and Standard Deviation of 7.4 Shape of

More information

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

CHI-SQUARE: TESTING FOR GOODNESS OF FIT CHI-SQUARE: TESTING FOR GOODNESS OF FIT In the previous chapter we discussed procedures for fitting a hypothesized function to a set of experimental data points. Such procedures involve minimizing a quantity

More information

Ch5: Discrete Probability Distributions Section 5-1: Probability Distribution

Ch5: Discrete Probability Distributions Section 5-1: Probability Distribution Recall: Ch5: Discrete Probability Distributions Section 5-1: Probability Distribution A variable is a characteristic or attribute that can assume different values. o Various letters of the alphabet (e.g.

More information

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1]. Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real

More information

Stat 5102 Notes: Nonparametric Tests and. confidence interval

Stat 5102 Notes: Nonparametric Tests and. confidence interval Stat 510 Notes: Nonparametric Tests and Confidence Intervals Charles J. Geyer April 13, 003 This handout gives a brief introduction to nonparametrics, which is what you do when you don t believe the assumptions

More information

2 Sample t-test (unequal sample sizes and unequal variances)

2 Sample t-test (unequal sample sizes and unequal variances) Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing

More information

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015. Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x

More information

Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

More information

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

More information

4.1 4.2 Probability Distribution for Discrete Random Variables

4.1 4.2 Probability Distribution for Discrete Random Variables 4.1 4.2 Probability Distribution for Discrete Random Variables Key concepts: discrete random variable, probability distribution, expected value, variance, and standard deviation of a discrete random variable.

More information

Normal Distribution as an Approximation to the Binomial Distribution

Normal Distribution as an Approximation to the Binomial Distribution Chapter 1 Student Lecture Notes 1-1 Normal Distribution as an Approximation to the Binomial Distribution : Goals ONE TWO THREE 2 Review Binomial Probability Distribution applies to a discrete random variable

More information

16. THE NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION

16. THE NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION 6. THE NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION It is sometimes difficult to directly compute probabilities for a binomial (n, p) random variable, X. We need a different table for each value of

More information

The Normal Distribution

The Normal Distribution The Normal Distribution Continuous Distributions A continuous random variable is a variable whose possible values form some interval of numbers. Typically, a continuous variable involves a measurement

More information

Statistical Confidence Calculations

Statistical Confidence Calculations Statistical Confidence Calculations Statistical Methodology Omniture Test&Target utilizes standard statistics to calculate confidence, confidence intervals, and lift for each campaign. The student s T

More information

3.4. The Binomial Probability Distribution. Copyright Cengage Learning. All rights reserved.

3.4. The Binomial Probability Distribution. Copyright Cengage Learning. All rights reserved. 3.4 The Binomial Probability Distribution Copyright Cengage Learning. All rights reserved. The Binomial Probability Distribution There are many experiments that conform either exactly or approximately

More information

WEEK #22: PDFs and CDFs, Measures of Center and Spread

WEEK #22: PDFs and CDFs, Measures of Center and Spread WEEK #22: PDFs and CDFs, Measures of Center and Spread Goals: Explore the effect of independent events in probability calculations. Present a number of ways to represent probability distributions. Textbook

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is. Some Continuous Probability Distributions CHAPTER 6: Continuous Uniform Distribution: 6. Definition: The density function of the continuous random variable X on the interval [A, B] is B A A x B f(x; A,

More information