Section 2.4 Numerical Measures of Central Tendency

Size: px
Start display at page:

Download "Section 2.4 Numerical Measures of Central Tendency"

Transcription

1 Section 2.4 Numerical Measures of Central Tendency Definitions Mean: The Mean of a quantitative dataset is the sum of the observations in the dataset divided by the number of observations in the dataset. Median: The Median (m) of a quantitative dataset is the middle number when the observations are arranged in ascending order. Mode: The Mode of a datset is the observation that occurs most frequently in the dataset How to calculate these Mean: There are two means, the Population Mean μ and the Sample mean x. The calculation of both is the same except that μ is calculated for the entire population and x is calculated for a sample taken from that population. We will now refer to x as in practice we never calculate μ, after all not calculating but estimating μ is the whole point of inferential statistics. 1

2 Dataset: X 1 X 2 X 3 X 4 X X n so there are n observations in this dataset Sample Mean: x n i = = 1 n x i Median: Arrange the n observations in order from smallest to largest, then: if n is odd, the median (m) is the middle number, if n is even, the median is the mean of the middle two numbers Given a histogram the median is the point on the X-axis such that half the area under the histogram lies to the left of the median and half lies to the right. An example of finding the median from a histogram with Class Intervals is shown in Example Median 50% 50% 2

3 Mode: If given a dataset, the mode is easily chosen as the value with the highest relative frequency. If given a relative frequency distribution with class intervals then the mode is chosen to be the mid point of the class interval which has the highest relative frequency. This class interval which has the highest relative frequency is called the Modal Class. The mode measures data concentration and so can be used to locate the region in a large dataset where much of the data is concentrated. NOTE: unlike the mean and median the mode must be an element of the original dataset Example Calculate the Mean Median and Mode for the following datasets: Example A: Dataset: 5, 3, 8, 5, 6 x 5 x i = i = 1 = = 5.4 Mode = 5 Median: 3, 5, 5, 6, 8 so m = 5 Note: 5.4 is not one of the original values in the dataset B: 11, 140, 98, 23, 45, 14, 56, 78, 93, 200, 123, 165 3

4 n = 12, x n i = = 1 n x i = 1046/12 = Median: 11, 14, 23, 45, 56, 78, 93, 98, 123, 140, 165, 200 m = ( )/2 = 85.5 C: generate a dataset containing 9 numbers using the Day, Month and Year of your birth and that of the people sitting to your left and right. ie: DD/MM/YY 4

5 *** D: Class Interval Frequency 2 -< < < < 10 7 Modal Class is 4 -< 6 as frequency of 18 is highest, mode is in the middle of this so mode = 5 Mean = (3*3 + 5*18 + 7*9 + 9*7)/( ) = 225/37 = Median: There are 37 observations in this datset so the median is the 19th observation. There are 3 observations in the first Class Interval 2 -<4 and as 19-3 =16 we need to find the 16th observation in the Class Interval 4 -< 6. Assuming the observations are distributed uniformly within each Class Interval we find that the 16th observation in the second interval should lie 16/18 = 0.89 of the way between 4 and 6. The distance between 4 and 6 is 2 units, 2*.89 = 1.78, and so we find: median (m) = =

6 2.4.4 Mean vs Median vs Mode - which measures the centre best? Choosing which of these three measures to use in practice can sometimes seem like a difficult task. However if we understand a little about the relative merits of each we should at least be able to make an informed decision. If the distribution is symmetric then Mean = Median If the distribution is Positively Skewed (to the right) then Mean > Median If the distribution is Negatively Skewed (to the left) then Median > Mean So the difference between the mean and median can be used to measure the skewness of a dataset. ***********INSERT SLIDE Note: The presence of outliers affects the mean but not the median. This can be seen from the diagrams and from the following example: 6

7 *** Example Ten statistics graduates who are now working as statisticians are surveyed for their annual salary. The survey produced the following dataset: 60,000 20,000 19,000 22,000 21,500 21,000 18,000 16,000 17,500 20,000 Calculate the Mode, Median and Mean: Mode = 20,000 Median = 20,000 Mean = 23,500 Notice that the distribution is positively skewed, the presence of the one high earner has affected the Mean causing it to be 1,500 higher than the highest of all the salaries excluding 60,000. For this dataset the Mean is therefore not a good measure of the centre of the dataset. Notice also that the median would be unaffected if the 60,000 was changed to a value like 23,000 which is more in line with the rest of the data. Because of this sensitivity of the mean to outliers and because the median is completely insensitive to outliers a revised version of the mean is sometimes used called the trimmed mean. 7

8 2.4.6 Definition: Trimmed Mean NOTE: This definition is NOT in the textbook A trimmed mean is computed by first ordrering the data values from smallest to largest, then deleting a selected number of values from each end of the ordered list and finally averaging the remaining undeleted values. The trimming percentage is the percentage of values deleted from EACH end of the ordered list. So if a dataset contained 10 observations and we wanted to find a 20% trimmed mean we would delete 2 observations from the top of the ordered dataset and 2 from the bottom leaving 6 remaining values. The mean is then calculated for these 6 remaining values and this is the 20% Trimmed Mean. Example: Compute a 10% trimmed mean for the dataset in Example 2.4.5, compare with previous measures. There are 10 observations in the dataset, 10% of 10 is 1 so we delete the largest and smallest observations ie the values 60,000 and 16,000 are deleted. The mean of the remaining values is then calculated: 10% Trimmed Mean = ( 17, , , , , , , ,000)/8 = 19,875 This is very similar to the median and mode for this data. 8

9 2.4.7 Some more Examples Sometimes we are not presented with a dataset but with a a Histogram or a Stem and Leaf Diagram. It is still possible to measure the centre of the dataset from these graphs. **********INSERT MPG Histogram and Stem&Leaf Example Measurements were taken of the pulses of a certain number of UCD Students, the observations are listed below. Find the median and mode of this dataset. What is the best way to present this data which will allow the median and mode to be calculated more easily? Examples Would you expect the datasets described below to possess relative Frequency distributions which are symmetric, skewed to the right or skewed to the left. A. The salaries of people employed by UCD B. The grades on an easy exam C. The grades on a diffucult exam D. The amount of time spent by students in a difficult 3 hour exam. E. The amount of time students in this class studied last week. F. The age of cars on a used car lot 9

10 Example: The median age of the population in Ireland is now 32 years old. The median age of the Irish population in 1986 was 27. Interepret these values and explain the trend, what implications does this data have for Irish society. What are the consequences for the entertainment industry in Ireland? 10

11 Section 2.5 Numerical Measures of Variability When we want to describe a dataset providing a measure of the centre of that dataset is only part of the story. Consider the following two distributions: A B Both of these distributions are symmetric and meana = meanb, modea=modeb and mediana=medianb. However these two distributions are obviously different, the data in A is quite spread out compared to the data in B. This spread is technically called variability and in this section we will examine how best to measure it. 11

12 2.5.1 Definitions Range: The Range of a quantatitive dataset is equal to the largest value minus the smallest value. Sample Variance: The Sample Variance is equal to the sum of the squared distances from the mean divided by n-1. s 2 = n ( x x) i i= 1 n 1 2 An easier formula to be used when calculating the variance is: s 2 = n i= 1 x n 2 i= 1 i n 1 x i n 2 12

13 Sample Standard Deviation: The Sample Standard Deviation, s, is defined as the positive square root of the Sample Variance, s Which is best? The meaning of the Range is easily seen from its definition. It is a very crude measure of the variability contained in a dataset as it is only interested in the largest and smallest values and does not measure the variability of the rest of the dataset. ExampleA: These two datasets have the same range but do they have the same variability? Dataset1: 1, 5, 5, 5, 9 Dataset2: 1, 2, 5, 8, 9 NO, Dataset2 is obviously more spread out than Dataset1 which has threee values clustered at 5. The Sample Variance is a much better measure of the variability in the whole dataset. This is because the term ( xi x) in s 2 calculates the distance of each observation in the dataset from the centre of the dataset (as measured by the Sample Mean). 13

14 As some of the x i s are smaller than x and some are larger they tend to cancel each other out. For this reason we square each ( xi x) term before adding them together and dividing by n-1 to get an average measure of the squared distance of each observation from the mean. The Sample Variance therefore will be small if all observations are close to the Sample Mean but will be large if the observations are far away from the mean. This is best illustrated by comparing the calculation of s 2 for the two datasets in ExampleA above. Dataset1: 1, 5, 5, 5, 9... x =5 s 2 = [(1-5) 2 + (5-5) 2 + (5-5) 2 + (5-5) 2 + (9-5) 2 ]/4 = [ (-4) 2 + (0) 2 + (0) 2 + (0) 2 + (4) 2 ]/4 = [ ]/4 = 8 Dataset2: 1, 2, 5, 8, 9... x =5 s 2 = [(1-5) 2 + (2-5) 2 + (5-5) 2 + (8-5) 2 + (9-5) 2 ]/4 = [ (-4) 2 + (-3) 2 + (0) 2 + (3) 2 + (4) 2 ]/4 = [ ]/4 = 12.5 So the increased spread contained in Dataset2 is indeed measured by s Samples and Populations 14

15 You will have noticed that although we described s 2 as an average of the squared distances from the sample mean, in fact we divided the sum of the squares not by n but by n-1. Now there were n observations in the dataset so surely the correct thing would be to divide by n and not n-1. The reason we divided by n-1 is because we are as always intereted in Inferential Statistics and we want to use s 2 (the Sample Variance) to estimate for the Population Variance which we will denote by σ 2 ( sigma squared). And we will find later that s 2 with the n-1 provides a more accurate estimator of σ 2. So again we have a sample and a Population and two Population Characteristics estimated by two Sample Statistics. Population Characteristic Sample Statistic Population σ 2 Sample s 2 Variance Variance Population Standard Deviation σ Sample Standard Deviation s Example Two samples are chosen from a population: 15

16 Sample1: 10, 0, 1, 9, 10, 0, 8, 1, 1, 9 Sample2: 0, 5, 10, 5, 5, 5, 6, 5, 6, 5 Answer the following questions based on these two samples: A. Examine both samples and identify which has the greater variability B. Calculate the Range for each sample, does your result aggree with the answer in A. C. Calculate the Standard Deviation for each sample, does this result aggree with your answer to part A. D. Which of the two, Range or Standard Deviation provides the best measure of variability. Answers: Range1 = 10, Range2 = 10 S 1 =4.5814, S 2 = Example Once upon a time there were two lecturers A & B, each delivered the same course to two different classes. When exam time came both classes had the same average marks of 70%. The marks for Lecturer A s class however had a standard deviation of 25% whereas the Standard Deviation for Lecturer B s class was 5%. Who s class would you rather be in? 16

17 Section 2.6 Interpreting the Standard Deviation - Chebyshev s Rule and the Empirical Rule We have seen that the Variance and hence the Standard Deviation of a dataset provides us with a relative measure of the variability contained in a dataset. So that if we are given two datasets the one with the larger Standard Deviation will be the dataset which exhibits the greater variability. Is it posssible for the Standard Deviation to give more than a relative measure of variability? Can we actually say how spread ou the data is? The answer is yes, we will see later how to give detailed answers for particular distributions. In the meantime there are two rules which will provide us with a good deal of information about some general datasets Chebyshev s Rule This rule applies to any dataset (population or sample) regardless of the shape or frequency distribution of the data. For k > 1 the proportion of observations which are within k Standard Deviations of the mean is at least 1-1/k 2. 17

18 Computing this for several values of k gives: k: Number of Standard Deviations Proportion of the observations within k Standard Deviations from the Mean 2 At least 1-1/4 = At least 1-1/9 = At least 1-1/16 = At least 1-1/20 = At least 1-1/25 = At least 1-1/100 = 0.99 Note: Chebyshev s Rule provides us with an idea of the spread of distributions. Because it is meant to work for all distributions regardless of their shape it doesn t give definite specific results. Instead it tells us that at least a certain proportion of observations lie in a specified interval. The proportions in Chebyshev s Rule are therefore very conservtive and for certain distributions we may find a much higher proportion of observations within these intervals. The Empirical rule provides us with some definite statements about the proportion of observations in a specified interval. It only works for Symmetric Bell- Shaped (mound-shaped) distributions. Also this rule is an approximation and more or less data than is indicated by the rule may lie in each interval. 18

19 2.6.2 The Empirical Rule For a Symmetric Bell-Shaped distribution; Approximately 68% of the observations are within 1 Standard Deviation of the Mean Approximately 95% of the observations are within 2 Standard Deviation of the Mean Approximately 99.7% of the observations are within 3 Standard Deviation of the Mean 19

20 2.6.3 Some Examples ExampleA The following is a list of the times it takes 12 UCD students to get to college in the morning : 12, 23, 56, 14, 17, 21, 33, 42, 45, 38, 51, 29 Calculate x and s and calculate the percentage of data between x - 2s and x + 2s and also between x - 3s and x + 3s. Compare these results with the predictions of Chebyshev s Rule. Assuming that the data is distributed in an approximate Bell shape use the Empirical Rule to calculate the percentage of the data within 2 standard deviations of the mean and within 3 S.Devs of the mean. Comment on your results. x =31.75 s = s = 2*14.78 = s = x - 2s = = 2.19 x + 2s = = x - 3s = = ~ 0!!!!!!!!!!!!!!!! x + 3s = =

21 Interval Actual Chebyshev s Empirical x -2s ~ x +2s 100% at least 75% approx. 95% 2.19 ~ x -3s ~ x +3s 100% at least 89% approx 0 ~ % This table illustrates very clearly how Chebyshev s rule generally underestimates the amount of data in each interval. The empirical rule provides, in this case, more accurate results. ExampleB: A lecturer in UCD has assigned some problems to be done by the 120 students in her class. When it comes time to collect the problems 9 students inform her that The dog ate my homework. From many years of teaching classes this size she has observed that the mean for homeworks actually eaten by pets of all kinds is 3 homeworks and the standard deviation is 0.8 homeworks. Should the lecturer believe that the homeworks of all 9 students were eaten by their dogs or not. By Chebyshev s rule at least 1-1/k 2 of the observations should in the interval ( x - ks, x + ks). This gives the following table: 21

22 k- # of Standard Deviations Interval 2 1.4, % 3 0.6, % 4 0, % 5 0, 7 96% 6 0, % 7 0, % 8 0, % At least Percentage of observations in interval From this table we can see that there is an AT MOST 2% chance that dogs ate 9 homeworks in this class. Remembering that Chebyshev s rule is extremely conservative we could conclude that the chances are very high that some of the students just didn t do their homeworks. 22

23 Example C: In Tombstone, Arizona Territory people used Colt.45 revolvers. However people used different ammunition. Wyatt Earp knew that his brothers and Doc Holliday were the only ones in the territory who used Colt.45s with Winchester ammunition. The Earp brothers conducted tests on many different combinations of weapons and ammunition. They found that dataset of observations produced by the combination of Colt.45 with Winchester shells showed a Mean velocity of 936 feet/second and a Standard Deviation of 10 feet/second. The measurements were taken at a distance of 15 feet from the gun. When Wyatt examined the body of a cowboy shot in the back in cold blood he concluded that he was shot at a distance of 15 feet and that the velocity of the bullet at impact was 1,000 feet/second. The dastardly Ike Clanton claimed that this cowboy was shot by the Earp brothers or Doc Holliday. Was Wyatt able to clear his good name using the Empirical Rule? 23

24 The distribution of this bullet velocity data should be approximately bell-shaped. This implies that the empirical rule should give a good estimation of the percentages of the data within each interval. k- # of Standard Deviations Interval Chebyshev s At least Percentage Empirical approximate Percentage 2 916, % 95% 3 906, % 99.7% 4 896, % ~100% 5 886, % ~100% 6 876, % ~100% 7 866, % ~100% This table quite clearly demonstrates that since the bullet velocity in the shooting was 1000 ft/sec and since this lies more than 6 Standard Deviations away from the mean the probability is extremely high that the Earps were not responsible for this shooting. This is especially evident from looking at the column showing percentages from the empirical rule. Practically 100% of bullet velocities should be between 896 and 976 ft/sec. 24

25 Example C2: During The Troubles in Northern Ireland both Republicans and Loyalists used 9mm handguns however they used different brands of handgun and ammunition. The security forces in NI knew that the republicans used Heckler and Koch 9mm handguns with Winchester ammunition. The security forces conducted tests on many different combinations of weapons and ammunition. They found that dataset of observations produced by the combination of a H&K 9mm with Winchester shells showed a Mean velocity of 936 feet/second and a Standard Deviation of 10 feet/second. The measurements were taken at a distance of 15 feet from the gun. Forensic scientists examining the body of a shooting victim concluded that he was shot at a distance of 15 feet and that the velocity of the bullet at impact was 1,000 feet/second. Describe the distribution of the bullet velocities. Did they conclude that the shooter was a member of a Republican terrorist organisation or a Loyalist organisation? 25

26 The distribution of this bullet velocity data should be approximately bell-shaped. This implies that the empirical rule should give a good estimation of the percentages of the data within each interval. k- # of Standard Deviations Interval Chebyshev s At least Percentage Empirical approximate Percentage 2 916, % 95% 3 906, % 99.7% 4 896, % ~100% 5 886, % ~100% 6 876, % ~100% 7 866, % ~100% This table quite clearly demonstrates that since the bullet velocity in the shooting was 1000 ft/sec and since this lies more than 6 Standard Deviations away from the mean the probability is extremely high that Republicans were not responsible for this shooting. This is especially evident from looking at the column showing percentages from the empirical rule. Practically 100% of bullet velocities should be between 896 and 976 ft/sec. 26

27 2.6.4 Example to illustrate the difference beween Chebyshev s Rule, The Empirical Rule and some actual data. A survey was conducted to measure the height 14 year olds, a sample of 1052 children were measured and it was found that : x = inches s = inches A bell-shaped symmetric distribution provided a good fit to the data, applying Chebyshev s and the Empirical rule we get: k: number of SDevs Interval: ( x -ks, x +ks) Actual % of Obs. in Interval Empirical Rule: % of Obs. Chebyshev s Rule: 72.1% 68% >= 0% 96.2% 95% >= 75% 99.2% 99.7% >= 89% Clearly in this instance Chebyshev s Rule underestimates the proportions very severely. 27

28 2.6.5 Estimating the Standard Deviation from the Range According to the Empirical rule for Bell-Shaped distributions almost all of the data should be in the interval ( x -3s, x +3s). So the Range should be approximately 6s ie: x +3s - ( x -3s). This gives us a crude but useful measure of the Standard Deviation. Standard Deviation ~ Range/6 28

29 Section 2.7 Numerical Measures of Relative Standing While it is useful to know how to measure the centre of a dataset and the variability of a dataset, many times we want to be able to compare one observation with the rest of the observations in the dataset. Is one observation larger than many others? For Example suppose you get 35% on the exam for this course you will probably feel quite bad about your performance but what if 90% of the class actually did worse than you? Then you might feel a bit better about your 35%. So in some cases knowing how one observation compares with others can be more useful than just knowing the value of that observation. This chapter will introduce some different ways of measuring Relative Standing. 29

30 2.7.1 Definitions Percentile: For any dataset the p th percentile is the observation which is greater in value than P% of all the numbers. Consequently this observation will be smaller than (100-P)% of the data. Z-Score: The Z-Score of an observation is the distance between that observation and the mean expressed in units of standard deviations. So: Sample Z-Score for an observation x is: Z x = s x Population Z-Score of an observation is: Z x = μ σ The numerical value of the Z-score reflects the relative standing of the observation. A large positive Z-score implies that the observation is larger than most of the other observations. A large negative Z-score indicates that the bservation is smaller than almost all the other observations. A Z score of zero or close to 0 means that the observation is located close to the mean of the dataset

31 ExampleA: The 50 th percentile of a dataset is the median (The median remember is the value which is larger than half of the data). ExampleB: Dataset 15, 3, 1, 7, 5, 17, 19, 11, 9, 13 In this dataset the 80th percentile is the value 15 as 15 is greater than or equal to 80% of the data. This is easily seen if we arrange the data in ascending order: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 Exercise 2.79 in textbook The distribution of scores on a nationally administered college achievement test has a median of 520 and a mean of 540. a. Explain how it is possible for the mean to exceed the median for this distribution. b. Suppose that you are told that the 90th percentile is 660, what does this mean? c. Suppose you are told that you scored at the 94th percentile, what does this mean? Answers: a. Distribution is positively skewed (to the right) b. 90% of the test scores are below 660 and 10% are above. c. 94% of the test scores were below yours and only 6% were above. 31

32 Example D. A sample of 120 statistics students was chosen and their exam results summarised, the mean and standard deviation were shown to be: x = 53% and s = 7% Eric and Kenny are two students in this class and Eric s exam result was 47% what was his Z-score? If Kenny s Z-Score is 2, what was his percentage on the exam? Z-scores and the Empirical Rule For a bell shaped distribution the Empirical Rule tells us the following about Z-scores: 1. Approximately 68% of the observations have a Z-Score between -1 and Approximately 95% of the observations have a Z-Score between -2 and Approximately 99.7% of the observations have a Z-Score between -3 and 3. Example 2.14 in the textbook: Suppose a female bank employee believes that her salary is low as a result of sex discrimination. To substantiate her belief, she collects information on the salaries of her male counterparts. She finds that their salaries have a mean of $34,000 and a standard deviation of $2,000. Her salary is $27,000 does this information support her claim of sex discrimination? Answer: 32

33 Calculate her Z-score with respect to her male counterparts: Z x x = = s $27, 000 $34, 000 $2, 000 = 35. So the woman s salary is 3.5 Standard Deviations below the mean of the male salary distribution. If the male salaries are distributed in a bell shape then the empirical rule tells us that very few salaries in this distribution should have a z-score below -3. Therefore a Z-score of -3.5 represents either a highly unsual observation from the male salary distribution or is from a different distribution. Do you think her claim of sex discrimination is justified? Answer: Need more data, on the collection technique the woman used, the length of time she has been in her job, her competence at her job etc. If she truly chose a representative sample, if she had been employed there as long as others and if she was good at her job then one might conclude that she was discriminated against. 33

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)

More information

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1. Lecture 6: Chapter 6: Normal Probability Distributions A normal distribution is a continuous probability distribution for a random variable x. The graph of a normal distribution is called the normal curve.

More information

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Lesson 4 Measures of Central Tendency

Lesson 4 Measures of Central Tendency Outline Measures of a distribution s shape -modality and skewness -the normal distribution Measures of central tendency -mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central

More information

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

More information

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

More information

3.2 Measures of Spread

3.2 Measures of Spread 3.2 Measures of Spread In some data sets the observations are close together, while in others they are more spread out. In addition to measures of the center, it's often important to measure the spread

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

AP Statistics Solutions to Packet 2

AP Statistics Solutions to Packet 2 AP Statistics Solutions to Packet 2 The Normal Distributions Density Curves and the Normal Distribution Standard Normal Calculations HW #9 1, 2, 4, 6-8 2.1 DENSITY CURVES (a) Sketch a density curve that

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

More information

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple. Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

Chapter 3. The Normal Distribution

Chapter 3. The Normal Distribution Chapter 3. The Normal Distribution Topics covered in this chapter: Z-scores Normal Probabilities Normal Percentiles Z-scores Example 3.6: The standard normal table The Problem: What proportion of observations

More information

Midterm Review Problems

Midterm Review Problems Midterm Review Problems October 19, 2013 1. Consider the following research title: Cooperation among nursery school children under two types of instruction. In this study, what is the independent variable?

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Variables. Exploratory Data Analysis

Variables. Exploratory Data Analysis Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

More information

Statistics. Measurement. Scales of Measurement 7/18/2012

Statistics. Measurement. Scales of Measurement 7/18/2012 Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does

More information

consider the number of math classes taken by math 150 students. how can we represent the results in one number?

consider the number of math classes taken by math 150 students. how can we represent the results in one number? ch 3: numerically summarizing data - center, spread, shape 3.1 measure of central tendency or, give me one number that represents all the data consider the number of math classes taken by math 150 students.

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Unit 7: Normal Curves

Unit 7: Normal Curves Unit 7: Normal Curves Summary of Video Histograms of completely unrelated data often exhibit similar shapes. To focus on the overall shape of a distribution and to avoid being distracted by the irregularities

More information

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve

More information

AP * Statistics Review. Descriptive Statistics

AP * Statistics Review. Descriptive Statistics AP * Statistics Review Descriptive Statistics Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production

More information

Week 3&4: Z tables and the Sampling Distribution of X

Week 3&4: Z tables and the Sampling Distribution of X Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal

More information

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175) Describing Data: Categorical and Quantitative Variables Population The Big Picture Sampling Statistical Inference Sample Exploratory Data Analysis Descriptive Statistics In order to make sense of data,

More information

Mind on Statistics. Chapter 2

Mind on Statistics. Chapter 2 Mind on Statistics Chapter 2 Sections 2.1 2.3 1. Tallies and cross-tabulations are used to summarize which of these variable types? A. Quantitative B. Mathematical C. Continuous D. Categorical 2. The table

More information

Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

More information

The Normal Distribution

The Normal Distribution Chapter 6 The Normal Distribution 6.1 The Normal Distribution 1 6.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Recognize the normal probability distribution

More information

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph. MBA/MIB 5315 Sample Test Problems Page 1 of 1 1. An English survey of 3000 medical records showed that smokers are more inclined to get depressed than non-smokers. Does this imply that smoking causes depression?

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. MATH 3/GRACEY PRACTICE EXAM/CHAPTERS 2-3 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) The frequency distribution

More information

Means, standard deviations and. and standard errors

Means, standard deviations and. and standard errors CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

More information

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck!

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck! STP 231 EXAM #1 (Example) Instructor: Ela Jackiewicz Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned.

More information

Interpreting Data in Normal Distributions

Interpreting Data in Normal Distributions Interpreting Data in Normal Distributions This curve is kind of a big deal. It shows the distribution of a set of test scores, the results of rolling a die a million times, the heights of people on Earth,

More information

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

DESCRIPTIVE STATISTICS & DATA PRESENTATION* Level 1 Level 2 Level 3 Level 4 0 0 0 0 evel 1 evel 2 evel 3 Level 4 DESCRIPTIVE STATISTICS & DATA PRESENTATION* Created for Psychology 41, Research Methods by Barbara Sommer, PhD Psychology Department

More information

Chapter 4. Probability and Probability Distributions

Chapter 4. Probability and Probability Distributions Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the

More information

Mathematics (Project Maths Phase 1)

Mathematics (Project Maths Phase 1) 2012. M128 S Coimisiún na Scrúduithe Stáit State Examinations Commission Leaving Certificate Examination, 2012 Sample Paper Mathematics (Project Maths Phase 1) Paper 2 Ordinary Level Time: 2 hours, 30

More information

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: What do the data look like? Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses

More information

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research

More information

Northumberland Knowledge

Northumberland Knowledge Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

More information

Exploratory Data Analysis. Psychology 3256

Exploratory Data Analysis. Psychology 3256 Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find

More information

Frequency Distributions

Frequency Distributions Descriptive Statistics Dr. Tom Pierce Department of Psychology Radford University Descriptive statistics comprise a collection of techniques for better understanding what the people in a group look like

More information

Standard Deviation Estimator

Standard Deviation Estimator CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

Practice#1(chapter1,2) Name

Practice#1(chapter1,2) Name Practice#1(chapter1,2) Name Solve the problem. 1) The average age of the students in a statistics class is 22 years. Does this statement describe descriptive or inferential statistics? A) inferential statistics

More information

3: Summary Statistics

3: Summary Statistics 3: Summary Statistics Notation Let s start by introducing some notation. Consider the following small data set: 4 5 30 50 8 7 4 5 The symbol n represents the sample size (n = 0). The capital letter X denotes

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9

More information

8. THE NORMAL DISTRIBUTION

8. THE NORMAL DISTRIBUTION 8. THE NORMAL DISTRIBUTION The normal distribution with mean μ and variance σ 2 has the following density function: The normal distribution is sometimes called a Gaussian Distribution, after its inventor,

More information

Measures of Central Tendency and Variability: Summarizing your Data for Others

Measures of Central Tendency and Variability: Summarizing your Data for Others Measures of Central Tendency and Variability: Summarizing your Data for Others 1 I. Measures of Central Tendency: -Allow us to summarize an entire data set with a single value (the midpoint). 1. Mode :

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Topic 9 ~ Measures of Spread

Topic 9 ~ Measures of Spread AP Statistics Topic 9 ~ Measures of Spread Activity 9 : Baseball Lineups The table to the right contains data on the ages of the two teams involved in game of the 200 National League Division Series. Is

More information

Describing, Exploring, and Comparing Data

Describing, Exploring, and Comparing Data 24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter

More information

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.) Center: Finding the Median When we think of a typical value, we usually look for the center of the distribution. For a unimodal, symmetric distribution, it s easy to find the center it s just the center

More information

1 Descriptive statistics: mode, mean and median

1 Descriptive statistics: mode, mean and median 1 Descriptive statistics: mode, mean and median Statistics and Linguistic Applications Hale February 5, 2008 It s hard to understand data if you have to look at it all. Descriptive statistics are things

More information

THE BINOMIAL DISTRIBUTION & PROBABILITY

THE BINOMIAL DISTRIBUTION & PROBABILITY REVISION SHEET STATISTICS 1 (MEI) THE BINOMIAL DISTRIBUTION & PROBABILITY The main ideas in this chapter are Probabilities based on selecting or arranging objects Probabilities based on the binomial distribution

More information

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data A Few Sources for Data Examples Used Introduction to Environmental Statistics Professor Jessica Utts University of California, Irvine jutts@uci.edu 1. Statistical Methods in Water Resources by D.R. Helsel

More information

The Standard Normal distribution

The Standard Normal distribution The Standard Normal distribution 21.2 Introduction Mass-produced items should conform to a specification. Usually, a mean is aimed for but due to random errors in the production process we set a tolerance

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Exam Name 1) A recent report stated ʺBased on a sample of 90 truck drivers, there is evidence to indicate that, on average, independent truck drivers earn more than company -hired truck drivers.ʺ Does

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

Lecture 14. Chapter 7: Probability. Rule 1: Rule 2: Rule 3: Nancy Pfenning Stats 1000

Lecture 14. Chapter 7: Probability. Rule 1: Rule 2: Rule 3: Nancy Pfenning Stats 1000 Lecture 4 Nancy Pfenning Stats 000 Chapter 7: Probability Last time we established some basic definitions and rules of probability: Rule : P (A C ) = P (A). Rule 2: In general, the probability of one event

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles. Math 1530-017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible

More information

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Final Exam Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) A researcher for an airline interviews all of the passengers on five randomly

More information

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Name: 1. The basic idea behind hypothesis testing: A. is important only if you want to compare two populations. B. depends on

More information

Ch. 3.1 # 3, 4, 7, 30, 31, 32

Ch. 3.1 # 3, 4, 7, 30, 31, 32 Math Elementary Statistics: A Brief Version, 5/e Bluman Ch. 3. # 3, 4,, 30, 3, 3 Find (a) the mean, (b) the median, (c) the mode, and (d) the midrange. 3) High Temperatures The reported high temperatures

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives. The Normal Distribution C H 6A P T E R The Normal Distribution Outline 6 1 6 2 Applications of the Normal Distribution 6 3 The Central Limit Theorem 6 4 The Normal Approximation to the Binomial Distribution

More information

Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous

Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous Chapter 2 Overview Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Classify as categorical or qualitative data. 1) A survey of autos parked in

More information

Measurement & Data Analysis. On the importance of math & measurement. Steps Involved in Doing Scientific Research. Measurement

Measurement & Data Analysis. On the importance of math & measurement. Steps Involved in Doing Scientific Research. Measurement Measurement & Data Analysis Overview of Measurement. Variability & Measurement Error.. Descriptive vs. Inferential Statistics. Descriptive Statistics. Distributions. Standardized Scores. Graphing Data.

More information

WEEK #22: PDFs and CDFs, Measures of Center and Spread

WEEK #22: PDFs and CDFs, Measures of Center and Spread WEEK #22: PDFs and CDFs, Measures of Center and Spread Goals: Explore the effect of independent events in probability calculations. Present a number of ways to represent probability distributions. Textbook

More information

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

More information

Descriptive statistics parameters: Measures of centrality

Descriptive statistics parameters: Measures of centrality Descriptive statistics parameters: Measures of centrality Contents Definitions... 3 Classification of descriptive statistics parameters... 4 More about central tendency estimators... 5 Relationship between

More information

Statistics Revision Sheet Question 6 of Paper 2

Statistics Revision Sheet Question 6 of Paper 2 Statistics Revision Sheet Question 6 of Paper The Statistics question is concerned mainly with the following terms. The Mean and the Median and are two ways of measuring the average. sumof values no. of

More information

Lecture 2. Summarizing the Sample

Lecture 2. Summarizing the Sample Lecture 2 Summarizing the Sample WARNING: Today s lecture may bore some of you It s (sort of) not my fault I m required to teach you about what we re going to cover today. I ll try to make it as exciting

More information

How To Write A Data Analysis

How To Write A Data Analysis Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Chapter 1 Review 1. As part of survey of college students a researcher is interested in the variable class standing. She records a 1 if the student is a freshman, a 2 if the student

More information

Foundation of Quantitative Data Analysis

Foundation of Quantitative Data Analysis Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1

More information

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Opgaven Onderzoeksmethoden, Onderdeel Statistiek Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

6. Decide which method of data collection you would use to collect data for the study (observational study, experiment, simulation, or survey):

6. Decide which method of data collection you would use to collect data for the study (observational study, experiment, simulation, or survey): MATH 1040 REVIEW (EXAM I) Chapter 1 1. For the studies described, identify the population, sample, population parameters, and sample statistics: a) The Gallup Organization conducted a poll of 1003 Americans

More information

6 3 The Standard Normal Distribution

6 3 The Standard Normal Distribution 290 Chapter 6 The Normal Distribution Figure 6 5 Areas Under a Normal Distribution Curve 34.13% 34.13% 2.28% 13.59% 13.59% 2.28% 3 2 1 + 1 + 2 + 3 About 68% About 95% About 99.7% 6 3 The Distribution Since

More information

z-scores AND THE NORMAL CURVE MODEL

z-scores AND THE NORMAL CURVE MODEL z-scores AND THE NORMAL CURVE MODEL 1 Understanding z-scores 2 z-scores A z-score is a location on the distribution. A z- score also automatically communicates the raw score s distance from the mean A

More information

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing. Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

More information

Probability Distributions

Probability Distributions Learning Objectives Probability Distributions Section 1: How Can We Summarize Possible Outcomes and Their Probabilities? 1. Random variable 2. Probability distributions for discrete random variables 3.

More information

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information