CALCULATIONS & STATISTICS

Size: px
Start display at page:

Download "CALCULATIONS & STATISTICS"

Transcription

1 CALCULATIONS & STATISTICS

2 CALCULATION OF SCORES Conversion of 1-5 scale to scores When you look at your report, you will notice that the scores are reported on a scale, even though respondents rate your services on the survey on a 1-5 scale. We convert respondents ratings because most people find it easier to interpret scores from VERY POOR POOR FAIR GOOD VERY GOOD Scale = Score = Calculation of Mean Scores Each patient has a score for every question that was answered. For each patient, a section score is calculated as the mean of the all the question scores in that particular section. Similarly, for each patient, an overall score is calculated as the mean of that patient s section scores. Sample Data for 5 Patients: Patient A1 A2 A3 B1 B2 B3 B4 C1 C2 SECT A SECT B SECT C OVERALL Mean: Mean rounded to one decimal place: For the data listed in the above table, Patient #1 s score for section A (SECT A) is the mean of Patient #1 s scores for the questions in section A (i.e., A1, A2, A3). (A1 + A2 + A3) 3 = SECT A ( ) 3 =

3 Similarly, Patient #1 s OVERALL score is calculated: (SECT A + SECT B + SECTC) 3 = OVERALL ( ) 3 = 7.47 The section scores displayed in a report are the means of all the patients scores for that section, rounded to one decimal place. (See bold face type at the bottom of SECT A, SECT B, and SECT C columns). Similarly, the overall facility rating score displayed in a report is the mean of all patients overall scores, rounded to one decimal place. (See bold face type at the bottom of OVERALL column). ( ) 5 = = 66.5 In a perfect world where every patient fills in every question, you could also calculate the overall hospital rating by taking the mean of the section averages displayed in your report. ( ) 3 = 66.5 HOWEVER, there is always some missing data (i.e., patients do not fill in every question on the survey). Patients may even skip an entire section if the questions do not apply to them. Thus, the overall score in your section might represent 250 patients, but a particular section score might represent only 245 patients. When this happens, it makes it impossible to calculate your overall mean score from the section mean scores displayed in your report. (See example below.) Sample Data for 5 Patients with Missing Data: Patient A1 A2 A3 B1 B2 B3 B4 C1 C2 SECT A SECT B SECT C OVERALL Mean: Mean rounded to one decimal place: In the above table you can see that the overall hospital rating score that would be displayed in a report is the mean of all patients overall scores. (See bold face type at the bottom of OVERALL column). 2

4 ( ) 5 = = 66.5 Because of missing data, you cannot obtain the overall hospital rating score by taking the mean of the section averages displayed in the report. ( ) 3 = = 65.6 Note: The difference appears small in this example because there are so few patients in the sample. In practice the difference would be much greater. THE MEAN The Mean (denoted X, and referred to as x-bar) is a measure of central tendency representing the arithmetic center of a group of scores. In plainer language, it is the average. In terms of your Press, Ganey report, the mean gives you information about the average score for: an individual question, a section on the survey, the overall satisfaction score for your facility, or the satisfaction scores for all facilities in our database. The mean is calculated by summing all scores and dividing by the number of scores. For example, the mean of.0, 7.5, and 6.5 would be calculated: A more general formula can be written: ( ) 3 = 7.33 Mean X X + X 2+ X 3 = =... + X n 1 n The mean can also be interpreted as the balancing point. Each score can be thought of as a one pound weight and the range of values of scores could be plotted along a supporting rod. The balancing point for the rod would be the mean of the values of the scores. Interestingly, physicists determine the precise balancing point, or center of gravity, using the same formula that statisticians use to find the mean. ] ] ] X

5 The mean, or balancing point, is easily influenced by outliers (i.e., scores that are much higher or much lower than the rest of the scores). If you put a weight on the balancing rod that was in about the same place as all the other weights (e.g., 7.0), the rod wouldn t tip very much and you wouldn t have to move the balancing point very far. If, however, you placed a weight way out on the end of the rod (e.g., 90.0), the rod would tip considerably. You would need to move the balancing point closer to the extreme value in order to make the rod balance. The same holds true with calculating the mean. If you have a group of scores that are all within a close range, the mean value will likely be towards the center of that range. However, having just one score that is much higher or lower than the rest can pull the mean up or down, respectively. Note: Remember that the mean score is an average, not a percent. Say you have a sample of five patients who rate skill of nurses as follows: 5 (very good), 4 (good), 4 (good), 4 (good), 5 (very good). Those ratings can be converted into the following scores: 100, 75, 75, 75, 100. The average of these scores would be 5 (( ) 5). You wouldn t want to say that your patients were 5% satisfied with the skill of nurses, because in fact all (100%) of your patients rated the nurses good or very good with 60% giving a rating of good and 40% giving a rating of very good. THE MEDIAN & PERCENTILE RANKING The Median The median is a measure of central tendency representing the mid-point in a distribution of scores or the point at the 50 th percentile. In other words, the median is the middle; it is the score that splits the distribution in half. Fifty percent of scores will fall above the median and fifty percent of scores will fall below the median. The median is determined by ordering the scores from highest to lowest and finding the middle value. If there is an odd number of scores, the median is the score in the middle. If there is an even number of scores, then there is no score in the middle so the median is the average of the two scores closest to the middle. Comparison of the Mean and the Median The mean and the median are each measures of central tendency, but they describe the center in different ways. The mean is the average of the scores, whereas the median is the middle of the scores. The average is not always in the middle of a distribution. Another difference between the two is that the mean is influenced by outliers (i.e., scores that are much higher or lower than the rest) but the median is not influenced by outliers. To illustrate this point, let s look at an imaginary sample of ten facilities overall satisfaction scores: 4

6 Hos # Hos # Finding the Mean: Hos # The mean of these scores is obtained by summing the Hos # scores and dividing by the number of scores (ten) to get Hos # Hos # Mean = ( Hos # ) 10 Hos # 6.2 = 5.9 Hos # Hos # Hos # Hos # Finding the Median: Hos # To find the median, the hospital scores are ordered from highest Hos # to lowest. In this case there is an even number of scores so we Hos # must take the average of the two closest to the middle (i.e., 7.1 Hos # 6.2 and 6.2). Hos # Hos # Media = ( ) 2 n Hos # = 6.65 Hos # Notice the relationship between the mean and the median. In this example, the mean is lower than the median because an outlier (the low score of 69.5) pulled the mean down a bit. The low outlier did not influence the median. If you were facility #, with a score of 6.2, you would notice that your facility score is above the mean (5.9) for the database but below the median (6.65). The median for the database is equivalent to the 50 th percentile in the percentile ranking, so facility # would be below the 50 th percentile. So if your facility s score is above the mean but below the 50 th percentile it indicates that there are low outliers in the database (facilities whose score are considerably lower than all the other scores). Conversely, if your score is below the mean but above the 50 th percentile, it indicates that there are high outliers in the database (facilities whose scores are considerably higher than all the other scores). Percentile Ranking Percentile ranking is a strategy for assigning a series of values to divide a distribution into equal parts. More specifically, the percentile ranking tells you the proportion of scores in the database which fall below your individual facility s score. The median of 5

7 the database is the score associated with the 50 th percentile. Percentile rankings therefore give you information about where your hospital stands in relation to the median of the database. Keep in mind that these numbers are percentiles, not actual scores. For example, a 6 in the percentile rank column of your report means that the hospital is in the 6 th percentile for this item. Translation: this particular hospital scores higher than 6% of the hospitals in the database and scores lower than 32 % of the hospitals in the database. For more information on percentile rankings please see the section on the Standard Deviation and the discussion of how percentile rankings relate to the standard deviation. THE STANDARD DEVIATION & STANDARD ERROR (SIGMA) What is the Standard Deviation? The standard deviation is a measure of variability of scores around the mean. Large numbers for the standard deviation indicate that the data are very spread out (i.e., there is a lot of variability). Conversely, a very small standard deviation would indicate that most of the data are very similar to the mean (i.e., less variability). How do you determine the Standard Deviation? The standard deviation is calculated using the formula: ( X X ) In plainer terms, this formula tells you to (see example below): Find the mean of the sample you are interested in (bottom cell of second column). Find the distance from the mean for each individual score. You do this by subtracting the mean from each hospital score (see third column). Square all the distances from the mean (fourth column). Add all of the squared distances from the mean scores (bottom cell of fourth column). Divide the sum of the squared distances by the number of scores you are working with, (in this example, there are five scores). The number you get, is called the variance. Take the square root of your answer from number...this is the Standard Deviation ! n 2 6

8 Example 1 FACILITY SCORE (SCORE MEAN) (SCORE MEAN) 2 X (X-X) (X-X) = = = = = X Sum of (X-X) 2 = = 3.5 = Square root of =9.64 For comparison, a second example (Example 2) is provided that shows the computation of the standard deviation for a different set of five scores. Notice that the mean is the same as in the first example (3.5). However the standard deviation is considerably smaller (1.32) because the scores are much closer to the mean than were the scores in the first example. In other words, two sets of data with the same mean won t necessarily be identical to one another. 7

9 Example 2 FACILITY SCORE (SCORE MEAN) (SCORE MEAN) 2 X (X-X) (X-X) = = = = = X Sum of (X-X) = 3.5 =.75 = 1.75 Square root of 1.75 =1.32 Importance of the Standard Deviation Knowing the standard deviation of the database is important because it allows you to compare your facility s score to the larger group. We can do this based on the normal distribution. The normal distribution (below) is a bell shaped curve representing a theoretical distribution of data in which the mean, median, and mode (the score that occurs most frequently) have the same value. In the normal distribution 6% of all data falls within one standard deviation of the mean of the distribution. Further, 95% of the data falls within two standard deviations of the mean. The Normal Distribution 95.44% 6.26% -2 S.D. -1 S.D. X +1 S.D. +2 S.D.

10 Standard Deviation and the Percentile Ranking Based upon information about the standard deviation you can figure out what proportion of scores in the database are below your facility s scores (see below). 50% 50% 13.6% 34.1% 34.1% 13.6% 2.3% 2.3% << < > >> -2 S.D. -1 S.D. X +1 S.D. +2 S.D. For example, if your score is above one standard deviation of the database mean (>) you can deduce (based on the normal distribution) that 4.1% of the scores in the database are below the score that your facility achieved. Similarly, if your score is above two standard deviations of the database mean (>>) you can deduce that 97.7% of the scores in the database are below the score your facility achieved. However, you may look at your percentile ranking and find that the proportions based on the standard deviation do not exactly match up with the proportions reported in the percentile rank columns. This is attributable to the fact that the standard deviation is based on the mean, whereas the percentile ranking is based on the median. Outliers (scores that are much higher or lower than most) will influence the mean but will have no effect on the median. If you had a distribution that was perfectly normal where the mean and the median were equal, then your proportions based on the standard deviation would equal the proportions based on the percentile ranking. What is the Standard Error or Sigma? So far we ve been talking about the standard deviation, which we ve defined as the average distance that any individual score is from the distribution mean. In the examples above we looked at the distance between individual hospitals and the comparative database mean but you can also calculate the standard deviation for your own sample of data. In this case you would be looking at the average distance that your patients scores are from your own facility sample mean. Remember that the facility's score is derived from the sample of patients who were sent surveys and chose to return them in the last report cycle. The population of patients would be all of the patients who were served during the last report cycle. Imagine an ideal world with unlimited funds for research, patients who could always be reached, and patients who always responded to the surveys. In this ideal world you could randomly select many samples from your total patient population. So if your 9

11 hospital saw 50 patients in the last quarter you could randomly select 10 to be sample 1, then randomly select a different set of 10 to be sample 2, and so on until you had a variety of samples from the same population. Each sample would have a mean score, the average of the scores for all the patients in that sample. You could calculate how spread out the sample means were around the mean for the total population using the same formula that we used to find the standard deviation. Instead of calling this the standard deviation of the sample scores (or the average distance that multiple sample means are from the population mean), this is called the Standard Error (SE) or Sigma (σ). Now the problem with this method of calculating the Standard Error is that we don t live in that ideal world with unlimited funds, patients who can always be reached, and patients who always fill out surveys. So instead we estimate the Standard Error of the sampling distribution using the standard deviation of the one sample that we do have. This is done using the following formula: S n Where S= standard deviation of the sample n= number of observations (patients) in the sample One reason the standard error is important is that it is used in the calculation of confidence intervals. Please see the section on confidence intervals for a continued discussion of this statistic. Note: Often, clients state that they are interested in knowing the standard error when in fact what they are looking for is the standard deviation or the confidence interval. If you think you want to know about the standard error, ask yourself the following questions: 1. Do I want to know how much variability there is in scores or how spread out the scores are around the mean? If yes, then I really want to know about the Standard Deviation. 2. Do I want to know how close my score, based on a sample, is to what the score would be if we really had data from every single patient? If yes, then I really want to know about the Confidence Interval. 10

12 CONFIDENCE INTERVAL What is a Confidence Interval? The confidence interval is the region in which a population score is likely to be found. In other words, it is the range around your sample mean in which you would expect to find your true population mean. When looking at the mean for a sample of a population, you get a very good estimate of the mean for the whole population, but you don t come up with the exact number. The only way you could truly know the true population mean is if you actually had data from everyone in the population (i.e., every single patient at your facility received and completed a survey). The mean score that you get for the sample of patients who returned the surveys is an accurate reflection of the opinions of those in the sample and is considered to be an estimate of the score that would characterize the whole population. The confidence interval tells you the range of values, around your sample mean, that would be expected to contain the mean for the actual population. For example, if your sample mean is 0 and your confidence interval is 2 then you would say that the estimate for the mean of your entire patient population is which would be 7-2. So the confidence interval answers the question: What is the range of values around our sample mean that we are 95% sure contains the actual population mean? How do you calculate a confidence interval? A 95% confidence interval is calculated using the following formula: C. I. = X ± SE( X ) Which is read as the sample mean plus or minus 1.96 times the standard error. Remember that if you wanted to know where 95% of your patients scores were, you could take your sample mean plus or minus two standard deviations. Calculating a confidence interval uses the same logic but applies it differently. For the confidence interval we are not trying to find a range within a sample that contains 95% of scores, but the range within a population that contains 95% of sample mean scores. The standard error (SE) tells you the average distance that a sample mean would be from the population mean (see discussion of the SE in the section on standard deviations); it s like saying the standard deviation of the sample means around the population mean. We could just find the interval that is plus or minus two standard errors (like we do with the standard deviation). However, if you remember from the picture of the normal distribution, two standard deviations actually gives you slightly more than 95%. It gives you %. If we want to know exactly 95%, then we would want slightly less than two standard errors

13 One last issue. We don t know the exact standard error of the population because we don t have multiple samples drawn from the same population at the same time. We have just one sample from the population. However, we can estimate the standard error of the population based on the standard deviation of our one sample using the following formula: S n So if a facility had an overall rating of 4.24 with a standard deviation of 4.5 and an n of 225 we could calculate the standard error as... S SE = n And the confidence interval would be calculated... Where S= standard deviation of the sample n= number of observations (patients) in the sample = 9.5 = ( ) = 4.24 ± 1.96( 0.63) = C. I. = X ± 1.96 SE X ± Thus, with a sample mean of 4.24, we are 95% confident that the actual population mean is between 3.00 and 5.4. Note: The Confidence Intervals that are listed in your report under the heading Facility Statistical Analysis can be used to estimate the amount of change in mean score you would need in order to show a statistically significant improvement. In order to create this estimate you must multiply the number in the 95% Confidence Interval column by 1.41 and add the result to your current mean. So, if you had a mean of 4.24 and confidence interval of 1.24 you could estimate that increasing your mean score by multiplying the 1.24 by The result, 1.75, is an estimate of how much you would have to increase your mean score (i.e., from 4.24 to 5.99). This estimate is a just a guide and assumes that your sample for the next report has exactly the same n and standard deviation as the data for the current report. 12

14 t-tests The statistical procedure used to determine if scores on a current report are significantly different from the scores on the last report is the t-test. The calculation of a t-test measures the difference between sample means, taking into account the size and variability of each of the samples. It is important to note that the result of the t-test does not simply tell you what the difference is, it tells you how confident you can be that the difference is real and not due to random error. Conventionally, if you are at least 95% confident that the difference is real and thus not due to random error, then the difference is said to be statistically significant. How do you calculate a t-test? t-tests are calculated using the following formula... t = ( X 1 X 2 ) S n S + n X 1 = the mean of sample 1 X 2 = the mean of sample 2 S 1 2 = the variance (Stand. Dev. 2 ) of sample 1 n 1 = the number of observations in sample 1 S 2 2 the variance (Stand. Dev. 2 ) of sample 2 n 2 = the number of observations in sample 2 In your report... When a score on a Press, Ganey report is found to be significantly different from the score on the last report, the score is highlighted with one of the following symbols: + 95% certainty of significant increase (t >1.96) % certainty of significant increase (t >2.576). - 95% certainty of significant decrease (t< -1.96) % certainty of significant decrease (t< ). Two factors can influence whether or not a difference is found to be significant: the size of the samples (n) and the variability of the samples. Size: It is less likely that you will be confident enough to call a difference significant (i.e., not due to error) if there is less data available (lower n s). Variability: It is less likely that you will be confident enough to call a difference significant (i.e., not due to error) when there is greater variability (i.e., the data are more 13

15 spread out around the mean of the sample; larger standard deviations). So if you have what appear to be large differences in scores that are not marked with an indicator of statistically significant difference, it is likely that either the n s were too low or the standard deviations were too large for you to be confident that the difference was real and not due to random error. Conversely, if you have what appear to be small differences in scores that are marked with an indicator of statistically significant difference, it is likely that either the n s were so high or the standard deviation was so small that it was possible to be very confident that the difference was real and not due to random error. How can we be sure that differences in scores are significant at the.05 level, especially when sample sizes are small? ** Remember that a t-test takes into account both the size of the samples and the variability of the samples, so if the results of a t-test indicate that a difference is indeed significant, then you can be at least 95% confident that the difference is real and not due to random error, no matter how small your samples are. 14

16 CORRELATIONS What is a correlation? A correlation tells you the strength of the relation between variables. In other words, a correlation tells us how much a change in one variable (e.g., a question score) is associated with a concurrent, systematic change in another variable (e.g., overall satisfaction). A correlation represents the strength of the relationship between two variables numerically, with a correlation coefficient (called r) which can range from -1.0 to A positive correlation coefficient indicates that as the value of one variable increases, the value of the other variable also increases. For example, the more you exercise, the greater your endurance. A negative correlation coefficient indicates that as the value of one variable increases the value of the other variable decreases. For example, the more you smoke cigarettes, the less lung capacity you have. It is important to recognize that when two variables are correlated it means that they are related to each other, but it does not necessarily mean that one variable influences or predicts the other. This is the basis of the statement, Correlation does not imply causation! How can you tell how strong the relationship is between two variables? The closer a correlation is to 1 (either positive or negative), the stronger the relationship. The closer a correlation is to 0 (either positive or negative), the weaker the relationship. How is a correlation calculated? Correlation coefficients are calculated using the formula below. This statistic basically assesses the degree to which variables X and Y vary together (the numerator) taking into account the degree to which each variable varies on its own (the denominator). r 2 x xy = 2 y Example of calculating the correlation between... X (Overall rating of care provided at this facility) and Y (Likelihood of recommending this hospital). Please see the table on the following page. 15

17 Note: Each row represents a patient s data. For example, the data in the first row indicates that a patient at this hospital circled a 3 (fair) for the question Overall rating of care given at this hospital, and circled a 4 (good) for the question Likelihood of recommending this hospital to others. Determine the mean of X and the mean of Y (bottom of columns 1 and 2). For each value of X (each patient), determine how much it deviates (differs) from the mean of X. This deviation will be called x (in lower case). For each value of Y (each patient), determine how much it deviates (differs) from the mean of Y. This deviation will be called y. For each row, multiply x and y. For each row, square x, that is, multiply x by itself. For each row, square y, that is, multiply y by itself. Sum all the xy scores (bottom of column 5). Sum all the x 2 scores (bottom of column 6). Sum all the y 2 scores (bottom of column 7). Table 1 X Y x= (X - X) y= (Y - Y) xy x 2 y X=4. 4 Y=4. xy= 1.4 x 2 = 3.2 y 2 = 0. Fill in the formula below... with the numbers... and then r 2 x xy = 2 r = y

18 1.4 r = 1.79 * 0.9 which equals.. A correlation of. indicates a strong relationship between two variables. Additionally, the correlation is positive indicating that as scores on Overall rating of care given at this hospital increase, scores on Likelihood of recommending this hospital also increase. Graphed representations of correlations The first graph depicts the positive correlation between Friendliness of Nurses and Likelihood of Recommending Hospital. The positive correlation coefficient (r =.79) means that high scores on Friendliness are associated with high scores on Likelihood to Recommend, whereas low scores on Friendliness are associated with low scores on Likelihood to Recommend. You can tell at a glance that the correlation is positive because the line slants upward across the graph. 100 Friendliness and Recommendations Likelihood of Recommending r = Friendliness of Nurses 464 hospitals - 3rd quarter 1996 In contrast, the second graph depicts the negative correlation between hospital bed size and hospital overall satisfaction score. The negative correlation coefficient (r = -.46) means that hospitals with more beds tend to have lower scores, whereas hospitals with fewer beds have higher scores. You can tell at a glance that the second correlation is negative because the line slants downward across the graph. 17

19 Overall Mean by Bed Size Average Common Question Mean r = Total Facility Beds Regression vs. Correlation Press, Ganey reports list correlation coefficients between each question and the overall satisfaction score (an average of all the other items in a questionnaire). This gives a measure of the relative importance of each individual question for overall satisfaction. Clients sometimes ask why we do not use multiple regression to demonstrate these relationships. When the items of a survey are highly correlated, it is impossible to separate the effect of one item from that of others with any degree of precision. Because this problem (termed multicollinearity ) violates one of the assumptions of multiple regression, Press, Ganey does not report multiple regression analyses performed to relate specific questions to overall satisfaction. Like regression analyses, correlations are statistical measures of association that describe the strength of relationship, or association, between factors, such as a doctor s courtesy and patients overall satisfaction with care from that physician. Correlations have the advantage of using all the information from each respondent to a survey, whereas regression analyses eliminates all information from any respondents who do not answer every question. 1

20 PRIORITY INDEX What is the priority index? The priority index is a way of combining two very important pieces of information: (1) the actual score achieved on a particular question, and (2) the degree to which that particular question is associated with overall satisfaction. Combining these two pieces of information helps a facility to know where efforts should be placed for quality improvement. For example, one question might be very low in score (e.g., quality of food) but not particularly associated with overall satisfaction. Because it is not highly associated with satisfaction the facility might chose to place quality improvement resources elsewhere. Conversely, a question might be very highly associated with overall satisfaction (e.g., overall cheerfulness of hospital) but not low in score. Attempting to raise the score would probably be difficult and may perhaps be unnecessary if most of your patients are very satisfied already. How is the priority index calculated? The priority index is derived through a three-step process. For the purpose of this example, let s assume that the survey has 50 questions. 1. SCORE RANK Questions are ordered from the highest score down to the lowest score. Each question is then given a score rank; the highest mean score gets a rank of 1, the second highest score gets a rank of 2, the third highest score gets a rank of 3 and so on down the line until the lowest score is given a rank of 50. It may help to remember the meaning of the score rank to think of a high score as a small issue and a low score as a big issue--something the facility should be concerned about. The high score, a small issue, has a small score rank (e.g., 1, 2, 3..). Conversely, a low score, a big issue of concern, has a big score rank (e.g., 4, 49, 50). The score rank for each question appears in parentheses to the right of the mean score column on the priority index page. 2. CORRELATION RANK Next, questions are ordered from the least correlated with overall satisfaction to the most associated with overall satisfaction. Each question is then given a correlation rank. The question that is the least correlated with satisfaction gets a rank of 1, the question that is the second least correlated with satisfaction gets a rank of 2 and so on down the line until the question that is the most correlated with satisfaction gets a rank of 50. Again, it helps to keep in mind what would be a small issue and what would be a big issue. A question that is not very correlated with satisfaction would be a small issue so it would have a small rank (e.g., 1, 2, 3...), whereas a question that was highly correlated with satisfaction would be a big issue-- something to pay attention to--and would have a big rank (e.g. 4, 49, 50). The correlation rank for each question appears in parentheses to the right of the correlation coefficient column on the priority index page. 19

21 3. COMPUTING THE PRIORITY INDEX The priority index is derived by adding the score rank (from step 1) to the correlation rank (from step 2). The questions are then ordered on the priority index page with the largest priority index score coming first on down to the lowest priority index score coming last. In order to be first in the priority index list a question would have to have two big issues: a big score rank (that means a low score) and a big correlation rank (that means a high association with satisfaction). Questions that appear at the bottom of the priority index list would have two small issues, a high overall score (which gets a small score rank) and a low association with satisfaction (which gets a small correlation rank). Questions that appear in the middle of the priority index list would likely have just one big issue (either a low score or a high association with satisfaction). PRIORITY INDEX = High Priority (Top of priority index page) = Medium Priority (Middle of priority index page) Low Priority (Bottom of priority index page) = or = SCORE RANK + A big issue! Big score rank (Low actual score) + A big issue! Big score rank (Low actual score) + A small issue Small score rank (High actual + score) A small issue Small score rank + (High actual score) CORRELATION RANK A big issue! Big correlation rank (High correlation with satisfaction) A small issue Small correlation rank (Low correlation with satisfaction) A big issue! Big correlation rank (High correlation with satisfaction) A small issue Small correlation rank (Low correlation with satisfaction) External Priority Indices Priority indexes are also provided with an external focus. The internal priority index, as described above, uses your question mean scores as an internal measure of performance. The external priority indices use your question percentile ranks as an external measure of performance. The same basic steps are used to create the external priority indices. However, in the first step, the questions are ranked according to their percentile ranks (highest to lowest) instead of according to their mean scores. 20

appendix B Grouped Frequency Distributions and Central Tendency Grouped Frequency Distributions OBJECTIVES FOR APPENDIX B

appendix B Grouped Frequency Distributions and Central Tendency Grouped Frequency Distributions OBJECTIVES FOR APPENDIX B appendix B Grouped Frequency Distributions and Central Tendency OBJECTIVES FOR APPENDIX B After studying the text and working the problems, you should be able to: 1. Use four conventions for constructing

More information

Appendix B Statistics in Psychological Research

Appendix B Statistics in Psychological Research A10_30413 9/4/07 1:34 PM Page A-10 Appendix B Statistics in Psychological Research Understanding and interpreting the results of psychological research depends on statistical analyses, which are methods

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

More information

Section 6-2 The Standard Normal Distribution

Section 6-2 The Standard Normal Distribution Section 6-2 The Standard Normal Distribution 6.1-1 Continuous Random Variables Continuous random variable A random variable X takes infinitely many values, and those values can be associated with measurements

More information

Essential Statistics Chapter 3

Essential Statistics Chapter 3 1 Essential Statistics Chapter 3 By Navidi and Monk Copyright 2016 Mark A. Thomas. All rights reserved. 2 Measures of Center in summarizing descriptions of data, statisticians often talk about measures

More information

Statistics: Introduction:

Statistics: Introduction: Statistics: Introduction: STAT- 114 Notes Definitions Statistics Collection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

Rounding Rule for the Mean: The mean should be rounded to one more decimal place than occurs in the raw data.

Rounding Rule for the Mean: The mean should be rounded to one more decimal place than occurs in the raw data. Section 3-1 Measures of average are called measures of central tendency and include the mean, median, mode, and midrange. Measures that determine the spread of the data values are called measures of variation

More information

The Normal Distribution

The Normal Distribution The Normal Distribution Diana Mindrila, Ph.D. Phoebe Baletnyne, M.Ed. Based on Chapter 3 of The Basic Practice of Statistics (6 th ed.) Concepts: Density Curves Normal Distributions The 68-95-99.7 Rule

More information

Introduction to Regression. Dr. Tom Pierce Radford University

Introduction to Regression. Dr. Tom Pierce Radford University Introduction to Regression Dr. Tom Pierce Radford University In the chapter on correlational techniques we focused on the Pearson R as a tool for learning about the relationship between two variables.

More information

LOOKING AT DATA DISTRIBUTIONS

LOOKING AT DATA DISTRIBUTIONS CHAPTER 1 LOOKING AT DATA DISTRIBUTIONS SECTION 1.1 OVERVIEW Section 1.1 introduces several methods for exploring data. These methods should only be applied after clearly understanding the background of

More information

Discrete Quantitative Data

Discrete Quantitative Data Discrete Quantitative Data Example: Cars are sampled from the end of the production line and inspected. To save time and money, not all cars are inspected. Below you see data on the number of blemishes

More information

Measuring center and spread for density curves. Calculating probabilities using the standard Normal Table (CIS Chapter 8, p 105 mainly p114)

Measuring center and spread for density curves. Calculating probabilities using the standard Normal Table (CIS Chapter 8, p 105 mainly p114) Objectives 1.3 Density curves and Normal distributions Density curves Measuring center and spread for density curves Normal distributions The 68-95-99.7 (Empirical) rule Standardizing observations Calculating

More information

GCSE HIGHER Statistics Key Facts

GCSE HIGHER Statistics Key Facts GCSE HIGHER Statistics Key Facts Collecting Data When writing questions for questionnaires, always ensure that: 1. the question is worded so that it will allow the recipient to give you the information

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Chapter 3 Introduction to Linear Regression

Chapter 3 Introduction to Linear Regression Chapter 3 Chapter 3 Introduction to Linear Regression Now we are moving on up to the big time! You are going to learn how to do something that is very remarkable you are about to learn how to predict the

More information

One-Sample T-Test. Dr. Tom Pierce Department of Psychology Radford University

One-Sample T-Test. Dr. Tom Pierce Department of Psychology Radford University 1 One-Sample T-Test Dr. Tom Pierce Department of Psychology Radford University Let s say that an investigator is interested in the effects of caring for a person with Alzheimer s disease on physical and

More information

INTRODUCTION TO STATISTICS. Arely Acuña Avilez UC Irvine SS 198 Research Class

INTRODUCTION TO STATISTICS. Arely Acuña Avilez UC Irvine SS 198 Research Class INTRODUCTION TO STATISTICS Arely Acuña Avilez UC Irvine SS 198 Research Class Goal of Statistics To summarize large batches of numbers in a neat, meaningful way! What is a statistic? A rule for computing

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1. Lecture 6: Chapter 6: Normal Probability Distributions A normal distribution is a continuous probability distribution for a random variable x. The graph of a normal distribution is called the normal curve.

More information

Statistics GCSE Higher Revision Sheet

Statistics GCSE Higher Revision Sheet Statistics GCSE Higher Revision Sheet This document attempts to sum up the contents of the Higher Tier Statistics GCSE. There is one exam, two hours long. A calculator is allowed. It is worth 75% of the

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

Data Analysis: Describing Data - Descriptive Statistics

Data Analysis: Describing Data - Descriptive Statistics WHAT IT IS Return to Table of ontents Descriptive statistics include the numbers, tables, charts, and graphs used to describe, organize, summarize, and present raw data. Descriptive statistics are most

More information

Variables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test.

Variables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test. The Analysis of Research Data The design of any project will determine what sort of statistical tests you should perform on your data and how successful the data analysis will be. For example if you decide

More information

A brief introduction to error analysis and propagation

A brief introduction to error analysis and propagation A brief introduction to error analysis and propagation Georg Fantner February 2011 Contents 1 Acknowledgements 2 2 Random and systematic errors 2 3 Determining random errors 2 3.1 Instrument Limit of Error

More information

Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1:

Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1: Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1: THE NORMAL CURVE AND "Z" SCORES: The Normal Curve: The "Normal" curve is a mathematical abstraction which conveniently

More information

LEARNING OBJECTIVES SCALES OF MEASUREMENT: A REVIEW SCALES OF MEASUREMENT: A REVIEW DESCRIBING RESULTS DESCRIBING RESULTS 8/14/2016

LEARNING OBJECTIVES SCALES OF MEASUREMENT: A REVIEW SCALES OF MEASUREMENT: A REVIEW DESCRIBING RESULTS DESCRIBING RESULTS 8/14/2016 UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION LEARNING OBJECTIVES Contrast three ways of describing results: Comparing group percentages Correlating scores Comparing group means Describe

More information

Math 227 Elementary Statistics. Bluman 5 th edition

Math 227 Elementary Statistics. Bluman 5 th edition Math 227 Elementary Statistics Bluman 5 th edition CHAPTER 3 Data Description 7 Objectives Summarize data using measures of central tendency, such as the mean, median, mode, and midrange. Describe data

More information

UNIT 1: ANALYTICAL METHODS FOR ENGINEERS

UNIT 1: ANALYTICAL METHODS FOR ENGINEERS UNIT 1: ANALYTICAL METHODS FOR ENGINEERS Unit code: A/601/1401 QCF Level: 4 Credit value: 15 OUTCOME 4 - STATISTICS AND PROBABILITY TUTORIAL 1 STATISTICAL DATA 1 Tabular and graphical form: data collection

More information

Chapter 3 Numerical Descriptive Measures Statistics for Managers Using Microsoft Excel, 5e 2008 Pearson Prentice-Hall, Inc.

Chapter 3 Numerical Descriptive Measures Statistics for Managers Using Microsoft Excel, 5e 2008 Pearson Prentice-Hall, Inc. Statistics for Managers Using Microsoft Excel 5th Edition Chapter 3 Numerical Descriptive Measures Statistics for Managers Using Microsoft Excel, 5e 2008 Pearson Prentice-Hall, Inc. Chap 3-1 Learning Objectives

More information

One-Way Analysis of Variance. Thomas W. Pierce Department of Psychology Radford University

One-Way Analysis of Variance. Thomas W. Pierce Department of Psychology Radford University One-Way Analysis of Variance Thomas W. Pierce Department of Psychology Radford University Analysis of Variance (ANOVA) is one of the foundation tools of data analysis for researchers in the behavioral

More information

PSY 307 Statistics for the Behavioral Sciences. Chapter 3-5 Mean, Variance, Standard Deviation and Z-scores

PSY 307 Statistics for the Behavioral Sciences. Chapter 3-5 Mean, Variance, Standard Deviation and Z-scores PSY 307 Statistics for the Behavioral Sciences Chapter 3-5 Mean, Variance, Standard Deviation and Z-scores Measures of Central Tendency (Representative Values) Quantitative data: Mode the most frequently

More information

10-3 Measures of Central Tendency and Variation

10-3 Measures of Central Tendency and Variation 10-3 Measures of Central Tendency and Variation So far, we have discussed some graphical methods of data description. Now, we will investigate how statements of central tendency and variation can be used.

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

2007 Thomson South-Western. All Rights Reserved. Slide 1

2007 Thomson South-Western. All Rights Reserved. Slide 1 Slide 1 Chapter 3 Descriptive Statistics: Numerical Measures Measures of Location Measures of Variability Measures of Distribution Shape, Relative Location, and Detecting Outliers Measures of Association

More information

Basic Statistics Review Part Two Page 1. Basic Statistics Review Part Two

Basic Statistics Review Part Two Page 1. Basic Statistics Review Part Two Basic Statistics Review Part Two Page 1 Basic Statistics Review Part Two Sampling Distribution of the Mean; Standard Error (See Zar 4 th ed. pages 65-76; or Zar 5 th ed. pages 66-72; 87-91) In our discussion

More information

Chapter 3: Statistics for describing, exploring, and comparing data

Chapter 3: Statistics for describing, exploring, and comparing data Chapter 3: Statistics for describing, exploring, and comparing data Chapter Problem: A common belief is that women talk more than men. Is that belief founded in fact, or is it a myth? Data set 8 in Appendix

More information

Guide to the Summary Statistics Output in Excel

Guide to the Summary Statistics Output in Excel How to read the Descriptive Statistics results in Excel PIZZA BAKERY SHOES GIFTS PETS Mean 83.00 92.09 72.30 87.00 51.63 Standard Error 9.47 11.73 9.92 11.35 6.77 Median 80.00 87.00 70.00 97.50 49.00 Mode

More information

GCSE Statistics Revision notes

GCSE Statistics Revision notes GCSE Statistics Revision notes Collecting data Sample This is when data is collected from part of the population. There are different methods for sampling Random sampling, Stratified sampling, Systematic

More information

Chapter P(x) 1

Chapter P(x) 1 Chapter 6 Key Ideas Density Curve Uniform Distribution, Standard Normal Distribution, Z-Score, Z-table (finding areas above and below values using them), Sampling Distributions (of the mean, of a proportion),

More information

The MEAN is the numerical average of the data set.

The MEAN is the numerical average of the data set. Notes Unit 8: Mean, Median, Standard Deviation I. Mean and Median The MEAN is the numerical average of the data set. The mean is found by adding all the values in the set, then dividing the sum by the

More information

Error Representation and Curvefitting

Error Representation and Curvefitting Error Representation and Curvefitting As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality --- Albert Einstein (1879-1955)

More information

Chapter 6: RANDOM SAMPLING AND DATA DESCRIPTION. Part 1: Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots

Chapter 6: RANDOM SAMPLING AND DATA DESCRIPTION. Part 1: Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Chapter 6: RANDOM SAMPLING AND DATA DESCRIPTION Part 1: Random Sampling Numerical Summaries Stem-n-Leaf plots Histograms, and Box plots Sections 6-1 to 6-4 Random Sampling In statistics, we re usually

More information

Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1::

Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1:: Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 000: Page 1:: CORRELATION: What is a correlation coefficient? A correlation coefficient is a succinct (single-number) measure of the

More information

The Math Part of the Course

The Math Part of the Course The Math Part of the Course Measures of Central Tendency Mode: The number with the highest frequency in a dataset Median: The middle number in a dataset Mean: The average of the dataset When to use each:

More information

CHAPTER 7 STANDARD ERROR OF THE MEAN AND CONFIDENCE INTERVALS

CHAPTER 7 STANDARD ERROR OF THE MEAN AND CONFIDENCE INTERVALS CHAPTER 7 STANDARD ERROR OF THE MEAN AND CONFIDENCE INTERVALS Researchers rarely conduct statistical research with knowledge of the characteristics of the entire population. Remember that a population

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Introduction to Statistics for Computer Science Projects

Introduction to Statistics for Computer Science Projects Introduction Introduction to Statistics for Computer Science Projects Peter Coxhead Whole modules are devoted to statistics and related topics in many degree programmes, so in this short session all I

More information

Chapter 8. Simple Linear Regression Analysis (Part-I)

Chapter 8. Simple Linear Regression Analysis (Part-I) Chapter 8 Simple Linear Regression Analysis (Part-I) In Chapter 7, we looked at whether two categorical variables were dependent on each other. You can think of dependence as being similar to a relationship

More information

Session 1.6 Measures of Central Tendency

Session 1.6 Measures of Central Tendency Session 1.6 Measures of Central Tendency Measures of location (Indices of central tendency) These indices locate the center of the frequency distribution curve. The mode, median, and mean are three indices

More information

Numerical Summarization of Data OPRE 6301

Numerical Summarization of Data OPRE 6301 Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

( ) LESSON 6: MEASURES OF SPREAD OR VARIATION (SECTION 3-3) (Lesson 6: Measures of Spread or Variation; 3-3) 3.15 PART A: THREE MEASURES

( ) LESSON 6: MEASURES OF SPREAD OR VARIATION (SECTION 3-3) (Lesson 6: Measures of Spread or Variation; 3-3) 3.15 PART A: THREE MEASURES (Lesson 6: Measures of Spread or Variation; 3-3) 3.15 LESSON 6: MEASURES OF SPREAD OR VARIATION (SECTION 3-3) PART A: THREE MEASURES Example 1 (same as in Lesson 5) The five students in a class take a

More information

5.0 - Chapter Introduction

5.0 - Chapter Introduction 5.0 - Chapter Introduction In this chapter, you will learn to use regression analysis in developing cost estimating relationships and other analyses based on a straight-line relationship even when the

More information

Section 3 Part 1. Relationships between two numerical variables

Section 3 Part 1. Relationships between two numerical variables Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Chapter 2 Summarizing and Graphing Data

Chapter 2 Summarizing and Graphing Data Chapter Summarizing and Graphing Data -- Organizing and Displaying Data Frequency Tables -- Graphical Displays -- Numerical Measures Measures of Central Tendency Measures of Variation Measures of Position

More information

Chapter 5: Normal Probability Distributions - Solutions

Chapter 5: Normal Probability Distributions - Solutions Chapter 5: Normal Probability Distributions - Solutions Note: All areas and z-scores are approximate. Your answers may vary slightly. 5.2 Normal Distributions: Finding Probabilities If you are given that

More information

Tips & Tools #10: Analyzing Quantitative Data

Tips & Tools #10: Analyzing Quantitative Data Tips & Tools #10: Analyzing Quantitative Data Statistical analysis can be quite involved. However, there are some common mathematical techniques that can make your evaluation data more understandable.

More information

Frequency Distributions

Frequency Distributions Displaying Data Frequency Distributions After collecting data, the first task for a researcher is to organize and summarize the data to get a general overview of the results. Remember, this is the goal

More information

The Range. Chapter 5: The Importance of Measuring Variability. Why is Variability important?

The Range. Chapter 5: The Importance of Measuring Variability. Why is Variability important? Chapter 5: The Importance of Measuring Variability Measures of Central Tendency - Numbers that describe what is typical or central in a variable s distribution (e.g., mean, mode, median). Measures of Variability

More information

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ 1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

More information

Showing Data Center and Spread

Showing Data Center and Spread Knowledge Article: Probability and Statistics Showing Data Center and Spread A. Measures of Central Tendency Central tendency is a loosely defined concept that has to do with the location of the center

More information

Section 3.1 Measures of Central Tendency: Mode, Median, and Mean

Section 3.1 Measures of Central Tendency: Mode, Median, and Mean Section 3.1 Measures of Central Tendency: Mode, Median, and Mean One number can be used to describe the entire sample or population. Such a number is called an average. There are many ways to compute averages,

More information

Chapter Seven: Correlation and Regression

Chapter Seven: Correlation and Regression Chapter Seven: Correlation and Regression Correlation and regression analysis (also called least squares analysis) helps us examine relationships among interval or ratio variables. As you will see, results

More information

One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate 1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,

More information

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve

More information

Properties of a Normal Distribution

Properties of a Normal Distribution Properties of a Normal Distribution 5.1 Introduction to Normal Distributions The mean, median, and mode are equal Bell shaped and is symmetric about the mean The total area that lies under the curve is

More information

MEASURES OF DISPERSION

MEASURES OF DISPERSION MEASURES OF DISPERSION Measures of Dispersion While measures of central tendency indicate what value of a variable is (in one sense or other) average or central or typical in a set of data, measures of

More information

X - Xbar : ( 41-50) (48-50) (50-50) (50-50) (54-50) (57-50) Deviations: (note that sum = 0) Squared :

X - Xbar : ( 41-50) (48-50) (50-50) (50-50) (54-50) (57-50) Deviations: (note that sum = 0) Squared : Review Exercises Average and Standard Deviation Chapter 4, FPP, p. 74-76 Dr. McGahagan Problem 1. Basic calculations. Find the mean, median, and SD of the list x = (50 41 48 54 57 50) Mean = (sum x) /

More information

DDBA8438: Central Tendency and Variability Video Podcast Transcript

DDBA8438: Central Tendency and Variability Video Podcast Transcript DDBA8438: Central Tendency and Variability Video Podcast Transcript JENNIFER ANN MORROW: Today's demonstration will review measures of central tendency and variability. My name is Dr. Jennifer Ann Morrow.

More information

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

More information

Week 7: The t Distribution, Confidence Intervals and Tests

Week 7: The t Distribution, Confidence Intervals and Tests Week 7: The t Distribution, Confidence Intervals and Tests 2 / 51 Inference for a Single Mean To this point, when examining the mean of a population we have always assumed that the population standard

More information

Guess at the densities (the height of each bar), then multiply by the base to find the area = percent of values contained in that bar.

Guess at the densities (the height of each bar), then multiply by the base to find the area = percent of values contained in that bar. Chapter 3 -- Review Exercises Statistics 1040 -- Dr. McGahagan Problem 4. Histogram of Blood Pressure. Guess at the densities (the height of each bar), then multiply by the base to find the area = percent

More information

Models for Discrete Variables

Models for Discrete Variables Probability Models for Discrete Variables Our study of probability begins much as any data analysis does: What is the distribution of the data? Histograms, boxplots, percentiles, means, standard deviations

More information

Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1:

Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 2000: Page 1: Research Methods 1 Handouts, Graham Hole,COGS - version 1.0, September 000: Page 1: DESCRIPTIVE STATISTICS - FREQUENCY DISTRIBUTIONS AND AVERAGES: Inferential and Descriptive Statistics: There are four

More information

Normal Approximations to Binomial Distributions

Normal Approximations to Binomial Distributions Normal Approximations to Binomial Distributions In 4.2 you learned how to find binomial probabilities. For instance, consider a surgical procedure that has an 85% chance of success. When a doctor performs

More information

Multiple Regression: Assumptions

Multiple Regression: Assumptions Multiple Regression: Assumptions Regression assumptions clarify the conditions under which multiple regression works well, ideally with unbiased and efficient estimates. So, what do we mean by regression

More information

Chapter 5. Section Introduction to Normal Distributions

Chapter 5. Section Introduction to Normal Distributions Section 5.1 - Introduction to Normal Distributions Chapter 5 Objectives: Interpret graphs of normal probability distributions Find areas under the standard normal curve Properties of a Normal Distribution

More information

Unit 1 Number Sense. In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions.

Unit 1 Number Sense. In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions. Unit 1 Number Sense In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions. BLM Three Types of Percent Problems (p L-34) is a summary BLM for the material

More information

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2 Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables

More information

ESP 178 Applied Research Methods. 2/26: Quantitative Analysis. Frequency distributions and graphs to show central tendency, variation, and skewness

ESP 178 Applied Research Methods. 2/26: Quantitative Analysis. Frequency distributions and graphs to show central tendency, variation, and skewness ESP 178 Applied Research Methods 2/26: Quantitative Analysis Key Concepts from Chapter 14 (Chapter 12 in Red Book): Descriptive Statistics Frequency distributions and graphs to show central tendency, variation,

More information

5.1 Introduction to Normal Distributions

5.1 Introduction to Normal Distributions 5.1 Introduction to Normal Distributions Properties of a Normal Distribution The mean, median, and mode are equal Bell shaped and is symmetric about the mean The total area that lies under the curve is

More information

Lecture Numerical Measures - 1 - http://wiki.stat.ucla.edu/socr/index.php/socr_courses_008_thomson_econ61 DESCRIPTIVE STATISTICS PART II DESCRIBING YOUR DATA USING NUMERICAL MEASURES Grace S. Thomson Lecture

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

7. Normal Distributions

7. Normal Distributions 7. Normal Distributions A. Introduction B. History C. Areas of Normal Distributions D. Standard Normal E. Exercises Most of the statistical analyses presented in this book are based on the bell-shaped

More information

Chapter 3: Central Tendency

Chapter 3: Central Tendency Chapter 3: Central Tendency Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the distribution and represents

More information

Hypothesis Testing with z Tests

Hypothesis Testing with z Tests CHAPTER SEVEN Hypothesis Testing with z Tests NOTE TO INSTRUCTOR This chapter is critical to an understanding of hypothesis testing, which students will use frequently in the coming chapters. Some of the

More information

Statistics Summary (prepared by Xuan (Tappy) He)

Statistics Summary (prepared by Xuan (Tappy) He) Statistics Summary (prepared by Xuan (Tappy) He) Statistics is the practice of collecting and analyzing data. The analysis of statistics is important for decision making in events where there are uncertainties.

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

Descriptive and Inferential Statistics using SPSS. Basic Guide for Field M&E Officers

Descriptive and Inferential Statistics using SPSS. Basic Guide for Field M&E Officers Descriptive and Inferential Statistics using SPSS Basic Guide for Field M&E Officers MILEAR Project, 2014 1 Table of Contents A. DATA ENTRY... 3 A.1 Importing Data... 3 A.2 Manually Inputting Data or Organizing

More information

CHAPTER 3 CENTRAL TENDENCY ANALYSES

CHAPTER 3 CENTRAL TENDENCY ANALYSES CHAPTER 3 CENTRAL TENDENCY ANALYSES The next concept in the sequential statistical steps approach is calculating measures of central tendency. Measures of central tendency represent some of the most simple

More information

Statistical inference using bootstrap confidence intervals Michael Wood Bootstrap confidence intervals

Statistical inference using bootstrap confidence intervals Michael Wood Bootstrap confidence intervals Statistical inference using bootstrap confidence intervals Michael Wood, Portsmouth University Business School, UK Michael.wood@port.ac.uk November 2004 (This is a preprint of an article accepted for publication

More information

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives. The Normal Distribution C H 6A P T E R The Normal Distribution Outline 6 1 6 2 Applications of the Normal Distribution 6 3 The Central Limit Theorem 6 4 The Normal Approximation to the Binomial Distribution

More information

Econometric Analysis Dr. Sobel

Econometric Analysis Dr. Sobel Econometric Analysis Dr. Sobel Econometrics Session 2: 4. Perform Ordinary Least Squares (OLS) Regression Analysis in gretl What it is, what it does, and why we do it: Regression analysis is basically

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

1/1/2014. Scales of Measurement GETTING TO THE STANDARD NORMAL DISTRIBUTION. Scales of Measurement. Scales of Measurement

1/1/2014. Scales of Measurement GETTING TO THE STANDARD NORMAL DISTRIBUTION. Scales of Measurement. Scales of Measurement MBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION Measurement is a process of assigning numbers to characteristics according to a defined rule. Not all measurement

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Kernel density function for protein abundance

Kernel density function for protein abundance Kernel density function for protein abundance March, 2012 Kernel density functions clearly show the differential abundance of a protein between one set of samples, that is one sample category, and another

More information

Chapter 2 Descriptive Statistics o 2.1 Frequency Distributions and Their Graphs o Frequency Distributions o Graphs of Frequency Distributions

Chapter 2 Descriptive Statistics o 2.1 Frequency Distributions and Their Graphs o Frequency Distributions o Graphs of Frequency Distributions Chapter 2 Descriptive Statistics o 2.1 Frequency Distributions and Their Graphs o Frequency Distributions o Graphs of Frequency Distributions o 2.2 More Graphs and Displays o Graphing Quantitative Data

More information