# CALCULATIONS & STATISTICS

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 CALCULATIONS & STATISTICS

2 CALCULATION OF SCORES Conversion of 1-5 scale to scores When you look at your report, you will notice that the scores are reported on a scale, even though respondents rate your services on the survey on a 1-5 scale. We convert respondents ratings because most people find it easier to interpret scores from VERY POOR POOR FAIR GOOD VERY GOOD Scale = Score = Calculation of Mean Scores Each patient has a score for every question that was answered. For each patient, a section score is calculated as the mean of the all the question scores in that particular section. Similarly, for each patient, an overall score is calculated as the mean of that patient s section scores. Sample Data for 5 Patients: Patient A1 A2 A3 B1 B2 B3 B4 C1 C2 SECT A SECT B SECT C OVERALL Mean: Mean rounded to one decimal place: For the data listed in the above table, Patient #1 s score for section A (SECT A) is the mean of Patient #1 s scores for the questions in section A (i.e., A1, A2, A3). (A1 + A2 + A3) 3 = SECT A ( ) 3 =

3 Similarly, Patient #1 s OVERALL score is calculated: (SECT A + SECT B + SECTC) 3 = OVERALL ( ) 3 = 7.47 The section scores displayed in a report are the means of all the patients scores for that section, rounded to one decimal place. (See bold face type at the bottom of SECT A, SECT B, and SECT C columns). Similarly, the overall facility rating score displayed in a report is the mean of all patients overall scores, rounded to one decimal place. (See bold face type at the bottom of OVERALL column). ( ) 5 = = 66.5 In a perfect world where every patient fills in every question, you could also calculate the overall hospital rating by taking the mean of the section averages displayed in your report. ( ) 3 = 66.5 HOWEVER, there is always some missing data (i.e., patients do not fill in every question on the survey). Patients may even skip an entire section if the questions do not apply to them. Thus, the overall score in your section might represent 250 patients, but a particular section score might represent only 245 patients. When this happens, it makes it impossible to calculate your overall mean score from the section mean scores displayed in your report. (See example below.) Sample Data for 5 Patients with Missing Data: Patient A1 A2 A3 B1 B2 B3 B4 C1 C2 SECT A SECT B SECT C OVERALL Mean: Mean rounded to one decimal place: In the above table you can see that the overall hospital rating score that would be displayed in a report is the mean of all patients overall scores. (See bold face type at the bottom of OVERALL column). 2

4 ( ) 5 = = 66.5 Because of missing data, you cannot obtain the overall hospital rating score by taking the mean of the section averages displayed in the report. ( ) 3 = = 65.6 Note: The difference appears small in this example because there are so few patients in the sample. In practice the difference would be much greater. THE MEAN The Mean (denoted X, and referred to as x-bar) is a measure of central tendency representing the arithmetic center of a group of scores. In plainer language, it is the average. In terms of your Press, Ganey report, the mean gives you information about the average score for: an individual question, a section on the survey, the overall satisfaction score for your facility, or the satisfaction scores for all facilities in our database. The mean is calculated by summing all scores and dividing by the number of scores. For example, the mean of.0, 7.5, and 6.5 would be calculated: A more general formula can be written: ( ) 3 = 7.33 Mean X X + X 2+ X 3 = =... + X n 1 n The mean can also be interpreted as the balancing point. Each score can be thought of as a one pound weight and the range of values of scores could be plotted along a supporting rod. The balancing point for the rod would be the mean of the values of the scores. Interestingly, physicists determine the precise balancing point, or center of gravity, using the same formula that statisticians use to find the mean. ] ] ] X

5 The mean, or balancing point, is easily influenced by outliers (i.e., scores that are much higher or much lower than the rest of the scores). If you put a weight on the balancing rod that was in about the same place as all the other weights (e.g., 7.0), the rod wouldn t tip very much and you wouldn t have to move the balancing point very far. If, however, you placed a weight way out on the end of the rod (e.g., 90.0), the rod would tip considerably. You would need to move the balancing point closer to the extreme value in order to make the rod balance. The same holds true with calculating the mean. If you have a group of scores that are all within a close range, the mean value will likely be towards the center of that range. However, having just one score that is much higher or lower than the rest can pull the mean up or down, respectively. Note: Remember that the mean score is an average, not a percent. Say you have a sample of five patients who rate skill of nurses as follows: 5 (very good), 4 (good), 4 (good), 4 (good), 5 (very good). Those ratings can be converted into the following scores: 100, 75, 75, 75, 100. The average of these scores would be 5 (( ) 5). You wouldn t want to say that your patients were 5% satisfied with the skill of nurses, because in fact all (100%) of your patients rated the nurses good or very good with 60% giving a rating of good and 40% giving a rating of very good. THE MEDIAN & PERCENTILE RANKING The Median The median is a measure of central tendency representing the mid-point in a distribution of scores or the point at the 50 th percentile. In other words, the median is the middle; it is the score that splits the distribution in half. Fifty percent of scores will fall above the median and fifty percent of scores will fall below the median. The median is determined by ordering the scores from highest to lowest and finding the middle value. If there is an odd number of scores, the median is the score in the middle. If there is an even number of scores, then there is no score in the middle so the median is the average of the two scores closest to the middle. Comparison of the Mean and the Median The mean and the median are each measures of central tendency, but they describe the center in different ways. The mean is the average of the scores, whereas the median is the middle of the scores. The average is not always in the middle of a distribution. Another difference between the two is that the mean is influenced by outliers (i.e., scores that are much higher or lower than the rest) but the median is not influenced by outliers. To illustrate this point, let s look at an imaginary sample of ten facilities overall satisfaction scores: 4

6 Hos # Hos # Finding the Mean: Hos # The mean of these scores is obtained by summing the Hos # scores and dividing by the number of scores (ten) to get Hos # Hos # Mean = ( Hos # ) 10 Hos # 6.2 = 5.9 Hos # Hos # Hos # Hos # Finding the Median: Hos # To find the median, the hospital scores are ordered from highest Hos # to lowest. In this case there is an even number of scores so we Hos # must take the average of the two closest to the middle (i.e., 7.1 Hos # 6.2 and 6.2). Hos # Hos # Media = ( ) 2 n Hos # = 6.65 Hos # Notice the relationship between the mean and the median. In this example, the mean is lower than the median because an outlier (the low score of 69.5) pulled the mean down a bit. The low outlier did not influence the median. If you were facility #, with a score of 6.2, you would notice that your facility score is above the mean (5.9) for the database but below the median (6.65). The median for the database is equivalent to the 50 th percentile in the percentile ranking, so facility # would be below the 50 th percentile. So if your facility s score is above the mean but below the 50 th percentile it indicates that there are low outliers in the database (facilities whose score are considerably lower than all the other scores). Conversely, if your score is below the mean but above the 50 th percentile, it indicates that there are high outliers in the database (facilities whose scores are considerably higher than all the other scores). Percentile Ranking Percentile ranking is a strategy for assigning a series of values to divide a distribution into equal parts. More specifically, the percentile ranking tells you the proportion of scores in the database which fall below your individual facility s score. The median of 5

7 the database is the score associated with the 50 th percentile. Percentile rankings therefore give you information about where your hospital stands in relation to the median of the database. Keep in mind that these numbers are percentiles, not actual scores. For example, a 6 in the percentile rank column of your report means that the hospital is in the 6 th percentile for this item. Translation: this particular hospital scores higher than 6% of the hospitals in the database and scores lower than 32 % of the hospitals in the database. For more information on percentile rankings please see the section on the Standard Deviation and the discussion of how percentile rankings relate to the standard deviation. THE STANDARD DEVIATION & STANDARD ERROR (SIGMA) What is the Standard Deviation? The standard deviation is a measure of variability of scores around the mean. Large numbers for the standard deviation indicate that the data are very spread out (i.e., there is a lot of variability). Conversely, a very small standard deviation would indicate that most of the data are very similar to the mean (i.e., less variability). How do you determine the Standard Deviation? The standard deviation is calculated using the formula: ( X X ) In plainer terms, this formula tells you to (see example below): Find the mean of the sample you are interested in (bottom cell of second column). Find the distance from the mean for each individual score. You do this by subtracting the mean from each hospital score (see third column). Square all the distances from the mean (fourth column). Add all of the squared distances from the mean scores (bottom cell of fourth column). Divide the sum of the squared distances by the number of scores you are working with, (in this example, there are five scores). The number you get, is called the variance. Take the square root of your answer from number...this is the Standard Deviation ! n 2 6

8 Example 1 FACILITY SCORE (SCORE MEAN) (SCORE MEAN) 2 X (X-X) (X-X) = = = = = X Sum of (X-X) 2 = = 3.5 = Square root of =9.64 For comparison, a second example (Example 2) is provided that shows the computation of the standard deviation for a different set of five scores. Notice that the mean is the same as in the first example (3.5). However the standard deviation is considerably smaller (1.32) because the scores are much closer to the mean than were the scores in the first example. In other words, two sets of data with the same mean won t necessarily be identical to one another. 7

9 Example 2 FACILITY SCORE (SCORE MEAN) (SCORE MEAN) 2 X (X-X) (X-X) = = = = = X Sum of (X-X) = 3.5 =.75 = 1.75 Square root of 1.75 =1.32 Importance of the Standard Deviation Knowing the standard deviation of the database is important because it allows you to compare your facility s score to the larger group. We can do this based on the normal distribution. The normal distribution (below) is a bell shaped curve representing a theoretical distribution of data in which the mean, median, and mode (the score that occurs most frequently) have the same value. In the normal distribution 6% of all data falls within one standard deviation of the mean of the distribution. Further, 95% of the data falls within two standard deviations of the mean. The Normal Distribution 95.44% 6.26% -2 S.D. -1 S.D. X +1 S.D. +2 S.D.

13 One last issue. We don t know the exact standard error of the population because we don t have multiple samples drawn from the same population at the same time. We have just one sample from the population. However, we can estimate the standard error of the population based on the standard deviation of our one sample using the following formula: S n So if a facility had an overall rating of 4.24 with a standard deviation of 4.5 and an n of 225 we could calculate the standard error as... S SE = n And the confidence interval would be calculated... Where S= standard deviation of the sample n= number of observations (patients) in the sample = 9.5 = ( ) = 4.24 ± 1.96( 0.63) = C. I. = X ± 1.96 SE X ± Thus, with a sample mean of 4.24, we are 95% confident that the actual population mean is between 3.00 and 5.4. Note: The Confidence Intervals that are listed in your report under the heading Facility Statistical Analysis can be used to estimate the amount of change in mean score you would need in order to show a statistically significant improvement. In order to create this estimate you must multiply the number in the 95% Confidence Interval column by 1.41 and add the result to your current mean. So, if you had a mean of 4.24 and confidence interval of 1.24 you could estimate that increasing your mean score by multiplying the 1.24 by The result, 1.75, is an estimate of how much you would have to increase your mean score (i.e., from 4.24 to 5.99). This estimate is a just a guide and assumes that your sample for the next report has exactly the same n and standard deviation as the data for the current report. 12

14 t-tests The statistical procedure used to determine if scores on a current report are significantly different from the scores on the last report is the t-test. The calculation of a t-test measures the difference between sample means, taking into account the size and variability of each of the samples. It is important to note that the result of the t-test does not simply tell you what the difference is, it tells you how confident you can be that the difference is real and not due to random error. Conventionally, if you are at least 95% confident that the difference is real and thus not due to random error, then the difference is said to be statistically significant. How do you calculate a t-test? t-tests are calculated using the following formula... t = ( X 1 X 2 ) S n S + n X 1 = the mean of sample 1 X 2 = the mean of sample 2 S 1 2 = the variance (Stand. Dev. 2 ) of sample 1 n 1 = the number of observations in sample 1 S 2 2 the variance (Stand. Dev. 2 ) of sample 2 n 2 = the number of observations in sample 2 In your report... When a score on a Press, Ganey report is found to be significantly different from the score on the last report, the score is highlighted with one of the following symbols: + 95% certainty of significant increase (t >1.96) % certainty of significant increase (t >2.576). - 95% certainty of significant decrease (t< -1.96) % certainty of significant decrease (t< ). Two factors can influence whether or not a difference is found to be significant: the size of the samples (n) and the variability of the samples. Size: It is less likely that you will be confident enough to call a difference significant (i.e., not due to error) if there is less data available (lower n s). Variability: It is less likely that you will be confident enough to call a difference significant (i.e., not due to error) when there is greater variability (i.e., the data are more 13

15 spread out around the mean of the sample; larger standard deviations). So if you have what appear to be large differences in scores that are not marked with an indicator of statistically significant difference, it is likely that either the n s were too low or the standard deviations were too large for you to be confident that the difference was real and not due to random error. Conversely, if you have what appear to be small differences in scores that are marked with an indicator of statistically significant difference, it is likely that either the n s were so high or the standard deviation was so small that it was possible to be very confident that the difference was real and not due to random error. How can we be sure that differences in scores are significant at the.05 level, especially when sample sizes are small? ** Remember that a t-test takes into account both the size of the samples and the variability of the samples, so if the results of a t-test indicate that a difference is indeed significant, then you can be at least 95% confident that the difference is real and not due to random error, no matter how small your samples are. 14

16 CORRELATIONS What is a correlation? A correlation tells you the strength of the relation between variables. In other words, a correlation tells us how much a change in one variable (e.g., a question score) is associated with a concurrent, systematic change in another variable (e.g., overall satisfaction). A correlation represents the strength of the relationship between two variables numerically, with a correlation coefficient (called r) which can range from -1.0 to A positive correlation coefficient indicates that as the value of one variable increases, the value of the other variable also increases. For example, the more you exercise, the greater your endurance. A negative correlation coefficient indicates that as the value of one variable increases the value of the other variable decreases. For example, the more you smoke cigarettes, the less lung capacity you have. It is important to recognize that when two variables are correlated it means that they are related to each other, but it does not necessarily mean that one variable influences or predicts the other. This is the basis of the statement, Correlation does not imply causation! How can you tell how strong the relationship is between two variables? The closer a correlation is to 1 (either positive or negative), the stronger the relationship. The closer a correlation is to 0 (either positive or negative), the weaker the relationship. How is a correlation calculated? Correlation coefficients are calculated using the formula below. This statistic basically assesses the degree to which variables X and Y vary together (the numerator) taking into account the degree to which each variable varies on its own (the denominator). r 2 x xy = 2 y Example of calculating the correlation between... X (Overall rating of care provided at this facility) and Y (Likelihood of recommending this hospital). Please see the table on the following page. 15

17 Note: Each row represents a patient s data. For example, the data in the first row indicates that a patient at this hospital circled a 3 (fair) for the question Overall rating of care given at this hospital, and circled a 4 (good) for the question Likelihood of recommending this hospital to others. Determine the mean of X and the mean of Y (bottom of columns 1 and 2). For each value of X (each patient), determine how much it deviates (differs) from the mean of X. This deviation will be called x (in lower case). For each value of Y (each patient), determine how much it deviates (differs) from the mean of Y. This deviation will be called y. For each row, multiply x and y. For each row, square x, that is, multiply x by itself. For each row, square y, that is, multiply y by itself. Sum all the xy scores (bottom of column 5). Sum all the x 2 scores (bottom of column 6). Sum all the y 2 scores (bottom of column 7). Table 1 X Y x= (X - X) y= (Y - Y) xy x 2 y X=4. 4 Y=4. xy= 1.4 x 2 = 3.2 y 2 = 0. Fill in the formula below... with the numbers... and then r 2 x xy = 2 r = y

18 1.4 r = 1.79 * 0.9 which equals.. A correlation of. indicates a strong relationship between two variables. Additionally, the correlation is positive indicating that as scores on Overall rating of care given at this hospital increase, scores on Likelihood of recommending this hospital also increase. Graphed representations of correlations The first graph depicts the positive correlation between Friendliness of Nurses and Likelihood of Recommending Hospital. The positive correlation coefficient (r =.79) means that high scores on Friendliness are associated with high scores on Likelihood to Recommend, whereas low scores on Friendliness are associated with low scores on Likelihood to Recommend. You can tell at a glance that the correlation is positive because the line slants upward across the graph. 100 Friendliness and Recommendations Likelihood of Recommending r = Friendliness of Nurses 464 hospitals - 3rd quarter 1996 In contrast, the second graph depicts the negative correlation between hospital bed size and hospital overall satisfaction score. The negative correlation coefficient (r = -.46) means that hospitals with more beds tend to have lower scores, whereas hospitals with fewer beds have higher scores. You can tell at a glance that the second correlation is negative because the line slants downward across the graph. 17

19 Overall Mean by Bed Size Average Common Question Mean r = Total Facility Beds Regression vs. Correlation Press, Ganey reports list correlation coefficients between each question and the overall satisfaction score (an average of all the other items in a questionnaire). This gives a measure of the relative importance of each individual question for overall satisfaction. Clients sometimes ask why we do not use multiple regression to demonstrate these relationships. When the items of a survey are highly correlated, it is impossible to separate the effect of one item from that of others with any degree of precision. Because this problem (termed multicollinearity ) violates one of the assumptions of multiple regression, Press, Ganey does not report multiple regression analyses performed to relate specific questions to overall satisfaction. Like regression analyses, correlations are statistical measures of association that describe the strength of relationship, or association, between factors, such as a doctor s courtesy and patients overall satisfaction with care from that physician. Correlations have the advantage of using all the information from each respondent to a survey, whereas regression analyses eliminates all information from any respondents who do not answer every question. 1

20 PRIORITY INDEX What is the priority index? The priority index is a way of combining two very important pieces of information: (1) the actual score achieved on a particular question, and (2) the degree to which that particular question is associated with overall satisfaction. Combining these two pieces of information helps a facility to know where efforts should be placed for quality improvement. For example, one question might be very low in score (e.g., quality of food) but not particularly associated with overall satisfaction. Because it is not highly associated with satisfaction the facility might chose to place quality improvement resources elsewhere. Conversely, a question might be very highly associated with overall satisfaction (e.g., overall cheerfulness of hospital) but not low in score. Attempting to raise the score would probably be difficult and may perhaps be unnecessary if most of your patients are very satisfied already. How is the priority index calculated? The priority index is derived through a three-step process. For the purpose of this example, let s assume that the survey has 50 questions. 1. SCORE RANK Questions are ordered from the highest score down to the lowest score. Each question is then given a score rank; the highest mean score gets a rank of 1, the second highest score gets a rank of 2, the third highest score gets a rank of 3 and so on down the line until the lowest score is given a rank of 50. It may help to remember the meaning of the score rank to think of a high score as a small issue and a low score as a big issue--something the facility should be concerned about. The high score, a small issue, has a small score rank (e.g., 1, 2, 3..). Conversely, a low score, a big issue of concern, has a big score rank (e.g., 4, 49, 50). The score rank for each question appears in parentheses to the right of the mean score column on the priority index page. 2. CORRELATION RANK Next, questions are ordered from the least correlated with overall satisfaction to the most associated with overall satisfaction. Each question is then given a correlation rank. The question that is the least correlated with satisfaction gets a rank of 1, the question that is the second least correlated with satisfaction gets a rank of 2 and so on down the line until the question that is the most correlated with satisfaction gets a rank of 50. Again, it helps to keep in mind what would be a small issue and what would be a big issue. A question that is not very correlated with satisfaction would be a small issue so it would have a small rank (e.g., 1, 2, 3...), whereas a question that was highly correlated with satisfaction would be a big issue-- something to pay attention to--and would have a big rank (e.g. 4, 49, 50). The correlation rank for each question appears in parentheses to the right of the correlation coefficient column on the priority index page. 19

21 3. COMPUTING THE PRIORITY INDEX The priority index is derived by adding the score rank (from step 1) to the correlation rank (from step 2). The questions are then ordered on the priority index page with the largest priority index score coming first on down to the lowest priority index score coming last. In order to be first in the priority index list a question would have to have two big issues: a big score rank (that means a low score) and a big correlation rank (that means a high association with satisfaction). Questions that appear at the bottom of the priority index list would have two small issues, a high overall score (which gets a small score rank) and a low association with satisfaction (which gets a small correlation rank). Questions that appear in the middle of the priority index list would likely have just one big issue (either a low score or a high association with satisfaction). PRIORITY INDEX = High Priority (Top of priority index page) = Medium Priority (Middle of priority index page) Low Priority (Bottom of priority index page) = or = SCORE RANK + A big issue! Big score rank (Low actual score) + A big issue! Big score rank (Low actual score) + A small issue Small score rank (High actual + score) A small issue Small score rank + (High actual score) CORRELATION RANK A big issue! Big correlation rank (High correlation with satisfaction) A small issue Small correlation rank (Low correlation with satisfaction) A big issue! Big correlation rank (High correlation with satisfaction) A small issue Small correlation rank (Low correlation with satisfaction) External Priority Indices Priority indexes are also provided with an external focus. The internal priority index, as described above, uses your question mean scores as an internal measure of performance. The external priority indices use your question percentile ranks as an external measure of performance. The same basic steps are used to create the external priority indices. However, in the first step, the questions are ranked according to their percentile ranks (highest to lowest) instead of according to their mean scores. 20

### DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

### Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### Chapter 5: Normal Probability Distributions - Solutions

Chapter 5: Normal Probability Distributions - Solutions Note: All areas and z-scores are approximate. Your answers may vary slightly. 5.2 Normal Distributions: Finding Probabilities If you are given that

### Chapter 2 Statistical Foundations: Descriptive Statistics

Chapter 2 Statistical Foundations: Descriptive Statistics 20 Chapter 2 Statistical Foundations: Descriptive Statistics Presented in this chapter is a discussion of the types of data and the use of frequency

### Descriptive statistics; Correlation and regression

Descriptive statistics; and regression Patrick Breheny September 16 Patrick Breheny STA 580: Biostatistics I 1/59 Tables and figures Descriptive statistics Histograms Numerical summaries Percentiles Human

### PowerScore Test Preparation (800) 545-1750

Question 1 Test 1, Second QR Section (version 1) List A: 0, 5,, 15, 20... QA: Standard deviation of list A QB: Standard deviation of list B Statistics: Standard Deviation Answer: The two quantities are

### DESCRIPTIVE STATISTICS & DATA PRESENTATION*

Level 1 Level 2 Level 3 Level 4 0 0 0 0 evel 1 evel 2 evel 3 Level 4 DESCRIPTIVE STATISTICS & DATA PRESENTATION* Created for Psychology 41, Research Methods by Barbara Sommer, PhD Psychology Department

### Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

### Descriptive Statistics

Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9

### Tool 1. Greatest Common Factor (GCF)

Chapter 4: Factoring Review Tool 1 Greatest Common Factor (GCF) This is a very important tool. You must try to factor out the GCF first in every problem. Some problems do not have a GCF but many do. When

### Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

### 7. Normal Distributions

7. Normal Distributions A. Introduction B. History C. Areas of Normal Distributions D. Standard Normal E. Exercises Most of the statistical analyses presented in this book are based on the bell-shaped

### Chapter 25: Exchange in Insurance Markets

Chapter 25: Exchange in Insurance Markets 25.1: Introduction In this chapter we use the techniques that we have been developing in the previous 2 chapters to discuss the trade of risk. Insurance markets

Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

### Northumberland Knowledge

Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

### 2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

### CHAPTER THREE COMMON DESCRIPTIVE STATISTICS COMMON DESCRIPTIVE STATISTICS / 13

COMMON DESCRIPTIVE STATISTICS / 13 CHAPTER THREE COMMON DESCRIPTIVE STATISTICS The analysis of data begins with descriptive statistics such as the mean, median, mode, range, standard deviation, variance,

### MATH 140 Lab 4: Probability and the Standard Normal Distribution

MATH 140 Lab 4: Probability and the Standard Normal Distribution Problem 1. Flipping a Coin Problem In this problem, we want to simualte the process of flipping a fair coin 1000 times. Note that the outcomes

### Domain of a Composition

Domain of a Composition Definition Given the function f and g, the composition of f with g is a function defined as (f g)() f(g()). The domain of f g is the set of all real numbers in the domain of g such

### CAHSEE on Target UC Davis, School and University Partnerships

UC Davis, School and University Partnerships CAHSEE on Target Mathematics Curriculum Published by The University of California, Davis, School/University Partnerships Program 006 Director Sarah R. Martinez,

### BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

### 3: Summary Statistics

3: Summary Statistics Notation Let s start by introducing some notation. Consider the following small data set: 4 5 30 50 8 7 4 5 The symbol n represents the sample size (n = 0). The capital letter X denotes

### 1 The Brownian bridge construction

The Brownian bridge construction The Brownian bridge construction is a way to build a Brownian motion path by successively adding finer scale detail. This construction leads to a relatively easy proof

### Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

### STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

### Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

### Mathematics. Probability and Statistics Curriculum Guide. Revised 2010

Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

### Introduction to Hypothesis Testing

I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Statistical Data analysis With Excel For HSMG.632 students

1 Statistical Data analysis With Excel For HSMG.632 students Dialog Boxes Descriptive Statistics with Excel To find a single descriptive value of a data set such as mean, median, mode or the standard deviation,

### Normality Testing in Excel

Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

### STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random

### Module 4: Data Exploration

Module 4: Data Exploration Now that you have your data downloaded from the Streams Project database, the detective work can begin! Before computing any advanced statistics, we will first use descriptive

### 4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses

### Key Concept. Density Curve

MAT 155 Statistical Analysis Dr. Claude Moore Cape Fear Community College Chapter 6 Normal Probability Distributions 6 1 Review and Preview 6 2 The Standard Normal Distribution 6 3 Applications of Normal

### Simple Random Sampling

Source: Frerichs, R.R. Rapid Surveys (unpublished), 2008. NOT FOR COMMERCIAL DISTRIBUTION 3 Simple Random Sampling 3.1 INTRODUCTION Everyone mentions simple random sampling, but few use this method for

### AP Physics 1 and 2 Lab Investigations

AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks

### Excel Formatting: Best Practices in Financial Models

Excel Formatting: Best Practices in Financial Models Properly formatting your Excel models is important because it makes it easier for others to read and understand your analysis and for you to read and

### Chapter 9 Descriptive Statistics for Bivariate Data

9.1 Introduction 215 Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction We discussed univariate data description (methods used to eplore the distribution of the values of a single variable)

### Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

### PAYCHEX, INC. BASIC BUSINESS MATH TRAINING MODULE

PAYCHEX, INC. BASIC BUSINESS MATH TRAINING MODULE 1 Property of Paychex, Inc. Basic Business Math Table of Contents Overview...3 Objectives...3 Calculator...4 Basic Calculations...6 Order of Operation...9

### Homework 8 Solutions

Math 17, Section 2 Spring 2011 Homework 8 Solutions Assignment Chapter 7: 7.36, 7.40 Chapter 8: 8.14, 8.16, 8.28, 8.36 (a-d), 8.38, 8.62 Chapter 9: 9.4, 9.14 Chapter 7 7.36] a) A scatterplot is given below.

### Answer: Quantity A is greater. Quantity A: 0.717 0.717717... Quantity B: 0.71 0.717171...

Test : First QR Section Question 1 Test, First QR Section In a decimal number, a bar over one or more consecutive digits... QA: 0.717 QB: 0.71 Arithmetic: Decimals 1. Consider the two quantities: Answer:

### Variables. Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### BIBA Report on the Importance of Advice in the Small to Medium Enterprise Market

BIBA Report on the Importance of Advice in the Small to Medium Enterprise Market The best insurance is a BIBA broker www.biba.org.uk Member helpline: 0845 77 00 266 The FSA define advice as an opinion

### Stigmatisation of people with mental illness

Stigmatisation of people with mental illness Report of the research carried out in July 1998 and July 2003 by the Office for National Statistics (ONS) on behalf of the Royal College of Psychiatrists Changing

### Back to the Basics! Dashboards, Quartiles, and Setting Priorities

Back to the Basics! Dashboards, Quartiles, and Setting Priorities Presented on Behalf of the CALNOC TEAM by Diane Storer Brown PhD, RN, CPHQ, FNAHQ, FAAN Co-Principal Investigator, Collaborative Alliance

### STAT 350 Practice Final Exam Solution (Spring 2015)

PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

### Statistics E100 Fall 2013 Practice Midterm I - A Solutions

STATISTICS E100 FALL 2013 PRACTICE MIDTERM I - A SOLUTIONS PAGE 1 OF 5 Statistics E100 Fall 2013 Practice Midterm I - A Solutions 1. (16 points total) Below is the histogram for the number of medals won

### An Introduction to Statistics using Microsoft Excel. Dan Remenyi George Onofrei Joe English

An Introduction to Statistics using Microsoft Excel BY Dan Remenyi George Onofrei Joe English Published by Academic Publishing Limited Copyright 2009 Academic Publishing Limited All rights reserved. No

### The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

### ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.

### Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Statistics Statistics are quantitative methods of describing, analysing, and drawing inferences (conclusions)

### If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?

Problem 3 If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C? Suggested Questions to ask students about Problem 3 The key to this question

### DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability

DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability RIT Score Range: Below 171 Below 171 Data Analysis and Statistics Solves simple problems based on data from tables* Compares

### seven Statistical Analysis with Excel chapter OVERVIEW CHAPTER

seven Statistical Analysis with Excel CHAPTER chapter OVERVIEW 7.1 Introduction 7.2 Understanding Data 7.3 Relationships in Data 7.4 Distributions 7.5 Summary 7.6 Exercises 147 148 CHAPTER 7 Statistical

### 2. Filling Data Gaps, Data validation & Descriptive Statistics

2. Filling Data Gaps, Data validation & Descriptive Statistics Dr. Prasad Modak Background Data collected from field may suffer from these problems Data may contain gaps ( = no readings during this period)

### Gamma Distribution Fitting

Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

### Data Analysis, Statistics, and Probability

Chapter 6 Data Analysis, Statistics, and Probability Content Strand Description Questions in this content strand assessed students skills in collecting, organizing, reading, representing, and interpreting

### Covered California CAHPS Ratings Fall 2014 Scoring Health Plan s Historical CAHPS Results

Covered California CAHPS Ratings Fall 2014 Scoring Health Plan s Historical CAHPS Results Summary of Key Elements Covered California CAHPS Quality Rating System (QRS) is comprised of the following elements:

### Grade 4 - Module 5: Fraction Equivalence, Ordering, and Operations

Grade 4 - Module 5: Fraction Equivalence, Ordering, and Operations Benchmark (standard or reference point by which something is measured) Common denominator (when two or more fractions have the same denominator)

### 3.2 Measures of Spread

3.2 Measures of Spread In some data sets the observations are close together, while in others they are more spread out. In addition to measures of the center, it's often important to measure the spread

### Using SPSS, Chapter 2: Descriptive Statistics

1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

### Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

Center: Finding the Median When we think of a typical value, we usually look for the center of the distribution. For a unimodal, symmetric distribution, it s easy to find the center it s just the center

### Lab 11. Simulations. The Concept

Lab 11 Simulations In this lab you ll learn how to create simulations to provide approximate answers to probability questions. We ll make use of a particular kind of structure, called a box model, that

### PLANNING PROBLEMS OF A GAMBLING-HOUSE WITH APPLICATION TO INSURANCE BUSINESS. Stockholm

PLANNING PROBLEMS OF A GAMBLING-HOUSE WITH APPLICATION TO INSURANCE BUSINESS HARALD BOHMAN Stockholm In the classical risk theory the interdependence between the security loading and the initial risk reserve

### Math Journal HMH Mega Math. itools Number

Lesson 1.1 Algebra Number Patterns CC.3.OA.9 Identify arithmetic patterns (including patterns in the addition table or multiplication table), and explain them using properties of operations. Identify and

### Geostatistics Exploratory Analysis

Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

### Exploratory Data Analysis. Psychology 3256

Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find

### MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

STATISTICS/GRACEY PRACTICE TEST/EXAM 2 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Identify the given random variable as being discrete or continuous.

### Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

### 1. How different is the t distribution from the normal?

Statistics 101 106 Lecture 7 (20 October 98) c David Pollard Page 1 Read M&M 7.1 and 7.2, ignoring starred parts. Reread M&M 3.2. The effects of estimated variances on normal approximations. t-distributions.

### 1. a. (iv) b. (ii) [6.75/(1.34) = 10.2] c. (i) Writing a call entails unlimited potential losses as the stock price rises.

1. Solutions to PS 1: 1. a. (iv) b. (ii) [6.75/(1.34) = 10.2] c. (i) Writing a call entails unlimited potential losses as the stock price rises. 7. The bill has a maturity of one-half year, and an annualized

### Strategies for Identifying Students at Risk for USMLE Step 1 Failure

Vol. 42, No. 2 105 Medical Student Education Strategies for Identifying Students at Risk for USMLE Step 1 Failure Jira Coumarbatch, MD; Leah Robinson, EdS; Ronald Thomas, PhD; Patrick D. Bridge, PhD Background

### Simple Inventory Management

Jon Bennett Consulting http://www.jondbennett.com Simple Inventory Management Free Up Cash While Satisfying Your Customers Part of the Business Philosophy White Papers Series Author: Jon Bennett September

### Tutorial Customer Lifetime Value

MARKETING ENGINEERING FOR EXCEL TUTORIAL VERSION 150211 Tutorial Customer Lifetime Value Marketing Engineering for Excel is a Microsoft Excel add-in. The software runs from within Microsoft Excel and only

### The Concept of Present Value

The Concept of Present Value If you could have \$100 today or \$100 next week which would you choose? Of course you would choose the \$100 today. Why? Hopefully you said because you could invest it and make

### Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

### 8. THE NORMAL DISTRIBUTION

8. THE NORMAL DISTRIBUTION The normal distribution with mean μ and variance σ 2 has the following density function: The normal distribution is sometimes called a Gaussian Distribution, after its inventor,

### Measurement & Data Analysis. On the importance of math & measurement. Steps Involved in Doing Scientific Research. Measurement

Measurement & Data Analysis Overview of Measurement. Variability & Measurement Error.. Descriptive vs. Inferential Statistics. Descriptive Statistics. Distributions. Standardized Scores. Graphing Data.

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Chapter 4 and 5 solutions

Chapter 4 and 5 solutions 4.4. Three different washing solutions are being compared to study their effectiveness in retarding bacteria growth in five gallon milk containers. The analysis is done in a laboratory,

### 1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

### A guide to level 3 value added in 2015 school and college performance tables

A guide to level 3 value added in 2015 school and college performance tables January 2015 Contents Summary interpreting level 3 value added 3 What is level 3 value added? 4 Which students are included

### Using Excel for descriptive statistics

FACT SHEET Using Excel for descriptive statistics Introduction Biologists no longer routinely plot graphs by hand or rely on calculators to carry out difficult and tedious statistical calculations. These

### Digging Deeper into Safety and Injury Prevention Data

Digging Deeper into Safety and Injury Prevention Data Amanda Schwartz: Have you ever wondered how you could make your center safer using information you already collect? I'm Amanda Schwartz from the Head

### Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test

Math Review for the Quantitative Reasoning Measure of the GRE revised General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important

### Market Research. Market Research: Part II: How To Get Started With Market Research For Your Organization. What is Market Research?

Market Research: Part II: How To Get Started With Market Research For Your Organization Written by: Kristina McMillan, Sr. Project Manager, SalesRamp Scope: This white paper discusses market research on

### Equity Risk Premium Article Michael Annin, CFA and Dominic Falaschetti, CFA

Equity Risk Premium Article Michael Annin, CFA and Dominic Falaschetti, CFA This article appears in the January/February 1998 issue of Valuation Strategies. Executive Summary This article explores one

### TI-Inspire manual 1. Instructions. Ti-Inspire for statistics. General Introduction

TI-Inspire manual 1 General Introduction Instructions Ti-Inspire for statistics TI-Inspire manual 2 TI-Inspire manual 3 Press the On, Off button to go to Home page TI-Inspire manual 4 Use the to navigate

### What really drives customer satisfaction during the insurance claims process?

Research report: What really drives customer satisfaction during the insurance claims process? TeleTech research quantifies the importance of acting in customers best interests when customers file a property

### APPENDIX. Interest Concepts of Future and Present Value. Concept of Interest TIME VALUE OF MONEY BASIC INTEREST CONCEPTS

CHAPTER 8 Current Monetary Balances 395 APPENDIX Interest Concepts of Future and Present Value TIME VALUE OF MONEY In general business terms, interest is defined as the cost of using money over time. Economists

### Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

### 9. Sampling Distributions

9. Sampling Distributions Prerequisites none A. Introduction B. Sampling Distribution of the Mean C. Sampling Distribution of Difference Between Means D. Sampling Distribution of Pearson's r E. Sampling

### Descriptive Analysis

Research Methods William G. Zikmund Basic Data Analysis: Descriptive Statistics Descriptive Analysis The transformation of raw data into a form that will make them easy to understand and interpret; rearranging,

### Pre-course Materials

Pre-course Materials BKM Quantitative Appendix Document outline 1. Cumulative Normal Distribution Table Note: This table is included as a reference for the Quantitative Appendix (below) 2. BKM Quantitative