Data handling and descriptive statistics in Proficiency Testing Microbiology

Size: px
Start display at page:

Download "Data handling and descriptive statistics in Proficiency Testing Microbiology"

Transcription

1 Data handling and descriptive statistics in Proficiency Testing Microbiology In relation to the standards ISO/IEC 1743 and ISO 1328 by PhD Microbiology division, Science department 1

2 Descriptive statistics for PT participant s results Location (mode, median, mean) Assigned value for an analysis (x pt ; mean ) Measurement uncertainty for assigned value Scale (standard deviation, range, MAD) Standard deviation for proficiency assessment (s pt, sigma-pt) Description of performance of individual laboratories (z scores, plots etc.) 2

3 No. of results No. of results Coliform bacteria 3/36/37 C (MF) 194 Without remark False negative Outliers Median = No. of colonies per 1 ml Coliform bacteria 3/36/37 C (MF) Median = * No. of colonies per 1 ml 3

4 Density Distribution Plot Normal; Mean=9,4 StDev 3 1,3,2,1, -1 - X

5 Methods for calculation of x pt & s pt Assigned value (x pt ) Value from (certified) reference material Consensus value from expert laboratories Consensus value from participant results (x i ) Standard deviation (s pt ) Fitness for purpose value determined in advance Horwitz curve (mainly in chemistry) From participant results (x i )

6 Statistics for assessment of performance Difference: D= x i x pt (x i is participant result) Per cent difference: D%= 1 (x i x pt )/x pt z score: z i = (x i x pt )/s pt s pt is the standard deviation for proficiency assessment Z scores, Zeta (z) scores and E n scores: when measurement uncertainties are considered 6

7 Questions 1. How to calculate appropriate assigned values, standard deviations and z scores? 2. Traditional methods or Robust statistical methods? 7

8 Basic considerations False positive results removed without any calculation False negative results removed without any calculation when the cfu concentration is high or after calculation (e.g. as outlier) when there is a low cfu concentration Plausible mean (x pt ) and legitimate standard deviation (s p ) should be determined with low/limited impact from false and/or extreme results 8

9 Frequency Histogram & estimated Normal distribution Median & Mode 2,38 Yeast (all),,8 1,6 2,4 3,2 Yeast (trimmed) 4, 4,8 Yeast (all) Mean 2,474 StDev,498 N 16 Yeast (trimmed) Mean 2,429 StDev,316 N ,,8 1,6 2,4 3,2 4, 4,8 9 2,38

10 How to get relevant statistical measures for location (x pt ) and scale (s p ) Traditional method (TM) remove outliers before calculation of mean and SD (after prior removal of obviously false results) Robust statistics (RSM) in strict sense calculation without identifying and removing deviating results by the use of an iterative method to reduce the effect of moderately or highly deviating results on the mean and SD (false results first removed) 1

11 Traditional outlier removal Assumption of approx. normal distribution at least after appropriate transformation [log 1 (cfu) or (cfu)] Extreme results are usually present blunders (false results, sample/dilution mixing up etc.), unclear reasons Outlier tests for normal distributions used usually also when the distribution is not perfectly normal (e.g. Grubbs test) 11

12 Robust statistical methods RSM RSM works well even when the results are only roughly normal distributed (e.g. long tails ) There are many different estimators of robustly calculated location ( mean ) and scale ( standard deviation ) With RSM there is no need to look for and identify results as outliers Often problem with ordinary outlier tests using TM when there are two or more outliers in one direction 12

13 The principles for calculation of robust mean and standard deviation by an iterative (= repeated) process called Huber s method Including the use of MAD, Median Absolute Deviation (sometimes the word Difference or Distance) 13

14 Huber s method first steps (acc. to ISO FDIS 1328:214) Robust estimation of mean = assigned value (x*) 1. Find the median of the results after sorting them insensitive to how far from the median deviating results are the median is the initial x*, a robust estimation of the mean Robust estimation of standard deviation (s*) 2. Calculate the absolute differences between the participant s results x i and the median: x i x* 3. Sort the absolute differences in ascending order and find the median of these differences = MAD insensitive to how far from the median deviating results are 4. Initial s* = MAD 1. (or more exactly: MAD 1.483) 14

15 Last steps iterative process. Calculate: d = 1. s* (d = delta; a difference) 6. For each x i (i = 1, 2,, p), calculate: x* d, when x i < x* d x i * = x* + d, when x i > x* + d x i, in other cases 7. Calculate new values for x* and s* x* = x i */p (p = number of results) s* = SD(x i *) 8. Repeat the steps 7 until convergence 1

16 Implications for performance More z scores will be beyond limits when deviating results are removed at (RSM) or before (TM) calculation of mean and SD more participant results unsatisfactory Usual performance criteria: z 2. satisfactory 2. < z < 3. questionable z 3. unsatisfactory 16

17 Limitations for Huber s method The underlying distribution should be roughly normal (= unimodal & symmetrical) The number of deviating results must not be > 2% of all results Outliers are not directly removed as such Outliers can be characterized as those x i where: x i x* 3 s* (or 2. s*) when s* is used as s pt Corresponds to z scores where: z 3 (or z 2.) 17

18 Examples from EURL Campylobacter trial PT 13,

19 Frequency ,6 2, 6 2,4 2,8 3,2 3,6 4,, No 1. C.coli 1,6 3,2 7 4, ,2 4,8 4,4, 1,2 No 2. E. coli No 3. C. lari No 4. C. jejuni 1 2,4,6 2 6,4 6, 3,6 3 4,8 4 6, ,2 19 1,8 2,4 3, 3,6 4,2 2, 2,4 2,8 3,2 3,6 4, 4,4 4, , 3,6 4,2 4,8 No. C.jejuni+E.coli No 6. C.coli No 7. C.jejuni+E.coli No 8. C.jejuni No 9. Blank Histogram & estimated Normal distribution Normal No 1. C.lari 4, 4,,,4, 6, 6, No 1. C.coli Mean,2 StDev,6287 N 3 No 2. E. coli Mean,171 StDev,6339 N 3 No 3. C. lari Mean 3,38 StDev,288 N 3 No 4. C. jejuni Mean 4,738 StDev,928 N 3 No. C.jejuni+E.coli Mean 3,18 StDev,4932 N 3 No 6. C.coli Mean,497 StDev,647 N 3 No 7. C.jejuni+E.coli Mean 3,21 StDev,497 N 3 No 8. C.jejuni Mean 4,967 StDev,4983 N 3 No 9. Blank Mean,1943 StDev,9899 N 3 No 1. C.lari Mean 3,82 StDev,793 N 3

20 Z scores when all resuts are used (deviating results included) No 1. C.coli N No 3. C. lari No 4. C. jejuni No. C.jejuni+No 6. C.coli No 7. C.jejuni+No 8. C.jejuni N No 1. C.lari,71 #,68,61,64 1,7,9 1,27 #,4-1,2 # -,7 -,7,3 -,7 -,3 -,74 # -,48, # -1,77-1,7-2,81,89 3,23,27 # 2,8,44 # 1,4 1,23,98 -,1,14-1,78 #,2,24 #,31,44,64,4,1,7 #,2 -,37 # 1,1 -,7 -,1,82,48 1,29 #,7,6 #,9,44,62,2 -,4,87 #,4 -,6 # -,83 -,23 -,17 -,3 -,1,27 # -3,89,6 #,89,8,4 -,8,18,9 #,6 3,74 #,31,44,44,89,3,87 #,2 -,4 #,31,27,44 -,3,1 -,13 #,28,1 #,7,36,82,38,3,29 #,19,16 # -,49 -,77 -,41 -,1-1,23,7 #,28,84 #,27,76,64,4,38,67 #,34-1,34 # -3,32-3,2-2,44-1,2-2,42-2,12 # -1,67,7 # 1,14,71,48 -,9,18-2,66 #,8,62 # -,6 -,3,42,6,1,71 # -,61,9 # 1,23,61,84 2,61,86 1,29 # 1,7,62 # -,49-1,7,84-1,48,84 1,3 #, -,19 # -1,1-1,36 -,1,2 -,1,43 # -,8 -,9 # -,4 -,3-1,33-1,22-1,19-1,18 # -,47-2,79 # -1,2-1,7-2,2-1,94-1,71-1,4 # -,73 -,1 #,74,68,32,2,2 -,9 #,9 -,78 # -,83 -,42 -,8-1,13 -,61 -,1 # -,33, #,27,88,62,66,9,67 #,4 -,62 #,29,34 -,17 -,1 -,63,1 #,21 -,76 # -1,39-1,3 -,98,88-1,23 -, # -,72 -,1 #,21,98 1,9,,2 -,9 #,28 -,72 #,12,61,23,18,1,27 #,28,68 #,6 1,,86,71,2,43 #,19 -,4 # -,64 -,6 -,98 -,3,1 -,13 # -1,11,79 #,8 1,13,66,89,64,83 #,2,87 #,7 -,13,48,47,78,1 #,6 2 -,6 # -,83,44-1,19-2,3-1,91-1,4 # -,1

21 Z scores by robust method (Huber's) No 1. C.coli N No 3. C. lari No 4. C. jejuni No. C.jejuni+No 6. C.coli No 7. C.jejuni+No 8. C.jejuni N No 1. C.lari 1,1,66,7,63 1,12 1,13 1,3,6-1,63 -,1 -,71 -,13 -,76 -,42 -,93 -,92,9-1,98-1,79-3,6,93 4,1,19 4,7,64 1,4 1,2 1,6 -,3,1-2,1 -,8,3,2,39,63,,1 -,4 -,8 -,48 1, -,71 -,3,8,9 1,33,83,11,94,39,6,2 -,9,86,6 -,7 -,97 -,34 -,38 -,7 -,68,19-6,62,11,88,3,33 -,1,2,4,89,19,2,39,38,93,36,86 -,8 -,3,2,2,38 -,7,1 -,26,3,73,4,3,8,39,36,21,2,24 -,6 -,93 -,68 -, -1,62 -,4,3 1,19,21,73,63,,46,63,4-1,82-3,6-3,36-3,2-1,62-3,1-2,48-2,9,99 1,14,68,43-1,2,2-3,8,8,88 -,72 -,42,3,,1,68-1,13,1 1,2,7,88 2,7 1,8 1,33 1,67,88 -,6-1,99,88-1,8 1,6 1,4 -,12 -,24-1,31-1,7 -,18,1 -,16,36-1,4 -,79 -,6 -,47-1,81-1,3-1,7-1,43 -,9-3,82-1,17-1,99-2,89-2,7-2,24-1,83-1,34 -,18,72,64,23,4 -,1 -,22,3-1, -,97 -,4 -,88-1,21 -,81 -,28 -,67,79,21,86,6,69 1,13,63,6 -,83,23,28 -,38 -, -,84,,24-1,3-1,7-1,21-1,39,91-1,62 -,72-1,32 -,68,1,97 1,18,7,64-1,11,3 -,97,,7,12,18,1,19,3,97,62 1,4,9,74 -,1,36,2 -,3 -,76 -,16-1,38 -,38,1 -,26-1,98 1,12,84 1,14,6,93,8,81,7 1,23,68 -,23,43,48,98,4, ,7 -,97,39-1,64-2,4-2, -1,83 -,29,31 1,37,86 1,,94,91,68 1,12

22 No 1. C.coli No 2. E. coli No 3. C. lari No 4. C. jejuni No. C.jejuni+E.c oli No 6. C.coli No 7. C.jejuni+E.c oli No 8. C.jejuni No 9. Blank No 1. C.lari 6,, 3,4,1 3, 6,1 3,7,6, 3,4 4,8, 3, 4,4 3,2,1 3,1 4,6, 2,7,8 3,7 2,1 3,81 1,8 6, 4,86,1,8,3,83, 3,9,47 3,67,49 3,32 4,8 1, 3,1,7, 3,2, 3,,8 3,3,, 3,1,32, 3,7 4,4 3,11,96 3,49,61, 3,3,9, 3,4, 3,49,61 3,23,4, 3,4,2, 2,6 4,6 3,1,2 3,,1,,,9, 3,1,8 3,38,4 3,34,26, 3,6 7,9, 3,2, 3,4 6, 3,4,4, 3,1,3, 3,2 4,9 3,4,2 3,3 4,9, 3,3,87, 3,34 4,9 3,9,71 3,4,11, 3,23,6, 2,78 4,28 2,98,21 2,64,, 3,3 6,8, 3,18,19 3,,8 3,44,3, 3,3 4,71, 1,28 2,9 1,98 4,64 2, 3,91, 1,76,99, 3,64,16 3,42 4,96 3,34 3,64, 3,4,94, 2,72 4,6 3,39,3 3,3,32, 2,6,61, 3,69,1 3,6 6,97 3,68,61, 3,93,94, 2,78 3,7 3,6 4,66 3,67,48, 3,8,43, 2,43 3,93 3,18,1 3,2,18, 2,4,18, 2,8 4,3 2,3 4,81 2,66 4,38, 2,71 3,8, 2, 3,7 2,1 4,4 2,4 4,2, 2,,46, 3,43,14 3,34,79 3,26 4,92, 3,1,6, 2,6 4,49 2,9 4,86 2,9 4,89, 2,82,9, 3,18,26 3,49,87 3,7,3, 3,4,16, 3,19 4,94 3,1,21 2,94,4, 3,2,7, 2,3 4,13 2,7,99 2,64 4,7, 2,1,23, 3,1,32 3,72,81 3,1 4,2, 3,3,1, 3,1,1 3,3,6 3,3,1, 3,3,98, 3,38,63 3,61,9 3,26,18, 3,23,3, 2,7 4,7 2,7,3 3,3 4,9, 2,2 6,, 3,49,41 3,1 6, 3,7,38, 3,49 6,1, 3,41 4,66 3,42,76 3,64,22, 3,6,2, 2,6, 2,6 22 4,2 2,3 4,2, 3,,68, 3,7,26 3,6 6, 3,61,32, 3,67

23 Other robust estimators Location (centre tendency) Median (= initial x*) Scale (dispersion of results, e.g SD) MAD (Median Absolute Difference) Scaled MAD = MADe = MAD (= initial s*) IQR (Interquartile range = % in the middle) 7th percentile of x i 2th percentile of x i (i = 1, 2,, p) Normalized (scaled) IQRn = IQR

24 References to Huber s method ISO 72-:23 ISO 1328:2 Will be replaced by ISO (FDIS) 1328:214 amc technical brief No. 6 April 21 (Analytical Methods Committee, Royal Society of Chemistry 21) 24

Magruder Statistics & Data Analysis

Magruder Statistics & Data Analysis Magruder Statistics & Data Analysis Caution: There will be Equations! Based Closely On: Program Model The International Harmonized Protocol for the Proficiency Testing of Analytical Laboratories, 2006

More information

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

Data Exploration Data Visualization

Data Exploration Data Visualization Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

More information

Lesson 4 Measures of Central Tendency

Lesson 4 Measures of Central Tendency Outline Measures of a distribution s shape -modality and skewness -the normal distribution Measures of central tendency -mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central

More information

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.) Center: Finding the Median When we think of a typical value, we usually look for the center of the distribution. For a unimodal, symmetric distribution, it s easy to find the center it s just the center

More information

Control Charts and Trend Analysis for ISO 17025. Speakers: New York State Food Laboratory s Quality Assurance Team

Control Charts and Trend Analysis for ISO 17025. Speakers: New York State Food Laboratory s Quality Assurance Team Control Charts and Trend Analysis for ISO 17025 Speakers: New York State Food Laboratory s Quality Assurance Team 1 ISO 17025 Requirements, 5.9.1: The laboratory shall have quality control procedures for

More information

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)

More information

consider the number of math classes taken by math 150 students. how can we represent the results in one number?

consider the number of math classes taken by math 150 students. how can we represent the results in one number? ch 3: numerically summarizing data - center, spread, shape 3.1 measure of central tendency or, give me one number that represents all the data consider the number of math classes taken by math 150 students.

More information

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

More information

Descriptive statistics parameters: Measures of centrality

Descriptive statistics parameters: Measures of centrality Descriptive statistics parameters: Measures of centrality Contents Definitions... 3 Classification of descriptive statistics parameters... 4 More about central tendency estimators... 5 Relationship between

More information

SKEWNESS. Measure of Dispersion tells us about the variation of the data set. Skewness tells us about the direction of variation of the data set.

SKEWNESS. Measure of Dispersion tells us about the variation of the data set. Skewness tells us about the direction of variation of the data set. SKEWNESS All about Skewness: Aim Definition Types of Skewness Measure of Skewness Example A fundamental task in many statistical analyses is to characterize the location and variability of a data set.

More information

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 seema@iasri.res.in Genomics A genome is an organism s

More information

Module 4: Data Exploration

Module 4: Data Exploration Module 4: Data Exploration Now that you have your data downloaded from the Streams Project database, the detective work can begin! Before computing any advanced statistics, we will first use descriptive

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Mean = (sum of the values / the number of the value) if probabilities are equal

Mean = (sum of the values / the number of the value) if probabilities are equal Population Mean Mean = (sum of the values / the number of the value) if probabilities are equal Compute the population mean Population/Sample mean: 1. Collect the data 2. sum all the values in the population/sample.

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

EXPLORING SPATIAL PATTERNS IN YOUR DATA

EXPLORING SPATIAL PATTERNS IN YOUR DATA EXPLORING SPATIAL PATTERNS IN YOUR DATA OBJECTIVES Learn how to examine your data using the Geostatistical Analysis tools in ArcMap. Learn how to use descriptive statistics in ArcMap and Geoda to analyze

More information

Changes to UK NEQAS Leucocyte Immunophenotyping Chimerism Performance Monitoring Systems From April 2014. Uncontrolled Copy

Changes to UK NEQAS Leucocyte Immunophenotyping Chimerism Performance Monitoring Systems From April 2014. Uncontrolled Copy Changes to UK NEQAS Leucocyte Immunophenotyping Chimerism Performance Monitoring Systems From April 2014 Contents 1. The need for change 2. Current systems 3. Proposed z-score system 4. Comparison of z-score

More information

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

More information

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck!

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck! STP 231 EXAM #1 (Example) Instructor: Ela Jackiewicz Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned.

More information

A Review of Statistical Outlier Methods

A Review of Statistical Outlier Methods Page 1 of 5 A Review of Statistical Outlier Methods Nov 2, 2006 By: Steven Walfish Pharmaceutical Technology Statistical outlier detection has become a popular topic as a result of the US Food and Drug

More information

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

More information

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

More information

NABL NATIONAL ACCREDITATION

NABL NATIONAL ACCREDITATION NABL 162 NABL NATIONAL ACCREDITATION BOARD FOR TESTING AND CALIBRATION LABORATORIES GUIDELINES FOR PROFICIENCY TESTING PROGRAM for TESTING AND CALIBRATION LABORATORIES ISSUE NO : 04 AMENDMENT NO : 00 ISSUE

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve

More information

Statistics. Measurement. Scales of Measurement 7/18/2012

Statistics. Measurement. Scales of Measurement 7/18/2012 Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Mean, Median, Standard Deviation Prof. McGahagan Stat 1040

Mean, Median, Standard Deviation Prof. McGahagan Stat 1040 Mean, Median, Standard Deviation Prof. McGahagan Stat 1040 Mean = arithmetic average, add all the values and divide by the number of values. Median = 50 th percentile; sort the data and choose the middle

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

More information

1. PURPOSE To provide a written procedure for laboratory proficiency testing requirements and reporting.

1. PURPOSE To provide a written procedure for laboratory proficiency testing requirements and reporting. Document #: FDPD-QMS.024.003 Page 1 of 12 Table of Contents 1. Purpose 2. Scope 3. Responsibility 4. References 5. Related Documents 6. Definitions 7. Safety 8. Equipment/Materials Needed 9. Process Description

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

Sampling and Descriptive Statistics

Sampling and Descriptive Statistics Sampling and Descriptive Statistics Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University Reference: 1. W. Navidi. Statistics for Engineering and Scientists.

More information

Shape of Data Distributions

Shape of Data Distributions Lesson 13 Main Idea Describe a data distribution by its center, spread, and overall shape. Relate the choice of center and spread to the shape of the distribution. New Vocabulary distribution symmetric

More information

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers 1.3 Measuring Center & Spread, The Five Number Summary & Boxplots Describing Quantitative Data with Numbers 1.3 I can n Calculate and interpret measures of center (mean, median) in context. n Calculate

More information

Chapter 3. The Normal Distribution

Chapter 3. The Normal Distribution Chapter 3. The Normal Distribution Topics covered in this chapter: Z-scores Normal Probabilities Normal Percentiles Z-scores Example 3.6: The standard normal table The Problem: What proportion of observations

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Midterm Review Problems

Midterm Review Problems Midterm Review Problems October 19, 2013 1. Consider the following research title: Cooperation among nursery school children under two types of instruction. In this study, what is the independent variable?

More information

Topic 9 ~ Measures of Spread

Topic 9 ~ Measures of Spread AP Statistics Topic 9 ~ Measures of Spread Activity 9 : Baseball Lineups The table to the right contains data on the ages of the two teams involved in game of the 200 National League Division Series. Is

More information

How To Test For Significance On A Data Set

How To Test For Significance On A Data Set Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

THE BINOMIAL DISTRIBUTION & PROBABILITY

THE BINOMIAL DISTRIBUTION & PROBABILITY REVISION SHEET STATISTICS 1 (MEI) THE BINOMIAL DISTRIBUTION & PROBABILITY The main ideas in this chapter are Probabilities based on selecting or arranging objects Probabilities based on the binomial distribution

More information

First Midterm Exam (MATH1070 Spring 2012)

First Midterm Exam (MATH1070 Spring 2012) First Midterm Exam (MATH1070 Spring 2012) Instructions: This is a one hour exam. You can use a notecard. Calculators are allowed, but other electronics are prohibited. 1. [40pts] Multiple Choice Problems

More information

Measures of Central Tendency and Variability: Summarizing your Data for Others

Measures of Central Tendency and Variability: Summarizing your Data for Others Measures of Central Tendency and Variability: Summarizing your Data for Others 1 I. Measures of Central Tendency: -Allow us to summarize an entire data set with a single value (the midpoint). 1. Mode :

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the

More information

Name: Date: Use the following to answer questions 2-3:

Name: Date: Use the following to answer questions 2-3: Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student

More information

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

More information

Week 1. Exploratory Data Analysis

Week 1. Exploratory Data Analysis Week 1 Exploratory Data Analysis Practicalities This course ST903 has students from both the MSc in Financial Mathematics and the MSc in Statistics. Two lectures and one seminar/tutorial per week. Exam

More information

Using SPSS, Chapter 2: Descriptive Statistics

Using SPSS, Chapter 2: Descriptive Statistics 1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

More information

ICMSF Lecture on Microbiological Sampling Plans

ICMSF Lecture on Microbiological Sampling Plans ICMSF Lecture on Microbiological Sampling Plans Susanne Dahms IAFP, San Diego, 2002 Client - meeting - - 1 Overview Introduction Sampling plans: Design and means to study their performance Two-class attributes

More information

a. mean b. interquartile range c. range d. median

a. mean b. interquartile range c. range d. median 3. Since 4. The HOMEWORK 3 Due: Feb.3 1. A set of data are put in numerical order, and a statistic is calculated that divides the data set into two equal parts with one part below it and the other part

More information

Exploratory Data Analysis. Psychology 3256

Exploratory Data Analysis. Psychology 3256 Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

How Far is too Far? Statistical Outlier Detection

How Far is too Far? Statistical Outlier Detection How Far is too Far? Statistical Outlier Detection Steven Walfish President, Statistical Outsourcing Services steven@statisticaloutsourcingservices.com 30-325-329 Outline What is an Outlier, and Why are

More information

Lecture 2. Summarizing the Sample

Lecture 2. Summarizing the Sample Lecture 2 Summarizing the Sample WARNING: Today s lecture may bore some of you It s (sort of) not my fault I m required to teach you about what we re going to cover today. I ll try to make it as exciting

More information

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

More information

Standard Deviation Estimator

Standard Deviation Estimator CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

Proficiency testing schemes on determination of radioactivity in food and environmental samples organized by the NAEA, Poland

Proficiency testing schemes on determination of radioactivity in food and environmental samples organized by the NAEA, Poland NUKLEONIKA 2010;55(2):149 154 ORIGINAL PAPER Proficiency testing schemes on determination of radioactivity in food and environmental samples organized by the NAEA, Poland Halina Polkowska-Motrenko, Leon

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

More information

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives. The Normal Distribution C H 6A P T E R The Normal Distribution Outline 6 1 6 2 Applications of the Normal Distribution 6 3 The Central Limit Theorem 6 4 The Normal Approximation to the Binomial Distribution

More information

BEIPH Final Report. QCMD 2010 Hepatitis B Virus DNA (HBVDNA10A) EQA Programme. William G MacKay on behalf of QCMD and its Scientific Council July 2010

BEIPH Final Report. QCMD 2010 Hepatitis B Virus DNA (HBVDNA10A) EQA Programme. William G MacKay on behalf of QCMD and its Scientific Council July 2010 QUALITY CONTROL for MOLECULAR DIAGNOSTICS The Altum Building, Todd Campus, West of Scotland Science Park, Glasgow, G20 0XA Scotland Tel: +44 (0) 141 945 6474 Fax: +44 (0) 141 945 5795 www.qcmd.org info@qcmd.org

More information

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

More information

FOOD FOR THOUGHT Topical Insights from our Subject Matter Experts UNDERSTANDING WHAT IS NEEDED TO PRODUCE QUALITY DATA

FOOD FOR THOUGHT Topical Insights from our Subject Matter Experts UNDERSTANDING WHAT IS NEEDED TO PRODUCE QUALITY DATA FOOD FOR THOUGHT Topical Insights from our Subject Matter Experts UNDERSTANDING WHAT IS NEEDED TO PRODUCE QUALITY DATA The NFL White Paper Series Volume 7, January 2013 Overview and a Scenario With so

More information

Means, standard deviations and. and standard errors

Means, standard deviations and. and standard errors CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

More information

AP * Statistics Review. Descriptive Statistics

AP * Statistics Review. Descriptive Statistics AP * Statistics Review Descriptive Statistics Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production

More information

Interlaboratory studies

Interlaboratory studies Interlaboratory studies Department of Food Chemistry and Analysis, ICT Prague Vladimír Kocourek Prague, 2012 Interlaboratory studies Various titles: Interlaboratory proficiency test or studies Interlaboratory

More information

Foundation of Quantitative Data Analysis

Foundation of Quantitative Data Analysis Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1

More information

List of Examples. Examples 319

List of Examples. Examples 319 Examples 319 List of Examples DiMaggio and Mantle. 6 Weed seeds. 6, 23, 37, 38 Vole reproduction. 7, 24, 37 Wooly bear caterpillar cocoons. 7 Homophone confusion and Alzheimer s disease. 8 Gear tooth strength.

More information

Bernd Klaus, some input from Wolfgang Huber, EMBL

Bernd Klaus, some input from Wolfgang Huber, EMBL Exploratory Data Analysis and Graphics Bernd Klaus, some input from Wolfgang Huber, EMBL Graphics in R base graphics and ggplot2 (grammar of graphics) are commonly used to produce plots in R; in a nutshell:

More information

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles. Math 1530-017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible

More information

Chapter 2 Statistical Foundations: Descriptive Statistics

Chapter 2 Statistical Foundations: Descriptive Statistics Chapter 2 Statistical Foundations: Descriptive Statistics 20 Chapter 2 Statistical Foundations: Descriptive Statistics Presented in this chapter is a discussion of the types of data and the use of frequency

More information

American Association for Laboratory Accreditation

American Association for Laboratory Accreditation Page 1 of 12 The examples provided are intended to demonstrate ways to implement the A2LA policies for the estimation of measurement uncertainty for methods that use counting for determining the number

More information

What is Data Analysis. Kerala School of MathematicsCourse in Statistics for Scientis. Introduction to Data Analysis. Steps in a Statistical Study

What is Data Analysis. Kerala School of MathematicsCourse in Statistics for Scientis. Introduction to Data Analysis. Steps in a Statistical Study Kerala School of Mathematics Course in Statistics for Scientists Introduction to Data Analysis T.Krishnan Strand Life Sciences, Bangalore What is Data Analysis Statistics is a body of methods how to use

More information

Section 1.3 Exercises (Solutions)

Section 1.3 Exercises (Solutions) Section 1.3 Exercises (s) 1.109, 1.110, 1.111, 1.114*, 1.115, 1.119*, 1.122, 1.125, 1.127*, 1.128*, 1.131*, 1.133*, 1.135*, 1.137*, 1.139*, 1.145*, 1.146-148. 1.109 Sketch some normal curves. (a) Sketch

More information

Frequency Distributions

Frequency Distributions Descriptive Statistics Dr. Tom Pierce Department of Psychology Radford University Descriptive statistics comprise a collection of techniques for better understanding what the people in a group look like

More information

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013 Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

More information

Variables. Exploratory Data Analysis

Variables. Exploratory Data Analysis Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

More information

Results of Proficiency Test Bisphenol A in Plastic May 2014

Results of Proficiency Test Bisphenol A in Plastic May 2014 Results of Proficiency Test Bisphenol A in Plastic May 214 Organised by: Author: Corrector: Report: Spijkenisse, the Netherlands ing. R.J. Starink dr. R.G. Visser & ing. N. Boelhouwer iis14p4 July 214

More information

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

More information

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data A Few Sources for Data Examples Used Introduction to Environmental Statistics Professor Jessica Utts University of California, Irvine jutts@uci.edu 1. Statistical Methods in Water Resources by D.R. Helsel

More information

Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

More information

PTA proficiency testing for metal testing laboratories

PTA proficiency testing for metal testing laboratories PTA proficiency testing for metal testing laboratories BRIGGS Philip (Proficiency Testing Australia PO Box 7507, Silverwater, Rhodes NSW 2128 Australia ) Abstract: Proficiency testing is used internationally

More information

Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous

Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous Chapter 2 Overview Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Classify as categorical or qualitative data. 1) A survey of autos parked in

More information

Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

More information

Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

More information

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple. Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of

More information

Northumberland Knowledge

Northumberland Knowledge Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

More information

What Does the Normal Distribution Sound Like?

What Does the Normal Distribution Sound Like? What Does the Normal Distribution Sound Like? Ananda Jayawardhana Pittsburg State University ananda@pittstate.edu Published: June 2013 Overview of Lesson In this activity, students conduct an investigation

More information

Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

More information

Validation and Calibration. Definitions and Terminology

Validation and Calibration. Definitions and Terminology Validation and Calibration Definitions and Terminology ACCEPTANCE CRITERIA: The specifications and acceptance/rejection criteria, such as acceptable quality level and unacceptable quality level, with an

More information

Descriptive Analysis

Descriptive Analysis Research Methods William G. Zikmund Basic Data Analysis: Descriptive Statistics Descriptive Analysis The transformation of raw data into a form that will make them easy to understand and interpret; rearranging,

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information