General and statistical principles for certification of RM ISO Guide 35 and Guide 34 / REDELAC International Seminar on RM / PT 17 November 2010 Dan Tholen,, M.S.
Topics Role of reference materials in traceability Definitions Determining the Assigned Property Value Uncertainty of the CRM Property Value Characterization Homogeneity Stability Combined Uncertainty
Role of reference materials Provide Metrological Traceability Identity Presence or absence Quantification Quality Control Proficiency Testing Transfer property values to other materials Validate measurement procedures
Definitions (Guide 34) reference material (RM) material, sufficiently homogeneous and stable with respect to one or more specified properties, which has been established to be fit for its intended use in a measurement process NOTE 1 RM is a generic term. NOTE 2 Properties can be quantitative or qualitative (e.g. identity of substances or species). NOTE 3 Uses can include the calibration of a measurement system, assessment of a measurement procedure, assigning values to other materials, and quality control.
Types of reference materials Calibration Material Positive / negative controls Purity compounds Quality Control material Proficiency Test item Material for method validation
Definitions (Guide 34) certified reference material (CRM) reference material characterized by a metrologically valid procedure for one or more specified properties, accompanied by a certificate that provides the value of the specified property, its associated uncertainty, and a statement of metrological traceability NOTE 1 The concept of value includes qualitative attributes such as identity or sequence. Uncertainties for such attributes may be b expressed as probabilities. NOTE 2 Metrologically valid procedures for the production and certification of reference materials are given in, among others, ISO Guides 34 and 35.
Types of Certified Reference Materials Primary Standard Reference Materials Positive and Negative control materials Calibration materials Proficiency Testing Items using reference values Quality Control material if the value is known
Types of CRMs A CRM might be a national Standard Reference Material (SRM) A CRM does not need to be an SRM Certificate and property value Traceability to a stated reference Uncertainty
Definitions (VIM3) 5.1 measurement standard realization of the definition of a given quantity, with stated quantity value and associated measurement uncertainty, used as a reference
Definitions (VIM3) 5.4 primary measurement standard measurement standard established using a primary reference measurement procedure, or created as an artifact, chosen by convention (Guide 30) 2.3 primary standard: : Standard that is designated or widely acknowledged as having the highest metrological qualities and whose value is accepted without reference to other Standards of the same quantity, within a specified context.
Definitions (VIM3) 5.5 secondary measurement standard measurement standard established through calibration with respect to a primary measurement standard for a quantity of the same kind (Guide 30) 2.4 secondary standard: Standard whose value is assigned by comparison with a primary Standard of the same quantity.
Summary: Types of Reference Materials Non-certified Reference Material Sufficiently homogeneous and stable May have information values Certified Reference Material RM with certificate and property value Statement of traceability and uncertainty Accreditable CRM Production meets Guide 34 (with 30 & 35) Certificate meets Guide 31
ISO REMCO Documents ISO Guide 34 and 35 developed out of step, but current versions are in harmony Guide 34 1998, 2000, 2009 Guide 35 1985, 1989, 2006 sort of
Guide 35 references in Guide 34 Summary: Guide 35 is referenced for: 5.12 Metrological traceability 5.13 Assessment of homogeneity 5.14 Assessment of stability 5.15 Characterization 5.16 Assignment of property values and their uncertainties
Guide 34 and 35 disharmony Guide 34 extends application of Guide 35 to include non-certified RM Degree of homogeneity Long term stability Guide 34 raises an issue not covered in Guide 35 determining equivalence of replacement batches Guide 35 does not allow significant long term instability, but Guide 34 says it can be addressed in the uncertainty
Determine Assigned Values Called Characterization of Property Values Can be accomplished in several ways By definition By formulation By testing
Definitions of metrological traceability (VIM3) o metrological traceability - property of a measurement result whereby the result can be related to a reference through a documented unbroken chain of calibrations, each contributing to the measurement uncertainty (+8 Notes)
Definitions (VIM3) NOTE 7 The ILAC considers the elements for confirming metrological traceability to be an unbroken metrological traceability chain to an international measurement standard or a national measurement standard, a documented measurement uncertainty, a documented measurement procedure, accredited technical competence, metrological traceability to the SI, and calibration intervals (see ILAC-P10:2002[9]).
More on Traceability (ILAC P10) P10: section 2a): Where such traceability is not technically possible or reasonable, the laboratory and the client and other interested parties may agree to using certified reference materials als provided by a competent supplier or using specified methods and/or consensus standards that are clearly described and agreed by all parties concerned; Note 1: It is recognised by ILAC that, due to the nature of some tests, it is not possible, realistic or relevant to expect traceability of measurement results to be demonstrated. ILAC Member Bodies have agreed to investigate this issue and develop guidelines on such exceptions and areas where requirements for traceability are difficult to apply.
9.3 Approaches for characterization Four approaches given in Guide 34: a) measurement by a single (primary) method in a single laboratory (includes( formulation); b) measurement by two or more independent reference methods in one laboratory; c) measurement by a network of laboratories using one or more methods of demonstrable accuracy; d) a method-specific approach giving only method- specific assessed property values, using a network of laboratories.
Property Characteristics - Identity Can be accomplished by definition or expert judgment What is this material? Should be confirmed by testing Presence (positive) or absence (negative, or blank) All measurements above Limit of Detection Should be confirmed by independent party Property characteristic
9.4.1 Single Laboratory Prefer primary methods (see definition in G35) Prefer more than one method, or confirm Primary methods not always available, others are commonly used Gravimetry for gas mixtures and solutions Freezing point depression (for purity) IDMS where applicable Commercial CRMs usually use one laboratory, may use one method (or use one method to confirm manufacture)
9.4.2 More than one Laboratory Necessary basis of approach: a) there exists a set of laboratories that are equally capable in determining the characteristics of the RM to provide results with acceptable accuracy; b) the differences between individual results, both within and between laboratories, are statistical in nature regardless of the causes.
9.4.2 More than one Laboratory Use caution in the assumption of equivalence of results Verify measurement uncertainty Verify equivalence of methods Results must be metrologically traceable to the same reference Ideally, laboratories should be accredited
9.4.2 More than one Laboratory Recommended number of labs varies Primary or well-established established methods: 2-32 3 labs Less well established methods, but expect that all results will be valid: 6-86 8 labs Many methods or chance of invalid results: 10-15 15 labs Consider whether a balanced representation of methods is needed
9.4.2 More than one Laboratory Recommended number of replicates: 2 units of RM At least 6 replicates over 2 days Separate calibrations on every replicate If homogeneity will be determined by the experiment, need 3-43 4 units per laboratory Report each result, not means Report uncertainty and method of uncertainty determination (if appropriate)
9.5 RM Property considerations 9.5.2.1 Chemical properties certified for purity Nothing is 100% pure - all impurities and mass fractions should be listed Listed purity = 1-Sum 1 of mass fractions of impurities The combined standard uncertainty of the amount of substance is the quadratic sum of uncertainties of the impurities
9.5 RM Property considerations 9.5.2.2 Chemical properties of synthetic RMs - solutions and gas mixtures Often manufactured by gravimetry: Gravimetric value is basis of certification Verify value with suitable method Homogeneity study for bottle variability Long term stability study Uncertainty due to verification and to bottle homogeneity assumed to be small, but should be included
9.5 RM Property considerations 9.5.3 Conventional properties materials defined by a method, test procedure, or particular piece of equipment Often subject to large variability Need very detailed method description Need careful control of particular equipment Can run a known to verify equipment is OK at the time of the measurement, and provide traceability Consider using more than one piece of equipment, laboratory, or operator Assure independent calibrations
10.5.5 Treatment of Outliers Outliers different than stragglers (results on limits of the distribution. Outliers should be eliminated, stragglers should be retained Concern to underestimate uncertainty Outliers can occur at any level single values, means, variances, methods, laboratories Choice of statistician whether to remove, based on tests and confirmation of assumptions Outliers are rarely replaced, and only if conditions can be replicated
10.8 Specific Issues ANOVA based evaluation is recommended where possible, to assure assessment of key components in same manner If collaborative study used to determine homogeneity, use 2-way 2 nested ANOVA with balanced design Other procedures are possible Robust means Weighted means useful for combining uncertainties Other
Uncertainty! The assigned property value is meaningless without two things: Statement of metrological traceability Estimate of uncertainty Applies to quantitative and qualitative property values
6.1-6.2 6.2 Measurement uncertainty Basic principal: The uncertainty of the Certified Reference Material must describe what is in a single test portion taken by an CRM user, for a CRM that has been stored and shipped according to specification, and is used within the claimed shelf life It is NOT the uncertainty of the mean of the batch of material made by the RM producer It is NOT the uncertainty of the mean of results from a characterization study (u( char char )
Measurement Uncertainty Follow principals in GUM, prefer Type A Must estimate uncertainty due to Characterization Homogeneity Between bottle always Within bottle where appropriate Transport (in excess of long term instability) Long term stability in storage by producer May need to add u for different methods / labs
Definitions (Guide 34) 3.8 measurement uncertainty non-negative negative parameter characterizing the dispersion of the quantity values being attributed to a measurand,, based on the information used Applies only when measurements are used Does not apply to characteristics (nominal properties) Does apply to presence / absence
Definitions (Guide 35) 3.5 between-bottle bottle homogeneity bottle-to to-bottle variation of a property of a reference material [S[ bb ] NOTE It is understood that the term between-bottle bottle homogeneity applies to other types of packages (e.g. vials) and other physical shapes and test pieces. 3.6 within-bottle homogeneity variation within one bottle of a property of a reference material [S[ wb ]
Homogeneity for Qualitative RM Can de determined by all tested samples having the defined property (for example, presence or absence) Can be determined by calculating a confidence interval for the property value For Positive - lower limit is greater than the Limit of Detection For Negative, Upper Limit is less than the Limit of detection
5.8 Project Design - Homogeneity Always required When homogeneity can be reasonably expected e.g. solutions and purified material Inherent inhomogeneity e.g. soils Need representative sample Number of samples produced Number to be tested - generally 10-30 Number of replicates generally 2-52 Selection process (random, systematic, etc.)
How many Bottles and How many No easy formula Replicates? Between Bottle variance σ bb Within Bottle varianceσ wb Repeatability σ r Number of samples produced Need for uncertainty of estimates s bb s wb
7.1-7.6 7.6 Homogeneity Study Assure sufficient subsample for testing Measure every element of RM? If not measured, element must be demonstrated to be correlated with tested element Can come from literature or other source Randomized measurements Check for trend in measurement order Check for trend in manufacture order
7.8 Homogeneity Study Estimating between-bottle bottle variance Two models discussed: Subsampling is possible Subsampling is not possible Statistical analysis the same for both (fully nested one factor design) (reference Figures 1 and 2, clause 7.8)
7.8-7.9 7.9 Homogeneity Study s 2 bb = s 2 A = (MS( among MS s bb = u bb = s 2 A MS within )/n )/n 0 NOTE: If MS within > MS among then s 2 bb = 0
7.9 u when bb s r is large When measurement has poor repeatability or for other reasons, if MS within > MS among then we cannot say u bb = 0 This is actually fairly common It is necessary to use an upper limit for u bb u bb = ((MS within /n) (2/ ѵ MSwithin )) where ѵ MSwithin = Degrees of Freedom for MS (usually the number of bottles * (n 0-1)) n 0 = number of replicates MS within
u when bb s r is large This is fairly common situation, and alternative limits may be used Upper limit of confidence interval could be used (MS among MS MS within )/n )/n u 2 bb s 2 bb + s 2 r/n 0 Could use u bb = (s 2 bb + MS within /n 0 ) With s bb 0, as calculated
Homogeneity question Should it be possible that u bb = 0? Should there always be a component due to between bottle differences? If a RM Producer knows s r and the experiment shows s r is in control at the time, can we allow s 2 bb = 0?
7.10 Within Bottle Homogeneity Not relevant in some cases Not always necessary even when possible Within bottle always mixed with repeatability at some level s r s wb Estimated as MS MS within in most cases
Definitions (Guide 35) 3.10 short-term term stability stability of a property of a reference material during transport under specified transport conditions [S[ sts ] 3.11 long-term stability stability of a property of a reference material under specified storage conditions at the CRM- producer [S[ lts ]
5.9 Project Design - Stability Need to describe conditions that affect stability of material Choose storage temperature that is best for long term stability Study short term stability (transport): Under stressed conditions Under specified conditions Plan to retain sufficient samples for long term monitoring under storage conditions
Project Design - Stability Stability testing may require many samples at least 2 bottles at every temperature and every time Short term, 3-53 5 points in 2 weeks Long term, 3-43 4 points for regression
8.1 Stability Studies Two types of stability are of interest Long term (shelf life) in storage at RM Producer Short term when shipped to user in addition to stability in storage If it is not possible to maintain stability during shipment, add a component to uncertainty
Stability Studies Guide 35 states (8.1) It is often equally important to know what might happen to the sample if proper transport conditions are not maintained That is, the RM Producer should test the effects of shipment conditions that are possible but not likely if reasonable precautions are taken Recommendation is to assist in developing guidelines for shipment Should unexpected conditions be included in the uncertainty claim?
8.2 Long term Stability Studies Classical: Store under defined conditions Test periodically for change, within-lab conditions Retain records, to develop experience and detect deterioration Isochronus: Controlled experiment with some samples stressed by heat (or other suspected conditions) Mathematical model to predict long term change due to conditions Guide 35 prefers isochronus estimates
8.3 Evaluation of Stability Data Prefer to have a kinetic model for instability, based on understanding Can evaluate change with a linear model simple linear regression Y ij = β 0 + β 1 X i + ε ij Y ij = stability result at time i for sample j (if > 1) X i = time i β 0 is the regression coefficients for intercept β 1 is the regression coefficient for slope ε ij ij is random error for time i and sample j
8.3 Evaluation of Stability Data β 1 is expected to be zero for a stable RM (no change over time) β 0 is the intercept; this has little real interpretation, but should be close to the assigned value the mean at time 0 ε ij is expected to be random, normal distributed Need to test statistical significance of regression, (F test) same as significance of β 1 (t t test) Reference Guide 35 8.3, page 23
8.3 Evaluation of Stability Data If the regression is significant there is a trend and this usually means the RM cannot be certified. Options: Do not certify the material Shorten the shelf life Make the uncertainty larger (in new Guide 34)
8.4.1 Monitoring Stability - evaluation Possible for instability to occur suddenly, not as gradual trend Monitoring assures uncertainty estimates are still valid Can be done with isochronous design (see References)
8.5 Determine Shelf Life in relation to Long Term Stability Even if there is no significant trend of instability, there is a need to make allowance for long term degradation To do so, describe a model where it is assumed that the property value Y decreases linearly from the initial value Y 0 with a constant relative degradation rate b' as a function of time X Y(b 0,b,X)=Y,X)=Y 0 (1+b X) See reference 25
Combined Uncertainty After determining uncertainty components, combine as Root Sum of Squares u CRM = (u (u 2 char +u 2 bb +u 2 lts +u 2 sts ) May need to add other components for a specific Certified Reference Material
6.2 Measurement uncertainty x CRM =x char +δx bb +δx lts +δx sts With care, δx bb so x CRM = x char δx lts δx sts are zero, Even if δx. = 0, uncertainties are >0
Notes on uncertainty S bb and S lts must always be included in the uncertainty estimate as components > 0 S wb and S sts need to be considered, but these components can be = 0
Expanded uncertainty Convention is for 95% coverage k=2 is acceptable Can use t statistic Use confidence interval if distribution is not symmetric Check all homogeneity and stability test results should be within the expanded uncertainty interval (can expand for laboratory s s measurement uncertainty)
Data exercises Conduct the analyses in Annex B.5, B.6, B.7 and match the Annex B results B.5 Simple Linear Regression, u lts per 8.5 and u lts per 8.3.2 (next-to to-last paragraph) B.6 Characterization using ANOVA; grand mean, uncertainty associated with grand mean, and s r B.7 Weighting; weights, final weights, mean, and uncertainty of mean
Thank you