Radon, the lognormal distribution and deviation from it

Similar documents
Data Preparation and Statistical Displays

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

RADON AND HEALTH. INFORMATION SHEET October What is radon and where does it come from?

Radon in Nordic countries

Simple linear regression

SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one?

Radiological Protection Principles concerning the Natural Radioactivity of Building Materials

Modeling the Distribution of Environmental Radon Levels in Iowa: Combining Multiple Sources of Spatially Misaligned Data

Jitter Measurements in Serial Data Signals

The correlation coefficient

Radon in Northern Ireland Homes: Report of a Targeted Survey

Algebra Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

I. Basic concepts: Buoyancy and Elasticity II. Estimating Tax Elasticity III. From Mechanical Projection to Forecast

Determination of g using a spring

Content Sheet 7-1: Overview of Quality Control for Quantitative Tests

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

Is log ratio a good value for measuring return in stock investments

Radon from building materials

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

GRADES 7, 8, AND 9 BIG IDEAS

Fairfield Public Schools

What s Wrong with Multiplying by the Square Root of Twelve

Calibration of Uncertainty (P10/P90) in Exploration Prospects*

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

What Does the Normal Distribution Sound Like?

All aspects on radon. Case 3. Background. Christoffer Baltgren,

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

The indoor Rn-concentration: schools, kindergartens and official buildings versus homes.

Short title: Measurement error in binary regression. T. Fearn 1, D.C. Hill 2 and S.C. Darby 2. of Oxford, Oxford, U.K.

4. Simple regression. QBUS6840 Predictive Analytics.

RESULTS OF THE NATIONAL SURVEY ON RADON INDOORS IN ALL THE 21 ITALIAN REGIONS

Chapter 3 RANDOM VARIATE GENERATION

Premaster Statistics Tutorial 4 Full solutions

THE RESULTS OF THE LITHUANIAN RADON SURVEY. Gendrutis Morkunas 1, Gustav Akerblom 2

Test Bias. As we have seen, psychological tests can be well-conceived and well-constructed, but

MEASUREMENT PROTOCOLS FOR RADON IN DWELLINGS IN SWEDEN; THIRTEEN YEARS OF EXPERIENCE

Correlation key concepts:

5/31/ Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA

Radiation Epidemiology. Radon in homes

TRADING SYSTEM EVALUATION By John Ehlers and Mike Barna 1

Dose conversion factors for radon

Gamma Distribution Fitting

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

8 INTERPRETATION OF SURVEY RESULTS

Probabilistic Risk Assessment Methodology for Criteria Calculation Versus the Standard (Deterministic) Methodology

CRIIRAD report N Analyses of atmospheric radon 222 / canisters exposed by Greenpeace in Niger (Arlit/Akokan sector)

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

WM2012 Conference, February 26 March 1, 2012, Phoenix, Arizona, USA

Radon in Homes and Lung Cancer Risk. Sarah C Darby CTSU, University of Oxford

Setting the scene. by Stephen McCabe, Commonwealth Bank of Australia

How Far is too Far? Statistical Outlier Detection

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

Module 5: Multiple Regression Analysis

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Uncertainty evaluations in EMC measurements

analyzing small data sets

THANK YOU. Realtors Protect Families From the Risk of Lung Cancer When Sellers Install Quality Radon Reduction Systems

or, put slightly differently, the profit maximizing condition is for marginal revenue to equal marginal cost:

What does the number m in y = mx + b measure? To find out, suppose (x 1, y 1 ) and (x 2, y 2 ) are two points on the graph of y = mx + b.

Online Appendices to the Corporate Propensity to Save

Algebra 1 Course Information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Click Here to Register for One or More of These Sessions On-Line

Spectrophotometry and the Beer-Lambert Law: An Important Analytical Technique in Chemistry

Pricing complex options using a simple Monte Carlo Simulation

ANNEX E. Sources-to-effects assessment for radon in homes and workplaces. Contents. Page INTRODUCTION

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Simple Regression Theory II 2010 Samuel L. Baker

Module 3: Correlation and Covariance

4. Continuous Random Variables, the Pareto and Normal Distributions

AP Physics 1 and 2 Lab Investigations

Measuring Line Edge Roughness: Fluctuations in Uncertainty

Means, standard deviations and. and standard errors

Radiation Epidemiology. Radon in homes

Model-based Synthesis. Tony O Hagan

Point and Interval Estimates

2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR)

Physics Lab Report Guidelines

Experiment 10. Radioactive Decay of 220 Rn and 232 Th Physics 2150 Experiment No. 10 University of Colorado

AN INVESTIGATION INTO THE USEFULNESS OF THE ISOCS MATHEMATICAL EFFICIENCY CALIBRATION FOR LARGE RECTANGULAR 3 x5 x16 NAI DETECTORS

Minnesota Academic Standards

Time series analysis as a framework for the characterization of waterborne disease outbreaks

Manual for simulation of EB processing. Software ModeRTL

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Common Core State Standards for Mathematics Accelerated 7th Grade

Glencoe. correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 3-3, , , 4-9

Protein Prospector and Ways of Calculating Expectation Values

Factors affecting online sales

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

EXPERIMENT: MOMENT OF INERTIA

Decision Analysis. Here is the statement of the problem:

9 Multiplication of Vectors: The Scalar or Dot Product

STATEMENT ON ESTIMATING THE MORTALITY BURDEN OF PARTICULATE AIR POLLUTION AT THE LOCAL LEVEL

The Point-Slope Form

Radon: The Second Leading Cause of Lung Cancer? Deh, deh, deh, dehhhhh..

WHAT IS RISK? Definition: probability of harm or loss Risk = Hazard x Exposure Risk can be voluntary or involuntary Interpretation of risk differs for

Introduction to Regression and Data Analysis

Prelab Exercises: Hooke's Law and the Behavior of Springs

Transcription:

Radon, the lognormal distribution and deviation from it Zornitza Daraktchieva and Jon Miles Health Protection Agency, UK 1th International Workshop on the Geological Aspects of Radon Risk Mapping 22 25 September, 21 Prague, Czech Republic Centre for Radiation, Chemical and Environmental Hazards

Why lognormal? Many surveys of radon in homes in different countries have found the results to follow a lognormal distribution, or close to lognormal. The reason for this can be understood in terms of the normal distribution: Values calculated by summing 12 or more random numbers follow a normal distribution Values calculated by multiplying 12 or more random numbers follow a lognormal distribution

Normal distribution from summing random numbers 14 12 1 8 1 random number 6 3 random numbers 4 12 random numbers 2.2.4.6.8 1

Lognormal distribution from multiplying random numbers 1 8 6 4 1 random number 3 random numbers 6 random numbers 12 random numbers 2.2.4.6.8 1 1.2

Normal probability plot We use a graphical technique called the normal probability plot (or normal Quantile-Quantile, Q-Q plot) to assess whether the data set is normally distributed. The data are plotted against a theoretical normal distribution in such a way that the points should form an approximate straight line. Departures from this straight line indicate departures from normality. The line fitted to the data points gives the theoretical normal distribution with parameters corresponding to the mean (intercept) and standard deviation (slope).

Q-Q plot of normal distribution.9.8.7.6.5.4.3.2.1-4 -3-2 -1 1 2 3 4 5 Normal order statistics medians This is the Q-Q plot of 32, points, each generated by summing 12 random numbers.

Variable magnitudes of random numbers 12 1 8 6 4 2-4 -3-2 -1 1 2 3 4 5 Normal order statistics medians The random numbers do not have to be of the same magnitude this is a plot of values where the random numbers varied over more than an order of magnitude

Radon from the ground into homes Radon source: radium-226 in rocks and soils Factors controlling how much radon enters a building: Ra-226 source concentration Grain size of source Moisture level around grains Permeability Entry routes into homes Underpressure in homes Ventilation rates Etc, etc............. All are multiplying factors on the process of radon travel from origin to air in homes hence expect lognormal distribution

But why should national distributions be lognormal? The multiplying factors controlling radon entry can explain why radon concentrations in homes in one uniform area should be lognormally distributed. But regional or national distributions must be sums of many local distributions. So how do the local lognormal distributions sum to a different lognormal nationally? The sum of many lognormal distributions may also be lognormal (but is not always so)

How many samples are there in the radon distribution? We don t know how many individual distributions might coexist in the measured radon distribution. The measured radon can be expressed by multiplicative chain of various factors (J. Miles, Radiation Protection Dosimetry, 1994): Ln(R net ) =ln(r soil ) + ln (A) + ln(b) + ln(c) + ln(d) +...... We can show that the radon distribution can be presented as a linear combination of many different lognormal distributions.

ln radon Monte Carlo simulations of radon distribution 6 samples model We have simulated 6 different weight samples with standard deviations of.85 and means of 1., 1.5, 2., 2.5, 3. and 3.5. The combined sample is a normal distribution with a mean of 2.2 and a standard deviation of 1.9 (red)..45.4.35.3.25 7 6 5 y = 1.891x + 2.246.2 4.15 3.1 2.5 1-4 -2 2 4 6 8 ln Radon -4-2 2 4-1 -2 Normal order statistics medians

ln radon Monte Carlo simulations of radon distribution 8 samples model 8 different weight samples with standard deviations of.85 and means of.8, 1.2, 1.6, 2., 2.4, 2.8, 3.2 and 3.6. The result is a normal distribution with a mean of 2.2 and a standard deviation of 1.1 (red)..4.35.3.25 7 6 y = 1.14x + 2.261.2.15.1.5 5 4 3 2 1-4 -3-2 -1 1 2 3 4-4 -2 2 4 6 8-1 ln Radonmean Normal order statistics medians -2-3

Comparison of lognormal model with UK radon distribution National survey of the radon concentrations in UK Data collected in a national survey on natural radiation exposure in UK dwellings (Wrixon et al. 1988). Sets of etched track detectors were placed in the main living area and in the bedroom. Detectors were placed for 2 consecutive periods of 6 months. A random sampling procedure was employed to select the representative sample of houses from the UK. The radon concentration was measured in 2,94 homes.

Frequency Measured radon concentrations appear to follow a lognormal distribution 3 25 2 15 1 5 2 4 6 8 1 Radon (Bq.m -3 )

ln radon Lognormal distribution? 8 7 6 5 4 3 2 1-4 -3-2 -1 1 2 3 4 Normal order statistics medians The UK distribution is not a straight line, so is not a simple lognormal distribution

Possible cause of deviation from straight line We know that the mean outdoor radon concentration in the UK is 4 Bq m -3. Outdoor radon is an addition to indoor radon which is not part of the multiplicative model for indoor radon. Therefore we subtract 4 Bq m -3 from each measurement result, and replot.

ln radon The radon distribution after subtraction of outdoor radon from each result Normal probability plot of national survey data 8 7 y = 1.949x + 2.2342 6 5 4 3 2 1-4 -3-2 -1 1 2 3 4-1 Normal order statistics medians -2 The result is a straight line over most of the range of the distribution. But there is a deviation from normality (the straight line) in the lower and upper part of the spectrum -3

ln radon Monte Carlo simulations of the national survey Here we modify the 6-sample model to simulate the deviation from lognormal at high radon concentrations. The 5 samples have means of.8, 1.2, 2., 2.4, 2.8 and standard deviations of.85. The 6 th sample with mean of 3.2 has a standard deviation of 1.5. The samples have different weights and are not symmetrical with respect to the mean of the combined sample (red)..4.35.3 8 7 y = 1.788x + 2.265.25 6.2 5.15 4.1 3 2.5 1-4 -2 2 4 6 8 1-4 -2 2 4-1 Normal order statistics medians We therefore conclude that the deviation observed in the upper part of the national survey is probably due to the contribution from one or more local distributions with high concentrations. -2

Effects of random uncertainties in radon measurement results We know that there are random uncertainties in radon measurement results, and that they are proportionately larger for low radon concentrations. This could affect the lower end of the lognormal plot. We will therefore add the effects of typical uncertainties in radon measurements on the simulated combined distribution on the slide 18.

-.2.2.6 1 1.4 1.8 2.2 2.6 3 3.4 3.8 4.2 4.6 5 5.4 5.8 6.2 6.6 7 7.4 7.8 Frequency ln radon Monte Carlo simulations of the national survey with measurement error A measurement error is added to the simulations. In addition the results below zero were set to 1. 4 MC 8 35 data 7 3 6 25 5 2 15 4 1 3 5 2 1 ln Radon -4-3 -2-1 1 2 3 4 Normal order statistics medians This reproduces the deviation from a straight line at low radon concentrations

Deviation from lognormal at upper end of distribution The distribution in the national survey, if extrapolated as a straight line to higher radon concentrations, predicts that the total number of homes in the UK above a radon concentration of 5, Bq m -3 is.7 homes. That was based on a systematic, unbiased survey. We have many more results from later, biased surveys. In later surveys, we have already found 38 homes above 5, Bq m -3, and we have certainly not yet found all of them. Clearly the actual distribution at very high concentrations does not follow the lognormal distribution found below 2 Bq m -3.

Can we use data from other surveys of UK homes to understand this deviation? Other UK surveys Surveys for mapping purposes Surveys to identify high radon homes Surveys paid by homeowners Surveys for local authorities Total: > 5, results BUT All are biased, cannot be used directly to estimate changes in UK radon exposure

Use of radon house data from biased surveys How can we use the data from the biased surveys, concentrated in high-radon areas, to supplement the data from the unbiased national survey? Overall, based on the national survey, we estimate that there are 1, ± 3, homes in the UK above 2 Bq m -3. So far, we have found ~5, homes above 2 Bq m -3. ASSUMPTION: Since we know that we have found 5% of the homes above 2 Bq m -3, we will assume that we have also found 5% of those above any higher radon threshold.

Using results from biased surveys to estimate national totals Threshold, Bq m -3 Number of homes exceeding threshold Estimated from national survey Found so far Estimated total (number found so far x2) 5 7791 9846 2 1, 75 2331 45 5,.7 38 75 1,.2 7 15

UK distribution, with extra data Normal probability plot, with higher estimates taken from other surveys measured Current UK Action Level predicted Here it is clear that the results at higher concentrations deviate strongly from those below 2 Bq m -3, and in fact produce a second straight line, implying that the higher radon results follow a different lognormal distribution, with different mean and geometric standard deviation from the national distribution.

Implications of deviation from single lognormal distribution There are many more homes with very high radon concentrations in the UK than expected from the national survey Calculations of numbers of homes above 1 Bq m -3 or 2 Bq m -3 are not significantly changed Calculations for mapping are not significantly affected, since they are based on the local distribution in high radon areas Use of maps in surveys to find high homes is not affected

Conclusions Analysis of results from the national survey in the UK has confirmed that the results follow a lognormal distribution over most of the range. It is necessary to allow for the contribution from outdoor radon, which is not part of the lognormal model. Monte Carlo simulations have shown that a lognormal distribution can be constructed from the sum of many lognormal distributions some with different weights. The observed distribution of national survey results can be obtained by combining several lognormal distributions, with measurement errors. The results above 4 Bq.m -3 deviate significantly from those below 2 Bq m -3 implying that at these concentrations the lognormal distribution has different mean and standard deviation. There are many more homes with very high radon concentrations in the UK than expected from the national survey, showing the importance of surveys to find and fix high homes