The crude and adjusted rates in epidemiology: Standardization and Adjustment

Similar documents
Case-control studies. Alfredo Morabia

Chapter 7: Effect Modification

Confounding in health research

Summary Measures (Ratio, Proportion, Rate) Marie Diener-West, PhD Johns Hopkins University

PREDICTION OF INDIVIDUAL CELL FREQUENCIES IN THE COMBINED 2 2 TABLE UNDER NO CONFOUNDING IN STRATIFIED CASE-CONTROL STUDIES

Standard Comparison Protocols at NICU, Regional and National levels.

The Cross-Sectional Study:

Clinical Study Design and Methods Terminology

Descriptive Methods Ch. 6 and 7

RATIOS, PROPORTIONS, PERCENTAGES, AND RATES

Confounding in Epidemiology

Lecture 1: Introduction to Epidemiology

STATEMENT ON ESTIMATING THE MORTALITY BURDEN OF PARTICULATE AIR POLLUTION AT THE LOCAL LEVEL

Childhood leukemia and EMF

COMMITTEE FOR VETERINARY MEDICINAL PRODUCTS GUIDELINE FOR THE CONDUCT OF POST-MARKETING SURVEILLANCE STUDIES OF VETERINARY MEDICINAL PRODUCTS

Basic Study Designs in Analytical Epidemiology For Observational Studies

Module 223 Major A: Concepts, methods and design in Epidemiology

Technical Briefing 3. Commonly used public health statistics and their confidence intervals. Purpose. Contents

Effect measure modification & Interaction. Madhukar Pai, MD, PhD McGill University madhukar.pai@mcgill.ca

Association Between Variables

Prospective, retrospective, and cross-sectional studies

Web portal for information on cancer epidemiology in the Czech Republic

Confidence Intervals in Public Health

Sample Size and Power in Clinical Trials

RR833. The joint effect of asbestos exposure and smoking on the risk of lung cancer mortality for asbestos workers ( )

AUSTRALIAN VIETNAM VETERANS Mortality and Cancer Incidence Studies. Overarching Executive Summary

The American Cancer Society Cancer Prevention Study I: 12-Year Followup

What is a P-value? Ronald A. Thisted, PhD Departments of Statistics and Health Studies The University of Chicago

13. Poisson Regression Analysis

Advanced Statistical Analysis of Mortality. Rhodes, Thomas E. and Freitas, Stephen A. MIB, Inc. 160 University Avenue. Westwood, MA 02090

Multiple logistic regression analysis of cigarette use among high school students

Chapter 4. Study Designs

Chapter 3. Sampling. Sampling Methods

An Application of the G-formula to Asbestos and Lung Cancer. Stephen R. Cole. Epidemiology, UNC Chapel Hill. Slides:

P (B) In statistics, the Bayes theorem is often used in the following way: P (Data Unknown)P (Unknown) P (Data)

Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University

TITLE AUTHOR. ADDRESS FOR CORRESPONDENCE (incl. fax and ) KEYWORDS. LEARNING OBJECTIVES (expected outcomes) SYNOPSIS

5. EPIDEMIOLOGICAL STUDIES

(1) Comparison of studies with different follow-up periods

Appendices Bexar County Community Health Assessment Appendices Appendix A 125

3rd Congress on Preconception Health and Care Uppsala February PEACE Tool

Elementary Statistics

7. Relating risk factors to health outcomes

University of Colorado Campus Box 470 Boulder, CO (303) Fax (303)

In an experimental study there are two types of variables: Independent variable (I will abbreviate this as the IV)

Glossary of Methodologic Terms

Cancer prevention. Chapter 16

Critical Appraisal of Article on Therapy

Sampling and Sampling Distributions

Dealing with Missing Data

Radiation Epidemiology. Radon in homes

Title: Proton Pump Inhibitors and the risk of pneumonia: a comparison of cohort and self-controlled case series designs

Basic of Epidemiology in Ophthalmology Rajiv Khandekar. Presented in the 2nd Series of the MEACO Live Scientific Lectures 11 August 2014 Riyadh, KSA

Alcohol abuse and smoking

Long-term impact of childhood bereavement

In this session, we ll address the issue Once you have data, what do you do with it? Session will include discussion & a data analysis exercise

Snap shot. Cross-sectional surveys. FETP India

IS 30 THE MAGIC NUMBER? ISSUES IN SAMPLE SIZE ESTIMATION

Guide to Biostatistics

Table 1. Underlying causes of death related to alcohol consumption, International Classification of Diseases, Ninth Revision

Introduction to Statistics and Quantitative Research Methods

Inclusion and Exclusion Criteria

Prevalence odds ratio or prevalence ratio in the analysis of cross sectional data: what is to be done?

Types of Studies. Systematic Reviews and Meta-Analyses

Confidence Intervals for One Standard Deviation Using Standard Deviation

Survey Inference for Subpopulations

CALCULATIONS & STATISTICS

Mortality Assessment Technology: A New Tool for Life Insurance Underwriting

Cohort Studies. Sukon Kanchanaraksa, PhD Johns Hopkins University

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

University of Maryland School of Medicine Master of Public Health Program. Evaluation of Public Health Competencies

Incidence and Prevalence

Scientific Update on Safe Use of Asbestos. Robert P. Nolan, PhD International Environmental Research Foundation New York, New York

Cohort studies. Chapter 8

Means, standard deviations and. and standard errors

Course Notes Frequency and Effect Measures

6. Standardization of rates and ratios*

Self-Check and Review Chapter 1 Sections

SAMPLING & INFERENTIAL STATISTICS. Sampling is necessary to make inferences about a population.

Fairfield Public Schools

Sriplung H. Projection of Cancer Problems. Chapter IV CANCER PROBLEMS PROJECTION OF. CANCER IN THAILAND : Vol. IV. p. 79-Blue

Q&A on methodology on HIV estimates

Body Mass Index as a measure of obesity

Analysis of Population Cancer Risk Factors in National Information System SVOD

Changing Trends in Mesothelioma Incidence. Hans Weill, M.D. Professor of Medicine Emeritus Tulane University Medical Center

AP STATISTICS 2010 SCORING GUIDELINES

Why Sample? Why not study everyone? Debate about Census vs. sampling

Chi-square test Fisher s Exact test

3.2 Conditional Probability and Independent Events

Dr. Lloyd Webb, Professor and Director (GPHP) Tel:

Q&A on Monographs Volume 116: Coffee, maté, and very hot beverages

Big Data for Population Health and Personalised Medicine through EMR Linkages

Globally 12% of all deaths among adults aged 30 years and over were attributed to tobacco.

Introduction to study design

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

The Young Epidemiology Scholars Program (YES) is supported by The Robert Wood Johnson Foundation and administered by the College Board.

General and Abdominal Adiposity and Risk of Death in Europe

Many research questions in epidemiology are concerned. Estimation of Direct Causal Effects ORIGINAL ARTICLE

HANDOUT #2 - TYPES OF STATISTICAL STUDIES

Long Term Follow-Up of the Residents of the Three Mile Island Accident Area:

Transcription:

The crude and adjusted rates in epidemiology: Standardization and Adjustment Dr. Khaled Kasim, Ph. D. Epidemiology Assistant Professor, Al-Azhar Faculty of Medicine, Cairo, Egypt Abstract Statistical adjustment in epidemiology is used to eliminate or reduce the confounding effects of extraneous confounding factor, such as age, when comparing disease or death rates in different populations. Direct age adjustment methods apply age-specific rates from study populations to an age distribution from a reference population. However, when age-specific rates are unavailable in the study populations, we use the indirect method of adjustment. The indirect method uses the age- specific rates from an external reference population to derive the expected number of cases or deaths in the study populations. The expected count is used to calculate a standardized mortality ratio (SMR), which is then used to adjust the rate in the study population. This review presents also a strategy for confounding and interaction assessment, which is essential to obtain the true measure of association between a disease and an exposure. As a rule, interaction must be ruled out before methods of confounder adjustment can be used. If adjustment alters interpretation of the crude risk, we must use the estimated adjusted risk. On the other hand, if adjustment does not alter the interpretation, we can use the crude rate. Introduction In epidemiology, most rates, such as incidence, prevalence, and mortality rate, are strongly age-dependent and influenced greatly by the age structure differences. The comparison of such crude rates over time and between different populations may be very misleading if the underlying age composition differs in the populations being compared. To compensate for this difference in age, we have two general options. First, we can restrict comparison to similarly age subgroups (i.e., age specific comparison). However, this can be confusing especially when many age strata exist and comparison are made between many different populations. Therefore, combining age-specific rates to derive a single age-independent index (single adjusted rate), may be more appropriate and compensate for age differences in populations. The age adjustment (standardization) can be achieved in several ways. In practice, the two most common approaches in use are the direct and indirect weighting of strataspecific rates (i.e., direct and indirect standardization). This review presents a brief description of these two methods of standardizations, with simplified hypothetical examples, and denotes their strengths and limitations. At the end of this paper, we will also give some hints about the adjustment of measure of associations with a simple clarifying example. 1

Description of the Standardization methods: 1. Direct method: In this method, we must: i. know the age specific death rates in the two populations under comparison. If not we must use the indirect method. ii. Borrow a reference population from elsewhere. This reference population may be hypothetical population or may be either one of the two populations under comparison or we may use the total of the compared populations. Example: Suppose we have two populations A & B, the crude death rate for each of them is more or less similar; 18/1000 for population A and 17/1000 for populations B (see Table 1) Table 1: Vital data of the two hypothetical populations A & B. Age group Categories -1-11 -61 No. of population 011 611 1111 2111 Population A No. of deaths 2 3 31 36 Age specific rate 1111/5 1111/5 1111/31 7111/71 No. of population 611 1111 011 2111 Population B No. of deaths 6 11 11 30 Age specific rate 1111/11 1111/11 1111/05 7111/71 From this table, we have the crude death rates of 18/1000 and 17/1000 for the population A & B respectively. However, on looking carefully to this table, we observe that the distribution (number) of population in each age group categories is not the same in the two compared populations and this might possibly confound the calculated crude death rates. To overcome this problem, we must adjust (standardize) the death rate by age, which represents the confounding factor in this example. Because the age specific death rates are known in this example for the two populations, we will use the direct method of standardization. First, we borrow the population A to be the reference population and then apply the number of population in each of its age group categories in the population A to the age specific death rate of the corresponding age group category in the population B. The adjusted death rate in the population B is calculated using the following formula: Σ (age specific death rate in B X No. of population in that age category in A number of population in A Σ = summation. = 10/1000 X400 + 10/1000X600 + 45/1000X1000 / 2000 = 55/ 2000 = 27.5/1000 2

From this calculation, we concluded that the age-adjusted death rate in population B (27.5/1000) is higher than that of its calculated crude rate (17/1000). It is also higher than that of the crude rate in population A (18/1000). Note that, we can also use the population B as a reference population and applying the number of population in each of its age group categories to the age specific death rate in the population A (try to do it). 2. Indirect method The indirect method of standardization is used when we have no data about the age specific rate in one or the two populations being compared, but the number of population in each age group should be known. In this condition, we borrow the age specific death rates from a reference (standard) population and applying it to the number of population in each corresponding age group of the compared populations to obtain what is called the expected rates. Finally, we divide the observed rate by this calculated expected rate and multiply by 100 to obtain the standardized morality ratio (SMR). The adjusted rate using this indirect method is based on multiplying the crude rate in the study population by this SMR ratio. The formulas summarising this method are: ar (indirect)= cr x SMR SMR = O/E E= R i n i Where ar: adjusted rate cr: crude rate SMR: Standardized mortality ratio. O: observed number of events in the study population. E : expected number of events in the study population. : summation. R i : the rate in ith stratum of the standard population. N i : the number of population in the ith stratum of the study population. Example: We use the previous table (Table 1), but without the number of deaths in each age group categories and accordingly we will have no data about the age specific death (see table 2) Table 2: Vital data of the two hypothetical populations A & B, but without agespecific rates. Age group Categories -1-11 -61 Population A No. of No. of population deaths 011 611 1111 2111 36 Age specific rate 7111/71 Population B No. of No. of population deaths 611 1111 011 2111 30 Age specific rate 7111/71 3

In this example, we must borrow a third standard (reference) population with a known age specific death rate to calculate the SMR (see table 3). Table 3: Age adjusted death rate of a hypothetical reference population. Age group -1-11 -61 Age specific death rate 1111/3 1111/11 1111/51 Using the data from this hypothetical table, we can calculate the SMR in the population A & B. The expected death rate in population A = 3/1000 X 400 + 18/1000 X 600 + 50/1000 X 1000 = 1.2 + 10.8 + 50 = 62 The SMR in population A = 36 / 62 X 100 = 58 %. The expected death rate in population B = 3/1000 X 600 + 18/1000 X 1000 + 50/1000 X 400 = 1.8 + 18 + 20 = 39.8. The SMR in population B = 34 /39.8 X 100 = 85 %. Simply, since the SMR in population B (85%) is higher than that in population A (58%), we can conclude that the risk of death is higher in population B. Similarly, when we multply the crude rate of each population by its measured SMR, we have the adjusted rate of population B to be about 15/1000 higher than that of population A which is calculated to be about 9/1000. Not that, we can also use the age specific death rate of one of the two compared population (if it is known) and apply it to the other to calculate the expected death rate and accordingly the SMR. Then we can read the SMR as follow: - If the SMR is more than 100, this means that more events (deaths) are occurring in the population than expected. - If the SMR is less than 100, this means that fewer events (deaths) are occurring in the population than expected. When we apply the age specific death rate of population A for the population B, we found the SMR for the population B to be about 170%. This means that more deaths are occurring in population B than would be expected or the death rate in population B showing 1.7 fold increase than population A. The adjusted rate in population B, using the above mentioned formula, is estimated to be about 29/1000. Although these methods of standardization are in common practical use since the middle of 19th century and help to give summary statements of unbiased comparisons, these methods have been found from the literature to have a number of disadvantages that include the choice of standard (reference) population. The direct standardization may suffer from instability of its estimate particularly when the 0

component rates are based on small number of deaths. The use of indirect (SMR) method, however, produces a greater numerical stability in such conditions. Furthermore, the calculation of these measures (standardization) necessitates the hypothesis of constant rate ratios. This is, however, not always satisfied in all the conditions particularly in presence of missing data. In conclusion, despite these mentioned drawbacks, the adjustment techniques are in common use to eliminate the confounding effects of extraneous factor of interest (such as age, sex, race, etc) when comparing epidemiologic or demographic rates over time or in different populations. At first, the studied data should be stratified by the extraneous factor to derive strata-specific rates. After stratification, adjustments are done between the compared populations with respect to the extraneous factor of interest, so that comparison can be made without confounding by this factor. However, when we have more than one confounding factor control by stratification appeared to be tedious. Also, the number of persons in each stratum becomes small and the standardized rates become subject to random variation. Multiplicative techniques have been developed to control simultaneously for several confounding factors and these techniques provide a useful adjunct to the simple stratification procedures of basic epidemiology. Adjustment of measures of associations: The measures of disease association in epidemiology include the prevalence ratio, which is the ratio comparison of two prevalences, cumulative incidence ratio, incidence rate ratio (relative risk), and disease odds ratio. Disease odds ratio provides an alternative to the prevalence ratio and cumulative incidence ratio as a ratio of association when the data represent proportions. The crude measures of association between exposure (E) and disease (D) may be confounded by another extraneous factor(s) called confounding factor(s) (F). Confounding in epidemiology means a distortion of an association between E and D brought about by an extraneous factor F (or extraneous factors F1, F2, F3, etc). Together with selection and information bias, the confounding bias forms the three main pillars of systematic error (bias) that may damage the results of any epidemiologic research. Confounding bias to occur, the following preconditions should be fulfilled in the confounding factor: i. F and E are associated. ii. F is an independent risk factor for D in the exposed and unexposed population. iii. F does not involve in the causal chain (mechanism) of the disease D. To clarify the confounding (confusion) bias, the following example is of value. Suppose we found a positive association between lung cancer (D) and alcohol consumption (E) in some of cancer epidemiology studies. This association might be confounded by cigarette smoking. Cigarette smoking (F) and E are associated (alcohol consumers are more likely to smoke than non-consumers are), and F and D are associated (smoking is an independent risk factor for lung cancer). In addition, smoking does not involve in the causal chain of lung cancer. Therefore, if we have no data about the smoking status of the studied subjects, this observed association may not represent the true association between lung cancer and alcohol consumption 5

because of confusion bias. To judge if factor F confound the estimated measure of association, we use the STRATIFICATION. By stratification, we mean to stratify the studied subjects into groups by the confounding factor(s). Three scenarios may occur when we use the stratification: i. The estimated measure of association is the same among the stratified groups by F, and the same as the crude measure. In this Scenario, neither confounding nor interaction is suspected and we can directly present the estimated crude measure of association (see table of Scenario A). ii. The estimated measure of association is different among the stratified groups by F, and from the crude measure. In this Scenario, an interaction between F and E is suspected (effect modification), and an interaction test (A chi-square test for interaction which is called test of heterogeneity) should be used to confirm such effect (see table of Scenario B). iii. The estimated measure of association is the same among the stratified groups by F, but not the same as the crude measure. In this Scenario, F represents a confounding factor and it must be controlled (adjusted) during statistical analysis to obtain the adjusted measure of association (see Table of Scenario C). It is pertinent here to say that before we use the methods of confounder adjustment, the interaction between the two studied factors (E & F) must be ruled out. As a rule, if the adjusted risk alters the interpretation of the crude measure of association, the adjustment is mandatory. But if, on the other hand, the adjustment does not alter interpretation, the crude measure of association can be used. Table of Scenario A: + D subjects 011 111 )+ Smokers (F 321 11 )- Non-smokers (F 11 21 - D Relative risk )*RR( 1611 1011 2111 2111 Crude RR = 4.0 011 021 111 111 RR (F+) = 4.0 1121 1111 1211 1211 RR (F-) = 4.0 RR* is the incidence rate among the exposed (E+) divided by the incidence rate among the non exposed (E-) Table of Scenario B: + D subjects 011 111 )+ Smokers (F 20 51 )- Non-smokers (F 306 11 - D Relative risk )*(RR 1611 1011 2111 2111 Crude RR = 4.0 306 1551 011 1611 RR (F+) = 1.9 1220 301 1611 011 RR (F-) = 9.4 6

RR* is the incidence rate among the exposed (E+) divided by the incidence rate among the non exposed (E-) Table of Scenario C: + D subjects 011 111 )+ Smokers (F 311 01 )- Non-smokers (F 12 52 - D Relative risk )*RR( 1611 1011 2111 2111 Crude RR = 4.0 1212 152 1611 211 RR (F+) = 1.0 311 1001 011 1111 RR (F-) = 1.0 RR* is the incidence rate among the exposed (E+) divided by the incidence rate among the non exposed (E-) In a Scenario C, the crude (unadjusted) measure of association, the relative risk (RR) of lung cancer associated with alcohol consumption is 4.0. However, when we stratified the studied subjects by smoking status (i.e. F + and F -), we found the risk to be the same among smokers (RR= 1.0) and non-smokers (RR= 1.0) but different from the crude risk. This means that smoking status is a confounding factor for the association between lung cancer and alcohol consumption, and the estimated true risk (4.0) is not the true association. Therefore, we must control (adjust) this confounding effect and we should present the resulting adjusted risk as the true measure of association. We can obtain the adjusted risk by using either the regression analysis models (multivariate regression analysis), or we can use the weighted average measures based on the intra-strata variance estimates (Mantel-Hansel method). Explication of such methods of adjustments is beyond the scope of this review. In a Scenario B, the crude RR of lung cancer associated with alcohol consumption is 4.0. By stratification, however, we found the risk among smokers to be 1.1 which is different from non-smokers (RR= 9.1) as well as from the crude risk. The observed heterogeneity of the estimated measures of association (crude and strata) suggests that smoking is not a confounding factor in this scenario, but it modifies the effect of exposure (alcohol consumption) on the disease (lung cancer). This biological phenomenon is called effect modification and is related to a statistical phenomenon called interaction. Interaction refers to a difference in effect of one factor according to the level of another factor and it always implies direct biological and public health relevance. On epidemiologic basis, the effect modification is suspected when the observed joint effect of the two studied factors is more or less than the predicted joint effect in the additive model (RR11 RR10 + RR01-1) indicating the departure from this model and consequently the presence of biological interaction. Confirmation of interaction between study factors by a chi-square test of interaction and estimation of the risk that represents this interaction is beyond the scope of this review. Generally, the use of regression analysis is of great value in this respect. 0

BIBLIOGRAPHY Album A, Norwell S: Measures of comparison of disease occurrence. In Introduction to modern epidemiology. 2 nd edition. Epidemiology resources Inc, 1990; p30-35. Arbitrage P, Berry G: Statistical Methods in Medical Research, 3rd Ed. Blackwell, Oxford, 1994. Below NE. and Day NE. Statistical methods in cancer research. Volume II. the design and analysis of cohort studies IARC, Lyon, 1980. Greenland S, Robins JM : Confounding and misclassification. American Journal of Epidemiology 1985; 122 :495-506. Mantel N, Haenszel, W. Statistical aspects in the analysis of data from retrospective studies of diseases. Journal of National Cancer Institute 1959; 22:719-748. Miettinen OS : Confounding and effect modication. American Journal of Epidemiology 1974; 100 :350-353. Gerstman BB : Stratification and adjustment. In Epidemiology kept simple : An introduction to modern epidemiology. A John Willy & Sons, Inc., Publication. New York. 1998; p108-120. Rothman K and Greenland S: Precision and validity in epidemiologic studies. In Epidemiology kept simple: An introduction to classic and modern epidemiology. A John Wiley and Sons, Inc., Publication, 1998, P115-135. 1