The Role of Cluster Randomization Trials in Health Research



Similar documents
Biostat Methods STAT 5820/6910 Handout #6: Intro. to Clinical Trials (Matthews text)

Missing data in randomized controlled trials (RCTs) can

1.0 Abstract. Title: Real Life Evaluation of Rheumatoid Arthritis in Canadians taking HUMIRA. Keywords. Rationale and Background:

Clinical Study Design and Methods Terminology

Study Designs. Simon Day, PhD Johns Hopkins University

Fairfield Public Schools

Paper PO06. Randomization in Clinical Trial Studies

Study Design and Statistical Analysis

Inclusion and Exclusion Criteria

Guide to Biostatistics

Guided Reading 9 th Edition. informed consent, protection from harm, deception, confidentiality, and anonymity.

GUIDELINES FOR REVIEWING QUANTITATIVE DESCRIPTIVE STUDIES

Current reporting in published research

Descriptive Methods Ch. 6 and 7

The Cross-Sectional Study:

Service delivery interventions

Competency 1 Describe the role of epidemiology in public health

Sample Size and Power in Clinical Trials

Critical appraisal. Gary Collins. EQUATOR Network, Centre for Statistics in Medicine NDORMS, University of Oxford

Statistical Rules of Thumb

NON-PROBABILITY SAMPLING TECHNIQUES

What are confidence intervals and p-values?

This series of articles is designed to

CLUSTER SAMPLE SIZE CALCULATOR USER MANUAL

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

Mode and Patient-mix Adjustment of the CAHPS Hospital Survey (HCAHPS)

Introduction to study design

11. Analysis of Case-control Studies Logistic Regression

WHAT IS A JOURNAL CLUB?

PEER REVIEW HISTORY ARTICLE DETAILS TITLE (PROVISIONAL)

Power and sample size in multilevel modeling

2. Simple Linear Regression

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

II. DISTRIBUTIONS distribution normal distribution. standard scores

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

GUIDE TO RANDOMISED CONTROLLED TRIAL PROTOCOL CONTENT AND FORMAT (FOR NON CTIMP S)

This clinical study synopsis is provided in line with Boehringer Ingelheim s Policy on Transparency and Publication of Clinical Study Data.

Week 4: Standard Error and Confidence Intervals

Additional sources Compilation of sources:

The PCORI Methodology Report. Appendix A: Methodology Standards

Sampling. COUN 695 Experimental Design

Principles of Hypothesis Testing for Public Health

IPDET Module 6: Descriptive, Normative, and Impact Evaluation Designs

Using Repeated Measures Techniques To Analyze Cluster-correlated Survey Responses

Statistics Review PSY379

Introduction to Fixed Effects Methods

Chapter Eight: Quantitative Methods

How to choose an analysis to handle missing data in longitudinal observational studies

Research Methods & Experimental Design

TUTORIAL on ICH E9 and Other Statistical Regulatory Guidance. Session 1: ICH E9 and E10. PSI Conference, May 2011

Intervention and clinical epidemiological studies

Descriptive Statistics

STATISTICAL ANALYSIS OF SAFETY DATA IN LONG-TERM CLINICAL TRIALS

Introduction to Statistics and Quantitative Research Methods

"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1

Master of Public Health (MPH) SC 542

Assessing study quality: critical appraisal. MRC/CSO Social and Public Health Sciences Unit

INTERNATIONAL CONFERENCE ON HARMONISATION OF TECHNICAL REQUIREMENTS FOR REGISTRATION OF PHARMACEUTICALS FOR HUMAN USE

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

2015 Michigan Department of Health and Human Services Adult Medicaid Health Plan CAHPS Report

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

Chapter 1 Introduction. 1.1 Introduction

The correlation coefficient

With Big Data Comes Big Responsibility

Types of Studies. Systematic Reviews and Meta-Analyses

Sample Size Planning, Calculation, and Justification

Online Reading Support London School of Economics Sandra McNally

Critical Appraisal of Article on Therapy

Measure Information Form (MIF) #275, adapted for quality measurement in Medicare Accountable Care Organizations

Written Example for Research Question: How is caffeine consumption associated with memory?

What are observational studies and how do they differ from clinical trials?

Understanding Diseases and Treatments with Canadian Real-world Evidence

SAMPLING & INFERENTIAL STATISTICS. Sampling is necessary to make inferences about a population.

Section 13, Part 1 ANOVA. Analysis Of Variance

Chapter 1. Longitudinal Data Analysis. 1.1 Introduction

Organizing Your Approach to a Data Analysis

REGULATIONS FOR THE POSTGRADUATE DIPLOMA IN CLINICAL RESEARCH METHODOLOGY (PDipClinResMethodology)

Data Analysis, Research Study Design and the IRB

The SURVEYFREQ Procedure in SAS 9.2: Avoiding FREQuent Mistakes When Analyzing Survey Data ABSTRACT INTRODUCTION SURVEY DESIGN 101 WHY STRATIFY?

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Approaches for Analyzing Survey Data: a Discussion

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

The Ohio Department of Medicaid s Methods for High Risk Care Management Program Performance Measures. SFY 2015 Final

Chapter 4. Probability and Probability Distributions

Why Sample? Why not study everyone? Debate about Census vs. sampling

Basic research methods. Basic research methods. Question: BRM.2. Question: BRM.1

Chapter 3 Local Marketing in Practice

Biostatistics: Types of Data Analysis

Transcription:

The Role of Cluster Randomization Trials in Health Research Allan Donner, Ph.D.,FRSC Professor Department of Epidemiology and Biostatistics Director, Biometrics Robarts Clinical Trials The University of Western Ontario London, Canada

What Are Cluster Randomization Trials Cluster randomization trials are experiments in which intact social units or clusters of individuals rather than independent individuals are randomly allocated to intervention groups.

Examples: Medical practices selected as the randomization unit in trials evaluating the efficacy of disease screening programs Communities selected as the randomization unit in trials evaluating the effectiveness of new vaccines in developing countries Hospitals selected as the randomization unit in trials evaluating educational guidelines directed at physicians and/or administrators

Reasons for Adopting Cluster Randomization Administrative convenience To obtain cooperation of investigators Ethical considerations To enhance subject compliance To avoid treatment group contamination Intervention naturally applied at the cluster level

Unit of Randomization vs. Unit of Analysis A key property of cluster randomization trials is that inferences are frequently intended to apply at the individual level while randomization is at the cluster or group level. Thus the unit of randomization may be different from the unit of analysis. In this case, the lack of independence among individuals in the same cluster, i.e., intracluster correlation, creates special methodologic challenges in both design and analysis.

Implications of Intracluster Correlation Reduction in effective sample size (ESS): Extent depends on size of the intracluster correlation coefficient and on average cluster size.

Application of standard sample size approaches leads to an underpowered study (Type II error) Application of standard statistical methods generally tends to bias p-values downwards, i.e., could lead to spurious statistical significance (Type I error)

Examples: COMPLETELY RANDOMIZED DESIGN Study Purpose: To evaluate the effectiveness of vitamin A supplements on childhood mortality. 450 villages in Indonesia were randomly assigned to ether participate in a vitamin A supplementation scheme, or serve as a control. One year mortality rates were compared in the two groups. Sommer et al. (1986)

Study Purpose: Matched Pair Design The COMMIT community intervention trial(1995) was designed to promote smoking cessation using a variety of community resources. The primary outcome measure was the 5-year smoking cessation rate. Unit of Randomization: Community Number of Matched Pairs: 11 Matching factors: size, pop. density

Possible Sources of Intracluster Correlation 1. Subjects frequently select the clusters to which they belong e.g., patient characteristics could be related to age or sex differences among physicians 2. Important covariates at the cluster level affect all individuals within the cluster in the same manner e.g., differences in temperature between nurseries may be related to infection rates

3. Individuals within clusters frequently interact and, as a result, may respond similarly e.g., education strategies or therapies provided in a group setting 4. Tendency of infectious diseases to spread more rapidly within than among families or hospital wards.

Quantifying the Effect of Clustering Consider a trial in which k clusters of size m are randomly assigned to each of an experimental and control group, denoted by i 1 and 2 respectively. Let i denote the sample mean of the response variable in group i, i 1, 2. Then assuming common variance 2 is normally distributed with where Var 2 1 m 1, i 1, 2 i km is the coefficient of intracluster correlation

The term F 1 m 1 is the variance inflation factor or design effect associated with cluster randomization. Equivalently, the effective sample size in each group is given by SS km F 1 km m 1 At At 0, 1, SS km SS k

Challenges in Design And Analysis: Choice of Unit of Inference The choice of unit of inference has profound Implications for both the design and analysis of a cluster randomized trial. Key question: Are the primary inferences of interest targeted at i) the individual level or ii) at the cluster level?

Example 1: The intervention consists of a blood pressure screening program intended to lower the risk of cardiovascular mortality Bass et al (1986) Example 2: The intervention consists of educational guidelines for the management of hyperlipidaemia intended to increase the proportion of eligible patients prescribed lipid-lowering drugs Diwan et al (1995)

Consider a trial randomizing hospitals designed to assess the effect of obtaining a second clinical opinion on the decision to proceed with a caesarian section operation (Altaye et al 2008). The target of the intervention was the hospital rate of caesarian section. In this case, the hospital was the natural unit of inference and standard methods of sample size estimation and analysis applied at the cluster level.

Other advantages of cluster-level analyses: Simplified data collection and lower cost Exact statistical inferences can always be constructed Can be adapted to adjust for baseline imbalances

A Common Misconception Investigators have occasionally claimed that cluster level analyses will only provide valid statistical inferences when the intracluster correlation coefficient However this is too stringent a claim. For balanced designs, cluster level analyses provide inferences which are equivalent to those obtained using a mixed linear regression model for any value of

Avoiding Recruitment Bias Consider a trial randomizing general practices in which physicians are asked to identify as well as to treat selected patients. Can cluster randomization introduce bias through the way patients are differentially recruited across treatment groups?

If the physician s practice has already been randomized, recruitment for patient participation cannot be done blindly with respect to intervention group. If physicians in the experimental group are more diligent in seeking out patients than in the control group, or tend to identify patients who are less ill, bias may result.

Unbiased estimates of the effect of intervention can be assured only if analyses are based on data from all cluster members or, alternatively, from a random sub-sample of cluster members. Possible Solutions: Identify eligible patients in each community prior to randomization. If eligible subjects are identified after randomization, recruitment should be done by an individual independent of the trial. Torgerson (2001) Farrin et al (2005)

Assuring the Power of a CRT Generally the size of effects has been meager in relation to the effort expended We often do not have the resources to detect medium effects Susser (1995)

Factors Influencing Loss of Precision Cluster Randomization Trials 1. Interventions often applied on a group basis with little or no attention given to individual study participants. 2. Some studies permit the immigration of new subjects after baseline. 3. Entire clusters, rather than just individuals, may be lost to follow-up.

4. Presence of experimental contamination. 5. Over-optimistic expectations regarding effect size. 6. Difficulties associated with prevention trials (low event rates, compliance problems, incomplete exposure to intervention).

Strategies for Increasing Precision Restrict inclusion criteria e.g., recruit practices of same size, with physicians having similar years of experience in a controlled geographic area. Conduct a baseline survey of the prevalence of the trial outcome - serves as powerful risk factor for outcome - provides estimate of -helps identify potential stratification factors - allows trial personnel to gain experience

Greater power is obtained by increasing the number of clusters vs. increasing cluster size Recognize likely size of as based on published values in particular fields Take into account variability in cluster size at the design stage (Eldridge et al [2006]) Consider sample size re-estimation as part of an interim analysis. Lake et al (2001)

Impact on Power of Increasing the Number of Clusters vs. Increasing Cluster Size: Let As As

Dubious Approaches Testing in attempt to rule out the existence of clustering. Assuming the value of is known from external sources in order to increase the available number of degrees of freedom.

Guidelines for Obtaining Informed Consent Two distinct levels of informed consent must be distinguished in cluster randomization trials: (i) informed consent for randomization (usually provided by a decision-maker ) (ii) informed consent for participants given that randomization has occurred.

By analogy to current ethical requirements for clinical trials, it would be unethical not to obtain informed consent from every cluster member prior to random assignment. Is such a strict analogy required for cluster randomization trials?

Ethical advice indicated that, since we were only providing information to clinicians, there was no reason to seek patient consent Wyatt et al (1998)

To what extent, if any, should distinctions in the unit of randomization inform ethical considerations for the design of a randomized trial? How should the risk/benefit ratio experienced by individual study subjects factor into such a discussion? How should decision makers who agree to random assignment of clusters be identified? Weijer et al(2011)

Is Pair-Matching Worthwhile? 1. If an effective matching factor can be found, the matched design gains power relative to the completely randomized design. 2. Failure to match may lead to a poor randomization, particularly in smaller samples, thus undermining the trial s credibility. In general, matching tends to be a more important design issue in trials randomizing intact clusters than in trials randomizing individuals, since the number of clusters is often relatively small.

Evaluation of the Effectiveness of Matching There has been considerable research comparing the power of completely randomized and matched pair designs, (e.g., Martin et al. [1993], Klar and Donner [1997]). Some findings from this research are as follows: If the total study size is 20 or fewer clusters (10 treated and 10 control) then matching should be used only if the investigators are confident that the correlation due to matching is at least 0.20. It is unlikely that effective matching would be possible for small studies. Matching may be overused as a design tool.

COMMIT TRIAL - estimated matching correlation Proportion of Heavy Smokers Who Quit Community Pair Experimental Control 1 2 3 4 5 6 7 8 9 10 11 0.139 0.163 0.164 0.204 0.183 0.164 0.262 0.193 0.215 0.136 0.155 0.205 0.202 0.163 0.249 0.16 0.186 0.23 0.169 0.127 0.172 0.189

Source Unit of Randomization Number of Pairs Outcome Variable Matching Correlation Stanton & Clemens (1987) Cluster of Families 25 Childhood Diarrhea Rate 0.49 Kidane & Morrow (2002) Cluster of villages 12 Childhood Mortality Thompson et al. (1997) Physician Practice 13 Levels of Coronary Risk Factors Ray et al. (1997) Nursing Home 7 Rate of Recurrent Falling Peterson et al (2002) School district 20 Prevalence of Smoking Haggerty et al. (1994) Community 9 Childhood Diarrhea Rate -0.39 0.13 0.63 0.34-0.32 Grosskurth et al. (1995) Community 6 HIV Rate 0.94 The COMMIT Research Group (1995) Community 11 Smoking Quit Rate 0.21

The overall conclusion from this research is that matching should be used cautiously in cluster randomization trials, particularly if the total number of matched pairs is small. Other disadvantages arise because the effect of intervention is totally confounded with the natural variation between the clusters within each matched pair. Thus, the intracluster correlation cannot be directly estimated.

Some consequences of the inability to estimate from the matched pairs design are as follows: Information on necessary for the planning of future trials in the same application area will not be routinely available. Regression analyses whose aim is to explore the effect of individual-level covariates on outcome cannot be conducted using data from the matched pairs design. A test of heterogeneity in the effect of intervention across the matched pairs cannot be constructed.

These consequences do not occur for data arising from either the completely randomized design or the stratified design. This is because these designs allow some replication of clusters within a given combination of intervention and stratum. For the matched pair design, there is no such replication. As a result, the inherent variation in response between clusters in a matched pair is totally confounded with the effect of intervention. Thus it is impossible to obtain a valid estimate of for this design except under the null hypothesis of no intervention effect or without making special assumptions.

Features of the Stratified Design Provides some control in the design on factors known to be related to outcome. Imbalances on remaining variables can be adjusted for. Permits an estimate of to be computed without the need to make special assumptions, thus facilitating secondary regression analyses using individual-level covariates.

Decreasing the number of strata while increasing the number of clusters per stratum tends to reduce the closeness of the matching but increases the degrees of freedom available for the estimation of error. When the number of pairs is fairly large (>20), it may be difficult to create matches that represent distinct rather than similar levels of risk across all strata. In this case, stratification with two or more clusters assigned to each treatment within strata may be preferable.

Conclusions 1. It is known that in small samples it is unlikely that effective matching is possible. This is largely because the required size of the correlation due to matching is difficult to achieve in practice. 2. In large samples, the pair-matched design may still have some practical and theoretical drawbacks as compared to the stratified design. This will be particularly true if prior information on potentially important matching variables is limited or it is difficult to obtain close matches on these variables.

The Benefits of Balance An important special case arises in trials having a quantitative outcome variable when each cluster has a fixed number of subjects (i.e., balanced designs). Unweighted cluster level analyses are then fully informative. Consider a completely randomized trial with clusters in each of two intervention groups with subjects per cluster.

A test for the effect of intervention can be constructed using either (i) A two sample t-test with 2(k-1) degrees of freedom calculated using the mean as a summary score for each cluster. (ii) A mixed effects linear regression model. The resulting test statistics are identical (e.g., Koepsell et al. [1991]).

Conducting an Individual Level Analysis Using Mixed Effects Regression Let the study outcome be denoted by the subject, in the cluster, in the intervention group, for Consider the mixed effects linear regression model given by where

The random effects and are assumed to be independent. Inferences may be obtained using procedure PROC MIXED in SAS.

Mixed Effects Linear Regression and Multilevel Modelling The mixed effects linear regression model can also be expressed as a multilevel model: (Level 1) (Level 2) and

This relationship also holds for unbalanced designs where cluster sizes are variable. However, for balanced designs a cluster-level analysis calculated using a two sample t-test will provide identical inferences. Other terms used for multilevel models are random-coefficient models and hierarchial models

A Common Misconception Investigators have occasionally claimed that cluster level analyses will only provide valid statistical inferences when the intracluster correlation coefficient However this is too stringent a claim. For balanced designs, cluster level analyses provide inferences which are equivalent to those obtained using a mixed linear regression model for any value of

Generalized Estimating Equations Approach The assumptions of the mixed effects linear regression model may be avoided by using the robust variance estimation procedure popularized by Liang and Zeger (1986). For a balanced design a robust test of the effect of intervention differs from the cluster level, two sample t-test statistic by a factor of

This shows that cluster-level analyses of the effect of intervention are robust to model misspecification for balanced designs. Similar simplification is obtained for two group comparisons of (i) overdispersed count data (Diggle et al. [1994]), and (ii) correlated binary outcome data (Boos [1992]).

Selection of Weights When appropriately weighted, cluster-level analyses provide inferences which are equivalent to individual-level analyses. Calculation of the optimal weights requires estimation of When there are few cluster per intervention group estimates of will be imprecise Consequently the use of weights may actually result in a loss of power and could even compromise validity of statistical inferences (e.g., Gumpertz and Pantula, 1989).

Sample Size Requirements Exact statistical inferences are not routinely available for mixed effects linear regression models (Littell et al. [1996], Schluchter and Elashoff [1990]). Rather statistical inferences must be constructed based on large sample theory (i.e., large numbers of clusters). There have yet to be widely accepted guidelines for the numbers of cluster required to ensure validity of statistical inferences.

Results obtained by model fitting using data from studies enrolling fewer than 20 clusters per intervention group should be interpreted with caution (Duncan et al., 1998). Sensitivity analyses based on comparisons of results obtained from cluster-level and individuallevel analyses may prove helpful.

Multiple Levels of Inference and Cluster Randomization Trials Decisions regarding the appropriate unit of inference and the appropriate unit of analysis can be difficult in some trials. For example, where schools are the unit of randomization, analyses could be carried out at the individual, classroom, or school level.

However, analyses conducted at the individual, or classroom level need to account for dependencies of responses for children from the same school. Decisions regarding the appropriate unit of analysis must be distinguished from the need to account for the unit of randomization.

Multilevel Models When applicable, multilevel models offer several advantages over cluster-level analyses. More naturally allows for modelling of random effects to obtain estimates of intracluster correlation used to design future studies. Allows adjustment of the effect of intervention for imbalance on individual-level and cluster-level baseline predictors of outcome. Simplifies examination of factors influencing the effect of intervention (i.e., interaction effects).

Example: Consider CATCH, a school based trial for cardiovascular health (Luepker et al. [1996]). One of the trial outcomes was body mass index (weight/height 2 ). The effect of intervention on body mass index might vary depending on both a child s obesity at the start of the trial relative to other students from the same school and the average degree of obesity at the school level. While these analyses enrich our understanding of the trial results they are almost always exploratory in nature.

CONCLUSIONS: Advantages of Cluster Level Analysis Easy to conduct and explain. Can be applied to any outcome variable. Exact statistical inferences can always be constructed. Can be adapted to adjust for baseline imbalances or variability in cluster size.

Advantages of Individual Level Analyses Simplify to standard statistical procedures in the absence of clustering. Allow more direct examination of the joint effects of cluster-level and individual-level predictors. Can be extended to analyses of multi-level data. More naturally result in estimation of intracluster correlation coefficients helpful in designing future studies.

Cluster Level Analyses and the Ecological Fallacy The ecological fallacy refers to the bias which may occur when an association observed between variables at the cluster level differs from the association at the individual level. Concern for the ecological fallacy has been raised as an argument against using cluster-level analyses (e.g., Kreft, 1998). The primary analysis of the effect of intervention in most randomized trials includes all eligible subjects regardless of their compliance with the protocol (i.e., analysis by intention to treat).

The ecological fallacy cannot arise in this case since the treatment assignment is shared by all cluster members. As with any intent to treat analysis, however, inferences are limited to examining the effect of instituting a treatment policy.

Cluster Level Analyses and the Ecological Fallacy The ecological fallacy refers to the bias which may occur when an association observed between variables at the cluster level differs from the association at the individual level. Concern for the ecological fallacy has been raised as an argument against using cluster-level analyses (e.g., Kreft, 1998). The primary analysis of the effect of intervention in most randomized trials includes all eligible subjects regardless of their compliance with the protocol (i.e., analysis by intention to treat).

The ecological fallacy cannot arise in this case since the treatment assignment is shared by all cluster members. As with any intent to treat analysis, however, inferences are limited to examining the effect of instituting a treatment policy.

Reporting of Cluster Randomization Trials Reporting of Study Design State clearly the justification for employing cluster randomization. Provide clear rationale for the selection of any matching or stratification factors. Clarify the choice of inferential unit. Describe in detail the content of the interventions.

Describe the clusters that meet the eligibility criteria for the trial, but declined to participate. State the methods used to obtain informed consent. Describe clearly how the trial sample size was determined. Describe the method of randomization in context of the selected design. Where relevant, discuss the steps taken to minimize the risk of contamination and loss of follow-up.

Reporting of Study Results Include a table showing baseline characteristics by intervention group, separately for individuallevel and cluster-level characteristics. Avoid the use of significance-tests in comparing baseline characteristics. Report the observed distribution of cluster sizes (median,range) Report on loss to follow-up at both the individual level and the cluster level

Use caution in reporting standard deviations of individual baseline characteristics. Provide a clear and explicit description of the method used to account for between-cluster variation and for the influence of baseline prognostic factors. Report empirical estimates of Note: Eldridge et al(2004) review the reporting quality of primary care trials published 1997-2000.

EXTENSION OF CONSORT STATEMENT TO CLUSTER RANDOMIZATION TRIALS (CAMPBELL ET AL, BMJ, 2004) - Includes a checklist of items that should be included in a trial report - Extends CONSORT statement for reporting of individually randomized trials (Begg et al, JAMA, 1996) - Key Points Rationale for adopting cluster design Incorporation of clustering into sample size estimation and analysis Chart showing flow of clusters through the trial, from assignment to analysis

Reporting of Cluster Randomization Trials Reporting of Study Design State clearly the justification for employing cluster randomization. Provide clear rationale for the selection of any matching or stratification factors. Clarify the choice of inferential unit. Describe in detail the content of the interventions.

Describe the clusters that meet the eligibility criteria for the trial, but declined to participate. State the methods used to obtain informed consent. Describe clearly how the trial sample size was determined. Describe the method of randomization in context of the selected design. Where relevant, discuss the steps taken to minimize the risk of contamination and loss of follow-up.

Reporting of Study Results Include a table showing baseline characteristics by intervention group, separately for individuallevel and cluster-level characteristics. Avoid the use of significance-tests in comparing baseline characteristics. Report the observed distribution of cluster sizes (median,range) Report on loss to follow-up at both the individual level and the cluster level

Use caution in reporting standard deviations of individual baseline characteristics. Provide a clear and explicit description of the method used to account for between-cluster variation and for the influence of baseline prognostic factors. Report empirical estimates of Note: Eldridge et al(2004) review the reporting quality of primary care trials published 1997-2000.

EXTENSION OF CONSORT STATEMENT TO CLUSTER RANDOMIZATION TRIALS (CAMPBELL ET AL, BMJ, 2004) - Includes a checklist of items that should be included in a trial report - Extends CONSORT statement for reporting of individually randomized trials (Begg et al, JAMA, 1996) - Key Points Rationale for adopting cluster design Incorporation of clustering into sample size estimation and analysis Chart showing flow of clusters through the trial, from assignment to analysis

REFERENCES Bass, M.J., McWhinney, I.R. and Donner A. (1986), Do family physicians need medical assistants to detect and manage hypertension? Canadian Medical Association Journal, 134, 1247-1255. Begg, C., Cho, M., Eastwood, S. et al. (1996), Improving the quality of reporting of randomized controlled trials, the CONSORT statement, Journal of the American Medical Association, 276, 637-639. Bell, R.M. and McCaffrey, D.F. (2002), Bias reduction in standard errors for linear regression with multi-stage samples, Survey Methodology, 28, 169-181. Bland, J.M. (2004), Cluster randomised trials in the medical literature: two bibliometric surveys, BMC Medical Research Methodology, 4, 1-6.

REFERENCES Campbell, M.K., Elbourne, D.R. Altman, D.C. (2004), CONSORT statement: extension to cluster randomised trials, British Medical Journal 328: 702-708. Campbell, M.K., Mollison, J. and Grimshaw, J.M. (2001), Cluster trials in implementation research: estimation of intracluster correlation coefficients and sample size, Statistics in Medicine, 20, 391-399. Campbell,M.J.,Donner,A. and Klar,N.Developments in cluster randomization trials and Statistics in Medicine.26:2-19,2007. COMMIT Research Group. (1995), Community Intervention Trial for Smoking Cessation (COMMIT): I. Cohort results from a four-year community intervention, American Journal of Public Health, 85,183-192. COMMIT Research Group.(1995), Community Intervention Trial for Smoking Cessation (COMMIT): II. Changes in adult cigarette smoking prevalence, American Journal of Public Health, 85, 193-200.

REFERENCES Cornfield,J.Randomiation by group:a formal analysis.american Journal of Epidemiology,108,100-102. Diwan, V.K., Wahlstrom, R., Tomson, G., Beerman, B., Sterky, G. and Eriksson, B. (1995), Effects of group detailing on the prescribing of lipid-lowering drugs: a randomized controlled trial in Swedish primary care, Journal of Clinical Epidemiology, 48, 705-711. Donner, A. and Klar, N. (2004), Pitfalls of and controversies in cluster randomized trials, American Journal of Public Health, 94(3), 416-422. Donner, A. and Klar, N. (2000), Design and Analysis of Cluster Randomization Trials in Health Research, Arnold Publishers Limited, London, U.K. Donner,A. and Klar, N. (1994) Methods of comparing event rates in intervention studies where the unit of allocation is a cluster, American Journal of Epidemiology,140,279-289.

REFERENCES Donner, A., Piaggio, G. and Villar, J. (2001), Statistical methods for the metaanalysis of cluster randomization trials, Statistical Methods in Medical Research, 10, 325-338. Edwards, S.J.L., Braunholtz, D.A., Lilford, R.J. and Stevens, A.J. (1999), Ethical issues in the design and conduct of cluster randomised controlled trials, BMJ, 318, 1407-1409. Eldridge, S.M, Ashby,D.Feder,G.S. Rudnicka,A.R.and Ukomunne,,O.C. (2004)Lessons for cluster randomized trials in the twenty-first century: a systematice review of trials in primary care Clinical Trials 1:80-90 Farrin, A., Russel, I., Torgerson, D. and Underwood, M. (2005), Differential recruitment in a cluster randomized trial in primary care: the experience of the UK Back pain, Exercise, Active management and Manipulation (UK BEAM) feasibility study, Clinical Trials, 2(2), 119-124. Farr, B.M., Hendley, J.O., Kaiser, D.L. and Gwaltney, J.M. (1988), Two randomized controlled trials of virucidal nasal tissues in the prevention of natural upper respiratory infection, American Journal of Epidemiology, 128, 1162-1172.