Modeling Drivers of Cost and Benefit for Policy Development in Cancer

Modeling Drivers of Cost and Benefit for Policy Development in Cancer Harms? Benefits? Costs? Ruth Etzioni Fred Hutchinson Cancer Research Center Seattle, Washington

The USPSTF recommends against routine screening mammography in women aged 40 to 49 years. The decision to start regular, biennial screening mammography before the age of 50 years should be an individual one and take into account patient context, including the patient s values regarding specific benefits and harms.

For biennial screening mammography in women aged 40 to 49 years, there is moderate certainty that the net benefit is small. The USPSTF emphasizes the adverse consequences for most women who will not develop breast cancer.

USPSTF Breast Cancer Screening Guidelines Net benefit is small NNS to save one life: 1904 for women aged 40-49 years based on updated review of the evidence and meta-analysis Mortality reduction: reduction in the likelihood of breast cancer death based on 6 models: Screening women aged 50-69: 17% (range 15-23%) reduction Screening women aged 40-69: 20% reduction Life years gained also considered but reported as secondary result Adverse consequences False positive tests: can cause anxiety and lead to additional imaging studies. More common in younger women Overdiagnosis

Results From 6 Breast Cancer Screening Models: Mortality Reduction Mandelblatt et al, 2009

Results From 6 Breast Cancer Screening Models: LY Saved

Current evidence is insufficient to assess the balance of benefits and harms of screening for prostate cancer in men younger than age 75 years (I statement). Do not screen for prostate cancer in men age 75 years or older. (Grade D recommendation).

Basis for USPSTF Decision to Stop PSA Screening at Age 75 The USPSTF was able to establish an upper bound for the potential magnitude of the benefit of treating screen-detected prostate cancer in this age group by extrapolating from evidence of treatment for clinically detected prostate cancer in this age group. For a population of men with an average life expectancy of 10 years of fewer, the USPSTF determined that the benefits of prostate cancer screening and treatment would range from small to none. Weighing this against the moderate-to-substantial psychological and physical harms associated with prostate cancer screening and treatment Bill-Axelson A et al. JNCI J Natl Cancer Inst 2008;100:1144-1154

USPSTF Prostate Cancer Screening Guidelines Moderate to substantial psychological and physical harms Associated with treatment: sexual dysfunction, urinary incontinence, bowel dysfunction Associated with screening: Overdiagnosis and also, pain associated with prostate biopsy, anxiety over false positive results Overdiagnosis: A major harm of PSA screening We estimated that 36% of men have onset in their lifetimes Before PSA, only 9% diagnosed before other-cause death

Summary Screening policy in the US appears to rest on key drivers of cost and benefit Cost drivers are particularly important in the presence of modest benefit Mammography: False positives in younger women PSA screening: Overdiagnosis in older men Different policy panels may emphasize different drivers of cost and benefit Two prostate cancer panels that I have worked with that differ considerably in terms of their key drivers American Cancer Society: Overdiagnosis and overtreatment National Comprehensive Cancer Network: False negatives

Prostate Cancer Screening Panels American Cancer Society Ad hoc group re-formed every 5 years to review most current evidence and develop recommendations by a consensus process. Mostly MD s, including internists, oncologists, some urologists. Guidelines: 2008, 2010 National Comprehensive Cancer Network Standing committee of approximately 15 members that teleconference annually to propose changes to an existing, elaborate guideline. Almost exclusively MD urologists and urologic oncologists/surgeons. US Preventive Services Task Force Standing panel that reviews evidence and develops policies for a variety of preventive and early detection interventions for a wide range of conditions. Mostly primary care clinicians. Guidelines: 2002, 2009

Guidelines over the years US Preventive Services Task Force (2002) There is insufficient evidence to recommend for or against screening. US Preventive Services Task Force (2008) There is insufficient evidence to recommend for or against screening before age 75. Do not screen after age 75 American Cancer Society (2008): Screening should be offered following discussion of risks and limitations to men age 50+ at average risk of PC and with at least 10 year LE American Cancer Society (2010) Informed decision making with provider after discussion of risks and limitations National Comprehensive Cancer Network (2009) Average risk men should be screened annually by age 50 with consideration of biopsy for PSA > 2.5 ng/ml or high PSA velocity.

NCCN: 2009 Guideline

Issues With The NCCN Guideline PSA Threshold Lowering the PSA threshold from 4.0 ng/ml to 2.5 ng/ml will double the number of men referred to biopsy. However, 15% of men with PSA less than 4.0 ng/ml have disease. ERSPC cutoff was 3.0 in most centers. Starting Age Lowering the age at which to start screening to 40 in average risk men could greatly increase the number of tests with low yield of cancers detected and uncertain incremental benefit. Median age at diagnosis before PSA was 76. PSA velocity PSA velocity is not helpful in detecting cancers or in detecting clinically significant cancers when PSA is low. Screening Interval In the ERSPC most centers had 4 years between screens this study showed significant benefit (mortality RR for screened vs control groups = 0.8)

Bringing Modeling To The Table Issue: Idea: NCCN guideline is aggressive and is likely to generate significant costs in terms of numbers of tests and overdiagnoses Panel members place an extremely high cost on delayed diagnosis and this dominates their policy decisions The frequency of aggressive disease transitioning to an incurable state at low PSA is not known There is no quantitative sense of the tradeoff between the harms and benefits of more versus less aggressive strategies How many more lives will be saved by a strategy of 2.5 vs 4.0 ng/ml? How many more men will be overdiagnosed? Provide panel members with the technology (MODEL) to evaluate a range of outcomes and focus on those that are most meaning

Some Words About Disease Modeling Step 1: Model concept A sequence of states in disease progression Sojourn time Step 2: Model calibration Estimate the transition rates between the states Step 3: Model deployment healthy latent symptomatic death (clinical) Project the outcomes of the interaction between the intervention and the model

Why Do We Need A Model?

Key Points The model should not be too detailed Must be able to estimate transition rates from available data Typically: Need data from a screened cohort The model should be detailed enough Must be able to address your questions Example: healthy latent symptomatic death What is the likely impact of screening with a specific PSA cutoff on disease-specific deaths?

Key Points The model should not be too detailed Must be able to estimate transition rates from available data Typically: Need data from a screened cohort The model should be detailed enough Must be able to address your questions Lead time Example: healthy latent symptomatic death Screen detected! Screening will advance the date of diagnosis How does this impact survival? Through a change in disease stage? If so, need to include stage progression in the model!

A Stage-Based Model Healthy Latent Symptomatic Normal Early stage Low grade Early stage Low grade Early stage High grade Early stage High grade Late stage Low grade Late stage Low grade Late stage High grade Late stage High grade Draisma et al, Int J Cancer 2007 Parameters: Stage transition rates, screen sensitivity in each stage

Criteria For The NCCN Model We want to compare competing PSA-based criteria for biopsy in terms of their impact on early detection The model should link PSA growth with disease progression and/or disease-specific survival Model should project a full range of outcomes that drive cost and benefit Model should be accessible to panel members with a user-friendly interface that will allow them to select strategies for comparison and view different outcomes Model should reflect population biopsy and treatment practices given test results and cancer diagnosis

PSA FHCRC Prostate Model PSA growth: Joint model of PSA growth and disease progression Changepoint model with individual-specific changepoints reflecting disease onset Risk of onset is proportional to age Risk of metastasis and clinical detection increase with PSA Grade of disease determined at onset; PSA growth is grade-dependent 20 18 16 14 12 10 8 6 4 2 0 Model parameters: Onset risk Risk of progression to metastasis Risk of progression to clinical onset metastasis clinical death 55 60 65 70 75

FHCRC Prostate Model Parameter Estimation FHCRC disease model has two components: PSA growth: Estimated using serial PSA data from the Prostate Cancer Prevention trial Progression: Estimated using data on prostate cancer incidence in the population Given PSA growth, what disease onset and progression rates yield ageand stage-specific incidence that best match that observed in the US? Algorithm: Simulated maximum likelihood Local-regional Distant/Advanced

FHCRC Prostate Model Calibration Fitted Incidence Trends Corresponding To 20 Seeds Local-regional stage incidence Distant-stage incidence Inoue et al JASA 2007; Gulati et al, Biostatistics 2010

FHCRC Prostate Model Overdiagnosis The calibrated model produces a simulated population of disease histories that are consistent with observed data Can use this population to empirically estimate quantities of interest that would not otherwise be observable Lifetime probabilities of disease onset and metastasis Lead time, overdiagnosis FHCRC model after calibration Lifetime probability onset 33% Prob(clinical onset) 38% Mean lead time Overdiagnosed 6 years 28% of scrdetected Draisma et al, JNCI 2009

Aside: Cross-Country Comparisons Europe vs. US MISCAN model fit to data from the Netherlands with 1991 incidence (absence of screening) and data from the ERSPC Rotterdam (presence of screening) Overdiagnosis frequency: 66% Lower rate of clinical progression Higher test sensitivity Fit to US incidence data: Overdiagnosis frequency: 42% Higher rate of clinical progression Lower test sensitivity Conclusion: Drivers of cost may be very context-dependent Draisma et al, JNCI 2009

FHCRC Prostate Model Survival Modeling In absence of screening: stage- and grade-based disease-specific survival using SEER data from men diagnosed before the PSA era In the presence of screening: A fraction of men who would have died in the absence of screening are cured Probability of cure: selected so we replicate the Goteborg trial result under biannual screening and good compliance (RR=0.56) The lower the PSA cutoff: The greater the fraction cured The greater the fraction diagnosed The more men referred to biopsy

FHCRC Model Interface

FHCRC Model Deployment July 2010 One week before 2010 NCCN conference call: Sent a link to the calculator with a brief description of goals and function to all panel members Attached a more detailed document that ran the calculator for a range of policies and tabulated the results During the call, reviewed goals of the calculator and noted that a request for (anonymous) feedback would be forthcoming Within one week after the conference call: Sent a short questionnaire about perceptions of usefulness for this call and for the future Also solicited suggestions for improvements

Comparisons of Policies Provided to Panel Outcomes projected: Tests performed False positives Overdiagnoses Prostate cancer deaths Years of life saved Mean lead time

Comparisons of Policies Cohort of 100,000 men aged 40 in the year 2000

Some more comparisons

Our Conclusions More aggressive strategies produce substantial increases in overdiagnoses and false positives and modest increases in years of life saved For almost every strategy that uses a threshold of 2.5 ng/ml, there is a dominating strategy that uses 4.0 ng/ml, whether in terms of false positives or overdiagnoses

FHCRC Model Deployment Panelists responses 5 1 4 1 1 3 1 2 3 2 2 1

After Communicating This Information

Summary Policy panels in the US tend to focus on drivers of cost Different cancers will have different cost drivers of primary importance Different panels will focus on different drivers of cost Modeling is being brought into the policy development process Not all panelists appreciate the value of models or trust their results I always have problems with invalid assumptions used in these models --- they never comport with the realities of clinical practice, where clinical judgment affects treatment outcomes Some are supportive I think that [her] modeling is absolutely the way of the future and we have to get into this 100% We have to be on board with this. Information on the different drivers of cost should be made available to panels so that they can determine cost-benefit tradeoffs that are most relevant to them. American urologists do not appear to prioritize harms or cost in their decision making.

Italian cardiologists appear to take cost-effectiveness information into account when deciding whether to use new treatments.

Acknowledgements CISNET Eric Feuer Angela Mariotto FHCRC Roman Gulati Lurdes Inoue Jeff Katcher Breast Cancer and USPSTF Jeanne Mandelblatt Model deployment and feedback: Cornerstone Systems NW Lauren Clarke