Epidemics in heterogeneous communities: estimation of R 0 and secure vaccination coverage



Similar documents
benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Recurrence. 1 Definitions and main statements

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

PERRON FROBENIUS THEOREM

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Stochastic epidemic models revisited: Analysis of some continuous performance measures

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

What is Candidate Sampling

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Calculation of Sampling Weights

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

How To Calculate The Accountng Perod Of Nequalty

1 De nitions and Censoring

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

NON-CONSTANT SUM RED-AND-BLACK GAMES WITH BET-DEPENDENT WIN PROBABILITY FUNCTION LAURA PONTIGGIA, University of the Sciences in Philadelphia

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

1 Example 1: Axis-aligned rectangles

Ring structure of splines on triangulations

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Can Auto Liability Insurance Purchases Signal Risk Attitude?

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

Section 5.4 Annuities, Present Value, and Amortization

The Application of Fractional Brownian Motion in Option Pricing

Lecture 3: Force of Interest, Real Interest Rate, Annuity

Project Networks With Mixed-Time Constraints

An Alternative Way to Measure Private Equity Performance

CHAPTER 14 MORE ABOUT REGRESSION

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Addendum to: Importing Skill-Biased Technology

Extending Probabilistic Dynamic Epistemic Logic

L10: Linear discriminants analysis

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

Analysis of Premium Liabilities for Australian Lines of Business

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining

Stress test for measuring insurance risks in non-life insurance

NMT EE 589 & UNM ME 482/582 ROBOT ENGINEERING. Dr. Stephen Bruder NMT EE 589 & UNM ME 482/582

The Cox-Ross-Rubinstein Option Pricing Model

Fragility Based Rehabilitation Decision Analysis

Brigid Mullany, Ph.D University of North Carolina, Charlotte

The OC Curve of Attribute Acceptance Plans

Solution: Let i = 10% and d = 5%. By definition, the respective forces of interest on funds A and B are. i 1 + it. S A (t) = d (1 dt) 2 1. = d 1 dt.

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

Product-Form Stationary Distributions for Deficiency Zero Chemical Reaction Networks

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

Performance Analysis of Energy Consumption of Smartphone Running Mobile Hotspot Application

DEFINING %COMPLETE IN MICROSOFT PROJECT

Chapter 4 ECONOMIC DISPATCH AND UNIT COMMITMENT

BERNSTEIN POLYNOMIALS

Support Vector Machines

Statistical Methods to Develop Rating Models

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

This circuit than can be reduced to a planar circuit

Texas Instruments 30X IIS Calculator

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

Calculating the high frequency transmission line parameters of power cables

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

1. Measuring association using correlation and regression


How To Understand The Results Of The German Meris Cloud And Water Vapour Product

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

Traffic-light a stress test for life insurance provisions

STATISTICAL DATA ANALYSIS IN EXCEL

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

Traffic State Estimation in the Traffic Management Center of Berlin

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

Fisher Markets and Convex Programs

Underwriting Risk. Glenn Meyers. Insurance Services Office, Inc.

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION

SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW.

RELIABILITY, RISK AND AVAILABILITY ANLYSIS OF A CONTAINER GANTRY CRANE ABSTRACT

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

Chapter 2 The Basics of Pricing with GLMs

Chapter XX More advanced approaches to the analysis of survey data. Gad Nathan Hebrew University Jerusalem, Israel. Abstract

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Prediction of Disability Frequencies in Life Insurance

14.74 Lecture 5: Health (2)

where the coordinates are related to those in the old frame as follows.

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

How To Find The Dsablty Frequency Of A Clam

Implementation of Deutsch's Algorithm Using Mathcad

SIMPLE LINEAR CORRELATION

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

The Greedy Method. Introduction. 0/1 Knapsack Problem

Joe Pimbley, unpublished, Yield Curve Calculations

Trade Adjustment and Productivity in Large Crises. Online Appendix May Appendix A: Derivation of Equations for Productivity

Time Domain simulation of PD Propagation in XLPE Cables Considering Frequency Dependent Parameters

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance

A Lyapunov Optimization Approach to Repeated Stochastic Games

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

Forecasting the Direction and Strength of Stock Market Movement

Analysis of Energy-Conserving Access Protocols for Wireless Identification Networks

Transcription:

J. R. Statst. Soc. B 2001) 63, Part 4, pp. 705±715 Epdemcs n heterogeneous communtes: estmaton of R 0 and secure vaccnaton coverage Tom Brtton Uppsala Unversty, Sweden [Receved January 2000. Fnal revson May 2001] Summary. A stochastc multtype model for the spread of an nfectous dsease n a communty of heterogeneous ndvduals s analysed. In partcular, estmates of R 0 the basc reproducton number) and the crtcal vaccnaton coverage are derved, where estmaton s based on nal sze data of an outbreak n the communty. It s shown that these key parameters cannot be estmated consstently from data; only upper and lower bounds can be estmated. Con dence regons for the upper bounds are derved, thus gvng conservatve estmates of R 0 and the fractons necessary to vaccnate. Keywords: Basc reproducton number; Consstency; Fnal sze data; Multtype epdemc; Vaccnaton coverage; Vaccne ef cacy 1. Introducton The man practcal motvaton for the study of epdemc models les n the nsghts that they provde about the control of nfectous dseases. These nsghts attan practcal relevance only when the model on whch they are based captures the essental characterstcs of dsease transmsson n a real communty and the avalable data enable estmaton of the model parameters. One feature that s known to play an mportant role n the propagaton of nfectous dseases s that of heterogenetes between ndvduals. For example, transmsson rates for measles and rubella are found to depend substantally on the age of ndvduals Grenfell and Anderson, 1985) and the rate of transmsson for n uenza type A s much hgher wthn households than between Addy et al., 1991), as would be expected for all transmttable dseases. In the present paper we treat estmaton procedures of the basc reproducton number R 0, where nference s based on nal sze data from one outbreak n the communty. The estmates are derved from stochastc models, thus allowng con dence bounds. The results are then nterpreted n terms of vaccnaton polces: what are the necessary crtera for a vaccnaton polcy to prevent future outbreaks,.e. to be above the crtcal vaccnaton coverage? Such a communty state s known as herd mmunty snce then everyone n the communty s protected from future outbreaks, even those who are not vaccnated. The problems stated above are analysed for a so-called multtype epdemc model n whch ndvduals are separated nto dfferent types wth arbtrary transmsson rates between each par of types,.e. wth no restrctons on the `who acqures nfecton from whom' matrx Address for correspondence: Tom Brtton, Department of Mathematcs, Uppsala Unversty, PO Box 480, SE-751 06 Uppsala, Sweden. E-mal: tom.brtton@math.uu.se & 2001 Royal Statstcal Socety 1369±7412/01/63705

706 T. Brtton Anderson and May, 1991). The types may for example re ect age groups, gender or the prevous hstory of the dsease and/or vaccnaton. A specal feature of the statstcal nference s that the basc reproducton number R 0 and the crtcal vaccnaton coverage v c cannot, n general, be estmated consstently. Instead estmates of the lower and upper bound for R 0 are gven, bounds whch also nduce lower and upper bounds on v c, and so only vaccnaton strateges wth hgher coverage than the upper bound wll surely prevent future epdemcs. The reason for ths ambguty s that the model contans more parameters transmsson rates) than the dmenson of the observed data vector, thus not enablng an estmaton of all the parameters Anderson and May, 1984). Greenhalgh and Detz 1994) treated smlar estmaton problems for a determnstc model of an open populaton,.e. wth brths and deaths, n whch heterogenety s caused by age. Suf cent data for estmaton come from a cross-sectonal survey from a populaton at `equlbrum'. They derved expressons for upper and lower bounds of R 0 smlar to those of the present paper, both under general transmsson rates as well as for several submodels. They also consdered dfferent vaccnaton strateges and ther effect on the equlbrum, and n partcular whether the dsease wll become extnct. The present paper dffers from Greenhalgh and Detz 1994) n several ways. Its man mert s that the model s stochastc, thus gvng con dence ntervals for the estmates. Further we allow heterogenetes of other sorts than age, e.g. caused by prevous hstory of vaccnaton or dsease or gender. In Greenhalgh and Detz 1994) such heterogenetes are not treated, wth the effect that the problem of the optmal vaccnaton strategy s trval: vaccnate only n the youngest age group s). A drawback wth the present analyss, compared wth Greenhalgh and Detz 1994), s the assumpton of a closed populaton, wth the effect that ndvduals cannot change type over tme as s natural wth age cohorts observed over longer perods of tme. Of course no populaton s really closed. However, when consderng a short epdemc outbreak, perhaps lastng a few months, the communty may be approxmated as beng closed. The methods of the present paper are not sutable for long-term outbreaks or a smultaneous analyss of several dfferent outbreaks. The reason for not treatng a stochastc epdemc model for an open populaton s the complcated quas-statonary behavour of such models; see NaÊ sell 1999). The estmators for open populatons are usually the same as n a closed populaton but the standard errors are dfferent. Farrngton et al. 2001) also treat a determnstc model for an open populaton allowng for varous heterogenetes. By usng avalable contact parameters from other related dsease outbreaks, assumng some relatonshp between the contact rates for the dseases, they could estmate R 0 consstently. See Secton 2 n Greenhalgh and Detz 1994) for an excellent survey of related work n the analyss of epdemcs. A short note treatng problems smlar to those of the present paper, but n a determnstc framework, has appeared recently Brtton, 1998a). In Secton 2 we de ne the multtype epdemc model and present asymptotc results for t. In Secton 3 we derve estmates, ncludng con dence bounds, of the fundamental parameter R 0. In Secton 4 we use these results to construct vaccnaton programmes that prevent future outbreaks. Secton 5 llustrates the results wth an example. 2. The model 2.1. De nton The model that we now de ne s a stochastc susceptble±nfected±removed epdemc model for a closed multtype populaton e.g. Ball and Clancy 1993)). Consder a closed populaton

Epdemcs n Heterogeneous Communtes 707 of sze n consstng of k dfferent types of ndvduals, labelled 1,..., k, and let n denote the number of -ndvduals and ˆ n =n the correspondng proporton. If an -ndvdual becomes nfected he or she becomes nfectous, possbly after a latency perod wth arbtrary dstrbuton. Durng the nfectous perod an -ndvdual has `close contact' wth any gven j- ndvdual at rate j =n, where a close contact s de ned as a contact whch results n nfecton f the other ndvdual s susceptble; otherwse the contact has no effect. The matrx f j g of contact ntenstes s assumed to be rreducble, thus omttng the possblty of a major outbreak for some but not all types of ndvdual. The nfectous perod I has dstrbuton F wth mean and standard devaton. For future use we de ne j ˆ j, mplyng that j j denotes the expected number of close contacts whch an -ndvdual has wth j- ndvduals durng the nfectous perod. When the nfectous perod s over, the ndvdual recovers and becomes mmune, and we say that the ndvdual s removed. The epdemc evolves untl there are no nfectous ndvduals n the populaton. Then no-one can become nfected and the epdemc has entered ts nal state. All contact processes and nfectous perods are de ned to be mutually ndependent. As ponted out by Ball and Clancy 1993) the model can be generalzed wthout affectng the dstrbuton of the nal state. The nal state depends only on the dstrbuton of the `total nfecton forces' fn 1 j I g for the dfferent type combnatons. Instead of assumng constant contact rates over the nfectous perod we may allow for a tme-varyng nfectvty, ncludng an ntal latency perod. Ths s modelled by a stochastc process fi t ; t 5 0g, where I t s the nfectvty t tme unts after nfecton of an -ndvdual and t falls under the model de ned above smply by lettng F denote the dstrbuton of 1 0 I t dt. 2.2. Asymptotc propertes ofthe model The asymptotc propertes of the model above, for a large populaton, have been analysed extensvely by Ball and Clancy 1993). Startng wth few ntally nfectous ndvduals n a large, otherwse susceptble, populaton, the epdemc can ether take off and gve a large outbreak or t may de out and nfect very few, a general phenomenon for epdemc models. Durng the ntal stages the epdemc can be approxmated by a multtype branchng process because nfectous ndvduals nfect new ndvduals vrtually ndependently of each other snce the probablty that they wll contact the same ndvdual s neglgble. For the model of the present paper the fundamental parameter R 0, the basc reproducton number, s de ned as the largest postve egenvalue of the matrx j j. Note that j j ˆ j =n n j s the expected number of close contacts whch an nfectous -ndvdual has wth j-ndvduals durng the nfectous perod. In the branchng process j j thus corresponds to the matrx of mean offsprng dstrbuton. The approxmatng branchng process s subcrtcal, crtcal or supercrtcal dependng on whether R 0 s smaller than, equal to or larger than 1. It hence follows that, asymptotcally, the probablty of a large outbreak n a completely susceptble populaton s postve f and only f R 0 > 1 e.g. Ball and Clancy 1993)). If a proporton 1 s j of all j-ndvduals are ntally mmune, so the proporton s j are susceptble, then the effectve reproducton number R e s the largest postve egenvalue of the matrx j j s j and a major outbreak s possble f and only f R e > 1. A smple argument for the last result s that we may neglect the mmune ndvduals by ntroducng new notaton: n 0 ˆ n s, the number of susceptble -ndvduals, n 0 ˆ n 0, the total number of susceptble ndvduals, and 0 ˆ n 0 =n 0 ˆ s =s, the proporton of susceptble ndvduals that are of type where s ˆ s s the overall proporton susceptble). The contact parameter s unchanged and equals j =n ˆ s j =n 0, so by ntroducng 0 j ˆ s j and smlarly 0 j ˆ s j t

708 T. Brtton follows from the result above that the effectve reproducton number R e s the largest postve egenvalue of the matrx 0 j 0 j ˆ j j s j. When makng nferences we shall always assume that a major outbreak has occurred and that the ntal number of nfectve ndvduals s small. The results are thus condtonal on a major outbreak Ð otherwse there s not enough nformaton for consstent estmaton Ð whch mplctly assumes that R e > 1from the propertes stated above. Consder the model de ned above n a populaton wth type dstrbuton f g and ntal proportons susceptble gven by fs g. Let ~p denote the random proporton among the ntally susceptble -ndvduals who become nfected durng the course of the epdemc. Applyng the results n Ball and Clancy 1993) then shows, assumng few ntal nfectve ndvduals and a major outbreak, that the vector f ~p g converges n probablty to fp g as n!1, where fp g s the unque postve soluton to the system of equatons 1 p j ˆ exp P s p j, j ˆ 1,...,k. 1 Equatons 1) have a natural nterpretaton: the proporton that escape nfecton equals the probablty of escapng nfecton from the aggregated total p nfecton forces. A central lmt theorem n Ball and Clancy 1993) shows that the vector f nj s j ~p j p j g s asymptotcally Gaussan wth mean vector 0 and varance matrx ˆ S T 1 S 1, where the matrces S and have elements p S j ˆ j s j s j j 1 p j, p j ˆ p 1 p j j s j s j 1 p 1 p j P k s k p k k kj k = k 2, k where j denotes the Kronecker delta functon ˆ 1and j ˆ 0, 6ˆ j). 2.3. Modellng vaccnaton Suppose that a vaccne s avalable havng ef cacy r j among j-ndvduals, j ˆ 1,...,k. It could for example be that all nfecton rates j, ˆ 1,...,k, are reduced by a factor r j the so-called leaky effect), or that a proporton r j become completely mmune and the rest are unaffected by the vaccne the all-or-nothng effect). The case r j ˆ 1for all j corresponds to a perfect vaccne and r j ˆ 0 for all j to a useless vaccne. See for example Halloran et al. 1992) for more about vaccne ef cacy. The propagaton of dsease transmsson n a partly vaccnated communty also havng ntally mmune ndvduals can be descrbed usng the present model. Consder the same populaton as before havng proportons ntally susceptble gven by fs g and suppose that a proporton v j of all ntally susceptble j-ndvduals are vaccnated wth such a vaccne before the epdemc season. The expected number of close contacts that an nfectous -ndvdual has wth ntally susceptble j-ndvduals s then reduced from j j s j to j j s j f 1 v j v j 1 r j g ˆ j j s j 1 v j r j. Ths s true because a proporton v j among the j-ndvduals have reduced ther susceptblty by a factor r j. The effectve reproducton number after vaccnaton, R ev, s then the largest egenvalue of the matrx j j s j 1 v j r j and the vaccnatons performed wll surely prevent

Epdemcs n Heterogeneous Communtes 709 an outbreak n the populaton f R ev 4 1. Vaccnatng wth the effect of surely preventng an outbreak when the entre populaton s ntally susceptble s of man nterest, because ths vaccnaton programme wll be preventve whatever proporton s mmune and wll reman so f the dsease-acqured mmunty wanes wth tme. In what follows we shall thus focus on studyng vaccnaton programmes for whch R v 4 1, where R v s the largest egenvalue of the matrx j j 1 v j r j. Vaccnaton programmes amng at reducng R ev below 1, for some other spec ed susceptblty levels fs g, can be derved by usng dentcal arguments. 3. Estmaton of R 0 In ths secton we derve estmates of R 0 for the multtype epdemc de ned n the prevous secton. Remember that R 0 s the largest egenvalue of the matrx j j,.e. for a completely susceptble populaton. The parameters are assumed to be unknown and are estmated wth data from one epdemc outbreak ~p 1,..., ~p k whch may have occurred n a communty contanng ntally mmune ndvduals. The proportons mmune before the outbreak f1 s g, the communty structure f g and the communty sze n are assumed to be known. It s mportant to take nto account the presence of ntally mmune ndvduals when makng nference on the reproducton number from an epdemc outbreak. If ths s neglected the resultng estmates wll underestmate the true parameters, wth the effect that the suggested vaccnaton coverage may not be preventve. Ths s very dfferent from assumng that the communty to be vaccnated s completely susceptble, as assumed n the prevous secton. In the latter case the suggested proportons to vaccnate are preventve even when the assumpton fals. As mentoned n the prevous secton the vector ~p 1,..., ~p k converges n probablty to p 1,...,p k de ned n equatons 1) as n!1. We hence start by treatng the determnstc lmt before dealng wth uncertanty. 3.1. Determnstc lmt Wth gven vectors fp j g and fs j g satsfyng equatons 1) but f j g otherwse arbtrary, R 0 s not determned unquely, as has been observed prevously e.g. Greenhalgh and Detz 1994) and Brtton 1998b)). In fact R 0 can attan any value n an nterval, as the followng lemma shows. Lemma 1. Let fp j g, fs j g and f j g be de ned as above and let f j g be any gven vector wth postve elements. Then the largest postve egenvalue of the matrx j j j, where f j g sats es equatons 1), les n the closed nterval mn, max Š, where mn ˆ mnf log 1 p =s p g, max ˆ maxf log 1 p =s p g. All values n the nterval can be attaned. Remark 1. If lemma 1s appled wth ˆ 1, ˆ 1,...,k, the largest egenvalue spec es R 0. It hence follows that R 0 les between and 0 ˆ mnf log 1 p =s p g R mn

710 T. Brtton R max 0 ˆ max f log 1 p =s p g. Proof. Denote the largest egenvalue of the matrx j j j by. By the Perron±Frobenus theorem t follows that there s a vector fx g wth postve components, unque up to normalzaton, such that L j :ˆ x j ˆ P x,j j j ˆ: R j j ˆ 1,...,k 2 e.g. Jagers 1975), pages 92±93). De ne M ˆ max 14 j4k x j = j s j p j, and suppose that the maxmum s attaned for j ˆ j 0,.e. x j0 = j0 s j0 p j0 ˆ M. Then, by the de nton of the left-hand sde n equaton 2) we have L j 0 ˆM j0 s j0 p j0. The rght-hand sde can be domnated as follows: x R j 0 ˆP s p s p,j0 j0 j0 4 M P s p, j0 j0 j0 ˆ M j0 j0 f log 1 p j0 g, where the last equalty s equaton 1). Snce M j0 s j0 p j0 ˆ L j 0 ˆR j 0 ths gves the upper bound 4 j0 log 1 p j0 =s j0 p j0 4 max f log 1 p =s p gˆ max. An dentcal argument shows that 5 mn. Below are some observatons showng that the end ponts of the nterval can be obtaned when ˆ 1). Fnally, any pont n the nterval can for example be obtaned by a lnear combnaton of the two extremes. & In the restrcted parameter space known as separable mxng e.g. Hethcote and Van Ark 1987)),.e. j ˆ j, the parameters may be nterpreted as nfectvty and susceptblty respectvely. Sometmes ths s called proportonal mxng, but most often proportonal mxng s used for the stronger assumpton that j ˆ j.) Then the basc reproducton number has the explct expresson R 0 ˆ P 3 e.g. Becker and Marschner 1990)). Ths can be used to verfy the followng observatons. a) R max 0, the `worst scenaro', s attaned n the separable mxng case f ˆ log 1 p, ˆ 1,...,k, and ˆ 0 for all except for the type 0 maxmzng log 1 p =s p, for whch 0 ˆ 1= 0 s 0 p 0. Ths choce gves R max 0 when nserted n equaton 3) and condton 1) s also sats ed. b) R mn 0, the `best scenaro', s attaned n the separable mxng case f ˆ log 1 p, ˆ 1,...,k, 1 ˆ 1= 1 s 1 p 1 for the type 1 mnmzng log 1 p =s p, and ˆ 0 for all other s. c) Under the assumpton of equal nfectvty j ˆ j ) equatons 1) determne the parameters unquely, j ˆ log 1 p j = s p,sor 0 s completely spec ed and equals j j f log 1 p j g= s p. 3.2. Stochastc model In the prevous subsecton bounds on the basc reproducton number were derved for the determnstc lmt of the multtype epdemc. In a nte communty the observed proportons nfected f ~p g are random, so the correspondng quanttes ~R max 0 and ~R mn 0, where p s replaced by ~p, are random estmates of these quanttes. We now derve a one-sded con dence nterval for R max 0 whch wll be used n the next secton when dervng how many need to be vaccnated to obtan herd mmunty wth some gven certanty.

As noted n the prevous subsecton R 0 was maxmzed when one type caused all nfectons, the type 0 whch maxmzes log 1 p =s p. When all types have equal proportons ntally susceptble, e.g. s ˆ 1for all, ths s the type wth hghest proporton nfected p.) We therefore assume ths to hold when constructng con dence bounds and thus overcome the fact that all parameters are not dent able. The varance of an estmated R 0 may of course be larger for other parameter con guratons, but these con guratons all have a lmtng R 0 smaller than R max 0, so then our con dence bound p for R max 0 wll stll be conservatve. In Secton 2.2 the asymptotc varance matrx of n s ~p p, denoted, was gven. For the case when j ˆ 0, 6ˆ 0, 0 0 s explct and equals 0 0 ˆ p 0 1 p 0 f1 0 s 0 0 0 2 1 p 0 0 = 0 2 g f1 0 s 0 0 0 1 p 0 g 2 ˆ p 0 1 p 0 p 2 0 flog 1 p 0 g 2 1 p 0 0 = 0 2 Š p 0 1 p 0 log 1 p 0 Š 2. 4 The second equalty follows from the assumpton that j ˆ 0, 6ˆ 0, mplyng that 0 s 0 0 0 ˆ log 1 p 0 =p for condton 1) to hold. The asymptotc varance of ~p 0 s 0 0 =n 0 s 0.Ifwe perform ths substtuton and replace p 0 by ~p 0 we obtan an explct standard error se ~p 0 ˆ Epdemcs n Heterogeneous Communtes 711 p ~p0 1 ~p 0 p 2 0 f log 1 ~p 0 g 2 1 ~p 0 0 = 0 2 Š p. 5 n0 s 0 f ~p 0 1 ~p 0 log 1 ~p 0 g The quantty 0 = 0 appearng n equaton 5) denotes the coef cent of varaton of the length of the nfectous perod for type 0. Ths quantty must be known or else estmated usng pror nformaton; nal sze data carry no nformaton about any temporal quanttes. Our estmate ^R max 0 ˆ log 1 ~p 0 =s 0 ~p 0 s ncreasng n ~p 0. Replacng ~p 0 by an upper con dence lmt wll thus produce an upper con dence lmt for R max 0. We summarze our results n the followng theorem. Theorem 1. Let 0 be de ned as the ndex maxmzng log 1 ~p =s ~p, assumed to be asymptotcally unque. Then, for the multtype epdemc model, ^R max 0 ˆ log 1 ~p 0 =s 0 ~p 0 ˆ max f log 1 ~p =s ~p g 6 s a consstent and asymptotcally Gaussan estmator for R max 0 de ned n lemma 1). The asymptotc varance of the estmator s var ^R max ˆp2 0 f log 1 p 0 g 2 1 p 0 0 = 0 2 0. 7 n 0 s 3 0 p 3 0 1 p 0 A1 upper con dence bound for R max 0 s gven by maxf log 1 ~p =s ~p g, 8 where ~p ˆ ~p z se ~p. Here se ~p s de ned as n equaton 5) only replacng 0 by, and z s the 1 -quantle n the normal dstrbuton. Remark 2. If unqueness s not assumed the estmator s stll consstent. However, then the varance s not correct as, n the lmt, the ndex 0 vares. The con dence bound gven by

712 T. Brtton expresson 8) may be used as a con dence bound for R 0 for any set of underlyng parameters and s then conservatve. Proof. Consstency and asymptotc normalty are a drect consequence of the asymptotc results stated n Secton 2.2, together wth the assumpton of asymptotc unqueness of 0. The varance formula s obtaned by the delta method on ^R max 0 ˆ f ~p 0 vewed as a functon of ~p 0. It follows that var ^R max 0 ˆf 0 ~p 0 2 var ~p 0 plus terms of smaller order. It follows from smple algebra that ths equals equaton 7). The standard error for ~p s a consstent estmate for the standard devaton snce the observed quanttes converge to the lmts de ned by equatons 1). The asymptotc normalty then mples that the upper con dence bound de ned by expresson 8) s correct. 4. Control We now return to controllng the spread of dsease by means of vaccnaton as dscussed n Secton 2.3, only now the model parameters are assumed unknown and are estmated by usng methods presented n the prevous secton. 4.1. Determnstc lmt Suppose that a vaccnaton programme, for whch the vaccne has known ef cacy fr j g,stobe carred out. The contact matrx j s known to satsfy condton 1). As mentoned prevously the necessary vaccnaton levels wll be derved assumng a completely susceptble communty, ths beng a conservatve assumpton. If a proporton v of the -ndvduals are vaccnated, ˆ 1,...,k, then the resultng reproducton number R v s gven by the largest egenvalue of the matrx j j 1 v j r j ; see Secton 2.3. Applyng lemma 1wth j ˆ 1 v j r j then shows that R v s contaned n the nterval mn 144k 1 r v log 1 p s p, max 144k 1 r v log 1 p. s p Herd mmunty s obtaned f R v 4 1. Ths s surely the case only when the upper end of the nterval does not exceed 1, or equvalently v 5 r 1 f1 s p = log 1 p g, for each. The optmal vaccnaton strategy for the multtype epdemc, meanng the vaccnaton programme vaccnatng the smallest number of ndvduals among all vaccnaton strateges that surely prevent future outbreaks, s thus gven by v ˆ 1 s 1 p, ˆ 1,...,k. 9 r log 1 p These proportons wll surely prevent future outbreaks. Each estmate s conservatve n that the proporton s derved under the assumpton that all nfectvty comes from that spec c type. Unless the vaccne s perfect,.e. r ˆ 1for all, t may happen that v > 1for some j. Ths mples that the communty s not surely protected from future outbreaks even when every such ndvdual s vaccnated,.e. the vaccne s not suf cently effectve to obtan herd mmunty.

Epdemcs n Heterogeneous Communtes 713 4.2. Uncertanty n estmates The vaccnaton coverages for dfferent types are estmated from equaton 9) smply by replacng the lmts fp g by the observed proportons nfected f ~p g. As mentoned above, v, and hence also ts estmate ^v, was obtaned assumng that -ndvduals were responsble for all nfectons. The uncertanty of the estmate should thus be obtaned under ths assumpton. Usng arguments that are dentcal wth those for the varance of ~p 0 presented n Secton 3.2 t follows that the asymptotc varance of ~p assumng kj ˆ 0, k 6ˆ, equals =n s where s de ned as n equaton 4) only replacng 0 by. Snce ~v s ncreasng n ~p upper con dence bounds for ^v can be obtaned by replacng the estmate ~p by the upper con dence bound ~p de ned n theorem 1. We summarze the results n the followng theorem. Theorem 2. The estmates de ned by ^v ˆ 1 s 1 ~p, ˆ 1,...,k, 10 r log 1 ~p are consstent and asymptotcally Gaussan estmates of the crtcal vaccnaton coverage of the multtype epdemc de ned by equaton 9). The asymptotc varance for ^v s var ^v ) ˆ s 2 p 2 log 1 p ) 2 1 p ) / ) 2, 11) r 2 n s p 1 p 0 )flog 1 p )g 4 anda1 upper con dence bound for v s gven by ^v ˆ 1 s 1 ~p r log 1 ~p, 12 where ~p s de ned n theorem 1. Proof. Consstency and asymptotc normalty follow by usng arguments that are smlar to those n the proof of theorem 1. The varance expresson 11) s obtaned by usng the delta method and the upper con dence bound 12) s derved smply by nsertng an upper con dence estmate ~p for the unknown quantty p. & The vaccnaton coverages de ned above are preventve when the communty s completely susceptble. Note that the epdemc outbreak on whch nference s based could stll have contaned ntally mmune ndvduals. If the communty s not completely susceptble the vaccnaton levels may be lower. In case the susceptble proportons are known and equal to fs 0 g say, then the vaccnaton programme should vaccnate enough ndvduals such that R ev does not exceed 1. Recall that R ev, the effectve reproducton number after vaccnaton, was de ned as the largest egenvalue of the matrx j j s 0 j 1 v j r j. Lemma 1can be appled, wth a dfferent choce of f j g, to solve ths problem usng the same methods as above. In fact, the communty structure may also be altered from that of the epdemc outbreak, to f 0 jg say. We could thus estmate vaccnaton programmes for a communty that s dfferent from the communty on whch the outbreak nference s based. However, these estmates wll only be vald f the contact parameters f j g are the same for the two communtes. 5. An example A smple example wth two types of ndvdual llustrates the methods n the paper. Suppose that a communty conssts of 1000 ndvduals, 300 chldren and 700 adults. Before the

714 T. Brtton epdemc ndvduals are tested for antbodes and t s found that 90% of the chldren and 60% of the adults were susceptble. An epdemc outbreak then occurs, resultng n 80% of the susceptble chldren and 20% of the susceptble adults becomng nfected. Further we assume that the coef cent of varaton of the length of the nfectous perod s the same for both types and equals 3/7, e.g. 7 days on average and 3 days standard devaton. Wth the termnology of the paper we have chldren beng type 1) n 1 ˆ 300, s 1 ˆ 0:9, ~p 1 ˆ 0:8, n 2 ˆ 700, s 2 ˆ 0:6, ~p 2 ˆ 0:2 and 1 = 1 ˆ 2 = 2 ˆ 3=7. Ths mples that log 1 ~p 1 =s 1 ~p 1 ˆ 2:235 and log 1 ~p 2 =s 2 ~p 2 ˆ 1:859. From equaton 6) t hence follows that ^R max 0 ˆ 2:235 whch s estmated for the worst case where chldren cause all nfectons. Usng equaton 5) the standard errors for ~p 1 and ~p 2 are 0.044 and 0.198 respectvely. A 95% upper con dence bound for R max 0 equals 2.618, computed usng expresson 8). Suppose now that a vaccne havng 90% ef cacy, the same for both types so r 1 ˆ r 2 ˆ 0:9, s avalable. The necessary proporton to vaccnate to avod future epdemcs depends on the proporton of ndvduals who are mmune to the dsease. For example, drectly after the epdemc the communty s protected wthout any vaccnaton. However, as tme passes the dsease-acqured mmunty usually wanes and an ncreasng proporton must be vaccnated to obtan herd mmunty. The necessary proportons to vaccnate when the entre communty s susceptble, estmated from equaton 10), are ^v 1 ˆ 0:614 and ^v 2 ˆ 0:514. Takng uncertanty nto account by gvng 95% upper con dence bounds changes the estmates to ^v 1 ˆ 0:687 and ^v 2 ˆ 0:641usng equaton 12). We conclude that at least 69% of the chldren and 64% of the adults should be vaccnated to prevent future outbreaks. Ths level of vaccnaton wll keep the communty protected even when the entre communty s ntally susceptble, but also f some ndvduals are mmune. The same levels also apply to any other communty havng the same communty and contact structure. 6. Dscusson In the present paper t s shown that fundamental parameters, such as R 0 and the crtcal vaccnaton coverage, cannot be estmated consstently from nal sze data n a smple multtype epdemc model. It s shown that a range of parameter con guratons havng dfferent R 0!) are consstent wth data. The largest range of possble values of R 0 appears when the proporton nfected among dfferent types vares greatly. In partcular t cannot be determned who causes further nfectons. However, by estmatng `the worst case' t s stll possble to derve vaccnaton programmes whch surely prevent future outbreaks. The suggested vaccnaton coverage conssts of vaccnatng the proporton of a type such that the whole communty s `safe', herd mmune, even f ths type causes all further nfectons. The paper s meant to serve as an example showng that consstent estmaton s often not possble even n smple epdemc models. If temporal data from an outbreak are avalable, a topc whch s not treated n the present paper, all the parameters are often, but not always, dent able Brtton, 1998b). To derve expressons for the crtcal vaccnaton coverage s not only of academc nterest. Of course, health practtoners am for complete vaccnaton coverage but ths s hardly ever possble to acheve. For ths reason expressons for the crtcal vaccnaton coverage may serve as the lowest acceptable coverage. The results of the present paper also show that t s not enough to vaccnate only n the most susceptble subgroups unless pror nformaton about nfectvty among subgroups s avalable: all groups must be partly vaccnated for the communty to have herd mmunty surely. The model s stll some way from beng realstc n that t does not allow mxng at dfferent

Epdemcs n Heterogeneous Communtes 715 levels see Ball et al. 1997)) whch s natural when socal structures are present. If both ndvdual heterogenetes and socal structures such as households are acknowledged, then estmaton quckly becomes cumbersome e.g. Addy et al. 1991) and Brtton and Becker 2000)). A thorough study for such models remans to be performed. In real lfe the proportons of varous types, the proportons mmune and the proportons nfected durng an epdemc are not fully known, only estmates thereof. Ths means that only estmates of f, s g and the random quanttes f ~p g are avalable. Ths wll result n more uncertanty when estmatng R 0 and the vaccnaton coverage. If the uncertanty n measurement error s quant ed t s possble to derve how ths affects the uncertanty n the estmates by usng the delta method. In the present paper measurement error has been neglected and such an analyss remans to be performed. A dfferent approach that s worthy of exploraton s to use Markov chan Monte Carlo methods see O'Nell et al. 2000) for an applcaton of Markov chan Monte Carlo methods to epdemc models). The fact that R 0 s undent able wll also have consequences for such an approach. However, f the relatve nfectvtes of the dfferent types are equpped wth nformatve pror dstrbutons, ths wll nduce a posteror dstrbuton for R 0 ndcatng whch part of the range of possble values s most lkely. Acknowledgements Ths work was ntated by stmulatng dscussons wth Serge Utev and Nels Becker whle vstng La Trobe Unversty. Fnancal support from the Swedsh Natural Scence Research Councl s gratefully acknowledged. Constructve suggestons from the Assocate Edtor and three referees have mproved the paper consderably. References Addy, C. L., Longn, I. M. and Haber, M. 1991) A generalzed stochastc model for the analyss of nfectous dsease nal sze data. Bometrcs, 47, 961±974. Anderson, R. M. and May, R. M. 1984) Spatal, temporal and genetc heterogenety n host populatons and the desgn of mmunsaton programs. IMA J. Math. Appl. Med. Bol., 1, 233±266. Ð 1991) Infectous Dseases of Humans; Dynamc and Control. Oxford: Oxford Unversty Press. Ball, F. and Clancy, D. 1993) The nal sze and severty of a generalsed stochastc multtype epdemc model. Adv. Appl. Probab., 25, 721±736. Ball, F., Mollson, D. and Scala-Tomba, G. 1997) Epdemcs wth two levels of mxng. Ann. Appl. Probab., 7, 46±89. Becker, N. G. and Marschner, I. C. 1990) The effect of heterogenety on the spread of dsease. Lect. Notes Bomath., 86, 90±103. Brtton, T. 1998a) Preventng epdemcs n heterogeneous communtes. In Proc. 19thInt. Bometrc Conf., nvted papers, pp. 109±115. Cape Town: Internatonal Bometrc Socety. Ð 1998b) Estmaton n multtype epdemcs. J. R. Statst. Soc. B, 60, 663±679. Brtton, T. and Becker, N. G. 2000) Estmatng the mmunty coverage requred to prevent epdemcs n a communty of households. Bostatstcs, 1, 389±402. Farrngton, C. P., Kanaan, M. N. and Gay, N. J. 2001) Estmaton of the basc reproducton number for nfectous dseases from age-strat ed serologcal survey data. Appl. Statst., 50, 251±292. Greenhalgh, D. and Detz, K. 1994) Some bounds on estmates for reproducton rato derved from the age-spec c force of nfecton. Math. Bosc., 124, 9±57. Grenfell, B. T. and Anderson, R. M. 1985) The estmaton of age-related rates of nfecton from case not catons and serologcal data. J. Hyg. Camb., 94, 419±436. Halloran, M. E., Haber, M. and Longn, I. M. 1992) Interpretaton and estmaton of vaccne ef cacy under heterogenety. Am. J. Epdem., 136, 328±343. Hethcote, H. W. and Van Ark, J. W. 1987) Epdemologcal models for heterogeneous populatons: proportonate mxng, parameter estmaton and mmunzaton programs. Math. Bosc., 84, 85±118. Jagers, P. 1975) Branchng Processes wth Bologcal Applcatons. London: Wley. NaÊ sell, I. 1999) On the tme to extncton n recurrent epdemcs. J. R. Statst. Soc. B, 61, 309±330. O'Nell, P. D., Baldng, D. J., Becker, N. G., Eerola, M. and Mollson, D. 2000) Analyses of nfectous dsease data from household outbreaks by Markov chan Monte Carlo methods. Appl. Statst., 49, 517±542.