Submtted to the Annas of Apped Statstcs USING EMPIRICA IKEIHOOD TO COMBINE DATA: APPICATION TO FOOD RISK ASSESSMENT. By Amée Crépet, Hugo Harar-Kermadec and Jessca Tressou INRA Mét@rs and INRA COREA Ths paper ntroduces an orgna methodoogy based on emprca ehood whch ams at combnng dfferent contamnaton and consumptons surveys n order to provde rs managers wth a rs measure tang account of a the avaabe nformaton. Ths rs ndex s defned as the probabty that exposure to a contamnant exceeds a safe dose. It s expressed as a non near functona of the dfferent consumpton and contamnaton dstrbutons, more precsey as a generazed U-statstc. Ths non nearty and the huge sze of the data sets mae drect computaton of the probem unfeasbe. Usng nearzaton technques and ncompete versons of the U-statstc, a tractabe approxmated emprca ehood program s soved yedng asymptotc confdence ntervas for the rs ndex. An aternatve Eucdean ehood program s aso consdered, repacng the Kubac-eber dstance nvoved n the emprca ehood by the Eucdean dstance. Both methodooges are tested on smuated data and apped to assess the rs due to the presence of methy mercury n fsh and other seafoods. Introducton. Certan foods may contan varyng amounts of chemcas such as methy mercury present n sea food, doxns n poutry, meat or mycotoxns n cereas, dred fruts, etc., whch may cause major heath probems when accumuatng nsde the body n excessve doses. A commony used measure of such chronc rss reated to the presence of chemca contamnants n food s the probabty that the contamnant ntae/exposure exceeds a safe dose determned by nternatona experts commttee based on expermenta and/or epdemoogca studes. A fundamenta probem when estmatng ths food rs ndex s the dversty of data sources and the scarcty of good databases. Frst, the assessment s most of the tme conducted from consumpton and contamnaton data ndependenty avaabe snce measurng the exposure drecty over ong perods of The thrd author s vstng Hong Kong Unversty of Scence and Technoogy, her research s supported n part by Hong Kong RGC Grant #60906. AMS 2000 subject cassfcatons: Prmary 62P2; secondary 62D05, 62E20 Keywords and phrases: Incompete U-statstcs, Eucdean ehood, Exposure to methy mercury, Sea food consumpton, Rs ndex
2 CRÉPET, HARARI-KERMADEC & TRESSOU tme s not feasbe. Moreover, nformaton on the consumpton behavor of a gven popuaton s obtaned through dfferent types of survey househod budget panes, food detary records, 24 hours reca and food frequency questonnares usng dfferent methodooges stratfed sampng, random sampng or quota methods, and anaytca contamnaton data aso come from dfferent aboratores. Yet, an accurate estmaton of the food rs ndex s cruca snce the resutng confdence ntervas may serve as arguments for nutrtona recommendatons or estabshment of new standards on the contamnaton of the food. It s therefore necessary to deveop a methodoogy to bud such a confdence nterva to combne a the avaabe data and sde nformaton, such as the man dfferences between the surveys, nown bases or censorshp, etc. Data combnaton s usefu n many domans and have been consdered from an econometrc/economst pont of vew n [3]. It can be aso ned to meta-anayss technques mosty used n medca statstcs [??? ]. Other methods can be apped to ncorporate sde nformaton, see [, 5, 9]. The methodoogy chosen n ths paper s based on emprca ehood technques ntroduced by Art B. Owen n [25, 26] as a powerfu semparametrc nference method based on a data drven ehood rato functon. Refer to Owen s boo [27] and the references theren for a compete bbography on the topc. Emprca ehood s very we adapted to our estmaton probem. Indeed, as expaned n [? ], due to the correatons among the dfferent quanttes and the presence of numerous nu consumptons, fttng a parametrc mode to mutdmensona consumpton data s dffcut and nonparametrc methods are mosty recommended. Moreover, the estmaton of the food rs ndex shoud ncude a the avaabe sources of nformaton about consumpton and contamnaton. Ths nd of estmaton probem has aready been studed from a theoretca pont of vew combnaton of ndependent sampes for the estmaton of ther common mean, see [29], or [27] pages 5, 30 and 223-225 but ts appcaton to a concrete apped probem rases ntractabe dffcutes n term of computaton, especay for food rs assessment. Indeed, data set engths do not add but mutpy, and the combnaton of say 3 data sets of ength 000 yeds bons of trpets. We propose a souton based on U-statstcs to hande ths dffcuty. The outne of the paper s as foows. Secton ntroduces the framewor and notaton used n food rs assessment probems and defnes the Emprca ehood Probem EP whch s at frst gance dffcut to sove due to the hgh nonnearty of the parameter of nterest. Secton 2 states the frst man resut to approxmate the EP souton usng nearzaton technques, notcng that the food rs ndex s a generazed U-statstc
USING EMPIRICA IKEIHOOD TO COMBINE DATA 3 smpfed through ts Hoeffdng decomposton [see 3]. The practca computaton of ths souton n the mudmensona case s treated n secton 3 va ncompete U-statstcs. An aternatve Eucdean ehood program s consdered n secton 4, repacng the Kubac-eber dstance nvoved n the EP by the Eucdean dstance. Fnay, secton 5 gves an ustraton of these methodooges on true datasets concernng methy mercury exposure of the French popuaton as we as a vadaton of these methodooges usng smuated datasets. The possbe generazatons of these methodooges and the specfc extensons n the case of food rs assessment are addressed n secton 6. Technca proofs are postponed to an appendx secton.. Framewor and notaton. Our goa s to estmate θ d, the probabty that exposure to a contamnant exceeds a toerabe dose d, when P products or groups of products are assumed to be contamnated. For ths purpose, P + 2 data sets are avaabe: two P -dmensona data sets comng from two compementary consumpton surveys and the P sets of contamnaton vaues. We assume that the 2 consumpton surveys concern the same popuaton. Therefore the probabtes that exposure to a contamnant exceeds a dose d estmated wth each consumpton sampes are equa, and ther common vaue s θ d. Our am s to gve a confdence nterva for θ d usng emprca ehood technques. Notaton. For =,..., P, Q [] denotes the random varabe for the contamnaton of product, wth dstrbuton Q []. s an..d. q [] sampe of ength from Q []. Its emprca dstrbuton s Q [] = = δ [] q, =,..., where δ [] q q = f q = q [] and 0 otherwse. In the foowng, r s the consumpton survey number and taes the vaue or 2. C r,..., Cr P = C r denotes the P -dmensona random varabe for the reatve consumpton vector, wth dstrbuton C r. c r,... cr P, c r s an..d. sampe of ength n r from C r. Its emprca dstrbuton s n r C n r r = n r δ r n c. r = n r = Consumptons are reatve consumptons n the sense that they are expressed n terms of ndvdua body weght. Ths way, the ndvdua exposure can be compared to the safe dose caed PTWI, see secton 5 for detas.
4 CRÉPET, HARARI-KERMADEC & TRESSOU θ r d The probabty that the exposure of one ndvdua exceeds a dose d s = Pr D r > d, wth D r = P = Q [] C r when usng the survey r. Emprca ehood program. { P = p, n p 2 j We defne the sets of weghts, { } w [] j n 2, P assocated to the 2 sampes of consumpton and the P sampes of contamnaton. The emprca ehood s gven by P n p n 2 p 2 j = j= = = w [], wth 2 constrants on consumpton weghts: for r =, 2, constrants on contamnaton weghts: P, Mode constrants. denote a dscrete probabty measure domnated by Q [], that s et Q [] Q [] = =,..., P. In the same way, n r = w [] = p r =. }, = and P w [] δ [] q = C n and are dscrete probabty mea- n r p r wth p r > 0 and sures domnated by C n and C 2 n 2,.e. n r p r = wth w [] C 2 n 2 C r n r = w [] = > 0 and = for = δ c r =, r =, 2. E Dr denotes the expectaton under the jont dscrete [] Q C r n r, whch s the reweghed jont probabty dstrbuton D r = P = dscrete probabty dstrbuton of the P contamnaton sampes and the r th consumpton survey sampe. The mode constrants can now be wrtten, for r =, 2 and for θ ]0, [, E Dr { { P = Q [] C r > d } θ } = 0, Theses mode constrants on θ have an expct but unpeasant expresson: for r =, 2, θ = θ r, wth { n r θ r P P P } = p r w [j] q [] c r, > d. = = = P = j= j =
USING EMPIRICA IKEIHOOD TO COMBINE DATA 5 2. nearzaton and approxmated emprca ehood. The precedng emprca ehood program s dffcut to sove, both from theoretca and practca ponts of vew, because of the hghy nonnear form of the mode constrants. The same probem aready appears when studyng the asymptotc behavor of the pug-n estmator of θ d wth ony one consumpton survey, see [3]. One souton s to see ˆθ as a generazed U-statstc and to nearze t usng Hoeffdng decomposton, see ee s boo, [2]. More generay, a method s to nearze the constrants to sove the optmzaton probem. Ths nearzaton s asymptotcay vad as soon as the parameter of nterest s Hadamard dfferentabe, see [] for detas. nearzaton { s made P= easer by consderng the nfuence functon of Ψ D =E D [ Q [] C r > d where D s the jont dstrbuton of contamnatons and consumptons. The nfuence functon of Ψ D at pont q,..., q P, c r s, for r =, 2: [ Ψ D q,..., q P, c = E P P = Q[] = Q[] C r P [ + m= E C r nr m Q[] ] >d θ Cr = c P = Q[] C r ] >d θ Q[m] = q m. Ths functona of D can be estmated by ts emprca counterpart Ψ, D where D denotes the emprca verson of D. Ψ can be wrtten expcty: D } ] θ, 2 Ψ [q,..., q P, c] = U 0 c + U r D q +... + U m r q m +... + U r P q P, where P 3 U 0 c = = and, for m = P and r =, 2, 4 U m r q m = n r P = m P Λ r [ m] { P = q m c r } q [] c > d θ,,m + P = m, > d θ, q [] c r where the sum s taen over the set Λ r [ m] of a ndexes,,..., m, m+,..., P,.e. fxng the contamnaton of food m.
6 CRÉPET, HARARI-KERMADEC & TRESSOU U 0 c r and the U m r P q [m] are generazed U-statstcs wth erne { m= P= } q [] c > d and degree,..., R P, see [2]. For smpcty, the dependence n n r,,..., P s mpct n the notaton. An approxmate verson of the mode constrants can now be wrtten: [ for r =, 2 : E Dr Ψ Q [],..., Q [P ], C r] = 0, that s n = n 2 j= p U 0 p 2 j U 0 c c 2 j + + D P = = P = = w [] U q [] w [] U 2 q [] = 0, = 0. The foowng theorem estabshes the asymptotc convergence of the approxmate verson of the emprca ehood when ony one product s consdered P =, =. The resut remans true n the genera case, P >, but needs some refnements to be tractabe n practce as detaed n next secton. Theorem. Assume that we have a contamnaton data q..d. and 2 ndependent consumpton sampes c..d. and c 2 n j j n 2..d. wth common rs ndex θ d = θ 2 d = θ d R. Assume that for r =, 2, U 0 c r have fnte varances and that U q, U 2 q has a fnte nvertbe varance-covarance matrx. Assume aso that n, n 2 and go to nfnty and that ther ratos are bounded, then the emprca ehood program nvoves sovng the dua program wth og ehood functon n,n 2,θ d gven by { n = n γ + λ U 0 c } 5 sup + { n 2 j= n γ 2 + λ 2 U 0 c 2 }. λ,λ 2,γ,γ 2,γ 3 R n +n 2 + γ γ 2 γ 3 =0 + { = n γ 3 + λ U q + λ 2 U 2 } q Defne the maxmum ehood estmator assocated to ths quantty ˆθ = arg sup n,n 2,θ. θ Then, the og-ehood rato [ r n,n 2,θ d = 2 n,n 2, ˆθ ] n,n 2,θ d χ 2.
USING EMPIRICA IKEIHOOD TO COMBINE DATA 7 The proof of these resuts s gven n appendx A.. Ths theorem yeds an α th confdence nterva for θ d such that { } θ : r n,n 2,θ χ 2 α. Remar. From a practca pont of vew, the nearzaton of the constrants aows for a good convergence of the optmzaton agorthm for nstance by usng a gradent descent method such as Newton-Raphson. The agorthmc aspects of emprca ehood are dscussed n chapter 2 n [27]. Remar 2. Ths mode constrants can be augmented by some estmatng equatons that woud aow to ncorporate some nowedge arsng from other data or from the mode under consderaton. For exampe, the natona census provdes the margn dstrbuton of the popuaton accordng to dfferent crtera age, sex, regon, professon and coud be ntegrated va estmatng equatons of the form 6 n = p Z = z 0, n 2 j= p 2 j Z 2 j = z 0, where Z and Z 2 j are vectors descrbng the beongng to specfed socodemographc categores n surveys and 2 and z 0 s the vector of the correspondng percentages of these categores based on the natona census. The convergence resuts w not be affected by the ntroducton of such socodemographc crtera, see [30] and [27], chapter 3, page 5. 3. Extenson to the case of severa products by ncompete U- statstcs. For P >, the computaton of the dfferent U-statstcs defned n 3 and 4 becomes too heavy when the data sets are arge f and/or n r are arge. Indeed, one needs to compute at east n P= r terms. To sove ths probem, we proceed to an approxmaton by repacng the compete U-statstcs by ncompete U-statstcs. The propertes of ncompete U-statstcs are we descrbed n [4] or [2]. et us defne the ncompete U-statstcs assocated to equatons 3 and 4. For smpcty, the szes of the ncompete U-statstcs are fxed to the same constant B, whch shoud be chosen greater than the sze of the dfferent data sets nvoved. B s chosen such as n + n 2 + P = = ob n order that the dfference between the compete and the ncompete versons s of order ob /2, [2]. For r = or 2, the ncompete verson of equaton 3
8 CRÉPET, HARARI-KERMADEC & TRESSOU s gven by 7 U B r 0 c r = B B r 0 { P = } q [] c r > d θ, where the sum s taen over the set B r 0 of ndexes,..., P, randomy P chosen wth repacement from {,..., }, wth sze B. = For m =,..., P, the ncompete verson of 4 s gven by 8 U r B q m = m B B r m m = q [] c r, + q mc r,m + P =m+ q [] c r, > d θ, where the sum s taen over the set B m r of ndexes,..., m, m+,..., P, P randomy chosen wth repacement from {,..., } {... n r }, wth sze B. = m The approxmate nfuence functon s now gven by q,..., q P, c r Ψ B = U B r 0 = c r + U B r q +... + U B r m q m +... + U r B q P. P The mode constrants can then be wrtten as foows. n p U B c P + w [] 0 U B q [] 9 = 0, n 2 j= p 2 j U 2 B 0 c 2 j + = = P = = w [] U 2 B q [] = 0. Coroary. Assume that n, n 2 and P go to nfnty and that ther ratos are bounded. Tae B such as n + n 2 + P = = ob. Then, under the assumptons of Theorem, the ehood rato for P products, r n,n 2,,..., P θ d, s asymptotcay χ 2 : r n,n 2,,..., P θ d χ 2.
USING EMPIRICA IKEIHOOD TO COMBINE DATA 9 See the appendx A.2 for the proof. Note n partcuar that B, the sze of the ncompete U-statstcs, must go to nfnty qucer than max{n, n 2,,..., P }. As before, ths yeds an α th confdence nterva for θ d such that { } θ : r n,n 2,,..., P θ χ 2 α. 4. A faster aternatve: Eucdean ehood. The emprca ehood program as wrtten n ths paper conssts n mnmzng the Kubac- eber dstance between a mutnoma dstrbuton on the sampe D D 2 and the observed data D D 2. Foowng the deas of [2], we repace the Kubac-eber dstance by the Eucdean dstance aso caed the χ 2 dstance. When usng the Eucdean dstance, the objectve functon n,n 2,,..., P θ becomes 0 { mn p, p 2 j, w [],=,..,P } 2 n = n p 2 2 + n 2 j= + P = = n 2 p 2 j w [] under the approxmated mode constrants 9 and the constrant that each set of weghts sums to. We get a resut equvaent to Coroary : 2, Coroary 2. Under the assumptons of Coroary, the statstc r n,n 2,,..., P θ d = n,n 2,,..., P θ d nf θ n,n 2,,..., P θ s asymptotcay P 2 2 χ 2. The proof of ths resut s gven n appendx A.3. The choce of ths dstance s cosey reated to the Generazed Method of Moments GMM, see [5, 24] for precsons on the ns between emprca ehood and GMM. Instead of ogarthms, the optmzaton program 0 ony nvoves quadratc terms and s then much easer to sove, as shown n appendx A.3. Ths consderaby decreases the computaton tme, mang exporaton easer and aowng to test dfferent constrants and modes. A specfcty of Eucdean dstance s that the weghts p, p 2 j and w [] are not forced to be postves. However, these weghts are asymptotcay nonnegatve wth probabty one, see [5]. The gan n computaton tme s counter-baanced by a ost n adaptabty to the data and to the constrants. Numerca resuts w be gven n
0 CRÉPET, HARARI-KERMADEC & TRESSOU the appcatons for both the Kubac-eber and the Eucdean dstances. Practca use of these methods shows that Eucdean dstance can be used for nta exporaton oong for the most usefu constrants for exampe and to gve frst-step estmators. Emprca ehood can then be used on the fna stage, to get precse confdence regons and estmators. The frststep estmators gven by Eucdean ehood can be used as startng vaues for the emprca ehood optmzaton. Secton 5 ustrates the nterest of ths strategy n arge data sets and a compcated mode. 5. Appcaton: Methy mercury Rs Assessment. In ths secton, the proposed methodooges based on emprca and eucdean ehood are apped to methy mercury rs assessment n the French popuaton. Indeed, at hgh concentratons, methy mercury, a we-nown envronmenta toxc found n the aquatc envronment, can cause esons of the nervous system and serous menta defcences n nfants whose mothers were exposed durng pregnancy [35]. There s aso some concerns that methy mercury may gve rse to retarded deveopment or other neuroogca effects at ower eves of exposure, whch are consstent wth standard patterns of fsh consumpton [0, 4, 23]. The atest epdemoogca resuts comped by the Jont Expert Commttee on Food Addtves and Contamnants [2] yeds a safe dose caed Provsona Toerabe Weey Intae PTWI for methy mercury of.6 µ g per wee per g of body weght. Methy mercury s many found n fsh and other sea foods. Other foods are therefore excuded to estmate human exposure n ths paper. In France, two man data sets are avaabe. The SECODIP pane coectng ong-term househod purchases from 989 to nowadays aows the estmaton of the chronc probabty to be over the PTWI. Unfortunatey data ony record househods purchase. The INCA survey records detaed ndvdua food consumpton but ony on a seven-day bass. We present these data sets together wth the contamnaton data n secton 5.. Then, a vadaton on smuated data s proposed n secton 5.2 foowng the man features of the actua data sets. Resuts are shown n secton 5.3 and 5.4 consderng one snge food group P =, or two food groups P = 2 respectvey. 5.. Data descrpton and specfc features. 5... Contamnaton data. Food contamnaton data concernng fsh and other seafoods avaabe on the French maret were generated by accredted aboratores from offca natona surveys performed between 994 and 2003 by the French Mnstry of Agrcuture and Fsheres [22] and the French Research Insttute for Expotaton of the Sea [7]. These = 2832 anaytca
USING EMPIRICA IKEIHOOD TO COMBINE DATA data are expressed n terms of tota mercury n mg/g of fresh weght. Consderng two food groups Fsh on one hand and Mouss and shefsh on the other hand, the data set szes are = 54 and 2 = 29. To extrapoate methy mercury eves from the mercury content, the dangerous form to human heath, converson factors have been apped to the anaytca data as 0.84 for fsh, 0.43 for mous and 0.36 for shefsh, [7, 8]. Adherng to nternatona recommendatons [3] the 7% of eft censored vaues,.e contamnaton eves beow some detecton or quantfcaton mt, were repaced wth haf the detecton or quantfcaton mt. Refer to [3, 33] for further dscussons. 5..2. Food consumpton data. The INCA survey. The French INCA survey r =, carred out by [9], records n = 3003 ndvdua consumptons durng one wee. The survey s composed of 2 sampes: 985 aduts aged 5 years or over and 08 chdren aged between 3 to 4 years. The data were obtaned durng an - month perod from consumpton ogs competed by the partcpants for a perod of 7 consecutve days. Natona representatveness of each subsampe aduts,chdren was ensured by stratfed sampng regon of resdence, town sze and by the appcaton of quotas age, sex, ndvdua professona/cutura category, househod sze. From ths survey, 92 food tems were seected wth respect to fsh or seafood. Ths ncudes fsh, fsh farmng, shefsh, mouss, mxed dshes, soups and msceaneous fshery products. Snce body weght of a ndvduas s avaabe, reatve consumptons are computed by dvdng the amount consumed durng the wee by the body weght. The proporton of chdren 34% n ths survey s hgh compared to the natona census 5%, [8]: t s usuay recommended to wor on aduts and chdren sampes separatey. In order to use the two subsampes, we correct ths seecton bas by addng a margn constrant on the proporton of chdren aged between [ 3 and ] 4 years as proposed n 2. The addtona constrant s E C n 3 Z = 0.5, where Z 4 s the age of ndvdua n the survey r = INCA. Ths modfes the form of the dua og-ehood 5 n the part concernng the frst survey. It becomes n = n {γ + λ U 0 c } + λ age 3 Z 4 0.5,
2 CRÉPET, HARARI-KERMADEC & TRESSOU where λ age s the Kühn and Tücer coeffcent assocated to the age constrant. SECODIP. The SECODIP pane for fsh, from TNS SECODIP http: //www.secodp.fr, s composed of 32 househods surveyed over one year the 999 year. In ths pane, 24 food groups contanng fsh or seafoods are retaned. Indvdua consumpton s created by nputtng to each ndvdua the househod s purchase dvded by the number of persons n the househod, whch s a current practce n food rs assessment based on househod aquston data. We aso dvde ths resut by 52 number of wees n a year and 60 mean body weght. Ths resuts nto n 2 = 9588 ndvdua reatve wee consumptons. Tabe Basc percente 95% confdence ntervas for MeHg rs expressed n % INCA SECODIP One snge product 3.47 [3.06 ; 3.86] 2.24 [.9 ; 2.57] Two products 5.68 [4.85 ; 6.40] 2.0 [.66 ; 2.55] Dfferences between the two surveys. Some unpubshed premnary studes and basc confdence nterva computatons of Tabe show that the use of INCA or SECODIP survey for the exposure estmaton to methy mercury gves dfferent resuts. Those resuts are consstent wth the terature showng that survey duratons nfuence the percentage of consumers due to nfrequency of purchase and the eve of food ntaes among consumers ony [20]. Numerous methods have been proposed to extrapoate from shortterm to ong-term ntae based on repeated short-term measures n the fed of nutrton, see [6, 28]. These wors are based on INCA type data and do not use the avaabe nformaton from SECODIP type data. However, the dfferences between the two surveys have many expanatons: the SECODIP pane s an Househod Budget Survey. However [32] found that, n genera, resuts from Househod Budget Surveys n Canada and Europe agree we wth ndvdua detary data; the SECODIP pane does not account for outsde consumptons: members of the pane do not record purchases for outdoor consumptons; the INCA survey s reazed n a pubc heath perspectve. Peope coud modfy ther consumpton behavor durng the survey wee n favor of foods they assume to be heathy as fsh. A these arguments expan the hgher fsh consumpton n INCA survey. We choose to ntroduce a coeffcent α to scae the SECODIP consumpton
USING EMPIRICA IKEIHOOD TO COMBINE DATA 3 to account for a these facts ntroducng an addtona mode constrant EC = α 0 EC 2. The coeffcent α 0 s estmated together wth the rs ndex θ d, eadng to confdence regons for θ d, α 0 cabrated by a χ 2 2 dstrbuton,.e. r n,n 2, θ d, α 0 χ 2 2. We then optmze on α for each θ to get a profed ehood on θ. 5.2. Vadaton on smuated data. In order to vadate the proposed methodoogy, coverage probabtes of the confdence ntervas resutng from coroares and 2 are assessed by smuaton of nown contamnaton and consumpton dstrbutons as n [3] and [33]. The agorthm s as foows: [Step ] Defne some true dstrbutons of consumpton and contamnatons and approxmate by a Monte Caro smuaton the parameter of nterest θ d. [Step 2] Reproduce the observed sampng scheme from the true dstrbutons defned n Step and obtan the CI from coroares and 2. Repeat Step 2 S tmes and chec whether the true vaue of θ d from Step beongs or not to the CI s of Step 2. For [Step ], we choose a mutvarate og norma dstrbuton for consumpton and Gamma dstrbutons for the P contamnaton dstrbutons 2. A Monte Caro smuaton of sze, 000, 000 yeds a true vaue of θ d=.6 =???%. The absoute error s of order????%. In [Step 2], two sampes of consumpton data are randomy seected from the mutvarate og norma dstrbuton determned n [Step ], one wth sze n = 3003, the other wth sze n 2 = 9588. Then the censorshp mechansm s reproduced: the data are frst dmnshed by a random factor wth mean 20% to account for consumpton outsde the home 3. 4 Then [Step 2] s repeated S = 500 tmes. Resuts For coroary, we obtan a coverage probabty of j espere pas on de 95% whereas the methodoogy from coroary 2 yeds a coverage probabty of pare!!! %. 5.3. Resuts when consderng one goba seafood group. We frst merge a the seafoods nto a snge group. Any contamnaton data s attrbuted to the tota ndvdua consumpton of seafoods. Cacuatons can therefore be performed usng the compete U-statstcs of degree,. 2 Ther parameters were chosen to ft as much as possbe the INCA dataset and the avaabe contamnaton data. 3 The proporton of the food eaten at home s dstrbuted accordng to a Beta dstrbuton wth mean 0.8 and varance 0.8 0.8 4 The ony features that are not reproduced are the hgh proporton of chdren n sampe and the aggregaton/dsggregaton of consumptons wthn househods.
4 CRÉPET, HARARI-KERMADEC & TRESSOU a Emprca ehood confdence regon b Emprca ehood rato profe horzonta axs s θ.6, horzonta axs s θ.6, vertca axs s α vertca axs s r n,n 2, θ.6 Fg. Emprca ehood for one snge food group sod, wth age constrant; dot, wthout age constrant Fgure a shows the two 95% confdence regons for the coupe of parameters θ.6, α. We compare the resuts obtaned wth and wthout the constrant on the proporton of chdren. The unconstraned confdence regon for θ.6, α s mared by a dotted ne, the sod ne correspondng to the constraned confdence regon. We can see that the constrant maes the 2 surveys coser α s smaer, the confdence regon s transated to the bottom and decrease the rs θ.6 s smaer, the confdence regon s transated to the eft. Chdren are nown to be a more senstve group to food exposure because of ther hgher reatve consumptons: they eat more compared to ther body weght than aduts. When addng the age constrant, the dscrete probabty measure reated to the INCA survey, the p n are modfed so that chdren become ess nfuent, whch expans the rs reducton and the decrease of α. Fgure b shows the profes of the emprca ehood ratos r n,n 2, θ.6. We get 2 profes, the dotted ne corresponds to the unconstraned case. The horzonta ne gves the 95% eve of the ch-square dstrbuton χ 2 95%, mtng the confdence nterva for the rs ndex. The 95% confdence nterva for θ.6 constranng INCA chdren proporton s [3.08% ; 3.47%] and the rs ndex estmator s θ.6 = 3.27%. The optma scang parameter s α =.3. Ths s an estmaton of the factor to convert ndvdua food purchases of seafoods nto ndvdua consumptons of seafoods. When the constrant on age s gnored, the estmator of θ.6 s the arthmetc mean of INCA survey and α scaed SECODIP data mared by the
USING EMPIRICA IKEIHOOD TO COMBINE DATA 5 vertca dotted bac ne. Indeed, the best correcton α s when both means are equa and then the maxmum of the ehood for θ.6 s ths common vaue. The SECODIP data has then no effect on the vaue of the estmator but has an effect on the confdence nterva: uncertanty s reduced thans to the arge sampe of consumpton vaues provded by the SECODIP data. Eucdean ehood: The Eucdean dstance s not as sharp as the Kubac dscrepancy, whch s used n the emprca ehood case. Moreover, the constrant on age beng near and ony on the smaer consumpton sampe INCA, the assocated term n the Eucdean ehood s sma n front of the rs ndex term, whch s nonnear and concerns both consumpton sampes INCA and SECODIP. The effect of the constrant s thus hghy reduced: confdence regons as shown n Fgure 2a as we as profes as shown n Fgure 2b are amost dentca. They gve resuts qute cose to what s obtaned wth the constraned emprca ehood. a Eucdean ehood confdence regon b Eucdean ehood rato profe horzonta axs s θ.6, vertca axs s α horzonta axs s θ.6, vertca axs s r n,n 2, θ.6 Fg 2. Eucdean ehood for one product sod, wth age constrant; dot, wthout age constrant 5.4. Resuts when consderng two products. Seafoods are now custered nto two groups: the frst one s Fsh and the second one s Mous and shefsh. Reca that = 54 and 2 = 29. Cacuaton are done usng ncompete U-statstcs defned n equatons 7 and 8 wth a sze B = 0000. α s here 2-dmensona. The confdence nterva for the rs ndex s [5.20% ; 5.64%] and the estmator s θ.6 = 5.43%. The correcton factors on SECODIP data are α =.8 and α2 =.65. Fgure 3 shows the profe of the emprca ehood rato. The probabty cacuated when seafoods are consdered as a snge group s
6 CRÉPET, HARARI-KERMADEC & TRESSOU Fg 3. Emprca ehood rato profe for two products wth age constrant horzonta axs s θ.6 and vertca axs s r n,n 2,, 2 θ.6 smaer than when seafoods are gathered nto two groups, see aso [34]. Consequenty n order to mprove ths rs assessment, t woud be nterestng to go deeper n the food nomencature of both surveys to create more groups but t s not possbe wth the avaabe SECODIP food nomencature. 6. Dscusson. Ths paper shows how emprca ehood method can be generazed to combne dfferent sources of data wth partcuar focus on food rs assessment. Yet the methodoogy s genera: f a parameter of nterest can be wrtten as a Hadamard dfferentabe functona of the dstrbutons of random varabes for whch observatons are avaabe then the Approxmate Emprcaehood Probem has a souton and asymptotc convergence of the ehood rato to a ch square dstrbuton was shown. Moreover, when the parameter of nterest can be wrtten as a U-statstc, ncompete U-statstcs can further be used to compute the assocated confdence nterva. We demonstrated on smuated data the effcency of our methodoogy as far as a food rs ndex s concerned. Natura extensons coud consder more consumpton surveys or severa contamnaton data sets, mutpyng the number of mode constrants and eventuay the number of estmatng equatons referrng to sde nformaton. The more the Emprca ehood Probem gets compcated, the more usefu s the Eucdean ehood at east to fnd frst step estmators. A technca mprovement of the present mode woud consst n usng a statstca method to dsaggregate househod purchases nto ndvdua at home consumptons and correct for the dfference between at home and tota food consumpton. [6] proposes a regresson based method for the decomposton of househod nutrtona ntaes nto ndvdua ntaes accountng for outsde consumptons, see aso
USING EMPIRICA IKEIHOOD TO COMBINE DATA 7 [? ]. In an emprca ehood program, ths nd of method woud requre the estmaton of a great number of parameters whch may cause optmzaton probems. Ths nd of methodoogy coud however avod the use of an ad-hoc scang parameter α between SECODIP and INCA panes. We pan to expore ths ssue n future wors. From an apped pont of vew, we obtan wth dfferent methods combnng the avaabe nformaton that the probabty to exceed the PTWI s of the order of 5%. Ths can be consdered as an mportant rs at a popuaton scae. It aso motvates some further wors to characterze the at-rs popuaton. Acnowedgments. We than Chrstne Bozot INRA-COREA for the support she has provded n handng the SECODIP data as we as Jean- Chares ebanc AFSSA for the contamnaton data. Many thans aso to Patrce Berta CREST-S for hs carefu readng of the manuscrpt. A errors reman ours. APPENDIX A: APPENDIX: PROOFS A.. Proof of Theorem. Frst, we consder the emprca ehood optmzaton program for two consumpton surveys and one food product. Reca that U 0 c and U r q are dependent of θ: U 0 c = = q c >d θ and U r q = nr n r = qc r θ, for r =, 2. >d The program EP s to maxmze n n2 = w, = p j= p2 j under the constrants : = w =, and for r =, 2, n r = pr =, and nr = pr U 0 c r + = w U r q = 0. To carry out ths optmzaton, we tae the n of the EP objectve functon. Ths forces the weghts to be postve. The dfference between these constrants and the nonnear ones defned n equaton s onr /2 where N r = n r +. Frst approxmaton of the weghts. We need an approxmaton of the weghts to contro the order of the agrange Mutpers. In order to obtan such an approxmaton, we consder an easer program. As the expectaton of U 0 c, U 0 c 2 j and U r q are zero, we consder the ehood n = p = w under the addtona constrants: n2 j= p2 j for r =, 2 n r = pr U 0 c r = 0 and = w U r q = 0. The constrants are thus spt n two, each constrant concernng ony one set of weghts. The optmzaton program s therefore dvded n 3 ndependent
8 CRÉPET, HARARI-KERMADEC & TRESSOU sub-programs, the two frst ones on the p r s beng the cassca emprca ehood for the mean and the ast one on the w s havng 2 constrants. As done n [30], Theorem, we have a contro on the order of the optma weghts of each sub-program: p r = /n r + t r U 0 c r wth t r = Onr /2 w = / + τ, τ 2 U q, U 2 q wth τ r = O /2. The optmum of ths new program, whch s gven by the optmum on each of the 3 sub-programs, s smaer than the EP one, because we added constrants: n n2 = w n n2 = w. = p j= p2 j = p j= p2 j Ths means that the weghts n EP the p s, p 2 j s, and w s are coser to /n, /n 2 and / than the p s, p 2 j s, and w s. Notce that n r 2 p r U 0 c r n r U 0 c r n r n pr U0 c r = = r n = r n r pr U0 c r n r = n = r n r = + t r U 0 c r U 0 c r t r n r U 0 c r 2 + otr = O n /2 r. n r = Then, comng bac to the orgna EP program, we have: n r p r U 0 c r n r U 0 c r n r + p r U 0 c r n r = n r = by standard CT arguments on U 0 w, we have, for r =, 2 3 n r = The EP program can be rewrtten as the foowng max- agrangan. mum: max w, γ a,p r p r U 0, γ r, λ r H c r = O n /2 = c r and = U 0 n r c r and 2. By smar arguments on = w U r q = O /2. w, γ a, p r, γ r, λ r, where: = O n /2 r,
H w, γ a, p r USING EMPIRICA IKEIHOOD TO COMBINE DATA 9 n, γ r, λ r = n = p n2 [ = = ] = p2 w γ a w ] [ nr λ r = pr U 0 c r + = w U r ]} q. 2 r= {γ r [ nr = pr Usng H/ p r = /p r γ r λ r U 0 for H/ w gves that 4 p r = γ r + λ r U 0 c r Note that we aso have 5 n r = p r H p r and w = and usng the constrants, we get that n 6 0 = = p H p n 2 + = c r n r = n r γ r λ r p r U 0 p 2 H p 2 + = = 0 and the smar expresson γ a + λ U q + λ 2 U 2 q. = c r = 0 w H w = n +n 2 + γ γ 2 γ a. The EP probem can be rewrtten usng 4 and 6 n the dua form { n = sup n γ + λ U 0 c } + { n 2 j= n γ 2 + λ 2 U 0 λ,λ 2,γ,γ 2,γ a R + { = n γ a + λ U q + λ 2 U 2 } q n +n 2 + γ γ 2 γ a=0 Furthermore, combnng 5 wth n r = pr U 0 c r = On /2 r gves that γ r = n r + v r wth v r = λ r Onr /2 and then p r = n r + v r + λ r U 0. and w = v v 2 + λ U q + λ 2 U 2 q et us consder the case of the w. Adaptng Owen s proof, equaton 3 for r = combned wth 4 yeds for the w constrant gven by O /2 = = w U q = U q = v v 2 +λ U q +λ 2 U [ ] 2 q = U q = v v 2 +λ U = q +λ 2 U 2 q U q v v 2 +λ U q +λ 2 U 2 q = U λ [ = w U ] 2 q λ 2 = w U q U 2 q + v +v 2 c 2 } c r., = w U q, where U = = U q. The ast term s equvaent to v + v 2 O 3/2 and then can be ncuded n O /2, so that U = λ [ = w U ] 2 q + λ 2 = w U q U 2 q + O /2.
20 CRÉPET, HARARI-KERMADEC & TRESSOU Usng Owen s arguments, we obtan U +O and U 2 +O /2 [ = λ 2 U 2 /2 = λ ] 2+ λ U U 2 [, where [ U U ] 2+ λ 2 and U U 2 = = U q U 2 q 2. Ths can be rewrtten: 7 λ λ 2 = [ U U U 2 ] 2 U U 2 [ U 2 ] 2 U + O U 2 + O U U 2 ] 2 = = [ U q /2 /2 As the emprca varance-covarance matrx convergences to the nondegenerated varance-covarance matrx E P [U, U 2 U, U 2 ] and as U and U 2 are of order O /2, t foows that λ and λ 2 are of order O /2. When consderng p r nstead of w, the cacuus are easer and we get n a smar fashon 8 λ r = n r [U r 0 ]2 U r 0 + Onr /2, where U r 0 = n nr r = U 0c r and [U r 0 ]2 = n [ nr 2. r = U 0 c r ] Now that we contro the sze of λ r at the optmum for both n r and wth 7 and 8, the arguments of [27] chapter.4 and the proof of [30] gve the expected convergence of r n,n 2,θ d = 2 n,n 2,θ d n,n 2,ˆθ to a χ 2. A.2. Proof of Coroary, case P >. The precedng arguments may be generazed to the case of P products. We gve here a proof for P = 2. The ncompete U-statstcs reated to the contamnaton of the 2 products are denoted U r r a,b and U b,b. The dfference between the ncompete and the compete statstcs are of order OB /2, and then does not affect the asymptotc resuts. The program conssts n maxmzng n = p n2 = p2 a = w[a] b = w[b], under the constrants : a = w[a] =, b = w[b] =, and for r =, 2 : nr = pr =, and n r = pr U r 0,B c r + a = w[a] U r q [a] 0 a,b a r + b = w[b] 0. For r =, 2 and = a, b, we can chec wth smar arguments that nr = pr U r 0,B c r = O n /2 r, = w U r [ q [] ] 0,B r = O /2.. U r b,b r b ] 2 q [b] =
USING EMPIRICA IKEIHOOD TO COMBINE DATA 2 We get as before for r =, 2 and = a, b : p r + v + λ U q [],B = n r + v r + λ r U r 0,B 0 and w [] = + λ 2 U 2,B 2 v a + v b = 0 and the proof foows the same nes as for product. q [] A.3. Eucdean ehood Proof of Coroary 2. functon of the program s now 2 mn{ p } 2 nr, p 2, w [] r= = c r, wth v + v 2 + The objectve n r p r 2 + P= = w [] 2. We get then smper expressons, whch aow to reach expct soutons for the weghts. For the sae of smpcty, we present the resuts for two consumptons surveys and one food product P =, the optmzaton program can be rewrtten 2 mn{ p, p 2, w } n =, n p 2+ n2 2+ = n 2 p 2 = w 2, under the constrants : = w =, and for r =, 2 : nr = pr =, and nr = pr U 0 c r + = w U r q = 0. Defne H = n 2 = n p 2+ n2 2 = n 2 p 2 2+ = 2 w 2 [ n λ = p U 0 c + = w U ] [ q n2 λ 2 = p2 U 0 c 2 + = w U 2 ] q [ n ] [ n2 ] [ = ] γ = p γ 2 = p2 γ a w. Then the frst order condton of the optmzaton program eads to H/ p r = n r n r p r γ r λ r U 0 c r = 0 so that we get p r [ = /n r + we have = n r = pr = + fnay p r = n r + λr n 2 r [ ] λ 2 U 2 2 q U 2 [ U 0 c r U 0 + U + λ [ U 2 0 + U 2 + λ 2 [ ] γ r + λ r U 0 c r /n 2 r. As the weghts sum to, γ r + λ r U r 0 /n r, so that γ r = λ r U r 0, and ] [ ] U r 0, and w = + λ U 2 q U +. The constrants can be rewrtten, for r =, 2 : VU 0 n + VU 2 0 n 2 + VU 2 VU ] CovU + λ 2,U 2 = 0, ] CovU + λ 2,U = 0,
22 CRÉPET, HARARI-KERMADEC & TRESSOU 2, where V and Cov denote the emprca varance operator, VX = X 2 X and the covarance operator, CovX, Y = X Y X Y. These terms do not depend on θ. Note that U r 0 = U r by defnton of these U-statstcs and wrte t U r. The optmum s then reached at λ λ 2 = 2 VU 0 n + VU CovU 2,U CovU VU 2 0 n 2 +,U 2 VU 2 U. U 2 Thus the optma vaue can be computed expcty. Fnay, repacng the vaues of the weghts and the λ s n the optmzaton program, we get: n, n 2, θ = 4 2 Y θ M N U Y θ, where Y θ = N2 U 2 and M = U r = n N n VU VU N N 2 CovU, U 2 0 + N N N 2 CovU, U 2 N 2 n 2 VU 2, q c r 0 + N 2 VU 2 >d θ s a generazed U-statstc wth erne q c r >d θ and of degree,. The CT for U-statstcs ensures that, wth N r = n r +, n r /N r η r, and /N r β r, Nr U r n r, N θ d θ, S 2 r where Sr 2 = η r V [ψ C ] + β r V [ψ Q ] and where ψ C and ψ Q are the gradents of order of the U-statstc. We consder now the asymptotc covarance C 2 of these two statstcs.e. the mt of N N 2 CovU, U 2. To cacuate C 2, we set X r = q c r >d θ, and we have N U Y θ = N = n X. N2 U 2 N2 n 2 X2. We have, N N 2 n n 2 2 Cov X, X 2 N N 2 = n n 2 2 E N N 2 = n n 2 2 E j X X 2 j j X X 2 j N N 2 = v 2,
USING EMPIRICA IKEIHOOD TO COMBINE DATA 23 where v 2 = E[X X 2 j ]. Therefore, C 2 = β β 2 /2 v 2 Fnay, Y θ s asymptotcay a gaussan vector wth mean θ d θ 2, where 2 =, and varance [ ] S 2 M = C 2 C 2 S2 2 We must now show that M s a convergent estmator for M. By cassca resuts on U-statstcs, we have et s show that M 2 =, N N 2 M 2 = 0 + N r r VU S2 r. N N 2 CovU, U 2 C 2. et n and j n 2 Ŝ 2 r = N r n r VU r CovU, U 2 = N N 2 2 X X 2 N N 2 j 3 X X 2 j. By the N, / X X 2 j v 2 and / Xr 0 and then M M. To estabsh the convergence of r n,n 2, θ d = n,n 2, θ d nf θ n,n 2, θ, we consder ˆθ, the mnmser of n,n 2, θ,.e. of Y θ M Y θ. Wrte Y θ = Z θ 2. The frst order condton gves : 2 M Z + ˆθ 2 M 2 = 0 and then ˆθ = 2 M Z θ 2 M 2. Thus r n,n 2, θ d = 2Y θ d M Y θ d 2Y ˆθ M Y ˆθ = 2 Z M Z 2θ d 2M Z + θd 2 2M 2 2 Z M Z 2ˆθ 2M Z + ˆθ 2 2M 2 ] = 2 [ 2θ d 2M Z + θd 2 2M 2 + 2ˆθ 2M Z ˆθ 2 2M 2. By the frst order condton, 2 M Z = ˆθ 2 M 2 and then ] r n,n 2, θ d = 2 [ 2θ d ˆθ 2 M 2 + θd 2 2M 2 + ˆθ 2 2M 2 = 2ˆθ θ d 2 2M 2 = 2 2 M Z 2 θ d 2 2 M. 2 M /2 Z 2 θ d s asymptotcay a standard gaussan vector. r n,n 2, θ d s then twce the square of a weghted mean of two ndependent standard gaussans, and then r n,n 2, θ d 4 2 χ 2.
24 CRÉPET, HARARI-KERMADEC & TRESSOU Case P >. We aso use ths framewor for the 2 surveys 2 products context. The form of the Eucdean ehood s amost the same, wth U r := U r 0 = U r = U r 2 and we easy get by straghtforward cacuus n, n 2,, 2 θ = U, U 2 A U, U 2 VU 0 VU where A = n + VU 2 + CovU 2,U 2 CovU 2,U 2 2 + 2 CovU 2,U 2 CovU 2,U 2 + VU 2 0 2 VU 2 n 2 + 2, VU 2 + 2 and the resut foows. REFERENCES [] Berta, P. 2006. Emprca ehood n some sem-parametrc modes. Bernou 2, 2, 299 33. [2] Berta, P., Harar-Kermadec, H., and Ravae, D. 2004. ϕ-dvergence emprque et vrasembance emprque générasée. To appear n Annaes d économe et de statstque. [3] Berta, P. and Tressou, J. 2006. Incompete generazed U-statstcs for food rs assessment. Bometrcs 62,, 66 74. [4] Bom, G. 976. Some propertes of ncompete U-statstcs. Bometra 63, 573 580. [5] Bonna, H. and Renaut, E. 2004. On the effcent use of the nformatona content of estmatng equatons: Imped probabtes and eucdean emprca ehood. Cahers scentfques CIRANO, 2004s-8. [6] Chesher, A. 997. Det reveaed?: Semparametrc estmaton of nutrent ntae-age reatonshps. J. R. Statst. Soc. A 60, 3, 389 428. [7] Casse, D., Cossa, D., Bretaudeau-Sanjuan, G., Touchard, G., and Bombed, B. 200. Methymercury n mouscs aong the French coast. Marne Pouton Buetn 42, 329 332. [8] Cossa, D., Auger, D., Averty, B., ucon, M., Massen, P., Noe, J., and San- Juan, J. 989. Atas des nveaux de concentraton en métaux métaoïdes et composés organochorés dans es produts de a pêche côtère françase. Tech. rep., IFREMER, Nantes. [9] CREDOC-AFFSA-DGA. 999. Enquête INCA ndvduee et natonae sur es consommatons amentares, TEC&DOC ed. avoser, Pars. Coordnateur : J.. Voater. [0] Davdson, P., Myers, G., Cox, C., Shamaye, C. F., Carson, T., Marsh, D., Tanner, M., Bern, M., Soane-Reves, J., Cernchar, E., Chosy, O., Cho, A., and Carson, T. W. 995. ongtudna neurodeveopmenta study of Seycheos chdren foowng n utero exposure to MeHg from materna fsh ngeston: Outcomes at 9-29 months. Neurotoxcoogy 6, 67 688. [] Deve, J. C. and Sarnda, C. E. 992. Cabraton estmators n survey sampng. J. Am. Statst. Ass. 87, 376 382. [2] FAO/WHO. 2003. Evauaton of certan food addtves and contamnants for methymercury. Sxty frst report of the Jont FAO/WHO Expert Commttee on Food Addtves, Technca Report Seres, WHO, Geneva, Swtzerand. [3] GEMs/Food-WHO. 995. Reabe evauaton of ow-eve contamnaton of food, worshop n the frame of GEMS/Food-EURO. Tech. rep., Kumbach, Germany, 26-27 May 995.
USING EMPIRICA IKEIHOOD TO COMBINE DATA 25 [4] Grandjean, P., Wehe, P., Whte, R., Debes, F., Ara, S., Yooyama, K., Murata, K., Sorensen, N., Dah, R., and Jorgensen, P. 997. Cogntve defct n 7-year-od chdren wth prenata exposure to methymercury. Neurotoxcoogy Teratoogy 9, 4 428. [5] Heersten, J. K. and Imbens, G. 999. Imposng moment restrctons from auxary data by weghtng. The revew of Econometrcs and Statstcs 8,, 4. [6] Hoffmann, K., Boengand, H., Dufour, A., Voater, J.., Teman, J., Vrtanen, M., Becer, W., and Henauw, S. D. 2002. Estmatng the dstrbuton of usua detary ntae by short-term measurements. European Journa of Cnca Nutrton 56, 53 62. [7] IFREMER. 994-998. Résutat du réseau natona d observaton de a quaté du meu marn pour es mousques RNO. [8] INSEE, Insttut Natona de a Statstque et des Etudes Economques. 999. a stuaton démographque en 999 - Mouvements de a popuaton et enquête empo de janver 999. Tech. rep. [9] Ireand, C. T. and Kubac, S. 968. Contngency tabes wth gven margnas. Bometra 55,, 79 88. [20] ambe, J., Kearney, J., ecercq, C., Zunft, H., Henauw, S. D., amberg- Aardt, C., Dunne, A., and Gbney, M. 2000. The nfuence of survey duraton on estmates of food ntaes and ts reevance for pubc heath nutrton and food safety ssues. European Journa of Cnca Nutrton 53, 6 73. [2] ee, A. J. 990. U-Statstcs: Theory and Practce. Statstcs: textboos and monographs, Vo. 0. Marce Deer, Inc, New Yor, USA. [22] MAAPAR. 998-2002. Résutats des pans de surveance pour es produts de a mer. Mnstère de Agrcuture, de Amentaton, de a Pêche et des Affares Ruraes. [23] Natona Research Counc NRC of the Natona Academy of Scences Prce. 2000. Toxcoogca effects of methy mercury. Tech. rep., Natona Academy Press, Washngton, DC. [24] Newey, W. K. and Smth, R. J. 2004. Hgher Order Propertes of GMM and Generazed Emprcaehood Estmators. Econometrca 72,, 29 255. [25] Owen, A. 988. Emprca ehood rato confdence ntervas for a snge functona. Bometra 75, 237 249. [26] Owen, A. 990. Emprca ehood rato confdence regons. Ann. Statst. 8, 90 20. [27] Owen, A. 200. Emprcaehood. Chapman & Ha/CRC. [28] Prce, P., Curry, C., P.E.Goodrum, M.N.Gray, McCrodden, J., N.W.Harrngton, Carson-ynch, H., and Keenan, R. 996. Monte caro modeng of tme-dependent exposures usng a mcroexposure event approach. Rs Anayss 6, 3, 339 348. [29] Qn, J. 993. Emprca ehood n based sampe probems. Ann. Statst. 2, 3, 82 96. [30] Qn, J. and awess, J. 994. Emprca ehood and genera estmatng equatons. Ann. Statst. 22, 300 325. [3] Rdder, G. and Mofftt, R. 2006. Handboo of econometrcs, Hecman and eamer ed. Esever, North-Hoand, Amsterdam, Chapter The econometrcs of data combnaton. See http://www-rcf.usc.edu/ rdder/wpapers/comsamp7nov03.pdf. [32] Serra-Majem,., Macean, D., Rbas,., Brue, D., Seua, W., Prattaa, R., Garca-Cosas, R., Yngve, A., and Petrasovts, M.. A. 2003. Comparatve anayss of nutrton data from natona, househod, and ndvdua eves: resuts from a who-cnd coaboratve project n canada, fnand, poand, and span. Journa
26 CRÉPET, HARARI-KERMADEC & TRESSOU of Epdemoogy and Communty Heath 57, 74 80. [33] Tressou, J. 2006. Non parametrc modeng of the eft censorshp of anaytca data n food rs exposure assessment. J. Am. Statst. Ass.. Appcaton & Case Study, n press. [34] Tressou, J., Crépet, A., Berta, P., Fenberg, M. H., and ebanc, J. C. 2004. Probabstc exposure assessment to food chemcas based on extreme vaue theory. appcaton to heavy metas from fsh and sea products. Food and Chemca Toxcoogy 42, 8, 349 358. [35] WHO. 990. Methymercury, Envronmenta Heath Crtera 0. Tech. rep., Geneva, Swtzerand. Address of the Frst and Thrd author INRA Mét@rs 6 rue Caude Bernard, 7523 Pars Cedex 5, France E-ma: crepet@napg.fr tressou@napg.fr Address of the Second author INRA COREA 65 bd de Brandebourg, 94205 Ivry-Sur-Sene, France E-ma: harar@dptmaths.ens-cachan.fr