Chris J. Skinner The probability of identification: applying ideas from forensic statistics to disclosure risk assessment


 Prosper Terry
 5 years ago
 Views:
Transcription
1 Chis J. Skinne The pobability of identification: applying ideas fom foensic statistics to disclosue isk assessment Aticle (Accepted vesion) (Refeeed) Oiginal citation: Skinne, Chis J. (2007) The pobability of identification: applying ideas fom foensic statistics to disclosue isk assessment. Jounal of the Royal Statistical Society: seies A (statistics in society), 170 (1). pp ISSN DOI: /j X WileyBlackwell This vesion available at: Available in LSE Reseach Online: Novembe 2011 LSE has developed LSE Reseach Online so that uses may access eseach output of the School. Copyight and Moal Rights fo the papes on this site ae etained by the individual authos and/o othe copyight ownes. Uses may download and/o pint one copy of any aticle(s) in LSE Reseach Online to facilitate thei pivate study o fo noncommecial eseach. You may not engage in futhe distibution of the mateial o use it fo any pofitmaking activities o any commecial gain. You may feely distibute the URL ( of the LSE Reseach Online website. This document is the autho s final manuscipt accepted vesion of the jounal aticle, incopoating any evisions ageed duing the pee eview pocess. Some diffeences between this vesion and the published vesion may emain. You ae advised to consult the publishe s vesion if you wish to cite fom it.
2 The pobability of identification: applying ideas fom foensic statistics to disclosue isk assessment C. J. Skinne Univesity of Southampton, U. K. Summay. This pape establishes a coespondence between statistical disclosue contol and foensic statistics egading thei common use of the concept of pobability of identification. The pape then seeks to investigate what lessons fo disclosue contol can be leant fom the foensic identification liteatue. The main lesson consideed hee is that disclosue isk assessment cannot, in geneal, ignoe the seach method employed by an intude seeking to achieve disclosue. The effects of using seveal seach methods ae consideed. Though consideation of the plausibility of assumptions and wost case appoaches, the papes suggests how the impact of seach method can be handled. The pape focuses on foundations of disclosue isk assessment, poviding some justification fo some modelling assumptions undelying some eisting ecod level measues of disclosue isk. The pape illustates the effects of using diffeent seach methods in a numeical eample based upon micodata fom a sample fom the 2001 Census. Key wods: confidentiality; micodata; ecod linkage; disclosue contol; uniqueness. To appea in Jounal of the Royal Statistical Society, Seies A 1
3 1. Intoduction Statistical agencies conducting suveys o censuses need to potect the confidentiality of espondents when eleasing outputs (Doyle et al, 2001). A majo aim in confidentiality potection is to avoid identification. Fo eample, the key confidentiality guaantee in the National Statistics Code of Pactice (National Statistics, 2004, p.7) is that no statistics will be poduced that ae likely to identify an individual. Bethlehem et al. (1990) efe to simila pinciples elsewhee, such as in the Intenational Statistical Institute Declaation on Pofessional Ethics. Concen about identification is paticulaly ponounced fo eleases of micodata, whee the identification of a ecod in a micodata file might lead to the disclosue of the values of sensitive vaiables (Paass, 1988; Duncan and Lambet, 1989; Reite, 2005). Pinciples of confidentiality potection, such as that embodied in the National Statistics Code of Pactice, ae often epessed boadly and equie efinement if they ae to be implemented in pactice. The concept of identification itself seems faily clea: it involves linking an element of the output, such as a micodata ecod, with a known individual o othe specified unit (Bethlehem et al., 1990). Moe challenging is the concept of the pobability of identification, to which confidentiality potection pinciples often efe. Fo eample, the phase likely to in the National Statistics confidentiality guaantee is a pobabilistic notion. The pobability of identification is often efeed to as identification isk o the isk of identity disclosue in the statistical disclosue contol (SDC heeafte) liteatue (e.g. Paass, 1988; Duncan and Lambet, 1989; Reite, 2005). The assessment of this pobability is not staightfowad, in paticula since the undelying uncetainty might aise fom a vaiety of souces, such as: whethe an attempt 2
4 at identification by an intude might take place, what auiliay infomation an intude might be able to use to attempt identification o which elements of the output o known individuals might be selected fo an attempt at identification. Some of these souces of uncetainty may be handled by appopiate definition and altenative assumptions, such as via the components of isk appoach of Mash et al. (1991). Nevetheless, thee emain challenges in assessing the uncetainty, as will become appaent in this pape. One field of statistical application whee thee has been igoous discussion and development of methods fo assessing the pobability of identification is foensic science (e.g. Dawid, 1994; Balding and Donnelly, 1995). The aim of this pape is, fist, to establish a coespondence between the foensic identification liteatue and that on SDC and then to conside the elevance of some ideas fom the fome liteatue to the assessment of identification isk in an SDC contet. One paticula implication of the foensic identification liteatue, upon which we shall focus, is that the pobability of identification may depend upon the seach method used by an intude to select an element of the output and a known individual in the population fo linking. While the SDC liteatue has acknowledged that intudes might employ diffeent seach methods to impove thei chances of disclosue (e.g. Duncan and Lambet, 1989; Lambet, 1993), epessions fo identification isk appeaing in the SDC liteatue (e.g. Paass, 1988) ae geneally not dependent on the seach method, fo given auiliay infomation. Following the foensic identification liteatue, we shall show how such dependence can aise. This finding makes the task of disclosue isk assessment hade, since the seach method employed by a hypothetical intude is necessaily unknown. We shall discuss how this poblem might be addessed. 3
5 We shall ague that the assessment of identification isk in SDC may be viewed as a genealization of a foensic identification poblem. As a consequence, we shall conside how foensic identification appoaches may be etended to identification isk assessment in SDC. Ou focus will be on the foundations of isk assessment methodology. We shall, howeve, outline an application in section 6 and povide some numeical illustations using data fom the 2001 Census. Ou focus in an SDC contet will be on micodata, although much of this pape will also be elevant to any fom of output whee identification is elevant, i.e. whee thee is concen about the linking of elements of the output to known individuals (o othe specified units). Ou discussion will apply to cases whee SDC methods, such as petubation (Willenbog and de Waal, 2001), have been applied, povided that each ecod of the esulting micodata (o element of the output) can still be intepeted as having oiginated fom a given individual. Othewise, it is not clea that thee is eason to be concened about identification. We ae not the fist to obseve the connection between foensic science and SDC. The efeence to fingepinting in Willenbog and de Waal (2001) povides a simple eample. A deepe but moe indiect connection may be taced via discussions of connections between foensic science and ecod linkage, e.g. Copas and Hilton (1990), and connections between ecod linkage and SDC, e.g. Paass (1988). We shall begin in Section 2 by intoducing a basic mapping between the two poblems of foensic identification and disclosue isk assessment. A fomal famewok will then be set out in Section 3 to encompass both poblems, and it will be indicated how the latte may be teated as a genealization of the fome. In Section 4, we estict 4
6 attention to situations whee an intude seeks to achieve identification by a matching appoach. The natue of identification isk fo this appoach and, in paticula, the impact of diffeent kinds of seach methods ae discussed in Section 5, with an illustation in Section 6. Finally, in Section 7 we discuss the boad conclusions and thei SDC contet. 2. The basic coespondence between foensic identification and SDC To intoduce the coespondence, we fist set out the two poblems in pototypical fom. In foensic identification (e.g. Balding and Donnelly, 1995), a cime has been committed by an unknown culpit, who belongs to a specified population. The posecuting authoity identifies a membe of this population as a suspect and bings the suspect to cout. Identification occus if the suspect and the culpit ae identical, i.e. the suspect committed the cime o, in othe wods, the suspect is guilty. Data elevant to identification consist of values of vaiables obseved both on the suspect and at the scene of the cime, e.g. fom fingepints, DNA pofiles o eye witness testimony. In identification isk assessment fo micodata (e.g. Paass, 1988), a micodata file is to be eleased, based upon data povided by a sample of esponding units fom a population in a suvey o census. The file consists of ecods fo these sample units, each with the values of seveal vaiables. An intude, i.e thid paty, has infomation about one o moe known units in the population and seeks to link one of these with one of the ecods. Identification occus if the selected known unit is identical to the esponding unit which povided data fo the ecod. Data elevant to identification consist of values of vaiables which ae both ecoded in the micodata and available to the intude fo the known units. These ae often called key vaiables. 5
7 The coespondence between the two poblems is summaised in Table 1. The cime coesponds to coopeation by a esponding unit in a suvey o othe fom of data collection, nomally undetaken unde some pledge of confidentiality. (Given most agencies desie to avoid nonesponse, the coespondence is ionic!) The culpit coesponds to the esponding unit. Fo simplicity, we shall geneally suppose that both the culpit and the espondent ae individuals. They each belong to some specified population. The posecuting authoity coesponds to the intude. The suspect, identified by the posecuting authoity, coesponds to the individual chosen by the intude fo linking to a given ecod in the micodata. To assess the pobability that the suspect is guilty, the cout will use evidence which links the suspect to the scene of the cime via some shaed chaacteistics, which coespond to the key vaiables. Some of the othe foms of coespondence in Table 1 will be etuned to in Section 3. In the foensic identification poblem thee is just one cime, one culpit and one suspect. (Note that if the cime is committed by seveal individuals jointly then we use the tem culpit to denote this cluste of individuals. Likewise, the suspect may consist of a cluste of individuals who ae suspected to have committed the cime jointly.) The foensic identification poblem theefoe coesponds to a special case of the disclosue isk assessment poblem, whee thee is just a single ecod in the micodata and whee the intude links just one known individual to this ecod. We thus view the SDC poblem as genealizing the foensic identification poblem to the case whee multiple cimes ae committed and thee ae multiple potential suspects that might be linked to these cimes. 6
8 3. Fomalisation of the coespondence We now seek to epand upon and fomalise the coespondence intoduced in the pevious section. We begin in Section 3.1. by setting out ou geneal famewok fo assessing identification isk in the contet of SDC. Then, in Section 3.2., we discuss how the foensic identification poblem may be consideed in this famewok SDC poblem We conside a ectangula micodata file in which each ecod contains values on a common set of vaiables fo a unit in a finite population U of size N. The units might in pinciple take diffeent foms, fo eample households o businesses, but hee we shall assume that they ae individuals fo simplicity. The micodata file might have been subject to petubation by SDC methods, povided that it emains meaningful to associate each ecod with a unique individual. We follow Paass (1988) and assume, hypothetically, that an intude seeks to link one o moe micodata ecods to one o moe known individuals in the population using the values of cetain key vaiables obseved in both the micodata and on the known individuals. The known individuals might be dawn fom a diffeent souce available to thid paties, fo eample a database consisting of multiple ecods containing values of the key vaiables. We define identification isk as the pobability that a link between a paticula ecod and a paticula known individual is coect, conditional on an intude having selected this ecod and this individual fo linkage using a specified seach method and specified auiliay infomation. If the intude attempts multiple links between seveal ecods and seveal known individuals then thee is an identification isk fo each 7
9 attempted link. Ou definition implies a isk fo each (ecod, known individual) pai which might have esulted fom an intude attack and, in paticula, fo the case when the known individual is in fact the individual to which the ecod belongs. We shall take the latte case to define the identification isk fo a given ecod. The possible combination of such ecod level measues of isk to fom a file level measue will be discussed in Section 7. Suppose then that the intude aives at a potential link between a micodata ecod and a known individual in the population, denoted B, as a esult of using a paticula seach method. The intude might, fo eample, begin with a given taget individual, B, in the population fo which additional infomation is sought, and then seach fo the ecod in the micodata which appeas to povide the best match to B. Let A() denote the individual to which micodata ecod,, belongs and wite A() as A when this is unambiguous. Identification occus if A=B. Note that, in ou notation, A and B epesent unique identifies of units in the population, e.g. names and addesses, wheeas belongs to the set s of micodata ecods which ae labelled abitaily, s = {1,..., n}. The values of the key vaiables fo and B ae denoted by X A( ) and X B espectively, whee ~ is used to signify that the key vaiables may be ecoded in diffeent ways in the two souces, fo eample because of measuement eo, diffeent definitions o because some SDC method has been applied to the micodata. The identification isk, may then be epessed as: identification isk = P( A( ) = B X, X, seach method), (1) micodata population whee X micodata and X population consist of the values assumed available to the intude on X fo ecods in the micodata and on X fo individuals in the population, espectively. 8
10 We suppose that the pobability in (1) efes to two possible kinds of stochastic mechanism: fist, a supepopulation model fo the geneation of the values X and X, which may include a stochastic SDC mechanism used to petub X o measuement eo mechanisms affecting both X and X ; and second, the selection of and B, i.e. the combination of the seach method and any pobability sampling scheme (and nonesponse mechanism) which led to the selection of the espondents, undelying the micodata, fom the population. We may compae the identification isk in (1) with the pobability P( A( ) = B X, X ), epesenting the uncetainty faced by the intude when micodata population assessing whethe an abitay ecod belongs to an abitay known individual B, pio to any seach, assuming the same infomation on X and X is available. Such pobabilities ae consideed by Paass (1988) and Reite (2005). If this pobability and the pobability in (1) ae the same then the seach method is said to be ignoable. If this condition holds then disclosue isk assessment should be easie, since the seach method of a hypothetical intude is necessaily unknown. Howeve, we shall show in section 5.2. that seach methods need not necessaily be ignoable and we shall discuss in section 5.4. how we might deal with this possibility. The pobability in (1) is to be intepeted fom the pespective of the eleasing agency o disclosue audito, based upon a set of stated assumptions about what auiliay infomation might be available and the vaious stochastic mechanisms above. These assumptions ae taken to be ones that could be publicly defended as ealistic o coespond to confidentiality potection guidelines. 9
11 3.2. Foensic Identification We now outline the coesponding setup in foensic identification, following the analogy set out in Section 2. The micodata sample is educed to a single ecod coesponding to the culpit A( ) committing the cime and B becomes the suspect, obseved to have a paticula combination of taits, i.e. key vaiables, known to be shaed by the ciminal. Fo simplicity, we conceive of the ciminal and the suspect as individuals, although these might be goups of individuals woking togethe. The population consists of the set of individuals who could have committed the cime and the seach method efes to the selection of B fom this population. Thee is only one culpit and hence the seach method does not efe to the selection of A. In the SDC setup, A( ) might theefoe be intepeted as having committed the cime of acting as a espondent in a suvey, poviding data upon which the given micodata ecod has been based. The evidence ecoveed fom the cime scene about the culpit is denoted X A. The coesponding chaacteistics of the suspect ae denoted be ecoded in diffeent ways, fo eample if X B. Again the key vaiables may X A includes vaiables obtained fom eyewitness accounts then these may be subject to measuement eo. The identification isk coesponds to the pobability that the suspect is guilty, that is that B is the same peson as A. Eplicit epessions fo this pobability of guilt may be obtained unde distibutional assumptions. Fo eample, fo the case whee X A and X B ae nomally distibuted, Lindley (1977) povides epessions fo the likelihood atio (fo A = B vs. A B ) coesponding to this posteio pobability of guilt given the obseved values 10
12 of X A and X B. Fulle (1993) povides epessions which may be intepeted as etensions of Lindley s esults to the SDC case. Epessions fo a futhe special case will be consideed in the net section. 4. Linkage by matching The discussion in Sections 2 and 3 applies to a vey wide class of possible seach methods. In pactice, an impotant class of methods, elevant to both SDC and foensic identification, may be defined in tems of matching. In this case, thee is a decision ule with a binay outcome, match o nonmatch, fo any pai ( X, X ). Thus, fo a given ecod,, in the micodata with key vaiable values X A( ) (o analogously a given cime with evidence X A( ) about the culpit), the decision ule defines a set S of possible A B individuals in the population with values of X B which match A( ) X (and all emaining individuals will not match). Some eamples of how such a matching ule might aise ae: (i) if the key vaiables ae categoical, misclassification is ignoed and X A is said to match X B if all of the key vaiables take the same value; (ii) if the key vaiables ae continuous o categoical and X A is said to match X B if measuement eo is judged to make X A and X B indistinguishable (Balding and Donnelly, 1995, p.36); (iii) if the key vaiables ae continuous o categoical and a ecod linkage decision ule of the Fellegi and Sunte (1969) type is used, geneating thee possible outcomes: link, nonlink o a possible link fo each 11
13 pai ( X A, X B ). We suppose the possible link categoy is pooled with one of the othe two categoies. Such matching appoaches have been widely consideed in the foensic identification liteatue. Fo eample, Kingston (1965) defines identification in tems of the same kind of set S as above. We shall etun to eample (i) in Section Identification Risk fo Linkage by Matching In this section we conside the natue of the pobability of identification in (1) fo the kinds of linkage methods descibed in Section 4. Sections 5.1. and 5.2. will focus on the case of a single ecod, as in foensic identification. The moe geneal case will be consideed in Sections 5.3 and Basic Fomulation fo a Single Recod We begin by consideing an abitay ecod in the micodata, ignoing the emaining micodata ecods, as in the foensic identification case. We define S as in Section 4 and let F denote the size of this set. We assume that any discepancies of measuement between X and X ae allowed fo in the matching ule sufficiently so that X A( ) matches A X A( ), i.e. ( ) S, and thus F 1. Suppose that, using the linkage appoach, an intude finds an individual B in S. We initially assume that F is known. By assumption about the linkage ule and the fact that the emaining ecods ae being ignoed, the key vaiable values X ( ), X cay no infomation about identification, i.e. whethe A( ) A = B, beyond the following B 12
14 infomation: A( ) S, B S and F. Thus, the identification isk in (1) may be epessed as: identification isk = P( A( ) = B X, X, seach method) micodata population = P( A( ) = B A( ) S, B S, F, seach method). (2) Unde faily weak conditions on the mechanism leading to the selection of and B, the epession in (2) educes to identification isk = 1/ F. (3) Fo eample, (3) holds if the intude is equally likely to select B as any membe of S, conditional on and the event B S. Assumptions fo (3) to hold ae also made and justified by Dawid (1994, assumption A1) and Balding and Donnelly (1995, Assumption 1 and equation 7) in the foensic identification contet. One cicumstance whee (3) might be questionable in an SDC contet is whee the intude begins with an abitay taget individual in the population, unequal pobability sampling is employed in the selection of the micodata sample and a match is obseved which is unique in the micodata. In this case, the F possible samples that could lead to this obseved outcome ae not necessaily equally likely if the pobability function in (2) is defined in tems of the sampling scheme. Hence (3) may not hold. Nevetheless, in this case it appeas difficult to aive at an altenative to 1/ F fo the ight hand side of (3), which is a function of infomation which an intude might ealistically have in pactice, and we shall not pusue such concens hee. Fo the emainde of the pape we shall suppose that epession (2) does educe to epession (3). 13
15 The simple epession 1/ F in (3) has been noted by seveal authos, both in the foensic identification liteatue, e.g. Kingston (1965), and in the SDC liteatue, e.g. Duncan and Lambet (1989). The difficulty with (3) in pactice is that F will geneally be unknown. Indeed, in the SDC contet a key consideation is to ensue that the fom of elease should not pemit key vaiables to be available whee F might be known to a potential intude and be small, say one o two. When F is unknown, we emove it fom the conditioning set in (2) to give: identification isk = P( A( ) = B A( ) S, B S, seach method) N F = 1 = P( A( ) = B A( ) S, B S, F = F, seach method) N F = 1 P( F = F A( ) S, B S, seach method) = (1/ F) P( F = F A( ) S, B S, seach method), unde ou assumption that (2) educes to (3), and hence identification isk = E(1/ F A( ) S, B S, seach method), (4) whee the epectation is with espect to the conditional distibution P( F A( ) S, B S, seach method) of F given the obseved events. The poblem of detemining the identification isk then educes to one of detemining this distibution. We now conside how to obtain an epession fo this distibution, following the appoach used in the foensic identification liteatue. This involves specifying both a supepopulation model, govening the pobability pocess undelying the event B S, and a seach method. Teating the ecod as fied, we may specify the supepopulation model by specifying the distibution of the binay indicato vaiables Z i fo whethe X i matches X A( ) (fo individuals i in the population). The event B S then coesponds to 14
16 the event Z B = 1 and the assumption that X A( ) matches X A( ) coesponds to the event that Z = 1. A standad supepopulation model (e.g. Kingston,1965; Dawid, 1994) which A teats the size, N, of the population as fied, is that the Zi, i U, ae independent and identically distibuted Benoulli tials with p denoting the pobability of a match. This implies that F is Binomially distibuted with paametes N and p and we efe to this as the Binomial model. The elation between these models and some models used in SDC will be consideed in Section 6. We shall teat p as known, fo simplicity, in the emainde of this section. In foensic identification applications, p will often be estimated fom a population database, possibly one fom which a suspect has been selected. In SDC applications, p might similaly be estimated fom a database available to an intude, but also fom multiple ecods in the micodata, as will be discussed in Section 6. The latte option has no analogue in foensic identification. In the following section, we set out a numbe of possible seach methods consideed in the foensic identification liteatue and discuss the natue of the conditional distibution fo F and the epession fo the isk in (4) given these seach methods and the Binomial model. 5.2 Seach Methods fom foensic identification liteatue In this section, we descibe a seies of seach methods, labelled 1, 2, to signify that the seach begins with a specified ecod. Seach Method 1: suspect is selected by seaching the population andomly until a match is found. This method may be illustated in the SDC contet by the jounalist scenaio of Paass (1988), whee a jounalist selects a ecod fom the micodata with an unusual 15
17 combination of values of the key vaiables and ties to find an known matching individual in the population by seaching though souces accessible to the jounalist until a match is found. The implicit assumption hee is that the systematic element of the jounalist s method of seaching the population is fully captued by the matching ule and that, othewise, the seach is equally likely to lead to any one of the F membes of S. Unde this seach method, we may wite P( F A( ) S, B S, seach method) = P( F A( ) S, seach method) since, conditional on A( ) S, the event B S is not infomative about F because some match must be found if we seach long enough. The event A( ) S tells us that Z A = 1 but, unde the Binomial model, is not infomative about Z i fo i A and so the conditional distibution of F is obtained by witing F = 1 + ( F 1) and noting that the conditional distibution of F 1 given A( ) S unde this seach method is Binomial with paametes N 1 and p (Lenth, 1986; Dawid, 1994, p.167). Staightfowad calculation using the Binomial density shows that the epectation in (4) has the closed fom epession: N identification isk = [1 (1 p) ]/[ Np]. (5) An implicit assumption hee is that N and p ae known. A futhe assumption is that y, the numbe of nonmatches aising befoe the intude finds a match, is unecoded and hence not conditioned upon. The effect of ecoding y will be consideed in method 3. Seach Method 2: suspect is dawn at andom fom the population and found to match. This method appeas less plausible in the SDC contet, since the epected payoff to a potential intude seems likely to be too low if no seach is undetaken. The neaest paallel appeas to be the case of spontaneous ecognition (Willenbog and de Waal, 16
18 2001, p.62) whee an intude happens, by chance, to obseve a match between a micodata ecod and a known individual. Fo this seach method, the event B S is infomative about F, making lage values of F moe likely. We may wite: P( F A( ) S, B S ) P( F A( ) S ) P( B S F, A( ) S ), (6) whee implicitly each tem also conditions on the seach method. The fist tem on the ight hand side of (6) is the density function of F 1 + Bin( N 1, p), as fo method 1. The second tem equals F / N since we assume the suspect is dawn andomly. We may intepet the implied distibution P( F A( ) S, B S ) as a sizebiased Binomial distibution (Dawid, 1994; Balding and Donnelly, 1995). It is staightfowad to show that the constant of popotionality in (6) is N /[1 + ( N 1) p] and hence that the conditional epectation in (4) takes the fom: identification isk = 1/{1 + ( N 1) p}. (7) Seach Method 3: as seach method 1 but whee the length of the seach is ecoded. If y is ecoded then the event B S does become infomative about F, as fo method 2. Indeed, if y = 0, methods 2 and 3 ae identical. To obtain the conditional distibution of F of inteest, all components of epession (6) may be modified by including the event of y pevious nonmatches alongside the conditioning event A( ) S. This simply has the effect of eplacing N by N y in each of the tems on the ight hand side of (6) and hence (c.f. Dawid,1994; Balding and Donnelly, 1995) epession (7) is modified to: identification isk = 1/{1 + ( N 1 y) p}. (8) 17
19 Seach Method 4: suspect is found to be unique match in a database. If a seach is made among y+1 potential suspects in a database, the same pobability calculations may be made as fo method 3 with y known (Balding and Donnelly, 1995) and so the identification isk is the same fo these two methods. In the SDC contet, this method coesponds again to the jounalist scenaio whee the database epesents a paticula souce available to the jounalist. These esults fo methods 3 and 4 have been subject to some debate in the foensic identification liteatue. Epession (8) implies that the geate the value of y, i.e. the longe the seach, the geate the isk of identification, although this incease will tend to be mino if the faction of the population seached, y / N, is small. This contasts with an altenative agument, advanced fo eample by Stockma (1999), that the isk may be seveely educed by such a database seach. See Dawid (2001) and Balding (2002) fo some of the ensuing debate. To illustate this debate in an SDC contet, suppose that a jounalist claims to have found a unique match between a named individual and a ecod in a public use micodata file eleased by a statistical agency. On discussion, the jounalist admits to have found the match by seaching though a lage database of 100,000 individuals. The agency might claim, following the altenative agument, that it is not supising that a match has been found as a esult of such an etensive seach and agues that, as a esult, little weight should be given to the obseved match, i.e. the pobability that the match is coect should be teated as small. This pape s position, following e.g. Balding (2002), is to suggest that such an agument would be misleading. It is tue that the pobability of finding a match does incease the longe the seach and thus that the jounalist s discovey may be unemakable oveall. Nevetheless, fo the 18
20 paticula ecod fo which a match is found, the fact that a futhe popotion of the population has been seached fo a match without success inceases athe than deceases the pobability that the match is coect. The issue is then whethe the value of this inceased pobability fo this ecod (i.e. epession (8) unde the Binomial model, assuming p is known) is of concen. Seach Method 5: method 1 is etended by continued seaching. If the seach is continued without a futhe match being found then this method may be teated as equivalent to eithe methods 3 o 4, with y equal to the numbe of nonmatches. If the continued seach leads to anothe individual being found which matches, then Dawid (1994) povides an epession fo the esulting isk, assuming y is not ecoded. In the eteme, if a complete seach of the population evealed F, the numbe of individuals in the population matching A, the isk would again become 1/ F, as in (3) Genealization: Seach Methods fo SDC Attention was esticted to the case of a single ecod in the pevious two sections. In the geneal SDC setting, howeve, thee will be multiple ecods in the micodata. Possible etensions of the pevious seach methods to this case will be consideed in this section and ae summaised in Table 2. These etensions ae of two types, temed fishing and diected seaches by Paass (1988). In a fishing method, the intude fist selects a ecod (o ecods) in the micodata, possibly a ecod that he/she epects to be easie to identify as a esult of having unusual values o combinations of values of key vaiables. Fo eample, Paass (1988) consides an ependitue suvey, whee an intude might select an individual puchasing two o moe boats. The intude then seeks to find a match fo this ecod using one of the 19
21 methods 1, 2... above. The judgement about the ecod being unusual might be based upon the micodata, in the eteme if the ecod is unique in the sample with espect to X (i.e. does not match any othe ecod). We let 1u, 2u denote the use of seach methods 1, 2 fo a ecod which is unique in this sense. We teat the case whee the intude selects multiple ecods fo linkage as epetition of methods 1, 1u etc. In what Paass (1988) efes to as a diected seach, the intude begins with a known taget individual (o individuals) in the population and then seaches fo a match in the micodata. Out of si scenaios consideed by Paass (1988), only one (the jounalist scenaio above) involves fishing. The emaining five ae diected seaches. In thee of these, it is assumed that the intude begins with a paticula individual in the population and then seaches the micodata file fo a match. In the emaining two scenaios the intude begins with a set of known individuals in the population and then seeks matches fo each of these in the micodata file. Duncan and Lambet (1989), Lambet (1993) and Reite (2005) also focus on the case of a diected seach. By intechanging the ole of the known population individuals and the micodata ecods, the seach methods in Section 5.2 may be tansposed to the case of a diected seach. We assume that any intude who has managed to gain access to the micodata would seach the whole file and would not stop at an intemediate stage, fo eample, at the fist match to the taget individual, B. We thus eject the countepats of methods 1, 2 and 3 as unealistic, since they involve eithe stopping (1 and 3) o no seach at all (2). The countepat of method 4, teating the micodata file as the countepat of the database, is: 20
22 Seach method B1: fo a given taget individual B, a unique matching micodata ecod is found. The use of B in the notation B1 is intended to signify that the seach begins with a specified known individual B. The intude cannot seach fo matches among individuals falling outside the micodata sample and thus the countepat of method 5 is ejected as impossible. We also eject methods which geneate moe than one match in a seach of the micodata, on the gounds of esticting attention to wost cases. It would be possible to qualify method B1 by some method fo selecting the taget individual. Fo eample, a method which selected the individual as unique within a database might be denoted B1u. It would also be possible fo the intude to select moe than one known individual fo linkage, fo eample the set of individuals within a database, esulting in an effective epetition of method B1. We shall, howeve, only eploe such etensions implicitly though consideation of B Genealization: Risk assessment fo SDC In this section, we conside the genealization of the esults on identification isk in Sections 5.1 and 5.2. to the case of SDC fo the seach methods discussed in Section 5.3. We also seek to compae these methods with espect to isk in ode to naow the class of seach methods which it is easonable fo a disclosue isk assesso to conside. This is desiable in pactice since dependence of the isk upon the seach method complicates the task of the assesso, given that the intude will geneally be hypothetical and hence the seach method unknown. We shall ague in this section that it is easonable fo the assesso to estict attention among the seach methods to 1u and B1 and thei etensions 21
23 to epeated ecods o known individuals. We conside two types of seach methods in tun, unde the headings discussed in the Section Fishing methods We suppose fist that the intude begins by selecting a micodata ecod and then seeks a match in the population. The epessions fo identification isk in Section 5.2. wee deived fo abitay ecods and hence will still apply povided the selection does not depend on some event which is infomative about F and any infomation povided by othe ecods is ignoed. Conside, following an eample of Paass (1988), the case of an ependitue suvey whee thee is a sepaate code in the micodata fo individuals who puchase two o moe boats. In one fom of attack, an intude might decide in advance to select any individual who falls into this categoy fo a matching attempt on gounds of pio judgement that this is an unusual categoy. In this case, this selection is not dependent on any obseved event and the epessions fo identification isk in Section 5.2. will continue to apply, unde the assumptions made thee povided we ignoe obseved data fom othe ecods. (This agument might be fomalised unde a given supepopulation model using the ielevance of stopping ules, following Bege and Wolpet, 1984, p.74). In a second fom of attack, the intude might seek an unusual categoy on the basis of obseving the micodata, fo eample it might be obseved that thee is only one individual in the micodata who puchases two o moe boats. Hee, conditioning the isk on the seach method (see (1)) coesponds to conditioning on this obseved sample uniqueness. These two foms of attack coespond to the distinction between methods 1, 2, of Section 5.2. and methods 1u, 2u, of Section
24 It follows fom ou definition in (1), howeve, that even in the case of method 1 we should condition the isk on X micodata, i.e. the infomation povided by othe micodata ecods and, in paticula, sample uniqueness if it occus. Thus the isk fo an individual in the micodata who was selected by method 1 and subsequently found to be sample unique should be the same as if the same individual was selected by the intude using method lu afte obseving sample uniqueness. The isk fo method 1 will tend to be highe if it is obseved that the individual is not sample unique and hence, if concen is with the wost cases, we may ague that it is sufficient to estict attention to 1u. In fact, if the sampling faction is small, as is common in many SDC applications, sample uniqueness will not cay much infomation about whee p is given, since F unde the Binomial model F will be pimaily detemined by the behaviou of nonsample individuals. See section 6 fo moe detailed discussion of this point. We may theefoe epect the isk fo methods 1u, 2u, to be vey simila to that fo methods 1, 2, in these cicumstances. Fo simplicity, we shall now compae isk fo the latte methods and then infe that simila compaative popeties will apply to the fome methods. We suppose that the event of sample uniqueness epesents the wost case, in tems of what micodata infomation the intude might use to select a ecod fo matching, and thus suppose that it is unnecessay to conside conditioning (1) on othe featues of X micodata. Suppose then that one of the methods 1, 2,, 5 is employed and that the selection of the ecod is not infomative so that the epessions fo identification isk in Section 5.2. still apply. Note that these esults also depend upon assumptions about the sampling scheme, discussed in Section 5.1. We now conside each of methods 2,, 5 23
25 in tun, compaing them with method 1, and ague that it is easonable to estict attention amongst the seach methods 1 to 5 just to method 1. Conside fist seach method 2. We have aleady suggested in Section 5.2. that this method is less plausible than the othe methods. Moeove, seach method 1 will, in geneal, lead to highe isk than method 2 since the size biasing in the latte method makes lage values of F moe likely and these ae associated with lowe isk. Balding and Donnelly (1995) give an eample whee N=101, p=0.004 and the epessions in (5) and (7) ae and espectively. Thus, disegading method 2 but consideing method 1 will be a consevative appoach to isk assessment. Methods 3 and 4 may lead to slightly highe isk than method 2, but the isk will only be highe than method 1 if a substantial popotion of the population is seached. Fo eample, if N=101 and p=0.004 then1/{1 + ( N 1 y) p} > equies y 48, i.e. almost half the population must be seached. Indeed, using the appoimation that N is lage, p is small and Np << 1 consideed in Balding and Donnelly, it will in geneal be necessay fo at least half the population to be seached (i.e. y / N > 0.5) fo methods 3 o 4 to lead to a highe isk than method 1. Pinciples govening SDC often enable such dispopotionate amounts of intude infomation to be uled out. Fo eample the National Statistics Code of Pactice (National Statistics, 2004, pp.7, 8) states, in elation to the use of SDC methods, that assumptions about the infomation likely to be available to thid paties should be made against the following standad: it would take a dispopotionate amount of time, effot and epetise fo an intude to identify a statistical unit to othes, o to eveal infomation about that unit not aleady in the public domain. 24
26 Method 5 may also be disegaded on the gounds that the only elevant cases unde this method educe to those unde methods 3 and 4 since it seems easonable to discount the possibility of the intude epoting that they have found a second match, because this would be epected to substantially educe the isk by uling out the possibility of population uniqueness. The esulting isk would be at most 0.5. Fo eample, fo the case N=101, p=0.004, epession (4.8) in Dawid (1994) implies the isk is Finally, let us tun to methods 1u5u. As discussed ealie, we may epect the isk fo these methods to be simila to that fo methods 15 fo a given selected individual and thus we suggest the above agument fo esticting attention to 1 may be etended to justify esticting attention to 1u out of the fome methods. As noted ealie, it is appopiate to condition the isk fo 1 on the obseved occuence o othewise of sample uniqueness and, taking the wost case, since the isk of 1 given sample uniqueness is the same as the isk fo 1u, we ague it is sufficient to estict attention to the latte method Diected Seaches Tuning to method B1, we note fist that it is isomophic to method 4 if we intechange the ole of the micodata and the database. Unde this isomophism, the indicato vaiables Z ae tanslated into vaiables i Z Bi fo individuals i U, indicating whethe X i matches X B. Fo individuals i outside the micodata sample, X i is defined to contain the values of the key vaiables which would be ecoded in the micodata if i wee selected into the sample. It is assumed that Z = 1. The coesponding Binomial model is that the BB Z Bi ae independent and identically distibuted Benoulli tials with p denoting the pobability of a match. 25
27 It then follows, as above, that unde this new Binomial model, the identification isk is given by identification isk = 1/{1 + ( N n) p }, (9) whee n is size of the micodata sample. We epect that, fo the same individual, p and p will be of simila magnitude in many pactical applications. As discussed in the pevious section, epession (9) (with p = p ) will only be geate than the isk fo method 1 if the sampling faction, n / N, is high, oughly geate than 0.5. Since we epect the isks fo 1u and 1 to be simila, we epect that in cases with small sampling factions, it will usually be easonable fo the disclosue isk assesso to disegad B1 in favou of 1u. 6. An Application with Categoical Key Vaiables and No Misclassification We now illustate the assessment of isk in one kind of SDC application which aises with sample micodata fom population censuses o social suveys. It is assumed that the key vaiables ae categoical and identically measued in the two souces, with linkage based upon eact matching, i.e. eample (i) of section 4. In this case, we label the combinations of categoies of the key vaiables by so that the ealie epession X fo the key vaiables may now take the intege values = 1,..., K. These combinations may be intepeted as cells in a multiway contingency table. The Binomial model consideed ealie implies a multinomial model fo this contingency table. Since we assume that X is identical to X and that linkage is based upon eact matching, the Binomial model in section 5.1 fo a given ecod with X = implies that the events Xi = fo diffeent population units i U ae independent and identically distibuted with P( X = ) = p, i 26
28 whee the subscipt is added to the pobability p to indicate that this model elates to the event X i =. Assuming that the Binomial model holds fo all ecods with all possible values = 1,..., K, it follows that the X i ae independent and identically distibuted andom vaiables with P( X i = ) = p, = 1,..., K, p = 1. = 1,..., K. (Since X = X and hence p = p, this model is also a consequence of the Binomial model in section ) The population counts F in the cells thus follow a multinomial distibution with paametes p and N, = 1,..., K. A elated model, moe common in the SDC liteatue, is the Poisson model whee the F ae independently distibuted as F Poisson( λ ). The multinomial model can, in fact, be deived fom the Poisson model by conditioning on N = F and setting p = λ / λ (McCullagh and Nelde, 1989, p.165). Even unconditionally, it may be agued that the two models have vey simila SDC consequences when the p ae small and N is lage (Chen and Kelle McNulty, 1998). In pactice, the p ae unknown, but infeence about them may be made using the multiple ecods of the micodata. As discussed in section 5.1., we may suppose that an intude could not know the values of the compute the coesponding sample counts F but he/she may be epected to be able to f fom the multiple micodata ecods. In typical SDC applications, inteest will focus on the iskiest cells whee f is small, say one o two (the values of p fo empty cells with f = 0 will not be of inteest since these cells contain no micodata ecods susceptible to identification). The data within a cell with such a small value of f will, howeve, cay little infomation, on its own, 27
29 about p. Fo the model to be useful fo isk assessment, it is theefoe natual to conside boowing infomation between cells by modelling the elation between the p in diffeent cells. One appoach is to conside a compound model, such as a Poissongamma model (Bethlehem et al., 1990) whee the λ, = 1,..., K, ae independent and identically gamma distibuted o a Diichletmultinomial model whee the p follow a Diichlet distibution. Such models imply that the identification isk is the same fo each micodata ecod, since they teat all cells as echangeable and make no use of the key vaiable chaacteistics used to constuct the cells. Such chaacteistics may be conditioned upon in a loglinea model, elating p o λ to main effects and inteactions between the key vaiables (Skinne and Holmes, 1998; Elami and Skinne, 2006), in ode to obtain moe ealistic pobabilities of identification, which may vay acoss cells. We now illustate this with a numeical eample, dawing on Skinne and Shlomo (2005). The data come fom the 2001 United Kingdom Census fo two lage aeas with a combined size of N 950, 000 individuals. A simple andom sample of size n 0.005N 4, 750 is dawn fom this population to mimic a sample suvey. The advantage of using census data is that the population chaacteistics can be used to validate samplebased pocedues. The following si key vaiables (with numbes of categoies in paentheses) ae used: aea (2), se (2), age band (18), maital status (6), ethnicity (17) and economic activity (10). The categoies ae the same as those used fo the Samples of Anonymised Recods fom the census. See Dale and Elliot (2001) fo a discussion of the choice of key vaiables in simila settings. The numbe of key vaiable combinations is thus K = 73,440 28
30 = We assume the multinomial model above, that is that the population counts F in the cells of the siway contingency table fomed by cossclassifying the key vaiables ae geneated by a multinomial distibution with paametes N and p, = 1,..., K. As above, we suppose the F ae unknown but the coesponding sample counts f ae known ( F = N, f = n ) and may be used to make infeence about the paametes p. We suppose that such infeence is conducted using a loglinea model fo p including all main effects and twoway inteactions (e.g. Agesti, 2002). Using the population data fo validation, this model has been found to geneate easonable disclosue isk assessments both fo these data (Skinne and Shlomo, 2005) and simila data souces (Skinne and Holmes, 1998; Elami and Skinne, 2006). Let p ˆ denote the maimum likelihood estimate of p unde this multinomial loglinea model. In Table 3 we pesent values of p ˆ fo thee individuals selected fom the sample. We conside only the 739 sample unique cells, i.e. cells whee f = 1, to continue ou wost case analysis, and select those sample unique individuals with the minimum, median and maimum values of p ˆ acoss these 739 cells. A compaison of the second and thid columns in the table shows how the values p ˆ could help the intude infe which of the sample uniques ae likely to have smalle (o lage) values of F. Fo eample, individuals in ethnic minoity goups tend to fall into cells with smalle values of F and this is picked up by the model though the main effect tem fo the ethnic goup. Unusual combinations of pais of key vaiables, e.g. widowed yeaolds, ae picked up though the twoway inteaction tems in the model. Impossible twoway combinations, 29
31 e.g. maied 04 yea olds, can also be handled in the model and will, of couse, not appea in the sample. Table 3 also includes estimates of identification isk fo these thee individuals unde diffeent assumptions about the intude s seach method. Consideing fist seach method 1 and eplacing p by p ˆ in epession (5) gives isk estimates of , and fo the sample unique individuals with minimum, median and maimum values of p ˆ espectively. We might conclude that the elease of the sample micodata ae not likely to identify the second and thid individuals, in the language of the Code of Pactice. Howeve, the isk fo the fist individual appeas high. In fact, the fist individual is not population unique. Thee ae, in fact, five women in the second aea in the population who ae ecoded as being aged 2024, of sepaated maital status, in the Bangladeshi ethnic goup and with looking afte home as thei economic activity. Out of the ten sample unique individuals with the lowest values of p ˆ just thee tun out to be population unique so the isk estimate of might be judged somewhat high. This aises questions about the choice of the loglinea model and the estimation method which we shall not pusue hee. Ou focus is on the compaison of isk estimates fo diffeent seach methods teating these values of p ˆ as ealistic and given. The above isk estimates fo method 1 only use the micodata to estimate p and ignoe the infomation that the individuals ae sample unique. As discussed in Section , conditioning on sample uniqueness is equivalent to consideing method 1u. The micodata sample is obtained by simple andom sampling of size n (without eplacement) so, unde the multinomial model, the conditional distibution of F given f = 1 may be 30
STUDENT RESPONSE TO ANNUITY FORMULA DERIVATION
Page 1 STUDENT RESPONSE TO ANNUITY FORMULA DERIVATION C. Alan Blaylock, Hendeson State Univesity ABSTRACT This pape pesents an intuitive appoach to deiving annuity fomulas fo classoom use and attempts
More informationQuestions & Answers Chapter 10 Software Reliability Prediction, Allocation and Demonstration Testing
M13914 Questions & Answes Chapte 10 Softwae Reliability Pediction, Allocation and Demonstation Testing 1. Homewok: How to deive the fomula of failue ate estimate. λ = χ α,+ t When the failue times follow
More informationChapter 3 Savings, Present Value and Ricardian Equivalence
Chapte 3 Savings, Pesent Value and Ricadian Equivalence Chapte Oveview In the pevious chapte we studied the decision of households to supply hous to the labo maket. This decision was a static decision,
More informationAn Introduction to Omega
An Intoduction to Omega Con Keating and William F. Shadwick These distibutions have the same mean and vaiance. Ae you indiffeent to thei iskewad chaacteistics? The Finance Development Cente 2002 1 Fom
More informationConcept and Experiences on using a Wikibased System for Softwarerelated Seminar Papers
Concept and Expeiences on using a Wikibased System fo Softwaeelated Semina Papes Dominik Fanke and Stefan Kowalewski RWTH Aachen Univesity, 52074 Aachen, Gemany, {fanke, kowalewski}@embedded.wthaachen.de,
More information9:6.4 Sample Questions/Requests for Managing Underwriter Candidates
9:6.4 INITIAL PUBLIC OFFERINGS 9:6.4 Sample Questions/Requests fo Managing Undewite Candidates Recent IPO Expeience Please povide a list of all completed o withdawn IPOs in which you fim has paticipated
More informationON THE (Q, R) POLICY IN PRODUCTIONINVENTORY SYSTEMS
ON THE R POLICY IN PRODUCTIONINVENTORY SYSTEMS Saifallah Benjaafa and JoonSeok Kim Depatment of Mechanical Engineeing Univesity of Minnesota Minneapolis MN 55455 Abstact We conside a poductioninventoy
More informationThe impact of migration on the provision. of UK public services (SRG.10.039.4) Final Report. December 2011
The impact of migation on the povision of UK public sevices (SRG.10.039.4) Final Repot Decembe 2011 The obustness The obustness of the analysis of the is analysis the esponsibility is the esponsibility
More informationFinancial Planning and Riskreturn profiles
Financial Planning and Risketun pofiles Stefan Gaf, Alexande Kling und Jochen Russ Pepint Seies: 201016 Fakultät fü Mathematik und Witschaftswissenschaften UNIERSITÄT ULM Financial Planning and Risketun
More informationest using the formula I = Prt, where I is the interest earned, P is the principal, r is the interest rate, and t is the time in years.
9.2 Inteest Objectives 1. Undestand the simple inteest fomula. 2. Use the compound inteest fomula to find futue value. 3. Solve the compound inteest fomula fo diffeent unknowns, such as the pesent value,
More informationConverting knowledge Into Practice
Conveting knowledge Into Pactice Boke Nightmae srs Tend Ride By Vladimi Ribakov Ceato of Pips Caie 20 of June 2010 2 0 1 0 C o p y i g h t s V l a d i m i R i b a k o v 1 Disclaime and Risk Wanings Tading
More informationUncertain Version Control in Open Collaborative Editing of TreeStructured Documents
Uncetain Vesion Contol in Open Collaboative Editing of TeeStuctued Documents M. Lamine Ba Institut Mines Télécom; Télécom PaisTech; LTCI Pais, Fance mouhamadou.ba@ telecompaistech.f Talel Abdessalem
More informationThe Binomial Distribution
The Binomial Distibution A. It would be vey tedious if, evey time we had a slightly diffeent poblem, we had to detemine the pobability distibutions fom scatch. Luckily, thee ae enough similaities between
More informationPromised LeadTime Contracts Under Asymmetric Information
OPERATIONS RESEARCH Vol. 56, No. 4, July August 28, pp. 898 915 issn 3364X eissn 15265463 8 564 898 infoms doi 1.1287/ope.18.514 28 INFORMS Pomised LeadTime Contacts Unde Asymmetic Infomation Holly
More informationThe transport performance evaluation system building of logistics enterprises
Jounal of Industial Engineeing and Management JIEM, 213 6(4): 194114 Online ISSN: 213953 Pint ISSN: 2138423 http://dx.doi.og/1.3926/jiem.784 The tanspot pefomance evaluation system building of logistics
More informationUNIT CIRCLE TRIGONOMETRY
UNIT CIRCLE TRIGONOMETRY The Unit Cicle is the cicle centeed at the oigin with adius unit (hence, the unit cicle. The equation of this cicle is + =. A diagam of the unit cicle is shown below: + =   
More informationIlona V. Tregub, ScD., Professor
Investment Potfolio Fomation fo the Pension Fund of Russia Ilona V. egub, ScD., Pofesso Mathematical Modeling of Economic Pocesses Depatment he Financial Univesity unde the Govenment of the Russian Fedeation
More informationThe Supply of Loanable Funds: A Comment on the Misconception and Its Implications
JOURNL OF ECONOMICS ND FINNCE EDUCTION Volume 7 Numbe 2 Winte 2008 39 The Supply of Loanable Funds: Comment on the Misconception and Its Implications. Wahhab Khandke and mena Khandke* STRCT Recently FieldsHat
More informationINITIAL MARGIN CALCULATION ON DERIVATIVE MARKETS OPTION VALUATION FORMULAS
INITIAL MARGIN CALCULATION ON DERIVATIVE MARKETS OPTION VALUATION FORMULAS Vesion:.0 Date: June 0 Disclaime This document is solely intended as infomation fo cleaing membes and othes who ae inteested in
More informationSpirotechnics! September 7, 2011. Amanda Zeringue, Michael Spannuth and Amanda Zeringue Dierential Geometry Project
Spiotechnics! Septembe 7, 2011 Amanda Zeingue, Michael Spannuth and Amanda Zeingue Dieential Geomety Poject 1 The Beginning The geneal consensus of ou goup began with one thought: Spiogaphs ae awesome.
More informationThings to Remember. r Complete all of the sections on the Retirement Benefit Options form that apply to your request.
Retiement Benefit 1 Things to Remembe Complete all of the sections on the Retiement Benefit fom that apply to you equest. If this is an initial equest, and not a change in a cuent distibution, emembe to
More informationChannel selection in ecommerce age: A strategic analysis of coop advertising models
Jounal of Industial Engineeing and Management JIEM, 013 6(1):89103 Online ISSN: 0130953 Pint ISSN: 013843 http://dx.doi.og/10.396/jiem.664 Channel selection in ecommece age: A stategic analysis of
More informationOverencryption: Management of Access Control Evolution on Outsourced Data
Oveencyption: Management of Access Contol Evolution on Outsouced Data Sabina De Capitani di Vimecati DTI  Univesità di Milano 26013 Cema  Italy decapita@dti.unimi.it Stefano Paaboschi DIIMM  Univesità
More informationContinuous Compounding and Annualization
Continuous Compounding and Annualization Philip A. Viton Januay 11, 2006 Contents 1 Intoduction 1 2 Continuous Compounding 2 3 Pesent Value with Continuous Compounding 4 4 Annualization 5 5 A Special Poblem
More informationPatent renewals and R&D incentives
RAND Jounal of Economics Vol. 30, No., Summe 999 pp. 97 3 Patent enewals and R&D incentives Fancesca Conelli* and Mak Schankeman** In a model with moal hazad and asymmetic infomation, we show that it can
More informationSoftware Engineering and Development
I T H E A 67 Softwae Engineeing and Development SOFTWARE DEVELOPMENT PROCESS DYNAMICS MODELING AS STATE MACHINE Leonid Lyubchyk, Vasyl Soloshchuk Abstact: Softwae development pocess modeling is gaining
More informationHow To Find The Optimal Stategy For Buying Life Insuance
Life Insuance Puchasing to Reach a Bequest Ehan Bayakta Depatment of Mathematics, Univesity of Michigan Ann Abo, Michigan, USA, 48109 S. David Pomislow Depatment of Mathematics, Yok Univesity Toonto, Ontaio,
More informationFinancing Terms in the EOQ Model
Financing Tems in the EOQ Model Habone W. Stuat, J. Columbia Business School New Yok, NY 1007 hws7@columbia.edu August 6, 004 1 Intoduction This note discusses two tems that ae often omitted fom the standad
More informationComparing Availability of Various Rack Power Redundancy Configurations
Compaing Availability of Vaious Rack Powe Redundancy Configuations White Pape 48 Revision by Victo Avela > Executive summay Tansfe switches and dualpath powe distibution to IT equipment ae used to enhance
More informationVISCOSITY OF BIODIESEL FUELS
VISCOSITY OF BIODIESEL FUELS One of the key assumptions fo ideal gases is that the motion of a given paticle is independent of any othe paticles in the system. With this assumption in place, one can use
More informationAn Efficient Group Key Agreement Protocol for Ad hoc Networks
An Efficient Goup Key Ageement Potocol fo Ad hoc Netwoks Daniel Augot, Raghav haska, Valéie Issany and Daniele Sacchetti INRIA Rocquencout 78153 Le Chesnay Fance {Daniel.Augot, Raghav.haska, Valéie.Issany,
More informationComparing Availability of Various Rack Power Redundancy Configurations
Compaing Availability of Vaious Rack Powe Redundancy Configuations By Victo Avela White Pape #48 Executive Summay Tansfe switches and dualpath powe distibution to IT equipment ae used to enhance the availability
More informationMining Relatedness Graphs for Data Integration
Mining Relatedness Gaphs fo Data Integation Jeemy T. Engle (jtengle@indiana.edu) Ying Feng (yingfeng@indiana.edu) Robet L. Goldstone (goldsto@indiana.edu) Indiana Univesity Bloomington, IN. 47405 USA Abstact
More informationAN IMPLEMENTATION OF BINARY AND FLOATING POINT CHROMOSOME REPRESENTATION IN GENETIC ALGORITHM
AN IMPLEMENTATION OF BINARY AND FLOATING POINT CHROMOSOME REPRESENTATION IN GENETIC ALGORITHM Main Golub Faculty of Electical Engineeing and Computing, Univesity of Zageb Depatment of Electonics, Micoelectonics,
More informationCANCER, HEART ATTACK OR STROKE CLAIM FORM
CANCER, HEART ATTACK OR STROKE CLAIM FORM Please ead the impotant infomation below: We suggest you make photocopies of any infomation sent fo you own ecods. Please be sue you policy numbe(s) is/ae witten
More informationAn Analysis of Manufacturer Benefits under Vendor Managed Systems
An Analysis of Manufactue Benefits unde Vendo Managed Systems Seçil Savaşaneil Depatment of Industial Engineeing, Middle East Technical Univesity, 06531, Ankaa, TURKEY secil@ie.metu.edu.t Nesim Ekip 1
More information30 H. N. CHIU 1. INTRODUCTION. Recherche opérationnelle/operations Research
RAIRO Rech. Opé. (vol. 33, n 1, 1999, pp. 2945) A GOOD APPROXIMATION OF THE INVENTORY LEVEL IN A(Q ) PERISHABLE INVENTORY SYSTEM (*) by Huan Neng CHIU ( 1 ) Communicated by Shunji OSAKI Abstact. This
More informationSupplementary Material for EpiDiff
Supplementay Mateial fo EpiDiff Supplementay Text S1. Pocessing of aw chomatin modification data In ode to obtain the chomatin modification levels in each of the egions submitted by the use QDCMR module
More informationContingent capital with repeated interconversion between debt and equity
Contingent capital with epeated inteconvesion between debt and equity Zhaojun Yang 1, Zhiming Zhao School of Finance and Statistics, Hunan Univesity, Changsha 410079, China Abstact We develop a new type
More informationIgnorance is not bliss when it comes to knowing credit score
NET GAIN Scoing points fo you financial futue AS SEEN IN USA TODAY SEPTEMBER 28, 2004 Ignoance is not bliss when it comes to knowing cedit scoe By Sanda Block USA TODAY Fom Alabama comes eassuing news
More informationAn application of stochastic programming in solving capacity allocation and migration planning problem under uncertainty
An application of stochastic pogamming in solving capacity allocation and migation planning poblem unde uncetainty YinYann Chen * and HsiaoYao Fan Depatment of Industial Management, National Fomosa Univesity,
More informationEfficient Redundancy Techniques for Latency Reduction in Cloud Systems
Efficient Redundancy Techniques fo Latency Reduction in Cloud Systems 1 Gaui Joshi, Emina Soljanin, and Gegoy Wonell Abstact In cloud computing systems, assigning a task to multiple seves and waiting fo
More informationReduced Pattern Training Based on Task Decomposition Using Pattern Distributor
> PNN05P762 < Reduced Patten Taining Based on Task Decomposition Using Patten Distibuto ShengUei Guan, Chunyu Bao, and TseNgee Neo Abstact Task Decomposition with Patten Distibuto (PD) is a new task
More informationThe Role of Gravity in Orbital Motion
! The Role of Gavity in Obital Motion Pat of: Inquiy Science with Datmouth Developed by: Chistophe Caoll, Depatment of Physics & Astonomy, Datmouth College Adapted fom: How Gavity Affects Obits (Ohio State
More informationFirstmark Credit Union Commercial Loan Department
Fistmak Cedit Union Commecial Loan Depatment Thank you fo consideing Fistmak Cedit Union as a tusted souce to meet the needs of you business. Fistmak Cedit Union offes a wide aay of business loans and
More informationTowards Automatic Update of Access Control Policy
Towads Automatic Update of Access Contol Policy Jinwei Hu, Yan Zhang, and Ruixuan Li Intelligent Systems Laboatoy, School of Computing and Mathematics Univesity of Westen Sydney, Sydney 1797, Austalia
More informationSaving and Investing for Early Retirement: A Theoretical Analysis
Saving and Investing fo Ealy Retiement: A Theoetical Analysis Emmanuel Fahi MIT Stavos Panageas Whaton Fist Vesion: Mach, 23 This Vesion: Januay, 25 E. Fahi: MIT Depatment of Economics, 5 Memoial Dive,
More informationRisk Sensitive Portfolio Management With CoxIngersollRoss Interest Rates: the HJB Equation
Risk Sensitive Potfolio Management With CoxIngesollRoss Inteest Rates: the HJB Equation Tomasz R. Bielecki Depatment of Mathematics, The Notheasten Illinois Univesity 55 Noth St. Louis Avenue, Chicago,
More informationData Center Demand Response: Avoiding the Coincident Peak via Workload Shifting and Local Generation
(213) 1 28 Data Cente Demand Response: Avoiding the Coincident Peak via Wokload Shifting and Local Geneation Zhenhua Liu 1, Adam Wieman 1, Yuan Chen 2, Benjamin Razon 1, Niangjun Chen 1 1 Califonia Institute
More informationStatistics and Data Analysis
Pape 27425 An Extension to SAS/OR fo Decision System Suppot Ali Emouznead Highe Education Funding Council fo England, Nothavon house, Coldhabou Lane, Bistol, BS16 1QD U.K. ABSTRACT This pape exploes the
More informationOptimal Capital Structure with Endogenous Bankruptcy:
Univesity of Pisa Ph.D. Pogam in Mathematics fo Economic Decisions Leonado Fibonacci School cotutelle with Institut de Mathématique de Toulouse Ph.D. Dissetation Optimal Capital Stuctue with Endogenous
More informationLiquidity and Insurance for the Unemployed
Liquidity and Insuance fo the Unemployed Robet Shime Univesity of Chicago and NBER shime@uchicago.edu Iván Wening MIT, NBER and UTDT iwening@mit.edu Fist Daft: July 15, 2003 This Vesion: Septembe 22, 2005
More informationCONCEPTUAL FRAMEWORK FOR DEVELOPING AND VERIFICATION OF ATTRIBUTION MODELS. ARITHMETIC ATTRIBUTION MODELS
CONCEPUAL FAMEOK FO DEVELOPING AND VEIFICAION OF AIBUION MODELS. AIHMEIC AIBUION MODELS Yui K. Shestopaloff, is Diecto of eseach & Deelopment at SegmentSoft Inc. He is a Docto of Sciences and has a Ph.D.
More informationThere is considerable variation in health care utilization and spending. Geographic Variation in Health Care: The Role of Private Markets
TOMAS J. PHILIPSON Univesity of Chicago SETH A. SEABUY AND Copoation LEE M. LOCKWOOD Univesity of Chicago DANA P. GOLDMAN Univesity of Southen Califonia DAIUS N. LAKDAWALLA Univesity of Southen Califonia
More informationDefine What Type of Trader Are you?
Define What Type of Tade Ae you? Boke Nightmae srs Tend Ride By Vladimi Ribakov Ceato of Pips Caie 20 of June 2010 1 Disclaime and Risk Wanings Tading any financial maket involves isk. The content of this
More informationThe Predictive Power of Dividend Yields for Stock Returns: Risk Pricing or Mispricing?
The Pedictive Powe of Dividend Yields fo Stock Retuns: Risk Picing o Mispicing? Glenn Boyle Depatment of Economics and Finance Univesity of Cantebuy Yanhui Li Depatment of Economics and Finance Univesity
More informationValuation of Floating Rate Bonds 1
Valuation of Floating Rate onds 1 Joge uz Lopez us 316: Deivative Secuities his note explains how to value plain vanilla floating ate bonds. he pupose of this note is to link the concepts that you leaned
More informationEffect of Contention Window on the Performance of IEEE 802.11 WLANs
Effect of Contention Window on the Pefomance of IEEE 82.11 WLANs Yunli Chen and Dhama P. Agawal Cente fo Distibuted and Mobile Computing, Depatment of ECECS Univesity of Cincinnati, OH 452213 {ychen,
More informationPAN STABILITY TESTING OF DC CIRCUITS USING VARIATIONAL METHODS XVIII  SPETO  1995. pod patronatem. Summary
PCE SEMINIUM Z PODSTW ELEKTOTECHNIKI I TEOII OBWODÓW 8  TH SEMIN ON FUNDMENTLS OF ELECTOTECHNICS ND CICUIT THEOY ZDENĚK BIOLEK SPŠE OŽNO P.., CZECH EPUBLIC DLIBO BIOLEK MILITY CDEMY, BNO, CZECH EPUBLIC
More informationHow to create RAID 1 mirroring with a hard disk that already has data or an operating system on it
AnswesThatWok TM How to set up a RAID1 mio with a dive which aleady has Windows installed How to ceate RAID 1 mioing with a had disk that aleady has data o an opeating system on it Date Company PC / Seve
More informationFaithful Comptroller s Handbook
Faithful Comptolle s Handbook Faithful Comptolle s Handbook Selection of Faithful Comptolle The Laws govening the Fouth Degee povide that the faithful comptolle be elected, along with the othe offices
More informationTHE CARLO ALBERTO NOTEBOOKS
THE CARLO ALBERTO NOTEBOOKS Meanvaiance inefficiency of CRRA and CARA utility functions fo potfolio selection in defined contibution pension schemes Woking Pape No. 108 Mach 2009 Revised, Septembe 2009)
More informationNBER WORKING PAPER SERIES FISCAL ZONING AND SALES TAXES: DO HIGHER SALES TAXES LEAD TO MORE RETAILING AND LESS MANUFACTURING?
NBER WORKING PAPER SERIES FISCAL ZONING AND SALES TAXES: DO HIGHER SALES TAXES LEAD TO MORE RETAILING AND LESS MANUFACTURING? Daia Bunes David Neumak Michelle J. White Woking Pape 16932 http://www.nbe.og/papes/w16932
More informationHow to recover your Exchange 2003/2007 mailboxes and emails if all you have available are your PRIV1.EDB and PRIV1.STM Information Store database
AnswesThatWok TM Recoveing Emails and Mailboxes fom a PRIV1.EDB Exchange 2003 IS database How to ecove you Exchange 2003/2007 mailboxes and emails if all you have available ae you PRIV1.EDB and PRIV1.STM
More informationLiquidity and Insurance for the Unemployed*
Fedeal Reseve Bank of Minneapolis Reseach Depatment Staff Repot 366 Decembe 2005 Liquidity and Insuance fo the Unemployed* Robet Shime Univesity of Chicago and National Bueau of Economic Reseach Iván Wening
More informationMULTIPLE SOLUTIONS OF THE PRESCRIBED MEAN CURVATURE EQUATION
MULTIPLE SOLUTIONS OF THE PRESCRIBED MEAN CURVATURE EQUATION K.C. CHANG AND TAN ZHANG In memoy of Pofesso S.S. Chen Abstact. We combine heat flow method with Mose theoy, supe and subsolution method with
More informationCloud Service Reliability: Modeling and Analysis
Cloud Sevice eliability: Modeling and Analysis YuanShun Dai * a c, Bo Yang b, Jack Dongaa a, Gewei Zhang c a Innovative Computing Laboatoy, Depatment of Electical Engineeing & Compute Science, Univesity
More informationImproving Network Security Via CyberInsurance A Market Analysis
1 Impoving Netwok Secuity Via CybeInsuance A Maket Analysis RANJAN PAL, LEANA GOLUBCHIK, KONSTANTINOS PSOUNIS Univesity of Southen Califonia PAN HUI Hong Kong Univesity of Science and Technology Recent
More informationStrategic Asset Allocation and the Role of Alternative Investments
Stategic Asset Allocation and the Role of Altenative Investments DOUGLAS CUMMING *, LARS HELGE HAß, DENIS SCHWEIZER Abstact We intoduce a famewok fo stategic asset allocation with altenative investments.
More informationOffice of Family Assistance. Evaluation Resource Guide for Responsible Fatherhood Programs
Office of Family Assistance Evaluation Resouce Guide fo Responsible Fathehood Pogams Contents Intoduction........................................................ 4 Backgound..........................................................
More informationA Capacitated Commodity Trading Model with Market Power
A Capacitated Commodity Tading Model with Maket Powe Victo MatínezdeAlbéniz Josep Maia Vendell Simón IESE Business School, Univesity of Navaa, Av. Peason 1, 08034 Bacelona, Spain VAlbeniz@iese.edu JMVendell@iese.edu
More informationA framework for the selection of enterprise resource planning (ERP) system based on fuzzy decision making methods
A famewok fo the selection of entepise esouce planning (ERP) system based on fuzzy decision making methods Omid Golshan Tafti M.s student in Industial Management, Univesity of Yazd Omidgolshan87@yahoo.com
More informationResearch and the Approval Process
Reseach and the Appoval Pocess Emeic Heny y Maco Ottaviani z Febuay 2014 Abstact An agent sequentially collects infomation to obtain a pincipal s appoval, such as a phamaceutical company seeking FDA appoval
More informationExperimentation under Uninsurable Idiosyncratic Risk: An Application to Entrepreneurial Survival
Expeimentation unde Uninsuable Idiosyncatic Risk: An Application to Entepeneuial Suvival Jianjun Miao and Neng Wang May 28, 2007 Abstact We popose an analytically tactable continuoustime model of expeimentation
More informationHEALTHCARE INTEGRATION BASED ON CLOUD COMPUTING
U.P.B. Sci. Bull., Seies C, Vol. 77, Iss. 2, 2015 ISSN 22863540 HEALTHCARE INTEGRATION BASED ON CLOUD COMPUTING Roxana MARCU 1, Dan POPESCU 2, Iulian DANILĂ 3 A high numbe of infomation systems ae available
More informationReview Graph based Online Store Review Spammer Detection
Review Gaph based Online Stoe Review Spamme Detection Guan Wang, Sihong Xie, Bing Liu, Philip S. Yu Univesity of Illinois at Chicago Chicago, USA gwang26@uic.edu sxie6@uic.edu liub@uic.edu psyu@uic.edu
More informationMoney Market Funds Intermediation and Bank Instability
Fedeal Reseve Bank of New Yok Staff Repots Money Maket Funds Intemediation and Bank Instability Maco Cipiani Antoine Matin Buno M. Paigi Staff Repot No. 599 Febuay 013 Revised May 013 This pape pesents
More informationInstructions to help you complete your enrollment form for HPHC's Medicare Supplemental Plan
Instuctions to help you complete you enollment fom fo HPHC's Medicae Supplemental Plan Thank you fo applying fo membeship to HPHC s Medicae Supplement plan. Pio to submitting you enollment fom fo pocessing,
More informationSemipartial (Part) and Partial Correlation
Semipatial (Pat) and Patial Coelation his discussion boows heavily fom Applied Multiple egession/coelation Analysis fo the Behavioal Sciences, by Jacob and Paticia Cohen (975 edition; thee is also an updated
More informationarxiv:1110.2612v1 [qfin.st] 12 Oct 2011
Maket inefficiency identified by both single and multiple cuency tends T.Toká 1, and D. Hováth 1, 1 Sos Reseach a.s., Stojáenská 3, 040 01 Košice, Slovak Republic Abstact axiv:1110.2612v1 [qfin.st] 12
More informationNontrivial lower bounds for the least common multiple of some finite sequences of integers
J. Numbe Theoy, 15 (007), p. 393411. Nontivial lowe bounds fo the least common multiple of some finite sequences of integes Bai FARHI bai.fahi@gmail.com Abstact We pesent hee a method which allows to
More informationCoordinate Systems L. M. Kalnins, March 2009
Coodinate Sstems L. M. Kalnins, Mach 2009 Pupose of a Coodinate Sstem The pupose of a coodinate sstem is to uniquel detemine the position of an object o data point in space. B space we ma liteall mean
More informationApproximation Algorithms for Data Management in Networks
Appoximation Algoithms fo Data Management in Netwoks Chistof Kick Heinz Nixdof Institute and Depatment of Mathematics & Compute Science adebon Univesity Gemany kueke@upb.de Haald Räcke Heinz Nixdof Institute
More informationDatabase Management Systems
Contents Database Management Systems (COP 5725) D. Makus Schneide Depatment of Compute & Infomation Science & Engineeing (CISE) Database Systems Reseach & Development Cente Couse Syllabus 1 Sping 2012
More informationResearch on Risk Assessment of the Transformer Based on Life Cycle Cost
ntenational Jounal of Smat Gid and lean Enegy eseach on isk Assessment of the Tansfome Based on Life ycle ost Hui Zhou a, Guowei Wu a, Weiwei Pan a, Yunhe Hou b, hong Wang b * a Zhejiang Electic Powe opoation,
More informationExam #1 Review Answers
xam #1 Review Answes 1. Given the following pobability distibution, calculate the expected etun, vaiance and standad deviation fo Secuity J. State Pob (R) 1 0.2 10% 2 0.6 15 3 0.2 20 xpected etun = 0.2*10%
More informationDeflection of Electrons by Electric and Magnetic Fields
Physics 233 Expeiment 42 Deflection of Electons by Electic and Magnetic Fields Refeences Loain, P. and D.R. Coson, Electomagnetism, Pinciples and Applications, 2nd ed., W.H. Feeman, 199. Intoduction An
More informationTrading Volume and Serial Correlation in Stock Returns in Pakistan. Abstract
Tading Volume and Seial Coelation in Stock Retuns in Pakistan Khalid Mustafa Assistant Pofesso Depatment of Economics, Univesity of Kaachi email: khalidku@yahoo.com and Mohammed Nishat Pofesso and Chaiman,
More informationUncertainty Associated with Microbiological Analysis
Appendix J STWG Pat 3 Uncetainty 7806 Page 1 of 31 Uncetainty Associated with Micobiological Analysis 1. Intoduction 1.1. Thee ae only two absolute cetainties in life: death and taxes! Whateve task we
More informationTowards Realizing a Low Cost and Highly Available Datacenter Power Infrastructure
Towads Realizing a Low Cost and Highly Available Datacente Powe Infastuctue Siam Govindan, Di Wang, Lydia Chen, Anand Sivasubamaniam, and Bhuvan Ugaonka The Pennsylvania State Univesity. IBM Reseach Zuich
More informationDebt Shifting in Europe
Debt Shifting in Euope Fancesca Baion Paolo Panteghini Univesità di Bescia Ra aele Miniaci Univesità di Bescia Maia Laua Paisi Univesità di Bescia Mach 1, 011 Abstact This aticle aims at analyzing the
More informationDefinitions and terminology
I love the Case & Fai textbook but it is out of date with how monetay policy woks today. Please use this handout to supplement the chapte on monetay policy. The textbook assumes that the Fedeal Reseve
More informationModeling and Verifying a Price Model for Congestion Control in Computer Networks Using PROMELA/SPIN
Modeling and Veifying a Pice Model fo Congestion Contol in Compute Netwoks Using PROMELA/SPIN Clement Yuen and Wei Tjioe Depatment of Compute Science Univesity of Toonto 1 King s College Road, Toonto,
More informationYARN PROPERTIES MEASUREMENT: AN OPTICAL APPROACH
nd INTERNATIONAL TEXTILE, CLOTHING & ESIGN CONFERENCE Magic Wold of Textiles Octobe 03 d to 06 th 004, UBROVNIK, CROATIA YARN PROPERTIES MEASUREMENT: AN OPTICAL APPROACH Jana VOBOROVA; Ashish GARG; Bohuslav
More informationTiming Synchronization in High Mobility OFDM Systems
Timing Synchonization in High Mobility OFDM Systems Yasamin Mostofi Depatment of Electical Engineeing Stanfod Univesity Stanfod, CA 94305, USA Email: yasi@wieless.stanfod.edu Donald C. Cox Depatment of
More informationHow do investments in heat pumps affect household energy consumption?
Discussion Papes Statistics Noway Reseach depatment No. 737 Apil 203 Bente Halvosen and Bodil Meethe Lasen How do investments in heat pumps affect household enegy consumption? Discussion Papes No. 737,
More informationDYNAMICS AND STRUCTURAL LOADING IN WIND TURBINES
DYNAMIS AND STRUTURAL LOADING IN WIND TURBINES M. Ragheb 12/30/2008 INTRODUTION The loading egimes to which wind tubines ae subject to ae extemely complex equiing special attention in thei design, opeation
More informationTHE DISTRIBUTED LOCATION RESOLUTION PROBLEM AND ITS EFFICIENT SOLUTION
IADIS Intenational Confeence Applied Computing 2006 THE DISTRIBUTED LOCATION RESOLUTION PROBLEM AND ITS EFFICIENT SOLUTION Jög Roth Univesity of Hagen 58084 Hagen, Gemany Joeg.Roth@Fenunihagen.de ABSTRACT
More information2 r2 θ = r2 t. (3.59) The equal area law is the statement that the term in parentheses,
3.4. KEPLER S LAWS 145 3.4 Keple s laws You ae familia with the idea that one can solve some mechanics poblems using only consevation of enegy and (linea) momentum. Thus, some of what we see as objects
More informationChoosing the best hedonic product represents a challenging
Thosten HennigThuau, Andé Machand, & Paul Max Can Automated Systems Help Consumes Make Bette Choices? Because hedonic poducts consist pedominantly of expeience attibutes, often with many available altenatives,
More information