Similarity Estimation Techniques from Rounding Algorithms

Size: px
Start display at page:

Download "Similarity Estimation Techniques from Rounding Algorithms"

Transcription

1 Smlarty Estmaton Technques from Roundng Algorthms Moses S. Charkar Dept. of Computer Scence Prnceton Unversty 35 Olden Street Prnceton, NJ ABSTRACT A localty senstve hashng scheme s a dstrbuton on a famly F of hash functons operatng on a collecton of objects, such that for two objects x, y, Pr h F [h(x) = h(y)] = sm(x,y), where sm(x,y) [0, 1] s some smlarty functon defned on the collecton of objects. Such a scheme leads to a compact representaton of objects so that smlarty of objects can be estmated from ther compact sketches, and also leads to effcent algorthms for approxmate nearest neghbor search and clusterng. Mn-wse ndependent permutatons provde an elegant constructon of such a localty senstve hashng scheme for a collecton of subsets wth the set smlarty measure sm(a, B) = A B. A B We show that roundng algorthms for LPs and SDPs used n the context of approxmaton algorthms can be vewed as localty senstve hashng schemes for several nterestng collectons of objects. Based on ths nsght, we construct new localty senstve hashng schemes for: 1. A collecton of vectors wth the dstance between u and v measured by θ( u, v)/π, where θ( u, v) s the angle between u and v. Ths yelds a sketchng scheme for estmatng the cosne smlarty measure between two vectors, as well as a smple alternatve to mnwse ndependent permutatons for estmatng set smlarty. 2. A collecton of dstrbutons on n ponts n a metrc space, wth dstance between dstrbutons measured by the Earth Mover Dstance (EMD), (a popular dstance measure n graphcs and vson). Our hash functons map dstrbutons to ponts n the metrc space such that, for dstrbutons P and Q, EMD(P, Q) Eh F[d(h(P), h(q))] O(log nlog log n) EMD(P, Q). Permsson to make dgtal or hard copes of all or part of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, to republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. STOC 02, May 19-21, 2002, Montreal, Quebec, Canada. Copyrght 2002 ACM /02/ $ INTRODUCTION The current nformaton exploson has resulted n an ncreasng number of applcatons that need to deal wth large volumes of data. Whle tradtonal algorthm analyss assumes that the data fts n man memory, t s unreasonable to make such assumptons when dealng wth massve data sets such as data from phone calls collected by phone companes, multmeda data, web page repostores and so on. Ths new settng has resulted n an ncreased nterest n algorthms that process the nput data n restrcted ways, ncludng samplng a few data ponts, makng only a few passes over the data, and constructng a succnct sketch of the nput whch can then be effcently processed. There has been a lot of recent work on streamng algorthms,.e. algorthms that produce an output by makng one pass (or a few passes) over the data whle usng a lmted amount of storage space and tme. To cte a few examples, Alon et al [2] consdered the problem of estmatng frequency moments and Guha et al [25] consdered the problem of clusterng ponts n a streamng fashon. Many of these streamng algorthms need to represent mportant aspects of the data they have seen so far n a small amount of space; n other words they mantan a compact sketch of the data that encapsulates the relevant propertes of the data set. Indeed, some of these technques lead to sketchng algorthms algorthms that produce a compact sketch of a data set so that varous measurements on the orgnal data set can be estmated by effcent computatons on the compact sketches. Buldng on the deas of [2], Alon et al [1] gve algorthms for estmatng jon szes. Gbbons and Matas [18] gve sketchng algorthms producng so called synopss data structures for varous problems ncludng mantanng approxmate hstograms, hot lsts and so on. Glbert et al [19] gve algorthms to compute sketches for data streams so as to estmate any lnear projecton of the data and use ths to get ndvdual pont and range estmates. Recently, Glbert et al [21] gave effcent algorthms for the dynamc mantenance of hstograms. Ther algorthm processes a stream of updates and mantans a small sketch of the data from whch the optmal hstogram representaton can be approxmated very quckly. In ths work, we focus on sketchng algorthms for estmatng smlarty,.e. the constructon of functons that produce succnct sketches of objects n a collecton, such that the smlarty of objects can be estmated effcently from ther sketches. Here, smlarty sm(x,y) s a functon that maps

2 pars of objects x,y to a number n [0, 1], measurng the degree of smlarty between x and y. sm(x, y) = 1 corresponds to objects x, y that are dentcal whle sm(x, y) = 0 corresponds to objects that are very dfferent. Broder et al [8, 5, 7, 6] ntroduced the noton of mn-wse ndependent permutatons, a technque for constructng such sketchng functons for a collecton of sets. The smlarty measure consdered there was sm(a,b) = A B A B. We note that ths s exactly the Jaccard coeffcent of smlarty used n nformaton retreval. The mn-wse ndependent permutaton scheme allows the constructon of a dstrbuton on hash functons h : 2 U U such that Pr h F [h(a) = h(b)] = sm(a,b). Here F denotes the famly of hash functons (wth an assocated probablty dstrbuton) operatng on subsets of the unverse U. By choosng say t hash functons h 1,... h t from ths famly, a set S could be represented by the hash vector (h 1(S),... h t(s)). Now, the smlarty between two sets can be estmated by countng the number of matchng coordnates n ther correspondng hash vectors. 1 The work of Broder et al was orgnally motvated by the applcaton of elmnatng near-duplcate documents n the Altavsta ndex. Representng documents as sets of features wth smlarty between sets determned as above, the hashng technque provded a smple method for estmatng smlarty of documents, thus allowng the orgnal documents to be dscarded and reducng the nput sze sgnfcantly. In fact, the mnwse ndependent permutatons hashng scheme s a partcular nstance of a localty senstve hashng scheme ntroduced by Indyk and Motwan [31] n ther work on nearest neghbor search n hgh dmensons. Defnton 1. A localty senstve hashng scheme s a dstrbuton on a famly F of hash functons operatng on a collecton of objects, such that for two objects x, y, Pr h F [h(x) = h(y)] = sm(x,y) (1) Here sm(x, y) s some smlarty functon defned on the collecton of objects. Gven a hash functon famly F that satsfes (1), we wll say that F s a localty senstve hash functon famly correspondng to smlarty functon sm(x, y). Indyk and Motwan showed that such a hashng scheme facltates the constructon of effcent data structures for answerng approxmate nearest-neghbor queres on the collecton of objects. In partcular, usng the hashng scheme gven by mnwse ndependent permutatons results n effcent data structures for set smlarty queres and leads to effcent clusterng algorthms. Ths was exploted later n several expermental papers: Cohen et al [14] for assocaton-rule mnng, Havelwala et al [27] for clusterng web documents, Chen et al [13] for selectvty estmaton of boolean queres, Chen et al [12] for twg queres, and Gons et al [22] for ndexng set value 1 One queston left open n [7] was the ssue of compact representaton of hash functons n ths famly; ths was settled by Indyk [28], who gave a constructon of a small famly of mnwse ndependent permutatons. attrbutes. All of ths work used the hashng technque for set smlarty together wth deas from [31]. We note that the defnton of localty senstve hashng used by [31] s slghtly dfferent, although n the same sprt as our defnton. Ther defnton nvolves parameters r 1 > r 2 and p 1 > p 2. A famly F s sad to be (r 1, r 2, p 1, p 2)- senstve for a smlarty measure sm(x,y) f Pr h F [h(x) = h(y)] p 1 when sm(x,y) r 1 and Pr h F [h(x) = h(y)] p 2 when sm(x,y) r 2. Despte the dfference n the precse defnton, we chose to retan the name localty senstve hashng n ths work snce the two notons are essentally the same. Hash functons wth closely related propertes were nvestgated earler by Lnal and Sasson [34] and Indyk et al [32]. 1.1 Our Results In ths paper, we explore constructons of localty senstve hash functons for varous other nterestng smlarty functons. The utlty of such hash functon schemes (for nearest neghbor queres and clusterng) crucally depends on the fact that the smlarty estmaton s based on a test of equalty of the hash functon values. We make an nterestng connecton between constructons of smlarty preservng hash-functons and roundng procedures used n the desgn of approxmaton algorthms. We show that procedures used for roundng fractonal solutons from lnear programs and vector solutons to semdefnte programs can be used to derve smlarty preservng hash functons for nterestng classes of smlarty functons. In Secton 2, we prove some necessary condtons on smlarty measures sm(x, y) for the exstence of localty senstve hash functons satsfyng (1). Usng ths, we show that such localty senstve hash functons do not exst for certan commonly used smlarty measures n nformaton retreval, the Dce coeffcent and the Overlap coeffcent. In semnal work, Goemans and Wllamson [24] ntroduced semdefnte programmng relaxatons as a tool for approxmaton algorthms. They used the random hyperplane roundng technque to round vector solutons for the MAX-CUT problem. We wll see n Secton 3 that the random hyperplane technque naturally gves a famly of hash functons F for vectors such that Pr h F [h( u) = h( v)] = 1 θ( u, v). π Here θ( u, v) refers to the angle between vectors u and v. Note that the functon 1 θ s closely related to the functon cos(θ). (In fact t s always wthn a factor from t. π Moreover, cos(θ) can be estmated from an estmate of θ.) Thus ths smlarty functon s very closely related to the cosne smlarty measure, commonly used n nformaton retreval. (In fact, Indyk and Motwan [31] descrbe how the set smlarty measure can be adapted to measure dot product between bnary vectors n d-dmensonal Hammng space. Ther approach breaks up the data set nto O(log d) groups, each consstng of approxmately the same weght. Our approach, based on estmatng the angle between vectors s more drect and s also more general snce t apples to general vectors.) We also note that the cosne between vectors can be estmated from known technques based on random projectons [2, 1, 20]. However, the advantage of a localty senstve hashng based scheme s that ths drectly yelds technques for nearest neghbor search for the cosne smlarty measure.

3 An attractve feature of the hash functons obtaned from the random hyperplane method s that the output s a sngle bt; thus the output of t hash functons can be concatenated very easly to produce a t-bt vector. 2 Estmatng smlarty between vectors amounts to measurng the Hammng dstance between the correspondng t-bt hash vectors. We can represent sets by ther characterstc vectors and use ths localty senstve hashng scheme for measurng smlarty between sets. Ths yelds a slghtly dfferent smlarty measure for sets, one that s lnearly proportonal to the angle between ther characterstc vectors. In Secton 4, we present a localty senstve hashng scheme for a certan metrc on dstrbutons on ponts, called the Earth Mover Dstance. We are gven a set of ponts L = {l 1,... l n}, wth a dstance functon d(, j) defned on them. A probablty dstrbuton P(X) (or dstrbuton for short) s a set of weghts p 1,... p n on the ponts such that p 0 and p = 1. (We wll often refer to dstrbuton P(X) as smply P, mplctly referrng to an underlyng set X of ponts.) The Earth Mover Dstance EMD(P, Q) between two dstrbutons P and Q s defned to be the cost of the mn cost matchng that transforms one dstrbuton to another. (Imagne each dstrbuton as placng a certan amount of earth on each pont. EMD(P, Q) measures the mnmum amount of work that must be done n transformng one dstrbuton to the other.) Ths s a popular metrc for mages and s used for mage smlarty, navgatng mage databases and so on [37, 38, 39, 40, 36, 15, 16, 41, 42]. The dea s to represent an mage as a dstrbuton on features wth an underlyng dstance metrc on features (e.g. colors n a color spectrum). Snce the earth mover dstance s expensve to compute (requrng a soluton to a mnmum transportaton problem), applcatons typcally use an approxmaton of the earth mover dstance. (e.g. representng dstrbutons by ther centrods). We construct a hash functon famly for estmatng the earth mover dstance. Our famly s based on roundng algorthms for LP relaxatons for the problem of classfcaton wth parwse relatonshps studed by Klenberg and Tardos [33], and further studed by Calnescu et al [10] and Chekur et al [11]. Combnng a new LP formulaton descrbed by Chekur et al together wth a roundng technque of Klenberg and Tardos, we show a constructon of a hash functon famly whch approxmates the earth mover dstance to a factor of O(log nlog log n). Each hash functon n ths famly maps a dstrbuton on ponts L = {l 1,..., l n} to some pont l n the set. For two dstrbutons P(X) and Q(X) on the set of ponts, our famly of hash functons F satsfes the property that: EMD(P, Q) Eh F[d(h(P), h(q))] O(log nlog log n) EMD(P, Q). We also show an nterestng fact about a roundng algorthm n Klenberg and Tardos [33] applyng to the case where the underlyng metrc on ponts s a unform metrc. In ths case, we show that ther roundng algorthm can 2 In Secton 2, we wll show that we can convert any localty senstve hashng scheme to one that maps objects to {0, 1} wth a slght change n smlarty measure. However, the modfed hash functons convey less nformaton, e.g. the collson probablty for the modfed hash functon famly s at least 1/2 even for a par of objects wth orgnal smlarty 0. be vewed as a generalzaton of mn-wse ndependent permutatons extended to a contnuous settng. Ther roundng procedure yelds a localty senstve hash functon for vectors whose coordnates are all non-negatve. Gven two vectors a = (a 1,... a n) and b = (b 1,... b n), the smlarty functon s sm( a, b) = mn(a, b) max(a, b). (Note that when a and b are the characterstc vectors for sets A and B, ths expresson reduces to the set smlarty measure for mn-wse ndependent permutatons.) Applcatons of localty senstve hash functons to solvng nearest neghbor queres typcally reduce the problem to the Hammng space. Indyk and Motwan [31] gve a data structure that solves the approxmate nearest neghbor problem on the Hammng space. Ther constructon s a reducton to the so called PLEB (Pont Locaton n Equal Balls) problem, followed by a hashng technque concatenatng the values of several localty senstve hash functons. We gve a smple technque that acheves the same performance as the Indyk Motwan result n Secton 5. The basc dea s as follows: Gven bt vectors consstng of d bts each, we choose a number of random permutatons of the bts. For each random permutaton σ, we mantan a sorted order of the bt vectors, n lexcographc order of the bts permuted by σ. To fnd a nearest neghbor for a query bt vector q we do the followng: For each permutaton σ, we perform a bnary search on the sorted order correspondng to σ to locate the bt vectors closest to q (n the lexcographc order obtaned by bts permuted by σ). Further, we search n each of the sorted orders proceedng upwards and downwards from the locaton of q, accordng to a certan rule. Of all the bt vectors examned, we return the one that has the smallest Hammng dstance to the query vector. The performance bounds we can prove for ths smple scheme are dentcal to that proved by Indyk and Motwan for ther scheme. 2. EXISTENCE OF LOCALITY SENSITIVE HASH FUNCTIONS In ths secton, we dscuss certan necessary propertes for the exstence of localty senstve hash functon famles for gven smlarty measures. Lemma 1. For any smlarty functon sm(x, y) that admts a localty senstve hash functon famly as defned n (1), the dstance functon 1 sm(x, y) satsfes trangle nequalty. Proof. Suppose there exsts a localty senstve hash functon famly such that Then, Pr h F [h(x) = h(y)] = sm(x,y). 1 sm(x,y) = Pr h F [h(x) h(y)]. Let h (x,y) be an ndcator varable for the event h(x) h(y). We clam that h (x,y) satsfes the trangle nequalty,.e. h (x,y) + h (y,z) h (x, z).

4 Snce h () takes values n the set {0, 1}, the only case when the above nequalty could be volated would be when h (x,y) = h (y,z) = 0. But n ths case h(x) = h(y) and h(y) = h(z). Thus, h(x) = h(z) mplyng that h (x, z) = 0 and the nequalty s satsfed. Ths proves the clam. Now, 1 sm(x, y) = Eh F[ h (x,y)] Snce h (x,y) satsfes the trangle nequalty, Eh F[ h (x,y)] must also satsfy the trangle nequalty. Ths proves the lemma. Ths gves a very smple proof of the fact that for the set smlarty measure sm(a,b) = A B, 1 sm(a,b) A B satsfes the trangle nequalty. Ths follows from Lemma 1 and the fact that a set smlarty measure admts a localty senstve hash functon famly, namely that gven by mnwse ndependent permutatons. One could ask the queston whether localty senstve hash functons satsfyng the defnton (1) exst for other commonly used set smlarty measures n nformaton retreval. For example, Dce s coeffcent s defned as sm Dce (A,B) = 1 2 The Overlap coeffcent s defned as sm Ovl (A, B) = A B ( A + B ) A B mn( A, B ) We can use Lemma 1 to show that there s no such localty senstve hash functon famly for Dce s coeffcent and the Overlap measure by showng that the correspondng dstance functon does not satsfy trangle nequalty. Consder the sets A = {a}, B = {b}, C = {a, b}. Then, sm Dce (A, C) = 2 3, sm Dce (C, B) = 2 3, sm Dce (A, B) = 0 1 sm Dce (A, C) + 1 sm Dce (C, B) < 1 sm Dce (A, B) Smlarly, the values for the Overlap measure are as follows: sm Ovl (A,C) = 1, sm Ovl (C, B) = 1, sm Ovl (A, B) = 0 1 sm Ovl (A, C) + 1 sm Ovl (C, B) < 1 sm Ovl (A, B) Ths shows that there s no localty senstve hash functon famly correspondng to Dce s coeffcent and the Overlap measure. It s often convenent to have a hash functon famly that maps objects to {0, 1}. In that case, the output of t dfferent hash functons can smply be concatenated to obtan a t-bt hash value for an object. In fact, we can always obtan such a bnary hash functon famly wth a slght change n the smlarty measure. A smlar result was used and proved by Gons et al [22]. We nclude a proof for completeness. Lemma 2. Gven a localty senstve hash functon famly F correspondng to a smlarty functon sm(x, y), we can obtan a localty senstve hash functon famly F that maps objects to {0, 1} and corresponds to the smlarty functon 1+sm(x,y) 2. Proof. Suppose we have a hash functon famly such that Pr h F [h(x) = h(y)] = sm(x,y). Let B be a parwse ndependent famly of hash functons that operate on the doman of the functons n F and map elements n the doman to {0, 1}. Then Pr b B [b(u) = b(v)] = 1/2 f u v and Pr b B [b(u) = b(v)] = 1 f u = v. Consder the hash functon famly obtaned by composng a hash functon from F wth one from B. Ths maps objects to {0, 1} and we clam that t has the requred propertes. Pr h F,b B [b(h(x)) = b(h(y))] = 1 + sm(x,y) 2 Wth probablty sm(x,y), h(x) = h(y) and hence b(h(x) = b(h(y)). Wth probablty 1 sm(x, y), h(x) h(y) and n ths case, Pr b B [b(h(x) = b(h(y))] = 1 2. Thus, Pr[b(h(x)) = b(h(y))] = sm(x,y) + (1 sm(x,y))/2 = (1 + sm(x,y))/2. Ths can be used to show a stronger condton for the exstence of a localty senstve hash functon famly. Lemma 3. For any smlarty functon sm(x, y) that admts a localty senstve hash functon famly as defned n (1), the dstance functon 1 sm(x, y) s sometrcally embeddable n the Hammng cube. Proof. Frstly, we apply Lemma 2 to construct a bnary localty senstve hash functon famly correspondng to smlarty functon sm (x, y) = (1 + sm(x, y))/2. Note that such a bnary hash functon famly gves an embeddng of objects nto the Hammng cube (obtaned by concatenatng the values of all the hash functons n the famly). For object x, let v(x) be the element n the Hammng cube x s mapped to. 1 sm (x, y) s smply the fracton of bts that do not agree n v(x) and v(y), whch s proportonal to the Hammng dstance between v(x) and v(y). Thus ths embeddng s an sometrc embeddng of the dstance functon 1 sm (x, y) n the Hammng cube. But 1 sm (x, y) = 1 (1 + sm(x,y))/2 = (1 sm(x, y))/2. Ths mples that 1 sm(x, y) can be sometrcally embedded n the Hammng cube. We note that Lemma 3 has a weak converse,.e. for a smlarty measure sm(x, y) any sometrc embeddng of the dstance functon 1 sm(x, y) n the Hammng cube yelds a localty senstve hash functon famly correspondng to the smlarty measure (α + sm(x, y))/(α + 1) for some α > RANDOM HYPERPLANE BASED HASH FUNCTIONS FOR VECTORS Gven a collecton of vectors n R d, we consder the famly of hash functons defned as follows: We choose a random vector r from the d-dmensonal Gaussan dstrbuton (.e. each coordnate s drawn the 1-dmensonal Gaussan dstrbuton). Correspondng to ths vector r, we defne a hash

5 functon h r as follows: h r ( u) = Then for vectors u and v, 1 f r u 0 0 f r u < 0 Pr[h r ( u) = h r ( v)] = 1 θ( u, v). π Ths was used by Goemans and Wllamson [24] n ther roundng scheme for the semdefnte programmng relaxaton of MAX-CUT. Pckng a random hyperplane amounts to choosng a normally dstrbuted random varable for each dmenson. Thus even representng a hash functon n ths famly could requre a large number of random bts. However, for n vectors, the hash functons can be chosen by pckng O(log 2 n) random bts,.e. we can restrct the random hyperplanes to be n a famly of sze 2 O(log2 n). Ths follows from the technques n Indyk [30] and Engebretsen et al [17], whch n turn use Nsan s pseudorandom number generator for space bounded computatons [35]. We omt the detals snce they are smlar to those n [30, 17]. Usng ths random hyperplane based hash functon, we obtan a hash functon famly for set smlarty, for a slghtly dfferent measure of smlarty of sets. Suppose sets are represented by ther characterstc vectors. Then, applyng the above scheme gves a localty senstve hashng scheme where Pr[h(A) = h(b)] = 1 θ π, where θ = cos 1 A A B Also, ths hash functon famly facltates easy ncorporaton of element weghts n the smlarty calculaton, snce the values of the coordnates of the characterstc vectors could be real valued element weghts. Later, n Secton 4.1 we wll present another technque to defne and estmate smlarty of weghted sets. 4. THE EARTH MOVER DISTANCE Consder a set of ponts L = {l 1,... l n} wth a dstance functon d(, j) (assumed to be a metrc). A dstrbuton P(L) on L s a collecton of non-negatve weghts (p 1,... p n) for ponts n X such that p = 1. The dstance between two dstrbutons P(L) and Q(L) s defned to be the optmal cost of the followng mnmum transportaton problem: mn,j f,j d(, j) (2) j f,j = p (3) j f,j = q j (4), j f,j 0 (5) Note that we defne a somewhat restrcted form of the Earth Mover Dstance. The general defnton does not assume that the sum of the weghts s dentcal for dstrbutons P(L) and Q(L). Ths s useful for example n matchng a small mage to a porton of a larger mage. We wll construct a hash functon famly for estmatng the Earth Mover Dstance based on roundng algorthms for the problem of classfcaton wth parwse relatonshps, ntroduced by Klenberg and Tardos [33]. (A closely related problem was also studed by Broder et al [9]). In desgnng hash functons to estmate the Earth Mover Dstance, we wll relax the defnton of localty senstve hashng (1) n three ways. 1. Frstly, the quantty we are tryng to estmate s a dstance measure, not a smlarty measure n [0, 1]. 2. Secondly, we wll allow the hash functons to map objects to ponts n a metrc space and measure E[d(h(x), h(y))]. (A localty senstve hash functon for a smlarty measure sm(x,y) can be vewed as a scheme to estmate the dstance 1 sm(x, y) by Pr h F [h(x) h(y)]. Ths s equvalent to havng a unform metrc on the hash values). 3. Thrdly, our estmator for the Earth Mover Dstance wll not be an unbased estmator,.e. our estmate wll approxmate the Earth Mover Dstance to wthn a small factor. We now descrbe the problem of classfcaton wth parwse relatonshps. Gven a collecton of objects V and labels L = {l 1,..., l n}, the goal s to assgn labels to objects. The cost of assgnng label l to object u V s c(u, l). Certan pars of objects (u, v) are related; such pars form the edges of a graph over V. Each edge e = (u, v) s assocated wth a non-negatve weght w e. For edge e = (u, v), f u s assgned label h(u) and v s assgned label h(v), then the cost pad s w ed(h(u), h(v)). The problem s to come up wth an assgnment of labels h : V L, so as to mnmze the cost of the labelng h gven by c(v, h(v)) + w ed(h(u), h(v)) u V e=(u,v) E The approxmaton algorthms for ths problem use an LP to assgn, for every u V, a probablty dstrbuton over labels n L (.e. a set of non-negatve weghts that sum up to 1). Gven a dstrbuton P over labels n L, the roundng algorthm of Klenberg and Tardos gave a randomzed procedure for assgnng label h(p) to P wth the followng propertes: 1. Gven dstrbuton P(L) = (p 1,... p n), Pr[h(P) = l ] = p. (6) 2. Suppose P and Q are probablty dstrbutons over L. E[d(h(P), h(q))] O(log nlog log n)emd(p, Q) (7) We note that the second property (7) s not mmedately obvous from [33], snce they do not descrbe LP relaxatons for general metrcs. Ther LP relaxatons are defned for Herarchcally well Separated Trees (HSTs). They convert a general metrc to such an HST usng Bartal s results [3, 4] on probablstc approxmaton of metrc spaces va tree metrcs. However, t follows from combnng deas n [33] wth those n Chekur et al [11]. Chekur et al do n fact gve an LP relaxaton for general metrcs. The LP relaxaton

6 does ndeed produce dstrbutons over labels for every object u V. The fractonal dstance between two labelngs s expressed as the mn cost transshpment between P and Q, whch s dentcal to the Earth Mover Dstance EMD(P, Q). Now, ths fractonal soluton can be used n the roundng algorthm developed by Klenberg and Tardos to obtan the second property (7) clamed above. In fact, Chekur et al use ths fact to clam that the gap of ther LP relaxaton s at most O(log nlog log n) (Theorem 5.1 n [11]). We elaborate some more on why the property (7) holds. Klenberg and Tardos frst (probablstcally) approxmate the metrc on L by an HST usng [3, 4]. Ths s a tree wth all vertces n the orgnal metrc at the leaves. The parwse dstance between any two vertces does no decrease and all parwse dstances are ncreased by a factor of at most O(log nlog log n) (n expectaton). For ths tree metrc, they use an LP formulaton whch can be descrbed as follows. Suppose we have a rooted tree. For subtree T, let l T denote the length of the edge that T hangs off of,.e. the frst edge on the path from T to the root. Further, for dstrbuton P on the vertces of the orgnal metrc, let P(T) denote the total probablty mass that P assgns to leaves n T; Q(T) s smlarly defned. The dstance between dstrbutons P and Q s measured by T lt P(T) Q(T), where the summaton s computed over all subtrees T. The Klenberg Tardos roundng scheme ensures that E[d(h(P), h(q))] s wthn a constant factor of T lt P(T) Q(T). Suppose nstead, we measured the dstance between dstrbutons by EMD(P, Q), defned on the orgnal metrc. By probablstcally approxmatng the orgnal metrc by a tree metrc T, the expected value of the dstance EMD T (P, Q) (on the tree metrc T ) s at most a factor of O(log nlog log n) tmes EMD(P, Q). Ths follows snce all dstances ncrease by O(log n log log n) n expectaton. Now note that the tree dstance measure used by Klenberg and Tardos T lt P(T) Q(T) s a lower bound on (and n fact exactly equal to) EMD T (P, Q). To see that ths s a lower bound, note that n the mn cost transportaton between P and Q on T, the flow on the edge leadng upwards from subtree T must be at least P(T) Q(T). Snce the roundng scheme ensures that E[d(h(P), h(q))] s wthn a constant factor of lt P(T) Q(T), we have that T E[d(h(P), h(q))] O(1)EMD T (P, Q) O(log nlog log n)emd(p, Q) where the expectaton s over the random choce of the HST and the random choces made by the roundng procedure. Theorem 1. The Klenberg Tardos roundng scheme yelds a localty senstve hashng scheme such that EMD(P, Q) E[d(h(P), h(q))] O(log nlog log n)emd(p, Q). Proof. The upper bound on E[d(h(P), h(q))] follows drectly from the second property (7) of the roundng scheme stated above. We show that the lower bound follows from the frst property (6). Let y,j be the jont probablty that h(p) = l and h(q) = l j. Note that j y,j = p, snce ths s smply the probablty that h(p) = l. Smlarly y,j = qj, snce ths s smply the probablty that h(q) = l j. Now, f h(p) = l and h(q) = l j, then d(h(p)h(q)) = d(, j). Hence E[d(f(P), f(q))] =,j y,j d(, j). Let us wrte down the expected cost and the constrants on y,j. E[d(h(P), h(q))] = j y,j,j = p j y,j = q j, j y,j 0 y,j d(, j) Comparng ths wth the LP for EMD(P, Q), we see that the values of f,j = y,j s a feasble soluton to the LP (2) to (5) and E[d(h(P), h(q))] s exactly the value of ths soluton. Snce EMD(P, Q) s the mnmum value of a feasble soluton, t follows that EMD(P, Q) E[d(h(P), h(q))]. Calnescu et al [10] study a varant of the classfcaton problem wth parwse relatonshps called the 0-extenson problem. Ths s the verson wthout assgnment costs where some objects are assgned labels apror and ths labelng must be extended to the other objects (a generalzaton of multway cut). For ths problem, they desgn a roundng scheme to get a O(log n) approxmaton. Agan, ther technque does not explctly use an LP that gves probablty dstrbutons on labels. However n hndsght, ther roundng scheme can be nterpreted as a randomzed procedure for assgnng labels to dstrbutons such that E[d(h(P), h(q))] O(log n)emd(p, Q). Thus ther roundng scheme gves a tghter guarantee than (7). However, they do not ensure (6). Thus the prevous proof showng that EMD(P, Q) E[d(h(P), h(q))] does not apply. In fact one can construct examples such that EMD(P, Q) > 0, yet E[d(h(P), h(q))] = 0. Hence, the resultng hash functon famly provdes an upper bound on EMD(P, Q) wthn a factor O(log n) but does not provde a good lower bound. We menton that the hashng scheme descrbed provdes an approxmaton to the Earth Mover Dstance where the qualty of the approxmaton s exactly the factor by whch the underlyng metrc can be probablstcally approxmated by HSTs. In partcular, f the underlyng metrc tself s an HST, ths yelds an estmate wthn a constant factor. Ths could have applcatons n compactly representng dstrbutons over herarchcal classes. For example, documents can be assgned a probablty dstrbuton over classes n the Open Drectory Project (ODP) herarchy. Ths herarchy could be thought of as an HST and documents can be mapped to dstrbutons over ths HST. The dstance between two dstrbutons can be measured by the Earth Mover Dstance. In ths case, the hashng scheme descrbed gves a way to estmate ths dstance measure to wthn a constant factor. 4.1 Weghted Sets We show that the Klenberg Tardos [33] roundng scheme for the case of the unform metrc actually s an extenson of mn-wse ndependent permutatons to the weghted case. Frst we recall the hashng scheme gven by mn-wse ndependent permutatons. Gven a unverse U, consder a random permutaton π of U. Assume that the elements of U are totally ordered. Gven a subset A U, the hash

7 functon h π s defned as follows: h π(a) = mn{π(a)} Then the property satsfed by ths hash functon famly s that A B Pr π[h π(a) = h π(b)] = A B We now revew the Klenberg Tardos roundng scheme for the unform metrc: Frstly, magne that we pck an nfnte sequence {( t, α t)} t=1 where for each t, t s pcked unformly and at random n {1,... n} and α t s pcked unformly and at random n [0,1]. Gven a dstrbuton P = (p 1,..., p n), the assgnment of labels s done n phases. In the th phase, we check whether α p t. If ths s the case and P has not been assgned a label yet, t s assgned label t. Now, we can thnk of these dstrbutons as sets n R 2 (see Fgure 1) Fgure 1: Vewng a dstrbuton as a contnuous set. The set S(P) correspondng to dstrbuton P conssts of the unon of the rectangles [ 1, ] [0, p ]. The elements of the unverse are [ 1, ] α. [ 1, ] α belongs to S(P) ff α p. The noton of cardnalty of unon and ntersecton of sets s replaced by the area of the ntersecton and unon of two such sets n R 2. Note that the Klenberg Tardos roundng scheme can be nterpreted as constructng a permutaton of the unverse and assgnng to a dstrbuton P, the value such that (, α) s the mnmum n the permutaton amongst all elements contaned n S(P). Suppose nstead, we assgn to P, the element (, α) whch s the mnmum n the permutaton of S(P). Let h be a hash functon derved from ths scheme (a slght modfcaton of the one n [33]). Then, Pr[h(P) = h(q)] = S(P) S(Q) S(P) S(Q) = mn(p, q) max(p, q) (8) For the Klenberg Tardos roundng scheme, the probablty of collson s at least the probablty of collson for the modfed scheme (snce two objects hashed to (, α 1) and (, α 2) respectvely n the modfed scheme would be both mapped to n the orgnal scheme). Hence Pr KT[h(P) = h(q)] Pr KT[h(P) h(q)] 1 = mn(p, q) max(p, q) mn(p, q) max(p, q) p q max(p, q) p q The last nequalty follows from the fact that p = q = 1 n the Klenberg Tardos settng. Ths was exactly the property used n [33] to obtan a 2-approxmaton for the unform metrc case. Note that the hashng scheme gven by (8) s a generalzaton of mn-wse ndependent permutatons to the weghted settng where elements n sets are assocated wth weghts [0, 1]. Mn-wse ndependent permutatons are a specal case of ths scheme when the weghts are {0, 1}. Ths scheme could be useful n a settng where a weghted set smlarty noton s desred. We note that the orgnal mn-wse ndependent permutatons can be used n the settng of nteger weghts by smply duplcatng elements accordng to ther weght. The present scheme would work for any nonnegatve real weghts. 5. APPROXIMATE NEAREST NEIGHBOR SEARCH IN HAMMING SPACE. Applcatons of localty senstve hash functons to solvng nearest neghbor queres typcally reduce the problem to the Hammng space. Indyk and Motwan [31] gve a data structure that solves the approxmate nearest neghbor problem on the Hammng space H d. Ther constructon s a reducton to the so called PLEB (Pont Locaton n Equal Balls) problem, followed by a hashng technque concatenatng the values of several localty senstve hash functons. Theorem 2 ([31]). For any ɛ > 0, there exsts an algorthm for ɛ-pleb n H d usng O(dn + n 1+1/(1+ɛ) ) space and O(n 1/(1+ɛ) ) hash functon evaluatons for each query. We gve a smple technque that acheves the same performance as the Indyk Motwan result: Gven bt vectors consstng of d bts each, we choose N = O(n 1/(1+ɛ) ) random permutatons of the bts. For each random permutaton σ, we mantan a sorted order O σ of the bt vectors, n lexcographc order of the bts permuted by σ. Gven a query bt vector q, we fnd the approxmate nearest neghbor by dong the followng: For each permutaton σ, we perform a bnary search on O σ to locate the two bt vectors closest to q (n the lexcographc order obtaned by bts permuted by σ). We now search n each of the sorted orders O σ examnng elements above and below the poston returned by the bnary search n order of the length of the longest prefx that matches q. Ths can be done by mantanng two ponters for each sorted order O σ (one moves up and the other down). At each step we move one of the ponters up or down correspondng to the element wth the longest matchng prefx. (Here the length of the longest matchng prefx n O σ s computed relatve to q wth ts bts permuted by σ). We examne 2N = O(n 1/(1+ɛ) ) bt vectors n ths way. Of all the bt vectors examned, we return the one that has the smallest Hammng dstance to q. The performance bounds we can prove for ths smple scheme are dentcal to that proved by Indyk and Motwan for ther scheme. An advantage of ths scheme s that we do not need a reducton to many nstances of PLEB for dfferent values of radus r,.e. we solve the nearest neghbor problem smultaneously for all values of radus r usng a sngle data structure. We outlne the man deas of the analyss. In fact, the proof follows along smlar lnes to the proofs of Theorem 5 and Corollary 3 n [31]. Suppose the nearest neghbor of q s at a Hammng dstance of r from q. Set p 1 = 1 r, d p 2 = 1 r(1+ɛ) and k = log d 1/p2 n. Let ρ = ln1/p 1 ln1/p 2. Then n ρ = O(n 1/(1+ɛ) ). We can show that wth constant probablty, from amongst N = O(n 1/(1+ɛ) ) permutatons, there

8 exsts a permutaton such that the nearest neghbor agrees wth p on the frst k coordnates n σ. Further, over all L permutatons, the number of bt vectors that are at Hammng dstance of more than r(1+ɛ) from q and agree on the frst k coordnates s at most 2N wth constant probablty. Ths mples that for ths permutaton σ, one of the 2L bt vectors near q n the orderng O σ and examned by the algorthm wll be a (1 + ɛ)-approxmate nearest neghbor. The probablty calculatons are smlar to those n [31], and we only sketch the man deas. For any pont q at dstance at least r(1 + ɛ) from q, the probablty that a random coordnate agrees wth q s at most p 2. Thus the probablty that the frst k coordnates agree s at most p k 2 = 1. For the N permutatons, the n expected number of such ponts that agree n the frst k coordnates s at most N. The probablty that ths number s 2N s > 1/2. Further, for a random permutaton σ, the probablty that the nearest neghbor agrees n k coordnates s p k 1 = n ρ. Hence the probablty that there exsts one permutaton amongst the N = n ρ permutatons where the nearest neghbor agrees n k coordnates s at least 1 (1 n ρ ) nρ > 1/2. Ths establshes the correctness of the procedure. As we stated earler, a nce property of ths data structure s that t automatcally adjusts to the correct dstance r to the nearest neghbor,.e. we do not need to mantan separate data structures for dfferent values of r. 6. CONCLUSIONS We have demonstrated an nterestng relatonshp between roundng algorthms used for roundng fractonal solutons of LPs and vector solutons of SDPs on the one hand, and the constructon of localty senstve hash functons for nterestng classes of objects, on the other. Roundng algorthms yeld new constructons of localty senstve hash functons that were not known prevously. Conversely (at least n hndsght), localty senstve hash functons lead to roundng algorthms (as n the case of mnwse ndependent permutatons and the unform metrc case n Klenberg and Tardos [33]). An nterestng drecton to pursue would be to nvestgate the constructon of sketchng functons that allow one to estmate nformaton theoretc measures of dstance between dstrbutons such as the KL-dvergence, commonly used n statstcal learnng theory. Snce the KL-dvergence s nether symmetrc nor satsfes trangle nequalty, new deas would be requred n order to desgn a sketch functon to approxmate t. Such a sketch functon, f one exsts, would be a very valuable tool n compactly representng complex dstrbutons. 7. REFERENCES [1] N. Alon, P. B. Gbbons, Y. Matas, and M. Szegedy. Trackng Jon and Self-Jon Szes n Lmted Storage. Proc. 18th PODS pp , [2] N. Alon, Y. Matas, and M. Szegedy. The Space Complexty of Approxmatng the Frequency Moments. JCSS 58(1): , 1999 [3] Y. Bartal. Probablstc approxmaton of metrc spaces and ts algorthmc applcaton. Proc. 37th FOCS, pages , [4] Y. Bartal. On approxmatng arbtrary metrcs by tree metrcs. In Proc. 30th STOC, pages , [5] A. Z. Broder. On the resemblance and contanment of documents. Proc. Compresson and Complexty of SEQUENCES, pp IEEE Computer Socety, [6] A. Z. Broder. Flterng near-duplcate documents. Proc. FUN 98, [7] A. Z. Broder, M. Charkar, A. Freze, and M. Mtzenmacher. Mn-wse ndependent permutatons. Proc. 30th STOC, pp , [8] A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweg. Syntactc clusterng of the Web. Proc. 6th Int l World Wde Web Conference, pp , [9] A. Z. Broder, R. Krauthgamer, and M. Mtzenmacher. Improved classfcaton va connectvty nformaton. Proc. 11th SODA, pp , [10] G. Calnescu, H. J. Karloff, and Y. Raban. Approxmaton algorthms for the 0-extenson problem. Proc. 11th SODA, pp. 8-16, [11] C. Chekur, S. Khanna, J. Naor, and L. Zosn. Approxmaton algorthms for the metrc labelng problem va a new lnear programmng formulaton. Proc. 12th SODA, pp , [12] Z. Chen, H. V. Jagadsh, F. Korn, N. Koudas, S. Muthukrshnan, R. T. Ng, and D. Srvastava. Countng Twg Matches n a Tree. Proc. 17th ICDE pp [13] Z. Chen, F. Korn, N. Koudas, and S. Muthukrshnan. Selectvty Estmaton for Boolean Queres. Proc. 19th PODS, pp , [14] E. Cohen, M. Datar, S. Fujwara, A. Gons, P. Indyk, R. Motwan, J. D. Ullman, and C. Yang. Fndng Interestng Assocatons wthout Support Prunng. Proc. 16th ICDE pp , [15] S. Cohen and L. Gubas. The Earth Mover s Dstance under Transformaton Sets. Proc. 7th IEEE Intnl. Conf. Computer Vson, [16] S. Cohen and L. Gubas. The Earth Mover s Dstance: Lower Bounds and Invarance under Translaton. Tech. report STAN-CS-TR , Dept. of Computer Scence, Stanford Unversty, [17] L. Engebretsen, P. Indyk and R. O Donnell. Derandomzed dmensonalty reducton wth applcatons. To appear n Proc. 13th SODA, [18] P. B. Gbbons and Y. Matas. Synopss Data Structures for Massve Data Sets. Proc. 10th SODA pp , [19] A. C. Glbert, Y. Kotds, S. Muthukrshnan, and M. J. Strauss. Surfng Wavelets on Streams: One-Pass Summares for Approxmate Aggregate Queres. Proc. 27th VLDB pp , [20] A. C. Glbert, Y. Kotds, S. Muthukrshnan, and M. J. Strauss. QuckSAND: Quck Summary and Analyss of Network Data. DIMACS Techncal Report , November [21] A. C. Glbert, S. Guha, P. Indyk, Y. Kotds, S. Muthukrshnan, and M. J. Strauss. Fast, Small-Space Algorthms for Approxmate Hstogram Mantenance. these proceedngs. [22] A. Gons, D. Gunopulos, and N. Koudas. Effcent

9 and Tunable Smlar Set Retreval. Proc. SIGMOD Conference [23] A. Gons, P. Indyk, and R. Motwan. Smlarty Search n Hgh Dmensons va Hashng. Proc. 25th VLDB pp , [24] M. X. Goemans and D. P. Wllamson. Improved Approxmaton Algorthms for Maxmum Cut and Satsfablty Problems Usng Semdefnte Programmng. JACM 42(6): , [25] S. Guha, N. Mshra, R. Motwan, and L. O Callaghan. Clusterng data streams. Proc. 41st FOCS, pp , [26] A. Gupta and Éva Tardos. A constant factor approxmaton algorthm for a class of classfcaton problems. Proc. 32nd STOC, pp , [27] T. H. Havelwala, A. Gons, and P. Indyk. Scalable Technques for Clusterng the Web. Proc. 3rd WebDB, pp , [28] P. Indyk. A small approxmately mn-wse ndependent famly of hash functons. Proc. 10th SODA, pp , [29] P. Indyk. On approxmate nearest neghbors n non-eucldean spaces. Proc. 40th FOCS, pp , [30] P. Indyk. Stable Dstrbutons, Pseudorandom Generators, Embeddngs and Data Stream Computaton. Proc. 41st FOCS, , [31] Indyk, P., Motwan, R. Approxmate nearest neghbors: towards removng the curse of dmensonalty. Proc. 30th STOC pp , [32] P. Indyk, R. Motwan, P. Raghavan, and S. Vempala. Localty-Preservng Hashng n Multdmensonal Spaces. Proc. 29th STOC, pp , [33] J. M. Klenberg and Éva Tardos Approxmaton Algorthms for Classfcaton Problems wth Parwse Relatonshps: Metrc Labelng and Markov Random Felds. Proc. 40th FOCS, pp , [34] N. Lnal and O. Sasson. Non-Expansve Hashng. Combnatorca 18(1): , [35] N. Nsan. Pseudorandom sequences for space bounded computatons. Combnatorca, 12: , [36] Y. Rubner. Perceptual Metrcs for Image Database Navgaton. Phd Thess, Stanford Unversty, May 1999 [37] Y. Rubner, L. J. Gubas, and C. Tomas. The Earth Mover s Dstance, Mult-Dmensonal Scalng, and Color-Based Image Retreval. Proc. of the ARPA Image Understandng Workshop, pp , [38] Y. Rubner, C. Tomas. Texture Metrcs. Proc. IEEE Internatonal Conference on Systems, Man, and Cybernetcs, 1998, pp [39] Y. Rubner, C. Tomas, and L. J. Gubas. A Metrc for Dstrbutons wth Applcatons to Image Databases. Proc. IEEE Int. Conf. on Computer Vson, pp , [40] Y. Rubner, C. Tomas, and L. J. Gubas. The Earth Mover s Dstance as a Metrc for Image Retreval. Tech. Report STAN-CS-TN-98-86, Dept. of Computer Scence, Stanford Unversty, [41] M. Ruzon and C. Tomas. Color edge detecton wth the compass operator. Proc. IEEE Conf. Computer Vson and Pattern Recognton, 2: , [42] M. Ruzon and C. Tomas. Corner detecton n textured color mages. Proc. IEEE Int. Conf. Computer Vson, 2: , 1999.

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

1 Example 1: Axis-aligned rectangles

1 Example 1: Axis-aligned rectangles COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

v a 1 b 1 i, a 2 b 2 i,..., a n b n i. SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching) Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton

More information

Recurrence. 1 Definitions and main statements

Recurrence. 1 Definitions and main statements Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

More information

Sngle Snk Buy at Bulk Problem and the Access Network

Sngle Snk Buy at Bulk Problem and the Access Network A Constant Factor Approxmaton for the Sngle Snk Edge Installaton Problem Sudpto Guha Adam Meyerson Kamesh Munagala Abstract We present the frst constant approxmaton to the sngle snk buy-at-bulk network

More information

Extending Probabilistic Dynamic Epistemic Logic

Extending Probabilistic Dynamic Epistemic Logic Extendng Probablstc Dynamc Epstemc Logc Joshua Sack May 29, 2008 Probablty Space Defnton A probablty space s a tuple (S, A, µ), where 1 S s a set called the sample space. 2 A P(S) s a σ-algebra: a set

More information

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence 1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan support vector machnes.

More information

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Research Note APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES * Iranan Journal of Scence & Technology, Transacton B, Engneerng, ol. 30, No. B6, 789-794 rnted n The Islamc Republc of Iran, 006 Shraz Unversty "Research Note" ALICATION OF CHARGE SIMULATION METHOD TO ELECTRIC

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary

More information

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits Lnear Crcuts Analyss. Superposton, Theenn /Norton Equalent crcuts So far we hae explored tmendependent (resste) elements that are also lnear. A tmendependent elements s one for whch we can plot an / cure.

More information

8 Algorithm for Binary Searching in Trees

8 Algorithm for Binary Searching in Trees 8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the

More information

We are now ready to answer the question: What are the possible cardinalities for finite fields?

We are now ready to answer the question: What are the possible cardinalities for finite fields? Chapter 3 Fnte felds We have seen, n the prevous chapters, some examples of fnte felds. For example, the resdue class rng Z/pZ (when p s a prme) forms a feld wth p elements whch may be dentfed wth the

More information

BERNSTEIN POLYNOMIALS

BERNSTEIN POLYNOMIALS On-Lne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful

More information

A Probabilistic Theory of Coherence

A Probabilistic Theory of Coherence A Probablstc Theory of Coherence BRANDEN FITELSON. The Coherence Measure C Let E be a set of n propostons E,..., E n. We seek a probablstc measure C(E) of the degree of coherence of E. Intutvely, we want

More information

Sketching Sampled Data Streams

Sketching Sampled Data Streams Sketchng Sampled Data Streams Florn Rusu, Aln Dobra CISE Department Unversty of Florda Ganesvlle, FL, USA frusu@cse.ufl.edu adobra@cse.ufl.edu Abstract Samplng s used as a unversal method to reduce the

More information

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

More information

How To Calculate The Accountng Perod Of Nequalty

How To Calculate The Accountng Perod Of Nequalty Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Conversion between the vector and raster data structures using Fuzzy Geographical Entities Converson between the vector and raster data structures usng Fuzzy Geographcal Enttes Cdála Fonte Department of Mathematcs Faculty of Scences and Technology Unversty of Combra, Apartado 38, 3 454 Combra,

More information

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6 PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has

More information

Ring structure of splines on triangulations

Ring structure of splines on triangulations www.oeaw.ac.at Rng structure of splnes on trangulatons N. Vllamzar RICAM-Report 2014-48 www.rcam.oeaw.ac.at RING STRUCTURE OF SPLINES ON TRIANGULATIONS NELLY VILLAMIZAR Introducton For a trangulated regon

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING Matthew J. Lberatore, Department of Management and Operatons, Vllanova Unversty, Vllanova, PA 19085, 610-519-4390,

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

Generalizing the degree sequence problem

Generalizing the degree sequence problem Mddlebury College March 2009 Arzona State Unversty Dscrete Mathematcs Semnar The degree sequence problem Problem: Gven an nteger sequence d = (d 1,...,d n ) determne f there exsts a graph G wth d as ts

More information

J. Parallel Distrib. Comput.

J. Parallel Distrib. Comput. J. Parallel Dstrb. Comput. 71 (2011) 62 76 Contents lsts avalable at ScenceDrect J. Parallel Dstrb. Comput. journal homepage: www.elsever.com/locate/jpdc Optmzng server placement n dstrbuted systems n

More information

Fisher Markets and Convex Programs

Fisher Markets and Convex Programs Fsher Markets and Convex Programs Nkhl R. Devanur 1 Introducton Convex programmng dualty s usually stated n ts most general form, wth convex objectve functons and convex constrants. (The book by Boyd and

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

Section 5.4 Annuities, Present Value, and Amortization

Section 5.4 Annuities, Present Value, and Amortization Secton 5.4 Annutes, Present Value, and Amortzaton Present Value In Secton 5.2, we saw that the present value of A dollars at nterest rate per perod for n perods s the amount that must be deposted today

More information

Enterprise Master Patient Index

Enterprise Master Patient Index Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an

More information

Stable Distributions, Pseudorandom Generators, Embeddings, and Data Stream Computation

Stable Distributions, Pseudorandom Generators, Embeddings, and Data Stream Computation Stable Dstrbutons, Pseudorandom Generators, Embeddngs, and Data Stream Computaton PIOTR INDYK MIT, Cambrdge, Massachusetts Abstract. In ths artcle, we show several results obtaned by combnng the use of

More information

Enabling P2P One-view Multi-party Video Conferencing

Enabling P2P One-view Multi-party Video Conferencing Enablng P2P One-vew Mult-party Vdeo Conferencng Yongxang Zhao, Yong Lu, Changja Chen, and JanYn Zhang Abstract Mult-Party Vdeo Conferencng (MPVC) facltates realtme group nteracton between users. Whle P2P

More information

where the coordinates are related to those in the old frame as follows.

where the coordinates are related to those in the old frame as follows. Chapter 2 - Cartesan Vectors and Tensors: Ther Algebra Defnton of a vector Examples of vectors Scalar multplcaton Addton of vectors coplanar vectors Unt vectors A bass of non-coplanar vectors Scalar product

More information

A DATA MINING APPLICATION IN A STUDENT DATABASE

A DATA MINING APPLICATION IN A STUDENT DATABASE JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (53-57) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng Büyükbakkalköy-Istanbul

More information

Loop Parallelization

Loop Parallelization - - Loop Parallelzaton C-52 Complaton steps: nested loops operatng on arrays, sequentell executon of teraton space DECLARE B[..,..+] FOR I :=.. FOR J :=.. I B[I,J] := B[I-,J]+B[I-,J-] ED FOR ED FOR analyze

More information

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME, ISSUE, FEBRUARY ISSN 77-866 Logcal Development Of Vogel s Approxmaton Method (LD- An Approach To Fnd Basc Feasble Soluton Of Transportaton

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

How To Calculate An Approxmaton Factor Of 1 1/E

How To Calculate An Approxmaton Factor Of 1 1/E Approxmaton algorthms for allocaton problems: Improvng the factor of 1 1/e Urel Fege Mcrosoft Research Redmond, WA 98052 urfege@mcrosoft.com Jan Vondrák Prnceton Unversty Prnceton, NJ 08540 jvondrak@gmal.com

More information

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES In ths chapter, we wll learn how to descrbe the relatonshp between two quanttatve varables. Remember (from Chapter 2) that the terms quanttatve varable

More information

INSTITUT FÜR INFORMATIK

INSTITUT FÜR INFORMATIK INSTITUT FÜR INFORMATIK Schedulng jobs on unform processors revsted Klaus Jansen Chrstna Robene Bercht Nr. 1109 November 2011 ISSN 2192-6247 CHRISTIAN-ALBRECHTS-UNIVERSITÄT ZU KIEL Insttut für Informat

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 14 MORE ABOUT REGRESSION CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp

More information

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008 Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn

More information

Quantization Effects in Digital Filters

Quantization Effects in Digital Filters Quantzaton Effects n Dgtal Flters Dstrbuton of Truncaton Errors In two's complement representaton an exact number would have nfntely many bts (n general). When we lmt the number of bts to some fnte value

More information

The Geometry of Online Packing Linear Programs

The Geometry of Online Packing Linear Programs The Geometry of Onlne Packng Lnear Programs Marco Molnaro R. Rav Abstract We consder packng lnear programs wth m rows where all constrant coeffcents are n the unt nterval. In the onlne model, we know the

More information

HÜCKEL MOLECULAR ORBITAL THEORY

HÜCKEL MOLECULAR ORBITAL THEORY 1 HÜCKEL MOLECULAR ORBITAL THEORY In general, the vast maorty polyatomc molecules can be thought of as consstng of a collecton of two electron bonds between pars of atoms. So the qualtatve pcture of σ

More information

Joint Scheduling of Processing and Shuffle Phases in MapReduce Systems

Joint Scheduling of Processing and Shuffle Phases in MapReduce Systems Jont Schedulng of Processng and Shuffle Phases n MapReduce Systems Fangfe Chen, Mural Kodalam, T. V. Lakshman Department of Computer Scence and Engneerng, The Penn State Unversty Bell Laboratores, Alcatel-Lucent

More information

Formulating & Solving Integer Problems Chapter 11 289

Formulating & Solving Integer Problems Chapter 11 289 Formulatng & Solvng Integer Problems Chapter 11 289 The Optonal Stop TSP If we drop the requrement that every stop must be vsted, we then get the optonal stop TSP. Ths mght correspond to a ob sequencng

More information

2008/8. An integrated model for warehouse and inventory planning. Géraldine Strack and Yves Pochet

2008/8. An integrated model for warehouse and inventory planning. Géraldine Strack and Yves Pochet 2008/8 An ntegrated model for warehouse and nventory plannng Géraldne Strack and Yves Pochet CORE Voe du Roman Pays 34 B-1348 Louvan-la-Neuve, Belgum. Tel (32 10) 47 43 04 Fax (32 10) 47 43 01 E-mal: corestat-lbrary@uclouvan.be

More information

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819-840 (2008) Data Broadcast on a Mult-System Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,

More information

A Lyapunov Optimization Approach to Repeated Stochastic Games

A Lyapunov Optimization Approach to Repeated Stochastic Games PROC. ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING, OCT. 2013 1 A Lyapunov Optmzaton Approach to Repeated Stochastc Games Mchael J. Neely Unversty of Southern Calforna http://www-bcf.usc.edu/

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta

More information

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

More information

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background: SPEE Recommended Evaluaton Practce #6 efnton of eclne Curve Parameters Background: The producton hstores of ol and gas wells can be analyzed to estmate reserves and future ol and gas producton rates and

More information

Efficient Project Portfolio as a tool for Enterprise Risk Management

Efficient Project Portfolio as a tool for Enterprise Risk Management Effcent Proect Portfolo as a tool for Enterprse Rsk Management Valentn O. Nkonov Ural State Techncal Unversty Growth Traectory Consultng Company January 5, 27 Effcent Proect Portfolo as a tool for Enterprse

More information

An interactive system for structure-based ASCII art creation

An interactive system for structure-based ASCII art creation An nteractve system for structure-based ASCII art creaton Katsunor Myake Henry Johan Tomoyuk Nshta The Unversty of Tokyo Nanyang Technologcal Unversty Abstract Non-Photorealstc Renderng (NPR), whose am

More information

Detecting Global Motion Patterns in Complex Videos

Detecting Global Motion Patterns in Complex Videos Detectng Global Moton Patterns n Complex Vdeos Mn Hu, Saad Al, Mubarak Shah Computer Vson Lab, Unversty of Central Florda {mhu,sal,shah}@eecs.ucf.edu Abstract Learnng domnant moton patterns or actvtes

More information

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems STAN-CS-73-355 I SU-SE-73-013 An Analyss of Central Processor Schedulng n Multprogrammed Computer Systems (Dgest Edton) by Thomas G. Prce October 1972 Techncal Report No. 57 Reproducton n whole or n part

More information

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement An Enhanced Super-Resoluton System wth Improved Image Regstraton, Automatc Image Selecton, and Image Enhancement Yu-Chuan Kuo ( ), Chen-Yu Chen ( ), and Chou-Shann Fuh ( ) Department of Computer Scence

More information

POLYSA: A Polynomial Algorithm for Non-binary Constraint Satisfaction Problems with and

POLYSA: A Polynomial Algorithm for Non-binary Constraint Satisfaction Problems with and POLYSA: A Polynomal Algorthm for Non-bnary Constrant Satsfacton Problems wth and Mguel A. Saldo, Federco Barber Dpto. Sstemas Informátcos y Computacón Unversdad Poltécnca de Valenca, Camno de Vera s/n

More information

Social Nfluence and Its Models

Social Nfluence and Its Models Influence and Correlaton n Socal Networks Ars Anagnostopoulos Rav Kumar Mohammad Mahdan Yahoo! Research 701 Frst Ave. Sunnyvale, CA 94089. {ars,ravkumar,mahdan}@yahoo-nc.com ABSTRACT In many onlne socal

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):1884-1889 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel

More information

Fast Fuzzy Clustering of Web Page Collections

Fast Fuzzy Clustering of Web Page Collections Fast Fuzzy Clusterng of Web Page Collectons Chrstan Borgelt and Andreas Nürnberger Dept. of Knowledge Processng and Language Engneerng Otto-von-Guercke-Unversty of Magdeburg Unverstätsplatz, D-396 Magdeburg,

More information

Performance Analysis and Coding Strategy of ECOC SVMs

Performance Analysis and Coding Strategy of ECOC SVMs Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School

More information

Chapter 7: Answers to Questions and Problems

Chapter 7: Answers to Questions and Problems 19. Based on the nformaton contaned n Table 7-3 of the text, the food and apparel ndustres are most compettve and therefore probably represent the best match for the expertse of these managers. Chapter

More information

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

How To Understand The Results Of The German Meris Cloud And Water Vapour Product Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University Characterzaton of Assembly Varaton Analyss Methods A Thess Presented to the Department of Mechancal Engneerng Brgham Young Unversty In Partal Fulfllment of the Requrements for the Degree Master of Scence

More information

On Lockett pairs and Lockett conjecture for π-soluble Fitting classes

On Lockett pairs and Lockett conjecture for π-soluble Fitting classes On Lockett pars and Lockett conjecture for π-soluble Fttng classes Lujn Zhu Department of Mathematcs, Yangzhou Unversty, Yangzhou 225002, P.R. Chna E-mal: ljzhu@yzu.edu.cn Nanyng Yang School of Mathematcs

More information

Brigid Mullany, Ph.D University of North Carolina, Charlotte

Brigid Mullany, Ph.D University of North Carolina, Charlotte Evaluaton And Comparson Of The Dfferent Standards Used To Defne The Postonal Accuracy And Repeatablty Of Numercally Controlled Machnng Center Axes Brgd Mullany, Ph.D Unversty of North Carolna, Charlotte

More information

To Fill or not to Fill: The Gas Station Problem

To Fill or not to Fill: The Gas Station Problem To Fll or not to Fll: The Gas Staton Problem Samr Khuller Azarakhsh Malekan Julán Mestre Abstract In ths paper we study several routng problems that generalze shortest paths and the Travelng Salesman Problem.

More information

A Performance Analysis of View Maintenance Techniques for Data Warehouses

A Performance Analysis of View Maintenance Techniques for Data Warehouses A Performance Analyss of Vew Mantenance Technques for Data Warehouses Xng Wang Dell Computer Corporaton Round Roc, Texas Le Gruenwald The nversty of Olahoma School of Computer Scence orman, OK 739 Guangtao

More information

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered

More information

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center Dynamc Resource Allocaton and Power Management n Vrtualzed Data Centers Rahul Urgaonkar, Ulas C. Kozat, Ken Igarash, Mchael J. Neely urgaonka@usc.edu, {kozat, garash}@docomolabs-usa.com, mjneely@usc.edu

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

320 The Internatonal Arab Journal of Informaton Technology, Vol. 5, No. 3, July 2008 Comparsons Between Data Clusterng Algorthms Osama Abu Abbas Computer Scence Department, Yarmouk Unversty, Jordan Abstract:

More information

+ + + - - This circuit than can be reduced to a planar circuit

+ + + - - This circuit than can be reduced to a planar circuit MeshCurrent Method The meshcurrent s analog of the nodeoltage method. We sole for a new set of arables, mesh currents, that automatcally satsfy KCLs. As such, meshcurrent method reduces crcut soluton to

More information

A Simple Approach to Clustering in Excel

A Simple Approach to Clustering in Excel A Smple Approach to Clusterng n Excel Aravnd H Center for Computatonal Engneerng and Networng Amrta Vshwa Vdyapeetham, Combatore, Inda C Rajgopal Center for Computatonal Engneerng and Networng Amrta Vshwa

More information

Mining Multiple Large Data Sources

Mining Multiple Large Data Sources The Internatonal Arab Journal of Informaton Technology, Vol. 7, No. 3, July 2 24 Mnng Multple Large Data Sources Anmesh Adhkar, Pralhad Ramachandrarao 2, Bhanu Prasad 3, and Jhml Adhkar 4 Department of

More information

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

On the Optimal Control of a Cascade of Hydro-Electric Power Stations On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;

More information

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts Power-of-wo Polces for Sngle- Warehouse Mult-Retaler Inventory Systems wth Order Frequency Dscounts José A. Ventura Pennsylvana State Unversty (USA) Yale. Herer echnon Israel Insttute of echnology (Israel)

More information

The descriptive complexity of the family of Banach spaces with the π-property

The descriptive complexity of the family of Banach spaces with the π-property Arab. J. Math. (2015) 4:35 39 DOI 10.1007/s40065-014-0116-3 Araban Journal of Mathematcs Ghadeer Ghawadrah The descrptve complexty of the famly of Banach spaces wth the π-property Receved: 25 March 2014

More information