Binary Embedding: Fundamental Limits and Fast Algorithm

Size: px
Start display at page:

Download "Binary Embedding: Fundamental Limits and Fast Algorithm"

Transcription

1 Binary Ebedding: Fundaental Liits and Fast Algorith Xinyang Yi The University of Texas at Austin Eric Price The University of Texas at Austin Constantine Caraanis The University of Texas at Austin Abstract Binary ebedding is a nonlinear diension reduction ethodology where high diensional data are ebedded into the Haing cube while preserving the structure of the original space. Specifically, for an arbitrary N distinct points in S p, our goal is to encode each point using - diensional binary strings such that we can reconstruct their geodesic distance up to δ unifor distortion. Existing binary ebedding algoriths either lack theoretical guarantees or suffer fro running tie O ( p ). We ake three contributions: () we establish a lower bound that shows any binary ebedding oblivious to the set of points requires = Ω( δ log N) bits and 2 a siilar lower bound for non-oblivious ebeddings into Haing distance; (2) we propose a novel fast binary ebedding algorith with provably optial bit coplexity = O ( δ log N ) 2 and near linear running tie O(p log p) whenever log N δ p, with a slightly worse running tie for larger log N; (3) we also provide an analytic result about ebedding a general set of points K S p with even infinite size. Our theoretical findings are supported through experients on both synthetic and real data sets. Introduction Low distortion ebeddings that transfor high-diensional points to low-diensional space have played an iportant role in dealing with storage, inforation retrieval and achine learning probles for odern datasets. Perhaps one of the ost faous results along these lines is the Johnson- Lindenstrauss (JL) lea Johnson and Lindenstrauss (984), which shows that N points can be ebedded into a O ( δ 2 log N ) -diensional space while preserving pairwise Euclidean distance up to δ-lipschitz distortion. This δ 2 dependence has been shown to be inforation-theoretically optial Alon (2003). Significant work has focused on fast algoriths for coputing the ebeddings, e.g., (Ailon and Chazelle, 2006; Kraher and Ward, 20; Ailon and Liberty, 203; Cheraghchi et al., 203; Nelson et al., 204).

2 More recently, there has been a growing interest in designing binary codes for high diensional points with low distortion, i.e., ebeddings into the binary cube (Weiss et al., 2009; Raginsky and Lazebnik, 2009; Salakhutdinov and Hinton, 2009; Liu et al., 20; Gong and Lazebnik, 20; Yu et al., 204). Copared to JL ebedding, ebedding into the binary cube (also called binary ebedding) has two advantages in practice: (i) As each data point is represented by a binary code, the disk size for storing the entire dataset is reduced considerably. (ii) Distance in binary cube is soe function of the Haing distance, which can be coputed quickly using coputationally efficient bit-wise operators. As a consequence, binary ebedding can be applied to a large nuber of doains such as biology, finance and coputer vision where the data are usually high diensional. While ost JL ebeddings are linear aps, any binary ebedding is fundaentally a nonlinear transforation. As we detail below, this nonlinearity poses significant new technical challenges for both upper and lower bounds. In particular, our understanding of the landscape is significantly less coplete. To the best of our knowledge, lower bounds are not known; ebedding algoriths for infinite sets have distortion-dependence δ significantly exceeding their finite-set counterparts; and perhaps ost significantly, there are no fast (near linear-tie) ebedding algoriths with strong perforance guarantees. As we explain below, this paper contributes to each of these three areas. First, we detail soe recent work and state of the art results. Recent Work. A coon approach pursued by several existing works, considers the natural extension of JL ebedding techniques via one bit quantization of the projections: b(x) = sign(ax), (.) where x R p is input data point, A R p is a projection atrix and b(x) is the ebedded binary code. In particular, Jacques et al. (20) shows when each entry of A is generated independently fro N (0, ), with > log N it with high probability achieves at ost δ (additive) distortion δ 2 for N points. Work in Plan and Vershynin (204) extend these results to arbitrary sets K S p where K can be infinite. They prove that the ebedding with δ-distortion can be obtained when w(k) 2 /δ 6 where w(k) is the Gaussian Mean Width of K. It is unknown whether the unusual δ 6 dependence is optial or not. Despite provable saple coplexity guarantees, one bit quantization of rando projection as in (.), suffers fro O ( p ) running tie for a single point. This quadratic dependence can result in a prohibitive coputational cost for high-diensional data. Analogously to the developents in fast JL ebeddings, there are several algoriths proposed to overcoe this coputational issue. Work in Gong et al. (203) proposes a bilinear projection ethod. By setting = O(p), their ethod reduces the running tie fro O(p 2 ) to O(p.5 ). More recently, work in Yu et al. (204) introduces a circulant rando projection algorith that requires running tie O ( p log p ). While these algoriths have reduced running tie, as of yet they coe without perforance guarantees: to the best of our knowledge, the easureent coplexities of the two algoriths are still unknown. Another line of work considers learning binary codes fro data by solving certain optiization probles (Weiss et al., 2009; Salakhutdinov and Hinton, 2009; Norouzi et al., 202; Yu et al., 204). Unfortunately, there is no known provable bits coplexity result for these algoriths. It is also worth noting that Raginsky and Lazebnik (2009) provide a binary code design for preserving shift-invariant kernels. Their ethod suffers fro the sae quadratic coputational issue copared with the fully rando Gaussian projection ethod. 2

3 Another related diension reduction technique is locality sensitive hashing (LSH) where the goal is to copute a discrete data structure such that siilar points are apped into the sae bucket with high probability (see, e.g., Andoni and Indyk (2006)). The key difference is that LSH preserves short distances, but binary ebedding preserves both short and far distances. For points that are far apart, LSH only cares that the hashings are different while binary ebedding cares how different they are. Contributions of this paper. In this paper, we address several unanswered probles about binary ebedding. We provide lower bounds for both data-oblivious and data-aware ebeddings; we provide a fast algorith for binary ebedding; and finally we consider the setting of infinite sets, and prove that in soe of the ost coon cases we can iprove the state-of-the-art saple coplexity guarantees by a factor of δ 2 :. We provide two lower bounds for binary ebeddings. The first shows that any ethod for ebedding and for recovering a distance estiate fro the ebedded points that is independent of the data being ebedded ust use Ω( log N) bits. This is based on a bound on the δ 2 counication coplexity of Haing distance used by Jayra and Woodruff (203) for a lower bound on the distributional JL ebedding. Separately, we give a lower bound for arbitrarily data-dependent ethods that ebed into (any function of) the Haing distance, showing such algoriths require = Ω( log N). This bound is siilar to Alon (2003) δ 2 log (/δ) which gets the sae result for JL, but the binary ebedding requires a different construction. 2. We provide the first provable fast algorith with optial easureent coplexity O ( log N ). δ 2 The proposed algorith has running tie O ( log δ 2 δ log2 N log p log 3 log N + p log p ) thus has alost linear tie coplexity when log N δ p. Our algorith is based on two key novel ideas. First, our siilarity is based on the edian Haing distance of sub-blocks of the binary code; second, our new ebedding takes advantage of a pair-wise independence arguent of Gaussian Toeplitz projection that could be of independent interest. 3. For arbitrary set K S p and the fully rando Gaussian projection algorith, we prove that = O(w(K + ) 2 /δ 4 ) is sufficient to achieve δ unifor distortion. Here K + is an expanded set of K. Although in general K K + and hence w(k) w(k + ), for interesting K such as sparse or low rank sets, one can show w(k + ) = Θ(w(K)) p. Therefore applying our theory to these sets results in an iproved dependence on δ copared to a recent result in Plan and Vershynin (204). See Section 3.3 for a detailed discussion. Discussion. For the fast binary ebedding, one siple solution, to the best of our knowledge not previously stated, is to cobine a Gaussian projection and the well known results about fast JL. In detail, consider the strategy b(x) = sign(afx), where A is a Gaussian atrix and F is any fast JL construction such as subsapled Walsh-Hadaard atrix Rudelson and Vershynin (2008) or partial circulant atrix Kraher et al. (204) with colun flips. A siple analysis shows that this approach achieves easureent coplexity O( log N) and running tie δ 2 O( log 2 N log p log 3 log N + p log p) by following the best known fast JL results. Our fast binary δ 4 ebedding algorith builds on this siple but effective thought. Instead of using a Gaussian atrix after the fast JL transfor, we use a series of Gaussian Toeplitz atrices that have fast atrix 3

4 vector ultiplication. This novel construction iproves the running tie by δ 2 while keeping easureent coplexity the sae. In order for this to work, we need to change the estiator fro straight Haing distance to one based on the edian of several Haing distances. An interesting point of coparison is Ailon and Rauhut (204), which considers RIP-optial distributions that give JL ebeddings with optial easureent coplexity O( log N) and running tie O(p log p). They show the existence of such ebeddings whenever log N < δ 2 p /2 γ for δ 2 any constant γ > 0, which is essentially no better than the bound given by the folklore ethod of coposing a Gaussian projection with a subsapled Fourier atrix. In our binary setting, we show how to iprove the region of optiality by a factor of δ. It would be interesting to try and translate this result back to the JL setting. Notation. We use [n] to denote natural nuber set {, 2,..., n}. For natural nubers a < b, let [a, b] denote the consecutive set {a, a +,..., b}. A vector in R n is denoted as x or equivalently (x, x 2,..., x n ). We use x I to denote the sub-vector of x with index set I [n]. We denote entry-wise vector ultiplication as x y = (x y, x 2 y 2,..., x n y n ). A atrix is typically denoted as M. Ter (i, j) of M is denoted as M i,j. Row i of M is denoted as M i. An n-by-n identity atrix is denoted as I n. For two rando variables X, Y, we denote the stateent that X and Y are independent as X Y. For two binary strings a, b {0, }, we use d H (a, b) to denote the noralized Haing distance, i.e., d H (a, b) := (a i b i ). 2 Organization, Proble Setup and Preliinaries In this section, we state our proble forally, give soe key definitions and present a siple (known) algorith that sets the stage for the ain results of this paper. The algorith (Algorith ), discussed in detail below, is siply the one-bit quantization of a standard JL ebedding. Its perforance on finite sets is easy to analyze, and we state it in Proposition 2.2 below. Three iportant questions reain unanswered: (i) Lower Bounds is the perforance guaranteed by Proposition 2.2 optial? We answer this affiratively in Section 3.. (ii) Fast Ebedding whereas Algorith is quadratic (depending on the product p), fast JL algoriths are nearly linear in p; does soething siilar exist for binary ebedding? We develop a new algorith in Section 3.2 that addresses the coplexity issue, while at the sae tie guaranteeing δ-ebedding with diension scaling that atches our lower bound. Interestingly, a key aspect of our contribution is that we use a slightly odified siilarity function, using the edian of the noralized Haing distance on sub-blocks. (iii) Infinite Sets recent work analyzing the setting of infinite sets K S p shows a dependence of δ 6 on the distortion. Is this optial? We show in Section 3.3 that in any settings this can be iproved by a factor of δ 2. In Section 4, we provide nuerical results. We give ost proofs in Section Proble Setup Given a set of p-diensional points, our goal is to find a transforation f : R p {0, } such that the Haing distance (or other related, easily coputable etric) between two binary codes is close to their siilarity in the original space. We consider points on the unit sphere S p and use 4

5 the noralized geodesic distance (occasionally, and soewhat isleadingly, called cosine siilarity) as the input space siilarity etric. For two points x, y R p, we use d(x, y) to denote the geodesic distance, defined as d(x, y) := (x/ x 2, y/ y 2 ), π where (, ) denotes the angle between two vectors. For x, y S p, the etric d(x, y) is proportional to the length of the shortest path connecting x, y on the sphere. Given the success of JL ebedding, a natural approach is to consider the one bit quantization of a rando projection: b = sign(ax), (2.) where A is soe rando projection atrix. Given two points x, y with ebedding vectors b, and c, we have b i c i if and only if A i, x A i, y < 0. The traditional etric in the ebedded space has been the so-called noralized Haing distance, which we done by d A (x, y) and is defined as follows. d A (x, y) := { sign ( A i, x ) sign ( A i, y )}. (2.2) Definition 2.. (δ-unifor Ebedding) Given a set K S p and projection atrix A R p, we say the ebedding b = sign(ax) provides a δ-unifor ebedding for points in K if da (x, y) d(x, y) δ, x, y K. (2.3) Note that unlike for JL, we ai to control additive error instead of relative error. Due to the inherently liited resolution of binary ebedding, controlling relative error would force the ebedding diension to scale inversely with the iniu distance of the original points, and in particular would be ipossible for any infinite set. 2.2 Unifor Rando Projection Algorith Unifor Rando Projection input Finite nuber of points K = {x i } K where K Sp, ebedding target diension. : Construct atrix A R p where each entry A i,j is drawn independently fro N (0, ). 2: for i =, 2,..., K do 3: b i sign(ax i ). 4: end for output {b i } K Algorith presents (2.) forally, when A is an i.i.d. Gaussian rando atrix, i.e., A i N (0, I p ) for any i []. It is easy to observe that for two fixed points x, y S p we have ( { E sign ( A i, x ) sign ( A i, y )}) = d(x, y), i []. (2.4) 5

6 The above equality has a geoetric explanation: each A i actually represents a uniforly distributed rando hyperplane in R p. Then sign ( A i, x ) sign ( A i, y ) holds if and only if hyperplane A i intersects the arc between x and y. In fact, d A (x, y) is equal to the fraction of such hyperplanes. Under such unifor tessellation, the probability with which the aforeentioned event occurs is d(x, y). Applying Hoeffding s inequality and probabilistic union bound over N 2 pairs of points, we have the following straightforward guarantee. Proposition 2.2. Given a set K S p with finite size K, consider Algorith with c(/δ 2 ) log K. Then with probability at least 2 exp( δ 2 ), we have d A (x, y) d(x, y) δ, x, y K. Here c is soe absolute constant. Proof. The proof idea is standard and follows fro the above; we oit the details. 3 Main Results We now present our ain results on lower bounds, on fast binary ebedding, and finally, on a general result for infinite sets. 3. Lower Bounds We offer two different lower bounds. The first shows that any ebedding technique that is oblivious to the input points ust use Ω( log N) bits, regardless of what ethod is used to estiate δ 2 geodesic distance fro the ebeddings. This shows that unifor rando projection and our fast binary ebedding achieve optial bit coplexity (up to constants). The bound follows fro results by Jayra and Woodruff (203) on the counication coplexity of Haing distance. Theore 3.. Consider any distribution on ebedding functions f : S p {0, } and reconstruction algoriths g : {0, } {0, } R such that for any x,..., x N S p we have g(f(xi ), f(x j )) d(x i, x j ) δ for all i, j [N] with probability ɛ. Then = Ω( δ 2 log(n/ɛ)). Proof. See Section 5. for detailed proof. One could iagine, however, that an ebedding could use knowledge of the input point set to ebed any specific set of points into a lower-diensional space than is possible with an oblivious algorith. In the Johnson-Lindenstrauss setting, Alon (2003) showed that this is not possible beyond (possibly) a log(/δ) factor. We show the analogous result for binary ebeddings. Relative to Theore 3., our second lower bound works for data-dependent ebedding functions but loses a log(/δ) and requires the reconstruction function to depend only on the Haing distance between the two strings. This restriction is natural because an unrestricted data-dependent reconstruction function could siply encode the answers and avoid any dependence on δ. 6

7 With the schee given in (2.), choosing A as a fully rando Gaussian atrix yields d A (x, y) d(x, y). However, an arbitrary binary ebedding algorith ay not yield a linear functional relationship between Haing distance and geodesic distance. Thus for this lower bound, we allow the design of an algorith with arbitrary link function L. Definition 3.2. (Data-dependent binary ebedding proble) Let L : [0, ] [0, ] be a onotonic and continuous function. Given a set of points x, x 2,..., x N S p, we say a binary ebedding apping f solves the binary ebedding proble in ters of link function L, if d H (f(x i ), f(x j )) L ( d(x i, x j ) ) δ, i, j [N]. (3.) Although the choice of L is flexible, note that for the sae point, we always have d H (f(x i ), f(x i )) = d(x i, x i ) = 0, thus (3.) iplies L(0) < δ. We can just let L(0) = 0. In particular, we let L ax = L(). We have the following lower bound: Theore 3.3. There exist 2N points x, x 2,..., x 2N S N such that for any binary ebedding algorith f on {x i } 2N, if it solves the data-dependent binary ebedding proble defined in 3.2 in ters of link function L and any δ (0, 6 e L ax), it ust satisfy 28e Proof. See Section 5.2 for detailed proof. ( Lax δ ) 2 log N log Lax 2δ. (3.2) Reark 3.4. We ake two rearks for the above result. () When L ax is soe constant, our result iplies that for general N points, any binary ebedding algorith (even data-dependent ) ust have Ω( log N) nuber of easureents. This is analogous to Alon s lower bound δ 2 log δ in the JL setting. It is worth highlighting two differences: (i) The JL setting considers the sae etric (Euclidean distance) for both the input and the ebedded spaces. In binary ebedding, however, we are interested in showing the relationship between Haing distance and geodesic distance. (ii) Our lower bound is applicable to a broader class of binary ebedding algoriths as it involves arbitrary, even data-dependent, link function L. Such an extension is not considered in the lower bound of JL. (2) The stated lower bound only depends on L ax and does not depend on any curvature inforation of L. The constraint L ax > 6 eδ is critical for our lower bound to hold, but soe such restriction is necessary because for L ax < δ, we are able to ebed all points into just one bit. In this case d H (f(x i ), f(x j )) = 0 for all pairs and condition (3.) would hold trivially. 3.2 Fast Binary Ebedding In this section, we present a novel fast binary ebedding algorith. We then establish its theoretical guarantees. There are two key ideas that we leverage: (i) instead of noralized Haing distance, we use a related etric, the edian of the noralized Haing distance applied to sub-blocks; and (ii) we show a key pair-wise independence lea for partial Gaussian Toeplitz projection, that allows us to use a concentration bound that then iplies nearness in the edian-etric we use. 7

8 3.2. Method Our algorith builds on sub-sapled Walsh-Hadaard atrix and partial Gaussian Toeplitz atrices with rando colun flips. In particular, an -by-p partial Walsh-Hadaard atrix has the for Φ := P H D. (3.3) The above construction has three coponents. We characterize each ter as follows: Ter D is a p-by-p diagonal atrix with diagonal ters {ζ i } p that are drawn fro i.i.d. Radeacher sequence, i.e, for any i [p], Pr(ζ i = ) = Pr(ζ i = ) = /2. Ter H is a p-by-p scaled Walsh-Hadaard atrix such that H H = I p. Ter P is an -by-p sparse atrix where one entry of each row is set to be while the rest are 0. The nonzero coordinate of each row is drawn independently fro unifor distribution. In fact, the role of P is to randoly select p rows of H D. An -by-n partial Gaussian Toeplitz atrix has the for We introduce each ter as follows: Ψ := P T D. (3.4) Ter D a is n-by-n diagonal atrix with diagonal ters {ζ i } n Radeacher sequence. that are drawn fro i.i.d. Ter T is a n-by-n Toeplitz atrix constructed fro (2n )-diensional vector g such that T i,j = g i j+n for any i, j [n]. In particular, g is drawn fro N (0, I 2n ). Ter P is an -by-n sparse atrix where P i = e i for any i []. Equivalently, we use P to select the first rows of TD. It s worth to note we actually only need to select any distinct rows. With the above constructions in hand, we present our fast algorith in Algorith 2. At a high level, Algorith 2 consists of two parts: First, we apply colun flipped partial Hadaard transfor to convert p-diensional point into n-diensional interediate point. Second, we use B independent (/B)-by-n partial Gaussian Toeplitz atrices and sign operator to ap an interediate point into B blocks of binary codes. In ters of siilarity coputation for the ebedded codes, we use the edian of each block s noralized Haing distance. In detail, for b, c {0, }, B-wise noralized Haing distance is defined as ( ) d H (b, c; B) := edian ({d } B ) H bti, c Ti (3.5) where T i = [i +, i + /B]. It is worth noting that our first step is one construction of fast JL transfor. In fact any fast JL transfor would work for our construction, but we choose a standard one with real value: based on 8 i=0

9 Rudelson and Vershynin (2008); Cheraghchi et al. (203); Kraher and Ward (20), it is known that with = O ( ɛ 2 log N log p log 3 (log N) ) easureents, a subsapled Hadaard atrix with colun flips becoes an ɛ-jl atrix for N points. The second part of our algorith follows fraework (2.). By choosing a Gaussian rando vector in each row of Ψ, fro our previous discussion in Section 2.2, the probability that such a hyperplane intersects the arc between two points is equal to their geodesic distance. Copared to a fully rando Gaussian atrix, as used in Algorith, the key difference is that the hyperplanes represented by rows of Ψ are not independent to each other; this iposes the ain analytical challenge. Algorith 2 Fast Binary Ebedding input Finite nuber of points {x i } N where each point x i S p, ebedded diension, interediate diension n, nuber of blocks B. : Draw a n-by-p sub-sapled Walsh-Hadaard atrix Φ according to (3.3). Draw B independent partial Gaussian Toeplitz atrices { Ψ (j)} B with size (/B)-by-n according to (3.4). j= 2: {Part I: Fast JL} 3: for i =, 2,..., N do 4: y i Φ x i. 5: end for 6: {Part II: Partial Gaussian Toeplitz Projection} 7: for i =, 2,..., N do 8: for j =, 2,..., B do 9: c j sign ( Ψ (j) ) y i. 0: end for : b i [c ; c 2 ;... ; c B ] 2: end for output {b i } N Analysis We give the analysis for Algorith 2. We first review a well known result about fast JL transfor. Lea 3.5. Consider the colun flipped partial Hadaard atrix defined in (3.3) with size - by-p. For N points x, x 2,..., x N S p, let y i = p Φ(ζ) x i, i [N]. For soe absolute constant c, suppose cδ 2 log N log p log 3 (log N), then with probability at least 0.99, we have that for any i, j [N] y i y j 2 x i x j 2 δ x i x j 2, (3.6) and for any i [N] y i 2 δ. (3.7) Proof. It can be proved by cobining Theore 4 in Cheraghchi et al. (203) and Theore 3. in Kraher and Ward (20). 9

10 The above result suggests that the first part of our algorith reduces the diension while preserving well the Euclidean distance of each pair. Under this condition, all the pairwise geodesic distances are also well preserved as confired by the following result. Lea 3.6. Consider the set of ebedded points {y i } N defined in Lea 3.5. Suppose conditions (3.6)-(3.7) hold with δ > 0. Then for any i, j [N], holds with soe absolute constant C. Proof. We postpone the proof to Appendix A. d(yi, y j ) d(x i, x j ) Cδ (3.8) The next result is our independence lea, and is one of the key technical ideas that ake our result possible. The result shows that for any fixed x, Gaussian Toeplitz projection (with colun flips) plus sign( ) generate pair-wise independent binary codes. Lea 3.7. Let g N (0, I 2n ), ζ = {ζ i } i=n be an i.i.d. Radeacher sequence. Let T be a rando Toeplitz atrix constructed fro g such that T i,j = g i j+n. Consider any two distinct rows of T say ξ, ξ. For any two fixed vectors x, y R n, we define the following rando variables X = sign ξ ζ, x, X = sign ξ ζ, x ; Y = sign ξ ζ, y, Y = sign ξ ζ, y. We have X X, X Y, Y X, Y Y. Proof. See Section 5.3. for detailed proof. We are ready to prove the following result about Algorith 2. Theore 3.8. Consider Algorith 2 with rando atrices Φ, Ψ defined in (3.3) and (3.4) respectively. For finite nuber of points {x i } N, let b i be the binary codes of x i generated by Algorith 2. Suppose we set B c log N, n c (/δ 2 ) log N log p log 3 (log N), n /B c (/δ 2 ), with soe absolute constants c, c, c, then with probability at least 0.98, we have that for any i, j [N] d H (b i, b j ; B) d(x i, x j ) δ. Siilarity etric d H (, ; B) is the edian of noralized Haing distance defined in (3.5). Proof. See Section for detailed proof. 0

11 The above result suggests that the easureent coplexity of our fast algorith is O ( log N ) δ 2 which atches the perforance of Algorith based on fully rando atrix. Note that this easureent coplexity can not be iproved significantly by any data-oblivious binary ebedding with any siilarity etric, as suggested by Theore 3.. Running tie: The first part of our algorith takes tie O ( p log p ). Generating a single block of binary codes fro partial Toeplitz atrix takes tie O ( n log( δ )). Thus the total running tie is O ( Bn log δ + p log p) = O ( log δ 2 δ log2 N log p log 3 (log N) + p log p ). By ignoring the polynoial log log factor, the second ter O ( p log p ) doinates when log N δ p/ log δ. Coparison to an alternative algorith: Instead of utilizing the partial Gaussian Toeplitz projection, an alternative ethod, to the best of our knowledge not previously stated, is to use fully rando Gaussian projection in the second part of our algorith. We present the details in Algorith 3. By cobining Proposition 2.2 and Lea 3.5, it is straightforward to show this algorith still achieves the sae easureent coplexity O ( log N ). The corresponding running δ 2 tie is O ( log 2 N log p log 3 (log N) + p log p ), so it is fast when log N δ 2 p. Therefore our δ 4 algorith has an iproved dependence on δ. This iproveent coes fro fast ultiplication of partial Toeplitz atrix and a pair-wise independence arguent shown in Lea 3.7. Algorith 3 Alternative Fast Binary Ebedding input Finite nuber of points {x i } N where each point x i S p, ebedded diension, interediate diension n. : Draw a n-by-p sub-sapled Walsh-Hadaard atrix Φ according to (3.3). Construct -by-n atrix A where each entry is drawn independently fro N (0, ). 2: for i =, 2,..., N do 3: b i sign(aφx i ) 4: end for output {b i } N 3.3 δ-unifor Ebedding for General K In this section, we turn back to the fully rando projection binary ebedding (Algorith ). Recall that in Proposition 2.2, we show for finite size K, = O( log K ) easureents are sufficient δ 2 to achieve δ-unifor ebedding. For general K, the challenge is that there ight be an infinite nuber of distinct points in K, so Proposition 2.2 cannot be applied. In proving the JL lea for an infinite set K, the standard technique is either constructing an ɛ-net of K or reducing the distortion to the deviation bound of a Gaussian process. However, due to the non-linearity essential for binary ebedding, these techniques cannot be directly extended to our setting. Therefore strengthening Proposition 2.2 to infinite size K iposes significant technical challenges. Before stating our result, we first give soe definitions. Definition 3.9. (Gaussian ean width) Let g N (0, I p ). For any set K S p, the Gaussian Matrix-vector ultiplication for -by-n partial Toeplitz atrix can be ipleented in running tie O ( n log ).

12 ean width of K is defined as w(k) := E g sup g, x. x K Here, w(k) 2 easures the effective diension of set K. In the trivial case K = S p, we have w(k) 2 p. However, when K has soe special structure, we ay have w(k) 2 p. For instance, when K = {x S p : supp(x) s}, it has been shown that w(k) = Θ( s log(p/s)) (see Lea 2.3 in Plan and Vershynin (203)). For a given δ, we define K + δ, the expanded version of K Sp as: K + δ := K { z S p : z = x y x y 2, x, y K if δ 2 x y 2 δ }. (3.9) In other words, K + δ is constructed fro K by adding the noralized differences between pairs of points in K that are within δ but not closer than δ 2. Now we state the ain result as follows. Theore 3.0. Consider any K S p. Let A R p be an i.i.d. Gaussian atrix where each row A i N (0, I p ). For any two points x, y K, d A (x, y) is defined in (2.2). Expanded set K + δ is defined in (3.9). When c w(k+ δ )2 δ 4, with soe absolute constant c, then we have that sup x,y K d A (x, y) d(x, y) δ holds with probability at least c exp( c 2 δ 2 ) where c, c 2 are absolute constants. Proof. See Section 5.4 for detailed proof. Reark 3.. We copare the above result to Theore.5 fro the recent paper Plan and Vershynin (204) where it is proved that for w(k) 2 /δ 6, Algorith is guaranteed to achieve δ-unifor ebedding for general K. Based on definition (3.9), we have w(k) w(k + δ ) δ 2 w(k K) δ 2 w(k). Thus in the worst case, Theore 3.0 recovers the previous result up to a factor. More iportantly, for any interesting sets one can show w(k + δ δ 2 ) w(k); in such cases, our result leads to an iproved dependence on δ. We give several such exaples as follows: Low rank set. For soe U R p r such that U U = I r, let K = {x S p : x = Uc, c S r }. We siply have K = K + δ and w(k) r. Our result iplies = O ( r/δ 4). Sparse set. K = {x S p : supp(x) s}. In this case we have K + δ {x S p : supp(x) 2s}. Therefore w(k + δ ) = Θ( s log(p/s)). Our result iplies = O ( s log(p/s) δ 4 ). Set with finite size. K <. As w(k) log K and K + δ 2 K, our result iplies = O ( log K /δ 4). We thus recover Proposition 2.2 up to factor /δ 2. Applying the result fro Plan and Vershynin (204) to the above sets iplies siilar results but the dependence on δ becoes /δ 6. 2

13 4 Nuerical Results In this section, we present the results of experients we conduct to validate our theory and copare the perforance of the following three algoriths we discussed: unifor rando projection (URP) (Algorith ), fast binary ebedding (FBE) (Algorith 2) and the alternative fast binary ebedding (FBE-2) (Algorith 3). We first apply these algoriths to synthetic datasets. In detail, given paraeters (N, p), a synthetic dataset is constructed by sapling N points fro S p uniforly at rando. Recall that δ is the axiu ebedding distortion aong all pairs of points. We use to denote the nuber of binary easureents. Algorith FBE needs paraeters n, B, which are interediate diension and nuber of blocks respectively. Based on Theore 3.8, n is required to be proportional to (up to soe logarithic factors) and B is required to be proportional to log N. We thus set n.3, B.8 log N. We also set n.3 for FBE-2. In addition, we fix p = 52. We report our first result showing the functional relationship between (, N, δ) in Figure. In particular, panel (a) shows the the change of distortion δ over the nuber of easureents for fixed N. We observe that, for all the three algoriths, δ decays with at the rate predicted by Proposition 2.2 and Theore 3.8. Panel (b) shows the epirical relationship between and log N for fixed δ. As predicted by our theory (lower bound and upper bound), has a linear dependence on log N. δ FBE FBE-2 URP (a) N = FBE FBE-2 URP logn (b) δ = 0.3 Figure : Results on synthetic datasets. (a) Each point, along with the standard deviation represented by the error bar, is an average of 50 trials each of which is based on a fresh synthetic dataset with size N = 300 and newly constructed ebedding apping. (b) Each point is coputed by slicing at δ = 0.3 in siilar plots like (a) but with the corresponding N. A popular application of binary ebedding is iage retrieval, as considered in (Gong and Lazebnik, 20; Gong et al., 203; Yu et al., 204). We thus conduct an experient on the Flickr dataset that consists of 0k iages fro Internet. Each iage is represented by a diensional noralized Fisher vector. We take 500 randoly sapled iages as query points and leave the rest as base for retrieval. The relevant iages of each query are defined as its 0 nearest neighbors 3

14 0.8 Recall FBE FBE-2 FBE-2(sae tie) URP URP(sae tie) Nuber of retrieved iages Recall FBE FBE-2 FBE-2(sae tie) URP URP(sae tie) Nuber of retrieved iages Recall FBE FBE-2 FBE-2(sae tie) URP URP(sae tie) Nuber of retrieved iages (a) = 5000 (b) = 0000 (c) = 5000 Figure 2: Iage retrieval results on Flickr Each panel presents the recall for specified nuber of easureents. Black and blue dot lines are respectively the recall of FBE-2 and URP with less nuber of easureents but the sae running tie as FBE. based on geodesic distance. Given, we apply FBE, FBE-2 and URP to convert all iages into -diensional binary codes. In particular, we set B = 0 for FBE and n.3 for FBE and FBE-2. Then we leverage the corresponding siilarity etrics, (3.5) for FBE and Haing distance for FBE-2 and URP, to retrieve the nearest iages for each query. The perforance of each algorith is characterized by recall, i.e., the nuber of retrieved relevant iages divided by the total nuber of relevant iages. We report our second result in Figure 2. Each panel shows the average recall of all queries for a specified. We note that FBE-2, as a fast algorith, perfors as well as URP with the sae nuber of easureents. In order to show the running tie advantage of our fast algorith FBE, we also present the perforance of FBE-2 and URP with fewer easureents such that they can be coputed with the sae tie as FBE. As we observe, with large nuber of easureents, FBE-2 and URP perfor arginally better than FBE while FBE has a significant iproveent over the two algoriths under identical tie constraint. 5 Proofs 5. Proof of Data-Oblivious Lower Bound (Theore 3.) The proof of the data-oblivious lower bound is based on a lower bound for one-way counication of Haing distance due to Jayra and Woodruff (203). Definition 5. (One-way counication of Haing distance). In the one-way counication odel, Alice is given a {0, } n and Bob is given b {0, } n. Alice sends Bob a essage c {0, }, and Bob uses b and c to output a value x R. Alice and Bob have shared randoness. Alice and Bob solve the (δ, ɛ) additive Haing distance estiation proble if x d H (a, b) δ with probability ɛ. The result proven in Jayra and Woodruff (203) is a lower bound for the ultiplicative Haing distance estiation proble, but their techniques readily yield a bound for the additive case 4

15 as well: Lea 5.2. Any algorith that solves the (δ, ɛ) additive Haing distance estiation proble ust have = Ω((/δ 2 ) log(/ɛ)) as long as this is less than n. Proof. We apply Lea 3. of Jayra and Woodruff (203) with paraeters α = 2, p =, b =, ε = δ, and δ = ɛ. This encodes inputs fro a proble they prove is hard (augented indexing on large doains) to inputs appropriate for Haing estiation. In particular, for n = O( log(/ɛ)) δ 2 it gives a distribution on (a, b) {0, } n {0, } n that are divided into NO and YES instances, such that: Fro the reduction, distinguishing NO instances fro YES instances with probability ɛ requires Alice to send = Ω( δ 2 log(/ɛ)) bits of counication to Bob. In NO instances, d H (a, b) 2 ( δ/3). In YES instances, d H (a, b) 2 ( 2δ/3). First, suppose n = n. Then since solving the additive Haing distance estiation proble with δ/2 accuracy would distinguish NO instances fro YES instances, it ust involve = Ω( log(/ɛ)) bits of counication. δ 2 For n > n, siply duplicate the coordinates of a and b n/n ties, and zero-pad the reainder. Less than half the coordinates are then part of the zero-padding, so the gap between YES and NO instances reains at least δ/2 and a protocol for the (δ/24, ɛ) additive Haing distance estiation proble requires = Ω( log(/ɛ)) as desired. δ 2 With this in hand, we can prove Theore 3.: Proof of Theore 3.. We reduce one-way counication of the (δ, ɛ) additive Haing distance estiation proble to the ebedding proble. Let a, b {0, } p be drawn fro the hard instance for the counication proble defined in Lea 5.2. Linearly transfor the to u, v S p via u = (2 a )/ p, v = (2 b )/ p. We have that u, v = 2d H (a, b), so d(u, v) = arccos( u, v ) π = arccos( 2d H(a, b)) π or d H (a, b) = ( cos(π πd(u, v))) 2 Given an estiate of d(u, v), we can therefore get an estiate of d H (a, b). In particular, since cos (x), if we learn d(u, v) to ±δ then we learn d H (a, b) to ±δ π 2. For now, consider the case of N = 2. Consider an oblivious ebedding function f : S p {0, } and reconstruction algorith g : {0, } {0, } R that has g(f(u), f(v)) d(u, v) δ 2 π with probability ɛ on the distribution of inputs (u, v). We can solve the one-way counication proble for Haing distance estiation by Alice sending f(u) to Bob, Bob learning d(u, v) 5

16 g(f(u), f(v)), and then coputing d H (a, b) to ±δ. By the lower bound for this proble, any such f and g ust have = Ω( log δ 2 ɛ ), proving the result for N = 2 (after rescaling δ). For general N, we draw instances (u, v ), (u 2, v 2 ),..., (u N/2, v N/2 ) independently fro the hard instance for binary ebedding of N = 2 and ɛ = 4ɛ/N. Consider an oblivious ebedding function f : S p {0, } and reconstruction algorith g : {0, } {0, } R that has for all i [N/2] that g(f(u i ), f(v i )) d(u i, v i ) δ with probability ɛ on this distribution. Define α to be the probability that g(f(u i ), f(v i )) d(u i, v i ) δ for any particular i. Because f and g are oblivious and the different instances are independent, we have the probability that all instances succeed is α N/2 ɛ, so α > ( ɛ) 2/N > 4ɛ/N. In particular, this eans f and g solve the hard instance of binary ebedding and N = 2, ɛ = 4ɛ/N. By the above lower bound for N = 2, this eans as desired. = Ω( δ 2 log(n/ɛ)) 5.2 Proof of Data-Dependent Lower Bound (Theore 3.3) We need a few ingredients to show the lower bound. identity atrix. First, we define a atrix that is close to Definition 5.3. ((δ, δ 2 )-near identity atrix) Syetric atrix M R p p is called a (δ, δ 2 )-near identity atrix if it satisfies both of the following conditions: δ M i,i, i [p], M i,j δ 2, i j [p]. Next we give a lower bound on the rank of (δ, δ 2 )-near identity atrix. Lea 5.4. Suppose positive seidefinite atrix M R p p is a (δ, δ 2 )-near identity atrix with rank d, and 0 < δ, δ 2 <. Then we have Proof. We postpone the proof to Appendix B. d p( δ ) 2 + (p )δ2 2. The above result is weak when it is applied to show our desired lower bound. We still need to ake use of the following cobinatorial result. 6

17 Lea 5.5. Suppose atrix M R p p has rank d. Let P (x) be any degree k polynoial function. Consider atrix N R p p defined as N := P (M), where the N i,j = P (M i,j ). We have ( ) k + d rank(n). k Proof. See Lea 9.2 of Alon (2003) for a detailed proof. Now we are ready to prove Theore 3.3. Proof of Theore 3.3. Let e i denote the i th natural basis of R N, i.e., the i th coordinate is while the rest are all zeros. Consider N points {e, e 2,..., e N } and their opposite vectors { e, e 2,..., e N }. For any binary ebedding algorith f, we let b i := f(e i ), i [N], c i := f( e i ), i [N]. Under the condition that f solves the general binary ebedding proble with link function L, we have d H (b i, c i ) L ( d(e i, e i ) ) δ, i [N]. (5.) As d(e i, e i ) =, we have Siilarly, note that L() + δ d H (b i, c i ) L() δ. (5.2) d(e i, e j ) = d(e i, e j ) = d( e i, e j ) = 2, i j, we have i j L(/2) δ d H (b i, b j ) L(/2) + δ, (5.3) L(/2) δ d H (c i, c j ) L(/2) + δ, (5.4) L(/2) δ d H (b i, c j ) L(/2) + δ. (5.5) Fro now on, we treat binary strings b i, c i as vectors in R. Let B denote the atrix with rows b i and C denote the atrix with rows c i. Consider the outer product of the difference between B and C, naely M = (B C)(B C). Note that i [N], M i,i = b i c i 2 2 = 4 d H (b i, c i ) 4 ( L() δ ). The last inequality follows fro (5.2). For i j, we have M i,j = b i c i, b j c j = bi, b j + ci, c j bi, c j bj, c i ( ) = 2 d H (b i, c j ) + d H (b j, c i ) d H (b i, b j ) d H (c i, c j ), 7

18 where the third equality follows fro By using (5.3) to (5.5), we have ( Therefore, 4 (L()+δ) M is actually a Mi,j 8δ. ) -near identity atrix. Consider degree k polynoial P (z) = z k. Let and d H (b, c) = ( b c b, c ) b, c {, } 2δ L(), 2δ L() N = P ( 4 L() M). It is easy to observe that N is a (γ, γ 2 )-near identity atrix where Under the condition By setting k = 2 log N log L() 2δ δ L() 4, we have, we have γ = ( γ = ( 2δ L() )k, γ 2 = ( 2δ ) k. L() δ L() )k ( 2 )k. γ 2 N. We apply Lea 5.4 by setting δ, δ 2, p in the stateent to be γ, γ 2, N respectively. We get rank(n) N( 4 )k + (N )/N 2 ( 4 )k N ( 8 )k N. (5.6) On the other hand, 4 L() M has rank at ost. By applying Lea 5.5 we get ( ) + k rank(n) ( e( + k) ) k. k k Applying the above result and (5.6) directly yields that When k = 2 log N log L() 2δ (N) /k 8e + k. k as we set, N /k ( L() 2δ )2. Therefore we have 32e (L()) 2k k δ 64e where the second inequality holds when ( L() ) 2 2δ 64e. (L()) 2k = 2δ 28e 8 (L()) 2 log N δ log L() 2δ,

19 5.3 Proofs about Fast Binary Ebedding Algorith 5.3. Proof of Lea 3.7 Proof. It suffices to prove X Y. One can check siilarly that the proof holds for the reaining three results. Note that X, Y are binary rando variables with values {, }. It is easy to observe both of the are balanced, naely Pr(X = ) = Pr(Y = ) = /2. If X Y, then we have Pr(X = Y ) = /2. In the reverse direction, suppose Pr(X = Y ) = /2. First we have Pr(X = ) = Pr(X =, Y = ) + Pr(X =, Y = ) = /2, (5.7) Pr(Y = ) = Pr(X =, Y = ) + Pr(X =, Y = ) = /2. (5.8) Cobining the above two results, we have Pr(X =, Y = ) = Pr(X =, Y = ). Using Pr(X =, Y = ) + Pr(X =, Y = ) = Pr(X Y ) = Pr(X = Y ) = 2, we thus have Pr(X =, Y = ) = Pr(X =, Y = ) = /4. Plugging the above result into (5.7) and (5.8) we have Pr(X =, Y = ) = Pr(X =, Y = ) = /4. Thus we have shown Pr(X = v Y = u) = Pr(X = v, Y = u) Pr(Y = u) = Pr(X = v), u, v {, }, which leads to X Y. Using the above arguents, we show that X Y if and only if Pr(X = Y ) = /2. Recalling the definition of X, Y, the above condition holds if and only if { ξ Pr ζ, x ξ ζ, y } 0 = } {{ } 2. Z Next we prove Z has syetric distribution around 0. Let I = [, n], I = [, n ], I 0 = [2n, 2n ] for soe natural nuber < n. Without loss of generality, we assue ξ = g I and ξ = [g I0 ; g I ]. We split I into T = n consecutive disjoint subsets I, I 2,..., I T each of which has size except I T = n (T ). Also, let I T contain the first n (T ) entries of I T. Then we have ( T ) Z = gii ζ Ii, x Ii ( T 2 gii ζ Ii+, y Ii+ + gi T ζ I T, y IT + gi0 ζ I, y I ). (5.9) We now let ĝ be such rando vector that is identical to g except that for any i {0} [T ] ĝ Ii = g Ii, if i od 2 = 0 Let ζ be such rando vector that is identical to ζ except that for any i {0} [T ] ζ Ii = ζ Ii, if i od 2 =. 9

20 Replacing g, ζ in (5.9) with ĝ, ζ yields Ẑ ( T = ĝii ζ ) Ii, x Ii ( = = Z. T ) gii ζ Ii, x Ii ( T 2 ( T 2 ĝii ζ Ii+, y Ii+ + ĝi T ζ IT, y IT + ĝi0 ζ I, y I ) gii ζ Ii+, y Ii+ + gi T ζ I T, y IT + gi0 ζ I, y I ) As each entry of g is syetric rando variable around 0, therefore ĝ and g has the sae probability distribution. The sae fact also holds for ζ and ζ. So we conclude that Z has syetric distribution around 0, which iplies Pr(Z > 0) = 2 and X Y Proof of Theore 3.8 Proof. Unspecified notations in this section are consistent with Algorith 2. Using Lea 3.6, we have { Pr d(y i, y j ) d(x i, x j ) } Cδ 0.0. (5.0) sup i,j [N] Now consider the first-block binary codes generated fro Gaussian Toeplitz projection. We focus on two interediate points y and y 2. Consider the first block of binary codes generated fro the second part of Algorith 2. We let u = sign ( Ψ () y ), v = sign ( Ψ () y 2 ). Suppose Ψ () contains Gaussian Toeplitz atrix T. For any i [/B], we have Since T i is a Gaussian rando vector, we have u i = sign ( T i ζ, y ) = sign ( Ti, y ζ ). v i = sign ( T i ζ, y 2 ) = sign ( Ti, y 2 ζ ). Pr(u i v i ) = d(y ζ, y 2 ζ) = d(y, y 2 ). Let Z i = ( u i v i ), i [/B]. Following Lea (3.7), we know that i j u i u j, u i v j, v i v j, v i u j. Therefore {Z i } [/B] is a pair-wise independent sequence. By Markov s inequality, we have ( /B Pr /B Z i E(Z ) ) δ 20 B V ar(z ) δ 2 B 4 δ 2 4. (5.)

21 The last inequality holds by setting B. Therefore, we have δ 2 ( Pr d H (u, v) d(y, y 2 ) ) δ 4. Now consider total B block binary codes {u i } B {v i} B fro y and y 2 respectively. Let E i = ( d H (u i, v i ) d(y, y 2 ) δ ), i [B]. Fro (5.), we have Pr(E i = ) < 4. If ore than half of E i are 0, then the edian of {d H (u i, v i )} B is within δ away fro d(y, y 2 ). Then we have ( Pr edian ( {d H (u i, v i )} B ) d(y, y 2 ) ) δ Pr ( B B E i ) ( Pr 2 B B E i E(E i ) > 4 ) exp( 4 B). In the second inequality, we use (5.). The last step follows fro Hoeffding s inequality. Now we use a union bound for N 2 pairs ( Pr d H (b i, b j ) d(y i, y j ) ) δ N 2 exp( 4 B) exp( 8 B). sup i,j [N] The last inequality holds by setting B 6 log N. triangle inequality, we coplete the proof. Cobing the above result and (5.0) using 5.4 Proof of Theore 3.0 For any set K S p, we use N δ (K) to denote a constructed δ-net of K, which is a δ-covering set with iniu size. In particular, by Sudakov s theore (e.g., Theore 3.8 in Ledoux and Talagrand (99)) log N δ (K) w(k)2 δ 2. We first prove that for a fixed two diensional space, = O( δ 2 ) independent Gaussian easureents are sufficient to achieve δ-unifor binary ebedding. Lea 5.6. Suppose K is any fixed two-diensional subspace in S p. Let A R p be a atrix with independent rows A i N (0, I p ), i []. Suppose log δ 2 δ, then with probability at least 3 exp( δ 2 ), sup d A (x, y) d(x, y) Cδ. (5.2) x,y K Here C is soe absolute constant. Proof. We postpone the proof to Appendix C. The next lea shows that the noralized l nor of Ax provides decent approxiation of x 2. 2

22 Lea 5.7. Consider any set K R p. A i N (0, I p ) for any i []. Consider We have where d(k) = ax x K x 2. Z = sup x K Let A be an -by-p atrix with independent rows A i, x 2 π x 2. Pr { Z 4 w(k) + t } 2 exp ( t2 ), t > 0. 2d(K) 2 Proof. See the proof of Lea 2. in Plan and Vershynin (204). In order to connect l nor to Haing distance, we need the following result. Lea 5.8. Consider finite nuber of points K S p. independent rows A i N (0, I p ) for any i []. Suppose Let A be an -by-p atrix with log K, δ2 then we have sup x K { Ai, x } δ 2δ. with probability at least exp( δ 2 ). Proof. Let X N (0, ). For any fixed point x K and any i [], we have Pr( A i, x δ) = Pr( X δ) δ. Let Z i = ( A i, x δ), i []. Then by using Hoeffding s inequality, Pr( Z i E(Z ) > δ) exp( 2δ 2 ). As E(Z ) = Pr( A i, x δ) δ, we conclude that with probability at least exp( 2δ 2 ), Z i 2δ. By applying union bound over K points and setting δ 2 log K, we coplete the proof. Now we are ready to prove Theore

23 Proof of Theore 3.0. We construct a δ-net of K that is denoted as N δ. log N δ 2 δ. Applying Proposition 2.2 and setting K = N δ, we have that We assue sup d A (x, y) d(x, y) δ (5.3) x,y N δ with probability at least 2 exp( δ 2 ). For any two fixed points x, y K, let x, y be their nearest points in N δ. Then we have d(x, y) d A (x, y) d(x, y) d(x, y ) + d(x, y ) d A (x, y) (a) d(x, y ) d A (x, y) + 2δ d A (x, y ) d A (x, y) + d(x, y ) d A (x, y ) + 2δ (b) d A (x, y ) d A (x, y) + 3δ d A (x, y ) d A (x, y) + d A (x, y) d A (x, y) + 3δ (c) d A (y, y) + d A (x, x) + 3δ, (5.4) where (a) follows fro d(x, y) d(x, y ) d(x, y) d(x, y) + d(x, y) d(x, y ) d(x, x ) + d(x, y ) 2δ, step (b) follows fro (5.3), step (c) follows fro the triangle inequality of Haing distance. Therefore we have sup da (x, y) d(x, y) 2 sup sup d A (x, x ) + 3δ. (5.5) x,y K x N δ x K x x 2 δ Next we bound the tail ter Recall that T := sup sup d A (x, x ). x N δ x K x x 2 δ K + δ := K { z S p : z = x y x y 2, x, y K if δ 2 x y 2 δ }. Now we construct a δ-net for K + δ \ K denoted as N δ. For two distinct points x, y N δ Nδ, let C(x, y) denote the unit circle spanned by x, y. We construct δ 2 -net C δ 2(x, y) for each circle C(x, y). For siplicity, we just let C δ 2(x, y) be the set of points that uniforly split C(x, y) with interval δ 2. We thus have C δ 2(x, y). Let G δ 2 δ denote the union of all circle nets C δ 2(x, y) spanned by points in N δ Nδ, naely G δ := C δ 2(x, y) {x, y}. x,y N δ Nδ For any point x K, we can always find a point in G δ that is O(δ 2 ) away fro x. To see why the arguent is true, we first let x be the nearest point to x in N δ. If x x 2 δ 2, 23

24 then x is the point we want. Otherwise, we have δ 2 x x 2 δ. In this case, we have (x x )/ x x K +. Following the definition of K + δ, we can always find a point x N δ Nδ such that x x x 2 δ, (5.6) x x 2 thereby x ( x x 2 x ) + x } {{ } 2 δ x x 2 δ 2. z Note that z 2 is very close to because δ 4 x z 2 2 z z, x + z z 2 + = ( z 2 ) 2. We thus have x z/ z 2 2 x z 2 + z z/ z 2 2 = x z 2 + z 2 2δ 2. Note that z is in the unit circle C(x, x ) spanned by x and x, thereby there exists u C δ 2(x, x ) such that u x 2 δ 2. Point u thus satisfies x u x z 2 + z u 2 3δ 2. (5.7) So for any x K and its nearest point x N δ, we define u as { x, x x 2 δ 2 ; u := argin v Cδ 2 (x,x ) x v 2, otherwise. where x N δ N δ and satisfies (5.6). Based on (5.7), we always have u x 2 3δ 2 and u x 2 u x 2 + x x 2 2δ. By triangle inequality of Haing distance, We thus have d A (x, x ) d A (x, u) + d A (u, x ). T sup sup d A (x, u) + d A (u, x ) x N δ x K x x 2 sup u G δ sup d A (x, u) x K x u 2 3δ } {{ 2 } T sup x K x u 2 3δ 2 + sup sup d A (u, v). x,y N δ N δ u,v C(x,y) } u v 2 2δ {{ } T 2 Next we bound ter T and T 2 respectively. Ter T. For a fixed point u G δ, using Lea 5.7 by setting (K, t) in the stateent to be K = (K {u}) {u R p : u 2 3δ 2 } and δ 2 respectively yields that { Pr A i, x u } 2 π x u 2 4w(K ) + δ 2 2 exp ( δ4 2d(K ) 2 ) 2 exp( /8). 24

Binary Embedding: Fundamental Limits and Fast Algorithm

Binary Embedding: Fundamental Limits and Fast Algorithm Xinyang Yi YIXY@UTEXAS.EDU Constantine Caramanis CONSTANTINE@UTEXAS.EDU Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX 787 Eric Price ECPRICE@CS.UTEXAS.EDU

More information

Machine Learning Applications in Grid Computing

Machine Learning Applications in Grid Computing Machine Learning Applications in Grid Coputing George Cybenko, Guofei Jiang and Daniel Bilar Thayer School of Engineering Dartouth College Hanover, NH 03755, USA gvc@dartouth.edu, guofei.jiang@dartouth.edu

More information

arxiv:0805.1434v1 [math.pr] 9 May 2008

arxiv:0805.1434v1 [math.pr] 9 May 2008 Degree-distribution stability of scale-free networs Zhenting Hou, Xiangxing Kong, Dinghua Shi,2, and Guanrong Chen 3 School of Matheatics, Central South University, Changsha 40083, China 2 Departent of

More information

Lecture L26-3D Rigid Body Dynamics: The Inertia Tensor

Lecture L26-3D Rigid Body Dynamics: The Inertia Tensor J. Peraire, S. Widnall 16.07 Dynaics Fall 008 Lecture L6-3D Rigid Body Dynaics: The Inertia Tensor Version.1 In this lecture, we will derive an expression for the angular oentu of a 3D rigid body. We shall

More information

Online Bagging and Boosting

Online Bagging and Boosting Abstract Bagging and boosting are two of the ost well-known enseble learning ethods due to their theoretical perforance guarantees and strong experiental results. However, these algoriths have been used

More information

Data Set Generation for Rectangular Placement Problems

Data Set Generation for Rectangular Placement Problems Data Set Generation for Rectangular Placeent Probles Christine L. Valenzuela (Muford) Pearl Y. Wang School of Coputer Science & Inforatics Departent of Coputer Science MS 4A5 Cardiff University George

More information

Reliability Constrained Packet-sizing for Linear Multi-hop Wireless Networks

Reliability Constrained Packet-sizing for Linear Multi-hop Wireless Networks Reliability Constrained acket-sizing for inear Multi-hop Wireless Networks Ning Wen, and Randall A. Berry Departent of Electrical Engineering and Coputer Science Northwestern University, Evanston, Illinois

More information

ON SELF-ROUTING IN CLOS CONNECTION NETWORKS. BARRY G. DOUGLASS Electrical Engineering Department Texas A&M University College Station, TX 77843-3128

ON SELF-ROUTING IN CLOS CONNECTION NETWORKS. BARRY G. DOUGLASS Electrical Engineering Department Texas A&M University College Station, TX 77843-3128 ON SELF-ROUTING IN CLOS CONNECTION NETWORKS BARRY G. DOUGLASS Electrical Engineering Departent Texas A&M University College Station, TX 778-8 A. YAVUZ ORUÇ Electrical Engineering Departent and Institute

More information

MINIMUM VERTEX DEGREE THRESHOLD FOR LOOSE HAMILTON CYCLES IN 3-UNIFORM HYPERGRAPHS

MINIMUM VERTEX DEGREE THRESHOLD FOR LOOSE HAMILTON CYCLES IN 3-UNIFORM HYPERGRAPHS MINIMUM VERTEX DEGREE THRESHOLD FOR LOOSE HAMILTON CYCLES IN 3-UNIFORM HYPERGRAPHS JIE HAN AND YI ZHAO Abstract. We show that for sufficiently large n, every 3-unifor hypergraph on n vertices with iniu

More information

Use of extrapolation to forecast the working capital in the mechanical engineering companies

Use of extrapolation to forecast the working capital in the mechanical engineering companies ECONTECHMOD. AN INTERNATIONAL QUARTERLY JOURNAL 2014. Vol. 1. No. 1. 23 28 Use of extrapolation to forecast the working capital in the echanical engineering copanies A. Cherep, Y. Shvets Departent of finance

More information

Media Adaptation Framework in Biofeedback System for Stroke Patient Rehabilitation

Media Adaptation Framework in Biofeedback System for Stroke Patient Rehabilitation Media Adaptation Fraework in Biofeedback Syste for Stroke Patient Rehabilitation Yinpeng Chen, Weiwei Xu, Hari Sundara, Thanassis Rikakis, Sheng-Min Liu Arts, Media and Engineering Progra Arizona State

More information

On Computing Nearest Neighbors with Applications to Decoding of Binary Linear Codes

On Computing Nearest Neighbors with Applications to Decoding of Binary Linear Codes On Coputing Nearest Neighbors with Applications to Decoding of Binary Linear Codes Alexander May and Ilya Ozerov Horst Görtz Institute for IT-Security Ruhr-University Bochu, Gerany Faculty of Matheatics

More information

Halloween Costume Ideas for the Wii Game

Halloween Costume Ideas for the Wii Game Algorithica 2001) 30: 101 139 DOI: 101007/s00453-001-0003-0 Algorithica 2001 Springer-Verlag New York Inc Optial Search and One-Way Trading Online Algoriths R El-Yaniv, 1 A Fiat, 2 R M Karp, 3 and G Turpin

More information

Searching strategy for multi-target discovery in wireless networks

Searching strategy for multi-target discovery in wireless networks Searching strategy for ulti-target discovery in wireless networks Zhao Cheng, Wendi B. Heinzelan Departent of Electrical and Coputer Engineering University of Rochester Rochester, NY 467 (585) 75-{878,

More information

This paper studies a rental firm that offers reusable products to price- and quality-of-service sensitive

This paper studies a rental firm that offers reusable products to price- and quality-of-service sensitive MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol., No. 3, Suer 28, pp. 429 447 issn 523-464 eissn 526-5498 8 3 429 infors doi.287/so.7.8 28 INFORMS INFORMS holds copyright to this article and distributed

More information

6. Time (or Space) Series Analysis

6. Time (or Space) Series Analysis ATM 55 otes: Tie Series Analysis - Section 6a Page 8 6. Tie (or Space) Series Analysis In this chapter we will consider soe coon aspects of tie series analysis including autocorrelation, statistical prediction,

More information

CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS

CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS 641 CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS Marketa Zajarosova 1* *Ph.D. VSB - Technical University of Ostrava, THE CZECH REPUBLIC arketa.zajarosova@vsb.cz Abstract Custoer relationship

More information

RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION. Henrik Kure

RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION. Henrik Kure RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION Henrik Kure Dina, Danish Inforatics Network In the Agricultural Sciences Royal Veterinary and Agricultural University

More information

Data Streaming Algorithms for Estimating Entropy of Network Traffic

Data Streaming Algorithms for Estimating Entropy of Network Traffic Data Streaing Algoriths for Estiating Entropy of Network Traffic Ashwin Lall University of Rochester Vyas Sekar Carnegie Mellon University Mitsunori Ogihara University of Rochester Jun (Ji) Xu Georgia

More information

Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona Network

Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona Network 2013 European Control Conference (ECC) July 17-19, 2013, Zürich, Switzerland. Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona

More information

Partitioned Elias-Fano Indexes

Partitioned Elias-Fano Indexes Partitioned Elias-ano Indexes Giuseppe Ottaviano ISTI-CNR, Pisa giuseppe.ottaviano@isti.cnr.it Rossano Venturini Dept. of Coputer Science, University of Pisa rossano@di.unipi.it ABSTRACT The Elias-ano

More information

Stable Learning in Coding Space for Multi-Class Decoding and Its Extension for Multi-Class Hypothesis Transfer Learning

Stable Learning in Coding Space for Multi-Class Decoding and Its Extension for Multi-Class Hypothesis Transfer Learning Stable Learning in Coding Space for Multi-Class Decoding and Its Extension for Multi-Class Hypothesis Transfer Learning Bang Zhang, Yi Wang 2, Yang Wang, Fang Chen 2 National ICT Australia 2 School of

More information

The Virtual Spring Mass System

The Virtual Spring Mass System The Virtual Spring Mass Syste J. S. Freudenberg EECS 6 Ebedded Control Systes Huan Coputer Interaction A force feedbac syste, such as the haptic heel used in the EECS 6 lab, is capable of exhibiting a

More information

Multi-Class Deep Boosting

Multi-Class Deep Boosting Multi-Class Deep Boosting Vitaly Kuznetsov Courant Institute 25 Mercer Street New York, NY 002 vitaly@cis.nyu.edu Mehryar Mohri Courant Institute & Google Research 25 Mercer Street New York, NY 002 ohri@cis.nyu.edu

More information

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, ACCEPTED FOR PUBLICATION 1. Secure Wireless Multicast for Delay-Sensitive Data via Network Coding

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, ACCEPTED FOR PUBLICATION 1. Secure Wireless Multicast for Delay-Sensitive Data via Network Coding IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, ACCEPTED FOR PUBLICATION 1 Secure Wireless Multicast for Delay-Sensitive Data via Network Coding Tuan T. Tran, Meber, IEEE, Hongxiang Li, Senior Meber, IEEE,

More information

Impact of Processing Costs on Service Chain Placement in Network Functions Virtualization

Impact of Processing Costs on Service Chain Placement in Network Functions Virtualization Ipact of Processing Costs on Service Chain Placeent in Network Functions Virtualization Marco Savi, Massio Tornatore, Giacoo Verticale Dipartiento di Elettronica, Inforazione e Bioingegneria, Politecnico

More information

Information Processing Letters

Information Processing Letters Inforation Processing Letters 111 2011) 178 183 Contents lists available at ScienceDirect Inforation Processing Letters www.elsevier.co/locate/ipl Offline file assignents for online load balancing Paul

More information

Modeling operational risk data reported above a time-varying threshold

Modeling operational risk data reported above a time-varying threshold Modeling operational risk data reported above a tie-varying threshold Pavel V. Shevchenko CSIRO Matheatical and Inforation Sciences, Sydney, Locked bag 7, North Ryde, NSW, 670, Australia. e-ail: Pavel.Shevchenko@csiro.au

More information

Dynamic Placement for Clustered Web Applications

Dynamic Placement for Clustered Web Applications Dynaic laceent for Clustered Web Applications A. Karve, T. Kibrel, G. acifici, M. Spreitzer, M. Steinder, M. Sviridenko, and A. Tantawi IBM T.J. Watson Research Center {karve,kibrel,giovanni,spreitz,steinder,sviri,tantawi}@us.ib.co

More information

Preference-based Search and Multi-criteria Optimization

Preference-based Search and Multi-criteria Optimization Fro: AAAI-02 Proceedings. Copyright 2002, AAAI (www.aaai.org). All rights reserved. Preference-based Search and Multi-criteria Optiization Ulrich Junker ILOG 1681, route des Dolines F-06560 Valbonne ujunker@ilog.fr

More information

Budget-optimal Crowdsourcing using Low-rank Matrix Approximations

Budget-optimal Crowdsourcing using Low-rank Matrix Approximations Budget-optial Crowdsourcing using Low-rank Matrix Approxiations David R. Karger, Sewoong Oh, and Devavrat Shah Departent of EECS, Massachusetts Institute of Technology Eail: {karger, swoh, devavrat}@it.edu

More information

Image restoration for a rectangular poor-pixels detector

Image restoration for a rectangular poor-pixels detector Iage restoration for a rectangular poor-pixels detector Pengcheng Wen 1, Xiangjun Wang 1, Hong Wei 2 1 State Key Laboratory of Precision Measuring Technology and Instruents, Tianjin University, China 2

More information

2. FINDING A SOLUTION

2. FINDING A SOLUTION The 7 th Balan Conference on Operational Research BACOR 5 Constanta, May 5, Roania OPTIMAL TIME AND SPACE COMPLEXITY ALGORITHM FOR CONSTRUCTION OF ALL BINARY TREES FROM PRE-ORDER AND POST-ORDER TRAVERSALS

More information

Modified Latin Hypercube Sampling Monte Carlo (MLHSMC) Estimation for Average Quality Index

Modified Latin Hypercube Sampling Monte Carlo (MLHSMC) Estimation for Average Quality Index Analog Integrated Circuits and Signal Processing, vol. 9, no., April 999. Abstract Modified Latin Hypercube Sapling Monte Carlo (MLHSMC) Estiation for Average Quality Index Mansour Keraat and Richard Kielbasa

More information

A Scalable Application Placement Controller for Enterprise Data Centers

A Scalable Application Placement Controller for Enterprise Data Centers W WWW 7 / Track: Perforance and Scalability A Scalable Application Placeent Controller for Enterprise Data Centers Chunqiang Tang, Malgorzata Steinder, Michael Spreitzer, and Giovanni Pacifici IBM T.J.

More information

Efficient Key Management for Secure Group Communications with Bursty Behavior

Efficient Key Management for Secure Group Communications with Bursty Behavior Efficient Key Manageent for Secure Group Counications with Bursty Behavior Xukai Zou, Byrav Raaurthy Departent of Coputer Science and Engineering University of Nebraska-Lincoln Lincoln, NE68588, USA Eail:

More information

Implementation of Active Queue Management in a Combined Input and Output Queued Switch

Implementation of Active Queue Management in a Combined Input and Output Queued Switch pleentation of Active Queue Manageent in a obined nput and Output Queued Switch Bartek Wydrowski and Moshe Zukeran AR Special Research entre for Ultra-Broadband nforation Networks, EEE Departent, The University

More information

A quantum secret ballot. Abstract

A quantum secret ballot. Abstract A quantu secret ballot Shahar Dolev and Itaar Pitowsky The Edelstein Center, Levi Building, The Hebrerw University, Givat Ra, Jerusale, Israel Boaz Tair arxiv:quant-ph/060087v 8 Mar 006 Departent of Philosophy

More information

Generating Certification Authority Authenticated Public Keys in Ad Hoc Networks

Generating Certification Authority Authenticated Public Keys in Ad Hoc Networks SECURITY AND COMMUNICATION NETWORKS Published online in Wiley InterScience (www.interscience.wiley.co). Generating Certification Authority Authenticated Public Keys in Ad Hoc Networks G. Kounga 1, C. J.

More information

AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES

AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES Int. J. Appl. Math. Coput. Sci., 2014, Vol. 24, No. 1, 133 149 DOI: 10.2478/acs-2014-0011 AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES PIOTR KULCZYCKI,,

More information

Trading Regret for Efficiency: Online Convex Optimization with Long Term Constraints

Trading Regret for Efficiency: Online Convex Optimization with Long Term Constraints Journal of Machine Learning Research 13 2012) 2503-2528 Subitted 8/11; Revised 3/12; Published 9/12 rading Regret for Efficiency: Online Convex Optiization with Long er Constraints Mehrdad Mahdavi Rong

More information

Pricing Asian Options using Monte Carlo Methods

Pricing Asian Options using Monte Carlo Methods U.U.D.M. Project Report 9:7 Pricing Asian Options using Monte Carlo Methods Hongbin Zhang Exaensarbete i ateatik, 3 hp Handledare och exainator: Johan Tysk Juni 9 Departent of Matheatics Uppsala University

More information

Factored Models for Probabilistic Modal Logic

Factored Models for Probabilistic Modal Logic Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008 Factored Models for Probabilistic Modal Logic Afsaneh Shirazi and Eyal Air Coputer Science Departent, University of Illinois

More information

ABSTRACT KEYWORDS. Comonotonicity, dependence, correlation, concordance, copula, multivariate. 1. INTRODUCTION

ABSTRACT KEYWORDS. Comonotonicity, dependence, correlation, concordance, copula, multivariate. 1. INTRODUCTION MEASURING COMONOTONICITY IN M-DIMENSIONAL VECTORS BY INGE KOCH AND ANN DE SCHEPPER ABSTRACT In this contribution, a new easure of coonotonicity for -diensional vectors is introduced, with values between

More information

Lecture L9 - Linear Impulse and Momentum. Collisions

Lecture L9 - Linear Impulse and Momentum. Collisions J. Peraire, S. Widnall 16.07 Dynaics Fall 009 Version.0 Lecture L9 - Linear Ipulse and Moentu. Collisions In this lecture, we will consider the equations that result fro integrating Newton s second law,

More information

Considerations on Distributed Load Balancing for Fully Heterogeneous Machines: Two Particular Cases

Considerations on Distributed Load Balancing for Fully Heterogeneous Machines: Two Particular Cases Considerations on Distributed Load Balancing for Fully Heterogeneous Machines: Two Particular Cases Nathanaël Cheriere Departent of Coputer Science ENS Rennes Rennes, France nathanael.cheriere@ens-rennes.fr

More information

Support Vector Machine Soft Margin Classifiers: Error Analysis

Support Vector Machine Soft Margin Classifiers: Error Analysis Journal of Machine Learning Research? (2004)?-?? Subitted 9/03; Published??/04 Support Vector Machine Soft Margin Classifiers: Error Analysis Di-Rong Chen Departent of Applied Matheatics Beijing University

More information

Stochastic Online Scheduling on Parallel Machines

Stochastic Online Scheduling on Parallel Machines Stochastic Online Scheduling on Parallel Machines Nicole Megow 1, Marc Uetz 2, and Tark Vredeveld 3 1 Technische Universit at Berlin, Institut f ur Matheatik, Strasse des 17. Juni 136, 10623 Berlin, Gerany

More information

Factor Model. Arbitrage Pricing Theory. Systematic Versus Non-Systematic Risk. Intuitive Argument

Factor Model. Arbitrage Pricing Theory. Systematic Versus Non-Systematic Risk. Intuitive Argument Ross [1],[]) presents the aritrage pricing theory. The idea is that the structure of asset returns leads naturally to a odel of risk preia, for otherwise there would exist an opportunity for aritrage profit.

More information

INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE SYSTEMS

INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE SYSTEMS Artificial Intelligence Methods and Techniques for Business and Engineering Applications 210 INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE

More information

An Approach to Combating Free-riding in Peer-to-Peer Networks

An Approach to Combating Free-riding in Peer-to-Peer Networks An Approach to Cobating Free-riding in Peer-to-Peer Networks Victor Ponce, Jie Wu, and Xiuqi Li Departent of Coputer Science and Engineering Florida Atlantic University Boca Raton, FL 33431 April 7, 2008

More information

Bayes Point Machines

Bayes Point Machines Journal of Machine Learning Research (2) 245 279 Subitted 2/; Published 8/ Bayes Point Machines Ralf Herbrich Microsoft Research, St George House, Guildhall Street, CB2 3NH Cabridge, United Kingdo Thore

More information

The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic Jobs

The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic Jobs Send Orders for Reprints to reprints@benthascience.ae 206 The Open Fuels & Energy Science Journal, 2015, 8, 206-210 Open Access The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic

More information

Models and Algorithms for Stochastic Online Scheduling 1

Models and Algorithms for Stochastic Online Scheduling 1 Models and Algoriths for Stochastic Online Scheduling 1 Nicole Megow Technische Universität Berlin, Institut für Matheatik, Strasse des 17. Juni 136, 10623 Berlin, Gerany. eail: negow@ath.tu-berlin.de

More information

Botnets Detection Based on IRC-Community

Botnets Detection Based on IRC-Community Botnets Detection Based on IRC-Counity Wei Lu and Ali A. Ghorbani Network Security Laboratory, Faculty of Coputer Science University of New Brunswick, Fredericton, NB E3B 5A3, Canada {wlu, ghorbani}@unb.ca

More information

Resource Allocation in Wireless Networks with Multiple Relays

Resource Allocation in Wireless Networks with Multiple Relays Resource Allocation in Wireless Networks with Multiple Relays Kağan Bakanoğlu, Stefano Toasin, Elza Erkip Departent of Electrical and Coputer Engineering, Polytechnic Institute of NYU, Brooklyn, NY, 0

More information

PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO

PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO Bulletin of the Transilvania University of Braşov Series I: Engineering Sciences Vol. 4 (53) No. - 0 PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO V. CAZACU I. SZÉKELY F. SANDU 3 T. BĂLAN Abstract:

More information

The Application of Bandwidth Optimization Technique in SLA Negotiation Process

The Application of Bandwidth Optimization Technique in SLA Negotiation Process The Application of Bandwidth Optiization Technique in SLA egotiation Process Srecko Krile University of Dubrovnik Departent of Electrical Engineering and Coputing Cira Carica 4, 20000 Dubrovnik, Croatia

More information

Software Quality Characteristics Tested For Mobile Application Development

Software Quality Characteristics Tested For Mobile Application Development Thesis no: MGSE-2015-02 Software Quality Characteristics Tested For Mobile Application Developent Literature Review and Epirical Survey WALEED ANWAR Faculty of Coputing Blekinge Institute of Technology

More information

Managing Complex Network Operation with Predictive Analytics

Managing Complex Network Operation with Predictive Analytics Managing Coplex Network Operation with Predictive Analytics Zhenyu Huang, Pak Chung Wong, Patrick Mackey, Yousu Chen, Jian Ma, Kevin Schneider, and Frank L. Greitzer Pacific Northwest National Laboratory

More information

Optimal Resource-Constraint Project Scheduling with Overlapping Modes

Optimal Resource-Constraint Project Scheduling with Overlapping Modes Optial Resource-Constraint Proect Scheduling with Overlapping Modes François Berthaut Lucas Grèze Robert Pellerin Nathalie Perrier Adnène Hai February 20 CIRRELT-20-09 Bureaux de Montréal : Bureaux de

More information

Airline Yield Management with Overbooking, Cancellations, and No-Shows JANAKIRAM SUBRAMANIAN

Airline Yield Management with Overbooking, Cancellations, and No-Shows JANAKIRAM SUBRAMANIAN Airline Yield Manageent with Overbooking, Cancellations, and No-Shows JANAKIRAM SUBRAMANIAN Integral Developent Corporation, 301 University Avenue, Suite 200, Palo Alto, California 94301 SHALER STIDHAM

More information

Analyzing Spatiotemporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy

Analyzing Spatiotemporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy Vol. 9, No. 5 (2016), pp.303-312 http://dx.doi.org/10.14257/ijgdc.2016.9.5.26 Analyzing Spatioteporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy Chen Yang, Renjie Zhou

More information

Adaptive Modulation and Coding for Unmanned Aerial Vehicle (UAV) Radio Channel

Adaptive Modulation and Coding for Unmanned Aerial Vehicle (UAV) Radio Channel Recent Advances in Counications Adaptive odulation and Coding for Unanned Aerial Vehicle (UAV) Radio Channel Airhossein Fereidountabar,Gian Carlo Cardarilli, Rocco Fazzolari,Luca Di Nunzio Abstract In

More information

Equivalent Tapped Delay Line Channel Responses with Reduced Taps

Equivalent Tapped Delay Line Channel Responses with Reduced Taps Equivalent Tapped Delay Line Channel Responses with Reduced Taps Shweta Sagari, Wade Trappe, Larry Greenstein {shsagari, trappe, ljg}@winlab.rutgers.edu WINLAB, Rutgers University, North Brunswick, NJ

More information

Real Time Target Tracking with Binary Sensor Networks and Parallel Computing

Real Time Target Tracking with Binary Sensor Networks and Parallel Computing Real Tie Target Tracking with Binary Sensor Networks and Parallel Coputing Hong Lin, John Rushing, Sara J. Graves, Steve Tanner, and Evans Criswell Abstract A parallel real tie data fusion and target tracking

More information

HOW CLOSE ARE THE OPTION PRICING FORMULAS OF BACHELIER AND BLACK-MERTON-SCHOLES?

HOW CLOSE ARE THE OPTION PRICING FORMULAS OF BACHELIER AND BLACK-MERTON-SCHOLES? HOW CLOSE ARE THE OPTION PRICING FORMULAS OF BACHELIER AND BLACK-MERTON-SCHOLES? WALTER SCHACHERMAYER AND JOSEF TEICHMANN Abstract. We copare the option pricing forulas of Louis Bachelier and Black-Merton-Scholes

More information

ADJUSTING FOR QUALITY CHANGE

ADJUSTING FOR QUALITY CHANGE ADJUSTING FOR QUALITY CHANGE 7 Introduction 7.1 The easureent of changes in the level of consuer prices is coplicated by the appearance and disappearance of new and old goods and services, as well as changes

More information

An improved TF-IDF approach for text classification *

An improved TF-IDF approach for text classification * Zhang et al. / J Zheiang Univ SCI 2005 6A(1:49-55 49 Journal of Zheiang University SCIECE ISS 1009-3095 http://www.zu.edu.cn/zus E-ail: zus@zu.edu.cn An iproved TF-IDF approach for text classification

More information

Online Appendix I: A Model of Household Bargaining with Violence. In this appendix I develop a simple model of household bargaining that

Online Appendix I: A Model of Household Bargaining with Violence. In this appendix I develop a simple model of household bargaining that Online Appendix I: A Model of Household Bargaining ith Violence In this appendix I develop a siple odel of household bargaining that incorporates violence and shos under hat assuptions an increase in oen

More information

Construction Economics & Finance. Module 3 Lecture-1

Construction Economics & Finance. Module 3 Lecture-1 Depreciation:- Construction Econoics & Finance Module 3 Lecture- It represents the reduction in arket value of an asset due to age, wear and tear and obsolescence. The physical deterioration of the asset

More information

Applying Multiple Neural Networks on Large Scale Data

Applying Multiple Neural Networks on Large Scale Data 0 International Conference on Inforation and Electronics Engineering IPCSIT vol6 (0) (0) IACSIT Press, Singapore Applying Multiple Neural Networks on Large Scale Data Kritsanatt Boonkiatpong and Sukree

More information

Performance Evaluation of Machine Learning Techniques using Software Cost Drivers

Performance Evaluation of Machine Learning Techniques using Software Cost Drivers Perforance Evaluation of Machine Learning Techniques using Software Cost Drivers Manas Gaur Departent of Coputer Engineering, Delhi Technological University Delhi, India ABSTRACT There is a treendous rise

More information

Capacity of Multiple-Antenna Systems With Both Receiver and Transmitter Channel State Information

Capacity of Multiple-Antenna Systems With Both Receiver and Transmitter Channel State Information IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO., OCTOBER 23 2697 Capacity of Multiple-Antenna Systes With Both Receiver and Transitter Channel State Inforation Sudharan K. Jayaweera, Student Meber,

More information

Cross-Domain Metric Learning Based on Information Theory

Cross-Domain Metric Learning Based on Information Theory Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Cross-Doain Metric Learning Based on Inforation Theory Hao Wang,2, Wei Wang 2,3, Chen Zhang 2, Fanjiang Xu 2. State Key Laboratory

More information

An Optimal Task Allocation Model for System Cost Analysis in Heterogeneous Distributed Computing Systems: A Heuristic Approach

An Optimal Task Allocation Model for System Cost Analysis in Heterogeneous Distributed Computing Systems: A Heuristic Approach An Optial Tas Allocation Model for Syste Cost Analysis in Heterogeneous Distributed Coputing Systes: A Heuristic Approach P. K. Yadav Central Building Research Institute, Rooree- 247667, Uttarahand (INDIA)

More information

A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS

A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS Isaac Zafrany and Sa BenYaakov Departent of Electrical and Coputer Engineering BenGurion University of the Negev P. O. Box

More information

Research Article Performance Evaluation of Human Resource Outsourcing in Food Processing Enterprises

Research Article Performance Evaluation of Human Resource Outsourcing in Food Processing Enterprises Advance Journal of Food Science and Technology 9(2): 964-969, 205 ISSN: 2042-4868; e-issn: 2042-4876 205 Maxwell Scientific Publication Corp. Subitted: August 0, 205 Accepted: Septeber 3, 205 Published:

More information

Modeling Parallel Applications Performance on Heterogeneous Systems

Modeling Parallel Applications Performance on Heterogeneous Systems Modeling Parallel Applications Perforance on Heterogeneous Systes Jaeela Al-Jaroodi, Nader Mohaed, Hong Jiang and David Swanson Departent of Coputer Science and Engineering University of Nebraska Lincoln

More information

Reconnect 04 Solving Integer Programs with Branch and Bound (and Branch and Cut)

Reconnect 04 Solving Integer Programs with Branch and Bound (and Branch and Cut) Sandia is a ultiprogra laboratory operated by Sandia Corporation, a Lockheed Martin Copany, Reconnect 04 Solving Integer Progras with Branch and Bound (and Branch and Cut) Cynthia Phillips (Sandia National

More information

ASIC Design Project Management Supported by Multi Agent Simulation

ASIC Design Project Management Supported by Multi Agent Simulation ASIC Design Project Manageent Supported by Multi Agent Siulation Jana Blaschke, Christian Sebeke, Wolfgang Rosenstiel Abstract The coplexity of Application Specific Integrated Circuits (ASICs) is continuously

More information

Calculating the Return on Investment (ROI) for DMSMS Management. The Problem with Cost Avoidance

Calculating the Return on Investment (ROI) for DMSMS Management. The Problem with Cost Avoidance Calculating the Return on nvestent () for DMSMS Manageent Peter Sandborn CALCE, Departent of Mechanical Engineering (31) 45-3167 sandborn@calce.ud.edu www.ene.ud.edu/escml/obsolescence.ht October 28, 21

More information

Enrolment into Higher Education and Changes in Repayment Obligations of Student Aid Microeconometric Evidence for Germany

Enrolment into Higher Education and Changes in Repayment Obligations of Student Aid Microeconometric Evidence for Germany Enrolent into Higher Education and Changes in Repayent Obligations of Student Aid Microeconoetric Evidence for Gerany Hans J. Baugartner *) Viktor Steiner **) *) DIW Berlin **) Free University of Berlin,

More information

An Integrated Approach for Monitoring Service Level Parameters of Software-Defined Networking

An Integrated Approach for Monitoring Service Level Parameters of Software-Defined Networking International Journal of Future Generation Counication and Networking Vol. 8, No. 6 (15), pp. 197-4 http://d.doi.org/1.1457/ijfgcn.15.8.6.19 An Integrated Approach for Monitoring Service Level Paraeters

More information

A framework for performance monitoring, load balancing, adaptive timeouts and quality of service in digital libraries

A framework for performance monitoring, load balancing, adaptive timeouts and quality of service in digital libraries Int J Digit Libr (2000) 3: 9 35 INTERNATIONAL JOURNAL ON Digital Libraries Springer-Verlag 2000 A fraework for perforance onitoring, load balancing, adaptive tieouts and quality of service in digital libraries

More information

Audio Engineering Society. Convention Paper. Presented at the 119th Convention 2005 October 7 10 New York, New York USA

Audio Engineering Society. Convention Paper. Presented at the 119th Convention 2005 October 7 10 New York, New York USA Audio Engineering Society Convention Paper Presented at the 119th Convention 2005 October 7 10 New York, New York USA This convention paper has been reproduced fro the authors advance anuscript, without

More information

CPU Animation. Introduction. CPU skinning. CPUSkin Scalar:

CPU Animation. Introduction. CPU skinning. CPUSkin Scalar: CPU Aniation Introduction The iportance of real-tie character aniation has greatly increased in odern gaes. Aniating eshes ia 'skinning' can be perfored on both a general purpose CPU and a ore specialized

More information

An Innovate Dynamic Load Balancing Algorithm Based on Task

An Innovate Dynamic Load Balancing Algorithm Based on Task An Innovate Dynaic Load Balancing Algorith Based on Task Classification Hong-bin Wang,,a, Zhi-yi Fang, b, Guan-nan Qu,*,c, Xiao-dan Ren,d College of Coputer Science and Technology, Jilin University, Changchun

More information

Cooperative Caching for Adaptive Bit Rate Streaming in Content Delivery Networks

Cooperative Caching for Adaptive Bit Rate Streaming in Content Delivery Networks Cooperative Caching for Adaptive Bit Rate Streaing in Content Delivery Networs Phuong Luu Vo Departent of Coputer Science and Engineering, International University - VNUHCM, Vietna vtlphuong@hciu.edu.vn

More information

The AGA Evaluating Model of Customer Loyalty Based on E-commerce Environment

The AGA Evaluating Model of Customer Loyalty Based on E-commerce Environment 6 JOURNAL OF SOFTWARE, VOL. 4, NO. 3, MAY 009 The AGA Evaluating Model of Custoer Loyalty Based on E-coerce Environent Shaoei Yang Econoics and Manageent Departent, North China Electric Power University,

More information

Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2

Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2 Exploiting Hardware Heterogeneity within the Sae Instance Type of Aazon EC2 Zhonghong Ou, Hao Zhuang, Jukka K. Nurinen, Antti Ylä-Jääski, Pan Hui Aalto University, Finland; Deutsch Teleko Laboratories,

More information

Memory and Computation Efficient PCA via Very Sparse Random Projections

Memory and Computation Efficient PCA via Very Sparse Random Projections Meory and Coputation Efficient PCA via Very Sparse Rando Projections Farad Pourkaali-Anaraki FARHAD.POURKAMALI@COLORADO.EDU Sannon M. Huges SHANNON.HUGHES@COLORADO.EDU Departent of Electrical, Coputer,

More information

Comment on On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes

Comment on On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes Coent on On Discriinative vs. Generative Classifiers: A Coparison of Logistic Regression and Naive Bayes Jing-Hao Xue (jinghao@stats.gla.ac.uk) and D. Michael Titterington (ike@stats.gla.ac.uk) Departent

More information

SOME APPLICATIONS OF FORECASTING Prof. Thomas B. Fomby Department of Economics Southern Methodist University May 2008

SOME APPLICATIONS OF FORECASTING Prof. Thomas B. Fomby Department of Economics Southern Methodist University May 2008 SOME APPLCATONS OF FORECASTNG Prof. Thoas B. Foby Departent of Econoics Southern Methodist University May 8 To deonstrate the usefulness of forecasting ethods this note discusses four applications of forecasting

More information

Presentation Safety Legislation and Standards

Presentation Safety Legislation and Standards levels in different discrete levels corresponding for each one to a probability of dangerous failure per hour: > > The table below gives the relationship between the perforance level (PL) and the Safety

More information

AUC Optimization vs. Error Rate Minimization

AUC Optimization vs. Error Rate Minimization AUC Optiization vs. Error Rate Miniization Corinna Cortes and Mehryar Mohri AT&T Labs Research 180 Park Avenue, Florha Park, NJ 0793, USA {corinna, ohri}@research.att.co Abstract The area under an ROC

More information

Investing in corporate bonds?

Investing in corporate bonds? Investing in corporate bonds? This independent guide fro the Australian Securities and Investents Coission (ASIC) can help you look past the return and assess the risks of corporate bonds. If you re thinking

More information

Evaluating Software Quality of Vendors using Fuzzy Analytic Hierarchy Process

Evaluating Software Quality of Vendors using Fuzzy Analytic Hierarchy Process IMECS 2008 9-2 March 2008 Hong Kong Evaluating Software Quality of Vendors using Fuzzy Analytic Hierarchy Process Kevin K.F. Yuen* Henry C.W. au Abstract This paper proposes a fuzzy Analytic Hierarchy

More information

PREDICTION OF POSSIBLE CONGESTIONS IN SLA CREATION PROCESS

PREDICTION OF POSSIBLE CONGESTIONS IN SLA CREATION PROCESS PREDICTIO OF POSSIBLE COGESTIOS I SLA CREATIO PROCESS Srećko Krile University of Dubrovnik Departent of Electrical Engineering and Coputing Cira Carica 4, 20000 Dubrovnik, Croatia Tel +385 20 445-739,

More information

Position Auctions and Non-uniform Conversion Rates

Position Auctions and Non-uniform Conversion Rates Position Auctions and Non-unifor Conversion Rates Liad Blurosen Microsoft Research Mountain View, CA 944 liadbl@icrosoft.co Jason D. Hartline Shuzhen Nong Electrical Engineering and Microsoft AdCenter

More information