Memory and Computation Efficient PCA via Very Sparse Random Projections

Size: px
Start display at page:

Download "Memory and Computation Efficient PCA via Very Sparse Random Projections"

Transcription

1 Meory and Coputation Efficient PCA via Very Sparse Rando Projections Farad Pourkaali-Anaraki Sannon M. Huges Departent of Electrical, Coputer, and Energy Engineering, University of Colorado at Boulder, CO, 839, USA Abstract Algorits tat can efficiently recover principal coponents in very ig-diensional, streaing, and/or distributed data settings ave becoe an iportant topic in te literature. In tis paper, we propose an approac to principal coponent estiation tat utilizes projections onto very sparse rando vectors wit Bernoulli-generated nonzero entries. Indeed, our approac is siultaneously efficient in eory/storage space, efficient in coputation, and produces accurate PC estiates, wile also allowing for rigorous teoretical perforance analysis. Moreover, one can tune te sparsity of te rando vectors deliberately to acieve a desired point on te tradeoffs between eory, coputation, and accuracy. We rigorously caracterize tese tradeoffs and provide statistical perforance guarantees. In addition to tese very sparse rando vectors, our analysis also applies to ore general rando projections. We present experiental results deonstrating tat tis approac allows for siultaneously acieving a substantial reduction of te coputational coplexity and eory/storage space, wit little loss in accuracy, particularly for very ig-diensional data.. Introduction Principal coponent analysis (PCA) is a fundaental tool in unsupervised learning and data analysis tat finds te low-diensional linear subspace tat iniizes te eansquared error between te original data and te data projected onto te subspace. Te principal coponents (PCs) can be obtained by a singular value decoposition (SVD) of te data atrix or eigendecoposition of te data s covariance atrix. PCA is frequently used for diensionality reduction, feature extraction, and as a pre-processing step for learning and recognition tasks suc as classification. Proceedings of te 3 st International Conference on Macine Learning, Beijing, Cina,. JMLR: W&CP volue 3. Copyrigt by te autor(s). Tere is a wealt of existing literature tat develops coputationally efficient approaces to coputing tese PCs. However, te overweling ajority of tis literature assues ready access to te stored full data saples. However, tis full data access is not always possible in odern data settings. Modern data acquisition capabilities ave increased assively in recent years, wic can lead to a wealt of rapidly canging ig-diensional data. Hence, in very large database environents, it ay not be feasible or practical to access all te data in storage (Mutukrisnan, 5). Moreover, in applications suc as sensor networks, distributed databases, and surveillance, data is typically distributed over any sensors. Accessing all te data at once requires treendous counication costs between te sensors and a central processing unit. Algorits tat don t require access to all te data can elp reduce tis counication cost (Balcan et al., 3). A tird case is streaing data, were one ust acquire and store te data in real tie to ave full access, wic ay not be feasible. One proising strategy to address tese issues in a coputationally efficient way, wic also allows for rigorous teoretical analysis, is to use very sparse rando projections. Rando projections provide inforative lowerdiensional representations of ig-diensional data, tereby saving eory and coputation. Tey are widely used in any applications, including databases and data strea processing (Li et al., 6; Indyk, 6) and copressive sensing (Donoo, 6). Initial attepts ave been ade to perfor PCA using only te inforation ebedded in rando projections. Unfortunately, owever, teoretical guarantees ave generally only been given for rando vectors wit i.i.d. entries drawn fro te Gaussian distribution. Tis coon coice is convenient in ters of teoretical analysis, but undesirable in practice. Suc dense rando vectors require relatively ig storage space, and ig coputation because of te large aount of floating point aritetic needed to copute eac projection. In tis paper, we instead ai to recover PCs fro very

2 sparse rando projections wit Bernoulli entries. Tese sparse rando projections can be ipleented using siple database operations. For exaple, tis type of rando projection can be obtained by siply adding two sall subsets of te entries of a data saple and ten subtracting te results. Tey tus require little coputation or data access. For distributed data, tis type of sparse Bernoulli projection could be obtained via localized aggregation in te network requiring inial counication (assuing all sensors can counicate wit one anoter). (If a network topology ust be respected, te sparse rando projections could presuably be adjusted accordingly, but we ave not yet analyzed tis case.) In sort, very sparse rando projections are or could potentially be extreely practical for a variety of situations. Our teoretical analysis begins by assuing a probabilistic generative odel for te data, related to te spiked covariance odel. Under tis odel, we sow tat PCs coputed fro very sparse rando projections are close estiators of te true underlying PCs. Moreover, one can adjust te sparsity of te rando projections as desired to greatly reduce eory and coputation (at te cost of soe accuracy). We give rigorous teoretical analysis of te resulting tradeoffs between eory, coputation, and accuracy as we vary sparsity, sowing tat efficiency in eory and coputation ay be gained wit little sacrifice in accuracy. In fact, our analysis will also apply ore generally to any rando projections wit i.i.d. zero ean entries and bounded second-, fourt-, sixt- and eigt-order oents, altoug we focus on te sparse-bernoulli case. In Section, we present a brief review of related work. Te odel assuptions and notation are in Section 3. We present an overview of te ain contributions in Section. In Section 5, te ain results are stated wit soe discussion of teir consequences. Proofs are reserved to te suppleentary aterial. Finally, we present experiental results deonstrating te perforance and efficiency of our approac copared wit prior work in Section 6.. Related Work Algorits tat can efficiently recover PCs fro a collection of full data saples ave been an iportant topic in te literature for decades. A copreensive survey of tese algorits can be found in (Halko et al., b; Gilbert et al., ) and te references terein. Tis includes several lines of work. Te first involves tecniques tat are based on diensionality reduction, sketcing, and sub-sapling for low-rank atrix approxiation suc as (Halko et al., a). In tese etods, te coputational coplexity is typically reduced by perforing SVD on te saller atrix obtained by sketcing or subsapling. However, tese etods require accessible storage of all te data saples. Tis ay not be practical for odern data processing applications were data saples are too vast or generated too quickly to be stored accessibly. Te second line of work involves online algorits specifically tailored to ave extreely low-eory coplexity suc as (Arora et al., ) and te references terein. Typically, tese algorits assue tat te data is streaing by, tat real-tie PC estiates are needed, and tey obtain tese by solving a stocastic optiization proble, in wic eac arriving data saple is used to update te PCs in an iterative procedure. As a couple recent exaples of tis line of work, (Mitliagkas et al., 3) sow tat a blockwise stocastic variant of te power etod can recover PCs in tis low-eory setting fro O(p log p) saples, altoug te coputational cost is not exained. Meanwile, (Arora et al., 3) bound te generalization error of PCs learned wit teir algorit to new data saples and also analyze its coputational cost. Our proble lies soewere between te above two lines of work. We don t assue tat eory/data access is not a concern, but at te sae tie, we also don t assue te extreely restrictive setting were one-saple-at-a-tie realtie PC updates are required. Instead, we ai to reduce bot eory and coputation siultaneously for PCA across a broad class of big data settings, e.g. for enorous databases were loading into local eory ay be difficult or costly, for streaing data wen PC estiates do not ave to be real-tie, or for distributed data. We also ai to provide tunable tradeoffs for te aount of accuracy tat will be sacrificed for eac given reduction in eory/coputation, in order to aid in coosing a desired balance point between tese. To do tis, we recover PCs fro rando projections. Tere ave been soe related prior attepts to extract PCs fro rando projections of data (Fowler, 9; Qi and Huges, ). In bot, te proble of recovering PCs fro rando projections as been considered only for dense Gaussian rando projections. However, dense vectors are undesirable for practical applications since tey require relatively ig storage space and coputation (including lots of floating point aritetic) as noted in te introduction. Our work will ake use of sparse rando vectors wit Bernoulli entries wic will be ore efficiently ipleentable in a large database environent. Cen et al. (3) ave estiated te covariance atrix of data fro general sub-gaussian rando projections to reduce eory use. However, convergence guarantees are given only for te case of infinite data saples, aking it ard to realistically use tese results in eory/coputation vs. accuracy tradeoffs, and coputational cost is not exained. We will address bot tese issues. As a final note, we observe tat our work also can be

3 viewed as an exaple of eerging ideas in coputational statistics (see (Candrasekaran and Jordan, 3)) in wic tradeoffs between coputational coplexity, dataset size, and estiation accuracy are explicitly caracterized, so tat a user ay coose to reduce coputation in very igdiensional data settings wit knowledge of te risk to te accuracy of te result. 3. Proble Forulation and Notation In tis paper, we focus on a statistical odel for te data tat is applicable to various scenarios. Assue tat our original data in R p are centered at xr p and {v i } d i= R p are te d ortonoral PCs. We consider te following probabilistic generative odel for te data saples, x i =x P d j= w ij jv j z i, i=,...,n, were {w i } n i= and {z i} n i= are drawn i.i.d. fro N (, I d d) and N (, p I p p), respectively. Also, { i } d i= are scalar constants reflecting te energy of te data in eac principal direction suc tat > >...> d >. Te additive noise ter z i allows for soe error in our assuptions. Note tat te underlying covariance atrix of te data is C true, P d j= j v jvj T, and te signal-to-noise ratio is SNR=, were, P d j= j. In fact, tis odel is related to te spiked covariance odel (Jonstone, ) in wic te data s covariance atrix is assued to be a low-rank perturbation of te identity atrix. We ten introduce a very general class of rando projections. Assue tat atrices {R i } n i= Rp, <p, are fored by drawing eac of teir i.i.d. entries fro a distribution wose ean µ is assued to be zero and wose k t order oents, µ k, are assued finite for k =,, 6, 8. In particular, we will be interested in a popular class of sparse rando projections, but our analysis will apply to any distribution satisfying tese assuptions. Eac rando projection y i R is ten obtained by taking inner products of te data saple x i R p wit te rando vectors coprising te coluns of R i, i.e. y i =R T i x i. Te ain goal of tis paper is to provide teoretical guarantees for estiating te center and PCs of {x i } n i= fro tese rando projections.. Our Contributions In tis paper, we introduce two estiators for te center and underlying covariance atrix of data {x i } n i= fro sparse rando projections {y i =R T i x i} n i=. In typical PCA, te center is estiated using te epirical center x ep = P n n i= x i. PCs are ten obtained by eigendecoposition P of te epirical covariance atrix C ep = n n i= (x i x ep )(x i x ep ) T, tat typically coes close to te true covariance atrix (Versynin, ). Siilar to typical PCA, we sow tat te epirical center and epirical covariance atrix of te new data saples {R i y i } n i= (scaled by a known factor) result in accurate estiates of te original center x, and te true underlying covariance atrix C true. (Note tat R i y i approxiately represents a projection in R p of x i onto te colun space of R i, but we ave eliinated a coputationally expensive atrix inverse ere.) We will provide rigorous teoretical analysis for te perforance of tese estiators in ters of paraeters suc as te easureent ratio /p, nuber of saples n, SNR, and oents µ k. Our approac is quite general and we believe it can eventually be applicable to various data processing applications in wic te data is very ig-diensional, streaing, or distributed. Particularly for te case of distributed data, we ay need to adjust te set-up to ensure te rando projections respect network topology, but we believe it could be done following te strategies in (Wang et al., a;b). We will be particularly interested in applying our general distribution results to te case of very sparse easureent atrices. Aclioptas () first sowed tat, in te classic Jonson Lindenstrauss result on pairwise distance preservation, te dense Gaussian projection atrices can be replaced wit sparse projection atrices, were eac entry is distributed on {,, } wit probabilities { 6, 3, 6 }, acieving a tree-fold speedup in processing tie. Li et al. (6) ten drew eac entry fro {,, } wit probabilities { s, s, s }, acieving a ore significant s- fold speedup in processing tie. In tis paper, we refer to tis second distribution as a sparse-bernoulli distribution wit sparsity paraeter s. Sparse rando projections ave been applied in any oter applications to substantially reduce coputational coplexity and eory requireents (Oidiran and Wainwrigt, ; Zang et al., ). Motivated by te success of tese etods, we propose to recover PCs fro sparse rando projections of te data, in wic eac entry of {R i } n i= is drawn i.i.d. fro te sparse-bernoulli distribution. In tis case, eac colun of {R i } n i= as p s nonzero entries, on average. Tis coice as te following properties siultaneously: Te coputation cost for obtaining eac projection is O( p s ) and tus te cost to acquire/access/old in eory te data needed for te algorit is O( pn s ). Specifically, we are interested in coosing and s so tat te copression factor, s <. In tis case, our fraework requires significantly less coputation cost and storage space. First, te coputation cost to acquire/access eac data saple is O( p), <, in contrast to te cost for acquiring eac original data saple O(p). Tis results in a substantial cost reduction for te sensing process, e.g. for streaing data. Second, once acquired, observe tat te projected data saples {R i y i } n i= Rp will be sparse, aving at ost O( p) nonzero entries eac. Tis results in a significant reduction, O( pn) as opposed to

4 O(pn), in eory/storage requireents and/or counication cost, e.g. transferring distributed data to a central processing unit. Given te sparse data atrix fored by {R i y i } n i=, one can ake use of efficient algorits for perforing (partial) SVD on very large sparse atrices, suc as te Lanczos algorit (Golub and Van Loan, ) and svds in MATLAB. In general, for a p n atrix, te coputational cost of SVD is O(p n). However, for large sparse atrices suc as ours, te cost can be reduced to O( p n) (Lin and Gunopulos, 3). In te reainder of tis paper, we will caracterize te accuracy of te estiated center and PCs in ters of, p, n, SNR, oents of te distribution (wic for sparse- Bernoulli will scale wit s), etc. As we will see, under certain conditions on te PCs, we ay coose as low as / p for constant accuracy. Hence, assuing n = O(p) saples, te eory/storage requireents for our approac can scale wit p in contrast to p for standard algorits tat store te full data, and a siilar factor of p savings in coputation can be acieved copared wit regular SVD. Less aggressive savings will also be available for oter PC types. 5. Main Results We present te ain results of our work in tis section, wit all proofs delayed to te suppleental aterial. Interestingly, we will see tat te sape of te distribution for eac entry of {R i } n i= plays an iportant role in our results. Te kurtosis, defined as apple, µ 3, is a easure of µ peakedness and eaviness of tail for a distribution. It can also be tougt of as a easure of non-gaussianity, since te kurtosis of te Gaussian distribution is zero. It turns out tat te distribution s kurtosis is a key factor in deterining PC estiation accuracy. For sparse-bernoulli, te kurtosis increases wit increasing sparsity paraeter s. 5.. Mean and Variance of Center Estiator Teore. Assue tat {R i } n i=, {x i} n i=, {y i} n i=,, n, and µ are as defined in Section 3, and define te n- saple center estiator bx n = µ n P n i= R iy i. Ten, te ean of te estiator bx n is te true center of te original data x, i.e. E[bx n ]=x, for all n, including te base case n=. Furterore, as n!, te estiator bx n converges to te true center: li n! bx n =x. We see tat te epirical center of {R i y i } n i= is a (scaled) unbiased estiator for te true center x. Note tat tis teore does not depend on te nuber of projections or sparsity paraeter s, and tus does not depend on, as a sufficiently ig nuber of saples will copensate for unfavorable values of tese paraeters. We furter note tat, wen n!, tere is no difference between te Gaussian, very sparse, or oter coices of rando projections. Tis is consistent wit te observation tat rando projection atrices consisting of i.i.d. entries ust only be zero ean to preserve pairwise distances in te Jonson- Lindenstrauss teore (Li et al., 6). Teore. Assue tat {R i } n i=, {x i} n i=, {y i} n i=,, n, p, µ,, and SNR are as defined in Section 3, and kurtosis apple is as defined above. Ten, te variance of te unbiased center estiator bx n = P n µ n i= R iy i is Var x bn = apple p n p kxk p apple p SNR. (5.) We see tat as te nuber of saples n and easureent ratio /p increase, te variance of tis estiator decreases at rate n and close to. Interestingly, te power of te /p signal, i.e. = P d j= j, works against te accuracy of te estiator. Te intuition for tis is tat, for te center estiation proble, it is desirable to ave all te data saples close to te center, wic appens for sall. For sparse rando projections, we observe tat te kurtosis is apple=s 3 and tus apple p t s p. Hence, variance scales wit increasing sparsity, altoug sufficient data saples n are enoug to cobat tis effect. Indeed, wen s>p, te variance increases eavily since any of te rando vectors are zero, and tus te corresponding projections cannot capture any inforation about te original data. Overall, tis result sows an explicit tradeoff between reducing n or increasing s to reduce eory/coputation and te variance of te resulting estiator. Finally, given tis ean and variance, probabilistic error bounds can be iediately obtained via Cebysev, Bernstein, etc. inequalities. 5.. Mean and Variance of Covariance Estiator Teore 3. Assue tat {R i } n i=, {x i} n i=, {y i} n i=,, n, p, µ,,, and C true are as defined in Section 3, and apple is te kurtosis. Moreover, assue tat {x i } n i= are centered at x=. Define P te n-saple covariance estiator C b n = n ( )µ n i= R iy i yi T RT i. Ten, for all n, te ean of tis estiator is: E[ C b n ]= C b true E, were C b true, C true I p p,, ( apple p() (p) p() ), and E, apple P d j= j diag(v jvj T ), were diag(a) denotes te atrix fored by zeroing all but te diagonal entries of A. Furterore, let C, C b true E. Ten, as n!, te estiator C b n converges to C : li n! Cn b =C. We observe tat te liit of te estiator C b n as two coponents. Te first, C b true, as te sae eigenvectors wit sligtly perturbed eigenvalues ( tends to be very sall in ig diensions) and te oter, E, is an error perturbation ter. Bot and E scale wit te kurtosis, reflecting te necessary tradeoff between increasing sparsity (decreasing eory/coputation) and aintaining accuracy. We first consider a siple exaple to gain soe intuition for tis teore. A set of data saples {x i } 3 i= R

5 are generated fro one PC. We also generate te easureent atrices {R i } 3 i= R ( /p =.) wit i.i.d. entries bot for te Gaussian distribution and te sparse-bernoulli distribution for various values of te sparsity paraeter s. In Fig. 5., we view two diensions (te original PC s and one oter) of te data {x i } 3 i= and te scaled projected data p {R i y i } 3 ( )µ i=, represented by blue dots and red circles respectively. We see tat te projected data saples are scattered soewat into oter directions for all four cases. However, te aount of scattered energy for te Gaussian and sparse-bernoulli for s=3 is quite sall. Tis can be easily verified fro te fact tat te aount of perturbation depends on te kurtosis, and for bot cases te kurtosis is apple=. As we increase te paraeter s, te kurtosis apple=s 3 gets larger, and tis is consistent wit te observation tat te projected data saples get ore scattered into oter directions. We also note te siilarity of our findings to (Li et al., 6) s result tat te variance of te pairwise distances in Jonson Lindenstrauss depends on te kurtosis of te distribution being used for rando projections. Despite te perturbation, in all cases, te PC can be recovered accurately. Note also tat scaling te projected data points by / p ( ) µ preserves te energy in te direction of te PC (i.e. te eigenvalue). In Teore 3, we see tat C true and C b true ave te sae set of eigenvectors wit te eigenvalues of C true increased by ={ ( apple p() p ) SNR }. Tus, is a decreasing function of p, /p and SNR, and in particular goes to as p!for constant projection ratio /p. Tis is illustrated in Fig. 5.. Tus, surprisingly, in te igdiensional regie, te aount of perturbation of eigenvalues becoes increasingly negligible even for sall easureent ratios. Now, let s exaine te error atrix E. We observe tat E can be viewed as representing a bias of te estiated PCs towards te nearest canonical basis vectors; it stes fro anisotropy in te distribution for R i wen tis is non- Gaussian (note apple =, and tus E =, for te Gaussian case). In later sections, we will use te -nor of E, kek, to bound te angle between te estiated and true PCs. Indeed, we find, for constant, kek, te sae angular PC estiation error is acieved. We now study kek, leading to useful observations, for several types of PCs. (An expanded discussion wit full derivations is included in te suppleentary aterials.) () Soot PCs: It as frequently been observed tat sparse-bernoulli rando projections are ost effective on vectors tat are soot (Ailon and Cazelle, 9), eaning tat teir axiu entry is of size O( p p ). Large iages, videos, and oter natural signals wit distributed energy are obvious exaples of tis type. (Oter sig- α/ p= p=5 p= p= Measureent Ratio /p...3. Measureent Ratio /p (a) (b) Figure 5.. Variation of te paraeter for (a) apple =and (b) apple =, varying p and easureent ratio /p, and fixed SNR =5. α/ p= p=5 p= p=5 nals are often preconditioned to be soot via ultiplication wit a Hadaard conditioning atrix.) We ay easily observe ten tat kek apple apple µ ax, or apple apple µ ax, were µ ax is te utual coerence (Elad, 7) between apple te PCs and te canonical basis, and we note apple. As we will see in Section 5.3, we will want to keep sall enoug to guarantee a certain fixed angular error. In fact, tis can be satisfied by requiring C( )µ ax, were C( ) is a constant depending on te error. Hence, for soot PCs, we need only ave / p, reducing eory and coputation by a rater rearkable factor of p. () All Sparse PCs: In te case of all sparse PCs, we ay write E as E= apple C true E were ke k apple p apple µ in and µ in, in appleiappled ax applejapplep v i, e j i represents te closeness of te PCs to te canonical basis {e j } p j=. Tus, unlike for oter sparse-bernoulli applications, we find tat sparse PCs can still be recovered very well ere, altoug te eigenvalues ay be eavily scaled by te known factor apple. Doing tis, and taking E as te resulting error ter, we can let / p µ in to aintain constant. (3) Neiter Sparse nor Soot PCs: In tis case, we can still apply te analysis for case (), just wit a larger µ ax and less aggressive eory/coputation savings. () Mixture of PC Types: In tis case, we ay split E into two error atrices, associated wit eac of te sparse and non-sparse PCs. Recovery of te d-diensional PC subspace still perfors well ere. However, if te eigenvalues { j }d j= do not decay sufficiently fast, scaling of te eigenvalues for te sparse PCs ay reorder te individual coponents. Please see te suppleentary aterial for furter discussions and siulations. Teore. Assue tat {R i } n i=, {x i} n i=, {y i} n i=,, n, p, µ k,, and SNR are as defined in Section P 3. Consider te covariance atrix estiator C b n = n n i= R ()µ i y i yi T RT i. Ten, te deviation of our n-saple estiator fro its ean value is upper bounded: apple E bc n C apple n ( ) (5.) n were,, e p p p SNR F e p apple SNR p SNR p apple o,,

6 Diension Diension Diension Diension Efficient PCA via Very Sparse Rando Projections Diension Diension Diension Diension (a) Gaussian (b) s = 3 (c) s = (d) s = 5 Figure 5.. Accurate recovery of te PC under rando projections using bot Gaussian and sparse rando projection atrices for p various values of s. In eac figure, tere are n=3 data saples uniforly distributed on a line in R. {Ri }n, i= R /p=., are generated wit i.i.d. entries drawn fro (a) N (, ) and (b,c,d) te sparse-bernoulli distribution for s=3,, 5. In n eac p figure, we view two diensions (te original PC s and one oter) of te data {xi }i= (blue dots) and te scaled projected data / ( ) µ {Ri RTi xi }n (red circles). We observe tat, in all cases, te projected data saples are syetrically distributed i= around te PC, and te inner product agnitude between te PC estiated fro te projected data and te true PC is at least.998. Pd e, j= j, and = ax(, ), were µ8/µ µ6/µ3 (µ/µ ) 3 3 /p /p! µ/µ /p (/p) /p p ( /p) and µ6/µ3 6 p/p (µ/µ ) 5 p/p! 6 3 p/p /p p (/p) /p p /p p (/p) p ( /p) p ( /p) µ/µ = Note tat as various ters tat scale wit p, te iger order oents µ8/µ, µ6/µ3, and µ/µ. /p, and We see tat as te nuber of data saples n increases, te variance decreases at rate n, converging quickly to te liit. Moreover, te variance of our estiator is a decreasing function of te easureent ratio /p and SNR. We furter note tat te paraeter gives us iportant inforation about te effect of te tails of te distribution on te convergence rate of te covariance estiator. More preµ8/µ cisely, for sparse rando projections, we see tat 3 = µ6/µ3 (µ/µ ) µ/µ s 3 ) = 3, = =, and =. Hence, for ( a fixed nuber of data saples, decreasing te copression factor leads to an increase of te variance and a loss in accuracy, as we will see in Section 6. Tis is as we would expect since tere is an inerent tradeoff between saving coputation and eory and te accuracy. However, caracterizing tis tradeoff allows to be cosen in an infored way for large datasets Meory, Coputation and PC Accuracy Tradeoffs We now use te covariance atrix estiator results to bound te error of its eigenvalues and eigenvectors, using related results fro atrix perturbation teory. First, note tat using te variance of our estiator (Eq. 5.) b n C ", wit in te Cebysev inequality yields C probability at least bn C bn C b true C C n" ( bn C F C ). Hence, C kek kek " F b true C (5.3) wit probability at least ). In fact, n" ( Eq. 5.3 can be used to caracterize tradeoffs between eory, coputation, and PC estiation accuracy (as an angle between estiated subspaces) in ters of our paraeters n, /p, etc. For siplicity in wat follows and to elp keep te intuition clear, we focus on te case were te nuber of saples n! and "! in Eq. 5.3 above. However, it is trivial to adjust tese results to te case of finite n by including a nonzero " in te derivations tat follow. For illustrative purposes, we start by analyzing te case of a single PC and use te following Lea. In te following, (A) and i (A) denote te set of all eigenvalues and te it eigenvalue of A, respectively. Lea 5. (Hogben, 6; Davis and Kaan, 97) Supe pose A is a real syetric atrix and A=A E is te e e ) is an exact eigenpair perturbed atrix. Assue tat (, v e were ke of A vk =. Ten (a) e kek for soe eigenvalue of A. (b) Let be te closest eigenvalue of A to e and v be its associated eigenvector wit kvk =, and let = in (A), 6= e. If >, ten sin \ (e v, v) kek (5.) were \ (e v, v) denotes te canonical angle between te two eigenvectors. We will use tis Lea to bound te angle between te b n and te true PC in te single PC PC estiate fro C

7 case. Since C true as only one eigenpair (, v) wit nonzero eigenvalue, C b true as an eigenpair (, v) and i( C b true )=, i=,...,p. Fro Lea 5, we see tat te largest eigenvalue of C b n satisfies ( C b n ) ( ) apple kek =. We find te paraeter : = in bcn bctrue i = bcn i=,...,p kek =( ). (5.5) We ten get te following tradeoff between te accuracy of te estiated eigenvector and te paraeters of our odel: sin \ (ev, v) apple. (5.6) Tis equation allows us to caracterize te statistical tradeoff between te sparsity paraeter s and te accuracy of te estiated PC. Observe tat tis is te sae = kek tat we discussed in Section 5.. To ensure fixed axiu angular error for PC estiation, i.e. sin \(ev, v)applesin, sin we sould coose suc tat apple sin. For soot PCs, we ay satisfy tis by coosing C( )µ ax for sin C( ), sin, wic gives O( p ). Hence, te eory/storage requireents of our etod can scale wit p in contrast to standard algorits tat scale wit p, wile te coputational coplexity of SVD can scale wit p as opposed to p 3. Altoug te soot case is of special interest, less aggressive, but still substantial, savings are also available for oter PC types. For te general case of d PCs, we consider te eigendecoposition of te perturbed atrix C b n and C b true : bc true = apple apple S V V V T S V T i " #" # e bc n = ev V e S ev T S e ev T. Te distance between eac perturbed eigenvalue and te corresponding original eigenvalue depends on te aount of perturbation. We now ave tat j( b C n ) j( b C true ) applekek = for all j=,...,d. Moreover, it is possible to quantify te rotation of eigenvectors using te notion of canonical angle atrix defined in (Davis and Kaan, 97). Note tat V, e V R p d are te first (true and estiated) PCs. Te canonical angles between te are defined as i =arccos i, were { i } d i= are te singular values of ( e V T e V ) / e V T V (V T V ) /, in our case, just e V T V. Te canonical angle atrix is ten defined as ( e V, V )=diag(,..., d ). Based on te results given in (Davis and Kaan, 97; Gilbert et al., ): sin ( e V, V ) apple kek Noralized Center Error Noralized Singular Value Error n=*p,γ =/5 n=*p, γ=/ n=3*p,γ=/5 n=3*p, γ=/ 6 8 Diension (p) (a) γ=/5 γ=/ γ=/ 6 8 Diension (p) Inner Product Magnitude γ=/5 γ=/ γ=/ 6 8 Diension (p) (c) (d) Figure 6.. Results for syntetic data: (a) noralized estiation error for te center for varying n and, (b) agnitude of te inner product between te estiated and true PC for varying, (c) noralized estiation error for for varying, and (d) coputation tie to perfor te SVD for te original vs. randoly projected data for varying. were,in appleiappled,applejapplep d (S ) fs ii >. Using te sae logic as in 5.5, we find d jj. Hence, Tie in sec. (b) γ=/5 γ=/ γ=/ SVD 6 8 Diension (p) coosing s,, etc. suc tat satisfies < d, te axiu canonical angle between e V and V satisfies sin i apple d, i =,...,d. (5.7) Tis is te sae for we saw in Eq Hence, for soot PCs, we ay again coose / p. 6. Experiental Results In tis section, we exaine te tradeoffs between eory, coputation, and accuracy for te sparse rando projections approac on bot syntetic and real-world datasets. First, we syntetically generate saples {x i } n i= Rp distributed along one PC wit =. Eac entry of te center and PC is drawn fro te unifor distribution on [, ) and [, ), respectively. Te PC is ten noralized to ave unit `-nor. We consider a relatively noisy situation wit SNR=. We ten estiate te center of te original data fro te sparse rando projections, were /p=., for varying n and copression factors. Our results are averaged over independent trials. Fig. 6.(a) sows te accuracy for te estiated center, were te error is te distance between te estiated and te true center noralized by te true center s nor. As expected, wen n or diension p increase, te copression factor can be tuned to acieve a substantial reduction of storage space wile obtaining accurate estiates. Tis is desirable for igdiensional data strea processing. We ten fix n=p, and plot te inner product agnitude between te estiated and true PC in Fig. 6.(b) and te

8 Explained Variance SVD on Original Data Our Metod,γ=/ Our Metod,γ=/ BSOI 6 8 Nuber of PCs Tie in sec. (log scale) 6 8 Nuber of PCs (a) (b) Figure 6.. Results for te MNIST dataset. Our proposed approac is copared wit two etods: () perforing MAT- LAB s svds on te full original data, () BSOI (Mitliagkas et al., 3). Plot of (a) perforance accuracy based on te explained variance and (b) coputation tie for perforing SVD. We see tat our approac perfors as well as SVD on te original data and outperfors BSOI wit significantly less coputation tie. coputation tie in Fig. 6.(d) for varying. We observe tat, despite saving nearly two orders of agnitude in coputation tie and also in eory (note = 5,, ) copared to PCA on te full data, te PC is well-estiated. Moreover, te approac reains increasingly effective for iger diensions, wic is of crucial iportance for odern data processing applications. We furter note tat, as te diension increases, we can decrease te copression factor wile still acieving a desired perforance. For exaple, = for p= 3 and = for p= ave alost te sae accuracy. Tis is consistent wit te observation fro before. / p We also plot te estiation error for te singular value in Fig. 6.(c). Te error is te distance between te singular value obtained by perforing SVD on {R i y i } n i= and on te original data {x i } n i=, noralized by te latter value. Finally, we consider te MNIST dataset to see a realworld application outside te spiked covariance odel. Tis dataset contains 7, saples of andwritten digits, wic we ave resized to pixels. Hence, we ave 7, saples in R 6. To evaluate te perforance of our etod, we use te explained variance described in (Mitliagkas et al., 3). Given estiates of d PCs e V R p d and te data atrix X, te fraction of explained variance is defined as tr( e V T XX T e V)/tr(XX T ). We copare te perforance of our approac wit () perforing SVD (using MATLAB svds) on te original data tat are fully acquired and stored, and as a useful point of coparison, wit () te online algorit Block-Stocastic Ortogonal Iteration (BSOI) (Mitliagkas et al., 3), were te data saples are fully acquired but not stored. We sow te results in Fig. 6. for te easureent ratio /p=.. In ters of accuracy, our approac perfors about as well as SVD on te original data, and as sligtly better perforance copared to BSOI. Te sparse rando projections result in a significant reduction of coputational coplexity, wit one order and two orders of agnitude speedup 3 copared to te original SVD and BSOI, respectively. In ters of eory requireents, 3 MB is needed to store te original data. However, te required eory for our fraework is MB for = and MB for =. Te projected data tus can easily reside in te ain eory. Moreover, we ave copared our etod wit te fast randoized SVD algorit in (Halko et al., a). Te estiation accuracy of tis etod is very close to SVD on te original data, and te coputation tie is about. seconds, wic is sligtly less tan te coputation tie of our etod. Tis is as we would expect, since fast randoized SVD is designed specifically for low-coputational coplexity. However, (Halko et al., a) is a full data etod, eaning tat it is assued tat te full data is available for coputation and does not require tie or cost to access. Our approac perfors approxiately as well in siilar coputation tie wile also allowing a reduction in eory (or data access or data counication costs) by a factor of, in tis case and. Tis can be a significant advantage in te case were data is stored in a large database syste or distributed network. Tis exaple indicates tat our approac results in a significant siultaneous reduction of eory and/or coputational cost wit little loss in accuracy. 7. Conclusions We ave presented a eory- and coputation-efficient approac for estiation of PCs via very sparse rando projections. Tis approac siultaneously reduces substantially te required eory and coputation for PC estiation, wile still providing ig accuracy. More iportantly, it allows us to rigorously analyze eac of eory, coputation, and accuracy in ters of te sparsity of te projection, for various PC odels. Tus, we ave been able to give provable tradeoffs between eory, coputation, and accuracy. Furterore, a user of tis approac could even use te sparsity of te projections to tune to any desired point on tis tree-way tradeoff. We believe tat tis approac could be valuable for various iportant odern data processing applications suc as assive databases, distributed networks, and ig-diensional data strea processing, altoug we ave not focused on te specific details of tese in favor of ore teoretical analysis. Indeed, we observe tat our approac perfors well in initial practical siulations, e.g. for te MNIST dataset, wit large reduction of bot eory and coputation witout sacrificing accuracy. Acknowledgeents: Tis aterial is based upon work supported by te National Science Foundation under Grant CCF-7775.

9 References D. Aclioptas. Database-friendly rando projections. In Proceedings of te twentiet ACM SIGMOD-SIGACT- SIGART syposiu on Principles of database systes, pages 7 8,. N. Ailon and B. Cazelle. Te fast Jonson-Lindenstrauss transfor and approxiate nearest neigbors. SIAM Journal on Coputing, 39:3 3, R. Arora, A. Cotter, K. Livescu, and N. Srebro. Stocastic optiization for PCA and PLS. In 5t Annual Allerton Conference on Counication, Control, and Coputing (Allerton), pages ,. R. Arora, A. Cotter, and N. Srebro. Stocastic optiization of PCA wit capped MSG. In NIPS, pages 85 83, 3. M. Balcan, S. Erlic, and Y. Liang. Distributed k-eans and k-edian clustering on general topologies. In NIPS, pages 995 3, 3. V. Candrasekaran and M. Jordan. Coputational and statistical tradeoffs via convex relaxation. Proc. of te National Acadey of Sciences, :E8 E9, 3. Y. Cen, Y. Ci, and A. Goldsit. Exact and stable covariance estiation fro quadratic sapling via convex prograing. arxiv preprint arxiv:3.87, 3. C. Davis and W. Kaan. Te rotation of eigenvectors by a perturbation. III. SIAM J. on Nuerical Analysis, 7: 6, 97. 5, 5.3 D. Donoo. Copressed sensing. IEEE Transactions on Inforation Teory, 5:89 36, 6. M. Elad. Optiized projections for copressed sensing. IEEE Trans. SP, 55: , J. Fowler. Copressive-projection principal coponent analysis. IEEE Trans. on Iage Process., pages 3, 9. A. Gilbert, J. Park, and M. Wakin. Sketced SVD: Recovering spectral features fro copressive easureents. arxiv preprint arxiv:.36,., 5.3 G. H. Golub and C. F. Van Loan. Matrix coputations, volue 3. JHU Press,. N. Halko, P. Martinsson, Y. Skolnisky, and M. Tygert. An algorit for te principal coponent analysis of large data sets. SIAM Journal on Scientific Coputing, 33(5): 58 59, a., 6 N. Halko, P. Martinsson, and J. Tropp. Finding structure wit randoness: Probabilistic algorits for constructing approxiate atrix decopositions. SIAM review, 53():7 88, b. L. Hogben. Handbook of linear algebra. CRC Press, 6. 5 P. Indyk. Stable distributions, pseudorando generators, ebeddings, and data strea coputation. Journal of te ACM (JACM), 53(3):37 33, 6. I. Jonstone. On te distribution of te largest eigenvalue in principal coponents analysis. Te Annals of Statistics, 9():95 37,. 3 P. Li, T. Hastie, and K. Curc. Very sparse rando projections. In Proceedings of te t ACM SIGKDD international conference on Knowledge discovery and data ining, pages 87 96, 6.,, 5., 5. J. Lin and D. Gunopulos. Diensionality reduction by rando projection and latent seantic indexing. In proceedings of te Text Mining Worksop, at te 3rd SIAM International Conference on Data Mining, 3. I. Mitliagkas, C. Caraanis, and P. Jain. Meory liited, Streaing PCA. In NIPS, 3., 6., 6 S. Mutukrisnan. Data streas: Algorits and applications. Now Publisers Inc, 5. D. Oidiran and M. Wainwrigt. Hig-diensional variable selection wit sparse rando projections: easureent sparsity and statistical efficiency. Te Journal of Macine Learning Researc, 99:36 386,. H. Qi and S. Huges. Invariance of principal coponents under low-diensional rando projection of te data. In ICIP, pages 937 9,. R. Versynin. How close is te saple covariance atrix to te actual covariance atrix? Journal of Teoretical Probability, 5(3): ,. M. Wang, W. Xu, E. Mallada, and A. Tang. Sparse recovery wit grap constraints: Fundaental liits and easureent construction. In IEEE Proceedings INFO- COM, pages , a. M. Wang, W. Xu, E. Mallada, and A. Tang. Sparse recovery wit grap constraints. CoRR, abs/7.89, b. K. Zang, L. Zang, and M. Yang. Real-tie copressive tracking. In Coputer Vision ECCV, pages Springer,.

Online Bagging and Boosting

Online Bagging and Boosting Abstract Bagging and boosting are two of the ost well-known enseble learning ethods due to their theoretical perforance guarantees and strong experiental results. However, these algoriths have been used

More information

Image restoration for a rectangular poor-pixels detector

Image restoration for a rectangular poor-pixels detector Iage restoration for a rectangular poor-pixels detector Pengcheng Wen 1, Xiangjun Wang 1, Hong Wei 2 1 State Key Laboratory of Precision Measuring Technology and Instruents, Tianjin University, China 2

More information

Machine Learning Applications in Grid Computing

Machine Learning Applications in Grid Computing Machine Learning Applications in Grid Coputing George Cybenko, Guofei Jiang and Daniel Bilar Thayer School of Engineering Dartouth College Hanover, NH 03755, USA [email protected], [email protected]

More information

Reliability Constrained Packet-sizing for Linear Multi-hop Wireless Networks

Reliability Constrained Packet-sizing for Linear Multi-hop Wireless Networks Reliability Constrained acket-sizing for inear Multi-hop Wireless Networks Ning Wen, and Randall A. Berry Departent of Electrical Engineering and Coputer Science Northwestern University, Evanston, Illinois

More information

Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona Network

Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona Network 2013 European Control Conference (ECC) July 17-19, 2013, Zürich, Switzerland. Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona

More information

Cooperative Caching for Adaptive Bit Rate Streaming in Content Delivery Networks

Cooperative Caching for Adaptive Bit Rate Streaming in Content Delivery Networks Cooperative Caching for Adaptive Bit Rate Streaing in Content Delivery Networs Phuong Luu Vo Departent of Coputer Science and Engineering, International University - VNUHCM, Vietna [email protected]

More information

Lecture L26-3D Rigid Body Dynamics: The Inertia Tensor

Lecture L26-3D Rigid Body Dynamics: The Inertia Tensor J. Peraire, S. Widnall 16.07 Dynaics Fall 008 Lecture L6-3D Rigid Body Dynaics: The Inertia Tensor Version.1 In this lecture, we will derive an expression for the angular oentu of a 3D rigid body. We shall

More information

Use of extrapolation to forecast the working capital in the mechanical engineering companies

Use of extrapolation to forecast the working capital in the mechanical engineering companies ECONTECHMOD. AN INTERNATIONAL QUARTERLY JOURNAL 2014. Vol. 1. No. 1. 23 28 Use of extrapolation to forecast the working capital in the echanical engineering copanies A. Cherep, Y. Shvets Departent of finance

More information

The Virtual Spring Mass System

The Virtual Spring Mass System The Virtual Spring Mass Syste J. S. Freudenberg EECS 6 Ebedded Control Systes Huan Coputer Interaction A force feedbac syste, such as the haptic heel used in the EECS 6 lab, is capable of exhibiting a

More information

Evaluating Inventory Management Performance: a Preliminary Desk-Simulation Study Based on IOC Model

Evaluating Inventory Management Performance: a Preliminary Desk-Simulation Study Based on IOC Model Evaluating Inventory Manageent Perforance: a Preliinary Desk-Siulation Study Based on IOC Model Flora Bernardel, Roberto Panizzolo, and Davide Martinazzo Abstract The focus of this study is on preliinary

More information

Media Adaptation Framework in Biofeedback System for Stroke Patient Rehabilitation

Media Adaptation Framework in Biofeedback System for Stroke Patient Rehabilitation Media Adaptation Fraework in Biofeedback Syste for Stroke Patient Rehabilitation Yinpeng Chen, Weiwei Xu, Hari Sundara, Thanassis Rikakis, Sheng-Min Liu Arts, Media and Engineering Progra Arizona State

More information

A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS

A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS Isaac Zafrany and Sa BenYaakov Departent of Electrical and Coputer Engineering BenGurion University of the Negev P. O. Box

More information

This paper studies a rental firm that offers reusable products to price- and quality-of-service sensitive

This paper studies a rental firm that offers reusable products to price- and quality-of-service sensitive MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol., No. 3, Suer 28, pp. 429 447 issn 523-464 eissn 526-5498 8 3 429 infors doi.287/so.7.8 28 INFORMS INFORMS holds copyright to this article and distributed

More information

Data Streaming Algorithms for Estimating Entropy of Network Traffic

Data Streaming Algorithms for Estimating Entropy of Network Traffic Data Streaing Algoriths for Estiating Entropy of Network Traffic Ashwin Lall University of Rochester Vyas Sekar Carnegie Mellon University Mitsunori Ogihara University of Rochester Jun (Ji) Xu Georgia

More information

An Integrated Approach for Monitoring Service Level Parameters of Software-Defined Networking

An Integrated Approach for Monitoring Service Level Parameters of Software-Defined Networking International Journal of Future Generation Counication and Networking Vol. 8, No. 6 (15), pp. 197-4 http://d.doi.org/1.1457/ijfgcn.15.8.6.19 An Integrated Approach for Monitoring Service Level Paraeters

More information

Verifying Numerical Convergence Rates

Verifying Numerical Convergence Rates 1 Order of accuracy Verifying Numerical Convergence Rates We consider a numerical approximation of an exact value u. Te approximation depends on a small parameter, suc as te grid size or time step, and

More information

Searching strategy for multi-target discovery in wireless networks

Searching strategy for multi-target discovery in wireless networks Searching strategy for ulti-target discovery in wireless networks Zhao Cheng, Wendi B. Heinzelan Departent of Electrical and Coputer Engineering University of Rochester Rochester, NY 467 (585) 75-{878,

More information

CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS

CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS 641 CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS Marketa Zajarosova 1* *Ph.D. VSB - Technical University of Ostrava, THE CZECH REPUBLIC [email protected] Abstract Custoer relationship

More information

Binary Embedding: Fundamental Limits and Fast Algorithm

Binary Embedding: Fundamental Limits and Fast Algorithm Binary Ebedding: Fundaental Liits and Fast Algorith Xinyang Yi The University of Texas at Austin [email protected] Eric Price The University of Texas at Austin [email protected] Constantine Caraanis

More information

Partitioning Data on Features or Samples in Communication-Efficient Distributed Optimization?

Partitioning Data on Features or Samples in Communication-Efficient Distributed Optimization? Partitioning Data on Features or Saples in Counication-Efficient Distributed Optiization? Chenxin Ma Industrial and Systes Engineering Lehigh University, USA [email protected] Martin Taáč Industrial and

More information

An Innovate Dynamic Load Balancing Algorithm Based on Task

An Innovate Dynamic Load Balancing Algorithm Based on Task An Innovate Dynaic Load Balancing Algorith Based on Task Classification Hong-bin Wang,,a, Zhi-yi Fang, b, Guan-nan Qu,*,c, Xiao-dan Ren,d College of Coputer Science and Technology, Jilin University, Changchun

More information

arxiv:0805.1434v1 [math.pr] 9 May 2008

arxiv:0805.1434v1 [math.pr] 9 May 2008 Degree-distribution stability of scale-free networs Zhenting Hou, Xiangxing Kong, Dinghua Shi,2, and Guanrong Chen 3 School of Matheatics, Central South University, Changsha 40083, China 2 Departent of

More information

Computer Science and Engineering, UCSD October 7, 1999 Goldreic-Levin Teorem Autor: Bellare Te Goldreic-Levin Teorem 1 Te problem We æx a an integer n for te lengt of te strings involved. If a is an n-bit

More information

Applying Multiple Neural Networks on Large Scale Data

Applying Multiple Neural Networks on Large Scale Data 0 International Conference on Inforation and Electronics Engineering IPCSIT vol6 (0) (0) IACSIT Press, Singapore Applying Multiple Neural Networks on Large Scale Data Kritsanatt Boonkiatpong and Sukree

More information

Leakage Detection in Buried Pipes by Electrical Resistance Imaging

Leakage Detection in Buried Pipes by Electrical Resistance Imaging st World Congress on Industrial Process Toograp Buxto Greater Mancester, April -7, 999 Leakage Detection in Buried Pipes by Electrical Resistance Iaging Josep Jordana, Manel Gasulla and Raón Pallás-Areny

More information

Real Time Target Tracking with Binary Sensor Networks and Parallel Computing

Real Time Target Tracking with Binary Sensor Networks and Parallel Computing Real Tie Target Tracking with Binary Sensor Networks and Parallel Coputing Hong Lin, John Rushing, Sara J. Graves, Steve Tanner, and Evans Criswell Abstract A parallel real tie data fusion and target tracking

More information

AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES

AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES Int. J. Appl. Math. Coput. Sci., 2014, Vol. 24, No. 1, 133 149 DOI: 10.2478/acs-2014-0011 AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES PIOTR KULCZYCKI,,

More information

Managing Complex Network Operation with Predictive Analytics

Managing Complex Network Operation with Predictive Analytics Managing Coplex Network Operation with Predictive Analytics Zhenyu Huang, Pak Chung Wong, Patrick Mackey, Yousu Chen, Jian Ma, Kevin Schneider, and Frank L. Greitzer Pacific Northwest National Laboratory

More information

6. Time (or Space) Series Analysis

6. Time (or Space) Series Analysis ATM 55 otes: Tie Series Analysis - Section 6a Page 8 6. Tie (or Space) Series Analysis In this chapter we will consider soe coon aspects of tie series analysis including autocorrelation, statistical prediction,

More information

Software Quality Characteristics Tested For Mobile Application Development

Software Quality Characteristics Tested For Mobile Application Development Thesis no: MGSE-2015-02 Software Quality Characteristics Tested For Mobile Application Developent Literature Review and Epirical Survey WALEED ANWAR Faculty of Coputing Blekinge Institute of Technology

More information

Capacity of Multiple-Antenna Systems With Both Receiver and Transmitter Channel State Information

Capacity of Multiple-Antenna Systems With Both Receiver and Transmitter Channel State Information IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO., OCTOBER 23 2697 Capacity of Multiple-Antenna Systes With Both Receiver and Transitter Channel State Inforation Sudharan K. Jayaweera, Student Meber,

More information

Implementation of Active Queue Management in a Combined Input and Output Queued Switch

Implementation of Active Queue Management in a Combined Input and Output Queued Switch pleentation of Active Queue Manageent in a obined nput and Output Queued Switch Bartek Wydrowski and Moshe Zukeran AR Special Research entre for Ultra-Broadband nforation Networks, EEE Departent, The University

More information

Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2

Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2 Exploiting Hardware Heterogeneity within the Sae Instance Type of Aazon EC2 Zhonghong Ou, Hao Zhuang, Jukka K. Nurinen, Antti Ylä-Jääski, Pan Hui Aalto University, Finland; Deutsch Teleko Laboratories,

More information

Optimal Resource-Constraint Project Scheduling with Overlapping Modes

Optimal Resource-Constraint Project Scheduling with Overlapping Modes Optial Resource-Constraint Proect Scheduling with Overlapping Modes François Berthaut Lucas Grèze Robert Pellerin Nathalie Perrier Adnène Hai February 20 CIRRELT-20-09 Bureaux de Montréal : Bureaux de

More information

The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic Jobs

The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic Jobs Send Orders for Reprints to [email protected] 206 The Open Fuels & Energy Science Journal, 2015, 8, 206-210 Open Access The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic

More information

An inquiry into the multiplier process in IS-LM model

An inquiry into the multiplier process in IS-LM model An inquiry into te multiplier process in IS-LM model Autor: Li ziran Address: Li ziran, Room 409, Building 38#, Peing University, Beijing 00.87,PRC. Pone: (86) 00-62763074 Internet Address: [email protected]

More information

PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO

PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO Bulletin of the Transilvania University of Braşov Series I: Engineering Sciences Vol. 4 (53) No. - 0 PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO V. CAZACU I. SZÉKELY F. SANDU 3 T. BĂLAN Abstract:

More information

Research on the Anti-perspective Correction Algorithm of QR Barcode

Research on the Anti-perspective Correction Algorithm of QR Barcode Researc on te Anti-perspective Correction Algoritm of QR Barcode Jianua Li, Yi-Wen Wang, YiJun Wang,Yi Cen, Guoceng Wang Key Laboratory of Electronic Tin Films and Integrated Devices University of Electronic

More information

Performance Evaluation of Machine Learning Techniques using Software Cost Drivers

Performance Evaluation of Machine Learning Techniques using Software Cost Drivers Perforance Evaluation of Machine Learning Techniques using Software Cost Drivers Manas Gaur Departent of Coputer Engineering, Delhi Technological University Delhi, India ABSTRACT There is a treendous rise

More information

Impact of Processing Costs on Service Chain Placement in Network Functions Virtualization

Impact of Processing Costs on Service Chain Placement in Network Functions Virtualization Ipact of Processing Costs on Service Chain Placeent in Network Functions Virtualization Marco Savi, Massio Tornatore, Giacoo Verticale Dipartiento di Elettronica, Inforazione e Bioingegneria, Politecnico

More information

Factored Models for Probabilistic Modal Logic

Factored Models for Probabilistic Modal Logic Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008 Factored Models for Probabilistic Modal Logic Afsaneh Shirazi and Eyal Air Coputer Science Departent, University of Illinois

More information

An Optimal Task Allocation Model for System Cost Analysis in Heterogeneous Distributed Computing Systems: A Heuristic Approach

An Optimal Task Allocation Model for System Cost Analysis in Heterogeneous Distributed Computing Systems: A Heuristic Approach An Optial Tas Allocation Model for Syste Cost Analysis in Heterogeneous Distributed Coputing Systes: A Heuristic Approach P. K. Yadav Central Building Research Institute, Rooree- 247667, Uttarahand (INDIA)

More information

Physics 211: Lab Oscillations. Simple Harmonic Motion.

Physics 211: Lab Oscillations. Simple Harmonic Motion. Physics 11: Lab Oscillations. Siple Haronic Motion. Reading Assignent: Chapter 15 Introduction: As we learned in class, physical systes will undergo an oscillatory otion, when displaced fro a stable equilibriu.

More information

A framework for performance monitoring, load balancing, adaptive timeouts and quality of service in digital libraries

A framework for performance monitoring, load balancing, adaptive timeouts and quality of service in digital libraries Int J Digit Libr (2000) 3: 9 35 INTERNATIONAL JOURNAL ON Digital Libraries Springer-Verlag 2000 A fraework for perforance onitoring, load balancing, adaptive tieouts and quality of service in digital libraries

More information

Markovian inventory policy with application to the paper industry

Markovian inventory policy with application to the paper industry Coputers and Cheical Engineering 26 (2002) 1399 1413 www.elsevier.co/locate/copcheeng Markovian inventory policy with application to the paper industry K. Karen Yin a, *, Hu Liu a,1, Neil E. Johnson b,2

More information

Modified Latin Hypercube Sampling Monte Carlo (MLHSMC) Estimation for Average Quality Index

Modified Latin Hypercube Sampling Monte Carlo (MLHSMC) Estimation for Average Quality Index Analog Integrated Circuits and Signal Processing, vol. 9, no., April 999. Abstract Modified Latin Hypercube Sapling Monte Carlo (MLHSMC) Estiation for Average Quality Index Mansour Keraat and Richard Kielbasa

More information

RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION. Henrik Kure

RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION. Henrik Kure RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION Henrik Kure Dina, Danish Inforatics Network In the Agricultural Sciences Royal Veterinary and Agricultural University

More information

Adaptive Modulation and Coding for Unmanned Aerial Vehicle (UAV) Radio Channel

Adaptive Modulation and Coding for Unmanned Aerial Vehicle (UAV) Radio Channel Recent Advances in Counications Adaptive odulation and Coding for Unanned Aerial Vehicle (UAV) Radio Channel Airhossein Fereidountabar,Gian Carlo Cardarilli, Rocco Fazzolari,Luca Di Nunzio Abstract In

More information

Markov Models and Their Use for Calculations of Important Traffic Parameters of Contact Center

Markov Models and Their Use for Calculations of Important Traffic Parameters of Contact Center Markov Models and Their Use for Calculations of Iportant Traffic Paraeters of Contact Center ERIK CHROMY, JAN DIEZKA, MATEJ KAVACKY Institute of Telecounications Slovak University of Technology Bratislava

More information

Halloween Costume Ideas for the Wii Game

Halloween Costume Ideas for the Wii Game Algorithica 2001) 30: 101 139 DOI: 101007/s00453-001-0003-0 Algorithica 2001 Springer-Verlag New York Inc Optial Search and One-Way Trading Online Algoriths R El-Yaniv, 1 A Fiat, 2 R M Karp, 3 and G Turpin

More information

Budget-optimal Crowdsourcing using Low-rank Matrix Approximations

Budget-optimal Crowdsourcing using Low-rank Matrix Approximations Budget-optial Crowdsourcing using Low-rank Matrix Approxiations David R. Karger, Sewoong Oh, and Devavrat Shah Departent of EECS, Massachusetts Institute of Technology Eail: {karger, swoh, devavrat}@it.edu

More information

Data Set Generation for Rectangular Placement Problems

Data Set Generation for Rectangular Placement Problems Data Set Generation for Rectangular Placeent Probles Christine L. Valenzuela (Muford) Pearl Y. Wang School of Coputer Science & Inforatics Departent of Coputer Science MS 4A5 Cardiff University George

More information

Equivalent Tapped Delay Line Channel Responses with Reduced Taps

Equivalent Tapped Delay Line Channel Responses with Reduced Taps Equivalent Tapped Delay Line Channel Responses with Reduced Taps Shweta Sagari, Wade Trappe, Larry Greenstein {shsagari, trappe, ljg}@winlab.rutgers.edu WINLAB, Rutgers University, North Brunswick, NJ

More information

Comment on On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes

Comment on On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes Coent on On Discriinative vs. Generative Classifiers: A Coparison of Logistic Regression and Naive Bayes Jing-Hao Xue ([email protected]) and D. Michael Titterington ([email protected]) Departent

More information

International Journal of Management & Information Systems First Quarter 2012 Volume 16, Number 1

International Journal of Management & Information Systems First Quarter 2012 Volume 16, Number 1 International Journal of Manageent & Inforation Systes First Quarter 2012 Volue 16, Nuber 1 Proposal And Effectiveness Of A Highly Copelling Direct Mail Method - Establishent And Deployent Of PMOS-DM Hisatoshi

More information

Factor Model. Arbitrage Pricing Theory. Systematic Versus Non-Systematic Risk. Intuitive Argument

Factor Model. Arbitrage Pricing Theory. Systematic Versus Non-Systematic Risk. Intuitive Argument Ross [1],[]) presents the aritrage pricing theory. The idea is that the structure of asset returns leads naturally to a odel of risk preia, for otherwise there would exist an opportunity for aritrage profit.

More information

Training Robust Support Vector Regression via D. C. Program

Training Robust Support Vector Regression via D. C. Program Journal of Information & Computational Science 7: 12 (2010) 2385 2394 Available at ttp://www.joics.com Training Robust Support Vector Regression via D. C. Program Kuaini Wang, Ping Zong, Yaoong Zao College

More information

The Application of Bandwidth Optimization Technique in SLA Negotiation Process

The Application of Bandwidth Optimization Technique in SLA Negotiation Process The Application of Bandwidth Optiization Technique in SLA egotiation Process Srecko Krile University of Dubrovnik Departent of Electrical Engineering and Coputing Cira Carica 4, 20000 Dubrovnik, Croatia

More information

Lecture L9 - Linear Impulse and Momentum. Collisions

Lecture L9 - Linear Impulse and Momentum. Collisions J. Peraire, S. Widnall 16.07 Dynaics Fall 009 Version.0 Lecture L9 - Linear Ipulse and Moentu. Collisions In this lecture, we will consider the equations that result fro integrating Newton s second law,

More information

Audio Engineering Society. Convention Paper. Presented at the 119th Convention 2005 October 7 10 New York, New York USA

Audio Engineering Society. Convention Paper. Presented at the 119th Convention 2005 October 7 10 New York, New York USA Audio Engineering Society Convention Paper Presented at the 119th Convention 2005 October 7 10 New York, New York USA This convention paper has been reproduced fro the authors advance anuscript, without

More information

PREDICTION OF POSSIBLE CONGESTIONS IN SLA CREATION PROCESS

PREDICTION OF POSSIBLE CONGESTIONS IN SLA CREATION PROCESS PREDICTIO OF POSSIBLE COGESTIOS I SLA CREATIO PROCESS Srećko Krile University of Dubrovnik Departent of Electrical Engineering and Coputing Cira Carica 4, 20000 Dubrovnik, Croatia Tel +385 20 445-739,

More information

2 Limits and Derivatives

2 Limits and Derivatives 2 Limits and Derivatives 2.7 Tangent Lines, Velocity, and Derivatives A tangent line to a circle is a line tat intersects te circle at exactly one point. We would like to take tis idea of tangent line

More information

Resource Allocation in Wireless Networks with Multiple Relays

Resource Allocation in Wireless Networks with Multiple Relays Resource Allocation in Wireless Networks with Multiple Relays Kağan Bakanoğlu, Stefano Toasin, Elza Erkip Departent of Electrical and Coputer Engineering, Polytechnic Institute of NYU, Brooklyn, NY, 0

More information

Protecting Small Keys in Authentication Protocols for Wireless Sensor Networks

Protecting Small Keys in Authentication Protocols for Wireless Sensor Networks Protecting Sall Keys in Authentication Protocols for Wireless Sensor Networks Kalvinder Singh Australia Developent Laboratory, IBM and School of Inforation and Counication Technology, Griffith University

More information

Reconnect 04 Solving Integer Programs with Branch and Bound (and Branch and Cut)

Reconnect 04 Solving Integer Programs with Branch and Bound (and Branch and Cut) Sandia is a ultiprogra laboratory operated by Sandia Corporation, a Lockheed Martin Copany, Reconnect 04 Solving Integer Progras with Branch and Bound (and Branch and Cut) Cynthia Phillips (Sandia National

More information

Multi-Class Deep Boosting

Multi-Class Deep Boosting Multi-Class Deep Boosting Vitaly Kuznetsov Courant Institute 25 Mercer Street New York, NY 002 [email protected] Mehryar Mohri Courant Institute & Google Research 25 Mercer Street New York, NY 002 [email protected]

More information

A magnetic Rotor to convert vacuum-energy into mechanical energy

A magnetic Rotor to convert vacuum-energy into mechanical energy A agnetic Rotor to convert vacuu-energy into echanical energy Claus W. Turtur, University of Applied Sciences Braunschweig-Wolfenbüttel Abstract Wolfenbüttel, Mai 21 2008 In previous work it was deonstrated,

More information

SAMPLING METHODS LEARNING OBJECTIVES

SAMPLING METHODS LEARNING OBJECTIVES 6 SAMPLING METHODS 6 Using Statistics 6-6 2 Nonprobability Sapling and Bias 6-6 Stratified Rando Sapling 6-2 6 4 Cluster Sapling 6-4 6 5 Systeatic Sapling 6-9 6 6 Nonresponse 6-2 6 7 Suary and Review of

More information

Exercise 4 INVESTIGATION OF THE ONE-DEGREE-OF-FREEDOM SYSTEM

Exercise 4 INVESTIGATION OF THE ONE-DEGREE-OF-FREEDOM SYSTEM Eercise 4 IVESTIGATIO OF THE OE-DEGREE-OF-FREEDOM SYSTEM 1. Ai of the eercise Identification of paraeters of the euation describing a one-degree-of- freedo (1 DOF) atheatical odel of the real vibrating

More information

CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY

CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY Y. T. Chen Departent of Industrial and Systes Engineering Hong Kong Polytechnic University, Hong Kong [email protected]

More information

Research Article Performance Evaluation of Human Resource Outsourcing in Food Processing Enterprises

Research Article Performance Evaluation of Human Resource Outsourcing in Food Processing Enterprises Advance Journal of Food Science and Technology 9(2): 964-969, 205 ISSN: 2042-4868; e-issn: 2042-4876 205 Maxwell Scientific Publication Corp. Subitted: August 0, 205 Accepted: Septeber 3, 205 Published:

More information

Calculation Method for evaluating Solar Assisted Heat Pump Systems in SAP 2009. 15 July 2013

Calculation Method for evaluating Solar Assisted Heat Pump Systems in SAP 2009. 15 July 2013 Calculation Method for evaluating Solar Assisted Heat Pup Systes in SAP 2009 15 July 2013 Page 1 of 17 1 Introduction This docuent describes how Solar Assisted Heat Pup Systes are recognised in the National

More information

Quality evaluation of the model-based forecasts of implied volatility index

Quality evaluation of the model-based forecasts of implied volatility index Quality evaluation of the odel-based forecasts of iplied volatility index Katarzyna Łęczycka 1 Abstract Influence of volatility on financial arket forecasts is very high. It appears as a specific factor

More information

The EOQ Inventory Formula

The EOQ Inventory Formula Te EOQ Inventory Formula James M. Cargal Matematics Department Troy University Montgomery Campus A basic problem for businesses and manufacturers is, wen ordering supplies, to determine wat quantity of

More information

Efficient Key Management for Secure Group Communications with Bursty Behavior

Efficient Key Management for Secure Group Communications with Bursty Behavior Efficient Key Manageent for Secure Group Counications with Bursty Behavior Xukai Zou, Byrav Raaurthy Departent of Coputer Science and Engineering University of Nebraska-Lincoln Lincoln, NE68588, USA Eail:

More information

Analyzing Spatiotemporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy

Analyzing Spatiotemporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy Vol. 9, No. 5 (2016), pp.303-312 http://dx.doi.org/10.14257/ijgdc.2016.9.5.26 Analyzing Spatioteporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy Chen Yang, Renjie Zhou

More information

Cross-Domain Metric Learning Based on Information Theory

Cross-Domain Metric Learning Based on Information Theory Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Cross-Doain Metric Learning Based on Inforation Theory Hao Wang,2, Wei Wang 2,3, Chen Zhang 2, Fanjiang Xu 2. State Key Laboratory

More information

Airline Yield Management with Overbooking, Cancellations, and No-Shows JANAKIRAM SUBRAMANIAN

Airline Yield Management with Overbooking, Cancellations, and No-Shows JANAKIRAM SUBRAMANIAN Airline Yield Manageent with Overbooking, Cancellations, and No-Shows JANAKIRAM SUBRAMANIAN Integral Developent Corporation, 301 University Avenue, Suite 200, Palo Alto, California 94301 SHALER STIDHAM

More information

Mathematical Model for Glucose-Insulin Regulatory System of Diabetes Mellitus

Mathematical Model for Glucose-Insulin Regulatory System of Diabetes Mellitus Advances in Applied Matheatical Biosciences. ISSN 8-998 Volue, Nuber (0), pp. 9- International Research Publication House http://www.irphouse.co Matheatical Model for Glucose-Insulin Regulatory Syste of

More information

ASIC Design Project Management Supported by Multi Agent Simulation

ASIC Design Project Management Supported by Multi Agent Simulation ASIC Design Project Manageent Supported by Multi Agent Siulation Jana Blaschke, Christian Sebeke, Wolfgang Rosenstiel Abstract The coplexity of Application Specific Integrated Circuits (ASICs) is continuously

More information

Fuzzy Sets in HR Management

Fuzzy Sets in HR Management Acta Polytechnica Hungarica Vol. 8, No. 3, 2011 Fuzzy Sets in HR Manageent Blanka Zeková AXIOM SW, s.r.o., 760 01 Zlín, Czech Republic [email protected] Jana Talašová Faculty of Science, Palacký Univerzity,

More information

Energy Proportionality for Disk Storage Using Replication

Energy Proportionality for Disk Storage Using Replication Energy Proportionality for Disk Storage Using Replication Jinoh Ki and Doron Rote Lawrence Berkeley National Laboratory University of California, Berkeley, CA 94720 {jinohki,d rote}@lbl.gov Abstract Energy

More information

How To Get A Loan From A Bank For Free

How To Get A Loan From A Bank For Free Finance 111 Finance We have to work with oney every day. While balancing your checkbook or calculating your onthly expenditures on espresso requires only arithetic, when we start saving, planning for retireent,

More information

( C) CLASS 10. TEMPERATURE AND ATOMS

( C) CLASS 10. TEMPERATURE AND ATOMS CLASS 10. EMPERAURE AND AOMS 10.1. INRODUCION Boyle s understanding of the pressure-volue relationship for gases occurred in the late 1600 s. he relationships between volue and teperature, and between

More information

AUC Optimization vs. Error Rate Minimization

AUC Optimization vs. Error Rate Minimization AUC Optiization vs. Error Rate Miniization Corinna Cortes and Mehryar Mohri AT&T Labs Research 180 Park Avenue, Florha Park, NJ 0793, USA {corinna, ohri}@research.att.co Abstract The area under an ROC

More information

Experiment 2 Index of refraction of an unknown liquid --- Abbe Refractometer

Experiment 2 Index of refraction of an unknown liquid --- Abbe Refractometer Experient Index of refraction of an unknown liquid --- Abbe Refractoeter Principle: The value n ay be written in the for sin ( δ +θ ) n =. θ sin This relation provides us with one or the standard ethods

More information

On Computing Nearest Neighbors with Applications to Decoding of Binary Linear Codes

On Computing Nearest Neighbors with Applications to Decoding of Binary Linear Codes On Coputing Nearest Neighbors with Applications to Decoding of Binary Linear Codes Alexander May and Ilya Ozerov Horst Görtz Institute for IT-Security Ruhr-University Bochu, Gerany Faculty of Matheatics

More information

Pricing Asian Options using Monte Carlo Methods

Pricing Asian Options using Monte Carlo Methods U.U.D.M. Project Report 9:7 Pricing Asian Options using Monte Carlo Methods Hongbin Zhang Exaensarbete i ateatik, 3 hp Handledare och exainator: Johan Tysk Juni 9 Departent of Matheatics Uppsala University

More information

Calculating the Return on Investment (ROI) for DMSMS Management. The Problem with Cost Avoidance

Calculating the Return on Investment (ROI) for DMSMS Management. The Problem with Cost Avoidance Calculating the Return on nvestent () for DMSMS Manageent Peter Sandborn CALCE, Departent of Mechanical Engineering (31) 45-3167 [email protected] www.ene.ud.edu/escml/obsolescence.ht October 28, 21

More information