Modeling and Optimization for Big Data Analytics


 Norman Warren
 2 years ago
 Views:
Transcription
1 [ Konsaninos Slavakis, Georgios B. Giannakis, and Gonzalo Maeos ] Modeling and Opimizaion for Big Daa Analyics isockphoo.com/tayo4nori [ (Saisical) learning ools for our era of daa deluge ] Wih pervasive sensors coninuously collecing and soring massive amouns of informaion, here is no doub his is an era of daa deluge. Learning from hese large volumes of daa is expeced o bring significan science and engineering advances along wih improvemens in qualiy of life. However, wih such a big blessing come big challenges. Running analyics on voluminous daa ses by cenral processors and sorage unis seems infeasible, and wih he adven of sreaming daa sources, learning mus ofen be performed in real Digial Objec Idenifier 0.09/MSP Dae of publicaion: 9 Augus 04 ime, ypically wihou a chance o revisi pas enries. Workhorse signal processing (SP) and saisical learning ools have o be reexamined in oday s highdimensional daa regimes. This aricle conribues o he ongoing crossdisciplinary effors in daa science by puing forh encompassing models capuring a wide range of SPrelevan daa analyic asks, such as principal componen analysis (PCA), dicionary learning (DL), compressive sampling (CS), and subspace clusering. I offers scalable archiecures and opimizaion algorihms for decenralized and online learning problems, while revealing fundamenal insighs ino he various analyic and implemenaion radeoffs involved. Exensions of he encompassing models o imely daaskeching, ensor and kernelbased learning asks are also provided. Finally, IEEE SIGNAL PROCESSING MAGAZINE [8] SEPTEMBER /4 04IEEE
2 he close connecions of he presened framework wih several big daa asks, such as nework visualizaion, decenralized and dynamic esimaion, predicion, and impuaion of nework link load raffic, as well as impuaion in ensorbased medical imaging are highlighed. Inroducion The informaion explosion propelled by he adven of online social media, Inerne, and globalscale communicaions has rendered daadriven saisical learning increasingly imporan. A any ime around he globe, large volumes of daa are generaed by oday s ubiquious communicaion, imaging, and mobile devices such as cell phones, surveillance cameras and drones, medical and ecommerce plaforms, as well as social neworking sies. The erm big daa is coined o describe his informaion deluge and, quoing a recen press aricle, heir effec is being fel everywhere, from business o science, and from governmen o he ars [8]. Large economic growh and improvemen in he qualiy of life hinge upon harnessing he poenial benefis of analyzing massive daa [8], [55]. Mining unprecedened volumes of daa promises o limi he spread of epidemics and maximize he odds ha online markeing campaigns go viral [35]; o idenify rends in financial markes, visualize neworks, undersand he dynamics of emergen socialcompuaional sysems, as well as proec criical infrasrucure including he Inerne s backbone nework [48], and he power grid [6]. big daa challenges and SP opporuniies While big daa come wih big blessings, here are formidable challenges in dealing wih largescale daa ses. Firs, he sheer volume and dimensionaliy of daa make i ofen impossible o run analyics and radiional inferenial mehods using sandalone processors, e.g., [8] and [3]. Decenralized learning wih parallelized mulicores is preferred [9], [], while he daa hemselves are sored in he cloud or disribued file sysems as in MapReduce/Hadoop [9]. Thus, here is an urgen need o explicily accoun for he sorage, query, and communicaion burden. In some cases, privacy concerns preven disclosing he full daa se, allowing only preprocessed daa o be communicaed hrough carefully designed inerfaces. Due o heir possibly disparae origins, big daa ses are ofen incomplee and a sizable porion of hem is missing. Largescale daa ineviably conain corruped measuremens, communicaion errors, and even suffer from cyberaacks as he acquisiion and ransporaion cos per enry is driven o he minimum. Furhermore, as many of he daa sources coninuously generae daa in real ime, analyics mus ofen be performed online subjec o ime consrains so ha a highqualiy answer obained slowly can be less useful han a mediumqualiy answer ha is obained quickly [46], [48], [75]. Alhough pas research on daabases and informaion rerieval is viewed as having focused on sorage, lookup, and search, he opporuniy now is o comb hrough massive daa ses, o discover new phenomena, and o learn [3]. Big daa challenges offer ample opporuniies for SP research [55], where daadriven saisical learning algorihms are envisioned o faciliae disribued and realime analyics (cf. Figure ). Boh classical and modern SP echniques have already placed significan emphasis on ime/daa adapiviy, e.g., [69], robusness [3], as well as compression and dimensionaliy reducion [43]. Tesamen o his fac is he recen rediscovery of sochasic approximaion and sochasicgradien algorihms for scalable online convex opimizaion and learning [65], ofenimes neglecing Robbins Monro and Widrow s seminal works ha go back half a cenury [60], [69], [79]. While he principal role of compuer science in big daa research is undeniable, he naure and scope of he emerging daa science field is cerainly mulidisciplinary and welcomes SP experise and is recen advances. For example, Webcolleced daa are ofen replee wih missing enries, which moivaes innovaive SP impuaion echniques ha leverage imely (lowrank) marix decomposiions [39], [5], or, suiable kernelbased inerpolaors [6]. Daa marices gahering raffic values observed in he backbone of largescale neworks can be modeled as he superposiion of unknown clean raffic, which is usually lowrank due o emporal periodiciies as well as nework opologyinduced correlaions, and raffic volume anomalies ha occur sporadically in ime and space, rendering he associaed marix componen sparse across rows and columns [38]. Boh quaniy and richness of highdimensional daa ses offer he poenial o improve saisical learning performance, requiring however innovaive models ha exploi laen lowdimensional srucure o effecively separae he daa whea from he chaff. To learn hese models however, here is a consequen need o advance online, scalable opimizaion algorihms for informaion processing over graphs (an absracion of boh neworked sources of decenralized daa, and muliprocessor, highperformance compuing archiecures); see, e.g., GraphLab [4] and he alernaing direcion mehod of mulipliers (ADMM) [9], [0], [5] ha enjoy growing populariy for disribued machine learning asks. Encompassing models for succinc big daa represenaions This secion inroduces a versaile model o fi daa marices as a superposiion of a lowrank marix capuring correlaions and periodic rends, plus a linearly compressed sparse marix explaining daa innovaions parsimoniously hrough a se of (possibly laen) facors. The model is rich enough o subsume various saisical learning paradigms wih welldocumened meris for highdimensional daa analysis, including PCA [8], DL [56], compressive sampling CS [], and principal componens pursui (PCP) [], [4], [5], o name a few. The background plus paerns and innovaions model for marix daa Le L! R N# T denoe a lowrank marix ( rank( L ) % min{ NT, }), and S R M #! T a sparse marix wih suppor size considerably smaller han MT. Consider also he largescale daa se Y R N #! T generically modeled as a superposiion of ) he lowrank marix L ; he daa background or rend, e.g., nominal IEEE SIGNAL PROCESSING MAGAZINE [9] SEPTEMBER 04
3 Massive Scale Parallel, Decenralized Ouliers, Missing Values Time/Daa Adapive Challenges Signal Processing and Learning for Big Daa Models and Opimizaion RealTime Consrains Robus Cloud Sorage Predicion, Forecasing Dimensionaliy Reducion Succinc, Sparse Cleansing, Impuaion Tasks Regression, Classificaion, Clusering [Fig] SPrelevan big daa hemes. load curves across he power grid or he background scene capured by a surveillance camera, plus, ) he daa paerns, (co) clusers, innovaions, or ouliers expressed by he produc of a (possibly unknown) dicionary D R N #! M imes he sparse marix S, and 3) a marix V R N #! T, which accouns for modeling and measuremen errors; in shor, Y = L+ DS + V. Marix D could be an overcomplee se of bases or a linear compression operaor wih N # M. The aforemenioned model offers a parsimonious descripion of Y, ha is welcomed in big daa analyics where daa ses involve numerous feaures. Such parsimony faciliaes inerpreabiliy, model idenifiabiliy, and i enhances he model s predicive performance by discarding noisy feaures ha bear lile relevance o he phenomenon of ineres [49]. To explicily accoun for missing daa in Y inroduce ) he se X 3 {, f, N} # {, f, T} of index pairs ( n, ), and ) he sampling operaor P X (), $ which nulls enries of is marix argumen no in X, leaving he res unchanged. This way, one can express incomplee and (possibly noise)corruped daa as PX( Y) = PX ( L+ DS + V). () Given P X ( Y), he challenging goal is o esimae he marix componens L and S (and D if no given), which furher enails denoising he observed enries and impuing he missing ones. An esimaor leveraging he lowrank propery of L and he sparsiy of S will be sough o fi he daa P X ( Y) in he leassquares (LS) error sense, as well as minimize he rank of L, and he number of nonzero enries of S : = [ sm, ] measured by is, 0(pseudo) norm. Unforunaely, albei naural boh rank and, 0norm crieria are in general NPhard o opimize [53]. Wih v k( L) denoing he kh singular value of L, he nuclear norm L * : = / v k( L), and he, k norm S : = / s m, m, are adoped as surrogaes, as hey are he closes convex approximans o rank ( L) and S 0, respecively, e.g., [4] and [48]. Accordingly, assuming known D for now, one solves min P ( YL DS) + m L + m S, X (P) { LS, } F * * where m*, m $ 0 are rank and sparsiyconrolling parameers. Being convex, (P) is compuaionally appealing as elaboraed in he secion Algorihms, in addiion o being widely applicable as i encompasses a gamu of known paradigms. Noice however ha when D is unknown, one obains a bilinear model ha gives rise o nonconvex esimaion crieria. The approaches highlighed nex can in fac accommodae more general models han (P), where daafiing erms oher han he Frobeniusnorm one and differen regularizers can be uilized o accoun for various ypes of a priori knowledge, e.g., srucured sparsiy or smoohness. Applicaion domains and subsumed paradigms Model () emerges in various applicaions, such as ) nework anomaly deecion oulined in he secion Inference and Impuaion, where Y R N #! T represens raffic volume over N links and T ime slos; L capures he nominal linklevel raffic (which IEEE SIGNAL PROCESSING MAGAZINE [0] SEPTEMBER 04
4 is lowrank due o emporal periodiciies and opologyinduced correlaions on he underlying flows); D represens a link # flow binary rouing marix; and S sparse anomalous flows [47], [48]; ) medical imaging, where dynamic magneic resonance imaging separaes he background L from he moion componen (e.g., a hear beaing) modeled via sparse dicionary represenaion DS [5] (see also he secion Inference and Impuaion ); 3) face recogniion in he presence of shadows and speculariies []; and 4) acousic SP for singing voice separaion from is music accompanimen [7], o name a few. In he absence of L and missing daa ( L = 0, X = {, f, N} # {, f, T}), model () describes an underdeermined sparse signal recovery problem ypically encounered wih CS []. If in addiion D is unknown, (P) boils down o DL [], [46], [56], [67], or, o nonnegaive marix facorizaion (NNMF) if he enries of D and S are nonnegaive [39]. For L = 0, X = {, f, N} # {, f, T}, and if he columns of Y lie close o a union of a small number of unknown lowdimensional linear subspaces, hen looking for a sparse S in () wih M % T amouns o subspace clusering [78]; see also [70] for oulierrobus varians wih srong performance guaranees. Wihou D and wih V = 0, decomposing Y ino L+ S corresponds o PCP, also referred o as robus PCA (RPCA) [], [4]. Even when L is nonzero, one could envision a varian where he measuremens are corruped wih correlaed (lowrank) noise [5]. Las bu no leas, when S = 0 and V! 0, recovery of L subjec o a rank consrain is nohing else han PCA arguably, he workhorse of highdimensional big daa analyics [8]. This same formulaion is adoped for lowrank marix compleion he basic ask carried ou by recommender sysems o impue he missing enries of a lowrank marix observed in noise, i.e., PX( Y) = PX ( L+ V) [3]. Based on he maximum likelihood principle, an alernaive approach for missing value impuaion by expecaionmaximizaion can be found in [73]. Algorihms As (P) is joinly convex wih respec o (w.r..) boh L and S, various ieraive solvers are available, including inerior poin mehods and cenralized online schemes based on (sub)gradienbased recursions [65]. For big daa however, offheshelf inerior poin mehods are compuaionally prohibiive, and are no amenable o decenralized or parallel implemenaions. Subgradienbased mehods are srucurally simple bu are ofen hindered by slow convergence due o resricive sep size selecion rules. The desideraa for largescale problems are lowcomplexiy, realime algorihms capable of processing massive daa ses in a parallelizable and/or fully decenralized fashion. The few such algorihms available can be classified as decenralized or parallel schemes, spliing, sequenial, and online or sreaming. Decenralized and parallel algorihms In hese divideandconquer schemes, muliple agens operae in parallel on disjoin or randomly subsampled subses of he massivescale daa, and combine heir oupus as ieraions proceed o accomplish he original learning or inference ask [34], [44]. Unforunaely, he nuclearnorm L * in (P) canno be easily disribued across muliple learners, since he full singular value decomposiion (SVD) of L has o be compued cenrally, prior disribuing is se of singular values o each node. In search of a nuclearnorm surrogae amenable o decenralized processing, i is useful o recall ha minimizing L * is anamoun o minimizing ( P Q F+ F)/, where L= PQ, wih P! R N # and Q! R T #, for some %min{ NT, }, is a bilinear decomposiion of he lowrank componen L [47], [7]. In oher words, each column vecor of L is assumed o lie in a low dimensional range space spanned by he columns of P. This gives rise o he following problem: min m* P ( Y PQ DS) ( P Q ) S. { } X   +,, F F+ F + m PQS (P) Unlike (P), he bilinear erm PQ renders (P) nonconvex, even if D is known. Ineresingly, [47, Prop. ] offers a cerificae for saionary poins of (P), qualifying hem as global opima of (P). Thanks o he decomposabiliy of F and across rows, and ignoring for a momen he operaor P X, (P) can be disribued over a number V of nodes or processing cores V wih cardinaliy V = V, where each node o! V learns from a subse of rows R o {, f, N}. In oher words, he N rows of Y are disribued over a pariion of rows { R } V o o =, V where by definiion ' R o = {, f, N}, and Roi+ R oj = Y 0, o = if i! j. Naurally, (P) is equivalen o his (modulo P X ) ask: min V {{ Po} o =, Q, S} V / o = YoPoQ DoS F V m* m* + / Po F+ Q F+ m S, () o = where Yo, Po, and D o are submarices formed by keeping only he R o rows of YP,, and D, respecively. An obsacle in () is he coupling of he daafiing erm wih he regularizaion erms via { Po, Q, S}. Direc uilizaion of ieraive subgradienype mehods, due o he nonsmooh loss funcion, are able o idenify local minimizers of (), a he cos of slow convergence and meiculous choice of sep sizes. In he convex analysis seing, successful opimizaion approaches o surmoun his obsacle include he ADMM [0] and he more general Douglas Rachford (DR) algorihm [5] ha spli or decouple variables in he nuclear,, , and Frobeniusnorms. The crux of spliing mehods, such as ADMM and DR, lies on compuing efficienly he proximal mapping of regularizing funcions, which for a (non)differeniable lowersemiconinuous convex funcion g and c 0, is defined as Proxcg( A): = argminal ( /) A Al F + cg( Al), 6 A [5]. The compuaional cos incurred by Proxc g depends on g. For example, if g is he nuclearnorm, hen Proxc *( A) = USof ( R ) V c, where A = URV is he compuaionally demanding SVD of A, and Sofc ( R) is he sofhresholding operaor whose (, i j ) h IEEE SIGNAL PROCESSING MAGAZINE [] SEPTEMBER 04
5 enry is [Sofc( R)] ij = sgn ([ R] ij, ) max{ 0, [ R] ij,  c}. On he conrary, if g =, hen Proxc ( A) = Sofc ( A), which is a compuaionally affordable, parallelizable operaion. Even if () is a nonconvex ask, a spliing sraegy mimicking ADMM and DR is promising also in he curren conex. If he nework nodes or cores can also exchange messages, hen () can be decenralized. This is possible if e.g., o! V has a neighborhood No V, where o! No and all members of N o exchange informaion. The decenralized rendiion of (P) becomes min Po, Qo, So ' Pol, Qlo, Slo P X o ( YoPoQo DoSo) F m* + ( Pol F+ Qlo F) + m Slo, 6o! V Qo = Qol, So = Sol 6ol! No :) s.o 6o! V :* Qlo = Qlol, Slo = Slol, Po = Pol (P3) where consensus consrains are enforced per neighborhood N o, and { Pol, Qlo, Sl o} are uilized o spli he LS cos from he Frobenius and, norms. Typically, (P3) is expressed in unconsrained form using he (augmened) Lagrangian framework. Decenralized inference algorihms over neworks, implemening he previous spliing mehodology, can been found in [], [47], [5], and [6]. ADMM and DR are convergen for convex coss, bu hey offer no convergence guaranees for he nonconvex (P3). There is, however, ample experimenal evidence in he lieraure ha suppors empirical convergence of ADMM, especially when he nonconvex problem a hand exhibis favorable srucure [0], [47]. Mehods offering convergence guaranees for (P3), afer encapsulaing consensus consrains ino he loss funcion, are sequenial schemes, such as he block coordinae descen mehods (BCDMs) [59], [77]. BCDMs minimize he underlying objecive sequenially over one block of variables per ieraion, while keeping all oher blocks fixed o heir mos upodae values. For example, a BCDM for solving he DL subask of (), ha is when { Po, Q} are absen from he opimizaion problem, is he KSVD algorihm []. Per ieraion, KSVD alernaes beween sparse coding of he columns of Y based on he curren dicionary and updaing he dicionary aoms o beer fi he daa. For a consensusbased decenralized implemenaion of KSVD in he cloud, see [58]. I is worh sressing ha (P3) is convex w.r.. each block among { Po, Qo, So, Pol, Qlo, Sl o}, whenever he res are held consan. Recen parallel schemes wih convergence guaranees ake advanage of his underlying srucure o speedup decenralized and parallel opimizaion algorihms [33], [64]. Addiional BCDM examples will be given nex in he conex of online learning. Online algorihms for sreaming analyics So far, Y has been decomposed across is rows corresponding o nework agens or processors; in wha follows, Y will be spli across is columns. Aiming a online solvers of (P), wih indexing he columns of Y : = [ y, f, y], and { X } x x = indicaing he locaions of known daa values across ime, consider he analyics engine acquiring a sream of vecors P X ( y), 6. An online counerpar of (P) is he following exponenially weighed LS esimae [48] min / P ' x = { qx, sx} x= d  x = P Xx( yxpqxdxsx) m* m* +  P F + q + m s xl x d / xl = x G, (P4) where P R N #!, { qx} x = R, { s } R M x, and d! (, 0] denoes he soermed forgeing facor. Wih d, pas daa are exponenially discarded o rack nonsaionary feaures. Clearly, PX can be represened by a marix X, whose rows are a subse of he rows of he Ndimensional ideniy marix. A provably convergen BCDM approach o efficienly solve a simplified version of (P4) was pu forh in [48]. Each ime a new daum is acquired, only q and s are joinly updaed via Lasso for fixed P = P , and hen (P4) is solved w.r.. P o updae P  using recursive LS (RLS). The laer sep can be efficienly spli across  x rows pn, = argminp / d ~ n, x( yn, xp qx  dn, xsx) + x = ( m */ ) p an aracive feaure faciliaing parallel processing, which neverheless enails a marix inversion when d. Since firs inroduced in [48], he idea of performing online rankminimizaion leveraging he separable nuclearnorm regularizaion in (P4) has gained populariy in realime NNMF for audio SP [7], and online robus PCA [], o name a few examples. In he case where P,{ q } x x = are absen from (P4), an online DL mehod of he same spiri as in [48] can be found in [46], [67]. Algorihms in [48] are closely relaed o imely robus subspace rackers, which aim a esimaing a lowrank subspace P from grossly corruped and possibly incomplee daa, namely PX( y) = PX ( Pq+ s+ v), =,, f. In he absence of 3 sparse ouliers { s} =, an online algorihm based on incremenal gradien descen on he Grassmannian manifold of subspaces was pu forh in [4]. The secondorder RLSype algorihm in [6] exends he seminal projecion approximaion subspace racking (PAST) algorihm o handle missing daa; see also [50]. When ouliers are presen, robus counerpars can be found in [5] and [9]. Relaive o all aforemenioned works, he esimaion problem (P4) is more challenging due o he presence of he (compression) dicionary D. Reflecing on (P) (P4), all objecive funcions share a common srucure: hey are convex w.r.. each of heir variable blocks, provided he res are held fixed. Naurally, his calls for BCDMs for minimizaion, as in he previous discussion. However, marix inversions and solving a bach Lasso per slo may prove prohibiive for largescale opimizaion asks. Projeced or proximal sochasic (sub)gradien mehods are aracive lowcomplexiy online alernaives o BCDMs mainly for opimizing convex objecives [65]. Unforunaely, due o heir diminishing sepsizes, such firsorder soluions exhibi slow convergence even for convex problems. On he oher hand, acceleraed varians for convex problems offer quadraic convergence of he IEEE SIGNAL PROCESSING MAGAZINE [] SEPTEMBER 04
6 objecive funcion values, meaning hey are opimally fas among firsorder mehods [54], [80]. Alhough quadraic convergence issues for nonconvex and imevarying coss as in (P4) are largely unexplored, he online, acceleraed, firsorder mehod oulined in Figure offers a promising alernaive for generally nonsmooh and nonconvex minimizaion asks [68]. Le x () i be a block of variables, which in (P4) can be P, or { q } x x =, or { sx} x = ; ha is, i! {, 3, }; and le x ( i) denoe all blocks in x: = ( x (), f, x () I ) excep for x () i. Consider he I sequence of loss funcions F(): x f ( x) g ( x () i = + / ), i = i where f is nonconvex, and Lipschiz coninuously differeniable bu convex w.r.. each x () i, whenever { x () j } j! i are held fixed; { gi} i I = are convex and possibly nondiffereniable; hence, F is nonsmooh. Clearly, he daa fi erm in (P4) corresponds o f, () g( x ): = ( m* /) P F, while g and g3 describe he oher wo regularizaion erms. The acceleraion module Accel of [80], developed originally for offline convex analyic asks, is applied o F in a sequenial, perblock (Gauss Seidel) fashion. Having x ( i) fixed, unless () i ( i) () i minx () i! Hi f( x ; x ) + gi( x ) is easily solvable, Accel is employed for Ri $ imes o updae x () i. The same procedure is carried over o he nex block x ( i + ), unil all blocks are updaed, and subsequenly o he nex ime insan + (Figure ). Unlike ADMM, his firsorder algorihm requires no marix inversions, and can afford inexac soluions of minimizaion subasks. Under several condiions, including (saisical) 3 saionariy of { F} =, i also guaranees quadraicrae convergence o a saionary poin of E{ F }, where E { } denoes expecaion over noise and inpu daa disribuions [68]. An applicaion of his mehod o he dicionarylearning conex can be found in he Inference and Impuaion secion. Daa Skeching, Tensors, and Kernels The scope of he Algorihms secion can be broadened o include random subsampling schemes on Y (also known as daa skeching), as well as muliway daa arrays (ensors) and nonlinear modeling via kernel funcions. Daa skeching Caering o decenralized or parallel solvers, all variables in (P3) should be updaed in parallel across learners of individual nework nodes. However, here are cases where solving all learning subasks simulaneously may be prohibiive or inefficien for wo main reasons. Firs, he daa size migh be so large ha compuing funcion values or firsorder informaion over all variables is impossible. Second, he naure and srucure of daa may preven a fully parallel operaion; e.g., when daa are no available in heir enirey, bu are acquired eiher in baches over ime or where no all of he nework nodes are equally responsive or funcional. A recen line of research aiming a obaining informaive subses of measuremens for asynchronous and reduceddimensionaliy processing of big daa ses is based on (random) subsampling or daa Is i I? Yes No i = + x (i) min Is f (x (i) x ( i) + g i (x (i) ) i Easy o Solve? No Yes Run he Acceleraion Module on f (. x ( i ) ), g i for R i Times x (i) min i Solve f (x (i) x ( i) ) + g i (x (i) ) Updae Block x (i) i i + [Fig] The online, acceleraed, sequenial (Gauss Seidel) opimizaion scheme for asympoically minimizing he sequence ( F)! N of nonconvex funcions. IEEE SIGNAL PROCESSING MAGAZINE [3] SEPTEMBER 04
7 skeching (via P X ) of he massive Y [45]. The basic principles of daa skeching will be demonsraed here for he overdeermined ( N & ) LS q* : = y! arg minq! R y Pq [a ask subsumed by (P) as well], denoes pseudoinverse,  P = ( P P) P, for P full columnrank. Popular sraegies o obain q * include he expensive SVD; he Cholesky decomposiion if P is full columnrank and well condiioned; and he slower bu more sable QR decomposiion [45]. The basic premise of he subsampling or daa skeching echniques is o largely reduce he number of rows of Y prior o solving he LS ask [45]. A daadriven mehodology of keeping only he mos informaive rows relies on he soermed (saisical) leverage scores and is oulined nex as a hreesep procedure. Given he (hin) SVD P = URV : (S) find he normalized leverage scores { ln} n N  N =, where ln : = en UU en = en PPen, wih en! R being he nh canonical vecor. Clearly, ln equals he (normalized) nh diagonal elemen and since PP = UU is he orhogonal projecor ono he linear subspace spanned by he columns of P, i follows ha y offers he bes approximaion o y wihin his subspace. Then, (S) for an arbirarily small e 0, and by using { ln} n N = as an imporance sampling disribuion, randomly sample and rescale by ( rln)  a number of O(  r = e log ) rows of P, ogeher wih he corresponding enries of y. Such a sampling and rescaling operaion can be expressed by a marix W! R r # N. Finally, (S3) solve he reducedsize LS problem qu *! argminq! R W( y Pq). Wih () $ l denoing condiion  number and c : = y UU y, i holds ha [45] y Pqu * # ( + e) y Pq* (3a)  q qu # e l( P) c  q * (3b) * * so ha performance degrades gracefully afer reducing he number of equaions. Similar o he nuclearnorm, a major difficuly is ha leverage scores are no amenable o decenralized compuaion [cf. discussion prior (P)], since he SVD of P is necessary prior o decenralizing he original learning ask. To avoid compuing he saisical leverage scores, he following daaagnosic sraegy has been advocaed [45]: ) Premuliply P and y wih he N# N random Hadamard ransform H N D, where H N is defined inducively as H N = N H N/ H N/, : + + = G H = = G, H N/  H N/ +  and D is a diagonal marix whose nonzero enries are drawn independenly and uniformly from {, + }, ) uniformly sample and rescale a number of r = O( log log N+ e  N log ) rows from HN D P ogeher wih he corresponding componens from HN D y, and 3) find q u *! argminq! R WHN D( y Pq), where W sands again for he sampling and rescaling operaion. Error bounds similar o hose in () can be also derived for his precondiioning sraegy [45]. Key o deriving such performance bounds is he Johnson Lindensrauss lemma, which loosely assers ha for any e! (, 0), any se of poins in N dimensions can be (linearly) embedded ino r $ 4 (  e e 3 )  ln dimensions, while preserving he pairwise Euclidean disances of he original poins up o a muliplicaive facor of (! e ). Besides he previous overdeermined LS ask, daa skeching has been employed o ease he compuaional burden of several largescale asks ranging from generic marix muliplicaion, SVD compuaion, o kmeans clusering and ensor approximaion [0], [45]. In he spiri of H N D, mehods uilizing sparse embedding marices have been also developed for overconsrained LS and, pnorm regression, lowrank and leverage scores approximaion [7]; in paricular, hey exhibi complexiy 3  l 3  O( supp( P) ) + O( e log ( e )) for solving he LS ask saisfying (3a), where supp( P ) sands for he cardinaliy of he suppor of P, and l! N *. Viewing he sampling and rescaling operaor W as a special case of a (weighed) PX allows carrying over he algorihms oulined in he Encompassing Models for Succinc Big Daa Represenaions and Algorihms secions o he daa skeching seup as well. big daa ensors Alhough he marix model in () is quie versaile and can subsume a variey of imporan frameworks as special cases, he paricular planar arrangemen of daa poses limiaions in capuring available srucures ha can be crucial for effecive inerpolaion. In he example of movie recommender sysems, marix models can readily handle wodimensional srucures of people # movie raings. However, movies are classified in various genres and one could explicily accoun for his informaion by arranging raings in a sparse person # genre # ile hreeway array or ensor. In general, various ensor daa analyic asks for nework raffic, social neworking, or medical daa analysis aim a capuring an underlying laen srucure, which calls for highorder facorizaions even in he presence of missing daa [], [50]. Ia# Ib# Ic A rankone hreeway array Y = [ yiaibic]! R, where he underline denoes ensors, is he ouer produc a% b% c of hree vecors a! R Ia, b! R Ib, c! R Ic : yiaibic = aia bib cic. One can inerpre aia, bib, and cic as corresponding o he people, genre, and ile componens, respecively, in he previous example. The rank of a ensor is he smalles number of rankone ensors ha sum up o generae he given ensor. These noions readily generalize o higherway ensors, depending on he applicaion. Nowihsanding, his is no an incremenal exension from lowrank marices o lowrank ensors, since even compuing he ensor rank is an NPhard problem in iself [36]. Defining a convex surrogae for he rank penaly such as he nuclear norm for marices is no obvious eiher, since singular values when applicable, e.g., in he Tucker model, are no relaed o he rank [74]. Alhough a hreeway array can be unfolded o obain a marix exhibiing laen Kronecker produc srucure, such an unfolding ypically desroys he srucure ha one looks for. These consideraions, moivae forming a lowrank approximaion of ensor Y as IEEE SIGNAL PROCESSING MAGAZINE [4] SEPTEMBER 04
8 Y. / ar % br% cr. (4) r = Lowrank ensor approximaion is a relaively maure opic in mulilinear algebra and facor analysis, and when exac, he decomposiion (4) is called parallel facor analysis (PARAFAC) or canonical decomposiion (CANDECOMP) [36]. PARAFAC is he model of choice when one is primarily ineresed in revealing laen srucure. Unlike he marix case, lowrank ensor decomposiion can be unique. There is deep heory behind his resul, and algorihms recovering he rankone facors [37]. However, various compuaional and big daarelaed challenges remain. Missing daa have been handled in raher ad hoc ways [76]. Parallel and decenralized implemenaions have no been horoughly addressed; see, e.g., ParCube and GigaTensor algorihms for recen scalable approaches [57]. Wih reference o (4), inroduce he facor marix A : = [ a,, a ] R I a # f!, and likewise for B! R Ib # and C R Ic #!. Le Yic, ic =, f, Ic denoe he ic h slice of Y along is hird (ube) dimension, such ha Yic( ia, ib) = yiaibic. I follows ha (4) can be compacly represened in marix form, in erms of slice facorizaions Yic = A diag( eic C) B, 6ic. Capializing on he Frobeniusnorm regularizaion (P), decenralized algorihms for lowrank ensor compleion under he PARAFAC model can be based on he opimizaion ask: min Ic / { A,B,C} ic = P X c( Yi  A diag( e C) B ) i c ic + m* 6 A + B + F F F Differen from he marix case, i is unclear wheher he regularizaion in (5) bears any relaion wih he ensor rank. Ineresingly, [7] assers ha (5) provably yields a lowrank Y for sufficienly large m *, while he poenial for scalable BCDMbased inerpolaion algorihms is apparen. For an online algorihm, see also (9) in he secion Big Daa Tasks and [50] for furher deails. Kernelbased learning In impuing random missing enries, predicion of muliway daa can be viewed as a ensor compleion problem, where an enire slice (say, he one orhogonal o he ube direcion represening ime) is missing. Noice ha since (5) does no specify a correlaion srucure, i canno perform his exrapolaion ask. Kernel funcions provide he nonlinear means o infuse correlaions or side informaion (e.g., user age range and educaional background for movie recommendaion sysems) in various big daa asks spanning disciplines such as ) saisics, for inference and predicion [8], ) machine learning, for classificaion, regression, clusering, and dimensionaliy reducion [63], and 3) SP, as well as (non)linear sysem idenificaion, sampling, inerpolaion, noise removal, and impuaion; see, e.g., [6] and [75]. In kernelbased learning, processing is performed in a high, possibly infiniedimensional reproducing kernel Hilber space (RKHS) H, where funcion f! H o be learned is expressed as F (5) a superposiion of kernels; i.e., f(): x = / 3 {l i ( x, xi), where i = l : X# X " R is he kernel associaed wih H, {{ i} 3 i = denoe he expansion coefficiens, and xxi,! X, 6 i [63]. Broadening he scope of (5), a kernelbased ensor compleion problem is posed as follows. Wih index ses X a: = {, f, Ia}, X b: = {, f, Ib}, and X c: = {, f, Ic}, and associaed kernels lxa( ia, il a), lxb( ib, il b) and lxc( ic, il c), ensor enry yiaibic is approximaed using funcions from he se F : = { fi ( a, ib, ic) = / ar( i ) b ( i ) c ( i ) a HX, b HX, c HX }, r a r b r c ; r! a r! b r! c where = is an upper bound on he rank. Specifically, wih binary weighs { ~ iaibic} aking value 0 if yiaibic is missing (and oherwise), fiing lowrank ensors is possible using / f = arg min ~ i i i[ yi i i fi ( a, ib, ic)] f! F ia, ib, ic a b c + m* / 8 ar + br + cr B. (6) r = a b c HX HX HX a b c If all kernels are seleced as Kronecker delas, (6) revers back o (5). The separable srucure of he regularizaion in (6) allows applicaion of Represener s heorem [63], which implies ha ar, br, and cr admi finie dimensional represenaions given Ia Ib by ar( ia) = / arila lxa( i, i ), i a l l a b ( i ) ( i, i ), a = r b = / b i riblxb b l l l b b = Ic and cr( ic) = / crilc lxc( i, i ), i c lc respecively. Coefficiens lc = A : = [ a ria l], B : = [ b ], rib l and C : = [ c ] ric l urn ou o be soluions of [cf. (5)] ( ABC,, c ): = arg min / PX ic( Yi I { ABC,, } ic =  KX A diag( e KX CBK ) X ) a ic c b c + race [ A KX A+ B KX B+ C KX C], F m* a b c (P5) where KXa: = [ lxa( ia, il a)], and likewise for KXb and KXc, sand for kernel marices formed using (cross)correlaions esimaed from hisorical daa as deailed in, e.g., [7]. Remarkably, he cos in (P5) is convex w.r.. any of { ABC,, }, whenever he res of hem are held fixed. As such, he lowcomplexiy online acceleraed algorihms of he Algorihms secion carry over o ensors oo. Having A available, he esimae a ria l is obained, and likewise for b rib l and c. rib l The laer yield he desired prediced values as y i i i : a ( i ) b r a r( ib) c r( ic). yi i i. / a b c = r = big daa Tasks The ools and hemes oulined so far will be applied in his secion o a sample of big daa SPrelevan asks. Dimensionaliy reducion Nework visualizaion The rising complexiy and volume of neworked (graphvalued) daa presens new opporuniies and challenges for visualizaion ools ha capure global paerns and srucural informaion such as hierarchy, similariy, and communiies [3], [7]. Mos visualizaion algorihms radeoff he clariy of srucural characerisics of he underlying daa for aesheic requiremens a b c IEEE SIGNAL PROCESSING MAGAZINE [5] SEPTEMBER 04
9 such as minimal edge crossing and fixed inernode disance. Alhough efficien for relaively small neworks or graphs (hundreds of nodes), embeddings Consider an undireced graph G( VE, ), where V denoes he se of verices (nodes, agens, or processing cores) wih cardinaliy V = V, and E sands for for larger graphs using hese echniques are seldom srucurally The rising complexiy nodes ha can communicae. Fol edges (links) ha represen pairs of informaive. The growing ineres and volume of neworked lowing (P3), node o! V communicaes wih is single or mulihop in analysis of big daa neworks has (graphvalued) daa presens prioriized he need for effecively new opporuniies and neighboring peers in No V. capuring srucure over aesheics challenges for visualizaion ools Given a se of observed feaure vecors { yo} o! V R, and a pre P in visualizaion. For insance, layous ha capure global paerns of meroransi neworks ha show hierarchically he bulk of raffic and srucural informaion such as hierarchy, similariy, scribed embedding dimension p % P(ypically p! {, 3} for visu convey a lucid picure abou he and communiies. alizaion), he graph embedding mos criical nodes in he even of a amouns o finding a se of p erroris aack. To his end, [3] capures { zo} o! V R vecors ha preserve hierarchy in neworks or graphs hrough welldefined in he very lowdimensional R p he nework srucure observed measures of node imporance, collecively known as cenraliy via { yo} o! V. The dimensionaliy reducion module of [3] is in he nework science communiy. Examples are he beweenness based on local linear embedding (LLE) principles [6], which cenraliy, which describes he exen o which informaion assume ha he observed { yo} o! V live on a lowdimensional, is roued hrough a specific node by measuring he fracion of all shores pahs raversing i, as well as closeness, eigenvalue, and Markov cenraliy [3]. smooh, bu unknown manifold, wih he objecive of seeking an embedding ha preserves he local srucure of he manifold in he lower dimensional R p. In paricular, LLE accomplishes his by approximaing each daa poin via an affine combinaion (real weighs summing up o ) of is neighbors, followed by.0 consrucion of a lowerdimensional embedding ha bes preserves he weighs. If Y : = [ y, f, y ] R # N o o ol ol No! gahers all he observed daa wihin he neighborhood of node o, and 0.5 along he lines of LLE, he cenraliy consrained (CC)LLE mehod comprises he following wo seps: (a) (b) [Fig3] The visualizaion of wo snapshos of he largescale nework Gnuella [40] by means of he CCLLE mehod. The cenraliy meric is defined by he node degree. Hence, nodes wih low degree are placed far from he cener of he embedding. (a) Gnuella04 (08/04/0). (b) Gnuella4 (08/4/0). S: 6o! V, so! arg min yo Ys o Ys o = h ( co) s. o) s = S: min zo  sool zol / / { zo} o! V o! V ol! V s.o s z = h ( co), 6o! V, o where { co} o! V R are cenraliy merics, h ( ) is a monoone decreasing funcion ha quanifies he cenraliy hierarchy, e.g., hc ( o) = exp( co), and s = enforces he local affine approximaion of y o by { yol} ol! No. In oher words, and in he spiri of (P3), y o is affinely approximaed by he local dicionary Do: = Yo. I is worh sressing ha boh objecive and consrains in sep of (7) can be compued solely by means of he innerproducs or correlaions { Yo yo, Yo Yo} o! V. Hence, knowledge of { yo} o! V is no needed in CCLLE, and only a given se of dissimilariy measures { d } ool ( oo, l )! V suffices o formulae (7), where d ool! R$ 0, dool= doo l, and d oo = 0, 6 ( o, ol )! V ; e.g., d : =  y   ool o yol yo yol in (7). Afer relaxing he nonconvex consrain Ys h o = ( co ) o he convex Ys h o # ( co ) one, a BCDM approach is followed o solve (7) efficienly, wih compuaional complexiy ha scales linearly wih he nework size [3]. Figure 3 depics he validaion of CCLLE on largescale degree visualizaions of snapshos of he Gnuella peeropeer filesharing nework ( V = 6, 58, (7) IEEE SIGNAL PROCESSING MAGAZINE [6] SEPTEMBER 04
10 E = 65, 369) [40]. Snapshos of his direced nework were capured on 4 and 4 Augus 00, respecively, wih nodes represening hoss. For convenience, undireced rendiions of he wo neworks were obained by symmerizaion of heir adjacency marices. Noice here ha he mehod can generalize o he direced case oo, a he price of increased compuaional complexiy. The cenraliy meric of ineres was he node degree, and dissimilariies were compued based on he number of shared neighbors beween any pair of hoss. I is clear from Figure 3 ha despie he dramaic growh of he nework over a span of 0 days, mos new nodes had low degree, locaed hus far from he cener of he embedding. The CCLLE efficiency is manifesed by he low running imes for obaining embeddings in Figure 3;,684 s for Gnuella04, and 5,639 s for Gnuella4 [3]. Inference and impuaion Decenralized esimaion of anomalous nework raffic In he backbone of largescale neworks, originodesinaion (OD) raffic flows experience abrup changes ha can resul in congesion and limi he qualiy of service provisioning of he end users. These raffic anomalies could be due o exernal sources such as nework failures, denial of service aacks, or inruders [38]. Unveiling hem is a crucial ask in engineering nework raffic. This is challenging however, since he available daa are highdimensional noisy linkload measuremens, which comprise he superposiion of clean and anomalous raffic. Consider as in he secion Dimensionaliy Reducion an undireced, conneced graph G( VE, ). The raffic Y R N #! T, carried over he edges or links E ( E = N) and measured a ime insans! {, f, T} is modeled as he superposiion of unknown clean raffic flows L*, over he ime horizon of ineres, and he raffic volume anomalies S* plus noise V ; Y = L* + S* + V. Common emporal paerns among he raffic flows in addiion o heir periodic behavior render mos rows (respecively columns) of L* linearly dependen, and hus L* ypically has low rank [38]. Anomalies are expeced o occur sporadically over ime, and only las for shor periods relaive o he (possibly long) measuremen inerval. In addiion, only a small fracion of he flows is anomalous a any ime slo. This renders marix S* sparse across rows and columns [48]. In he presen conex, real daa including OD flow raffic levels and endoend laencies are colleced from he operaion of he Inerne nework (Inerne backbone nework across he Unied Saes) [30]. OD flow raffic levels were recorded for a hreeweek operaion (sampled per 5 min) of Inernev during 8 8 December 003 [38]. To beer assess performance, large spikes of ampliude equal o he larges recorded raffic across all flows and ime insans were injeced ino % randomly seleced enries of he groundruh marix L*. Along he lines of (P3), where he number of links N =, and T = 504, he rows of he daa marix Y were disribued uniformly over a number of V = nodes. (P3) is solved using ADMM, and a small porion ( 50 # 50) of he esimaed anomaly marix S is depiced in Figure 4(a). Anomaly Ampliude Flow Index (n) 0 0 Real Daa (Inernev) Time Index () 50 Relaive Esimaion Error Relaive Esimaion Error Ieraion Index Synheic Daa Time (s) True Esimaed V = RPCA V = V = 4 V = 5 V = 00 V = 65 (a) (b) [Fig4] Decenralized esimaion of nework raffic anomalies measured in bye unis over 5 min ime inervals: (a) only a small porion ( 50 # 50) of he sparse marices S* and S enries are shown; (b) relaive esimaion error versus ADMM ieraion index and cenral processing uni (CPU) ime over neworks wih V number of nodes. The curve obained by he cenralized RPCA mehod [] is also depiced. IEEE SIGNAL PROCESSING MAGAZINE [7] SEPTEMBER 04
11 As a means of offering addiional design insighs, furher validaion is provided here o reveal he radeoffs ha become relevan as he nework size increases. Specifically, comparisons in erms of running ime are carried ou w.r.. is cenralized counerpar. Throughou, a nework modeled as a square grid (uniform laice) wih agens per row/column is adoped. To gauge running imes as he nework grows, consider a fixed size daa marix 500 Y R, #, 500!. The daa are synhesized according o he previous model of Y = L* + S* + V, deails for which can be found in [47, Sec. V]. Rows of Y are uniformly spli among he nework nodes. Figure 4(b) illusraes he relaive esimaion error S  S* F/ S* F (S sands for he esimae of S* ) versus boh ieraion index of he ADMM and CPU ime over various nework sizes. Dynamic link load raffic predicion and impuaion Consider again he previous undireced graph G( VE, ). Conneciviy and edge srenghs of G are described by he adjacency Link Load Link Load Esimaion Error.5 0.5,500,600,700,800,900,000 Time ,500,600,700,800,900,000 Time True (Missing) Esimaed (Missing) True (Missing) Esimaed (Missing) Link 7 Link ,500,600,700,800,900,000 Time (c) [Fig5] Link load racking (dos and riangles) and impuaion (crosses and circles) on Inerne [30]. The proposed mehod is validaed versus he ADMMbased approach of [3]. (a) (b) True (Observed) Esimaed (Observed) True (Observed) Esimaed (Observed) ADMM Based Proposed marix W R V #! V, where [ W] ool 0 if nodes o and ol are conneced, while [ W ] ool = 0 oherwise. A every! N 0, a variable o! R, which describes a neworkwide dynamical process of ineres, corresponds o a node o! V. All node variables are colleced in : = [, f, V]! R V. A sparse represenaion of he process over G models as a linear combinaion of few aoms in an N# M dicionary D, wih M $ N; and = Ds, M where s! R is sparse. Furher, only a porion of is observed Nl # N per ime slo. Le now X! R, Nl # N, denoe a binary measuremen marix, wih each row of X corresponding o he canonical basis vecor for R N, selecing he measured componens of y! R. In oher words, he observed daa per slo are N y = X + v, where v denoes noise. To impue missing enries of in y, he opology of G will be uilized. The spaial correlaion of he process is capured by he (unnormalized) graph N Laplacian marix K : = diag( W N)  W, where N! R is he allones vecor. Following Figure and given a forgeing facor d! (, 0], o gradually diminish he effec of pas daa (and hus accoun for nonsaionariy), define F (, sd):  x mk = / d yx XxDs + s D KDs D x = g () s g ( D) H F + m s + kd( D), f ( sd, ) where D : = d  x /, and k x = D sands for he indicaor funcion N# M of D : = { D = [ d, f, dm]! R ; dm #, m! {, f, M}}, i.e., k D( D) = 0 if D! D, and k D( D) =+ 3 if D " D (noe ha 6c 0, Prox ckd is he meric projecion ono he closed convex D [5]). The erm including he known K quanifies he a priori informaion on he opology of G, and promoes smooh soluions over srongly conneced nodes of G [3]. This erm is also insrumenal for accommodaing missing enries in ( )! N 0. The algorihm of Figure was validaed on esimaing and racking neworkwide link loads aken from he Inerne measuremen archive [30]. The nework consiss of N = 54 links and nine nodes. Using he nework opology and rouing N informaion, neworkwide link loads ( )! N 0 R become available (in gigabis per second). Per ime slo, only Nl = 30 of he componens, chosen randomly via X, are observed in Nl y! R. Cardinaliy of he imevarying dicionaries is se o M = 80, 6. To cope wih pronounced emporal variaions of he Inerne link loads, he forgeing facor d in (8) was se equal o 0.5. Figure 5 depics esimaed values of boh observed (dos) and missing (crosses) link loads, for a randomly chosen link of he nework. The normalized squared esimaion error beween he rue and he inferred, specifically  , is also ploed in Figure 5 versus ime. The acceleraed algorihm was compared wih he saeofhear scheme in [3] ha relies on ADMM, o minimize a cos closely relaed o (8) w.r.. s, and uses BCD ieraions requiring marix inversion o opimize (8) w.r.. D. On he oher hand, R = and R = 0 in he algorihm of Figure. I is worh noicing here ha ADMM in [3] requires muliple ieraions o achieve a prescribed esimaion accuracy, and ha no marix inversion (8) IEEE SIGNAL PROCESSING MAGAZINE [8] SEPTEMBER 04
12 (a) (b) (c) (d) [Fig6] The impuaion of missing funcional MRI cardiac images by using he PARAFAC ensor model and he online framework of (9). The images were arificially colored o highligh he differences beween he obained recovery resuls. (a) The original image. (b) The degraded image (75% missing values). (c) The recovered image ( = 0) wih relaive esimaion error 0.4. (d) The recovered image ( = 50) wih relaive esimaion error was incorporaed in he realizaion of he proposed scheme. Even if he acceleraed firsorder mehod operaes under lower compuaional complexiy han he ADMM approach, esimaion error performance boh on observed and missing values is almos idenical. Cardiac MRI Cardiac magneic resonance imaging (MRI) is a major imaging ool for noninvasive diagnosis of hear diseases in clinical pracice. However, ime limiaions posed by he paien s breahholding ime, and hus he need for fas daa acquisiion degrade he qualiy of MRI images, resuling ofen in missing pixel values. In he presen conex, impuaion of he missing pixels uilizes he fac ha cardiac MRI images inrinsically conain lowdimensional componens. The FOURDIX daa se is considered, which conains 63 cardiac scans wih en seps of he enire cardiac cycle [4]. Each scan is an image of size 5 # 5 pixels, which is divided ino 64 ( 3 # 3) dimensional paches. Placing one afer he oher, 3 # 3 # 67, 38 paches form a sequence of slices of a ensor Y! R. Randomly chosen 75% of he Y enries are dropped o simulae missing daa. Operaing on such a ensor via bach algorihms is compuaionally demanding, due o he ensor s size and he compuer s memory limiaions. Moivaed by he bach formulaion in (5), a weighed LS online counerpar is [50] min / { ABC,, } x = d  x = P X( Yx  A diag( ex C) B ) m* + ( A F+ B F) + m* e C, x x E  d / x = where d 0 is a forgeing facor, and e x is he xh dimensional canonical vecor. The hird dimension of Y in (9) indicaes he slice number. To solve (9), he variables { ABC,, } are sequenially processed; fixing { AB, }, (9) is minimized w.r.. C, while gradien seepes descen seps are aken w.r.. each one of A and B, having he oher variables held consan. The resulan online learning algorihm is compuaionally ligh, wih 56 operaions (on average) per. The resuls of is applicaion o a randomly chosen scan image, for differen choices of he rank, are depiced in Figure 6 wih relaive esimaion errors, Y  Y x x F/ Yx F, equal o 0.4 and for = 0 and 50, respecively. F (9) IEEE SIGNAL PROCESSING MAGAZINE [9] SEPTEMBER 04
13 Addiional approaches for bach ensor compleion of boh visual and specral daa can be found in [4] and [66], whereas he algorihms in [] and [7] carry ou lowrank ensor decomposiions from incomplee daa and perform impuaion as a byproduc. Acknowledgmens Work in his aricle was suppored by he Naional Science Foundaion grans ECCS and Eager Moreover, i has been cofinanced by he European Union (European Social Fund and Greek naional funds hrough he Operaional Program Educaion and Lifelong Learning of he Naional Sraegic Reference FrameworkResearch Funding Program: Thalis UoA Secure Wireless Nonlinear Communicaions a he Physical Layer. We wish o hank Moreza Mardani and Brian Baingana, from he Universiy of Minnesoa, for he fruiful discussions and he numerical ess hey provided. AUTHORS Konsaninos Slavakis received his Ph.D. degree from he Tokyo Insiue of Technology (TokyoTech), Japan, in 00. He was a posdocoral fellow wih TokyoTech ( ) and he Deparmen of Informaics and Telecommunicaions, Universiy of Ahens, Greece ( ). He was an assisan professor in he Deparmen of Telecommunicaions and Informaics, Universiy of Peloponnese, Tripolis, Greece (007 0). He is currenly a research associae professor wih he Deparmen of Elecrical and Compuer Engineering and Digial Technology Cener, Universiy of Minnesoa, Unied Saes. His curren research ineress include signal processing, machine learning, and big daa analyics problems. Georgios B. Giannakis received his Ph.D. degree from he Universiy of Souhern California in 986. Since 999, he has been wih he Universiy of Minnesoa, where he holds he ADC chair in wireless elecommunicaions in he Deparmen of Elecrical and Compuer Engineering and serves as direcor of he Digial Technology Cener. His ineress are in he areas of communicaions, neworking, and saisical signal processing subjecs on which he has published more han 360 journal and 60 conference papers, book chapers, wo edied books, and wo research monographs (hindex 08). His curren research focuses on sparsiy and big daa analyics, cogniive neworks, renewables, power grid, and social neworks. He is he (co) invenor of paens and he (co)recipien of eigh bes paper awards from he IEEE Communicaions and Signal Processing Socieies. He is a Fellow of he IEEE and EURASIP and has also received echnical achievemen awards from he IEEE Signal Processing Sociey and EURASIP. Gonzalo Maeos received his B.Sc. degree in elecrical engineering from Universidad de la Republica, Uruguay, in 005 and he M.Sc. and Ph.D. degrees in elecrical engineering from he Universiy of Minnesoa, in 009 and 0, respecively. Since 04, he has been an assisan professor wih he Deparmen of Elecrical and Compuer Engineering, Universiy of Rocheser. During 03, he was a visiing scholar wih he Compuer Science Deparmen, Carnegie Mellon Universiy. From 003 o 006, he worked as a sysems engineer a ABB, Uruguay. His research ineress lie in he areas of saisical learning from big daa, nework science, wireless communicaions, and signal processing. His curren research focuses on algorihms, analysis, and applicaion of saisical signal processing ools o dynamic nework healh monioring, social, power grid, and big daa analyics. References [] E. Acar, D. M. Dunlavy, T. G. Kolda, and M. Mørup, Scalable ensor facorizaions for incomplee daa, Chemome. Inell. Lab. Sys., vol. 06, no., pp. 4 56, 0. [] M. Aharon, M. Elad, and A. Brucksein, KSVD: An algorihm for designing overcomplee dicionaries for sparse represenaion, IEEE Trans. Signal Processing, vol. 54, no., pp , Nov [3] B. Baingana and G. B. Giannakis, Embedding graphs under cenraliy consrains for nework visualizaion, submied for publicaion. arxiv: [4] L. Balzano, R. Nowak, and B. Rech, Online idenificaion and racking of subspaces from highly incomplee informaion, in Proc. Alleron Conf. Communicaion, Conrol, and Compuing, Monicello, IL, 00, pp [5] H. H. Bauschke and P. L. Combees, Convex Analysis and Monoone Operaor Theory in Hilber Spaces. New York: Springer, 0. [6] J. A. Bazerque and G. B. Giannakis, Nonparameric basis pursui via sparse kernelbased learning, IEEE Signal Process. Mag., vol. 30, no. 4, pp. 5, July 03. [7] J. A. Bazerque, G. Maeos, and G. B. Giannakis, Rank regularizaion in Bayesian inference for ensor compleion and exrapolaion, IEEE Trans. Signal Processing, vol. 6, no., pp , Nov. 03. [8] T. Bengsson, P. Bickel, and B. Li, Curseofdimensionaliy revisied: Collapse of he paricle filer in very large scale sysems, in Probabiliy and Saisics: Essays in Honor of David A. Freedman. Beachwood, OH: IMS, 008, vol., pp [9] D. P. Bersekas and J. N. Tsisiklis, Parallel and Disribued Compuaion: Numerical Mehods. Belmon, MA: Ahena Scienific, 999. [0] S. Boyd, N. Parikh, E. Chu, B. Peleao, and J. Ecksein, Disribued opimizaion and saisical learning via he alernaing direcion mehod of mulipliers, Found. Trends Machine Learn., vol. 3, no., pp., 0. [] E. Candès and M. B. Wakin, An inroducion o compressive sampling, IEEE Signal Process. Mag., vol. 5, no., pp. 30, 008. [] E. J. Candès, X. Li, Y. Ma, and J. Wrigh, Robus principal componen analysis?, J. ACM, vol. 58, no., pp. 37, 0. [3] E. J. Candes and Y. Plan, Marix compleion wih noise, Proc. IEEE, vol. 98, no. 6, pp , June 009. [4] V. Chandrasekaran, S. Sanghavi, P. R. Parrilo, and A. S. Willsky, Ranksparsiy incoherence for marix decomposiion, SIAM J. Opim., vol., no., pp , 0. [5] Q. Chenlu and N. Vaswani, Recursive sparse recovery in large bu correlaed noise, in Proc. Alleron Conf. Communicaion, Conrol, and Compuing, Sep. 0, pp [6] Y. Chi, Y. C. Eldar, and R. Calderbank, PETRELS: Parallel subspace esimaion and racking using recursive leas squares from parial observaions, IEEE Trans. Signal Processing, vol. 6, no. 3, pp , 03. [7] K. L. Clarkson and D. P. Woodruff, Low rank approximaion and regression in inpu sparsiy ime, in Proc. Symp. Theory Compuing, June 4, 03, pp arxiv: v4. [8] K. Cukier. (00). Daa, daa everywhere. The Economis. [Online]. Available: hp://www.economis.com/node/ [9] J. Dean and S. Ghemawa, MapReduce: Simplified daa processing on large clusers, in Proc. Symp. Operaing Sysem Design and Implemenaion, San Francisco, CA, 004, vol. 6, p. 0. [0] P. Drineas and M. W. Mahoney, A randomized algorihm for a ensorbased generalizaion of he SVD, Linear Algeb. Appl., vol. 40, no. 3, pp , 007. [] J. Feng, H. Xu, and S. Yan, Online robus PCA via sochasic opimizaion, in Proc. Advances in Neural Informaion Processing Sysems, Lake Tahoe, NV, Dec. 03, pp [] P. Forero, A. Cano, and G. B. Giannakis, Consensusbased disribued suppor vecor machines, J. Mach. Learn. Res., vol., pp , May 00. [3] P. Forero, K. Rajawa, and G. B. Giannakis, Predicion of parially observed dynamical processes over neworks via dicionary learning, IEEE Trans. Signal Processing, o be published. [4] [Online]. Available: hp://www.osirixviewer.com/daases/ IEEE SIGNAL PROCESSING MAGAZINE [30] SEPTEMBER 04
14 [5] H. Gao, J. Cai, Z. Shen, and H. Zhao, Robus principal componen analysisbased fourdimensional compued omography, Phys. Med. Biol., vol. 56, no., pp , 0. [6] G. B. Giannakis, V. Kekaos, N. Gasis, S. J. Kim, H. Zhu, and B. Wollenberg, Monioring and opimizaion for power grids: A signal processing perspecive, IEEE Signal Process. Mag., vol. 30, no. 5, pp. 07 8, Sep. 03. [7] L. Harrison and A. Lu, The fuure of securiy visualizaion: Lessons from nework visualizaion, IEEE New., vol. 6, pp. 6, Dec. 0. [8] T. Hasie, R. Tibshirani, and J. Friedman, The Elemens of Saisical Learning: Daa Mining, Inference, and Predicion, nd ed. New York: Springer, 009. [9] J. He, L. Balzano, and A. Szlam, Incremenal gradien on he Grassmannian for online foreground and background separaion in subsampled video, in Proc. IEEE Conf. Compuer Vision and Paern Recogniion, Providence, RI, June 0, pp [30] [Online]. Available: hp://www.inerne.edu/observaory/ [3] M. I. Jordan, On saisics, compuaion and scalabiliy, Bernoulli, vol. 9, no. 4, pp , 03. [3] S. A. Kassam and H. V. Poor, Robus echniques for signal processing: A survey, Proc. IEEE, vol. 73, no. 3, pp , Mar [33] S.J. Kim and G. B. Giannakis, Opimal resource allocaion for MIMO ad hoc cogniive radio neworks, IEEE Trans. Info. Theory, vol. 57, no. 5, pp , May 0. [34] A. Kleiner, A. Talwalkar, P. Sarkar, and M. I. Jordan, A scalable boosrap for massive daa, J. Royal Sais. Soc.: Ser. B, o be published. [Online]. Available: hp://dx.doi.org/0./rssb.050 [35] E. D. Kolaczyk, Saisical Analysis of Nework Daa: Mehods and Models. New York: Springer, 009. [36] T. G. Kolda and B. W. Bader, Tensor decomposiions and applicaions, SIAM Rev., vol. 5, no. 3, pp , 009. [37] J. B. Kruskal, Threeway arrays: Rank and uniqueness of rilinear decomposiions, wih applicaions o arihmeic complexiy and saisics, Linear Algeb. Appl., vol. 8, no., pp , 977. [38] A. Lakhina, M. Crovella, and C. Dio, Diagnosing neworkwide raffic anomalies, in Proc. SIGCOMM, Aug. 004, pp [39] D. Lee and H. S. Seung, Learning he pars of objecs by nonnegaive marix facorizaion, Naure, vol. 40, pp , Oc [40] J. Leskovec, J. Kleinberg, and C. Falousos, Graph evoluion: Densificaion and shrinking diameers, ACM Trans. Knowl. Discov. Daa, vol., no., Mar [4] J. Liu, P. Musialski, P. Wonka, and J. Ye, Tensor compleion for esimaing missing values in visual daa, IEEE Trans. Paern Anal. Mach. Inell., vol. 35, pp. 08 0, Jan. 03. [4] Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guesrin, and J. Hellersein, GraphLab: A new framework for parallel machine learning, in Proc. 6h Conf. Uncerainy in Arificial Inelligence, Caalina Island: CA, 00. [43] Y. Ma, P. Niyogi, G. Sapiro, and R. Vidal, Dimensionaliy reducion via subspace and submanifold learning [From he Gues Ediors], IEEE Signal Process. Mag., vol. 8, no., pp. 4 6, Mar. 0. [44] L. Mackey, A. Talwalkar, and M. I. Jordan, Disribued marix compleion and robus facorizaion, submied for publicaion. arxiv: v7. [45] M. W. Mahoney, Randomized algorihms for marices and daa, Found. Trends Machine Learn., vol. 3, no., pp. 3 4, 0. [46] J. Mairal, F. Bach, J. Ponce, and G. Sapiro, Online learning for marix facorizaion and sparse coding, J. Machine Learn. Res., vol., pp. 9 60, Mar. 00. [47] M. Mardani, G. Maeos, and G. B. Giannakis, Decenralized sparsiyregularized rank minimizaion: Algorihms and applicaions, IEEE Trans. Signal Processing, vol. 6, no., pp , Nov. 03. [48] M. Mardani, G. Maeos, and G. B. Giannakis, Dynamic anomalography: Tracking nework anomalies via sparsiy and low rank, IEEE J. Sel. Topics Signal Process., vol. 8, pp , Feb. 03. [49] M. Mardani, G. Maeos, and G. B. Giannakis, Recovery of lowrank plus compressed sparse marices wih applicaion o unveiling raffic anomalies, IEEE Trans. Info. Theory, vol. 59, no. 8, pp , Aug. 03. [50] M. Mardani, G. Maeos, and G. B. Giannakis, Subspace learning and impuaion for sreaming big daa marices and ensors, IEEE Trans. Signal Processing, submied for publicaion. [5] G. Maeos, J. A. Bazerque, and G. B. Giannakis, Disribued sparse linear regression, IEEE Trans. Signal Processing, vol. 58, no. 0, pp , Oc. 00. [5] G. Maeos and G. B. Giannakis, Robus PCA as bilinear decomposiion wih ouliersparsiy regularizaion, IEEE Trans. Signal Processing, vol. 60, no. 0, pp , Oc. 0. [53] B. K. Naarajan, Sparse approximae soluions o linear sysems, SIAM J. Compu., vol. 4, no., pp. 7 34, Apr [54] Y. Neserov, A mehod for solving he convex programming problem wih convergence rae O(/k ), Dokl. Akad. Nauk SSSR, vol. 69, no. 3, pp , 983. [55] Office of Science and Technology Policy. (0). Big daa research and developmen iniiaive. Execuive Office of he Presiden. [Online]. Available: hp:// final_.pdf [56] B. A. Olshausen and D. J. Field, Sparse coding wih an overcomplee basis se: A sraegy employed by V? Vision Res., vol. 37, no. 3, pp , 997. [57] E. E. Papalexakis, U. Kang, C. Falousos, N. D. Sidiropoulos, and A. Harpale, Large scale ensor decomposiions: Algorihmic developmens and applicaions, IEEE Daa Eng. Bull., vol. 36, no. 3, pp , Sep. 03. [58] H. Raja and W. U. Bajwa, Cloud KSVD: Compuing daaadapive represenaions in he cloud, in Proc. Alleron Conf. Communicaion, Conrol, and Compuing, Oc. 03, pp [59] M. Razaviyayn, M. Hong, and Z.Q. Luo, A unified convergence analysis of block successive minimizaion mehods for nonsmooh opimizaion, SIAM J. Opim., vol. 3, no., pp. 6 53, 03. [60] H. Robbins and S. Monro, A sochasic approximaion mehod, Ann. Mah. Sais., vol., pp , Sep. 95. [6] L. K. Saul and S. T. Roweis, Think globally, fi locally: Unsupervised learning of low dimensional manifolds, J. Mach. Learn. Res., vol. 4, pp. 9 55, Dec [6] I. D. Schizas, A. Ribeiro, and G. B. Giannakis, Consensus in ad hoc WSNs wih noisy links Par I: Disribued esimaion of deerminisic signals, IEEE Trans. Signal Processing, vol. 56, no., pp , Jan [63] B. Schölkopf and A. J. Smola, Learning wih Kernels. Cambridge, MA: MIT Press, 00. [64] G. Scuari, F. Facchinei, P. Song, D. P. Palomar, and J.S. Pang, Decomposiion by parial linearizaion: Parallel opimizaion of muliagen sysems, IEEE Trans. Signal Processing, vol. 6, no. 3, pp [65] S. ShalevShwarz, Online learning and online convex opimizaion, Found. Trends Mach. Learn., vol. 4, no., pp , 0. [66] M. Signoreo, R. V. Plas, B. D. Moor, and J. A. K. Suykens, Tensor versus marix compleion: A comparison wih applicaion o specral daa, IEEE Signal Process. Le., vol. 8, pp , July 0. [67] K. Skreing and K. Engan, Recursive leas squares dicionary learning algorihm, IEEE Trans. Signal Processing, vol. 58, no. 4, pp. 30, Apr. 00. [68] K. Slavakis and G. B. Giannakis, Online dicionary learning from big daa using acceleraed sochasic approximaion algorihms, in Proc. ICASSP, Florence, Ialy, 04, pp [69] V. Solo and X. Kong, Adapive Signal Processing Algorihms: Sabiliy and Performance. Englewood Cliffs, NJ: Prenice Hall, 995. [70] M. Solanolkoabi and E. J. Candès, A geomeric analysis of subspace clusering wih ouliers, Ann. Sais., vol. 40, no. 4, pp , Dec. 0. [7] P. Sprechmann, A. M. Bronsein, and G. Sapiro, Realime online singing voice separaion from monaural recordings using robus lowrank modeling, in Proc. Conf. In. Sociey for Music Informaion Rerieval, Oc. 0, pp [7] N. Srebro and A. Shraibman, Rank, racenorm and maxnorm, in Learning Theory. Berlin/Heidelberg: Germany: Springer, 005, pp [73] N. Sädler, D. J. Sekhoven, and P. Bühlmann, Paern alernaing maximizaion algorihm for missing daa in large p small n problems, J. Mach. Learn. Res., o be published. arxiv: v3. [74] J. M. F. en Berge and N. D. Sidiropoulos, On uniqueness in CANDECOMP/ PARAFAC, Psychomerika, vol. 67, no. 3, pp , 00. [75] S. Theodoridis, K. Slavakis, and I. Yamada, Adapive learning in a world of projecions: A unifying framework for linear and nonlinear classificaion and regression asks, IEEE Signal Process. Mag., vol. 8, no., pp. 97 3, Jan. 0. [76] G. Tomasi and R. Bro, PARAFAC and missing values, Chemom. Inell. Lab. Sys., vol. 75, no., pp , 005. [77] P. Tseng, Convergence of block coordinae decen mehod for nondiffereniable minimizaion, J. Opim. Theory Appl., vol. 09, pp , June 00. [78] R. Vidal, Subspace clusering, IEEE Signal Process. Mag., vol. 8, no., pp. 5 68, Mar. 0. [79] B. Widrow and J. M. E. Hoff, Adapive swiching circuis, IRE WESCON Conv. Rec., vol. 4, pp , Aug [80] M. Yamagishi and I. Yamada, Overrelaxaion of he fas ieraive shrinkagehresholding algorihm wih variable sepsize, Inverse Probl., vol. 7, no. 0, p , 0. [SP] IEEE SIGNAL PROCESSING MAGAZINE [3] SEPTEMBER 04
ONLINE SKETCHING FOR BIG DATA SUBSPACE LEARNING. Morteza Mardani and Georgios B. Giannakis
ONLINE SKETCHING FOR BIG DATA SUBSPACE LEARNING Moreza Mardani and Georgios B. Giannakis Dep. of ECE and Digiial Technology Cener, Universiy of Minnesoa ABSTRACT Skeching a.k.a. subsampling highdimensional
More informationTEMPORAL PATTERN IDENTIFICATION OF TIME SERIES DATA USING PATTERN WAVELETS AND GENETIC ALGORITHMS
TEMPORAL PATTERN IDENTIFICATION OF TIME SERIES DATA USING PATTERN WAVELETS AND GENETIC ALGORITHMS RICHARD J. POVINELLI AND XIN FENG Deparmen of Elecrical and Compuer Engineering Marquee Universiy, P.O.
More informationStochastic approximation visavis online learning for big data analytics
Sochasic approximaion visavis online learning for big daa analyics Konsaninos Slavakis, SeungJun Kim, Gonzalo Maeos, and Georgios B. Giannakis July 30, 2014 We live in an era of daa deluge, where daa
More informationMultiprocessor SystemsonChips
Par of: Muliprocessor SysemsonChips Edied by: Ahmed Amine Jerraya and Wayne Wolf Morgan Kaufmann Publishers, 2005 2 Modeling Shared Resources Conex swiching implies overhead. On a processing elemen,
More informationRealtime Particle Filters
Realime Paricle Filers Cody Kwok Dieer Fox Marina Meilă Dep. of Compuer Science & Engineering, Dep. of Saisics Universiy of Washingon Seale, WA 9895 ckwok,fox @cs.washingon.edu, mmp@sa.washingon.edu Absrac
More informationUnderstanding Sequential Circuit Timing
ENGIN112: Inroducion o Elecrical and Compuer Engineering Fall 2003 Prof. Russell Tessier Undersanding Sequenial Circui Timing Perhaps he wo mos disinguishing characerisics of a compuer are is processor
More informationThe Transport Equation
The Transpor Equaion Consider a fluid, flowing wih velociy, V, in a hin sraigh ube whose cross secion will be denoed by A. Suppose he fluid conains a conaminan whose concenraion a posiion a ime will be
More informationPerformance Center Overview. Performance Center Overview 1
Performance Cener Overview Performance Cener Overview 1 ODJFS Performance Cener ce Cener New Performance Cener Model Performance Cener Projec Meeings Performance Cener Execuive Meeings Performance Cener
More informationTask is a schedulable entity, i.e., a thread
RealTime Scheduling Sysem Model Task is a schedulable eniy, i.e., a hread Time consrains of periodic ask T:  s: saring poin  e: processing ime of T  d: deadline of T  p: period of T Periodic ask T
More informationChapter 8: Regression with Lagged Explanatory Variables
Chaper 8: Regression wih Lagged Explanaory Variables Time series daa: Y for =1,..,T End goal: Regression model relaing a dependen variable o explanaory variables. Wih ime series new issues arise: 1. One
More informationAn empirical analysis about forecasting Tmall airconditioning sales using time series model Yan Xia
An empirical analysis abou forecasing Tmall aircondiioning sales using ime series model Yan Xia Deparmen of Mahemaics, Ocean Universiy of China, China Absrac Time series model is a hospo in he research
More informationMeasuring macroeconomic volatility Applications to export revenue data, 19702005
FONDATION POUR LES ETUDES ET RERS LE DEVELOPPEMENT INTERNATIONAL Measuring macroeconomic volailiy Applicaions o expor revenue daa, 1970005 by Joël Cariolle Policy brief no. 47 March 01 The FERDI is a
More informationMorningstar Investor Return
Morningsar Invesor Reurn Morningsar Mehodology Paper Augus 31, 2010 2010 Morningsar, Inc. All righs reserved. The informaion in his documen is he propery of Morningsar, Inc. Reproducion or ranscripion
More informationVector Autoregressions (VARs): Operational Perspectives
Vecor Auoregressions (VARs): Operaional Perspecives Primary Source: Sock, James H., and Mark W. Wason, Vecor Auoregressions, Journal of Economic Perspecives, Vol. 15 No. 4 (Fall 2001), 101115. Macroeconomericians
More informationINVESTIGATION OF THE INFLUENCE OF UNEMPLOYMENT ON ECONOMIC INDICATORS
INVESTIGATION OF THE INFLUENCE OF UNEMPLOYMENT ON ECONOMIC INDICATORS Ilona Tregub, Olga Filina, Irina Kondakova Financial Universiy under he Governmen of he Russian Federaion 1. Phillips curve In economics,
More informationConstant Data Length Retrieval for Video Servers with Variable Bit Rate Streams
IEEE Inernaional Conference on Mulimedia Compuing & Sysems, June 173, 1996, in Hiroshima, Japan, p. 151155 Consan Lengh Rerieval for Video Servers wih Variable Bi Rae Sreams Erns Biersack, Frédéric Thiesse,
More informationAnalogue and Digital Signal Processing. First Term Third Year CS Engineering By Dr Mukhtiar Ali Unar
Analogue and Digial Signal Processing Firs Term Third Year CS Engineering By Dr Mukhiar Ali Unar Recommended Books Haykin S. and Van Veen B.; Signals and Sysems, John Wiley& Sons Inc. ISBN: 073807 Ifeachor
More informationGraphing the Von Bertalanffy Growth Equation
file: d:\b1732013\von_beralanffy.wpd dae: Sepember 23, 2013 Inroducion Graphing he Von Beralanffy Growh Equaion Previously, we calculaed regressions of TL on SL for fish size daa and ploed he daa and
More informationWhy Did the Demand for Cash Decrease Recently in Korea?
Why Did he Demand for Cash Decrease Recenly in Korea? Byoung Hark Yoo Bank of Korea 26. 5 Absrac We explores why cash demand have decreased recenly in Korea. The raio of cash o consumpion fell o 4.7% in
More informationA Note on Using the Svensson procedure to estimate the risk free rate in corporate valuation
A Noe on Using he Svensson procedure o esimae he risk free rae in corporae valuaion By Sven Arnold, Alexander Lahmann and Bernhard Schwezler Ocober 2011 1. The risk free ineres rae in corporae valuaion
More informationBayesian Filtering with Online Gaussian Process Latent Variable Models
Bayesian Filering wih Online Gaussian Process Laen Variable Models Yali Wang Laval Universiy yali.wang.1@ulaval.ca Marcus A. Brubaker TTI Chicago mbrubake@cs.orono.edu Brahim Chaibdraa Laval Universiy
More informationState Machines: Brief Introduction to Sequencers Prof. Andrew J. Mason, Michigan State University
Inroducion ae Machines: Brief Inroducion o equencers Prof. Andrew J. Mason, Michigan ae Universiy A sae machine models behavior defined by a finie number of saes (unique configuraions), ransiions beween
More informationFourier Series Solution of the Heat Equation
Fourier Series Soluion of he Hea Equaion Physical Applicaion; he Hea Equaion In he early nineeenh cenury Joseph Fourier, a French scienis and mahemaician who had accompanied Napoleon on his Egypian campaign,
More informationChapter 7. Response of FirstOrder RL and RC Circuits
Chaper 7. esponse of FirsOrder L and C Circuis 7.1. The Naural esponse of an L Circui 7.2. The Naural esponse of an C Circui 7.3. The ep esponse of L and C Circuis 7.4. A General oluion for ep and Naural
More informationImpact of Debt on Primary Deficit and GSDP Gap in Odisha: Empirical Evidences
S.R. No. 002 10/2015/CEFT Impac of Deb on Primary Defici and GSDP Gap in Odisha: Empirical Evidences 1. Inroducion The excessive pressure of public expendiure over is revenue receip is financed hrough
More informationANALYSIS AND COMPARISONS OF SOME SOLUTION CONCEPTS FOR STOCHASTIC PROGRAMMING PROBLEMS
ANALYSIS AND COMPARISONS OF SOME SOLUTION CONCEPTS FOR STOCHASTIC PROGRAMMING PROBLEMS R. Caballero, E. Cerdá, M. M. Muñoz and L. Rey () Deparmen of Applied Economics (Mahemaics), Universiy of Málaga,
More informationAppendix D Flexibility Factor/Margin of Choice Desktop Research
Appendix D Flexibiliy Facor/Margin of Choice Deskop Research Cheshire Eas Council Cheshire Eas Employmen Land Review Conens D1 Flexibiliy Facor/Margin of Choice Deskop Research 2 Final Ocober 2012 \\GLOBAL.ARUP.COM\EUROPE\MANCHESTER\JOBS\200000\22348900\4
More informationAccelerated Gradient Methods for Stochastic Optimization and Online Learning
Acceleraed Gradien Mehods for Sochasic Opimizaion and Online Learning Chonghai Hu, James T. Kwok, Weike Pan Deparmen of Compuer Science and Engineering Hong Kong Universiy of Science and Technology Clear
More informationChapter 1.6 Financial Management
Chaper 1.6 Financial Managemen Par I: Objecive ype quesions and answers 1. Simple pay back period is equal o: a) Raio of Firs cos/ne yearly savings b) Raio of Annual gross cash flow/capial cos n c) = (1
More informationDistributed and Secure Computation of Convex Programs over a Network of Connected Processors
DCDIS CONFERENCE GUELPH, ONTARIO, CANADA, JULY 2005 1 Disribued and Secure Compuaion of Convex Programs over a Newor of Conneced Processors Michael J. Neely Universiy of Souhern California hp://wwwrcf.usc.edu/
More informationMACROECONOMIC FORECASTS AT THE MOF A LOOK INTO THE REAR VIEW MIRROR
MACROECONOMIC FORECASTS AT THE MOF A LOOK INTO THE REAR VIEW MIRROR The firs experimenal publicaion, which summarised pas and expeced fuure developmen of basic economic indicaors, was published by he Minisry
More informationChabot College Physics Lab RC Circuits Scott Hildreth
Chabo College Physics Lab Circuis Sco Hildreh Goals: Coninue o advance your undersanding of circuis, measuring resisances, currens, and volages across muliple componens. Exend your skills in making breadboard
More informationDDoS Attacks Detection Model and its Application
DDoS Aacks Deecion Model and is Applicaion 1, MUHAI LI, 1 MING LI, XIUYING JIANG 1 School of Informaion Science & Technology Eas China Normal Universiy No. 500, DongChuan Road, Shanghai 0041, PR. China
More informationInformation Theoretic Approaches for Predictive Models: Results and Analysis
Informaion Theoreic Approaches for Predicive Models: Resuls and Analysis Monica Dinculescu Supervised by Doina Precup Absrac Learning he inernal represenaion of parially observable environmens has proven
More informationRelative velocity in one dimension
Connexions module: m13618 1 Relaive velociy in one dimension Sunil Kumar Singh This work is produced by The Connexions Projec and licensed under he Creaive Commons Aribuion License Absrac All quaniies
More informationUSE OF EDUCATION TECHNOLOGY IN ENGLISH CLASSES
USE OF EDUCATION TECHNOLOGY IN ENGLISH CLASSES Mehme Nuri GÖMLEKSİZ Absrac Using educaion echnology in classes helps eachers realize a beer and more effecive learning. In his sudy 150 English eachers were
More informationSELFEVALUATION FOR VIDEO TRACKING SYSTEMS
SELFEVALUATION FOR VIDEO TRACKING SYSTEMS Hao Wu and Qinfen Zheng Cenre for Auomaion Research Dep. of Elecrical and Compuer Engineering Universiy of Maryland, College Park, MD20742 {wh2003, qinfen}@cfar.umd.edu
More informationAdvise on the development of a Learning Technologies Strategy at the LeopoldFranzensUniversität Innsbruck
Advise on he developmen of a Learning Technologies Sraegy a he LeopoldFranzensUniversiä Innsbruck Prof. Dr. Rob Koper Open Universiy of he Neherlands Educaional Technology Experise Cener Conex  Period
More informationThe naive method discussed in Lecture 1 uses the most recent observations to forecast future values. That is, Y ˆ t + 1
Business Condiions & Forecasing Exponenial Smoohing LECTURE 2 MOVING AVERAGES AND EXPONENTIAL SMOOTHING OVERVIEW This lecure inroduces imeseries smoohing forecasing mehods. Various models are discussed,
More informationDuration and Convexity ( ) 20 = Bond B has a maturity of 5 years and also has a required rate of return of 10%. Its price is $613.
Graduae School of Business Adminisraion Universiy of Virginia UVAF38 Duraion and Convexiy he price of a bond is a funcion of he promised paymens and he marke required rae of reurn. Since he promised
More informationPrincipal components of stock market dynamics. Methodology and applications in brief (to be updated ) Andrei Bouzaev, bouzaev@ya.
Principal componens of sock marke dynamics Mehodology and applicaions in brief o be updaed Andrei Bouzaev, bouzaev@ya.ru Why principal componens are needed Objecives undersand he evidence of more han one
More informationDOES TRADING VOLUME INFLUENCE GARCH EFFECTS? SOME EVIDENCE FROM THE GREEK MARKET WITH SPECIAL REFERENCE TO BANKING SECTOR
Invesmen Managemen and Financial Innovaions, Volume 4, Issue 3, 7 33 DOES TRADING VOLUME INFLUENCE GARCH EFFECTS? SOME EVIDENCE FROM THE GREEK MARKET WITH SPECIAL REFERENCE TO BANKING SECTOR Ahanasios
More informationMTH6121 Introduction to Mathematical Finance Lesson 5
26 MTH6121 Inroducion o Mahemaical Finance Lesson 5 Conens 2.3 Brownian moion wih drif........................... 27 2.4 Geomeric Brownian moion........................... 28 2.5 Convergence of random
More informationEmergence of FokkerPlanck Dynamics within a Closed Finite Spin System
Emergence of FokkerPlanck Dynamics wihin a Closed Finie Spin Sysem H. Niemeyer(*), D. Schmidke(*), J. Gemmer(*), K. Michielsen(**), H. de Raed(**) (*)Universiy of Osnabrück, (**) Supercompuing Cener Juelich
More informationChapter 8 Student Lecture Notes 81
Chaper Suden Lecure Noes  Chaper Goals QM: Business Saisics Chaper Analyzing and Forecasing Series Daa Afer compleing his chaper, you should be able o: Idenify he componens presen in a ime series Develop
More informationSinglemachine Scheduling with Periodic Maintenance and both Preemptive and. Nonpreemptive jobs in Remanufacturing System 1
Absrac number: 050407 Singlemachine Scheduling wih Periodic Mainenance and boh Preempive and Nonpreempive jobs in Remanufacuring Sysem Liu Biyu hen Weida (School of Economics and Managemen Souheas Universiy
More informationMarket Liquidity and the Impacts of the Computerized Trading System: Evidence from the Stock Exchange of Thailand
36 Invesmen Managemen and Financial Innovaions, 4/4 Marke Liquidiy and he Impacs of he Compuerized Trading Sysem: Evidence from he Sock Exchange of Thailand Sorasar Sukcharoensin 1, Pariyada Srisopisawa,
More informationPrice elasticity of demand for crude oil: estimates for 23 countries
Price elasiciy of demand for crude oil: esimaes for 23 counries John C.B. Cooper Absrac This paper uses a muliple regression model derived from an adapaion of Nerlove s parial adjusmen model o esimae boh
More informationRepresenting Periodic Functions by Fourier Series. (a n cos nt + b n sin nt) n=1
Represening Periodic Funcions by Fourier Series 3. Inroducion In his Secion we show how a periodic funcion can be expressed as a series of sines and cosines. We begin by obaining some sandard inegrals
More informationA Distributed MultipleTarget Identity Management Algorithm in Sensor Networks
A Disribued MulipleTarge Ideniy Managemen Algorihm in Sensor Neworks Inseok Hwang, Kaushik Roy, Hamsa Balakrishnan, and Claire Tomlin Dep. of Aeronauics and Asronauics, Sanford Universiy, CA 94305 Elecrical
More informationMultiple Structural Breaks in the Nominal Interest Rate and Inflation in Canada and the United States
Deparmen of Economics Discussion Paper 0007 Muliple Srucural Breaks in he Nominal Ineres Rae and Inflaion in Canada and he Unied Saes Frank J. Akins, Universiy of Calgary Preliminary Draf February, 00
More informationRandom Walk in 1D. 3 possible paths x vs n. 5 For our random walk, we assume the probabilities p,q do not depend on time (n)  stationary
Random Walk in D Random walks appear in many cones: diffusion is a random walk process undersanding buffering, waiing imes, queuing more generally he heory of sochasic processes gambling choosing he bes
More informationThe Greek financial crisis: growing imbalances and sovereign spreads. Heather D. Gibson, Stephan G. Hall and George S. Tavlas
The Greek financial crisis: growing imbalances and sovereign spreads Heaher D. Gibson, Sephan G. Hall and George S. Tavlas The enry The enry of Greece ino he Eurozone in 2001 produced a dividend in he
More informationRevisions to Nonfarm Payroll Employment: 1964 to 2011
Revisions o Nonfarm Payroll Employmen: 1964 o 2011 Tom Sark December 2011 Summary Over recen monhs, he Bureau of Labor Saisics (BLS) has revised upward is iniial esimaes of he monhly change in nonfarm
More informationEconomics 140A Hypothesis Testing in Regression Models
Economics 140A Hypohesis Tesing in Regression Models While i is algebraically simple o work wih a populaion model wih a single varying regressor, mos populaion models have muliple varying regressors 1
More informationTSGRAN Working Group 1 (Radio Layer 1) meeting #3 Nynashamn, Sweden 22 nd 26 th March 1999
TSGRAN Working Group 1 (Radio Layer 1) meeing #3 Nynashamn, Sweden 22 nd 26 h March 1999 RAN TSGW1#3(99)196 Agenda Iem: 9.1 Source: Tile: Documen for: Moorola Macrodiversiy for he PRACH Discussion/Decision
More informationJournal Of Business & Economics Research September 2005 Volume 3, Number 9
Opion Pricing And Mone Carlo Simulaions George M. Jabbour, (Email: jabbour@gwu.edu), George Washingon Universiy YiKang Liu, (yikang@gwu.edu), George Washingon Universiy ABSTRACT The advanage of Mone Carlo
More informationQualityOfService Class Specific Traffic Matrices in IP/MPLS Networks
ualiyofservice Class Specific Traffic Marices in IP/MPLS Neworks Sefan Schnier Deusche Telekom, TSysems D4 Darmsad +4 sefan.schnier@sysems.com Franz Harleb Deusche Telekom, TSysems D4 Darmsad +4
More informationComplex Fourier Series. Adding these identities, and then dividing by 2, or subtracting them, and then dividing by 2i, will show that
Mah 344 May 4, Complex Fourier Series Par I: Inroducion The Fourier series represenaion for a funcion f of period P, f) = a + a k coskω) + b k sinkω), ω = π/p, ) can be expressed more simply using complex
More informationQualityOfService Class Specific Traffic Matrices in IP/MPLS Networks
ualiyofservice Class Specific Traffic Marices in IP/MPLS Neworks Sefan Schnier Deusche Telekom, TSysems D4 Darmsad +4 sefan.schnier@sysems.com Franz Harleb Deusche Telekom, TSysems D4 Darmsad +4
More informationThe option pricing framework
Chaper 2 The opion pricing framework The opion markes based on swap raes or he LIBOR have become he larges fixed income markes, and caps (floors) and swapions are he mos imporan derivaives wihin hese markes.
More informationNetwork Discovery: An Estimation Based Approach
Nework Discovery: An Esimaion Based Approach Girish Chowdhary, Magnus Egersed, and Eric N. Johnson Absrac We consider he unaddressed problem of nework discovery, in which, an agen aemps o formulae an esimae
More informationAnalysis of Pricing and Efficiency Control Strategy between Internet Retailer and Conventional Retailer
Recen Advances in Business Managemen and Markeing Analysis of Pricing and Efficiency Conrol Sraegy beween Inerne Reailer and Convenional Reailer HYUG RAE CHO 1, SUG MOO BAE and JOG HU PARK 3 Deparmen of
More informationAutomatic measurement and detection of GSM interferences
Auomaic measuremen and deecion of GSM inerferences Poor speech qualiy and dropped calls in GSM neworks may be caused by inerferences as a resul of high raffic load. The radio nework analyzers from Rohde
More informationWhy Do Real and Nominal. InventorySales Ratios Have Different Trends?
Why Do Real and Nominal InvenorySales Raios Have Differen Trends? By Valerie A. Ramey Professor of Economics Deparmen of Economics Universiy of California, San Diego and Research Associae Naional Bureau
More informationPROFIT TEST MODELLING IN LIFE ASSURANCE USING SPREADSHEETS PART ONE
Profi Tes Modelling in Life Assurance Using Spreadshees PROFIT TEST MODELLING IN LIFE ASSURANCE USING SPREADSHEETS PART ONE Erik Alm Peer Millingon 2004 Profi Tes Modelling in Life Assurance Using Spreadshees
More informationAppendix A: Area. 1 Find the radius of a circle that has circumference 12 inches.
Appendi A: Area workedou s o OddNumbered Eercises Do no read hese workedou s before aemping o do he eercises ourself. Oherwise ou ma mimic he echniques shown here wihou undersanding he ideas. Bes wa
More informationMaintaining MultiModality through Mixture Tracking
Mainaining MuliModaliy hrough Mixure Tracking Jaco Vermaak, Arnaud Douce Cambridge Universiy Engineering Deparmen Cambridge, CB2 1PZ, UK Parick Pérez Microsof Research Cambridge, CB3 0FB, UK Absrac In
More informationDYNAMIC MODELS FOR VALUATION OF WRONGFUL DEATH PAYMENTS
DYNAMIC MODELS FOR VALUATION OF WRONGFUL DEATH PAYMENTS Hong Mao, Shanghai Second Polyechnic Universiy Krzyszof M. Osaszewski, Illinois Sae Universiy Youyu Zhang, Fudan Universiy ABSTRACT Liigaion, exper
More informationEvolutionary building of stock trading experts in realtime systems
Evoluionary building of sock rading expers in realime sysems Jerzy J. Korczak Universié Louis Paseur Srasbourg, France Email: jjk@dpinfo.usrasbg.fr Absrac: This paper addresses he problem of consrucing
More informationDistributed Online Localization in Sensor Networks Using a Moving Target
Disribued Online Localizaion in Sensor Neworks Using a Moving Targe Aram Galsyan 1, Bhaskar Krishnamachari 2, Krisina Lerman 1, and Sundeep Paem 2 1 Informaion Sciences Insiue 2 Deparmen of Elecrical EngineeringSysems
More informationIndividual Health Insurance April 30, 2008 Pages 167170
Individual Healh Insurance April 30, 2008 Pages 167170 We have received feedback ha his secion of he e is confusing because some of he defined noaion is inconsisen wih comparable life insurance reserve
More informationDIFFERENTIAL EQUATIONS with TI89 ABDUL HASSEN and JAY SCHIFFMAN. A. Direction Fields and Graphs of Differential Equations
DIFFERENTIAL EQUATIONS wih TI89 ABDUL HASSEN and JAY SCHIFFMAN We will assume ha he reader is familiar wih he calculaor s keyboard and he basic operaions. In paricular we have assumed ha he reader knows
More information11/6/2013. Chapter 14: Dynamic ADAS. Introduction. Introduction. Keeping track of time. The model s elements
Inroducion Chaper 14: Dynamic DS dynamic model of aggregae and aggregae supply gives us more insigh ino how he economy works in he shor run. I is a simplified version of a DSGE model, used in cuingedge
More informationImpact of scripless trading on business practices of Subbrokers.
Impac of scripless rading on business pracices of Subbrokers. For furher deails, please conac: Mr. T. Koshy Vice Presiden Naional Securiies Deposiory Ld. Tradeworld, 5 h Floor, Kamala Mills Compound,
More information1. BACKGROUND 11 Traffic Flow Surveillance
AuoRecogniion of Vehicle Maneuvers Based on SpaioTemporal Clusering. BACKGROUND  Traffic Flow Surveillance Conduced wih kinds of beacons mouned a limied roadside poins wih Images from High Aliude Plaforms
More informationAP Calculus AB 2010 Scoring Guidelines
AP Calculus AB 1 Scoring Guidelines The College Board The College Board is a noforprofi membership associaion whose mission is o connec sudens o college success and opporuniy. Founded in 1, he College
More informationThe Grantor Retained Annuity Trust (GRAT)
WEALTH ADVISORY Esae Planning Sraegies for closelyheld, family businesses The Granor Reained Annuiy Trus (GRAT) An efficien wealh ransfer sraegy, paricularly in a low ineres rae environmen Family business
More informationSPEC model selection algorithm for ARCH models: an options pricing evaluation framework
Applied Financial Economics Leers, 2008, 4, 419 423 SEC model selecion algorihm for ARCH models: an opions pricing evaluaion framework Savros Degiannakis a, * and Evdokia Xekalaki a,b a Deparmen of Saisics,
More informationForecasting, Ordering and Stock Holding for Erratic Demand
ISF 2002 23 rd o 26 h June 2002 Forecasing, Ordering and Sock Holding for Erraic Demand Andrew Eaves Lancaser Universiy / Andalus Soluions Limied Inroducion Erraic and slowmoving demand Demand classificaion
More informationRC, RL and RLC circuits
Name Dae Time o Complee h m Parner Course/ Secion / Grade RC, RL and RLC circuis Inroducion In his experimen we will invesigae he behavior of circuis conaining combinaions of resisors, capaciors, and inducors.
More informationUse SeDuMi to Solve LP, SDP and SCOP Problems: Remarks and Examples*
Use SeDuMi o Solve LP, SDP and SCOP Problems: Remarks and Examples* * his file was prepared by WuSheng Lu, Dep. of Elecrical and Compuer Engineering, Universiy of Vicoria, and i was revised on December,
More informationThe Application of Multi Shifts and Break Windows in Employees Scheduling
The Applicaion of Muli Shifs and Brea Windows in Employees Scheduling Evy Herowai Indusrial Engineering Deparmen, Universiy of Surabaya, Indonesia Absrac. One mehod for increasing company s performance
More informationStatistical Analysis with Little s Law. Supplementary Material: More on the Call Center Data. by SongHee Kim and Ward Whitt
Saisical Analysis wih Lile s Law Supplemenary Maerial: More on he Call Cener Daa by SongHee Kim and Ward Whi Deparmen of Indusrial Engineering and Operaions Research Columbia Universiy, New York, NY 1799
More informationHedging with Forwards and Futures
Hedging wih orwards and uures Hedging in mos cases is sraighforward. You plan o buy 10,000 barrels of oil in six monhs and you wish o eliminae he price risk. If you ake he buyside of a forward/fuures
More informationDistributing Human Resources among Software Development Projects 1
Disribuing Human Resources among Sofware Developmen Proecs Macario Polo, María Dolores Maeos, Mario Piaini and rancisco Ruiz Summary This paper presens a mehod for esimaing he disribuion of human resources
More informationOptimal Investment and Consumption Decision of Family with Life Insurance
Opimal Invesmen and Consumpion Decision of Family wih Life Insurance Minsuk Kwak 1 2 Yong Hyun Shin 3 U Jin Choi 4 6h World Congress of he Bachelier Finance Sociey Torono, Canada June 25, 2010 1 Speaker
More informationSupplementary Appendix for Depression Babies: Do Macroeconomic Experiences Affect RiskTaking?
Supplemenary Appendix for Depression Babies: Do Macroeconomic Experiences Affec RiskTaking? Ulrike Malmendier UC Berkeley and NBER Sefan Nagel Sanford Universiy and NBER Sepember 2009 A. Deails on SCF
More informationA Bayesian Approach for Personalized Booth Recommendation
2011 Inernaional Conference on Social Science and Humaniy IPED vol. (2011) (2011) IACSI Press, Singapore A Bayesian Approach for Personalized Booh ecommendaion Ki Mok Ha 2bcreaor@khu.ac.kr Il Young Choi
More informationIssues Using OLS with Time Series Data. Time series data NOT randomly sampled in same way as cross sectional each obs not i.i.d
These noes largely concern auocorrelaion Issues Using OLS wih Time Series Daa Recall main poins from Chaper 10: Time series daa NOT randomly sampled in same way as cross secional each obs no i.i.d Why?
More informationON THURSTONE'S MODEL FOR PAIRED COMPARISONS AND RANKING DATA
ON THUSTONE'S MODEL FO PAIED COMPAISONS AND ANKING DATA Alber MaydeuOlivares Dep. of Psychology. Universiy of Barcelona. Paseo Valle de Hebrón, 171. 08035 Barcelona (Spain). Summary. We invesigae by means
More informationTerm Structure of Prices of Asian Options
Term Srucure of Prices of Asian Opions Jirô Akahori, Tsuomu Mikami, Kenji Yasuomi and Teruo Yokoa Dep. of Mahemaical Sciences, Risumeikan Universiy 111 Nojihigashi, Kusasu, Shiga 5258577, Japan Email:
More informationTime Series Prediction of Web Domain Visits by IFInference System
Time Series Predicion of Web Domain Visis by IFInference Sysem VLADIMÍR OLEJ, JANA FILIPOVÁ, PETR HÁJEK Insiue of Sysem Engineering and Informaics Faculy of Economics and Adminisraion Universiy of Pardubice,
More informationA New Type of Combination Forecasting Method Based on PLS
American Journal of Operaions Research, 2012, 2, 408416 hp://dx.doi.org/10.4236/ajor.2012.23049 Published Online Sepember 2012 (hp://www.scirp.org/journal/ajor) A New Type of Combinaion Forecasing Mehod
More informationStock Trading with Recurrent Reinforcement Learning (RRL) CS229 Application Project Gabriel Molina, SUID 5055783
Sock raing wih Recurren Reinforcemen Learning (RRL) CS9 Applicaion Projec Gabriel Molina, SUID 555783 I. INRODUCION One relaively new approach o financial raing is o use machine learning algorihms o preic
More informationTrends in TCP/IP Retransmissions and Resets
Trends in TCP/IP Reransmissions and Reses Absrac Concordia Chen, Mrunal Mangrulkar, Naomi Ramos, and Mahaswea Sarkar {cychen, mkulkarn, msarkar,naramos}@cs.ucsd.edu As he Inerne grows larger, measuring
More informationCointegration Analysis of Exchange Rate in Foreign Exchange Market
Coinegraion Analysis of Exchange Rae in Foreign Exchange Marke Wang Jian, Wang Shuli School of Economics, Wuhan Universiy of Technology, P.R.China, 430074 Absrac: This paper educed ha he series of exchange
More informationINTEREST RATE FUTURES AND THEIR OPTIONS: SOME PRICING APPROACHES
INTEREST RATE FUTURES AND THEIR OPTIONS: SOME PRICING APPROACHES OPENGAMMA QUANTITATIVE RESEARCH Absrac. Exchangeraded ineres rae fuures and heir opions are described. The fuure opions include hose paying
More informationHierarchical Mixtures of AR Models for Financial Time Series Analysis
Hierarchical Mixures of AR Models for Financial Time Series Analysis Carmen Vidal () & Albero Suárez (,) () Compuer Science Dp., Escuela Poliécnica Superior () Risklab Madrid Universidad Auónoma de Madrid
More informationAP Calculus BC 2010 Scoring Guidelines
AP Calculus BC Scoring Guidelines The College Board The College Board is a noforprofi membership associaion whose mission is o connec sudens o college success and opporuniy. Founded in, he College Board
More information