Big Data Deep Learning: Challenges and Perspectives

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Big Data Deep Learning: Challenges and Perspectives"

Transcription

1 Receved Aprl 20, 2014, accepted May 13, 2014, date of publcaton May 16, 2014, date of current verson May 28, Dgtal Object Identfer /ACCESS Bg Data Deep Learnng: Challenges and Perspectves XUE-WEN CHEN 1, (Senor Member, IEEE), AND XIAOTONG LIN 2 1 Department of Computer Scence, Wayne State Unversty, Detrot, MI 48404, USA 2 Department of Computer Scence and Engneerng, Oakland Unversty, Rochester, MI 48309, USA Correspondng author: X.-W. Chen ABSTRACT Deep learnng s currently an extremely actve research area n machne learnng and pattern recognton socety. It has ganed huge successes n a broad area of applcatons such as speech recognton, computer vson, and natural language processng. Wth the sheer sze of data avalable today, bg data brngs bg opportuntes and transformatve potental for varous sectors; on the other hand, t also presents unprecedented challenges to harnessng data and nformaton. As the data keeps gettng bgger, deep learnng s comng to play a key role n provdng bg data predctve analytcs solutons. In ths paper, we provde a bref overvew of deep learnng, and hghlght current research efforts and the challenges to bg data, as well as the future trends. INDEX TERMS Classfer desgn and evaluaton, feature representaton, machne learnng, neural nets models, parallel processng. I. INTRODUCTION Deep learnng and Bg Data are two hottest trends n the rapdly growng dgtal world. Whle Bg Data has been defned n dfferent ways, heren t s referred to the exponental growth and wde avalablty of dgtal data that are dffcult or even mpossble to be managed and analyzed usng conventonal software tools and technologes. Dgtal data, n all shapes and szes, s growng at astonshng rates. For example, accordng to the Natonal Securty Agency, the Internet s processng 1,826 Petabytes of data per day [1]. In 2011, dgtal nformaton has grown nne tmes n volume n just fve years [2] and by 2020, ts amount n the world wll reach 35 trllon ggabytes [3]. Ths exploson of dgtal data brngs bg opportuntes and transformatve potental for varous sectors such as enterprses, healthcare ndustry manufacturng, and educatonal servces [4]. It also leads to a dramatc paradgm shft n our scentfc research towards data-drven dscovery. Whle Bg Data offers the great potental for revolutonzng all aspects of our socety, harvestng of valuable knowledge from Bg Data s not an ordnary task. The large and rapdly growng body of nformaton hdden n the unprecedented volumes of non-tradtonal data requres both the development of advanced technologes and nterdscplnary teams workng n close collaboraton. Today, machne learnng technques, together wth advances n avalable computatonal power, have come to play a vtal role n Bg Data analytcs and knowledge dscovery (see [5] [8]). They are employed wdely to leverage the predctve power of Bg Data n felds lke search engnes, medcne, and astronomy. As an extremely actve subfeld of machne learnng, deep learnng s consdered, together wth Bg Data, as the bg deals and the bases for an Amercan nnovaton and economc revoluton [9]. In contrast to most conventonal learnng methods, whch are consdered usng shallow-structured learnng archtectures, deep learnng refers to machne learnng technques that use supervsed and/or unsupervsed strateges to automatcally learn herarchcal representatons n deep archtectures for classfcaton [10], [11]. Inspred by bologcal observatons on human bran mechansms for processng of natural sgnals, deep learnng has attracted much attenton from the academc communty n recent years due to ts state-of-the-art performance n many research domans such as speech recognton [12], [13], collaboratve fulterng [14], and computer vson [15], [16]. Deep learnng has also been successfully appled n ndustry products that take advantage of the large volume of dgtal data. Companes lke Google, Apple, and Facebook, who collect and analyze massve amounts of data on a daly bass, have been aggressvely pushng forward deep learnng related projects. For example, Apple s Sr, the vrtual personal assstant n Phones, offers a wde varety of servces ncludng weather reports, sport news, answers to user s questons, and remnders etc. by utlzng deep learnng and more and more data collected by Apple servces [17]. Google apples deep learnng algorthms to massve chunks of messy data obtaned from the Internet for Google s translator, IEEE. Translatons and content mnng are permtted for academc research only. Personal use s also permtted, but republcaton/redstrbuton requres IEEE permsson. See for more nformaton. VOLUME 2, 2014

2 Androd s voce recognton, Google s street vew, and mage search engne [18]. Other ndustry gants are not far behnd ether. For example, Mcrosoft s real-tme language translaton n Bng voce search [19] and IBM s bran-lke computer [18], [20] use technques lke deep learnng to leverage Bg Data for compettve advantage. As the data keeps gettng bgger, deep learnng s comng to play a key role n provdng bg data predctve analytcs solutons, partcularly wth the ncreased processng power and the advances n graphcs processors. In ths paper, our goal s not to present a comprehensve survey of all the related work n deep learnng, but manly to dscuss the most mportant ssues related to learnng from massve amounts of data, hghlght current research efforts and the challenges to bg data, as well as the future trends. The rest of the paper s organzed as follows. Secton 2 presents a bref revew of two commonly used deep learnng archtectures. Secton 3 dscusses the strateges of deep learnng from massve amounts of data. Fnally, we dscuss the challenges and perspectves of deep learnng for Bg Data n Secton 4. II. OVERVIEW OF DEEP LEARNING Deep learnng refers to a set of machne learnng technques that learn multple levels of representatons n deep archtectures. In ths secton, we wll present a bref overvew of two well-establshed deep archtectures: deep belef networks (DBNs) [21] [23] and convolutonal neural networks (CNNs) [24] [26]. A. DEEP BELIEF NETWORKS Conventonal neural networks are prone to get trapped n local optma of a non-convex objectve functon, whch often leads to poor performance [27]. Furthermore, they cannot take advantage of unlabeled data, whch are often abundant and cheap to collect n Bg Data. To allevate these problems, a deep belef network (DBN) uses a deep archtecture that s capable of learnng feature representatons from both the labeled and unlabeled data presented to t [21]. It ncorporates both unsupervsed pre-tranng and supervsed fne-tunng strateges to construct the models: unsupervsed stages ntend to learn data dstrbutons wthout usng label nformaton and supervsed stages perform local search for fne tunng. Fg. 1 shows a typcal DBN archtecture, whch s composed of a stack of Restrcted Boltzmann Machnes (RBMs) and/or one or more addtonal layers for dscrmnaton tasks. RBMs are probablstc generatve models that learn a jont probablty dstrbuton of observed (tranng) data wthout usng data labels [28]. They can effectvely utlze large amounts of unlabeled data for explotng complex data structures. Once the structure of a DBN s determned, the goal for tranng s to learn the weghts (and bases) between layers. Ths s conducted frstly by an unsupervsed learnng of RBMs. A typcal RBM conssts of two layers: nodes n one layer are fully connected to nodes n the other layer and there s no connecton for nodes n the same layer (see Fg.1, for example, the nput layer and the frst hdden layer H 1 form a RBM) [28]. Consequently, each node s ndependent of other nodes n the same layer gven all nodes n the other layer. Ths characterstc allows us to tran the generatve weghts W of each RBMs usng Gbbs samplng [29], [30]. FIGURE 1. Illustraton of a deep belef network archtecture. Ths partcular DBN conssts of three hdden layers, each wth three neurons; one nput later wth fve neurons and one output layer also wth fve neurons. Any two adjacent layers can form a RBM traned wth unlabeled data. The outputs of current RBM (e.g., h (1) n the frst RBM marked n red) are the nputs of the next RBM (e.g., h (2) n the second RBM marked n green). The weghts W can then be fne-tuned wth labeled data after pre-tranng. Before fne-tunng, a layer-by-layer pre-tranng of RBMs s performed: the outputs of a RBM are fed as nputs to the next RBM and the process repeats untl all the RBMs are pretraned. Ths layer-by-layer unsupervsed learnng s crtcal n DBN tranng as practcally t helps avod local optma and allevates the over-fttng problem that s observed when mllons of parameters are used. Furthermore, the algorthm s very effcent n terms of ts tme complexty, whch s lnear to the number and sze of RBMs [21]. Features at dfferent layers contan dfferent nformaton about data structures wth hgher-level features constructed from lower-level features. Note that the number of stacked RBMs s a parameter predetermned by users and pre-tranng requres only unlabeled data (for good generalzaton). For a smple RBM wth Bernoull dstrbuton for both the vsble and hdden layers, the samplng probabltes are as follows [21]: p ( h j = 1 v; W ) ( I ) = σ w j v + a j (1) and =1 J p (v = 1 h; W ) = σ w j h j + b (2) where v and h represents a I 1 vsble unt vector and a J 1 hdden unt vector, respectvely; W s the matrx of weghts (w j ) connectng the vsble and hdden layers; a j and b are bas terms; and σ ( ) s a sgmod functon. For the case j=1 VOLUME 2,

3 of real-valued vsble unts, the condtonal probablty dstrbutons are slghtly dfferent: typcally, a Gaussan-Bernoull dstrbuton s assumed and p (v h; W ) s Gaussan [30]. Weghts w j are updated based on an approxmate method called contrastve dvergence (CD) approxmaton [31]. For example, the (t + 1)-th weght for w j can be updated as follows: w j (t + 1) = c w j (t) + α ( v h j data v h j model ) where α s the learnng rate and c s the momentum factor; data and model are the expectatons under the dstrbutons defned by the data and the model, respectvely. Whle the expectatons may be calculated by runnng Gbbs samplng nfntely many tmes, n practce, one-step CD s often used because t performs well [31]. Other model parameters (e.g., the bases) can be updated smlarly. As a generatve mode, the RBM tranng ncludes a Gbbs sampler to sample hdden unts based on the vsble unts and vce versa (Eqs. (1) and (2)). The weghts between these two layers are then updated usng the CD rule (Eq. 3). Ths process wll repeat untl convergence. An RBM models data dstrbuton usng hdden unts wthout employng label nformaton. Ths s a very useful feature n Bg Data analyss as DBN can potentally leverage much more data (wthout knowng ther labels) for mproved performance. After pre-tranng, nformaton about the nput data s stored n the weghts between every adjacent layers. The DBN then adds a fnal layer representng the desred outputs and the overall network s fne tuned usng labeled data and back propagaton strateges for better dscrmnaton (n some mplementatons, on top of the stacked RBMs, there s another layer called assocatve memory determned by supervsed learnng methods). There are other varatons for pre-tranng: nstead of usng RBMs, for example, stacked denosng auto-encoders [32], [33] and stacked predctve sparse codng [34] are also proposed for unsupervsed feature learnng. Furthermore, recent results show that when a large number of tranng data s avalable, a fully supervsed tranng usng random ntal weghts nstead of the pre-traned weghts (.e., wthout usng RBMs or auto-encoders) wll practcally work well [13], [35]. For example, a dscrmnatve model starts wth a network wth one sngle hdden layer (.e., a shallow neural network), whch s traned by back propagaton method. Upon convergence, a new hdden layer s nserted nto ths shallow NN (between the frst hdden layer and the desred output layer) and the full network s dscrmnatvely traned agan. Ths process s contnued untl a predetermned crteron s met (e.g., the number of hdden neurons). In summary, DBNs use a greedy and effcent layer-bylayer approach to learn the latent varables (weghts) n each hdden layer and a back propagaton method for fnetunng. Ths hybrd tranng strategy thus mproves both the generatve performance and the dscrmnatve power of the network. (3) B. CONVOLUTIONAL NEURAL NETWORKS A typcal CNN s composed of many layers of herarchy wth some layers for feature representatons (or feature maps) and others as a type of conventonal neural networks for classfcaton [24]. It often starts wth two alterng types of layers called convolutonal and subsamplng layers: convolutonal layers perform convoluton operatons wth several flter maps of equal sze, whle subsamplng layers reduce the szes of proceedng layers by averagng pxels wthn a small neghborhood (or by max-poolng [36], [37]). Fg. 2 shows a typcal archtecture of CNNs. The nput s frst convoluted wth a set of flters (C layers n Fg. 2). These 2D fltered data are called feature maps. After a nonlnear transformaton, a subsamplng s further performed to reduce the dmensonalty (S layers n Fg. 2). The sequence of convoluton/subsamplng can be repeated many tmes (predetermned by users). FIGURE 2. Illustraton of a typcal convolutonal neural network archtecture. The nput s a 2D mage, whch convolves wth four dfferent flters (.e., h (1), = 1 to 4), followed by a nonlnear actvaton, to form the four feature maps n the second layer (C 1 ). These feature maps are down-sampled by a factor of 2 to create the feature maps n layer S 1. The sequence of convoluton/nonlnear actvaton/subsamplng can be repeated many tmes. In ths example, to form the feature maps n layer C 2, we use eght dfferent flters (.e., h (2), = 1 to 8): the frst, thrd, fourth, and sxth feature maps n layer C 2 are defned by one correspondng feature map n layer S 1, each convolutng wth a dfferent flter; and the second and ffth maps n layer C 2 are formed by two maps n S 1 convolutng wth two dfferent flters. The last layer s an output layer to form a fully connected 1D neural network,.e., the 2D outputs from the last subsamplng later (S 2 ) wll be concatenated nto one long nput vector wth each neuron fully connected wth all the neurons n. the next layer (a hdden layer n ths fgure). As llustrated n Fg. 2, the lowest level of ths archtecture s the nput layer wth 2D N N mages as our nputs. Wth local receptve felds, upper layer neurons extract some elementary and complex vsual features. Each convolutonal layer (labeled Cx n Fg. 2) s composed of multple feature maps, whch are constructed by convolvng nputs wth dfferent flters (weght vectors). In other words, the value of each unt n a feature map s the result dependng on a local receptve feld n the prevous layer and the flter. Ths s 516 VOLUME 2, 2014

4 followed by a nonlnear actvaton: ( = f y (l) j K j x (l 1) + b j where y (l) j s the j-th output for the l-th convoluton layer C l ; f ( ) s a nonlnear functon (most recent mplementatons use a scaled hyperbolc tangent functon as the nonlnear actvaton functon [38]: f (x) = tanh(2x/3)). K j s a tranable flter (or kernel) n the flter bank that convolves wth the feature map x (l 1) from the prevous layer to produce a new feature map n the current layer. The symbol represents a dscrete convoluton operator and b j s a bas. Note that each flter K j can connect to all or a porton of feature maps n the prevous layer (n Fg. 2, we show a partally connected feature maps between S 1 and C 2 ). The sub-samplng layer (labeled Sx n Fg. 2) reduces the spatal resoluton of the feature map (thus provdng some level of dstorton nvarance). In general, each unt n the sub-samplng layer s constructed by averagng a 2 2 area n the feature map or by max poolng over a small regon. The key parameters to be decded are weghts between layers, whch are normally traned by standard backpropagaton procedures and a gradent descent algorthm wth mean squared-error as the loss functon. Alternatvely, tranng deep CNN archtectures can be unsupervsed. Heren we revew a partcular method for unsupervsed tranng of CNNs: predctve sparse decomposton (PSD) [39]. The dea s to approxmate nputs Xwth a lnear combnaton of some basc and sparse functons. Z = arg X WZ λ Z 1 + α Z D tanh (KX) 2 2 (5) where W s a matrx wth a lnear bass set, Z s a sparse coeffcent matrx, D s a dagonal gan matrx and K s the flter bank wth predctor parameters. The goal s to fnd the optmal bass functon sets W and the flter bank Kthat mnmze the reconstructon error (the frst term n Eq. 5) wth a sparse representaton (the second term), and the code predcton error smultaneously (the thrd term n Eq. 5, measurng the dfference between the predcted code and actual code, preserves nvarance for certan dstortons). PSD can be traned wth a feed-forward encoder to learn the flter bank and also the poolng together [39]. In summary, nspred by bologcal processes [40], CNN algorthms learn a herarchcal feature representaton by utlzng strateges lke local receptve felds (the sze of each flter s normally small), shared weghts (usng the same weghts to construct all the feature maps at the same level sgnfcantly reduces the number of parameters), and subsamplng (to further reduce the dmensonalty). Each flter bank can be traned wth ether supervsed or unsupervsed methods. A CNN s capable of learnng good feature herarches automatcally and provdng some degree of translatonal and dstortonal nvarances. ) (4) III. DEEP LEARNING FOR MASSIVE AMOUNTS OF DATA Whle deep learnng has shown mpressve results n many applcatons, ts tranng s not a trval task for Bg Data learnng due to the fact that teratve computatons nherent n most deep learnng algorthms are often extremely dffcult to be parallelzed. Thus, wth the unprecedented growth of commercal and academc data sets n recent years, there s a surge n nterest n effectve and scalable parallel algorthms for tranng deep models [12], [13], [15], [41] [44]. In contrast to shallow archtectures where few parameters are preferable to avod overfttng problems, deep learnng algorthms enjoy ther success wth a large number of hdden neurons, often resultng n mllons of free parameters. Thus, large-scale deep learnng often nvolves both large volumes of data and large models. Some algorthmc approaches have been explored for large-scale learnng: for example, locally connected networks [24], [39], mproved optmzers [42], and new structures that can be mplemented n parallel [44]. Recently, Deng et al. [44] proposed a modfed deep archtecture called Deep Stackng Network (DSN), whch can be effectvely parallelzed. A DSN conssts of several specalzed neural networks (called modules) wth a sngle hdden layer. Stacked modules wth nputs composed of raw data vector and the out puts from prevous module form a DSN. Most recently, a new deep archtecture called Tensor Deep Stackng Network (T-DSN), whch s based on the DSN, s mplemented usng CPU clusters for scalable parallel computng [45]. The use of great computng power to speed up the tranng process has shown sgnfcant potental n Bg Data deep learnng. For example, one way to scale up DBNs s to use multple CPU cores, wth each core dealng wth a subset of tranng data (data-parallel schemes). Vanhoucke et al. [46] dscussed some aspects of techncal detals, ncludng carefully desgnng data layout, batchng of the computaton, usng SSE2 nstructons, and leveragng SSE3 and SSE4 nstructons for fxed-pont mplementaton. These mplementatons can enhance the performance of modern CPUs more for deep learnng. Another recent work ams to parallelze Gbbs samplng of hdden and vsble unts by splttng hdden unts and vsble unts nto n machnes, each responsble for 1/n of the unts [47]. In order to make t work, data transfer between machnes s requred (.e., when samplng the hdden unts, each machne wll have the data for all the vsble unts and vce verse). Ths method s effcent f both the hdden and vsble unts are bnary and also f the sample sze s modest. The communcaton cost, however, can rse up quckly f large-scale data sets are used. Other methods for large-scale deep learnng also explore FPGA-based mplementaton [48] wth a custom archtecture: a control unt mplemented n a CPU, a grd of multple full-custom processng tles, and a fast memory. In ths survey, we wll focus on some recently developed deep learnng frameworks that take advantage of great computng power avalable today. Take Graphcs Processors Unts VOLUME 2,

5 (GPUs) as an example: as of August 2013, NVIDIA sngle precson GPUs exceeded 4.5 TeraFLOP/s wth a memory bandwdth of near 300 GB/s [49]. They are partcularly suted for massvely parallel computng wth more transstors devoted for data proceedng needs. These newly developed deep learnng frameworks have shown sgnfcant advances n makng large-scale deep learnng practcal. Fg. 3 shows a schematc for a typcal CUDA-capable GPU wth four mult-processors. Each mult-processor (MP) conssts of several streamng multprocessors (SMs) to form a buldng block (Fg. 3 shows two SMs for each block). Each SM has multple stream processors (SPs) that share control logc and low-latency memory. Furthermore, each GPU has a global memory wth very hgh bandwdth and hgh latency when accessed by the CPU (host). Ths archtecture allows for two levels of parallelsm: nstructon (memory) level (.e., MPs) and thread level (SPs). Ths SIMT (Sngle Instructon, Multple Threads) archtecture allows for thousands or tens of thousands of threads to be run concurrently, whch s best suted for operatons wth large number of arthmetc operatons and small access tmes to memory. Such levels of parallelsm can also be effectvely utlzed wth specal attenton on the data flow when developng GPU parallel computng applcatons. One consderaton, for example, s to reduce the data transfer between RAM and the GPU s global memory [50] by transferrng data wth large chunks. Ths s acheved by uploadng as large sets of unlabeled data as possble and by storng free parameters as well as ntermedate computatons, all n global memory. In addton, data parallelsm and learnng updates can be mplemented by leveragng the two levels of parallelsm: nput examples can be assgned across MPs, whle ndvdual nodes can be treated n each thread (.e., SPs). A. LARGE-SCALE DEEP BELIEF NETWORKS Rana et al. [41] proposed a GPU-based framework for massvely parallelzng unsupervsed learnng models ncludng DBNs (n ths paper, they refer the algorthms to stacked RBMs) and sparse codng [21]. Whle prevous models tend to use one to four mllon free parameters (e.g., Hnton & Salakhutdnov [21] used 3.8 mllon parameters for free mages and Ranzato and Szummer used three mllon parameters for text processng [51]), the proposed approach can tran on more than 100 mllon free parameters wth mllons of unlabeled tranng data [41]. Because transferrng data between host and GPU global memory s tme consumng, one needs to mnmze hostdevce transfers and take advantage of shared memory. To acheve ths, one strategy s to store all parameters and a large chunk of tranng examples n global memory durng tranng [41]. Ths wll reduce the data transfer tmes between host and globa memory and also allow for parameter updates to be carred out fully nsde GPUs. In addton, to utlze the MP/SP levels of parallelsm, a few of the unlabeled tranng data n global memory wll be selected each tme to compute the updates concurrently across blocks (data parallelsm) FIGURE 3. An llustratve archtecture of a CUDA-capable GPU wth hghly threaded streamng processors (SPs). In ths example, the GPU has 64 stream processors (SPs) organzed nto four multprocessors (MPs), each wth two stream multprocessors (SMs). Each SM has eght SPs that share control unt and nstructon cache. The four MPs (buldng blocks) also share a global memory (e.g., graphcs double data rate DRAM) that often functons as very-hgh-bandwdth, off-chp memory (memory bandwdth s the data exchange rate). Global memory typcally has hgh latency and s accessble to the CPU (host). A typcal processng flow ncludes: nput data are frst coped from host memory to GPU memory, followed by loadng and executng GPU program; results are then sent back from GPU memory to host memory. Practcally, one needs to pay careful consderaton to data transfer between host and GPU memory, whch may take consderable amount of tme. (Fg. 3). Meanwhle, each component of the nput example s handled by SPs. When mplementng the DBN learnng, Gbbs samplng [52], [53] s repeated usng Eqs. (1-2). Ths can be mplemented by frst generatng two samplng matrces P(h x) and P(x h), wth the (, j)-th element P(h j x ) (.e., the probablty of j-th hdden node gven the -th nput example) and P(x j h ), respectvely [41]. The samplng matrces can then be mplemented n parallel for the GPU, where each block takes an example and each thread works on an element of the example. Smlarly, the weght update operatons (Eq. (3)) can be performed n parallel usng lnear algebra packages for the GPU after new examples are generated. Expermental results show that wth 45 mllon parameters n a RBM and one mllon examples, the GPU-based mplementaton ncreases the speed of DBN learnng by a factor of up to 70, compared to a dual-core CPU mplementaton (around 29 mnutes for GPU-based mplementaton versus more than one day for CPU-based mplementaton) [41]. B. LARGE-SCALE CONVOLUTIONAL NEURAL NETWORKS CNN s a type of locally connected deep learnng methods. Large-scale CNN learnng s often mplemented on GPUs wth several hundred parallel processng cores. CNN tranng nvolves both forward and backward propagaton. For parallelzng forward propagaton, one or more blocks are assgned for each feature map dependng on the sze of maps [36]. Each thread n a block s devoted to a sngle neuron 518 VOLUME 2, 2014

6 n a map. Consequently, the computaton of each neuron, whch ncludes convoluton of shared weghts (kernels) wth neurons from the prevous layers, actvaton, and summaton, s performed n a SP. The outputs are then stored n the global memory. Weghts are updated by back-propagaton of errors δ k. The error sgnal δ (l 1) k of a neuron k n the prevous layer (l 1) depends on the error sgnals δ (l) j of some neurons n a local feld of the current layer l. Parallelzng backward propagaton can be mplemented ether by pullng or pushng [36]. Pullng error sgnals refers to the process of computng delta sgnals for each neuron n the prevous layer by pullng the error sgnals from the current layer. Ths s not straghtforward because of the subsamplng and convoluton operatons: for example, the neurons n the prevous layer may connect to dfferent numbers of neurons n the prevous layer due to border effects [54]. For llustraton, we plot a onedmensonal convoluton and subsamplng n Fg. 4. As can be seen, the frst sx unts have dfferent number of connectons. We need frst to dentfy the lst of neurons n the current layer that contrbute to the error sgnals of neurons n the prevous layer. On the contrary, all the unts n the current layer have exactly the same number of ncomng connectons. Consequently, pushng the error sgnals from the current layer to prevous layer s more effcent,.e., for each unt n the current layer, we update the related unts n the prevous layer. FIGURE 4. An llustraton of the operatons nvolved wth 1D convoluton and subsamplng. The convoluton flter s sze s sx. Consequently, each unt n the convoluton layer s defned by sx nput unts. Subsamplng nvolves averagng two adjacent unts n the convoluton layer. For mplementng data parallelsm, one needs to consder the sze of global memory and feature map sze. Typcally, at any gven stage, a lmted number of tranng examples can be processed n parallel. Furthermore, wthn each block where comvoluton operaton s performed, only a porton of a feature map can be mantaned at any gven tme due to the extremely lmted amount of shared memory. For convoluton operatons, Scherer et al. suggested the use of lmted shared memory as a crcular buffer [37], whch only holds a small porton of each feature map loaded from global memory each tme. Convoluton wll be performed by threads n parallel and results are wrtten back to global memory. To further overcome the GPU memory lmtaton, the authors mplemented a modfed archtecture wth both the convoluton and subsamplng operatons beng combned nto one step [37]. Ths modfcaton allows for storng both the actvtes and error values wth reduced memory usage whle runnng backpropagaton. To further speedup, Krzhevsky et al. proposed the use of two GPUs for tranng CNNs wth fve convolutonal layers and three fully connected classfcaton layers. The CNN uses Rectfed Lnear Unts (ReLUs) as the nonlnear functon (f (x) = max(0, x)), whch has been shown to run several tmes faster than other commonly used functons [55]. For some layers, about half of the network s computed n a sngle GPU and the other porton s calculated n the other GPU; the two GPUs communcated at some other layers. Ths archtecture takes full advantage of cross-gpu parallelzaton that allows two GPUs to communcate and transfer data wthout usng host memory. C. COMBINATION OF DATA- AND MODEL-PARALLEL SCHEMES DstBelef s a software framework recently desgned for dstrbuted tranng and learnng n deep networks wth very large models (e.g., a few bllon parameters) and large-scale data sets. It leverages large-scale clusters of machnes to manage both data and model parallelsm va multthreadng, message passng, synchronzaton as well as communcaton between machnes [56]. For large-scale data wth hgh dmensonalty, deep learnng often nvolves many densely connected layers wth a large number of free parameters (.e., large models). To deal wth large model learnng, DstBelef frst mplements model parallelsm by allowng users to partton large network archtectures nto several smaller structures (called blocks), whose nodes wll be assgned to and calculated n several machnes (collectvely we call t a parttoned model ). Each block wll be assgned to one machne (see Fg. 5). Boundary nodes (nodes whose edges belong to more than one parttons) requre data transfer between machnes. Apparently, fullyconnected networks have more boundary nodes and often demand hgher communcaton costs than locally-connected structures, and thus less performance benefts. Nevertheless, as many as 144 parttons have been reported for large models n DstBelef [56], whch leads to sgnfcant mprovement of tranng speed. DstBelef also mplements data parallelsm and employs two separate dstrbuted optmzaton procedures: Downpour stochastc gradent descent (SGD) and Sandblaster [56], whch perform onlne and batch optmzaton, respectvely. Heren we wll dscuss Downpour n detals and more nformaton about Sandblaster can be found n the reference [56]. Frst, multple replcas of the parttoned model wll be created for tranng and nference. Lke deep learnng models, large data sets wll be parttoned nto many subsets. DstBelef wll then run multple replcas of the parttoned model to compute gradent descent va Downpour SGD on dfferent subsets of tranng data. Specfcally, DstBelef employs a centralzed parameter server storng and applyng updates for VOLUME 2,

7 FIGURE 5. DstBelef: models are parttoned nto four blocks and consequently assgned to four machnes [56]. Informaton for nodes that belong to two or more parttons s transferred between machnes (e.g., the lnes marked wth yellow color). Ths model s more effectve for less densely connected networks. all parameters of the models. Parameters are grouped nto server shards. At any gven tme, each machne n a parttoned model needs only to communcate wth the parameter server shards that hold the relevant parameters. Ths communcaton s asynchronous: each machne n a parttoned model runs ndependently and each parameter server shard acts ndependently as well. One advantage of usng asynchronous communcaton over standard synchronous SGD s ts fault tolerance: n the event of the falures of one machne n a model copy, other model replcas wll contnue communcatng wth the central parameter server to process the data and update the shared weghts. In practce, the Adagrad adaptve learnng rate procedure [57] s ntegrated nto the Downpour SGD for better performance. DstBelef s mplemented n two deep learnng models: a fully connected network wth 42 mllon model parameters and 1.1 bllon examples, and a locallyconnected convolutonal neural network wth 16 mllon mages of 100 by 100 pxels and 21,000 categores (as many as 1.7 bllon parameters). The expermental results show that locally connected learnng models wll beneft more from DstBelef: ndeed, wth 81 machnes and 1.7 bllon parameters, the method s 12x faster than usng a sngle machne. As demonstrated n [56], a sgnfcant advantage of DstBelef s ts ablty to scale up from sngle machne to thousands of machnes, whch s the key to Bg Data analyss. Most recently, the DstBelef framework was used to tran a deep archtecture wth a sparse deep autoencoder, local receptve felds, poolng, and local contrast normalzaton [50]. The deep learnng archtecture conssts of three stacked layers, each wth sublayers of local flterng, local poolng, and local contrast normalzaton. The flterng sublayers are not convolutonal, each flter wth ts own weghts. The optmzaton of ths archtecture nvolves an overall objectve functon that s the summaton of the objectve functons for the three layers, each amng at mnmzng a reconstructon error whle mantanng sparsty of connectons between sublayers. The DstBelef framework s able to scale up the dataset, the model, and the resources all together. The model s parttoned nto 169 machnes, each wth 16 CPU cores. Multple cores allow for another level of parallelsm where each subset of cores can perform dfferent tasks. Asynchronous SGD s mplemented wth several replcas of the core model and mn-batch of tranng examples. The framework was able to tran as many as 14 mllon mages wth a sze of 200 by 200 pxels and more than 20 thousand categores for three days over a cluster of 1,000 machnes wth 16,000 cores. The model s capable of learnng hgh-level features to detect objects wthout usng labeled data. D. THE COTS HPC SYSTEMS Whle DstBelef can learn wth very large models (more than one bllon parameters), ts tranng requres 16,000 CPU cores, whch are not commonly avalable for most researchers. Most recently, Coates et al. presented an alternatve approach that trans comparable deep network models wth more than 11 bllon free parameters by usng just three machnes [58]. The Commodty Off-The-Shelf Hgh Performance Computng (COTS HPC) system s comprsed of a cluster of 16 GPU servers wth Infnband adapter for nterconnects and MPI for data exchange n a cluster. Each server s equpped wth four NVIDIA GTX680 GPUs, each havng 4GB of memory. Wth well-balanced number of GPUs and CPUs, COTS HPC s capable of runnng very large-scale deep learnng. The mplementaton ncludes carefully desgned CUDA kernels for effectve usage of memory and effcent computaton. For example, to effcently compute a matrx multplcaton Y = WX (e.g., W s the flter matrx and X s the nput matrx), Coates et al. [58] fully take advantage of matrx sparseness and local receptve feld by extractng nonzero columns n W for neurons that share dentcal receptve felds, whch are then multpled by the correspondng rows n X. Ths strategy successfully avods the stuaton where the requested memory s larger than the shared memory of the GPU. In addton, matrx operatons are performed by usng a hghly optmzed tool called MAGMA BLAS matrx-matrx multply kernels [59]. Furthermore, GPUs are beng utlzed to mplement a model parallel scheme: each GPU s only used for a dfferent part of the model optmzaton wth the same nput examples; collectvely, ther communcaton occurs through the MVA- PICH2 MPI. Ths very large scale deep learnng system s capable of tranng wth more than 11 bllon parameters, whch s the largest model reported by far, wth much less machnes. Table 1 summarzes the current progress n large-scale deep learnng. It has been observed n several groups (see [41]) that sngle CPU s mpractcal for deep learnng wth a large model. Wth multple machnes, the runnng tme may not be a bg concern any more (see [56]). However, 520 VOLUME 2, 2014

8 sgnfcant computatonal resources are needed to acheve the goal. Consequently, major research efforts are towards experments wth GPUs. TABLE 1. Summary of recent research progress n large-scale deep learnng. IV. REMAINING CHALLENGES AND PERSPECTIVES: DEEP LEARNING FOR BIG DATA In recent years, Bg Data has taken center stage n government and socety at large. In 2012, the Obama Admnstraton announced a Bg Data Research and Development Intatve to help solve some of the Naton s most pressng challenges [60]. Consequently, sx Federal departments and agences (NSF, HHS/NIH, DOD, DOE, DARPA, and USGS) commtted more than $200 mllon to support projects that can transform our ablty to harness n novel ways from huge volumes of dgtal data. In May of the same year, the state of Massachusetts announced the Massachusetts Bg Data Intatve that funds a varety of research nsttutons [61]. In Aprl, 2013, U.S. Presdent Barack Obama announced another federal project, a new bran mappng ntatve called the BRAIN (Bran Research Through Advancng Innovatve Neurotechnologes) [62] amng to develop new tools to help map human bran functons, understand the complex lnks between functon and behavor, and treat and cure bran dsorders. Ths ntatve mght test and extend the current lmts of technologes for Bg Data collecton and analyss, as NIH drector Francs Collns stated that collecton, storage, and processng of yottabytes (a bllon petabytes) of data would eventually be requred for ths ntatve. Whle the potental of Bg Data s undoubtedly sgnfcant, fully achevng ths potental requres new ways of thnkng and novel algorthms to address many techncal challenges. For example, most tradtonal machne learnng algorthms were desgned for data that would be completely loaded nto memory. Wth the arrval of Bg Data age, however, ths assumpton does not hold any more. Therefore, algorthms that can learn from massve amounts of data are needed. In spte of all the recent achevement n large-scale deep learnng as dscussed n Secton 3, ths feld s stll n ts nfancy. Much more needs to be done to address many sgnfcant challenges posted by Bg Data, often characterzed by the three V s model: volume, varety, and velocty [63], whch refers to large scale of data, dfferent types of data, and the speed of streamng data, respectvely. A. DEEP LEARNING FROM HIGH VOLUMES OF DATA Frst and foremost, hgh volumes of data present a great challengng ssue for deep learnng. Bg data often possesses a large number of examples (nputs), large varetes of class types (outputs), and very hgh dmensonalty (attrbutes). These propertes drectly lead to runnng-tme complexty and model complexty. The sheer volume of data makes t often mpossble to tran a deep learnng algorthm wth a central processor and storage. Instead, dstrbuted frameworks wth parallelzed machnes are preferred. Recently, mpressve progresses have been made to mtgate the challenges related to hgh volumes. The novel models utlze clusters of CPUs or GPUs n ncreasng the tranng speed wthout scarfyng accuracy of deep learnng algorthms. Strateges for data parallelsm or model parallelsm or both have been developed. For example, data and models are dvded nto blocks that ft wth n-memory data; the forward and backward propagatons can be mplemented effectvely n parallel [56], [58], although deep learnng algorthms are not trvally parallel. The most recent deep learnng framework can handle a sgnfcantly large number of samples and parameters. It s also possble to scale up wth more GPUs used. It s less clear, however, how the deep learnng systems can contnue scalng sgnfcantly beyond the current framework. Whle we can expect the contnuous growth n computer memory and computatonal power (manly through parallel or dstrbuted computng envronment), further research and effort on addressng ssues assocated wth computaton and communcaton management (e.g., copyng data or parameters or gradent values to dfferent machnes) are needed for scalngup to very large data sets. Ultmately, to buld the future deep learnng system scalable to Bg Data, one needs to develop hgh performance computng nfrastructure-based systems together wth theoretcally sound parallel learnng algorthms or novel archtectures. Another challenge assocated wth hgh volumes s the data ncompleteness and nosy labels. Unlke most conventonal datasets used for machne learnng, whch were hghly curated and nose free, Bg Data s often ncomplete resultng from ther dsparate orgns. To make thngs even more complcated, majorty of data may not be labeled, or f labeled, there exst nosy labels. Take the 80 mllon tny VOLUME 2,

9 mage database as an example, whch has 80 mllon lowresoluton color mages over 79,000 search terms [64]. Ths mage database was created by searchng the Web wth every non-abstract Englsh noun n the WordNet. Several search engnes such as Google and Flckr were used to collect the data over the span of sx months. Some manual curaton was conducted to remove duplcates and low-qualty mages. Stll, the mage labels are extremely unrelable because of search technologes. One of the unque characterstcs deep learnng algorthms possess s ther ablty to utlty unlabeled data durng tranng: learnng data dstrbuton wthout usng label nformaton. Thus, the avalablty of large unlabeled data presents ample opportuntes for deep learnng methods. Whle data ncompleteness and nosy labels are part of the Bg Data package, we beleve that usng vastly more data s preferable to usng smaller number of exact, clean, and carefully curated data. Advanced deep learnng methods are requred to deal wth nosy data and to be able to tolerate some messness. For example, a more effcent cost functon and novel tranng strategy may be needed to allevate the effect of nosy labels. Strateges used n sem-supervsed learnng [65] [68] may also help allevate problems related to nosy labels. B. DEEP LEARNING FOR HIGH VARIETY OF DATA The second dmenson for Bg Data s ts varety,.e., data today comes n all types of formats from a varety sources, probably wth dfferent dstrbutons. For example, the rapdly growng multmeda data comng from the Web and moble devces nclude a huge collecton of stll mages, vdeo and audo streams, graphcs and anmatons, and unstructured text, each wth dfferent characterstcs. A key to deal wth hgh varety s data ntegraton. Clearly, one unque advantage of deep learnng s ts ablty for representaton learnng wth ether supervsed or unsupervsed methods or combnaton of both, deep learnng can be used to learn good feature representatons for classfcaton. It s able to dscover ntermedate or abstract representatons, whch s carred out usng unsupervsed learnng n a herarchy fashon: one level at a tme and hgher-level features defned by lower-level features. Thus, a natural soluton to address the data ntegraton problem s to learn data representatons from each ndvdual data sources usng deep learnng methods, and then to ntegrate the learned features at dfferent levels. Deep learnng has been shown to be very effectve n ntegratng data from dfferent sources. For example, Ngam et al. [69] developed a novel applcaton of deep learnng algorthms to learn representatons by ntegratng audo and vdeo data. They demonstrated that deep learnng s generally effectve n (1) learnng sngle modalty representatons through multple modaltes wth unlabeled data and (2) learnng shared representatons capable of capturng correlatons across multple modaltes. Most recently, Srvastava and Salakhutdnov [70] developed a multmodal Deep Boltzmann Machne (DBM) that fuses two very dfferent data modaltes, real-valued dense mage data and text data wth sparse word frequences, together to learn a unfed representaton. DBM s a generatve model wthout fne-tunng: t frst bulds multple stacked-rbms for each modalty; to form a multmodal DBM, an addtonal layer of bnary hdden unts s added on top of these RBMs for jont representaton. It learns a jont dstrbuton n the multmodal nput space, whch allows for learnng even wth mssng modaltes. Whle current experments have demonstrated that deep learnng s able to utlze heterogeneous sources for sgnfcant gans n system performance, numerous questons reman open. For example, gven that dfferent sources may offer conflctng nformaton, how can we resolve the conflcts and fuse the data from dfferent sources effectvely and effcently. Whle current deep learnng methods are manly tested upon b-modaltes (.e., data from two sources), wll the system performance benefts from sgnfcantly enlarged modaltes? Furthermore, at what levels n deep learnng archtectures are approprate for feature fuson wth heterogeneous data? Deep learnng seems well suted to the ntegraton of heterogeneous data wth multple modaltes due to ts capablty of learnng abstract representatons and the underlyng factors of data varaton. C. DEEP LEARNING FOR HIGH VELOCITY OF DATA Emergng challenges for Bg Data learnng also arose from hgh velocty: data are generatng at extremely hgh speed and need to be processed n a tmely manner. One soluton for learnng from such hgh velocty data s onlne learnng approaches. Onlne learnng learns one nstance at a tme and the true label of each nstance wll soon be avalable, whch can be used for refnng the model [71] [76]. Ths sequental learnng strategy partcularly works for Bg Data as current machnes cannot hold the entre dataset n memory. Whle conventonal neural networks have been explored for onlne learnng [77] [87], only lmted progress on onlne deep learnng has been made n recent years. Interestngly, deep learnng s often traned wth stochastc gradent descent approach [88], [89], where one tranng example wth the known label s used at a tme to update the model parameters. Ths strategy may be adapted for onlne learnng as well. To speed up learnng, nstead of proceedng sequentally one example at a tme, the updates can be performed on a mnbatch bass [37]. Practcally, the examples n each mn-batch are as ndependent as possble. Mn-batches provde a good balance between computer memory and runnng tme. Another challengng problem assocated wth the hgh velocty s that data are often non-statonary,.e., data dstrbuton s changng over tme. Practcally, non-statonary data are normally separated nto chunks wth data from a small tme nterval. The assumpton s that data close n tme are pece-wse statonary and may be characterzed by a sgnfcant degree of correlaton and, therefore, follow the same dstrbuton [90] [97]. Thus, an mportant feature of a deep learnng algorthm for Bg Data s the ablty to learn the data as a stream. One area that needs to be explored s deep onlne learnng onlne learnng often scales naturally and 522 VOLUME 2, 2014

10 s memory bounded, readly parallelzable, and theoretcally guaranteed [98]. Algorthms capable of learnng from non..d. data are crucal for Bg Data learnng. Deep learnng can also leverage both hgh varety and velocty of Bg Data by transfer learnng or doman adapton, where tranng and test data may be sampled from dfferent dstrbutons [99] [107]. Recently, Glorot et al. mplemented a stacked denosng auto-encoder based deep archtecture for doman adapton, where one trans an unsupervsed representaton on a large number of unlabeled data from a set of domans, whch s appled to tran a classfer wth few labeled examples from only one doman [100]. Ther emprcal results demonstrated that deep learnng s able to extract a meanngful and hgh-level representaton that s shared across dfferent domans. The ntermedate hgh-level abstracton s general enough to uncover the underlyng factors of doman varatons, whch s transferable across domans. Most recently, Bengo also appled deep learnng of multple level representatons for transfer learnng where tranng examples may not well represent test data [99]. They showed that more abstract features dscovered by deep learnng approaches are most lkely generc between tranng and test data. Thus, deep learnng s a top canddate for transfer learnng because of ts ablty to dentfy shared factors present n the nput. Although prelmnary experments have shown much potental of deep learnng n transfer learnng, applyng deep learnng to ths feld s relatvely new and much more needs to be done for mproved performance. Of course, the bg queston s whether we can beneft from Bg Data wth deep archtectures for transfer learnng. In concluson, Bg Data presents sgnfcant challenges to deep learnng, ncludng large scale, heterogenety, nosy labels, and non-statonary dstrbuton, among many others. In order to realze the full potental of Bg Data, we need to address these techncal challenges wth new ways of thnkng and transformatve solutons. We beleve that these research challenges posed by Bg Data are not only tmely, but wll also brng ample opportuntes for deep learnng. Together, they wll provde major advances n scence, medcne, and busness. REFERENCES [1] Natonal Securty Agency. The Natonal Securty Agency: Mssons, Authortes, Oversght and Partnershps [Onlne]. Avalable: _the_nsa_story.pdf [2] J. Gantz and D. Rensel, Extractng Value from Chaos. Hopknton, MA, USA: EMC, Jun [3] J. Gantz and D. Rensel, The Dgtal Unverse Decade Are You Ready. Hopknton, MA, USA: EMC, May [4] (2011, May). Bg Data: The Next Fronter for Innovaton, Competton, and Productvty. McKnsey Global Insttute [Onlne]. Avalable: next_fronter_for_nnovaton [5] J. Ln and A. Kolcz, Large-scale machne learnng at twtter, n Proc. ACM SIGMOD, Scottsdale, Arzona, USA, 2012, pp [6] A. Smola and S. Narayanamurthy, An archtecture for parallel topc models, Proc. VLDB Endowment, vol. 3, no. 1, pp , [7] A. Ng et al., Map-reduce for machne learnng on multcore, n Proc. Adv. Neural Inf. Procees. Syst., vol , pp [8] B. Panda, J. Herbach, S. Basu, and R. Bayardo, MapReduce and ts applcaton to massvely parallel learnng of decson tree ensembles, n Scalng Up Machne Learnng: Parallel and Dstrbuted Approaches. Cambrdge, U.K.: Cambrdge Unv. Press, [9] E. Crego, G. Munoz, and F. Islam. (2013, Dec. 8). Bg data and deep learnng: Bg deals or bg delusons? Busness [Onlne]. Avalable: [10] Y. Bengo and S. Bengo, Modelng hgh-dmensonal dscrete data wth mult-layer neural networks, n Proc. Adv. Neural Inf. Process. Syst., vol , pp [11] Y. Marc Aurelo Ranzato, L. Boureau, and Y. LeCun, Sparse feature learnng for deep belef networks, n Proc. Adv. Neural Inf. Process. Syst., vol , pp [12] G. E. Dahl, D. Yu, L. Deng, and A. Acero, Context-dependent pretraned deep neural networks for large-vocabulary speech recognton, IEEE Trans. Audo, Speech, Lang. Process., vol. 20, no. 1, pp , Jan [13] G. Hnton et al., Deep neural networks for acoustc modelng n speech recognton: The shared vews of four research groups, IEEE Sgnal Process. Mag., vol. 29, no. 6, pp , Nov [14] R. Salakhutdnov, A. Mnh, and G. Hnton, Restrcted Boltzmann machnes for collaboratve flterng, n Proc. 24th Int. Conf. Mach. Learn., 2007, pp [15] D. Creşan, U. Meler, L. Cambardella, and J. Schmdhuber, Deep, bg, smple neural nets for handwrtten dgt recognton, Neural Comput., vol. 22, no. 12, pp , [16] M. Zeler, G. Taylor, and R. Fergus, Adaptve deconvolutonal networks for md and hgh level feature learnng, n Proc. IEEE Int. Conf. Comput. Vs., Nov. 2011, pp [17] A. Efrat. (2013, Dec. 11). How deep learnng works at Apple, beyond. Informaton [Onlne]. Avalable: com/how-deep-learnng-works-at-apple-beyond [18] N. Jones, Computer scence: The learnng machnes, Nature, vol. 505, no. 7482, pp , [19] Y. Wang, D. Yu, Y. Ju, and A. Acero, Voce search, n Language Understandng: Systems for Extractng Semantc Informaton From Speech, G. Tur and R. De Mor, Eds. New York, NY, USA: Wley, 2011, ch. 5. [20] J. Krk. (2013, Oct. 1). Unverstes, IBM jon forces to buld a bran-lke computer. PCWorld [Onlne]. Avalable: artcle/ /unverstes-jon-bm-n-cogntve-computng-researchproject.html [21] G. Hnton and R. Salakhutdnov, Reducng the dmensonalty of data wth neural networks, Scence, vol. 313, no. 5786, pp , [22] Y. Bengo, Learnng deep archtectures for AI, Found. Trends Mach. Learn., vol. 2, no. 1, pp , [23] V. Nar and G. Hnton, 3D object recongton wth deep belef nets, n Proc. Adv. NIPS, vol , pp [24] Y. LeCun, L. Bottou, Y. Bengo, and P. Haffner, Gradent-based learnng appled to document recognton, Proc. IEEE, vol. 86, no. 11, pp , Nov [25] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, Natural language processng almost from scratch, J. Mach. Learn. Res., vol. 12, pp , Nov [26] P. Le Callet, C. Vard-Gaudn, and D. Barba, A convolutonal neural network approach for objectve vdeo qualty assessment, IEEE Trans. Neural Netw., vol. 17, no. 5, pp , Sep [27] D. Rumelhart, G. Hnton, and R. Wllams, Learnng representatons by back-propagatng errors, Nature, vol. 323, pp , Oct [28] G. Hnton, A practcal gude to tranng restrcted Boltzmann machnes, Dept. Comput. Sc., Unv. Toronto, Toronto, ON, Canada, Tech. Rep. UTML TR , [29] G. Hnton, S. Osndero, and Y. Teh, A fast learnng algorthm for deep belef nets, Neural Comput., vol. 18, no. 7, pp , [30] Y. Bengo, P. Lambln, D. Popovc, and H. Larochelle, Greedy layerwse tranng of deep networks, n Proc. Neural Inf. Process. Syst., 2006, pp [31] G. Hnton, Tranng products of experts by mnmzng contrastve dvergence, Neural Comput., vol. 14, no. 8, pp , [32] P. Vncent, H. Larochelle, Y. Bengo, and P.-A. Manzagol Extractng and composng robust features wth denosng autoencoders, n Proc. 25th Int. Conf. Mach. Learn., 2008, pp VOLUME 2,

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

Nonlinear data mapping by neural networks

Nonlinear data mapping by neural networks Nonlnear data mappng by neural networks R.P.W. Dun Delft Unversty of Technology, Netherlands Abstract A revew s gven of the use of neural networks for nonlnear mappng of hgh dmensonal data on lower dmensonal

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

Lecture 2: Single Layer Perceptrons Kevin Swingler

Lecture 2: Single Layer Perceptrons Kevin Swingler Lecture 2: Sngle Layer Perceptrons Kevn Sngler kms@cs.str.ac.uk Recap: McCulloch-Ptts Neuron Ths vastly smplfed model of real neurons s also knon as a Threshold Logc Unt: W 2 A Y 3 n W n. A set of synapses

More information

Big Data Deep Learning: Challenges and Perspectives

Big Data Deep Learning: Challenges and Perspectives Big Data Deep Learning: Challenges and Perspectives D.saraswathy Department of computer science and engineering IFET college of engineering Villupuram saraswathidatchinamoorthi@gmail.com Abstract Deep

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

Dropout: A Simple Way to Prevent Neural Networks from Overfitting

Dropout: A Simple Way to Prevent Neural Networks from Overfitting Journal of Machne Learnng Research 15 (2014) 1929-1958 Submtted 11/13; Publshed 6/14 Dropout: A Smple Way to Prevent Neural Networks from Overfttng Ntsh Srvastava Geoffrey Hnton Alex Krzhevsky Ilya Sutskever

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

HP Mission-Critical Services

HP Mission-Critical Services HP Msson-Crtcal Servces Delverng busness value to IT Jelena Bratc Zarko Subotc TS Support tm Mart 2012, Podgorca 2010 Hewlett-Packard Development Company, L.P. The nformaton contaned heren s subject to

More information

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching) Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton

More information

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features On-Lne Fault Detecton n Wnd Turbne Transmsson System usng Adaptve Flter and Robust Statstcal Features Ruoyu L Remote Dagnostcs Center SKF USA Inc. 3443 N. Sam Houston Pkwy., Houston TX 77086 Emal: ruoyu.l@skf.com

More information

Fault tolerance in cloud technologies presented as a service

Fault tolerance in cloud technologies presented as a service Internatonal Scentfc Conference Computer Scence 2015 Pavel Dzhunev, PhD student Fault tolerance n cloud technologes presented as a servce INTRODUCTION Improvements n technques for vrtualzaton and performance

More information

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing A Replcaton-Based and Fault Tolerant Allocaton Algorthm for Cloud Computng Tork Altameem Dept of Computer Scence, RCC, Kng Saud Unversty, PO Box: 28095 11437 Ryadh-Saud Araba Abstract The very large nfrastructure

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

A Programming Model for the Cloud Platform

A Programming Model for the Cloud Platform Internatonal Journal of Advanced Scence and Technology A Programmng Model for the Cloud Platform Xaodong Lu School of Computer Engneerng and Scence Shangha Unversty, Shangha 200072, Chna luxaodongxht@qq.com

More information

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending Proceedngs of 2012 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 25 (2012) (2012) IACSIT Press, Sngapore Bayesan Network Based Causal Relatonshp Identfcaton and Fundng Success

More information

Analytics and Fusion for Distributed Big Data

Analytics and Fusion for Distributed Big Data Analytcs and Fuson for Dstrbuted Bg Data Arjun Shankar, Ph.D. Thanks also to: James Horey and Arvnd Ramanathan ORNL Computatonal Scences and Engneerng Dvson SOS, Jekyll Island, Georga March 203 Outlne

More information

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble 1 ECE544NA Fnal Project: Robust Machne Learnng Hardware va Classfer Ensemble Sa Zhang, szhang12@llnos.edu Dept. of Electr. & Comput. Eng., Unv. of Illnos at Urbana-Champagn, Urbana, IL, USA Abstract In

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

Multiple-Period Attribution: Residuals and Compounding

Multiple-Period Attribution: Residuals and Compounding Multple-Perod Attrbuton: Resduals and Compoundng Our revewer gave these authors full marks for dealng wth an ssue that performance measurers and vendors often regard as propretary nformaton. In 1994, Dens

More information

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT Toshhko Oda (1), Kochro Iwaoka (2) (1), (2) Infrastructure Systems Busness Unt, Panasonc System Networks Co., Ltd. Saedo-cho

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters Frequency Selectve IQ Phase and IQ Ampltude Imbalance Adjustments for OFDM Drect Converson ransmtters Edmund Coersmeer, Ernst Zelnsk Noka, Meesmannstrasse 103, 44807 Bochum, Germany edmund.coersmeer@noka.com,

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

A Secure Password-Authenticated Key Agreement Using Smart Cards

A Secure Password-Authenticated Key Agreement Using Smart Cards A Secure Password-Authentcated Key Agreement Usng Smart Cards Ka Chan 1, Wen-Chung Kuo 2 and Jn-Chou Cheng 3 1 Department of Computer and Informaton Scence, R.O.C. Mltary Academy, Kaohsung 83059, Tawan,

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

IMPACT ANALYSIS OF A CELLULAR PHONE

IMPACT ANALYSIS OF A CELLULAR PHONE 4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

Communication Networks II Contents

Communication Networks II Contents 8 / 1 -- Communcaton Networs II (Görg) -- www.comnets.un-bremen.de Communcaton Networs II Contents 1 Fundamentals of probablty theory 2 Traffc n communcaton networs 3 Stochastc & Marovan Processes (SP

More information

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School Robust Desgn of Publc Storage Warehouses Yemng (Yale) Gong EMLYON Busness School Rene de Koster Rotterdam school of management, Erasmus Unversty Abstract We apply robust optmzaton and revenue management

More information

Rank Based Clustering For Document Retrieval From Biomedical Databases

Rank Based Clustering For Document Retrieval From Biomedical Databases Jayanth Mancassamy et al /Internatonal Journal on Computer Scence and Engneerng Vol.1(2), 2009, 111-115 Rank Based Clusterng For Document Retreval From Bomedcal Databases Jayanth Mancassamy Department

More information

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression Novel Methodology of Workng Captal Management for Large Publc Constructons by Usng Fuzzy S-curve Regresson Cheng-Wu Chen, Morrs H. L. Wang and Tng-Ya Hseh Department of Cvl Engneerng, Natonal Central Unversty,

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering Lecture 7a Clusterng Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Clusterng Groups together smlar nstances n the data sample Basc clusterng problem: dstrbute data nto k dfferent groups such that

More information

Improved SVM in Cloud Computing Information Mining

Improved SVM in Cloud Computing Information Mining Internatonal Journal of Grd Dstrbuton Computng Vol.8, No.1 (015), pp.33-40 http://dx.do.org/10.1457/jgdc.015.8.1.04 Improved n Cloud Computng Informaton Mnng Lvshuhong (ZhengDe polytechnc college JangSu

More information

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4, pp. 30-30 (2005) 30 THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Yu-Mn Chang *, Yu-Cheh

More information

Development of an intelligent system for tool wear monitoring applying neural networks

Development of an intelligent system for tool wear monitoring applying neural networks of Achevements n Materals and Manufacturng Engneerng VOLUME 14 ISSUE 1-2 January-February 2006 Development of an ntellgent system for tool wear montorng applyng neural networks A. Antć a, J. Hodolč a,

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

Gender Classification for Real-Time Audience Analysis System

Gender Classification for Real-Time Audience Analysis System Gender Classfcaton for Real-Tme Audence Analyss System Vladmr Khryashchev, Lev Shmaglt, Andrey Shemyakov, Anton Lebedev Yaroslavl State Unversty Yaroslavl, Russa vhr@yandex.ru, shmaglt_lev@yahoo.com, andrey.shemakov@gmal.com,

More information

Calculating the high frequency transmission line parameters of power cables

Calculating the high frequency transmission line parameters of power cables < ' Calculatng the hgh frequency transmsson lne parameters of power cables Authors: Dr. John Dcknson, Laboratory Servces Manager, N 0 RW E B Communcatons Mr. Peter J. Ncholson, Project Assgnment Manager,

More information

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):1884-1889 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel

More information

iavenue iavenue i i i iavenue iavenue iavenue

iavenue iavenue i i i iavenue iavenue iavenue Saratoga Systems' enterprse-wde Avenue CRM system s a comprehensve web-enabled software soluton. Ths next generaton system enables you to effectvely manage and enhance your customer relatonshps n both

More information

Distributed Multi-Target Tracking In A Self-Configuring Camera Network

Distributed Multi-Target Tracking In A Self-Configuring Camera Network Dstrbuted Mult-Target Trackng In A Self-Confgurng Camera Network Crstan Soto, B Song, Amt K. Roy-Chowdhury Department of Electrcal Engneerng Unversty of Calforna, Rversde {cwlder,bsong,amtrc}@ee.ucr.edu

More information

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns A study on the ablty of Support Vector Regresson and Neural Networks to Forecast Basc Tme Seres Patterns Sven F. Crone, Jose Guajardo 2, and Rchard Weber 2 Lancaster Unversty, Department of Management

More information

Sketching Sampled Data Streams

Sketching Sampled Data Streams Sketchng Sampled Data Streams Florn Rusu, Aln Dobra CISE Department Unversty of Florda Ganesvlle, FL, USA frusu@cse.ufl.edu adobra@cse.ufl.edu Abstract Samplng s used as a unversal method to reduce the

More information

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts Power-of-wo Polces for Sngle- Warehouse Mult-Retaler Inventory Systems wth Order Frequency Dscounts José A. Ventura Pennsylvana State Unversty (USA) Yale. Herer echnon Israel Insttute of echnology (Israel)

More information

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING Matthew J. Lberatore, Department of Management and Operatons, Vllanova Unversty, Vllanova, PA 19085, 610-519-4390,

More information

A DATA MINING APPLICATION IN A STUDENT DATABASE

A DATA MINING APPLICATION IN A STUDENT DATABASE JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (53-57) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng Büyükbakkalköy-Istanbul

More information

Politecnico di Torino. Porto Institutional Repository

Politecnico di Torino. Porto Institutional Repository Poltecnco d Torno Porto Insttutonal Repostory [Artcle] A cost-effectve cloud computng framework for acceleratng multmeda communcaton smulatons Orgnal Ctaton: D. Angel, E. Masala (2012). A cost-effectve

More information

A powerful tool designed to enhance innovation and business performance

A powerful tool designed to enhance innovation and business performance A powerful tool desgned to enhance nnovaton and busness performance The LEGO Foundaton has taken over the responsblty for the LEGO SERIOUS PLAY method. Ths change wll help create the platform for the contnued

More information

BERNSTEIN POLYNOMIALS

BERNSTEIN POLYNOMIALS On-Lne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful

More information

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary

More information

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE Yu-L Huang Industral Engneerng Department New Mexco State Unversty Las Cruces, New Mexco 88003, U.S.A. Abstract Patent

More information

Survey on Virtual Machine Placement Techniques in Cloud Computing Environment

Survey on Virtual Machine Placement Techniques in Cloud Computing Environment Survey on Vrtual Machne Placement Technques n Cloud Computng Envronment Rajeev Kumar Gupta and R. K. Paterya Department of Computer Scence & Engneerng, MANIT, Bhopal, Inda ABSTRACT In tradtonal data center

More information

Parallel Numerical Simulation of Visual Neurons for Analysis of Optical Illusion

Parallel Numerical Simulation of Visual Neurons for Analysis of Optical Illusion 212 Thrd Internatonal Conference on Networkng and Computng Parallel Numercal Smulaton of Vsual Neurons for Analyss of Optcal Illuson Akra Egashra, Shunj Satoh, Hdetsugu Ire and Tsutomu Yoshnaga Graduate

More information

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819-840 (2008) Data Broadcast on a Mult-System Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,

More information

8 Algorithm for Binary Searching in Trees

8 Algorithm for Binary Searching in Trees 8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the

More information

Cloud-based Social Application Deployment using Local Processing and Global Distribution

Cloud-based Social Application Deployment using Local Processing and Global Distribution Cloud-based Socal Applcaton Deployment usng Local Processng and Global Dstrbuton Zh Wang *, Baochun L, Lfeng Sun *, and Shqang Yang * * Bejng Key Laboratory of Networked Multmeda Department of Computer

More information

Enterprise Master Patient Index

Enterprise Master Patient Index Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

A Design Method of High-availability and Low-optical-loss Optical Aggregation Network Architecture

A Design Method of High-availability and Low-optical-loss Optical Aggregation Network Architecture A Desgn Method of Hgh-avalablty and Low-optcal-loss Optcal Aggregaton Network Archtecture Takehro Sato, Kuntaka Ashzawa, Kazumasa Tokuhash, Dasuke Ish, Satoru Okamoto and Naoak Yamanaka Dept. of Informaton

More information

Adaptive Fractal Image Coding in the Frequency Domain

Adaptive Fractal Image Coding in the Frequency Domain PROCEEDINGS OF INTERNATIONAL WORKSHOP ON IMAGE PROCESSING: THEORY, METHODOLOGY, SYSTEMS AND APPLICATIONS 2-22 JUNE,1994 BUDAPEST,HUNGARY Adaptve Fractal Image Codng n the Frequency Doman K AI UWE BARTHEL

More information

Traffic State Estimation in the Traffic Management Center of Berlin

Traffic State Estimation in the Traffic Management Center of Berlin Traffc State Estmaton n the Traffc Management Center of Berln Authors: Peter Vortsch, PTV AG, Stumpfstrasse, D-763 Karlsruhe, Germany phone ++49/72/965/35, emal peter.vortsch@ptv.de Peter Möhl, PTV AG,

More information

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Study on Model of Risks Assessment of Standard Operation in Rural Power Network Study on Model of Rsks Assessment of Standard Operaton n Rural Power Network Qngj L 1, Tao Yang 2 1 Qngj L, College of Informaton and Electrcal Engneerng, Shenyang Agrculture Unversty, Shenyang 110866,

More information

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP) 6.3 / -- Communcaton Networks II (Görg) SS20 -- www.comnets.un-bremen.de Communcaton Networks II Contents. Fundamentals of probablty theory 2. Emergence of communcaton traffc 3. Stochastc & Markovan Processes

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

320 The Internatonal Arab Journal of Informaton Technology, Vol. 5, No. 3, July 2008 Comparsons Between Data Clusterng Algorthms Osama Abu Abbas Computer Scence Department, Yarmouk Unversty, Jordan Abstract:

More information

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm Document Clusterng Analyss Based on Hybrd PSO+K-means Algorthm Xaohu Cu, Thomas E. Potok Appled Software Engneerng Research Group, Computatonal Scences and Engneerng Dvson, Oak Rdge Natonal Laboratory,

More information

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign PAS: A Packet Accountng System to Lmt the Effects of DoS & DDoS Debsh Fesehaye & Klara Naherstedt Unversty of Illnos-Urbana Champagn DoS and DDoS DDoS attacks are ncreasng threats to our dgtal world. Exstng

More information

Credit Limit Optimization (CLO) for Credit Cards

Credit Limit Optimization (CLO) for Credit Cards Credt Lmt Optmzaton (CLO) for Credt Cards Vay S. Desa CSCC IX, Ednburgh September 8, 2005 Copyrght 2003, SAS Insttute Inc. All rghts reserved. SAS Propretary Agenda Background Tradtonal approaches to credt

More information

A Multi-Camera System on PC-Cluster for Real-time 3-D Tracking

A Multi-Camera System on PC-Cluster for Real-time 3-D Tracking The 23 rd Conference of the Mechancal Engneerng Network of Thaland November 4 7, 2009, Chang Ma A Mult-Camera System on PC-Cluster for Real-tme 3-D Trackng Vboon Sangveraphunsr*, Krtsana Uttamang, and

More information

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION NEURO-FUZZY INFERENE SYSTEM FOR E-OMMERE WEBSITE EVALUATION Huan Lu, School of Software, Harbn Unversty of Scence and Technology, Harbn, hna Faculty of Appled Mathematcs and omputer Scence, Belarusan State

More information

A Performance Analysis of View Maintenance Techniques for Data Warehouses

A Performance Analysis of View Maintenance Techniques for Data Warehouses A Performance Analyss of Vew Mantenance Technques for Data Warehouses Xng Wang Dell Computer Corporaton Round Roc, Texas Le Gruenwald The nversty of Olahoma School of Computer Scence orman, OK 739 Guangtao

More information

Performance Management and Evaluation Research to University Students

Performance Management and Evaluation Research to University Students 631 A publcaton of CHEMICAL ENGINEERING TRANSACTIONS VOL. 46, 2015 Guest Edtors: Peyu Ren, Yancang L, Hupng Song Copyrght 2015, AIDIC Servz S.r.l., ISBN 978-88-95608-37-2; ISSN 2283-9216 The Italan Assocaton

More information

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1 Send Orders for Reprnts to reprnts@benthamscence.ae The Open Cybernetcs & Systemcs Journal, 2014, 8, 115-121 115 Open Access A Load Balancng Strategy wth Bandwdth Constrant n Cloud Computng Jng Deng 1,*,

More information

PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS

PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS Yunhong Xu, Faculty of Management and Economcs, Kunmng Unversty of Scence and Technology,

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

Using Mixture Covariance Matrices to Improve Face and Facial Expression Recognitions

Using Mixture Covariance Matrices to Improve Face and Facial Expression Recognitions Usng Mxture Covarance Matrces to Improve Face and Facal Expresson Recogntons Carlos E. homaz, Duncan F. Glles and Raul Q. Fetosa 2 Imperal College of Scence echnology and Medcne, Department of Computng,

More information

Time Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University

Time Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University Tme Seres Analyss n Studes of AGN Varablty Bradley M. Peterson The Oho State Unversty 1 Lnear Correlaton Degree to whch two parameters are lnearly correlated can be expressed n terms of the lnear correlaton

More information

Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System

Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System Mnng Feature Importance: Applyng Evolutonary Algorthms wthn a Web-based Educatonal System Behrouz MINAEI-BIDGOLI 1, and Gerd KORTEMEYER 2, and Wllam F. PUNCH 1 1 Genetc Algorthms Research and Applcatons

More information

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement An Enhanced Super-Resoluton System wth Improved Image Regstraton, Automatc Image Selecton, and Image Enhancement Yu-Chuan Kuo ( ), Chen-Yu Chen ( ), and Chou-Shann Fuh ( ) Department of Computer Scence

More information

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification IDC IDC A Herarchcal Anomaly Network Intruson Detecton System usng Neural Network Classfcaton ZHENG ZHANG, JUN LI, C. N. MANIKOPOULOS, JAY JORGENSON and JOSE UCLES ECE Department, New Jersey Inst. of Tech.,

More information

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy Fnancal Tme Seres Analyss Patrck McSharry patrck@mcsharry.net www.mcsharry.net Trnty Term 2014 Mathematcal Insttute Unversty of Oxford Course outlne 1. Data analyss, probablty, correlatons, vsualsaton

More information

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) 2127472, Fax: (370-5) 276 1380, Email: info@teltonika.

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) 2127472, Fax: (370-5) 276 1380, Email: info@teltonika. VRT012 User s gude V0.1 Thank you for purchasng our product. We hope ths user-frendly devce wll be helpful n realsng your deas and brngng comfort to your lfe. Please take few mnutes to read ths manual

More information

EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP. Kun-chan Lan and Tsung-hsun Wu

EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP. Kun-chan Lan and Tsung-hsun Wu EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP Kun-chan Lan and Tsung-hsun Wu Natonal Cheng Kung Unversty klan@cse.ncku.edu.tw, ryan@cse.ncku.edu.tw ABSTRACT Voce over IP (VoIP) s one of

More information

Properties of Indoor Received Signal Strength for WLAN Location Fingerprinting

Properties of Indoor Received Signal Strength for WLAN Location Fingerprinting Propertes of Indoor Receved Sgnal Strength for WLAN Locaton Fngerprntng Kamol Kaemarungs and Prashant Krshnamurthy Telecommuncatons Program, School of Informaton Scences, Unversty of Pttsburgh E-mal: kakst2,prashk@ptt.edu

More information

Review of Hierarchical Models for Data Clustering and Visualization

Review of Hierarchical Models for Data Clustering and Visualization Revew of Herarchcal Models for Data Clusterng and Vsualzaton Lola Vcente & Alfredo Velldo Grup de Soft Computng Seccó d Intel lgènca Artfcal Departament de Llenguatges Sstemes Informàtcs Unverstat Poltècnca

More information

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop IWFMS: An Internal Workflow Management System/Optmzer for Hadoop Lan Lu, Yao Shen Department of Computer Scence and Engneerng Shangha JaoTong Unversty Shangha, Chna lustrve@gmal.com, yshen@cs.sjtu.edu.cn

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS 21 22 September 2007, BULGARIA 119 Proceedngs of the Internatonal Conference on Informaton Technologes (InfoTech-2007) 21 st 22 nd September 2007, Bulgara vol. 2 INVESTIGATION OF VEHICULAR USERS FAIRNESS

More information

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems STAN-CS-73-355 I SU-SE-73-013 An Analyss of Central Processor Schedulng n Multprogrammed Computer Systems (Dgest Edton) by Thomas G. Prce October 1972 Techncal Report No. 57 Reproducton n whole or n part

More information

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Conversion between the vector and raster data structures using Fuzzy Geographical Entities Converson between the vector and raster data structures usng Fuzzy Geographcal Enttes Cdála Fonte Department of Mathematcs Faculty of Scences and Technology Unversty of Combra, Apartado 38, 3 454 Combra,

More information

Dynamic Resource Allocation and Power Management in Virtualized Data Centers

Dynamic Resource Allocation and Power Management in Virtualized Data Centers Dynamc Resource Allocaton and Power Management n Vrtualzed Data Centers Rahul Urgaonkar, Ulas C. Kozat, Ken Igarash, Mchael J. Neely urgaonka@usc.edu, {kozat, garash}@docomolabs-usa.com, mjneely@usc.edu

More information

Disagreement-Based Multi-System Tracking

Disagreement-Based Multi-System Tracking Dsagreement-Based Mult-System Trackng Quannan L 1, Xnggang Wang 2, We Wang 3, Yuan Jang 3, Zh-Hua Zhou 3, Zhuowen Tu 1 1 Lab of Neuro Imagng, Unversty of Calforna, Los Angeles 2 Huazhong Unversty of Scence

More information

Effective Network Defense Strategies against Malicious Attacks with Various Defense Mechanisms under Quality of Service Constraints

Effective Network Defense Strategies against Malicious Attacks with Various Defense Mechanisms under Quality of Service Constraints Effectve Network Defense Strateges aganst Malcous Attacks wth Varous Defense Mechansms under Qualty of Servce Constrants Frank Yeong-Sung Ln Department of Informaton Natonal Tawan Unversty Tape, Tawan,

More information