Understanding Convolutional Neural Networks

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Understanding Convolutional Neural Networks"

Transcription

1 Fakultät für Mathematk, Informatk und Naturwssenschaften Lehr- und Forschungsgebet Informatk VIII Computer Vson Prof Dr Bastan Lebe Semnar Report Understandng Convolutonal Neural Networks Davd Stutz Matrculaton Number: ###### August 30, 04 Advsor: Lucas Beyer

2 Abstract Ths semnar paper focusses on convolutonal neural networks and a vsualzaton technque allowng further nsghts nto ther nternal operaton After gvng a bref ntroducton to neural networks and the multlayer perceptron, we revew both supervsed and unsupervsed tranng of neural networks n detal In addton, we dscuss several approaches to regularzaton The second secton ntroduces the dfferent types of layers present n recent convolutonal neural networks Based on these basc buldng blocks, we dscuss the archtecture of the tradtonal convolutonal neural network as proposed by LeCun et al [LBD + 89] as well as the archtecture of recent mplementatons The thrd secton focusses on a technque to vsualze feature actvatons of hgher layers by backprojectng them to the mage plane Ths allows to get deeper nsghts nto the nternal workng of convolutonal neural networks such that recent archtectures can be evaluated and mproved even further

3 Contents Motvaton 4 Bblographcal Notes 4 Neural Networks and Deep Learnng 5 Multlayer Perceptrons 5 Actvaton Functons 6 3 Supervsed Tranng 7 3 Error Measures 8 3 Tranng Protocols 8 33 Parameter Optmzaton 8 34 Weght Intalzaton 9 35 Error Backpropagaton 0 4 Unsupervsed Tranng 0 4 Auto-Encoders 0 4 Layer-Wse Tranng 5 Regularzaton 5 L p -Regularzaton 5 Early Stoppng 53 Dropout 54 Weght Sharng 55 Unsupervsed Pre-Tranng 3 Convolutonal Neural Networks 3 3 Convoluton 3 3 Layers 3 3 Convolutonal Layer 3 3 Non-Lnearty Layer 4 33 Rectfcaton 5 34 Local Contrast Normalzaton Layer 5 35 Feature Poolng and Subsamplng Layer 5 36 Fully Connected Layer 6 33 Archtectures 6 33 Tradtonal Convolutonal Neural Network 6 33 Modern Convolutonal Neural Networks 7 4 Understandng Convolutonal Neural Networks 8 4 Deconvolutonal Neural Networks 8 4 Deconvolutonal Layer 8 4 Unsupervsed Tranng 9 4 Vsualzng Convolutonal Neural Networks 9 4 Poolng Layers 9 4 Rectfcaton Layers 0 43 Convolutonal Neural Network Vsualzaton 0 43 Flters and Features 0 43 Archtecture Evaluaton 0 5 Concluson 3

4 Motvaton Artfcal neural networks are motvated by the learnng capabltes of the human bran whch conssts of neurons nterconnected by synapses In fact at least theoretcally they are able to learn any gven mappng up to arbtrary accuracy [HSW89] In addton, they allow to easly ncorporate pror knowledge about the task nto the network archtecture As result, n 989, LeCun et al ntroduced convolutonal neural networks for applcaton n computer vson [LBD + 89] Convolutonal neural networks use mages drectly as nput Instead of handcrafted features, convolutonal neural networks are used to automatcally learn a herarchy of features whch can then be used for classfcaton purposes Ths s accomplshed by successvely convolvng the nput mage wth learned flters to buld up a herarchy of feature maps The herarchcal approach allows to learn more complex, as well as translaton and dstorton nvarant, features n hgher layers In contrast to tradtonal multlayer perceptrons, where deep learnng s consdered dffcult [Ben09], deep convolutonal neural networks can be traned more easly usng tradtonal methods Ths property s due to the constraned archtecture of convolutonal neural networks whch s specfc to nput for whch dscrete convoluton s defned, such as mages Nevertheless, deep learnng of convolutonal neural networks s an actve area of research, as well As wth multlayer perceptrons, convolutonal neural networks stll have some dsadvantages when compared to other popular machne learnng technques as for example Support Vector Machnes as ther nternal operaton s not well understood [ZF3] Usng deconvolutonal neural networks proposed n [ZKTF0], ths problem s addressed n [ZF3] The approach descrbed n [ZF3] allows the vsualzaton of feature actvatons n hgher layers of the network and can be used to gve further nsghts nto the nternal operaton of convolutonal neural networks Bblographcal Notes Although ths paper brefly ntroduces the basc notons of neural networks as well as network tranng, ths topc s far too extensve to be covered n detal For a detaled dscusson of neural networks and ther tranng several textbooks are avalable [Bs95, Bs06, Hay05] The convolutonal neural network was orgnally proposed n [LBD + 89] for the task of ZIP code recognton Both convolutonal neural networks as well as tradtonal multlayer perceptrons were excessvely appled to character recognton and handwrtten dgt recognton [LBBH98] Tranng was ntally based on error backpropagaton [RHW86] and gradent descent The orgnal convolutonal neural network s based on weght sharng whch was proposed n [RHW86] An extenson of weght sharng called soft weght sharng s dscussed n [NH9] Recent mplementatons make use of other regularzaton technques as for example dropout [HSK + ] Although the work by Hnton et al n 006 [HO06] can be consdered as breakthrough n deep learnng as t allows unsupervsed tranng of neural networks deep learnng s stll consdered dffcult [Ben09] A thorough dscusson of deep learnng ncludng recent research s gven n [Ben09] as well as [LBLL09, GB0, BL07] Addtonal research on ths topc ncludes dscusson on actvaton functons as well as the effect of unsupervsed pre-tranng [EMB + 09, EBC + 0, GBB] Recent archtectural changes of convolutonal neural networks are dscussed n detal n [JKRL09] and [LKF0] Recent success of convolutonal neural networks s reported n [KSH] and [CMS] Ths paper s manly motvated by the experments n [ZF3] Based on deconvolutonal neural networks [ZKTF0], the authors of [ZF3] propose a vsualzaton technque allowng to vsualze feature actvatons of hgher layers Here, tradtonal methods refers to gradent descent for parameter optmzaton combned wth error backpropagaton as dscussed n secton 3 Usng weght sharng as dscussed n secton 54, the actual model complexty s reduced 4

5 x x D w 0 y Fgure : A processng unt conssts of a propagaton rule mappng all nputs w 0,x,x D to the actual nput z, and an actvaton functon f whch s appled on the actual nput to form the output y = f z) Here, w 0 represents an external nput called bas and x,,x D are nputs from other unts of the network In a network graph, each unt s labeled accordng to ts output Therefore, to nclude the bas w 0 as well, a dummy unt see secton ) wth value s ncluded Neural Networks and Deep Learnng An artfcal) neural network comprses a set of nterconnected processng unts [Bs95, p 80-8] Gven nput values w 0,x,,x D, where w 0 represents an external nput and x,,x D are nputs orgnatng from other processng unts wthn the network, a processng unt computes ts output as y = f z) Here, f s called actvaton functon and z s obtaned by applyng a propagaton rule whch maps all the nputs to the actual nput z Ths model of a sngle processng unt ncludes the defnton of a neuron n [Hay05] where nstead of a propagaton rule an adder s used to compute z as the weghted sum of all nputs Neural networks can be vsualzed n the means of a drected graph 3 called network graph [Bs95, p 7-0] Each unt s represented by a node labeled accordng to ts output and the unts are nterconnected by drected edges For a sngle processng unt ths s llustrated n fgure where the external nput w 0 s only added for llustraton purposes and s usually omtted [Bs95, p 6-0] For convenence, we dstngush nput unts and output unts An nput unt computes the output y := x where x s the sngle nput value of the unt Output unts may accept an arbtrary number of nput values Altogether, the network represents a functon yx) whch dmensons are fxed by the number of nput unts and output unts, ths means the nput of the network s accepted by the nput unts and the output unts form the output of the network Multlayer Perceptrons A L + )-layer perceptron, llustrated n fgure, conssts of D nput unts, C output unts, and several so called hdden unts The unts are arranged n layers, that s a multlayer perceptron comprses an nput layer, an output layer and L hdden layers 4 [Bs95, p 7-0] The th unt wthn layer l computes the output y l) = f z l) ) wth z l) m l ) = k= w l),k yl ) k + w l),0 ) where w l),k denotes the weghted connecton from the kth unt n layer l ) to the th unt n layer l, and w l),0 can be regarded es external nput to the unt and s referred to as bas Here, ml) denotes the number of unts n layer l, such that D = m 0) and C = m L+) For smplcty, the bas can be regarded as weght when ntroducng a dummy unt y l) 0 := n each layer: z l) m l ) = k=0 w l),k yl ) k or z l) = w l) y l ) ) where z l), w l) and y l ) denote the correspondng vector and matrx representatons of the actual nputs z l), the weghts w l),k and the outputs yl ) k, respectvely 3 In ts most general form, a drected graph s an ordered par G = V,E) where V s a set of nodes and E a set of edges connectng the nodes: u,v) E means that a drected edge from node u to v exsts wthn the graph In a network graph, gven two unts u and v, a drected edge from u to v means that the output of unt u s used by unt v as nput 4 Actually, a L + )-layer perceptron conssts of L + ) layers ncludng the nput layer However, as stated n [Bs06], the nput layer s not counted as there s no real processng takng place nput unts compute the dentty) 5

6 nput layer x 0 st hdden layer L th hdden layer y ) 0 y L) 0 output layer y L+) x y ) y L) y L+) x D y ) m ) y L) m L) y L+) C Fgure : Network graph of a L+)-layer perceptron wth D nput unts and C output unts The l th hdden layer contans m l) hdden unts Overall, a multlayer perceptron represents a functon y,w) : R D R C,x yx,w) 3) where the output vector yx,w) comprses the output values y x,w) := y L+) and w s the vector of all weghts wthn the network We speak of deep neural networks when there are more than three hdden layers present [Ben09] The tranng of deep neural networks, referred to as deep learnng, s consdered especally challengng [Ben09] Actvaton Functons In [Hay05, p 34-37], three types of actvaton functons are dscussed: threshold functons, pecewse-lnear functons and sgmod functons A common threshold functon s gven by the Heavsde functon: { f z 0 hz) = 0 f z < 0 4) However, both threshold functons as well as pecewse-lnear functons have some drawbacks Frst, for network tranng we may need the actvaton functon to be dfferentable Second, nonlnear actvaton functons are preferable due to the addtonal computatonal power they nduce [DHS0, HSW89] The most commonly used type of actvaton functons are sgmod functons As example, the logstc sgmod s gven by σz) = + exp z) 5) Its graph s s-shaped and t s dfferentable as well as monotonc The hyperbolc tangent tanhz) can be regarded as lnear transformaton of the logstc sgmod onto the nterval [, ] Note, that both actvaton functons are saturatng [DHS0, p ] When usng neural networks for classfcaton 5, the softmax actvaton functon for output unts s used to nterpret the output values as posteror probabltes 6 Then the output of the th unt n the output layer s 5 The classfcaton task can be stated as follows: Gven an nput vector x of D dmensons, the goal s to assgn x to one of C dscrete classes [Bs06] 6 The outputs y L+), C, can be nterpreted as probabltes as they le n the nterval [0,] and sum to [Bs06] 6

7 Logstc sgmod Hyperbolc tangent σz) 05 tanhz) z a) Logstc sgmod actvaton functon z b) Hyperbolc tangent actvaton functon sz) 05 Softsgn tanhz) 05 Rectfed tanh z c) Logstc sgmod actvaton functon z d) Rectfed hyperbolc tangent actvaton functon Fgure 3: Common used actvaton functons nclude the logstc sgmod σz) defned n equaton 5) and the hyperbolc tangent tanhz) More recently used actvaton functons are the softsgn of equaton 7) and the rectfed hyperbolc tangent gven by σz L+),) = expzl+) ) C k= expzl+) k ) 6) Experments n [GB0] show that the logstc sgmod as well as the hyperbolc tangent perform rather poorly n deep learnng Better performance s reported usng the softsgn actvaton functon: In [KSH] a non-saturatng actvaton functon s used: sz) = + z 7) rz) = max0,z) 8) Hdden unts usng the actvaton functon n equaton 8) are called rectfed lnear unts 7 Furthermore, n [JKRL09], rectfcaton n addton to the hyperbolc tangent actvaton functon s reported to gve good results Some of the above actvaton functons are shown n fgure 3 3 Supervsed Tranng Supervsed tranng s the problem of determnng the network weghts to approxmate a specfc target mappng g In practce, g may be unknown such that the mappng s gven by a set of tranng data The tranng set T S := {x n,t n ) : n N} 9) comprses both nput values x n and correspondng desred, possbly nosy, output values t n gx n ) [Hay05] 7 Also abbrevated as ReLUs 7

8 3 Error Measures Tranng s accomplshed by adjustng the weghts w of the neural network to mnmze a chosen objectve functon whch can be nterpreted as error measure between network output yx n ) and desred target output t n Popular choces for classfcaton nclude the sum-of-squared error measure gven by Ew) = N n= and the cross-entropy error measure gven by Ew) = N n= E n w) = E n w) = N C n= k= N C n= k= y k x n,w) t n,k ), 0) t n,k logy k x n,w)), ) where t n,k s the k th entry of the target value t n Detals on the choce of error measure and ther propertes can be found n [Bs95] 3 Tranng Protocols [DHS0] consders three tranng protocols: Stochastc tranng An nput value s chosen at random and the network weghts are updated based on the error E n w) Batch tranng All nput values are processed and the weghts are updated based on the overall error Ew) = N n= E nw) Onlne tranng Every nput value s processed only once and the weghts are updated usng the error E n w) Further dscusson of these protocols can be found n [Bs06] and [DHS0] A common practce eg used for experments n [GBB], [GB0]) combnes stochastc tranng and batch tranng: Mn-batch tranng A random subset M {,,N} mn-batches) of the tranng set s processed and the weghts are updated based on the cumulatve error E M w) := n M E n w) 33 Parameter Optmzaton Consderng stochastc tranng we seek to mnmze E n wth respect to the network weghts w The necessary crteron can be wrtten as E n w = E nw)! = 0 ) where E n s the gradent of the error E n Due to the complexty of the error E n, a closed-form soluton s usually not possble and we use an teratve approach Let w[t] denote the weght vector n the t th teraton In each teraton we compute a weght update w[t] and update the weghts accordngly [Bs06, p 36-37]: w[t + ] = w[t] + w[t] 3) From unconstraned optmzaton we have several optmzaton technques avalable Gradent descent s a frst-order method, ths means t uses only nformaton of the frst dervatve of E n and can, thus, be used n combnaton wth error backpropagaton as descrbed n secton 35, whereas Newton s method s a second-order method and needs to evaluate the Hessan matrx H n of E n 8 or an approprate approxmaton of the Hessan matrx) n each teraton step 8 The Hessan matrx H n of a the error E n s the matrx of second-order partal dervatves: H n ) r,s = E n w r w s 8

9 w[0] w[] w[] w[3] w[4] Fgure 4: Illustrated usng a quadratc functon to mnmze, the dea of gradent descent s to follow the negatve gradent at the current poston as t descrbes the drecton of the steepest descent The learnng rate γ descrbes the step sze taken n each teraton step Therefore, gradent descent descrbes a frst-order optmzaton technque Gradent descent Gradent descent s motvated by the dea to take a step n the drecton of the steepest descent, that s the drecton of the negatve gradent, to reach a mnmum [Bs95, p 63-67] Ths prncple s llustrated by fgure 4 Therefore, the weght update s gven by w[t] = γ E n w[t] = γ E nw[t]) 4) where γ s the learnng rate As dscussed n [Bs06, p63-7], ths approach has several dffcultes, for example how to choose the learnng rate to get fast learnng but at the same tme avod oscllaton 9 Newton s method Although there are some extensons of gradent descent avalable, second-order methods promse faster convergence because of the use of second-order nformaton [BL89] When usng Newton s method, the weght update w[t] s gven by w[t] = γ E n w[t] ) En w[t] = γ H n w[t]) ) En w[t]) 5) where H n w[t]) s the Hessan matrx of E n and γ descrbes the learnng rate The drawback of ths method s the evaluaton and nverson of the Hessan matrx 0 whch s computatonally expensve [BL89] 34 Weght Intalzaton As we use an teratve optmzaton technque, the ntalzaton of the weghts w s crucal [DHS0, p 3-3] suggest choosng the weghts randomly n the range < m l ) wl), j < 6) m l ) Ths result s based on the assumpton that the nputs of each unt are dstrbuted accordng to a Gaussan dstrbuton and ensures that the actual nput s approxmately of unty order Gven logstc sgmod actvaton functons, ths s meant to result n optmal learnng [DHS0, p 3-3] In [GB0] an alternatve ntalzaton scheme called normalzed ntalzaton s ntroduced We choose the weghts randomly n the range 6 6 < m l ) wl), j + m < 7) l) m l ) + m l) The dervaton of ths ntalzaton scheme can be found n [GB0] Expermental results n [GB0] demonstrate mproved learnng when usng normalzed ntalzaton An alternatve to these weght ntalzaton schemes s gven by layer-wse unsupervsed pre-tranng as dscussed n [EBC + 0] We dscuss unsupervsed tranng n secton 4 9 Oscllaton occurs f the learnng rate s chosen too large such that the algorthm successvely oversteps the mnmum 0 An algorthm to evaluate the Hessan matrx based on error backpropagaton as ntroduced n secton 35 can be found n [Bs9] The nverson of an n n matrx has complexty On 3 ) when usng the LU decomposton or smlar technques 9

10 35 Error Backpropagaton Algorthm, proposed n [RHW86], s used to evaluate the gradent E n w[t]) of the error functon E n n each teraton step More detals as well as a thorough dervaton of the algorthm can be found n [Bs95] or [RHW86] Algorthm Error Backpropagaton) Propagate the nput value x n through the network to get the actual nput and output of each unt Calculate the so called errors δ L+) 3 Determne δ l) [Bs06, p 4-45] for the output unts: δ L+) := E n y L+) f z L+) ) 8) for all hdden layers l by usng error backpropagaton: 4 Calculate the requred dervatves: 4 Unsupervsed Tranng δ l) In unsupervsed tranng, gven a tranng set := f z l) m l+) ) k= w l+),k δ l+) k 9) E n w l) = δ l) j y l ) 0) j, T U := {x n : n N} ) wthout desred target values, the network has to fnd smlartes and regulartes wthn the data by tself Among others, unsupervsed tranng of deep archtectures can be accomplshed based on Restrcted Boltzman Machnes or auto-encodes [Ben09] We focus on auto-encoders 4 Auto-Encoders Auto-encoders, also called auto-assocators [Ben09], are two-layer perceptrons wth the goal to compute a representaton of the nput n the frst layer from whch the nput can accurately be reconstructed n the output layer Therefore, no desred target values are needed auto-encoders are self-supervsed [Ben09] In the hdden layer, consstng of m := m ) unts, an auto-encoder computes a representaton cx) from the nput x [Ben09]: c x) = D k=0 w ),k x k ) The output layer tres to reconstruct the nput from the representaton gven by cx): ˆx = d cx)) = m k=0 w ),k c kx) 3) As the output of an auto-encoder should resemble ts nput, t can be traned as dscussed n secton 3 by replacng the desred target values t n used n the error measure by the nput x n In the case where m < D, the auto-encoder s expected to compute a useful, dmensonalty-reducng representaton of the nput If m D, the auto-encoder could just learn the dentty such that ˆx would be a perfect reconstructon of x However, as dscussed n [Ben09], n practce ths s not a problem A bref ntroducton to Restrcted Boltzman Machnes can be found n [Ben09] 0

11 nput layer representaton layer reconstructon layer c 0 x) x 0 ˆx c x) x ˆx x D c m x) ˆx C Fgure 5: An auto-encoder s manly a towlayer perceptron wth m := m ) hdden unts and the goal to compute a representaton cx) n the frst layer from whch the nput can accurately be reconstructed n the output layer 4 Layer-Wse Tranng As dscussed n [LBLL09], the layers of a neural network can be traned n an unsupervsed fashon usng the followng scheme: For each layer l =,,L + : Tran layer l usng the approach dscussed above takng the output of layer l ) as nput, assocatng the output of layer l wth the representaton cy l ) ) and addng an addtonal layer to compute ŷ l) 5 Regularzaton It has been shown, that multlayer perceptrons wth at least one hdden layer can approxmate any target mappng up to arbtrary accuracy [HSW89] Thus, the tranng data may be overftted, that s the tranng error may be very low on the tranng set but hgh on unseen data [Ben09] Regularzaton descrbes the task to avod overfttng to gve better generalzaton performance, meanng that the traned network should also perform well on unseen data [Hay05] Therefore, the tranng set s usually splt up nto an actual tranng set and a valdaton set The neural network s then traned usng the new tranng set and ts generalzaton performance s evaluated on the valdaton set [DHS0] There are dfferent methods to perform regularzaton Often, the tranng set s augmented to ntroduce certan nvarances the network s expected to learn [KSH] Other methods add a regularzaton term to the error measure amng to control the complexty and form of the soluton [Bs95]: Ê n w) = E n w) + ηpw) 4) where Pw) nfluences the form of the soluton and η s a balancng parameter 5 L p -Regularzaton A popular example of L p -regularzaton s the L -regularzaton : Pw) = w = w T w 5) The dea s to penalze large weghts as they tend to result n overfttng [Bs95] In general, arbtrary p can be used to perform L p -regularzaton Another example sets p = 3 to enforce sparsty of the weghts, that s many of the weghts should vansh: Pw) = w 6) The L -regularzaton s often referred to as weght decay, see [Bs95] for detals 3 For p =, the norm s defned by w = W k= w k where W s the dmenson of the weght vector w

12 5 Early Stoppng Whle the error on the tranng set tends to decrease wth the number of teratons, the error on the valdaton set usually starts to rse agan once the network starts to overft the tranng set To avod overfttng, tranng can be stopped as soon as the error on the valdaton set reaches a mnmum, that s before the error on the valdaton set rses agan [Bs95] Ths method s called early stoppng 53 Dropout In [HSK + ] another regularzaton technque, based on observaton of the human bran, s proposed Whenever the neural network s gven a tranng sample, each hdden unt s skpped wth probablty Ths method can be nterpreted n dfferent ways [HSK + ] Frst, unts cannot rely on the presence of other unts Second, ths method leads to the tranng of multple dfferent networks smultaneously Thus, dropout can be nterpreted as model averagng 4 54 Weght Sharng The dea of weght sharng was ntroduced n [RHW86] n the context of the T-C problem 5 Weght sharng descrbes the dea of dfferent unts wthn the same layer to use dentcal weghts Ths can be nterpreted as a regularzaton method as the complexty of the network s reduced and pror knowledge may be ncorporated nto the network archtecture The equalty constrant s replaced when usng soft weght sharng, ntroduced n [NH9] Here, a set of weghts s encouraged not to have the same weght value but smlar weght values Detals can be found n [NH9] and [Bs95] When usng weght sharng, error backpropagaton can be appled as usual, however, equaton 0) changes to E n w l) j, m l) = k= δ l) k yl ) 7) when assumng that all unts n layer l share the same set of weghts, that s w l) j, = wl) k, for j,k ml) Nevertheless, equaton 0) stll needs to be appled n the case that the errors need to be propagated to precedng layers [Bs06] 55 Unsupervsed Pre-Tranng Results n [EBC + 0] suggest that layer-wse unsupervsed pre-tranng of deep neural networks can be nterpreted as regularzaton technque 6 Layer-wse unsupervsed pre tranng can be accomplshed usng a smlar scheme as dscussed n secton 4: For each l =,,L + : Tran layer l usng the approach dscussed n secton 4 Fne-tune the weghts usng supervsed tranng as dscussed n secton 3 A formulaton of the effect of unsupervsed pre-tranng as regularzaton method s proposed n [EMB + 09]: The regularzaton term punshes weghts outsde a specfc regon n weght space wth an nfnte penalty such that Pw) = logpw)) 8) where pw) s the pror for the weghts, whch s zero for weghts outsde ths specfc regon [EBC + 0] 4 Model averagng tres to reduce the error by averagng the predcton of dfferent models [HSK + ] 5 The T-C problem descrbes the task of classfyng mages nto those contanng a T and those contanng a C ndependent of poston and rotaton [RHW86] 6 Another nterpretaton of unsupervsed pre-tranng s that t ntalzes the weghts n the basn of a good local mnmum and can therefore be nterpreted as optmzaton ad [Ben09]

13 3 Convolutonal Neural Networks Although neural networks can be appled to computer vson tasks, to get good generalzaton performance, t s benefcal to ncorporate pror knowledge nto the network archtecture [LeC89] Convolutonal neural networks am to use spatal nformaton between the pxels of an mage Therefore, they are based on dscrete convoluton After ntroducng dscrete convoluton, we dscuss the basc components of convolutonal neural networks as descrbed n [JKRL09] and [LKF0] 3 Convoluton For smplcty we assume a grayscale mage to be defned by a functon I : {,,n } {,,n } W R,, j) I, j 9) such that the mage I can be represented by an array of sze n n 7 Gven the flter K R h + h +, the dscrete convoluton of the mage I wth flter K s gven by where the flter K s gven by I K) r,s := h u= h h v= h K u,v I r+u,s+v 30) K h, h K h,h K = K 0,0 3) K h, h K h,h Note that the behavor of ths operaton towards the borders of the mage needs to be defned properly 8 A commonly used flter for smoothng s the dscrete Gaussan flter K Gσ) [FP0] whch s defned by ) ) K Gσ) = exp r + s r,s πσ σ 3) where σ s the standard devaton of the Gaussan dstrbuton [FP0] 3 Layers We follow [JKRL09] and ntroduce the dfferent types of layers used n convolutonal neural networks Based on these layers, complex archtectures as used for classfcaton n [CMS] and [KSH] can be bult by stackng multple layers 3 Convolutonal Layer Let layer l be a convolutonal layer Then, the nput of layer l comprses m l ) feature maps from the prevous layer, each of sze m l ) m l ) 3 In the case where l =, the nput s a sngle mage I consstng of one or more channels Ths way, a convolutonal neural network drectly accepts raw mages as nput The output of layer l conssts of m l) feature maps of sze ml) ml) 3 The th feature map n layer l, denoted Y l), s computed as Y l) = B l) m l ) + j= K l) l ), j Y j 33) 7 Often, W wll be the set {0,,55} representng an 8-bt channel Then, a color mage can be represented by an array of sze n n 3 assumng three color channels, for example RGB 8 As example, consder a gray scale mage of sze n n When applyng an arbtrary flter of sze h + h + to the pxel at locaton, ) the sum of equaton 30) ncludes pxel locatons wth negatve ndces To solve ths problem, several approaches can be consdered, as for example paddng the mage n some way or applyng the flter only for locatons where the operaton s defned properly resultng n the output array beng smaller than the mage 3

14 nput mage or nput feature map output feature maps Fgure 6: Illustraton of a sngle convolutonal layer If layer l s a convolutonal layer, the nput mage f l = ) or a feature map of the prevous layer s convolved by dfferent flters to yeld the output feature maps of layer l where B l) s a bas matrx and K l), j s the flter of sze h l) + hl) + connectng the jth feature map n layer l ) wth the th feature map n layer l [LKF0] 9 As mentoned above, m l) and ml) 3 are nfluenced by border effects When applyng the dscrete convoluton only n the so called vald regon of the nput feature maps, that s only for pxels where the sum of equaton 30) s defned properly, the output feature maps have sze m l) = ml ) h l) and m l) 3 = ml ) 3 h l) 34) Often the flters used for computng a fxed feature map Y l) are the same, that s K l), j = K l),k for j k In addton, the sum n equaton 33) may also run over a subset of the nput feature maps To relate the convolutonal layer and ts operaton as defned by equaton 33) to the multlayer perceptron, we rewrte the above equaton Each feature map Y l) n layer l conssts of m l) ml) 3 unts arranged n a two-dmensonal array The unt at poston r,s) computes the output Y l) )r,s = = l ) B l) m )r,s + j= B l) l ) m )r,s + j= K l), j h l) u= h l) ) l ) Y j r,s h l) v= h l) K l), j ) u,v Y l ) j ) 35) r+u,s+v 36) The tranable weghts of the network can be found n the flters K l), j and the bas matrces B l) As we wll see n secton 35, subsamplng s used to decrease the effect of nose and dstortons As noted n [CMM + ], subsamplng can be done usng so called skppng factors s l) and s l) The basc dea s to skp a fxed number of pxels, both n horzontal and n vertcal drecton, before applyng the flter agan Wth skppng factors as above, the sze of the output feature maps s gven by m l) = ml ) s l) h l) + and m l) 3 = ml ) 3 h l) s l) + 37) 3 Non-Lnearty Layer If layer l s a non-lnearty layer, ts nput s gven by m l) feature maps and ts output comprses agan m l) = ml ) feature maps, each of sze m l ) m l ) 3 such that m l) = ml ) and m l) 3 = ml ) 3, gven by 9 Note the dfference between a feature map Y l) y l) as used n the multlayer perceptron Y l) = f Y l ) ) 38) comprsng m l) ml) 3 unts arranged n a two-dmensonal array and a sngle unt 4

15 where f s the actvaton functon used n layer l and operates pont wse In [JKRL09] addtonal gan coeffcents are added: ) = g f 39) Y l) Y l ) A convolutonal layer ncludng a non-lnearty, wth hyperbolc tangent actvaton functons and gan coeffcents s denoted by F CSG 0 Note that n [JKRL09] ths consttutes a sngle layer whereas we separate the convolutonal layer and the non-lnearty layer 33 Rectfcaton Let layer l be a rectfcaton layer Then ts nput comprses m l ) feature maps of sze m l ) m l ) 3 and the absolute value for each component of the feature maps s computed: Y l) 40) = Y l) where the absolute value s computed pont wse such that the output conssts of m l) = ml ) feature maps unchanged n sze Experments n [JKRL09] show that rectfcaton plays a central role n achevng good performance Although rectfcaton could be ncluded n the non-lnearty layer [LKF0], we follow [JKRL09] and add ths operaton as an ndependent layer The rectfcaton layer s denoted by R abs 34 Local Contrast Normalzaton Layer Let layer l be a contrast normalzaton layer The task of a local contrast normalzaton layer s to enforce local compettveness between adjacent unts wthn a feature map and unts at the same spatal locaton n dfferent feature maps We dscuss subtractve normalzaton as well as brghtness normalzaton An alternatve, called dvsve normalzaton, can be found n [JKRL09] or [LKF0] Gven m l ) feature maps of sze m l ) m l ) 3, the output of layer l comprses m l) = ml ) feature maps unchanged n sze The subtractve normalzaton operaton computes Y l) = Y l ) m l ) j= K Gσ) Y l ) j 4) where K Gσ) s the Gaussan flter from equaton 3) In [KSH] an alternatve local normalzaton scheme called brghtness normalzaton s proposed to be used n combnaton wth rectfed lnear unts Then the output of layer l s gven by Y l) ) r,s = Y l ) κ + µ ml ) j= ) r,s Y l ) j ) r,s ) µ 4) where κ, λ, µ are hyperparameters whch can be set usng a valdaton set [KSH] The sum n equaton 4) may also run over a subset of to the feature maps n layer l ) Local contrast normalzaton layers are denoted N S and N B, respectvely 35 Feature Poolng and Subsamplng Layer The motvaton of subsamplng the feature maps obtaned by prevous layers s robustness to nose and dstortons [JKRL09] Reducng the resoluton can be accomplshed n dfferent ways In [JKRL09] and 0 C for convolutonal layer, S for sgmod/hyperbolc tangent actvaton functons and G for gan coeffcents In [JKRL09] the flter sze s added as subscrpt such that FCSG 7 7 denotes the usage of 7 7 flters Addtonally, the number of used flters s added as follows: 3FCSG 7 7 We omt the number of flters as we assume full connectvty such that the number of flters s gven by ml) ml ) Note that equaton 40) can easly be appled to fully-connected layers as ntroduced n secton 36, as well 5

16 Fgure 7: Illustraton of a poolng and subsamplng layer If layer l s a poolng and subsamplng layer and gven m l ) = 4 feature maps of the prevous layer, all feature maps are pooled and subsampled ndvdually Each unt n one of the m l) = 4 output feature maps represents the average or the maxmum wthn a fxed wndow of the correspondng feature map n layer l ) feature maps layer l ) feature maps layer l [LKF0] ths s combned wth poolng and done n a separate layer, whle n the tradtonal convolutonal neural networks, subsamplng s done by applyng skppng factors Let l be a poolng layer Its output comprses m l) = ml ) feature maps of reduced sze In general, poolng operates by placng wndows at non-overlappng postons n each feature map and keepng one value per wndow such that the feature maps are subsampled We dstngush two types of poolng: Average poolng When usng a boxcar flter, the operaton s called average poolng and the layer denoted by P A Max poolng For max poolng, the maxmum value of each wndow s taken The layer s denoted by P M As dscussed n [SMB0], max poolng s used to get faster convergence durng tranng Both average and max poolng can also be appled usng overlappng wndows of sze p p whch are placed q unts apart Then the wndows overlap f q < p Ths s found to reduce the chance of overfttng the tranng set [KSH] 36 Fully Connected Layer Let layer l be a fully connected layer If layer l ) s a fully connected layer, as well, we may apply equaton ) Otherwse, layer l expects m l ) feature maps of sze m l ) m l ) 3 as nput and the th unt n layer l computes: y l) = f z l) ) wth z l) = m l ) j= m l ) r= m l ) 3 w l), j,r,s s= Y l ) j ) r,s 43) where w l), j,r,s denotes the weght connectng the unt at poston r,s) n the jth feature map of layer l ) and the th unt n layer l In practce, convolutonal layers are used to learn a feature herarchy and one or more fully connected layers are used for classfcaton purposes based on the computed features [LBD + 89, LKF0] Note that a fully-connected layer already ncludes the non-lneartes whle for a convolutonal layer the non-lneartes are separated n ther own layer 33 Archtectures We dscuss both the tradtonal convolutonal neural network as proposed n [LBD + 89] as well as a modern varant as used n [KSH] 33 Tradtonal Convolutonal Neural Network In [JKRL09], the basc buldng block of tradtonal neural networks s F CSG P A, whle n [LBD + 89], the subsamplng s accomplshed wthn the convolutonal layers and there are no gan coeffcents used In Usng the notaton as used n secton 3, the boxcar flter K B of sze h + h + s gven by K B ) r,s = h +)h +) 6

17 convolutonal layer wth non-lneartes layer l = convolutonal layer wth non-lneartes layer l = 4 fully connected layer layer l = 7 nput mage layer l = 0 subsamplng layer layer l = 3 subsamplng layer layer l = 6 fully connected layer output layer l = 8 Fgure 8: The archtecture of the orgnal convolutonal neural network, as ntroduced n [LBD + 89], alternates between convolutonal layers ncludng hyperbolc tangent non-lneartes and subsamplng layers In ths llustraton, the convolutonal layers already nclude non-lneartes and, thus, a convolutonal layer actually represents two layers The feature maps of the fnal subsamplng layer are then fed nto the actual classfer consstng of an arbtrary number of fully connected layers The output layer usually uses softmax actvaton functons general, the unque characterstc of tradtonal convolutonal neural networks les n the hyperbolc tangent non-lneartes and the weght sharng [LBD + 89] Ths s llustrated n fgure 8 where the non-lneartes are ncluded wthn the convolutonal layers 33 Modern Convolutonal Neural Networks As example of a modern convolutonal neural network we explore the archtecture used n [KSH] whch gves excellent performance on the ImageNet Dataset [ZF3] The archtecture comprses fve convolutonal layers each followed by a rectfed lnear unt non-lnearty layer, brghtness normalzaton and overlappng poolng Classfcaton s done usng three addtonal fully-connected layers To avod overfttng, [KSH] uses dropout as regularzaton technque Such a network can be specfed by F CR N B P where F CR denotes a convolutonal layer followed by a non-lnearty layer wth rectfed lnear unts Detals can be found n [KSH] In [CMS] the authors combne several deep convolutonal neural networks whch have a smlar archtecture as descrbed above and average ther classfcaton/predcton result Ths archtecture s referred to as mult-column deep convolutonal neural network 7

18 4 Understandng Convolutonal Neural Networks Although convolutonal neural networks have been used wth success for a varety of computer vson tasks, ther nternal operaton s not well understood Whle backprojecton of feature actvatons from the frst convolutonal layer s possble, subsequent poolng and rectfcaton layers hnder us from understandng hgher layers as well As stated n [ZF3], ths s hghly unsatsfactory when amng to mprove convolutonal neural networks Thus, n [ZF3], a vsualzaton technque s proposed whch allows us to vsualze the actvatons from hgher layers Ths technque s based on an addtonal model for unsupervsed learnng of feature herarches: the deconvolutonal neural network as ntroduced n [ZKTF0] 4 Deconvolutonal Neural Networks Smlar to convolutonal neural networks, deconvolutonal neural networks are based upon the dea of generatng feature herarches by convolvng the nput mage by a set of flters at each layer [ZKTF0] However, deconvolutonal neural networks are unsupervsed by defnton In addton, deconvolutonal neural networks are based on a top-down approach Ths means, the goal s to reconstruct the network nput from ts actvatons and flters [ZKTF0] 4 Deconvolutonal Layer Let layer l be a deconvolutonal layer The nput s composed of m l ) feature maps of sze m l ) m l ) Each such feature map Y l ) s represented as sum over m l) feature maps convolved wth flters K l) j, : m l) j= 3 K l) l) j, Y j = Y l ) 44) As wth an auto-encoder, t s easy for the layer to learn the dentty, f there are enough degrees of freedom Therefore, [ZKTF0] ntroduces a sparsty constrant for the feature maps Y l) j, and the error measure for tranng layer l s gven by m l ) E l) m l) w) = K l) l) j, Y j Y l ) m l) + Y l) p 45) = j= = p where p s the vectorzed p-norm and can be nterpreted as L p -regularzaton as dscussed n secton 5 The dfference between a convolutonal layer and a deconvolutonal layer s llustrated n fgure 9 Note that the error measure E l) s specfc for layer l Ths mples that a deconvolutonal neural network wth multple deconvolutonal layers s traned layer-wse convolutonal layer deconvolutonal layer bottom-up top-down Fgure 9: An llustraton of the dfference between the bottom-up approach of convolutonal layers and the top-down approach of deconvolutonal layers 8

19 output layer l = L + deconvolutonal layer l = L convolutonal layer l = L deconvolutonal layer l = convolutonal layer l = feature actvatons nput mage l = 0 Fgure 0: After each convolutonal layer, the feature actvatons of the prevous layer are reconstructed usng an attached deconvolutonal layer For l > the process of reconstructon s terated untl the feature actvatons are backprojected onto the mage plane 4 Unsupervsed Tranng Smlar to unsupervsed tranng dscussed n secton 4, tranng s performed layer-wse Therefore, equaton 45) s optmzed by alternately optmzng wth respect to the feature maps Y l) gven the flters K l) l ) j, and the feature maps Y of the prevous layer and wth respect to the flters K l) j, [ZKTF0] Here, the optmzaton wth respect to the feature maps Y l) causes some problems For example when usng p =, the optmzaton problem s poorly condtoned [ZKTF0] and therefore usual gradent descent optmzaton fals An alternatve optmzaton scheme s dscussed n detal n [ZKTF0], however, as we do not need to tran deconvolutonal neural networks, ths s left to the reader 4 Vsualzng Convolutonal Neural Networks To vsualze and understand the nternal operatons of a convolutonal neural network, a sngle deconvolutonal layer s attached to each convolutonal layer Gven nput feature maps for layer l, the output feature maps Y l) are fed back nto the correspondng deconvolutonal layer at level l The deconvolutonal layer reconstructs the feature maps Y l ) that gave rse to the actvatons n layer l [ZF3] Ths process s terated untl layer l = 0 s reached resultng n the actvatons of layer l beng backprojected onto the mage plane The general dea s llustrated n fgure 0 Note that the deconvolutonal layers do not need to be traned as the flters are already gven by the traned convolutonal layers and merely have to be transposed 3 More complex convolutonal neural networks may nclude non-lnearty layers, rectfcaton layers as well as poolng layers Whle we assume the used non-lneartes to be nvertble, the use of rectfcaton layers and poolng layers cause some problems 4 Poolng Layers Let layer l be a max poolng layer, then the operaton of layer l s not nvertble We need to remember whch postons wthn the nput feature map Y l) gave rse to the maxmum value to get an approxmate nverse [ZF3] Therefore, as dscussed n [ZF3], swtch varables are ntroduced 3 Gven a feature map Y l) = K l) l ), j Y j here we omt the sum of equaton 33) for smplcty) and usng the transposed flter ) K l) T ) l ) j, gves us: Y j = K l) T l), j Y 9

20 unpoolng layer rectfcaton layer swtch varables poolng layer rectfcaton layer non-lnearty layer non-lnearty layer deconvolutonal layer convolutonal layer Fgure : Whle the approach descrbed n secton 4 can easly be appled to convolutonal neural networks ncludng non-lnearty layers, the usage of poolng and rectfcaton layers mposes some problems The max poolng operaton s not nvertble Therefore, for each unt n the poolng layer, we remember the poston n the correspondng feature map whch gave rse to the unt s output value To accomplsh ths, so called swtch varables are ntroduced [ZF3] Rectfcaton layers can smply be nverted by prependng a rectfcaton layer to the deconvolutonal layer 4 Rectfcaton Layers The convolutonal layer may use rectfcaton layers to obtan postve feature maps after each non-lnearty layer To cope wth ths, a rectfcaton layer s added to each deconvolutonal layer to obtan postve reconstructons of the feature maps, as well [ZF3] Both the ncorporaton of poolng layers and rectfcaton layers s llustrated n fgure 43 Convolutonal Neural Network Vsualzaton The above vsualzaton technque can be used to dscuss several aspects of convolutonal neural networks We follow the dscusson n [ZF3] whch refers to the archtecture descrbed n secton Flters and Features Backprojectng the feature actvatons allows close analyss of the herarchcal nature of the features wthn the convolutonal neural network Fgure, taken from [ZF3], shows the actvatons for three layers wth correspondng nput mages Whle the frst and second layer comprse flters for edge and corner detecton, the flters tend to get more complex and abstract wth hgher layers For example when consderng layer 3, the feature actvatons reflect specfc structures wthn the mages: the patterns used n layer 3, row, column ; human contours n layer 3 row3, column 3 Hgher levels show strong nvarances to translaton and rotaton [ZF3] Such transformatons usually have hgh mpact on low-level features In addton, as stated n [ZF3], t s mportant to tran the convolutonal neural network untl convergence as the hgher levels usually need more tme to converge 43 Archtecture Evaluaton The vsualzaton of the feature actvatons across the convolutonal layers allows to evaluate the effect of flter sze as well as flter placement For example, by analyzng the feature actvatons of the frst and second layer, the authors of [ZF3] observed that the frst layer does only capture hgh frequency and low frequency nformaton and the feature actvatons of the second layer show alasng artfacts By adaptng the flter sze of the frst layer and the skppng factor used wthn the second layer, performance could be mproved In addton, the vsualzaton shows the advantage of deep archtectures as hgher layers are able to learn more complex features nvarant to low-level dstortons and translatons [ZF3] 0

21 Fgure : Taken from [ZF3], ths fgure shows a selecton of features across several layers of a fully traned convolutonal network usng the vsualzaton technque dscussed n secton 4 5 Concluson In the course of ths paper we dscussed the basc notons of both neural networks n general and the multlayer perceptron n partcular Wth deep learnng n mnd, we ntroduced supervsed tranng usng gradent descent and error backropagaton as well as unsupervsed tranng usng auto encoders We concluded the secton wth a bref dscusson of regularzaton methods ncludng dropout [HSK+ ] and unsupervsed pre-tranng We ntroduced convolutonal neural networks by dscussng the dfferent types of layers used n recent mplementatons: the convolutonal layer; the non-lnearty layer; the rectfcaton layer; the local contrast normalzaton layer; and the poolng and subsamplng layer Based on these basc buldng blocks, we dscussed the tradtonal convolutonal neural networks [LBD+ 89] as well as a modern varant as used n [KSH] Despte of ther excellent performance [KSH, CMS], the nternal operaton of convolutonal neural networks s not well understood [ZF3] To get deeper nsght nto ther nternal workng, we followed [ZF3] and dscussed a vsualzaton technque allowng to backproject the feature actvatons of hgher layers Ths allows to further evaluate and mprove recent archtectures as for example the archtecture used n [KSH] Nevertheless, convolutonal neural networks and deep learnng n general s an actve area of research Although the dffculty of deep learnng seems to be understood [Ben09, GB0, EMB+ 09], learnng feature herarches s consdered very hard [Ben09] Here, the possblty of unsupervsed pre-tranng had a huge mpact and allows to tran deep archtectures n reasonable tme [Ben09, EBC+ 0] Nonetheless, the reason for the good performance of deep neural networks s stll not answered fully

22 References [Ben09] Y Bengo Learnng deep archtectures for AI Foundatons and Trends n Machne Learnng, ): 7, 009 [Bs9] C Bshop Exact calculaton of the hessan matrx for the multlayer perceptron Neural Computaton, 44):494 50, 99 [Bs95] C Bshop Neural Networks for Pattern Recognton Clarendon Press, Oxford, 995 [Bs06] C Bshop Pattern Recognton and Machne Learnng Sprnger Verlag, New York, 006 [BL89] S Becker and Y LeCun Improvng the convergence of back-propagaton learnng wth second-order methods In Connectonst Models Summer School, pages 9 37, 989 [BL07] Y Bengo and Y LeCun Scalng learnng algorthms towards AI In Large Scale Kernel Machnes MIT Press, 007 [CMM + ] D C Creşan, U Meer, J Masc, L M Gambardella, and J Schmdhuber Flexble, hgh performance convolutonal neural networks for mage classfcaton In Artfcal Intellgence, Internatonal Jont Conference, pages 37 4, 0 [CMS] [DHS0] [EBC + 0] [EMB + 09] [FP0] [GB0] [GBB] D C Cresan, U Meer, and J Schmdhuber Mult-column deep neural networks for mage classfcaton Computng Research Repostory, abs/0745, 0 R Duda, P Hart, and D Stork Pattern Classfcaton Wley-Interscence Publcaton, New York, 00 D Erhan, Y Bengo, A Courvlle, P-A Manzagol, P Vncent, and S Bengo Why does unsupervsed pre-tranng help deep learnng? Journal of Machne Learnng Research, :65 660, 00 D Erhan, P-A Manzagol, Y Bengo, S Bengo, and P Vncent The dffculty of tranng deep archtectures and the effect of unsupervsed pre-tranng In Artfcal Intellgence and Statstcs, Internatonal Conference on, pages 53 60, 009 D Forsyth and J Ponce Computer Vson: A Modern Approach Prentce Hall Professonal Techncal Reference, New Jersey, 00 X Glorot and Y Bengo Understandng the dffculty of tranng deep feedforward neural networks In Artfcal Intellgence and Statstcs, Internatonal Conference on, pages 49 56, 00 X Glorot, A Bordes, and Y Bengo Deep sparse rectfer neural networks In Artfcal Intellgence and Statstcs, Internatonal Conference on, pages 35 33, 0 [GMW8] P Gll, W Murray, and M Wrght Practcal optmzaton Academc Press, London, 98 [Hay05] [HO06] S Haykn Neural Networks A Comprehensve Foundaton Pearson Educaton, New Delh, 005 G E Hnton and S Osndero A fast learnng algorthm for deep belef nets Neural Computaton, 87):57 554, 006 [HSK + ] G E Hnton, N Srvastava, A Krzhevsky, I Sutskever, and R Salakhutdnov Improvng neural networks by preventng co-adaptaton of feature detectors Computng Research Repostory, abs/070580, 0 [HSW89] K Hornk, M Stnchcombe, and H Whte Multlayer feedforward networks are unversal approxmators Neural Networks, 5): , 989

23 [JKRL09] [KRL0] [KSH] [LBBH98] [LBD + 89] [LBLL09] [LeC89] [LKF0] [NH9] K Jarrett, K Kavukcuogl, M Ranzato, and Y LeCun What s the best mult-stage archtecture for object recognton? In Computer Vson, Internatonal Conference on, pages 46 53, 009 K Kavukcuoglu, M A Ranzato, and Y LeCun Fast nference n sparse codng algorthms wth applcatons to object recognton Computng Research Repostory, abs/003467, 00 A Krzhevsky, I Sutskever, and G E Hnton ImageNet classfcaton wth deep convolutonal neural networks In Advances n Neural Informaton Processng Systems, pages , 0 Y LeCun, L Buttou, Y Bengo, and P Haffner Gradent-based learnng appled to document recognton Proceedngs of the IEEE, 86:78 34, 998 Y LeCun, B Boser, J S Denker, D Henderson, R E Howard, W Hubbard, and L D Jackel Backpropagaton appled to handwrtten zp code recognton Neural Computaton, 4):54 55, 989 H Larochelle, Y Bengo, J Louradour, and P Lambln Explorng strateges for tranng deep neural networks Journal of Machne Learnng Research, 0: 40, 009 Y LeCun Generalzaton and network desgn strateges In Connectonsm n Perspectve, 989 Y LeCun, K Kavukvuoglu, and C Farabet Convolutonal networks and applcatons n vson In Crcuts and Systems, Internatonal Symposum on, pages 53 56, 00 S J Nowlan and G E Hnton Smplfyng neural networks by soft weght-sharng Neural Computaton, 44): , 99 [RHW86] D E Rumelhart, G E Hnton, and R J Wllams Parallel dstrbuted processng: Exploratons n the mcrostructure of cognton chapter Learnng Representatons by Back- Propagatng Errors, pages MIT Press, Cambrdge, 986 [Ros58] F Rosenblatt The perceptron: A probablstc model for nformaton storage and organzaton n the bran Psychologcal Revew, 65, 958 [SMB0] D Scherer, A Müller, and S Behnke Evaluaton of poolng operatons n convolutonal archtectures for object recognton In Artfcal Neural Networks, Internatonal Conference on, pages 9 0, 00 [SSP03] [ZF3] [ZKTF0] P Y Smard, D Stenkraus, and J C Platt Best practces for convolutonal neural networks ppled to vsual document analyss In Document Analyss and Recognton, Internatonal Conference on, 003 M D Zeler and R Fergus Vsualzng and understandng convolutonal networks Computng Research Repostory, abs/390, 03 M D Zeler, D Krshnan, G W Taylor, and R Fergus Deconvolutonal networks In Computer Vson and Pattern Recognton, Conference on, pages , 00 3

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

Nonlinear data mapping by neural networks

Nonlinear data mapping by neural networks Nonlnear data mappng by neural networks R.P.W. Dun Delft Unversty of Technology, Netherlands Abstract A revew s gven of the use of neural networks for nonlnear mappng of hgh dmensonal data on lower dmensonal

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

Lecture 2: Single Layer Perceptrons Kevin Swingler

Lecture 2: Single Layer Perceptrons Kevin Swingler Lecture 2: Sngle Layer Perceptrons Kevn Sngler kms@cs.str.ac.uk Recap: McCulloch-Ptts Neuron Ths vastly smplfed model of real neurons s also knon as a Threshold Logc Unt: W 2 A Y 3 n W n. A set of synapses

More information

Dropout: A Simple Way to Prevent Neural Networks from Overfitting

Dropout: A Simple Way to Prevent Neural Networks from Overfitting Journal of Machne Learnng Research 15 (2014) 1929-1958 Submtted 11/13; Publshed 6/14 Dropout: A Smple Way to Prevent Neural Networks from Overfttng Ntsh Srvastava Geoffrey Hnton Alex Krzhevsky Ilya Sutskever

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering Lecture 7a Clusterng Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Clusterng Groups together smlar nstances n the data sample Basc clusterng problem: dstrbute data nto k dfferent groups such that

More information

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching) Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton

More information

Communication Networks II Contents

Communication Networks II Contents 8 / 1 -- Communcaton Networs II (Görg) -- www.comnets.un-bremen.de Communcaton Networs II Contents 1 Fundamentals of probablty theory 2 Traffc n communcaton networs 3 Stochastc & Marovan Processes (SP

More information

Adaptive Fractal Image Coding in the Frequency Domain

Adaptive Fractal Image Coding in the Frequency Domain PROCEEDINGS OF INTERNATIONAL WORKSHOP ON IMAGE PROCESSING: THEORY, METHODOLOGY, SYSTEMS AND APPLICATIONS 2-22 JUNE,1994 BUDAPEST,HUNGARY Adaptve Fractal Image Codng n the Frequency Doman K AI UWE BARTHEL

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

MAPP. MERIS level 3 cloud and water vapour products. Issue: 1. Revision: 0. Date: 9.12.1998. Function Name Organisation Signature Date

MAPP. MERIS level 3 cloud and water vapour products. Issue: 1. Revision: 0. Date: 9.12.1998. Function Name Organisation Signature Date Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

On Mean Squared Error of Hierarchical Estimator

On Mean Squared Error of Hierarchical Estimator S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta

More information

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Conversion between the vector and raster data structures using Fuzzy Geographical Entities Converson between the vector and raster data structures usng Fuzzy Geographcal Enttes Cdála Fonte Department of Mathematcs Faculty of Scences and Technology Unversty of Combra, Apartado 38, 3 454 Combra,

More information

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP) 6.3 / -- Communcaton Networks II (Görg) SS20 -- www.comnets.un-bremen.de Communcaton Networks II Contents. Fundamentals of probablty theory 2. Emergence of communcaton traffc 3. Stochastc & Markovan Processes

More information

Improved Methods of Geometric Shape Recognition using Fuzzy and Neural Techniques

Improved Methods of Geometric Shape Recognition using Fuzzy and Neural Techniques Buletnul Stntfc al Unverstat Poltehnca dn Tmsoara, ROMANIA Sera AUTOMATICA s CALCULATOARE PERIODICA POLITECHNICA, Transactons on AUTOMATIC CONTROL and COMPUTER SCIENCE Vol.49 (63, 2004, ISSN 224-600X Improved

More information

Logistic Regression. Steve Kroon

Logistic Regression. Steve Kroon Logstc Regresson Steve Kroon Course notes sectons: 24.3-24.4 Dsclamer: these notes do not explctly ndcate whether values are vectors or scalars, but expects the reader to dscern ths from the context. Scenaro

More information

BERNSTEIN POLYNOMIALS

BERNSTEIN POLYNOMIALS On-Lne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful

More information

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble 1 ECE544NA Fnal Project: Robust Machne Learnng Hardware va Classfer Ensemble Sa Zhang, szhang12@llnos.edu Dept. of Electr. & Comput. Eng., Unv. of Illnos at Urbana-Champagn, Urbana, IL, USA Abstract In

More information

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

v a 1 b 1 i, a 2 b 2 i,..., a n b n i. SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are

More information

1 Approximation Algorithms

1 Approximation Algorithms CME 305: Dscrete Mathematcs and Algorthms 1 Approxmaton Algorthms In lght of the apparent ntractablty of the problems we beleve not to le n P, t makes sense to pursue deas other than complete solutons

More information

Georey E. Hinton. University oftoronto. Email: zoubin@cs.toronto.edu. Technical Report CRG-TR-96-1. May 21, 1996 (revised Feb 27, 1997) Abstract

Georey E. Hinton. University oftoronto. Email: zoubin@cs.toronto.edu. Technical Report CRG-TR-96-1. May 21, 1996 (revised Feb 27, 1997) Abstract The EM Algorthm for Mxtures of Factor Analyzers Zoubn Ghahraman Georey E. Hnton Department of Computer Scence Unversty oftoronto 6 Kng's College Road Toronto, Canada M5S A4 Emal: zoubn@cs.toronto.edu Techncal

More information

greatest common divisor

greatest common divisor 4. GCD 1 The greatest common dvsor of two ntegers a and b (not both zero) s the largest nteger whch s a common factor of both a and b. We denote ths number by gcd(a, b), or smply (a, b) when there s no

More information

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement An Enhanced Super-Resoluton System wth Improved Image Regstraton, Automatc Image Selecton, and Image Enhancement Yu-Chuan Kuo ( ), Chen-Yu Chen ( ), and Chou-Shann Fuh ( ) Department of Computer Scence

More information

Gender Classification for Real-Time Audience Analysis System

Gender Classification for Real-Time Audience Analysis System Gender Classfcaton for Real-Tme Audence Analyss System Vladmr Khryashchev, Lev Shmaglt, Andrey Shemyakov, Anton Lebedev Yaroslavl State Unversty Yaroslavl, Russa vhr@yandex.ru, shmaglt_lev@yahoo.com, andrey.shemakov@gmal.com,

More information

Using Mixture Covariance Matrices to Improve Face and Facial Expression Recognitions

Using Mixture Covariance Matrices to Improve Face and Facial Expression Recognitions Usng Mxture Covarance Matrces to Improve Face and Facal Expresson Recogntons Carlos E. homaz, Duncan F. Glles and Raul Q. Fetosa 2 Imperal College of Scence echnology and Medcne, Department of Computng,

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

8 Algorithm for Binary Searching in Trees

8 Algorithm for Binary Searching in Trees 8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan support vector machnes.

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

A Multi-mode Image Tracking System Based on Distributed Fusion

A Multi-mode Image Tracking System Based on Distributed Fusion A Mult-mode Image Tracng System Based on Dstrbuted Fuson Ln zheng Chongzhao Han Dongguang Zuo Hongsen Yan School of Electroncs & nformaton engneerng, X an Jaotong Unversty X an, Shaanx, Chna Lnzheng@malst.xjtu.edu.cn

More information

An artificial Neural Network approach to monitor and diagnose multi-attribute quality control processes. S. T. A. Niaki*

An artificial Neural Network approach to monitor and diagnose multi-attribute quality control processes. S. T. A. Niaki* Journal of Industral Engneerng Internatonal July 008, Vol. 4, No. 7, 04 Islamc Azad Unversty, South Tehran Branch An artfcal Neural Network approach to montor and dagnose multattrbute qualty control processes

More information

Implementation of Deutsch's Algorithm Using Mathcad

Implementation of Deutsch's Algorithm Using Mathcad Implementaton of Deutsch's Algorthm Usng Mathcad Frank Roux The followng s a Mathcad mplementaton of Davd Deutsch's quantum computer prototype as presented on pages - n "Machnes, Logc and Quantum Physcs"

More information

Loop Parallelization

Loop Parallelization - - Loop Parallelzaton C-52 Complaton steps: nested loops operatng on arrays, sequentell executon of teraton space DECLARE B[..,..+] FOR I :=.. FOR J :=.. I B[I,J] := B[I-,J]+B[I-,J-] ED FOR ED FOR analyze

More information

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information

+ + + - - This circuit than can be reduced to a planar circuit

+ + + - - This circuit than can be reduced to a planar circuit MeshCurrent Method The meshcurrent s analog of the nodeoltage method. We sole for a new set of arables, mesh currents, that automatcally satsfy KCLs. As such, meshcurrent method reduces crcut soluton to

More information

Time Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University

Time Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University Tme Seres Analyss n Studes of AGN Varablty Bradley M. Peterson The Oho State Unversty 1 Lnear Correlaton Degree to whch two parameters are lnearly correlated can be expressed n terms of the lnear correlaton

More information

ErrorPropagation.nb 1. Error Propagation

ErrorPropagation.nb 1. Error Propagation ErrorPropagaton.nb Error Propagaton Suppose that we make observatons of a quantty x that s subject to random fluctuatons or measurement errors. Our best estmate of the true value for ths quantty s then

More information

Study on CET4 Marks in China s Graded English Teaching

Study on CET4 Marks in China s Graded English Teaching Study on CET4 Marks n Chna s Graded Englsh Teachng CHE We College of Foregn Studes, Shandong Insttute of Busness and Technology, P.R.Chna, 264005 Abstract: Ths paper deploys Logt model, and decomposes

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered

More information

Learning from Multiple Outlooks

Learning from Multiple Outlooks Learnng from Multple Outlooks Maayan Harel Department of Electrcal Engneerng, Technon, Hafa, Israel She Mannor Department of Electrcal Engneerng, Technon, Hafa, Israel maayanga@tx.technon.ac.l she@ee.technon.ac.l

More information

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):1884-1889 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel

More information

Traffic State Estimation in the Traffic Management Center of Berlin

Traffic State Estimation in the Traffic Management Center of Berlin Traffc State Estmaton n the Traffc Management Center of Berln Authors: Peter Vortsch, PTV AG, Stumpfstrasse, D-763 Karlsruhe, Germany phone ++49/72/965/35, emal peter.vortsch@ptv.de Peter Möhl, PTV AG,

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary

More information

Development of an intelligent system for tool wear monitoring applying neural networks

Development of an intelligent system for tool wear monitoring applying neural networks of Achevements n Materals and Manufacturng Engneerng VOLUME 14 ISSUE 1-2 January-February 2006 Development of an ntellgent system for tool wear montorng applyng neural networks A. Antć a, J. Hodolč a,

More information

Performance Analysis and Coding Strategy of ECOC SVMs

Performance Analysis and Coding Strategy of ECOC SVMs Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School

More information

Fast Fuzzy Clustering of Web Page Collections

Fast Fuzzy Clustering of Web Page Collections Fast Fuzzy Clusterng of Web Page Collectons Chrstan Borgelt and Andreas Nürnberger Dept. of Knowledge Processng and Language Engneerng Otto-von-Guercke-Unversty of Magdeburg Unverstätsplatz, D-396 Magdeburg,

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Supervsed vs. Unsupervsed Learnng Up to now we consdered supervsed learnng scenaro, where we are gven 1. samples 1,, n 2. class labels for all samples 1,, n Ths s also

More information

Hallucinating Multiple Occluded CCTV Face Images of Different Resolutions

Hallucinating Multiple Occluded CCTV Face Images of Different Resolutions In Proc. IEEE Internatonal Conference on Advanced Vdeo and Sgnal based Survellance (AVSS 05), September 2005 Hallucnatng Multple Occluded CCTV Face Images of Dfferent Resolutons Ku Ja Shaogang Gong Computer

More information

1 Example 1: Axis-aligned rectangles

1 Example 1: Axis-aligned rectangles COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton

More information

Inequality and The Accounting Period. Quentin Wodon and Shlomo Yitzhaki. World Bank and Hebrew University. September 2001.

Inequality and The Accounting Period. Quentin Wodon and Shlomo Yitzhaki. World Bank and Hebrew University. September 2001. Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Sequential Optimizing Investing Strategy with Neural Networks

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Sequential Optimizing Investing Strategy with Neural Networks MATHEMATICAL ENGINEERING TECHNICAL REPORTS Sequental Optmzng Investng Strategy wth Neural Networks Ryo ADACHI and Akmch TAKEMURA METR 2010 03 February 2010 DEPARTMENT OF MATHEMATICAL INFORMATICS GRADUATE

More information

Intra-day Trading of the FTSE-100 Futures Contract Using Neural Networks With Wavelet Encodings

Intra-day Trading of the FTSE-100 Futures Contract Using Neural Networks With Wavelet Encodings Submtted to European Journal of Fnance Intra-day Tradng of the FTSE-00 Futures Contract Usng eural etworks Wth Wavelet Encodngs D L Toulson S P Toulson Intellgent Fnancal Systems Lmted Sute 4 Greener House

More information

An interactive system for structure-based ASCII art creation

An interactive system for structure-based ASCII art creation An nteractve system for structure-based ASCII art creaton Katsunor Myake Henry Johan Tomoyuk Nshta The Unversty of Tokyo Nanyang Technologcal Unversty Abstract Non-Photorealstc Renderng (NPR), whose am

More information

2.4 Bivariate distributions

2.4 Bivariate distributions page 28 2.4 Bvarate dstrbutons 2.4.1 Defntons Let X and Y be dscrete r.v.s defned on the same probablty space (S, F, P). Instead of treatng them separately, t s often necessary to thnk of them actng together

More information

Review of Hierarchical Models for Data Clustering and Visualization

Review of Hierarchical Models for Data Clustering and Visualization Revew of Herarchcal Models for Data Clusterng and Vsualzaton Lola Vcente & Alfredo Velldo Grup de Soft Computng Seccó d Intel lgènca Artfcal Departament de Llenguatges Sstemes Informàtcs Unverstat Poltècnca

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

Clustering Gene Expression Data. (Slides thanks to Dr. Mark Craven)

Clustering Gene Expression Data. (Slides thanks to Dr. Mark Craven) Clusterng Gene Epresson Data Sldes thanks to Dr. Mark Craven Gene Epresson Proles we ll assume we have a D matr o gene epresson measurements rows represent genes columns represent derent eperments tme

More information

where the coordinates are related to those in the old frame as follows.

where the coordinates are related to those in the old frame as follows. Chapter 2 - Cartesan Vectors and Tensors: Ther Algebra Defnton of a vector Examples of vectors Scalar multplcaton Addton of vectors coplanar vectors Unt vectors A bass of non-coplanar vectors Scalar product

More information

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression Novel Methodology of Workng Captal Management for Large Publc Constructons by Usng Fuzzy S-curve Regresson Cheng-Wu Chen, Morrs H. L. Wang and Tng-Ya Hseh Department of Cvl Engneerng, Natonal Central Unversty,

More information

Interleaved Power Factor Correction (IPFC)

Interleaved Power Factor Correction (IPFC) Interleaved Power Factor Correcton (IPFC) 2009 Mcrochp Technology Incorporated. All Rghts Reserved. Interleaved Power Factor Correcton Slde 1 Welcome to the Interleaved Power Factor Correcton Reference

More information

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

More information

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008 Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn

More information

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns A study on the ablty of Support Vector Regresson and Neural Networks to Forecast Basc Tme Seres Patterns Sven F. Crone, Jose Guajardo 2, and Rchard Weber 2 Lancaster Unversty, Department of Management

More information

Recurrent networks and types of associative memories

Recurrent networks and types of associative memories 12 Assocatve Networks 12.1 Assocatve pattern recognton The prevous chapters were devoted to the analyss of neural networks wthout feedback, capable of mappng an nput space nto an output space usng only

More information

Multiple-Period Attribution: Residuals and Compounding

Multiple-Period Attribution: Residuals and Compounding Multple-Perod Attrbuton: Resduals and Compoundng Our revewer gave these authors full marks for dealng wth an ssue that performance measurers and vendors often regard as propretary nformaton. In 1994, Dens

More information

Lecture 18: Clustering & classification

Lecture 18: Clustering & classification O CPS260/BGT204. Algorthms n Computatonal Bology October 30, 2003 Lecturer: Pana K. Agarwal Lecture 8: Clusterng & classfcaton Scrbe: Daun Hou Open Problem In HomeWor 2, problem 5 has an open problem whch

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST) yaoqi.feng@yahoo.

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST) yaoqi.feng@yahoo. ICSV4 Carns Australa 9- July, 007 RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL Yaoq FENG, Hanpng QIU Dynamc Test Laboratory, BISEE Chna Academy of Space Technology (CAST) yaoq.feng@yahoo.com Abstract

More information

Distributed Multi-Target Tracking In A Self-Configuring Camera Network

Distributed Multi-Target Tracking In A Self-Configuring Camera Network Dstrbuted Mult-Target Trackng In A Self-Confgurng Camera Network Crstan Soto, B Song, Amt K. Roy-Chowdhury Department of Electrcal Engneerng Unversty of Calforna, Rversde {cwlder,bsong,amtrc}@ee.ucr.edu

More information

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems STAN-CS-73-355 I SU-SE-73-013 An Analyss of Central Processor Schedulng n Multprogrammed Computer Systems (Dgest Edton) by Thomas G. Prce October 1972 Techncal Report No. 57 Reproducton n whole or n part

More information

Proactive Secret Sharing Or: How to Cope With Perpetual Leakage

Proactive Secret Sharing Or: How to Cope With Perpetual Leakage Proactve Secret Sharng Or: How to Cope Wth Perpetual Leakage Paper by Amr Herzberg Stanslaw Jareck Hugo Krawczyk Mot Yung Presentaton by Davd Zage What s Secret Sharng Basc Idea ((2, 2)-threshold scheme):

More information

Statistical Methods to Develop Rating Models

Statistical Methods to Develop Rating Models Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

IMPACT ANALYSIS OF A CELLULAR PHONE

IMPACT ANALYSIS OF A CELLULAR PHONE 4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

Calculating the high frequency transmission line parameters of power cables

Calculating the high frequency transmission line parameters of power cables < ' Calculatng the hgh frequency transmsson lne parameters of power cables Authors: Dr. John Dcknson, Laboratory Servces Manager, N 0 RW E B Communcatons Mr. Peter J. Ncholson, Project Assgnment Manager,

More information

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence 1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh

More information

A GENERAL APPROACH FOR SECURITY MONITORING AND PREVENTIVE CONTROL OF NETWORKS WITH LARGE WIND POWER PRODUCTION

A GENERAL APPROACH FOR SECURITY MONITORING AND PREVENTIVE CONTROL OF NETWORKS WITH LARGE WIND POWER PRODUCTION A GENERAL APPROACH FOR SECURITY MONITORING AND PREVENTIVE CONTROL OF NETWORKS WITH LARGE WIND POWER PRODUCTION Helena Vasconcelos INESC Porto hvasconcelos@nescportopt J N Fdalgo INESC Porto and FEUP jfdalgo@nescportopt

More information

Statistical Approach for Offline Handwritten Signature Verification

Statistical Approach for Offline Handwritten Signature Verification Journal of Computer Scence 4 (3): 181-185, 2008 ISSN 1549-3636 2008 Scence Publcatons Statstcal Approach for Offlne Handwrtten Sgnature Verfcaton 2 Debnath Bhattacharyya, 1 Samr Kumar Bandyopadhyay, 2

More information

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits Lnear Crcuts Analyss. Superposton, Theenn /Norton Equalent crcuts So far we hae explored tmendependent (resste) elements that are also lnear. A tmendependent elements s one for whch we can plot an / cure.

More information

A machine vision approach for detecting and inspecting circular parts

A machine vision approach for detecting and inspecting circular parts A machne vson approach for detectng and nspectng crcular parts Du-Mng Tsa Machne Vson Lab. Department of Industral Engneerng and Management Yuan-Ze Unversty, Chung-L, Tawan, R.O.C. E-mal: edmtsa@saturn.yzu.edu.tw

More information

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background: SPEE Recommended Evaluaton Practce #6 efnton of eclne Curve Parameters Background: The producton hstores of ol and gas wells can be analyzed to estmate reserves and future ol and gas producton rates and

More information

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification IDC IDC A Herarchcal Anomaly Network Intruson Detecton System usng Neural Network Classfcaton ZHENG ZHANG, JUN LI, C. N. MANIKOPOULOS, JAY JORGENSON and JOSE UCLES ECE Department, New Jersey Inst. of Tech.,

More information

Activity Scheduling for Cost-Time Investment Optimization in Project Management

Activity Scheduling for Cost-Time Investment Optimization in Project Management PROJECT MANAGEMENT 4 th Internatonal Conference on Industral Engneerng and Industral Management XIV Congreso de Ingenería de Organzacón Donosta- San Sebastán, September 8 th -10 th 010 Actvty Schedulng

More information

Inter-Ing 2007. INTERDISCIPLINARITY IN ENGINEERING SCIENTIFIC INTERNATIONAL CONFERENCE, TG. MUREŞ ROMÂNIA, 15-16 November 2007.

Inter-Ing 2007. INTERDISCIPLINARITY IN ENGINEERING SCIENTIFIC INTERNATIONAL CONFERENCE, TG. MUREŞ ROMÂNIA, 15-16 November 2007. Inter-Ing 2007 INTERDISCIPLINARITY IN ENGINEERING SCIENTIFIC INTERNATIONAL CONFERENCE, TG. MUREŞ ROMÂNIA, 15-16 November 2007. UNCERTAINTY REGION SIMULATION FOR A SERIAL ROBOT STRUCTURE MARIUS SEBASTIAN

More information