Hgh Order Reverse Mode of AD Theory and Implementaton Mu Wang and Alex Pothen Department of Computer Scence Purdue Unversty September 30, 2016 Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 1 / 1
Research Overvew Second order reverse mode : More effcent n evaluatng Hessan n both complexty and memory usage n many applcatons. Proved to be equvalent to an varance of vertex elmnaton on the computatonal graph of the gradent Hgh order reverse mode : Hgh order reverse mode : evaluatng dervatve tensor d f up to any order n reverse mode Implementaton : ReverseAD Applcatons: Uncertanty quantfcaton Chemstry : exchange-correlaton (XC) energy functonal Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 2 / 1
Research Overvew Second order reverse mode : More effcent n evaluatng Hessan n both complexty and memory usage n many applcatons. Proved to be equvalent to an varance of vertex elmnaton on the computatonal graph of the gradent Hgh order reverse mode : Hgh order reverse mode : evaluatng dervatve tensor d f up to any order n reverse mode Implementaton : ReverseAD Applcatons: Uncertanty quantfcaton Chemstry : exchange-correlaton (XC) energy functonal Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 2 / 1
Research Overvew Second order reverse mode : More effcent n evaluatng Hessan n both complexty and memory usage n many applcatons. Proved to be equvalent to an varance of vertex elmnaton on the computatonal graph of the gradent Hgh order reverse mode : Hgh order reverse mode : evaluatng dervatve tensor d f up to any order n reverse mode Implementaton : ReverseAD Applcatons: Uncertanty quantfcaton Chemstry : exchange-correlaton (XC) energy functonal Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 2 / 1
Research Overvew Second order reverse mode : More effcent n evaluatng Hessan n both complexty and memory usage n many applcatons. Proved to be equvalent to an varance of vertex elmnaton on the computatonal graph of the gradent Hgh order reverse mode : Hgh order reverse mode : evaluatng dervatve tensor d f up to any order n reverse mode Implementaton : ReverseAD Applcatons: Uncertanty quantfcaton Chemstry : exchange-correlaton (XC) energy functonal Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 2 / 1
Research Overvew Second order reverse mode : More effcent n evaluatng Hessan n both complexty and memory usage n many applcatons. Proved to be equvalent to an varance of vertex elmnaton on the computatonal graph of the gradent Hgh order reverse mode : (ths talk) Hgh order reverse mode : evaluatng dervatve tensor d f up to any order n reverse mode Implementaton : ReverseAD Applcatons: Uncertanty quantfcaton Chemstry : exchange-correlaton (XC) energy functonal Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 2 / 1
Background For a scalar objectve functon f : R n R Frst order AD: Forward : [F1 f ](x, ẋ) = f x ẋ = f T ẋ Reverse : [R1 f ](x) = ( f x 1,, f x 1 ) = f Second order AD: (Pure) Forward : [F 2 f ](x, ẋ) = 1 2ẋT 2 f ẋ Mxed : [R1 F 1 f ](x, ẋ) = 2 f ẋ (Pure) Reverse : [R2 f ](x) = 2 f Hgh order AD: (Pure) Forward : Hgh order taylor coeffcents (Pure) Reverse : Hgh order reverse mode Mxed modes then can be generated Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 3 / 1
Background For a scalar objectve functon f : R n R Frst order AD: Forward : [F1 f ](x, ẋ) = f x ẋ = f T ẋ Reverse : [R1 f ](x) = ( f x 1,, f x 1 ) = f Second order AD: (Pure) Forward : [F 2 f ](x, ẋ) = 1 2ẋT 2 f ẋ Mxed : [R1 F 1 f ](x, ẋ) = 2 f ẋ (Pure) Reverse : [R2 f ](x) = 2 f Hgh order AD: (Pure) Forward : Hgh order taylor coeffcents (Pure) Reverse : Hgh order reverse mode Mxed modes then can be generated Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 3 / 1
Background For a scalar objectve functon f : R n R Frst order AD: Forward : [F1 f ](x, ẋ) = f x ẋ = f T ẋ Reverse : [R1 f ](x) = ( f x 1,, f x 1 ) = f Second order AD: (Pure) Forward : [F 2 f ](x, ẋ) = 1 2ẋT 2 f ẋ Mxed : [R1 F 1 f ](x, ẋ) = 2 f ẋ (Pure) Reverse : [R2 f ](x) = 2 f Hgh order AD: (Pure) Forward : Hgh order taylor coeffcents (Pure) Reverse : Hgh order reverse mode Mxed modes then can be generated Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 3 / 1
Background For a scalar objectve functon f : R n R Frst order AD: Forward : [F1 f ](x, ẋ) = f x ẋ = f T ẋ Reverse : [R1 f ](x) = ( f x 1,, f x 1 ) = f Second order AD: (Pure) Forward : [F 2 f ](x, ẋ) = 1 2ẋT 2 f ẋ Mxed : [R1 F 1 f ](x, ẋ) = 2 f ẋ (Pure) Reverse : [R2 f ](x) = 2 f Hgh order AD: (Pure) Forward : Hgh order taylor coeffcents (Pure) Reverse : Hgh order reverse mode Mxed modes then can be generated Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 3 / 1
Background For a scalar objectve functon f : R n R Frst order AD: Forward : [F1 f ](x, ẋ) = f x ẋ = f T ẋ Reverse : [R1 f ](x) = ( f x 1,, f x 1 ) = f Second order AD: (Pure) Forward : [F 2 f ](x, ẋ) = 1 2ẋT 2 f ẋ Mxed : [R1 F 1 f ](x, ẋ) = 2 f ẋ (Pure) Reverse : [R2 f ](x) = 2 f Hgh order AD: (Pure) Forward : Hgh order taylor coeffcents (Pure) Reverse : Hgh order reverse mode Mxed modes then can be generated Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 3 / 1
Background For a scalar objectve functon f : R n R Frst order AD: Forward : [F1 f ](x, ẋ) = f x ẋ = f T ẋ Reverse : [R1 f ](x) = ( f x 1,, f x 1 ) = f Second order AD: (Pure) Forward : [F 2 f ](x, ẋ) = 1 2ẋT 2 f ẋ Mxed : [R1 F 1 f ](x, ẋ) = 2 f ẋ (Pure) Reverse : [R2 f ](x) = 2 f Hgh order AD: (Pure) Forward : Hgh order taylor coeffcents (Pure) Reverse : Hgh order reverse mode Mxed modes then can be generated Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 3 / 1
Background For a scalar objectve functon f : R n R Frst order AD: Forward : [F1 f ](x, ẋ) = f x ẋ = f T ẋ Reverse : [R1 f ](x) = ( f x 1,, f x 1 ) = f Second order AD: (Pure) Forward : [F 2 f ](x, ẋ) = 1 2ẋT 2 f ẋ Mxed : [R1 F 1 f ](x, ẋ) = 2 f ẋ (Pure) Reverse : [R2 f ](x) = 2 f Hgh order AD: (Pure) Forward : Hgh order taylor coeffcents (Pure) Reverse : Hgh order reverse mode Mxed modes then can be generated Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 3 / 1
Hgh Order Forward Mode Accumulate hgh order taylor coeffcents 1 : d f : d-th order dervatve tensor (symmetrc). d f ẋ : A tensor-vector product, (d 1)-th order symmetrc tensor [ [[. d f ẋ] ẋ ] ] ẋ : A scalar, the d-th order taylor coeffcents. 1 Grewank, Andreas, Jean Utke, and Andrea Walther. Evaluatng hgher dervatve tensors by forward propagaton of unvarate Taylor seres. Mathematcs of Computaton of the Amercan Mathematcal Socety 69.231 (2000): 1117-1130. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 4 / 1
Hgh Order Forward Mode Accumulate hgh order taylor coeffcents 1 : d f : d-th order dervatve tensor (symmetrc). d f ẋ : A tensor-vector product, (d 1)-th order symmetrc tensor [ [[. d f ẋ] ẋ ] ] ẋ : A scalar, the d-th order taylor coeffcents. d = 2: [F 2 f ](x, ẋ) = 1 2ẋT 2 f ẋ [ 2 f ] j = [F 2 f ](x, e + e j ) [F 2 f ](x, e ) [F 2 f ](x, e j ) 1 Grewank, Andreas, Jean Utke, and Andrea Walther. Evaluatng hgher dervatve tensors by forward propagaton of unvarate Taylor seres. Mathematcs of Computaton of the Amercan Mathematcal Socety 69.231 (2000): 1117-1130. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 4 / 1
Hgh Order Forward Mode Accumulate hgh order taylor coeffcents 1 : d f : d-th order dervatve tensor (symmetrc). d f ẋ : A tensor-vector product, (d 1)-th order symmetrc tensor [ [[. d f ẋ] ẋ ] ] ẋ : A scalar, the d-th order taylor coeffcents. General case: [F d f ](x, ẋ) = 1 d![ [[ d f ẋ] ẋ ] [ d f ] 1 d : a lnear combnaton of ] ẋ {[F d f ](x, ė) : ė Span{e 1,, e d }}} 1 Grewank, Andreas, Jean Utke, and Andrea Walther. Evaluatng hgher dervatve tensors by forward propagaton of unvarate Taylor seres. Mathematcs of Computaton of the Amercan Mathematcal Socety 69.231 (2000): 1117-1130. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 4 / 1
Hgh Order Forward Mode Accumulate hgh order taylor coeffcents 1 : d f : d-th order dervatve tensor (symmetrc). d f ẋ : A tensor-vector product, (d 1)-th order symmetrc tensor [ [[. d f ẋ] ẋ ] ] ẋ : A scalar, the d-th order taylor coeffcents. General case: [F d f ](x, ẋ) = 1 d![ [[ d f ẋ] ẋ ] [ d f ] 1 d : a lnear combnaton of ] ẋ {[F d f ](x, ė) : ė Span{e 1,, e d }}} Complexty : O( ( (n+d 1)) d l) 1 Grewank, Andreas, Jean Utke, and Andrea Walther. Evaluatng hgher dervatve tensors by forward propagaton of unvarate Taylor seres. Mathematcs of Computaton of the Amercan Mathematcal Socety 69.231 (2000): 1117-1130. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 4 / 1
Reverse Mode Revst Defnton After process the SAC v = ϕ (v j ) {vj :v j v } n reverse mode, the process SACs defne an equvalent functon f (S ). The objectve functon s the composton of f and the remanng SACs and S s the current lve varable set. Observaton reverse mode computes the dervatves of f (S ) n each step by followng the order chan rule. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 5 / 1
Reverse Mode Revst Defnton After process the SAC v = ϕ (v j ) {vj :v j v } n reverse mode, the process SACs defne an equvalent functon f (S ). The objectve functon s the composton of f and the remanng SACs and S s the current lve varable set. Observaton Second order reverse mode computes the frst and the second order dervatves of f (S ) n each step by followng the frst and second order chan rule. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 5 / 1
Reverse Mode Revst Defnton After process the SAC v = ϕ (v j ) {vj :v j v } n reverse mode, the process SACs defne an equvalent functon f (S ). The objectve functon s the composton of f and the remanng SACs and S s the current lve varable set. Observaton Second order Hgh order reverse mode computes the frst and the second order dervatves up to order d of f (S ) n each step by followng the frst and second hgh order chan rule. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 5 / 1
Hgh Order Chan Rule Observaton Hgh order reverse mode computes the dervatves up to order d of f n each step by followng the hgh order chan rule. When process v = ϕ (v j ) {vj :v j v }: S = S +1 \ {v } {v j : v j v } f (S ) = f +1 (S +1 \ {v }, v = ϕ (v j ) {vj :v j v }) Hgh order chan rule: dervatves of f +1 (S +1 ) dervatves of f (S ) General case of Faà d Bruno equaton Specal case of the equaton n Ma, 2009 2 2 Ma, Tsoy-Wo. Hgher chan formula proved by combnatorcs. the electronc journal of combnatorcs 16.1 (2009): N21. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 6 / 1
Hgh Order Chan Rule Observaton Hgh order reverse mode computes the dervatves up to order d of f n each step by followng the hgh order chan rule. When process v = ϕ (v j ) {vj :v j v }: S = S +1 \ {v } {v j : v j v } f (S ) = f +1 (S +1 \ {v }, v = ϕ (v j ) {vj :v j v }) Hgh order chan rule: dervatves of f +1 (S +1 ) dervatves of f (S ) General case of Faà d Bruno equaton Specal case of the equaton n Ma, 2009 2 2 Ma, Tsoy-Wo. Hgher chan formula proved by combnatorcs. the electronc journal of combnatorcs 16.1 (2009): N21. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 6 / 1
Hgh Order Chan Rule Multset A multset D s a generalzaton of the noton of a set n whch members are allowed to appear more than once. We use D S to represent the famly of all multsets over S. That s: D S = {D : D = {e 1, e 2,, e d }, e S, 1 d} Dervatve Mappng For a functon f (S), ts order d dervatve tensor can be represented as a mappng from D D S, D = d to R as: T f (D) = D f D = D f v 1 v 2 v D Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 7 / 1
Hgh Order Chan Rule Multset A multset D s a generalzaton of the noton of a set n whch members are allowed to appear more than once. We use D S to represent the famly of all multsets over S. That s: D S = {D : D = {e 1, e 2,, e d }, e S, 1 d} Dervatve Mappng For a functon f (S), ts order d dervatve tensor can be represented as a mappng from D D S, D = d to R as: T f (D) = D f D = D f v 1 v 2 v D Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 7 / 1
Hgh Order Chan Rule Multset A multset D s a generalzaton of the noton of a set n whch members are allowed to appear more than once. We use D S to represent the famly of all multsets over S. That s: D S = {D : D = {e 1, e 2,, e d }, e S, 1 d} Dervatve Mappng For a functon f (S), ts order d dervatve tensor can be represented as a mappng from D D S, D = d to R as: T f (D) = D f D = D f v 1 v 2 v D Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 7 / 1
Hgh Order Chan Rule T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D. Frst order : D = {v} a (v) = a +1 (v) + v a +1(v ) v a +1(v ) : D L =, D 1 = {v} Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 8 / 1
Hgh Order Chan Rule T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D. Frst order : D = {v} a (v) = a +1 (v) + v a +1(v ) v a +1(v ) : D L =, D 1 = {v} Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 8 / 1
Hgh Order Chan Rule T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D. Frst order : D = {v} a (v) = a +1 (v) + v a +1(v ) v a +1(v ) : D L =, D 1 = {v} Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 8 / 1
Hgh Order Chan Rule T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D. Frst order : D = {v} a (v) = a +1 (v) + v a +1(v ) v a +1(v ) : D L =, D 1 = {v} Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 8 / 1
Hgh Order Chan Rule T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D. Frst order : D = {v} a (v) = a +1 (v) + v a +1(v ) v a +1(v ) : D L =, D 1 = {v} Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 8 / 1
Hgh Order Chan Rule T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D. Second order : D = {v, u} h (v, u) = h +1 (v, u) + v h +1(v, u) + u h +1(v, v ) + v u h +1(v, v ) + 2 ϕ v u a +1(v ) Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 9 / 1
Hgh Order Chan Rule T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D. Second order : D = {v, u} h (v, u) = h +1 (v, u) + v h +1(v, u) + u h +1(v, v ) + v u h +1(v, v ) + 2 ϕ v u a +1(v ) Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 9 / 1
Hgh Order Chan Rule T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D. Second order : D = {v, u} h (v, u) = h +1 (v, u) + v h +1(v, u) + u h +1(v, v ) + v u h +1(v, v ) + 2 ϕ v u a +1(v ) v h +1(v, u) : D L = {u}, D 1 = {v} Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 9 / 1
Hgh Order Chan Rule T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D. Second order : D = {v, u} h (v, u) = h +1 (v, u) + v h +1(v, u) + u h +1(v, v ) + v u h +1(v, v ) + 2 ϕ v u a +1(v ) u h +1(v, v ) : D L = {v}, D 1 = {u} Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 9 / 1
Hgh Order Chan Rule T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D. Second order : D = {v, u} h (v, u) = h +1 (v, u) + v h +1(v, u) + u h +1(v, v ) + v u h +1(v, v ) + 2 ϕ v u a +1(v ) v u h +1(v, v ) : D L =, D 1 = {v}, D 2 = {u} Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 9 / 1
Hgh Order Chan Rule T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D. Second order : D = {v, u} h (v, u) = h +1 (v, u) + v h +1(v, u) + u h +1(v, v ) + v u h +1(v, v ) + 2 ϕ v u a +1(v ) 2 ϕ v u a +1(v ) : D L =, D 1 = {v, u} Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 9 / 1
Hgh Order Reverse Mode : Complexty T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L D L, D 1,, D r s a partton of D. B d+1 summatons : (d + 1) th Bell number. O(B d+1 s d 1 ) updates for each SAC. Overall complexty : O(B d+1 s d 1 l), s = max{s } ϕ ] T f+1 (D L {v r }) D 1 D r When d = 1 : O(l) Baur-Strassen theorem. When d = 2 : O(l s) second order reverse mode When d = 3 : O(l s 2 ) thrd order reverse mode. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 10 / 1
Hgh Order Reverse Mode : Complexty T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L D L, D 1,, D r s a partton of D. B d+1 summatons : (d + 1) th Bell number. O(B d+1 s d 1 ) updates for each SAC. Overall complexty : O(B d+1 s d 1 l), s = max{s } ϕ ] T f+1 (D L {v r }) D 1 D r When d = 1 : O(l) Baur-Strassen theorem. When d = 2 : O(l s) second order reverse mode When d = 3 : O(l s 2 ) thrd order reverse mode. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 10 / 1
Hgh Order Reverse Mode : Complexty T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L D L, D 1,, D r s a partton of D. B d+1 summatons : (d + 1) th Bell number. O(B d+1 s d 1 ) updates for each SAC. Overall complexty : O(B d+1 s d 1 l), s = max{s } ϕ ] T f+1 (D L {v r }) D 1 D r When d = 1 : O(l) Baur-Strassen theorem. When d = 2 : O(l s) second order reverse mode When d = 3 : O(l s 2 ) thrd order reverse mode. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 10 / 1
Hgh Order Reverse Mode : Complexty T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L D L, D 1,, D r s a partton of D. B d+1 summatons : (d + 1) th Bell number. O(B d+1 s d 1 ) updates for each SAC. Overall complexty : O(B d+1 s d 1 l), s = max{s } ϕ ] T f+1 (D L {v r }) D 1 D r When d = 1 : O(l) Baur-Strassen theorem. When d = 2 : O(l s) second order reverse mode When d = 3 : O(l s 2 ) thrd order reverse mode. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 10 / 1
Hgh Order Reverse Mode : Complexty T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L D L, D 1,, D r s a partton of D. B d+1 summatons : (d + 1) th Bell number. O(B d+1 s d 1 ) updates for each SAC. Overall complexty : O(B d+1 s d 1 l), s = max{s } ϕ ] T f+1 (D L {v r }) D 1 D r When d = 1 : O(l) Baur-Strassen theorem. When d = 2 : O(l s) second order reverse mode When d = 3 : O(l s 2 ) thrd order reverse mode. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 10 / 1
Hgh Order Reverse Mode : Complexty T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L D L, D 1,, D r s a partton of D. B d+1 summatons : (d + 1) th Bell number. O(B d+1 s d 1 ) updates for each SAC. Overall complexty : O(B d+1 s d 1 l), s = max{s } ϕ ] T f+1 (D L {v r }) D 1 D r When d = 1 : O(l) Baur-Strassen theorem. When d = 2 : O(l s) second order reverse mode When d = 3 : O(l s 2 ) thrd order reverse mode. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 10 / 1
Hgh Order Reverse Mode : Complexty T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L D L, D 1,, D r s a partton of D. B d+1 summatons : (d + 1) th Bell number. O(B d+1 s d 1 ) updates for each SAC. Overall complexty : O(B d+1 s d 1 l), s = max{s } ϕ ] T f+1 (D L {v r }) D 1 D r When d = 1 : O(l) Baur-Strassen theorem. When d = 2 : O(l s) second order reverse mode When d = 3 : O(l s 2 ) thrd order reverse mode. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 10 / 1
Hgh Order Reverse Mode : Complexty T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L D L, D 1,, D r s a partton of D. B d+1 summatons : (d + 1) th Bell number. O(B d+1 s d 1 ) updates for each SAC. Overall complexty : O(B d+1 s d 1 l), s = max{s } ϕ ] T f+1 (D L {v r }) D 1 D r When d = 1 : O(l) Baur-Strassen theorem. When d = 2 : O(l s) second order reverse mode When d = 3 : O(l s 2 ) thrd order reverse mode. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 10 / 1
Hgh Order Reverse Mode : Implementaton T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D Generate all D L, s.t, T f+1 (D L {v r }) 0 and D 1,, D r, s.t, 0, 1 r D Then perform ncremental updates on D = D L D 1 D r More than one way to partton D nto D L, D 1,, D r. SymCoeff (D L, D 1,, D r ) : Multplcty that partton D nto D L, D 1,, D r. Flat code for pre-computed symmetrc coeffcents 5k lnes of generated code for up to sxth order Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 11 / 1
Hgh Order Reverse Mode : Implementaton T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D Generate all D L, s.t, T f+1 (D L {v r }) 0 and D 1,, D r, s.t, 0, 1 r D Then perform ncremental updates on D = D L D 1 D r More than one way to partton D nto D L, D 1,, D r. SymCoeff (D L, D 1,, D r ) : Multplcty that partton D nto D L, D 1,, D r. Flat code for pre-computed symmetrc coeffcents 5k lnes of generated code for up to sxth order Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 11 / 1
Hgh Order Reverse Mode : Implementaton T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D Generate all D L, s.t, T f+1 (D L {v r }) 0 and D 1,, D r, s.t, 0, 1 r D Then perform ncremental updates on D = D L D 1 D r More than one way to partton D nto D L, D 1,, D r. SymCoeff (DL, D 1,, D r ) : Multplcty that partton D nto D L, D 1,, D r. Flat code for pre-computed symmetrc coeffcents 5k lnes of generated code for up to sxth order Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 11 / 1
Hgh Order Reverse Mode : Implementaton T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D Generate all D L, s.t, T f+1 (D L {v r }) 0 and D 1,, D r, s.t, 0, 1 r D Then perform ncremental updates on D = D L D 1 D r More than one way to partton D nto D L, D 1,, D r. SymCoeff (DL, D 1,, D r ) : Multplcty that partton D nto D L, D 1,, D r. Flat code for pre-computed symmetrc coeffcents 5k lnes of generated code for up to sxth order Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 11 / 1
Hgh Order Reverse Mode : Implementaton T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D Generate all D L, s.t, T f+1 (D L {v r }) 0 and D 1,, D r, s.t, 0, 1 r D Then perform ncremental updates on D = D L D 1 D r More than one way to partton D nto D L, D 1,, D r. SymCoeff (DL, D 1,, D r ) : Multplcty that partton D nto D L, D 1,, D r. Flat code for pre-computed symmetrc coeffcents 5k lnes of generated code for up to sxth order Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 11 / 1
Hgh Order Reverse Mode : Implementaton T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D Generate all D L, s.t, T f+1 (D L {v r }) 0 and D 1,, D r, s.t, 0, 1 r D Then perform ncremental updates on D = D L D 1 D r More than one way to partton D nto D L, D 1,, D r. SymCoeff (DL, D 1,, D r ) : Multplcty that partton D nto D L, D 1,, D r. Flat code for pre-computed symmetrc coeffcents 5k lnes of generated code for up to sxth order Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 11 / 1
Hgh Order Reverse Mode : Implementaton T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r D L, D 1,, D r s a partton of D Generate all D L, s.t, T f+1 (D L {v r }) 0 and D 1,, D r, s.t, 0, 1 r D Then perform ncremental updates on D = D L D 1 D r More than one way to partton D nto D L, D 1,, D r. SymCoeff (DL, D 1,, D r ) : Multplcty that partton D nto D L, D 1,, D r. Flat code for pre-computed symmetrc coeffcents 5k lnes of generated code for up to sxth order Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 11 / 1
Hgh Order Reverse Mode : Implementaton T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r ReverseAD : an operator overloadng mplementaton of the hgh order reverse mode n C++11. Avalable at https://gthub.com/wangmu0701/reversead. Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 11 / 1
Hgh Order Reverse Mode : Implementaton T f (D) = T f+1 (D) + D L D [ D D j = D 1 D r =D\D L ϕ ] T f+1 (D L {v r }) D 1 D r ReverseAD : an operator overloadng mplementaton of the hgh order reverse mode n C++11. Avalable at https://gthub.com/wangmu0701/reversead. Monotonc ndexng for varables on the trace v j v = ndex(v j ) < ndex(v ) Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 11 / 1
Performance : Synthetc Functon A synthetc functon desgned wth parameters: n : number of ndependent varables s : sze of lve varables durng the functon evaluaton l : the complexty of the functon Dense dervatves z z, s 2.0 + z 2.0, y = t, =1 z 2.0 0.5, ID(z) = log(exp(z)), t = ID k ID 1 (z ), z = t, t = n x. =1 1.0/(1.0/z), sn(asn(z)). Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 12 / 1
Performance : Synthetc Functon A synthetc functon desgned wth parameters: n : number of ndependent varables s : sze of lve varables durng the functon evaluaton l : the complexty of the functon Dense dervatves z z, s 2.0 + z 2.0, y = t, =1 z 2.0 0.5, ID(z) = log(exp(z)), t = ID k ID 1 (z ), z = t, t = n x. =1 1.0/(1.0/z), sn(asn(z)). Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 12 / 1
Performance : Synthetc Functon Fxed l, let n and s change smultaneously Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 13 / 1
Performance : Synthetc Functon Fxed l, let n and s change smultaneously Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 13 / 1
Performance : Synthetc Functon Fxed l and n, changed s Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 14 / 1
Performance : Synthetc Functon Fxed l and n, changed s Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 14 / 1
Performance : Synthetc Functon Fxed l and s, changed n Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 15 / 1
Performance : Synthetc Functon Fxed l and s, changed n Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 15 / 1
Applcaton : XCFUN (on gong) Arbtrary order Exchange-Correlaton functonal lbrary https://gthub.com/dftlbs/xcfun Usng lbtaylor to evaluate dervatves of functonals Up to thrd order n current mplementaton Small number of ndependents : 20 at most Not so-complex functonals On a collecton of functonals: Thrd order Lbtaylor : 81ms Thrd order ReverseAD : 20ms Fourth order ReverseAD : 83ms Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 16 / 1
Applcaton : XCFUN (on gong) Arbtrary order Exchange-Correlaton functonal lbrary https://gthub.com/dftlbs/xcfun Usng lbtaylor to evaluate dervatves of functonals Up to thrd order n current mplementaton Small number of ndependents : 20 at most Not so-complex functonals On a collecton of functonals: Thrd order Lbtaylor : 81ms Thrd order ReverseAD : 20ms Fourth order ReverseAD : 83ms Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 16 / 1
Concluson and Future Work Hgh order dervatve tensors could (and probably should) be drectly evaluated va reverse mode. A seres of algorthms to evaluate dervatves T f up to order d : F d F 1 R d 1 R d R d : symmetrc dervatve tensor d f F 1 R[ d 1 : tensor-vector d f ẋ [[ d f ẋ] ẋ ] ] ẋ Fd : The structural (and sparsty) propertes of Tf determnes the optmal method. General compresson and recovery usng F1 R d 1. perfectly parallelzable Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 17 / 1
Concluson and Future Work Hgh order dervatve tensors could (and probably should) be drectly evaluated va reverse mode. A seres of algorthms to evaluate dervatves T f up to order d : F d F 1 R d 1 R d R d : symmetrc dervatve tensor d f F 1 R[ d 1 : tensor-vector d f ẋ [[ d f ẋ] ẋ ] ] ẋ Fd : The structural (and sparsty) propertes of Tf determnes the optmal method. General compresson and recovery usng F1 R d 1. perfectly parallelzable Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 17 / 1
Concluson and Future Work Hgh order dervatve tensors could (and probably should) be drectly evaluated va reverse mode. A seres of algorthms to evaluate dervatves T f up to order d : F d F 1 R d 1 R d R d : symmetrc dervatve tensor d f F 1 R[ d 1 : tensor-vector d f ẋ [[ F d : d f ẋ] ẋ ] ] ẋ The structural (and sparsty) propertes of T f determnes the optmal method. General compresson and recovery usng F1 R d 1. perfectly parallelzable Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 17 / 1
Concluson and Future Work Hgh order dervatve tensors could (and probably should) be drectly evaluated va reverse mode. A seres of algorthms to evaluate dervatves T f up to order d : F d F 1 R d 1 R d R d : symmetrc dervatve tensor d f F 1 R[ d 1 : tensor-vector d f ẋ [[ F d : d f ẋ] ẋ ] ] ẋ The structural (and sparsty) propertes of T f determnes the optmal method. General compresson and recovery usng F1 R d 1. perfectly parallelzable Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 17 / 1
Concluson and Future Work Hgh order dervatve tensors could (and probably should) be drectly evaluated va reverse mode. A seres of algorthms to evaluate dervatves T f up to order d : F d F 1 R d 1 R d R d : symmetrc dervatve tensor d f F 1 R[ d 1 : tensor-vector d f ẋ [[ F d : d f ẋ] ẋ ] ] ẋ The structural (and sparsty) propertes of T f determnes the optmal method. General compresson and recovery usng F1 R d 1. perfectly parallelzable Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 17 / 1
Concluson and Future Work Hgh order dervatve tensors could (and probably should) be drectly evaluated va reverse mode. A seres of algorthms to evaluate dervatves T f up to order d : F d F 1 R d 1 R d R d : symmetrc dervatve tensor d f F 1 R[ d 1 : tensor-vector d f ẋ [[ F d : d f ẋ] ẋ ] ] ẋ The structural (and sparsty) propertes of T f determnes the optmal method. General compresson and recovery usng F1 R d 1. perfectly parallelzable Mu Wang and Alex Pothen Hgh Order Reverse AD September 30, 2016 17 / 1