JOURNAL OF SOFTWARE, VOL. 6, NO. 6, JUNE 0 96 An Ensemble Daa Mnng and FLANN Combnng Shor-erm Load Forecasng Sysem for Abnormal Days Mng L College of Auomaon, Guangdong Unversy of Technology, Guangzhou, P.R.Chna Emal: mngl4@mal.usc.edu.cn Junl Gao College of Auomaon, Guangdong Unversy of Technology, Guangzhou, P.R.Chna Emal: jomnygao@63.com Absrac The modelng of he relaonshps beween he power loads and he varables ha nfluence he power loads especally n he abnormal days s he key pon o mprove he performance of shor-erm load forecasng sysems. To negrae he advanages of several forecasng models for mprovng he forecasng accuracy, based on daa mnng and arfcal neural nework echnques, an ensemble decson ree and FLANN combnng shor-erm load forecasng sysem s proposed o manly sele he weahersensve facors nfluence on he power load. In he proposed sraegy, an ensemble decson ree wh abnormal paern modfcaon algorhm and a FLANN algorhm are used respecvely o oban he nal predcng resuls of he power loads frs, a BP-based combnaon of he above wo resuls are used o ge a beer predcon aferwards. Correspondng forecasng sysem s developed for praccal use. The sascal analyss showed ha he accuracy of he proposed shor me load forecasng of abnormal days has ncreased grealy. Meanwhle, he acual forecas resuls of Anhu Provnce s elecrc power load have valdaed he effecveness and he superory of he sysem. Index Terms shor-erm load forecasng, combnng forecasng, abnormal days, ensemble daa mnng, FLANN I. INTRODUCTION Power load forecasng s of grea mporance n power sysem desgn n he sense ha he predcon accuracy wll drecly affec he operaon and plannng of he whole power load sysem. In he pas few decades, a varey of power load forecasng algorhms have been proposed and revsed, such as neural neworks [], exper sysems [], fuzzy sysems approach [3], SVM [4], daa mnng [5], ec. However, hese mehods dd no consder he accumulaon effec of meeorologcal characer especally wh he unusual weaher condons and vared Manuscrp receved Oc. s, 00; revsed Oc. 0h, 00; acceped Nov. 5h, 00. Ths work s suppored by he Fundng n Guangdong Provnce, Guangdong Developmen and Reform [43] and he Dr. Sar Fund n Guangdong Unversy of Technology Correspondng auhor: Mng L, mngl4@mal.usc.edu.cn holday acves; here we call Abnormal Days. So, due o he complexy and uncerany, s hard o model he relaonshps beween he loads and relaed varables. The dffcules may ex n he followng aspecs: frs of all, he modelng and he parameers choosng are roublesome for he lack of adequae cognon of he nfluencng mechansm of he load; second, The load a a gven day s dependen on oo much facors, e.g. may be nfluenced by he load a he prevous day or he same day n he prevous week [6] ; furhermore, he unexpeced evens wll cause flucuaons n he power load, ec. Toward he major facors whch make he modelng process complcaed, a combnng forecasng sraegy based on smlary s proposed n hs paper o solve he problem. Frs, an ensemble daa mnng wh abnormal paern modfcaon algorhm and a FLANN echnque are used respecvely o oban he nal resuls. Then, a BP-based combnaon of he above wo resuls are used o ge a beer predcon. The mehod has an advanage of dealng no only wh he nonlnear par of load, bu also wh he abnormal days wh rapd clmae change. The paper s organzed as follows: Par II nroduces he sysem desgn, ncludng he archecure and he wo man modules; he core algorhms of he ensemble daa mnng, he FLANN and he fnal combnng are focused on n Par III, dealed mplemenaon s dscussed o clarfy he key pons; applcaon and resuls are llusraed o valdae he proposed sysem n Par IV, hen a concluson s drawn n Par V wh some suggesons of he fuure research.. II. SYSTEM DESIGN A. Overall Archecure The overall archecure of he combnng shor-erm load forecasng for Anhu provnce s shown n Fg.. The sysem uses he Server-Clen archecure and MS SQL daabase. I consss of wo modules: The Daa Processng Module and he Load Forecasng Module. Daa Processng Module s o conver he load daa and do:0.4304/jsw.6.6.96-968
96 JOURNAL OF SOFTWARE, VOL. 6, NO. 6, JUNE 0 meeorologcal daa no he specfc form of ranng daa requred by he daa mnng algorhms; he Load Forecasng Module s ask s o call he daa mnng he FLANN and he combnng algorhms for he loads predcon, whch can be easly browsed by he clen. The sysem wll frs load he power load daa and he daa of all he relevan facors whch wll nfluence he power load. Afer he preprocessng, cleanng, and analyzng of he loaded daa, seleced algorhm wll be runnng o oban he model of relevan facors quanfed mpac on he load, especally he hdden paerns whch wll be revealed and modeled n par III. Then usng he hsorcal load curve daa, meeorologcal daa and omorrow's weaher forecas daa as he model npu, he omorrow's load curve can be predced. Fgure. Sysem Archecure B. Daa Processng Module The daa we used ncludes he power load and he meeorologcal daa. These wo daa are n wo forms: he hsorcal daa and real-me daa. For he power load daa, he hsorcal daa and he real-me daa are n he same forma, collecng one record per 5 mnues. As for he meeorologcal daa, he hsorcal daa s he valdaed daa wh hgh accuracy and good negry whch have been checked by he Meeorologcal Deparmen, bu he daa densy s low (6 hours a meeorologcal record) and he real-me daa s colleced accordng o he real-me measuremen. Oppose o he hsorcal daa, always has more error daa and mssng daa, bu he daa densy s hgh, collecng one record per hour. Daa Prereamen In order o process hese daa no he form we desred he necessary prereamen of he daa ncludes [7] : (a) Error daa. Remove error daa devang from he vald range, e.g. we reaed he power load p>0 and he vald emperaure value s T [-0 C 40 C] (b) Mssng daa. There are mssng daa n daabase and he removed error daa wll also cause new mssng dada. Thus a lnear nerpolaon mehod s used o fll he mssng poson when he nerval s relave shor, e.g. when he daa on me n and n+ are known, hen he daa on n+j s as follows: T n+j =T n +(T n+j -T n ) j/. Meanwhle, when he nerval s relave long, he mssng daa can be replaced by he daa of recen days or smlar days. (c) Daa densy converson. Ths rule s manly for he meeorologcal daa. As saed above, he power load daa s one record per 5 mnues whle weaher daa s one record per sx hours. The resuls of converson are boh one record per ffeen mnues. Arbues Selecon Regardless of he learnng mechansm, he condon arbues and he arge arbues should be deermned frs. The gray relevan analyss resuls have shown ha he meeorologcal facor s he mos mporan one n he facors nfluencng he load. From he provnce s 3 years large regonal meeorologcal load daabase, accordng o he suggesons of he power expers and he meeorologss, he emperaure, relave humdy, oal cloud cover, ranfall, vapor pressure and maxmum gus speed are adoped as he condon arbues, whch are bascally covers he man meeorologcal facors ha affec he load. In order o reflec he arbues change s mpac on he power load, he value of meeorologcal changes s used here. And, he changes do no mean he dfference beween adjacen wo days, bu he dfference compared o he day before yeserday. Meanwhle, consderng ha he load s no only depend on he meeorologcal changes, bu also her baselne value, e.g. f ncreasng 3, s mpac on he load s sgnfcanly dfferen n he summer from he wner, so he meeorologcal baselne value are also used here. In summary, a oal of meeorologcal arbues s se o be he npu properes. Alhough he orgnal meeorologcal daa of 6 ces n he provnce can be more fully reflec he provnce's meeorologcal condons, bu f so, he resuled arbues are oo much (6 =9), whch wll make he predcon complcaed. So accordng o he geographcal dsrbuon of Anhu Provnce, he 6 ces have been dvded no hree regons: Huabe, Jangnan and Janghua. Thus, a oal of 36 meeorologcal properes are reaed n hs way. The nes sep s o deermne he arge arbue. As he power load changes shows a regular rend, we defne a load changng rae as he arge arbue o reflec hs change: pn+ pn Δ pn = 00% () p where P n+, P n are he power load of he (n+) h day and he n h day respecvely. C. Power Load Forecasng Module A combnng forecasng wh adapve coeffcens s used o produce more accurae load predcon by sharng he srengh of dfferen predcons. The challenges exs n wo aspecs: he frs s how o reflec he nfluence facors mpac on he power load, especally he unusual change n abnormal days. The second s how o fnd he bes nonlnear combnaon of he wo mehods so as o ouperform he ndvdual forecas. By comparng he varous predcng algorhms, akng he acual suaon n power load forecasng no accoun, an ensemble decson ree and FLANN combnng algorhm s proposed as Fg. shows. We can use each algorhm o ge wo ndependen predcons. Nex, n
JOURNAL OF SOFTWARE, VOL. 6, NO. 6, JUNE 0 963 modfcaon for some paerns s essenal. As has menoned above, n he day wh unusual weaher condons or specal evens, he power load wll show her unque characerscs, whch s hghly smlar f n he same condons such as n he connuous hgh emperaure. So he hdden law for he paern s he key o carry ou he modfcaon, whch wll be obaned n a second ranng of he algorhms. Hence, n he forecasng phase, f a forecased day s n a smlar paern, he modfcaon wll be appled o he resul. In hs way, he ndvdual resuls are used o consruc a nonlnear combnng model o generae an mproved predcng. Supposed a se of S and f s (C j,s) sandng for he number of cases n S belongng o he class C j, he nformaon enropy s: (, ) (, ) f C S f C S IS ( ) = log k s j s j () j= S S Afer S has been paroned n accordance wh he n oucome of a es X The expeced nformaon requremen can be found as he weghed sum over he subses, as: n S I X( S) = I = S ( S) (3) The quany GX ( ) = IS ( ) IX ( S) measures he nformaon ha s ganed by paronng T n accordance wh he es X. The gan creron selecs a es o maxmze hs nformaon gan. Bu hs gan creron has srong bas n favor of ess wh many oucomes. I can be recfed by a knd of normalzaon whch represens he poenal nformaon generaed by dvdng T no n subses.: Fgure : Procedure of predcng sraegy III. IMPLEMENTATION A. Ensemble Daa Mnng In he proposed sysem, a boosng negraed ensemble decson ree s used o absrac rules o descrbe he relaonshp beween npu and oupu. The ermnal nodes of he resuled decson ree of he algorhm wll reflec he nformaon of he classfcaon. Expaons wll be focused on he basc formulaon, he paern modfcaon and he boosng. Basc Formulaon An mproved C4.5 algorhm [8] s used as he basc predcng mehod, whch bulds he decson ree from a se of ranng daa usng he concep of nformaon enropy. The ranng daa s a se S=S,S, of already classfed samples. Each sample S =x,x, s a vecor where x,x, represen arbues of he sample. The ranng daa s augmened wh a vecor C=C,C, where C,C, represen he class o whch each sample belongs. A each node of he ree, he algorhm chooses one arbue of he daa ha mos effecvely spls s se of samples no subses enrched n one class or he oher. Is creron s he normalzed nformaon gan (dfference n enropy) ha resuls from choosng an arbue for splng he daa. The arbue wh he hghes normalzed nformaon gan s chosen o make he decson. The C4.5 algorhm hen recurs on he smaller sub lss. S Spl _ I( X ) log S n = = S S Now a new gan creron expressng he proporon of nformaon generaed by he spl ha appears helpful for classfcaon s as follows: (4) Gan _ R( X ) = G( X )/ Spl _ I( X ) (5) Snce he orgnal consruced decson ree may suffer from over fng problem, or may be large and unreadable, should be smplfed or be pruned. The smplfed or pruned ree s obaned by dscardng one or more sub-rees and replacng hem wh a leave node, accordng he respecve predc errors calculaed wh gven confdence level. Consder he (n+) h day o be he predced dae and he n h day o be he base dae, afer he basc formulaon, an nal power load changng rae Δp n s he oupu of he decson ree, so he power load of he (n+) h day s as (6) whch can be derved from (): p = ( +Δpn) p (6) n+ n where p s he hsorcal power load n he n h n day exraced from he hsorcal daabase. Paern Modfcaon For he exceponal changes n he abnormal days, several caegores of specal paerns are recognzed and analyzed o ensure ha each paern s composed of daly load daa sequence wh hghly smlar feaures; hen learned modfcaon rules are appled o he daa n hese paerns, he specfc modfcaon s as follows. (a) Temperaure reconsrucon whn a day
964 JOURNAL OF SOFTWARE, VOL. 6, NO. 6, JUNE 0 Despe he dry bulb emperaure and s change are he effecve parameers descrbng he emperaure; can be found ha he sensvy of predcve value vares grealy due o he dfferen mes n a day, meanwhle, he emperaure parameers mpac on he load forecasng under dfferen condons also changes a lo. In vew of hs suaon, he weghed daly maxmum emperaure s used o reconsrucon hsorcal daa, whch s reaed as par of he npu o ener he mnng model as follows: T w ()=T() (-ω)+t max ω (7) In (7), T w () s he weghed emperaure, T() s he dry bulb emperaure a me, T max s he hghes daly emperaure, ω s he weghng coeffcen. The same way s appled o he weaher forecasng daa processng. Durng he sysem desgn, only he hsorcal emperaure daa from June o Sepember s reconsruced usng he weghed daly maxmum emperaure rule, whle dry bulb emperaure s sll used for oher mes. I can be seen from (7) ha he same form of weghng s mplemened regardless of he day and ngh. Moreover, n he vald conex of he weghng coeffcens ω, ngh me wll be affeced more by ω, whch s also ndcaed by he expermen. So f reasonably seleced, he weghng coeffcens can no only effecvely deal wh he parculary of he summer emperaure, bu also weaken he load forecasng s dependence on accuracy of he weaher forecas daa, so afer hs converson, he mpac of people's subjecve s consdered o mprove he sysem s performance. I s needed o be menon ha, a oo small weghng coeffcen wll no acheve he desred effec and a oo bg one wll weaken he mpac of he acual emperaure n dfferen me of day on he accuracy, hus lead o some poor resuls. Therefore, he approprae weghng coeffcens need o be seleced cauously based on experence and expermens. (b) Sraeges dealng wh he emperaure muaon The relaonshp beween he emperaure changes and he power load n summer dffers grealy compared o oher seasons. The nfluence of he connuous hgh emperaure on he load s no only he sngle hgh emperaure, bu also he accumulaon effecs of emperaure n he days go by. The varous hdden paerns n summer are analyzed n hs secon and correspondng mprovemen sraeges are gven accordngly. Muaon pon Years of weaher-load hsorcal daa show ha here s a remarkable characersc n he relaonshp beween he power load and he weaher change n he summer. In more deal, a load-emperaure muaon pon always exss o cause he enormous changes on boh sdes of he pon. Accordng o he behavor and he magnude when he acual emperaure goes hrough he pon, he correspondng condon can be classfed no four cases: Mnor warmng hrough muaon pon, rapd warmng hrough muaon pon, slgh coolng hrough muaon pon and rapd coolng hrough muaons Hgh emperaure When summer emperaures rse o a ceran degree, even mnor emperaure changes, wll resul n a large load change, A he same me here s also a load - emperaure change sauraon pon, above hs emperaure, he ordnary power consumpon wll be on full load (no sauraon). Meanwhle, as he emperaure connues o rse n summer, wll show a knd of regular load changes, whch wll be dfferen from oher seasons. For he remendous dfference beween summer and oher season, some modfcaon rules are pu forward o handle he ho weaher, manly dealng wh he condon when he emperaure of he base dae and he forecasng emperaure boh are above he muaon pon, more specfc, here are fve knds of suaons: susaned hgh emperaure, mnor heang under he hgh emperaure, rapd heang under hgh emperaure, mnor coolng under he hgh emperaure and rapd coolng under hgh emperaure. Relave low emperaure When he emperaure of he base dae and he day o be forecased are boh below he muaon pon, he modfcaon s no sgnfcan. So he necessary correcons are also much smaller compared o he prevous wo paerns. Connuous coolng When he day and ngh emperaure dfference s large, hs usually occurs n he season change, or s accompaned by srong clmae change. In order o accuraely descrbe hs suaon, he sraegy reang he coolng n he day should be dfferen from he ngh. Fve coolng paerns can be aaned n accordance wh he followng facors: coolng rae n he dayme, coolng rae n he ngh me and average daly emperaure change In concluson, four man paerns are summarzed above, and each paern conans several ypes. Dfferen specal rules whch wll be dscovered by daa-mnng echnque should be appled o each one of he ypes. So a daa mnng and specal rules combned sraegy s acheved o modfy he orgnal predcon as follows: ( () ( ( ))) L () = Φ L + f Δp, Δ p (8) predc base k (, ) β + ( -β ) f Δp Δ p = Δ p Δ p (9) k k k In (9) and (0), =,,,96 refers o he 96 samplng me sequence, k =,,,5 s he represenave of he 4 groups of 5 knds of muaons, whle L base () s he base load and L predc () he s he forecased load. Ensemble Boosng Some ensemble mehods have emerged as meaechnques for mprovng he generalzaon performance of exsng learnng algorhms. Specally, AdaBoos [9] s repored as he mos successful boosng algorhm wh a promse of mprovng classfcaon accuraces of a weak learnng algorhm. Boosng s a compose classfers echnque; works by generang a sequence of decson rees. The frs classfer s bul as he prevous secon descrbes. Then, he second one s generaed n such a way ha focuses
JOURNAL OF SOFTWARE, VOL. 6, NO. 6, JUNE 0 965 on he samples ha were msclassfed by he frs one. Then he hrd model s bul o focus on he second model's errors, and so on [0]. We assume a gven se S of N nsances each belongng o one of K classes and a learnng sysem ha consrucs a classfer from a ranng se of nsances boosng wll consruc mulple classfers from he nsances; he number T of repeons or rals wll be reaed as fxed. The classfer learned on ral wll be denoed as C whle C * s he compose classfer. For any nsance, C () and C * () are he classes predced by C and C * respecvely. The verson of boosng nvesgaed n hs paper s an mproved edon of he AdaBoos []. The boosng manans a wegh for each nsance - he hgher he wegh, he more he nsance nfluences he classfer learned. A each ral he vecor of weghs s adjused o reflec he performance of he correspondng classfer wh he resul ha he wegh of msclassfed nsances s ncreased. The fnal classfer also aggregaes he learned classfers by vong, bu each classfer s voe s a funcon of s accuracy. Frs a 0- funcon s defned as follows: h 0 s msclassfed by he classfer θ () = oherwse Le ω denoe he wegh of nsance a ral, and p s he renormalzaon facor of p ω = p = N, n = ω = ω. Tha s o say: (0) A each ral =,, T, a classfer C s consruced from he gven nsances under he dsrbuon p. The error ε of hs classfer s also measured wh respec o he weghs and consss of he sum of he weghs of he nsances ha msclassfes: n ε = p θ () = If ε >0.5, he rals are ermnaed and T=T-, Conversely f C correcly classfes all he nsances so ha ε =0 he rals ermnae and =T. Oherwse, he wegh vecor for he nex ral ω + s generaed by mulplyng he weghs of nsances ha C classfes correcly by he facor β whch s calculaed as follows: + ωβ s correcly classfed ω = ω s msclassfed () where β = ε /( ε ) Afer he above whole process of ranng, he boosed classfer C * s obaned by summng he voes of he classfers C,C,,C T where he voe for classfer C s worh log(/β ) uns. The Pseudo code for he boosng algorhm s gven n Table I. TABLE I. THE BOOSTING ALGORITHM Inpu: A gven se S of N nsances Tranng:. Inalze T, Le =, for every, ω = / N. Consruc C from he gven nsances under he dsrbuon 3. Calculae ε. If ε >0.5, he rals are ermnaed and T=T-, f ε =0 he rals ermnae and =T. Oherwse Calculae β and ω + 4. If =T, he rals are ermnaed, else le =+ and go o sep T * Oupu: C log ( / β ) = = C In he followng secon, he predced resul of he ensemble daa mnng mehod s denoed by f (). B. FLANN Orgnally, he funconal lnk ANN (FLANN) was proposed by Pao []. He has shown ha, hs nework may be convenenly used for funcon approxmaon and paern classfcaon wh faser convergence rae and lesser compuaonal load han a mullayer percepron srucure. In hs paper, ranscendenal knowledge of elecrcal power load are mpored o srucure he FLANN forecasng nework, meanwhle, prunng and affxaon momenum algorhms are used o mprove sandard FLANN as well. Nex, he FLANN srucure and learnng algorhm are nroduced n deal. FLANN Srucure Consder a se of bass funcons H = { ϕ L( A)} I wh he followng properes: ) φ =; ) The subse H { } j j = ϕ H = s lnearly ndependen; / j 3) sup j ϕ = < A Le H { } j N = ϕ = be a se of bass funcons as shown n Fg. 3. Thus, he FLANN consss of N bass funcons { ϕ, }, ϕ ϕn H N wh he followng npu-oupu relaonshp for he h oupu: X x x xn = N ( ) ( ) y X = w h X j j j = F.E (X) (X) N(X) W y (X) y (X) y p (X) Fgure 3: BP-based combned forecasng p (3) Frs, he se of effcen bass funcons should be deermned o reflec he power load sysem s mechansm and s pror knowledge, whch s a characersc of he FLANN [3]. As analyzed n par II, a oal of
966 JOURNAL OF SOFTWARE, VOL. 6, NO. 6, JUNE 0 meeorologcal arbues s se o be he npu properes n he decson ree whch have a sgnfcan nfluence on he power load. So n he FLANN, hese arbues are also ncorporaed no he bass of funcon n he form of her polynomal such as L ω ()=β T()+β T ()+β 3 T 3 ()+ where L ω () s he weaher sensve par of he power load and T() s he funcon of reconsruced emperaure as (8) defned, β (=,,3) s he nonlnear emperaure coeffcen, he omed par s he sum of he power of oher arbues lke he T(). However, he weaher ndependen power load always show her cyclcal performance, for example, he mornng peak, he evenng peak and he shoulder load, so we can model hs par of power load n he form of Fourer seres: q = () = 0 + cos( ω + sn ( ω )) L a a k b k Hence, we can he funcon bass H=[H (), H ()] where H ()=[x g(a x+b ) g(a m x+b m )] T and H ()=[ cosω snω cosqω snqω] T wh x=[t() T () T 3 () ] T s he polynomal of he arbues seleced. Takng no accoun he complexy of weaher facors, a anh( ) funcon s used as he Acvaon funcon g( ) n H (). Classfer Learnng Based on he algorhm n [4] and [5], an mproved prunng and addonal momenum of he wdrow-hoff algorhm s proposed. Frs, los of expermenal resuls have demonsraed ha a consderable poron of he nal chosen funcon bass s no vald, n accordance; here wll be some elemens of 0 appearng n he wegh marx. A hs pon, he correspondng bass should be cu off o accelerae he learnng process. The revsed wegh updang mehod [4] usng he affxaon momenum s as follows: αek W( k+ ) = δ( k) W( k) + ( δ( k) ) T λ+ ( ) θ( k) X θ( k) (4) where k s he number of he eraon, λ s he forgeng facor, e(k) s he k h -sep oupu error, α s he adapve learnng rae whch sasfes 0<α(k)< and δ(k) s he k h sep momenum facor whch s defned as: δ SSE > β SSE δ ( k ) SSE SSE δ ( k ) oherwse k k k k = δ < (5) where SSE k s he sum of squared error of he nework s oupu n he k h sep, δ, δ, β are emprcal consan parameers and θ(k)=[sgn( cosω snω cosqω snqω) anh(a T()+a T ()+a 3 T 3 ()+b ) anh(a s T()+a s T ()+a s3 T 3 ()+b s ) ] The algorhm s equvalen o a low pass fler, whch allows gnorance of he characerscs of he small changes on he nework. I can also decrease he possbly of local mnmum. Therefore, he convergence rae s faser han he orgnal Wdrow-Hoff dela rule algorhm. The proof of convergence of he algorhm s guaraneed by [4] and he Lyapunov Sably Theory. In he learnng process, he rule of parameers seng s summarzed as follows: () In he nalzaon sage when he frequency characerscs of he load daa curve s sll unknown and s obvous ha he curves o be predced s supermposed wh a varey of dfferen frequency bands, so relave large values are assgned o q and s o raverse he varous frequency bands o mprove he accuracy of predcon. () The funcon of T and oher arbues should obey o he characerscs of he power load sysem, e.g. he funcon of he emperaure can be T()=sn(0.5)+x 0.5 o reflecs he exponenal relaonshp and perodcy. The parameers of he funcon can be fne-uned durng he leanng of he hsorcal daa. (3) Inal a j and b can be chosen randomly, δ, δ, β should be adjused accordng o he smulaon resuls of he predcng. In he followng secon, he predcng resul of he FLANN mehod s denoed by f (). C. Combnng Forecasng In vew of he exsng lmaons of he sngle forecasng, combnng forecasng mehods has been appled under he premse ha he fnal predcng resuls are he nonlnear weghed combnaon of he sngle approach. Suppose ha here are m knds of forecasng mehods for he even F, f we can express he h mehod as φ, he nonlnear combnaon of dfferen forecasng mehods can be descrbed as follows: y=φ(x)=φ(φ, φ,... φ m ) (6) Under ceran measuremen, Φ(x) s more superor o φ (x). As explaned n he prevous secon, he mproved ensemble decson ree sraegy and he FLANN predcon are chosen as he ndvdual predcng model, so he lef key problem s he nonlnear mappng. Consderng he nonlnear mappng ably of he BP neural nework, a hree-layer BP neural nework s chosen o deermne he opmal combnaon forecasng wegh as shown n Fg. 4. Fgure 4: BP-based combned forecasng The mplemenaon of he combnng forecasng s dvded no he followng seps: (a) The ranng phase: he daa n he hsory daabase s exraced o ran he BP nework offlne so ha he correspondng weghs s obaned and hen he relaonshp beween he predcng value of he wo ndvdual mehods and he acual value can be modeled. In he offlne ranng, he npu s he ndvdual
JOURNAL OF SOFTWARE, VOL. 6, NO. 6, JUNE 0 967 predcng value f (), f () and he oupu s he acual power load value recorded n he hsorcal daabase. (b) The forecasng phase: he npu of he BP nework s he predcng value of he power load for he day o be forecased f (), f () based on he weaher forecasng daa, and he oupu s he fnal desred predcng value. IV. APPLICATION AND RESULTS The performance of he proposed combnng sraegy has been esed usng one year of load and meeorologcal daa for he seasons wh many abnormal days n Anhu Power Dspachng and Communcaon Cener, whch s currenly usng ELPSDM [7] mehod for shor-erm load forecasng. The accuracy formula s used o evaluae performance of he forecasng, whch s defned by (7): Where E n Rj = E / n 00% (7) = s he relave error of he forecasng pons gven n [6]. As he 96 pons mehods s adoped o ge he predcng curve, n equal o 96. In he modelng phase, he hsorcal power and meeorologcal daa from May 005 o May 008 s used as he ranng daa. In he predcng phase, he obaned model s used o predc he power load from he June, 008 o Sepember, 008. A Analyss of he modfed ensemble daa mnng In order o sress he advance of he ensemble daa mnng, hs ex compares he forecasng accuracy of he resul from he basc C4.5 decson ree and he ensemble daa mnng wh paern modfcaon (shorened as ELM). And he resul can be seen n able II. Table II shows ha he forecasng accuracy of he ELM s obvously hgher han he basc C4.5. Especally can be calculaed from Table II ha he overall average value of he R j defned n (5) s 96.4% compared o 94.49% of he basc C4.5. I s worh menonng ha n he abnormal days when he predcng accuracy s relave low usng he basc C4.5 algorhm, he accuracy has ncreased grealy usng he proposed ensemble daa mnng wh modfcaon, e.g. n July 5 h, July 6 h, July 9 h, ec. TABLE II COMPARISON OF FORECASTING ACCURACY IN JULY, 008 Dae Basc C4.5 ELM 7-0 95.77 96.0 7-0 97.5 97.46 7-03 96.5 97.6 7-04 95.8 95.99 7-05 90.4 94.93 7-06 89.96 9. 7-07 96.36 97. 7-08 9.85 95. 7-09 90.7 94.30 7-0 96.3 97.6 7-90.34 93.7 7-9.96 94.49 7-3 97.08 97.80 7-4 95.9 96.83 7-5 96.89 97.03 7-6 97.39 98.40 7-7 93.08 95.86 7-8 9.34 95.88 7-9 94.49 95.5 7-0 95.5 95.70 7-96.66 97.55 7-94.64 96.0 7-3 9.35 94.88 7-4 96.5 97.89 7-5 95.8 95.90 7-6 96.00 96.87 7-7 93.70 94.74 7-8 96.36 97.88 7-9 96. 97.3 7-30 94.43 96.86 7-3 9.50 95.3 B Analyss of he FLANN The sascs of he average predcng accuracy from June 008 o Sepember 008 s llusraed n Table III for he comparson beween he radonal FLANN and he affxaon momenum FLANN wh prunng (shorened as AMFLANN). TABLE III COMPARISON OF FORECASTING ACCURACY FROM JUL. TO DEC. 008 Monh FLANN AMFLANN 008-06 94.9 95.63 008-07 93.46 95.90 008-08 94.34 95.9 008-09 98.75 98.99 Table III shows ha he mproved AMFLANN algorhm has gven a subsanal ncrease n forecasng accuracy. Moreover, a large number of expermenal resuls have confrmed he algorhm s nheren ably o rejec he pahologcal daa and reduce s mpac o he greaes exen snce he FLN uses he expanded bass funcons. In addon, he mechansm of he power load s smlar even a dfferen mes, so he choce of he bass funcons s relavely fxed, whle he coeffcen can be raned adapvely based on he hsorcal daa. C Analyss of he overall sysem To verfy he performance of he proposed mehod, wo comparsons are carred ou, he frs s he comparson beween forecasng and real-load of Anhu power load nework as shown n Fg. 5; he second s he comparson of he performance beween he mproved sysem he currenly usng one as shown n Fg. 6. I can be seen from Fgure 5 and Fgure 6 ha he mproved sysem wll no only be able o manan hgh accuracy of he load predcon hroughou he summer, bu also grealy mproved he accuracy of he predcon when here exs rapd clmae change. I s worh menonng ha because of he algorhm s dependence on he weaher forecas o some exen, he serous weaher forecasng error wll cause a consderable bad nfluence on he accuracy of he predcng resuls. So n he Fgure 6, he serous error forecasng of he cold spell n 3 Augus cause he predcon accuracy down o be slghly lower han 90%. However, he sascs of he forecasng accuracy over he enre summer shows ha he mproved sysem can keep hghly accurae predcon
968 JOURNAL OF SOFTWARE, VOL. 6, NO. 6, JUNE 0 o acheve an average predcon accuracy of 96.4% even when here are many anomales n he weaher condons. Analyzng he comparson beween he currenly usng sysem and he proposed sysem, n he abnormal days when he currenly usng sysem s dffcul o acheve accurae predcng, he average predcon accuracy has been mproved by.4% compared o he currenly usng sysem; whle he monhly average accuracy hroughou he year of he proposed sysem has reached 97.9%. 0 8 6 4 Real Load Predcng Load 0 6. 6.5 6.9 7.3 7.7 8.0 8.4 9.7 9. Dae Fgure 5: Comparson beween forecasng and real-load Fgure 6: Performance of mproved sysem vs. he orgnal one V. CONCLUSIONS In hs paper, an ensemble daa mnng and FLANN combnng forecasng sysem has been proposed o acheve hgh predcng accuracy especally n abnormal days. A varey of abnormal paerns have been recognzed and correspondng modfcaon s gven o mprove he predcng accuracy. The acual predcon resuls have proved ha he sraegy has grealy mproved he predcon accuracy n abnormal days whle ensurng he overall predcon accuracy and enhanced he sysem s ably o adap o he abnormal condons. Fuure work wll be focused on he followng aspecs: he frs s how o make he sysem adapve o oher common abnormal suaons such as polcal evens, holday, conngences, ec. The second s how o redesgn he sysem o mprove he feedback performance of he sysem, and how o make he sysem robus o he weaher forecasng. REFERENCES [] Z.H. Osman, M.L. Awad and T.K. Mahmoud, Neural nework based approach for shor-erm load forecasng, Power Sysems Conference and Exposon. IEEE/PES, 009, pp. -8. [] S. Rahman and R. Bhanagar, An exper sysem based algorhm for shor erm load forecas, IEEE Trans. on Power Sysems, Vol. 3, No., pp. 39-399, 988 [3] S. Sachdeva, C.M. Verma, Load forecasng usng fuzzy mehods, Power Sysem Technology and IEEE Power Inda Conference, 008, pp. -4. [4] A.M. Escobar, L.P. Perez, Applcaon of suppor vecor machnes and ANFIS o he shor-erm load forecasng, Transmsson and Dsrbuon Conference and Exposon: Lan Amerca, IEEE/PES, 008, pp. -5 [5] Y. Lu; Y.N Huang. Research on analycal mehods of elecrc load based on daa mnng, Inellgen Compuaon Technology and Auomaon (ICICTA), Vol., pp. 085-088, 00. [6] H.S. Hpper, C.E. Pedrera and R.C. Souza, Neural neworks for shor-erm load forecasng: a revew and evaluaon, IEEE Trans Power Sys, Vol. 6, No., pp.44-55, 00. [7] L. Hong Lu, H.Q. Zhang e al, An Elecrc Load Predcon Sysem Based on Daa Mnng, Mn-Mcro Sysems, Vol. 5, No. 3, pp. 434-437, 004. [8] hp://en.wkpeda.org/wk/c4.5_algorhm [9] Y. Freund, R.E. Schapre, A decson-heorec generalzaon of on-lne learnng and an applcaon o boosng, J. Compu. Sys. Sc, Vol. 55, No., pp. 9-39, 997. [0] S. Dudo, J. Frdlyand and T.P. Speed, Comparson of dscrmnaon mehods for he classfcaon of umors usng gene expresson daa, Techncal Repor 576, Deparmen of Sascs, Unversy of Calforna a Berkeley, Berkeley, CA., 000. [] J.R. Qunlan, Baggng, boosng, and C45, Proc of 4h Naonal Conference on Arfcal Inellgence, Porland, Oregon, pp. 75-730, 996. [] Y.H. Pao, S. M. Phllps and D. J. Sobajc, Neural-ne compung and nellgen conrol sysems, In. J. Conr., vol. 56, no., pp. 63 89, 99. [3] A. Serra e al, Evoluon of funconal lnk neworks, IEEE ransacons on evoluonary compuaon, Vol. 5, No., pp. 54-65, February 00. [4] P.K.Dash, e al, A real-me shor-erm load forecasng sysem usng funconal lnk nework, IEEE Transacon on Power Sysem, Vol., No., pp. 675-680, May 997. [5] H.T. Zhang, e al, Forecasng algorhm of shor-erm elecrc power load based on mproved FLN, Transacons of Chna Elecroechncal socey, Vol. 9, No. 5, pp. 9-96, 004. [6] D.X. Nu, e al, The mehods and applcaon of power sysem load forecasng. Bejng: Chna Elecrc Power Press, 998. Mng L was born n Jujang, Jangx, P.R.Chna n Oc. nd, 980. He receved a docor s degree of engneerng n July, 008 specalzng n paern recognon and nellgen sysem graned by he Unversy of Scence and Technology of Chna, Hefe, P.R.Chna. He s now an INSTRUCTOR n Guangdong Unversy of Technology, Guangzhou, Guangdong, P.R.Chna. And hs curren research neress nclude modelng, smulaon and conrol of complex sysems. Junl Gao Docor of engneerng, MASTER TUTOR n Guangdong Unversy of Technology. Hs curren research neress nclude Power elecroncs and moon conrol echnology, compuer numercal conrol sysem, he developmen and applcaon of he embedded sysem, ec.