Experiencing SAX: a Novel Symbolic Representation of Time Series


 Kory Haynes
 3 years ago
 Views:
Transcription
1 Expereg SAX: a Novel Symbol Represetato of Tme Seres JESSIA LIN Iformato ad Software Egeerg Departmet, George Maso Uversty, Farfax, VA 030 EAMONN KEOGH LI WEI STEFANO LONARDI omputer See & Egeerg Departmet, Uversty of alfora Rversde, Rversde, A 95 Abstrat. May hgh level represetatos of tme seres have bee proposed for data mg, ludg Fourer trasforms, wavelets, egewaves, peewse polyomal models et. May researhers have also osdered symbol represetatos of tme seres, otg that suh represetatos would potetalty allow researhers to aval of the wealth of data strutures ad algorthms from the text proessg ad boformats ommutes. Whle may symbol represetatos of tme seres have bee trodued over the past deades, they all suffer from two fatal flaws. Frstly, the dmesoalty of the symbol represetato s the same as the orgal data, ad vrtually all data mg algorthms sale poorly wth dmesoalty. Seodly, although dstae measures a be defed o the symbol approahes, these dstae measures have lttle orrelato wth dstae measures defed o the orgal tme seres. I ths work we formulate a ew symbol represetato of tme seres. Our represetato s uque that t allows dmesoalty/umerosty reduto, ad t also allows dstae measures to be defed o the symbol approah that lower boud orrespodg dstae measures defed o the orgal seres. As we shall demostrate, ths latter feature s partularly extg beause t allows oe to ru erta data mg algorthms o the effetly mapulated symbol represetato, whle produg detal results to the algorthms that operate o the orgal data. I partular, we wll demostrate the utlty of our represetato o varous data mg tasks of lusterg, lassfato, query by otet, aomaly deteto, motf dsovery, ad vsualzato. Keywords Tme Seres, Data Mg, Symbol Represetato, Dsretze. Itroduto May hgh level represetatos of tme seres have bee proposed for data mg. Fgure llustrates a herarhy of all the varous tme seres represetatos the lterature [3,, 0, 4, 7, 30, 36, 53, 54, 63]. Oe represetato that the data mg ommuty has ot osdered detal s the dsretzato of the orgal data to symbol strgs. At frst glae ths seems a surprsg oversght. There s a eormous wealth of exstg algorthms ad data strutures that allow the effet mapulatos of strgs. Suh algorthms have reeved deades of atteto the text retreval ommuty, ad more reet atteto from the boformats ommuty [5, 9, 5, 5, 57, 60]. Some smple examples of tools that are ot defed for realvalued sequees but are defed for symbol approahes lude hashg, Markov models, suffx trees, deso trees et. There s, however, a smple explaato for the data mg ommuty s lak of terest strg mapulato as a supportg tehque for mg tme seres. If the data are trasformed to vrtually ay of the other represetatos depted Fgure, the t s possble to measure the smlarty of two tme seres that represetato spae, suh that the dstae s guarateed to lower boud the true dstae betwee the tme seres the orgal spae. Ths smple fat s at the ore of almost all algorthms tme The exeptos are radom mappgs, whh are oly guarateed to be wth a epslo of the true dstae wth a erta probablty, trees, terpolato ad atural laguage.
2 seres data mg ad dexg [0]. However, spte of the fat that there are dozes of tehques for produg dfferet varats of the symbol represetato [3, 5, 7], there s o kow method for alulatg the dstae the symbol spae, whle provdg the lower boudg guaratee. I addto to allowg the reato of lower boudg dstae measures, there s oe other hghly desrable property of ay tme seres represetato, ludg a symbol oe. Almost all tme seres datasets are very hgh dmesoal. Ths s a hallegg fat beause all otrval data mg ad dexg algorthms degrade expoetally wth dmesoalty. For example, above 60 dmesos, dex strutures degrade to sequetal sag [6]. Noe of the symbol represetatos that we are aware of allow dmesoalty reduto [3, 5, 7]. There s some reduto the storage spae requred, se fewer bts are requred for eah value, however the trs dmesoalty of the symbol represetato s the same as the orgal data. Data Adaptve Tme Seres Represetatos No Data Adaptve Sorted oeffets Peewse Lear Approxmato Iterpolato Regresso Peewse Polyomal Sgular Value Deomposto Adaptve Peewse ostat Approxma to Symbol Natural Laguage Lower Boudg Strgs Trees No Lower Boudg Haar Orthoormal Daubehes db > Wavelets Radom Mappgs BOrthoormal oflets Symlets Spetral Dsrete Fourer Trasform Peewse Aggregate Approxmato Dsrete ose Trasform Fgure : A herarhy of all the varous tme seres represetatos the lterature. The leaf odes refer to the atual represetato, ad the teral odes refer to the lassfato of the approah. The otrbuto of ths paper s to trodue a ew represetato, the lower boudg symbol approah There s o doubt that a ew symbol represetato that remedes all the problems metoed above would be hghly desrable. More spefally, the symbol represetato should meet the followg rtera: spae effey, tme effey fast dexg, ad orretess of aswer sets o false dsmssals. I ths work we formally formulate a ovel symbol represetato ad show ts utlty o other tme seres tasks. Our represetato s uque that t allows dmesoalty/umerosty reduto, ad t also allows dstae measures to be defed o the symbol represetato that lower boud orrespodg popular dstae measures defed o the orgal data. As we shall demostrate, the latter feature s partularly extg beause t allows oe to ru erta data mg algorthms o the effetly mapulated symbol represetato, whle produg detal results to the algorthms that operate o the orgal data. I partular, we wll demostrate the utlty of our represetato o the lass data mg tasks of lusterg [9], lassfato [4], dexg [, 0, 30, 63], ad aomaly deteto [4, 34, 54]. The rest of ths paper s orgazed as follows. Seto brefly dsusses bakgroud materal o tme seres data mg ad related work. Seto 3 trodues our ovel symbol approah, ad dsusses ts dmesoalty reduto, umerosty reduto ad lower boudg abltes. Seto 4 otas a expermetal evaluato of the symbol approah o a varety of data mg tasks. Impat of the symbol approah s also dsussed. Fally, Seto 5 offers some olusos ad suggestos for future work.. Bakgroud ad Related Work Tme seres data mg has attrated eormous atteto the last deade. The revew below s eessarly bref; we refer terested readers to [3, 53] for a more depth revew. A prelmary verso of ths paper appears [4].
3 . Tme Seres Data Mg Tasks Whle makg o pretee to be exhaustve, the followg lst summarzes the areas that have see the majorty of researh terest tme seres data mg. Idexg: Gve a query tme seres, ad some smlarty/dssmlarty measure D,, fd the most smlar tme seres database DB [,, 0, 30, 63]. lusterg: Fd atural groupgs of the tme seres database DB uder some smlarty/dssmlarty measure D, [9, 36]. lassfato: Gve a ulabeled tme seres, assg t to oe of two or more predefed lasses [4]. Summarzato: Gve a tme seres otag datapots where s a extremely large umber, reate a possbly graph approxmato of whh retas ts essetal features but fts o a sgle page, omputer sree, exeutve summary et [43]. Aomaly Deteto: Gve a tme seres, ad some model of ormal behavor, fd all setos of whh ota aomales or surprsg/terestg/uexpeted/ovel behavor [4, 34, 54]. Se the datasets eoutered by data mers typally do t ft ma memory, ad dsk I/O teds to be the bottleek for ay data mg task, a smple geer framework for tme seres data mg has emerged [0]. The bas approah s outled Table. Table : A geer tme seres data mg approah.. 3. reate a approxmato of the data, whh wll ft ma memory, yet retas the essetal features of terest. Approxmately solve the task at had ma memory. Make hopefully very few aesses to the orgal data o dsk to ofrm the soluto obtaed Step, or to modfy the soluto so t agrees wth the soluto we would have obtaed o the orgal data. It should be lear that the utlty of ths framework depeds heavly o the qualty of the approxmato reated Step. If the approxmato s very fathful to the orgal data, the the soluto obtaed ma memory s lkely to be the same as, or very lose to, the soluto we would have obtaed o the orgal data. The hadful of dsk aesses made Step 3 to ofrm or slghtly modfy the soluto wll be osequetal ompared to the umber of dsk aesses requred had we worked o the orgal data. Wth ths md, there has bee great terest approxmate represetatos of tme seres, whh we osder below.. Tme Seres Represetatos As wth most problems omputer see, the sutable hoe of represetato greatly affets the ease ad effey of tme seres data mg. Wth ths md, a great umber of tme seres represetatos have bee trodued, ludg the Dsrete Fourer Trasform DFT [0], the Dsrete Wavelet Trasform DWT [], Peewse Lear, ad Peewse ostat models PAA [30], APA [4, 30], ad Sgular Value Deomposto SVD [30]. Fgure llustrates the most ommoly used represetatos. Reet work suggests that there s lttle to hoose betwee the above terms of dexg power [3], however, the represetatos have other features that may at as stregths or weakesses. As a smple example, wavelets have the useful multresoluto property, but are oly defed for tme seres that are a teger power of two legth []. Oe mportat feature of all the above represetatos s that they are real valued. Ths lmts the algorthms, data strutures ad deftos avalable for them. For example, aomaly deteto we aot meagfully defe the probablty of observg ay partular set of wavelet oeffets, se the
4 probablty of observg ay real umber s zero [38]. Suh lmtatos have lead researhers to osder usg a symbol represetato of tme seres. Dsrete Fourer Trasform Peewse Lear Approxmato Haar Wavelet Adaptve Peewse ostat Approxmato Fgure : The most ommo represetatos for tme seres data mg. Eah a be vsualzed as a attempt to approxmate the sgal wth a lear ombato of bass futos Whle there are lterally hudreds of papers o dsretzg symbolzg, tokezg, quatzg tme seres [3, 7] see [5] for a extesve survey, oe of the tehques allows a dstae measure that lower bouds a dstae measure defed o the orgal tme seres. For ths reaso, the geer tme seres data mg approah llustrated Table s of lttle utlty, se the approxmate soluto to problem reated ma memory may be arbtrarly dssmlar to the true soluto that would have bee obtaed o the orgal data. If, however, oe had a symbol approah that allowed lower boudg of the true dstae, oe ould take advatage of the geer tme seres data mg model, ad of a host of other algorthms, deftos ad data strutures whh are oly defed for dsrete data, ludg hashg, Markov models, ad suffx trees. Ths s exatly the otrbuto of ths paper. We all our symbol represetato of tme seres SAX Symbol Aggregate approxmato, ad defe t the ext seto. 3. SAX: Our Symbol Approah SAX allows a tme seres of arbtrary legth to be redued to a strg of arbtrary legth w, w <, typally w <<. The alphabet sze s also a arbtrary teger a, where a >. Table summarzes the major otato used ths ad subsequet setos. Table : A summarzato of the otato used ths paper Ĉ w a A tme seres,, A Peewse Aggregate Approxmato of a tme seres,..., w A symbol represetato of a tme seres ˆ ˆ,..., ˆ w The umber of PAA segmets represetg tme seres Alphabet sze e.g., for the alphabet {a,b,}, a 3 Our dsretzato proedure s uque that t uses a termedate represetato betwee the raw tme seres ad the symbol strgs. We frst trasform the data to the Peewse Aggregate Approxmato PAA represetato ad the symbolze the PAA represetato to a dsrete strg. There are two mportat advatages to dog ths: Dmesoalty Reduto: We a use the welldefed ad welldoumeted dmesoalty reduto power of PAA [30, 63], ad the reduto s automatally arred over to the symbol represetato. Lower Boudg: Provg that a dstae measure betwee two symbol strgs lower bouds the true dstae betwee the orgal tme seres s otrval. The key observato that
5 allowed us to prove lower bouds s to oetrate o provg that the symbol dstae measure lower bouds the PAA dstae measure. The we a prove the desred result by trastvty by smply potg to the exstg proofs for the PAA represetato tself [3, 63]. We wll brefly revew the PAA tehque before osderg the symbol exteso. 3. Dmesoalty Reduto Va PAA A tme seres of legth a be represeted a wdmesoal spae by a vetor elemet of s alulated by the followg equato: w w j j + w,,. The th Smply stated, to redue the tme seres from dmesos to w dmesos, the data s dvded to w equal szed frames. The mea value of the data fallg wth a frame s alulated ad a vetor of these values beomes the dataredued represetato. The represetato a be vsualzed as a attempt to approxmate the orgal tme seres wth a lear ombato of box bass futos as show Fgure 3. For smplty ad larty, we assume that s dvsble by w. We wll relax ths assumpto Seto K w Fgure 3: The PAA represetato a be vsualzed as a attempt to model a tme seres wth a lear ombato of box bass futos. I ths ase, a sequee of legth 8 s redued to 8 dmesos The PAA dmesoalty reduto s tutve ad smple, yet has bee show to rval more sophstated dmesoalty reduto tehques lke Fourer trasforms ad wavelets [30, 3, 63]. We ormalze eah tme seres to have mea of zero ad a stadard devato of oe before overtg t to the PAA represetato, se t s well uderstood that t s meagless to ompare tme seres wth dfferet offsets ad ampltudes [3]. 3. Dsretzato Havg trasformed a tme seres database to the PAA we a apply a further trasformato to obta a dsrete represetato. It s desrable to have a dsretzato tehque that wll produe symbols wth equprobablty [5, 45]. Ths s easly aheved se ormalzed tme seres have a Gaussa dstrbuto [38]. To llustrate ths, we extrated subsequees of legth 8 from 8 dfferet tme seres ad plotted ormal probablty plots of the data as show Fgure 4. A ormal probablty plot s a graphal tehque that shows f the data s approxmately ormally dstrbuted []: a approxmate straght le dates that the data s approxmately ormally dstrbuted. As the fgure shows, the hghly lear ature of the plots suggests that the data s approxmately ormal. For a large famly of the tme seres data our dsposal, we ote that the Gaussa assumpto s deed true. For the small subset of data where the assumpto s ot
6 obeyed, the effey s slghtly deterorated; however, the orretess of the algorthm s uaffeted. The orretess of the algorthm s guarateed by the lowerboudg property of the dstae measure the symbol spae, whh we wll expla the ext seto. Gve that the ormalzed tme seres have hghly Gaussa dstrbuto, we a smply determe the breakpots that wll produe a equalszed areas uder Gaussa urve [38]. Defto. Breakpots: breakpots are a sorted lst of umbers Β β,,β a suh that the area uder a N0, Gaussa urve from β to β + /a β 0 ad β a are defed as  ad, respetvely. These breakpots may be determed by lookg them up a statstal table. For example, Table 3 gves the breakpots for values of a from 3 to 0. Fgure 4: A ormal probablty plot of the dstrbuto of values from subsequees of legth 8 from 8 dfferet datasets. The hghly lear ature of the plot strogly suggests that the data ame from a Gaussa dstrbuto Table 3: A lookup table that otas the breakpots that dvde a Gaussa dstrbuto a arbtrary umber from 3 to 0 of equprobable regos a β β β β β β β β β β 9.8
7 Oe the breakpots have bee obtaed we a dsretze a tme seres the followg maer. We frst obta a PAA of the tme seres. All PAA oeffets that are below the smallest breakpot are mapped to the symbol a, all oeffets greater tha or equal to the smallest breakpot ad less tha the seod smallest breakpot are mapped to the symbol b, et. Fgure 5 llustrates the dea. b b b b  a a a Fgure 5: A tme seres s dsretzed by frst obtag a PAA approxmato ad the usg predetermed breakpots to map the PAA oeffets to SAX symbols. I the example above, wth 8, w 8 ad a 3, the tme seres s mapped to the word baabb Note that ths example the 3 symbols, a, b ad are approxmately equprobable as we desred. We all the oateato of symbols that represet a subsequee a word. Defto. Word: A subsequee of legth a be represeted as a word ˆ ˆ, K, ˆ as follows. w Let alpha deote the th elemet of the alphabet,.e., alpha a ad alpha b. The the mappg from a PAA approxmato to a word Ĉ s obtaed as follows: ˆ β < β alpha j, f j j We have ow defed our symbol represetato the PAA represetato s merely a termedate step requred to obta the symbol represetato. Reetly, [6] has emprally ad theoretally show some very promsg lusterg results for lppg, that s to say, overtg the tme seres to a bary vetor. They demostrated that dsretzg the tme seres before lusterg sgfatly mproves the auray the presee of outlers. We ote that lppg s atually a speal ase of SAX, where a. 3.3 Dstae Measures Havg trodued the ew represetato of tme seres, we a ow defe a dstae measure o t. By far the most ommo dstae measure for tme seres s the Euldea dstae [3, 5]. Gve two tme seres ad of the same legth, Eq. 3 defes ther Euldea dstae, ad Fgure 6A llustrates a vsual tuto of the measure. D, q 3 If we trasform the orgal subsequees to PAA represetatos, ad, usg Eq., we a the obta a lower boudg approxmato of the Euldea dstae betwee the orgal subsequees by: w q DR, 4 w Ths measure s llustrated Fgure 6B. If we further trasform the data to the symbol represetato, we a defe a MINDIST futo that returs the mmum dstae betwee the orgal tme seres of two words:
8 w dst qˆ, ˆ MINDIST ˆ, ˆ 5 w The futo resembles Eq. 4 exept for the fat that the dstae betwee the two PAA oeffets has bee replaed wth the subfuto dst. The dst futo a be mplemeted usg a table lookup as llustrated Table 4. Table 4: A lookup table used by the MINDIST futo. Ths table s for a alphabet of ardalty of 4,.e. a4. The dstae betwee two symbols a be read off by examg the orrespodg row ad olum. For example, dsta,b 0 ad dsta, a b d a b d The value ell r, for ay lookup table a be alulated by the followg expresso. 0, f r ell 6 r, βmax r, βm r,, otherwse For a gve value of the alphabet sze a, the table eeds oly be alulated oe, the stored for fast lookup. The MINDIST futo a be vsualzed s Fgure 6..5 A B ˆ ˆ baabb babaa Fgure 6: A vsual tuto of the three represetatos dsussed ths work, ad the dstae measures defed o them. A The Euldea dstae betwee two tme seres a be vsualzed as the square root of the sum of the squared dfferees of eah par of orrespodg pots. B The dstae measure defed for the PAA approxmato a be see as the square root of the sum of the squared dfferees betwee eah par of orrespodg PAA oeffets, multpled by the square root of the ompresso rate. The dstae betwee two SAX represetatos of a tme seres requres lookg up the dstaes betwee eah par of symbols, squarg them, summg them, takg the square root ad fally multplyg by the square root of the ompresso rate
9 As metoed, oe of the most mportat haratersts of SAX s that t provdes a lowerboudg dstae measure. Below, we show that MINDIST lowerbouds the Euldea dstae two steps. Frst, we wll show that the PAA dstae lowerbouds the Euldea dstae. The proof has appeared [3] by the urret author; for ompleteess, we repeat the proof here. Next, we wll show that MINDIST lowerbouds the PAA dstae, whh tur, by trastvty, shows that MINDIST lowerbouds the Euldea dstae. Step : We eed to show that the PAA dstae lowerbouds the Euldea dstae; that s, D, DR,. We wll show the proof o the ase where there s a sgle PAA frame.e. mappg the tme seres to oe sgle PAA oeffet. A more geeralzed proof for N frames a be obtaed by applyg the sgleframe proof o every frame. Proof: Usg the same otatos as Eq. 3 ad Eq. 4, we wat to prove that w q w q 7 Let ad be the meas of tme seres ad, respetvely. Se we are osderg oly the sgleframe ase, Ieq. 7 a be rewrtte as: q 8 Squarg both sdes we get q 9 Eah pot q a be represeted term of,.e. Thus, Ieq. 9 a be rewrtte as: q q. Same apples to eah pot. q 0 Rearragg the lefthad sde we get q We a expad ad rewrte Ieq. as: q + q By dstrbutve law we get:
10 q q + 3 Or q q + 4 Reall that q q, whh meas that q q, ad smlarty,. Therefore, the summato part of the seod term o the lefthad sde of the equalty beomes: q q q q q q Substtutg 0 to the seod term o the lefthad sde, Ieq. 4 beomes: 0 q + 5 aellg out o both sdes of the equalty, we get q 0 6 whh always holds true, hee ompletes the proof. Step : otug from Step ad usg the same methodology, we wll ow show that MINDIST lowerbouds the PAA dstae; that s, we wll show that ˆ ˆ, dst 7 Let a, b, ad so forth, there are two possble searos: ase : ˆ ˆ. I other words, the symbols represetg the two tme seres are ether the same, or oseutve from the alphabet, e.g. ' ' ˆ ' ' ˆ ', ' ˆ ˆ b ad a or a. From Eq. 6, we kow that the
11 MINDIST s 0 ths ase. Therefore, the rghthad sde of Ieq. 7 beomes zero, whh makes the equalty always hold true. ase : ˆ ˆ >. I other words, the symbols represetg the two tme seres are at least two alphabets apart, e.g. ˆ ' ' ad ˆ ' a'. For smplty, assume ˆ > ˆ ; the ase where ˆ < ˆ a be prove smlar fasho. Aordg to Eq. 6, dst ˆ, ˆ s dst ˆ, ˆ β β ˆ ˆ 8 For the example above, dst ' ', ' a' β β. Reall that Eq. states the followg: ˆ alpha j, f j β < β j So we kow that β β ˆ ˆ < < β β ˆ ˆ 9 Substtutg Eq. 8 to Ieq. 7 we get βˆ β ˆ 0 whh mples that β ˆ β ˆ Note that from our assumptos earler that ˆ ˆ > ad ˆ > ˆ.e. s at a hgher rego tha, we a drop the absolute value otatos o both sdes: β β ˆ ˆ Rearragg the terms we get: β ˆ β ˆ 3 whh we kow always holds true se, from Ieq. 9, we kow that
12 β 0 ˆ β ˆ < 0 4 Ths ompletes the proof for ˆ > ˆ. The ase where ˆ < ˆ a be prove smlarly, ad s omtted for brevty. There s oe ssue we must address f we are to use a symbol represetato of tme seres. If we wsh to approxmate a massve dataset ma memory, the parameters w ad a have to be hose suh a way that the approxmato makes the best use of the prmary memory avalable. There s a lear tradeoff betwee the parameter w otrollg the umber of approxmatg elemets, ad the value a otrollg the graularty of eah approxmatg elemet. It s ufeasble to determe the best tradeoff aalytally, se t s hghly data depedet. We a however emprally determe the best values wth a smple expermet. Se we wsh to aheve the tghtest possble lower bouds, we a smply estmate the lower bouds over all possble feasble parameters, ad hoose the best settgs. MINDIST ˆ, ˆ Tghtess of Lower Boud 5 D, We performed suh a test wth a oateato of 50 tme seres databases take from the UR tme seres data mg arhve. For every ombato of parameters we averaged the result of 00,000 expermets o subsequees of legth 56. Fgure 7 shows the results. Tghtess of lower boud Word Sze w Fgure 7: The emprally estmated tghtess of lower bouds over the ross produt of a [3 ] ad w [ 9]. The darker hstogram bars llustrate ombatos of parameters that requre approxmately equal spae to store every possble word. The results suggest that usg a low value for a results weak bouds. Whle t s tutve that larger alphabet szes yeld better results, there are dmshg returs as a reases. If spae s a ssue, a alphabet sze the rage 5 to 8 seems to be a good hoe that offers a reasoable balae betwee spae ad tghtess of lower boud eah alphabet wth ths rage a be represeted wth just 3 bts. Ireasg the alphabet sze would requre more bts to represet eah alphabet. We ed ths seto wth a vsual omparso betwee SAX ad the four most used represetatos the lterature Fgure 8. We a see that SAX preserves the geeral shape of the orgal tme seres. Note that se SAX s a symbol represetato, the alphabets a be stored as bts rather tha doubles, whh results a osderable amout of spaesavg. Therefore, SAX represetato a afford to have hgher dmesoalty tha the other realvalued approahes, whle usg less or the same amout of spae Alphabet sze a
13 f e d b a DFT PLA Haar APA Fgure 8: A vsual omparso of SAX ad the four most ommo tme seres data mg represetatos. A raw tme seres of legth 8 s trasformed to the word ffffffeeeddbaabeedbaaaaaddee. Ths s a far omparso se the umber of bts eah represetato s the same 3.4 Numerosty Reduto We have see that, gve a sgle tme seres, our approah a sgfatly redue ts dmesoalty. I addto, our approah a redue the umerosty of the data for some applatos. Most applatos assume that we have oe very log tme seres T, ad that maageable subsequees of legth are extrated by use of a sldg wdow, the stored a matrx for further mapulato [, 0, 30, 63]. Fgure 9 llustrates the dea. T p 67 0 p Fgure 9: A llustrato of the otato trodued ths seto: A tme seres T of legth 8, the subsequee 67, of legth 6, ad the frst 8 subsequees extrated by a sldg wdow. Note that the sldg wdows are overlappg Whe performg sldg wdows subsequee extrato, wth ay of the realvalued represetatos, we must store all T  + extrated subsequees dmesoalty redued form. However, mage for a momet that we are usg our proposed approah. If the frst word extrated s aabb, ad the wdow s shfted to dsover that the seod word s also aabb, we a reasoably dede ot to lude the seod ourree of the word sldg wdows matrx. If we ever eed to retreve all ourrees of aabb, we a go to the loato poted to by the frst ourrees, ad remember to slde to the rght, testg to see f the ext wdow s also mapped to the same word. We a stop testg as soo as the word hages. Ths smple dea s very smlar to the rulegtheodg data ompresso algorthm. The utlty of ths optmzato depeds o the parameters used ad the data tself, but t typally yelds a umerosty reduto fator of two or three. However, may datasets are haraterzed by log perods of lttle or o movemet, followed by bursts of atvty sesmologal data s a obvous example. O these datasets the umerosty reduto fator a be huge. osder the example show Fgure 0.
14 Spae Shuttle STS57 Telemetry aabb aabb Fgure 0: Sldg wdow extrato o Spae Shuttle Telemetry data, wth 3. At tme pot 6, the extrated word s aabb, ad the ext 40 subsequees also map to ths word. Oly a poter to the frst ourree must be reorded, thus produg a large reduto umerosty There s oly oe speal ase we must osder. As we oted Seto 3., we ormalze eah tme seres ludg subsequees to have a mea of zero ad a stadard devato of oe. However, f the subsequee otas oly oe value, the stadard devato s ot defed. More troublesome s the ase where the subsequee s almost ostat, perhaps 3 zeros ad a sgle If we ormalze ths subsequee, the sgle dfferg elemet wll have ts value exploded to Ths stuato ours qute frequetly. For example, the last 00 tme uts of the data Fgure 0 appear to be ostat, but atually ota ty amouts of ose. If we were to ormalze subsequees extrated from ths area, the ormalzato would magfy the ose to large meagless patters. We a easly deal wth ths problem, f the stadard devato of the sequee before ormalzato s below a epslo ε, we smply assg the etre word to the mddleraged alphabet e.g. f a Relaxato o the Number of Segmets So far we have desrbed SAX wth the assumpto that the legth of the tme seres s dvsble by the umber of segmets,.e. /w must be a teger. If s ot dvdable by w, there wll be some pots the tme seres that we are ot sure whh segmet to put them. For example, Fgure A, we are dvdg 0 data pots to 5 segmets. Ad t s obvous that pot, should be segmet ; pot 3, 4 should be segmet ; so o ad so forth. I Fgure B, we are dvdg 0 data pots to 3 segmets. It s ot lear whh segmet pot 4 should go: segmet or segmet. Same problem holds for pot 7. The assumpto must be dvdable by w learly lmts our hoes of w, ad s problemat f s a prme umber. Here we show that ths eeds ot be the ase ad provde a smple soluto whe s ot dvsble by w. Istead of puttg the whole pot to a segmet, we a put part of t. For example, Fgure B, pot 4 otrbutes ts /3 to segmet ad ts /3 to segmet, ad pot 7 otrbutes ts /3 to segmet ad ts /3 to segmet 3. Ths makes eah segmet otas exatly 3 /3 data pots ad solves the udvdable problem. Ths geeralzato s mplemeted the later verso of SAX, as well as some of the applatos that utlze SAX.
15 A: S S S 3 S 4 S 5 otrbutes to S wth weght /3 B: otrbutes to S wth weght /3 otrbutes to S wth weght /3 otrbutes to S 3 wth weght / S S S 3 Fgure : A 0 data pots are dvded to 5 segmets. B 0 data pots are dvded to 3 segmets. The data pots marked wth rles otrbute to two adjaet segmets at the same tme 4. Expermetal Valdato of Our Symbol Approah I ths seto, we perform varous data mg tasks usg our symbol approah ad ompare the results wth other wellkow exstg approahes. For lusterg, lassfato, ad aomaly deteto, we ompare the results wth the lass Euldea dstae, ad wth other prevously proposed symbol approahes. Note that oe of these other approahes use dmesoalty reduto. I the ext paragraphs we summarze the strawme represetatos that we ompare ours to. We hoose these two approahes se they are typal represetatves of approahes the lterature. AdréJösso, ad Badal [3] proposed the SDA algorthm that omputes the hages betwee values from oe stae to the ext, ad dvde the rage to userpredefed setos. The dsadvatages of ths approah are obvous: pror kowledge of the data dstrbuto of the tme seres s requred order to set the breakpots; ad the dsretzed tme seres does ot oserve the geeral shape or dstrbuto of the data values. Huag ad Yu proposed the IMPATS algorthm, whh uses hage rato betwee oe tme pot to the ext tme pot to dsretze the tme seres [7]. The rage of hage ratos are the dvded to equalszed setos ad mapped to symbols. The tme seres s overted to a dsretzed olleto of hage ratos. As wth SAX, the user eeds to defe the ardalty of symbols. 4. lusterg lusterg s oe of the most ommo data mg tasks, beg useful ts ow rght as a exploratory tool, ad also as a subroute more omplex algorthms [6,, 9]. We osder two lusterg algorthms, oe of herarhal lusterg, ad oe of parttoal lusterg. 4.. Herarhal lusterg omparg herarhal lustergs s a very good way to ompare ad otrast smlarty measures, se a dedrogram of sze N summarzes ON dstae alulatos [3]. The evaluato s typally subjetve, we smply adjudge whh dstae measure appears to reate the most atural groupgs of the data. However, f we kow the data labels advae we a also make objetve statemets of the qualty of the lusterg. I Fgure we lustered e tme seres from the otrol hart dataset, three eah from the dereasg tred, upward shft ad ormal lasses.
16 Euldea SAX IMPATS alphabet8 SDA Fgure : A omparso of the four dstae measures ablty to luster members of the otrol hart dataset. omplete lkage was used as the agglomerato tehque I ths ase we a objetvely state that SAX s superor, se t orretly assgs eah lass to ts ow subtree. Ths s smply a sde effet due to the smoothg effet of dmesoalty reduto. Therefore, t s ot surprsg that SAX a sometmes outperform the smple Euldea dstae, espeally o osy data, or data wth shftg o the tmeaxs. Ths fat s demostrated the dedrogram produed by Euldea dstae: the ormal lass, whh otas a lot of ose, s ot lustered orretly. More geerally, we observed that SAX losely mms Euldea dstae o varous datasets. The reasos that SDA ad IMPATS perform poorly, we observe, are that ether symbol represetato s very desrptve of the geeral shape of the tme seres, ad that the lak of dmesoalty reduto a further dstort the results f the data s osy. What SDA does s essetally dffereg the tme seres, ad the dsretzg the resultg seres. Whle dffereg has bee used hstorally statstal tme seres aalyss, ts purposes to remove some autoorrelato, ad to make a tme seres statoary are ot always applable determato of smlarty data mg. I addto, although omputg the dervatves tells the type of hage from oe tme pot to the ext tme pot: sharp rease, slght rease, sharp derease, et., ths approah does t appear very useful se tme seres data are typally osy. More spefally, addto to the overall treds or shapes, there are oses that appear throughout the etre tme seres. Wthout ay smoothg or dmesoalty reduto, these oses are lkely to overshadow the atual haratersts of the tme seres. To demostrate why the dereasg tred ad the upward shft lasses are dstgushable by the lusterg algorthm for SDA, let s look at what the dffereed seres look lke. Fgure 3 shows the orgal tme seres ad ther orrespodg seres after dffereg. It s lear that the dffereed seres from the same lass are ot ay more smlar tha those from a dfferet lass. As a matter of fat, as we ompute the parwse dstaes betwee all 6 dffereed seres, we realze that the dstaes are ot datve at all of the lasses these data belog. Table 5 ad Table 6 show the ter ad the tradstaes betwee the seres the seres from the dereasg tred lass are deoted as A, ad the seres from the upward shft are deoted as B. Iterestgly, [3], the authors show that takg the frst dervatves.e. dffereg atually worses the results whe ompared to usg the raw data. They further show that performg peewse ormalzato.e. ormalzato o fxedszed wdows rather o the whole seres o the frst dervatves mproves the results. Our expermetal results valdate ther observatos, as SDA does ot do ay kd of ormalzato, whereas peewse ormalzato s a part of SAX the PAA step.
17 IMPATS suffers from smlar problems as SDA. I addto, t s lear that ether IMPATS or SDA a beat smple Euldea dstae, ad the dsusso above apples to all data mg tasks, se the problems le the ature of the represetatos. A B dereasg tred after dffereg upward shft after dffereg Fgure 3: A Tme seres from the dereasg tred lass ad the resultg seres after dffereg. B Tme seres from the upward shft lass ad the resultg seres after dffereg. Table 5: Itralass dstaes betwee the dffereed tme seres from the dereasg tred lass. Table 6: Iterlass dstaes betwee the dffereed tme seres from the dereasg tred ad the upward shft lasses. A A A3 B B B3 A A A A A A Parttoal lusterg Although herarhal lusterg s a good saty hek for ay proposed dstae measure, t has lmted utlty for data mg beause of ts poor salablty. The most ommoly used data mg lusterg algorthm s kmeas [], so for ompleteess we wll osder t here. We performed kmeas o both the orgal raw data, ad our symbol represetato. Fgure 4 shows a typal ru of kmeas o a spae telemetry dataset. Both algorthms overge after teratos. Se kmeas algorthm seeks to optmze the objetve futo, by mmzg the sum of squared traluster error, we ompare ad plot the objetve futos for eah terato. The objetve futo for a gve lusterg s gve by Eq. 6, where x s the tme seres, ad m s the luster eter of the luster that x belogs to. The smaller the objetve futo, the more ompat thus better the lusters. F k N m x m The results here are qute ututve ad surprsg: workg wth a approxmato of the data gves better results tha workg wth the orgal data. Fortuately, a reet paper offers a suggesto as to why ths mght be so. It has bee show that talzg the lusters eters o a low dmeso approxmato of the data a mprove the qualty [6], ths s what lusterg wth SAX mpltly does. 6
18 65000 Objetve Futo Raw Raw data data Our Symbol SAX Approah Fgure 4: A omparso of the kmeas lusterg algorthm usg SAX ad usg the raw data. The dataset was Spae Shuttle telemetry,,000 subsequees of legth 5. Surprsgly, workg wth the symbol approxmato produes better results tha workg wth the orgal data I we trodue aother dstae measure based o SAX. By applyg t o lusterg, we show that t outperforms the Euldea dstae measure. 4. lassfato Number of Iteratos lassfato of tme seres has attrated muh terest from the data mg ommuty. Although spealpurpose algorthms have bee proposed [36], we wll osder oly the two most ommo lassfato algorthms for brevty, larty of presetatos ad to faltate depedet ofrmato of our fdgs. 4.. Nearest Neghbor lassfato To ompare dfferet dstae measures o earesteghbor lassfato, we use leavgoeout ross valdato. Frstly, we ompare SAX wth Euldea dstae, IMPATS, SDA, ad LP f. Two lass sythet datasets are used: the ylderbellfuel BF dataset has 50 staes of tme seres for eah of the three lusters, ad the otrol hart has 00 staes for eah of the sx lusters [3]. Se SAX allows dmesoalty ad alphabet sze as user put, ad the IMPATS allows varable alphabet sze, we ra the expermets o dfferet ombatos of dmesoalty reduto ad alphabet sze. For the other approahes we appled the smple dmesoalty reduto tehque of skppg data pots at a fxed terval. I Fgure 5, we show the result wth a dmesoalty reduto of 4 to. Smlar results were observed for other levels of dmesoalty reduto. Oe aga, SAX s ablty to beat Euldea dstae s probably due to the smoothg effet of dmesoalty reduto, evertheless ths expermet does show the superorty of SAX over the others proposed the lterature.
19 0.6 ylder Bell Fuel otrol hart 0.5 Error Rate Impats SDA Euldea LP max SAX Alphabet Sze Alphabet Sze Fgure 5: A omparso of fve dstae measures utlty for earest eghbor lassfato. We tested dfferet alphabet szes for SAX ad IMPATS, SDA s alphabet sze s fxed at 5 Se both IMPATS ad SDA perform poorly ompared to Euldea dstae ad SAX, we wll exlude them from the rest of the lassfato expermets. To provde a loser look o how SAX ompares to Euldea dstae, we ra a extesve expermet ad ompared the error rates o datasets avalable ole at Eah dataset s splt to trag ad testg parts. We use the trag part to searh the best value for SAX parameters w umber of SAX words ad a sze of the alphabet: For w, we searh from up to / s the legth of the tme seres. Eah tme we double the value of w. For a, we searh eah value betwee 3 ad 0. If there s a te, we use the smaller values. The ompresso rato last olum of ext table s alulated as: w log a, beause for SAX 3 represetato we oly eed log a bts per word, whle for the orgal tme seres we eed 4 bytes 3 bts for eah value. The we lassfy the testg set based o the trag set usg oe earest eghbor lassfer ad report the error rate. The results are show Table 7. We also summarze the results by plottg the error rates for eah dataset as a dmesoal pot: EU_error, SAX_error. If a pot falls wth the lower tragle, the SAX s more aurate tha Euldea dstae, ad ve versa for the upper tragle. The plot s show Fgure 6. From ths expermet, we a olude that SAX s ompettve wth Euldea dstae, but requres far less spae.
20 Name Number of lasses Table 7: NN omparso betwee Euldea Dstae ad SAX. Sze of Trag Set Sze of Testg Set Tme Seres Legth NN EU Error NN SAX Error w a ompresso Rato Sythet % otrol GuPot % BF % Fae all % OSU Leaf % Swedsh Leaf % 50Words % Trae % Two Patters % Wafer % Fae four % lghtg % lghtg % EG % Ada % Yoga % Fsh % Plae % ar % Beef % offee % Olve Ol % Error Rate of SAX Represetato I ths rego Euldea dstae s more aurate I ths rego SAX represetato s more aurate Error Rate of Euldea Dstae Fgure 6: Error rates for SAX ad Euldea dstae o datasets. Lower tragle s the rego where SAX s more aurate tha Euldea dstae, ad upper tragle s where Euldea dstae s more aurate tha SAX.
21 4.. Deso Tree lassfato Beause of Nearest Neghbor s poor salablty, t s usutable for most data mg applatos; stead deso trees are the most ommo hoe of lassfer. Whle deso trees are defed for real data, attemptg to lassfy tme seres usg the raw data would learly be a mstake, se the hgh dmesoalty ad ose levels would result a deep, bushy tree wth poor auray. I a attempt to overome ths problem, Geurts [4] suggests represetg the tme seres as a Regresso Tree RT ths represetato s essetally the same as APA [30], see Fgure, ad trag the deso tree dretly o ths represetato. The tehque shows great promse. We ompared SAX to the Regresso Tree RT o two datasets; the results are Table 8. Table 8: A omparso of SAX wth the spealzed Regresso Tree approah for deso tree lassfato. Our approah used a alphabet sze of 6, both approahes used a dmesoalty of 8 Dataset SAX Regresso Tree 3.04 ± ±. BF 0.97 ±.4.4 ±.0 Note that whle our results are ompettve wth the RT approah, the RT represetato s udoubtedly superor terms of terpretablty [4]. Oe aga, our pot s smply that our blak box approah a be ompettve wth spealzed solutos. 4.3 uery by otet Idexg The majorty of work o tme seres data mg appearg the lterature has addressed the problem of dexg tme seres for fast retreval [53]. Ideed, t s ths otext that most of the represetatos eumerated Fgure were trodued [, 0, 30, 63]. Dozes of papers have trodued tehques to do dexg wth a symbol approah [3, 7], but wthout exepto, the aswer set retreved by these tehques a be very dfferet to the aswer set that would be retreved by the true Euldea dstae. It s oly by usg a lower boudg tehque that oe a guaratee retrevg the full aswer set, wth o false dsmssals [0]. To perform query by otet, we buld a dex usg SAX, ad ompare t to a dex bult usg the Haar wavelet approah []. Se the datasets we use are large ad dskresdet, ad the redued dmesoalty ould stll be potetally hgh or at least hgh eough suh that the performae degeerates to sequetal sa f Rtree were used [6], we use Vetor Approxmato VA fle as our dexg algorthm. We ote, however, that SAX ould also be dexed by lass strg dexg tehques suh as suffx trees. To ompare performae, we measure the peretage of dsk I/Os requred order to retreve the oeearest eghbor to a radomly extrated query, relatve to the umber of dsk I/Os requred for sequetal sa. Se t has bee forbly show that the hoe of dataset a make a sgfat dfferee the relatve dexg ablty of a represetato, we tested o more tha 50 datasets from the UR Tme Seres Data Mg Arhve. I Fgure 7 we show 4 represetatve examples. The yaxs shows the dex power terms of the peretage of the data retreved from the dsk, ompared to sequetal sa. I almost all ases, SAX shows a superor reduto the umber of dsk aesses. I addto, SAX does ot have the lmtato faed by the Haar Wavelet that the data legth must be a power of two.
22 DWT Haar SAX Ballbeam haot Memory Wdg Dataset Fgure 7: A omparso of dexg ablty of wavelets versus SAX. The Yaxs s the peretage of the data that must be retreved from dsk to aswer a NN query of legth 56, whe the dmesoalty reduto rato s 3 to for both approahes 4.4 Takg Advatage of the Dsrete Nature of our Represetato I the prevous setos we showed examples of how our proposed represetato a ompete wth realvalued represetatos ad the orgal data. I ths seto we llustrate examples of data mg algorthms that take explt advatage of the dsrete ature of our represetato Detetg Novel/Surprsg/Aomalous Behavor A smple dea for detetg aomalous behavor tme seres s to exame prevously observed ormal data ad buld a model of t. Data obtaed the future a be ompared to ths model ad ay lak of oformty a sgal a aomaly [4]. I order to aheve ths, [34] we ombed a statstally soud sheme wth a effet ombatoral approah. The statstally sheme s based o Markov has ad ormalzato. Markov has are used to model the ormal behavor, whh s ferred from the prevously observed data. The tme ad spaeeffey of the algorthm omes from the use of suffx tree as the ma data struture. Eah ode of the suffx tree represets a patter. The tree s aotated wth a sore obtaed omparg the support of a patter observed the ew data wth the support reorded the Markov model. Ths apparetly smple strategy turs out to be very effetve dsoverg surprsg patters. I the orgal work we use a smple symbol approah, smlar to IMPATS [7]; here we revst the work usg SAX. For ompleteess, we wll ompare SAX to two hghly refereed aomaly deteto algorthms that are defed o real valued represetatos, the TSAtree Wavelet based approah of Shahab et al. [54] ad the Immuology IMM spred work of Dasgupta ad Forrest [4]. We also lude the Markov tehque usg IMPATS ad SDA order to dsover how muh of the dfferee a be attrbuted dretly to the represetato. Fgure 8 otas a expermet omparg all 5 tehques.
23 5 I II III IIII V VI VII Fgure 8: A omparso of fve aomaly deteto algorthms o the same task. I The trag data, a slghtly osy se wave of legth,000. II The tme seres to be examed for aomales s a osy se wave that was reated wth the same parameters as the trag sequee, the a assortmet of aomales were trodued at tme perods 50, 500 ad 750. III ad IIII The Markov Model tehque usg the IMPATS ad SDA represetatos dd ot learly dsover the aomales, ad reported some false alarms. V The IMM aomaly deteto algorthm appears to have dsovered the frst aomaly, but t also reported may false alarms. VI The TSATree approah s uable to detet the aomales. VII The Markov modelbased tehque usg SAX learly fds the aomales, wth o false alarms The results o ths smple expermet are mpressve. Se suffx trees ad Markov models a be used oly o dsrete data, ths offers a motvato for our symbol approah. Whle all the other approahes, ludg the Markov Models usg IMPATS ad SDA represetatos, the Immuologybased aomaly deteto approah, ad the TSATree approah, dd ot learly dsover the aomales ad reported some false alarms, the SAXbased Markov Model learly fds the aomales wth o false alarms Motf Dsovery It s well uderstood boformats that overrepreseted DNA sequees ofte have bologal sgfae [5, 9, 5]. A substatal body of lterature has bee devoted to tehques to dsover suh patters [5, 57, 60]. I a prevous work, we defed the related oept of tme seres motf [43]. Tme seres motfs are lose aalogues of ther dsrete ouss, although the deftos must be augmeted to prevet erta degeerate solutos. The aïve algorthm to dsover the motfs s quadrat the legth of the tme seres. I [43], we demostrated a smple tehque to mtgate the quadrat omplexty by a large ostat fator, evertheless ths tme omplexty s learly uteable for most real datasets. The symbol ature of SAX offers a uque opportuty to aval of the wealth of boformats researh ths area. I partular, reet work by Tompa ad Buhler holds great promse [60]. The authors show that may prevously usolvable motf dsovery problems a be solved by hashg subsequees to bukets usg a radom subset of ther features as a key, the dog some postproessg searh o the hash bukets 3. They all ther algorthm PROJETION. We arefully remplemeted the radom projeto algorthm of Tompa ad Buhler, makg mor hages the postproessg step to allow for the fat that although we are hashg radom projetos of our symbol represetato, we atually wsh to dsover motfs defed o the orgal raw data [3]. Fgure 9 shows a example of a motf dsovered a dustral dataset [8] usg ths tehque. The patters foud are extremely smlar to oe aother. 3 Of ourse, ths desrpto greatly uderstates the otrbutos of ths work. We urge the reader to osult the orgal paper.
24 Wdg Dataset Agular speed of reel A B A B Fgure 9: Above, a motf dsovered a omplex dataset by the modfed PROJETION algorthm. Below, the motf s best vsualzed by algg the two subsequees ad zoomg. The smlarty of the two subsequees s strkg, ad hts at uexpeted regularty Apart from the attratve salablty of the algorthm, there s aother mportat advatage over other approahes. The PROJETION algorthm s able to dsover motfs eve the presee of ose. Our exteso of the algorthm herts ths robustess to ose. We dret terested readers to [3] for more detaled dsusso of ths algorthm Vsualzato Data vsualzato tehques are very mportat for data aalyss, se the huma eye has bee frequetly advoated as the ultmate datamg tool. However, despte ther llustratve ature, whh a provde users better uderstadg of the data ad tutve terpretato of the mg results, there has bee surprsgly lttle work o vsualzg large tme seres datasets. Oe reaso for ths lak of terest s that tme seres data are also usually very massve sze. Wth lmted pxel spae ad the typally eormous amout of data at had, t s feasble to dsplay all the data o the sree at oe, muh less fdg ay useful formato from the data. How to effetly orgaze the data ad preset them suh a way that s tutve ad omprehesble to huma eyes thus remas a great hallege. Ideally, the vsualzato tehque should follow the Vsual Iformato Seekg Matras, as summarzed by Dr. Be Shederma: Overvew, zoom & flter, detalsodemad. I other words, t should be able to provde users the overvew or summary of the data, ad allows users to further vestgate o the terestg patters hghlghted by the tool. To ths ed, we developed VzTree [4], a tme seres patter dsovery ad vsualzato system based o augmetg suffx trees. VzTree vsually summarzes both the global ad loal strutures of tme seres data at the same tme. I addto, t provdes ovel teratve solutos to may patter dsovery problems, ludg the dsovery of frequetly ourrg patters motf dsovery, surprsg patters aomaly deteto, ad query by otet. The user teratve paradgm allows users to vsually explore the tme seres, ad perform realtme hypotheses testg. Se the use of suffx tree requres that the put data be dsrete, SAX s the perfet addate for dsretzg the tme seres data. ompared to the exstg tme seres vsualzato systems the lterature, VzTree s uque several respets. Frst, almost all other approahes assume hghly perod tme seres, whereas VzTree makes o suh assumpto. Other methods typally requre spae both memory spae, ad pxel spae that grows at least learly wth the legth of the tme seres, makg them uteable for mg massve datasets. Fally, VzTree allows us to vsualze a muh rher set of features, ludg global summares of the dfferees betwee two tme seres, loally repeated patters, aomales, et. I VzTree, patters are represeted a depthlmted tree struture, whh ther frequees of ourree are eoded the thkesses of brahes. The algorthm works by sldg a wdow aross the tme seres ad extratg subsequees of userdefed legths. The subsequees are the dsretzed to strgs by SAX ad serted to a augmeted suffx tree. Eah strg s regarded as a patter, ad the frequey of ourree for eah patter s eoded by the thkess of the brah: the thker the brah, the
25 more frequet the orrespodg patter. Motf dsovery ad aomaly deteto a thus be easly aheved: those that our frequetly a be regarded as motfs, ad those that our rarely a be regarded as aomaly. Fgure 0 shows the sreeshot of VzTree for aomaly deteto o the Duth power demad dataset. Eletrty osumpto s reorded every 5 mutes; therefore, for the year of 997, there are 35,040 data pots. The majorty of the weeks follow the regular ModayFrday, 5workgday patter, as show by the thk brahes. The th brahes deote the aomales the sese that the eletrty osumpto s abormal gve the day of the week. Note that VzTree, we reverse the alphabet orderg so the alphabets ow read topdow rather tha bottomup e.g. a s ow the topmost brah, rather tha the bottommost brah. Ths way, the strg better desrbes the atual shape of the tme seres a deotes the top rego, b the mddle rego, the bottom rego. The top rght wdow shows the subtree whe we lk o the d hld of the root ode. lkg o ay of the exstg brahes the ma or the subtree wdow wll plot the subsequees represeted by them the bottom rght wdow. The hghlghted, rled subsequee s retreved by lkg o the brah bab. The zoom shows why t s a aomaly: t s the begg of the threeday week durg hrstmas Thursday ad Frday off. The other th brahes deote other aomales suh as New Year s Day, Good Frday, uee s Brthday, et. Zoom a b a b Fgure 0: Aomaly deteto o power osumpto data. The aomaly show here s a short week durg hrstmas. The evaluato for vsualzato tehques s usually subjetve. Although VzTree learly demostrates ts apablty detetg otrval patters, [40] we also devse a measure that quatfes the effetveess of the algorthm. The measure, whh we all the dssmlarty oeffet, desrbes how dssmlar two tme seres are, ad rages from 0 to. I essee, the oeffet summarzes the dfferee behavor of eah patter represeted by a strg two tme seres. More oretely, for eah patter, we out ts respetve umbers of ourrees both tme seres, ad see how muh the frequees dffer. We all ths measure support, whh s the weghted by the ofdee, or the degree of terestgess of the patter. For example, a patter that ours 0 tmes tme seres A ad 00 tmes tme seres B s
26 probably less sgfat tha a patter that ours 0 tmes A but zero tmes B, eve though the support for both ases s 0. Subtratg the dssmlarty oeffet from the gves us a ovel smlarty measure that desrbes how smlar two tme seres are. More detals o the dssmlarty measure a be foud [40]. A mportat fat about ths smlarty measure s that, ulke a dstae measure that omputes pottopot dstaes, t aptures the global struture of the tme seres rather tha loal dfferees. Ths tmevarat feature s useful f we are terested the overall strutures of the tme seres. Fgure shows the dedrogram of lusterg result usg the dssmlarty oeffet as the dstae measure. It learly demostrates that the oeffet aptures the dssmlarty very well ad that all lusters are separated perfetly. Note that t s eve able to dstgush the four dfferet sets of heartbeats from top dow, luster, 4, 5, ad 6! Fgure : lusterg result usg the dssmlarty oeffet As a referee, we ra the same lusterg algorthm usg the wdelyused Euldea dstae. The result s show Fgure. learly, lusterg usg our dssmlarty measure returs superor results. Fgure : lusterg result usg Euldea dstae
A Fast Algorithm for Computing the Deceptive Degree of an Objective Function
IJCSNS Iteratoal Joural of Computer See ad Networ Seurty, VOL6 No3B, Marh 6 A Fast Algorthm for Computg the Deeptve Degree of a Objetve Futo LI Yuqag Eletro Tehque Isttute, Zhegzhou Iformato Egeerg Uversty,
More informationFuzzy Risk Evaluation Method for Information Technology Service
Fuzzy Rsk Evaluato Method for Iformato Tehology Serve Outsourg Qasheg Zhag Yrog Huag Fuzzy Rsk Evaluato Method for Iformato Tehology Serve Outsourg 1 Qasheg Zhag 2 Yrog Huag 1 Shool of Iformats Guagdog
More informationUniversal Prediction Applied to Stylistic Music Generation Gיrard Assayag (Ircam), Shlomo Dubnov (Ben Gurion Univ.)
Uversal Predto Appled to Stylst Mus Geerato Gיrard Assayag (Iram), Shlomo Dubov (Be Guro Uv.) Abstrat Capturg a style of a partular pee or a omposer s ot a easy task. Several attempts to use mahe learg
More informationPreprocess a planar map S. Given a query point p, report the face of S containing p. Goal: O(n)size data structure that enables O(log n) query time.
Computatoal Geometry Chapter 6 Pot Locato 1 Problem Defto Preprocess a plaar map S. Gve a query pot p, report the face of S cotag p. S Goal: O()sze data structure that eables O(log ) query tme. C p E
More informationA Hierarchical Fuzzy Linear Regression Model for Forecasting Agriculture Energy Demand: A Case Study of Iran
3rd Iteratoal Coferee o Iformato ad Faal Egeerg IPEDR vol. ( ( IACSIT Press, Sgapore A Herarhal Fuzz Lear Regresso Model for Foreastg Agrulture Eerg Demad: A Case Stud of Ira A. Kazem, H. Shakour.G, M.B.
More information6.7 Network analysis. 6.7.1 Introduction. References  Network analysis. Topological analysis
6.7 Network aalyss Le data that explctly store topologcal formato are called etwork data. Besdes spatal operatos, several methods of spatal aalyss are applcable to etwork data. Fgure: Network data Refereces
More informationAverage Price Ratios
Average Prce Ratos Morgstar Methodology Paper August 3, 2005 2005 Morgstar, Ic. All rghts reserved. The formato ths documet s the property of Morgstar, Ic. Reproducto or trascrpto by ay meas, whole or
More informationIDENTIFICATION OF THE DYNAMICS OF THE GOOGLE S RANKING ALGORITHM. A. Khaki Sedigh, Mehdi Roudaki
IDENIFICAION OF HE DYNAMICS OF HE GOOGLE S RANKING ALGORIHM A. Khak Sedgh, Mehd Roudak Cotrol Dvso, Departmet of Electrcal Egeerg, K.N.oos Uversty of echology P. O. Box: 163151355, ehra, Ira sedgh@eetd.ktu.ac.r,
More informationHiTech Authentication for Palette Images Using Digital Signature and Data Hiding
The Iteratoal Arab Joural of Iformato Tehology, Vol. 8, No., Aprl 0 7 HTeh Authetato for Palette Images Usg Dgtal Sgature ad Data Hdg Aroka Jasra, Regasvaguruatha Rajesh, Ramasamy Balasubramaa, ad Perumal
More informationSpatial Keyframing for Performancedriven Animation
Eurographs/ACSIGGRAPH Symposum o Computer Amato (25) K. Ajyo, P. Faloutsos (Edtors) Spatal Keyframg for Performaedrve Amato T. Igarash,3, T. osovh 2, ad J. F. Hughes 2 The Uversty of Tokyo 2 Brow Uversty
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y  ˆ " 1
STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ
More informationANOVA Notes Page 1. Analysis of Variance for a OneWay Classification of Data
ANOVA Notes Page Aalss of Varace for a OeWa Classfcato of Data Cosder a sgle factor or treatmet doe at levels (e, there are,, 3, dfferet varatos o the prescrbed treatmet) Wth a gve treatmet level there
More informationMeasures of Dispersion, Skew, & Kurtosis (based on Kirk, Ch. 4) {to be used in conjunction with Measures of Dispersion Chart }
Percetles Psych 54, 9/8/05 p. /6 Measures of Dsperso, kew, & Kurtoss (based o Krk, Ch. 4) {to be used cojucto wth Measures of Dsperso Chart } percetle (P % ): a score below whch a specfed percetage of
More informationChapter Eight. f : R R
Chapter Eght f : R R 8. Itroducto We shall ow tur our atteto to the very mportat specal case of fuctos that are real, or scalar, valued. These are sometmes called scalar felds. I the very, but mportat,
More informationRecurrence Relations
CMPS Aalyss of Algorthms Summer 5 Recurrece Relatos Whe aalyzg the ru tme of recursve algorthms we are ofte led to cosder fuctos T ( whch are defed by recurrece relatos of a certa form A typcal example
More informationNumerical Methods with MS Excel
TMME, vol4, o.1, p.84 Numercal Methods wth MS Excel M. ElGebely & B. Yushau 1 Departmet of Mathematcal Sceces Kg Fahd Uversty of Petroleum & Merals. Dhahra, Saud Araba. Abstract: I ths ote we show how
More informationChapter 9 Cluster Sampling
Chapter 9 amplg It s oe of the as assumptos a samplg proedure that the populato a e dvded to a fte umer of dstt ad detfale uts, alled samplg uts The smallest uts to whh the populato a e dvded are alled
More informationMEASURES OF CENTRAL TENDENCY
MODULE  6 Statstcs Measures of Cetral Tedecy 25 MEASURES OF CENTRAL TENDENCY I the prevous lesso, we have leart that the data could be summarsed to some extet by presetg t the form of a frequecy table.
More informationImproving website performance for search engine optimization by using a new hybrid MCDM model
Improvg webste performae for searh ege optmzato by usg a ew hybrd MDM model Yehag he Isttute of ha ad AsaPaf Studes, Natoal Su Yatse Uversty, awa, R.O.. tayler530259@gmal.om YuSheg Lu Departmet of
More informationLocally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases
Locally Adaptve Dmesoalty educto for Idexg Large Tme Seres Databases Kaushk Chakrabart Eamo Keogh Sharad Mehrotra Mchael Pazza Mcrosoft esearch Uv. of Calfora Uv. of Calfora Uv. of Calfora edmod, WA 985
More informationClase 4: Detector de Clases Multiples
Aprededo y reooedo ategoras de objetos Clase 4: Detetor de Clases Multples Jua Wahs Computer ee Departmet & MOVE Isttute Naal Postgraduate hool Moterey CA Courtesy o Atoo Torralba Aprededo y reooedo ategoras
More informationChapter 11 Systematic Sampling
Chapter Sstematc Samplg The sstematc samplg techue s operatoall more coveet tha the smple radom samplg. It also esures at the same tme that each ut has eual probablt of cluso the sample. I ths method of
More informationApplications of Support Vector Machine Based on Boolean Kernel to Spam Filtering
Moder Appled Scece October, 2009 Applcatos of Support Vector Mache Based o Boolea Kerel to Spam Flterg Shugag Lu & Keb Cu School of Computer scece ad techology, North Cha Electrc Power Uversty Hebe 071003,
More information1. The Time Value of Money
Corporate Face [000345]. The Tme Value of Moey. Compoudg ad Dscoutg Captalzato (compoudg, fdg future values) s a process of movg a value forward tme. It yelds the future value gve the relevat compoudg
More informationInduction Proofs. ) ( for all n greater than or equal to n. is a fixed integer. A proof by Mathematical Induction contains two steps:
CMPS Algorthms ad Abstract Data Types Iducto Proofs Let P ( be a propostoal fucto, e P s a fucto whose doma s (some subset of) the set of tegers ad whose codoma s the set {True, False} Iformally, ths meas
More informationThe GompertzMakeham distribution. Fredrik Norström. Supervisor: Yuri Belyaev
The GompertzMakeham dstrbuto by Fredrk Norström Master s thess Mathematcal Statstcs, Umeå Uversty, 997 Supervsor: Yur Belyaev Abstract Ths work s about the GompertzMakeham dstrbuto. The dstrbuto has
More informationChapter 3 31. Chapter Goals. Summary Measures. Chapter Topics. Measures of Center and Location. Notation Conventions
Chapter 3 3 Chapter Goals Chapter 3 umercal Descrptve Measures After completg ths chapter, you should be able to: Compute ad terpret the mea, meda, ad mode for a set of data Fd the rage, varace, ad stadard
More informationAbraham Zaks. Technion I.I.T. Haifa ISRAEL. and. University of Haifa, Haifa ISRAEL. Abstract
Preset Value of Autes Uder Radom Rates of Iterest By Abraham Zas Techo I.I.T. Hafa ISRAEL ad Uversty of Hafa, Hafa ISRAEL Abstract Some attempts were made to evaluate the future value (FV) of the expected
More informationThe analysis of annuities relies on the formula for geometric sums: r k = rn+1 1 r 1. (2.1) k=0
Chapter 2 Autes ad loas A auty s a sequece of paymets wth fxed frequecy. The term auty orgally referred to aual paymets (hece the ame), but t s ow also used for paymets wth ay frequecy. Autes appear may
More informationStatistical Pattern Recognition (CE725) Department of Computer Engineering Sharif University of Technology
I The Name of God, The Compassoate, The ercful Name: Problems' eys Studet ID#:. Statstcal Patter Recogto (CE725) Departmet of Computer Egeerg Sharf Uversty of Techology Fal Exam Soluto  Sprg 202 (50
More informationLecture 14: Unsupervised learning I
Leture 4: Usupervsed lear I Supervsed Vs. usupervsed lear Flat luster alorthms kmeas ISODATA Herarhal luster alorthms Dvsve Alomeratve Itroduto to Patter Reoto Rardo GuterrezOsua Wrht State Uversty Supervsed
More informationT = 1/freq, T = 2/freq, T = i/freq, T = n (number of cash flows = freq n) are :
Bullets bods Let s descrbe frst a fxed rate bod wthout amortzg a more geeral way : Let s ote : C the aual fxed rate t s a percetage N the otoal freq ( 2 4 ) the umber of coupo per year R the redempto of
More informationThe Digital Signature Scheme MQQSIG
The Dgtal Sgature Scheme MQQSIG Itellectual Property Statemet ad Techcal Descrpto Frst publshed: 10 October 2010, Last update: 20 December 2010 Dalo Glgorosk 1 ad Rue Stesmo Ødegård 2 ad Rue Erled Jese
More informationSpeeding up kmeans Clustering by Bootstrap Averaging
Speedg up meas Clusterg by Bootstrap Averagg Ia Davdso ad Ashw Satyaarayaa Computer Scece Dept, SUNY Albay, NY, USA,. {davdso, ashw}@cs.albay.edu Abstract Kmeas clusterg s oe of the most popular clusterg
More informationCHAPTER 2. Time Value of Money 61
CHAPTER 2 Tme Value of Moey 6 Tme Value of Moey (TVM) Tme Les Future value & Preset value Rates of retur Autes & Perpetutes Ueve cash Flow Streams Amortzato 62 Tme les 0 2 3 % CF 0 CF CF 2 CF 3 Show
More informationA Comparison of the Performance of TwoTier Cellular Networks Based on Queuing Handoff Calls
Iteratoal Joural of Appled Matemats ad Computer Sees 2;2 www.waset.org Sprg 2006 A Comparso of te erformae of TwoTer Cellular Networks Based o Queug Hadoff Calls Tara Sal ad Kemal Fdaboylu Abstrat Twoter
More informationMDM 4U PRACTICE EXAMINATION
MDM 4U RCTICE EXMINTION Ths s a ractce eam. It does ot cover all the materal ths course ad should ot be the oly revew that you do rearato for your fal eam. Your eam may cota questos that do ot aear o ths
More informationAPPENDIX III THE ENVELOPE PROPERTY
Apped III APPENDIX III THE ENVELOPE PROPERTY Optmzato mposes a very strog structure o the problem cosdered Ths s the reaso why eoclasscal ecoomcs whch assumes optmzg behavour has bee the most successful
More informationECONOMIC CHOICE OF OPTIMUM FEEDER CABLE CONSIDERING RISK ANALYSIS. University of Brasilia (UnB) and The Brazilian Regulatory Agency (ANEEL), Brazil
ECONOMIC CHOICE OF OPTIMUM FEEDER CABE CONSIDERING RISK ANAYSIS I Camargo, F Fgueredo, M De Olvera Uversty of Brasla (UB) ad The Brazla Regulatory Agecy (ANEE), Brazl The choce of the approprate cable
More informationof the relationship between time and the value of money.
TIME AND THE VALUE OF MONEY Most agrbusess maagers are famlar wth the terms compoudg, dscoutg, auty, ad captalzato. That s, most agrbusess maagers have a tutve uderstadg that each term mples some relatoshp
More informationm Mass difference between standard weight and the unknown weight read on the balance
Cosstey Test o Mass Calbrato of Set of Weghts Class ad Lowers Lus Oar Beerra, Igao Herádez, Jorge Nava, Fél Pezet Natoal Ceter of Metrology (CNAM) Querétaro, Meo Abstrat: O weghts albrato oe by oe there
More informationSHAPIROWILK TEST FOR NORMALITY WITH KNOWN MEAN
SHAPIROWILK TEST FOR NORMALITY WITH KNOWN MEAN Wojcech Zelńsk Departmet of Ecoometrcs ad Statstcs Warsaw Uversty of Lfe Sceces Nowoursyowska 66, 787 Warszawa emal: wojtekzelsk@statystykafo Zofa Hausz,
More informationOptimal multidegree reduction of Bézier curves with constraints of endpoints continuity
Computer Aded Geometrc Desg 19 (2002 365 377 wwwelsevercom/locate/comad Optmal multdegree reducto of Bézer curves wth costrats of edpots cotuty GuoDog Che, GuoJ Wag State Key Laboratory of CAD&CG, Isttute
More informationRUSSIAN ROULETTE AND PARTICLE SPLITTING
RUSSAN ROULETTE AND PARTCLE SPLTTNG M. Ragheb 3/7/203 NTRODUCTON To stuatos are ecoutered partcle trasport smulatos:. a multplyg medum, a partcle such as a eutro a cosmc ray partcle or a photo may geerate
More informationSettlement Prediction by Spatialtemporal Random Process
Safety, Relablty ad Rs of Structures, Ifrastructures ad Egeerg Systems Furuta, Fragopol & Shozua (eds Taylor & Fracs Group, Lodo, ISBN 97877 Settlemet Predcto by Spataltemporal Radom Process P. Rugbaapha
More informationSTRATEGIC SUPPLY FUNCTION COMPETITION WITH PRIVATE INFORMATION. Xavier Vives. October 2009 COWLES FOUNDATION DISCUSSION PAPER NO.
STRATEGIC SUPPLY FUNCTION COMPETITION WITH PRIVATE INFORMATION By Xaver Vves Otober 009 COWLES FOUNDATION DISCUSSION PAPER NO. 1736 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS YALE UNIVERSITY Box 0881
More information10.5 Future Value and Present Value of a General Annuity Due
Chapter 10 Autes 371 5. Thomas leases a car worth $4,000 at.99% compouded mothly. He agrees to make 36 lease paymets of $330 each at the begg of every moth. What s the buyout prce (resdual value of the
More informationRelaxation Methods for Iterative Solution to Linear Systems of Equations
Relaxato Methods for Iteratve Soluto to Lear Systems of Equatos Gerald Recktewald Portlad State Uversty Mechacal Egeerg Departmet gerry@me.pdx.edu Prmary Topcs Basc Cocepts Statoary Methods a.k.a. Relaxato
More informationChapter 7 Dynamics. 7.1 NewtonEuler Formulation of Equations of Motion
Itroduto to Robots,. arry Asada Chapter 7 Dyams I ths hapter, we aalyze the dyam behavor of robot mehasms. he dyam behavor s desrbed terms of the tme rate of hage of the robot ofgurato relato to the ot
More informationSession 4: Descriptive statistics and exporting Stata results
Itrduct t Stata Jrd Muñz (UAB) Sess 4: Descrptve statstcs ad exprtg Stata results I ths sess we are gg t wrk wth descrptve statstcs Stata. Frst, we preset a shrt trduct t the very basc statstcal ctets
More informationSimple Linear Regression
Smple Lear Regresso Regresso equato a equato that descrbes the average relatoshp betwee a respose (depedet) ad a eplaator (depedet) varable. 6 8 Slopetercept equato for a le m b (,6) slope. (,) 6 6 8
More informationClassic Problems at a Glance using the TVM Solver
C H A P T E R 2 Classc Problems at a Glace usg the TVM Solver The table below llustrates the most commo types of classc face problems. The formulas are gve for each calculato. A bref troducto to usg the
More informationModels for Selecting an ERP System with Intuitionistic Trapezoidal Fuzzy Information
JOURNAL OF SOFWARE, VOL 5, NO 3, MARCH 00 75 Models for Selectg a ERP System wth Itutostc rapezodal Fuzzy Iformato Guwu We, Ru L Departmet of Ecoomcs ad Maagemet, Chogqg Uversty of Arts ad Sceces, Yogchua,
More informationLecture 4. Materials Covered: Chapter 7 Suggested Exercises: 7.1, 7.5, 7.7, 7.10, 7.11, 7.19, 7.20, 7.23, 7.44, 7.45, 7.47.
TT 430, ummer 006 Lecture 4 Materals Covered: Chapter 7 uggested Exercses: 7., 7.5, 7.7, 7.0, 7., 7.9, 7.0, 7.3, 7.44, 7.45, 7.47.. Deftos. () Parameter: A umercal summary about the populato. For example:
More informationOverview. Eingebettete Systeme. Model of periodic tasks. Model of periodic tasks. Echtzeitverhalten und Betriebssysteme
Overvew Egebettete Systeme able of some kow preemptve schedulg algorthms for perodc tasks: Echtzetverhalte ud Betrebssysteme 5. Perodsche asks statc prorty dyamc prorty Deadle equals perod Deadle smaller
More informationDECISION MAKING WITH THE OWA OPERATOR IN SPORT MANAGEMENT
ESTYLF08, Cuecas Meras (Meres  Lagreo), 79 de Septembre de 2008 DECISION MAKING WITH THE OWA OPERATOR IN SPORT MANAGEMENT José M. Mergó Aa M. GlLafuete Departmet of Busess Admstrato, Uversty of Barceloa
More informationREGRESSION II: Hypothesis Testing in Regression. Excel Regression Output. Excel Regression Output. A look at the sources of Variation in the Model
REGRESSION II: Hypothess Testg Regresso Tom Ilveto FREC 408 Model Regressg SAT (Y o Percet Takg (X Y s the Depedet Varable State average SAT Score 999  SATOTAL X s the Idepedet Varable Percet of hgh school
More informationOnline Appendix: Measured Aggregate Gains from International Trade
Ole Appedx: Measured Aggregate Gas from Iteratoal Trade Arel Burste UCLA ad NBER Javer Cravo Uversty of Mchga March 3, 2014 I ths ole appedx we derve addtoal results dscussed the paper. I the frst secto,
More informationOn Error Detection with Block Codes
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 9, No 3 Sofa 2009 O Error Detecto wth Block Codes Rostza Doduekova Chalmers Uversty of Techology ad the Uversty of Gotheburg,
More informationCompressive Sensing over Strongly Connected Digraph and Its Application in Traffic Monitoring
Compressve Sesg over Strogly Coected Dgraph ad Its Applcato Traffc Motorg Xao Q, Yogca Wag, Yuexua Wag, Lwe Xu Isttute for Iterdscplary Iformato Sceces, Tsghua Uversty, Bejg, Cha {qxao3, kyo.c}@gmal.com,
More informationAn Approach to Evaluating the Computer Network Security with Hesitant Fuzzy Information
A Approach to Evaluatg the Computer Network Securty wth Hestat Fuzzy Iformato Jafeg Dog A Approach to Evaluatg the Computer Network Securty wth Hestat Fuzzy Iformato Jafeg Dog, Frst ad Correspodg Author
More information22. The accompanying data describe flexural strength (Mpa) for concrete beams of a certain type was introduced in Example 1.2.
. The accompayg data descrbe flexural stregth (Mpa) for cocrete beams of a certa type was troduced Example.. 9. 9.7 8.8 0.7 8.4 8.7 0.7 6.9 8. 8.3 7.3 9. 7.8 8.0 8.6 7.8 7.5 8.0 7.3 8.9 0.0 8.8 8.7.6.3.8.7
More informationSingle machine stochastic appointment sequencing and scheduling
Sgle mahe stohast aotmet sequeg ad shedulg We develo algorthms for a sgle mahe stohast aotmet sequeg ad shedulg roblem th atg tme, dle tme, ad overtme osts. Ths s a bas stohast shedulg roblem that has
More informationChapter 3. AMORTIZATION OF LOAN. SINKING FUNDS R =
Chapter 3. AMORTIZATION OF LOAN. SINKING FUNDS Objectves of the Topc: Beg able to formalse ad solve practcal ad mathematcal problems, whch the subjects of loa amortsato ad maagemet of cumulatve fuds are
More informationFractalStructured Karatsuba`s Algorithm for Binary Field Multiplication: FK
FractalStructured Karatsuba`s Algorthm for Bary Feld Multplcato: FK *The authors are worg at the Isttute of Mathematcs The Academy of Sceces of DPR Korea. **Address : U Jog dstrct Kwahadog Number Pyogyag
More informationOpinion Extraction, Summarization and Tracking in News and Blog Corpora
Opo Extrato, Suarzato ad Trakg ews ad Blog Corpora LuWe Ku, YuTg Lag ad HsHs Che Departet of Coputer See ad Iforato Egeerg atoal Tawa Uversty Tape, Tawa {lwku, eaga}@lg.se.tu.edu.tw; hhhe@se.tu.edu.tw
More informationRobust Realtime Face Recognition And Tracking System
JCS& Vol. 9 No. October 9 Robust Realtme Face Recogto Ad rackg System Ka Che,Le Ju Zhao East Cha Uversty of Scece ad echology Emal:asa85@hotmal.com Abstract here s some very mportat meag the study of realtme
More informationFuture Value of an Annuity
Future Value of a Auty After payg all your blls, you have $200 left each payday (at the ed of each moth) that you wll put to savgs order to save up a dow paymet for a house. If you vest ths moey at 5%
More informationFast, Secure Encryption for Indexing in a ColumnOriented DBMS
Fast, Secure Ecrypto for Idexg a ColumOreted DBMS Tgja Ge, Sta Zdok Brow Uversty {tge, sbz}@cs.brow.edu Abstract Networked formato systems requre strog securty guaratees because of the ew threats that
More informationThe simple linear Regression Model
The smple lear Regresso Model Correlato coeffcet s oparametrc ad just dcates that two varables are assocated wth oe aother, but t does ot gve a deas of the kd of relatoshp. Regresso models help vestgatg
More informationOBJECT TRACKING AND POSITIONING ON VIDEO IMAGES
OBJC RACKIG AD OIIOIG O VIDO IMAG ChFar Che, M Che Ceter for pae ad Remote eg Reearh, atoal Cetral verty, Chug L, AIWA fhe@rr.u.edu.tw 55 Commo ICWG V/III KY WORD: Vdeo, arget, rakg, Objet, Mathg ABRAC:
More informationADAPTATION OF SHAPIROWILK TEST TO THE CASE OF KNOWN MEAN
Colloquum Bometrcum 4 ADAPTATION OF SHAPIROWILK TEST TO THE CASE OF KNOWN MEAN Zofa Hausz, Joaa Tarasńska Departmet of Appled Mathematcs ad Computer Scece Uversty of Lfe Sceces Lubl Akademcka 3, 95 Lubl
More informationCSSE463: Image Recognition Day 27
CSSE463: Image Recogto Da 27 Ths week Toda: Alcatos of PCA Suda ght: roject las ad relm work due Questos? Prcal Comoets Aalss weght grth c ( )( ) ( )( ( )( ) ) heght sze Gve a set of samles, fd the drecto(s)
More informationPlastic Number: Construction and Applications
Scet f c 0 Advaced Advaced Scetfc 0 December,.. 0 Plastc Number: Costructo ad Applcatos Lua Marohć Polytechc of Zagreb, 0000 Zagreb, Croata lua.marohc@tvz.hr Thaa Strmeč Polytechc of Zagreb, 0000 Zagreb,
More informationRQM: A new ratebased active queue management algorithm
: A ew ratebased actve queue maagemet algorthm Jeff Edmods, Suprakash Datta, Patrck Dymod, Kashf Al Computer Scece ad Egeerg Departmet, York Uversty, Toroto, Caada Abstract I ths paper, we propose a ew
More informationStochastic Programming Models for International Asset Allocation Problems
Stohast Programmg Models or teratoal Asset Alloato Problems Herules Vladmrou Nolas Topaloglou, Stavros Zeos HERMES eter o omputatoal Fae & Eooms Shool o Eooms & Maagemet Uversty o yprus RsLab Meetg Madrd,
More informationSecurity Analysis of RAPP: An RFID Authentication Protocol based on Permutation
Securty Aalyss of RAPP: A RFID Authetcato Protocol based o Permutato Wag Shaohu,,, Ha Zhje,, Lu Sujua,, Che Dawe, {College of Computer, Najg Uversty of Posts ad Telecommucatos, Najg 004, Cha Jagsu Hgh
More informationGreen Master based on MapReduce Cluster
Gree Master based o MapReduce Cluster MgZh Wu, YuChag L, WeTsog Lee, YuSu L, FogHao Lu Dept of Electrcal Egeerg Tamkag Uversty, Tawa, ROC Dept of Electrcal Egeerg Tamkag Uversty, Tawa, ROC Dept of
More informationStatistical Intrusion Detector with InstanceBased Learning
Iformatca 5 (00) xxx yyy Statstcal Itruso Detector wth IstaceBased Learg Iva Verdo, Boja Nova Faulteta za eletroteho raualštvo Uverza v Marboru Smetaova 7, 000 Marbor, Sloveja va.verdo@sol.et eywords:
More informationNetwork dimensioning for elastic traffic based on flowlevel QoS
Network dmesog for elastc traffc based o flowlevel QoS 1(10) Network dmesog for elastc traffc based o flowlevel QoS Pas Lassla ad Jorma Vrtamo Networkg Laboratory Helsk Uversty of Techology Itroducto
More informationOptimal replacement and overhaul decisions with imperfect maintenance and warranty contracts
Optmal replacemet ad overhaul decsos wth mperfect mateace ad warraty cotracts R. Pascual Departmet of Mechacal Egeerg, Uversdad de Chle, Caslla 2777, Satago, Chle Phoe: +5626784591 Fax:+562689657 rpascual@g.uchle.cl
More informationConversion of NonLinear Strength Envelopes into Generalized HoekBrown Envelopes
Covero of NoLear Stregth Evelope to Geeralzed HoekBrow Evelope Itroducto The power curve crtero commoly ued lmtequlbrum lope tablty aaly to defe a olear tregth evelope (relatohp betwee hear tre, τ,
More informationProjection model for Computer Network Security Evaluation with intervalvalued intuitionistic fuzzy information. Qingxiang Li
Iteratoal Joural of Scece Vol No7 05 ISSN: 834890 Proecto model for Computer Network Securty Evaluato wth tervalvalued tutostc fuzzy formato Qgxag L School of Software Egeerg Chogqg Uversty of rts ad
More informationConstrained Cubic Spline Interpolation for Chemical Engineering Applications
Costraed Cubc Sple Iterpolato or Chemcal Egeerg Applcatos b CJC Kruger Summar Cubc sple terpolato s a useul techque to terpolate betwee kow data pots due to ts stable ad smooth characterstcs. Uortuatel
More informationn. We know that the sum of squares of p independent standard normal variables has a chi square distribution with p degrees of freedom.
UMEÅ UNIVERSITET Matematskstatstska sttutoe Multvarat dataaalys för tekologer MSTB0 PA TENTAMEN 00409 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multvarat dataaalys för tekologer B, 5 poäg.
More informationCIS603  Artificial Intelligence. Logistic regression. (some material adopted from notes by M. Hauskrecht) CIS603  AI. Supervised learning
CIS63  Artfcal Itellgece Logstc regresso Vasleos Megalookoomou some materal adopted from otes b M. Hauskrecht Supervsed learg Data: D { d d.. d} a set of eamples d < > s put vector ad s desred output
More informationPerformance Attribution. Methodology Overview
erformace Attrbuto Methodology Overvew Faba SUAREZ March 2004 erformace Attrbuto Methodology 1.1 Itroducto erformace Attrbuto s a set of techques that performace aalysts use to expla why a portfolo's performace
More informationDynamic Twophase Truncated Rayleigh Model for Release Date Prediction of Software
J. Software Egeerg & Applcatos 3 6369 do:.436/jsea..367 Publshed Ole Jue (http://www.scrp.org/joural/jsea) Dyamc Twophase Trucated Raylegh Model for Release Date Predcto of Software Lafe Qa Qgchua Yao
More informationChapter 12 Polynomial Regression Models
Chapter Polyomal Regresso Models A model s sad to be lear whe t s lear parameters. So the model ad y = + x+ x + β β β ε y= β + β x + β x + β x + β x + β xx + ε are also the lear model. I fact, they are
More informationChapter 3 0.06 = 3000 ( 1.015 ( 1 ) Present Value of an Annuity. Section 4 Present Value of an Annuity; Amortization
Chapter 3 Mathematcs of Face Secto 4 Preset Value of a Auty; Amortzato Preset Value of a Auty I ths secto, we wll address the problem of determg the amout that should be deposted to a accout ow at a gve
More informationSTATISTICAL ANALYSIS OF WIND SPEED DATA
Esşehr Osmagaz Üerstes Müh.Mm.Fa.Dergs C. XVIII, S.2, 2005 Eg.&Arh.Fa. Esşehr Osmagaz Uersty, Vol. XVIII, No: 2, 2005 STATISTICAL ANALYSIS OF WIND SPEED DATA Veysel YILMAZ, Haydar ARAS 2, H.Eray ÇELİK
More informationWe present a new approach to pricing Americanstyle derivatives that is applicable to any Markovian setting
MANAGEMENT SCIENCE Vol. 52, No., Jauary 26, pp. 95 ss 2599 ess 52655 6 52 95 forms do.287/msc.5.447 26 INFORMS Prcg AmercaStyle Dervatves wth Europea Call Optos Scott B. Laprse BAE Systems, Advaced
More informationReinsurance and the distribution of term insurance claims
Resurace ad the dstrbuto of term surace clams By Rchard Bruyel FIAA, FNZSA Preseted to the NZ Socety of Actuares Coferece Queestow  November 006 1 1 Itroducto Ths paper vestgates the effect of resurace
More informationThree Dimensional Interpolation of Video Signals
Three Dmesoal Iterpolato of Vdeo Sgals Elham Shahfard March 0 th 006 Outle A Bref reve of prevous tals Dgtal Iterpolato Bascs Upsamplg D Flter Desg Issues Ifte Impulse Respose Fte Impulse Respose Desged
More informationMaintenance Scheduling of Distribution System with Optimal Economy and Reliability
Egeerg, 203, 5, 48 http://dx.do.org/0.4236/eg.203.59b003 Publshed Ole September 203 (http://www.scrp.org/joural/eg) Mateace Schedulg of Dstrbuto System wth Optmal Ecoomy ad Relablty Syua Hog, Hafeg L,
More informationA New Bayesian Network Method for Computing Bottom Event's Structural Importance Degree using Jointree
, pp.277288 http://dx.do.org/10.14257/juesst.2015.8.1.25 A New Bayesa Network Method for Computg Bottom Evet's Structural Importace Degree usg Jotree Wag Yao ad Su Q School of Aeroautcs, Northwester Polytechcal
More informationAn IGRSSVM classifier for analyzing reviews of Ecommerce product
Iteratoal Coferece o Iformato Techology ad Maagemet Iovato (ICITMI 205) A IGRSSVM classfer for aalyzg revews of Ecommerce product Jaju Ye a, Hua Re b ad Hagxa Zhou c * College of Iformato Egeerg, Cha
More informationISyE 512 Chapter 7. Control Charts for Attributes. Instructor: Prof. Kaibo Liu. Department of Industrial and Systems Engineering UWMadison
ISyE 512 Chapter 7 Cotrol Charts for Attrbutes Istructor: Prof. Kabo Lu Departmet of Idustral ad Systems Egeerg UWMadso Emal: klu8@wsc.edu Offce: Room 3017 (Mechacal Egeerg Buldg) 1 Lst of Topcs Chapter
More informationThe Analysis of Development of Insurance Contract Premiums of General Liability Insurance in the Business Insurance Risk
The Aalyss of Developmet of Isurace Cotract Premums of Geeral Lablty Isurace the Busess Isurace Rsk the Frame of the Czech Isurace Market 1998 011 Scetfc Coferece Jue, 10.  14. 013 Pavla Kubová Departmet
More informationCommon pbelief: The General Case
GAMES AND ECONOMIC BEHAVIOR 8, 738 997 ARTICLE NO. GA97053 Commo pbelef: The Geeral Case Atsush Kaj* ad Stephe Morrs Departmet of Ecoomcs, Uersty of Pesylaa Receved February, 995 We develop belef operators
More information