Segment and combine approach for non-parametric time-series classification



Similar documents
TEMPORAL PATTERN IDENTIFICATION OF TIME SERIES DATA USING PATTERN WAVELETS AND GENETIC ALGORITHMS

Chapter 8: Regression with Lagged Explanatory Variables

Duration and Convexity ( ) 20 = Bond B has a maturity of 5 years and also has a required rate of return of 10%. Its price is $613.

Real-time Particle Filters

SELF-EVALUATION FOR VIDEO TRACKING SYSTEMS

Predicting Stock Market Index Trading Signals Using Neural Networks

Principal components of stock market dynamics. Methodology and applications in brief (to be updated ) Andrei Bouzaev, bouzaev@ya.

Measuring macroeconomic volatility Applications to export revenue data,

ANALYSIS AND COMPARISONS OF SOME SOLUTION CONCEPTS FOR STOCHASTIC PROGRAMMING PROBLEMS

Automatic measurement and detection of GSM interferences

Journal Of Business & Economics Research September 2005 Volume 3, Number 9

Single-machine Scheduling with Periodic Maintenance and both Preemptive and. Non-preemptive jobs in Remanufacturing System 1

Multiprocessor Systems-on-Chips

Individual Health Insurance April 30, 2008 Pages

Morningstar Investor Return

A New Adaptive Ensemble Boosting Classifier for Concept Drifting Stream Data

USE OF EDUCATION TECHNOLOGY IN ENGLISH CLASSES

Hotel Room Demand Forecasting via Observed Reservation Information

Genetic Algorithm Search for Predictive Patterns in Multidimensional Time Series

Performance Center Overview. Performance Center Overview 1

Making a Faster Cryptanalytic Time-Memory Trade-Off

DOES TRADING VOLUME INFLUENCE GARCH EFFECTS? SOME EVIDENCE FROM THE GREEK MARKET WITH SPECIAL REFERENCE TO BANKING SECTOR

PROFIT TEST MODELLING IN LIFE ASSURANCE USING SPREADSHEETS PART ONE

When Is Growth Pro-Poor? Evidence from a Panel of Countries

Stock Price Prediction Using the ARIMA Model

Making Use of Gate Charge Information in MOSFET and IGBT Data Sheets

DDoS Attacks Detection Model and its Application

INTEREST RATE FUTURES AND THEIR OPTIONS: SOME PRICING APPROACHES

Bayesian Filtering with Online Gaussian Process Latent Variable Models

To Sponsor or Not to Sponsor: Sponsored Search Auctions with Organic Links and Firm Dependent Click-Through Rates

The Greek financial crisis: growing imbalances and sovereign spreads. Heather D. Gibson, Stephan G. Hall and George S. Tavlas

SPEC model selection algorithm for ARCH models: an options pricing evaluation framework

A Note on Using the Svensson procedure to estimate the risk free rate in corporate valuation

Acceleration Lab Teacher s Guide

The Transport Equation

Why Did the Demand for Cash Decrease Recently in Korea?

The naive method discussed in Lecture 1 uses the most recent observations to forecast future values. That is, Y ˆ t + 1

Small and Large Trades Around Earnings Announcements: Does Trading Behavior Explain Post-Earnings-Announcement Drift?

WATER MIST FIRE PROTECTION RELIABILITY ANALYSIS

Measuring the Effects of Monetary Policy: A Factor-Augmented Vector Autoregressive (FAVAR) Approach * Ben S. Bernanke, Federal Reserve Board

A Natural Feature-Based 3D Object Tracking Method for Wearable Augmented Reality

The Application of Multi Shifts and Break Windows in Employees Scheduling

Statistical Analysis with Little s Law. Supplementary Material: More on the Call Center Data. by Song-Hee Kim and Ward Whitt

Supplementary Appendix for Depression Babies: Do Macroeconomic Experiences Affect Risk-Taking?

-, On the digital-computer classification of geometric line patterns,

Mathematics in Pharmacokinetics What and Why (A second attempt to make it clearer)

Usefulness of the Forward Curve in Forecasting Oil Prices

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C.

Vector Autoregressions (VARs): Operational Perspectives

Market Liquidity and the Impacts of the Computerized Trading System: Evidence from the Stock Exchange of Thailand

Chapter 1.6 Financial Management

MACROECONOMIC FORECASTS AT THE MOF A LOOK INTO THE REAR VIEW MIRROR

MODEL AND ALGORITHMS FOR THE REAL TIME MANAGEMENT OF RESIDENTIAL ELECTRICITY DEMAND. A. Barbato, G. Carpentieri

II.1. Debt reduction and fiscal multipliers. dbt da dpbal da dg. bal

Hedging with Forwards and Futures

Premium Income of Indian Life Insurance Industry

TSG-RAN Working Group 1 (Radio Layer 1) meeting #3 Nynashamn, Sweden 22 nd 26 th March 1999

Improving timeliness of industrial short-term statistics using time series analysis

11/6/2013. Chapter 14: Dynamic AD-AS. Introduction. Introduction. Keeping track of time. The model s elements

GOOD NEWS, BAD NEWS AND GARCH EFFECTS IN STOCK RETURN DATA

Analogue and Digital Signal Processing. First Term Third Year CS Engineering By Dr Mukhtiar Ali Unar

Segmentation, Probability of Default and Basel II Capital Measures. for Credit Card Portfolios

Efficient One-time Signature Schemes for Stream Authentication *

Analysis of Pricing and Efficiency Control Strategy between Internet Retailer and Conventional Retailer

Distributing Human Resources among Software Development Projects 1

An Online Learning-based Framework for Tracking

4. International Parity Conditions

Multi-camera scheduling for video production

Trends in TCP/IP Retransmissions and Resets

SEASONAL ADJUSTMENT. 1 Introduction. 2 Methodology. 3 X-11-ARIMA and X-12-ARIMA Methods

This is the author s version of a work that was submitted/accepted for publication in the following source:

Appendix D Flexibility Factor/Margin of Choice Desktop Research

Diane K. Michelson, SAS Institute Inc, Cary, NC Annie Dudley Zangi, SAS Institute Inc, Cary, NC

Migration, Spillovers, and Trade Diversion: The Impact of Internationalization on Domestic Stock Market Activity

OPERATION MANUAL. Indoor unit for air to water heat pump system and options EKHBRD011ABV1 EKHBRD014ABV1 EKHBRD016ABV1

An Empirical Comparison of Asset Pricing Models for the Tokyo Stock Exchange

Term Structure of Prices of Asian Options

Time-Series Forecasting Model for Automobile Sales in Thailand

Direc Manipulaion Inerface and EGN algorithms

Cointegration: The Engle and Granger approach

Network Discovery: An Estimation Based Approach

Market Efficiency or Not? The Behaviour of China s Stock Prices in Response to the Announcement of Bonus Issues

Forecasting and Forecast Combination in Airline Revenue Management Applications

User Identity Verification via Mouse Dynamics

Transcription:

Segmen and combine approach for non-parameric ime-series classificaion Pierre Geurs and Louis Wehenkel Universiy of Liège, Deparmen of Elecrical Engineering and Compuer Science, Sar-Tilman B28, B4000 Liège, Belgium, Posdocoral researcher, FNRS, Belgium {p.geurs,l.wehenkel}@ulg.ac.be Absrac. This paper presens a novel, generic, scalable, auonomous, and flexible supervised learning algorihm for he classificaion of mulivariae and variable lengh ime series. The essenial ingrediens of he algorihm are randomizaion, segmenaion of ime-series, decision ree ensemble based learning of subseries classifiers, combinaion of subseries classificaion by voing, and cross-validaion based emporal resoluion adapaion. Experimens are carried ou wih his mehod on 10 synheic and real-world daases. They highligh he good behavior of he algorihm on a large diversiy of problems. Our resuls are also highly compeiive wih exising approaches from he lieraure. 1 Learning o classify ime-series Time-series classificaion is an imporan problem from he viewpoin of is muliudinous applicaions. Specific applicaions concern he non inrusive monioring and diagnosis of processes and biological sysems, for example o decide wheher he sysem is in a healhy operaing condiion on he basis of measuremens of various signals. Oher relevan applicaions concern speech recogniion and behavior analysis, in paricular biomerics and fraud deecion. From he viewpoin of machine learning, a ime-series classificaion problem is basically a supervised learning problem, wih emporally srucured inpu variables. Among he pracical problems faced while rying o apply classical (proposiional) learning algorihms o his class of problems, he main one is o ransform he non-sandard inpu represenaion ino a fixed number of scalar aribues which can be managed by a proposiional base learner and a he same ime reain informaion abou he emporal properies of he original daa. One approach o solve his problem is o define a (possibly very large) collecion of emporal predicaes which can be applied o each ime-series in order o compue (logical or numerical) feaures which can hen be used as inpu represenaion for any base learner (e.g. [7, 8, 10, 11]). This feaure exracion sep can also be incorporaed direcly ino he learning algorihm [1, 2, 14]. Anoher approach is o define a disance or similariy measure beween ime-series ha akes ino accoun emporal specific peculiariies (e.g. invariance wih respec o

ime or ampliude rescaling) and hen o use his disance measure in combinaion wih neares neighbors or oher kernel-based mehods [12, 13]. A poenial advanage of hese approaches is he possibiliy o bias he represenaion by exploiing prior problem specific knowledge. A he same ime, his problem specific modeling sep makes he applicaion of machine learning non auonomous. The approach invesigaed in his paper aims a developing a fully generic and off-he-shelf ime-series classificaion mehod. More precisely, he proposed algorihm relies on a generic pre-processing sage which exracs from he imeseries a number of randomly seleced subseries, all of he same lengh, which are labeled wih he class of he ime-series from which hey were aken. Then a generic supervised learning mehod is applied o he sample of subseries, so as o derive a subseries classifier. Finally, a new ime-series is classified by aggregaing he predicions of all is subseries of he said size. The mehod is combined wih a en-fold cross-validaion wrapper in order o adjus auomaically he size of he subseries o a given daase. As base learners, we use ree-based mehods because of heir scalabiliy and auonomy. Secion 2 presens and moivaes he proposed algorihmic framework of segmenaion and combinaion of ime-series daa and Secion 3 presens an empirical evaluaion of he algorihm on a diverse se of ime-series classificaion asks. Furher deails abou his sudy may be found in [4]. 2 Segmen and Combine Noaions A ime-series is originally represened as a discree ime finie duraion real-valued vecor signal. The differen componens of he vecor signal are called emporal aribues in wha follows. The number of ime-seps for a given emporal aribue is called is duraion. We suppose ha all emporal aribues of a given ime-series have he same duraion. On he oher hand, he duraions of differen ime-series of a given problem (or daase) are no assumed o be idenical. A given ime series is relaed o a paricular observaion (or objec). A learning sample (or daase) is a se (ordering is considered irrelevan a his level) of N preclassified ime-series denoed by LS N = {( a( d(o), o), c(o) ) o = 1,..., N }, where o denoes an observaion, d(o) IN sands for he duraion of he imeseries, c(o) refers o he class associaed o he ime-series, and a( d(o), o) = (a 1 ( d(o), o),..., a n ( d(o), o)), a i ( d(o), o) = (a i (1, o),..., a i (d(o), o)), represens he vecor of n real-valued emporal aribues of duraion d(o). The objecive of he ime-series classificaion problem is o derive from LS N a classificaion rule ĉ(a( d(o), o)) which predics oupu classes of an unseen imeseries a( d(o), o) as accuraely as possible. Training a subseries classifier. In is raining sage, he segmen and combine algorihm uses a proposiional base learner o yield a subseries classifier from LS N in he following way:

Subseries sampling. For i = 1,..., N s choose o i {1,..., N} randomly, hen choose a subseries offse i {0,..., d(o i ) l} randomly, and creae a scalar aribue vecor a l i (o i ) = (a 1 ( i + 1, o i ),..., a 1 ( i + l, o i ),..., a n ( i + 1, o i ),..., a n ( i + l, o i )) concaenaing he values of all n emporal aribues over he ime inerval i + 1,..., i + l. Collec he samples in a raining se of subseries LS l N s = { (a l i (o i ), c(o i )) i = 1,..., N s }. Classifier raining. Use he base learner on LS l N s o build a subseries classifier. This classifier is supposed o reurn a class-probabiliy vecor P l c (a l ). Noice ha when N s is greaer han he oal number of subseries of lengh l, no sampling is done and LSN l s is aken as he se of all subseries. Classifying a ime-series by voes on is subseries. For a new imeseries a( d(o), o), exrac sysemaically all is subseries of lengh l, a l i (o), i {0,..., d(o) l}, and classify i according o d(o) l ĉ(a( d(o), o)) = arg max P l c c (al i (o)). Noe ha if he base learner reurns 0/1 class indicaors, he aggregaion sep merely selecs he class receiving he larges number of voes. i=0 Tuning he subseries lengh l. In addiion o he choice of base learner discussed below, he sole parameers of he above mehod are he number of subseries N s and he subseries lengh l. In pracice, he larger N s, he higher he accuracy. Hence, he choice of he value of N s is only dicaed by compuaional consrains. On he oher hand, he subseries lengh l should be adaped o he emporal resoluion of he problem. Small values of l force he algorihm o focus on local (shif-invarian) paerns in he original ime-series while larger values of l amoun o considering he ime-series more globally. In our mehod, we deermine his lengh auomaically by rying ou a se of candidae values l i min o LSN d(o), esimaing for each l i he error-rae by en-fold cross-validaion over LS N, and selecing he value l yielding he lowes error rae esimae. Base learners. In principle, any proposiional base learner (SVM, knn, MLP ec.) could be used in he above approach. However, for scalabiliy reasons, we recommend o use decision rees or ensembles of decision rees. In he rials in he nex secion we will compare he resuls obained wih wo differen reebased mehods, namely single unpruned CART rees and ensembles of exremely randomized rees. The exremely randomized rees algorihm (Exra-Trees) is

Table 1. Summary of daases Daase Src. N d n c d d Proocol Bes Ref {l i}, {s i} CBF 1 798 1 3 128 10-fold cv 0.00 [7] 1,2,4,8,16,32,64,96,128 CC 2 600 1 6 60 10-fold cv 0.83 [1] 1,2,5,10,20,30 CBF-r 1 5000 1 3 128 10-fold cv 1,2,4,8,16,32,64,96,128 Two-pa 1 5000 1 3 128 10-fold cv 1,2,4,8,16,32,64,96,128 TTes 1 999 3 3 81-121 10-fold cv 0.50 [7] 3,5,10,20,40,60 Trace 3 1600 4 16 268-394 holdou 800 0.83 [1] 10,25,50,100,150,200,250 Auslan-s 2 200 8 10 32-101 10-fold cv 1.50 [1] 1,2,5,10,20,30 Auslan-b 5 2566 22 95 45-136 holdou 1000 2.10 [7] 1,2,5,10,20,30,40 JV 2 640 8 10 7-29 holdou 270 3.80 [8] 2,3,5,7 ECG 4 200 2 2 39-152 10-fold cv 1,2,5,10,20,30,39 1 hp://www.monefiore.ulg.ac.be/~geurs/hesis.hml 2 [5] 3 hp://www2.ife.no 4 hp://www-2.cs.cmu.edu/~bobski/pubs/r01108.hml 5 hp://waleed.web.cse.unsw.edu.au/new/phd.hml described in deails in [4]. I grows a ree by selecing he bes spli from a small se of candidae random splis (boh aribue and cu-poin are randomized). This mehod allows o reduce srongly variance wihou increasing bias oo much. I is also significanly faser in he raining sage han bagging or boosing which search for opimal aribue and cu-poins a each node. Noice ha because he segmen and combine approach has some inrinsic variance reducion capabiliy, i is generally counerproducive o prune single rees in his conex. For he same reason, he number of rees in he ree ensemble mehods can be chosen reasonably small (25 in our experimens). 3 Empirical analysis 3.1 Benchmark problems Experimens are carried ou on 10 problems. For he sake of breviy, we only repor in Table 1 he main properies of he 10 daases. We refer he ineresed reader o [4] and he references herein for more deails. The second column gives he (web) source of he daase. The nex four columns give he number N d of ime-series in he daase, he number of emporal aribues n of each imeseries, he number of classes c, and he range of values of he duraion d(o); he sevenh column specifies our proocol o derive error raes; he eighh and ninh columns give respecively he bes published error rae (wih idenical or comparable proocol o ours) and he corresponding reference; he las column gives he rial values used for he parameers l and s. The firs six problems are arificial problems specifically designed for he validaion of ime-series classificaion mehods, while he las four problems correspond o real world problems. 3.2 Accuracy resuls Accuracy resuls on each problem are gahered in Table 2. In order o assess he ineres of he segmen and combine approach, we compare i wih a simple

Table 2. Error raes (in %) and opimal values of s and l Temporal normalizaion Segmen&Combine (N s = 10000) ST ET ST ET Daase Err% s Err% s Err% l Err% l CBF 4.26 24.0 ± 8.0 0.38 27.2 ± 7.3 1.25 92.8 ± 9.6 0.75 96.0 ± 0.0 CC 3.33 21.0 ± 17.0 0.67 41.0 ± 14.5 0.50 35.0 ± 6.7 0.33 37.0 ± 4.6 CBF-r 13.28 30.4 ± 13.3 2.51 30.4 ± 4.8 1.63 41.6 ± 14.7 1.88 57.6 ± 31.4 Two-pa 25.12 8.0 ± 0.0 14.37 36.8 ± 46.1 2.00 96.0 ± 0.0 0.37 96.0 ± 0.0 TTes 18.42 40.0 ± 0.0 13.61 40.0 ± 0.0 3.00 80.0 ± 0.0 0.80 80.0 ± 0.0 Trace 50.13 50 40.62 50 8.25 250 5.00 250 Auslan-s 19.00 5.5 ± 1.5 4.50 10.2 ± 4.0 5.00 17.0 ± 7.8 1.00 13.0 ± 4.6 Auslan-b 22.82 10 4.51 10 18.40 40.0 ± 0.0 5.16 40.0 ± 0.0 JV 16.49 2 4.59 2 8.11 3 4.05 3 ECG 25.00 18.5 ± 10.0 15.50 19.0 ± 9.4 25.50 29.8 ± 6.0 24.00 32.4 ± 8.5 normalizaion echnique [2, 6], which aims a ransforming a ime-series ino a vecor of fixed dimensionaliy of scalar numerical aribues: he ime inerval of each objec is divided ino s equal-lengh segmens and he average values of all emporal aribues along hese segmens are compued, yielding a new vecor of n s aribues which are used as inpus o he base learner. The wo approaches are combined wih single decision rees (ST) and ensembles of 25 Exra-Trees (ET) as base learners. The bes resul in each row is highlighed. For he segmen and combine mehod, we randomly exraced 10,000 subseries. The opimal values of he parameers l and s are searched among he candidae values repored in he las column of Table 1. When he esing proocol is holdou, he parameers are adjused by 10-fold cross-validaion on he learning sample only; when he esing proocol is 10-fold cross-validaion, he adjusmen of hese parameers is made for each of he en folds by an inernal 10-fold cross-validaion. In his laer case average values and sandard deviaions of he parameers s and l over he (exernal) esing folds are provided. From hese resuls we firs observe ha Segmen and Combine wih Exra- Trees (ET) yields he bes resuls on six ou of en problems. On hree oher problems (CBF, CBR-r, Auslan-b) is accuracy is close o he bes one. Only on he ECG problem, he resuls obained are somewha disappoining wih respec o he normalizaion approach. On he oher hand, i is clear ha he combinaion of he normalizaion echnique wih single rees (ST) is sysemaically (much) less accurae han he oher varians. We also observe ha, boh for normalizaion and segmen and combine, he Exra-Trees always give significanly beer resuls han single rees. 1 On he oher hand, he improvemen resuling from he segmen and combine mehod is sronger for single decision rees han for Exra-Trees. Indeed, error raes of he former are reduced in average by 65% while error raes of he laer are 1 There is only one excepion, namely CBF-r where he ST mehod is slighly beer han ET in he case of segmen and combine.

only reduced by 30%. Acually, wih segmen and combine, single rees and Exra-Trees are close o each oher in erms of accuracy on several problems, while hey are no wih normalizaion. This can be explained by he inrinsic variance reducion effec of he segmen and combine mehod, which is due o he virual increase of he learning sample size and he averaging sep and somewha miigaes he effec variance reducion echniques like ensemble mehods (see [3] for a discussion of bias and variance of he segmen and combine mehod). From he values of l in he las column of Table 2, i is clear ha he opimal l is a problem dependen parameer. Indeed, wih respec o he average duraion of he ime-series his opimal values ranges from 17% (on JV) o 80% (on TTes). This highlighs he usefulness of he auomaic uning by crossvalidaion of l as well as he capaciy of he segmen and combine approach o adap iself o variable emporal resoluions. A comparison of he resuls of he las wo columns of Table 2 wih he eighh column of Table 1, shows ha he segmen and combine mehod wih Exra-Trees is acually quie compeiive wih he bes published resuls. Indeed, on CBF, CC, TTes, Auslan-s, and JV, is resuls are very close o he bes published ones. 2 Since on Trace, and o a lesser exen on Auslan-b, he resuls were less good, we ran a side-experimen o see if here is room for improvemen. On Trace we were able (wih Exra-Trees and N s = 15000) o reduce he error rae from 5.00% o 0.875% by firs resampling he ime series ino a fixed number of 268 ime poins. The same approach wih 40 ime poins decreased also he error rae on Auslan-b from 5.16% o 3.94%. 3.3 Inerpreabiliy Le us illusrae he possibiliy o exrac inerpreable informaion from he subseries classifiers. Acually, hese classifiers provide for each ime poin a vecor esimaing he class-probabiliies of subseries cenered a his poin. Hence, subseries ha correspond o a high probabiliy of a cerain subse of classes can be considered as ypical paerns of his subse of classes. Figure 1 shows for example, in he op par, wo emporal aribues for hree insances of he Trace problem respecively of classes 1, 3, and 5, and in he boom par he evoluion of he probabiliies of hese hree classes as prediced for subseries (of lengh l = 50) as hey move progressively from lef o righ on he ime axis. The Class 3 signal (op middle) differs from he Class 1 signal (op lef) only in he occurrence of a small sinusoidal paern in one of he aribue (around = 200); on he oher hand, Class 1 and 3 differ from Class 5 (op righ) in he occurrence of a sharp peak in he oher aribue (around = 75 and = 100 respecively). From he probabiliy plos we see ha, for 50 he hree classes are equally likely, bu a he ime where he peak appears ( [60 70]) he probabiliy of Class 5 decreases for he wo 2 Noe ha on CBF, CC, TTes, and Auslan-s, our es proocols are no sricly idenical o hose published since we could no use he same en folds. This may be sufficien o explain small differences wih respec o resuls from he lieraure.

Class 1 Class 3 Class 5 1 0.5 0-0.5-1 0 100 200 300 0 100 200 300 0.5 0.4 0.3 C1 C3 C5 C1 C3 C5 0 100 200 300 C1 C3 C5 0.2 0.1 0 0 100 200 300 0 100 200 300 0 100 200 300 Fig. 1. Inerpreabiliy of Segmen and Combine (Trace daase, N s = 10000, ET) firs series (where a peak appears) and increases for he righ-mos series (where no peak appears). Subsequenly, around = 170, he subseries in he middle insance sar o deec he sinusoidal paern, which ranslaes ino an increase of he probabiliy of Class 3, while for he wo oher ime-series Classes 1 and 5 become equally likely and Class 3 relaively less. Noice ha he voing scheme used o classify he whole ime-series from is subseries amouns o inegraing hese curves along he ime axis and deciding on he mos likely class once all subseries have been incorporaed. This suggess ha, once a subseries classifier has been rained, he segmen combine approach can be used in real-ime in order o classify signals hrough ime. 4 Conclusion In his paper, we have proposed a new generic and non-parameric mehod for ime-series classificaion which randomly exracs subseries of a given lengh from ime-series, induces a subseries classifier from his sample, and classifies ime-series by averaging he predicion over is subseries. The subseries lengh is auomaically adaped by he algorihm o he emporal resoluion of he problem. This algorihm has been validaed on 10 benchmark problems, where i yielded resuls compeiive wih sae-of-he-ar algorihms from he lieraure. Given he diversiy of benchmark problems and concepual simpliciy of our algorihm, his is a very promising resul. Furhermore, he possibiliy o exrac inerpreable informaion from ime-series has been highlighed. There are several possible exensions of our work, such as more sophisicaed aggregaion schemes and muli-scale subseries exracion. These would allow o handle problems wih more complex emporally relaed characerisic paerns

of variable ime-scale. We have also suggesed ha he mehod could be used for real-ime ime-series classificaion, by adjusing he voing scheme. The approach presened here for ime-series is essenially idenical o he work repored in [9] for image classificaion. Similar ideas could also be exploied o yield generic approaches for he classificaion of exs or biological sequences. Alhough hese laer problems have differen srucural properies, we believe ha he flexibiliy of he approach makes i possible o adjus i o hese conexs in a sraighforward manner. References 1. J. Alonso González and J. J. Rodríguez Diez. Boosing inerval-based lierals: Variable lengh and early classificaion. In M. Las, A. Kandel, and H. Bunke, ediors, Daa mining in ime series daabases. World Scienific, June 2004. 2. P. Geurs. Paern exracion for ime-series classificaion. In L. de Raed and A. Siebes, ediors, Proceedings of PKDD 2001, 5h European Conference on Principles of Daa Mining and Knowledge Discovery, LNAI 2168, pages 115 127, Freiburg, Sepember 2001. Springer-Verlag. 3. P. Geurs. Conribuions o decision ree inducion: bias/variance radeoff and ime series classificaion. PhD hesis, Universiy of Liège, Belgium, May 2002. 4. P. Geurs and L. Wehenkel. Segmen and combine approach for non-parameric ime-series classificaion. Technical repor, Universiy of Liège, 2005. 5. S. Heich and S. D. Bay. The UCI KDD archive, 1999. Irvine, CA: Universiy of California, Deparmen of Informaion and Compuer Science. hp://kdd.ics.uci.edu. 6. M. W. Kadous. Learning comprehensible descripions of mulivariae ime series. In Proceedings of he Sixeenh Inernaional Conference on Machine Learning, ICML 99, pages 454 463, Bled, Slovenia, 1999. 7. M. W. Kadous and C. Sammu. Classificaion of mulivariae ime series and srucured daa using conrucive inducion. Machine learning, 58(1-2):179 216, February/March 2005. 8. M. Kudo, J. Toyama, and M. Shimbo. Mulidimensional curve classificaion using passing-hrough regions. Paern Recogniion Leers, 20(11-13):1103 1111, 1999. 9. R. Marée, P. Geurs, J. Piaer, and L. Wehenkel. Random subwindows for robus image classificaion. In Proceedings of he IEEE Inernaional Conference on Compuer Vision and Paern Recogniion (CVPR 2005), 2005. 10. I. Mierswa and K. Morik. Auomaic feaure exracion for classifying audio daa. Machine Learning, 58(1-2):127 149, February/March 2005. 11. R. T. Olszewski. Generalized feaure exracion for srucural paern recogniion in ime-series daa. PhD hesis, Carnegie Mellon Universiy, Pisburgh, PA, 2001. 12. C.A. Raanamahaana and E. Keogh. Making ime-series classificaion more accurae using learned consrains. In Proceedings of SIAM, 2004. 13. H. Shimodaira, K.I. Noma, M. Nakai, and S. Sagayama. Dynamic ime-alignmen kernel in suppor vecor machine. In Advances in Neural Informaion Processing Sysems 14, NIPS2001, volume 2, pages 921 928, December 2001. 14. Y. Yamada, E. Suzuki, H. Yokoi, and K. Takabayashi. Decision-ree inducion from ime-series daa based on sandard-example spli es. In Proceedings of he 20h Inernaional Conference on Machine Learning (ICML-2003), 2003.