This article is publised in Ukrainian in "Herald of Zhytomyr Engeneering- Technological Institute" 2003.- 1. P. 181 186.



Similar documents
The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Forecasting the Direction and Strength of Stock Market Movement

Support Vector Machines

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

What is Candidate Sampling

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

The OC Curve of Attribute Acceptance Plans

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

L10: Linear discriminants analysis

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Single and multiple stage classifiers implementing logistic discrimination

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Support Vector Machine Model for Currency Crisis Discrimination. Arindam Chaudhuri 1. Abstract

Lecture 2: Single Layer Perceptrons Kevin Swingler

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

Gender Classification for Real-Time Audience Analysis System

LSSVM-ABC Algorithm for Stock Price prediction Osman Hegazy 1, Omar S. Soliman 2 and Mustafa Abdul Salam 3

1 Example 1: Axis-aligned rectangles

Performance Analysis and Coding Strategy of ECOC SVMs

Different Methods of Long-Term Electric Load Demand Forecasting; A Comprehensive Review

An Efficient and Simplified Model for Forecasting using SRM

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

A COLLABORATIVE TRADING MODEL BY SUPPORT VECTOR REGRESSION AND TS FUZZY RULE FOR DAILY STOCK TURNING POINTS DETECTION

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

Searching for Interacting Features for Spam Filtering

Implementation of Deutsch's Algorithm Using Mathcad

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

An Alternative Way to Measure Private Equity Performance

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Automated Network Performance Management and Monitoring via One-class Support Vector Machine

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

A Genetic Programming Based Stock Price Predictor together with Mean-Variance Based Sell/Buy Actions

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center

Research Article Integrated Model of Multiple Kernel Learning and Differential Evolution for EUR/USD Trading

Modelling of Web Domain Visits by Radial Basis Function Neural Networks and Support Vector Machine Regression

Fault tolerance in cloud technologies presented as a service

Statistical Methods to Develop Rating Models

Least 1-Norm SVMs: a New SVM Variant between Standard and LS-SVMs

Project Networks With Mixed-Time Constraints

A DATA MINING APPLICATION IN A STUDENT DATABASE

An Integrated Approach of AHP-GP and Visualization for Software Architecture Optimization: A case-study for selection of architecture style

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST)

Calculating the high frequency transmission line parameters of power cables

1. Measuring association using correlation and regression

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

Credit Limit Optimization (CLO) for Credit Cards

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Figure 1. Training and Test data sets for Nasdaq-100 Index (b) NIFTY index

Improved SVM in Cloud Computing Information Mining

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Invoicing and Financial Forecasting of Time and Amount of Corresponding Cash Inflow

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

Data Mining Analysis and Modeling for Marketing Based on Attributes of Customer Relationship

7.5. Present Value of an Annuity. Investigate

BANKRUPTCY PREDICTION BY USING SUPPORT VECTOR MACHINES AND GENETIC ALGORITHMS

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble

Investigation of Normalization Techniques and Their Impact on a Recognition Rate in Handwritten Numeral Recognition

Forecasting and Modelling Electricity Demand Using Anfis Predictor

AUTHENTICATION OF OTTOMAN ART CALLIGRAPHERS

SIMPLE LINEAR CORRELATION

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

The Network flow Motoring System based on Particle Swarm Optimized

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Using Series to Analyze Financial Situations: Present Value

Machine Learning and Software Quality Prediction: As an Expert System

Estimating the Number of Clusters in Genetics of Acute Lymphoblastic Leukemia Data

Research Article Enhanced Two-Step Method via Relaxed Order of α-satisfactory Degrees for Fuzzy Multiobjective Optimization

Financial market forecasting using a two-step kernel learning method for the support vector regression

How To Calculate The Accountng Perod Of Nequalty

Traffic State Estimation in the Traffic Management Center of Berlin

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Time Delayed Independent Component Analysis for Data Quality Monitoring

Mining Multiple Large Data Sources

Learning to Classify Ordinal Data: The Data Replication Method

Calculation of Sampling Weights

STATISTICAL DATA ANALYSIS IN EXCEL

BUSINESS PROCESS PERFORMANCE MANAGEMENT USING BAYESIAN BELIEF NETWORK. 0688,

Small pots lump sum payment instruction

Hybrid-Learning Methods for Stock Index Modeling

Multiclass sparse logistic regression for classification of multiple cancer types using gene expression data

Support vector domain description

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

A Secure Password-Authenticated Key Agreement Using Smart Cards

Prediction of Stock Market Index Movement by Ten Data Mining Techniques

Using Content-Based Filtering for Recommendation 1

An Interest-Oriented Network Evolution Mechanism for Online Communities

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

Draft. Evaluation of project and portfolio Management Information Systems with the use of a hybrid IFS-TOPSIS method

Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System

MARKET SHARE CONSTRAINTS AND THE LOSS FUNCTION IN CHOICE BASED CONJOINT ANALYSIS

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

Logistic Regression. Steve Kroon

Transcription:

Цаконас А.Д., Дуніас Г.Д., Штовба С.Д. Прогнозування результатів футбольних матчів за допомогою машини опорних векторів // Вісник Житомирського інженерно-технологічного інститута.- 003.-. С.8 86. Ths artcle s publsed n Ukranan n "Herald of Zhytomyr Engeneerng- Technologcal Insttute" 003.-. P. 8 86. UDC 68.3 A. Tsakonas, Ph.D Student, Unversty of the Aegean, Chos, Greece G. Dounas, Ph.D., Lecturer, Unversty of the Aegean, Chos, Greece S. Shtovba, Ph.D., Assstant Professor, Vnntsa State Techncal Unversty, Ukrane FORECASTIG FOOTBALL MATCH OUTCOMES WITH SUPPORT VECTOR MACHIES Анотація. В статі пропонується метод прогнозування резльтатів футбольних ігор, що грунтується на такій технології софт-комп ютингу як автоматичне навчання на базі машині розділяючої гіперплощини. Розроблена в статті модель прогнозування враховує такі показники команд: різниця кількості вибувших провідних гравців; різниця ігрових динамік команд; різниця класу команд; фактор свого полю; результати особистих зустрічей команд. Тестування показує, що запропонова модель забезпечує добру збіжність прогнозованих та дійсних результатів футбольних матчів, що дозволяє рекомендувати машину розділяючої гіперплощини як перспективний підхід для прогнозування результатів різних спортивних чемпіонатів.. Introducton The predcton of sport game results corresponds to an nterestng real-world applcaton of modern decson makng and forecastng, whle t could also be consdered as a good benchmark problem for testng dverse technques of etrapolaton and predcton under dffcult condtons of lmted avalable statstcs and uncertantes of nfluence factors. By referrng to terms and methodologes such as ntellgent technques, soft computng, or computatonal learnng [] we mean n fact, a large varety of new powerful technques for ntellgent data analyss, whch provde a sutable way for handlng complety, uncertanty and fuzzness of real-world problems. The am of the present paper s to demonstrate an eample of how to predct football game wnners by applyng such a specfc modern ntellgent technque, namely Support Vector Machnes (SVM). Data representng the Ukranan football champonshp durng the 0 last years are used for the creaton and testng of the ntellgent prognostc models appled wthn ths paper.. The problem statement The task of creatng football wnner predcton models could be reduced to that of fndng out functonal mappng of the form: = (,,..., ) yî{ d, d, d }, () n 3 where - denotes a vector of features (.e. nfluence factors), such as team level, clmate condtons, playng place, results of past games etc.; y - denotes the football game result for assessment of one of the terms: d - «host team s wn», d - «draw» and d 3 - «guest team s wn». For the need of SVM applcaton the problem s re-stated as follows: = (,,..., n) yî{ -,}, () where - denotes the same vector as prevously; y - denotes the football game result for assessment of one of the terms: - s equal to result «host team wll not wn» and s equal to result «guest team wll not wn». 3. Feature selecton

The features carryng the major nfluence on the game predcton results, always correspond to a subjectve choce for every dfferent decson-maker, nevertheless there are some common aspects taken nto account from all decson makers. Accordng to [] these features taken fnally nto account, are the followngs: - dfference of nfrmty factors (as number of traumatsed and dsqualfed players of host team mnus the same players of guest team); - dfference of dynamcs profle (as score of host team for fve last games mnus score of guest team for the fve last games); - dfference of ranks (host team's rank mnus guest team's rank, n the current champonshp); 3 4 - host factor (as HP HG - GP GG, where HP denotes the total home ponts of the host team n the current champonshp; HG s the number of played home games by the host team; GP s the total guest ponts of the guest team n the current champonshp; HG s the number of played guest games by the guest team); 5 - personal score (as goal dfference for all the games of the teams nvolved, wthn 0 years). ote, that the above features do not consst confdental nformaton, but t s easy for the decson maker to know the feature values before the game. 4. Support Vector Machnes SVMs [3] correspond to a relatvely new computatonal ntellgence technque, related to the machne learnng concept. SVMs are used n pattern recognton as well as n regresson estmaton and lnear operator nverson. SVMs have nterestng attrbutes, dfferent than other computatonal ntellgence technques, such as neural networks, as SVMs are always able to fnd a global mnmum and they have a smple geometrc nterpretaton. SVMs are also capable of handlng large number of data or attrbutes and ther learnng s comparable n terms of speed wth that of neural networks. More specfcally, n order to estmate a classfcaton functon such as: f : { ± }, (3) the most mportant s to select an estmate f from a well restrcted so-called capacty of the learnng machne. Small capactes may not be suffcent to appromate comple functons, whle large capactes may fal to generalze, whch s the effect of what s called overfttng. In contrast to the neural networks' approach, where the early stoppng method s used to avod overfttng, n SVMs overfttng s lmted accordng to the statstcal theory of learnng from small samples [3]. The smpler decson functons are the lnear functons. In the case of SVM, the mplementaton of lnear functons corresponds to fndng a large margn separatng between two classes. Ths margn s the mnmum dstance of the tranng data ponts to the separaton surface. The procedure to fnd the mamum margn separaton s a conve quadratc problem [4]. An addtonal parameter enables the SVM to msclassfy some outlyng tranng data n order to get larger margn between the rest tranng data, wthout however affectng the optmzaton be the quadratc problem. If we transform the nput data nto a feature space F usng a map such as: f : F, (4) then, a lnear learnng machne s etended to a non-lnear one. In SVMs the latter procedure s appled mplctly. What we have to supply, s a dot product of pars of data ponts f( ) f( ) Î n feature space. Thus, to compute these dot products, we supply the so-called kernel j F functons that defne the feature space va: K(, ) = f( ) f( ). (5) j j We don't need to know f, because the mappng s performed mplctly. SVMs can also learn whch of the features mpled by the kernel are dstnctve for the two classes. The selecton of the approprate kernel functon may boost the learnng process. 4. The SVM algorthm As assumed n secton 3, we are gven tranng set S = {(, y),...,(, y )}, where each pont n = (,,..., n) belongs to R, and y Î{-, } s a label that dentfes the class of pont. The goal s to determne a functon f( ) = w f( ) + b, (6) where w = ( w, w,..., w n ) and b are the parameters of shatterng hyperplane; f( ) = ( f( ),..., f m ( )) corresponds to a mappng from R n nto a feature space Kernel Hlbert Space mappng used for kernel learnng machnes [5]. m R. Ths s the standard

Accordng to Statstcal Learnng Theory [3] n order to obtan a functon wth controllable generalzaton capablty, we need to control the Vapnk Chervonenks dmenson of the functon through structural rsk mnmzaton. SVMs are a practcal mplementaton of ths dea. The formulaton of SVM leads to the followng quadratc programmng problem [5]: Problem P: Mnmze w w + C å, subject to y( w f( ) + b) ³ -, ³ 0, =,,...,, where C s a postve penalty coeffcent for a msclassfcaton. The soluton w * of ths problem s gven by the equaton: (7) * w* ayf( ), =å * * * where a * = ( a, a,..., a ) s the soluton of followng Dual Problem: Problem P: T Mamze - a Da +åa, subject to å y a = 0 ; 0 a C, =,,...,, where D s a matr such that: D = yyf( ) f( ). (8) j j j By combnng equatons (6) and (7) the soluton of Problem P s gven by: * * åyaf( ) f( ) + b. * The ponts for whch a > 0 are called Support Vectors (SVs). They are the ponts that are ether msclassfed by the computer separatng functon or are closer than a mnmum dstance - the margn of the soluton - from the separatng surface [5]. In many applcatons they form a small subset of the tranng ponts. For certan choces of the mappng f( ) we can epress the dot product n the feature space defned by the f 's as f( ) f( ) = K(, ), where K s called the kernel of the Reproducng Kernel Hlbert Space defned by the f 's [5]. j j We may observe that the spatal complety of Problem P s, ndependent from the dmensonalty of the feature space. Ths observaton allows us to etend the method n feature spaces of nfnte dmenson [3]. In practce however, because of memory and speed requrements, Problem P presents lmtatons on the sze of the tranng set [6]. (9) 5. Results Although our problem s actually a mult-class classfcaton (predct the wnner wth three possble outcomes: home, host, draw) lttle research or none has been done n the one-step mult-class [7]. Thus we solve ths classfcaton problem as a common regresson problem, where the SVM algorthm has to mnmze the mean square error. Then, n order to get the predcted outcome, the followng rules are appled to the denormalzed forecasted values: f forecasted_value>=0 consder postve or zero score result guest team wll not wn ; f forecasted_value<0 consder negatve or zero score result host team wll not wn. Whle SVM classfcaton must be appled between two classes, we select to gnore the draw case as a specal case (a no wnner case) keepng the sgn of the output ndcatng the predcted class. The algorthm was gven as nput a set of 05 tranng data records and the SVM was tested on 70 test data records []. All data were normalzed n [-, ] range. The software appled was the mysvm [8]. We selected as kernel functon the dot functon (smple multplcaton) as we had no evdence for the approprateness of other, more comple functons. We also set the capacty parameter of the SVM equal to С = 000. Ths parameter has to be postve, ts value s then dvded by the number of eamples that are used for tranng. The other mportant parameter (see secton 4) s the nsenstvty known as epslon, whch s a constant that the predcton can devate from the functonal value wthout beng penalzed. In the algorthm t sets both a postve (epslon+) and a negatve nsenstvty (epslon-). Here we set epslon=0.0.

The algorthm statstcs n detal are presented n Table, and are eplaned n the paragraph that follows. Support Vectors s the number of support vectors produced. Bounded SVs are the number of support vectors at the upper bound, then the mnmum and the mamum values of the alphas are shown. w s the -norm of the hyperplane vector and VCdm s an estmator of the Vapnk Chervonenks dmenson computed from the last two values. ( w,..., w5) s the hyperplane vector for the attrbutes and b s the addtonal constant of the hyperplane. The followng results were obtaned after 377 teratons: Tran Set Mean square error - 0.0597589; Test Set Mean square error - 0.05367684. By applyng the classfcaton rules descrbed n the prevous paragraph we receved the followng results: Correct Predcton on Test Set s 43 out of 70 eamples (accuracy 6.4%). Table - Support vector learnng output statstcs Parameter Value Support Vectors 97 Bounded SVs 90 mn SV -9.7087379 ma SV 9.7087379 w 0.8035 VCdm <=.3774434 w 0.570 w -0.00445 w 0.8758 3 w 0.838793 4 w 0.0998453 5 b 0.0638468 In order to compare our model wth other approaches, n Table we consdered results obtaned by other computatonal ntellgent approaches, n prevous work []. Those results were obtaned for a predcton ncludng the draw result of the matches, thus ther quotaton s here ndcatory. Also, results for the fuzzy model and the neural network nclude the classfcaton score on an 75-element set (tranng and testng sets). These results can help however to draw general conclusons on the effectveness of the method n ths data set. Table Comparson of the SVM model wth other approaches Model Correct classfcaton Fuzzy model 64 % (both sets) eural network 64 % (both sets) Genetc programmng model 64.8 % (test set) Support Vector Machnes 6.4% (test set) 6. Conclusons - Further Research Ths paper brefly demonstrates the applcaton of modern statstcal or entropy-based approaches, such as Support Vector Machnes. The latter, relatvely new computatonal ntellgence approach, was mplemented n a common (for SVM theory) ± outcome bass, wth postve values correspondng to a guest team wll not wn outcome and negatve values to a host team wll not wn outcome. These prme results presented n the paper, are ndcatve of the usablty of the SVMs, denotng the compettveness of ths approach among other ntellgent approaches for data drven forecastng and decson makng. Further research n ths doman, may nvolve hybrd computatonal ntellgent schemes (see a detaled revew n [9], for detals), whle those approaches have been proved n many cases capable of capturng nearly stochastc or chaotc processes offerng a hgh classfcaton and predcton rate. References. Zadeh L. Appled Soft Computng Foreword // Appled Soft Computng, 000, Vol., P.-.. Tsakonas A., Dounas G., Shtovba S., Vvdyuk V. Soft Computng-Based Result Predcton of Football Games // Proceedngs of the Frst Internatonal Conference on Inductve Modelng, Lvv, 00, Vol. 3, P. 5-.

3. Vapnk V.. Statstcal learnng theory.- Wley-Interscence, 998.- 736 p. 4. Vapnk V.. The nature of statstcal learnng theory.- Sprnger-Verlag, 999.- 304 p. 5. Cortes C., Vapnk V. Support Vector etworks // Machne Learnng, 995, Vol. 0, P. -5. 6. Evgenou T., Pontl M., Support Vector Machnes wth Clusterng for Tranng wth Very Large Datasets.- Sprnger: Lecture otes n Computer Scence, Vol. 308, 00.- P. 346-354. 7. Boser B., Guyon I., Vapnk V.P. A tranng algorthm for optmal margn classfers // Computatonal Learnng Theory, 99, Vol.5, P. 44-5. 8. Rupng S., mysvm-manual. Techncal Report.- Unversty of Dortmund, Computer Scence Department, 000. 9. Tsakonas A., Dounas G. Hybrd Computatonal Intellgence Schemes n Comple Domans: An Etended Revew.- Sprnger: Lecture otes n Computer Scence, Vol. 308, 00, P. 494-5. Tsakonas Athanasos Demetros, MSc, PhD Student, Unversty of the Aegean, Chos, Greece. Scentfc nterests: Computatonal Intellgence, Decson Makng, Wavelets, Chaos Theory. Tel.: (30937) 89399. E-mal: tsakonas@stt.aegean.gr. Dounas Gorgos D., PhD, Lecturer, Unversty of the Aegean, Chos, Greece. Scentfc nterests: Computatonal Intellgence, Decson Makng, Wavelets, Medcal Applcatons of Artfcal Intellgenc. Tel.: (307)-94408. E-mal: g.dounas@aegen.gr. Shtovba Serhy Dmytrovych, PhD, Assstant Professor, Vnntsa State Techncal Unversty, Vnntsa, Ukrane. Scentfc nterests: Fuzzy Logc, Genetc Algorthms, Decson Makng, Relablty, Qualty Control. Tel.: (043)-440430. E-mal: serg@faksu.vstu.vnnca.ua. Стаття надійшла до редакції 00р. А.Д. Цаконас, Г.Д. Дуниас, С.Д. Штовба Прогнозирование результатов футбольных матчей с помощью машины разделяющей гиперплоскости. В статье предложен метод прогнозирование результатов футбольных матчей, основанный на такой технологии софт-компьютинга как автоматическое обучение на основе машины разделяющей гиперплоскости. Разработанная в статье модель прогнозирование учитывает следующие показатели команд: разница потерь ведущих игроков; разница игровых динамик команд; разница классов команд; фактор своего поля; результаты личных встреч команд. Тестирование показывает, что предложенная модель обеспечивает хорошую согласованность спрогнозированных и действительных результатов футбольных матчей, что позволяет рекомендовать машину разделяющей гиперплоскости как перспективный подход для прогнозирования результатов различных спортивных чемпионатов. A. Tsakonas, G. Dounas, S. Shtovba Forecastng football match outcomes wth support vector machnes. A soft computng method for result predcton of football games based on machne learnng technques such as support vector machnes s proposed n the artcle. The model s takng nto account the followng features of football teams: dfference of nfrmty factors; dfference of dynamcs profle; dfference of ranks; host factor; personal score of the teams. Testng shows that the proposed model acheves a satsfactory estmaton of the actual game outcomes. The current work concludes wth the recommendaton of support vector machnes technque as a powerful approach, for the creaton of result predcton models of dverse sport champonshps.