A Trading System for FTSE-100 Futures Using Neural Networks and Wavelets



Similar documents
Intra-day Trading of the FTSE-100 Futures Contract Using Neural Networks With Wavelet Encodings

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

What is Candidate Sampling

Lecture 2: Single Layer Perceptrons Kevin Swingler

An Alternative Way to Measure Private Equity Performance

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Calculating the high frequency transmission line parameters of power cables

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

A COLLABORATIVE TRADING MODEL BY SUPPORT VECTOR REGRESSION AND TS FUZZY RULE FOR DAILY STOCK TURNING POINTS DETECTION

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Sequential Optimizing Investing Strategy with Neural Networks

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

L10: Linear discriminants analysis

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

Single and multiple stage classifiers implementing logistic discrimination

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

1. Measuring association using correlation and regression

An Interest-Oriented Network Evolution Mechanism for Online Communities

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

Research Article Integrated Model of Multiple Kernel Learning and Differential Evolution for EUR/USD Trading

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Construction Rules for Morningstar Canada Target Dividend Index SM

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

A Genetic Programming Based Stock Price Predictor together with Mean-Variance Based Sell/Buy Actions

Financial market forecasting using a two-step kernel learning method for the support vector regression

Credit Limit Optimization (CLO) for Credit Cards

Analysis of Premium Liabilities for Australian Lines of Business

Return decomposing of absolute-performance multi-asset class portfolios. Working Paper - Nummer: 16

DEFINING %COMPLETE IN MICROSOFT PROJECT

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

1.1 The University may award Higher Doctorate degrees as specified from time-to-time in UPR AS11 1.

Using Association Rule Mining: Stock Market Events Prediction from Financial News

Lecture 3: Force of Interest, Real Interest Rate, Annuity

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble

Recurrence. 1 Definitions and main statements

Statistical Methods to Develop Rating Models

Prediction of Disability Frequencies in Life Insurance

Time Value of Money Module

Transition Matrix Models of Consumer Credit Ratings

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

7.5. Present Value of an Annuity. Investigate

Traffic State Estimation in the Traffic Management Center of Berlin

Prediction of Stock Market Index Movement by Ten Data Mining Techniques

How To Calculate The Accountng Perod Of Nequalty

CHAPTER 14 MORE ABOUT REGRESSION

The Application of Fractional Brownian Motion in Option Pricing

Traffic-light a stress test for life insurance provisions

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

An RFID Distance Bounding Protocol

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

An artificial Neural Network approach to monitor and diagnose multi-attribute quality control processes. S. T. A. Niaki*

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

Nordea G10 Alpha Carry Index

Calculation of Sampling Weights

The OC Curve of Attribute Acceptance Plans

Development of an intelligent system for tool wear monitoring applying neural networks

Imperial College London

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance

Finite Math Chapter 10: Study Guide and Solution to Problems

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35, , ,200,000 60, ,000

Beating the Odds: Arbitrage and Wining Strategies in the Football Betting Market

Hybrid-Learning Methods for Stock Index Modeling

Joe Pimbley, unpublished, Yield Curve Calculations

Biometric Signature Processing & Recognition Using Radial Basis Function Network

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST)

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters

Statistical Approach for Offline Handwritten Signature Verification

Forecasting and Stress Testing Credit Card Default using Dynamic Models

STANDING WAVE TUBE TECHNIQUES FOR MEASURING THE NORMAL INCIDENCE ABSORPTION COEFFICIENT: COMPARISON OF DIFFERENT EXPERIMENTAL SETUPS.

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

On the Use of Neural Network as a Universal Approximator

SPECIALIZED DAY TRADING - A NEW VIEW ON AN OLD GAME

Brigid Mullany, Ph.D University of North Carolina, Charlotte

Using Series to Analyze Financial Situations: Present Value

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining

Comparison of support-vector machines and back propagation neural networks in forecasting the six major Asian stock markets

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

Gender Classification for Real-Time Audience Analysis System

Joint Scheduling of Processing and Shuffle Phases in MapReduce Systems

Trade Adjustment and Productivity in Large Crises. Online Appendix May Appendix A: Derivation of Equations for Productivity

Transcription:

Reprnt from: BP Workng Paper n Fnancal Economcs Seres (3) September 997 A Tradng Sstem for FTSE-00 Futures Usng eural etworks and Wavelets D L Toulson S P Toulson Intellgent Fnancal Sstems Lmted fs@f5com wwwf5com ABSTRACT In s paper, we shall examne e combned use of e Dscrete Wavelet Transform [7] and regularsed neural networks to predct ntra-da returns of e LIFFE FTSE-00 ndex future The Dscrete Wavelet Transform (DWT) has recentl been used extensvel n a number of sgnal processng applcatons [5, 6] In s work, we shall propose e use of a specalsed neural network archtecture (WEAPO) at ncludes wn t a laer of wavelet neurons These wavelet neurons serve to mplement an ntal wavelet transformaton of e nput sgnal, whch n s case, wll be a set of lagged returns from e FTSE-00 future We derve a learnng rule for e WEAPO archtecture at allows e dlatons and postons of e wavelet nodes to be determned as part of e standard back-propagaton of error algorm Ths ensures at e chld wavelets used n e transform are optmal n terms of provdng e best dscrmnator nformaton for e predcton task We en examne how e predctons obtaned from commttees of WEAPO networks ma be exploted to establsh tradng rules for adoptng postons n e FTSE-00 Index Future usng a Sgnal Thresholded Tradng Sstem (STTS) The STTS operates b combnng predctons of e future return estmates of a fnancal tme seres over a varet of dfferent predcton horzons A set of tradng rules s en determned at act to optmse e rsk adusted performance (Sharpe Rato) of e tradng strateg usng realstc assumptons for bd/ask spread, slppage and transacton costs ITRODUCTIO Over e past decade, e use of neural networks for fnancal and econometrc applcatons has been wdel researched In partcular, neural networks have been appled to e task of provdng forecasts for varous fnancal markets rangng from spot currences to equt ndexes The mpled use of ese forecasts s often to develop sstems to provde proftable tradng recommendatons However, n practce, e success of neural network tradng sstems has been somewhat poor Ths ma be attrbuted to a number of factors In partcular, we can dentf e followng weaknesses n man approaches: Data Pre-processng Inputs to e neural network are often smple lagged returns (or even prces!) The dmenson of s nput nformaton s often much too hgh n e lght of e number of tranng samples lkel to be avalable Technques such as Prncpal Components Analss (PCA) and

Dscrmnant Analss can often help to reduce e dmenson of e nput data [,] In s paper, we present an alternate approach usng e Dscrete Wavelet Transform (DWT) Model Complext eural networks are often traned for fnancal forecastng applcatons wout sutable regularsaton technques Technques such as Baesan Regularsaton [,3,0] or smple weght deca help control e complext of e mappng performed b e neural network and reduce e effect of over-fttng of e tranng data Ths s partcularl mportant n e context of fnancal forecastng due to e hgh level of nose present wn e data 3 Confuson Of Predcton And Tradng Performance Often researchers present results for fnancal forecastng n terms of root mean square predcton error or number of accuratel forecast turnng ponts Whlst ese values contan useful nformaton about e performance of e predctor e do not necessarl mpl at a successful tradng sstem ma be based upon em The performance of a tradng sstem s usuall dependent on e performance of e predctons at ke ponts n e tme seres Ths performance s not usuall adequatel reflected n e overall performance of e predctor averaged over all ponts of a large testng perod We shall present a practcal tradng model n s paper at attempts to address each of ese ponts THE PREDICTIO MODEL In s paper, we shall examne e use of commttees of neural networks to predct future returns of e FTSE-00 Index Future over 5, 30, 60 and 90 mnute predcton horzons We shall en combne ese predctons and determne from em a set of tradng rules at wll optmse rsk adusted performance (Sharpe Rato) We shall use as nput to each of e neural network predctors, e prevous 40 lagged mnutel returns of e FTSE-00 Future The requred output shall be e predcted future return for e approprate predcton horzon Ths process s llustrated n Fgure Predcton Horzon FTSE-00 Tme 40 lagged returns 5 mn 30 mn 60 mn 90 mn Fgure : Predctng FTSE-00 Index Futures: 40 lagged returns are extracted form e FTSE-00 future tme seres These returns are used as nput to (WEAPO) MLPs Dfferent MLPs are traned to predct e return of e FTSE-00 future 5, 30, 60 and 90 mnutes ahead A ke consderaton concernng s tpe of predcton strateg s how to encode e 40 avalable lagged returns as a neural network nput vector One possblt would be to smpl use all 40 raw nputs The problem w s approach s e hgh dmensonalt of e nput vectors Ths wll requre us to use an extremel large set of tranng examples to ensure at e parameters of e model (e weghts of e neural network) ma be properl determned Due to computatonal complextes and e non-statonart of fnancal tme seres, usng extremel large tranng sets s seldom practcal A preferable strateg s to attempt to reduce e dmenson of e nput nformaton

3 A popular approach to reducng e dmenson of nputs to neural networks s to use a Prncpal Components Analss (PCA) transform to reduce nformaton redundanc n e nput vectors due to nter-component correlatons However, as we are workng w lagged returns from a sngle fnancal tme seres we know, n advance, at ere s lttle (auto) correlaton n e lagged returns In oer work [, ], we have approached e problem of dmenson reducton rough e use of Dscrmnant Analss technques These technques were shown to lead to sgnfcantl mproved performance n terms of predcton ablt of e traned networks However, such technques do not, n general, take an advantage of our knowledge of e temporal structure of e nput components, whch n s case wll be sequental lagged returns Such technques are also mplctl lnear n er assumptons of separablt, whch ma not be generall approprate when consderng obtanng an optmal set of nputs to (non-lnear) neural networks We shall consder, as an alternatve means of reducng e dmenson of e nput vectors, e use of e Dscrete Wavelet Transform 3 THE DISCRETE WAVELET TRASFORM (DWT) 07 034 Wavelets 03 Coeffcents Fgure : The dscrete wavelet transform The Dscrete Wavelet Transform [4, 5] has recentl receved much attenton as a technque for e preprocessng of data n applcatons nvolvng bo e compact representaton of e orgnal data (e data compresson or factor analss) or as a dscrmnator bass for pattern recognton and regresson problems The transform functons b proectng e orgnal sgnal onto a sub-space spanned b a set of chld wavelets derved from a partcular Moer wavelet For example, let us select e Moer wavelet to be e Mexcan Hat functon t ( t ) e The wavelet chldren are en dlated and translated forms of (), e 4 ( t) π () 3, ( t) t () ow, let us select a fnte subset C from e nfnte set of possble chld wavelets Let e members of e subset be dentfed b e dscrete values of poston and scale,,, K, e {,,, } C K (3) 3

4 The component of e proecton of e orgnal sgnal x onto e K dmensonal space spanned b e chld wavelets s en x, ( ) (4) The sgnfcant questons to be answered w respect to usng e DWT to reduce e dmenson of e nput vectors to e neural network are frstl how man chld wavelets should be used and gven at, what values of shft and dlaton, and,, should be chosen? In s paper, we shall present a meod of choosng a sutable set of chld wavelets such at e transformaton of e orgnal data (e 40 lagged returns) wll enhance e non-lnear separablt of dfferent classes of sgnal (e future postve and negatve returns) whlst sgnfcantl reducng ts dmenson We show how s ma be acheved naturall b mplementng e wavelet transform as a set of neurons contaned n e frst laer of a mult-laer perceptron 4 THE WAVELET ECODIG A PRIORI ORTHOGOAL ETWORK (WEAPO) The WEAPO archtecture s shown below n Fgure 3 The archtecture s essentall a Mult Laer Perceptron (MLP) w an addtonal laer of specal wavelet nodes Each of ese nodes represents a sngle chld wavelet and ts output response s smpl e proecton of e nput vector onto at partcular wavelet e where n s case, s e output of e, x ( ) (5) wavelet node, x s e and ξ are respectvel e shfts and dlatons assocated w e component of e nput vector and wavelet node The scales and shfts for each wavelet node are optmsed b ncludng em as parameters wn e usual backprop tranng algorm We can us determne e optmal set of scales and shfts approprate for e predcton task we wsh to solve Detals of e dervaton of backprop for e wavelet neurons are gven n Appendx A In addton to s tranng rule to optmse e shfts and scales of e wavelet nodes, we have also devsed mechansms to control bo e orogonalt of e ndvdual wavelets and e regularsaton of e complext of e network mappng as a whole Detals of s are agan ncluded n Appendx A DWT Predcton Horzon FTSE-00 Tme 40 lagged returns Pseudo weghts Wavelet nodes MLP 5 mn 30 mn 60 mn 90 mn Fgure 3: The WEAPO archtecture 4

5 5 PREDICTIG FTSE-00 FUTURES USIG WEAPO ETWORKS 5 The Data We shall appl e network archtecture and tranng rules descrbed n e prevous secton to e task of predctng future returns of e FTSE-00 ndex futures quoted on LIFFE The hstorcal prce data used was tckb-tck quotes of actual trades suppled b LIFFE The data was pre-processed to a -mnutel format b takng e average volume adusted traded prce durng each mnute Mssng values were flled n b nterpolaton but were marked un-tradable Prces were obtaned n s manner for e whole of Januar 995- June 996 to eld approxmatel 00,000 dstnct prces The entre data set was en dvded nto ree dstnct subsets, tranng/valdaton, optmsaton and test We traned and valdated e neural network models on e frst sx mons of 995 data The predcton performance results, quoted n s secton, are e results of applng e neural networks to e second sx mons of e 995 data The STTS tradng model parameters (descrbed n e next secton) were also optmsed usng s perod We reserved e whole of 996 for out-of-sample tradng performance test purposes Fgure 4: The FTSE-00 future Januar 995 to Januar 996 5 Indvdual Predctor Performances Table to 4 show e performances of four dfferent neural network predctors for e four predcton horzons (5, 30, 60 and 90 mnute ahead predctons) The predctors used were A smple earl-stoppng MLP traned usng all 40 lagged return nputs w an optmsed number of hdden nodes found b exhaustve search (-3 nodes) A standard weght deca MLP traned usng all 40 lagged returns w e value of weght deca, lambda, optmsed b cross valdaton 3 An MLP traned w Laplacan weght deca and weght/node elmnaton (as n Wllams [4]) 4 A WEAPO archtecture usng wavelet nodes, soft orogonalsaton constrants and Laplacan weght deca for weght/node elmnaton The performances of e archtectures are shown n terms of RMSE predcton error n terms of desred and actual network outputs Turnng pont accurac: Ths s e number of tmes e network correctl predcts e sgn of e future return 3 Large turnng pont accurac: Ths s e number of tmes at e network correctl predcts e sgn of returns whose magntude s greater an one standard devaton from zero (s measure s relevant n terms of expected tradng sstem performance) 5

6 Predcton horzon % Accurac Large % Accurac RMSE 5 507% 5477% 0003 30 504% 5955% 0039379 60 569% 546% 007403 90 5% 508% 0085858 Table : Results for MLP usng earl stoppng Predcton horzon % Accurac Large % Accurac RMSE 5 5075% 570% 00533 30 500% 5608% 003459 60 5335% 5409% 006099 90 548% 574% 08560 Table : Results for weght deca MLP Predcton horzon % Accurac Large % Accurac RMSE 5 55% 4855% 000467 30 546% 5434% 00356 60 464% 4348% 0064493 90 5039% 508% 009000 Table 3: Results for Laplacan weght deca MLP Predcton horzon % Accurac Large % Accurac RMSE 5 537% 5743% 00879 30 579% 569% 00344 60 546% 5794% 006044 90 55% 588% 0088 Table 4: Results for WEAPO We conclude at e WEAPO archtecture and e smple weght deca archtecture appear sgnfcantl better an e oer two technques The WEAPO archtecture appears to be partcularl good at predctng e sgn of large market movements 53 Use of Commttees for Predcton In e prevous secton, we presented predcton performance results usng a sngle WEAPO archtecture appled to e four requred predcton horzons A number of auors have suggested e use of lnear combnatons of neural networks as a means of mprovng e robustness of neural networks for forecastng and oer tasks The basc dea of a commttee s to ndependentl tran a number of neural networks and to en combne er outputs Suppose we have traned neural networks and at e output of e net s gven b ( x) ρ The commttee response s gven b 0 ρ ρ ( x) α ( x) + α 0 (6) where α s e weghtng for e network and α0 s e bas of e commttee The weghtngs, α ma eer be smple averages (Basc Ensemble Meod) or ma be optmsed usng an OLS procedure (Generalsed Ensemble Meod) Specfcall, e OLS weghtngs ma be determned b ρ ρ α Ξ Γ (7) 6

7 where Ξ and Γ are defned n terms of e outputs of e ndvdual traned networks and e tranng examples, e T ρ ρ [ ξ ] ( xt ) ( x, t ) Ξ T ρ Γ T t [ γ ] ( xt ) T t ρ t t (8) where x ρ s e nput vector, t s e correspondng target response and T s e number of tranng examples Below, we show e predcton performances of commttees composed of fve ndependentl traned WEAPO archtectures, for each of e predcton horzons We conclude at e performances (n terms of RMSE) are superor to ose obtaned usng a sngle WEAPO archtecture Turnng pont detecton accurac however, s broadl smlar Predcton horzon % Accurac Large % Accurac RMSE 5 535% 577% 00734 30 534% 5698% 0036 60 5447% 577% 00559 90 559% 5869% 00809 Table 5: Results for Commttees of fve ndependentl traned WEAPO archtectures 6 THE SIGAL THRESHOLDED TRADIG SYSTEM 6 Background One mght nk at f we have a neural network or oer predcton model correctl predctng e future drecton of a market 60 percent of e tme, en t would be relatvel straghtforward to devse a proftable tradng strateg, In fact, s s not necessarl e case In partcular one must consder e followng: What are e effectve transacton costs at are ncurred each tme we execute a round-trp trade? Over what horzon are we makng e predctons? If e horzon s partcularl short term (e 5-mnute ahead predctons on ntra-da futures markets) s t reall possble to get n and out of e market quckl enough and more mportantl to get e quoted prces? In terms of buldng proftable tradng sstems t ma be more effectve to have a lower accurac but longer predcton horzons What level of rsk s beng assumed b takng e ndcated postons? We ma, for nstance, want to optmse not ust pure proft but perhaps some rsk-adusted measure of performance such as Sharpe Rato or Sterlng Rato An acceptable tradng sstem has to take account of some or all of e above consderatons 6 The Basc STTS Model Assume we have P predctors makng predctons about e expected FTSE-00 Futures returns Each of e predctors makes predctons for tme steps ahead Let e predcton of e predctor at tme t be denoted b p (t) We shall defne e normalsed tradng sgnal S(t) at tme t to be: 7

8 P p ( t) S( t) ω (9) where ω s e weghtng gven to e predctor An llustraton of s s gven n Fgure 5 5 mnutes 30 mnutes 60 mnutes 90 mnutes 5 30 P 60 P 90 ω ω ω P ω P S ( t) P ( t) ω Fgure 5: Weghted summaton of predctons from four WEAPO commttee predctors to gve a sngle tradng sgnal We shall base e tradng strateg on e streng of e combned tradng sgnal S(t) at an gven tme t At tme t we compare e tradng sgnal S (t) w two resholds, denoted b α and β These two resholds are used for e followng decsons: α s e reshold at controls when to open a long or short trade β s e reshold used to decde when to close out an open long or short trade At an gven tme t, e tradng sgnal wll be compared w e approprate reshold usng e current tradng poston In partcular, detals of e actons defned for each tradng poston are found n Table 6: Current poston Test Acton: Go Flat f S(t) > α Long Flat f S(t) < -α Short Long f S(t) < -β Flat Short f S(t) > β Flat Table 6: Usng e tradng resholds to decde whch acton to take Fgure 6 demonstrates e concept of usng e two resholds for tradng The two graphs shown Fgure 6 are e tradng sgnals S (t) for each tme t (top graph) and e assocated prces p (t) dsplaed n e bottom graph The prce graph s colour coded for e dfferent tradng poston at are recommended, blue for along recommendaton, red for a short tradng recommendaton and gre oerwse At e begnnng of tradng we are n a flat poston We shall open a trade f e tradng sgnal exceeds e absolute value of α At e tme marked ❶ s s e case snce e tradng sgnal s greater an α We shall open a long trade Unless e tradng sgnal falls below - β, s long trade wll sta open Ths condton s fulflled at e tme marked ❷, when we shall close out e long trade We are now agan n a flat poston At tme ❸ e tradng sgnal falls below -α, so we open a short tradng poston Ths poston s not closed out untl e tradng sgnal exceeds β, whch occurs at tme ❹ when e short trade s closed out 8

9 s(t) Go long Go short Go short α β β α Prce ❶ ❷ ❸ ❹ Go flat Go flat Go flat Fgure 6: Tradng sgnals and prces 7 RESULTS An STTS tradng sstem, as descrbed above, was formed usng as nput 4 WEAPO commttee predctors Each commttee contaned fve ndependentl traned WEAPO networks and was traned to produce 5, 30, 60 and 90-mnute ahead predctons, respectvel A screen-shot from e software used to perform s smulaton (Amber) s shown below n Fgure 7 The optmal values for e STTS resholds α and β and e four STTS predctor weghtngs, ω to use were found b assessng e performance of e STTS model on e optmsaton data (last 6 mons of 995) usng partcular values for e parameters The parameters were en optmsed usng smulated annealng w e obectve functon beng e tradng performance on s perod measured n terms of Sharpe Rato In terms of tradng condtons, t was assumed at ere would be a ree mnute dela n openng or closng an trade and at e combned bd-ask spread / transacton charge for each round trp trade would be 8 ponts Bo are consdered conservatve estmates After e optmal parameters for e STTS sstem were determned, e tradng sstem was appled to e prevousl unseen data of e frst half of 996 Table 7 summarses e tradng performance over e sx-mon test perod n terms of over-all proftablt, tradng frequenc and Sharpe Rato Monl net proftablt n tcks 53 Average monl tradng frequenc (roundtrp) 8 Sharpe rato dal (monl) 036 (048) Table 7: Results of tradng sstem on e unseen test perod 9

0 Fgure 7: The Tradng Sstem for FTSE-00 futures 40 lagged returns are extracted from e FTSE-00 future tme seres and after standardsaton, nput to e 0 WEAPO predctors, arranged n four commttes Each commttee s responsble for a partcular predcton horzon The predctons are en combned for each commttee and passed onto e STTS tradng module 8 COCLUSIO We have presented a complete tradng model for adoptng poston n e LIFFE FTSE-00 Future In partcular, we have developed a sstem at avods e ree weaknesses at we dentfed n e ntroducton, namel Data Pre-Processng We have constraned e effectve dmenson of e 40 lagged returns b mposng a Dscrete Wavelet Transform on e nput data va e WEAPO archtecture We have also, wn e WEAPO archtecture devsed a meod for automatcall dscoverng e optmal number of wavelets to use n e transform and also whch scales and dlatons should be used Regularsaton We have appled Baesan regularsaton technques to constran e complext of e predcton models We have demonstrated e requrement for s b comparng e predcton performances of regularsed and unregularsed (earl-stoppng) neural network models 3 STTS Tradng Model The STTS model s desgned to transform predctons nto actual tradng strateges Its obectve crteron s erefore not RMS predcton error but e rsk adusted proft of tradng strateg The model has been shown to provde relatvel consstent profts n smulated out-of-sample hgh frequenc tradng over a 6-mon perod 0

9 BIBLIOGRAPHY [] DE Rummelhart, GE Hnton, RJ Wllams Learnng Internal Representatons B Error Propagaton In Parallel Dstrbuted Processng Chapter 8 MIT Press 986 [] DJC MacKa Baesan Interpolaton eural Comput 4(3), 45-447, 99 [3] DJC MacKa A Practcal Baesan Framework For Backprop etworks eural Comput 4(3), 448-47,99 [4] BA Telfer, H Szu GJ Dobeck Tme-Frequenc, Multple Aspect Acoustc Classfcaton World Congress on eural etworks, Vol pp II-34 II-39, Jul 995 [5] DP Casasent JS Smokeln eural et Desgn of Macro Gabor Wavelet Flters for Dstorton-Invarant Obect Detecton In Clutter Optcal Engneerng, Vol 33, o7, pp 64-70 Jul 994 [6] H Szu B Telfer eural etwork Adaptve Flters For Sgnal Representaton Optcal Engneerng 3, 907-96, 99 [7] I Debauches Oronormal Bases of Compactl Supported Wavelets Communcatons n Pure and Appled Maematcs, 988 Vol 6, o 7, pp 909-996 [8] K Fukunaga, Statstcal Pattern Recognton ( nd Edton), Academc Press, 990 [9] MFMoller A Scaled Conugate Gradent Meod For Fast Supervsed Learnng [0] WLBuntne ASWegend Baesan Back-Propagaton Complex Sstems 5, 603-643 [] D L Toulson, S P Toulson, Use of eural etwork Ensembles for Portfolo selecton and Rsk Management, Proc Forecastng Fnancal Markets, Thrd Internatonal conference, London, 996 [] D L Toulson, S P Toulson, Use of eural etwork Mxture Models for Forecastng and Applcaton to Portfolo Management, Sx Internatonal Smposum on forecastng, Istanbul, 996 [3] SE Fahlman Faster Learnng Varatons On Back-Propagaton: An Emprcal Stud Proceedngs Of The 988 Connectonst Models Summer School, pp 38-5 Morgan Kaufmann [4] PM Wllams Baesan Regularsaton and Prunng Usng A Laplace Pror eural Computaton, Vol 5, 993 [5] Y Meer Wavelets and Operators Cambrdge Unverst Press, 995

APPEDIX A DERIVIG BACKPROP FOR WEAPO The MLP s usuall traned usng error backpropagaton Backprop requres e calculaton of e partal dervatves of e data error E D w respect to each of e free parameters of e network (usuall e weghts and bases of e nodes) For e case of wavelet neurons, e weghts between e neuron and e nput pattern are not free but are constraned to assume dscrete values of a partcular chld wavelet The free parameters for e wavelet nodes are erefore not e weghts, but e values of translaton and dlaton and To optmse ese parameters durng tranng, we must obtan expressons for e partal dervatves of e error functon w respect to ese two wavelet parameters The usual form of e backpropagaton algorm s: E E,, ω ω (0) The term E, often referred to as δ, s e standard backpropagaton of error term, whch ma be found n e usual wa for e case of e wavelet nodes The partal dervatve ω, must be substtuted w e partal dervatves of e node output w respect to e wavelet parameters For a gven moer wavelet ) (x, consder e output of e wavelet node, gven n Equaton (4) Takng partal dervatves w respect to e translaton and dlaton elds: x x x x x ' ) ( ' 5 3 () Orogonalsaton of e Wavelet odes A potental problem w usng wavelet nodes s at duplcaton n e parameters of some of e wavelet nodes ma occur One wa of avodng s tpe of duplcaton would be to appl a soft constrant of orogonalt on e wavelets of e hdden laer Ths could be done rough use of e addton of e error functon W E,, () where denotes e proecton g f g f ) ( ) (, (3)

3 In e prevous secton, backprop was derved n terms of e unregularsed sum of squares data error term, We now add n an addtonal term for e orogonalt constrant to eld a combned error functon M(W), gven b E D W M ( W ) αe D + γe (4) Weght and ode Elmnaton A number of technques have been suggested n e lterature for node and/or weght elmnaton n neural networks We shall adopt e technque proposed b Wllams [4,, 3] and use a Laplacan pror as a natural meod of elmnatng redundant nodes The Laplacan Pror on e weghts mples an addtonal term n e error functon, e D + W M ( W ) αe + γe βe (5) W where E W s defned as E W ω, (6), A consequence of s pror s at durng tranng, weghts are forced to adopt one of two postons A weght can eer adopt equal data error senstvt as all e oer weghts or s forced to zero Ths leads to skeletonsaton of a network Durng s process, weghts, hdden nodes or nput components ma be removed from e archtecture As e weghts emergng from redundant wavelet nodes wll have neglgble data error senstvt, s wll cause em to be elmnated 3