Reinforcement Learning for Trading Systems and Portfolios
|
|
|
- Elijah Spencer
- 10 years ago
- Views:
Transcription
1 Reinforcement Learning for rading Systems and Portfolios John Moody and Matthew Saffell* Oregon Graduate Institute, CSE Dept. P.O. Box 91000, Portland, OR {moody, Abstract We propose to train trading systems by optimizing financial objective functions via reinforcement learning. he performance functions that we consider as value functions are profit or wealth, the Sharpe ratio and our recently proposed differential Sharpe ratio for online learning. In Moody & Wu (1997), we presented empirical results in controlled experiments that demonstrated the advantages of reinforcement learning relative to supervised learning. Here we extend our previous work to compare Q-Learning to a reinforcement learning technique based on real-time recurrent learning (RRL) that maximizes immediate reward. Our simulation results include a spectacular demonstration of the presence of predictability in the monthly Standard and Poors 500 stock index for the 25 year period 1970 through Our reinforcement trader achieves a simulated out-of-sample profit of over 4000% for this period, compared to the return for a buy and hold strategy of about 1300% (with dividends reinvested). his superior result is achieved with substantially lower risk. Introduction: Reinforcement Learning for rading he investor s or trader s ultimate goal is to optimize some relevant measure of trading system performance, such as profit, economic utility or risk-adjusted return. In this paper, we propose to use reinforcement learning to directly optimize such trading system performance functions, and we compare two different reinforcement learning methods. he first uses immediate rewards to train the trading systems, while the second (Q-Learning (Watkins)) approximates discounted future rewards. hese methodologies can be applied to optimizing systems designed to trade a single security or to trade a portfolio of securities. In addition, we propose a novel value function for risk adjusted return suitable for online learning: the differential Sharpe ratio. * he authors are also with Nonlinear Prediction Systems, el: (503) Copyright (c) 1998, American Association for Artificial Intelligence ( All rights reserved. rading system profits depend upon sequences of interdependent decisions, and are thus path-dependent. Optimal trading decisions when the effects of transactions costs, market impact and taxes are included require knowledge of the current system state. Reinforcement learning provides a more elegant means for training trading systems when state-dependent transaction costs are included, than do more standard supervised approaches (Moody, Wu, Liao & Saffell). he reinforcement learning algorithms used here include maximizing immediate reward and Q-Learning (Watkins). hough much theoretical progress has been made in recent years in the area of reinforcement learning, there have been relatively few successful, practical applications of the techniques. Notable examples include Neuro-gammon (esauro), the asset trader of Neuneier (1996), an elevator scheduler (Crites & Barto) and space-shuttle payload scheduler (Zhang & Dietterich). In this paper we present results for reinforcement learning trading systems that outperform the S&P 500 Stock Index over a 25-year test period, thus demonstrating the presence of predictable structure in US stock prices. Structure of rading Systems and Portfolios raders: Single Asset with Discrete Position Size In this section, we consider performance functions for systems that trade a single security with price series zt. he trader is assumed to take only long, neutral or short positions Ft E {-1, 0, 1} of constant magnitude. he constant magnitude assumption can be easily relaxed to enable better risk control. he position Ft is established or maintained at the end of each time interval t, and is re-assessed at the end of period t + 1. A trade is thus possible at the end of each time period, although nonzero trading costs will discourage excessive trading. A trading system return Rt is remized at the end of the time interval (t - 1, t] and includes the profit or loss resulting from the position Ft-1 held during that interval and any transaction cost incurred at time t due to a difference in the positions Ft-1 and Ft. In order to properly incorporate the effects of KDD
2 transactions costs, market impact and taxes ill a trader s decision making, the trader must have internal state information and must therefore be recurrent. An example of a single asset trading system that could take into account transactions costs and market impact would be one with tile following decision function: Ft = F(Ot;Ft-x,It) with It = {Zt, Zt-l, Zt-2,... ; Yt, Yt-1, Yt-2," } where Ot denotes the (learned) system parameters at time t and h denotes the information set at time t, which includes present and past values of the price series zt and an arbitrary number of other external variables denoted Yr. Portfolios: Continuous Quantities of Multiple Assets For trading multiple assets in general (typically including a risk-free instrument), a multiple output trading system is required. Denotiug a set of m markets with price series {{z~ } : a = 1,...,m}, the market return r~ for price series z~ for the period ending at time t is defined as ((za/~,a ~t t/-t-1) - 1). Defining portfolio weights of the a th asset as Fa0, a trader that takes only long positions must have portfolio weights that satisfy: F ~ _> 0 and m 1 Ea=l fa = One approach to imposing the constraints on the portfolio weights without requiring that a constrained optimization be performed is to use a trading system that has softmax outputs: Fa0 = {exp[d()]}/{~]~_-t exp[ff()]} for a = 1,...,m. Here, the fa0 could be linear or more complex functions of the inputs, such as a two layer neural network with sigmoidal internal units and linear outputs. Such a trading system call be optimized using unconstrained optimization methods. Denoting the sets of raw and normalized outputs collectively as vectors f0 and F0 respectively, a recursive trader will have structure Ft = softmax { ft ( Ot-1 ;Ft-,, It)}. Financial Performance Functions Profit and Wealth for raders and Portfolios rading systems can be optimized by maximizing performance functions U 0 such as profit, wealth, utility functions of wealth or performance ratios like the Sharpe ratio. he simplest and most natural performance function for a risk-insensitive trader is profit. We consider two cases: additive and multiplicative profits. ile transactions cost rate is denoted ~. Additive profits are appropriate to consider if each trade is for a fixed number of shares or contracts of security zt. his is often the case, for example, when trading small futures accounts or when trading standard US$ FX contracts ill dollar-denominated foreign currencies. With tile definitions r t = zt -- zt-1 and r[ l = z[ - z[_ for the price returns of a risky (traded) asset and a riskfree asset (like -Bills) respectively, the additive profit accunmlated over time periods with trading position size p > 0 is then defined as: P = P E R, (1) with P0 = 0 and typically F = Fo = 0. Equation (2) holds for continuous quantities also. he wealth defined as W = Wo + P. Multiplicative profits are appropriate when a fixed fraction of accumulated wealth v > 0 is invested in each long or short trade. Here, rt = (zt/zt-1-1) and, I t = (zl/zit_ 1-1). If no short sales are allowed and the leverage factor is set fixed at u = 1, the wealth at time is: wr = Wo {l+r,} (2) = WoII(l+(1-Ft-,)rIt+Ft-lrt) (1 - alf, - F,-ll). When multiple assets are considered, tile effective portfolio weightings change with each time step due to price movements. hus, maintaining constant or desired portfolio weights requires that adjustlnents in positions be made at each time step. he wealth after periods for a portfolio trading system is,=l,=l o:, , (3) W = Wol-[{I+Rt}=WoU F~~]z~ "~ 1-6 IF: -, a=l where Ft a is the effective portfolio ~ weight of asset a before readjusting, defined as Ft a { F a t-l( Zt/ a Z a t-l)}/ {~=l Fbt-ltlzblzbtl t-ljj~l. In (4), the first factor in the curly brackets is the increase in wealth over tile time interval t prior to rebalancing to achieve the newly specified weights Ff. he second factor is the reduction in wealth due to the rebalancing costs. Risk Adjusted Return: he Sharpe and Differential Sharpe Ratios Rather than maximizing profits, most modern fund managers attempt to maximize risk-adjusted return as advocated by Modern Portfolio heory. he Sharpe ratio is tile most widely-used measure of risk-adjusted return (Sharpe). Denoting as before the trading system returns for period t (including transactions costs) as Rt, tile Sharpe ratio is defined to be Average(Rt) S = Standard Deviation(Rt) (4) 280 Moody
3 where the average and standard deviation are estimated for periods t = {1,..., }. Proper on-line learning requires that we compute the influence on the Sharpe ratio of the return at time t. o accomplish this, we have derived a new objective function called the differential Sharpe ratio for on-line optimization of trading system performance (Moody, Wu, Liao & Saffell). It is obtained by considering exponential moving averages of the returns and standard deviation of returns in (4), and expanding to first order in the decay rate.: St.~ St-1 +.aa-~lo=o + 0(. 2) Noting that only the first order term in this expansion depends upon the return Rt at time t, we define the differential Sharpe ratio as: dst Bt-IAAt - 1At-IABt Ot =- d~ -- (Bt-, - A2t_,) 3/2 (5) where the quantities At and Bt are exponential moving estimates of the first and second moments of Rt: At =- At-1-1-.AAt = At-1 -t-.(rt - At-l) Bt = Bt-1 +.ABe = Bt-1 --b.(r2t -- Bt-1).(6) reating At-1 and Be-1 as numerical constants, note that, in the update equations controls the magnitude of the influence of the return Rt on the Sharpe ratio St. Hence, the differential Sharpe ratio represents the influence of the return Rt realized at time t on St. Reinforcement Learning for rading Systems and Portfolios he goal in using reinforcement learning to adjust the parameters of a system is to maximize the expected payoff or reward that is generated due to the actions of the system. his is accomplished through trial and error exploration of the environment. he system receives a reinforcement signal from its environment (a reward) that provides information on whether its actions are good or bad. he performance functions that we consider are functions of profit or wealth U(W) after a sequence of time steps, or more generally of the whole time sequence of trades U(W1, W2,..., W) as is the case for a path-dependent performance function like the Sharpe ratio. In either case, the performance function at time can be expressed as a function of the sequence of trading returns U(R1, R2,..., -R). We denote this by U in the rest of this section. Maximizing Immediate Utility Given a trading system model Ft(O), the goal is to adjust the parameters 0 in order to maximize U. his maximization for a complete sequence of trades can be done off-line using dynamic programming or batch versions of recurrent reinforcement learning algorithms. Here we do the optimization on-line using a standard reinforcement learning technique. his reinforcement learning algorithm is based on stochastic gradient ascent. he gradient of U with respect to the parameters 0 of the system after a sequence of trades is du(o) ~ du { drt dft drt dft-l ~ do -- -~t dft do- + dft-1 -~ J (7) he above expression as written with scalar Fi applies to the traders of a single risky asset, but can be trivially generalized to the vector case for portfolios. he system can be optimized in batch mode by repeatedly computing the value of U on forward passes through the data and adjusting the trading system parameters by using gradient ascent (with learning rate p) AO = pdu(o)/do or some other optimization method. A simple on-line stochastic optimization can be obtained by considering only the term in (7) that depends on the most recently realized return Rt during a forward pass through the data: dut(o) dut { dr ~t dft drt dft_, } d7 - drt dft do + dft_~ d~ he parameters are then updated on-line using AOt = pdut (Or)/dOt. Such an algorithm performs a stochastic optimization (since the system parameters Ot are varied during each forward pass through the training data), and is an example of immediate reward reinforcement learning. his approach is described in (Moody, Wu, Liao & Saffell) along with extensive simulation results. Q-Learning Besides explicitly training a trader to take actions, we can also implicitly learn correct actions through the technique of value iteration. In value iteration, an estimate of future costs or rewards is made for a given state, and the action is chosen that minimizes future costs or maximizes future rewards. Here we consider the specific technique named Q-Learning (Watkins), which estimates future rewards based on the current state and the current action taken. We can write the Q-function version of Bellman s equation as Q*(=,a)=, y=0 (9) where there are n states in the system and p=~(a) is the probability of transitioning from state x to state y given action a. he advantage of using the Q-function is that there is no need to know the system model p=y(a) in order to choose the best action. One simply calculates the best action as a* = argmaxa(q*(x, a)). he update rule for training a function approximator is Q(~, a) = Q(x, a) + p(u(z, a) + 3 maxb q*(y, b)), where p is a learning rate. Empirical Results Long/Short rader of a Single Security We have tested techniques for optimizing both profit and the Sharpe ratio in a variety of settings. We present (8) KDD
4 _i_ Figure 1: Boxplot for ensembles of 100 experiments comparing the performance of the "RRL" and "Qtrader" trading systeins on artificial price series. "Qtrader" outperforms the "RRL" system in terms of both (a) profit and (b) risk-adjusted returns (Sharpe ratio). he differences are statistically significant. results here on artificial data comparing the performance of a trading system, "RRL", trained using real time recurrent learning with the performance of a trading system, "Qtrader", implemented using Q-Learning. We generate log price series of length 10,000 as random walks with autoregressive trend processes. hese series are trending on short time scales and have a high level of noise. he results of our simulations indicate that the Q-Learning trading system outperforms the "RRL" trader that is trained to maximize immediate reward. he "RP~L" system is initialized randomly at the beginning, trained for a while using labelled data to initialize the weights, and then adapted using real-time recurrent learning to optimize the differential Sharpe ratio (5). he neural net that is used in the "Qtrader" system starts from random initial weights. It is trained repeatedly on the first 1000 data points with the discount parameter 7 set to 0 to allow the network to learn immediate reward. hen 7 is set equal to 0.9 and training continues with the second thousand data points being used as a validation set. he value function used here is profit. Figure 1 shows box plots summarizing test performances for ensembles of 100 experiments. In these simulations, the data are partitioned into a training set consisting of the first 2,000 samples and a test set containing the last 8,000 samples. Each trial has different realizations of the artificial price process and different random initial parameter values. he transaction costs are set at 0.2%, and we observe the cumulative profit and Sharpe ratio over the test data set. We find that the "Qtrader" system, which looks at future profits, significantly outperforms the "RRL" system which looks to maximize immediate rewards because it is effectively looking farther into the future when making decisions. S&:P 500 / Bill Asset Allocation System Long/Short Asset Allocation System and Data A long/short trading system is trained on monthly SgzP 500 stock index and 3-month Bill data to maximize! the differential Sharpe ratio. he S&P 500 target series is the total return index computed by reinvesting dividends. he 84 input series used in the trading systems include both financial and macroeconomic data. All data are obtained from Citibase, and the maeroeconomic series are lagged by one month to reflect reporting delays. A total of 45 years of monthly data are used, from January 1950 through December he first 20 years of data are used only for the initial training of the system. he test period is the 25 year period from January 1970 through December ile experimental results for the 25 year test period are true ex ante simulated trading results. For each year dnring 1970 through 1994, the system is trained on a moving window of the previous 20 years of data. For 1970, the system is initialized with random parameters. For the 24 subsequent years, the previously learned parameters are used to initialize the training. In this way, the system is able to adapt to changing market and economic conditions. Within the moving training window, the "RRL" systems use the first 10 years for stochastic optimization of system parameters, and the subsequent 10 years for validating early stopping of training. he networks are linear, and are regularized using quadratic weight decay during training with a regularization parameter of he "Qtrader" systems use a bootstrap sample of the 20 year training window for training, and the final 10 years of the training window are used for validating early stopping of training. he networks are two-layer feedforward networks with 30 tanh units in the hidden layer. Experimental Results he left panel in Figure 2 shows box plots summarizing the test performance for the full 25 year test period of the trading systems with various realizations of the initial system parameters over 30 trials for the "RRL" system, and 10 trials for the "Qtrader" system 1. he transaction cost is set at 0.5%. Profits are reinvested during trading, and multiplicative profits are used when calculating the wealth. he notches in the box plots indicate robust estimates of the 95% confidence intervals on the hypothesis that the median is equal to the performance of the buy and hold strategy. he horizontal lines show the performance of the "RRL" voting, "Qtrader" voting and buy and hold strategies for the same test period. he annualized monthly Sharpe ratios of the buy and hold strategy, the "Qtrader" voting strategy and the "RRL" voting strategy are 0.34, 0.63 and 0.83 respectively. he Sharpe ratios calculated here are for the excess returns of the strategies over the 3-month treasury bill rate. Figure 2 shows results for following the strategy of taking positions based on a majority vote of the ensembles of trading systems compared with the buy and hold strategy. We can see that the trading systems go short the S&P 500 during critical periods, such as the oil price Ien trials were done for the "Qtrader" system due to the amount of computation required in training the systems 282 Moody
5 ,d! lo ii I i I i -.1~.1 i. -tz - i I"-Il~t~lP.i -1~ i i i i~l ~ i i f iii i Iii I i, t i h Ii i ii i "[... :, i ii t i fll I ii!, _~ : ~,,~,,,,,,,, I~ II I_l_ll I ~L, tl IL_I ~j I_ 197"0 I I Figure 2: est results for ensembles of simulations using the S~:P 500 stock index and 3-month reasury Bill data over the time period. he figure shows the equity curves associated with the voting systems and the buy and hold strategy, as well as the voting trading signals produced by the systems. he solid curves correspond to the "Rlq.L" voting system performance, dashed curves to the "Qtrader" voting system and the dashed and dotted curves indicate the buy and hold performance. Initial equity is set to 1, and transaction costs are set at 0.5%. In both simulations, the traders avoid the dramatic losses that the buy and hold strategy incurred during In addition, the "R.RL" trader makes money during the crash of 1987, while the "Qtrader" system avoids the large losses associated with the buy and hold strategy during the same period. shock of 1974, the tight money (high interest rate) periods of the early 1980 s, the market correction of 1984 and the 1987 crash. his ability to take advantage of high treasury bill rates or to avoid periods of substantial stock market loss is the major factor in the long term success of these trading models. One exception is that the "I~RL" trading system remains long during the 1991 stock market correction associated with the Persian Gulf war, though the "Qtrader" system does identify the correction. On the whole though, the "Qtrader" system trades much more frequently than the "RRL" system, and in the end does not perform as well on this data set. From these results we find that both trading systems outperform the buy and hold strategy, as measured by both accumulated wealth and Sharpe ratio. hese differences are statistically significant and support the proposition that there is predictability in the U.S. stock and treasury bill markets during the 25 year period 1970 through A more detailed presentation of the "R.RL" results is presented in (Moody, Wu, Liao & Saffell). Conclusions and Extensions In this paper, we have trained trading systems via reinforcement learning to optimize financial objective functions including our recently proposed differential Sharpe ratio for online learning. We have also provided simulation results that demonstrate the presence of predictability in the monthly SSzP 500 Stock Index for the 25 year period 1970 through We have previously shown with extensive simulation results (Moody, Wu, Liao L; Saffell) that the "RRL" trading system significantly outperforms systems trained using supervised methods for traders of both single securities and portfolios. he superiority of reinforcement learning over supervised learning is most striking when state-dependent transaction costs are taken into account. Here we show that the Q-Learning approach can significantly improve on the "lq.rl" method when trading single securities, as it does for our artificial data set. However, the "Qtrader" system does not perform as well as the "Rlq.L" system on the S~zP 500 / Bill asset allocation problem, possibly due to its more frequent trading. his effect deserves further exploration. Acknowledgements We acknowledge support for this work from Nonlinear Prediction Systems and from DARPA under contract DAAH01-96-C-R026 and AASER grant DAAH References Crites, R. H. & Barto, A. G. (1996), Improving elevator performance using reinforcement learning, in D. S. ouretzky, M. C. Mozer & M. E. Hasselmo, eds, Advances in NIPS, Vol. 8, pp Moody, J. & Wu, L. (1997), Optimization of trading systems and portfolios, in Y. Abu-Mostafa, A. N. Refenes & A. S. Weigend, eds, Neural Networks in the Capital Markets, World Scientific, London. Moody, J., Wu, L., Liao, Y. & Saffell, M. (1998), Performance functions and reinforcement learning for trading systems and portfolios, Journal of Forecasting 17. o appear. Neuneier, R. (1996), Optimal asset allocation using adaptive dynamic programming, in D. S. ouretzky, M. C. Mozer & M. E. Hasselmo, eds, Advances in NIPS, Vol. 8, pp Sharpe, W. F. (1966), Mutual fund performance, Journal o] Business pp esauro, G. (1989), Neurogammon wins the computer olympiad, Neural Computation 1, Watkins, C. J. C. H. (1989), Learning with Delayed Rewards, PhD thesis, Cambridge University, Psychology Department. Zhang, W. & Dietterich,. G. (1996), High-performance job-shop schedttling with a time-delay td(a) network, in D. S. ouretzky, M. C. Mozer & M. E. Hasselmo, eds, Advances in NIPS, Vol. 8, pp KDD
Reinforcement Learning for Trading
Reinforcement Learning for rading John Moody and Matthew Saffell* Oregon Graduate Institute CSE Dept. P.O. Box 91000 Portland OR 97291-1000 {moody saffell }@cse.ogi.edu Abstract We propose to train trading
Learning to Trade via Direct Reinforcement
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 4, JULY 2001 875 Learning to Trade via Direct Reinforcement John Moody and Matthew Saffell Abstract We present methods for optimizing portfolios, asset
2 Reinforcement learning architecture
Dynamic portfolio management with transaction costs Alberto Suárez Computer Science Department Universidad Autónoma de Madrid 2849, Madrid (Spain) [email protected] John Moody, Matthew Saffell International
Optimization of technical trading strategies and the profitability in security markets
Economics Letters 59 (1998) 249 254 Optimization of technical trading strategies and the profitability in security markets Ramazan Gençay 1, * University of Windsor, Department of Economics, 401 Sunset,
Tikesh Ramtohul WWZ, University of Basel 19 th June 2009 (COMISEF workshop, Birkbeck College, London)
Tikesh Ramtohul WWZ, University of Basel 19 th June 2009 (COMISEF workshop, Birkbeck College, London) T.Ramtohul : An Automated Trading System based on Recurrent Reinforcement Learning, Birkbeck College,
2 Forecasting by Error Correction Neural Networks
Keywords: portfolio management, financial forecasting, recurrent neural networks. Active Portfolio-Management based on Error Correction Neural Networks Hans Georg Zimmermann, Ralph Neuneier and Ralph Grothmann
A Sarsa based Autonomous Stock Trading Agent
A Sarsa based Autonomous Stock Trading Agent Achal Augustine The University of Texas at Austin Department of Computer Science Austin, TX 78712 USA [email protected] Abstract This paper describes an autonomous
Intraday FX Trading: An Evolutionary Reinforcement Learning Approach
1 Civitas Foundation Finance Seminar Princeton University, 20 November 2002 Intraday FX Trading: An Evolutionary Reinforcement Learning Approach M A H Dempster Centre for Financial Research Judge Institute
Pattern Recognition and Prediction in Equity Market
Pattern Recognition and Prediction in Equity Market Lang Lang, Kai Wang 1. Introduction In finance, technical analysis is a security analysis discipline used for forecasting the direction of prices through
Design of an FX trading system using Adaptive Reinforcement Learning
University Finance Seminar 17 March 2006 Design of an FX trading system using Adaptive Reinforcement Learning M A H Dempster Centre for Financial Research Judge Institute of Management University of &
Comparing Neural Networks and ARMA Models in Artificial Stock Market
Comparing Neural Networks and ARMA Models in Artificial Stock Market Jiří Krtek Academy of Sciences of the Czech Republic, Institute of Information Theory and Automation. e-mail: [email protected]
Jae Won idee. School of Computer Science and Engineering Sungshin Women's University Seoul, 136-742, South Korea
STOCK PRICE PREDICTION USING REINFORCEMENT LEARNING Jae Won idee School of Computer Science and Engineering Sungshin Women's University Seoul, 136-742, South Korea ABSTRACT Recently, numerous investigations
A Multiagent Approach to Q-Learning for Daily Stock Trading Jae Won Lee, Jonghun Park, Member, IEEE, Jangmin O, Jongwoo Lee, and Euyseok Hong
864 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A: SYSTEMS AND HUMANS, VOL. 37, NO. 6, NOVEMBER 2007 A Multiagent Approach to Q-Learning for Daily Stock Trading Jae Won Lee, Jonghun Park, Member,
On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information
Finance 400 A. Penati - G. Pennacchi Notes on On the Efficiency of Competitive Stock Markets Where Traders Have Diverse Information by Sanford Grossman This model shows how the heterogeneous information
Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski [email protected]
Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski [email protected] Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems
COMBINED NEURAL NETWORKS FOR TIME SERIES ANALYSIS
COMBINED NEURAL NETWORKS FOR TIME SERIES ANALYSIS Iris Ginzburg and David Horn School of Physics and Astronomy Raymond and Beverly Sackler Faculty of Exact Science Tel-Aviv University Tel-A viv 96678,
Classification of Bad Accounts in Credit Card Industry
Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition
Algorithmic Trading Session 1 Introduction. Oliver Steinki, CFA, FRM
Algorithmic Trading Session 1 Introduction Oliver Steinki, CFA, FRM Outline An Introduction to Algorithmic Trading Definition, Research Areas, Relevance and Applications General Trading Overview Goals
Portfolio Optimization
Portfolio Optimization Jaehyun Park Ahmed Bou-Rabee Stephen Boyd EE103 Stanford University November 22, 2014 Outline Single asset investment Portfolio investment Portfolio optimization Single asset investment
Chapter 1 INTRODUCTION. 1.1 Background
Chapter 1 INTRODUCTION 1.1 Background This thesis attempts to enhance the body of knowledge regarding quantitative equity (stocks) portfolio selection. A major step in quantitative management of investment
Using artificial intelligence for data reduction in mechanical engineering
Using artificial intelligence for data reduction in mechanical engineering L. Mdlazi 1, C.J. Stander 1, P.S. Heyns 1, T. Marwala 2 1 Dynamic Systems Group Department of Mechanical and Aeronautical Engineering,
Cost of Capital and Corporate Refinancing Strategy: Optimization of Costs and Risks *
Cost of Capital and Corporate Refinancing Strategy: Optimization of Costs and Risks * Garritt Conover Abstract This paper investigates the effects of a firm s refinancing policies on its cost of capital.
Lecture 12: The Black-Scholes Model Steven Skiena. http://www.cs.sunysb.edu/ skiena
Lecture 12: The Black-Scholes Model Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 11794 4400 http://www.cs.sunysb.edu/ skiena The Black-Scholes-Merton Model
Monotonicity Hints. Abstract
Monotonicity Hints Joseph Sill Computation and Neural Systems program California Institute of Technology email: [email protected] Yaser S. Abu-Mostafa EE and CS Deptartments California Institute of Technology
Machine Learning in FX Carry Basket Prediction
Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines
Stock Option Pricing Using Bayes Filters
Stock Option Pricing Using Bayes Filters Lin Liao [email protected] Abstract When using Black-Scholes formula to price options, the key is the estimation of the stochastic return variance. In this
Forecasting of Economic Quantities using Fuzzy Autoregressive Model and Fuzzy Neural Network
Forecasting of Economic Quantities using Fuzzy Autoregressive Model and Fuzzy Neural Network Dušan Marček 1 Abstract Most models for the time series of stock prices have centered on autoregressive (AR)
CS 522 Computational Tools and Methods in Finance Robert Jarrow Lecture 1: Equity Options
CS 5 Computational Tools and Methods in Finance Robert Jarrow Lecture 1: Equity Options 1. Definitions Equity. The common stock of a corporation. Traded on organized exchanges (NYSE, AMEX, NASDAQ). A common
Inductive QoS Packet Scheduling for Adaptive Dynamic Networks
Inductive QoS Packet Scheduling for Adaptive Dynamic Networks Malika BOURENANE Dept of Computer Science University of Es-Senia Algeria [email protected] Abdelhamid MELLOUK LISSI Laboratory University
Living to 100: Survival to Advanced Ages: Insurance Industry Implication on Retirement Planning and the Secondary Market in Insurance
Living to 100: Survival to Advanced Ages: Insurance Industry Implication on Retirement Planning and the Secondary Market in Insurance Jay Vadiveloo, * Peng Zhou, Charles Vinsonhaler, and Sudath Ranasinghe
How To Understand The Value Of A Mutual Fund
FCS5510 Sample Homework Problems and Answer Key Unit03 CHAPTER 6. INVESTMENT COMPANIES: MUTUAL FUNDS PROBLEMS 1. What is the net asset value of an investment company with $10,000,000 in assets, $500,000
Threshold Autoregressive Models in Finance: A Comparative Approach
University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Informatics 2011 Threshold Autoregressive Models in Finance: A Comparative
An Implementation of Genetic Algorithms as a Basis for a Trading System on the Foreign Exchange Market
An Implementation of Genetic Algorithms as a Basis for a Trading System on the Foreign Exchange Market Andrei Hryshko School of Information Technology & Electrical Engineering, The University of Queensland,
NEURAL networks [5] are universal approximators [6]. It
Proceedings of the 2013 Federated Conference on Computer Science and Information Systems pp. 183 190 An Investment Strategy for the Stock Exchange Using Neural Networks Antoni Wysocki and Maciej Ławryńczuk
NEURAL NETWORKS AND REINFORCEMENT LEARNING. Abhijit Gosavi
NEURAL NETWORKS AND REINFORCEMENT LEARNING Abhijit Gosavi Department of Engineering Management and Systems Engineering Missouri University of Science and Technology Rolla, MO 65409 1 Outline A Quick Introduction
Answers to Concepts in Review
Answers to Concepts in Review 1. A portfolio is simply a collection of investments assembled to meet a common investment goal. An efficient portfolio is a portfolio offering the highest expected return
Dynamic Asset Allocation Using Stochastic Programming and Stochastic Dynamic Programming Techniques
Dynamic Asset Allocation Using Stochastic Programming and Stochastic Dynamic Programming Techniques Gerd Infanger Stanford University Winter 2011/2012 MS&E348/Infanger 1 Outline Motivation Background and
Computing Near Optimal Strategies for Stochastic Investment Planning Problems
Computing Near Optimal Strategies for Stochastic Investment Planning Problems Milos Hauskrecfat 1, Gopal Pandurangan 1,2 and Eli Upfal 1,2 Computer Science Department, Box 1910 Brown University Providence,
Valuation of Razorback Executive Stock Options: A Simulation Approach
Valuation of Razorback Executive Stock Options: A Simulation Approach Joe Cheung Charles J. Corrado Department of Accounting & Finance The University of Auckland Private Bag 92019 Auckland, New Zealand.
Algorithmic Trading Session 6 Trade Signal Generation IV Momentum Strategies. Oliver Steinki, CFA, FRM
Algorithmic Trading Session 6 Trade Signal Generation IV Momentum Strategies Oliver Steinki, CFA, FRM Outline Introduction What is Momentum? Tests to Discover Momentum Interday Momentum Strategies Intraday
Hedging Illiquid FX Options: An Empirical Analysis of Alternative Hedging Strategies
Hedging Illiquid FX Options: An Empirical Analysis of Alternative Hedging Strategies Drazen Pesjak Supervised by A.A. Tsvetkov 1, D. Posthuma 2 and S.A. Borovkova 3 MSc. Thesis Finance HONOURS TRACK Quantitative
Financial Development and Macroeconomic Stability
Financial Development and Macroeconomic Stability Vincenzo Quadrini University of Southern California Urban Jermann Wharton School of the University of Pennsylvania January 31, 2005 VERY PRELIMINARY AND
BINOMIAL OPTIONS PRICING MODEL. Mark Ioffe. Abstract
BINOMIAL OPTIONS PRICING MODEL Mark Ioffe Abstract Binomial option pricing model is a widespread numerical method of calculating price of American options. In terms of applied mathematics this is simple
B.3. Robustness: alternative betas estimation
Appendix B. Additional empirical results and robustness tests This Appendix contains additional empirical results and robustness tests. B.1. Sharpe ratios of beta-sorted portfolios Fig. B1 plots the Sharpe
The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network
, pp.67-76 http://dx.doi.org/10.14257/ijdta.2016.9.1.06 The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network Lihua Yang and Baolin Li* School of Economics and
Parallel Computing for Option Pricing Based on the Backward Stochastic Differential Equation
Parallel Computing for Option Pricing Based on the Backward Stochastic Differential Equation Ying Peng, Bin Gong, Hui Liu, and Yanxin Zhang School of Computer Science and Technology, Shandong University,
Option Pricing Theory and Applications. Aswath Damodaran
Option Pricing Theory and Applications Aswath Damodaran What is an option? An option provides the holder with the right to buy or sell a specified quantity of an underlying asset at a fixed price (called
From thesis to trading: a trend detection strategy
Caio Natividade Vivek Anand Daniel Brehon Kaifeng Chen Gursahib Narula Florent Robert Yiyi Wang From thesis to trading: a trend detection strategy DB Quantitative Strategy FX & Commodities March 2011 Deutsche
How to Time the Commodity Market
How to Time the Commodity Market Devraj Basu, Roel C.A. Oomen and Alexander Stremme June 2006 Abstract Over the past few years, commodity prices have experienced the biggest boom in half a century. In
Index tracking UNDER TRANSACTION COSTS:
MARKE REVIEWS Index tracking UNDER RANSACION COSS: rebalancing passive portfolios by Reinhold Hafner, Ansgar Puetz and Ralf Werner, RiskLab GmbH Portfolio managers must be able to estimate transaction
Using Markov Decision Processes to Solve a Portfolio Allocation Problem
Using Markov Decision Processes to Solve a Portfolio Allocation Problem Daniel Bookstaber April 26, 2005 Contents 1 Introduction 3 2 Defining the Model 4 2.1 The Stochastic Model for a Single Asset.........................
Neural Network Applications in Stock Market Predictions - A Methodology Analysis
Neural Network Applications in Stock Market Predictions - A Methodology Analysis Marijana Zekic, MS University of Josip Juraj Strossmayer in Osijek Faculty of Economics Osijek Gajev trg 7, 31000 Osijek
Channel Allocation in Cellular Telephone. Systems. Lab. for Info. and Decision Sciences. Cambridge, MA 02139. [email protected].
Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems Satinder Singh Department of Computer Science University of Colorado Boulder, CO 80309-0430 [email protected] Dimitri
Exam Introduction Mathematical Finance and Insurance
Exam Introduction Mathematical Finance and Insurance Date: January 8, 2013. Duration: 3 hours. This is a closed-book exam. The exam does not use scrap cards. Simple calculators are allowed. The questions
CHAPTER 5 PREDICTIVE MODELING STUDIES TO DETERMINE THE CONVEYING VELOCITY OF PARTS ON VIBRATORY FEEDER
93 CHAPTER 5 PREDICTIVE MODELING STUDIES TO DETERMINE THE CONVEYING VELOCITY OF PARTS ON VIBRATORY FEEDER 5.1 INTRODUCTION The development of an active trap based feeder for handling brakeliners was discussed
Sensex Realized Volatility Index
Sensex Realized Volatility Index Introduction: Volatility modelling has traditionally relied on complex econometric procedures in order to accommodate the inherent latent character of volatility. Realized
Prediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
I.e., the return per dollar from investing in the shares from time 0 to time 1,
XVII. SECURITY PRICING AND SECURITY ANALYSIS IN AN EFFICIENT MARKET Consider the following somewhat simplified description of a typical analyst-investor's actions in making an investment decision. First,
Benchmarking Low-Volatility Strategies
Benchmarking Low-Volatility Strategies David Blitz* Head Quantitative Equity Research Robeco Asset Management Pim van Vliet, PhD** Portfolio Manager Quantitative Equity Robeco Asset Management forthcoming
Chapter 3 Fixed Income Securities
Chapter 3 Fixed Income Securities Road Map Part A Introduction to finance. Part B Valuation of assets, given discount rates. Fixed-income securities. Stocks. Real assets (capital budgeting). Part C Determination
Introduction, Forwards and Futures
Introduction, Forwards and Futures Liuren Wu Zicklin School of Business, Baruch College Fall, 2007 (Hull chapters: 1,2,3,5) Liuren Wu Introduction, Forwards & Futures Option Pricing, Fall, 2007 1 / 35
Automated Stock Trading and Portfolio Optimization Using XCS Trader and Technical Analysis. Anil Chauhan [email protected]
Automated Stock Trading and Portfolio Optimization Using XCS Trader and Technical Analysis Anil Chauhan [email protected] Master of Science Artificial Intelligence School of Informatics University
OPTIMAl PREMIUM CONTROl IN A NON-liFE INSURANCE BUSINESS
ONDERZOEKSRAPPORT NR 8904 OPTIMAl PREMIUM CONTROl IN A NON-liFE INSURANCE BUSINESS BY M. VANDEBROEK & J. DHAENE D/1989/2376/5 1 IN A OPTIMAl PREMIUM CONTROl NON-liFE INSURANCE BUSINESS By Martina Vandebroek
Test3. Pessimistic Most Likely Optimistic Total Revenues 30 50 65 Total Costs -25-20 -15
Test3 1. The market value of Charcoal Corporation's common stock is $20 million, and the market value of its riskfree debt is $5 million. The beta of the company's common stock is 1.25, and the market
VALUATION IN DERIVATIVES MARKETS
VALUATION IN DERIVATIVES MARKETS September 2005 Rawle Parris ABN AMRO Property Derivatives What is a Derivative? A contract that specifies the rights and obligations between two parties to receive or deliver
Betting with the Kelly Criterion
Betting with the Kelly Criterion Jane June 2, 2010 Contents 1 Introduction 2 2 Kelly Criterion 2 3 The Stock Market 3 4 Simulations 5 5 Conclusion 8 1 Page 2 of 9 1 Introduction Gambling in all forms,
Stock Market Forecasting Using Machine Learning Algorithms
Stock Market Forecasting Using Machine Learning Algorithms Shunrong Shen, Haomiao Jiang Department of Electrical Engineering Stanford University {conank,hjiang36}@stanford.edu Tongda Zhang Department of
THE LOW-VOLATILITY ANOMALY: Does It Work In Practice?
THE LOW-VOLATILITY ANOMALY: Does It Work In Practice? Glenn Tanner McCoy College of Business, Texas State University, San Marcos TX 78666 E-mail: [email protected] ABSTRACT This paper serves as both an
A Multi-agent Q-learning Framework for Optimizing Stock Trading Systems
A Multi-agent Q-learning Framework for Optimizing Stock Trading Systems Jae Won Lee 1 and Jangmin O 2 1 School of Computer Science and Engineering, Sungshin Women s University, Seoul, Korea 136-742 [email protected]
Finanzdienstleistungen (Praxis) Algorithmic Trading
Finanzdienstleistungen (Praxis) Algorithmic Trading Definition A computer program (Software) A process for placing trade orders like Buy, Sell It follows a defined sequence of instructions At a speed and
Effective downside risk management
Effective downside risk management Aymeric Forest, Fund Manager, Multi-Asset Investments November 2012 Since 2008, the desire to avoid significant portfolio losses has, more than ever, been at the front
High Frequency Trading using Fuzzy Momentum Analysis
Proceedings of the World Congress on Engineering 2 Vol I WCE 2, June 3 - July 2, 2, London, U.K. High Frequency Trading using Fuzzy Momentum Analysis A. Kablan Member, IAENG, and W. L. Ng. Abstract High
Dynamic macro stress exercise including feedback effect
Dynamic macro stress exercise including feedback effect by Tokiko Shimizu * Institute for Monetary and Economic Studies Bank of Japan Abstract The goal of this study is to illustrate a viable way to explore
New Ensemble Combination Scheme
New Ensemble Combination Scheme Namhyoung Kim, Youngdoo Son, and Jaewook Lee, Member, IEEE Abstract Recently many statistical learning techniques are successfully developed and used in several areas However,
Duality in General Programs. Ryan Tibshirani Convex Optimization 10-725/36-725
Duality in General Programs Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: duality in linear programs Given c R n, A R m n, b R m, G R r n, h R r : min x R n c T x max u R m, v R r b T
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network
Chapter 2 The Research on Fault Diagnosis of Building Electrical System Based on RBF Neural Network Qian Wu, Yahui Wang, Long Zhang and Li Shen Abstract Building electrical system fault diagnosis is the
Financial Market Microstructure Theory
The Microstructure of Financial Markets, de Jong and Rindi (2009) Financial Market Microstructure Theory Based on de Jong and Rindi, Chapters 2 5 Frank de Jong Tilburg University 1 Determinants of the
Guaranteed Annuity Options
Guaranteed Annuity Options Hansjörg Furrer Market-consistent Actuarial Valuation ETH Zürich, Frühjahrssemester 2008 Guaranteed Annuity Options Contents A. Guaranteed Annuity Options B. Valuation and Risk
How can we discover stocks that will
Algorithmic Trading Strategy Based On Massive Data Mining Haoming Li, Zhijun Yang and Tianlun Li Stanford University Abstract We believe that there is useful information hiding behind the noisy and massive
Fixed Income Portfolio Management. Interest rate sensitivity, duration, and convexity
Fixed Income ortfolio Management Interest rate sensitivity, duration, and convexity assive bond portfolio management Active bond portfolio management Interest rate swaps 1 Interest rate sensitivity, duration,
When to Refinance Mortgage Loans in a Stochastic Interest Rate Environment
When to Refinance Mortgage Loans in a Stochastic Interest Rate Environment Siwei Gan, Jin Zheng, Xiaoxia Feng, and Dejun Xie Abstract Refinancing refers to the replacement of an existing debt obligation
Independent component ordering in ICA time series analysis
Neurocomputing 41 (2001) 145}152 Independent component ordering in ICA time series analysis Yiu-ming Cheung*, Lei Xu Department of Computer Science and Engineering, The Chinese University of Hong Kong,
The relation between news events and stock price jump: an analysis based on neural network
20th International Congress on Modelling and Simulation, Adelaide, Australia, 1 6 December 2013 www.mssanz.org.au/modsim2013 The relation between news events and stock price jump: an analysis based on
Eligibility Traces. Suggested reading: Contents: Chapter 7 in R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction MIT Press, 1998.
Eligibility Traces 0 Eligibility Traces Suggested reading: Chapter 7 in R. S. Sutton, A. G. Barto: Reinforcement Learning: An Introduction MIT Press, 1998. Eligibility Traces Eligibility Traces 1 Contents:
Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem
Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become
AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.
AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree
How to time the stock market Using Artificial Neural Networks and Genetic Algorithms
How to time the stock market Using Artificial Neural Networks and Genetic Algorithms Donn S. Fishbein, MD, PhD Neuroquant.com ABSTRACT Despite considerable bad press that market timing has received in
Journal of Financial and Economic Practice
Analyzing Investment Data Using Conditional Probabilities: The Implications for Investment Forecasts, Stock Option Pricing, Risk Premia, and CAPM Beta Calculations By Richard R. Joss, Ph.D. Resource Actuary,
Lecture Notes: Basic Concepts in Option Pricing - The Black and Scholes Model
Brunel University Msc., EC5504, Financial Engineering Prof Menelaos Karanasos Lecture Notes: Basic Concepts in Option Pricing - The Black and Scholes Model Recall that the price of an option is equal to
Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support
IBM SPSS Neural Networks 22
IBM SPSS Neural Networks 22 Note Before using this information and the product it supports, read the information in Notices on page 21. Product Information This edition applies to version 22, release 0,
SOME ASPECTS OF GAMBLING WITH THE KELLY CRITERION. School of Mathematical Sciences. Monash University, Clayton, Victoria, Australia 3168
SOME ASPECTS OF GAMBLING WITH THE KELLY CRITERION Ravi PHATARFOD School of Mathematical Sciences Monash University, Clayton, Victoria, Australia 3168 In this paper we consider the problem of gambling with
ECON 351: The Stock Market, the Theory of Rational Expectations, and the Efficient Market Hypothesis
ECON 351: The Stock Market, the Theory of Rational Expectations, and the Efficient Market Hypothesis Alejandro Riaño Penn State University June 8, 2008 Alejandro Riaño (Penn State University) ECON 351:
TD(0) Leads to Better Policies than Approximate Value Iteration
TD(0) Leads to Better Policies than Approximate Value Iteration Benjamin Van Roy Management Science and Engineering and Electrical Engineering Stanford University Stanford, CA 94305 [email protected] Abstract
