ALGORITHMIC TRADING USING MACHINE LEARNING TECH-
|
|
|
- Alban Briggs
- 10 years ago
- Views:
Transcription
1 ALGORITHMIC TRADING USING MACHINE LEARNING TECH- NIQUES: FINAL REPORT Chenxu Shao, Zheming Zheng Department of Management Science and Engineering December 12, 2013 ABSTRACT In this report, we present an automatic stock trading process, which relies on a hierarchy of a feature selecting method, multiple machine-learning algorithms as well as an online learning mechanism. Backward search was used in feature selection, while the local linear regression (LLR), l 1 regularized ν-support vector machine (ν-svm), and the multiple additive regression tree (MART), were chosen as the underlying algorithms. Our trading model is simplified from real life trading. One strength of our approach is that the model, regardless of its many simplified assumptions, is more sophisticated than many of the reported model which uses simple buy-and-hold strategies. In addition, applying the online learning mechanism greatly improves the prediction accuracy. The learning results are impressively robust, rendering our process promising candidates for real life algorithmic trading. 1 Introduction Nowadays investors and trading firms have been more aware of risks than any time in the past due to the non-stationary and chaotic stock markets under the impact of the 2008 financial crisis. Therefore, many firms now rely heavily on algorithmic trading in the stock markets, especially for high-frequency trading, because of the large amount of information and the importance of instantaneous decisions. It has always been a challenging task to predict stock prices or even their trends. Our project aims at predicting the short-term pricing trend of selected stocks and simulate the trading results with a simple strategy, to provide a key reference for improving algorithmic trading and better trading strategies. 1.1 Trading model In order to model the real life trading, we constructed a simplified trading model which is almost identical to the real life trading, with several simplifications: 1. only close price is considered (when selling/buying the stock, one would have to pay the bid/ask price, but they are relatively close to the close price) and trading actions happen each day before closing which is not realistic in real life. 2. The only transaction cost is the 0.3% close transaction fee 1 and income tax is not accounted. Despite these simplifications, the trading model is relatively realistic in that the actual transaction cost is considered compared to the models used in the literature 2;3; Concepts and terminologies We assume the trader has two kinds of assets: cash and stock shares. The present value (PV) of the trader s portfolio, is defined as the future amount of money that has been discounted to reflect its current value, as if it existed today. In this specific project, we assume the risk-free interest rate is zero 3, so the present value is calculated as: PV = Total Cash + Total Share Stock Price (1) The rate of return (ROR), is defined as the ratio of money gained or lost (whether realized or unrealized) on an investment relative to the amount of money invested. So, the total return of the portfolio (i.e., the growth rate of the money) is calculated as: Total ROR = PV last day PV first day PV first day (2) The daily return is just the daily growth rate of the money, and can be calculated from the following formula: Total ROR = (1 + Daily ROR) # of trading days 1 (3) 2 Methodology 2.1 Stock selection The dataset was downloaded from Bloomberg terminal with more than 20 indicators/features. The stock is almost randomly selected from S&P 500 index components, but we do include two basic criteria: stock price and volume. Higher stock price indicates that the stock may be a principal component of the index whereas high volume shows that the stock is traded actively. In this project, we selected and tested on 25 stocks from the S&P 500 index components, they are listed as follows: Table 1: List of selected S&P 500 index components (tickers only) AAPL AMZN AZO BAC BLK C CMG CSCO DAL EMC EXC GE GOOG INTC ISRG MA MS MSFT NVDA PCLN PFE SCHW T WFC WPO [email protected] [email protected] 3 Assuming the annual interest rate is 0.2%, then the daily interest rate is roughly which is negligible
2 2.2 Trading strategy The trading strategy is rather a simple one: knowing or believing that the stock price will grow tomorrow, one would long (buy) exactly one share of the stock, and one would sell all the holding shares and short (sell) an extra share only based on the knowledge or belief that the stock price will decrease tomorrow. Using this strategy in the simplified model, we managed to achieve a total return of 740.2% (for 731 days) for AAPL (APPLE INC.) corresponding to a daily return of %, provided that the future stock prices are known. The predetermined present value as a function of time, as well as a typical stock price time series, are shown in Fig. 1: Present value, labled (knowing future) tree (MART). MART was chosen because it uses data very efficiently and good results can be obtained with relatively small data sets. To determine which model(s) to use and if these models have more bias or variances so that we can tune the model(s), a learning analysis was performed. The learning curves of the four models are illustrated as follows: 1e 04 5e 04 5e 03 5e 02 5e 01 Learning curve of SVM Learning curve of LWLR Stock Price, AAPL Learning curve of Logistic Regression 1e 04 5e 04 5e 03 5e 02 5e 01 Learning curve of MART Figure 1: Top: PV v.s. time, Bottom: Stock price v.s. time 2.3 Feature construction The features mainly contain indicators acquired from the Bloomberg terminal. In addition, there are several indicators constructed with the data, such as Sharpe ratio, Treynor ratio, etc. A complete list of features are listed below: Table 2: List of features used in the model 10-day Volatility Quick ratio P/E ratio Net change EBITDA α β Risk premium Earning/share α for β Log ROR Benchmark ROR High price ROR Low Price ROR Volume ROR Williams.R% PVT Moving average / Price Sharpe ratio Treynor ratio Due to the limitation of space, we will not introduce all the indicators/features, but note the latter eleven features are calculated using the acquired data. The label is constructed with the closing price of the stock. If the closing price of the next day is greater than that of the present day, the label is set to 1, otherwise it is set to be -1 (for ν-svm) or 0 (for regressions). 2.4 Model selection In the beginning, we considered four kinds of models: ν- support vector machine (ν-svm), locally weighted linear regression (LWLR), logistic regression, and multiple additive regression Figure 2: Learning curves of the four models. The underlying stock here is AAPL. From the learning curves, it can be easily seen that ν-svm is less biased and clearly has a large variance, while LWLR, logistic regression and MART are biased. Since it is hard to construct more features but easy to reduce the size of features and change the size of training data, and considering the overall accuracy reflected from the learning curves, ν-svm and MART were selected as the underlying models. In addition, inspired by the idea of LLWR, a local linear regression (LLR) was also implemented, based on the belief that locally the stock price grows linear with time. 2.5 Feature selection The learning analysis rendered ν-svm having a relatively high variance. To tackle this issue, the feature selection was performed. As the amount of features (20) is considerably small, instead of using heavy machinery such as mutual information, a simple backward search was performed. For each of the stock, we leave one of the 20 features out and calculate the classification error. Then we have 20 classification errors corresponding to the absence of each of the 20 features. Next, We determine the 25% quantile of these errors and remove features without which the errors drop significantly, within the top 25%. Usually, 3-4 features are removed using this method. CS 229 FINAL REPORT 2
3 2.6 Online learning vs offline learning To implement the learning algorithms, we first compared the online learning mechanism versus the offline learning mechanism. In practice, the data of the stock from April 1, 2010 to December 31, 2010 were used as training data for offline learning and for the first prediction of online learning. Their accuracies and recalls as functions of initial training sizes are plotted as follows: The precision, recall and the accuracy for the LLR model are listed in Table.3. It can be seen that although the accuracy of LLR is not high, which ranges from 63% 70%, but it is quite robust, with a variance of In addition, the precision and recall remain the same level as the accuracy. Moreover, although its accuracy is not impressively high, the model maintains positive total returns. All these features render this model very stable and practical. Accuracy of online and offline learning Percentage Percentage Online learning Offline learning Online learning Offline learning Pre training size Recall of online and offline learning Pre training size Figure 3: Accuracy (Top) and Recall (Bottom) curves of the two learning mechanisms It can be seen from the above figure that the online learning mechanism has both higher accuracy and recall than the offline learning, which is reasonable: As the stock price is stochastic almost surely, the models need to be readjusted actively so that it catches the new trend/feature. It is also worth noting that for the online learning, when doing the prediction, we use the stock data at the opening time which is the beginning of a day and the prediction was performed for the closing time of a day. At many occasions, there may be correlations between the opening price and the closing price (which can be one of the reasons why online learning is much more accurate than offline learning), but generally this strategy is still of practical use as one can always trade between the opening time and closing time. Based on this result, online learning was then performed with the subsequent data from January 1, 2011 to November 9, On each step, the prediction was made for the current trading date and the trading actions were performed under the prediction, then the models were adjusted and retrained for the prediction for the next trading day. 3 Results and discussion 3.1 Local linear regression The local linear regression was trained with a subset of features with PVT and EBITDA removed due to the large absolute values, which may cause divergence. For each iteration, the model was trained with the data from the past few days (three in this case), and then the prediction is performed (referred as "local learning" in the later text). Table 3: Learning results of LLR model on 25 stocks Ticker Accuracy Precision Recall Total Return Total Return (label) AAPL 63.47% 64.75% 63.20% % % AMZN 68.95% 70.03% 67.57% % % AZO 67.44% 69.53% 68.81% 42.85% % BAC 67.85% 67.70% 66.76% 6.30% 19.94% BLK 67.17% 66.92% 70.51% 81.98% % C 65.25% 65.17% 64.09% 27.39% 76.95% CMG 68.67% 71.32% 70.05% % % CSCO 65.25% 64.54% 64.90% 5.27% 21.73% DAL 67.03% 68.12% 66.84% 8.72% 27.46% EMC 67.99% 67.14% 66.76% 12.46% 34.20% EXC 67.99% 66.30% 68.18% 11.40% 30.35% GE 68.54% 69.23% 69.60% 8.77% 20.25% GOOG 65.53% 67.40% 64.91% % % INTC 67.31% 68.22% 66.94% 5.33% 27.85% ISRG 67.03% 67.80% 65.40% % % MA 69.63% 72.45% 71.36% % % MS 65.80% 65.29% 65.65% 18.01% 40.69% MSFT 68.67% 68.84% 67.13% 4.56% 31.79% NVDA 65.39% 62.43% 65.89% 8.41% 32.33% PCLN 67.44% 69.25% 69.25% % % PFE 67.31% 68.29% 65.12% 1.52% 17.78% SCHW 71.00% 70.73% 71.51% 8.44% 25.57% T 69.77% 70.57% 73.32% 1.57% 22.89% WFC 67.44% 66.49% 69.06% 10.93% 35.34% WPO 67.03% 68.59% 70.18% % % The present value curses of AAPL using this model and predetermined label are plotted in Fig. 4. The present value grows with time, but is quite slow compared to the predetermined present value. In addition, it has many kinks, due to the many failures of predicting the trend of the future price. The total return of the trading strategy with this model is 176.2% for AAPL, corresponding to a daily return of %. which is much lower than the predetermined returns. In general, the total return with LLR model is always much lower than the predetermined returns as listed in Table 3. Despite the many issues, the LLR model managed to reach a positive returns for all 25 stocks for 731 days. And due to its simplicity ( easy to implement) and speed ( fast to train and predict), it can be a very powerful tool in high frequency trading. 3 CS 229 FINAL REPORT
4 Present value, local linear regression model Present value, SVM model Local LR model Labeled SVM model Labeled Figure 4: Left: PV of LLR, compared to the labelled PV, Right:PV of SVM, compared to the labelled PV. The underlying stock here is AAPL. 3.2 ν-support vector machine The l 1 regulated ν-svm (C = 1,ν 0.2, with Laplace Radial Basis Function (RBF) kernel) was trained with a subset of features using feature selection (e.g., in the case of AAPL, the four features: benchmark ROR, volatility, EBITDA and α for β were removed from the feature set). The precision, recall and the accuracy for the ν- SVM model are listed as follows: Table 4: Learning results of SVM model on 25 stocks Ticker Accuracy Precision Recall Total Return Tot. Ret. (label) AAPL 91.66% 92.66% 90.93% % % AMZN 90.56% 90.13% 91.35% % % AZO 92.34% 91.92% 93.81% % % BAC 95.49% 94.57% 96.40% 17.04% 19.94% BLK 92.75% 93.96% 91.69% % % C 90.83% 90.41% 91.16% 67.60% 76.95% CMG 89.88% 90.61% 90.61% % % CSCO 91.52% 92.07% 90.53% 20.17% 21.73% DAL 94.66% 94.67% 94.92% 26.30% 27.46% EMC 92.61% 92.63% 92.11% 33.15% 34.20% EXC 91.93% 91.98% 91.19% 29.49% 30.35% GE 94.80% 94.69% 95.20% 18.37% 20.25% GOOG 92.20% 92.15% 92.88% % % INTC 93.16% 93.05% 93.55% 26.89% 27.85% ISRG 93.30% 93.44% 93.19% % % MA 91.93% 92.06% 93.22% % % MS 93.30% 94.07% 92.24% 37.62% 40.69% MSFT 93.16% 92.62% 93.65% 29.77% 31.79% NVDA 92.61% 91.64% 92.71% 30.92% 32.33% PCLN 91.11% 91.93% 91.21% % % PFE 88.78% 87.80% 90.19% 14.72% 17.78% SCHW 93.71% 94.43% 92.88% 24.87% 25.57% T 92.34% 92.31% 93.26% 20.01% 22.89% WFC 92.75% 94.27% 90.88% 32.42% 35.34% WPO 91.66% 92.71% 91.52% % % Clearly, the ν-svm model has a very high accuracy, and good precision and recall as well. Compared to LLR model, the accuracy is significantly higher. As can be observed from the present value curve in Fig.4, the present value grows with time in a linear fashion, which is quite reasonable assuming the price change of the stock is nearly constant, thus with our trading strategy the amount one earns everyday is also nearly constant provided the prediction is correct. Although not obvious, it can be seen that the performance of the model is not very good near a sudden change of price trend (known as "change point", which is either a local optimum, or jump). The ν-svm model proved to be a very stable and powerful tool to predict stock price trend and together with the trading strategy, it managed to reach relatively high returns. It is worth noting that for different stocks the parameter ν and selected features are slightly different, and tweaking the parameters and performing ν-svm algorithm takes longer time than simply doing the linear regression. Hence, although ν-svm is more accurate, it is not as efficient as LLR. 3.3 Multiple additive regression tree The precision, recall and the accuracy for the MART model are listed in Table.5. It is observed that the accuracies and recalls fluctuate a lot, whereas the precisions remain a very high level. Although the averaged accuracy is 81%, its variance is , which is almost 10 times larger than that of LLR and ν-svm, indicating that the performance of this model varies greatly. Moreover, the total return of this model can even be significantly negative when the accuracy is actually high ( > 65%, see INTC and ISRG). This is because MART has a very poor recall, so that most of the time the trader is losing money by short selling the stock shares, which can cause significant loss when the trader is holding many shares in hand. The unstable performance and the negative returns of MART reveal its disadvantages: first, its outputs of regression can lie outside of the range [0,1] which would be classified as incorrect prediction; second, its extrapolation properties tend to be poor, which may cause the performance to be unstable; last, it is very sensitive to outliers, which further brings down the prediction accuracy. In addition to this, it takes many iterations for MART to converge, the total number of iterations required for convergence varies from Performing hundreds of iterations costs a relatively large amount of time which is another disadvantage of this method. Therefore, it is not only less stable compared to ν- SVM and LLR, but less efficient than the two models. CS 229 FINAL REPORT 4
5 Table 5: Learning results of MART model on 25 stocks Ticker Accuracy Precision Recall Total Return Tot. Ret. (label) AAPL 98.50% 99.73% 97.33% % % AMZN 92.34% 99.68% 85.14% % % AZO 98.63% 99.22% 98.20% % % BAC 53.08% % 4.99% 42.55% 19.94% BLK 69.90% 99.35% 41.29% % % C 72.50% % 44.48% 38.02% 76.95% CMG 99.73% 99.75% 99.75% % % CSCO 83.31% % 66.02% 8.54% 21.73% DAL 64.02% % 29.68% 28.61% 27.46% EMC 94.80% % 89.30% 30.05% 34.20% EXC 70.45% % 38.64% 22.34% 30.35% GE 85.23% % 71.20% 10.88% 20.25% GOOG 97.54% % 95.25% % % INTC 74.97% % 50.81% 2.62% 27.85% ISRG 65.39% % 31.06% % % MA 96.31% 99.73% 93.47% % % MS 60.19% % 19.39% 17.16% 40.69% MSFT 62.52% % 24.31% 21.99% 31.79% NVDA 81.26% % 60.06% 22.24% 32.33% PCLN 90.56% % 82.17% % % PFE 84.95% % 70.03% 7.82% 17.78% SCHW 68.95% % 37.81% 20.87% 25.57% T 99.32% % 98.70% 20.18% 22.89% WFC 60.47% % 20.17% 27.29% 35.34% WPO 98.77% % 97.69% % % The performances of the three methods are summarized as follows: Table 6: Performance of different models Model Avg. accuracy Avg. learning time Have negative returns LLR 67.40% 5-10s No ν-svm 92.36% 20-50s No MART 80.95% 1-5 min Yes Note this is only a rough estimation based on observations. As a result, ν-svm has the best overall performance. The time it used to learn and predict makes it best for daily trading. LLR turns out to be very stable and efficient, which can be used to assist high frequency trading. MART, which has a good average performance, is highly unstable and relatively slow in learning and predicting, and thus not suitable for algorithmic trading. 3.4 Further analysis on online learning Notice that from the learning curve we can see that the accuracy (= 1 test error) of linear regression ranges from 40% to 50%, whereas the actual averaged accuracy of LLR is 67.40%, which is higher. This is due to the advantage of online learning and local learning. Since the belief that the stock price is locally linear (but not globally), local learning should improve the quality of the prediction. Moreover, the online learning technique enables the model to evolve with time, so that it was adjusted and better for the next prediction. In addition, for ν-svm, its accuracy varies from 50% - 95% by the learning curve, whereas the actually averaged accuracy of online learning turned out to be 92.36%, which is a number in between. This is because although the online learning technique enables the model to adjust itself with time, its variance is certainly larger than the offline learning as it always have no less training data than the offline learning, which may reduce the accuracy. 4 Conclusion and future work In conclusion, we have developed a process flow of automatic trading using machine learning techniques. Feature construction, feature selection, and model selection were studied in detail. The online learning and offline learning mechanisms were also compared and discussed. The averaged prediction accuracy for the local linear regression model turned out to be 67.40% whereas that of the ν-support vector machine turned out to be 92.36%, thus showing our process a good candidate for algorithmic trading. Noting that ν-support vector machine usually fails to predict correctly when the data point is close to a sudden change which is almost random seen from the data itself, and that the information of a sudden change in stock price trend may be included in the text information such as daily news. It is believed that a further step would be to investigate the usage of text-mining techniques to assist the machine learning process proposed in this project. Furthermore, it is noted that improving the learning speed of ν- support vector machine as well as the efficiency of the code overall is needed. Finally, the real-time tests of the system as well as bringing more complexities (such as considering more realistic transaction costs) is necessary. 5 Acknowledgement We would like to thank Professor Andrew Ng for teaching this great course of machine learning. Indeed, we enjoyed the class very much. The materials and knowledge we learned from this class is very practical and useful, and we have already put what we learned into practice. We also thank Alex Yuqing Dai, for assisting us in downloading the stock data from the Bloomberg terminal and for the many discussions he had with us. References [1] [2] Dempster, M.A.H.; Leemans, V. Expert Syst. Appl., 2005, 30, [3] Fung, G.; Yu, J.; Lam, W. Advances in Knowledge Discovery and Data Mining, Springer Berlin Heidelberg, [4] Debbini, D.; Estin, P.; Goutagny, M. Modeling the Stock Market Using Twitter Sentiment Analysis 5 CS 229 FINAL REPORT
Machine Learning in Stock Price Trend Forecasting
Machine Learning in Stock Price Trend Forecasting Yuqing Dai, Yuning Zhang [email protected], [email protected] I. INTRODUCTION Predicting the stock price trend by interpreting the seemly chaotic market
Stock Market Forecasting Using Machine Learning Algorithms
Stock Market Forecasting Using Machine Learning Algorithms Shunrong Shen, Haomiao Jiang Department of Electrical Engineering Stanford University {conank,hjiang36}@stanford.edu Tongda Zhang Department of
Recognizing Informed Option Trading
Recognizing Informed Option Trading Alex Bain, Prabal Tiwaree, Kari Okamoto 1 Abstract While equity (stock) markets are generally efficient in discounting public information into stock prices, we believe
Making Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University [email protected] [email protected] I. Introduction III. Model The goal of our research
Multiple Kernel Learning on the Limit Order Book
JMLR: Workshop and Conference Proceedings 11 (2010) 167 174 Workshop on Applications of Pattern Analysis Multiple Kernel Learning on the Limit Order Book Tristan Fletcher Zakria Hussain John Shawe-Taylor
Using News Articles to Predict Stock Price Movements
Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 [email protected] 21, June 15,
CS229 Project Report Automated Stock Trading Using Machine Learning Algorithms
CS229 roject Report Automated Stock Trading Using Machine Learning Algorithms Tianxin Dai [email protected] Arpan Shah [email protected] Hongxia Zhong [email protected] 1. Introduction
Sentiment Score based Algorithmic Trading
Sentiment Score based Algorithmic Trading 643 1 Sukesh Kumar Ranjan, 2 Abhishek Trivedi, 3 Dharmveer Singh Rajpoot 1,2,3 Department of Computer Science and Engineering / Information Technology, Jaypee
How to Win the Stock Market Game
How to Win the Stock Market Game 1 Developing Short-Term Stock Trading Strategies by Vladimir Daragan PART 1 Table of Contents 1. Introduction 2. Comparison of trading strategies 3. Return per trade 4.
Applying Machine Learning to Stock Market Trading Bryce Taylor
Applying Machine Learning to Stock Market Trading Bryce Taylor Abstract: In an effort to emulate human investors who read publicly available materials in order to make decisions about their investments,
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
Section 1 - Overview; trading and leverage defined
Section 1 - Overview; trading and leverage defined Download this in PDF format. In this lesson I am going to shift the focus from an investment orientation to trading. Although "option traders" receive
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
Design of an FX trading system using Adaptive Reinforcement Learning
University Finance Seminar 17 March 2006 Design of an FX trading system using Adaptive Reinforcement Learning M A H Dempster Centre for Financial Research Judge Institute of Management University of &
Predicting the Stock Market with News Articles
Predicting the Stock Market with News Articles Kari Lee and Ryan Timmons CS224N Final Project Introduction Stock market prediction is an area of extreme importance to an entire industry. Stock price is
A Sarsa based Autonomous Stock Trading Agent
A Sarsa based Autonomous Stock Trading Agent Achal Augustine The University of Texas at Austin Department of Computer Science Austin, TX 78712 USA [email protected] Abstract This paper describes an autonomous
Forecasting stock markets with Twitter
Forecasting stock markets with Twitter Argimiro Arratia [email protected] Joint work with Marta Arias and Ramón Xuriguera To appear in: ACM Transactions on Intelligent Systems and Technology, 2013,
Fuzzy logic decision support for long-term investing in the financial market
Fuzzy logic decision support for long-term investing in the financial market Abstract This paper discusses the use of fuzzy logic and modeling as a decision making support for long-term investment decisions
Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement
Sentiment Analysis of Twitter Feeds for the Prediction of Stock Market Movement Ray Chen, Marius Lazer Abstract In this paper, we investigate the relationship between Twitter feed content and stock market
Active Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
Beating the NCAA Football Point Spread
Beating the NCAA Football Point Spread Brian Liu Mathematical & Computational Sciences Stanford University Patrick Lai Computer Science Department Stanford University December 10, 2010 1 Introduction Over
Hedge Fund Returns: You Can Make Them Yourself!
Hedge Fund Returns: You Can Make Them Yourself! Harry M. Kat * Helder P. Palaro** This version: June 8, 2005 Please address all correspondence to: Harry M. Kat Professor of Risk Management and Director
CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis
CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis Team members: Daniel Debbini, Philippe Estin, Maxime Goutagny Supervisor: Mihai Surdeanu (with John Bauer) 1 Introduction
Black-Scholes-Merton approach merits and shortcomings
Black-Scholes-Merton approach merits and shortcomings Emilia Matei 1005056 EC372 Term Paper. Topic 3 1. Introduction The Black-Scholes and Merton method of modelling derivatives prices was first introduced
Two-State Option Pricing
Rendleman and Bartter [1] present a simple two-state model of option pricing. The states of the world evolve like the branches of a tree. Given the current state, there are two possible states next period.
Pattern Recognition and Prediction in Equity Market
Pattern Recognition and Prediction in Equity Market Lang Lang, Kai Wang 1. Introduction In finance, technical analysis is a security analysis discipline used for forecasting the direction of prices through
Knowledge Discovery in Stock Market Data
Knowledge Discovery in Stock Market Data Alfred Ultsch and Hermann Locarek-Junge Abstract This work presents the results of a Data Mining and Knowledge Discovery approach on data from the stock markets
Machine Learning Final Project Spam Email Filtering
Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE
Black-Scholes Equation for Option Pricing
Black-Scholes Equation for Option Pricing By Ivan Karmazin, Jiacong Li 1. Introduction In early 1970s, Black, Scholes and Merton achieved a major breakthrough in pricing of European stock options and there
Model Validation Turtle Trading System. Submitted as Coursework in Risk Management By Saurav Kasera
Model Validation Turtle Trading System Submitted as Coursework in Risk Management By Saurav Kasera Instructors: Professor Steven Allen Professor Ken Abbott TA: Tom Alberts TABLE OF CONTENTS PURPOSE:...
Portfolio Performance Measures
Portfolio Performance Measures Objective: Evaluation of active portfolio management. A performance measure is useful, for example, in ranking the performance of mutual funds. Active portfolio managers
FINANCIAL ECONOMICS OPTION PRICING
OPTION PRICING Options are contingency contracts that specify payoffs if stock prices reach specified levels. A call option is the right to buy a stock at a specified price, X, called the strike price.
E-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee [email protected] Seunghee Ham [email protected] Qiyi Jiang [email protected] I. INTRODUCTION Due to the increasing popularity of e-commerce
How to assess the risk of a large portfolio? How to estimate a large covariance matrix?
Chapter 3 Sparse Portfolio Allocation This chapter touches some practical aspects of portfolio allocation and risk assessment from a large pool of financial assets (e.g. stocks) How to assess the risk
Using Twitter to Analyze Stock Market and Assist Stock and Options Trading
University of Connecticut DigitalCommons@UConn Doctoral Dissertations University of Connecticut Graduate School 12-17-2015 Using Twitter to Analyze Stock Market and Assist Stock and Options Trading Yuexin
Neural Networks for Sentiment Detection in Financial Text
Neural Networks for Sentiment Detection in Financial Text Caslav Bozic* and Detlef Seese* With a rise of algorithmic trading volume in recent years, the need for automatic analysis of financial news emerged.
Bootstrapping Big Data
Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu
Stock Investing Using HUGIN Software
Stock Investing Using HUGIN Software An Easy Way to Use Quantitative Investment Techniques Abstract Quantitative investment methods have gained foothold in the financial world in the last ten years. This
Hedging Illiquid FX Options: An Empirical Analysis of Alternative Hedging Strategies
Hedging Illiquid FX Options: An Empirical Analysis of Alternative Hedging Strategies Drazen Pesjak Supervised by A.A. Tsvetkov 1, D. Posthuma 2 and S.A. Borovkova 3 MSc. Thesis Finance HONOURS TRACK Quantitative
Regularized Logistic Regression for Mind Reading with Parallel Validation
Regularized Logistic Regression for Mind Reading with Parallel Validation Heikki Huttunen, Jukka-Pekka Kauppi, Jussi Tohka Tampere University of Technology Department of Signal Processing Tampere, Finland
AN EVALUATION OF THE PERFORMANCE OF MOVING AVERAGE AND TRADING VOLUME TECHNICAL INDICATORS IN THE U.S. EQUITY MARKET
AN EVALUATION OF THE PERFORMANCE OF MOVING AVERAGE AND TRADING VOLUME TECHNICAL INDICATORS IN THE U.S. EQUITY MARKET A Senior Scholars Thesis by BETHANY KRAKOSKY Submitted to Honors and Undergraduate Research
Predict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons
Data Mining. Nonlinear Classification
Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15
Life Cycle Asset Allocation A Suitable Approach for Defined Contribution Pension Plans
Life Cycle Asset Allocation A Suitable Approach for Defined Contribution Pension Plans Challenges for defined contribution plans While Eastern Europe is a prominent example of the importance of defined
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
On the Predictability of Stock Market Behavior using StockTwits Sentiment and Posting Volume
On the Predictability of Stock Market Behavior using StockTwits Sentiment and Posting Volume Abstract. In this study, we explored data from StockTwits, a microblogging platform exclusively dedicated to
Decompose Error Rate into components, some of which can be measured on unlabeled data
Bias-Variance Theory Decompose Error Rate into components, some of which can be measured on unlabeled data Bias-Variance Decomposition for Regression Bias-Variance Decomposition for Classification Bias-Variance
Eurodollar Futures, and Forwards
5 Eurodollar Futures, and Forwards In this chapter we will learn about Eurodollar Deposits Eurodollar Futures Contracts, Hedging strategies using ED Futures, Forward Rate Agreements, Pricing FRAs. Hedging
Music Mood Classification
Music Mood Classification CS 229 Project Report Jose Padial Ashish Goel Introduction The aim of the project was to develop a music mood classifier. There are many categories of mood into which songs may
Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.
Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features
Using simulation to calculate the NPV of a project
Using simulation to calculate the NPV of a project Marius Holtan Onward Inc. 5/31/2002 Monte Carlo simulation is fast becoming the technology of choice for evaluating and analyzing assets, be it pure financial
Lecture Notes: Basic Concepts in Option Pricing - The Black and Scholes Model
Brunel University Msc., EC5504, Financial Engineering Prof Menelaos Karanasos Lecture Notes: Basic Concepts in Option Pricing - The Black and Scholes Model Recall that the price of an option is equal to
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network
Forecasting Trade Direction and Size of Future Contracts Using Deep Belief Network Anthony Lai (aslai), MK Li (lilemon), Foon Wang Pong (ppong) Abstract Algorithmic trading, high frequency trading (HFT)
Exchange Traded Funds
LPL FINANCIAL RESEARCH Exchange Traded Funds February 16, 2012 What They Are, What Sets Them Apart, and What to Consider When Choosing Them Overview 1. What is an ETF? 2. What Sets Them Apart? 3. How Are
Online Ensembles for Financial Trading
Online Ensembles for Financial Trading Jorge Barbosa 1 and Luis Torgo 2 1 MADSAD/FEP, University of Porto, R. Dr. Roberto Frias, 4200-464 Porto, Portugal [email protected] 2 LIACC-FEP, University of
Author Gender Identification of English Novels
Author Gender Identification of English Novels Joseph Baena and Catherine Chen December 13, 2013 1 Introduction Machine learning algorithms have long been used in studies of authorship, particularly in
A Stock Trading Algorithm Model Proposal, based on Technical Indicators Signals
Informatica Economică vol. 15, no. 1/2011 183 A Stock Trading Algorithm Model Proposal, based on Technical Indicators Signals Darie MOLDOVAN, Mircea MOCA, Ştefan NIŢCHI Business Information Systems Dept.
Option Valuation. Chapter 21
Option Valuation Chapter 21 Intrinsic and Time Value intrinsic value of in-the-money options = the payoff that could be obtained from the immediate exercise of the option for a call option: stock price
How can we discover stocks that will
Algorithmic Trading Strategy Based On Massive Data Mining Haoming Li, Zhijun Yang and Tianlun Li Stanford University Abstract We believe that there is useful information hiding behind the noisy and massive
Identifying Market Price Levels using Differential Evolution
Identifying Market Price Levels using Differential Evolution Michael Mayo University of Waikato, Hamilton, New Zealand [email protected] WWW home page: http://www.cs.waikato.ac.nz/~mmayo/ Abstract. Evolutionary
Options Pricing. This is sometimes referred to as the intrinsic value of the option.
Options Pricing We will use the example of a call option in discussing the pricing issue. Later, we will turn our attention to the Put-Call Parity Relationship. I. Preliminary Material Recall the payoff
Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes
Knowledge Discovery and Data Mining Lecture 19 - Bagging Tom Kelsey School of Computer Science University of St Andrews http://tom.host.cs.st-andrews.ac.uk [email protected] Tom Kelsey ID5059-19-B &
Genetic algorithm evolved agent-based equity trading using Technical Analysis and the Capital Asset Pricing Model
Genetic algorithm evolved agent-based equity trading using Technical Analysis and the Capital Asset Pricing Model Cyril Schoreels and Jonathan M. Garibaldi Automated Scheduling, Optimisation and Planning
Automatic Inventory Control: A Neural Network Approach. Nicholas Hall
Automatic Inventory Control: A Neural Network Approach Nicholas Hall ECE 539 12/18/2003 TABLE OF CONTENTS INTRODUCTION...3 CHALLENGES...4 APPROACH...6 EXAMPLES...11 EXPERIMENTS... 13 RESULTS... 15 CONCLUSION...
Introduction to Logistic Regression
OpenStax-CNX module: m42090 1 Introduction to Logistic Regression Dan Calderon This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Abstract Gives introduction
I.e., the return per dollar from investing in the shares from time 0 to time 1,
XVII. SECURITY PRICING AND SECURITY ANALYSIS IN AN EFFICIENT MARKET Consider the following somewhat simplified description of a typical analyst-investor's actions in making an investment decision. First,
Lecture 6: Option Pricing Using a One-step Binomial Tree. Friday, September 14, 12
Lecture 6: Option Pricing Using a One-step Binomial Tree An over-simplified model with surprisingly general extensions a single time step from 0 to T two types of traded securities: stock S and a bond
Performance of pairs trading on the S&P 500 index
Performance of pairs trading on the S&P 500 index By: Emiel Verhaert, studentnr: 348122 Supervisor: Dick van Dijk Abstract Until now, nearly all papers focused on pairs trading have just implemented the
Active Versus Passive Low-Volatility Investing
Active Versus Passive Low-Volatility Investing Introduction ISSUE 3 October 013 Danny Meidan, Ph.D. (561) 775.1100 Low-volatility equity investing has gained quite a lot of interest and assets over the
Alliance Consulting BOND YIELDS & DURATION ANALYSIS. Bond Yields & Duration Analysis Page 1
BOND YIELDS & DURATION ANALYSIS Bond Yields & Duration Analysis Page 1 COMPUTING BOND YIELDS Sources of returns on bond investments The returns from investment in bonds come from the following: 1. Periodic
Optimization applications in finance, securities, banking and insurance
IBM Software IBM ILOG Optimization and Analytical Decision Support Solutions White Paper Optimization applications in finance, securities, banking and insurance 2 Optimization applications in finance,
Data Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
Overlapping ETF: Pair trading between two gold stocks
MPRA Munich Personal RePEc Archive Overlapping ETF: Pair trading between two gold stocks Peter N Bell and Brian Lui and Alex Brekke University of Victoria 1. April 2012 Online at http://mpra.ub.uni-muenchen.de/39534/
SPDR S&P Software & Services ETF
SPDR S&P Software & Services ETF Summary Prospectus-October 31, 2015 XSW (NYSE Ticker) Before you invest in the SPDR S&P Software & Services ETF (the Fund ), you may want to review the Fund's prospectus
Long-term Stock Market Forecasting using Gaussian Processes
Long-term Stock Market Forecasting using Gaussian Processes 1 2 3 4 Anonymous Author(s) Affiliation Address email 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
Hedging With a Stock Option
IOSR Journal of Business and Management (IOSR-JBM) e-issn: 2278-487X, p-issn: 2319-7668. Volume 17, Issue 9.Ver. I (Sep. 2015), PP 06-11 www.iosrjournals.org Hedging With a Stock Option Afzal Ahmad Assistant
Probability Models of Credit Risk
Probability Models of Credit Risk In discussing financial risk, it is useful to distinguish between market risk and credit risk. Market risk refers to the possibility of losses due to changes in the prices
MAXIMIZING RETURN ON DIRECT MARKETING CAMPAIGNS
MAXIMIZING RETURN ON DIRET MARKETING AMPAIGNS IN OMMERIAL BANKING S 229 Project: Final Report Oleksandra Onosova INTRODUTION Recent innovations in cloud computing and unified communications have made a
Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams
2012 International Conference on Computer Technology and Science (ICCTS 2012) IPCSIT vol. XX (2012) (2012) IACSIT Press, Singapore Using Text and Data Mining Techniques to extract Stock Market Sentiment
The Volatility Index Stefan Iacono University System of Maryland Foundation
1 The Volatility Index Stefan Iacono University System of Maryland Foundation 28 May, 2014 Mr. Joe Rinaldi 2 The Volatility Index Introduction The CBOE s VIX, often called the market fear gauge, measures
The Binomial Option Pricing Model André Farber
1 Solvay Business School Université Libre de Bruxelles The Binomial Option Pricing Model André Farber January 2002 Consider a non-dividend paying stock whose price is initially S 0. Divide time into small
1 Interest rates, and risk-free investments
Interest rates, and risk-free investments Copyright c 2005 by Karl Sigman. Interest and compounded interest Suppose that you place x 0 ($) in an account that offers a fixed (never to change over time)
Advanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
Algorithmic Trading Session 1 Introduction. Oliver Steinki, CFA, FRM
Algorithmic Trading Session 1 Introduction Oliver Steinki, CFA, FRM Outline An Introduction to Algorithmic Trading Definition, Research Areas, Relevance and Applications General Trading Overview Goals
Relational Learning for Football-Related Predictions
Relational Learning for Football-Related Predictions Jan Van Haaren and Guy Van den Broeck [email protected], [email protected] Department of Computer Science Katholieke Universiteit
Equity forecast: Predicting long term stock price movement using machine learning
Equity forecast: Predicting long term stock price movement using machine learning Nikola Milosevic School of Computer Science, University of Manchester, UK [email protected] Abstract Long
BINOMIAL OPTIONS PRICING MODEL. Mark Ioffe. Abstract
BINOMIAL OPTIONS PRICING MODEL Mark Ioffe Abstract Binomial option pricing model is a widespread numerical method of calculating price of American options. In terms of applied mathematics this is simple
MHI3000 Big Data Analytics for Health Care Final Project Report
MHI3000 Big Data Analytics for Health Care Final Project Report Zhongtian Fred Qiu (1002274530) http://gallery.azureml.net/details/81ddb2ab137046d4925584b5095ec7aa 1. Data pre-processing The data given
Midterm Exam:Answer Sheet
Econ 497 Barry W. Ickes Spring 2007 Midterm Exam:Answer Sheet 1. (25%) Consider a portfolio, c, comprised of a risk-free and risky asset, with returns given by r f and E(r p ), respectively. Let y be the
SMG... 2 3 4 SMG WORLDWIDE
U S E R S G U I D E Table of Contents SMGWW Homepage........... Enter The SMG............... Portfolio Menu Page........... 4 Changing Your Password....... 5 Steps for Making a Trade....... 5 Investor
