Stock Trend Prediction with Technical Indicators using SVM



Similar documents
A New Approach to Neural Network based Stock Trading Strategy

A GUIDE TO WL INDICATORS

Machine learning for algo trading

ALGORITHMIC TRADING USING MACHINE LEARNING TECH-

NEDBANK PRIVATE WEALTH STOCKBROKERS Graphical Analysis Manual

Disclaimer: The authors of the articles in this guide are simply offering their interpretation of the concepts. Information, charts or examples

Alerts & Filters in Power E*TRADE Pro Strategy Scanner

A Practical Guide to Technical Indicators; (Part 1) Moving Averages

MATHEMATICAL TRADING INDICATORS

Predicting Gold Prices

Cross Validation. Dr. Thomas Jensen Expedia.com

Nest Pulse Impact Document

Technical Indicators Tutorial - Forex Trading, Currency Forecast, FX Trading Signal, Forex Training Cour...

Chapter 2.3. Technical Analysis: Technical Indicators

Using News Articles to Predict Stock Price Movements

Data Mining. Nonlinear Classification

Stochastic Oscillator.

Decompose Error Rate into components, some of which can be measured on unlabeled data

Trading with the Intraday Multi-View Indicator Suite

Technical Analysis. Technical Analysis. Schools of Thought. Discussion Points. Discussion Points. Schools of thought. Schools of thought

JetBlue Airways Stock Price Analysis and Prediction

THE MACD: A COMBO OF INDICATORS FOR THE BEST OF BOTH WORLDS

Understanding the market

Highly Active Manual FX Trading Strategy. 1.Used indicators. 2. Theory Standard deviation (stddev Indicator - standard MetaTrader 4 Indicator)

Machine Learning in Stock Price Trend Forecasting

6.14. Oscillators and Indicators.

Beating the MLB Moneyline

Data Mining - Evaluation of Classifiers

Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes

Indicators. Applications and Pitfalls. Adam Grimes

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Charting Glossary Version 1 September 2008

Using Bollinger Bands. by John Bollinger

TECHNICAL CHARTS UNDERSTANDING TECHNICAL CHARTS

Predicting Flight Delays

8 Day Intensive Course Lesson 5 Stochastics & Bollinger Bands

Beating the NCAA Football Point Spread

arxiv: v1 [stat.ml] 21 Jan 2013

Machine Learning Techniques for Stock Prediction. Vatsal H. Shah

Chapter 2.3. Technical Indicators

ChartGenie USER GUIDE

Data Mining Practical Machine Learning Tools and Techniques

Knowledge Discovery and Data Mining

NEURAL networks [5] are universal approximators [6]. It

Applying Machine Learning to Stock Market Trading Bryce Taylor

Hong Kong Stock Index Forecasting

VBM-ADX40 Method. (Wilder, J. Welles from Technical Analysis of Stocks and Commodities, February 1986.)

Tai Kam Fong, Jackie. Master of Science in E-Commerce Technology

CS229 Project Report Automated Stock Trading Using Machine Learning Algorithms

BROKER SERVICES AND PLATFORM

Making Sense of the Mayhem: Machine Learning and March Madness

Cross-Validation. Synonyms Rotation estimation

Knowledge Discovery and Data Mining

Retracements With TMV

How Well Do Traditional Momentum Indicators Work? Cynthia A. Kase, CMT President, Kase and Company, Inc., CTA October 10, 2006

A Stock Trading Algorithm Model Proposal, based on Technical Indicators Signals

Development of Trading Strategies

The profitability of MACD and RSI trading rules in the Australian stock market

DATA MINING TECHNIQUES AND APPLICATIONS

ANTS SuperGuppy. ANTS for esignal Installation Guide. A Guppy Collaboration For Success PAGE 1

Data Mining for Knowledge Management. Classification

Monotonicity Hints. Abstract

Predict the Popularity of YouTube Videos Using Early View Data

Forecasting stock markets with Twitter

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research

Sentiment analysis using emoticons

Professional Trader Series: Moving Average Formula & Strategy Guide. by John Person

Improved Trend Following Trading Model by Recalling Past Strategies in Derivatives Market

Music Mood Classification

How I Trade Profitably Every Single Month without Fail

Recognizing Informed Option Trading

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

A Survey of Systems for Predicting Stock Market Movements, Combining Market Indicators and Machine Learning Classifiers

New Work Item for ISO Predictive Analytics (Initial Notes and Thoughts) Introduction

Comparison of machine learning methods for intelligent tutoring systems

CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION

Neural Network Stock Trading Systems Donn S. Fishbein, MD, PhD Neuroquant.com

An Implementation of Genetic Algorithms as a Basis for a Trading System on the Foreign Exchange Market

Active Learning SVM for Blogs recommendation

SAIF-2011 Report. Rami Reddy, SOA, UW_P

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

Exploring the use of Big Data techniques for simulating Algorithmic Trading Strategies

Solving Mass Balances using Matrix Algebra

Stock Market Forecasting Using Machine Learning Algorithms

An introduction to Value-at-Risk Learning Curve September 2003

EVALUATION OF THE PAIRS TRADING STRATEGY IN THE CANADIAN MARKET

Market Velocity and Forces

NEST STARTER PACK. Omnesys Technologies. Nest Starter Pack. February, Page 1 of 36

Principal components analysis

FINANCIAL markets are driven by information. An

Social Media Mining. Data Mining Essentials

Long-term Stock Market Forecasting using Gaussian Processes

Distances, Clustering, and Classification. Heatmaps

Evaluation & Validation: Credibility: Evaluating what has been learned

Using Text and Data Mining Techniques to extract Stock Market Sentiment from Live News Streams

Car Insurance. Prvák, Tomi, Havri

Rule based Classification of BSE Stock Data with Data Mining

The Moving Average W. R. Booker II. All rights reserved forever and ever. And ever.

Machine Learning. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Machine Learning Term 2012/ / 34

Equity forecast: Predicting long term stock price movement using machine learning

Transcription:

Stock Trend Prediction with Technical Indicators using SVM Xinjie Di dixinjie@gmail.com SCPD student from Apple Inc Abstract This project focuses on predicting stock price trend for a company in the near future. Unlike some other approaches which are concerned with company fundamental analysis (e.g. Financial reports, market performance, sentiment analysis etc.), the feature space is derived from the time series of the stock itself and is concerned with potential movement of past price. Tree algorithm is applied to feature selection and it suggests a subset of stock technical indicators are critical for predicting the stock trend. It explores different ways of validation and shows that overfitting tend to occur due to fundamentally noisy nature of a single stock price. Experiment results suggest that we are able to achieve more than 70% accuracy on predicting a 3-10 day average price trend with RBF kernelized SVM algorithm. Keywords: stock prediction, feature selection, SVM, stock technical indicator, scikit. 1 Introduction Short-term prediction of stock price trend has potential application for personal investment without high-frequency-trading infrastructure. Unlike predicing market index (as explored by previous years projects), single stock price tends to be affected by large noise and long term trend inherently converges to the company s market performance. So this project focuses on short-term (1-10 days) prediction of stock price trend, and takes the approach of analyzing the time series indicators as features to classify trend (Raise or Down). The validation model is chosen so that testing set always follows the training set in time span to simulate real prediction. Cross validated Grid Search on parameters of rbf-kernelized SVM is performed to fit the training data to balance the bias and variances. Although the efficient-market hyothesis suggests that stock price movements are governed by the random walk hypothesis and thus are inherently unpredictable, the experiment shows that with 1000 transaction days as training data, we are able to predict AAPL s next day acutual close price trend with 56% accuracy, better than the random walk; and more than 70% accuracy on next 3-day, 5-day, 7-day, 10-day price trend. In the end, we conclude that stock technical indicators are very effective and efficient features without any sentiment data in predicting short-term stock trend. 2 Feature Space 2.1 Data Collection The data is pulled from http://finance.yahoo.com. I picked the 3 stocks (AAPL, MSFT, AMZN) that have time span available from 2010-01-04 to 2014-12-10 to get enough data and 2 market index (NASDAQ and SP&500). The goal is to predict a single stock s trend (e.g. AAPL) using features derived from its time series plus those from NASDAQ and SP&500 for augmentation. 2.2 Features of Stock Indicators

A stock technical indicator is a series of data points that are derived by applying a function to the price data at time t and study period n. Below is a table of indicators that I compute from time series and transform to features: Indicators Name Description Formula WILLR Williams %R Determines where today s closing (highest-closed)/(highest-lowest)*100 price fell within the range on past 10- day s transaction. ROCR Rate of Compute (Price(t)/Price(t-n))*100 Change rate of change relative to previous trading intervals MOM Momentum Measures Price(t)-Price(t-n) the change RSI CCI ADX TRIX MACD OBV Relative Strength Commodity Channel Directional Triple Exponential Moving Moving Convergence Divergence On Balance Volume in price Suggests the overbought and oversold market signal. Identifies cyclical turns in stock price Discover if trend is developing Smooth the insignificant movements Use different EMA signal buy&sell Relates trading volume to to Avg(PriceUp)/(Avg(PriceUP)+Avg(PriceDown)*100 Where: PriceUp(t)=1*(Price(t)-Price(t-1)){Price(t)- Price(t-1)>0}; PriceDown(t)=1*(Price(t-1)-Price(t)){Price(t)- Price(t-1)<0}; Tp(t)-TpAvg(t,n)/(0.15*MD(t)) where: Tp(t)=(High(t)+Low(t)+Close(t))/3; TpAvg(t,n)=Avg(Tp(t)) over [t, t-1,, t-n+1]; MD(t)=Avg(Abs(Tp(t)-TpAvg(t,n))); Sum((+DI-(-DI))/(+DI+(-DI))/n TR(t)/TR(t-1) where TR(t)=EMA(EMA(EMA(Price(t)))) over n days period OSC(t)-EMAosc(t) where OSC(t)=EMA1(t)- EMA2(t); EMAosc(t)=EMAosc(t-1)+(k*OSC(t)- EMAosc(t-1)) OBV(t)=OBV(t-1)+/-Volume(t)

TSF ATR MFI Time Series Forcasting True Range Money Flow price change Calculates the linear regression of 20-day price Shows volatility of market Relates typeical price with Volume Linear Regression Estimate with 20-day price. ATR(t)=((n-1)*ATR(t-1)+Tr(t))/n where Tr(t)=Max(Abs(High-Low), Abs(Hight-Close(t-1)), Abs(Low-Close(t-1)); 100-(100/(1+Money Ratio)) where Money Ratio=(+Moneyflow/-Moneyflow); Moneyflow=Tp*Volume The above technical indicators cover different type of features: 1) Price change ROCR, MOM 2) Stock trend discovery ADX, MFI 3) Buy&Sell signals WILLR, RSI, CCI, MACD 4) Volatility signal ATR 5) Volume weights OBV 6) Noise elimination and data smoothing TRIX Also TSF is another feature that itself does linear regression to suggest trend. 2.2 Feature Construction and Data Labeling Those indicators are computed against different periods from 3-day to 20-day and each of them is treated as individual feature in feature space. E.g. ROCR3, ROCR6 means relative price to 3 days ago and to 6 days ago relatively. The feature set used for this project is defined as follows: full_features = ['Adj Close', 'OBV', 'Volume', 'RSI6', 'RSI12', 'SMA3', 'EMA6', 'EMA12', 'ATR14', 'MFI14','ADX14', 'ADX20', 'MOM1', 'MOM3','CCI12', 'CCI20', 'ROCR3', 'ROCR12','outMACD', 'outmacdsignal', 'outmacdhist', 'WILLR', 'TSF10', 'TSF20', 'TRIX', 'BBANDSUPPER', 'BBANDSMIDDLE', 'BBANDSLOWER'] The feature matrix X is defined as: X(t) is a row vector of [full_features(stocktopredict), full_features(sp&500), full_features(nasdaq)]. Hence the total features for an example is of size 3*len(full_features) = 84. Depends on what we want to predict, the data is labeledwith Up or Down. Y(t) is the label for data at time t, Y(t) = f(price(t+n),price(t+n-1), Price(t)) where function f has the value in {-1, 1}. In this setting, it is guaranteed that future price trend is unseen to any features. For predicting the next day trend, Y(t) = 1 if Price(t+1)>Price(t); Y(t) = -1 if Price(t+1)<Price(t).

For predicting the next 3-day average price trend, Y(t) = 1 if SMA(t+3)>SMA(t); Y(t) = -1 if SMA(t+3) < Price(t). 3 Model Fitting and Results 3.1 Feature Selection with ensemble Extremely Randomized Tree Algorithm Limited by the length of document, I m not describing the Extremely Randomized Tree algorithm here and it is in the reference. With total 84 features, there can exist a lot of noisy features that will overfit the training data and mislead the prediction. So the goal is to run an algorithm on training data to decide the ranking of features and pick only top 30% of features to feed into SVM classifier. The feature selection significantly helps to handle overfitting and below is a table of some top features chosen by Extremely Randomized Tree for each stock: AAPL WILLR,MOM3,ROCR3,RSI6,CCI12,CCI20,MOM1,N-ADX14,TRIX,N-CCI12.. 28 AMZN MOM1, N-MOM1, WILLR,SP-MOM1,ROCR3,CCI12,RSI6,MOM3,N-ROCR3 40 MSFT MOM1,RSI6,WILLR,N-MOM1,CCI12,SP-MOM1,MOM3,ROCR3,MACDHist 30 From the above table we found that AAPL s feature ranking tend to favor more longer-term indicators such as WILLR, CCI12 etc that reflects a general trend, whereas AMZN s ranking shows that those 1-3 day features are the highest which means it tends to overfit and has high variance. MSFT is in the middle. We will see in the next section that this agrees surprisingly well with the validation of predicting results. The last column number is the number of relevant features selected by ER tree. AMZN uses significantly more features than the other two which implies the trained model is likely to overfit. 3.2 RBF-Kernelized SVM and Grid Search on parameters The fitting model is soft-margin SVM with RBF kernel exp( x-p /l). We have two parameter to fit C and l. For each training set, I do 5-fold cross-validation and grid search on parameter pair <C,l> and pick the best parameter to do validation on test set. 3.3 Validation Model and Results To validate the prediction accuracy, we train on 950 examples and predict on next 50 days. And we repeat the training and testing with a step window 10 examples so that in total we get 10 testing precision, taking the mean of them we get the following. Preprocessing is applied to eleminate the mean and normalize the vaiance to 1. This is to avoid the magnitude difference of features will poise some significant weight which is undesirable. Company/Accuracy Next 3-day Next 5-day Next 7-day Next 10-day Apple 73.4% 71.41% 70.25% 71.13% Amazon 63% 65% 61.5% 71.25% Microsoft 64.5% 73% 77.125% 77.25% 4 Analysis and Conclusion The result shows that for Apple the prediction is robust and above 70%. For Amazon, it suggests that longer-period prediction (10-day) is significantly better than the shortterm. For Microsoft, the split is between 5-day, shorter than that it tends to have very high noise while 7-day and 10-day prediction is extremely good. These can be mapped to the top features extracted: Apple stock has

more generalized indicators weighted heavier than others so that it has low variance. Amazon s top features are all 1-3 days volatility, so that the model tend to overfit. Actually the experiment shows the training error for Amazon is 80% which is significantly higher than the other two stocks. Last result is directly predicting next day price trend, it shows on average it acheives 56% accuracy. This is very sound result since the next 1 day price is highly noise in stock market and should be close to random. With this model, we can do better than random walk consitently. The result is very helpful in real-world investment for non-hft inverster. By learning from past data we are able to get above 70% accurate prediction on the next couple day s trend. For future work, it worth adding sentiment data as features to augment the technical features. The challenge is how to eliminate as much as noise in sentiment data and quantify them. Reference: Learning to rank with extremely randomized trees Efficient Machine Learning Techniques for Stock Market Prediction Feature Investigation for Stock market Prediction An SVM-based Approach for Stock Market Trend Prediction http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:introduction_to_technical_indicators_and_oscillators FEATURE SELECTION FOR PRICE CHANGE PREDICTION