JetBlue Airways Stock Price Analysis and Prediction Team Member: Lulu Liu, Jiaojiao Liu DSO530 Final Project JETBLUE AIRWAYS STOCK PRICE ANALYSIS AND PREDICTION 1
Motivation Started in February 2000, JetBlue Airways (JBLU) is a fast-growing low-cost American airline. For stock investors, airline is always a challenge industry. However, JetBlue has been recognized as a value airline, based on its service, style and cost structure, says investment research firm Trefis 1. Equity analyst saying that JetBlue will remain profitable in 2015 and it is a strong buy stock 2. To understand how JetBlue will perform in the future, we will focus on following questions: (1) whether and how do JetBlue stock price influenced by external factors, such as crude oil price, jet fuel spot price and treasury bill rate? (2) How do these factors influence short term and long term stock performance? (3) Whether JetBlue is a strong buy stock or not? Data Sources Methodology The sample data for this project will be drawn from ten years (from April 25th, 2005 to April 17th, 2015) daily airline stock prices, volume, SP index, daily crude oil price, daily jet fuel spot price and daily treasury yield curve rates. To obtain data, we will use several sources, such data need to be cleaned and merged before analysis. Yahoo Finance database 3 : daily stock price. Economic Research of Federal Reserve Bank of St.Louis: corresponding S&P 500 data 4. U.S. Energy Information Administration official website: daily WTI Crude Oil Price 5. U.S. Energy Information Administration official website: daily Jet Fuel Spot Price FOB 8. U.S. Department of the Treasury: U.S. Government Daily Treasury Yield Curve Rates, we see this variable as the risk free return in our project 7. Variables Dependent variables. To analyze stock price, two categorical dependent variables can be defined: (1) stock close price; (2) price trend, which is up or down. Independent variables. In this project, we will use several price-related indicators to reflect stock performance. The first indicator is S&P 500, which is an American stock market index based on the market capitalizations of 500 large companies having common stock listed on the NYSE or NASDAQ 4. The second indicator is closing price in relation to day s range (CRTDR), which indicates at which point along the day s range the closing price is located 6. The third indicator is crude oil price, which will be downloaded from U.S. Energy Admission website 5. The fourth indicator is daily Jet Fuel Spot Price FOB which also download from U.S. Energy Admission website. The last 3 indicators we used for our models are : 5 Year Daily Treasury Yield Curve Rates,10 Year Daily Treasury Yield Curve Rates, 20 Year Daily Treasury Yield Curve Rates. Besides, we will consider stock basic open price, close price, daily volume, daily return and cumulative return as our independents variables. We also created some new columns which can be better predictors or variables to assist us do 2
DSO530 Final Project model analysis. Lulu Liu, Jiaojiao Liu 1) Daily Return of Jetblue Stock Price: we created this new matrix based on its daily closing price by using the second day s daily stock price minus the first day s daily stock price and then divided by the the first day s daily stock price. 2) Stock Price Moving Direction: we created this binary predictor by using the Daily Return of Jetblue Stock Price matrix. If the daily stock price return is above zero, we set the moving direction to be UP, otherwise we set the moving direction to be DOWN. Models and Analysis We firstly used technical analysis to visualize our original dataset: stock price and volume since the historical performance of stocks are indicators of future performance. In order to predict the trend of stock price, we divided our dataset into 2 parts: training dataset and testing dataset. We randomly picked the first 1800 Jetblue price data points from 2005.04 to 2015.04 as our training dataset and the remaining data points as the testing dataset. After dividing our dataset, we applied 3 methodologies to predict testing dataset s stock price trend based on the estimation gained from training dataset. The first method we used is regression analysis, we chose Jetblue Closing Price as our predictor and used multiple linear regression model, ridge regression model, lasso regression model, and regression tree analysis to predict Jetblue Closing Price. The second method we used is classification analysis, we chose the binary variable Stock Price Moving Direction as our predictor and used K-Nearest Neighbors (knn) classification approach to predict the trend of stock price moving direction. The last method we used is Autoregressive Integrated Moving Average (ARIMA) Model, by applying this model we can get a better understanding about the future stock price trend. 1) Technical analysis Technical analysis use charts and technical indicators to identify certain patterns of future stock market movement. We have provided various kinds of financial charts to visualize data by using R package quantmod. Price and Volume Chart Here we provided three plots of JBLU stock price, which are daily trend, monthly trend and most recent three months price. As we can seen from the three charts below, Jetblue s historical price chart from 2012 to 2015 is significantly more bullish than 2009 to 2012 s price chart. On Apr, 28th, 2015, the company announced their quarterly earning of 0.4 per share which is relatively greater than the industry average -10.0% during the same period time 10, we can see this 2 points as a buy signal for making final recommendation. 3
Figure 1: JBLU daily stock price Figure2: JBLU monthly stock price Figure 3: JBLU most recent three months stock price trend Moving Average Convergence Divergence Moving average convergence divergence (MACD) is a trend following momentum indicator that shows the relationship between two moving averages of prices. As shown in the chart below, during the period of Oct 2012 to Apr 2015, the MACD of Jetblue are rising over the signal line the majority of the time, which shows a bullish signal and it indicates that it s may be time to buy. Conversely, when MACD fall below the signal line it means a bearish signal and indicates it s time to sell. On the other hand, when the MACD is above zero, the short-term average is above the long-term average which signals upward momentum. As we can see from the chart, the signal line always acts as an area of support and resistance for the indicator. 4
DSO530 Final Project Lulu Liu, Jiaojiao Liu Figure 4: Moving average convergence divergence Commodity Channel Index We always seen commodity channel index as an oscillator to help us determine when an investment has been overbought or oversold. It measures the difference between a stock price change and its average price change. High and positive number indicates that the stock price is well above their average which is a show of strength. And on the contrary, low negative number means the price are below their average which shows weakness. As we can see from the chart below, major of the time, Jetblue s commodity channel index surges above 100 which reflect strong price action that can signal an uptrend. 5
Figure 5: Commodity channel index 2) Regression Analysis Ridge Regression We used JetBlue s closing price as y and all of our variables as Xs: Jetblue opening price, Jetblue daily volume, S&P 500 closing price, S&P 500 daily volume, Jetblue daily return, Jet Fuel Spot price index, Crude Oil price index, 5 Yr Treasury Bill Rate, 10 Yr Treasury Bill Rate, 20 Yr Treasury Bill Rate. Applying the ridge regression penalty to our data helped us shrink the variables that are not significant toward to zero and we also applied a Variance inflation factors (VIF) analysis to help us locate highly correlated variables. After removing the un-significant variables and highly correlated variables, we revised our model and used only these 4 variables to do our analysis: Jetblue daily.volume, S&P 500 Closing price, Crude.Oil.Price and 20 Yr Treasury Bill Rate. As we can see from the table below, Ridge Regression gave wonderful results, with very low RMSE of 2.396%. The summary of the coefficients, lambda plot and predicted y plot are shown below: 6
DSO530 Final Project Lulu Liu, Jiaojiao Liu Ridge Regression Best Lambda 0.3093 Mean Squared Error 5.7388 RMSE 2.3956 Lasso Regression Applying the lasso regression penalty to our data helped us shrinking the variables that are not significant to be exactly zero and the variable selection results are the same as ridge regression penalty. Therefore we also used these 4 variables to do our analysis: Jetblue daily.volume, S&P 500 Closing price, Crude.Oil.Price and 20 Yr Treasury Bill Rate. As we can see from the table below, lasso Regression also gave us relatively low results, with RMSE of 2.680%. The summary of the coefficients, lambda plot and predicted y plot are shown below: 7
Lasso Regression Best Lambda 0.0038 Mean Squared Error 7.1799 RMSE 2.6795 Multiple linear regression analysis By using the same variables as ridge and lasso regression, we also built a multiple linear regression to analyze the relationship between stock close price and other factors. After applying the MLR model to our dataset, we find out that Multiple Linear Regression gave us a RMSE of 2.252%. The summary of the coefficients and predicted y plot are shown below: 8
DSO530 Final Project Lulu Liu, Jiaojiao Liu Multiple Linear Regression Mean Squared Error 5.0722 RMSE 2.2522 F-Statistic 574.7 P-Value < 2.2e-16 Adjusted R-squared 0.5606 Residual standard error 3.105 Regression Tree Analysis As we can see from the full regression tree (unpruned) analysis, Crude Oil Price ranked top 1 factor affecting the closing price of Jetblue. We also used cross validation methodology to pruned the full model tree to see if pruned tree have significant better results compared with full regression tree, but the result showed that our full model already used the minimum index. Therefore our pruned regression tree and unpruned tree have exactly the same MSE which is 11.4%. Regression Tree Analysis Mean Squared Error 11.4106 RMSE 3.3780 9
MSE comparison See below for the RMSE comparison table, after carefully comparing these 4 models mean squared error root, we find out that Regression Tree analysis seems to be significantly weaker than ridge, lass and MLR with a standard error bordering close to 3.4%. And Multiple Linear Regression Model have the lowest error rate compared with the ridge and lasso models. In conclusion, all of our models provide some improvements over the regression and have realistic parsimonious input variable set. However, we recommend to use Multiple Linear Regression model as by far it provides the best predictive accuracy. This model is data driven and can be used for solving the problem of predicting Jetblue s future stock price. RMSE comparison Ridge Regression 2.3956 Lasso Regression 2.6795 Regression Tree Analysis 3.3780 Multiple Linear Regression 2.2522 3) Classification Analysis Logistic Regression We have used JetBlue s stock trend direction (up & down) as y and all of other variables as x: Jetblue opening price, Jetblue closing price, Jetblue daily volume, S&P 500 opening price, S&P 500 closing price, S&P 500 daily volume, Jet Fuel Spot price index, Crude Oil price index, 20 Yr Treasury Bill Rate. The summary of the regression is shown below: 10
DSO530 Final Project Lulu Liu, Jiaojiao Liu The regression result shows that opening price and closing price of JetBlue and SP are statistical significant. We also have calculated the misclassification rate, which is shown below: Logistic Regression Misclassification Error 0.1052 Linear Discriminant Analysis (LDA) Since we have one categorical dependent variable stock trend direction, and several continuous independent variables, we have applied Linear Discriminant Analysis to our dataset. LDA introduces the determination of linear equation that will predict the groups. The coefficient in LDA is called discriminant coefficient. The purpose of this model is to choose certain v that could maximize the distance between different categories mean. The summary of the regression is shown below: 11
Linear Discriminant Analysis Misclassification Error 0.1220 Quadratic Discriminant Analysis (QDA) QDA is similarly related to LDA, while QDA does not assume that every class has the same variance. We have applied quadratic discriminant analysis to our dataset. The summary of the regression is shown below: 12
DSO530 Final Project Lulu Liu, Jiaojiao Liu Quadratic Discriminant Analysis Misclassification Error 0.1445 K-Nearest Neighbors (knn) classification KNN generates the closest k records of the training data that highly close to the test, which is the optimal k. Since KNN does not give insight about what variables are important and what are not useful, we used the knowledge from our previous data exploration and models to select the input variables. Since attributes have different ranges, we first standardized them and then apply knn optimization model(see Appendix E for detailed R code). K-Nearest Neighbor (Optimal k = 186) Misclassification Error 0.4516 Figure 6: Test error rates based on cross-validation by KNN with different Ks 13
Misclassification Rate Comparison See below for the misclassification rate comparison table, after applying these 4 models error rate, we find out that Logistic Regression model has the lowest misclassification rate compared with the LDA, QDA and knn models. In conclusion, we recommend to use Logistic Regression model to predict Jetblue s future stock trend as by far it provides the best predictive accuracy. Misclassification Rate Comparison Logistic Regression 0.1052 Linear Discriminant Analysis 0.1220 Quadratic Discriminant Analysis 0.1445 K-Nearest Neighbor 0.4516 4) Time series stock price prediction using autoregressive integrated moving average (ARIMA) model. ARIMA is a generalization of an autoregressive moving average (ARIMA) model, which is fitted to time series data for forecasting stock price. An ARIMA model is classified as an "ARIMA(p,d,q)" model, where [9] : p is the number of autoregressive terms d is the number of seasonal differences needed for stationarity q is the number of lagged forecast errors in the prediction equation First, we have generated a stationary sequence by defining the n difference of y. See below for the plots of 1st and 2nd differenced sequence.to balance stationary and overdifference, we have decided d=2 in our model. Figure 7: 1st difference sequence Figure 8: 2nd difference sequence Second, we need to find p and q to fit an ARIMA model. See below the autocorrelation and 14
DSO530 Final Project Lulu Liu, Jiaojiao Liu partial autocorrelation of series stockdiff. The autocorrelation graph shows that the 1st lag is within the boundary, and most of others lag are within the boundary (except the 4th one). Therefore, we have decided p=1 and q=4 in our model. Figure 9: Autocorrelation Figure 10: Partial autocorrelation Third, after running the ARIMA(1,2,4) model, we have forecasted the future stock trend. The summary of the model is shown below: 15
Figure 11: Forecast from ARIMA model Conclusion From our Technical Analysis of jetblue s stock performance we concluded these 5 points: Moving Average Convergence indicates a bullish trend; chart pattern indicates a strong upward trend; volume pattern indicates the stock is under accumulation; 50 day moving average is rising which shows bullish trend and also 200 day moving average is rising which shows bullish trend. All of our models in the statistical analysis part provide some improvements over the original dataset and have provided us some valuable insights to help us revise our model. From the regression analysis, we only recommend the use of Multiple Linear Regression model as by far it provided the best predictive accuracy with a error rate of 5.0722. From the classification analysis, we can predict JBLU future stock trend with a low misclassification rate at 10.52% by using logistic regression model. These 2 models is data driven and can be used for solving prediction problem of future stock price of Jetblue and their competitors. In addition, our ARIMA prediction model also verified our prediction of Jetblue s future stock trend, which is a strong upward trend between April 2015 to April 2016. After the technical analysis and statistical model forecasting, we also performed a basic fundamental analysis of Jetblue Stock. From our research, we found out that Jetblue s fundamental performance is very strong: its revenue increased at an annual growth rate of 13% with operating income at 17% [10], its profit margin is much higher than industry average and also the company have very low debt levels. And also based on Trailing P/E ratio, Jetblue is currently traded at 39% discount to its airline industry peers. Since we have the S&P 500 price index, we 16
DSO530 Final Project Lulu Liu, Jiaojiao Liu did a basic statistic analysis of Jetblue and S&P 500. We find out that JBLU has a high correlation (>=0.4) with the S&P 500 index, and over the last 90 days, JBLUE s standard deviation has been 2.1 while S&P 500 index has been 0.7. Recommendation Based on above statistical analysis, technical analysis and fundamental analysis, we have a strong buy opinion on the share of Jetblue s stock. We believe that Jetblue should remain profitable through 2016. And also from our analysis, we think Jetblue is currently trading at discount compared to its peers such as American Airline and United, this point also makes the stock price more attractive to value investors. 17
References [1] Base Case: Trefis https://www.trefis.com/stock/jblu/model/trefis?easyaccesstoken=provider_4eb86c3ba095 bb2138f80caafc7b5b763971f1b9 [2] Why this young discount airline is attracting value investors http://www.forbes.com/sites/genemarcial/2014/11/30/why-this-young-discount-airline-isattracting-value-investors/ [3] Yahoo Finance, JetBlue stock price http://finance.yahoo.com/q/hp?s=jblu+historical+prices [4] S&P 500, Economic Research Federal Reserve Bank of St. Louis http://en.wikipedia.org/wiki/s%26p_500 https://research.stlouisfed.org/fred2/series/sp500/downloaddata [5] U.S. Energy Information Administration, Crude Oil Price WTI - (Cushing, Oklahoma) in Dollars per Barrel http://www.eia.gov/dnav/pet/pet_pri_spt_s1_d.html [6] Closing Price in Relation to the Day s Range, and Equity Index Mean Reversion http://qusma.com/2012/11/06/closing-price-in-relation-to-the-days-range-and-equity-indexmean-reversion/ [7] U.S. Department of the Treasury, Government Treasury Yield Curve Daily Rates from 04/2005-04/2015 http://www.treasury.gov/resource-center/data-chart-center/interestrates/pages/textview.aspx?data=yieldall [8] U.S. Energy Information Administration, Jet Fuel Spot Price FOB - in Dollars per Gallon http://www.eia.gov/dnav/pet/hist/leafhandler.ashx?n=pet&s=eer_epjk_pf4_rgc_dpg&f =D [9] ARIMA models for time series forecasting, http://people.duke.edu/~rnau/411arim.htm#pdq [10] Jetblue Official Website : Investor page http://investor.jetblue.com/investor-relations/financial-information/quarterly-results/28-04- 2015.aspx 18