Stock Price Forecasting Using Information from Yahoo Finance and Google Trend
|
|
|
- Nelson French
- 10 years ago
- Views:
Transcription
1 Stock Price Forecasting Using Information from Yahoo Finance and Google Trend Selene Yue Xu (UC Berkeley) Abstract: Stock price forecasting is a popular and important topic in financial and academic studies. Time series analysis is the most common and fundamental method used to perform this task. This paper aims to combine the conventional time series analysis technique with information from the Google trend website and the Yahoo finance website to predict weekly changes in stock price. Important news/events related to a selected stock over a five- year span are recorded and the weekly Google trend index values on this stock are used to provide a measure of the magnitude of these events. The result of this experiment shows significant correlation between the changes in weekly stock prices and the values of important news/events computed from the Google trend website. The algorithm proposed in this paper can potentially outperform the conventional time series analysis in stock price forecasting. Introduction: There are two main schools of thought in the financial markets, technical analysis and fundamental analysis. Fundamental analysis attempts to determine a stock s value by focusing on underlying factors that affect a company s actual business and its future prospects. Fundamental analysis can be performed on industries or the economy as a whole. Technical analysis, on the other hand, looks at the price movement of a stock and uses this data to predict its future price movements. In this paper, both fundamental and technical data on a selected stock are collected from the Internet. Our selected company is Apple Inc. (aapl). We choose this stock mainly because it is popular and there is a large amount of information online that is relevant to our research and can facilitate us in evaluating ambiguous news. Our fundamental data is in the form of news articles and analyst opinions, whereas our technical data is in the form of historical stock prices. Scholars and researchers have developed many techniques to evaluate online news over the recent years. The most popular technique is text mining. But this method is complicated and subject to language biases. Hence we attempt to use information from the Yahoo finance website and the Google trend website to simplify the evaluation process of online news information.
2 In this paper, we first apply the conventional ARMA time series analysis on the historical weekly stock prices of aapl and obtain forecasting results. Then we propose an algorithm to evaluate news/events related to aapl stock using information from the Yahoo finance website and the Google trend website. We then regress the changes in weekly stock prices on the values of the news at the beginning of the week. We aim to use this regression result to study the relationship between news and stock price changes and improve the performance of the conventional stock price forecasting process. Literature review: The basic theory regarding stock price forecasting is the Efficient Market Hypothesis (EMH), which asserts that the price of a stock reflects all information available and everyone has some degree of access to the information. The implication of EMH is that the market reacts instantaneously to news and no one can outperform the market in the long run. However the degree of market efficiency is controversial and many believe that one can beat the market in a short period of time 1. Time series analysis covers a large number of forecasting methods. Researchers have developed numerous modifications to the basic ARIMA model and found considerable success in these methods. The modifications include clustering time series from ARMA models with clipped data 2, fuzzy neural network approach 3 and support vector machines model 4. Almost all these studies suggest that additional factors should be taken into account on top of the basic or unmodified model. The most common and important one of such factors is the online news information related to the stock. Many researchers attempt to use textual information in public media to evaluate news. To perform this task, various mechanics are developed, such as the AZFin text system 5, a matrix form text mining system 6 and named entities representation scheme 7. All of these processes require complex algorithm that performs text extraction and evaluation from online sources. Data: Weekly stock prices of aapl from the first week of September 2007 to the last week of August 2012 are extracted from the Yahoo finance website. This data set contains the open, high, low, close and adjusted close prices of aapl stock on every Monday throughout these five years. It also contains trading volume values on these days. To achieve consistency, the close prices are used as a general measure of stock price of aapl over the past five years.
3 We use the Key Developments feature under the Events tab on the Yahoo finance website to extract important events and news that are related to aapl stock over the past five years. The Key Developments of aapl from the first week of August 2007 to the last week of August 2012 are recorded. Most of these news comes from the Reuters news website. Reuters is an international news agency and a major provider of financial market data. Each piece of news is examined in greater details in order to determine whether the news should have positive or negative influence on the stock price. The news is then assigned a value of +1 or - 1 accordingly. If the influence of the news is highly controversial or ambiguous, then the news is assigned a zero value. The starting point of the Yahoo finance news data is set one month earlier than the starting point of the stock price data because we eventually want to study the relationship between news at one time and stock price at a later time. Google Trend is a public web facility of Google Inc. It shows how often a specific search term is entered relative to the total search volume on Google Search. It is possible to refine the request by geographical region and time domain. Google Trend provides a very rough measure of how much people talk about a certain topic at any point of time. Online search data has gained increasing relevance and attention in recent years. This is especially true in economic studies. Google Search Insights, a more sophisticated and advanced service displaying search trends data has been used to predict several economic metrics including initial claims for unemployment, automobile demand, and vacation destinations 8. Search data can also be used for measuring consumer sentiment 9. In this paper, weekly (every Sunday) search index of the term aapl on a global scale from the first week of August 2007 to the last week of August 2012 are extracted from the Google trend website. The starting point of the Google trend data is set one month earlier than the starting point of the stock price data because we eventually want to study the relationship between news at one time and stock price at a later time. In addition, this helps to align the news data from the Yahoo finance website with the data from the Google trend website. Assumptions: Several assumptions regarding the Google trend data, the Yahoo finance news information and the Yahoo finance stock price data are made: 1. The global Google trend index of the search term aapl shows how much people around the world search up aapl stock at a time and hence gives a rough measure of the impact of the news/events at that time. Note that here we acknowledge the roughness of this measure. After all, the objective of this paper is to simplify the process of news evaluation. 2. The major source of news information is the key development function on the Yahoo finance website. We assume that this function gives a list of significant news and important events related to Apple Inc. throughout the
4 past five years. In other words, we assume the criterion used by the Yahoo finance website when judging if certain news is important enough to be included in our analysis. Note that this list might not be a comprehensive list of the important news related to Apple Inc. over the past five years. But again, this paper attempts to simplify news selection process. Hence a simple criterion from the Yahoo finance website is used. 3. The historical weekly close prices of aapl reflect changes in the real values of aapl stock during this period of time. 4. Keeping other factors constant, positive news and negative news have equal impact on stock prices. Both positive news and negative news are assigned an initial magnitude of 1, which is later adjusted by the Google trend index values. The only difference is their sign: positive news is assigned to +1 while negative news is assigned to This paper assumes that the impact of news on stock prices follows an exponential function with a parameter of 1/7. The specific algorithm to calculate the value of news at a certain time after it is first released is shown in the next section. Essentially, we expect the news to die out after about two weeks.
5 Analysis of Data: 1. The basic ARIMA model analysis of the historical stock prices: To perform the basic ARIMA time series analysis on the historical stock prices, we first make a plot of the raw data, i.e. the weekly close prices of aapl over time. The plot is shown below: This plot shows that the close price of aapl increases in general over the past five years. However, there is no apparent pattern in the movement of the stock price. The variance of the stock price seems to increase slightly with time. The stock price is especially volatile near the end. These observations imply that a log or square- root transformation of the raw data might be appropriate in order to stabilize variance.
6 Since there is no linear or other discernable mathematical pattern in the data, we perform first degree differencing on the raw data and on the transformed data. We plot the results and compare them: Comparing the three plots, it is clear that the first differences of the raw data show increasing variance over time whereas the first differences of the transformed data show relatively stable variance over time. Hence we confirm that it is indeed desirable to perform transformation on the raw data. Comparing the second and the third plot, it seems that both the logarithm and the square- root transformations do a good job stabilizing the variance and the distributions of the first differences in both plots look random. We decide to use a square- root transformation because it seems to do a better job tuning down some of the extreme values such as the ones near the 50 th data point.
7 Next we plot the autocorrelation function and the partial autocorrelation function of the first differences of the transformed data: From these plots, it seems that the first differences of the transformed data are random. There doesn t seem to be an ARMA relationship in the first differences with time. The transformed stock prices essentially follow an ARIMA(0,1,0) process. This is essentially a random walk process. The random walk model in stock price forecasting has been commonly used and studied throughout history 10. The random walk model has similar implications as the efficient market hypothesis as they both suggest that one cannot outperform the market by analyzing historical prices of a certain stock.
8 2. Algorithm for computing the value of news at a certain time: As mentioned earlier, important news and events related to Apple Inc. over the past five years are recorded and assessed to be either +1 or - 1. An exponential algorithm is used to simulate the change of impact of a piece of news over time. The details of the algorithm is as follow: a. To start with, each day in the past five years is assigned a value of +1, - 1 or 0 depending on if there is important news/event on that day and if the news is positive, negative or neutral respectively. b. The impact of the news decreases exponentially such that n days later, the absolute value of the news/event becomes exp(- n/7) and the sign still follows the original sign of the news/event. This exponential form means that on the day that certain news/event occurs, the absolute value of that news/event is always exp(0)=1. One day later, the absolute value becomes exp(- 1/7)=86.69% of the original absolute value. A week later, the absolute value becomes exp(- 7/7)=36.79% of the original absolute value. Two weeks later, the absolute value becomes exp(- 14/7)=13.53% of the original absolute value. We design the algorithm in a way that news/event almost dies off after two weeks. c. On any day within the past five years, the value of news/event on that date is the sum of all news/events values on and before that date, which are calculated using the aforementioned algorithm. d. After computing the value of news/event for every single day in the past five years, the days corresponding to the Google trend dates (every Sunday in the past five years) are selected. The news/events value computed from the key developments feature on the Yahoo finance website gives the sign of news/event at a particular time and the Google trend index data gives the magnitude of the news/event at that time. They are multiplied together to give a measure of the final value of news/event at that point of time.
9 We plot the final values of news/events over time to get a rough picture of our news vector: From the plot, we see that a lot of the time the news values are close to zero. This means that there is little news to influence the public at those times. There are also some large spikes in the plot. For example, there is a big spike in the negative direction around the 20 th data point; this means that there is some very bad and significant news about Apple Inc. around this time. There is a big spike in the positive direction around the 220 th data point; this means that there is some very good and significant news about Apple Inc. around this time.
10 3. Regression of weekly stock price changes on the news values at the beginning of each week: In order to study the relationship between news/event at certain time and stock price at a later time, we regress the weekly stock price changes on the news values from the previous week. The summary of the regression is shown below: > summary(mod) Call: lm(formula = d1 ~ inf2, data = d) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) *** inf e-07 *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 258 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 258 DF, p-value: 5.808e-07 The regression result shows that there seems to be a very significant and positive correlation (6.3290) between the weekly changes in stock prices and the news values at the beginning of each week. To put this into words, one unit increase in news values should result in roughly 6.33 unit increase in stock price change by the end of the week. However we notice that the R- squared value, which shows the proportion of the variability in the weekly stock price changes explained by the model, is very small ( ). This means that the model is not doing a good job predicting the changes in stock prices.
11 In order to further examine the regression data, we plot the weekly stock price changes against the news values from the previous week with the regression line drawn: From the plot, we notice that there are potential outliers in the data such as the point well above the regression line near news value - 2. An outlier is a data point that is markedly distant from the rest of the data and does not fit the current model. Identifying outliers helps to distinguish between truly unusual points and residuals that are large but not exceptional. We use the Jacknife residuals and the Bonferroni inequality method to come up with a conservative test at a level of 10% to filter out the outliers in our data. This procedure picks out the data point we just mentioned as the sole outlier in our data set. We exclude this data point for the rest of our analysis. The regression result and the plot of the new data set without the outlier are shown on the next page:
12 > summary(mod3) Call: lm(formula = d1 ~ inf2, data = d3) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) *** inf e-09 *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 12 on 257 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 257 DF, p-value: 9.498e-09
13 The regression results in a similar estimate of the coefficient on the news variable (7.3542), which is also significant at level This shows that the estimate for the correlation coefficient between weekly stock price changes and news values at the beginning of each week is relatively stable. There is slight improvement in the R- squared value (0.1205). A little over 10% of the variability in the weekly stock price changes can be explained by the model. Although the R- squared value shows slight increase, it is still very small, implying that the model is still not doing a great job predicting the changes in stock prices. We further examine the model by analyzing data leverages. If leverage is large, the data point will pull the fitted value toward itself. Such observations often have small residuals and do not show up as outliers on the residual plots. The fit looks good, but is dominated by a single observation. If the leverage of a data point is substantially greater than k/n, where k is the number of parameters being estimated and n is the sample size, then we say that it has high leverage. A rule of thumb says that if leverage is bigger than 2k/n, then this data point is a candidate for further consideration. When checking for high leverage points in our data set using R, we find 24 influential points. If we take out these points and perform regression on the remaining data. The regression summary and plot are shown here: > summary(mod2) Call: lm(formula = d1 ~ inf2, data = d2) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) ** inf Signif. codes: 0 *** ** 0.01 * Residual standard error: on 234 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 234 DF, p-value: 0.112
14 The estimate of the correlation coefficient between weekly stock price changes and news values is no longer significant. In fact, now the distribution of weekly stock price changes over news values seems rather random. Hence we suspect that the previous regression result is dominated by a number of highly influential data points in our sample and does not represent general situations very well.
15 4. Regression of weekly stock price changes on dummy variables representing different intervals of news values: In addition to studying the relationship between stock price changes and specific news values, we would like to examine the general behavior of stock price changes when the news value falls within a certain range. In order to do this, we divide the news values into different intervals, represent them by dummy variables and regress the weekly changes in stock prices onto these dummy variables (without an intercept term). When we perform a regression like this, the estimated coefficient for each dummy variable in the regression output is the average of the weekly stock price changes that correspond to news values within that particular interval. First we divide the news values into two general categories: negative and positive. The regression result is shown below: > summary(m) Call: lm(formula = d1 ~ pdum + ndum - 1) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) pdum ** ndum Signif. codes: 0 *** ** 0.01 * Residual standard error: on 258 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 2 and 258 DF, p-value: From the regression summary, we see that the mean of weekly stock price changes for positive news is It is significant at a level of This makes sense since we would expect positive news to be followed by a positive change in stock price. The mean of weekly stock price changes for negative news is It is not significant at any level smaller than 0.1. This contradicts to what we would have expected since normally negative news should be followed by a negative change in stock price.
16 Next we divide the news values into finer intervals: smaller than - 2, bigger than or equal to - 2 and smaller than - 1.5, bigger than or equal to and smaller than - 1, bigger than or equal to - 1 and smaller than - 0.5, bigger than or equal to and smaller than 0, bigger than or equal to 0 and smaller than 0.5, bigger than or equal to 0.5 and smaller than 1, bigger than or equal to 1 and smaller than 1.5, and finally bigger than or equal to 1.5. The regression result over these nine dummy variables (without intercept term) is shown below: > summary(m2) Call: lm(formula = d1 ~ dum1 + dum2 + dum3 + dum4 + dum5 + dum6 + dum7 + dum8 + dum9-1) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) dum dum dum dum dum * dum ** dum dum dum e-05 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 251 degrees of freedom Multiple R-squared: 0.147, Adjusted R-squared: F-statistic: on 9 and 251 DF, p-value: 6.274e-06 From the regression summary, we see that the averages of weekly stock price changes for news values that fall into the first four intervals are negative. But only the averages in the first interval (<- 2) and the fourth interval (- 1~- 0.5) are significant, both at level 0.1. The other two are not significant at any level smaller than 0.1. Also we would have expected the mean of weekly stock price changes to be more negative in the second interval (- 2~- 1.5) than those in the third (- 1.5~- 1) and
17 fourth interval (- 1~- 0.5). The average of weekly stock price changes in the fifth interval (- 0.5~0) is very interesting. Contrary to our expectation that it should be negative, it is actually positive here and is significantly so (at a level 0.05). We will discuss about this in the next section. The averages of weekly stock price changes for news values that fall into the last four intervals, except for the seventh interval, are positive, just as what we would have expected. The averages in the sixth interval (0~0.5) and the last interval (>1.5) are significant at level 0.01 and level respectively. The mean in the eighth interval (1~1.5) is not significant at any level smaller than 0.1. The mean in the seventh interval (0.5~1) is a negative value, which contradicts to our expectation. But it is not significant at any level smaller than 0.1. All in all, when dividing news values into different intervals, the trend of average weekly stock price changes is similar to what we would have expected, with a couple of exceptions.
18 Discussion: 1. We assign the same initial magnitude of 1 to positive and negative news. This might not accurately reflect how people generally react to good and bad news. There are numerous studies showing that negative information has a much greater impact on individuals attitudes than does positive information 11. Hence it might be more reasonable to assign a slightly larger initial absolute value to negative news than to positive news. 2. A lot of the time, the judgment of whether a piece of news is good or bad is subjective. People might have different, even opposite, interpretations of the same information. Since the determination of the sign of news is done manually in our experiment, this process is subject to human biases. This is especially true when the news is ambiguous and not as significant as other news. For example, in the last section, we see that the stock price on average goes up when there is slightly negative news. It is possible that some of these news are deemed negative in our analysis but actually are considered goods news by popular opinion. It is hard to accurately and systematically determine whether news is good or bad in the opinion of the majority of the public when there is ambiguity. 3. We use an exponential function to simulate how fast the impact of news fades away. There might be better alternatives to simulate this process. Moreover, positive news and negative news might fade out in different manners. This is not accounted for in our analysis. 4. Our main news source is the key developments function on the Yahoo finance website, which mostly comes from the Reuters News website. This might not provide a comprehensive list of all the important news/events that took place over the past five years. 5. We only extract the Google trend index record for the term aapl. Often when people search about a company, they would type in the name of the company or the news directly instead of using a stock symbol. The Google trend data for more relevant terms could be added to the ones we get for aapl in order to get a more accurate measure of how much people care about the news related to this stock at a particular time. Conclusion: Our initial analysis shows significant correlation between news values and weekly stock price changes. But we need to be careful when using this result since it is likely that the result is dominated by a number of influential observations and is not reflective of the general trend. In general the weekly stock price changes within different intervals of news values behave in the same way as what we expect. But we also find some interesting exceptions worth looking into. There are a number of limitations and shortcomings in our experiment as mentioned in the discussion section and potential improvements can be made to our data collection and analysis method. Further researches can be done with possible improvements such as more refined search data and more accurate algorithm to compute news values.
19 Reference: 1. Gili Yen and Cheng- Few Lee, Efficient Market Hypothesis (EMH): Past, Present and Future in Review of Pacific Basin Financial Markets and Policies, vol 11, issue 2, A. J. Bagnall and G. J. Janacek, Clustering Time Series from ARMA Models with Clipped Data in KDD, W. Kim, R. Kohavi, J. Gehrke, and W. DuMouchel, Eds. ACM, D. Marcek, Stock Price forecasting: Statistical, Classical and Fuzzy Neural Network Approach in MDAI, V. Torra and Y. Narukawa, Eds, vol Springer, A Hybrid ARIMA and Support Vector Machines Model in Stock Price Forecasting in Omega, the International Journal of Management Science. vol. 33, no. 3, Schumaker, Robert P., and Hsinchun Chen. "Textual Analysis of Stock Market Prediction using Breaking Financial News: The AZFin text system." ACM Transactions on Information Systems (TOIS) 27.2 (2009) 6. Deng, Shangkun, et al. "Combining Technical Analysis with Sentiment Analysis for Stock Price Prediction." Dependable, Autonomic and Secure Computing (DASC), 2011 IEEE Ninth International Conference on. IEEE, BABU, M. SURESH, DRN GEETHANJALI, and V. RATNA KUMARI. "TEXTUAL ANALYSIS OF STOCK MARKET PREDICTION USING FINANCIAL NEWS ARTICLES." The Technology World Quarterly Journal (2010) 8. Choi, Hyunyoung, and Hal Varian. "Predicting the present with google trends." Economic Record 88.s1 (2012): Preis, Tobias, Daniel Reith, and H. Eugene Stanley. "Complex dynamics of our economic life on different scales: insights from search engine query data." Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences (2010): Fama, Eugene F. "Random walks in stock market prices." Financial Analysts Journal (1965): Soroka, Stuart N. "Good news and bad news: Asymmetric responses to economic information." Journal of Politics 68.2 (2006):
20 Appendix (R code): price=read.csv("/users/selenexu/desktop/econ Honor/aapl table.csv",header=true) price=price[order(price$date,decreasing=false),] trend=read.csv("/users/selenexu/desktop/econ Honor/aapl trends.csv",header=true) plot(trend$aapl,type='l') plot(price$close,type='l',xlab='time',ylab='close Prices',main='Weekly Close Prices of aapl') d1=diff(price$close) logd1=diff(log(price$close)) sd1=diff(sqrt(price$close)) par(mfrow=c(3,1)) plot(d1,type='l',xlab='time',ylab='difference',main='first Degree Differencing on Raw Data') plot(logd1,type='l',xlab='time',ylab='difference',main='first Degree Differencing on Logged Data') plot(sd1,type='l',xlab='time',ylab='difference',main='first Degree Differencing on Square-root Data') par(mfrow=c(2,1)) acf(sd1,main='autocorrelation Function of the First Differences') pacf(sd1,main='partial Autocorrelation Function of the First Differences') sd2=diff(sd1) par(mfrow=c(2,1)) acf(sd2,main='autocorrelation Function of the Second Differences') pacf(sd2,main='partial Autocorrelation Function of the Second Differences') arima(sqrt(price$close),order=c(0,2,1)) inf2=info[1:(length(info)-1)] d=data.frame(d1,inf2) mod=lm(d1~inf2,data=d)
21 summary(mod) plot(inf2,d1,xlab='news Values',ylab='Weekly Changes in Stock Prices',main='Changes in Stock Prices VS News with Regression Line') abline(a=summary(mod)$coefficients[1],b=summary(mod)$coefficients [2]) jack=rstudent(mod) plot(jack, ylab='jacknife residuals',main='jacknife residuals') q=abs(qt(.1/(260*2),( ))) jack[abs(jack)==max(abs(jack))] which(abs(jack)>q) d3=d[-which(abs(jack)>q),] mod3=lm(d1~inf2,data=d3) summary(mod3) plot(d3$inf2,d3$d1,xlab='news Values',ylab='Weekly Changes in Stock Prices',main='Changes in Stock Prices VS News (without outliers)') abline(a=summary(mod3)$coefficients[1],b=summary(mod3)$coefficien ts[2]) influence(mod)$h dim(d) which(influence(mod)$h > 2*2/260 ) influence(mod)$h[which(influence(mod)$h > 2*2/260 )] d2=d[-which(influence(mod)$h > 2*2/260 ),] plot(d2$inf2,d2$d1,xlab='news Values',ylab='Weekly Changes in Stock Prices',main='Changes in Stock Prices VS News (without influential points)') abline(a=summary(mod2)$coefficients[1],b=summary(mod2)$coefficien ts[2]) mod2=lm(d1~inf2,data=d2) summary(mod2) pdum=(inf2>0)*1 ndum=(inf2<0)*1 m=lm(d1~pdum+ndum-1) summary(m) x=d[inf2>0,] mean(x$d1) min(inf2)
22 max(inf2) dum1=(inf2<(-2))*1 dum2=(inf2>=(-2)&inf2<(-1.5))*1 dum3=(inf2>=(-1.5)&inf2<(-1))*1 dum4=(inf2>=(-1)&inf2<(-0.5))*1 dum5=(inf2>=(-0.5)&inf2<(0))*1 dum6=(inf2>=(0)&inf2<(0.5))*1 dum7=(inf2>=(0.5)&inf2<(1))*1 dum8=(inf2>=(1)&inf2<(1.5))*1 dum9=(inf2>=(1.5))*1 m2=lm(d1~dum1+dum2+dum3+dum4+dum5+dum6+dum7+dum8+dum9-1) summary(m2) #the following shows how to create the news vector news=read.csv("/users/selenexu/desktop/econ Honor/event.csv",header=TRUE) news$date=as.date(news$date,"%m/%d/%y") date=seq(from=as.date(" "), to=as.date(" "), by=1) length(date) match=match(news$date, date) fac=1:length(date) for (i in fac) {if (i %in% match) {fac[i]=news[match(i,match),3]} else {fac[i]=0}} mul=1:length(fac) for (i in mul) {mul[i]=sum(exp(-((i-1):0)*1/7)*fac[1:i])} plot(mul,type='l') d=date[5:length(date)] da=d[seq(1,length(d),7)] tail(da) weeklydate=da[1:(length(da)-2)] m=mul[5:length(mul)] m1=m[seq(1,length(m),7)] m2=m1[1:(length(m1)-2)] trend=read.csv("/users/selenexu/desktop/econ Honor/aapl trends.csv",header=true) infor=trend$aapl[5:length(trend$aapl)] info=infor*m2 plot(info,type='l', xlab='time',ylab='news Values',main='Values of News/Events over Time')
Multiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
Statistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate
5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
MSwM examples. Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech.
MSwM examples Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech February 24, 2014 Abstract Two examples are described to illustrate the use of
Chapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
We extended the additive model in two variables to the interaction model by adding a third term to the equation.
Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic
Introduction to time series analysis
Introduction to time series analysis Margherita Gerolimetto November 3, 2010 1 What is a time series? A time series is a collection of observations ordered following a parameter that for us is time. Examples
The Viability of StockTwits and Google Trends to Predict the Stock Market. By Chris Loughlin and Erik Harnisch
The Viability of StockTwits and Google Trends to Predict the Stock Market By Chris Loughlin and Erik Harnisch Spring 2013 Introduction Investors are always looking to gain an edge on the rest of the market.
MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal
MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims
Psychology 205: Research Methods in Psychology
Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready
EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION
EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY 5-10 hours of input weekly is enough to pick up a new language (Schiff & Myers, 1988). Dutch children spend 5.5 hours/day
Master s Thesis. A Study on Active Queue Management Mechanisms for. Internet Routers: Design, Performance Analysis, and.
Master s Thesis Title A Study on Active Queue Management Mechanisms for Internet Routers: Design, Performance Analysis, and Parameter Tuning Supervisor Prof. Masayuki Murata Author Tomoya Eguchi February
Getting Correct Results from PROC REG
Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking
Using R for Linear Regression
Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional
Time Series Analysis
Time Series Analysis Identifying possible ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos
Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne
Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a
Statistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
Analysis of Bayesian Dynamic Linear Models
Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main
A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data
A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data Athanasius Zakhary, Neamat El Gayar Faculty of Computers and Information Cairo University, Giza, Egypt
Time Series Analysis: Basic Forecasting.
Time Series Analysis: Basic Forecasting. As published in Benchmarks RSS Matters, April 2015 http://web3.unt.edu/benchmarks/issues/2015/04/rss-matters Jon Starkweather, PhD 1 Jon Starkweather, PhD [email protected]
5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Earnings Announcement and Abnormal Return of S&P 500 Companies. Luke Qiu Washington University in St. Louis Economics Department Honors Thesis
Earnings Announcement and Abnormal Return of S&P 500 Companies Luke Qiu Washington University in St. Louis Economics Department Honors Thesis March 18, 2014 Abstract In this paper, I investigate the extent
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this
2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or
Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
Sentiment analysis on tweets in a financial domain
Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma [email protected] The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
Example G Cost of construction of nuclear power plants
1 Example G Cost of construction of nuclear power plants Description of data Table G.1 gives data, reproduced by permission of the Rand Corporation, from a report (Mooz, 1978) on 32 light water reactor
Time Series Analysis
Time Series 1 April 9, 2013 Time Series Analysis This chapter presents an introduction to the branch of statistics known as time series analysis. Often the data we collect in environmental studies is collected
Module 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
Comparing Nested Models
Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller
Analysis of algorithms of time series analysis for forecasting sales
SAINT-PETERSBURG STATE UNIVERSITY Mathematics & Mechanics Faculty Chair of Analytical Information Systems Garipov Emil Analysis of algorithms of time series analysis for forecasting sales Course Work Scientific
Applying Machine Learning to Stock Market Trading Bryce Taylor
Applying Machine Learning to Stock Market Trading Bryce Taylor Abstract: In an effort to emulate human investors who read publicly available materials in order to make decisions about their investments,
Machine Learning Techniques for Stock Prediction. Vatsal H. Shah
Machine Learning Techniques for Stock Prediction Vatsal H. Shah 1 1. Introduction 1.1 An informal Introduction to Stock Market Prediction Recently, a lot of interesting work has been done in the area of
Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY
Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship
Violent crime total. Problem Set 1
Problem Set 1 Note: this problem set is primarily intended to get you used to manipulating and presenting data using a spreadsheet program. While subsequent problem sets will be useful indicators of the
Outline: Demand Forecasting
Outline: Demand Forecasting Given the limited background from the surveys and that Chapter 7 in the book is complex, we will cover less material. The role of forecasting in the chain Characteristics of
THE UNIVERSITY OF CHICAGO, Booth School of Business Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Homework Assignment #2
THE UNIVERSITY OF CHICAGO, Booth School of Business Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay Solutions to Homework Assignment #2 Assignment: 1. Consumer Sentiment of the University of Michigan.
Predictive time series analysis of stock prices using neural network classifier
Predictive time series analysis of stock prices using neural network classifier Abhinav Pathak, National Institute of Technology, Karnataka, Surathkal, India [email protected] Abstract The work pertains
ADVANCED FORECASTING MODELS USING SAS SOFTWARE
ADVANCED FORECASTING MODELS USING SAS SOFTWARE Girish Kumar Jha IARI, Pusa, New Delhi 110 012 [email protected] 1. Transfer Function Model Univariate ARIMA models are useful for analysis and forecasting
Equity forecast: Predicting long term stock price movement using machine learning
Equity forecast: Predicting long term stock price movement using machine learning Nikola Milosevic School of Computer Science, University of Manchester, UK [email protected] Abstract Long
Predict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
A Primer on Forecasting Business Performance
A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
From Saving to Investing: An Examination of Risk in Companies with Direct Stock Purchase Plans that Pay Dividends
From Saving to Investing: An Examination of Risk in Companies with Direct Stock Purchase Plans that Pay Dividends Raymond M. Johnson, Ph.D. Auburn University at Montgomery College of Business Economics
4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4
4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression
Stat 5303 (Oehlert): Tukey One Degree of Freedom 1
Stat 5303 (Oehlert): Tukey One Degree of Freedom 1 > catch
Penalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
Least Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
CHAPTER 11: THE EFFICIENT MARKET HYPOTHESIS
CHAPTER 11: THE EFFICIENT MARKET HYPOTHESIS PROBLEM SETS 1. The correlation coefficient between stock returns for two non-overlapping periods should be zero. If not, one could use returns from one period
Week 5: Multiple Linear Regression
BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School
Knowledge Discovery in Stock Market Data
Knowledge Discovery in Stock Market Data Alfred Ultsch and Hermann Locarek-Junge Abstract This work presents the results of a Data Mining and Knowledge Discovery approach on data from the stock markets
Big Data, Socio- Psychological Theory, Algorithmic Text Analysis, and Predicting the Michigan Consumer Sentiment Index
Big Data, Socio- Psychological Theory, Algorithmic Text Analysis, and Predicting the Michigan Consumer Sentiment Index Rickard Nyman *, Paul Ormerod Centre for the Study of Decision Making Under Uncertainty,
Lucky vs. Unlucky Teams in Sports
Lucky vs. Unlucky Teams in Sports Introduction Assuming gambling odds give true probabilities, one can classify a team as having been lucky or unlucky so far. Do results of matches between lucky and unlucky
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through
http://www.jstor.org This content downloaded on Tue, 19 Feb 2013 17:28:43 PM All use subject to JSTOR Terms and Conditions
A Significance Test for Time Series Analysis Author(s): W. Allen Wallis and Geoffrey H. Moore Reviewed work(s): Source: Journal of the American Statistical Association, Vol. 36, No. 215 (Sep., 1941), pp.
Marketing Mix Modelling and Big Data P. M Cain
1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored
2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
The Effect of Maturity, Trading Volume, and Open Interest on Crude Oil Futures Price Range-Based Volatility
The Effect of Maturity, Trading Volume, and Open Interest on Crude Oil Futures Price Range-Based Volatility Ronald D. Ripple Macquarie University, Sydney, Australia Imad A. Moosa Monash University, Melbourne,
SAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
Modeling Lifetime Value in the Insurance Industry
Modeling Lifetime Value in the Insurance Industry C. Olivia Parr Rud, Executive Vice President, Data Square, LLC ABSTRACT Acquisition modeling for direct mail insurance has the unique challenge of targeting
Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs
Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs Andrew Gelman Guido Imbens 2 Aug 2014 Abstract It is common in regression discontinuity analysis to control for high order
Financial Trading System using Combination of Textual and Numerical Data
Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,
Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay. Solutions to Midterm
Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay Solutions to Midterm Problem A: (30 pts) Answer briefly the following questions. Each question has
Prediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
STATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
Studying Achievement
Journal of Business and Economics, ISSN 2155-7950, USA November 2014, Volume 5, No. 11, pp. 2052-2056 DOI: 10.15341/jbe(2155-7950)/11.05.2014/009 Academic Star Publishing Company, 2014 http://www.academicstar.us
Simple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
Statistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
The correlation coefficient
The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative
Multiple Linear Regression in Data Mining
Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple
Using News Articles to Predict Stock Price Movements
Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 [email protected] 21, June 15,
E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F
Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,
International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013
A Short-Term Traffic Prediction On A Distributed Network Using Multiple Regression Equation Ms.Sharmi.S 1 Research Scholar, MS University,Thirunelvelli Dr.M.Punithavalli Director, SREC,Coimbatore. Abstract:
International Statistical Institute, 56th Session, 2007: Phil Everson
Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: [email protected] 1. Introduction
Threshold Autoregressive Models in Finance: A Comparative Approach
University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Informatics 2011 Threshold Autoregressive Models in Finance: A Comparative
Financial Risk Management Exam Sample Questions/Answers
Financial Risk Management Exam Sample Questions/Answers Prepared by Daniel HERLEMONT 1 2 3 4 5 6 Chapter 3 Fundamentals of Statistics FRM-99, Question 4 Random walk assumes that returns from one time period
Exercise 1.12 (Pg. 22-23)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.
Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course
IBM SPSS Forecasting 22
IBM SPSS Forecasting 22 Note Before using this information and the product it supports, read the information in Notices on page 33. Product Information This edition applies to version 22, release 0, modification
Using simulation to calculate the NPV of a project
Using simulation to calculate the NPV of a project Marius Holtan Onward Inc. 5/31/2002 Monte Carlo simulation is fast becoming the technology of choice for evaluating and analyzing assets, be it pure financial
Demand Forecasting When a product is produced for a market, the demand occurs in the future. The production planning cannot be accomplished unless
Demand Forecasting When a product is produced for a market, the demand occurs in the future. The production planning cannot be accomplished unless the volume of the demand known. The success of the business
American Association for Laboratory Accreditation
Page 1 of 12 The examples provided are intended to demonstrate ways to implement the A2LA policies for the estimation of measurement uncertainty for methods that use counting for determining the number
Credibility and Pooling Applications to Group Life and Group Disability Insurance
Credibility and Pooling Applications to Group Life and Group Disability Insurance Presented by Paul L. Correia Consulting Actuary [email protected] (207) 771-1204 May 20, 2014 What I plan to cover
Ch.3 Demand Forecasting.
Part 3 : Acquisition & Production Support. Ch.3 Demand Forecasting. Edited by Dr. Seung Hyun Lee (Ph.D., CPL) IEMS Research Center, E-mail : [email protected] Demand Forecasting. Definition. An estimate
APPLICATION OF THE VARMA MODEL FOR SALES FORECAST: CASE OF URMIA GRAY CEMENT FACTORY
APPLICATION OF THE VARMA MODEL FOR SALES FORECAST: CASE OF URMIA GRAY CEMENT FACTORY DOI: 10.2478/tjeb-2014-0005 Ramin Bashir KHODAPARASTI 1 Samad MOSLEHI 2 To forecast sales as reliably as possible is
How To Identify A Churner
2012 45th Hawaii International Conference on System Sciences A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication Namhyoung Kim, Jaewook Lee Department of Industrial and Management
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
Stock Market Volatility during the 2008 Financial Crisis
Stock Market Volatility during the 2008 Financial Crisis Kiran Manda * The Leonard N. Stern School of Business Glucksman Institute for Research in Securities Markets Faculty Advisor: Menachem Brenner April
JetBlue Airways Stock Price Analysis and Prediction
JetBlue Airways Stock Price Analysis and Prediction Team Member: Lulu Liu, Jiaojiao Liu DSO530 Final Project JETBLUE AIRWAYS STOCK PRICE ANALYSIS AND PREDICTION 1 Motivation Started in February 2000, JetBlue
Data Mining Techniques Chapter 6: Decision Trees
Data Mining Techniques Chapter 6: Decision Trees What is a classification decision tree?.......................................... 2 Visualizing decision trees...................................................
MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
ALGORITHMIC TRADING USING MACHINE LEARNING TECH-
ALGORITHMIC TRADING USING MACHINE LEARNING TECH- NIQUES: FINAL REPORT Chenxu Shao, Zheming Zheng Department of Management Science and Engineering December 12, 2013 ABSTRACT In this report, we present an
MULTIPLE REGRESSIONS ON SOME SELECTED MACROECONOMIC VARIABLES ON STOCK MARKET RETURNS FROM 1986-2010
Advances in Economics and International Finance AEIF Vol. 1(1), pp. 1-11, December 2014 Available online at http://www.academiaresearch.org Copyright 2014 Academia Research Full Length Research Paper MULTIPLE
