Prediction of Stock Performance in the Indian Stock Market Using Logistic Regression

Transcription

1 Dutta, Bandopadhyay, and Sengupta 105 Prediction of Stock Performance in the Avijan Dutta Associate Professor & Head Department of Management Studies National Institute of Technology, Durgapur, India Gautam Bandopadhyay Associate Professor National Institute of Technology, Durgapur, India Suchismita Sengupta Associate Professor IES Management College and Research Centre Mumbai, India ABSTRACT The authors use logistic regression (LR) and various financial ratios as independent variables to investigate indicators that significantly affect the performance of stocks actively traded on the Indian stock market. The study sample consists of the ratios of 30 large market capitalization companies over a four-year period. The study identifies and examines eight financial ratios that can classify the companies up to a 74.6% level of accuracy into two categories good or poor based on their rate of return. The paper asserts that the model developed can enhance an investor's stock price forecasting ability. Macroecomonic variables, which also can influence the share price, were not taken into account, however. The paper dicusses the practical implications of using the LR method to predict the probability of good stock performance. The authors state that the model can be used by investors, fund managers, and investment companies to enhance their abilty to select out-performing stocks. Keywords: Classification of stock performance, Indian stock market, logistic regression, market rate of return, financial ratios, NIFTY Volume 7, Number 1, June 2012

2 106 Prediction of Stock Performance in the 1. INTRODUCTION Global crashes do not occur all of a sudden but are headed by local and regional crashes in emerging economies. Even when the investors are not exposed to emerging stock markets, they should pay attention to these markets, as local crashes can affect developed markets. Moreover, the interdependence is relevant as well, in that interest rates, bond returns, and volatility also affect the probabilities of the different types of stock market crashes. It is important for shareholders and potential investors to use relevant financial information to enable them to make good investment decisions in the stock market. Predicting stock performance is certainly very complicated and difficult. In the history of stock performance literature, no comprehensive, accurate model has been suggested to date for predicting stock market performance. A stock s performance can, to some extent, be analyzed based on financial indicators presented in the company s annual report. The annual report contains a vast amount of information that can be transformed into various ratios. Previous literature suggests that financial ratios are important tools for assessing future stock performance. Analysts, investors, and researchers use financial ratios to project future stock price trends. Ratio analysis has emerged, therefore, as one of the key parameters used by fund managers and investors to determine the intrinsic value of stock shares; thus, financial ratios are used extensively for the valuation of stock. The study of financial ratios emerged as a new discipline after stock market crashes in the 1990s and early 2000s in the United States and parts of Europe and southern Asia. Today, ratios are used extensively in fundamental analysis to predict the future performance of a company. Various new ratios, such as book value and price/cash earnings per share, have been included in this discipline for share valuation. Financial ratios help to form the basis of investor stock price expectations and, hence, influence investment decision making. The level of importance given to financial ratios differs from industry to industry and from one country to another. Thus, selecting appropriate ratios is very crucial in increasing the prediction success rate. The objective of this paper is to apply statistical methods to survey and analyze financial data in order to develop a simplified model for interpretation. This study aims to develop a model for classifying stocks into two categories International Journal of Business and Information

3 Dutta, Bandopadhyay, and Sengupta 107 (good or poor), based on their rate of return. A company s stock is classified as good if its share returns perform above the market returns provided by the National Stock Exchange composite index of India; i.e., the NIFTY. In this study, the logistic regression (LR) method has been used to classify selected companies, based on their performance. The LR method is used to predict the probability of good stock performance by fitting the variables to a logistic curve. Thus, LR is used to classify a set of independent variables into two or more mutually exclusive categories. It involves finding a linear combination of independent variables that reflect large differences in group means. 2. REVIEW OF LITERATURE In stock performance literature, little attention has been given in the past to the Indian stock market. In recent years, however, there has been a greater focus on the market because of its rapid growth and its increasing potential for global investors. In light of the market s growing importance, more attention has been directed to studies concerning different classification techniques for measuring stock performance. A number of research papers predict stock performance as well as pricing of the stock index across the globe. Harvey [1995] observes that emerging market returns are usually more predictable than developed market returns because emerging market returns are more likely to be influenced by local information than developed markets. In recent literature, artificial neural networks (ANN) have been successfully used for modeling financial time series [Cheng, 1996; Van and Robert, 1997]. In the United States, several studies have examined the cross-sectional relationship between fundamental variables and stock returns. Fundamental variables such as earnings yield, cash flow yield, book-to-market ratio, and size are demonstrated to have some power in predicting stock returns [Fama and French, 1992]. Studies based on European markets also demonstrate similar findings. Ferson and Harvey [1993] observe that returns are predictable, to an extent, across a number of European markets (e.g., UK, France, and Germany). Jung and Boyd [1996], in their study of forecasting UK stock prices, suggest that the predictive strength of their stock performance models is quite significant. In the Japanese stock market, studies carried out by Jaffe and Westerfield [1985] and Kato et al. [1990] also demonstrate some evidence of predictability in the behavior of index returns. Volume 7, Number 1, June 2012

4 108 Prediction of Stock Performance in the Logistic regression (LR), which is helpful for predicting the presence or absence of a characteristic or outcome based on values of a set of predictor variables, is a multivariate analysis model [Lee, 2004]. The applications of LR have repeatedly been used in the area of corporate finance, banking, and investments. Multivariate discriminant analysis (MDA) has been used by many researchers for the default-prediction model. Altman [1968] was the pioneer in this work, whereas Ohlson [1980] later used LR to construct the defaultprediction model. The early research on default prediction focuses on classifying firms as either defaulters or non-defaulters. Ohlson [1980] identifies this assumption of default prediction as an equal payoff state. Clearly, misclassifying a defaulted firm as a non-defaulted firm would have repercussions that are more severe for an investor or a loan officer than would be true in the the opposite case. This research focuses, therefore, on the ability of the models to accurately rank defaulted and non-defaulted firms, based on their default probability. In predicting financial distress and bankruptcy, which have been widely applied as evaluation models providing credit-risk information, Ohlson [1980] used LR, and was then followed by several authors such as Zavgren [1985]. Subsequently, the same trend was used by Zmijewski [1984] for probit analysis. Öğüt and Aktaş [2009] found that data-mining techniques (ANN and SVM) are better suited to detect stock-price manipulation than multivariate statistical techniques such as discriminant analysis or LR, because the performances of data-mining techniques in terms of classification accuracy are better than those of multivariate techniques. They proposed a new binary classification method for predicting corporate failure based on genetic algorithm, and proposed to validate its prediction power through empirical analysis. Min and Jeong [2009] compared prediction accuracy with other methods such as multi-discriminant analysis, logistic regression, decision tree, and artificial neural network, and showed that the binary classification method they proposed can serve as a promising alternative to existing methods for bankruptcy prediction. Bildirici and Ersin [2009] proposed an ANN-APGARCH model to increase the forecasting performance of the APGARCH model. The ANN-extended versions of the GARCH models improved forecast results. International Journal of Business and Information

5 Dutta, Bandopadhyay, and Sengupta 109 Mostafa [2010] showed that neuro-computational models are useful tools in forecasting stock exchange movements in emerging markets. Their results also indicated that the quasi-newton training algorithm produces fewer forecasting errors, compared with other training algorithms. Because of the robustness and flexibility of modeling algorithms, neuro-computational models are expected to outperform traditional statistical techniques such as regression and ARIMA in forecasting price movements on stock exchanges. Li et al. [2010] used LR as a comparative method in order to build a better model for predicting stock returns effectively and efficiently. A 30 times holdout method was used in the assessment, along with the two commonly used methods in the top 10 data mining algorithms (the support vector machine and k nearest neighbor) and the two baseline benchmark methods from the statistical area (MDA and LR). Li and Sun [2011] observed that multiple classifiers outperform single classifiers in terms of prediction accuracy and returns on investment. They showed that there is no significant difference between majority voting and bagging in prediction accuracy, but that the former has a better prediction accuracy for stock returns than the latter. Finally, the homogeneous multiple classifiers using neural networks by majority voting perform best when predicting stock returns. The two classical statistical methods (MDA and logit) have assumed a key role in the area of business failure prediction (BFP). Chen [2011] carried out studies at the Taiwan Stock Exchange Corporation (TSEC) to improve the accuracy of the financial distress prediction model and collected 100 listed companies as the initial sample. The empirical experiment included 37 ratios comprising financial and other non-financial ratios, and used principal component analysis (PCA) to extract suitable variables. Decision tree (DT) classification methods (C5.0, CART, and CHAID) and LR techniques were used to implement the financial distress prediction model. The experiments produced a satisfying result, verifying the possibility and validity of the proposed methods for the financial distress prediction of listed companies. Guresen et al. [2011] evaluated the effectiveness of neural network models, which are known to be dynamic and effective in stock market predictions. The models analyzed are multi-layer perceptron (MLP), the dynamic artificial neural network (DAN2), and hybrid neural networks that use generalized auto- Volume 7, Number 1, June 2012

6 110 Prediction of Stock Performance in the regressive conditional heteroscedasticity (GARCH) to extract new input variables. The comparison for each model is presented in two viewpoints. Swiderski et al. [2012] demonstrated the new approach to the automatic assessment of the financial condition of a company and developed the computerized classification system, applying WOE representation of data and LR, and using support vector machine (SVM) as the final classifier. The applied method is a combination of a classical binary scoring approach and SVM classification. The application of this method to the assessment of the financial condition of companies, classified into five classes, has shown its superiority with respect to classical approaches. At the time of prediction, with the help of MDA, it was assumed that the groups were of similar size as while predicting the default and non-default firms in the prediction carried out by Altman [1968] and subsequent researchers. It was shown that the number of non-default firms was never more than twice the number of default firms. However, default or bankruptcy being a rare event, a very high proportion of the non-defaulters was excluded from the analysis. Besides being used to predict corporate fiascos, ratios are also used for scaling or grouping industries according to the degree of risk. Horrigan [1965] found financial ratios to be successful predictors for bond rating. Metnyk and Mathur [1972] used ratios to classify corporations into similar risk groups and attempted to relate them to the companies market rates of return; but, they did not report favorable results. Conner [1973] studied five ratios namely, (1) total liabilities to net worth, (2) working capital to sales, (3) cash flow to number of common shares, (4) earnings per share to price per share, and (5) current liabilities to inventory but found them to be poor indicators of return on common stock. Different methodologies and financial ratios are used by various authors to classify the performance of firms. Kumar and Ravi [2007] carried out a comprehensive review of various work related to bankruptcy prediction problems and found that neural network is the most widely used technique, followed by statistical models. McConnell, Haslem, and Gibson [1986] have indicated that qualitative data can provide additional information to forecast stock price performance more accurately. The LR technique yields coefficients for each independent variable based on a sample of data [Huang, Chai, and Peng, 2007]. Logistic regression models International Journal of Business and Information

7 Dutta, Bandopadhyay, and Sengupta 111 (LRM) with two or more explanatory variables are widely used in practice [Haines et al., 2007]. The parameters of the LR model are commonly estimated by maximum likelihood [Pardo, Pardo, and Pardo, 2005]. The advantage of LR is that, through the addition of an appropriate link function to the usual linear regression model, the variables may be either continuous or discrete, or any combination of both types, and they do not necessarily have normal distributions [Lee, 2004]. The predictor values from the analysis can be interpreted as probabilities (0 or 1 outcome) or membership in the target groups (categorical dependent variables). It has been observed that the probability of a 0 or 1 outcome is a non-linear function of the logit [Nepal, 2003]. Logistic regression is useful for situations in which it is required to predict the presence or absence of a characteristic or outcome based on values of a set of predictor variables. Logistic regression is similar, therefore, to a linear regression model, but is proficient to models where the dependent variable is dichotomous. Logistic regression coefficients can be used to estimate odd ratios for each of the independent variables in the model. Logistic regression helps to form a multivariate regression between a dependent variable and several independent variables [Lee, Ryu and Kim, 2007]. It is designed to estimate the parameters of a multivariate explanatory model in situations where the dependent variable is dichotomous, and the independent variables are continuous or categorical. Existing literature indicates that LR has been rarely used to build a model for predicting out-performing shares. Logistic regression has been used mostly for predicting financial distress and business failure. It has not been used for predicting share performance in India. In terms of investment destination in share, India is a top performing emerging market. In this context, the present study will provide useful information to shareholders and potential investors to enable them to make good decisions regarding investments. 3. RESEARCH OBJECTIVE AND METHODOLOGY In this study, the relation between financial ratios and stock performance of the firms has been analyzed with the help of binary logistic regression. The earlier studies mentioned above have generally indicated that logistic regression, as used in the finance discipline, can be an effective tool for decision makers. It Volume 7, Number 1, June 2012

8 112 Prediction of Stock Performance in the has also been recognized that financial ratios can enhance an investor's stock price forecasting ability. The objective of this study is to build a model using financial ratios of the firms for the purpose of predicting out-performing shares in the Indian stock market. This study aims, therefore, to answer two questions: (1) Can the yields of stocks be explained with the help of financial ratios? (2) Can we analyze stock yields using a logistic regression model? The study also examines the efficacy of ratios as predictors of stock performance Analysis of Model-Logistic Regression Regression analysis is used to determine the magnitude of relationships between variables as well as to model relationships between variables and for predictions based on the models. Simple linear regression or multiple linear is applicable when this relationship is assumed to be linear [Davis, 2005]. However, a number of non-linear techniques could be used to obtain a more accurate regression if the relationship between variables is not linear in parameters. Logistic regression is preferred in case the response variable can take only binary values (yes or no). The outcome of logistic regression is a function that describes how the probability of the event (yes or no) varies with the predictors [Tabachnick and Fidell, 2001]. Logistic regression could predict the likelihood, or the odds ratio, of the outcome based on the predictor variables, or covariates. The significance of logistic regression can be evaluated by the log likelihood test, given as the model chi-square test, evaluated at the p < 0.05 level, or the Wald statistic. Logistic regression has the advantage of being less affected than discriminant analysis when the normality of the variable cannot be assumed. It has the capacity to analyze a mix of all types of predictors [Hair, 1995]. Logistic regression, which assumes the errors are drawn from a binomial distribution, is formulated to predict and explain a binary categorical variable instead of a metric measure. In logistic regression, the dependent variable is a log odd or logit, which is the natural log of the odds. Logistic regression allows one to predict a discrete outcome, such as group membership, from a set of variables that may be continuous, discrete, dichotomous, or a mix of any of these. Generally, the dependent or response International Journal of Business and Information

9 Dutta, Bandopadhyay, and Sengupta 113 variable is dichotomous, such as presence/absence or success/failure. In instances where the independent variables are categorical, or a mix of continuous and categorical, logistic regression is preferred. Since the probability of an event must lie between 0 and 1, it is unrealistic to model probabilities with linear regression techniques, because the linear regression model allows the dependent variable to take values greater than 1 or less than 0. The logistic regression model is a type of generalized linear model that extends the linear regression model by linking the range of real numbers to the 0-1 range. In the logistic regression model, the relationship between Z and the probability of the event of interest is described by this link function. p i = e zi = 1 1+e zi 1+e -zi Figure 1. Logistic Regression Model Here the y-axis is the predicted variable p i and the horizontal axis denotes the explanatory variable z i. or Volume 7, Number 1, June 2012

10 114 Prediction of Stock Performance in the z i =log(p i /1 p i ) where p i is the probability the i th case experiences the event of interest, and z i is the value of the unobserved continuous variable for the i th case. The z value is the odds ratio. It is expressed by z i = β 0 +β 1 x i1 +β 2 x i2 + +β p x ip where x ij is the j th predictor for the i th case, β j is the j th coefficient, and p is the number of predictors. Logistic regression analysis does not require the restrictive assumptions regarding normality distribution of independent variables or equal dispersion matrices nor the prior probabilities of failure [Ohlson, 1980; Zavgren, 1985]. Rather, logistic regression is based on two assumptions; (1) it requires the dependent variable to be dichotomous, with the groups being discrete, nonoverlapping, and identifiable; and (2) it considers the cost of type I and type II error rates in the selection of the optimal cut-off probability. βs are the regression coefficients that are estimated through an iterative maximum likelihood method. However, because of the subjectivity of the choice of these misclassification costs in practice, most researchers minimize the total error rate and, hence, implicitly assume equal costs of type I and type II errors [Ohlson, 1980; Zavgren, 1985] Application of Logistic Regression We begin this section with a discussion of data sources. In this context, the companies with large market capitalizations have been considered, of which, most of these companies are part of the NIFTY index. The financial data used in International Journal of Business and Information

11 Dutta, Bandopadhyay, and Sengupta 115 this analysis was collected from the Web link The sample of the study was drawn from the 30 companies that are most actively traded on the Indian stock exchange as given in Appendix 2. Financial ratios and stock prices for calculating return were then collected. In this research, a sample period consisting of four years ( ) was selected for classification purposes. For the purpose of carrying out logistic regression analysis, first a method is required for classifying a company as a good or poor investment choice for a given year. Although there is no definitive method for defining a market investment as good or poor, in this study we use a method that is simple and objective namely, if the value of a company s stock over a given year rose above market return, it is classified as a good investment option; otherwise, it is classified as a poor investment option. Here, the NIFTY (Index of National Stock Exchange) return has been taken as proxy for market return. To obtain the return at the end of each financial year, the March ending prices were used for each year. The return was calculated using the following formula: Return of stock = Χ 100 where, Pt= Price at the T year Pt-1= Price at the T -1year Market return = Χ 100 Similarly, NIFTY(t) = NIFTY at the t year, and NIFTY(t-1) = NIFTY at the (t-1) year. The sample in this study is based on the selection of 30 companies for a fouryear period (2005 through 2008). The study consists of a sample size of 118 distinct companies year-wise observations. As discussed, we have used twp dependent variables ( good or poor ) and six independent variables. Initially, Volume 7, Number 1, June 2012

12 116 Prediction of Stock Performance in the 16 financial ratios were taken for analysis. A normality test was conducted on all these explanatory variables. The results of the test are summarized and presented in Table 1, which shows that six variables are normal. The normality test was used to give a better prediction result. The table also shows that the P-value for all six variables is greater than 0.05, which implies that these variables are normal. Table 1 One-Sample Kolmogorov-Smirnov Test % Increase Price/cash in Net Earning Book Earning PBIDT/ Sales/Net Sales per share Value per Share Sales Assets N Normal Parameters a Mean Std. Deviation Most Extreme Differences Absolute Positive Negative Kolmogorov-Smirnov Z Asymp. Sig. (2-tailed) The variables were also tested using a Q-Q plot, as shown in Appendix 1. The variables that were not normal were not considered for further analysis. The six independent variables considered for final analysis are presented in Table 1. The six ratios are mostly the valuation ratios, which generally determine the value of share in the stock market. As a matter of fact, the dependent variable or outcome is a dichotomous one, and, hence, has been rated GOOD = 1 and POOR = 0 to International Journal of Business and Information

13 Dutta, Bandopadhyay, and Sengupta 117 signify the investment choice. Out of to 118 samples, 68 have been classified as poor and 50 as good. Table 2 Dependent Variables Type of Company (based on stock market return) GOOD POOR Return above Market return; i.e., NIFTY Return below Market return; i.e., NIFTY Table 3 Dependent Variable Encoding Original Value Internal Value Poor 0 Good 1 Table 4 Independent Variables Name of the Variable NS CEPS BV PECEPS PE PBIDTS SNA PEBV Description of the Variable Percentage Increase in Net Sales Cash Earnings per Share Book Value Price/Cash Earnings Per Share Price/Earning Profit Before Interest Depreciation and Tax/Sales Sales/Net Assets Price/Book value Volume 7, Number 1, June 2012

14 118 Prediction of Stock Performance in the 4. EMPIRICAL RESULT AND ANALYSIS The estimated results of the logistic regression model of the stock price return performance, along with the whole sample, are summarized in Table 5. The final logistic regression equation is estimated by using the maximum likelihood estimation for classifying a company: Z= * NS * BV * CEPS+.069*PECEPS * PE * PBIDTS * SNA *PEBV, where z= log (p/1-p), and p is the probability that the outcome is GOOD. In the above equation, it is possible to classify a company by calculating Z values. P values can be obtained from Z values. If the P value is higher than 0.42, then the stock was classified as good; and, if it is lower than 0.42, then the stock was classified as poor. Table 5 (Using SPSS) Step 1 a NS BV CEPS PECEPS PE PBIDTS SNA PEBV Constant Variables in the Equation B S. E. Wald df Sig. Exp(B) a. Variable(s) entered on step 1: NS, BV, CEPS, PECEPS, PE, PBIDTS, SNA, PEBV. The ratio of B to S.E., squared, equals the Wald statistic. It provides the statistical significance of each estimated coefficient. If the logistic coefficient is statistically significant, we can interpret it in terms of how it impacts the International Journal of Business and Information

15 Dutta, Bandopadhyay, and Sengupta 119 estimated probability and thus the prediction of group membership. Several authors have identified problems with the use of the Wald statistic. Menard [1995] warns that, for large coefficients, standard error is inflated, lowering the Wald statistic (chi-square) value. Agresti [1996] states that the likelihood-ratio test is more reliable for small sample sizes than the Wald test. Maximization of Wald statistics indicates minimizing the standard error of the corresponding parameter. Wald statistics actually provide the significant test of the β- coefficients Classification Accuracy The following classification table helps to assess the performance of the model by cross-tabulating the observed response categories with the predicted response categories. For each case, the predicted response is the category treated as 1, if that category's predicted probability is greater than the user-specified cutoff. The cutoff value is taken at 0.5. Table 6 Classification Table a Predicted Observed Step 1 Perf Overall Percentage a. The cut v alue is.410 POOR GOOD Perf Percentage POOR GOOD Correct This table shows the comparison of the observed and the predicted performance of the companies and the degree of their prediction accuracy. It also shows the degree of success of the classification for this sample. The number and percentage of cases correctly classified and misclassified are displayed. It is clear from this table that the poor companies have a 75% correct classification rate, whereas good companies have a 74% correct classification rate. Overall, correct classification was observed in 74.6% of original grouped cases. Volume 7, Number 1, June 2012

16 120 Prediction of Stock Performance in the The plot of the distribution of the firms against the probability is shown above. The graph shows another method to evaluate right and wrong predictions by plotting POOR (P) and GOOD (G) status. The cutoff probability for the decision taken is 0.42 (or 42%). Thus, using this cutoff value, any company whose score is higher than 0.42 would be predicted to be a good performing company, and any company with a score less than 0.42 would be classified as a poor performing company. However, there may be times when one would want to adjust this cutoff value. Neter et al. [1996] suggest three ways to select a cutoff value for predicting: Use the standard 0.42 cutoff value. Determine a cutoff value that will give the best predictive fit for the sample data. This is usually determined through trial and error. Select a cutoff value that will separate the sample data into a specific proportion of the two states, based on a prior known proportion split in the population. International Journal of Business and Information

17 Dutta, Bandopadhyay, and Sengupta Tests of Goodness of Fit The Hosmer-Lemeshow [1989] goodness of fit test is well known when data are obtained from a simple random survey. The procedure involves grouping the observations based on the expected probabilities and then testing the hypothesis that the difference between expected and observed events is approximately zero for all the groups. Hosmer-Lemeshow [1989] proposed a statistic that they show through simulation. It is distributed as chi-square when there is no replication in the subpopulations. This test is available only for binary response models. The Hosmer-Lemeshow [1989] statistic evaluates the goodness-of-fit by creating 10 ordered groups of subjects and then compares the number actually in the each group (observed) to the number predicted by the logistic regression model (predicted). Thus, the test statistic is a chi-square statistic with a desirable outcome of non-significance, indicating that the model prediction does not significantly differ from the observed. The present study also estimated the Hosmer and Lemeshow statistic, which provides useful information about the calibration of the model. The observed significance level for chi-square value is found to be (Hosmer and Lemeshow test), which indicates acceptance of the null hypothesis of the model, meaning there is not much difference between observed and predicted values. This result shows that the model appears to fit the data reasonably well. The chi-square value (10.737) of this model at the 0.01 significance level indicates that logistic regression is very meaningful, in accordance with the dependent variable relating to each specified independent variables. Table 7 (Using SPSS) Hosmer and Lemeshow Test Step 1 Chi-square df Sig Volume 7, Number 1, June 2012

18 122 Prediction of Stock Performance in the The omnibus tests are the measures of how well the model performs. They test whether the explained variance in a set of data is significantly greater than the unexplained variance, overall. Table 8 (Using SPSS) Omnibus Tests of Model Coefficients Step 1 Step Block Model Chi-square df Sig If the step were to remove a variable, the exclusion makes sense if the significance of the change is large (i.e., greater than 0.10). If the step were to add a variable, the inclusion makes sense if the significance of the change is small (i.e., less than 0.05). 5. CONCLUSION This study used the binary logistic regression model to determine the factors that significantly affect the performance of a company in the stock market. The binary logistic regression method helps the investor to form an opinion about the shares to be invested. It may be observed that eight financial ratios can classify companies up to a 74.6% level of accuracy into two categories ( good or poor ), based on their rate of return. The eight financial ratios are: Percentage change in net sales (NS) Sales/net assets (SNA) Price/cash earnings per share (PECEPS) Price/book value (PEBV) Price/earnings per share (PE) PBIDT/sales (PBIDT) Cash price/earnings per share (CEPS) Book value (BV) International Journal of Business and Information

19 Dutta, Bandopadhyay, and Sengupta 123 When evaluated from the investors point of view, we conclude that it is possible to predict out-performing shares by examining these ratios. Various methods are available for data processing for analysis, but in this study, we conclude that ratio methods have the capability to reveal maximum information content, if variables are chosen very carefully with regard to the purpose at hand. Ratios enjoy remarkable simplicity and, in spite of the problem of multi collinearity, the information revealed by them is so direct to a particular decisioncontrol situation that movements of ratio give a picturesque representation of the movement of an actual business process. In this study, data for 12 months were taken into consideration, and, at the end of 12 th month, stock share prices were compared with those of the previous year to determine performance. In further studies, data for each three-month period can be used, and different criteria can be defined, for evaluating stock performance. This study used financial ratios as the only factor affecting share prices, but there may be various other economic and management factors that may also influence share prices. McConnell, Haslem, and Gibson [1986] have shown that qualitative data can provide additional information to forecast stock price performance more accurately. Further studies can use qualitative data for improving forecasting ability. In the current study, only logistic regression was considered to build the model. Therefore, for further development, this study proposes to investigate and use various approaches such as the genetic algorithm, rough set approach to increase the prediction ratio. Volume 7, Number 1, June 2012

20 124 Prediction of Stock Performance in the Appendix 1 International Journal of Business and Information

21 Dutta, Bandopadhyay, and Sengupta 125 Volume 7, Number 1, June 2012

22 126 Prediction of Stock Performance in the International Journal of Business and Information

23 Dutta, Bandopadhyay, and Sengupta 127 Appendix 2 Sample Data Set (118 Observations) Year Perf Company NS EPS BV PECEPS PBIDTS SNA 2008 POOR Tata motor POOR Tata motor GOOD Tata motor POOR Tata motor GOOD Tata Steel POOR Tata Steel POOR Tata Steel POOR Tata Steel POOR TCS GOOD TCS POOR TCS GOOD Sterlite GOOD Sterlite GOOD Sterlite POOR Sterlite GOOD Tata Power POOR Tata Power POOR Tata Power POOR Tata Power POOR Satyam POOR Satyam GOOD Satyam POOR Satyam GOOD SBI POOR SBI POOR SBI POOR SBI Continued Volume 7, Number 1, June 2012

24 128 Prediction of Stock Performance in the Year Perf Company NS EPS BV PECEPS BIDTS SNA 2008 GOOD Reliance Industries GOOD Reliance Industries GOOD Reliance Industries POOR Reliance Industries GOOD Reliance Energy POOR Reliance Energy POOR Reliance Energy POOR Reliance Energy POOR ONGC POOR ONGC POOR ONGC POOR ONGC GOOD NTPC POOR NTPC POOR NTPC POOR NTPC POOR Maruti POOR Maruti GOOD Maruti POOR Maruti POOR Mahindra &Mahindra GOOD Mahindra &Mahindra GOOD Mahindra &Mahindra POOR Mahindra &Mahindra GOOD L&T GOOD L&T GOOD L&T GOOD L&T Continued International Journal of Business and Information

25 Dutta, Bandopadhyay, and Sengupta 129 Year Perf Company NS EPS BV PECEPS PBIDTS SNA 2008 POOR Jaiprakash POOR Jaiprakash POOR Jaiprakash POOR Jaiprakash POOR Infosys GOOD Infosys POOR Infosys POOR Infosys GOOD ITC POOR ITC GOOD ITC GOOD ITC POOR ICICI GOOD ICICI POOR ICICI POOR ICICI GOOD HDFC GOOD HDFC GOOD HDFC POOR HDFC GOOD HDFC BANK GOOD HDFC BANK POOR HDFC BANK POOR HDFC BANK GOOD Hindalco POOR Hindalco POOR Hindalco POOR Hindalco POOR Bharti Airtel GOOD Bharti Airtel GOOD Bharti Airtel GOOD Grasim POOR Grasim GOOD Grasim POOR Grasim Continued Volume 7, Number 1, June 2012

26 130 Prediction of Stock Performance in the Year Perf Company NS EPS BV PECEPS PBIDTS NVA 2008 GOOD BHEL POOR BHEL GOOD BHEL POOR BHEL POOR Sun Pharma GOOD Sun Pharma GOOD Sun Pharma POOR Sun Pharma GOOD SAIL GOOD SAIL POOR SAIL GOOD SAIL POOR Dr Reddy POOR Dr Reddy GOOD Dr Reddy POOR Dr Reddy POOR Wipro POOR Wipro GOOD Wipro POOR Wipro GOOD Asian Paints POOR Asian Paints GOOD Asian Paints POOR Asian Paints POOR Shree_Cement GOOD Shree_Cement GOOD Shree_Cement GOOD Shree_Cement International Journal of Business and Information

27 Dutta, Bandopadhyay, and Sengupta 131 Appendix 3 Evaluation Data Set (22 Observations) Year Perf Company NS EPS BV PECEPS PBIDTS SNA 2008 GOOD TATA Tea POOR TATA Tea GOOD TATA Tea POOR TATA Tea GOOD India Infoline GOOD India Infoline GOOD Unitech Ltd GOOD Unitech GOOD Unitech GOOD Unitech POOR Berger Paints POOR Berger Paints GOOD Berger Paints GOOD Berger Paints POOR Pidilite POOR Pidilite GOOD Pidilite POOR Pidilite POOR TVS Motors POOR TVS Motors GOOD TVS Motors POOR TVS Motors Volume 7, Number 1, June 2012

28 132 Prediction of Stock Performance in the REFERENCES Altman, E.I Financial ratios, discriminant analysis, and the prediction of corporate bankruptcy, Journal of Finance 23, Awales, George S. Jr Another look at the President s letter to stockholders, Financial Analysts Journal, 71-73, March-April. Bhattacharya, Hrishikes Total Management by Ratios, 2 nd ed., New Delhi, India: Sage Publications. Bildirici, Melike, and Özgür Ömer Ersin Improving forecasts of GARCH family models with the artificial neural networks: An application to the daily returns in Istanbul Stock Exchange, Expert Systems with Applications 36(4), Connor, M.C On the usefulness of financial ratios to investors in common stock, The Accounting Review, Chen, Mu-Yen Predicting corporate financial distress based on integration of decision tree classification and logistic regression, Expert Systems with Applications 38(9), Cheng, W; L. Wanger; and Ch. Lin Forecasting the 30-year US treasury bond with a system of neural networks, Journal of Computational Intelligence in Finance 4, Davis, D Business Research for Decision-Making, 1 st ed., Belmont, CA: Thomson Brooks/Cole. Dutta, A., et al Classification and Prediction of Stock Performance using Logistic Regression: An Empirical Examination from Indian Stock Market: Redefining Business Horizons: McMillan Advanced Research Series, Fama, E, and K. French Permanent and temporary components of stock prices, Journal of Political Economy 96, Ferson, W.E., and C.R. Harvey The risk and predictability of international equity returns, Review of Financial Studies 6, 1993, Guresen, Erkam, et al Using artificial neural network models in stock market index prediction, Expert Systems with Applications 38 (8), Haines, L.M., et al D-optimal designs for logistic regression in two variables, moda 8-advances in model-oriented design and analysis, Physica-Verlag HD, Hair, J.F., et.al Multivariate Data Analysis, 6 th ed.: Pearson Education, Inc. Hair, J.F. 1995, Multivariate Data Analysis with Readings, 4 th ed., Englewood Cliffs, NJ: Prentice Hall. International Journal of Business and Information

29 Dutta, Bandopadhyay, and Sengupta 133 Harvey, C.R Predictable risk and returns in emerging markets, The Review of Financial Studies 8, Horrigan, James O Some empirical bases of financial ratio analysis, The Accounting Review, Hosmer, David, and Stanley Lemeshow Applied Logistic Regression, John Wiley and Sons, Inc. Huang, Q.; Y. Cai; and J. Peng Modeling the spatial pattern of farmland using GIS and multiple logistic regression: A case study of Maotiao River Basin, Guizhou Province, China, Environmental Modeling and Assessment, 12(1), Jaffe, J; D. Keim; and R. Westerfield Earnings yields, market values, and stock returns, Journal of Finance 44, Jaffe, J., and R. Westerfield Patterns in Japanese common stock returns: Day of the week and turn of the year effects, Journal of Financial and Quantitative Analysis 20, Jung, C., and R. Boyd Forecasting UK stock prices, Applied Financial Economics 6, Kamruzzaman, J., and R.A. Sarker ANN-based forecasting of foreign currency exchange rates, Neural Information Processing - Letters and Reviews 3(2). Kato, K.; W. Ziemba; and S. Schwartz Day of the week effects in Japanese stocks, In: E. Elton and M. Grubber, eds., Japanese Capital Markets, New York: Harper & Row. Kumar P.R., and V. Ravi Bankruptcy prediction in banks and firms via statistical and intelligent techniques: A review, European Journal of Operation Research 180: 1-28 Lee, S Application of likelihood ratio and logistic regression models to landslide susceptibility mapping using GIS. Environmental Management 34(2), Lee, S.; J. Ryu; and L. Kim Landslide susceptibility analysis and its verification using likelihood ratio, logistic regression, and artificial neural network models: Case study of Youngin, Korea, Landslides 4: Li, Hui, et al., 2010 Predicting business failure using classification and regression tree: An empirical comparison with popular classical statistical methods and top classification mining methods, Expert Systems with Applications 37(8), Li, Hui, and Jie Sun, 2011, Empirical research of hybridizing principal component analysis with multivariate discriminant analysis and logistic regression for business failure prediction, Expert Systems with Application 38(5), McConnell, Dennis; John A. Haslem, and Virginia R. Gibson The president s letter to stockholders: A new look, Financial Analysis Journal: Volume 7, Number 1, June 2012

30 134 Prediction of Stock Performance in the Menard, Scott Applied Logistic Regression Analysis. Sage Publications. Series: Quantitative Applications in the Social Sciences, No Metnyk, Z.L., and Iqbal Mathur Business risk homogeneity: A multivariate application and evaluation. Proceedings of the 1972 Midwest AIDS Conference. Min, Jae H., and Chulwoo Jeong A binary classification method for bankruptcy prediction, Expert Systems with Applications 36(3), Öğüt, Hulisi, et al. 2009, Detecting stock-price manipulation in an emerging market: The case of Turkey, Expert Systems with Applications 36(9), Mostafa, Mohamed M Forecasting stock exchange movements using neural networks: Empirical evidence from Kuwait, Expert Systems with Application 37(9), Nepal, S.K Trail impacts in Sagarmatha (Mt. Everest) National Park, Nepal: A logistic regression analysis, Environmental Management 32(3), Neter, J.; W. Wasserman; C.J. Nachtsheim; and M.H. Kutner Applied Linear Regression Models, 3 rd ed., Chicago: Irwin. Ohlson, J Financial ratios and the probabilistic prediction of bankruptcy, Journal of Accounting Research 18: Pardo, J.A.; L. Pardo; and M.C. Pardo Minimum Ө-divergence estimator in logistic regression models, Statistical Papers 47: Swiderski, Bartosz, et al Multi-stage classification by using logistic regression and neural networks for assessment of financial condition of company, Decision Support System 52(2), Tabachnick, B.G., and L.S. Fidell Using Multivariate Statistics, 4 th ed., Boston, Mass.: Allyn & Bacon. Tsai, Chih-Fong, et al Predicting stock returns by classifier ensembles, Applied Soft Computing 11(2), Van, E., and J. Robert The application of neural networks in the forecasting of share prices. Haymarket, VA, USA: Finance & Technology Publishing. Zavgren, C Assessing the vulnerability to failure of American industrial firms: A logistic analysis, Journal of Business Finance and Accounting 12(1): Zmijewski, M.E Methodological issues related to the estimation of financial distress prediction models, Journal of Accounting Research 22: International Journal of Business and Information

31 Dutta, Bandopadhyay, and Sengupta 135 Web Links ANON Logistic regression. Wikipedia. [Online]. Available: [11th April 2009] -risk+models:...-a papers.ssrn.com/sol3/delivery.cfm/ssrn_id485182_code pdf?abstractid= FAFF, Robert Investigating the performance of alternative default-risk models: option-based versus accounting-based approaches. Australian Journal of Management,Dec 1[Online]. Available: -risk+models:...-a [11th April 2009] ABOUT THE AUTHORS Avijan Dutta is a faculty member in the Department of Management studies, National Institute of Technology, Durgapur, India. He obtained his post-graduate degree in management from IIM-Ahemdabad, and received his Ph.D. from Jadavpur University. He has published several articles in leading journals. He was awarded the Silver Medal for the Best Research Paper at the Association of Indian Management Schools convention held at Hyderabad in He was also awarded 2 nd place in the Best Case Writing competition at AIMS Western Region conference in He has conducted in-company training programs for Indian companies such as Reliance and Sterilite Industries. His areas of research interest are capital market and investment management. Gautam Bandyopadhyay is an associate professor in the Department of Management Studies, National Institute of Technology, Durgapur. He has extensive experience in Volume 7, Number 1, June 2012

32 136 Prediction of Stock Performance in the teaching and research activities. He received his Ph.D. from the Department of Mathematics at Jadavpur University. He is also a fellow member of the Institute of Cost & Works Accountants of India, and has presented several research papers at international conferences in India and elsewhere. He is the author of many research papers in peerreviewed journals of national and international repute. He is now guiding several Ph.D. students, and has advised others who have completed their Ph.D. Suchismita Sengupta is an associate professor at the IES Management College and Research Centre, Mumbai, India, and has 16 years experience in teaching, research, and consultancy. She has a M.Com, MBA, a master s degree in international business operations, and a Ph.D. in finance. She is actively involved in research and in the publication and review process of a few international journals. She has published nine papers in refereed national and international journals and has contributed book chapters published by Allied Publishers, Deep and Deep, and McMillan Advance Research Series. She has provided consulting services to clients on business operations and has carried out various projects on a collaborative basis. She has experience in training and development, identification of training needs with competency mapping activities, and conducting training programs to enhance the efficiency in overall business operations. She has also reviewed research papers for foreign journals. International Journal of Business and Information