DISCRIMINANT ANALYSIS AND THE IDENTIFICATION OF HIGH VALUE STOCKS

DISCRIMINANT ANALYSIS AND THE IDENTIFICATION OF HIGH VALUE STOCKS Jo Vu (PhD) Senior Lecturer in Econometrics School of Accounting and Finance Victoria University, Australia Email: jo.vu@vu.edu.au ABSTRACT This study examines the Australian Stock Exchange listed companies, in an attempt to discriminate between stocks based on their potential value to the shareholders. Using the company characteristics and its financial ratios data such as Company Market Capitalisation, Dividend per Share, Dividend Yield, Price per Earnings, Beta, Earnings per Share, Debt/Equity Ratio, Cash Flow, Book Value, Return on Assets and Return on Equity, a method of identifying high-value stocks is presented here. Discriminant analysis is also used to predict which Australian stocks are of greatest economic return for investors in the future. Keywords: High-value Stocks, Stock Indicators/Measures, Value Investing, Discriminant Analysis. I. INTRODUCTION The last three decades have witnessed stock market bursts and booms, economic prosperity and downturns, natural disasters, terrorist attacks; health scares crises and political upheavals. Through them all, since the last US Sub-prime mortgage crisis started, the global financial crisis had brought down billion of dollars in the stock market, and was considered to be one of the worst bear markets since 1929 Stock Market Crash - the Great Depression. It is no doubt that the stock markets have been experiencing the bear market and are on the way to a slow recovery. Although the market seems to be more positive than it was 18 months ago, value investors are still wary and are not jumping hastily back in the market even at bargain prices. They d look closely at the company balance sheet with an attempt to identify value stock that was cheap relative to their assets. The impact of the market downturn and uncertain economic times call for new guidance in stock investments, to ensure that their financial futures are protected and their retirements are financially comfortable. In response to this call, investors, financial analysts, researchers have offered their insights in investment and the art of stock selection. Some investors go for companies with potential growth and big profits in years to come, whereas others would go for, as what they normally call, value companies with solid earnings and real assets at bargain prices.

But what does value investing actually mean? The traditional approach of value investing is often associated with the names of Benjamin Graham, who laid down the principles of value investing in the 1930 s, and his well-known follower Warren Buffet. In Security Analysis, Graham and Dodd (2009) gave the concept of intrinsic value and defined value investing in Graham s classic book The Intelligent Investor, originally published in 1949, as the strategy of selecting stocks that trade for less than their intrinsic values (Graham, 2005). With this philosophy, value investors would know the difference between speculation and investment (Williams, 1938) and would seek stocks of companies that are underpriced, and therefore do not correspond with the company s fundamentals. In general, value stocks would have sound balance sheets that show good financial health. Indicators such as Dividend per Share, Dividend Yield, Price/Earnings, Beta, Earnings, Debt/Equity Ratio, Cash Flow, Book Value, Return on Assets and Return on Equity, are often examined to supplement an investor s decision in equity markets. Typically, value investors would select stocks with lower-than-average price-to-book or price-to-earnings ratios or stocks with higher-than-average yields. However, the big problem for value investing, as noted in Investopedia (an online investing education site), is in the estimation of intrinsic value as there is no correct intrinsic value. Investopedia also points out that the very definition of value investing is quite subjective, such that with the same information, two investors can place different values on a company. Some investors would only look at present assets/earnings; others would base strategies on future growth and cash flows. With such different methodologies, the key question is which financial indicators are most important in identifying high-value stock potential. Hence, the objective is to identify whether stocks can be quantitatively identified as highvalue, by their characteristics and financial ratios. The discriminant model is developed, where a sample of one hundred Australian Stock Exchange (ASX) companies is analyzed to establish a function that best discriminates between companies in three mutually exclusive groups: low, medium and high-value stocks; and a sample of fifty is used to predict group membership for cross-validation (total sample =150). The reliability of the discriminant model reflects the accuracy of the group predictions based on the financial ratio data. The implications of the findings will be discussed in relation to the benefits for the Australian stock market and might have extensive relevance to other overseas markets, in order to maximize their investment return. II. DATA SOURCES AND DEVELOPMENT OF THE MODEL Data on ASX companies are obtained from the COMSEC website which is the share trading division of the Commonwealth Bank of Australia Board (2010). The selection of the ASX companies is made on the basis of available data, as it is necessary to have significant stock measures to obtain a reasonably whole set of determinant variables. A list of thirteen potentially useful variables (financial ratios) is compiled for analysis. 2

Variables (X i ) Figure 1 Characteristics of Company Profile Market Capitalisation - size of a publicly traded company. It is equal to the share price times the number of shares outstanding (shares that have been authorized, issued, and purchased by investors). Dividend per Share (DPS) - payments made by a corporation to its shareholder members. Dividend Yield - the average of the actual dividend over the last 12 months, and the consensus projected dividend for the next 12 months, all divided by the current price. Price/Earnings Ratio (P/E) - gives an indication of how much investors are willing to pay for each dollar s worth of earnings. Price/Book Ratio (P/B) - which is the ratio of the current price per share divided by book value per share. Price Earnings Growth Ratio (PEG) - which is the ratio of the stock's P/E, to its prospective earnings per share growth rate. A ratio of one is considered to represent fair value and a ratio greater than one indicates a more "expensive" stock Beta - measures the stock prices sensitivity to fluctuations of the market as a whole and it cannot be eliminated through diversification. A beta greater than one indicates greater volatility than the market. Debt/Equity Ratio (D/E) - indicates what proportion of equity and debt the company is using to finance its assets. A high debt/equity ratio generally means that a company has been aggressive in financing its growth with debt. Cash Flow statement - reflects a company's liquidity. Earnings per Share - the amount of earnings per share of a company's stock. Book Value - the net asset value of a company. The book value measures the value of the shareholders ownership in the company, as measured by the last full year balance sheet. Return On Assets (ROA) - the ratio of earnings to total assets and is a measure of company s performance in terms of profitability; it also tells what earnings were generated from invested capital. Return On Equity (ROE) - the amount of net income returned as a percentage of shareholders equity. Return on equity measures a company's profitability by revealing how much profit a company generates with the money shareholders have invested. 3

It is noted that the interaction of diverse variables whose relationships are unclear, thus making the prediction of stock price performance difficult. However, thirteen ratios (variables) that are commonly used in finance in determining a company s financial position are often used by investors and fund managers in predicting future earnings and stock price performance. As shown in Figure 1, twelve financial ratios (independent variables) are entered as potential predictors: Market Capitalisation; Dividend per Share; Dividend Yield (%); Price/Earnings (P/E); Price/Book (P/B); Price Earnings Growth (PEG); Beta; Debt/Equity (D/E); Cash Flow; Earnings per Share, Book Value; Return On Assets (ROA). Return on Equity (ROE) is the dependent variable, which is widely used by investors and financial analysts as one of the key performance measures. Theoretically, discriminant analysis involves deriving a variate which is the linear combination of independent variables that will discriminate best between the objects (companies) in the groups defined as priori (Hair et. al. 2009, Johnson & Wichern 2002). That is the company profiles of the ASX are divided into three groups: High-Value, Medium-Value and Low-Value based on their Return on Equity (ROE). Discrimination is achieved by calculating the variate s weights for each independent variable to maximize the difference between the company groups (i.e. the between-group variance relative to the within-group variance). The variate for a discriminant analysis, also known as the discriminant function, is derived from an equation similar to multiple regression and the general formula of the discriminant function is expressed as follows: D = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 +. + β i X i Where: D = the discriminant z score β 0 = a constant/intercept β i = the discriminant coefficient for X i or discriminant weight for independent variable i X i = independent variable i Discriminant function analysis is used to determine which of the twelve financial ratios (explanatory variables) discriminate between three groups: High-Value, Medium-Value and Low-Value. This function can be used to provide a score that will determine which group the case is predicted to belong. But first we need to perform a One-Way ANOVA to determine whether the differences in Return on Equity by the three groups are significant or could have occurred simply by chance. Table 1 ANOVA ROE Sum of Squares df Mean Square F Sig. Between Groups 21731.929 2 10865.965 63.502.000 Within Groups 16768.926 98 171.111 Total 38500.856 100 4

The analysis of variance (Table 1) shows that overall the means differ significantly. That is the ROE for 2009 differs significantly for ASX companies with different levels of return (pvalue=.000) low, medium and high. A further test was conducted to determine the significance of the difference between the means for the different levels of return (refer to Table 2). Table 2 Multiple Comparisons Tukey HSD (I) Value (J) Value Mean Difference Standard Error Sig. (I-J) Low Medium -10.037 * 2.925.003 High -39.522 * 3.515.000 Medium Low 10.037 * 2.925.003 High -29.485 * 3.648.000 High Low 39.522 * 3.515.000 Medium 29.485 * 3.648.000 * The mean difference is significant at the.05 level Dependent Variable: ROE Table 2 shows that the differences between each pair of stock ROE groupings are statistically significantly different (p-value=.000). For each of the independent (predictor) variables above significant differences exist between the groups. In order to determine the capacity of these variables to account for different company stock return a Discriminant Analysis is used to derive potential predictive functions. Discriminant analysis is then applied to predict which companies are of greatest investment return for investors in the future. For discriminant analysis, a categorical dependent variable is needed; in this case the investment returns by company categorized into three groups: High (ROE>20), Medium (10<ROE<20 and Low (ROE<10) returns. These groupings are an arbitrary selection and more groups or only two could be used. The question of group categorisation is a practical one and three groups provide a reasonable categorization. It is necessary to determine the characteristics of company profile that are useful in discriminating between these three groups. The independent variables used in this study are: Company Market Capitalisation, Dividend per Share, Dividend Yield, P/E, P/B, P/E Growth, Beta, Earnings per Share, Debt/Equity Ratio, Cash Flow, Earnings, Book Value, ROA and ROE. This list is not exhaustive of all possible measures, but does represent a list that is commonly collected. As such it represents a reasonable set of variables to test the applied ability of the method applied. As this is an exploratory study where a large number of independent variables involve, and the objective is to identify which of these variables can produce a parsimonious equation to predict Return on Equity (ROE) group value. Therefore, Stepwise Multiple Linear Regression is used here, as the goal is to identify the smallest set of independent variables, which will correctly classify the largest number of cases on the ROE. It is noted that the stepwise approach involves 5

entering the independent variables into the discriminant function one at a time on a basis of their discriminating power. And by sequentially selecting the next best discriminating variable at each step, variables that are not useful in discriminating between the groups are eliminated thus a reduced set of variables is revealed. Note that the reduced set is almost as good as the complete set as long as the ratio of sample size to independent variable is below 20:1. Once the Discrimiant analysis identifies the variables that display significant difference between the three value groups, performance of a test statistics determines the overall discriminating power of the model. This test statistics, which is the ratio of the sum-of-squares between-groups to the sum-of-squares within-groups, follows the equation: Wilks' Lambda G g 1 G Ng g 1 j 1 Ng[ y [ y g jg y] y g 2 ] 2 Where, G = total number of groups g = Group g, g=1 to G N g = Number of companies in group g Y jg = Company j in group g, j=1 to N g Y g = Group mean (centroid) Y = Overall sample mean Also, when Wilks Lambda is maximized the group means or centroids of the three groups are effectively spreading apart, at the same time reducing dispersion of the individual points about their respective group means. In other words, these centroids are created in the reduced space, which is created by the discriminant function reduced from the initial predictor variables. Differences in the location of these centroids show the dimensions along which the three groups differ. The centroid or group mean is calculated by averaging the discriminant scores for all the individual companies within a particular group. y 1 N Ng g yjg Once the values of the discriminant coefficients are estimated, the discriminant score for each case or company is calculated and based on this score; each company is then assigned to one of the three groups as a Low-, Medium- or High-Value Group. The assignments are made based upon the proximity of the company s score to the three group means (Figure 2). j 1 6

Figure 2 X 2 G 1 D G 2 G 3 High D values. Low D values. Medium D values. X 1 Finally, the discriminant function is assessed to determine how well it can correctly classify each company to their a priori groups, and if it works equally well for each group of the dependent variable. For validation purpose, the total sample is divided into subsamples, the analysis sample and the holdout sample. The analysis sample (n 1 =100), is used to develop the discriminant function. The holdout sample (n 2 =50) is used to test the discriminant function. The method of validating the function is referred to as the split-sample validation or cross-validation. Value Log Determinants III. EMPIRICAL RESULTS Table 3 Box's Test of Equality of Covariance Matrices Rank Log Determinant Test Results Box's M 171.687 Low 4 18.612 F Approx. 7.756 Medium 4 17.825 df1 20 High 4 20.799 df2 10888.926 Pooled within- groups 4 21.580 Sig..000 Tests null hypothesis of equal population covariance matrices Table 3 shows the log determinants and Box s M value. In Discrimination Analysis, an assumption is that the population covariance matrices are equal and for this assumption to hold, the log determinants should be equal. In this case, Box s M is significant, Box s M=171.687, p <.001. However, with adequate sample sizes (n=100) a significant value is often interpreted as not being of major importance and will affect the accuracy of classification. In this case given the sample size is large, we will continue with the interpretation. 7

Table 4 Results of SPSS - Stepwise Discriminant Analysis on Four Variables Step Entered Mean Standard Exact F df1 df2 Sig. Deviation Statistic 1 ROA 13.887 11.848 36.649 2 63.000 2 Earnings 69.464 67.905 19.549 4 124.000 3 Book Value 5.197 4.689 17.971 6 122.000 4 Debt/Equity 52.086 40.302 15.335 8 120.000 At each step, the variable that minimizes the overall Wilks Lambda is entered As mentioned previously, twelve financial ratios (variables) commonly used by investors and financial analysts are performance measures. An F test is performed here to test the discriminating power of each of the four variables: Return on Assets, Earnings, Book Value and Debt/Equity. All mentioned variables are found to be significant at.05 levels, indicating significant differences in these variables between the three groups. The statistical information of the four significant variables is shown in Tables 4 and 5. In the Correlation Coefficients table the correlations between pairs of variables are not exceptionally high, i.e. no multicollinearity. Table 5 Correlation Coefficients Pearson Correlation ROA Earnings Book Value Debt/Equity ROA Earnings Book Value 1.000 -.218 -.197 -.369 -.218 1.000.105 -.170 -.197.105 1.000 -.285 Debt/ Equity -.369 -.170 -.285 1.000 The Eigenvalues (Table 6) provide information regarding the two discriminant functions produced by the analysis. Note that only two discriminantions are needed to separate between three groups. The square of the canonical correlation provides an index of the overall model fit, which is interpreted as the proportion of variance explained (R-squared). Table 6 Eigenvalues Function Eigenvalue % of Variance Cumulative % Canonical Correlation 1 2.907 (a) 98.4 98.4.863 2.047(a) 1.6 100.0.211 a First 2 canonical discriminant functions were used in the analysis Table 6 shows the first function has a canonical correlation of.863, thus the first model explains 74.48% of the variation in the dependent variable Stock Groups, i.e. the discriminant function accounting for 74.48% of the between-group variability. The second 8

function has a lower canonical correlation (.211) and explains only 4.45% of the variation in the dependent variable. Table 7 Wilks' Lambda Test of Function(s) Wilks' Chi-square df Sig. Lambda 1.245 86.625 8.000 2.955 2.811 3.422 Table 7 shows that the disciminant function 1 is significant (p-value=.000) at the.05 level. That means the function performs better than could be explained by chance. However, discriminant function 2 is not significant (p-value=.422). Hence, function 1 will be used to predict the dependent variable - ROE. It is noted that Wilks Lambda is used to test whether the differences between the groups are significant. Table 8 Standardized Canonical Discriminant Function Coefficients Function Debt/Equity 0.458 Earnings 1.062 Book Value - 0.664 ROA 0.902 The Standardized Canonical Discriminant Function Coefficients provide an index of the relative importance of the predictor variables (Table 8) in the same way that standardized regression coefficients are interpreted. The larger the absolute value the more important. The sign indicates direction of the relationship. Here, the function shows Earnings and ROA as the most important variables with 1.062 and.902 standardized coefficients respectively. The next important predictors of ROE are Book Value (absolute.664) and the Debt/Equity ratio (.458). Hence, the most highly valued stocks are expressed as stocks with high earnings, high ROA, under-priced and have a low Debt/Equity ratio. Note also by comparing this book value to the company s market value, the book value can indicate whether a stock is under- or over-priced. Note that a negative sign for book value (-.664) indicates the stock is underpriced. Table 9 Canonical Discriminant Function Coefficients Function Debt/Equity 0.011 Earnings 0.017 Book Value - 0.143 ROA 0.110 (Constant) - 2.558 9

Using Table 9 - the Canonical Discriminant Function Coefficients, the final discriminant function can be arranged as follows: D = -2.558 +.011 X 1 +.017 X 2 -.143 X 3 +.110 X 4 Where, X 1 = Debt/Equity X 2 = Earnings X 3 = Book Value X 4 = Return On Assets D = Discriminant Score X 1 - Debt/Equity. The Debt /Equity ratio indicates what proportion of equity and debt the company is using to finance its assets. A high debt/equity ratio generally means that a company has been aggressive in financing its growth with debt. Of the three risk ratios studied (Beta, Earning Stability and Debt/Equity), this one proved to be the most valuable but the least important predictor variables compared to Earnings Stability, Book Value and Return on Assets. X 2 - Earnings per Share - the amount of earnings per share of a company's stock. X 3 - Book Value is also a determinant variable that measures the total value of the company s assets that shareholders would receive if a company were liquidated. X 4 - Return on Assets - an indicator of how profitable a company is relative to its assets; it also measures what earnings were generated from invested capital. The discriminant coefficients of the function for Debt/Equity, Earnings, and Return on Assets show positive signs except for the Book Value. As mentioned before, a negative coefficient sign for book value indicates the company stock is underpriced. In overall, the higher a company s value, the higher its discriminant score. Value Table 10 Functions at Group Centroids Function Low -1.659 Medium - 0.256 High 2.636 Unstandardized canonical discriminant functions evaluated at group means Table 10 provides the average value of the discriminant score (D) for each of the value groups. From this we can see the separation between groups (Low, Medium and High Value) and the values of the discriminant function associated with each group. The Low- Value group has a mean D score of -1.659; the Medium-Value Group has a mean D score of 10

-0.256, and the High-Value of 2.636. Hence, individual stocks with D score closer to 2.636 will be predicted to belong to the High-Value group. Cross- Validated Table 11 Predicted Group Membership Low Medium High Count Low 13 7 1 21 Medium 2 16 1 19 High 0 1 9 10 % Low 61.9 33.3 4.8 100.0 Medium 10.5 84.2 5.3 100.0 High.0 10.0 90.0 100.0 a.76.0% of Cross-Validated grouped cases correctly classified. The Predicted Group Membership provides a summary of how well the analysis would be at classifying new cases that have not been included in the original sample. Table 11 presents the number and associated percentage of cases correctly and incorrectly classified based on the independent variables. With the use of Cross-Validated for the total sample of 50 cases, 38 (76.0%) overall are correctly classified. It is noted that this percentage is analogous to the coefficient of determination, R 2, in the regression model which measures the percent of the variation of the dependent variable (ROE) explained by the significant independent variables (ROA, Earnings, Book Value and Debt/Equity ratio). Of the High-Value Group 90% are correctly identified with 10% misclassified as belonging to the Medium value group. 84.2% of the Medium-Value Group and 61.9% Low-Value Group are correctly classified. IV. CONCLUDING REMARKS This paper uses the discriminant analysis approach to establish a function which best discriminates between the ASX stocks in three mutually exclusive groups (high-, mediumand low-value). In order to demonstrate the capabilities and limitations of the discriminant analysis methods, a set of financial ratios was used to test the individual discriminating ability of the variables and their relative contribution of each to the total discriminating power of the function. The results indicate that four of twelve predictor variables display significant differences between groups were the Earnings, ROA, Book Value and Debt/Equity ratio, with both Earnings and ROA as the most important variables in discriminating between high-, medium- and low-value groups. The discriminant ratio model could only predict value stocks correctly (76 per cent overall), with 90 per cent of the highvalue group were correctly classified. While the prediction power is not too impressive, the implications are if an investor already owns stocks whose performance seem to be dismal according to the discriminant model, he/she should review his/her portfolio and act 11

accordingly. The analysis could also be useful in the selection of efficient portfolio. However, further investigation is required on this value-investing topic. Further study should investigate a different model, be it another widely used multivariate statistical technique like this Discriminant Analysis or the Artificial Neural Networks to determine whether it can outperform this as a superior method of classification of value groups. Further research is also required, that is once high-value stocks are identified, and the question is what do they have in common (low P/E, large Cap, Beta is less than one, same sector, etc )? Also, would different sets of companies or different sectors have different discriminating power variables? How about the offshore markets such as China, would they have the same predictors? REFERENCES COMSEC website accessed via https://www.comsec.com.au/default.aspx Graham, B. (2005) The Intelligent Investor: the Classic Text on Value Investing, Harper Business, New York. Graham, B. and Dodd, D. (2009) Security Analysis, 6 th edition, Mc-Graw-Hill, New York. Hair, J.F., Black, W.C., Babin, B.J. and Anderson, R.E. (2009) Multivariate Data Analysis, 7 th edition, Prentice Hall. Johnson, R.A. and Wichern, D.W. (2002) Applied Multivariate Statistical Analysis, International Edition, 5 th Edition. Pearson Education International, New Jersey. Investopedia website accessed on June 4, 2011 via http://www.investopedia.com/terms/v/valueinvesting.asp Williams, J.B. (1938) The Theory of Investment Value, Harvard University Press. 12