Intern. J. of Research in Marketing 19 (2002) 151 166 www.elsevier.com/locate/ijresmar Managerial evaluation of sales forecasting effectiveness: A MIMIC modeling approach Heidi M. Winklhofer a, Adamantios Diamantopoulos b, * a The University of Nottingham, UK b Marketing and Business Research Department, Loughborough University Business School, Ashby Road, Loughborough, Leicestershire, LE11 3TU, UK Received 2 April 2001; received in revised form 20 March 2002; accepted 20 March 2002 Abstract A Multiple Indicators and MultIple Causes (MIMIC) model is developed in which managerial evaluations of forecasting effectiveness are modeled as a function of different forecast performance criteria, namely, accuracy, bias, timeliness and cost. The model is estimated using data from a survey of export sales forecasting practices and several hypotheses linking the aforementioned criteria on effectiveness are tested. The findings indicate that evaluations of forecasting effectiveness are equally influenced by short-term accuracy and absence of overestimating bias, while timely delivery of the forecast to management is somewhat less important. Long-term accuracy, underestimation and timing of production of the forecast are not found to impact on effectiveness. Implications of the results for forecasting practice are considered and future research directions identified. D 2002 Elsevier Science B.V. All rights reserved. Keywords: Empirical study; Forecasting performance; Sales forecasting 1. Introduction Although forecast performance is the main driver behind the effort put into forecasting by academics, business and consultants, the question of how managers evaluate their firm s forecasting effectiveness remains unanswered. In this context, the empirical forecasting literature has focused, almost exclusively, on accuracy as a measure of forecast success (e.g. Dalrymple, 1975, 1987; Mentzer & Cox, 1984a; Small, 1980; Watson, 1996), with a few studies also examining the presence/absence of bias (e.g. McHugh & Sparkes, 1983; Peterson, 1989, 1990). Accuracy * Corresponding author. Tel.: +44-1509-223123; fax: +44-1509-223961. E-mail address: a.diamantopoulos@lboro.ac.uk (A. Diamantopoulos). and bias relate respectively to the magnitude and direction of the forecast error, namely, the difference between actual and forecast sales. This preoccupation with forecast error is also reflected in the methodological forecasting literature where a large number of error measures have been proposed as indicators of forecast performance (reviewed in Armstrong, 1985; Armstrong & Collopy, 1992; Fildes, 1992; Mahmoud, 1984, 1987; Makridakis, 1993). Implicit in the equation lower errors = better forecast performance is the assumption that managers base their assessments of the effectiveness of a forecast primarily on an evaluation of forecast accuracy. Although additional considerations such as timeliness and cost have been raised in previous empirical research, (e.g. Carbone & Armstrong, 1982; Mentzer & Cox, 1984b; Yokum & Armstrong, 1995), the studies 0167-8116/02/$ - see front matter D 2002 Elsevier Science B.V. All rights reserved. PII: S0167-8116(02)00066-6
152 H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 involved were exclusively concerned with the selection of different forecasting techniques rather than the forecasts themselves. This is an important distinction as the performance of the forecast is not solely a function of the technique used but also of other factors such as the quality of the data (Armstrong, 2001a) and the forecasting expertise involved (Armstrong, Adya, & Collopy, 2001). Moreover, forecasting practices often involve combining forecasts generated by different techniques (Armstrong, 2001b) and/or judgementally adjusting model-based forecasts (Sanders & Ritzman, 2001). Bearing in mind that a forecast (as an output) and a forecasting technique are not the same, there is no explicit and systematic analysis in the literature of the factors underlying managerial evaluations of forecasting effectiveness and, in particular, the relative importance of accuracy vs. other criteria in shaping such evaluations. It is this gap that the present study seeks to address, drawing from survey data relating to the forecasting practices of UK exporters. Specifically, we examine the extent to which managers evaluations of their firms overall forecasting effectiveness is influenced by forecast accuracy, forecast bias, timeliness and cost of the forecast. We also examine potential interactions amongst these criteria and investigate the role of the environment as a possible moderating influence. Our analysis is based on the specification and subsequent estimation of a Multiple Indicators and MultIple Causes (MIMIC) model (Jöreskog & Goldberger, 1975) using LISREL methodology. The intended contribution of our study is threefold. First, on the theoretical front, to the best of our knowledge, this is the first attempt that (a) systematically investigates the degree to which accuracy, bias, timeliness and cost impact upon managerial perceptions of forecasting effectiveness, (b) models these antecedents within a structural equations (SEM) framework and explores their interactions, and (c) explicitly considers the environment as a moderating influence. On the methodological front, our study proposes a multi-item operationalisation of forecasting effectiveness enabling the assessment of measurement error in the focal construct and also illustrates how SEM techniques can be profitably employed to investigate forecastingrelated issues. In this context, past empirical research on forecasting practices has been employing very basic techniques to analyze survey data (typically percentages and cross-tabulations, often without significance testing) and multivariate analyses have been very rare (for a review, see Winklhofer, Diamantopoulos, & Witt, 1996, and studies cited therein). Finally, on the practitioner front, our study offers empirically based insights as to the relative importance of different performance criteria used by managers to judge forecasting effectiveness in their firm. Such information is clearly of relevance to forecast preparers when designing/operating forecasting systems as well as to consultants providing forecasting advice and training. For example, knowing that, say, accuracy is a major influence shaping managerial perceptions of effectiveness, it could be used to justify the introduction of alternative forecasting methods and/or the collection of better input data. Similarly, if a criterion like accuracy is found not to influence the evaluation of forecasting effectiveness, attention can be focused on educating managers by highlighting why accuracy considerations should be taken into account (e.g. by demonstrating the value of improved forecast accuracy as done, e.g. by Armstrong, 1985). Moreover, by identifying the relative importance of accuracy, bias, timeliness and cost, it becomes possible to set priorities directly reflecting managerial preferences for different forecast performance criteria. Again, if implementation of such priorities is seen to contradict principles of good forecasting practice (summarized in Armstrong, 2001c), action can be taken to inform managers of the potential negative consequences (hopefully leading to a revised set of priorities which is more consistent with sound forecasting principles). The next section introduces our model of forecasting effectiveness and discusses its major premises. Next, the literature on forecast performance is used to provide a conceptual rationale for the hypotheses depicted in the MIMIC model. This is followed by a description of the empirical data and the results from estimating the model. The paper concludes with a discussion of the implications of the findings and some suggestions for future research. 2. A multiple-indicators, multiple-causes model of forecasting effectiveness Fig. 1 shows our proposed model of sales forecasting effectiveness in which the latter is modeled as
H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 153 Fig. 1. Path diagram for MIMIC model of sales forecasting effectiveness. a latent variable (g) measured by multiple indicators ( y 1 y 3 ). Seven antecedent variables (x 1 x 7 ), reflecting aspects of accuracy, bias, timeliness and cost, respectively, are shown to impact on forecasting effectiveness. In addition, the turbulence of the export environment (x 8 ) is included as a contextual variable. 1 The formal specification of the model is described by the following matrix equations: y ¼ Kg þ e, g ¼ Gx þ f, ð1þ ð2þ where y =(y 1, y 2, y 3 ) are indicators of g and x =(x 1,..., x 8 ) are the antecedents ( causes ) of g (both y and x are column vectors). 2 Eq. (1) indicates that the y s are congeneric measures of g (Jöreskog, 1971), while 1 Although no correlations among the x-variables are included in Fig. 1 (to avoid clutter), in general, Cov(x i, x j ) p 0(i p j) and such interrelationships are taken into account during parameter estimation. 2 In terms of measurement theory, the x-variables are formative measures, whereas the y-variables are reflective measures; for a detailed discussion of the differences between the two types of measures, see Bagozzi and Fornell (1982), Bollen (1984), Bollen and Lennox (1991), and Fornell, Rhee, and Yi (1991). Eq. (2) shows that g is linear in the x s plus a random disturbance term (f). It is assumed that the e s (errors in measurement) and f are uncorrelated (i.e. Cov(e, f) = 0). The K (Lambda) matrix contains the k-parameters (k 1 k 3 ) which reflect the loadings of the y-variables on the latent construct (g). The G (Gamma) matrix, on the other hand, contains the c-parameters (c 1 c 8 ) which indicate the impact of the x-variables on g. In substantive terms, the proposed model implies that managers evaluations of forecasting effectiveness are determined by the performance of the forecast in terms of accuracy, bias (absence of), timeliness and cost. The advantages of modeling forecasting effectiveness along the lines of Fig. 1 are as follows. First, the relative importance of accuracy, bias, cost and timeliness considerations in influencing managers evaluations of effectiveness can be identified, in addition to their joint (i.e. combined) impact. The for-mer can be determined by examining the magnitude of the (standardized) c-parameters and the imposition of equality constraints, while the latter by looking at the amount of residual (unexplained) variance in g. Second, interactions among the x-variables can be explored and the goodness-of-fit of models containing interaction terms contrasted against models with
154 H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 main effects only; this is achieved by undertaking nested model comparisons, evaluated via change-in-fit tests. The same approach can be used to examine the role of the environment as a possible moderator. Third, by applying structural equation modeling (SEM) techniques, the overall fit of the model(s) can be evaluated; specifically, the model-implied covariance matrix (Ŝ) can be compared to the actual (i.e. sample-based) covariance matrix (S ) and the discrepancy between the two matrices (S Ŝ ) formally assessed (see analysis section). Fourth, forecasting effectiveness can be operationalized by means of multiple indicators; this is important since managers evaluations of forecasting effectiveness are unlikely to be fully captured by a single-item measure. In our case, we use three items ( y 1 y 3 ) to represent forecasting effectiveness (discussed in the methodology section). Fifth, the psychometric properties of the chosen operationalisation of effectiveness can be formally assessed and, thus, the dimensionality and reliability of the (multi-item) measure established. This is done by inspecting the magnitude and significance of the k- and e-parameters and computing the composite reliability of the items (i.e. y 1 y 3 ). Having presented the basic model structure, we now elaborate on the hypotheses implied by the model (and depicted in the c-parameters) regarding the expected effects of accuracy, bias, timeliness and cost on forecasting effectiveness. 2.1. Accuracy Accuracy is often used synonymously with forecast performance and is by far the most common dimension investigated in empirical forecasting studies. Mentzer and Cox (1984a, p. 144) point out that in order to understand the concept of forecast accuracy, it should be divided into the components of potential accuracy and achieved accuracy. Potential accuracy is the maximum obtainable accuracy for a given forecast situation. More specifically, the particular forecasting situation places constraints on the potential accuracy of a forecast; for example, 10 per cent accuracy might be a very accurate result for a company operating in a volatile market, but not very accurate for a company in a stable market (Winklhofer & Diamantopoulos, 1996, p. 67). Similarly, it has been noted that for products that are difficult to forecast... the same forecast error represents a better quality forecast than for products that are easy to forecast (Hagdorn-van der Meijden, VanNunen, & Ramondt, 1994, p. 102). In light of the above, it seems appropriate to consider the impact of forecast accuracy on evaluations of forecast effectiveness while controlling for the volatility of the environment. This is one reason for including environmental turbulence (x 8 ) as a contextual variable in Fig. 1. Another reason is that the environment may moderate the effects of the various performance criteria on forecasting effectiveness (see Section 2.6). Given the importance attached to accuracy as a performance criterion (see introductory section), the following hypothesis is put forward: H1. Forecast accuracy will be positively related to managerial evaluations of forecasting effectiveness. Regarding measurement, while a wide range of accuracy measures has been developed in the methodological literature (see Armstrong & Collopy, 1992; Mahmoud, 1987), Mentzer and Kahn (1995) report that the majority of firms use the mean absolute percentage error (MAPE) to measure forecast accuracy. A large number of previous business surveys have employed MAPE as a forecast performance measure (e.g. Dalrymple, 1987; McHugh & Sparkes, 1983; Schnaars, 1984; Small, 1980; West, 1994). This is also the measure we use in our MIMIC model to operationalize forecast accuracy (x 1 and x 2 in Fig. 1). 2.2. Bias One speaks of a biased forecast when, over several time periods, the forecast is consistently too optimistic or too pessimistic. In statistical terms, the forecast is biased if a nonrandom difference between an estimate and its true value can be observed. While a few studies have examined bias as criterion for selecting among forecasting techniques (e.g. Peterson, 1989, 1990; Sanders & Manrodt, 1994), little is known about how bias influences managerial perceptions of effectiveness. For example, Erickson (1987, p. 453) argues that to managers who are interested in planning as well as forecasting lack of bias is more important than accuracy, but fails to provide any
H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 155 evidence supporting his contention. Nevertheless, it could be expected that managerial evaluations of forecast effectiveness will not only be affected by the magnitude of forecast error (i.e. accuracy) but also by its direction (i.e. bias). This is because, in the real world, the cost of under-forecasting is not the same as the cost of over-forecasting (Ermer, 1991, p. 10), that is, the loss function involved may not be symmetrical. Thus, the following hypothesis can be proposed: H2. Lack of bias will be positively related to managerial evaluations of forecasting effectiveness. Since bias must reflect either under- or overestimation of actual sales, we include two indicators of bias in our MIMIC model (x 3 and x 4 in Fig. 1). 2.3. Timeliness Remus and Simkin (1987) developed several guidelines for forecasts to be useful for decision making and started their list by the need for the forecasts to be timely. Timeliness refers to the forecast being available to a decision maker in advance of having to make a decision based on the forecast. Timeliness is, therefore, a necessary condition for a forecast to be used. However, despite the obvious importance of timeliness, only a minority of empirical investigations have included it as a forecast performance indicator (e.g. Herbig, Milewicz, & Golden, 1994; Yokum & Armstrong, 1995). In our model, we use two measures of timeliness (x 5 and x 6 in Fig. 1) and expect a positive impact on forecasting effectiveness. Specifically: H3. Timeliness will be positively related to managerial evaluations of forecasting effectiveness. 2.4. Cost Generating accurate, unbiased, and timely forecasts is a cost-incurring activity; such costs include initial development costs, maintenance costs (to keep the model up-to-date) and operating costs (time and dollars to make the forecasts) (Armstrong, 2001a, p. 464). Although few exporters seem to systematically track their forecasting costs (Winklhofer & Diamantopoulos, 1996), in an exporting context, a key concern is the increased cost of obtaining appropriate data (Craig & Douglas, 2000); indeed, firms (particularly, smaller companies) often have to accept tradeoffs between the need for more accurate data and the limited resources available to accomplish the tasks (Jeannet & Hennessey, 2001, p. 220). Hence, it is expected that: H4. Costs will be negatively related to managerial evaluations of forecasting effectiveness. 2.5. Relative importance of accuracy, bias, cost and timeliness In addition to the hypotheses advanced above regarding the expected individual impact of accuracy (H1), bias (H2), timeliness (H3) and cost (H4), we use our model to investigate the relative impact of these antecedent factors on forecasting effectiveness. 3 In this context, according to the existing literature, accuracy would be expected to feature as the most important determinant of forecasting effectiveness. For example, Klein (1984, p. 1, emphasis added) states that the forecast itself has an intrinsic importance and to the user community it is the all important thing. Its accuracy is the bottom line for the professional forecaster, much as the net profit is the bottom line for the chief executive of an enterprise. In addition, several studies have reported that accuracy is, overall, the number one criterion for selecting a forecasting technique (Carbone & Armstrong, 1982; Mahmoud, Rice, & Malhotra, 1986; Martin & Witt, 1988; Mentzer & Kahn, 1995; Mentzer & Cox, 1984a; Yokum & Armstrong, 1995). In this context, although the relative importance of accuracy vs. other selection criteria was found by Yokum and Armstrong (1995) to be dependent upon the specific conditions involved (i.e. number of forecasts, time series length and method used), in general, there was much agreement across roles and across situations that accuracy 3 Technically, in terms of the path diagram in Fig. 1, the focus shifts from an examination of the signs and significance of the c- parameters to a comparison of their magnitude. This involves the introduction of equality constraints among the c-parameters and subsequent comparison of the constrained parameter estimates with their unconstrained counterparts (see analysis section).
156 H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 was the most important criterion (Yokum & Armstrong, 1995, p. 591). Thus, it is hypothesized that: H5. Accuracy will have the greatest impact on managerial evaluations of forecasting effectiveness. With regard to forecast bias, unless the loss function for the firm is symmetrical (Armstrong, 1985), underestimation and overestimation are unlikely to be perceived as being equally harmful. On balance, it could be argued that a conservative forecast is preferable to an overoptimistic forecast, because there are several serious consequences of...overoptimism, the most obvious being that businesses plan according to sales expectations that cannot be met (Wheeler & Shelley, 1987, p. 57). Thus, an asymmetric effect of bias is expected, namely: H6. Overestimation will have a greater (negative) impact on managerial evaluations of forecasting effectiveness than underestimation. Finally, concerning timeliness and cost, although the former is clearly a necessary condition for a forecast to be used (Remus & Simkin, 1987), it is by no means a sufficient condition. A timely forecast which, however, is highly inaccurate and/or biased, is unlikely to be thought of highly by a decision maker. 4 Similarly, a forecast user is unlikely to rely on a forecast simply because it was cheap to produce, if that forecast is perceived to be inaccurate and/or highly biased. Thus, it is hypothesized that: H7. Cost and timeliness will have the least impact on managerial evaluations of forecasting effectiveness. 2.6. Interactions and moderating influences In addition to the individual effects of accuracy, bias, timeliness and cost described in hypotheses H1 H4, it could be the case that interactions among the performance criteria also play a role in affecting managerial evaluations of forecasting effectiveness. For example, accuracy may have a stronger impact on effectiveness if 4 Note that both studies reported in Yokum and Armstrong (1995) ranked timeliness lower than accuracy as a criterion for choosing a forecasting technique. the forecast is perceived to be unbiased than if it is highly biased. Similarly, timeliness may have a lesser effect on effectiveness if the forecast is seen to be inaccurate. Although the study of interactions is common in marketing studies (for a review, see Gatignon, 1993), neither the presence nor the nature of potential interactions among the different forecast performance criteria has been investigated in previous forecasting research. Given the present lack of knowledge regarding which criteria may interact with other criteria (and the form that any such interactions may take), it seems prudent to offer an exploratory hypothesis at this stage: H8. Interactions among accuracy, bias, timeliness and cost will have an impact on managerial evaluations of forecasting effectiveness. Finally, depending upon the environment faced by the firm in which forecasting activity takes place, the influence of the various performance criteria on managerial evaluations of effectiveness may also vary. For example, under conditions of high environmental turbulence, the impact of timeliness on effectiveness may be higher than if a stable environment is involved. Thus, the environment may play a moderating role in the relationships between effectiveness and the different performance criteria. Again, however, empirical evidence on such a role of the environment is lacking; although previous research has addressed direct links between the environment and performance criteria such as accuracy (e.g. McHugh & Sparkes, 1983; Sanders & Manrodt, 1994), no study has examined the environment as a moderating influence. Thus, only an exploratory hypothesis is proposed: H9. Environmental turbulence will moderate the relationships between the forecast performance criteria and managerial evaluations of forecasting effectiveness. 3. Model operationalisation 3.1. Data collection The empirical data on which the subsequent analysis is based were drawn from a survey of the forecasting practices of UK exporters in the manufac-
H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 157 turing sector. Specifically, following exploratory interviews with 12 firms and two separate mail pilots, 1330 firms derived from the Dun and Bradstreet UK database were targeted by means of a mail questionnaire. The recipients of the questionnaire were export directors, sales directors, marketing directors, managing directors or finance directors, depending upon the contact name in the Dun and Bradstreet database. In addition, since the contacted person may not always have been the most knowledgeable with regard to export sales forecasting, the recipient was asked in the cover letter to pass the questionnaire on to somebody else in the firm who he/she felt was in a better position to answer it. An analysis of the job titles of respondent confirmed the observation made in previous research that organisational responsibility for export sales forecasting varies widely across firms (see Winklhofer & Diamantopoulos, 1996). Specifically, 48% held top management positions, 27% were in sales and/or marketing, 10% in exporting, 6% in finance and accounting, and 8% held positions such as inventory controller and operations manager. The sample was stratified according to firm size as follows: 40% of firms had between 40 and 100 employees, 40% between 100 and 500 employees, and 20% more than 500 employees; the specific sample proportions reflected the size categories in the Dun and Bradstreet database. Two weeks after the initial mailing, a follow-up letter was sent to a stratified sample of 300 non-respondents using the same stratification approach as in the initial mailing. Altogether, 256 responses were obtained of which 180 were usable. In order to gather first-hand information with regard to reasons for non-response, a telephone follow-up of 100 non-respondents was undertaken. This showed that the main reason for non-response was ineligibility (e.g. the firm did not exist any longer or undertook no exporting); among the potentially eligible respondents, the main reason for noncompletion was time pressure which prohibited the participation in any survey. After adjusting for ineligibles (Churchill, 1999; Wiseman & Billington, 1984), the effective response rate came to 18.5%. The latter is comparable with other surveys conducted in an industrial setting (Jobber & Blaesdale, 1987) and with surveys on forecasting issues (e.g. McHugh & Sparkes, 1983; White, 1986). It needs to be emphasized, in this context, that a low response rate does not automatically mean that there has been non-response error. Nonresponse error is a problem only when a difference between the respondents and the non-respondents leads the researchers to an incorrect conclusion or decision (Tull & Hawkins, 1993, p. 184). In the present study, non-response error would be a cause for concern only if the main reasons for non-response were not related to the satisfaction of eligibility criteria but to respondent characteristics which directly and differentially affect responses to the substantive issues examined in the survey (Lesley, 1972). As an additional safety check against non-response error, early and late responses were compared following the successive waves procedure discussed by Armstrong and Overton (1977). Early responses were defined as usable questionnaires obtained within the first week following the initial mailing, while late responses were questionnaires received after the follow-up. No significant differences were found between the two groups, providing further evidence that non-response error is unlikely to be a major problem in the study. In terms of sample composition, the responding firms closely matched the initial sample stratification; specifically, 39.3% had between 50 and 100 employees, 41.4% fell in the 100 500 category and 17.4% employed more than 500 people. A v 2 test revealed no significant difference between these proportions and the original sample distribution in terms of company size. Regarding the respondents sector of activity, 67% of the firms were industrial companies while the rest operated in consumer good markets. On average, the respondent firms had been exporting for some 34 years and derived 47% of their total sales from export markets. 3.2. Variable measurement Appendix A provides a full listing of the measures used to operationalize the various parts of the MIMIC model shown earlier in Fig. 1. The measures were developed on the basis of insights gained from the exploratory interviews (see Section 3.1) and an extensive literature search. The measures were subsequently pretested by means of protocol interviews (Reynolds & Diamantopoulos, 1998) with decision makers in local exporting firms. Forecasting effectiveness was operationalized by three items ( y 1 y 3 ), reflecting the confidence that
158 H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 decision makers have in the forecasts produced and their assessment of their firm s overall export forecasting capability relative to the industry and competition. An exploratory factor analysis showed that all three items loaded strongly on a single common factor; moreover, a reliability analysis revealed high internal consistency among the items concerned (Cronbach s a = 0.74). Regarding forecast accuracy, the magnitude of forecast error had to be captured; the latter is dependent both on the specific forecasting level and associated time horizon (Mentzer & Cox, 1984a,b; Small, 1980; White, 1986). Accuracy was, therefore, measured in relation to a specific forecasting level (total export sales) 5 and two time horizons, short- and medium-term; short-term forecasts typically reached between 1 and 3 months ahead, while medium-term forecasts between 3 and 6 months ahead. The specific accuracy measure employed was the mean absolute percentage error (MAPE). MAPE captures the absolute difference between actual and forecasted sales as a percentage of actual sales and shows the extent to which actual sales are under- or overestimated by the forecast. According to Makridakis (1993, p. 528), MAPE is a relative measure that incorporates the best characteristics among the various accuracy criteria. Moreover, it is the only measure... that means something to decision makers. In this context, Mentzer and Kahn (1995) found that MAPE was the most widely used accuracy measure in their sample of firms and a strong preference for MAPE was also observed in the exploratory interviews conducted as part of this study. However, based on the results of his pretest, West (1994, p. 401) reported that asking respondents to provide point estimates of MAPE proved to be somewhat intimidating to many firms and reduced the response of the survey considerably. Thus, in the present study, respondents were asked to provide information about their range of forecast errors in relation to short-term total export sales (x 1 ) and medium-term total export sales (x 2 ). Errors were 5 To be precise, respondents were presented with a list of seven forecasting levels (product item, product line/group, export customer, region within a country, country, country group, and total export sales) and were asked to respond only for those levels actually used in their firm. As the total export sales level was the most popular, the error information provided for this level was employed as a measure of actual forecast accuracy. recorded in one of the following ranges: 0 5%, 6 10%, 11 15%, 16 20%, > 20%. Forecast bias was captured by two items, requesting respondents to indicate the extent to which export sales forecasts tended to overestimate (x 3 ) or underestimate export sales (x 4 ). Timeliness was also measured by two items representing the degree to which forecasts were prepared (x 5 ) and received by decision makers (x 6 ) in a timely fashion. Cost was measured by an item indicating the extent to which data for export forecasting purposes was considered to be expensive (x 7 ). Finally, environmental turbulence was measured by combining the three scales developed by Jaworski and Kohli (1993) capturing the dimensions of technological, customer, and competitor turbulence; the application of the reliability formula for linear composites (Nunnally & Bernstein, 1994) resulted in a value of 0.816 which is very satisfactory. Appendix B lists the means, standard deviations and minimum/maximum values of all variables included in the MIMIC model; the relevant correlation matrix is also reproduced. It can be seen from the magnitude of the scale means in relation to the scales endpoints that no response bias is evident. Similarly, the size of the standard deviations provides no evidence of a restriction of range problem for any of the variables involved (see also maximum and minimum values). These checks suggest that the measures included in the model are of sufficient quality to enable estimation of parameters with confidence. 4. Model estimation 4.1. Fit assessment The LISREL 8 (Jöreskog & Sörbom, 1993) structural equations program was used to estimate the MIMIC model presented earlier in Fig. 1. Consistent with the recommendations in the methodological literature, 6 the covariance matrix of the observed variables was used as input and the model s overall fit assessed on the following criteria: v 2 goodnessof-fit test; root-mean-square error of approximation 6 See, in particular, Baumgartner and Homburg (1996), Bollen and Long (1993), Hoyle and Panter, (1995), and Marsh, Balla, and Han (1996).
H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 159 (RMSEA; Browne & Cudeck, 1993); goodness-of-fit index (GFI, Jöreskog & Sörbom, 1982); and comparative fit index (CFI, Bentler, 1990). 7 In addition, an inspection of standardized residuals was undertaken and the construct reliability of the effectiveness variable calculated (Steenkamp & Van Trijp, 1991). Estimation of the model resulted in a nonsignificant v 2 statistic (v 2 = 12.599, df = 16, p = 0.421), indicating a close correspondence between the model-based covariance matrix (Ŝ) and the observed (i.e. samplebased) covariance matrix (S). 8 The model s good overall fit was supported by the remaining fit criteria (RMSEA= 0.0162; standardized RMR = 0.023; GFI = 0.979; CFI = 1.000). Moreover, none of the standardized residuals exceeded a value of 2.58 which would indicate specification error (Sharma, 1996). The model explained 27.3% of the variance in forecasting effectiveness and the composite (construct) reliability of the latter came to 0.783; the latter exceeds the recommended 0.70 threshold suggested in the literature (e.g. Hair, Anderson, Tatham, & Black, 1998) and confirms that the three statements used to operationalize forecast effectiveness ( y 1 y 3 in Appendix A) do a good job as joint measures of this construct. Finally, computation of the expected cross-validation index (ECVI) which assesses whether a model is likely to cross-validate across samples of the same size from the same population (Diamantopoulos & Siguaw, 2000, p. 85) resulted in a value of 1.073; this is lower than the corresponding values for the saturated (ECVI = 1.211) and independence (null) models (EVCI = 1.291) and is thus supportive of the stability of the model (see Byrne, 1998). 4.2. Hypothesis testing results: main effects Having established the overall fit of the model, we now focus on the assessment of the model parameters associated with the research hypotheses presented in Section 2. A detailed examination of individual parameters is necessary because it is possible that 7 For a technical discussion of these fit indices, together with relevant formulae, see Hair et al. (1998). 8 Since the v 2 statistic tests the hypothesis H 0 that the observed covariance matrix was generated by the hypothesized model against the alternative hypothesis H 1 that the covariance is an unrestricted matrix (Long, 1983, p. 47), a nonsignificant v 2 is indicative of a good fit and a significant v 2 of poor fit. global measures of fit will indicate a satisfactory model but certain parameters corresponding to hypothesized relations may be nonsignificant (Bagozzi & Yi, 1988, p. 80). Focussing initially on hypotheses H1 H4 regarding the individual effects of accuracy, bias, cost and timeliness on forecasting effectiveness, the following points can be made. First, regarding accuracy, only the short-term MAPE (x 1 ) had a significant impact on evaluations of forecasting effectiveness (c 1 = 0.233, p < 0.05); 9 the coefficient for medium-term MAPE (x 2 ), although in the correct direction, 10 was not significant (c 2 = 0.108, p > 0.10). Thus, H1 is partly supported. Second, in terms of bias, only (absence of) overestimation (x 4 ) influenced forecasting effectiveness (c 4 = 0.195, p < 0.05); underestimation (x 3 ) had no impact (c 3 = 0.018), p > 0.10). Thus, only partial support is obtained for hypothesis H2. Third, concerning timeliness, the results show that the key influence on effectiveness is when forecasts are received by decision makers (x 6 ) rather than when they are produced (x 5 ); only the parameter estimate of the former variable turned out to be significant and in the direction predicted by H3 (c 6 = 0.206, p < 0.05; c 5 = 0.027, p > 0.10). Fourth, cost (x 7 ) had no effect on effectiveness (c 7 = 0.011, p > 0.10), hence offering no support for hypothesis H4. Finally, regarding the impact of environmental turbulence (x 8 ), this was found to be positively related to managerial evaluations of effectiveness (c 8 = 0.316, p < 0.01); this suggests that firms facing dynamic export environments are more satisfied with their overall forecasting capability than firms operating in more stable environments. Although the above results can also be used to provide insights regarding hypotheses H5 H7 (e.g. the fact that overestimation (x 4 ) returned a significant coefficient, while underestimation (x 3 ) did not is clearly consistent with hypothesis H6), a more rigorous approach for testing the relative influence of those performance criteria that did significantly impact on 9 All parameter estimates reported are standardized coefficients. 10 Given that x 1 and x 2 measure forecast error, negative coefficients indicate a positive effect of accuracy on managerial evaluations of effectiveness.
160 H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 effectiveness was adopted. This involved (a) the elimination of nonsignificant parameters and reestimation of the model, 11 and (b) the introduction of equality constraints accompanied by nested model comparisons. In our case, we simplified the model in Fig. 1 by dropping the paths between x 2, x 3, x 5, x 7 and g (or, what amounts to the same thing, by setting c 2, c 3, c 5 and c 7 equal to zero). Estimation of the revised MIMIC model also produced a good fit (v 2 = 3.970, df =8, p = 0.860; RMSEA= 0.000; standardized RMR = 0.027; GFI = 0.990; CFI = 1.000). Importantly, the elimination of nonsignificant linkages did not result in a deterioration in the model s fit. Specifically, the application of a v 2 difference test (D 2 ) showed that the difference in fit between the original and the modified models was not significant (D 2 = 8.629, df =8, p > 0.10), indicating that the more parsimonious model with four predictors was just as effective in explaining forecasting effectiveness as the original eight-predictor model. Next, to identify the relative influence of accuracy (short-term), bias (overestimation) and timeliness (time received) on effectiveness, we introduced equality constraints on the relative parameters and compared several nested models; the results are shown in Table 1. It can be seen that a hypothesis that accuracy, bias and timelines all have equal impacts on effectiveness (Model 2) must be rejected. The introduction of equality constraints on the parameters of the three performance criteria led to a marked deterioration in fit (see M 2 M 1 comparison); the same applies if equal impacts are assumed between accuracy and timeliness (Model 4) and bias and timeliness (Model 5). However, modifying the initial model by setting equal parameters for accuracy and bias (Model 3) did not significantly affect fit (see M 3 M 1 comparison). This suggests that, in evaluating forecasting effectiveness, managers place equal emphasis on short-term accuracy (x 1 ) and lack of overestimation bias (x 4 ), but not on timeliness (x 6 ). Indeed, inspection of the relevant standardized parameters shows a smaller coefficient for timeliness (c 6 = 0.205, p < 0.05) than accuracy 11 In this context, deletion of some parameters may result in a more parsimonious model that fits just as well as the original version (Diamantopoulos & Siguaw, 2000, p. 102). Table 1 Nested model comparison: equality constraints Model v 2 df p-value M 1 : all c s free 3.970 8 0.860 M 2 : c 1 = c 4 = c 6 16.997 10 0.074 M 3 : c 1 = c 4, c 6 free 3.973 9 0.913 M 4 : c 1 = c 6, c 4 free 15.222 9 0.085 M 5 : c 4 = c 6, c 1 free 12.675 9 0.178 Comparison D 2 df p-value M 2 M 1 13.027 2 0.000 M 3 M 1 0.003 1 NS M 4 M 1 11.252 1 0.000 M 5 M 1 8.705 1 0.000 NS = not significant. (c 1 = 0.290, p < 0.01) or bias (c 4 = 0.240, p < 0.01). 12 Looking at these results in the context of hypotheses H5 H7, the following picture can be painted. First, mixed support is obtained for hypothesis H5, according to which accuracy considerations should have the greatest impact on managers evaluations of forecasting effectiveness. While accuracy is indeed important, this applies only to short-term accuracy; moreover, absence of bias (overestimation) is just as important. Second, concerning hypothesis H6, the results show indeed that overestimation has a greater impact on managerial evaluations of effectiveness than underestimation; whereas the latter failed to even register a significant effect, overestimation was found to be just as influential as short-term accuracy (see above). Third, as hypothesized by H7, timeliness and cost are, in relative terms, the least important criteria shaping evaluations of effectiveness; timeliness is less influential than either accuracy or bias, whereas cost does not seem to have a significant impact at all. Note, however, that the results regarding cost may be partly due to the measurement of the latter and will be revisited in Section 5. 12 Note that, in LISREL, when equality constraints are imposed on parameters of the model, these will hold in the original solution but not in general in the standardized solutions (Jöreskog &Sörbom, 1989, p. 39). Thus, although the raw coefficients for x 1 and x 4 are the same (c 1 = c 4 = 0.141), their standardized counterparts are not.
H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 161 Table 2 Nested model comparison: interaction terms Model v 2 df p-value M 1 : main effects only 3.970 8 0.860 M 1a :asm 1 plus 7.152 14 0.929 interactions among accuracy, bias and timeliness M 1b :asm 1 plus 6.857 14 0.940 interactions with environment M 1c :asm 1 plus all interactions in M 1a and M 1b 9.531 20 0.976 Model comparisons D 2 df p-value M 1a M 1 3.182 6 NS M 1b M 1 2.887 6 NS M 1c M 1 5.561 12 NS NS = not significant. 4.3. Hypothesis testing results: interactions and moderating influences To investigate potential interactions among the three performance criteria found to be significantly linked to managerial evaluations of effectiveness (hypothesis H8), product terms were created capturing accuracy bias (x 1 x 4 ), accuracy timeliness (x 1 x 6 ) and bias timeliness (x 4 x 6 ) interactions. In addition, the potential moderating role of the environment (hypothesis H9) was investigated by introducing the following interactive effects: x 1 x 8, x 4 x 8 and x 6 x 8. 13 Subsequently, D 2 tests were used to compare models containing interactions with the model containing main effects only (Table 2). The results show that the incorporation of two-way interactions among accuracy, bias and timeliness (Model 1a) or interactions with the environment (Model 1b) or both (Model 1c) does not result in a significant improvement in fit over the baseline (i.e. main-effects-only) model (Model 1). To ensure that inability to detect any interaction effects was not due to low power (hence increasing the chance of a Type II error), the GPOWER program (Faul & Erdfelder, 1992) 13 Consistent with advice in the methodological literature (e.g. see Jaccard & Wan, 1996), all variables were mean-centered prior to forming product terms to avoid multicollinearity problems. Table 3 Parameter estimates for final model Parameter Raw coefficient Standardized coefficient was employed to conduct a power analysis. 14 Using a significance level of 5% (to control for Type I error), setting R 2 = 0.27 for the main-effects-only model (recall that 27.3% of the variance in effectiveness (g) was accounted for by the original model) and assuming that any interaction effects would account for an additional 10% of the variance in g (i.e. that DR 2 = 0.10), the power of the analyses in Table 2 came to 0.907 for Models 1a and 1b (containing seven predictors each) and 0.817 for Model 1c (containing ten predictors); these power values are very acceptable and consistent with Cohen s (1988) recommended 0.80 criterion. Thus, in light of the results in Table 2 and bearing in mind that statistical power is not a concern, hypotheses H8 and H9 are not supported by the data. Table 3 summarizes the parameter estimates of the final MIMIC model, incorporating the main effects of accuracy (short-term), bias (overestimation) and timeliness on forecasting effectiveness, as well as the impact of the environment on the latter. 5. Discussion and conclusions p-value k 1 1.000 a 0.779 n/a k 2 0.858 0.642 p < 0.01 k 3 0.759 0.738 p < 0.01 h 1 0.176 0.393 p < 0.01 h 2 0.286 0.588 p < 0.01 h 3 0.132 0.456 p < 0.01 c 1 0.141 0.290 p < 0.01 c 4 0.141 0.240 p < 0.01 c 6 0.124 0.205 p < 0.05 c 8 0.022 0.323 p < 0.01 a Fixed coefficient, to set scale for latent variable; h i = var(e i ), i =1,2,3. Several years ago, Makridakis (1981, pp. 307 308) argued that forecasting should not be judged on the simple accuracy criterion but its role should be enlarged and be concerned with its ability to improve the decision making within organizations. In this 14 As Jaccard, Turrisi, and Wan (1990, p. 35) point out power is important because with low statistical power it is possible that a theoretically important interaction effect will go undetected by the researcher. See also Schmidt and Hunter (1978).
162 H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 spirit, the present study sought to bring together different criteria of forecast performance and assess their impact on managers evaluations of forecasting effectiveness. Our MIMIC model depicting the effects of accuracy, bias and timeliness on effectiveness showed good fit to the data and enabled several hypotheses to be tested in a multivariate framework. The results obtained show that forecast accuracy is not the only criterion affecting the evaluation of overall forecasting effectiveness by managers. Thus, caution should be exercised when interpreting the findings of previous empirical studies in which accuracy and effectiveness are treated interchangeably (for a review, see Winklhofer et al., 1996). Moreover, only short-term accuracy seems to materially influence evaluations of effectiveness, whereas medium-term accuracy does not. Two possible explanations might be offered for this result. Firstly, it could be the case that, at least with respect to export operations, firms are primarily concerned with short-term performance and, therefore, place particular emphasis on short-term forecasting; in this context there is evidence to suggest that the mental maps of export managers are indeed characterized by a short-term orientation (Madsen, 1998). Alternatively, given that an increase in the time horizon generally decreases forecast accuracy (e.g. see Dalrymple, 1987; Mentzer & Cox, 1984a,b; Small, 1980), it could be that managers simply consider medium-term forecast errors as an unreliable criterion when judging the forecasting effectiveness of their firm (and, hence, ignore it altogether). Needless to say that both explanations are speculative and in need of empirical verification in future studies. Consistent with expectations, aspects of bias (overestimation) and timeliness also influence evaluations of forecasting effectiveness; indeed, lack of bias is just as important as (short-term) accuracy. Given the greater influence of overestimation on forecasting effectiveness, an empirical analysis of the relative importance that companies place on over- vs. underestimation (and the reasons behind it) is an obvious area for future research. Similarly, given the influence of timeliness considerations on forecasting effectiveness, the factors that facilitate (or hinder) timely receipt of forecasts by decision makers need further investigation. Contrary to expectations, cost was not found to impact on evaluations of effectiveness. One reason for this could be that firms do not keep records on forecasting expenditures and, thus, do not know what their costs are (Cerullo & Avila, 1975; Dalrymple, 1987; Rothe, 1978; Winklhofer & Diamantopoulos, 1996). Another possible reason could be the rather narrow measure of cost (data collection costs) used in the present study. Although such costs are clearly important, particularly, in an exporting context (Craig & Douglas, 2000), they are only part of the total costs associated with forecasting. A more comprehensive cost measure, ideally capturing development, maintenance and operating costs (Armstrong, 1985) may well reveal associations with evaluations of effectiveness. Hopefully, this limitation of our study will be rectified in future research. Although the export environment was found to be linked to evaluations of effectiveness, it did not play a moderating role on the impact of accuracy, bias and timeliness; neither were any interactions established among the performance criteria themselves. It, therefore, seems that the way each performance criterion affects evaluations of effectiveness is not conditional on the influence of the other criteria or the influence of the environment. Thus, a main-effects-only model is sufficient to capture the impact of accuracy, bias and timeliness on effectiveness. Having said that, given that the model explains about 27% of the variance in forecasting effectiveness, a substantial proportion of variation has still to be accounted for (which implies that additional influences need to be considered). One such influence relates to the management of the forecasting process. Specifically, the way the forecasts are communicated within the firm could be vital for their acceptance by decision makers (Wheeler & Shelley, 1987). Indeed, successful forecasting not only implies that forecasts are used but that they are used consistently by the various functional areas (Lawless, 1990). Further dimensions worth exploring relate to the units in which forecasts are presented and the extent to which they are regarded as appropriate by decision makers (Drumm, 1993) as well as the form of presentation of the forecast (Webby, 1994; Willemain, 1989). In terms of managerial implications, the findings show that in evaluating forecasting effectiveness, decision makers take a balanced view, utilising multiple criteria. This is encouraging as relying on a single criterion is not recommended, since no single measure gives an unambiguous indication of forecast performance (Mathews & Diamantopoulos, 1994, p. 410). It is also consistent with recommendations
H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 163 offered for assessing forecasting techniques, suggesting that multiple criteria are desirable in the selection and evaluation of forecasting techniques (Yokum & Armstrong, 1995, p. 596), as well as with broader advice in the literature stating that forecasting should not be judged on the basis of accuracy alone (e.g. Hogarth & Makridakis, 1981). Thus, collectively, the findings offer insights into developing forecasting systems that are responsive to managers perceptions. Regarding the particular combination of criteria underlying the evaluation of forecasting effectiveness, the sole emphasis on short-term accuracy is somewhat worrying as it may reflect a tendency to use forecasts as input to short-term export planning only. If this is the case, then the full benefits of forecasting for planning purposes may not be capitalized upon. Accordingly, drawing managers attention to how forecasting can improve decision making and planning beyond the short-term (e.g. see Armstrong, 1983; Makridakis, 1990; Remus & Simkin, 1987) may well be warranted. Indeed, the relationship between forecasting and export planning has not been empirically investigated in the literature; this represents yet another opportunity for future research. As far as timeliness and costs are concerned, the emphasis on the former serves to highlight the importance of providing timely forecasts to management (Remus & Simkin, 1987). Unless forecasts become available to decision makers at the time they are needed, their value is practically lost. This implies that forecast preparers and forecast users must collaborate to ensure that the forecasting system is, in fact, in tune with the firm s decision cycle. Regarding cost, not withstanding the aforementioned limitations in measurement, we cannot but share Dalrymple s (1987, p. 389) concern that business firms apparently treat forecasting as a free good... Another possibility is that forecasting is treated so casually that it is not even given a formal budget. Obviously neither condition is particularly desirable. The finding that cost considerations did not impact on evaluations of effectiveness suggests that managers either lack the cost information necessary on which to base judgments regarding the efficiency of their forecasting activities or, if they do, they ignore it for some reason. From a practical implication perspective, an examination of awareness and evaluation of forecasting expenditures by decision makers is needed to shed light on the cost issue. In conclusion, the model developed in this paper should be seen as a starting point in modeling forecasting effectiveness. Future studies will hopefully build on it by adding appropriate antecedents, improving on its measurement, and expanding its explanatory power. Appendix A. Variable operationalisation for MIMIC model Overall forecasting effectiveness y 1 = overall, we are as good in forecasting export sales as any firm in our industry. y 2 = our export decision makers have a lot of confidence in our export sales forecasts. y 3 = compared to our competitors in export markets, our export sales forecasting capability is superior. Accuracy x 1 = short-term MAPE (%) x 2 = medium-term MAPE (%) Bias x 3 = a frequent problem with our sales forecasts is that they tend to underestimate export sales. x 4 = we usually tend to overestimate what can be sold in our export markets. Timeliness x 5 = by the time our export sales forecasts have been prepared, important export decisions have already been made. * x 6 = decision makers in our firm often receive export sales forecasts too late to be of any real use. * ( * Recoded items (reverse directionality); y 1 y 3 and x 3 x 8 all measured on five-point Likert scales where 5 = strongly agree, and 1 = strongly disagree.) Cost x 7 = the costs of obtaining data useful for export sales forecasting purposes are often prohibitive. Environmental turbulence x 8 = composite measure obtained by summing the scores on Jaworski and Kohli s (1993) multi-item scales capturing technological, customer, and competitor turbulence, respectively.
164 H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 Appendix B B.1. Data characteristics Variable Mean Std. Deviation Minimum Maximum y 1 3.46 0.67 1 5 y 2 3.33 0.71 2 5 y 3 3.02 0.54 1 5 x 1 2.17 1.09 1 5 x 2 2.54 1.29 1 5 x 3 2.55 0.80 1 4 x 4 2.77 0.90 1 5 x 5 3.45 0.88 2 5 x 6 3.89 0.98 1 5 x 7 2.81 0.96 1 5 x 8 50.71 7.56 32.0 73.0 B.2. Correlation matrix y 1 y 2 y 3 x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 y 1 1.000 y 2 0.457 1.000 y 3 0.582 0.465 1.000 x 1 0.246 0.277 0.176 1.000 x 2 0.294 0.235 0.159 0.605 1.000 x 3 0.039 0.017 0.002 0.016 0.069 1.000 x 4 0.193 0.186 0.143 0.200 0.488 0.105 1.000 x 5 0.088 0.073 0.075 0.185 0.107 0.315 0.056 1.000 x 6 0.226 0.253 0.199 0.347 0.299 0.216 0.138 0.374 1.000 x 7 0.036 0.123 0.046 0.171 0.128 0.211 0.077 0.147 0.189 1.000 x 8 0.163 0.095 0.122 0.225 0.209 0.067 0.249 0.130 0.066 0.213 1.000 References Armstrong, J. S. (1983). Strategic planning and forecasting fundamentals. In K. Albert (Ed.), The strategic management handbook (pp. 2-7 2-32). New York: McGraw-Hill. Armstrong, J. S. (1985). Long-range forecasting: From crystal ballto computer (2nd ed.). Chichester: Wiley. Armstrong, J. S. (2001a). Evaluating forecasting methods. In J. S. Armstrong (Ed.), Principles of forecasting: A handbook for researchers and practitioners ( pp. 443 472). Boston, MA: Kluwer Academic Publishing. Armstrong, J. S. (2001b). Combining forecasts. In J. S. Armstrong (Ed.), Principles of forecasting: A handbook for researchers and practitioners ( pp. 417 439). Boston, MA: Kluwer Academic Publishing. Armstrong, J. S. (2001c). Standards and practices for forecasting. In J. S. Armstrong (Ed.), Principles of forecasting: A handbook for researchers and practitioners ( pp. 679 732). Boston, MA: Kluwer Academic Publishing. Armstrong, J. S., Adya, M., & Collopy, F. (2001). Expert systems for forecasting. In J. S. Armstrong (Ed.), Principles of forecasting: A handbook for researchers and practitioners ( pp. 285 300). Boston, MA: Kluwer Academic Publishing. Armstrong, J. S., & Collopy, F. (1992). Error measures for generalizing about forecasting methods: empirical comparisons. International Journal of Forecasting, 8, 69 80.
H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 165 Armstrong, J. S., & Overton, T. S. (1977, August). Estimating nonresponse bias in mail surveys. Journal of Marketing Research, 14, 396 402. Bagozzi, R. P., & Fornell, C. (1982). Theoretical concepts, measurements, and meaning. In C. Fornell (Ed.), A second generation of multivariate analysis ( pp. 24 38). New York: Praeger. Bagozzi, R. P., & Yi, Y. (1988). On the evaluation of structural equation models. Journal of the Academy of Marketing Science, 16, 74 94. Baumgartner, H., & Homburg, C. (1996). Applications of structural equation modelling in marketing and consumer research: A review. International Journal of Research in Marketing, 13, 139 161. Bentler, P. M. (1990). Comparative fit indices in structural models. Psychological Bulletin, 107, 238 246. Bollen, K. A. (1984). Multiple indicators: Internal consistency of no necessary relationship? Quality and Quantity, 18, 377 385. Bollen, K. A., & Lennox, R. (1991). Conventional wisdom on measurement: A structural equation perspective. Psychological Bulletin, 110 (2), 305 314. Bollen, K. A., & Long, J. S. (1993). Testing structural equation models. Newbury Park, CA: Sage. Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen, & J. S. Long (Eds.), Testing structural equation models. Newbury Park, CA: Sage. Byrne, B. M. (1998). Structural equation modeling with LISREL, PRELIS and SIMPLIS: Basic concepts, applications and programming. Mahwah, NJ: Lawrence Erlbaum Associates. Carbone, R., & Armstrong, J. S. (1982). Evaluation of extrapolative forecasting methods: Results of a survey of academicians and practitioners. Journal of Forecasting, 1, 215 217. Cerullo, M. J., & Avila, A. (1975). Sales forecasting practices: a survey. Managerial Planning, 24, 33 39. Churchill Jr., G. A. (1999). Marketing research methodological foundations (7th ed.). Orlando, FL: Dryden Press. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). New York: Academic Press. Craig, C. S., & Douglas, S. P. (2000). International marketing research (2nd ed.). Chichester: Wiley. Dalrymple, D. J. (1975). Sales forecasting methods and accuracy. Business Horizons, 18, 69 73. Dalrymle, D. J. (1987). Sales forecasting practices. International Journal of Forecasting, 3, 379 391. Diamantopoulos, A., & Siguaw, J. A. (2000). Introducing LISREL. London: Sage Publications. Drumm, W. J. (1993). Forecasting by consensus is riskier than it sounds. Journal of Business Forecasting, 12(1), 22 23. Erickson, G. M. (1987). Marketing managers need more than forecasting accuracy. International Journal of Forecasting, 3, 453 455. Ermer, C. M. (1991, Spring). Cost of error affects the forecasting model selection. Journal of Business Forecasting, 10 12. Faul, F., & Erdfelder, E. (1992). GPOWER: A priori, post-hoc, and compromise power analysis for MS-DOS. Bonn, Germany: Dept of Psychology, Bonn University. Fildes, R. (1992). The evaluation of extrapolative forecasting methods. International Journal of Forecasting, 8, 81 98. Fornell, C., Rhee, B. -D., & Yi, Y. (1991). Direct regression, reverse regression, and covariance structure analysis. Marketing Letters, 2(3), 309 320. Gatignon, H. (1993). Marketing-mix models. In J. Eliashberg, & G. L. Lilien (Eds.), Handbooks in operations research and management science, Vol. 5 (pp. 697 732) Amsterdam: North- Holland. Hagdorn-van der Meijden, L., VanNunen, J., & Ramondt, A. (1994). Forecasting bridging the gap between sales and manufacturing. International Journal of Production Economics, 37 (1), 101 114. Hair, Jr., J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (1998). Multivariate data analysis (5th ed.). Hemel Hempstead: Prentice-Hall International. Herbig, P., Milewicz, J., & Golden, J. E. (1994). Differences in forecasting behaviour between industrial product firms and consumer product firms. Journal of Business and Industrial Marketing, 9, 60 69. Hogarth, R. M., & Makridakis, S. (1981). Forecasting and planning: An evaluation. Management Science, 27, 115 138. Hoyle, R. H., & Panter, A. T. (1995). Writing about structural equation models. In R. H. Hoyle (Ed.), Structural equation modelling ( pp. 158 176). Newbury Park, CA: Sage. Jaccard, J. C., Turrisi, R., & Wan, C. K. (1990). Interaction effects in multiple regression. Newbury Park, CA: Sage. Jaccard, J. C., & Wan, C. K. (1996). LISREL approaches to interaction effects in multiple regression. Sage University Paper series on Quantitative Applications in the Social Sciences ( pp. 07 114). Thousand Oaks, CA: Sage. Jaworski, B. J., & Kohli, A. K. (1993). Market orientation: Antecedents and consequences. Journal of Marketing, 57, 53 70. Jeannet, J. -P., & Hennessey, H. D. (2001). Global marketing strategies (5th ed.). Boston, MA: Houghton Mifflin. Jobber, D., & Blaesdale, M. J. R. (1987). Interviewing in an industrial market research: The state-of-the-art. Quarterly Review of Marketing, 12(2), 7 11. Jöreskog, K. G. (1971). Statistical analyses of sets of congeneric tests. Psychometrika, 36, 109 134. Jöreskog, K. G., & Goldberger, A. S. (1975). Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association, 10, 631 639. Jöreskog, K. G., & Sörbom, D. (1982). Recent developments in structural equation modelling. Journal of Marketing Research, 19, 404 416. Jöreskog, K. G., & Sörbom, D. (1989). LISREL7: A guide to the program and applications (2nd ed.). Chicago, IL: SPSS. Jöreskog, K. G., & Sörbom, D. (1993). LISREL8: User s reference guide. Mooresville, IN: Scientific Software. Klein, L. R. (1984). The importance of the forecast. Journal of Forecasting, 3(1), 1 9. Lawless, M. (1990). Effective sales forecasting a management tool. Journal of Business Forecasting, 9(1), 2 11. Lesley, L. L. (1972). Are high response rates essential for valid surveys? Social Science Research, 1, 323 334. Long, J. S. (1983). Covariance structure models: An introduction to LISREL. Beverly Hills, CA: Sage Publications.
166 H.M. Winklhofer, A. Diamantopoulos / Intern. J. of Research in Marketing 19 (2002) 151 166 Madsen, T. K. (1998). Executive insights: managerial judgment of export performance. Journal of International Marketing, 6(3), 82 93. Mahmoud, E. (1984). Accuracy in forecasting: A survey. Journal of Forecasting, 3, 139 159. Mahmoud, E. (1987). The evaluation of forecasts. In S. Makridakis, & S. C. Wheelwright (Eds.), The handbook of forecasting: A manager s guide (2nd ed.). New York: Wiley. Mahmoud, E., Rice, G., & Malhotra, N. (1986). Emerging issues in sales forecasting and decision support systems. Journal of Academy of Marketing Science, 16, 47 61. Makridakis, S. (1981). Forecasting accuracy and the assumption of constancy. International Journal of Management Science, 9(3), 307 311. Makridakis, S. (1990). Forecasting, planning and strategy for the 21st century. London: The Free Press. Makridakis, S. (1993). Accuracy measures: theoretical and practical concerns. International Journal of Forecasting, 9(4), 527 529. Marsh, H. W., Balla, J. R., & Han, K. -T. (1996). An evaluation of incremental fit indices: A classification of mathematical and empirical properties. In G. A. Marcoulides, & R. E. Schumacker (Eds.), Advanced structural equation modelling: Issues and techniques ( pp. 315 353). Mahwah, NJ: Lawrence Earlbaum Associates. Martin, C., & Witt, C. (1988). Forecasting performance. Tourism Management, 9(4), 326 329. Mathews, B. P., & Diamantopoulos, A. (1994). Towards a taxonomy of forecast error dimensions. Journal of Forecasting, 13, 409 416. McHugh, A. K., & Sparkes, J. R. (1983). The forecasting dilemma. Management Accounting, 61(3), 30 34. Mentzer, J., & Kahn, K. (1995). Forecasting technique familiarity, satisfaction, usage and application. Journal of Forecasting, 14(5), 465 476. Mentzer, J. T., Cox Jr., J. E. (1984a). A model of the determinants of achieved forecast accuracy. Journal of Business Logistics, 5, 143 155. Mentzer, J. T., Cox Jr., J. E. (1984b). Familiarity, application and performance of sales forecasting techniques. Journal of Forecasting, 3, 27 36. Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill. Peterson, R. T. (1989). Sales force composite forecasting an exploratory analysis. Journal of Business Forecasting, 8, 23 27. Peterson, R. T. (1990). The role of experts judgment in sales forecasting. Journal of Business Forecasting, 9(2), 16 21. Remus, W., & Simkin, M. G. (1987). Integrating forecasting and decision making. In S. Makridakis, & S. C. Wheelwright (Eds.), The handbook of forecasting: A manager s guide (2nd ed.). New York: Wiley. Reynolds, N., & Diamantopoulos, A. (1998). The effect of pretest method on error detection rates: experimental evidence. European Journal of Marketing, 32(5/6), 480 498. Rothe, J. (1978). Effectiveness of sales forecasting methods. Industrial Marketing Management, 7, 114 118. Sanders, N. R., & Manrodt, K. B. (1994). Forecasting practices in US corporations: Survey results. Interfaces, 24(2), 92 100. Sanders, N. R., & Ritzman, L. P. (2001). Judgmental adjustment of statistical forecasts. In J. S. Armstrong (Ed.), Principles of forecasting: A handbook for researchers and practitioners ( pp. 405 416). Boston, MA: Kluwer Academic Publishing. Schmidt, F. L., & Hunter, J. E. (1978). Moderator research and the law of small numbers. Personnel Psychology, 31, 215 232. Schnaars, S. P. (1984). Situational factors affecting forecasting accuracy. Journal of Marketing Research, 21, 290 297. Sharma, S. (1996). Applied multivariate techniques. NewYork: Wiley. Small, R. L. (1980). Sales forecasting in Canada: A survey of practices, The Conference Board of Canada, Study No. 66. Steenkamp, J. -B. E. M., & Van Trijp, H. C. M. (1991). The use of LISREL in validating marketing constructs. International Journal of Research in Marketing, 8, 283 299. Tull, D. S., & Hawkins, D. I. (1993). Marketing research measurement and method (6th ed.). New York: Macmillan. Watson, D. C. (1996). Forecasting in the Scottish electronics industry. International Journal of Forecasting, 12, 361 371. Webby, R. (1994). Graphical support for the integration of event information into time series forecasting. Unpublished PhD dissertation, University of New South Wales. West, D. C. (1994). Number of sales forecast methods and marketing management. Journal of Forecasting, 13(4), 395 407. Wheeler, D. R., & Shelley, C. J. (1987). Toward more realistic forecasts for high-technology products. Journal of Business and Industrial Marketing, 2(3), 55 63. White, H. R. (1986). Sales forecasting: Timesaving and profit-making strategies that work. London: Scott, Foresman and Company. Willemain, T. (1989). Graphical adjustments of statistical forecasts. International Journal of Forecasting, 5, 179 185. Winklhofer, H., & Diamantopoulos, A. (1996). First insights into export sales forecasting practice: A qualitative study. International Marketing Review, 13(4), 52 81. Winklhofer, H., Diamantopoulos, A., & Witt, S. F. (1996). Forecasting practice: A review of the empirical literature and an agenda for future research. International Journal of Forecasting, 12, 193 221. Wiseman, F., & Billington, M. (1984, August). Comment on a standard definition of response rates. Journal of Marketing Research, 21, 336 338. Yokum, J. T., & Armstrong, J. S. (1995). Beyond accuracy: Comparison of criteria used to select forecasting methods. International Journal of Forecasting, 11, 591 597.