RESEARCH NOTE Adapting Surveys for Nonprofit Research William J. Ritchie, John J. Sherlock Many useful survey instruments have been developed in the forprofit management arena, but they often require varying levels of adaptation for relevant application in the nonprofit context. This research note first explains key steps in the process for adapting and testing a survey instrument. It then illustrates how each step should be applied and reported using a case study adaptation of Hodgkinson s (1992) strategic locus of control instrument for nonprofit use. IN RECENT YEARS, studies have highlighted various methodological issues associated with survey-related research in the nonprofit sector. Authors have addressed a number of topics, such as the impact of survey responses on measure interpretation (Brooks, 2004), issues with data collection (Brudney and Gazley, 2006; Rooney, Steinberg, and Schervich, 2004), and alternative methods of data collection (Markham and Couldry, 2007; Hager, Wilson, Pollak, and Rooney, 2003; Kennedy and Vargus, 2001). Since survey research plays such an important role in nonprofit data collection efforts, it is incumbent on authors to ensure that their instruments are valid proxies for the phenomena they intend to measure. However, instrument validation is often a complicated process, requiring expertise in multiple disciplines. For example, proper instrument evaluation requires skills in relevant research theory, statistical methods, and practice. Further complicating the survey instrument evaluation process is the fact that resources on statistical topics are dispersed among multiple academic disciplines with varying perspectives on research methods. In fact, many of the resources relating to research surveys are often developed and validated in contexts outside the nonprofit organization domain and require some level of adaptation when used in the nonprofit domain. For example, questionnaire wording in the for-profit realm often uses terms such as competitor, industry, and corporation. NONPROFIT MANAGEMENT & LEADERSHIP, vol. 19, no. 3, Spring 2009 2009 Wiley Periodicals, Inc. 387 Published online in Wiley InterScience (www.interscience.wiley.com)..225
388 RITCHIE, SHERLOCK Figure 1. Scale Reporting Concepts Theoretical background of survey instrument Selection of scale Sample and scale descriptives Scale validity and evaluation Scale testing with factor analysis Research studies with survey data should provide the theoretical background of the survey instrument. Nonprofit researchers frequently undertake minor revisions to adapt such instruments to the nonprofit context, assuming no impact on the psychometric properties of the original survey instrument. Often, detailed documentation of validation efforts for these adapted surveys is not reported or, worse, not undertaken at all. The net effects of these practices present three issues for nonprofit research. First, limited documentation may prompt readers to question the efficacy of the instrument. Second, there may be no validation paper trail for other nonprofit researchers and practitioners who desire to implement the same survey in other nonprofit contexts. Third, the result of inappropriately applied instruments may be studies with unrecognized type 2 errors, where inaccurate measurements lead to invalid conclusions given the study s context. This research note first explains key steps in the process for adapting and testing a survey instrument (Figure 1). It then illustrates how each step should be applied and reported using a case study adaptation of Hodgkinson s (1992) strategic locus of control (SLOC) instrument for nonprofit use. Theoretical Background of Survey Instrument Research studies with survey data should provide the theoretical background of the survey instrument. This should include citations indicating where the original instrument was introduced into the literature, as well as construct definitions, descriptions of the instrument, and the number of items in the scale. For example, in the case study used for this research note, we adapted and tested Hodgkinson s (1992) version of the Rotter (1966) locus of control (LOC) scale. As such, an appropriate paragraph explaining the theoretical background might be written as follows: Based on the work of Rotter s (1966) twenty-nine-item scale (twenty-three-item without filler items), LOC has been used
ADAPTING SURVEYS FOR NONPROFIT RESEARCH 389 to describe the various beliefs that people hold regarding whether reinforcements of their actions and outcomes are largely determined internally, by the individual actor, or externally, by events that are largely driven by uncontrollable forces (Miller, Kets de Vries, and Toulouse, 1982; Rotter, 1966). Those who have internal loci of control believe that achievements, success, and personal accomplishments are largely due to their own actions and initiatives (Carpenter and Golden, 1997). Conversely, those who have external loci of control believe that forces such as fate or luck largely determine their level of achievement. In cases where the scale has been adapted, it is also necessary to present a rationale for such changes for example: The Rotter (1966) scale has received criticism through the years from researchers in a variety of disciplines (Spector, 1982; Boone, 1988; Hodgkinson, 1992). For example, using a sample of nonprofit leaders, Adeyemi-Bello (2001) identified various validity issues with the Rotter (1966) scale, suggesting that there may be too many questions in the instrument to measure a single construct. In the strategy literature, Hodgkinson (1992) argued that the Rotter (1966) Internal-External (I-E) scale encourages individual respondents to present themselves in a socially desirable manner, thus calling into question the reliability of many traditional LOC findings. Furthermore, he suggested that the LOC construct is best evaluated with custom scales that are specific to unique disciplines of interest. To this end, Hodgkinson (1992) developed and validated the sixteenitem strategic locus of control (SLOC) scale. Selection of Scale Exemplary studies should offer a solid rationale for the selection of a particular scale. Van Maanen, Sorensen, and Mitchell (2007) provide an in-depth discussion relating to this topic. The description of the scale used in a study should contain direct theoretical connections. The following provides an example: The Hodgkinson (1992) SLOC was selected for this analysis for a number of reasons. First, as mentioned earlier, the SLOC scale was shown to overcome many of the Rotter (1966) scale shortcomings. Second, the strategic orientation of the Hodgkinson (1992) scale is particularly relevant to nonprofit researchers in the light of recent calls in the literature for understanding relationships between strategic-level constructs
390 RITCHIE, SHERLOCK such as effectiveness and top manager behaviors (Ritchie and Eastwood, 2006; Stone, Bigelow, and Crittenden, 1999; Ritchie and Kolodinsky, 2003). The for-profit research linking LOC with organizational performance (Fusilier, Ganster, and Mayes, 1987; Govindarajan, 1989; Hollenbeck, Brief, Whitener, and Pauli, 1988; Mia, 1987; Storms and Spector, 1987) provides a rich background for similar studies in the nonprofit realm. Finally, the SLOC scale provides an opportunity to illustrate the types of adaptations necessary in traditional for-profit surveys to adapt them to the nonprofit context. Survey instruments should be as context specific as possible (Adler and Weiss, 1988). Research on the LOC construct has a history of adaptation to different contexts. As indicated by Hodgkinson (1992), examples of contextual adaptations can be found in research domains such as politics (Davis, 1983), economics (Furnham, 1986), and work settings (Spector, 1988). A representative sample suggests that the survey respondents are well matched with the survey content. Sample and Scale Descriptives The sampling technique and questionnaire administration should be reported. This should include a description of the representative sample. A representative sample suggests that the survey respondents are well matched with the survey content. For example, a sample of customer service representatives in an organization would not be considered representative if the survey instrument was designed to measure organizational-level constructs such as the strategy formulation process. Sample characteristics such as sample size, item-toresponse ratios, and response rates should also be reported. Finally, scale-specific items such as number of items in the measure, the use of negatively worded items (if appropriate), and the response type (for example, Likert-type items) as well as the metrics (for example, range of the scale and interval) should also be reported (see Hinkin, 1995). A description of the sample and scale descriptives from the SLOC case study is as follows: The data in the current study were obtained from a representative sample of chief executives of nonprofit university and college foundations registered under IRS code, section 501(c)(3). The survey instrument was mailed to a nationwide sample of 263 chief executives of higher education foundations. The sample was identified using the National Center for Charitable Statistics (NCCS) (1998) database of nonprofit organizations. One hundred forty-four executives completed and returned the questionnaire, generating a response rate of 55 percent. The SLOC scale measured two opposing dimensions, internal and external LOC, to minimize response
ADAPTING SURVEYS FOR NONPROFIT RESEARCH 391 pattern bias. Following Hodgkinson s (1992) original scale composition, we reverse-coded the eight items in the scale from internal orientation to external orientation, resulting in a sixteen-item scale where higher numbers indicate great external LOC orientation. The scale mean was 2.49 (using a Likert scale where 5 strongly agree and 1 strongly disagree) with a standard deviation of.38. Internal consistency analysis revealed a Cronbach s alpha (discussed later) coefficient of.74 for the sixteen-item scale. Corrected item-total correlations are presented in Table 1 and inter-item correlations are presented in Table 2. Table 1. Strategic Locus of Control Scale Item Number Corrected Item-Total Correlation 1 There is very little my foundation can do in order to change 0.49 the rules of competition among higher education foundations. 2 Reverse Many of the problems experienced by foundations can be avoided 0.15 through careful planning and analysis. 3 To a great extent the competitive environment in which my 0.50 foundation operates is shaped by forces beyond its control. 4 Reverse Becoming a successful foundation is a matter of creating 0.16 opportunities; luck has little or nothing to do with it. 5 There is little point in the majority of institutionally related 0.33 foundations taking an active interest in the wider concerns of other foundations because only the larger, more powerful foundations have any real influence. 6 It is not always wise to make strategic plans far ahead because 0.33 many things turn out to be a matter of good or bad fortune anyway. 7 Reverse My foundation can pretty much accomplish whatever it sets out 0.29 to achieve. 8 Reverse Most foundations can have an influence in shaping the structure 0.36 of the market. 9 As regards to competing in the marketplace, most foundations are 0.57 victims of forces they cannot control. 10There is little point in engaging in strategic analyses and planning 0.35 because often events occur that my foundation cannot control. 11 Reverse Usually foundations fail because they have not taken advantage of 0.15 their opportunities. 12 Reverse My foundation is able to influence the basis on which it competes 0.59 with other foundations. 13 Foundations that rarely experience strategic problems are just 0.10 plain lucky. 14 Reverse There is a direct connection between the interest you take in 0.01 your competitors foundations and the success of your own foundation. 15 Reverse My foundation has a direct role in shaping the environment in 0.42 which it competes. 16 Market opportunities in higher education foundations are 0.60 largely predetermined by factors beyond my foundation s control.
392 RITCHIE, SHERLOCK Table 2. Nested Model Comparisons, Reverse-Coded to External Emphasis Modification Index Change in (covariance to RMR RMSEA AGFI GFI x 2 P Chi Square df RFI PNFI CFI be added) Model 1.089.12.72.79 314 0 104.43.44.60 Items 12 to 15 Model 2.076.078.81.86 193 0 p.001 103.63.58.81 Items 1 to 16 Model 3.075.065.83.87 164 0 p.001 102.67.61.86 Items 6 to 10 Model 4.073.054.85.89 143.004 p.001 101.70.63.89 Items 7 to 12 Model 5.068.049.86.89 134.012 p.001 100.72.64.91 Items 7 to 15 Model 6.057.024.88.91 106.28 p.001 99.78.67.97 Note: N 142 using maximum likelihood. Scale Validity and Evaluation In the adaptation or development of survey instruments, it is incumbent on the researcher to ensure that the questionnaire items accurately measure the intended phenomena of interest. Researchers have referred to this condition as the construct validity of the instrument (Bagozzi, Yi, and Phillips, 1991). However, due to the conceptual nature of the phenomena being measured, validity is difficult to ascertain with a single measurement and typically entails triangulating on multiple measurements (Schwab, 2005). A comprehensive review of these concepts is beyond the scope of this study since there are numerous statistical references on the market (for example, Schwab, 2005; Alreck and Settle, 1995). However, a brief overview of key indicators is discussed in the following paragraphs to serve as a point of departure for more in-depth study. Content and Face Validity A common starting point in scale construction is the assessment of content and face validities. Fulfillment of these criteria is considered by researchers to be minimum requirements of a valid scale (Hinkin, 1995). Content validity is present when survey questionnaire items correctly characterize the phenomena of interest. This is typically assessed by having experts knowledgeable in the field of research review questionnaire items to ensure that they fit with the phenomena of study. With regard to face validity, if the measures appear valid to a sample similar to study participants, the measure is said to meet this criterion. When original scales are edited to fit a specific context, both content and face validity should be evaluated, and changes to the scale should be reported. This simple act of disclosure can provide valuable validation information to researchers desiring to conduct additional research by adapting the scale for their own context or by using your adapted scale. The following example provides an application of these conditions for the adaptation of the SLOC instrument:
ADAPTING SURVEYS FOR NONPROFIT RESEARCH 393 Upon review of the Hodgkinson (1992) SLOC instrument, practicing nonprofit managers and researchers in the field suggested changes in item wording to ensure content and face validity. To this end, terms such as corporation and industry were eliminated and replaced with more appropriate terms such as education foundation, other foundations, and nonprofit sector to fit the context. See Table 1 for a complete listing of the sixteen-item scale and changes to scale wording. Internal Consistency/Reliability Internal consistency/reliability is a type of convergent validity where the correlation among the items in a scale is evaluated. Consistency/reliability is defined as the percent of variance in an observed variable that is accounted for by the true scores on the underlying construct (Hatcher and Stepanski, 1994, p. 507). It represents the degree to which the measurement scores are free from random error. A common measure of internal consistency is the coefficient alpha (Cronbach and Meehl, 1955; Cronbach, 1951). With a reporting range from zero to one, this measure provides an indicator of the extent to which items in a given scale or construct measurement are correlated with each other. Survey instruments with scales displaying alphas equal to or greater than the threshold level of.70 are generally considered to be internally consistent (Hatcher and Stepanski, 2005; Nunnally, 1978). Although internal consistency is the most common form of reliability reported, two additional types of reliability are worthy of note: interrater and stability reliability. Interrater reliability refers to the consistency of construct measurement across multiple raters, and stability (for example, test-retest) reliability involves consistency over time (Vogt, 1999). While reliability is statistically necessary for a valid construct, it is not sufficient in and of itself to establish construct validity. There is extensive literature on other proxies for construct validity, such as criterion and external validity. The following section provides an overview of factor analysis, a form of discriminant validity. Nunally (1978) also referred to this criterion as factorial validity. Although internal consistency is the most common form of reliability reported, two additional types of reliability are worthy of note: interrater and stability reliability. Scale Testing with Factor Analysis Factor analytic techniques have been a widely used means of assessing construct validity (Thompson and Daniel, 1996). Researchers use exploratory factor analysis to evaluate survey questionnaire items that they believe to be representative of one or more constructs (for example, scales). The exploratory nature of this form of analysis is that the researcher undertakes the analysis with the expectation that the coefficients (also referred to as factor loadings, representing the strength of membership of a given survey item on a hypothesized construct) are a tool to substantiate the retention of items in a construct and rival theoretical models are not tested. Furthermore,
394 RITCHIE, SHERLOCK various methods may provide evidence of the factor structure such as the percentage of explained variance, eigenvalues greater than one, and rotation methods. Exploratory factor analysis may be undertaken with many popular statistical software packages such as SPSS and SAS. By contrast, confirmatory factor analysis (CFA) requires that the researcher determine construct operationalization a priori, then test this model against rival models (Thompson and Daniel, 1996). This method requires the use of structural equation modeling software (such as LISREL, EQS, and Amos), in which the covariance structures of a hypothesized model are evaluated for fit. In this case, the model is representative of the sum total of the proposed linkages between individual survey items and the constructs or the phenomenon of interest. It is noteworthy that model assessment is often performed on multiple constructs simultaneously and compared with several other models (involving a nesting of models), including a null model that represents only a single construct comprising all survey item constructs (Kline, 1998; Kelloway, 1998). The confirmatory nature of this factor analysis is that there is evidence of improved fit as the items in the null model are parsed into additional constructs, ultimately ending with all the domains of interest to the researcher. A truly rigorous application of CFA requires that the researcher report goodness-of-fit indexes and chi-square difference tests (for competing nested models) to show the superiority of the final hypothesized model over the null model. A sample paragraph introducing the factor analytic techniques might be written as follows: The sixteen items, representative of the LOC construct, were subjected to a confirmatory factor analysis (CFA) using the covariance matrix and maximum likelihood estimation as implemented in LISREL version 8.7 (Jöreskog and Sörbom, 1993, 2004). With 144 survey responses in this study, our sample was deemed adequate for the study (see Kline, 1998; Kelloway, 1996; Anderson and Gerbing, 1984, 1988; Bentler and Chou, 1987). The structural model for this analysis is presented in Figure 2. As mentioned earlier, this case study used the process of nested model comparisons (for an extended discussion of CFA, nested models, and reporting standards, see Gorsuch, 1973; Anderson and Gerbing, 1988; Kelloway, 1998; Kline, 1998) to evaluate sequential theoretical models. The process of examining nested models involves testing a sequence of structural models beginning with a baseline or null model (Kelloway, 1998). The baseline model is then modified and reevaluated with subsequent structural models using the sequential chi square difference test, which is calculated by subtracting the chi square values between sequentially nested models. Each subsequent model in our analysis displayed (n 1) degrees of freedom from the
ADAPTING SURVEYS FOR NONPROFIT RESEARCH 395 Figure 2. Conceptual Model: Completely Standardized Solution Final Model (Model 6) 0.81 Q1 0.68 Q3 0.81 Q5 0.86 Q6 0.31 0.31 0.53 Q9 0.43 0.56 0.88 Q10 0.44 0.38 0.69 0.96 Q13 0.35 0.20 0.65 Q16 0.59 ExtLOC 1.00 0.14 0.98 Q2 0.14 0.15 0.98 Q4 0.33 0.15 0.98 Q7 0.38 0.89 Q8 0.10 0.21 0.46 0.98 Q11 0.43 0.85 Q12 0.67 0.99 Q14 0.96 Q15 previous model, to facilitate accurate interpretation of the chi square comparison. Since the chi square statistic has been proven asymptotically independent (Steiger, Shapiro, and Browne, 1985), the difference score can be readily evaluated for significance using standard chi square tables. A sample write-up for the nesting process is as follows: The nested model comparisons are presented in Table 3. The baseline (null) model (model 1) did not provide an adequate
Table 3. Correlation Matrix of Questionnaire Items Survey Item Number 2 4 7 8 11 12 14 15 1 3 5 6 9 10 13 16 4 0.13 1 7 0.095 0.13 1 8.190* 0.016.198* 1 11 0.122 0.159 0.08 0.089 1 12 0.016 0.077.259**.332**.264** 1 14 0.026 0.046 0.016.207* 0.027 0.018 1 15 0.089 0.037 0.152.294**.196*.625**.184* 1 1.165* 0.056.244**.279** 0.048.338** 0.076.286** 1 3 0.052 0.106.171* 0.099 0.022 342** 0.149 226**.420** 1 5 0.066 0.33 0.135 0.149 0.033.166* 0.109 0.143.316**.332** 1 6.172*.170* 0.076.170* 0.041.188* 0.065 0.02 0.153 308**.223** 1 9 0.064 0.133.208*.260** 0.128.410** 0.129.255**.370**.493**.373**.233** 1 10.226** 0.087 0.122 0.156 0.039.245** 0.013 0.049 0.161.293** 0.148.438**.229** 1 13 0.026 0.013 0.091 0.12 0.112 0.084.184* 0.026 0.004.224** 0.087 0.065.188* 0.094 1 16 0.065 0.119.267**.284** 0.12 446** 0.052.428**.489**.401**.277**.187*.504**.189* 0.08 1 144 144 143 143 144 143 144 143 143 142 144 144 144 144 144 144 *Correction significant at the p.05 level (two-tailed). **Correction significant at the p.01 level (two-tailed).
ADAPTING SURVEYS FOR NONPROFIT RESEARCH 397 fit for the data as evidenced by the low fit indexes and high ratio of chi square to df (chi square 314 (104), GFI.79, RMSEA.12). Modification indexes generated by LISREL were examined to determine if the addition of potential covariances improved the psychometric properties of the model. It is noteworthy that paths that were added to the model must make sense theoretically. As covariances were added between instrument items, we evaluated comparative fit measures and chi-square difference tests to ensure that each additional covariance made a substantial contribution to the model. The final model (model 6) provided a superior fit to the data (chi square 107.65 (99), GFI.91, RMSEA.025, RMR.05). Discussion This research note provides nonprofit researchers and managers with a brief primer on how to evaluate the psychometric properties of a revised or adapted survey instrument. As such, we have emphasized the importance of not assuming that a modified scale is valid. It is equally important to report modifications to existing scales and validity tests not only to demonstrate empirical rigor but also to help other researchers who desire to use adapted scales to further the field of nonprofit research. We have identified five key areas that should be reported when conducting survey-related research. First, the instrument s theoretical background should be explained, including relevant citations in the literature. Second, the researcher should provide a solid rationale for the selection of the scale, emphasizing the relationship between applicable theory and the study at hand. Third, the sample and scale descriptives should be reported. Fourth, since measurement of scale validity can be accurately assessed only with triangulation on multiple measures, it is incumbent on the researcher to provide several different validity tests. Finally, the scale should be subjected to CFA. The outcome from this evaluation process should be summarized in the results and discussion sections of the manuscript. As demonstrated in the examples provided, validity test results are typically presented using figures, charts, or tables and then discussed. It is not uncommon for the validity test results to show some scale deviations from acceptable psychometric standards. These deviations should be reported in the discussion section with some discussion as to potential reasons for the deviation. A sample write-up is as follows: We have emphasized the importance of not assuming that a modified scale is valid. The results of this analysis indicate that given our sample of nonprofit executives, the sixteen-item strategic LOC scale
398 RITCHIE, SHERLOCK demonstrates acceptable psychometric properties. For this case study, a number of methodological points are worthy of mention. First, as mentioned earlier, the entire sixteen-item scale demonstrated strong internal consistency (alpha.74) by traditional standards, for example, Nunally s (1978) cutoff criteria of.70. Furthermore, the CFA confirmed the unidimensionality of the scale and the overall fit of the data to the theoretical model. However, a review of corrected itemtotal correlations, the correlation matrix, and the loadings in the CFA indicated that responses to one item in the scale (question number 14) did not necessarily correspond with the majority of other items in the scale. We reviewed this item for a possible explanation for the poor fit. One possible explanation is the fact that this item mentions competing foundations. In the for-profit sector, one might easily conceive of competitors, while in the nonprofit context, it is possible that CEOs did not perceive such a stark contrast between competing foundations and their own foundation, causing some spurious responses to the item. Future versions of this scale might be improved by changing the wording in this item from competitors to other related foundations. Another alternative for researchers is the removal of this item from the scale. One of the benefits of having multiple-item scales is the opportunity to remove a particularly problematic survey item with potentially minimal impact on the outcomes of the study. The onus is on the researcher to substantiate removal or inclusion of problematic items. Second, on examination of the final structural model, it was clear that all covariances as suggested by the modification indexes were representative of scale items that belonged to the constructs of either internal or external LOC. The implication here is the possibility that the single construct (sixteen-item external LOC) might more accurately be represented as a two-factor, internal-external scale. Therefore, we conducted a post hoc analysis and tested a two-factor model and found that it did provide an excellent overall fit for the data, comparable to the one-factor model. However, the chi square difference test between the single-factor model 6 and the two-factor model revealed that there was not a significant (p.05) improvement in overall model fit in the two-factor model. One possible explanation is that there is substantial evidence in the literature that negative wording in survey instruments sometimes results in separate factors that are the result of statistical artifacts (Herche and Engelland, 1996) rather than true construct characteristics. For example, when negatively worded items are reverse-coded, an
ADAPTING SURVEYS FOR NONPROFIT RESEARCH 399 assumption is made by the researcher that the endorsement of a specific survey item is not equivalent to rejecting the opposite of that same item. As noted in the Herche and Engelland (1996) article, Terborg and Peters (1974) found that differences in scores between standard and reverseworded items appeared to occur independent of the effects of acquiescence. In addition, while reverse-item wording addresses the problem of acquiescence in surveys, it also introduces the possibility of confusion, particularly with long questionnaires. Such respondent confusion suggests a nonsystematic variation in responses, reducing the internal consistency of the items but not necessarily affecting their scale dimensionality (Herche and Engelland, 1996). As an additional check, we evaluated the correlation matrix and found that the externally worded items in the instrument had a greater number of correlations with themselves (thirty-nine significant correlations in all) than with the remaining internal items (only twenty-four significant correlations), bolstering the notion that the results of the two-factor model may be the result of method variance. By contrast, the internal construct displayed a similar number of correlations both within and between constructs, with twenty significant inter-item correlations versus twenty-three significant item correlations with the external scale. The Cronbach alpha for this scale is.56, low by conventional standards. These findings provide us with a reasonable level of confidence that the CFA results and internal consistency measure could be relied on and that any evidence suggesting a two-factor model or a poorly fitting single-factor model are most likely statistical artifacts. Conclusion We have highlighted a process for evaluating the psychometric properties of an adapted survey instrument. Similar methods should be employed whenever nonprofit researchers endeavor to contextualize scales to the nonprofit context. Following the process described is necessary to ensure that the research instrument used has adequate psychometric properties appropriate to the research being conducted. Doing so benefits not only the specific study for which the scale was adapted, but also other nonprofit researchers who wish to use the scale in the adapted form. Hodgkinson s (1992) SLOC scale was used as the case study example because of its applicability to the nonprofit sector. In the light of the variety of LOC-type studies in the broader research community and the recent emphasis on executive and organization performance in the nonprofit literature, we There is substantial evidence in the literature that negative wording in survey instruments sometimes results in separate factors that are the result of statistical artifacts.
400 RITCHIE, SHERLOCK believe there are numerous opportunities for additional scholarly works using this adapted scale. WILLIAM J. RITCHIE is an assistant professor in the Department of Management at James Madison University, Harrisonburg, Virginia. JOHN J. SHERLOCK is an associate professor of human resources at Western Carolina University, Cullowhee, North Carolina. References Adeyemi-Bello, T. Validating Rotter s (1966) Locus of Control Scale with a Sample of Not-for-Profit Leaders. Management Research News, 2001, 24, 25 34. Adler, S., and Weiss, H. M. Recent Developments in the Study of Personality and Organizational Behavior. In C. L. Cooper and I. T. Robertson (eds.), International Review of Industrial and Organizational Psychology. Hoboken, N.J.: Wiley, 1988. Alreck, P. L., and Settle, R. B. The Survey Research Handbook. (2nd ed.) Chicago: Irwin, 1995. Anderson, J. C., and Gerbing, D. W. The Effects of Sampling Error on Convergence, Improper Solutions and Goodness-of-Fit Indices for Maximum Likelihood Confirmatory Factor Analysis. Psychometrika, 1984, 49, 155 173. Anderson, J. C., and Gerbing, D. W. Structural Equation Modeling in Practice: A Review and Recommended Two-Step Approach. Psychological Bulletin, 1988, 103(3), 411 423. Bagozzi, R. P., Yi, Y., and Phillips, L. W. Assessing Construct Validity in Organizational Research. Administrative Science Quarterly, 1991, 36(3), 421 458. Bentler, P. M., and Chou, C. P. Practical Issues in Structural Equation Modeling. Sociological Methods and Research, 1987, 16, 78 117. Boone, C. The Influence of the Locus of Control of Top Managers on Company Strategy, Structure and Performance. In P. V. Abeele (ed.), Psychology in Micro and Macro Economics: Proceedings of the Thirteenth Annual Colloquium of the International Association for Research in Economic Psychology. Leuven, Belgium: International Association for Research in Economic Psychology, 1988. Brooks, A. C. What Do Don t Know Responses Really Mean in Giving Surveys? Nonprofit and Voluntary Sector Quarterly, 2004, 33(3), 423 434. Brudney, J. L., and Gazley, B. Moving Ahead or Falling Behind? Volunteer Promotion and Data Collection. Nonprofit Management and Leadership, 2006, 16(3), 259 276. Carpenter, M. A., and Golden, B. R. Perceived Managerial Discretion: A Study of Cause and Effect. Strategic Management Journal, 1997, 18(3), 187 206.
ADAPTING SURVEYS FOR NONPROFIT RESEARCH 401 Cronbach, L. J. Coefficient Alpha and the Internal Structure of Tests. Psychometrica, 1951, 16(3), 297 334. Cronbach, L. J., and Meehl, P. E. Construct Validity in Psychological Tests. Psychological Bulletin, 1955, 52, 281 302. Davis, J. Does Authority Generalize? Locus of Control Perceptions in Anglo-American and Mexican-American Adolescents. Political Psychology, 1983, 4, 101 120. Furnham, A. Economic Locus of Control. Human Relations, 1986, 39, 29 43. Fusilier, M. R., Ganster, D. C., and Mayes, B. T. Effects of Social Support, Role Stress, and Locus of Control on Health. Journal of Management, 1987, 13, 517 528. Govindarajan, V. A Contingency Approach to Strategy Implementation at the Business-Unit Level. Strategic Management Journal, 1989, 10, 251 269. Gorsuch, R. L. Factor Analysis. (2nd ed.) Mahwah, N.J.: Erlbaum, 1983. Hager, M. A., Wilson, S., Pollak, T. H., and Rooney, P. M. Response Rates for Mail Surveys of Nonprofit Organizations. Nonprofit and Voluntary Sector Quarterly, 2003, 32(2), 252 261. Hatcher, L., and Stepanski, E. J. A Step-by-Step Approach to Using the SAS System for Univariate and Multivariate Statistics. Cary, N.C.: SAS Institute, 1994. Herche, J., and Engelland, B. Reversed-Polarity Items and Scale Unidimensionality. Academy of Marketing Science, 1996, 24(4), 366 374. Hinken, T. R. A Review of Scale Development Practices in the Study of Organizations. Journal of Management, 1995, 21(5), 967 988. Hodgkinson, G. P. Research Notes and Communications: Development and Validation of the Strategic Locus of Control Scale. Strategic Management Journal, 1992, 13, 311 317. Hollenbeck, J. R., Brief, A. P., Whitener, E. M., and Pauli, K. E. An Empirical Note on the Interaction of Personality and Aptitude in Personnel Selection. Journal of Management, 1988, 14, 441 451. Jöreskog, K., and Sörbom, D. LISREL 8: Structural Equation Modeling with the SIMPLIS Command Language. Mahwah, N.J.: Erlbaum, 1993. Jöreskog, K., and Sörbom, D. LISREL 8.7. Chicago: Scientific Software International, 2004. Kelloway, E. K. Using LISREL for Structural Equation Modeling: A Researcher s Guide. Thousand Oaks, Calif.: Sage, 1996. Kelloway, E. K. Using LISREL for Structural Equation Modeling. Thousand Oaks, Calif.: Sage, 1998. Kennedy, J. M., and Vargus, B. Challenges in Survey Research and Their Implications for Philanthropic Studies Research. Nonprofit and Voluntary Sector Quarterly, 2001, 30(3), 483 494. Kline, R. B. Principles and Practice of Structural Equation Modeling. New York: Guilford Press, 1998.
402 RITCHIE, SHERLOCK Markham, T., and Couldry, N. Tracking Reflexivity of the (Dis)Engaged Citizen. Qualitative Inquiry, 2007, 13(5), 675 695. Mia, L. Participation in Budgetary Decision Making, Task Difficulty, Locus of Control, and Employee Behavior: An Empirical Study. Decision Sciences, 1987, 18, 547 561. Miller, D., Kets de Vries, M. F., and Toulouse, J. M. Top Executive Locus of Control and Its Relationship to Strategy-Making, Structure, and Environment. Academy of Management Journal, 1982, 25, 237 253. National Center for Charitable Statistics. Core Files Database. Washington, D.C.: Urban Institute, 1998. Nunnally, J. C. Psychometric Theory. (2nd ed.) New York: McGraw- Hill, 1978. Ritchie, W. J., and Eastwood, K. Executive Functional Experience and Its Relationship to the Financial Performance of Nonprofit Organizations. Nonprofit Management and Leadership, 2006, 17(1), 67 82. Ritchie, W. J., and Kolodinsky, R. W. Nonprofit Organization Financial Performance Measurement: An Evaluation of New and Existing Financial Performance Measures. Nonprofit Management and Leadership, 2003, 13(4), 367 381. Rooney, P., Steinberg, K., and Schervich, P. G. Methodology Is Destiny: The Effect of Survey Prompts on Reported Levels of Giving and Volunteering. Nonprofit and Voluntary Sector Quarterly, 2004, 33(4), 628 654. Rotter, J. B. Generalized Expectancies for Internal Versus External Control of Reinforcement. Psychological Monographs: General and Applied, 1966, 80(1), 1 28. Schwab, D. P. Research Methods for Organizational Studies. (2nd ed.) Mahwah, N.J.: Erlbaum, 2005. Spector, P. E. Behavior in Organizations as a Function of Employees Locus of Control. Psychological Bulletin, 1982, 91, 482 497. Spector, P. E. Development of the Work Locus of Control Scale. Journal of Occupational Psychology, 1988, 61, 335 340. Steiger, J. H., Shapiro, A., and Browne, M. W. On the Multivariate Asymptotic Distribution of Sequential Chi-Square Statistics. Psychometrika, 1985, 50, 253 264. Stone, M. M., Bigelow, B., and Crittenden, W. Research on Strategic Management in Nonprofit Organizations. Administration and Society, 1999, 31(3), 378 423. Storms, P., and Spector, P. E. Relationship of Organizational Frustrations with Reported Behavioral Actions: The Moderating Effect of Locus of Control. Journal of Occupational Psychology, 1987, 60, 227 234. Terborg, J. R., and Peters, L. H. Some Observations on Wording of Item-Stems for Attitude Questionnaires. Psychological Reports, 1974, 35, 463 466.
ADAPTING SURVEYS FOR NONPROFIT RESEARCH 403 Thompson, B., and Daniel, L. G. Factor Analytic Evidence for the Construct Validity of Scores: A Historical Overview and Some Guidelines. Educational and Psychological Measurement, 1996, 56(2), 197 208. Van Maanen, J., Sorensen, J. B., and Mitchell, T. R. The Interplay Between Theory and Method. Academy of Management Review, 2007, 32(4), 1145 1154. Vogt, W. P. Dictionary of Statistics and Methodology. (2nd ed.) Thousand Oaks, Calif.: Sage, 1999.