How To Measure The Effectiveness Of College Sexual Assault Education

Psychology of Women Quarterly, 29 (2005), 374 388. Blackwell Publishing. Printed in the USA. Copyright C 2005 Division 35, American Psychological Association. 0361-6843/05 SEXUAL ASSAULT EDUCATION PROGRAMS: A META-ANALYTIC EXAMINATION OF THEIR EFFECTIVENESS Linda A. Anderson Oregon State University Susan C. Whiston Indiana University Meta-analyses of the effectiveness of college sexual assault education programs on seven outcome measure categories were conducted using 69 studies that involved 102 treatment interventions and 18,172 participants. Five of the outcome categories had significant average effect sizes (i.e., rape attitudes, rape-related attitudes, rape knowledge, behavioral intent, and incidence of sexual assault), while the outcome areas of rape empathy and rape awareness behaviors did not have average effect sizes that differed from zero. A significant finding of this study is that longer interventions are more effective than brief interventions in altering both rape attitudes and rape-related attitudes. Moderator analyses also suggest that the content of programming, type of presenter, gender of the audience, and type of audience may also be associated with greater program effectiveness. Implications for research and practice are discussed. The disturbingly high incidence of sexual assault experienced by college women has been widely documented over the last few decades (e.g., Abbey, Ross, McDuffle, & McAuslan, 1996; Brener, McMahon, Warren, & Douglas, 1999; Koss, Gidycz, & Wisniewski, 1987). Consequently, the need for sexual assault prevention on college campuses nationwide has become increasingly apparent. The federal government has acknowledged the importance of this issue by mandating that sexual assault prevention efforts be conducted on campuses that receive federal funding (Neville & Heppner, 2002). As a result, college education programs have emerged as one of the more popular strategies for sexual assault prevention. Although interventions have been developed and implemented at various colleges and universities across the United States since the 1980s, few of these programs have been empirically evaluated (Schewe & O Donohue, 1993a; Yeater & O Donohue, 1999). McCall (1993) summarized this situation by contending, [S]exual assault prevention programming remains a confused, scattered, and sporadic enterprise with little scientific underpinning (p. 277). Fortunately, in recent years the research on these programs has expanded; however, drawing conclusions from the myriad of Linda A. Anderson, University Counseling and Psychological Services, Oregon State University; Susan Whiston, Department of Counseling & Educational Psychology, Indiana University. Address correspondence and reprint requests to: Linda A. Anderson, University Counseling and Psychological Services, Oregon State University, 500 Snell Hall, Corvallis, OR 97331. E-mail: Linda.Anderson@oregonstate.edu dissertations and publications in this area can be a daunting task. Consequently, despite increases in recent research, little is actually known about the overall effectiveness of these programs and whether they produce lasting attitudinal or behavioral changes (Heppner, Neville, Smith, Kivlighan, & Gershuny, 1999). In an attempt to understand the value of these programs, several narrative reviews of the literature have been published (e.g., Bachar & Koss, 2001; Breitenbecher, 2000; Gidycz, Rich, & Marioni, 2002; Lonsway, 1996; Schewe & O Donohue, 1993a; Yeater & O Donohue, 1999). While several reviewers have concluded that most programs display short-term effectiveness in altering rape-supportive attitudes, there is little understanding of the impact of these interventions beyond this point. Unfortunately, narrative review of such a broad range of findings has several limitations. First, past narrative reviews of the sexual assault education literature typically have not included unpublished studies, and thus may tend to inflate the effectiveness of programming (Brecklin & Forde, 2001; Breitenbecher, 2000). Furthermore, narrative reviews do not provide quantitative indices of the degree to which particular program approaches are effective, nor do they typically systematically identify factors that may moderate or influence program effectiveness. For many involved in sexual assault education, it is not sufficient to know if these interventions are generally effective because they are interested in developing programs that have the most significant impact on participants. Hence, identification and analysis of moderator variables may be particularly important. Meta-analysis is a technique that overcomes some of the limitations of narrative reviews and provides a numerical 374

Sexual Assault 375 indicator of program effectiveness (i.e., effect size) that allows individuals to determine the degree to which interventions are effective (Lipsey & Wilson, 2001). Furthermore, meta-analytic techniques can be used to identify variables that influence effect size (i.e., moderator variables). There have been two previous meta-analytic reviews of sexual assault education programs. The first meta-analytic review of the sexual assault education literature (Flores & Hartlaub, 1998) yielded an average effect size of.30. This effect size indicates that those attending a sexual assault education program would have outcomes almost a third of a standard deviation better than participants who had not attended a program. Flores and Hartlaub s meta-analysis included only 11 studies that used rape-myth acceptance as the sole outcome measure. The second meta-analysis found an overall mean effect size of.35, based on 45 studies (Brecklin & Forde, 2001). Brecklin and Forde (2001) were also able to identify several variables that moderated effect size. They found that programs were more effective for men in singlegender than in mixed-gender groups. They also found that published studies had larger effect sizes, supporting the need to include both published and unpublished research in future reviews. However, Brecklin and Forde s (2001) meta-analysis considered only one category of outcome, rape attitudes. The current study was designed to expand on previous analyses of the effectiveness of sexual assault education programs by (a) examining a more diverse set of outcomes and (b) exploring whether a wider spectrum of program factors (e.g., type of presenter, content of program) may influence program effectiveness. Concerning the first goal mentioned above, there are a variety of outcome measures that have been used by individual researchers to examine the impact of sexual assault education programs. Until recently, most studies have relied exclusively upon attitudinal measures to assess the effectiveness of sexual assault education. Several researchers have questioned this restricted focus on attitudes as the only indicator of change (e.g., Heppner, Humphrey, Hillenbrand-Gunn, & DeBord, 1995; Lonsway, 1996; Schewe & O Donohue, 1993a) because there is still debate on whether a reduction in rape-supportive attitudes will reduce the actual incidence of sexual assault. Consequently, investigators have begun to utilize measures that assess other outcomes, such as knowledge about sexual assault, behaviors thought to be associated with sexual assault, and incidence of perpetration or victimization. Due to the recent expansion of outcome measurement in the literature, it was considered important to incorporate diverse outcome measures in the current meta-analysis. In this investigation, seven different outcome variables were analyzed separately: rape attitudes, rape empathy, rape-related attitudes, rape knowledge, behavioral intentions, rape awareness behaviors, and incidence of sexual assault. In essence, seven separate meta-analyses were conducted, one for each distinct outcome variable. These outcomes represent various attitudinal, knowledge, and behavioral categories and were adapted from construct categories offered by Breitenbecher (2000). Although these outcome categories are likely to be correlated, they were analyzed separately to obtain information about the differential impact of programming on various constructs, particularly in light of concerns about the tenuous link between attitudes and behaviors. Lipsey and Wilson (2001) recommended this approach, as averaging effect sizes across all of these constructs would result in more ambiguous and less meaningful results. Attitude measurements employed in the sexual assault education literature represent a diverse range and thus were divided into three categories. For the purpose of this investigation, dependent measures that may be categorized as rape attitudes were those that assessed attitudes specific to rape, such as rape myth acceptance, attitudes toward rape, and rape victim blame. This category is similar to the outcomes analyzed in prior metaanalyses. Rape empathy was the second outcome category. It included scales designed to measure empathy specifically related to rape and the degree to which participants identified with either rape victims or perpetrators. This outcome was differentiated because reviewers have specifically identified empathy as a construct targeted in educational programming (Lonsway, 1996; Schewe, 2002). The third attitudinal construct was rape-related attitudes. The scales incorporated into this category did not measure rapespecific attitudes, but assessed attitudes thought to promote the occurrence of sexual assault. This category included measures of sex-role stereotyping, attitudes toward women, and adversarial sexual beliefs. These measures were not included in past meta-analyses; thus, this category may contribute additional information about the impact of programming. The fourth outcome, rape-related knowledge, consisted of measures of participants factual knowledge about sexual assault. The final three outcome categories reflected varying dimensions of behavior, which are typically defined and assessed differently for women and men. Behavioral intentions included participants self-reported intent to rape or to engage in certain dating behaviors. Rape awareness behaviors referred to the actual self-reported or observed behaviors of participants that may reflect heightened awareness about sexual assault (e.g., differences in dating behaviors or willingness to volunteer for rape prevention efforts). The final outcome category included in this study was the actual incidence of sexual assault perpetration and victimization following an intervention. The second major goal of the current investigation was to examine the impact of several potential moderating variables. Descriptive characteristics about the study (e.g., published or unpublished) and its methods and procedures (e.g., sample size, random assignment, and time of followup measure) were incorporated in the moderator analysis to examine if the type and quality of the research influence effect size. In addition, it was deemed critical to

376 ANDERSON AND WHISTON examine both characteristics of the participants and the interventions to delineate the attributes of effective programs. Attending to issues of gender is particularly important in analyzing sexual assault education programs because rape is primarily a crime committed by males against females, and therefore there are likely to be gender differences on the outcome variables. The gender of the audience was targeted for moderator analysis because the question of whether all-male, all-female, or mixed groups are more effective has been frequently debated. In a narrative review, Breitenbecher (2000) tentatively concluded that single-gender programs appeared more effective. Similarly, Berkowitz (2002) and Rozee and Koss (2001) argued that mixedgender programs are less effective because men may become defensive in the presence of women. Brecklin and Forde (2001) explored this issue in their meta-analysis and concluded that single-gender programs are indeed more effective at reducing men s rape-supportive attitudes. This investigation sought to replicate prior conclusions concerning rape attitudes, while also exploring the impact of singleand mixed-gender programming on additional outcome constructs. Based on the recommendations of previous reviewers (e.g., Breitenbecher, 2000; Lonsway, 1996; Yeater & O Donohue, 1999), additional moderators examined in this study included status of the intervention facilitator (e.g., peer, graduate student, or professional), type of population that received the intervention (e.g., fraternity members), length of the program, and intervention content. Surprisingly, previous reviews have not explored in detail the degree to which content influences program effectiveness; thus, practitioners have little information concerning what type of content to include to maximize a program s impact. Based on a comprehensive review of the literature and an analysis of over 100 program descriptions, four general types of rape education programs were identified. Although some interventions exhibited elements of more than one type of program, an attempt was made to categorize the content of each program based upon its primary focus. The nature of coding this particular item was somewhat subjective; however, this strategy provided at least an initial exploration of the effect of differences in content to guide practitioners in program development. In sum, this meta-analysis expands on previous reviews by focusing on additional characteristics of the participants, facilitators, and content of the interventions and how these factors may moderate outcome. Furthermore, because previous reviews have examined the effect of sexual assault education programs on attitudinal variables only, this metaanalysis examined additional outcomes related to knowledge and behavior. Finally, additional statistical procedures as recommended by Hedges and Olkin (1985) and Lipsey and Wilson (2001) were implemented to calculate weighted effect sizes and to consider methodological factors of studies in drawing conclusions. Selection of Studies METHOD Several strategies were used to ensure that virtually all pertinent data, both published and unpublished, were used in this meta-analysis. Initially, seven computerized reference database systems were searched: PsycINFO, ERIC, MEDLINE, Dissertation Abstracts Online, Criminal Justice Abstracts, Sociological Abstracts, and the Social Science Citation Index. A number of combinations of key words (e.g., rape, sexual assault, prevention, intervention) were used to identify pertinent studies, dissertations or theses, and program evaluation reports. The second step for identifying relevant studies was to examine the references of the articles obtained and of previous reviews. The third step involved searching by hand the last 5 years of journals that have traditionally published articles relating to sexual assault to find articles that were comparatively recent and not in databases or cited by other researchers. The final strategy was to contact authors who have published multiple articles concerning sexual assault education and to request their assistance in locating studies. For more detailed information concerning methodological procedures, refer to Anderson (2003). To have been eligible for inclusion, studies must have examined an intervention intended to reduce negative attitudes and/or behaviors associated with sexual assault. Eligible studies must have measured the effectiveness of an intervention using one or more quantitative dependent measures designed to assess one of seven different outcome categories: rape attitudes, rape empathy, rape-related attitudes, rape knowledge, behavioral intentions, rape awareness behaviors, and/or incidence of sexual assault. Because the prevalence, definition, and general understanding of sexual assault may vary by culture, only studies that involved North American college students as participants were included in this meta-analysis. Furthermore, a study had to compare one or more interventions with a control group. Control groups could either be placebo, wait-list, minimal treatment, or no-treatment groups. Either random assignment of participants or pretests had to be given to ensure equivalence of groups on outcome measures prior to the intervention. Consistent with the suggestions of Hedges and Olkin (1985), neither treatment-versus-treatment nor pretest posttest comparisons were included so as to calculate the least biased estimate of the effectiveness of rape education programs. Finally, each study also needed to provide the necessary information to calculate effect sizes, such as means and standard deviations or other statistical test information. Of the 120 studies identified as research on sexual assault education programs, 51 of those studies could not be used for this meta-analysis. Many of those studies (n = 19) were eliminated because they lacked a control group. Other studies (n = 11) were not included because they duplicated another study (e.g., a dissertation later published as a

Sexual Assault 377 journal article), and some studies (n = 10) did not provide sufficient information to calculate effect sizes. Other studies were eliminated because the participants were not college students (n = 4) or because the studies included no relevant dependent measures (n = 2), lacked pretest or random assignment (n = 4), or did not involve an intervention (n = 1). After being screened, 69 studies met the criteria for inclusion in this review. Coding Procedures Consistent with the recommendations of Stock (1994), a manual was developed to guide coding. Attention was given to coding of moderator variables to determine whether factors such as methodological sophistication or content of the intervention moderated the effect size of an investigation. As suggested by Lipsey and Wilson (2001), moderator variables were divided into three categories: source descriptors, methods and procedures, and substantive issues (i.e., characteristics of the participants and the intervention). Source descriptors coded for this study included publication form (journal, dissertation/thesis, unpublished) and year of publication. Information concerning methods and procedures were coded as follows: unit and type of assignment to conditions (individual random, group random, individual nonrandom, group nonrandom), nature of control group (placebo, no treatment, minimal information), time of follow-up measure, and overall quality of study. Guidelines were developed to standardize evaluation of the overall quality of the study (see Anderson, 2003). We further coded for attempt to control for social desirability or demand characteristics, sample attrition, standardization of measures, and reliability of measures; however, these factors were not included in the moderator analyses because the information was inconsistently reported. Information coded about the participants included gender, mean age, race, gender of audience (all-female, women from a mixed-gender group, all-male, men from a mixedgender group, women and men combined), and type of audience (Greek members, general students, high risk). Information about the intervention was coded as follows: status of facilitator (peer, graduate student, or professional), length of program, and content of program. The content of the program was coded as follows: (a) informative (providing factual information and statistics, review of myths and facts, discussion of consequences of rape, identification of characteristics of rape scenarios); (b) empathy focused (helping participants develop empathy for rape victims); (c) socialization focused (examining gender-role stereotyping, societal messages that influence rape, oppression); (d) risk reducing (teaching specific strategies to reduce one s risk of rape); (e) more than one content; (f) cannot determine; and (g) other. Theoretical foundation was also coded, but could not be analyzed as a moderator due to the dearth of studies that included this variable. The first author coded each article, and the second author (a professor of counseling psychology) coded a randomly selected portion of articles (n = 12) so that reliability estimates could be obtained. The rate of agreement between the two coders was calculated according to the four types of moderator variables. The rate of agreement for source descriptors combined with methods and procedures was 95.5%, for participant characteristics 97.6%, and for intervention characteristics 92.1%. Analysis Statistical analysis was conducted in accordance with the guidelines provided by Lipsey and Wilson (2001). To identify effects on behavior, knowledge, and attitudes, separate meta-analyses were conducted for each of the seven outcome categories. Each effect size (g) was calculated by subtracting the mean of the control group from the mean of the experimental group and dividing by the pooled standard deviation. When means and standard deviations were not available from primary research studies, procedures summarized by Lipsey and Wilson (2001) to estimate standardized mean difference effect sizes from statistics (e.g., t tests) were utilized. Following the calculation of effect sizes, these values were then transformed to d to correct for small sample bias (Hedges & Olkin, 1985). Once effect sizes were calculated for each study, adjusted for bias and error, and combined within each construct, the standard error of the estimate was calculated and each d was then weighted by its inverse variance, which produced the aggregated effect size estimate d+ (Hedges & Olkin, 1985). This procedure also corrects for bias because it requires each effect size to be weighted by a value (inverse variance weight) that represents its precision (Lipsey & Wilson, 2001). Lipsey and Wilson (2001) and Rosenthal (1991) argued that effect sizes within a distribution need to be statistically independent and suggested using only one effect size for each construct examined. One method for ensuring independence in outcome measures is to not use data from the same measure more than one time. For example, some studies will assess outcome at the end of treatment and then later in a follow-up analysis to determine the long-term effect of the intervention. We opted to calculate effect sizes in this meta-analysis using measures from the last followup analysis conducted for each study. Given that effect size tends to decrease with time, this conservative method was chosen to enhance our understanding of the long-term effectiveness of programming, which may not be reflected in an immediate posttest. As another way to ensure independence of data, we averaged the effect sizes within each category to produce one effect size per outcome. Finally, when there is more than one treatment intervention within a single study, effect sizes were calculated for each treatment relative to the control group, to avoid losing valuable information about different types of programs. Due to our interest in examining gender issues, effect sizes for women and men were coded separately whenever possible, in an effort to determine whether women and men in single-gender groups benefit more from programming

378 ANDERSON AND WHISTON than those in mixed-gender groups. However, some authors reported the results only for women and men combined. Following the calculation of mean weighted effect sizes for each outcome construct, 95% confidence intervals were computed to determine the significance of each mean effect size. If the confidence interval does not include zero, then the mean effect is significantly different from zero (p<.05). Next, a homogeneity analysis was conducted to determine whether each effect size represented a common population mean. Homogeneity testing was performed according to procedures described by Hedges and Olkin (1985) and was based upon the Q statistic, which follows a chi-square distribution with k 1 degrees of freedom (k being the number of effect sizes). If the effect size distribution was found to be heterogeneous (a significant Q value), this heterogeneity indicated that the dispersion of effect sizes around their mean was greater than that which would be expected from sampling error alone (Lipsey & Wilson, 2001). In this circumstance, analyses of moderator variables should be conducted in order to identify additional sources of variance. In addition, as recommended by Hedges and Olkin (1985), outliers were examined in an effort to determine if the removal of these values would cause the distribution to achieve homogeneity. For this investigation, an outlier was defined as an effect size that fell two standard deviations above or below the mean of the distribution, which is a common procedure according to Lipsey and Wilson (2001). Of the 13 variables identified as potential moderators, 7 were categorical and were investigated using a procedure analogous to analyses of variance (ANOVA; Hedges & Olkin, 1985). Modified weighted least squares regression (Hedges & Olkin, 1985) was utilized for analysis of the six continuous variables. RESULTS This meta-analysis included results from 69 articles representing 102 treatment interventions. Because some interventions were evaluated using multiple outcome measures, and effect sizes were coded separately for women and men whenever possible, the results are based on 262 effect sizes. Approximately 43.9% of these effect sizes represent rape attitudes, 7.6% rape empathy, 21.4% rape-related attitudes, 6.9% knowledge, 9.5% behavioral intentions, 5.7% rape awareness behaviors, and 5% incidence of sexual assault. Before discussing the results of the seven separate meta-analyses, we will provide a brief overview of all of the studies included. Studies Overview The studies included in this meta-analytic review involved 18,172 participants, of which 48.7% were women (SD = 35). The participants average age was 20.3 (SD = 2.3). Race could be determined in 71% of the studies as follows: 4.1% African American, 4.3% Asian American, 84% Caucasian, 3.1% Latino/a, and 4.9% other participants. The studies were authored between 1978 and 2002 (M = 1995.1, SD = 4.7), 58% were published in academic journals, 37.6% were dissertations or theses, and 4.3% were conference papers or other unpublished works. Regarding methodology, 68% of studies used random assignment (of either groups or individuals), while 17% of studies attempted to control for social desirability or demand characteristics. The number of outcome measures used in each study ranged from 1to26with an average of 4.1 measures per study (SD = 3.7); however, only an average of 3.6 (SD = 2.7) of the measures could be coded. Reliability estimates were reported for the majority of measures. Regarding the type of control group employed, 59.4% used a no-treatment control, 10.1% used a wait-list control, 24.6% used a placebo intervention, and 5.8% used minimal information groups for comparison with treatment groups. For additional information about the studies included, refer to Anderson (2003). Concerning the effectiveness of the 102 interventions evaluated, Table 1 provides the overall effect sizes for each of the seven outcome categories. As Table 1 indicates, the mean weighted effect size for sexual assault education programs ranged from.061 for rape awareness behavior measures to.574 for measures of rape knowledge. As reflected by the confidence intervals, the effect sizes for rape Table 1 Sample Size, Mean Weighted Effect Size, 95% Confidence Interval, Homogeneity Test, and Fail-Safe N for Seven Outcome Categories Outcome Category k d+ 95% C.I. Homogeneity Test Fail-Safe N Rape attitudes 115.211.175/.246 Q (114) = 215.41 569 Rape empathy 20.072.003/.147 Q (19) = 45.86 Rape-related attitudes 56.125.076/.174 Q (55) = 79.68 86 Rape knowledge 18.574.498/.650 Q (17) = 119.54 118 Behavioral intent 25.136.054/.217 Q (24) = 42.47 16 Awareness behavior 15.061.018/.140 Q (14) = 19.24 Incidence 13.101.036/.167 Q (12) = 33.47 7 Note. k = Number of effect sizes, d+=mean weighted effect size, C.I. = Confidence interval, and Q = Hedges & Olkin s (1985) measure of homogeneity. p<.05. p<.01.

Sexual Assault 379 knowledge, rape attitudes, behavioral intent, rape-related attitudes, and incidence of sexual assault were all significant at the.05 level. For the five significant effect sizes, fail-safe Ns (Rosenthal, 1991) are also provided in Table 1. This statistic provides an estimate of the number of unpublished studies with null results required to reduce the mean weighted effect size to a value that is no longer statistically significant. The five outcome categories that had significant effect sizes were examined to determine if the average effect sizes represented a homogenous group of effect sizes or a heterogeneous group that should be further explored through moderator analysis. Although rape knowledge and incidence had significant mean effect sizes and significant tests of homogeneity, there were too few studies in these categories to ascertain reliable findings; thus, moderator analyses were not conducted for these outcomes. Moderator analyses were performed on the outcome categories of rape attitudes, rape-related attitudes, and behavioral intentions because these tests of homogeneity were significant, and a sufficient number of studies were available in these categories for analysis. Rape Attitudes There were 115 effect sizes in this outcome category, based on 89 treatment interventions conducted within 57 studies, which produced an average effect size of.211. On average, 41% of the participants were women (SD = 29.5). The most common outcome measure was the Rape Myth Acceptance Scale (Burt, 1980), and the average time for follow-up assessment was 34.9 days. The average intervention lasted 142.6 minutes, but there was substantial variation in length of programs (SD = 362.1). For the homogeneity analysis, results (Q = 214.41, p <.001) indicated moderator analyses were warranted. Outlier analysis revealed the presence of six outliers. However, deletion of these outliers only minimally reduced the overall effect size (i.e., from.211 to.209) and did not impact its significance. Furthermore, the Q statistic retained significance. These outliers were consequently retained for moderator analysis. The results of moderator analysis for the seven categorical variables are presented in Table 2. When moderator variables are categorical, the interpretation is analogous to ANOVA, where Q B reflects the portion explained by the categorical variable and Q W indicates the residual pooledwithin-groups portion. Journal articles were found to have a greater mean effect size than unpublished works. An analysis of the gender of the audience also revealed significant between-group differences. Women who received an intervention in an all-female group displayed the largest effect sizes, although only three studies were found, and this value was not significant. Finally, the content of the intervention appeared to moderate mean effect size, and interventions categorized as risk reducing resulted in the largest mean effect size, while empathy-focused programs and programs in which the content was unable to be determined did not produce significant change. However, because of significant heterogeneity within most of these categories, caution should be exercised in interpreting these results. Random assignment, nature of control group, type of population, and status of facilitator were not found to be significant moderators of effect size. When a significant overall test of homogeneity (Q) exists and moderator variables are continuous (as compared to categorical in the previous analysis), then it is common to use regression to test which of the moderator variables contributes uniquely to variation in effect size. Two indices are used to assess the overall fit of the regression models, with Q R reflecting a partitioning of the variability into the portion associated with the regression model and Q E indicating the variability unaccounted for by the model. Examination of the six predictor variables (see Table 3) revealed that only two of them (length of intervention and number of participants) made significant contributions to variation among rape attitude effect sizes. The positive direction of each B-weight indicates that longer interventions and larger sample sizes are both associated with more positive change in rape attitudes. Rape-Related Attitudes This outcome category contained 56 effect sizes, based upon 43 treatment interventions from 26 studies, that produced an effect size of.125. On average, 38% of the participants were women (SD = 33). The average length of the intervention was 240.1 (SD = 526.0) minutes, and the follow-up assessment was conducted an average of 36.7 (SD = 111.5) days after the termination of treatment. The homogeneity test revealed significant variation among effect sizes (Q = 79.68, p<.05). Three effect sizes met the criteria as outliers; however, because deletion of these outliers did not reduce the Q statistic beyond significant heterogeneity and did not substantially change the average effect size, these effect sizes were retained in the moderator analysis. Concerning rape-related attitudes, the moderator analyses for the categorical variables appear in Table 4. Nonrandom assignment and nature of the control group appeared to impact the magnitude of effect size. Once again it should be noted that some of the categories did not obtain withingroup homogeneity. In addition, only one study used a waitlist control group; this category was dropped from analysis. Average effect sizes also differed significantly by the type of population that received the intervention, the status of the facilitator, and the content of the intervention. However, these significant between-group findings are tempered by the lack of homogeneity within several of the categories. Modified weighted least squares regression analysis (see Table 5) was conducted for the six continuous variables in the same manner as previously described for the rape attitudes outcome, and the regression model was statistically

380 ANDERSON AND WHISTON Table 2 Categorical Moderators of Rape Attitudes Outcome Variable & Class k d+ 95% C.I. Q W Q B Type of publication 4.00 Journal 60.239.19/.28 129.43 Diss/thesis/unpub 55.164.11/.22 81.98 Random assignment 1.74 Individual random 47.215.16/.27 94.92 Group random 32.241.17/.31 27.33 Nonrandom 36.181.12/.24 91.42 Nature of control group 7.15 No treatment 71.245.20/.29 133.50 Wait-list 5.081.06/.22 10.59 Attention placebo 33.202.13/.27 47.27 Minimal information 6.140.03/.25 16.90 Type of population 7.49 General students 94.205.17/.25 154.21 Greek members 13.290.20/.38 40.22 High-risk 5.011.22/.24 10.39 Other 3.045.21/.30 3.10 Gender of audience 14.83 Female/female group 3.287.013/.587 2.43 Female/mixed group 27.236.163/.310 50.52 Male/male group 24.111.027/.196 38.46 Male/mixed group 29.124.039/.209 36.25 Female/male combined 32.273.217/.329 72.93 Status/facilitator 3.59 Peer 19.246.17/.33 29.81 Graduate student 30.172.10/.25 30.22 Professional 28.233.17/.30 68.28 Combination 16.154.05/.26 38.94 Unknown-N/A 22.222.12/.32 44.58 Content/intervention 29.48 Information 39.257.20/.31 69.78 Empathy 9.130.02/.28 18.69 Socialization 13.327.19/.46 19.17 Risk reduction 5.435.28/.59 12.53 More than one 36.165.10/.23 56.34 Other 13.019.14/.11 9.41 Note. k = Number of effect sizes, d+ =Mean weighted effect size, C.I. = Confidence interval, Q W = Homogeneity within each class, and Q B = Homogeneity between classes. p<.05. p<.01. significant. The overall rating of the quality of the study and the time of measurement each contributed significantly to the prediction of effect size. More specifically, studies with higher overall quality ratings and studies with a greater time lapse between the end of the intervention and the time of measurement were both associated with smaller effect sizes. Furthermore, consistent with the outcome category rape attitudes, length of intervention was significant, with longer interventions being associated with larger effect sizes. Behavioral Intentions Based upon 25 effect sizes obtained from 22 treatment interventions within 16 studies, the overall d+ for studies using measures of behavioral intentions was.136. In this outcome category, 31% of the participants were women. The average length of the intervention was 215 (SD = 386.2) minutes, and the follow-up assessment was conducted an average of 19.4 (SD = 43.9) days later. The homogeneity test revealed significant variation among effect sizes. Outlier analysis revealed the presence of one outlier (d+ =.89), and removal of this outlier from the distribution and recalculation of the average weighted effect size increased the value from.136 to.15; however, the Q statistic dropped from 42.47 to 33.77, which borders on statistical significance (p =.068). This outlier was excluded from further moderator analysis because, unlike previously identified outliers, the study displayed several outstanding

Sexual Assault 381 Table 3 Continuous Moderators for Rape Attitudes Outcome Predictor B SE B ß z Year of publication.0091.0049.147 1.8548 Overall quality.0021.0192.009.1083 Length of intervention.0003.0001.227 2.7921 Percent attrition.0009.0010.068.8993 Time of measurement.0002.0004.045.5195 N of sample (after attrition).0002.0001.218 2.9792 Note. Q R (6) = 19.134, Q E (82) = 196.279, and R 2 =.09. B = Unstandardized regression coefficient, SE B = Corrected standard error value, ß = Standardized regression coefficient, and z = z-test of significance. p<.01. methodological characteristics that may have impacted the results and was not typical of the general pool of studies. Although this exclusion caused the Q statistic to drop below significance, moderator analysis was nevertheless conducted to explore variations in effect sizes. As Table 6 reflects, three variables appeared to moderate effect size for this outcome category nature of the control group, gender of audience, and status of the facilitator. Modified weighted least squares regression analysis was conducted for the six continuous variables for this outcome as well. The regression model was not statistically significant, while the residual also did not attain statistical significance. This finding indicated that the regression model was not correctly specified for this outcome; thus, it was not appropriate to conclude that any of the regression coefficients was significantly different from zero. DISCUSSION The primary purpose of this meta-analysis was to investigate the effectiveness of sexual assault education programs on college campuses using both published and unpublished studies. Specifically, we were interested in whether these programs influenced attitudinal outcomes, knowledge measures, and behavioral indices. Seven separate meta-analyses were conducted to determine the impact programming has on these distinct outcome categories. The results of this investigation indicate that the efficacy of sexual assault education programming on college campuses appears to differ depending on which types of outcomes are considered. The outcome category that evidenced the most positive change was rape knowledge, which produced a mean effect size of.57. This finding indicates that those who participated in a sexual assault education program displayed greater factual knowledge about rape than those who did not attend a program. The positive effect size for rape knowledge could be considered to produce a medium effect using the guidelines suggested by Cohen (1988). The second largest effect size was found for the rape attitudes category (.21), which suggests that sexual assault education programming has a small but positive influence on rape attitudes. This mean effect size is somewhat lower than the previous metaanalyses of Brecklin and Forde (2001, p. 35) and Flores and Harlaub (1998;.30). Possible reasons for these differences are the larger number of studies included in this investigation, implementation of controls for data dependence, and weighting of effect sizes as suggested by Hedges and Olkin (1985) and Lipsey and Wilson (2001). Although the effect sizes for behavioral intentions, rape-related attitudes, and incidence of sexual assault were statistically significant (.14,.12, and.10, respectively), the influence of sexual assault programs on these outcomes may be of little clinical significance, as these effect sizes do not reach the criteria for a small effect size (i.e.,.20 to.40) as suggested by Cohen (1988). Sexual assault education programming did not appear to have any impact on rape empathy or rape awareness behaviors because the studies produced overall mean effect sizes that were not significantly different from zero. Consequently, the answer to the question Are sexual assault education programs effective? indeed depends upon the criteria used to define effectiveness (Breitenbecher, 2000). If effectiveness is defined solely as a decrease in sexual assault, then there is little support available from the current pool of studies. Although a decline in incidence may be the ultimate goal of education programming, the extreme difficulty in obtaining accurate long-term statistics regarding involvement with sexual assault following an intervention (Schewe & O Donohue, 1993a) indicates that additional outcomes should be considered. Our findings do indicate that sexual assault education programs are somewhat effective in changing attitudes toward rape and increasing rape knowledge. However, due to the dearth of studies using behavioral outcomes, more research using behavioral indices is needed before definitive conclusions can be reached. In addition, the effect sizes of certain outcome constructs were most likely influenced by characteristics of the intervention, participants, and research methodology. For example, rape attitudes may be more subject to demand characteristics than other outcomes because these attitudes are often overtly discussed and disputed in sexual assault education workshops (Lonsway, 1996), while other outcome measures may be less directly associated with program content. A factor that may have influenced the effect sizes for behavioral variables pertains to when the outcome was measured. In general, effect sizes tend to decrease when there is a longer time between when the intervention is delivered and when the outcome is measured. Outcomes such as rape awareness behaviors had longer average follow-up measurement times (89 days) compared to rape attitudes (35 days). Hence, it is difficult to determine whether the lower effect sizes associated with behavioral outcomes relative to attitudinal measures are due to program impact or to time of assessment. An advantage of meta-analytic methodology is the ability to examine variables that influence program outcome

382 ANDERSON AND WHISTON Table 4 Categorical Moderators for Rape-Related Attitudes Outcome Variable & Class k d+ 95% C.I. Q W Q B Type of publication.81 Journal 25.139.08/.20 48.85 Diss/thesis/unpub 31.090.00/.18 30.03 Random assignment 9.03 Individual random 25.056.02/.13 23.65 Group random 14.077.04/.19 5.08 Nonrandom 17.214.14/.29 41.91 Nature of control group 14.92 No treatment 38.200.14/.26 53.26 Wait-list 1 Attention placebo 11.053.06/.17 7.66 Minimal information 6.028.14/.08 3.83 Type of population 6.38 General students 43.098.04/.16 53.62 Greek members 8.242.13/.35 14.68 High-risk 5.011.22/.24 5.00 Other 0 Gender of audience 1.99 Female/female group 1 Female/mixed group 14.114.032/.196 33.09 Male/male group 17.169.066/.273 19.20 Male/mixed group 16.094.002/.191 17.98 Female/male combined 8.123.001/.247 7.42 Status/facilitator 10.56 Peer 6.157.02/.29 15.41 Graduate student 21.029.06/.12 12.23 Professional 12.209.13/.29 27.83 Combination 10.171.02/.32 5.64 Unknown-N/A 7.032.12/.18 8.01 Content/intervention 14.73 Information 19.217.13/.30 20.95 Empathy 4.094.22/.41 1.60 Socialization 8.300.11/.48 6.00 Risk reduction 2.203.15/.56 2.33 More than one 17.030.04/.10 18.27 Other 5.125.03/.28 15.79 Note. k = Number of effect sizes, d+=mean weighted effect size, C.I. = Confidence interval, Q W = Homogeneity within each class, and Q B = Homogeneity between classes. p<.05. p<.01. through moderator analysis. A significant finding of this meta-analysis is that longer interventions (i.e., length of time exposed to material in minutes) seemed to be more effective in altering both rape attitudes and rape-related attitudes. Interestingly, the range in length of interventions was substantial (7 to 2,520 minutes); it seems sensible to conclude that a 7-minute intervention would be less effective than a much longer intervention. Although we did not specifically test single- versus multi-session programming, these findings suggest that semester-long courses or possibly multi-session workshops may be more effective in promoting positive change. Flores and Hartlaub (1998) and Brecklin and Forde (2001), however, did not find an association between the length of the intervention and program effectiveness. This difference may in part be due to their analysis of this variable as categorical, whereas we analyzed length of intervention as a continuous variable. We believe that, given the larger number of studies in our meta-analyses, our finding that longer interventions were more effective may more accurately represent the research in this area. Hence, we would encourage those designing educational programs to institute longer, more thorough interventions rather than brief programs. Because the attention span of students may be limited during one sitting, an educator might consider multi-session programming. This study also found that the status of the facilitator appears to influence changes in rape-related attitudes

Sexual Assault 383 Table 5 Continuous Moderators for Rape-Related Attitudes Outcome Predictor B SE B ß z Year of publication.0066.0072.123.9192 Overall quality.0665.0264.348 2.5186 Length of intervention.0004.0001.481 3.1913 Percent attrition.0012.0014.109.8195 Time of measurement.0014.0005.410 2.7104 N of sample (after attrition).00002.0001.024.1870 Note. Q R (6) = 24.757, Q E (49) = 54.779 (ns), and R 2 =.31. B = Unstandardized regression coefficient, SE B = Corrected standard error value, ß = Standardized regression coefficient, and z = z-test of significance. p<.05. p<.01. and behavioral intentions. Professional presenters were more successful, while graduate students and peer presenters were generally less successful in promoting positive changes. Although there should be some caution in interpreting these results, these findings do raise questions about the common practice of employing peer facilitators. Peer education is popular not only in rape education but also in a number of other health-related educational programs (e.g., substance abuse, HIV, sexuality); however, both Backett- Milburn and Wilson (2000) and Parkin and McKeganey (2000) have questioned whether there is sufficient research to support this prevalent approach. Walker and Avis (1999) suggested several reasons why peer intervention might fail, including a lack of investment in peer education (viewing peers as cheap labor ); lack of appreciation of the complexity of the peer education process and the need for highly skilled personnel; and inadequate supervision, training, and support. Consequently, it may be beneficial to address these concerns in future research before any conclusions can be offered concerning the effectiveness of peer educators. Another significant moderator of effect size for both rape attitudes and rape-related attitudes was the content of the intervention. The results suggest that interventions that focus on gender-role socialization, provide general information about rape, discuss rape myths/facts, and address risk-reduction strategies have a more positive impact on participants attitudes than rape empathy programs and interventions with unspecified contents. However, some considerations should be addressed before concluding that rape empathy interventions are ineffective. First, the difference in effectiveness for these programs could be associated with the types of outcome measures utilized to assess positive change. These attitudinal measures tend to assess concepts discussed in myth/fact and socialization-focused programs, while these concepts may not be as directly addressed in empathy programs. In addition, these findings include only attitudinal data; thus, whether programs with different content have any differential impact upon the behavior of participants is unknown. Consequently, more em- Table 6 Categorical Moderators for Behavioral Intent Outcome Variable & Class k d+ 95% C.I. Q W Q B Type of publication.03 Journal 12.142.01/.27 13.66 Diss/thesis/unpub 12.157.05/.26 20.08 Random assignment 4.47 Individual random 14.074.04/.19 14.90 Group random 5.278.12/.44 8.07 Nonrandom 5.187.01/.38 6.32 Nature of control group 10.38 No treatment 16.235.12/.35 21.73 Attention placebo 6.009.13/.14 1.67 Type of population 1.69 General students 14.178.07/.28 26.62 Greek members 4.090.07/.25 2.38 High-risk 4.207.11/.52 3.07 Gender of audience 10.77 Female/female group 2.552.15/.96.43 Female/mixed group 2.095.33/.14.00 Male/male group 14.133.02/.24 18.98 Male/mixed group 5.265.10/.43 3.59 Female/male combined 1 Status/facilitator 12.51 Peer 8.043.07/.16 2.13 Graduate student 5.124.07/.32 4.98 Professional 3.449.09/.81 1.04 Combination 3.168.08/.41 4.89 Unknown-N/A 5.427.21/.64 8.22 Content/intervention 5.27 Information 6.095.05/.24 8.33 Empathy 5.074.13/.27 1.96 Socialization 4.359.13/.58 1.25 Risk reduction 4.105.07/.28 7.28 More than one 5.232.03/.44 9.68 Note. k = Number of effect sizes, d+=mean weighted effect size, C.I. = Confidence interval, Q W = Homogeneity within each class, and Q B = Homogeneity between classes. p<.05. p<.01. pirical research examining the content of programming is needed. Another pertinent finding was that programs that included more than one topic appeared to be less effective than more focused programs, which may indicate that more in-depth programming produces better outcomes than sessions that cover multiple topics more superficially. Furthermore, this finding may be related to our finding that longer interventions are more effective and that attempting to cover information too quickly may result in weak effects that have little long-term impact. A final issue to consider when evaluating the content of sexual assault education interventions is that the type of program offered may vary depending upon the gender of the participants. Women are more likely to receive a risk-reduction intervention, while men may be more likely to receive an empathy intervention.

384 ANDERSON AND WHISTON Due to gender differences in rape attitudes and behaviors, the gender of participants may influence findings of overall effectiveness within a particular content category. Type of audience was also a significant moderator of effect size for rape-related attitudes. Greek members appeared to be the most positively impacted by educational programming, which is of interest because it is often thought that fraternity and sorority members are at greater risk to experience sexual assault (e.g., Copenhaver & Grauerholz, 1991; Sandy, 1990; Schwartz & DeKeseredy, 1997). Although high-risk populations did not appear to demonstrate positive changes in attitudes, this finding must be observed with caution because this category included only five studies and consisted of heterogeneous groups. Consequently, more research is needed to explore the difference in responsiveness to education among specific highrisk groups. Another important moderator is the gender of the audience. For women, a significant positive effect size was found for rape attitudes when the program was conducted with mixed-gender groups. Although a relatively high effect size (.29) was found for women in all-female groups, this value was not significant and was based on only three studies. In contrast, tentative findings for behavioral intentions suggest that women may have a better outcome in an all-female setting and that mixed-gender programming may not be effective. However, these findings are based on only four studies, and thus further research is needed before any conclusion is drawn. Surprisingly, there was no evidence from these data that men are more likely to benefit from programming administered in all-male groups as compared to men in mixed-gender groups. Although there was no significant difference between the two groups, it is surprising to note that men from mixedgender groups displayed a larger effect size for behavioral intentions. These results contradict Brecklin and Forde s (2001) findings that single-gender programs were more effective for men than mixed-gender programs. Differences between our study and Brecklin and Forde s may provide some insights into reasons for the conflicting findings. It should be noted that Brecklin and Forde s results were related only to rape attitudes and their meta-analysis did not include behavioral measures. In addition, the current investigation included a larger number of studies and controlled for data dependency. Because we included effect sizes only from the last follow-up evaluation, our study may also offer a more accurate indication of the longer-term effectiveness of programming. Considering the significance of this issue and the recent support for single-gender programs above mixed-gender programs (e.g., Berkowitz, 2002; Gidycz et al., 2002; Rozee & Koss, 2001; Schewe, 2002), more empirical research on this question is necessary. Schewe and O Donohue (1993a) and Yeater and O Donohue (1999) have voiced concerns about the lack of methodological sophistication in sexual assault education research and the potential to create programs based on misleading findings; therefore, particular attention was focused in this meta-analysis on incorporating research methodology and design variables into our analyses. Our findings suggest that studies that are published, are rated as lower quality, lack random assignment, have larger sample sizes, and employ no-treatment control groups have larger effect sizes. Collectively, these results suggest that low methodological standards may lead to potentially erroneous conclusions about the effectiveness of sexual assault education interventions. Although these findings are not consistent across every methodological characteristic and varied across outcome variables, methodological rigor is necessary in future research to provide more precise findings. Brecklin and Forde (2001) also found publication bias in their metaanalysis, which suggests that unpublished studies should continue to be included in future reviews of this literature. Consistent with Brecklin and Forde (2001) and Flores and Hartlaub (1998), we found that for rape-related attitudes, the length of time between the end of treatment and the assessment of the impact of the program was a significant moderator of treatment effectiveness. Therefore, there are consistent findings that the positive effects of treatment tend to diminish over time. Limitations There are several limitations to this study that must be acknowledged. First, the results of any meta-analytic review are only as sound as the studies included in the analysis (Lipsey & Wilson, 2001). Although criteria were specified a priori to exclude studies with serious methodological problems, it should be noted that many studies contained some limitations, which in turn restricted the conclusions of this meta-analysis. Another limitation concerns the amount of unexplained variance found in many of the univariate moderator analyses, as well as the underspecification of the regression model for the rape attitudes outcome. These conditions suggest that several of the findings from the moderator analyses should be viewed with caution, because there were additional sources of variance that remain unexplained. In particular, the possibility of interaction effects must be considered because the findings of moderator analyses may be influenced by other potentially related variables. Moreover, the small number of studies included in the behavioral intentions moderator analysis also limits the generalizability of these findings. Although attempts were made to limit the number of moderator analyses, another issue concerns a potential Type I error due to the number of univariate analyses that were conducted for each outcome. However, given our adherence to the procedures suggested by Hedges and Olkin (1985) and Lipsey and Wilson (see Anderson, 2003 for details), the probability of a Type I error was reduced. Although attention was given to systematic and objective coding, certain moderator variables were more sensitive to coder subjectivity. In particular, it was challenging to code

Sexual Assault 385 the content of each intervention, due to the variations in detail provided by the authors of each study. Hence, future research must attend to issues of treatment integrity before determining the effectiveness of certain content. A final limitation that must be noted is the small number of studies included in certain outcome categories (e.g., incidence). Results for these particular outcome categories should be viewed as only a tentative indication of effectiveness. Implications for Practice Our results suggest that sexual assault education interventions for college students tend to be more effective when they are longer, presented by professionals, and include content addressing risk reduction, gender-role socialization, or provision of information and discussion of myths and facts about sexual assault. In addition, there was support for both mixed- and single-gender programming; however, single-gender programming may tentatively be more helpful in some circumstances for women. Therefore, practitioners may wish to consider both single- and mixed-gender groups, depending upon the goals and topics of the presentation. Consequently, extended educational programming (e.g., longer than 1 hour) that provides longer exposure to material addressing multiple content areas in depth may be a more effective way to educate about issues of sexual assault. In addition, the format of an extended program may be varied; a portion of the information may be presented in a single-gender environment, while the remainder may be offered in a mixed-gender setting to maximize potential effectiveness. This approach may be particularly appropriate considering the amount of variance observed within groups during the moderator analyses, suggesting that no one particular strategy is effective for all individuals. Implications for Research There is a critical need for additional controlled studies that examine factors that moderate the effectiveness of sexual assault education programming. Given the inconsistent findings in meta-analytic studies, further exploration of the long-term impact of the length of the intervention and the gender of audience is particularly important. In addition, our findings indicate that the evaluation of interventions for special populations is needed. Furthermore, increased effort should be extended toward developing and evaluating culturally relevant programming. The research literature has generally failed to include culturally diverse samples; consequently, the impact of programming upon individuals from racially and culturally diverse backgrounds is largely unknown. We identified only one study (Heppner et al., 1999) that attempted to evaluate the effectiveness of a culturally relevant intervention with African American and Caucasian fraternity men. The findings from this study suggest that African American fraternity men may respond more positively to a culturally relevant intervention than to a color-blind intervention. Although numerous authors have emphasized the importance of drawing from a theoretical framework in determining the content and format of an intervention (e.g., Gidycz et al., 2002; Heppner et al., 1995; Yeater & O Donohue, 1999), the findings of this investigation suggest that currently there are very few programs that have a clear theoretical foundation. We suggest that researchers continue to explore differences in theoretically driven interventions; progress in this area may help to delineate components of effective programming. Furthermore, it is recommended that researchers include detailed information about program content in their publications so that their interventions may be replicated. In accordance with past reviewers (e.g., Brecklin & Forde, 2001; Flores & Hartlaub, 1998), the results of this investigation indicate that future researchers should strengthen the methodological quality of their studies. Specifically, researchers should attempt to use random assignment of participants, placebo control groups, and longer follow-up periods. When assessing the effectiveness of programming, it is essential to include psychometrically sound measures and to recognize the inherent limitations of selfreport instruments. Our final suggestion for researchers is to include a variety of outcome measures, particularly behavioral measures, when assessing the effectiveness of sexual assault education interventions. Although the impact of programming on behavioral intentions, related behaviors, and incidence of sexual assault produced only small effects, the limited number of behavioral measures utilized restricted the conclusions that could be drawn. In conclusion, as past reviewers have contended, it may be unrealistic to expect a standard 1-hour, one-session sexual assault education program to have a lasting impact on the attitudes and particularly the behaviors of participants. However, this review offered promising findings that certain characteristics may be associated with greater program effectiveness. Although it may be ideal for education to begin much earlier (i.e., middle and high school) to promote enduring changes in attitudes and behaviors related to sexual assault, this is not under the control of college administrators. Colleges must address the attitudes, knowledge, and behaviors that students maintain upon entering college because current conditions suggest that students are at considerable risk of experiencing sexual assault. Therefore, the goal of enhancing the effectiveness of this programming must never falter since the improvement these interventions is essential to enhance the safety and wellbeing of both women and men on campuses nationwide. With the continued advancement of sexual assault prevention research and practice, including increased attention to factors such as theoretical bases of change, explicitly defined content, and longer exposure to educational information,

386 ANDERSON AND WHISTON we must have faith that these obstacles can be overcome. This problem is far too pervasive to be ignored. Initial submission: January 12, 2004 Initial acceptance: February 18, 2005 Final acceptance: July 25, 2005 REFERENCES References marked with an asterisk indicate studies included in the meta-analysis. Abbey, A., Ross, L. T., McDuffie, D., & McAuslan, P. (1996). Alcohol and dating risk factors for sexual assault among college women. Psychology of Women Quarterly, 20, 147 169. Anderson, L. A. (2003). A meta-analysis of the effectiveness of sexual assault prevention education programming on American college campuses. Unpublished doctoral dissertation, Indiana University, Bloomington. Anderson, L. A., Stoelb, M. P., Duggan, P., Hieger, B., Kling, K. H., & Payne, J. P. (1998). The effectiveness of two types of rape prevention programs in changing the rape-supportive attitudes of college students. Journal of College Student Development, 39, 131 142. Bachar, K., & Koss, M. P. (2001). From prevalence to prevention: Closing the gap between what we know about rape and what we do. In C. M. Renzetti, J. L. Edelson, & R. K. Bergen (Eds.), Sourcebook on violence against women (pp. 117 142). Thousand Oaks, CA: Sage. Backett-Milburn, K., & Wilson, S. (2000). Understanding peer education: Insights from a process evaluation. Health Education Research, 15, 85 96. Beadner, S. D. (2000). Date rape attitudes intervention: A controlled outcome study. Unpublished master s thesis, University of Nevada, Las Vegas. Berg, D. R., Lonsway, K. A., & Fitzgerald, L. F. (1999). Rape prevention education for men: The effectiveness of empathy induction techniques. Journal of College Student Development, 40, 219 234. Berger, N. M. (1993). An exploration of the effectiveness of an acquaintance rape prevention program designed for male intercollegiate athletes. Unpublished doctoral dissertation, State University of New York at Buffalo. Berkowitz, A. (2002). Fostering men s responsibility for preventing sexual assault. In P. A. Schewe (Ed.), Preventing intimate partner violence: Developmentally appropriate interventions across the lifespan (pp. 163 196). Washington, DC: American Psychological Association. Black, B., Weisz, A., Coats, S., & Patterson, D. (2000). Evaluating a psychoeducational sexual assault prevention program incorporating theatrical presentation, peer education, and social work. Research on Social Work Practice, 10, 589 606. Borden, L. A., Karr, S. K., & Caldwell-Colbert, A. T. (1988). Effects of a university rape prevention program on attitudes and empathy toward rape. Journal of College Student Development, 29, 132 136. Boulter, C. (1997). Effects of an acquaintance rape prevention program on male college students endorsements of rape myth beliefs and sexually coercive behavior. Unpublished doctoral dissertation, Washington State University, Pullman. Brecklin, L. R., & Forde, D. R. (2001). A meta-analysis of rape education programs. Violence and Victims, 16, 303 321. Breitenbecher, K. H. (2000). Sexual assault on college campuses: Is an ounce of prevention enough? Applied and Preventive Psychology, 9, 23 52. Breitenbecher, K. H., & Gidycz, C. A. (1998). An empirical evaluation of a program designed to reduce the risk of multiple sexual victimization. Journal of Interpersonal Violence, 13, 472 488. Breitenbecher, K. H., & Scarce, M. (1999). A longitudinal evaluation of the effectiveness of a sexual assault education program. Journal of Interpersonal Violence, 14, 459 478. Breitenbecher, K. H., & Scarce, M. (2001). An evaluation of the effectiveness of a sexual assault education program focusing on psychological barriers to resistance. Journal of Interpersonal Violence, 16, 387 407. Brener, N. D., McMahon, P. M., Warren, C. W., & Douglas, K. A. (1999). Forced sexual intercourse and associated healthrisk behaviors among female college students in the United States. Journal of Consulting and Clinical Psychology, 67, 252 259. Burt, M. R. (1980). Cultural myths and supports for rape. Journal of Personality and Social Psychology, 38, 217 229. Calhoun, K. S., Gidycz, C. A., Loh, C., Wilson, A., Lueken, M., Outman, R. C., et al. (2001). Sexual assault prevention in high-risk women. Paper presented at the 35th Annual Convention of the Association for Advancement of Behavioral Therapy, Philadelphia. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum. Copenhaver, S., & Grauerholz, E. (1991). Sexual victimization among sorority women: Exploring the link between sexual violence and institutional practices. Sex Roles, 24, 31 41. Dallager, C., & Rosen, L. A. (1993). Effects of a human sexuality course on attitudes toward rape and violence. Journal of Sex Education and Therapy, 19, 193 199. Davis, J. L. (1999). Effectiveness of a sexual assault education and prevention program for victims and nonvictims of sexual abuse. Unpublished doctoral dissertation, University of Arkansas, Fayetteville. Davis, T. L. (1997). The effectiveness of a sex-role socialization focused date rape prevention program in reducing rapesupportive attitudes in college fraternity men. Unpublished doctoral dissertation, University of Iowa, Iowa City. DeBates, R. (2002). The good guy approach: Assessing the effectiveness of a minimal educational intervention on college men s rape myth acceptance. Unpublished manuscript, Southern Oregon University, Ashland. Duggan, L. M. (1998). The effectiveness of an acquaintance sexual assault prevention program in changing attitudes/beliefs and behavioral intent among college students. Unpublished doctoral dissertation, Temple University, Philadelphia. Earle, J. P. (1996). Acquaintance rape workshops: Their effectiveness in changing the attitudes of first year college men. NAPSA Journal, 34, 2 17. Echols, K. L. (1998). Dating relationships and sexual victimization: An intervention program with college freshman males. Unpublished doctoral dissertation, University of Alabama, Tuscaloosa.

Sexual Assault 387 Fischer, G. J. (1986). College student attitudes toward forcible date rape: Changes after taking a human sexuality course. Journal of Sex Education and Therapy, 12, 42 46. Flores, S. A., & Hartlaub, M. G. (1998). Reducing rape-myth acceptance in male college students: A meta analysis of intervention studies. Journal of College Student Development, 39, 438 448. Fonow, M. M., Richardson, L., & Wemmerus, V. A. (1992). Feminist rape education: Does it work? Gender and Society, 6, 108 121. Forst, L. S., Lightfoot, J. T., & Burrichter, A. (1996). Familiarity with sexual assault and its relationship to the effectiveness of acquaintance rape prevention programs. Journal of Contemporary Criminal Justice, 12, 28 44. Foubert, J. D. (2000). The longitudinal effects of a rapeprevention program on fraternity men s attitudes, behavioral intent, and behavior. Journal of American College Health, 48, 158 163. Foubert, J. D., & Marriott, K. A. (1997). Effects of a sexual assault peer education program on men s belief in rape myths. Sex Roles, 36, 259 268. Foubert, J. D., & McEwen, M. K. (1998). An all male rape prevention peer education program: Decreasing fraternity men s behavioral intent to rape. Journal of College Student Development, 39, 548 556. Frazier, P., Valtinson, G., & Candell, S. (1994). Evaluation of a coeducational interactive rape prevention program. Journal of Counseling and Development, 73, 153 158. Gibson, P. R. (1991). An intervention designed to modify attitudes toward acquaintance rape in college students. Unpublished doctoral dissertation, University of Rhode Island, Kingston. Gidycz, C. A., Layman, M. J., Rich, C. L., Crothers, M., Gylys, J., Matorin, A., et al. (2001). An evaluation of an acquaintance rape prevention program. Journal of Interpersonal Violence, 16, 1120 1138. Gidycz, C. A., Lynn, S. J., Rich, C. L., Marioni, N. L., Loh, C., Blackwell, L. M., et al. (2001). The evaluation of a sexual assault risk reduction program: A multisite investigation. Journal of Consulting and Clinical Psychology, 69, 1073 1078. Gidycz, C. A., Rich, C. L., & Marioni, N. L. (2002). Interventions to prevent rape and sexual assault. In J. Petrak & B. Hedge (Eds.), The trauma of adult sexual assault: Treatment, prevention, and policy (pp. 235 259). New York: Wiley. Gilbert, B. J., Heesacker, M., & Gannon, L. J. (1991). Changing the sexual aggression-supportive attitudes of men: A psychoeducational intervention. Journal of Counseling Psychology, 38, 197 203. Gillies, R. A. (1997). Providing direct counter arguments to challenge male audiences attitudes toward rape. Unpublished doctoral dissertation, University of Missouri Columbia. Hanson, K. A., & Gidycz, C. A. (1993). Evaluation of a sexual assault prevention program. Journal of Consulting and Clinical Psychology, 61, 1046 1052. Harrison, P. J., Downes, J., & Williams, M. D. (1991). Date and acquaintance rape: Perceptions and attitude change strategies. Journal of College Student Development, 32, 131 139. Hedges, L. V., & Olkin, I. (1985). Statistical methods for metaanalysis. Orlando, FL: Academic Press. Heppner, M. J., Humphrey, C. F., Hillenbrand-Gunn, T. L., & DeBord, K. A. (1995). The differential effects of rape prevention programming on attitudes, behavior, and knowledge. Journal of Counseling Psychology, 42, 508 518. Heppner, M. J., Neville, H. A., Smith, K., Kivlighan, D. M., & Gershuny, B. S. (1999). Examining immediate and longterm efficacy of rape prevention programming with racially diverse college men. Journal of Counseling Psychology, 46, 16 26. Hoehn, M. J. (1992). An investigation of the impact of educational programs on attitudes toward rape held by female undergraduate students. Unpublished master s thesis, Texas Women s University, Denton. Intons-Peterson, M. J., Roskos-Ewoldsen, B., Thomas, L., Shirley, M., & Blut, D. (1989). Will educational materials reduce negative effects of exposure to sexual violence? Journal of Social and Clinical Psychology, 8, 256 275. Jensen, L. A. (1993). College students attitudes toward acquaintance rape: The effects of a prevention intervention using cognitive dissonance theory. Unpublished doctoral dissertation, University of Alabama, Tuscaloosa. Johnson, P. J. R. (1978). The effects of rape education on male attitudes toward rape and women. Unpublished doctoral dissertation, Texas Woman s University, Denton. Johnson, J. D., & Russ, I. (1989). Effects of salience of consciousness-raising information on perceptions of acquaintance versus stranger rape. Journal of Applied Social Psychology, 19, 1182 1197. Kline, R. J. (1993). The effects of a structured group rapeprevention program on selected male personality correlates of abuse toward women. Unpublished doctoral dissertation, Lehigh University, Bethlehem, PA. Koss, M. P., Gidycz, C. A., & Wisniewski, W. (1987). The scope of rape: Incidence and prevalence of sexual aggression and victimization in a national sample of higher education students. Journal of Consulting and Clinical Psychology, 55, 162 170. Lanier, C. (1995). Evaluation of a date rape prevention program for new students in a university setting. Unpublished doctoral dissertation, University of Texas, Houston. Layman-Guadalupe, M. J. (1996). Evaluation of an acquaintance rape awareness program: Differential impact upon acknowledged and unacknowledged rape victims. Unpublished doctoral dissertation, Ohio University, Athens. Lenihan, G. O., & Rawlins, M. E. (1994). Rape supportive attitudes among greek students before and after a date rape prevention program. Journal of College Student Development, 35, 450 455. Lenihan, G. O., Rawlins, M. E., Eberly, C. G., Buckley, B., & Masters, B. (1992). Gender differences in rape supportive attitudes before and after a date rape education intervention. Journal of College Student Development, 33, 331 337. Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage. Lonsway, K. A. (1996). Preventing acquaintance rape through education: What do we know? Psychology of Women Quarterly, 20, 229 265. Lonsway, K. A., Klaw, E. L., Berg, D. R., Waldo, C. R., Kothari, C., Mazurek, C. J., et al. (1998). Beyond no means no : Outcomes of an intensive program to train peer facilitators for campus rape education. Journal of Interpersonal Violence, 13, 73 92.

388 ANDERSON AND WHISTON Lonsway, K. A., & Kothari, C. (2000). First year campus rape education: Evaluating the impact of a mandatory intervention. Psychology of Women Quarterly, 24, 220 232. Malamuth, N. M., & Check, J. V. P. (1984). Debriefing effectiveness following exposure to pornographic rape depictions. The Journal of Sex Research, 20, 1 13. Mann, C. A., Hecht, M. L., & Valentine, K. B. (1988). Performance in a social context: Date rape versus date right. Central States Speech Journal, 3/4, 269 280. Marx, B. P., Calhoun, K. S., Wilson, A. E., & Meyerson, L. A. (2001). Sexual revictimization prevention: An outcome evaluation. Journal of Consulting and Clinical Psychology, 69, 25 32. McCall, G. J. (1993). Risk factors and sexual assault prevention. Journal of Interpersonal Violence, 8, 277 295. McLeod, P. A. (1997). The impact of rape education on rape attributions and attitudes: Comparison of a feminist intervention and a miscommunication model intervention. Unpublished doctoral dissertation, University of South Carolina, Columbia. Michener, S. O. (1996). An analysis of rape aggression defense as a method of self-empowerment for women. Unpublished doctoral dissertation, Walden University, Minneapolis, MN. Murphy, D. K. (1997). Date rape prevention programs: Effects on college students attitudes. Unpublished doctoral dissertation, Ball State University, Muncie, IN. Nagler, M. (1993). Effect of an acquaintance rape prevention program on rape attitudes, knowledge, intent, and sexual assault reporting. Unpublished doctoral dissertation, Hofstra University, Hampstead, NY. Nelson, E. S., & Torgler, C. C. (1990). A comparison of strategies for changing college students attitudes toward acquaintance rape. Journal of Humanistic Education and Development, 29, 69 85. Neville, H. A., & Heppner, M. J. (2002). Prevention and treatment of violence against women: An examination of sexual assault. In C. L. Juntunen & D. Atkinson (Eds.), Counseling across the lifespan: Prevention and treatment (pp. 261 277). Thousand Oaks, CA: Sage. Nichols, R. K. (1991). The impact of two types of rape education programs on college students attitudes. Unpublished doctoral dissertation, University of Missouri Columbia. Northam, E. (1997). The evaluation of a university based acquaintance rape prevention program. Unpublished doctoral dissertation, University of Oklahoma, Norman. Ostrowski, M. J. (1991). Incidence and attitudes involving sexual assault during courtship in an undergraduate college sample: The effects of education and empathy training. Unpublished doctoral dissertation, University of South Dakota, Vermillion. Parkin, S., & McKeganey, N. (2000). The rise and rise of peer education approaches. Drugs: Education, Prevention and Policy, 7, 293 310. Pinzone-Glover, H. A., Gidycz, C. A., & Jacobs, C. D. (1998). An acquaintance rape prevention program: Effects on attitudes toward women, rape-related attitudes, and perceptions of rape scenarios. Psychology of Women Quarterly, 22, 605 621. Rosenthal, E. H., Heesacker, M., & Neimeyer, G. J. (1995). Changing the rape-supportive attitudes of traditional and non-traditional male and female college students. Journal of Counseling Psychology, 42, 171 177. Rosenthal, R. (1991). Meta-analytic procedures for social researchers. Newbury Park, CA: Sage. Rozee, P. D., & Koss, M. P. (2001). Rape: A century of resistance. Psychology of Women Quarterly, 25, 295 311. Saberi, D. (1999). Acquaintance rape prevention: Changing rapesupportive attitudes of college students. Unpublished doctoral dissertation, Arizona State University, Tempe. Sandy, P. R. (1990). Fraternity gang rape: Sex, brotherhood, and privilege on campus. New York: New York University Press. Schewe, P. A. (2002). Guidelines for developing rape prevention and risk reduction interventions. In P. A. Schewe (Ed.), Preventing intimate partner violence: Developmentally appropriate interventions across the lifespan (pp. 107 136). Washington, DC: American Psychological Association. Schewe, P. A., & O Donohue, W. (1993a). Rape prevention: Methodological problems and new directions. Clinical Psychology Review, 19, 667 682. Schewe, P. A., & O Donohue, W. (1993b). Sexual abuse prevention with high-risk males: The roles of victim empathy and rape myths. Violence and Victims, 8, 339 351. Schewe, P. A., & O Donohue, W. (1996). Rape prevention with high-risk males: Short-term outcome of two interventions. Archives of Sexual Behavior, 25, 455 471. Schewe, P. A., & Shizas, B. A. (2002). Rape prevention with college age males: Unexpected outcomes from a video taped program versus a peer-mediated group discussion. Unpublished manuscript, University of Illinois, Chicago. Schultz, S. K., Scherman, A., & Marshall, L. J. (2000). Evaluation of a university based date rape prevention program: Effect on attitudes and behavior related to rape. Journal of College Student Development, 41, 193 201. Schwartz, M. D., & DeKeseredy, W. S. (1997). Sexual assault on the college campus: The role of male peer support. Thousand Oaks, CA: Sage. Schwartz, M. D., & Wilson, N. (1993). We re talking but are they listening? The retention of information from sexual assault programming for college students. Free Inquiry in Creative Sociology, 21, 3 8. Stock, W. A. (1994). Systematic coding for research synthesis. In H. Cooper & L. V. Hedges (Eds.), The handbook of research synthesis (pp. 125 138). New York: Russell Sage Foundation. Sweetser, S. S. (1995). Gender differences in attitudes toward and attributions of responsibility in date and acquaintance rape. Unpublished doctoral dissertation, Boston University, Boston. Tarrant, J. M. (1997). Rape education for men: A comparison of two interventions. Unpublished doctoral dissertation, University of Missouri Columbia. Walker, S. A., & Avis, M. (1999). Common reasons why peer education fails. Journal of Adolescence, 22, 573 577. Wolford, M. J. (1993). The effects of educational programs about rape on the attitudes of first year urban university students. Unpublished doctoral dissertation, Old Dominion University, Norfolk, VA. Yeater, E. A. (2000). An evaluation of a sexual assault prevention program for female college students. Unpublished doctoral dissertation, University of Nevada, Reno. Yeater, E. A., & O Donohue, W. (1999). Sexual assault prevention programs: Current issues, future directions, and the potential efficacy of interventions with women. Clinical Psychology Review, 19, 739 771.