1 The Impact of Web-Based Instruction on Performance in an Applied Statistics Course Robert J. Hall Department of Educational Psychology Texas A&M University United States Michael S. Pilant Department of Mathematics Texas A&M University United States R. Arlen Strader Department of Educational Psychology Texas A&M University United States Abstract: Forty-one graduate and undergraduate students enrolled in Summer or Fall sections of an applied statistics course self-selected into one of three groups. Group 1 subjects (n = 11) took chapter pretests online and printed out results. Subjects in group 2 (n = 18) printed copies of the pretests but did not take tests online. Group 3 subjects (n = 12) did not access the pretests. Pretests for exams 1 and 3 were made available through a class website using online-testing software developed by the authors. Demographic information (educational background and course learning and performance goals) and prerequisite math skills scores were collected on the first day of class. Groups could not be distinguished, a priori, on the basis of either demographic data or math skill scores. There were, however, statistically significant differences among groups on exam performance, favoring students in the onlinepretest group. Factors influencing student performance, as well as issues for future study, are addressed. Overview and Purpose The area of human-computer interaction combines information from several academic areas (i.e., computer science, education, and psychology). For example, concepts inherent to technology such as flow, feedback, formatting, detail, organization, and consolidation have been relevant to cognitive psychology and to instructional design for a number of decades. According to Dillon and Zhu (1997), technology has "enhanced our ability to apply these [learning] principles over distributed learner populations with higher fidelity than previously" (p.223). One major component of instructional design and one of the more difficult areas in Webbased instruction to verify is "effectiveness" (Reeves & Reeves, 1997). Several key questions related to effectiveness must be considered in order to document how Web-based instruction affects academic performance. Questions related to course orientation (i.e., for-credit, not-for-credit, workforce related), learning objectives (i.e., factual, algorithmic, strategic), and nature of instruction (i.e., skills-based vs. general problem solving) highlight important issues facing developers of Web-based courses and instruction. The issue of concern for this study, however, is assessment and exploring how technology can add to our understanding of human learning and performance and whether carefully designed web-based supplemental study-aids can be shown to impact on classroom performance as measured by course exams. In that regard, we are interested in questions such as "How will performance metrics for a Web-based course be defined?" and "Will assessment focus primarily on pre-testing (for diagnostic purposes), post-testing (to determine subject mastery), or both?" These issues are inter-related in a complex manner and make questions about assessing the effectiveness of Web-based instruction challenging ones. The purpose of this study, then, is to evaluate the impact of online pretesting and feedback on outcome performance as measured by in-class multiple choice and workout exams in educational statistics.
2 Method Subjects. This study focuses on two cohorts of graduate and undergraduate students enrolled in a first semester course in applied statistics for the behavioral sciences. The first cohort included 19 students (13 females, 6 males) enrolled in a five-week session during the summer of The second cohort includes 22 students (16 females, 6 males) enrolled in the same class during the fall of The same instructor taught both courses. Procedure. On the first day of class, students were given a password and student ID that enabled them to access the class website. All course materials including HTML versions of the in-class lecture slides (Microsoft (R) PowerPoint), problems, assignments, announcements, threaded discussion groups, class rosters with links, JAVA-based statistics applets, and chapter pretests were made available to students through the website. All but one of the 41 students participating in this study reported owning a computer and only one student who owned a computer did not have a modem. In addition, there are many open-access labs at the University housing hundreds of computers with high-speed Ethernet connections. In other words, access to a computer should not have been an issue for any student taking this course. Students were given a math prerequisite exam on the first day of class (59 questions) that covered basic arithmetic, advanced arithmetic, complex problems, and simple algebra skills. Students were not allowed to use calculators to answer any of the questions on the test. The math prerequisite test was used to compare math skill levels across treatment groups. Results for the three groups are summarized in Table 2. During the first week of class, students were required to fill out an online demographic questionnaire made up of 36 questions. Tables 3 and 4 summarize results from questionnaire data by treatment group. Finally, three days prior to each exam, chapter pretests were made available to students. Instrument. The authors have developed an assessment tool that addresses some of the issues outlined in the introduction. A series of linked components make up the online testing instrument. The first is an "administration" or "authoring" component that is web-based and password protected. In this environment an instructor can build or modify questions and exams and create classroom databases. A second component provides students access to the actual interactive testing. Finally, the third part is the database itself, which uses the program Microsoft (R) Access. Note, although Microsoft (R) Access is used for data storage, it is completely hidden behind a web front-end and therefore can be used by Unix and Mac systems. One of the unusual features programmed into the online testing tool is the ability to capture information related to motivation through use of sliders that measure student's "familiarity" with content objectives prior to testing, and "confidence" about response accuracy as the test proceeds. This provides students and instructors with information regarding depth of preparation, level of skill, and degree of uncertainty. Each question is timed transparently to the student. The format of the questions is unstructured - any valid HTML code can be included. Question types currently supported are true/false, multiple choice, short workout, and detailed workout. Questions can be interactive (through embedded JAVA applets), and can contain graphics and multimedia. Responses are recorded and a printable summary of the students' work - including computer-graded questions (true/false and multiple choice), explanations, time spent answering individual questions, reported confidence and help, and reported familiarity with learning objectives - is returned when all questions have been answered. Students review one question at a time and once questions are submitted, they cannot be revisited. If there is not enough time to complete a pretest, a bookmark feature allows the student to return to the test at the point where he/she stopped. Using this tool, instructors can maintain question, exam, student, and class databases, and link questions to categories and/or objectives to individual questions. The authoring tool is completely accessible through the Web (although it is password protected for security reasons). Results Design. On the basis of the amount of time spent engaged with the online chapter pretests, students were placed into 1 of 3 groups. The first group, Pretested Online (n = 11), spent at least 20 minutes online working on each chapter pretest (15 to 18 multiple-choice questions and workout problems per test). The second group, Download Only (n = 18), was made up of those students who visited the test sites but only took enough time to print out the test. These students averaged less than five minutes with each pretest. The third group, No Access (n = 12), did not access the pretests, choosing instead to study for the exams without benefit of these study aids. Analysis. A split-plot analysis of variance was used to analyze data from this study. The design had one between-subjects factor, Pretest Group (3 levels; Pretested online/download/no Access) and two within-
3 subjects factors, Exam (3 levels; Exam 1/Exam 2/Exam 3) and Question Type (2 levels; Multiple Choice/Workout). All three main effects and one interaction effect, Exam x Question Type, were statistically significant. For Pretest Group, F (2, 38) = 6.31, p <.004, η 2 =.25, ω 2 =.21, power =.87. For Exam, F (2, 76) = 21.98, p <.001, η 2 =.37, power = For Question Type, F (1, 38) = , p <.001, η 2 =.82, power = For Exam x Question Type, F (2, 76) = 8.60, p <.001, η 2 =.19, power =.96. Table 1 summarizes means and standard deviations for the different cells in the design. Online Pretest Group Pretested online (n = 11) Downloaded pretests (n = 18) No access (n = 12) Total Ex. 1 Ex. 1 Ex. 2 Ex. 2 Ex. 3 Ex. 3 MC% WO% MC% WO% MC% WO% Table 1: Descriptive statistics for exams by question type and group. MC - Multiple Choice; WO - Work Out Discussion In this study, we were interested in whether students who took pretests online (OT) and then reviewed printed answers and explanations would outperform, on in-class exams, students who only downloaded (DO) pretests and/or students who did not access (NA) pretests. Statistically significant differences, separating the OT group from the NA group were obtained. Moreover, there were consistently observed, but not statistically significant, differences between the OT and DO groups and between the DO and NA groups. On the basis of this analysis, we are tempted to conclude that results from this study can be interpreted to indicate that taking Web-based pretests resulted in better performance on the exams given in class. At best, however, this conclusion is preliminary. Students were not randomly assigned to treatment groups but rather self-selected into the respective groups. Thus, systematic bias tied to individual differences (i.e., ability, motivation, and educational background) inherent to the self-selection process could be cited to account for the better performance of the OT group. Clearly, one factor that might promote the observed differences in performance would be general math ability. All students were given a math prerequisite exam on the first day of class. Table 2 summarizes exam results for each of the treatment groups. Wilks'-Lamda F for a one-way multivariate ANOVA testing for group differences among the four dependent math skill measures was not statistically significant (F (8, 70) = 1.01; p =.434). Hence, students in the three treatment groups could not be distinguished, a priori, on the basis of their general math ability as measured by the prerequisite test scores. Given that the groups did not differ in general math ability, we turned to investigate factors, other than online pretesting, that might account for the observed group performance differences. Tables 3 and 4 summarize findings from the demographic questionnaire by treatment group. Table 3 presents frequency counts for variables related to educational background and course goals by treatment group. Proportional representation for the variables gender, class cohort, year in school, and number of hours employed during the semester was approximately the same for all groups. Similarly, almost all students reported grades of A or B in their high school algebra courses with the ratio of A's to B's about 2 to 1. All but 1 student reported owning a computer and when asked how comfortable are you working with web browsers to obtain information, the least comfortable group was OT (61/100) followed by DO (74/100) and NA (77/100). Most students had not taken a computer course of any kind and all students indicated that their grade goal for the class was an A or B. Primary learning goals reported for the OT and NA groups were conceptual understanding but all three groups placed importance on developing computational skill. Finally, in all groups, most students anticipated that they would spend 7 to 12 hours a week studying for the class. Information summarized in Table 3, then, might be interpreted to argue more for homogeneity rather than heterogeneity across groups. Next we looked to see if measures related to general motivation (e.g., confidence, importance, ability, or effort) might form a basis for differentiating groups, thus undermining the conclusion that the online testing
4 experience was a major factor in determining exam performance. Table 4 summarizes the descriptive statistics that address questions tied to motivation. Univariate one-way ANOVAs were used to test for group differences for each of the rating measures. None were statistically significant. In sum, even though students were not randomly assigned to treatment groups, the only reliable difference among groups that we could establish to account for differences in exam performance was the nature of the interaction that students had with the chapter pretests. Online Pretest Group Pretested online (n = 11) Downloaded pretests (n = 18) No access (n = 12) Total Basic arithmetic Advanced arithmetic Table 2: Prerequisite math score descriptive statistics by online pretest group. Complex arithmetic Simple algebra Total score (30 pts) (14 pts) (11 pts) (4 pts) (59 pts) Pretest Group (n = 41) Survey Question Tested Online (n = 11) Download Only (n = 18) No Access (n = 12) Gender Male Female Male Female Male Female Class Cohort Summer 98 Fall 98 Summer 98 Fall 98 Summer 98 Fall Year in School Undergrad Grad Undergrad Grad Undergrad Grad Hours Employed None < 20 > 20 None < 20 > 20 None < 20 > HS Algebra Grade A B C or < A B C or < A B C or < Own a Computer? Yes No Yes No Yes No Web Browser Use < 1 wk 1-2 wk Daily < 1 wk 1-2 wk Daily < 1 wk 1-2 wk Daily Computer Course? Yes No Yes No Yes No Grade Goal A B Other A B Other A B Other Primary Learning Goal a Comp. Concep. Prep. Comp. Concep. Prep. Comp. Concep. Prep Anticipated Study Hours < > 12 < > 12 < > Table 3: Demographic survey results summarized by pretest access group. a For primary learning goals, the numbers indicate "yes" responses. Students could check more than one area. Key: Comp = Computational skills; Concep = Conceptual Understanding; Prep = Future course preparation. Further support for the impact of the pretests comes from student feedback. When asked "how helpful were the online chapter pretests in preparing you for the exams?" students who took advantage of the Webbased pretests responded with either a 5 or 6 (out of a possible 6) on a Likert scale rating instrument. In addition, many students provided written comments indicating that the online pretests (and the student reports generated from the pretests) helped them to focus their study efforts and to prepare for the in-class exams.
5 Others reported that the online tests provided an experience comparable to the actual test in that there was no immediate feedback regarding whether their answers were right or wrong and that they felt anxious about their performance like they would on an exam. For some students, then, the pretests helped to focus their final preparation for the in-class exams. Pretest Group (n = 41) Tested Online (n = 11) Download Only (n = 18) No Access (n = 12) Rating X n X n X n Grade Goal Importance a Grade Goal Confidence b Learning Goal Import. c Learning Goal Confid. d Math Ability for "B" e Math Abl. for Lrn. Goal f Personal Math Ability g Effort for "B" h Effort for Learning Goal i Personal level of Effort j Table 4: Confidence, importance, and ability ratings (scale = 1 to 100) summarized by pretest access group. a How important is the final grade in this course to you? b How confident are you that you can attain your goal for the course grade? c How important is your primary learning goal to you? d How confident are you that you can obtain your primary learning goal for this course? e How much mathematical ABILITY do you think it would take to obtain a grade of at least a B in this course? f How much mathematical ABILITY do you think it would take to acquire the computational skill and conceptual understanding for this course needed to prepare you for future courses and/or your career? g How much mathematical ABILITY do you think you have? h How much EFFORT do you think it would take to obtain a grade of at least a B in this course? i How much EFFORT do you think it would take to acquire the computational skill and conceptual understanding for this course needed to prepare you for future courses and/or your career? j How much EFFORT do you intend to devote to this course? In contrast were those students who chose not to access the online tests and those who accessed the tests but chose only to print out the questions and explanations. The latter group faired better on the in-class exams than students who did not access pretests, but their performance, on average, was almost a full grade (8%) lower than the group who took full advantage of the online testing package. Some students in each group did well on the exams, but one interpretation is that just looking at questions and answers apprises students of specific information that might be asked on an exam but does not prime them to search through networked information. These students are not faced with producing reasonable responses within a time frame or with identifying areas of uncertainty that when resolved might result in more in-depth understanding of concepts and facts. At this point, evidence for the impact of online pretesting on performance is suggestive, not conclusive. Students from two classes, taught by the same instructor covering material in basic statistics, self-selected into treatment groups. Group members differed in how they chose to prepare for in-class exams. Although reasons for their choices are not known, we do know that those students who took online pretests outperformed those who did not. Moreover, no differences in the self-selected groups based on educational background, grades, computer literacy, or ability to do basic math or simple algebra could be established. Student motivation as reflected in self-rated effort, ability, and course learning and performance goals also did not distinguish the groups nor did the amount of time that students reported studying for the exams. In other words, the groups, although not randomly populated, appear similar with respect to many of the individual difference variables that often are cited as major factors influencing classroom performance (Mayer, 1999). The only reliable difference among groups in this study was in the online aids that students chose to help them prepare for the in-class exams. Nonetheless, our explanation of findings places too great an emphasis on a retroactive search for limits to the interpretation of the observed treatment effect. A more proactive posture, providing upfront control (i.e., random assignment from a well-defined pool of subjects), is advocated for future research. One final piece of information, however, is noteworthy. Figure 1 depicts exam-score distributions across groups. Pretests were available only for exams 1 and 3. With the treatment effect withdrawn for exam 2, it might be argued that the groups should regress to some common level of performance. Clearly that does not happen. Group 3's performance remains generally inferior
6 to that of the other two groups. This suggests a more generalized difference among groups not captured in the data that we collected. But, there is also indication for an online testing effect, one that impacts performance in ways different from just studying using copies of pretests (e.g., Group 2). For the OT group, the group that took most advantage of pretest materials, the interquartile range (IQR) of scoring expanded about 50% when there were no pretest study aids (exam 2). When pretests were again available, exam 3, variability within the IQR returned to exam 1 levels. Note, too, the consistency of IQR ranges across groups on exam 2. Moreover, in terms of median performance, the OT and DO groups are similar across exams, but the DO group evidences much greater variability on exams 1 and 3 than does the OT group. The DO and NA groups differ in terms of median scores but the respective IQRs are similar. What does all this mean? Score % Exam 1 Exam 2 40 N = OT DO NA 12 Exam 3 Figure 1: Total-test score distributions. Online Pretest Group It may well be that reduced variability is the real impact of the online pretesting experience. That is, students in the OT group were more densely packed into a narrower scoring range than were students in the other two groups. In summary, work to this point looks promising. There is a robust and powerful testauthoring tool that affords the capability of collecting speed, accuracy, confidence, and familiarity data on students placed in test-like conditions. Observed performance differences reflect a treatment effect, but the sample, although homogeneous, is small. For these reasons, work in this area needs to be extended to include more subjects placed in experimental groups controlled by experimenters rather than in self-selected groups based on subject choices. From this study, however, further work in this area appears justified. References Disson, A., & Zhu, E. (1997). Designing web-based instruction: A human-computer interaction perspective. In B.H. Khan (Ed.) Web-based instruction (pp ). Englewood Cliffs, NJ: Educational Technology Publications. Mayer, R.E. (1999). The promise of educational psychology: Learning in the content areas. Upper Saddle River, NJ: Prentice Hall Reeves, T.C., & Reeves, P.M. (1997). Effective dimensions of interactive learning on the world wide web. In B.H. Khan (Ed.) Web-based instruction (pp ). Englewood Cliffs, NJ: Educational Technology Publications. Acknowledgements This project was supported by a grant from the Academy for Advanced Telecommunications and Technology, College of Science, Texas A&M University.