PREDICTING SUCCESS IN THE COMPUTER SCIENCE DEGREE USING ROC ANALYSIS Arturo Fornés arforser@fiv.upv.es, José A. Conejero aconejero@mat.upv.es 1, Antonio Molina amolina@dsic.upv.es, Antonio Pérez aperez@upvnet.upv.es, Eduardo Vendrell even@isa.upv.es, Andrés Terrasa aterrasa@dsic.upv.es, and Emilio Sanchis esanchis@dsic.upv.es Affiliation: Facultad de Informática. Universidad Politécnica de Valencia, Spain. Postal address: Facultad de Informática Universidad Politécnica de Valencia Camino de Vera s/n E-46022 Valencia, Spain Phone. number +34-963877200 ABSTRACT: ROC Curves (Receiver Operating Characteristic) are remarkably useful in medical decision-making. Given a common measurable characteristic, with a continuous output in all the beings of a group, ROC curves determine a threshold on the values of the characteristic, which tries to predict the classification of beings into discrete classes. In the Facultad de Informática of Valencia (Spain), we have used ROC analysis with our students high school cumulative grade point average, with their grade point average in the university entrance examination (Selectividad), and with the rate of passed subjects. Our purpose is to prevent freshmen from dropping-out the Computer Science degree. Keywords: ROC analysis, computer science degree, grade point average. 1. Assignment of students to Universities in Spain Secondary School in Spain is divided in two parts: compulsory secondary education (12 to 16 years) and post-compulsory secondary education, including the baccalaureate or high school degree (Bachillerato), and the middle grade of vocational training. Both of them consist of two academic years. In order to access to the Computer Science degree, students must succeed in all subjects of high school, and have to sit for a university entrance examination (Selectividad). This consists of a series of tests on every subject conducted during the high school period. In the Higher Education Spanish System, the assignment of students to universities works like this: Every student has a university access mark (in the sequel abbreviated by UAM) which depends on their cumulative grade point average during high school (HSA)(60%), and on their grade point average on the university entrance examination, (40%) (UEEA). A ranking of all students can be made in correspondence with their UAM marks. Then, every student submits an application to the administration where he/she ranks his/her preferences on degrees and universities. Finally, the assignment of students to universities is done as follows: The student with the highest AUM that has not been assigned to any degree is assigned to his/her most preferred degree with available enrollment. 1 Contact author
For the sake of completeness, we point out that the academic grades in Spain are given by a numerical value between 0 and 10. For further information we refer the reader to (Terrasa et al. 2006). Marks Name 0.0 to 4.9 Fail 5.0 to 6.9 Pass 7.0 to 8.9 Good 9.0 to 10.0 Excellent 10.0+ Distinction Spanish Official Grading System 2. ROC Analysis While originally it was mainly used for recovering radio signals contaminated by noise, ROC analysis has been widely used in medicine for many decades (Zou 2002). During the last years, it also has been introduced in artificial intelligence, machine learning, and data mining. Let us consider the following prediction problem: Suppose that we have a group of beings with a common measurable characteristic that are separated in two classes. If we want to predict how they are classified with respect to a test based on this characteristic, a threshold t 0 in the values of the characteristic should be established. An individual with a higher or equal value to t 0 on the test is predicted as positive (p). Otherwise, it is predicted as negative (n). Depending on the labeling and the actual classification of the individuals, there are four possible outcomes: If the outcome from a prediction is p and the actual value is p, then it is called a true positive (TP); however if the actual value is n, then it is said a false positive (FP). Conversely, the true negative (TN) and false negative (FN) could also be defined. These values let us to introduce the true positive rate (TPR), also known as sensitivity, as TPR=TP/(TP+TN). On the other hand, the false positive rate (FPR), also known as 1-specificity, is performed as FPR=FP/(FP+TN). A Receiver Operating Characteristic, in the sequel a ROC curve, is a graphic representation of the TPR (x axe) vs. the FPR (y axe) for a binary classifier system as its discrimination threshold is varied. Since we are dealing with rates, this plot is always contained in the box [0,1]x[0,1]. Image taken from www.wikipedia.org A perfect prediction classifier would have a point in the corner of (0,1), because it will mean that all true positives are found. The closer the ROC curve is to this point, the better the classifier is. So, points above the diagonal line show good classification outputs. Therefore, the area under the curve (AUC) is a good measure of this fact: areas above 0.7 show that the classifier is fine. On the contrary, points along the diagonal line would represent a random guess of the classifier. In this case areas under the curve are around 0.5. For further details on ROC Analysis we suggest the reader (Fawcett 2004.
3. The syllabus of our Computer Science degree The syllabus of the Computer Science degree in the Facultad de Informática of Valencia is made up of 10 semesters. The number of Spanish credits for the whole syllabus is 375 (1 Spanish credit is equivalent to 10 teaching hours), divided into two stages. The first one takes 6 semesters and the second one takes 4. One academic year comprises two semesters and every semester lasts 14 weeks with around 37 5 credits (375 teaching hours). There are three types of subjects: compulsory, optional, and elective. In order to obtain the degree, all the compulsory subjects for the degree must be taken, besides a certain number of credits corresponding to optional and elective subjects. The compulsory subjects are majority in the two first years and in the fourth year of the degree. The distribution of credits is as follows: STAGE YEAR COMPULSORY OPTIONAL ELECTIVE TOTAL 1 66 0 6 72 1 st Stage 2 72 0 6 78 3 33 30 12 75 2 nd Stage 4 54 18 3 75 5 15 48 12 75 The following table includes the information about the first year subjects. Most of them last two semesters, and a high number of credits are dedicated to basic engineering subjects (physics and mathematics). All the subjects, except Technical English, are compulsory subjects. First Year Subjects Semester Credits Calculus A+B 12 Computer Fundamentals A+B 12 Fundamentals of Physics for Computer Science A+B 9 Programming A+B 12 Discrete Mathematics and Linear Algebra A 9 Computer Technology B 6 Numerical Computation B 6 Elective Subjects (Technical English) B 6 4. Details of our study Our study has been carried out with all freshmen from 2001/2002 until 2005/2006, around 750 students. For every student we have collected the following data: HSA, the cumulative grade point in the high school degree; UEEA, the grade point in the university entrance examination; UAM, the university access mark (which is calculated as 0.6*HSA+0.4*UEEA). These marks have been analyzed, using Analyse-it package, in order to determine if they are good classifiers predicting if a student will success or will drop-out. A student that continues in the degree is considered a positive. A student that drops-out the degree, despite of studying another one in our university, is considered as a negative.
The area under the curve of these indicators is 0,66 for the HSA, 0,62 for the UEEA, and 0,67 for the UAM. Therefore, all of them are not well enough predictors. However, we must point out that two other indicators were also considered. Those were the concrete marks of the university entrance examination in mathematics and physics. It has been shown that they are not good classifiers, even for the mathematics or physics subjects of the computer science degree. In these cases the area under the curve is around 0.63. For the subjects of mathematics and physics the best classifier was the UAM, since the area under the curve in these cases was approximately 0.75. Besides, the amount of passed compulsory credits (PCC) was also considered as a predictor of students success. In fact, this is the best predictor since the area under the curve is about 0.84. Therefore, an optimal threshold has been computed for the HSA, UAM and PCC for every year. These points are optimal in the sense that they jointly minimize the frequency of false positives and false negatives. 5. Results Here we show the average and the optimal thresholds computed for every predictor for every year freshmen. 2001 2002 2003 2004 2005 Mean of HSA 8,17 8,06 7,96 7,45 7,46 Optimal HSA 8,11 7,92 7,8 7,36 7,4 Mean of UAM 7,64 7,64 7,49 7,01 7,02 Optimal UAM 7,58 7,35 7,13 6,84 6,74 Mean of # PCC 48,55 51,34 42,28 32,89 33,85 Optimal # PCC 45 45 36 21 18 As it can be seen, the optimal points are decreasing for HAS and UAM. These results are very strongly related with the falling of freshmen UAM. In the table below we show the lowest UAM of the freshmen of every year. The coefficient of correlation between them and the HSA and UAM optimal points is over 0.98. 2001 2002 2003 2004 2005 Lowest UAM 7,28 7,07 6,74 6,38 6,32 We also should point out that the number of passed credits is computed at the end of the academic year 2005-2006. Therefore, we have information of 5 years for the freshmen of 2001-2002, of 4 years for the ones of 2002-2003, and so on. Therefore, the optimal number of PCC appears to have deeply decreased with the pass of the years. The True Positive rate (TPR) and True Negative Rate (TNR) of these indicators can be seen here. The results are presented as percentages.
2001 2002 2003 2004 2005 TPR TNR TPR TNR TPR TNR TPR TNR TPR TNR HSA 85,54 67,69 91,89 85,07 91,36 68,66 82,43 62,12 86,84 84,06 UAM 83,12 73,24 94,12 80,36 91,30 67,86 82,89 67,19 89,02 74,60 # PCC 93,88 48,00 97,09 65,79 92,71 63,46 93,55 40,43 95,05 54,55 We confirm that the best indicator is the number of PCC. The TPR is around 95% and the mean of the TNR is 54%. The HAS and UAM only have good results for the true positive ones. In addition, we have also observed the following facts. Despite of having decreased the UAM, only 10% of freshmen with the UAM in the first quartile have dropped-out. Around the 15% of the students that arrive to the degree dropout it. They usually do it after the first year (66%), or after the second one (25%). The average of passed credits on compulsory subjects for the freshmen who drop-out is 6,5 over 66. The 52,14% of these students have just passed 12 or less credits, that is less than the 16% of the academic year workload. 6. But do all of them really try? As the number of passed compulsory credits for a student who have dropped-out is very low we have analyzed if they really present to the examinations. For the freshmen who have dropped out we have calculated the rate of passed exams over the number of exams done by them. Calculus 8,10% Computer Fundamentals 24,12% Fundamentals of Physics for Computer Science 7,96% Programming 10,37% Discrete Mathematics and Linear Algebra 19,59% Computer Technology 8,05% Numerical Computation 11,04% Nearly none of the students who have passed programming, calculus, physics, and computer technology (the most difficult subjects) dropsout the degree. But there is still one open question. Do they really try? In Spain there are two periods of exams for every subject during the academic year. One is when the semester has just ended, January for the fall semester and June for the spring semester. The other one is in June for fall semester subjects, and in September for spring semester and annual subjects. During the first period of exams, these students usually sit to 3 or less exams (18,2% present to 0 exams, 17,5% to 1, 23,4% to 2 and 14,6% to 3, and only 26,3% to 4 exams or more). This rates decrease dramatically for the second period, since the 64% has presented to 0 subjects, and the 12,4% only to one. Finally, we have observed that students that have a mark of 3 or below in programming in the first period of exams directly decide to dropout. 7. Conclusions As we have seen, the HSA and UAM are two pre-university indicators that predict very well success but just in the case of students with the higher marks. This fact probably happens, since they collect
information on continuous efforts during long term periods, despite of the success in concrete exams. In our University, we have developed a tutorial program called INTEGRA. Every professor is assigned as a tutor of some students (2-4). This information let us to concentrate efforts with the negative predicted ones in order to improve the freshmen success. Despite of being the best predictor, the number of passed credits has the problem on our syllabus: it can only be computed at the end of June, since nearly all the compulsory subjects of the first year are annual. Now, we only have this information when most of the students that drop-out have taken their decision. To correct this, the new syllabus will consist entirely on one-semester subjects during the first year. To sum up, we realized that the most important thing is to achieve that the students succeed in the first period of exams. Therefore, we have decided to start additional support classes for the freshmen. These will be strongly recommended for the negative predicted ones. As future work, we plan to go deeper in this study including other indicators such as the information about the student workload on every subject of the degree (molina07). 8. Acknowledgements The authors want to thank the support of the PACE Project conducted on the Universidad Politécnica de Valencia. 9. Bibliography Integra Program. http://www.upv.es/entidades/vai/menu_516916c.html Fawcett, T. ROC Graphs: Notes on Practical Considerations for Researchers. Technical report, Palo Alto, USA; HP Laboratories. Computer Science Degree. RESOLUCIÓN de 21 de septiembre de 2001, de la Universidad Politécnica de Valencia, por la que se ordena la publicación del plan de estudios de Ingeniero en Informática de la Facultad de Informática de esta Universidad. Molina, A., Terrasa, A., Vendrell, E., Sanchis, E. ECTS evaluation in the Faculty of Computer Science of the Polytechnic University of Valencia. International Conference on Engineering Education ICEE 2007. Septembre 2007, Coimbra, Portugal. PACE Program. Plan General de la UPV para la Promoción y Dinamización de la Convergencia Europea. Vicerrectorado de Estudios y Convergencia Europea. Universidad Politécnica de Valencia. 2005. http://www.upv.es/entidades/vece/menu_592108c.html Terrasa, A., Vendrell, E., Sanchis, E. and Conejero, A. The Spanish experience of adapting to the ECTS system. ECTS Assessment in Higher Education (I.S.B.N.: 1103-2685). Department of Educational Measurement - Umea University, pp. 129 139, 2006 Vendrell, E., Terrasa, A., Conejero, J.A., Sanchis, E. A Project to Establish a Context for Teaching Innovation at the Faculty of Computer Science of Valencia. International Conference on Engineering Education, ineer ICEE2006. July 2006, Puerto Rico, USA. Vivo, J.M., Sánchez de la Vega, M.M., and Franco, M. Estudio del
Rendimiento Académico Universitario Basado en Curvas ROC. Revista de Investigación Educativa 22 (2) 327-340, 2004. Zou, K.H. Receiver operating characteristic (ROC) literature research. Online bibliography available from http://splweb.bwh.harvard.edu:8000/pages/ppl/zou/roc.html http://www.wikipedia.org