THE ROLE OF THE COMPUTER IN THE TEACHING OF STATISTICS. Tarmo Pukkila and Simo Puntanen University of Tampere, Finland



Similar documents
CORRELATION ANALYSIS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

Government of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

Statistics Graduate Courses

Statistics 104: Section 6!

9. Sampling Distributions

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

CHANCE ENCOUNTERS. Making Sense of Hypothesis Tests. Howard Fincher. Learning Development Tutor. Upgrade Study Advice Service

STAT 360 Probability and Statistics. Fall 2012

ECON 523 Applied Econometrics I /Masters Level American University, Spring Description of the course

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

1 Short Introduction to Time Series

STATISTICAL DATA ANALYSIS COURSE VIA THE MATLAB WEB SERVER

Teaching model: C1 a. General background: 50% b. Theory-into-practice/developmental 50% knowledge-building: c. Guided academic activities:

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

RUSRR048 COURSE CATALOG DETAIL REPORT Page 1 of 6 11/11/ :33:48. QMS 102 Course ID

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

ANALYTIC HIERARCHY PROCESS (AHP) TUTORIAL

Example: Boats and Manatees

Diablo Valley College Catalog

TEACHING SCIENCES IN THE FINNISH COMPULSORY SCHOOL (PHYSICS, CHEMISTRY, BIOLOGY AND GEOGRAPHY) Brief description of the Finnish Educational system

Teaching Multivariate Analysis to Business-Major Students

Aims and structure of Japan Statistical Society Certificate

RELEVANT TO ACCA QUALIFICATION PAPER P3. Studying Paper P3? Performance objectives 7, 8 and 9 are relevant to this exam

Psychology. Academic Requirements. Academic Requirements. Career Opportunities. Minor. Major. Mount Mercy University 1

Module 5: Statistical Analysis

Bachelor's Degree in Business Administration and Master's Degree course description

Simulating Chi-Square Test Using Excel

Multivariate Normal Distribution

Truman College-Mathematics Department Math 125-CD: Introductory Statistics Course Syllabus Fall 2012

Module 3: Correlation and Covariance

From the help desk: Bootstrapped standard errors

Applied Statistics and Data Analysis

I ~ 14J... <r ku...6l &J&!J--=-O--

MATHEMATICAL METHODS OF STATISTICS

2. Linear regression with multiple regressors

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Overview Accounting for Business (MCD1010) Introductory Mathematics for Business (MCD1550) Introductory Economics (MCD1690)...

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

The impact of Discussion Classes on ODL Learners in Basic Statistics Ms S Muchengetwa and Mr R Ssekuma muches@unisa.ac.za, ssekur@unisa.ac.

COLLEGE ALGEBRA IN CONTEXT: Redefining the College Algebra Experience

Chapter 7 Factor Analysis SPSS

E3: PROBABILITY AND STATISTICS lecture notes

International College of Economics and Finance Syllabus Probability Theory and Introductory Statistics

IT-Support of Accounting Education

The primary goal of this thesis was to understand how the spatial dependence of

Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation

Prerequisite: High School Chemistry.

Simple linear regression

UNIT 1: COLLECTING DATA

Mathematical Knowledge level of Primary Education Department students

Premaster Statistics Tutorial 4 Full solutions

The Basics of System Dynamics: Discrete vs. Continuous Modelling of Time 1

Supplementary PROCESS Documentation

CALCULATIONS & STATISTICS

Factorial Invariance in Student Ratings of Instruction

How to Write a Successful PhD Dissertation Proposal

Chapter 21: The Discounted Utility Model

How to Write a Research Proposal

Network quality control

A teaching experience through the development of hypertexts and object oriented software.

Calculating the Probability of Returning a Loan with Binary Probability Models

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

APPLICATION FOR PART-TIME EMPLOYMENT AS A TUTOR TUTOR IN THE DOLCIANI MATHEMATICS LEARNING CENTER

Regression Analysis: A Complete Example

LOGNORMAL MODEL FOR STOCK PRICES

Graduate Certificate in Systems Engineering

Bachelor and Master of Science degrees in Mathematics and Statistics at University of Helsinki

Teaching the, Gaussian Distribution with Computers in Senior High School

Structure of Presentation. Stages in Teaching Formal Methods. Motivation (1) Motivation (2) The Scope of Formal Methods (1)

e-learning in College Mathematics an Online Course in Algebra with Automatic Knowledge Assessment

Chapter 3 Local Marketing in Practice

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 12: June 22, Abstract. Review session.

Fairfield Public Schools

Issues in Information Systems Volume 15, Issue II, pp , 2014

Reform of the quality of higher education in Norway implications for the library teaching

TEACHING STATISTICS THROUGH DATA ANALYSIS

Report to the Academic Senate General Education Area B1 - Pilot Assessment Plan

Introducing the Multilevel Model for Change

Confidence Intervals for Cp

How To Use An Ss Effectively

TRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics

Curriculum Guidelines for Bachelor of Arts Degrees in Statistical Science

Analysis of Bayesian Dynamic Linear Models

7. Normal Distributions

Notes on Determinant

Mgmt 469. Model Specification: Choosing the Right Variables for the Right Hand Side

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Battery Bias

School of Business TRINITY COLLEGE DUBLIN. Masters in Finance

Hypothesis testing - Steps

Topic #6: Hypothesis. Usage

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

From the help desk: Swamy s random-coefficients model

Transcription:

THE ROLE OF THE COMPUTER IN THE TEACHING OF STATISTICS Tarmo Pukkila and Simo Puntanen University of Tampere, Finland 1. Introduction In this paper we consider the nature of a statistics course and discuss the role of the computer in it. In particular, we discuss the the course for the first year-students at the university level. In this context, for example the following questions are essential: How much effort should be put e.g. into handling the mathematical properties of central concepts like mean, variance and correlation? What is the real difference between a mathematics and a statistics course, or a course in computer science? In fact, is a subject called statistics needed any more? Can mathematicians handle formulae and computer scientists computers more efficiently than statisticians? However, there is surely no justification in calling a course a real statistics cou.rse, when all we know of it is that the formulae or means etc. are carefully handled. in our paper we handle the concept of a real statistics course and discuss the essential role of the computer in it. The course which we are teaching is given in the Faculty of Economics and Administration at the University of Tampere. The computer has been used on a large scale in teaching basic statistics in the University of Tampere since the year 1975. Computer usage was begun applying the system SURVOI71 in data-analytical problems. Pukkila (1978) and Pukkila and Puntanen (1980) are papers describing the commencement of computer use in the University. After promising experiences of its use in data-analytical practical statistical problems, use in teaching was extended in 1978 to cover also theoretical concepts (sampling distributions, testing of hypotheses, regression models, analysis of variance models, time series models, generation of data from these etc.). For the illustration of such theoretical statistical concepts the KONSTA system (Pukkila, Puntanen and Stenman 1984) was developed at the Department of Mathematical Sciences. 2. Aims of the Basic Course When coming to the course most students have not yet had time to study especially deeply their main subjects, where later they have possibly to apply statistical methods. It might be argued in this kind of situation that the basic course in statistics comes too early in the students' program. J The basic course is obligatory for all students in the Faculty. This is in many cases the only reason why a student comes to the course, so that the attitudes of students towards the course are also not the best possible. The goal in teaching is that after the basic course students should be able to cope with statistical problems eventually arising in their major subjects. Among other things this means that they should be able independently to carry out a brief empirical statistical research. In order to reach this goal,

the students should be able to quire data, to handle them, interpret results and finally to write a report. Hopefully the course also arouses real interest for later statistical courses, especially among those studying an applied subject. On the other hand, the basic course has also to form a sound basis upon which to build later studies in statistics on the curriculum. 3. Natrire of Studies on the Basic Course Basic course is ihtended to be as practical as possible. This does not mean that theoretical matters are not handled on the course. Nevertheless practical problems have a central role. For the students the handling of practical problems begins with the finding and formulating of statistical problems. It continues to obtaining data and analyzing them and ends with the writing of a report. Let us consider some particular points in the teaching of statistics for first-year students. How much time should be given e.g. to the handling of the mathematical properties of basic concepts like mean, variance and correlation? Is such a course where these topics are treated in detail a real statistics course? What is the difference between a mathematics and a statistics course, or a course in computer science? In fact, is a subject called statistics needed any more? Can mathematicians handle formulae and computer scientists computers more efficiently than statisticians? In trying to answer the above questions it must be remembered that the course we are talking about is a first-year course. This is important because the first statistics course has a somewhat different role from later courses. Attitudes to statistics are to a great extent created on the first course, and it can be said that in statistical research attitudes mean a great deal: If they are not "correctn, then not even the correct formulae will help. Mathematical courses have longer traditions than those in statistics and all students have some idea of what mathematics is about, while statistics is a new subject to them. There is surely no justification in calling a course a real statistics course, when all we' know of it is that formulae of means etc. are handled carefully. Similarly a course which shows in detail how the computer can be used in statistical calculations is not necessarily a real statistics course. These conclusions can be drawn e.g. from one important aim of the first statistics course. After a course u student should be able independently to carry out empirical research. But knowing all formulae and computer commands does not necessarily imply that a student is able to write a report on his findings. These capabilities, however, are fundamental to the statistician's role. The writing of an accurate, understandable report is an essential part of the statistical research process. It happens every now and then that a student comes and says: "Here are the statistics I calculated. What next? Or do you sign my study book now?" Here the main point, statistically seen, is l1whut next?" It is not enough for a statistician (not even for a first-year statistician) to

know all the correct formulae and their effects or to use the computer correctly. A real statistics student should be able to combine different things in a sensible way; he should know the correct formulae, be able to calculate statistics and he should know what to do with the statistics he has calculated. And above all, he should be able to create statistical problems, i.e., he should have a statistical imagination. In simple terms: a mathematics student has to prove a theorem, a computer science student has to write a program which works in the desired way, a statistics student has to solve a real statistical problem, a data matrix being connected with it. These problems have rather different natures particularly a.s regards their levels of uniqueness. It is typical in an empirical statistical research that there are no unique problems with unique solutions. This is a matter that the first-year students should be faced with. The computer is a necessary slave in instilling into students an actual feeling for empirical statistical research. This slave allows the statistical imagination to work freely without tedious calculations being a limiting factor. But not all decisions can he made by this fast slave, because its common sense is more limited than that of a well-practised statistician. When speaking of the teaching of statistics it is useful to remember that statistics should help for example in making certain decisions. This means that statistical problems must be derived from real life. Real life research problems raise questions, and it is possible that statistics may help to answer these questions. Between questions and answers certain statistical methods are inevitably applied. This means that teaching statistics should not be dealing only with techniques for calculating means, variances, correlation coefficients etc. The technique should be combined with an empirical problem in the background. This is not easy especially on big courses, but a basic course in statistics should nevertheless contain a leitmotif. This backbone would also hopefully motivate students to study statistics. Besides the empirical problems the basic course also involves many matters of a theoretical nature. Common to these is that for many students they are rather difficult to understand. These theoretical matters are connected in one way or another with the concept of sampling distribution. When teaching theoretical statistical concepts connected with sampling distributions we have at one stage repeatedly to use phrases like: if we took from a certain population 100 simple random samples of a given size n and calculated the 95% confidence interval according to a certain formula for the population means p from each sample, then about 95 of these intervals would include the right value of p." If the teacher has only chalk and blackboard available, then in the practical teaching situation only phrases like the above are possible. But nowadays other kinds of equipment are also available. Why not use them? 4. The Organization of Computer Usaqe in the Teaching of Statistics In the University of Tampere the use of the computer was introduced on the basic courses in data-analytical problems in 1975. It was seen to be an

inconsistency while later statistical research obliged students almost invariably to use the computer, the basic course took no account of this. Our opinion was and is that in fact a course in statistics is the natural place to learn to use the computer in statistical problems. An important impulse to the introduction of computer use on the basic course was the availability of the statistical data processing system SURVOI71. This was developed in the Computer Centre of the University. In 1981 SURV0171 was implemented on a DEC 2060, which replaced the old Honeywelt system. The new system has about 100 lines for time-sharing terminals. The main feature of SURVOI71 is that it is extremely simple to learn and use, and that the students can use it interactively. SURVOI71 is used in studying empirical statistical problems on the basic course in statistics in the University of Tampere in such a way that every student solves some of the weekly exercises with the aid of the computer and using the system. These exercises are connected with a real data matrix from a practical research problem. The data, on the other hand, have been collected and stored in the memory of the computer during those years the computer has been used extensively in teaching. Students have also to collect their own data sets. It is quite essential that the data sets to be studied should be as interesting as possible. On the other hand, the data can well be messy, thus giving a taste of real data analysis. In using the computer in certain weekly exercises the aim is that the real situation be imitated as closely as possible. For this reason these exercises are defined rather loosely, especially in later stages of the course. We know that in the real world the problems are not at all clear and for this reason it is a good policy to make students acquainted with problems which also on definition look like real problems. It seems that, especially in the beginning, many students feel confusion when they are given a loosely defined problem which demands a greater independence than for example the precisely defined and clear problems of mathematics they have been accustomed to handle in school. On computers and SURVOI71 the students follow two lectures each of two hours. After each of these two lectures every student has a supervised terminal exercise of one hour. For terminal and also other exercises the students are divided into about ten groups. After these lectures and terminal exercises, which take place at the very beginning of the course, they use the terminals in the University independently in their exercises. Note that there must be two kinds of computer exercises: First those with only a few observations so that results calculated manually can he compared with the output of the computer, and secondly those concerning extensive real-life data. At this point it must be said that the practical problems described above form only a part of the exercises done by students. They must also solve more traditional exercises weekly. Among these exercises there are also problems that teach students to make calculations by hand.

Then some words about KONSTA. As was already mentioned in the introductory section, KONSTA is an interactive computer system used mainly to illustrate basic concepts of statistics. The system is a collection of separate programs which are closely interconnected. The programs of KONSTA can be divided into four parts as follows: Generation of samples from desired populations and calculation of relevant statistics. 0 Generation of observations from certain models such as standard linear model, time series, random walks, etc. KONSTA offers flexible possibilities to study the behaviour of various statistics when adding or removing observations. For example. one can generate pairs (x,y) such that x and y are random digits and calculate the regression and correlation coefficients after adding each observation. For handling data matrices there are particular programs planned to be especially useful in dynamic interactive data analysis when one wants to get quickly a rough idea of data and any slights changes in them. 5. Concluding Remarks Now it may be asked what the computer has brought to teaching. Briefly, the answer is: It has essentially helped in making our course a real statistics course. References Pukkila, T. (1978). The utilization of the computer in the teaching of statistics at elementary level. In COMPSTAT 1978, Proceedings in Computational Statistics (pp. 422-428). Wien: Physica-Verlag. Pukkila, T., & Puntanen, S. (1980). Computer usage on first statistics courses in the University of Tampere. In COMPSTAT 1980, Proceedings in Computational Statistics (pp. 145-1 51 ). Wien : Physica-Verlag. Pukkila, T., Puntamen, S., & Stenman, 0. (1984). The use of KONSTA in the teaching of statistics. Department of Mathematical Sciences, University of Tampere Finland, Report A 125, 118 pp.