MILCA - DISTANCE EDUCATION IN COMPUTATIONAL LINGUISTICS



Similar documents
SUBJECT-SPECIFIC CRITERIA

Exercise Design and Implementation in Active Math

SUBJECT-SPECIFIC CRITERIA

Task-Model Driven Design of Adaptable Educational Hypermedia

Modularising Multilingual and Multicultural Academic Communication Competence for BA and MA level

Interconnection of competencies and modules of the degree programme Ba Computer Science

Learning Mathematics with

AUSTRIA State of Implementation of the Bologna Objectives. Bundesministerium für Bildung, Wissenschaft und Kultur

Master s Studies in Fashion Management

Strategy of the Federal and Länder Ministers of Science. for the Internationalisation of the Higher Education Institutions.

Science teachers pedagogical studies in Finland

Master on Logistics and Supply Chain Management

elearning in the area of vocational training and further training

International Accreditation of Bachelor, Master and PhD Programmes. Guideline

Honours Degree (top-up) Computing Abbreviated Programme Specification Containing Both Core + Supplementary Information

SUBJECT-SPECIFIC CRITERIA

medin: e-learning in Medical Computer Science

Faculty of Health & Human Sciences School of Psychology

Master s Degree Programme in Social and Health Care Development and Management

MeLLANGE Multilingual elearning in LANGuage Engineering. Best practices in e-learning content creation and development

Date of Revision: October 2012 October 2013 December 2014 (to include all teaching institutions & updated regulations & Blended Learning mode)

Producing Accessible Slide Presentations for Scientific Lectures: a Case Study for the Italian University in the Mac OS X Environment

DoQuP project. WP.1 - Definition and implementation of an on-line documentation system for quality assurance of study programmes in partner countries

NEW MODULAR APPROACH FOR KNOWLEDGE- TRANSFER IN MULTIMEDIA CARTOGRAPHY: THE E-LEARNING PROJECT CARTOUCHE

Scope of Application. 2 Admission requirements

Web Development I & II*

PARIS AGENDA OR 12 RECOMMENDATIONS FOR MEDIA EDUCATION

Teaching Mathematics and Science in Romania - A Review

LOUGHBOROUGH UNIVERSITY

Leuphana Graduate School

Master s programme Print and Media Technology All facts you have to know!

E-learning: Market Structure, Platform Functions, Standardisation (*)

Subject Examination and Academic Regulations for the Research on Teaching and Learning Master s Programme at the Technische Universität München

SUBJECT-SPECIFIC CRITERIA

THE BACHELOR S DEGREE IN SPANISH

Psychology. Undergraduate

Web Development. Owen Sacco. ICS2205/ICS2230 Web Intelligence

Inter university use of tutored online courses: an alternative to MOOCs

Spatial Data Management Development of e-learning Modules

Programme Specification

AUTOMATED LEARNING TOOLS.

PROGRAMME SPECIFICATION

Developing Open Source Web Services for Technology-Enhanced Learning

Virtual Learning Environment in the Age of Global Infonetworks

Eligibility criteria for the courses modules & Accreditation Rules

Building the Tower of Babel: Converting XML Documents to VoiceXML for Accessibility 1

GRADUATE CERTIFICATE / GRADUATE DIPLOMA IN PSYCHOLOGY

Curriculum for the Master of Arts programme in Translation Studies at the Faculty of Humanities 2 of the University of Innsbruck

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN INFORMATION TECHNOLOGY IN EDUCATION (MSc[ITE])

European Union Eastern Neighbouring Area Central Asia cooperation triangle in action

MODELLING, AUTHORING AND PUBLISHING THE DOCUMENT ANALYSIS LEARNING OBJECT

How To Teach Elearning

UNIVERSITY OF LINCOLN JOB DESCRIPTION. JOB NUMBER CSS016 GRADE 8 DATE May 2015 CONTEXT

PROGRAMME SPECIFICATION UNDERGRADUATE PROGRAMMES. Cass Business School Department or equivalent UG Programme (Cass Business School) UCAS Code

FACULTY STUDY PROGRAMME FOR POSTGRADUATE STUDIES

POSTGRADUATE PROGRAMME SPECIFICATION

Selbo 2 an Environment for Creating Electronic Content in Software Engineering

E-Learning in University Math-Education

elearning Instructional Design Guidelines Ministry of Labour

QUEEN S UNIVERSITY BELFAST. e-learning and Distance Learning Policy

AN ELECTRONIC LEARNING ENVIRONMENT FOR APPLIED STATISTICS: QUALITY CARE AND STATISTICS EDUCATION IN HIGHER EDUCATION

HIGHER DIPLOMA BUSINESS FINANCE

MA APPLIED LINGUISTICS AND TESOL

LONDON SCHOOL OF COMMERCE. Programme Specification for the. Cardiff Metropolitan University. BSc (Hons) in Computing

Study Regulations for the Master Course Visual Computing

A Review of Recent E-learning Trends: Implementation & Cognitive Styles

Programme Specification (Postgraduate)

E-Learning and Change Management

Curriculum and Module Handbook. Master s Degree Programme. in Finance (Master of Science in Finance) 1 September 2015

An Introduction to Managing Research Data

University E-Learning System

Module Handbook for the Masters program in General Linguistics at the University of Bamberg

Doctor of Education - Higher Education

Verfahrenstechnik und Chemieingenieurwesen

estatistik.core: COLLECTING RAW DATA FROM ERP SYSTEMS

A Concept for an Electronic Magazine

Guidelines for Doctoral Programs in Business and Management

POSTGRADUATE PROGRAMME SPECIFICATION

Brief Description of the Master Study Program Psychology: Human Performance in Socio-Technical Systems (HPSTS)

PROGRAMME SPECIFICATION Postgraduate Diploma / Master of Science Psychology

How to Teach Teachers to Teach with New Media: Initial and Further Teacher Education in a web-based Collaborative Distant Learning Environment

COMPUTABLE DOCUMENT FORMAT

Teaching and Learning Strategy for UCL Computer Science. Stage 1: the narrative or vision

MD Link Integration MDI Solutions Limited

Analyzing lifelong learning student behavior in a progressive degree

Web Design Specialist

1Lesson 1: Overview of Web Design Concepts Objectives

MASTERS OF SCIENCE ADVANCED ENGLISH PROFESSIONAL STUDIES IN THE FIELD OF ELECTRICAL ENERGETICS AND ENGINEERING

Arts, Humanities and Social Science Faculty

Short Course. Coding. Specification for Junior Cycle

Chapter 1 Introduction

CURRICULUM OF 1 SEPTEMBER

Austrian Federal Ministry of Science and Research National overview on the issue of employability of Bachelor graduates in Austria

Introduction to <emma>

INNOVATION IN UNDERGRADUATE COMPUTER SCIENCE EDUCATION

PROGRAMME SPECIFICATION MSc Speech, Language and Communication Needs in Schools: Advanced Practice

Lecturer in the School of Computer Applications

Subject Benchmark Statement Political Science

Entrepreneurship education in Austria 1

3. Programme accredited by Currently accredited by the BCS. 8. Date of programme specification Students entering in October 2013

Transcription:

MILCA - DISTANCE EDUCATION IN COMPUTATIONAL LINGUISTICS Aljoscha Burchardt, Stephan Walter, Manfred Pinkal, Department of Computational Linguistics and Phonetics, Saarland University I. Introduction This paper discusses experiences from supplementing the Computational Linguistics (CL) curriculum at five German universities with distance education in the form of a network of online courses. The courses were developed and taught within the Germany-wide joint project MiLCA (Medienintensive Lehrmodule in der Computerlinguistik-Ausbildung, media-intensive teaching-modules for education in CL, [2]) that was funded by the German federal ministry of research and education between 2001 and 2003. We present the contributions of the CL group at Saarbrücken university within this project. To provide the necessary context, we first sketch the goals of the project in relation to the general situation of CL education in Germany. II. MiLCA Joint Project Distance education for Computational Linguistics CL is an interdisciplinary subject examining the production and processing of human language. Research is conducted with the aim of implementing software systems that model these capacities. Applications range from intelligent spellcheckers to flexible natural language dialogue systems. Several aspects of the subject make it an ideal candidate for the introduction of distance teaching methods, and in particular for e-learning. As a subject at universities, CL is relatively young. While there is still only a small number of programs and departments dedicated solely to computational linguistics, the subject is often offered as a minor at various departments ranging from computer science over cognitive science to linguistics. This fact on the one hand reflects a diversity of topics in the field that makes it attractive for many students and researchers. But on the other hand it presents a major problem for establishing competitive standards of education at a larger number of locations: Expertise is spread, institutes often have excellent experts teaching in some of the relevant areas but do not have the capacity to offer comprehensive curricula. In this situation, distance education is a promising way to co-ordinate resources. Moreover, attitudes towards the use of new media are generally positive amongst students and teaching staff of CL, given the nature of the subject. Computer interaction and computer-based system demos play a great role in many fields of teaching. Techniques such as online co-operation and online information search are of everyday importance to a computational linguist as tools and objects of study alike. Barriers for the introduction of media-based teaching modules are therefore probably lower than in many other areas. Finally, media competence is an important goal of training in its own right, and the educational use of new media helps to reach this goal through practical experience. The use of distance education and new media not only offers a way for universities to benefit from each others expertise and specialisation within existing degree programs. Within the Bologna process, Germany s major CL locations are now introducing Master and Bachelor-degrees and are re-designing their curricula accordingly. This reform generally aims at more transparency and comparability between curricula. It comes along with an increased orientation towards an international audience. Distance education can contribute here by providing a network of virtual courses that define a globally available reference for core fields of the subject. Online course material may also give students with different

backgrounds the opportunity to catch up before joining a Masters program. This may take the form of self studies or even organised preparatory online lectures. Students can attend such courses from wherever they come from without having to move weeks before the actual Masters program starts. Research and Education in CL are characterised by a large degree of internationality. Major European locations are joined in networks and co-operate intensively in research projects as well as teaching. The international post-graduate college Language Technology and Cognitive Systems, a co-operation between the universities of Saarbrücken and Edinburgh, as well as the newly introduced international Masters programs are examples of institutionalised forms of international orientation in academic CL education. Distance education can foster such co-operations in various ways. First, online courses make it possible for students to still attend courses at their home university while they are visiting a partner site, or simply help them keep in touch with students and teachers at home. Second, distance education can contribute to the intensification of co-operations even when financial resources limit travel possibilities. This way, co-operations can possibly also be extended to sites that otherwise could not participate at all. Perspectives in this direction were explored on a workshop organised by the MiLCA project in November 2003 that focused on the use of e-learning for CL in eastern European countries. Project Goals The MiLCA project was funded by the German ministry of education and research in the period between 2001 and 2003 as part of the larger framework Neue Medien in der Bildung (new media in education), comprising nearly 600 academic projects aiming at the production of high-quality learning material with multimedia enhancements. MiLCA tried to answer the needs and seize the chances sketched above by joining forces between five major German academic centres and further international associated partners. The participants in the project developed nine online course modules. Particular consideration in the development of these modules was given to standards (pedagogical and technical) as well as to the issue of sustainability. While the network of course modules developed in the project covers most of the areas relevant for the subject, it is kept open to integrate contributions by new partners from CL and neighbouring disciplines. By bundling expertise in web-, language-, and text-technology, as well as previous experiences with online teaching, the MiLCA consortium was able to develop most of the particular solutions relevant for its purposes in accordance with evolving standards within the project itself. Moreover international contacts of the participating institutes allowed to integrate further experts as associated partners for technical as well content development. Project Members The MiLCA consortium consists of the universities of Bonn, Gießen, Osnabrück, Saarbrücken, and Tübingen. Further contributions were made (amongst others) by researchers from LORIA, Nancy (France), Carnegie Mellon University, Pittsburgh and Ohio State University, Columbus (USA), the University of Edinburgh (UK) and the University of Toronto (Canada). Pedagogical standards were imposed through didactical consulting and formative as well as summative didactic evaluation, conducted by project members from the psychological institute of Tübingen university. They provided support ranging from general guidelines, e.g. on the selection of didactic scenarios for the preparation of courses, to specific practical advice, such as regarding the selection of suitable media-types for the effective communication of contents of various characteristics. Evaluation was based on the assessment of changes in subject-specific skills, general media-competence and attitude towards the use of new media by way of questionnaires. Further sources that were considered included protocol data and performance of the students. Evaluation results were encouraging throughout as regards the effectiveness of online teaching for the contents considered, as well as the dynamics in the students attitude towards new media.

Framework The WWW was chosen as the main medium for the distribution of course material developed in the MiLCA-project. Consequentially, the development of multimedia enhancements was generally oriented towards deployment over the web, and the amount of non-standard software needed in addition to webbrowsers and freely available plug-ins was to be kept low. The restrictions imposed on the authors by this decision were outweighed by the benefits of easy availability and high compatibility of the resulting course material. The formal nature of some areas of computational linguistics posed a particular challenge in this respect: Techniques for the encoding and presentation of mathematical content over the web were still very much under development at the beginning of the project. Partners in the project agreed to adopt the emerging XML standard MathML ([4]). Meanwhile, MathML-encoded content can be directly displayed by the latest generations of most common web browsers. The project aimed at bundling presentation, communication and content development in one environment. While this aim could only partially be achieved during the project itself, the decisions that were made already lay the ground for future steps into this direction as soon as possible: The consortium decided to use the teaching platform ILIAS (http://www.ilias.uni-koeln.de/ios/index-e.html). ILIAS is a widely accepted, rapidly developing open source solution. From the outset, ILIAS allowed the presentation of HTML-based course material in a standardised environment, connected with an e- mail- and newsgroups- system. Additionally it provided navigation-facilities and various supportive tools for the learne, and user and content management possibilities for teaching and administration. Its major weakness lay in its inadequate support for authoring. Course material in MiLCA was therefore created using external tools and was encoded in a generic XML format from which HTML was generated for the presentation in ILIAS. New versions of ILIAS will allow direct import of MiLCA course material. Particular consideration was given to the question of sustainability of the project results. Efforts were made on three levels. All course modules were coded in XML, relative to a document type description (DTD) developed for this purpose within the sub-project at Gießen University ([3]). The defined format is based on the concept of Learning Objects, each standing for a content unit at some level. Learning objects may be nested like chapters and sections in a book, but they may also be related to each other in various other forms. This gives the encoding scheme enough flexibility to reflect the specifics of hypertext and multi-media content. Additionally, the MiLCA-DTD incorporates extensive didactic as well as technical meta-data according to the international LOM standard. Content can be selected, re-grouped and adapted on the grounds of this additional information. The XML-format developed in MiLCA will become part of the ILIASplatform as an input specification. Course modules developed in the project were taught and open for students at all participating sites during the project. Experiences from these test-runs will serve as guidelines for establishing them as regular supplements within the curricula at the respective universities and as preparatory courses for Masters programs. Moreover steps were taken to disseminate the project results outside the MiLCA-consortium through publications, presentations on conferences and fairs, summer schools, contacts to other online learning initiatives and a dedicated workshop with a focus on online learning in the acceding countries of central and eastern Europe. Future maintenance, distribution and adaptation of course material, as well as upcoming administrative issues will be taken care of by a collecting society funded by three members of the project consortium. The society has agreed upon an open content distribution model for the course material itself. Additional services (e.g. teaching and tutoring) will be offered on a caseby-case contract basis. Up to now, the users of the MiLCA material have been university students, but the training of employees in language technology sections of companies is envisioned as well. Interested parties are welcome to contact the collecting society.

III. MiLCA - Saarbrücken The Saarbrücken CL group is the largest one in Germany, with large resources in teaching as well as research. The use of new media has long become an integral part of the teaching activities at the Saarbrücken CL group. Online course material was therefore not so much developed to filling gaps in the course schedule or to introduce completely new methods in teaching. Rather the focus was on the exploration of subject specific chances of and requirements on media usage for effective content presentation. Media Usage The interdisciplinary character of CL places special demands on the structural conception of teaching material. Most topics have a dual character: On the one hand, there are linguistic phenomena with different theoretical explanations. On the other hand, there are machine-executable algorithms that implement the treatment of the respective phenomena following these explanations. The didactic demands in each of the two cases often differ as much as the applicable methodology. The use of media in our course material differs accordingly between the two discussed types of content. The more theoretically oriented parts (apart from mathematical content) use standard hypertext and multi-media elements such as images, flash animations, linked glossaries, references, etc. The XML markup, in particular the metadata allows the specification of a fine-grained document structure which delimits material of different difficulty levels. For example this makes it possible to mark sections which leave the basic track of argumentation. Here, hypertext allows for the integration of more in-depth discussion of particular issues for those it may concern. The second type of content is found in the practical sections, where computational implementations (i.e. fragments of program code) are presented. A coding scheme for this purpose had to fullfill the following three desiderata: Quotation of code from program files via symbolic references (instead of manual insertion of code snippets) had to be possible. This technique e.g. keeps downloadable code and code that appears in the text consistent throughout a course. Program code had to be in XML markup. Apart from securing consistency with the project wide standards, this makes it possible to use e.g. XSLT transformations to generate plain code for downloading, or HTML-formatted code for displaying. Code snippets integrated into the text had to be executable, e.g. in the form of hyperlinks. This method generally allows the reader to inspect the behaviour of a program right where it is discussed. In particular, it can be used to give a direct impression of issues such as run-time or complexity of output. No ready-made solution was available that could meet all of these requirements. A proposed XMLformat was adapted and combined with standard techniques such as CGI-scripting to achieve the described behaviour. Technical Solutions The Saarbrücken MiLCA project has added three courses to the MiLCA pool: Computational Semantics Algorithms for CL Natural Language Dialogue Systems. Details on the courses can be found at http://www.coli.uni-sb.de/cl/projects/milca/. It is not possible to present the contents of the course modules within the range of this paper. Instead, an example from the

course on Computational Semantics will be used to illustrate the interconnection between subject specific didactic requirements and media usage. One of the topics in Computational Semantics is the development of computer based methods for natural language interpretation. Such computer programs analyse sentences in a first step by representing them as logical formulas. These formulas so-to-speak contain the sentence meanings in a computer readable format. As noted earlier, mathematical content (such as logical formulas) in MiLCA is coded in the MathML standard. The latest generation of the most common web browsers can display MathML with convincing results. Yet editing MathML directly remains cumbersome. In the Saarbrücken subproject, a chain of freely available tools was used to convert Latex, the widespread standard format for scientific publications, into MiLCA-XML. While the conversion could be done only semi-automatically for the general case, the generation of MathML from Latex was fully automated, so that mathematical content could be authored using Latex throughout. Documents were edited in an intermediate format where the mathematical formulae still were in Latex within special tags. Below you see an example formula which (in a common approach) describes the meaning of the word every: This GIF was automatically generated out of this intermediate (latex-based) format: <math>\lambda P.\lambda Q.\forall x.(p@x\to Q@x)</math> More important, the following MathML has also been automatically generated out of this format: <mml:math xmlns:mml="http://www.w3.org/1998/math/mathml"> <mml:mi>λ</mml:mi> <mml:mi>p</mml:mi> <mml:mo>.</mml:mo> <mml:mi>λ</mml:mi> <mml:mi>q</mml:mi> [...] </mml:math> In cases where a browser cannot display MathML and for print versions, the GIF of the formula presented above can serve for display. The markup of programme code was realized in close cooperation with an industrial partner who assimilated the proposed CodeML standard to our needs. This made it possible to build a tool chain supporting the insertion of programme code into the textual material. In a first step, a code fragment like the following so-called clause of the programming language Prolog (which formulates the translation of the word every into the formula presented above): det(lambda(p,lambda(q,forall(x,(p@x)>(q@x)))))--> [every]. is automatically transformed to the following flat CodeML markup: <cpg class="predicate" id="firstlambda:det"> <cpg class="clause" id="firstlambda:det:1"> [...] det(lambda(p,lambda(q,forall(x,(p@x)>(q@x)))))--> [every]. [...] </cpg> This code fragment can then be mentioned within the text by way of symbolic reference. The following reference cites this clause with the help of a self-explanatory unique name which is made up of the respective file name, the name of the so-called predicate, and the clause number: <coderef to="firstlambda:det:1"></coderef>

The automatic insertion and formating of the respective code fragments for display is achieved by an XSLT script. This technique makes it possible to insert all used files of the programming language as XML elements within the main document and to refer to parts of the files as explained above. With the same technique, links can be inserted in the text that extract downloadable plain code files again. IV. Conclusion In this paper we have outlined the MiLCA project, a Germany-wide joint project on distance learning in the field of Computational Linguistics funded by the German ministry for education and research.the immediate result of the project is the pool of course material itself. The courses cover all main areas of CL, theoretical as well as applied, and a number of special topics. All material is available in a uniform XML markup including metadata which can be imported into the open-source learning platform ILIAS. A collecting society has been founded to take care of the future administration and distribution of the course material. As a second result of the project, a number of XML standards which existed only as proposals at the beginning of the project have been further developed, tested, and documented by project partners and within a number of co-operations with external experts. The experiences we made in the Saarbrücken subproject were encouraging. The motivation and the results of the students were generally high. Apart from our positive experience with elearning for CL, the empirical know-how we acquired during the authoring process has revealed a number of promising future research topics in the field of CL for elearning: It has shown that even relatively unsophisticated methods of CL can support the process of authoring course material enormously. For example the rules of thumb we used in the generation of candidate abstracts for metadata markup have proven extremely helpful. An interesting topic to be pursued further is how CL can further facilitate such kinds of authoring, editing and displaying processes. Techniques that will become relevant here are automatic linking of documents and machine-aided summarization. Elaborate multilingual support and dynamic user adaptation are other possible further directions. References: 1. MICHAEL KOHLHASE CodeML: An Open Markup Format the Content and Presentatation of Program Code, 08 March 2004, http://www.mathweb.org/omdoc/projects/codeml.ps 2. LOTHAR LEMNITZER AND BERNHARD SCHRÖDER (EDS.) (2003) Computerlinguistik -- Neue Wege in der Lehre. Themenheft von Sprache und Datenverarbeitung., IKP, Bonn 3. HENNING LOBIN AND MAIK STÜHRENBERG (2003) XML-strukturierte Learning Objects, Lea Cyrus et al. (ed.), Sprache: Zwischen Theorie und Technologie 4. W3C MATH WORKING GROUP (2003) Mathematical Markup Language (MathML) 2.0, 21 October 2003, http://www.w3.org/tr/mathml2/ Authors: Prof. Dr. Manfred Pinkal, Aljoscha Burchardt, Stephan Walter Department of Computational Linguistics and Phonetics Universität des Saarlandes, Fachrichtung 4.7 Allgemeine Linguistik Postfach 15 11 50 66041 Saarbrücken, Germany {pinkal,albu,stwa}@coli.uni-sb.de