Teaching Operating Systems Administration with User Mode Linux Renzo Davoli Department of Computer Science University of Bologna - Italy renzo@cs.unibo.it ABSTRACT User Mode Linux is a virtual machine running on a GNU- Linux operating system. It is the right choice for teaching operating systems administration, as it does not need any dedicated hardware. It runs at user level (no need for root, i.e. administrator, access or possible security threats) and it does not have the performance problems of an emulator. This paper describes how to set up a laboratory for teaching operating systems administration. Categories and Subject Descriptors D.4.0 [Operating Systems]: General; D.4.6 [Operating Systems]: Security and ProtectionAccess controls; D.4.3 [Operating Systems]: File Systems Management; D.4.4 [Operating Systems]: Communications ManagementNetwork communication General Terms Experimentation, Performance, Security Keywords Teaching, Operating Systems, Administration, Virtual Machine, Security, Laboratory 1. INTRODUCTION Operating Systems is a very wide area in Computer Science, and there is no doubt that a good knowledge of O.S. is needed for any professional role in Information Technology and Computer Science. This paper focuses on the System Administration sector and, in Particular, on lab activities concerning administrative tools and procedures. There are proposals from many Universities to create specific courses in this field [20]. There are many examples in the literature of how to set up exercises and tools for many chapters in an Operating Systems syllabus. For instance, lab environment procedures Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ITiCSE 04, June 28 30, 2004, Leeds, United Kingdom. Copyright 2004 ACM 1-58113-836-9/04/0006...$5.00. and software have been set up for teaching kernel development using hardware emulators [7, 2, 6, 5, 13, 11, 15, 10, 19] also using networked environments [3, 4]. There is also material on kernel developing teaching on bare hardware, using Minix [18] or the whole Linux kernel. All the laboratory activity related to the use of system calls and libraries or scripting does not require any special infrastructure except standard programming environments. On the other hand, far fewer proposals have been made for teaching O.S. administration. A laboratory environment to teach system administration can be based on: Hardware. This is an expensive choice in terms of technical staff activity, because of the need for a single machine per student, or group of students (or at least removable disk units). [1] It is not possible for students to work at home, unless they have or buy a suitable hardware unit, or work remotely on the laboratory machines. Network access should be provided for driver download or search for HOWTOs but, at the same time, students must have access to the bare hardware in order to install the kernel, so anonymous access and security threats are possible. If a single computer per student cannot be guaranteed there is no way to test the configuration of services realistically, because computers must be rebooted for other students to do their exercises. Hardware emulators [14, 16, 17, 22]. The cons of this choice consist in performance. It is possible, with Bochs and Qemu, to operate on operating systems not designed for the host machine CPU architecture, but the emulation phase can be computationally heavy. These emulators normally run at user level, but some of them (Plex86 and VmWare) have kernel modules to increase performance, thus reducing the security level of the system. Virtual Machines. This is an intermediate approach. It is efficient and safe: the interface between the virtual machine and the host operating system is at the system call layer and the entire virtual system can run in user mode and with standard user permissions. 2. DISCUSSION User Mode Linux [21, 8] (U-ML 1 ) is a virtual machine running on a GNU Linux Operating System. It has been 1 The standard acronym for User-Mode Linux would be 112
A VDE Switch B Real Network Host Computer Real Disk Figure 1: A schematic view of a GNU Linux workstation running two U-ML virtual machines implemented as a patch for the Linux kernel source code. By compiling the patched kernel for an architecture named um, the resulting code can be executed as a standard binary, instead of being an image for the system boot. When a U-ML kernel is executed (as a standard program of the host machine) it appears as if it were another Linux machine. After booting, showing the same set of messages that can be seen on the console of a real Linux machine, the user gets the login prompt. The processes of the U-ML machine are mapped onto processes of the host machine and there is no emulation in place. Only access to the hardware from inside the virtual machine is intercepted and processed by virtual devices. Each user is the administrator of his/her own U-ML virtual computer. Almost all the practices of a system administrator can be tested inside a U-ML machine that behaves exactly like a real computer. U-ML is a set of standard processes for the host machine, having neither special privileges nor the need to be executed in set-uid mode, nor having access to specific kernel modules. The execution of U-ML is as safe as the execution of any other user application [9]. It is also possible to compile new versions of U-ML kernels with different drivers and features. In the case of kernel software bugs or erroneous configurations that would lead to kernel panic on a real machine, there is just the abnormal termination of the U-ML processes. There is no damage to security or availability for other users logged onto the same host computer. U-ML has become a standard tool for Linux kernel developers. The following subsections give a short description of the U-ML virtual devices and their configuration for the OS Administration Lab. (See fig. 1) 2.1 (VDE) Virtual Distributed Ethernet U-ML provides different virtual network interfaces: ethertap, tuntap, daemon and slip. Ethertap and tuntap are interface modules to the corresponding modules of the host UML. In this paper U-ML is used instead of UML to avoid confusion with other acronyms widely used in Computer Science like Unified Modeling Language kernel. Root access or a setuid utility is required to activate network interface at the host server for Ethertap or tuntap. Slip emulates the Serial Line IP protocol. It can be used to emulate serial communication between U-ML machines, but a teaching environment based on slip interface would not be real world consistent: in the most common scenario, computers are interconnected by an Ethernet interface or remote computers using modems. Networks based on local interconnections of point-to-point serial lines between pairs of computers are quite rare. We have used the daemon interface: an Ethernet network is emulated locally using a local daemon. The local communication is based on UNIX domain sockets. We have created a new daemon named vde-switch (based on the uml-switch tool) [12] to work as a real GNU-Linux daemon also interfacing the virtual network to the host machine through the tuntap module. The host machine can then act as a router to interconnect the virtual network to the Internet. We have configured the IP-masquerading option so that each U-ML machine is connected to the Internet as a host within a private intranet network. Network clients running on U-ML virtual machines can communicate with the entire Internet, while servers are viewable only for other U-ML machines and for the host computer. Students U-ML processes have the same access permissions to the real network that are normally given to ordinary users. VDE is also compatible with the hardware emulator MPS. MPS is a tool that can be used by students to create their own kernel (as described in [15]). With VDE it is possible to interoperate between real linux boxes, U-ML virtual machines and MPS student created operating systems. The MPS machine could provide simple services, as they were embedded systems (e.g. domotics appliances). In this way it is possible to give students a vertical perspective. They will have tested and implemented educational samples of software at all levels: kernel, application, services. Moreover, each user can start other instances of vdeswitch and it is possible to create other virtual LANs. It is also possible to create vde-cables that interconnect local or remote vde-switches. Using the latter tool the student can join the virtual network at the University from home or a dormitory and test the services. Students can manage an entire virtual network on their own. A vde-cable consists of two vde-plug units interconnected by a double pipe. Any kind of character based on remote execution service works to join a remote vde-switch. We use ssh: a vde-cable with ssh works as a general purpose encrypted tunnel. Students can also test the support for ethernet compatible protocols not yet supported by their Internet providers (e.g. IPv6) at home. Please note that the method also works when one of the ends (usually the end at the student s home) is behind a NAT masked subnet. Figure 2 shows an example of virtual network that can be created using vde-switches and vde-cables. 2.2 File Systems Each file system of the U-ML virtual machine is normally mapped onto a single large file on the host computer. There are no special encodings or formats: the file is an image of the file-system which can be mounted by the loop option. There are two special file systems in U-ML: hostfs and Copy on Write (COW). Hostfs is used to access a subtree of the host computer file system, while COW is used to have 113
LAB COMPUTER A TUNTAP (routed to the world) LAB COMPUTER B (router) MPS STUDENT S PC TUNTAP as default route (all applications on this real computer are directly connected to the virtual network) Figure 2: An example of complex virtual network infrastructure read-write access to a read-only file system image, by writing modifications in a separate diff file. Obviously, from inside U-ML it is possible to mount network file systems. We have decided to use a standard image file for the root file system which is quite small, while /usr is mapped on a read-only shared image, opened in COW mode (see 3). The /usr file system in a real environment is quite large (several GBs), it is very expensive in terms of resources to give such large quotas of disk space to all the students. With our architecture one (or some) shared /usr standard disks are created by the TAs. All the students can mount the chosen /usr file system in read/write COW mode. Only the changes to the /usr shared image will be accounted in their disk quotas. 2.3 Terminals U-ML emulates serial terminals that can be connected to serial lines, pseudo terminals and xterm windows. It is also possible to interconnect the terminal lines of two U- ML virtual machines: they will behave as a pair of real computers interconnected by a serial line (e.g. a modem). 2.4 Exercises and organization of the class Several sets of exercises can be organized using our lab environment. The following is a non exhaustive list of possible assignments: Root level basic administration commands. startup and shutdown; user management (add/delete user, change the user s shell,...). log management (log analysis, storage of log files). Installation of application and services (getting the source, configure make, install, choice of the right path and organization). use of a GNU-Linux distribution. (Packet update and downgrade when needed, security updates). System Administration scripting (writing and testing of shell/perl/python scripts for system administration) installation and configuration of local network services: Configuration of IPv4 and IPv6 local network services. NIS and LDAP installation and test usage. Configuration and test of a network file system infrastructure. Centralized printing services. Tests on network management systems. installation and configuration of Internet services: Testing of IPv4 IPv6 DNS configurations for direct and reverse mapping. Http servers: virtual hosting, gci execution. FTP server: anonymous and authorized access. other services: News (nntp), irc,... Proxy services. installation and administration of routers (zebra). Tests on security issues: 114
NFS test Bob s UML mounts a subdir of Charlie s home directory A B C Student: Alice Student: Bob Student: Charlie / (root) / (root) / (root) COW DIFF /usr COW DIFF COW DIFF /home Charlie s Home dir on host Computer (using hostfs) Figure 3: An example of file system architecture used in the Lab. testing of iptables IP filtering and masquerading functions. IPsec implementation. installation of the ssl layer. testing of IDS and security auditing software (e.g. nessus) Kernel driver development, change and testing kernel configuration and compilation. installation/development of new drivers. Classes on system administration can be organized using a single path of exercises for all the students, or by creating the feeling of a project team. In the former case, the teacher decides a set of interesting experiments as a subset of the list above, and gives the specifications to the students. In the latter case, several groups manage different machines and carry on different projects, as each group is responsible for a service. Each approach has pros and cons: the former could lead to a higher probability of cheating, the latter an evaluation problem for grading different exercises. The experience can be exchanged either by organizing seminar lectures, where each group explains its implementation/configuration to the rest of the class, or by using reports to be published either on a web site or on a newsgroup. Students can realize the real-world consistency of their implementations. The teacher can also decide to interface the students virtual world to the real Internet (provided he/she trusts the services). In this way students will be able to receive E-mails on their managed MTA or to show their web server. 3. CONCLUSIONS AND FURTHER DEVELOPMENTS The environment is very promising, the range of possible exercises is very wide and very valuable in terms of teaching aids. However, some system administration activities cannot be done on this U-ML lab, or the support has not been sufficiently tested. The basic installation of the entire O.S. is an example. U- ML starts from a kernel and then mounts a partition. Disk partitions are mapped onto separate files, there is no disk boot-block. For the same reason it is not possible (at least not yet possible) to make exercises on HD partitioning or tests on lilo. We already have a linux installation image: booting from that image the student can test a complete debian network installation on U-ML. Current implementation already makes some slight differences between a real machine and a virtual machine (e.g. the device names and a custom fdisk implementation). We are confident that we will be able to release this tool quite soon. We are studying solutions to also include exercises about practical aspects, administration and integration virtual systems based on other operating systems. However, there is a tradeoff between a tricky implementation on U-ML that could behave differently from a real computer, or a less efficient implementation on a hardware emulator, like Bochs or Qemu. In the former case, the student would use the same tool for the entire course or course module. In the latter, he/she has to learn how to use another tool. There are pros and cons to both approaches. We are studying fast emulation tools and techniques in order to create lab environments not only for exercises in system installation, but also to teach system ad- 115
ministration for proprietary closed operating systems. VDE interfaces for Bochs, Plex86 and qemu are still in the early stages of development. A feature that should be implemented on the vde-switch is support for 802.1q tagged networks. Students could learn how to create virtual LANs on emulated switches. We have not yet tested the use of multicast protocols. If the performances are well suited to the application, several other exercises could be assigned (e.g. on multimedia content delivery). Another interesting tool is modem emulation. It is already possible to interconnect two U-ML machines by an emulated serial line and PPP, but all the problems related to modem control commands and specific protocols cannot be tested. The development of an emulated modem able to use AT commands, maybe with a virtual PABX (each U- ML modem is connected to a specific phone number), would help in testing utilities like modem dialers (e.g. chat). The students can learn how to set up dial-in services like those given by Internet providers, or how to set up a service using dial-out lines. The infrastructure is also suitable for just about any type of exercise on computer networking. VDE code has been released under the GPL: students can implement their own switch with different characteristics (e.g. minimum spanning tree, that is not currently implemented on owr vdeswitch, or some kind of priority management, or even level 4 switching services). U-ML machines can work as routers, so students can test existing routing tools or implement their own. Phenomena like count-to-infinity or link state inconsistencies due to unstable lines can be implemented by hand. NAT subnetworks or firewalling techniques can be tested using a U-ML machine with several interfaces interconnected to several vde-switches. 4. ACKNOWLEDGMENTS I wish to thank dr. D Ascanio, who made preliminary tests on U-ML teaching Environments, and Mr. Cosimo Iaia, who is working to complete the Debian installation tool for U-ML. 5. REFERENCES [1] R. R. Adams and C. Erickson. Linux in education: Teaching system administration with linux. Linux Journal, 2001. [2] O. Babaoglu, M. Bussan, R. Drummond, and F. B. Schneider. Documentation for the chip computer system. Technical report, Department of Computer Science, Cornell University, 1988. [3] E. Carniani and R. Davoli. The netwire emulator: A tool for teaching and understanding networks. In ACM 6th Conference on Innovation and Technology in Computer Science Education, ITiCSE 2001, pages 153 156, Canterbury, England, 2001. [4] E. Carniani and R. Davoli. The netwire emulator: Teaching and understanding networks in both synthetic and real scenarios. In M. Roccetti, editor, Simulation Series, volume (34)1, pages 29 33. SCS, 2002. Presented at the International Conference on Simulation and Multimedia in Engineering Education, ICSEE 2002. [5] W. A. Christopher, S. J. Procter, and T. E. Anderson. Nachos. http://www.cs.washington.edu/homes/tom/nachos/. [6] W. A. Christopher, S. J. Procter, and T. E. Anderson. The nachos instructional operating system. In USENIX Winter 1993 Conference Proceedings, 1993. Best Paper Award. [7] R. Davoli and M. Goldweber. New directions in operating systems courses using hardware simulators. In Proc. of International Conference on Simulation and Multimedia in Engineering Education (ICSEE), Orlando, 2003. [8] J. D. Dike. User-mode linux. In Proc. of 2001 Ottawa Linux Symposium (OLS), Ottawa, 2001. [9] J. D. Dike. Making linux safe for virtual machines. In Proc. of 2002 Ottawa Linux Symposium (OLS), Ottawa, 2002. [10] G. Fankhauser. A mips r3000 simulator. http://www.tik.ee.etzh.ch/ gfa/sim/simulator.html. [11] D. A. Holland, A. T. Lim, and M. I. Seltzer. A new instructional operating system. In ACM 33rd Technical Simposium on Computer Science Education SIGCSE 2002 Proceedings, pages 111 115, 2002. [12] King enzo ark programs. http://www.bononia.it/ renzo/keap/. [13] J. Larus. Spim: a mips r2000/r3000 simulator. http://www.cs.wisc.edu/ larus/spim.html. [14] K. Lawton. Bochs project home page. http://bochs.sourceforge.net. [15] M. Morsiani and R. Davoli. Learning operating system structure and implementation through the MPS computer system simulator. In Proceedings of the 30th SIGCSE Technical Symposium on Computer Science Education, pages 63 67, New Orleans, 1999. [16] T. plex86 Team. Plex86 project home page. http://www.plex86.org. [17] Qemu cpu emulator. http://fabrice.bellard.free.fr/qemu/index.org.html. [18] A. S. Tanenbaum and A. S. Woodhull. Operating Systems Design and Implementation. Prentice Hall, 2nd edition, 1997. [19] T. T. Team. Topsy - a teachable operating system. http://www.tik.ee.etzh.ch/ topsy. [20] R. Tompkins. A new twist on teaching system administration. In Proc. of USENIX Tenth System Administration Conference (LISA), Chicago, October 1996. [21] User-mode linux. http://www.usermodelinux.org/. [22] Vmware inc. http://www.vmware.com/. 116