Multi-threaded FTP Client Jeffrey Sharkey University of Minnesota Duluth Department of Computer Science 320 Heller Hall 1114 Kirby Drive Duluth, Minnesota 55812-2496 Website: http://www.d.umn.edu/~shar0213/ E-mail: shar0213@d.umn.edu Introduction For my final project in CS 5651 Computer Networks, I chose to write a multithreaded FTP client. In this report, I will explain how I implemented the communication protocol using the IETF RFC [JP85] as a guide. I will cover some of the issues encountered and how I solved them, explain my design and coding architecture, and elaborate on how I tested the client. Protocol Implemented When writing my FTP client, I used two references. The first was IETF RFC 959: File Transfer Protocol, [JP85] which outlined how clients and servers would communicate to transfer files. It did an excellent job of explaining both the commands and the timeline surrounding certain events. The commandreply sequence diagram on pages 49-52 was very helpful in understanding how and when to expect responses from a correctly implemented server. Other 1
examples of connection state diagrams on the following pages also aided in my understanding of the protocol. The protocol outlines that clients initially connect to servers on port 21, but can also connect on other ports as the session progresses. The first connection is known as the protocol-interpreter connection because it contains much of the protocol (command) traffic. This is analogous to a Telnet session in the sense that it acts like a command prompt interpreter system. Then, as requested or needed, the client can request to open data connections to the server. The client and server take two opposite roles as requested by the client: those of passive and active endpoints. For each data connection, one endpoint must be active and the other passive. Because the client specifies its role first, the server must take the opposite role. The passive endpoint simply listens and accepts data connections on a given port. The active endpoint then connects and actively works with any data transfer, while the passive endpoint simple does as told. Most servers require authentication, but some do not. When required, servers will request this with a code upon initial TCP communication. At this point, connection between the Transport layers is successful, and now the Application layers are performing the authentication. Servers can accept anonymous connections, or refuse them altogether. My second reference was two already-working FTP servers that I ran on 2
my local machine. The first was Microsoft FTP server [Mic01] that comes as part of IIS. The second was GuildFTPd [MF04], a freeware Windows FTP server daemon. Since I did much of my work offline, these servers were helpful in testing my FTP client. Implementation Detail After examining the FTP protocol, I decided that it would be easiest to write my client in the Java programming language. This was because I knew it had excellent socket and multi-threading capabilities. Because a client is required to handle multiple command and data connections, it seemed logical to implement my client using multiple threads. This allowed me to write my program more efficiently and with fewer event loops. Java is also traditionally known for enforcing a higher level of programming etiquette; that is, being more object-oriented than C or C++. I considered writing my client in C++, but decided against it after recalling some of the problems I ran into when trying to compile on the Solaris machines in the development laboratory. Another aspect was that I would be working on this project on my personal Windows computer, and that I did not have a C++ compiler at my disposal. Thus, I chose to write my client in Java. As a sidenote, my client only supports ASCII file transfers. It can be easily extended to also support binary transfers. 3
To umbrella the concept of a single command connection and multiple data connections, I created the concept of a controller class that would be responsible for all connections to a given server. It would spawn transfer and authentication threads as invoked by its methods. It would take care of all of the underlying commands, and would not require any other knowledge of FTP for the person calling it. Finally, I wrote a simple user interface class that acted like a command prompt to the user. It accepts commands according to the below table: Command Arguments Description OPEN Open command connection to server. CLOSE Close command connection to server. LOGIN Instantiate authentication thread. DIR Acquire and dump directory listing. GET Remote name. Download the given file. PUT Local name. Upload the given file. CD Path name. Change to the given directory. DEBUG Toggle the debug mode of the client. QUIT Quit the client. In terms of software architecture, I used two design ideas. The first was multithreading, which allows my program to handle a virtually unlimited number of simultaneous file transfers. In addition, the user can continue to issue commands over the console connection while a transfer is in progress. Some multi-threaded servers support this feature. The second was the observer design pattern. I wrote my Console object to have a State associated with it. As the state changed, it would update any registered observers with 4
the news. Currently, only the Control class registers itself, but this extends the overall reusability of the client. Experiments To test my client, I connected to various internet and local FTP servers. I did this to ensure that my client would work in most normal situations. I however, did not (and cannot) connect to each unique FTP server in existence. However, I am confident that my client works with most servers that follow the RFC mentioned earlier in this document. I connected to the following servers: localhost, ftp.d.umn.edu, ftp.cpinternet.com, ftp.irs.gov, and csdev01.d.umn.edu. I was surprised at how different each one of them acted, but was happy to note that my client responded appropriately in each of the cases. In addition to host connection testing, I also tested how my client handled error codes coming back from the server. I set up three test cases. First, I made a directory where the user could only LIST and not RETR or STOR. My client successfully informed the user that they did not have permission to read or write to the directory when they tried to download or upload files. Second, I made a directory that allowed reading, but not writing of files. It successfully let the user download, but explained the error to them when they tried to upload. Third, I made a directory where a user could download 5
and upload, but could not LIST. My client also correctly handled this case. The performance of my client was not tested extensively. However, I can infer a few thoughts. Because Java is an interpreted language, it has been known to respond poorly under high-stress situations. With the advent of a Java-based OS, this may change. Also, multithreading has been known to cause deadlock situations in certain cases. I have implemented a concept known as object locking in the very deadlock-prone portions of my program. However, there are still sections of my code that could cause deadlock. Summary Over the past few weeks, I have had the very educational experience of having to write an FTP client from the bottom up. It has shown me how far technology has come, and given me on overall appreciation for the ancestors of today s high-speed data transfer networks. References [JP85] J. Reynolds J. Postel. File transfer protocol (rfc 959). Oct 1985. [MF04] Brad Palmer Matthew Flewelling, Steve Poulsen. GuildFTPd 0.999.13, 2004. [Mic01] Microsoft. Internet Information Services 5.1, 2001. 6