Project Report on Implementation and Testing of an HTTP/1.0 Webserver



Similar documents
1. When will an IP process drop a datagram? 2. When will an IP process fragment a datagram? 3. When will a TCP process drop a segment?

World Wide Web. Before WWW

Chapter 27 Hypertext Transfer Protocol

CONTENT of this CHAPTER

Network Technologies

Computer Networks. Lecture 7: Application layer: FTP and HTTP. Marcin Bieńkowski. Institute of Computer Science University of Wrocław

APACHE WEB SERVER. Andri Mirzal, PhD N

Microsoft Windows Server 2003 with Internet Information Services (IIS) 6.0 vs. Linux Competitive Web Server Performance Comparison

A Comparative Study on Vega-HTTP & Popular Open-source Web-servers

Internet Technologies. World Wide Web (WWW) Proxy Server Network Address Translator (NAT)

Lecture 2. Internet: who talks with whom?

By Bardia, Patit, and Rozheh

INT322. By the end of this week you will: (1)understand the interaction between a browser, web server, web script, interpreter, and database server.

The Web: some jargon. User agent for Web is called a browser: Web page: Most Web pages consist of: Server for Web is called Web server:

CentOS Linux 5.2 and Apache 2.2 vs. Microsoft Windows Web Server 2008 and IIS 7.0 when Serving Static and PHP Content

Building a Multi-Threaded Web Server

Protocolo HTTP. Web and HTTP. HTTP overview. HTTP overview

Implementing Reverse Proxy Using Squid. Prepared By Visolve Squid Team

Optimization of Cluster Web Server Scheduling from Site Access Statistics

Intel DPDK Boosts Server Appliance Performance White Paper

Design Notes for an Efficient Password-Authenticated Key Exchange Implementation Using Human-Memorable Passwords

SWE 444 Internet and Web Application Development. Introduction to Web Technology. Dr. Ahmed Youssef. Internet

Project #2. CSE 123b Communications Software. HTTP Messages. HTTP Basics. HTTP Request. HTTP Request. Spring Four parts

Interactive Applications in Teaching with the MATLAB Web Server. 1 Aim and structure of the MATLAB Web Server

Application Note. Windows 2000/XP TCP Tuning for High Bandwidth Networks. mguard smart mguard PCI mguard blade

Serving dynamic webpages in less than a millisecond

Web Server Manual. Mike Burns Greg Pettyjohn Jay McCarthy November 20, 2006

Port Use and Contention in PlanetLab

The Web History (I) The Web History (II)

One Server Per City: C Using TCP for Very Large SIP Servers. Kumiko Ono Henning Schulzrinne {kumiko, hgs}@cs.columbia.edu

Outline Definition of Webserver HTTP Static is no fun Software SSL. Webserver. in a nutshell. Sebastian Hollizeck. June, the 4 th 2013

Internet Information TE Services 5.0. Training Division, NIC New Delhi

HTTP. Internet Engineering. Fall Bahador Bakhshi CE & IT Department, Amirkabir University of Technology

Traffic Analyzer Based on Data Flow Patterns

Technical Research Paper. Performance tests with the Microsoft Internet Security and Acceleration (ISA) Server

MatrixSSL Getting Started

Deployment Guide Oracle Siebel CRM

10. Java Servelet. Introduction

Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs)

Adding Advanced Caching and Replication Techniques to the Apache Web Server

Web Development. Owen Sacco. ICS2205/ICS2230 Web Intelligence

CS640: Introduction to Computer Networks. Applications FTP: The File Transfer Protocol

The Hyper-Text Transfer Protocol (HTTP)

CTIS 256 Web Technologies II. Week # 1 Serkan GENÇ

Development and Evaluation of an Experimental Javabased

Web Server Software Architectures

PRODUCTIVITY ESTIMATION OF UNIX OPERATING SYSTEM

DESIGN AND IMPLEMENTATION OF A WEB SERVER FOR A HOSTING SERVICE

Accelerating Rails with

HTTP Protocol. Bartosz Walter

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology

MFPConnect Monitoring. Monitoring with IPCheck Server Monitor. Integration Manual Version Edition 1

Introducing the Microsoft IIS deployment guide

Description of Microsoft Internet Information Services (IIS) 5.0 and

Comparative Study of Load Testing Tools

MASTER THESIS. TITLE: Analysis and evaluation of high performance web servers

Exercises: FreeBSD: Apache and SSL: SANOG VI IP Services Workshop

EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications

IT Best Practices Audit TCS offers a wide range of IT Best Practices Audit content covering 15 subjects and over 2200 topics, including:

SIDN Server Measurements

Web Application s Performance Testing

Internet Technologies Internet Protocols and Services

Performance analysis of a Linux based FTP server

Deployment Guide Microsoft IIS 7.0

SIP: Protocol Overview

Working With Virtual Hosts on Pramati Server

GDC Data Transfer Tool User s Guide. NCI Genomic Data Commons (GDC)

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study

Web Server for Embedded Systems

Cache Configuration Reference

S y s t e m A r c h i t e c t u r e

WWW. World Wide Web Aka The Internet. dr. C. P. J. Koymans. Informatics Institute Universiteit van Amsterdam. November 30, 2007

Understanding Slow Start

Painless Web Proxying with Apache mod_proxy

Manual. Netumo NETUMO HELP MANUAL Copyright Netumo 2014 All Rights Reserved

PERFORMANCE IMPACT OF WEB SERVICES ON INTERNET SERVERS

Web. Services. Web Technologies. Today. Web. Technologies. Internet WWW. Protocols TCP/IP HTTP. Apache. Next Time. Lecture # Apache.

Internet Technologies_1. Doc. Ing. František Huňka, CSc.

reference: HTTP: The Definitive Guide by David Gourley and Brian Totty (O Reilly, 2002)

MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM?

1945: 1989: ! Tim Berners-Lee (CERN) writes internal proposal to develop a. 1990:! Tim BL writes a graphical browser for Next machines.

DEPLOYMENT GUIDE Version 1.0. Deploying the BIG-IP LTM with Apache Tomcat and Apache HTTP Server

7 Why Use Perl for CGI?

Using Dynamic Feedback to Optimise Load Balancing Decisions

Research of Web Real-Time Communication Based on Web Socket

Securing The Apache Web Server. Agenda. Background. Matthew Cook

WHAT IS A WEB SERVER?

Network Probe User Guide

Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago

A Tool for Evaluation and Optimization of Web Application Performance

Drupal Performance Tuning

24x7 Scheduler Multi-platform Edition 5.2

PERFORMANCE ANALYSIS OF WEB SERVERS Apache and Microsoft IIS

Performance Comparison of Web-based Database Access

CommandCenter Secure Gateway

Lektion 2: Web als Graph / Web als System

The Three-level Approaches for Differentiated Service in Clustering Web Server

Firewall Security: Policies, Testing and Performance Evaluation

Programming the Apache Lifecycle

Transcription:

Project Report on Implementation and Testing of an HTTP/1.0 Webserver Christian Fritsch, Krister Helbing, Fabian Rakebrandt, Tobias Staub Practical Course Telematics Teaching Assistant: Ingo Juchem Instructor: Dr. Xiaoming Fu SS 2005 Institute for Informatics University of Göttingen, Germany

Abstract. This document reports the project Implementation and Testing of an HTTP/1.0 Webserver. which we performed during the practical course Practicum Telematics in the summer semester 2005. We discuss some design aspects, give an overview of the implementation and evaluate the performance results. The source code is released under GPL in http://user.informatik.uni-goettingen.de/ teleprak/ss2005/webperf.

1 Introduction This document describes the details on the design, implementation and testing of an HTTP/1.0 webserver using the programming language C. This was performed as a practicum project during the Summer Semester 2005 at the University of Göttingen. The goal of this project was to build a webserver that should provide basic functionality of HTTP/1.0 as specified in RFC 1945 [1]. The project requirements included the following: 1) should be able to handle incoming HTTP requests in an appropriate way, 2) it has to be file system based, and 3) should work with a thread pool achieving faster transmission times. Although the HTTP/1.0 specification supports common gateway interface (CGI), our web server implementation currently does not understand CGI; furthermore, we have not implemented the POST method. For easy configuration, all MIME types are stored in a static file, so that they can be extended later on in an easy way. In our implementation, we further stored all settings for the program within an external configuration file. This mechanism allows a quick adjustment of the individual settings (e.g. the values for index or default page ) that will take effect after the server has been restarted. We have perfored a number of tests using commonly-available browsers to assure that the server works effectively. This can easily be justified by retrieving various file types with different browsers like Internet Explorer, Firefox or the text based Lynx. In addition, we conducted several performance tests allowing us to make a comparison between the Linux based version and other web serverslike Apache. We also applied the developed web server to the Windows platform and compared the performance (advantages and disadvantages) of both operation systems. Finally, our results suggested that the used number of threads in the server is an important factor in determining the performance.

2 Problem Description Developing a web server in C from scratch is not a simple task. Firstly, one has to be acquaint with what the standard C libraries provide and how to map into the designated goals. Second, although other implementations of HTTP/1.0 web servers exist, there is actually no thread based version we can compare with our solution in a direct way. Because of this fact there is no fair competition between our implementations and others. Currently, either the available ones are HTTP/1.1 based and offers an improved performance, or the server does not use multi-threading. We kept this problem open in the rate testing.

3 Technical Approach 3.1 Requirements Refering to RFC 1945 The old Version HTTP/0.9 was developed by Tim Berners-Lee in 1990 to transmit HTML documents and supports a special form to inquire the web server about files which is called Simple-Request. In this case the client uses a request that consists only of one single GET method combined with the URI of the document. Retrieving data is just the purpose of the protocol, so the performance can definitely be improved. That is why its successor came into being in 1996. HTTP/1.0 offers additional methods like HEAD and POST to enhance the set of functionality. From now on a so called Full-Request will be performed (and refering to that, you will receive a Full-Response). An HTTP/1.0 web server is expected to recognize the protocol version and build a response depending on the format the client comes in. Apart from the request-line headers will be sent to provide optional information about the client. The headers of a request message can be distinguished between a general header, a request header and an entity header. The first one contains the directives Date and Pragma (obsolete, HTTP/1.0 only knows the value no-cache ). Especially the variable date represents its value in three different allowed formats, but only two of them are actually still in use, the RFC 822 (updated by RFC 1123) and ANSI C s asctime format. The request header defines e.g. the accepted file types. That means the web server is told about all types the client supports. Further the User-Agent and the Referer are defined, because of that fact details about the browser are shown and so is also the URI that has lead to the document. The Authorization contains some more information that are requested when the web server returns the message 401 Unauthorized. The entity header deals with the data that can be sent to the server, therefore different content information to the entity body are given. This header is only used in combination with the POST method. Basically the web server replies by sending only the entity body that contains the data which the client wants to be retrieved. This form is called Simple- Response and is the answer to a Simple-Request HTTP/0.9 will send. The HTTP/1.0 server offers quite more information, so it finally replies a status-line that represents the state of the requested document. The first token is always the version of the protocol both parts use, followed by the status code (five different classes) and at least a human readable phrase for the status. Later on there can also be a general header, a response header or an entity header. The first one and last one are the same like they are used within a request message. Some additional information about the server that cannot be placed in the status-line will be put into the fields of the response header. That will be one the one hand the Location that is represented by an absolute URI and on the other hand the Server that describes what kind of software the origin server uses to handle a request. The entity body at the end of the reply contains the requested file, but will only be sent if the target exists and can be read.

Because of these headers it is possible to make use of a cache management to achieve conditional GET requests. So the web server will only send files if there is a modification or when the document expires. Otherwise the browser will fall back to locally stored data in order to avoid unnecessary connections. E.g. returning the message 304 not Modified means in this case that the browser will just receive the status-line without the entity. Therefore the new directives Date, Expires or Last-Modified can optimize the cache management which is telling the system how long the files should temporarily be stored. The HEAD method will only fetch the header part of a complete reply, so that the browser can check whether a new version of a file is available. The client is also able to send data via POST with an entity body and specific entity header variables, but that feature is not implemented yet. 3.2 Using Threads The web server uses thread pooling that means each request will be worked out within a single thread taken out of a pool. In this case the listening procedure will not be blocked, because the inquiry can be processed immediately. Otherwise no new client would be able to send a request message to the web server, because it is still busy and cannot handle any new requests at all. So the only limitation is that the server is bound to the defined number of threads. But keep in mind that the performance as well as the failure rate depends on this value. To solve this problem we thought about the definition of this value within the configuration file in order to perform some quick tests that can determine the optimal number of threads.

4 Implementation Details 4.1 Interfaces Fig. 1. all participating interfaces of Fabihttp fabihttp.c main: The method called by the operating system on program start mime.c buildmime: Loads the mime.txt file into memory findmime: Called each time the mime type of a document type is requested release mime: Releases the memory used by the mime-types on program exit config.c buildconfig: Loads the options set in config.txt comm.c

create socket: Gets a socket from the operating system an returns it s number wait for request: stops the program until a request arrives send response: Sends the response request.c main loop: starts the master thread thread.c masterthread: The main thread that initializes the socket communication and starts threads on incoming requests new thread: creates a new thread, called by masterthread parse.c parserequest: Scans the request String for http commands and fills a request-struct response.c build response: builds a response to a valid request or sends an error message 4.2 Program Initialization Fig. 2. Initialization of Fabihttp As show in Figure 2, first of all the main method loads the mime-types and the configuration options into memory. If no options are set, the program internal default values are used. After that, the main loop method creates the

masterthread which establishes a signal handler that quits the program when receiving the <Ctrl-C> commandbytheuser.themasterthread continues creating the first socket connection on the specified http-port and starts to listen for an http-request on it. Both functions are located in the comm.c module. The program is now in a loop between waiting for incoming http-requests and creating a new thread that handles it. 4.3 Request Handling Each time a request arrives at the established socket, the masterthread loop creates a new sub-thread based on the method handle whole request. This method causes the parse module to parse the request string for all interesting http commands. Most important are the http-method (get or head), the desired Uniform Resource Identifier (URI) and the http protocol version (0.9, 1.0 or 1.1). Due to the limitations of Fabihttp all 1.1 requests are handled as 1.0 requests. With the filled request struct as parameter the build response method is called to generate an http-response string. Several answers are currently possible: <HTTP/1.0 400 Bad Request> if the http-method is undefined, <HTTP/1.0 404 Not Found> if the requested file does not exist, <HTTP/1.0 403 Forbidden> if the requested file is not accessible by <OTHER> or outside the specified wwwdir, and of course <HTTP/1.0 200 OK>. Each response header field contains information like date, file size, server, content-length, last-modified and contenttype. After generating the response string, build response calls the send response method with the response string and the filename to send. Finally the request handling thread is detached by itself. The overall request handling design is depicted in Figure 3. 4.4 WinFabiHttp Besides, porting the Linux version of Fabihttp to Windows platform was conducted, which we called WinFabiHttp. A simple request-response dispatch by WinFabiHttp was shown in Figure 4. This made it possible to compare the network performance of the different operating systems within one and the same program. Two major problems occurred during the development of the Windows version. First was the lack of operating system internal thread handling and second was the slightly different socket management. The first problem could be solved by using Pthreads-win32 [2]. Pthreadswin32 is a free Open Source Software implementation of the Threads component of the POSIX 1003.1c 1995 Standard (or later) for Microsoft s Win32 environment, distributed under the GNU Lesser General Public License (LGPL). Because the library is being built using various exception handling schemes and compilers - and because the library may not work reliably if these are mixed in an application, each different version of the library has its own name. For WinFabiHttp the library files pthreadgc2.dll libpthreadgc2.a were used.

Fig. 3. Request handling The socket handling under MS-Windows is covered by the winsock library. To use it, the additional includes <winsock2.h> and <mswsock.h> are required. Two incompatibilities were noticed during the Linux to Windows conversion. Unlike the Unix sockets winsock has to be initialized before use and the command for closing a socket connection is closesocket() instead of close(). The executable file is with 40.468 Bytes considerably larger than the 23.080 Bytes Linux binary. In return the Windows version can deal with blanks and umlauts in the URI.

Fig. 4. Simple request-response dispatch by WinFabiHttp

5 Results 5.1 Testing Environment For our tests we used two different environments. The first one consisted of a Pentium 100MHz with 24MB of RAM and Debian Sarge as operating system (server I), an AMD Athlon XP 2400+ with 512MB of RAM and also Debian Sarge as operating system (server II) and as client we used an Intel Celeron M 1400MHz with 512MB of RAM and Kubuntu as operating system. The three computers were connected in a 100MBit/s LAN via switch. This test setting was used to measure the performance differences with regard to the host computer. The second environment consisted of an AMD Athlon XP 2200+ with 512 MB of RAM and Windows XP on the one hand and SuSE Linux 9.2 on the other hand as operating system. As client functioned an Intel Pentium III 700 MHz with 512 MB of RAM and SuSE Linux 9.2 as operating system. These computers were also connected in a 100MBit/s LAN via switch. This setting was used to measure the performance differences between the Linux version and the Windows port of our web server. For the comparison of Fabihttp with other web servers we used the open source standard Apache [3], Monkey Httpd 0.9 [4] and the Windows port Win- FabiHttp. The Monkey web server is also an open source project. It s a HTTP/1.1 server, written in C, threadbased but without usage of a thread pool. As testtool we used the httperf - HTTP performance measurement tool. It speaks both HTTP/1.0 and HTTP/1.1 and can simulate a variety of workload. At the end of each test run it provides you with detailed statistics subdivided into different parts like summary, requests, responses, errors and used network bandwith. 5.2 Tests Part 1 First we measured the CPU usage of our web server and how different numbers of concurrent threads affect it (Figure 5). We used our first test environment for this analysis. As expected we measured significant differences. The Pentium is capable of handling about 130 connections per second. Beyond that limit the resources are exceeded. This means if you want to host a private homepage via Fabihttp and you don t expect so many concurrent visitors you can make use of an old disused computer. Next we tested the response times depending on the number of threads. Figure 6 shows clearly that the Athlon handles requests faster than the Pentium. But we didn t get any understandable results concerning the threads. That s why we decided to run additional tests with a lower number of them (<64). 5.3 Tests Part 2 In our following test we analyzed the file transfer times dependent on the filesize. For this and the remaining tests we used our second environment.

Fig. 5. CPU usage Figure 7 shows a linear increase of the time values, without significant differences between the web servers. We came to the conclusion the 100MBit/s LAN is the bottleneck in this case, that s why we can t gauge the performance of Fabihttp. After we could not detect differences concerning the filesizes we evaluated unlike send methods. Therefore we began sending with a single character and then increased the buffer size. At the end we compared the values with the sendfile method. This method is known to increase the throughput of the network because the operating system does not copy the data before it will be sent (zerocopy) and the network card is able to fetch the header of the packets from a different place in memory. That will relieve the operating system and the cpu and can accelerate network transactions up to 85%, at least in theory [5]. The first conclusion is the fact that sending data via the send char() method is always ineffective (Figure 8). The performance is getting better when you make use of buffered methods, but there are limitations to this as well. In our test we detected no more increase in performance with a buffer size of 20 bytes or higher. We couldn t even perceive better values for the sendfile method. The next test is about the different file transfer times of Fabihttp and the Windows port WinFabiHttp (Figure 9). As in our first file transfer time test we could not notice significant differences between both versions. The bottleneck in this scenario is the 100MBit/s LAN again. But regarding to the fact that Fabihttp is slightly faster than WinFabiHttp we assume that the Linux we used handles file transfers in general more efficiently than Windows, because both web servers are nearly identical.

Fig. 6. response times 5.4 Test Part 3 This part deals with performance differences by investigating variable configuration with regard to different numbers of threads and different operating systems. These tests should demonstrate that a variable number of threads affect the performance considerably. Therefore they will prove that the optimal number of threads depends on the operating system. The following five tests which were all performed with the second environment are supposed to prove the theory: Failure rates of different webservers (Figure 10): This test investigates the failure rates of the three webservers Monkey Httpd 0.9, Fabihttp and Apache with default values. The tests were performed with a range of 1 to 2000 connections per second. As one can see in figure 9, the Apache did his job very well without any failures. The Fabihttp got some few problems which are not really mentionable. But the failure rate of the Monkey Httpd 0.9 increased extremely by 2000 connections per second. That might be caused by a different thread-handling. Success rate (default values) WinFabiHttp vs. Fabihttp (Figure 11): The next test is about the different success rates of Fabihttp and the Windows port WinFabiHttp. The test was performed with a range of 500 to 2000 connections per second. The Fabihttp passed all tests with a success rate of nearly 100%. But the success rate of the WinFabiHttp decreased appreciable by 1000 and 2000 connections per second. That might be caused by Windows internals. Success rate: WinFabiHttp and Fabihttp with different request rates and threads (Figure 12): The following test deals with different numbers of threads. The test was performed with a range of 500 to 2000 connections per second. The Fabihttp as well as the WinFabiHttp were tested.

Fig. 7. File transfer time One could see, that the different numbers of threads caused different success rates by both webservers. The webservers reacted very differently. During all the test the success rate of the Fabihttp ranged between 96% and 100%. But the success rate of the WinFabiHttp showed significant differences up to 20 threads. Success rates with 2 threads WinFabiHttp vs. Fabihttp (Figure 13): Here both webservers (WinFabiHttp and Fabihttp) were tested with two threads. The test was performed with a range of 1 to 2000 connections per second. One could expect similar results for both versions, but as one can see in figure 12 the Fabihttp had a success rate of 100% during all the tests and the success rate of the WinFabiHttp decreased extremely. This should prove that the optimal number of threads depends on the operating system. Failure rates with 2 and 64 threads Monkey vs. Fabihttp (Figure 14): The last test investigates the failure rate of the Fabihttp with 2 and 64 threads and the Monkey Httpd 0.9. The Fabihttp with 2 threads had a failure rate of 0%, but the Fabihttp with 64 threads as well as the Monkey caused failures. These results prove that a scalable number of threads can upgrade the performance.

Fig. 8. Different send methods Fig. 9. File transfer time

Fig. 10. failure rates Fig. 11. Success rates

Fig. 12. Success rates Fig. 13. success rates

Fig. 14. Failure rates

6 Summary and Future Works On the basis of the tests we conclude the following three points: 1. There is a certain number of threads for optimized server performance, however this depends on the used operating system; 2. There is no significant performance benefit with non-standard send methods; 3. Linux seems to handle large numbers of requests better than Windows. In the future it would make sense to go on the following three points: 1. Adjust the optimal number of threads automatically; 2. Implement CGI support. 3. Upgrade to HTTP/1.1.

References 1. Berners-Lee, T., Fielding, R., Frystyk, H.: Hypertext transfer protocol http/1.0. RFC 1945, Internet Engineering Task Force (1996) 2. (http://sources.redhat.com/pthreads win32/) 3. (http://www.apache.org/) 4. (http://monkeyd.sourceforge.net/) 5. Wowra, J.P.: Www server optimizations. Seminar report for Advanced Topics in Computer Networking (SS2005) (2005)