4D WebSTAR 5.1: Performance Advantages CJ Holmes, Director of Engineering, 4D WebSTAR OVERVIEW This white paper will discuss a variety of performance benefits of 4D WebSTAR 5.1 when compared to other Web servers such as Apache 1.3.20. This paper is designed to: 1) show the performance advantages offered by 4D WebSTAR 2) provide tests that can be replicated and experimented with by our customers to tune the performance of their servers. For these testing purposes, we have kept the configuration as simple as possible while still providing meaningful measurements 3) discuss some of the factors affecting performance in Web servers in general, and 4) Discuss some of the areas where we plan to improve 4D WebSTAR performance. TEST SUITE PACKAGE For your convenience, the same files used to generate the performance results have been provided with this white paper download. The archive includes: a copy of http_load (a performance testing tool), all of the files used as content for these tests, and the files required by http_load. INTERPRETING THE TEST RESULTS To interpret the results of the performance testing, we will start with the bottom-line numbers and work our way into more details. The following graph offers a visual representation of the test results: overall search php dirlist index small large 0 500 1000 1500 large small index dirlist php search overall WebSTAR 5.1 293 1291 1069 698 188 67 603 Apache 1.3.20 312 1297 654 280 195 6.3 155 WebSTAR 5.1 Apache 1.3.20 In this graph, the unit of measurement is requests per second, and higher numbers indicate faster performance. We will discuss each test in more detail later, including what is special about each file set, what the performance-limiting factors are, and how changes to the server or network configuration could affect the outcome. TEST PLATFORM The platform that was used for performance testing here does not represent the best setup possible. Rather, it is a platform that most people would be able to recreate for their own testing purposes. An example of improving the testing platform would be to use the new generation of PowerMacs which have an improved I/O architecture, faster bus speed, and larger L2 cache. Running these same tests on such a machine would result in mostly higher numbers. The testing platform used here includes the following: Server - Dual 450-MHz PowerMac G4 (Machine ID 406) - Software - WebSTAR 5.1 - Apache 1.3.20 (Darwin) pre-installed with MacOS 10.1.2 Clients - 3 Single-processor PowerMac G4's - http_load dated 04Jan2002. Available from <http://www.acme.com/>. Network - 100BaseT to every machine - 100BaseT switch Running the tests In most cases the test was run from all three clients simultaneously using a command line similar to "./http_load -parallel 8 -seconds 120 both_small.txt". The results from all three clients were added together to arrive at the final numbers. We chose http_load because of its simplicity and efficiency, even though it offers little in the way of protocol HTTP://WWW.WEBSTAR.COM/51 1
4D WebSTAR 5.1: Performance Advantages options. We used 8 threads per client for a total of 24 threads. We chose this number because it did not require any special configuration for either server but was large enough to be non-trivial. Some users will want to test with many more threads, perhaps using http_load's throttle feature to simulate more clients using the server over a slower network. Before each test is run for real from all three clients, it is run for just a few seconds from a single client to "warm up" the server. This eliminates initialization and caching ramp-up from the final numbers. In reality, this didn't seem to make much of a difference, but it is good practice. OVERALL Overall, 4D WebSTAR was nearly three times faster than Apache. This overall test is composed of several different kinds of requests: - 55% small files - 15% large files by URL - 15% large index files - 5% directory listings - 5% PHP requests - 5% search requests This test attempts to simulate a real-world Web server. It consists mainly of small graphics, HTML files in the 20-40k range, and the occasional index file request. These static pages make up a total of 85% of the entire load. The remaining 15% are dynamic requests for directory listings, PHP, and search queries. Such a mix may or may not approximate your own Web site. It is difficult to predict how changing the server or network setup will affect the outcome of such a test because there are so many variables involved. Instead, we discuss each kind of test separately, and offer some advice on how to optimize performance for that case. Later we discuss how you can determine which kind of request should be optimized for the greatest effect for your particular server. SMALL FILE TEST When serving small files, 4D WebSTAR and Apache were evenly matched in our tests. The URLs for this test are in the file both_small.txt. It is composed of a collection of small graphics files, or "eye candy." It is common for these files to be scattered all over a Web site. Our webmaster is quite good at keeping these files small, and many of them are under 1K. This test yields the highest requests/second numbers in both servers, but not much data actually being served. The greatest limitations in this test are (a) how fast the connection setup and tear-down can happen and (b) how fast the server is at parsing the request and making basic serving decisions such as which file to serve and whether or not the request is valid. Setting up and tearing down TCP connections is rather expensive. It involves a three-way handshake to set up connections, and a four-way handshake to tear them down. Thus, latency has a significant impact on this number as no single request can be served in less than 7 times the latency between the two hosts. Using a gigabit switch can improve the performance by lowering the time required to set up connections. Also, clients who use persistent connections and pipelined requests would see considerably higher performance, since they would avoid most of the costs of TCP connection overhead. Since http_load does not support persistent connections, we are seeing a "worst-case scenario" here, which is useful for our purposes. File I/O is a secondary consideration, but not an insignificant one. Once the connection has been accepted, half of the CPU time needed to process the request is consumed by opening and reading the file. This may come as a surprise to many people since we normally think of disk operations as being an order of magnitude faster than the network. But consider the case of a very small file, under one kilobyte, which is already in the operating system's file cache. It takes 25-30 microseconds just to open the file, and another 25-45 microseconds to read it, for a total of 50-75 microseconds. But it will take around 82 microseconds to send 1 kilobyte of data over a 100Mb network. As the size of the file shrinks, the time to open and read the file stays the same, but the cost of sending it decreases. For this reason, 4D WebSTAR uses a data cache to keep small files in memory. If you have thousands of small files you may want to increase 4D WebSTAR's data cache to be large enough to hold them all. Apple's version of Apache also has a built-in data cache that is similar. LARGE FILE TEST In this test, 4D WebSTAR and Apache performed nearly the same. The URLs for this test are in the file both_large.txt. It includes files that are between 20k and 70k in length, too large to be cached without modifying your server settings. This test produces low requests/seconds numbers but 2 HTTP://WWW.WEBSTAR.COM/51
very high bandwidth consumption. The performance difference between the two servers is most likely related to some kind of other network traffic present when the test was conducted. Again, disk I/O is a secondary factor for most Web sites. The exception is for sites that need to serve many clients at a time (over a hundred), which is a more challenging problem for the disk hardware and raises the risk of exceeding the file system's limits on file descriptors. Both issues can be mitigated by increasing 4D WebSTAR's "maximum file size to cache" setting and "maximum cache size" setting to cache more files as well as larger files. Using a faster network will result in better performance by both servers, but such a test isn't relevant to the majority of our users. INDEX FILES The URL for this test is in the file both_index.txt. It is simply a URL requesting the home page. It may not look like much, but it can be a challenging case for both servers. When a server receives a request that ends in a '/' character, it needs to decide which file in the directory should be served, or perhaps to allow a plug-in such as the Directory Indexer to handle the request. Typically, there are several possibilities from which to choose. In this test, 4D WebSTAR outperformed Apache by about 60%, handling over a thousand such requests per second. For this test we configured Apache to accept three different index files: default.html, index.htm, and index.html, in that order. We did not change 4D WebSTAR's configuration, since it already looks for all three files by default. But the test directory is over 30 items long. This represents a bad-case scenario for both servers. 4D WebSTAR and Apache use very different algorithms for this. 4D WebSTAR uses the data cache to retrieve a listing of what is in the directory and searches through the directory looking for a match. This algorithm takes just a little bit longer as the size of the directory increases, but if the size of the directory and the list of file names to check are both quite large, then the number of comparisons to be done grows polynomially. Unless the best match is found, all possibilities will be examined. Apache creates each possible URL and issues a subrequest to the server, which re-invokes all of the overhead of making serving decisions. This can be efficient if the first try is a "hit," but each additional try invokes this overhead again. Each try is very expensive, but the cost grows in a linear fashion. When configured for a bestcase scenario (look for index.html first), Apache performs this test at the same speed as 4D WebSTAR. The best way to improve performance for both servers is to try and stick to a convention of always using the same index file name, and putting that name at the top of your configuration. In 4D WebSTAR's case, it also helps to not have too many files in your directory, but that won't be true for much longer. We plan to change index file selection to an algorithm that is still as fast as Apache's best case, but grows linearly. DIRECTORY LISTINGS This test used two different URLs for the two servers, found in webstar_index.txt and apache_index.txt. Both URLs cause a server to perform a similar task: to list the contents of a directory sorted by the last modification date of the files. This is another very common case that is taken for granted by webmasters and users, but can be quite expensive to serve. Again, there was a large difference between the two servers, with 4D WebSTAR outperforming Apache by about 150%. The crucial difference here is how the two servers perform directory lookups. Apache reads directories from the file system, while 4D WebSTAR retrieves the list from its cache. The cache checks to see if the directory has changed recently, and will reload the information if necessary. But most of the time the data will be coming directly from the cache. To configure 4D WebSTAR for this test, we added a realm that allowed directory listings inside any URL beginning with "/subfolder/". Directory listings are turned off in 4D WebSTAR by default for security reasons. No configuration changes were necessary in Apache. PHP The URL for this test is contained in both_php.txt, and consists of a single URL: <http://www.yourserver.org/hello.php>. This file contains a single PHP command, phpinfo(), which outputs about 40k of information relating to PHP's configuration. The two servers performed nearly the same on this test, with Apache edging out 4D WebSTAR by 3%. 4D WebSTAR 5.1 includes FastCGI, a network procotol used between Web servers and external programs such as PHP. The advantage of such a setup is that it allows the external program to be run in a separate process or even on a totally separate machine. This improves overall reliability and, in the case of using an additional machine, overall performance. We configured PHP to be used as a FastCGI on port 8081, and launched PHP running 8 child processes. HTTP://WWW.WEBSTAR.COM/51 3
4D WebSTAR 5.1: Performance Advantages On Apache we ran PHP as a plug-in. To configure Apache we uncommented the PHP-related lines in the httpd.conf settings file. FastCGI is a new feature in 4D WebSTAR V, and has not been optimized yet. Since we see this as a key technology for 4D WebSTAR's open architecture, this feature is high on our list for performance enhancements. SEARCH Full-text search is a very expensive service. So expensive, in fact, that many shops use a separate machine for search services. So we only needed 1 thread from 1 client to produce a load:./http_load -parallel 1 -seconds 120 webstar_search.txt 4D WebSTAR includes a search facility based on the Onix engine from LexTek International. On the Apache side we chose Ht://dig because of its popularity with Apache users. The test URL for the two servers are in webstar_search.txt and apache_search.txt, respectively. For indexable material we used the Ht://dig Web site. Both URLs query the appropriate index for the term htdig" and sort them by relevance. 4D WebSTAR Search outperformed Apache/Ht://dig by an order of magnitude on this test. This is due both to the overhead of CGIs and the fact that Onix is a particularly fast engine. Configuring 4D WebSTAR for this test was a matter of adding the Ht://dig home page to the existing preinstalled index. We set the crawler to go three URLs deep into the Ht://dig Web site. It took slightly longer to configure Ht://dig. After downloading the source from www.htdig.org, we did the following: We edited the CONFIG file and changed the following lines: prefix= /Library/WebServer/htdig CGIBIN_DIR= /Library/WebServer/CGI-Executables IMAGE_DIR= /Library/WebServer/Documents/htdig SEARCH_DIR= /Library/WebServer/Documents/htdig From the terminal we changed directory into the Ht://dig directory and then compiled/installed it. You'll need an administrator password to do this: sudo make install Next, open /Library/WebServer/htdig/htdig.conf and add the line: max_hop_count: 3 Then build the index from the terminal with: sudo /Library/WebServer/htdig/bin/rundig FINDING YOUR BOTTLENECKS First and foremost, you cannot serve faster than your network will allow. If your Internet connection is full of people who are trying to download installers from your Web site (as is our case when we release a new version of 4D WebSTAR), there isn't much you can do to make it all go faster except to add more network capacity. To run into server limitations you need more than a single T1 line. Second, make sure you have enough RAM in your server. You can run the top command from the terminal, and look at the pageins and pageouts numbers. If the numbers in parentheses are high all the time, then your server is constantly using the disk for virtual memory and you need either more RAM or fewer applications running. Third, analyze your logs to find out what your server is doing and how much of a load different kinds of files are placing on your server. The following instructions should help you analyze and weigh each kind of URL you serve the request type with the most weight is the one whose optimization would benefit your overall performance the most. Here are some suggestions for analysis: 1) Figure out what to look for, such as small files, large files, frequently-used plug-ins, or CGIs (e.g. search or dirindex). Depending on your setup and how much time you want to spend on this, you may have just a few categories or over a dozen. 2) Find out from your logs what percentage of your requests fall into each category. 3) Measure how many of that kind of request you can do per second. Don't try to simply average the time-taken from the logs, because some of the numbers are so small that they fall below the millisecond resolution for the log format. Instead, use a tool like http_load to get a decent measurement. Also, use a good sampling of that 4 HTTP://WWW.WEBSTAR.COM/51
kind of file. Simply requesting the same one over and over won't cut it in the real world. You will want a pretty thorough representation of that kind of file. For example, if you are testing your PHP pages, you will want to use the pages that are requested most often as part of your test suite. 4) Pick a baseline number. We'll use 1000 for this example, simply because it gives us an idea of how many milliseconds each request takes. But you can use any number you like, even numbers like 1 or pi, as long as you always use the same number. administration, but is efficient at a large number of very common dynamic requests. It gives administrators access to standard, high-performance interfaces such as FastCGI, and built-in advanced services such as Search. This advantage in handling dynamic content can result in a very large performance advantage over Apache on MacOS X. 5) Then compute the weight of each kind of request by multiplying that number by the percentage of your mix taken up by that category of request. The kinds of requests with the highest numbers are the ones your server is spending the most time on. - weight = (baseline / req_per_sec) * load_percentage - WebSTAR Search (1000/67)*.05 = 0.75 - WebSTAR Large Files (1000/293)*.15 =.51 - WebSTAR Small Files (1000/1291)*.55 = 0.42 - WebSTAR PHP (1000/188)*.05 =.27 - WebSTAR Index Files (1000/1069)*.15 =.14 - WebSTAR DirIndexer (1000/698)*.05 = 0.07 If this were a real-live server mix and you were a small business that felt that your server performance needed to be improved, then you might consider the following measures, in more or less this order: 1) If your server's CPU load is often high, move your search services to a separate machine. 2) If the network is frequently full, move large downloads to a more high-bandwidth server or consider buying more bandwidth. 3) Make sure your data cache is large enough to hold all of your most commonly used small files. If many of them are being read off of a disk, it will affect your overall performance. CONCLUSION Measuring Web server performance is a complex task. Network characteristics, file I/O, CPU performance, and the specifics of your particular load all contribute to your server's performance. But it is possible to determine which factors are most important to your situation. 4D WebSTAR provides not only tremendous ease of HTTP://WWW.WEBSTAR.COM/51 5