Reverse proxy for Tomcat Project Plan Anders Nyman d03any@efd.lth.se June 10, 2005 Version 1.2 This project is now implemented, binary and source code can be found at http://j2ep.sourceforge.net
1 Introduction A reverse proxy is easily confused with a forward proxy which allows clients a proxy connection to the Internet. Instead a reverse proxy sits between your application servers and the Internet providing a proxy for all incoming connections to your application. Why do you need this kind of proxy, isn t it enough to have your application server directly connected behind a firewall? No, there are many advances gained when using a reverse proxy. Among them are the fact that you have a single point of entry where you can do all the filtering needed, add new application servers seamlessly or change their locations without the client noticing anything different. There is also a possibility of using a reverse proxy as a caching system, this is however not the specific goal of this project. Even if there are many benefits of using a reverse proxy there are also some negative aspects such as the risk of your proxy going down and taking the entire application with it or the fact that request will be handled a bit slower with a proxy in comparison to a client talking directly to the server. There are already methods one can use to get a reverse proxy with Tomcat. One popular way is by using apache and mod_proxy. However there is a need for a simple configurable proxy where you can add your own set of rules. Some people might also not feel comfortable using apache and think it is an unnecessary install just to get the proxy support. Therefore a web application implementing a reverse proxy will be made by extending the Rules and RulesChain setup from the balancer webapp. 2 Glossary Balancer The balancer webapp developed by Yoav Shapira and shipped with later versions of Tomcat 5.X. Rule Reference to the Rule class used in the balancer webapp. More generally a condition that if met tells the proxy what application server to use. RulesChain A list of rules that are traversed trying to find a matching one. 3 Project Organisation This is a one man project by Anders Nyman, Computer Science Student at Lunds Tekniska Högskola located in Lund, Sweden. Visit http://www.lth.se/english/ and http://www.cs.lth.se/index.shtml.en for more information 3.1 Bio As a computer science student I like to have some side projects besides the schoolwork. Lately I have been coding on an J2EE framework. Sort of like Struts but where everything is defined directly in a database instead of XML 2
so that I can easily add web pages and change the permissions for them using a web interface. You might ask yourself why I didn t start extending Struts, but I really wanted to get down and try out some MVC things like using filters instead of servlets as the controller. Plus it s not as fun to start using Struts compared to coding your own MVC-framework. I have a good bit of experience coding J2EE applications with filters and servlets for Tomcat using an object orientated approach and have an overall good experience coding Java. I also have had some previous experience with filtering requests and think a reverse proxy might come in handy for my own work. This makes me a perfect candidate for implementing a reverse proxy. 4 Work Description The basis for the reverse proxy will be a new webapp depending on the existing balancer webapp for its rules, rule chaining and rules parser. Possibly the Rule classes might be refactored from the balancer and into a common package in order to have a common pool of classes for the both applications. The development method will be a bit of reuse when coding the rules and using an incremental style at large. The big increments are coding of the rules and creating a proxy. There are many steps to take in order to implement a reverse proxy and a specific division of the steps is quite hard to make. Even so the following tasks have been identified. T1 - Code a proxy. A reverse proxy obviously has to be made, this task is however only consisting of coding a quick sample filter that proxies all information to a specifed web server. Various coding solutions might be made as a start to identify a, from a performance view, good solutions. T2 - Identify and extract from balancer. Identify components in balancer that can be used for the proxy and extract them. T3 - Make XML digestion work. Make sure the proxy is easy configurable using a balancer type of mapping. T4 - Basic rules. Create a set of basic rules like IP filtering, filtering on query string, etc. T5 - Integrate proxy with rules. The proxy will have to be integrated with the rules. It should use the RulesChain or any equivalent for the purpose of identifying which rule is met. T6 - Documentation. A detailed documentation on how to setup the webapp, how to configure the included rules and also how to write your own rules are to be made following the guidelines for tomcat documentation. T7 - Composite Rule. A composite rule is a rule consisting of one or numerous other rules in order to allow many conditions to be met. An example is that requests should be sent to a specific server if the IP is in 3
range 217.213.x.x and the request is made on a Friday, on a Thursday the same request should be sent to another server. This is a silly example but there is a need for a composite rule. T8 - XML configuration for composite rule The xml layout will probably have to look a bit different if a composite rule is made this might require changes to the XML digester. A special note might be made concerning T6 - Documentation, this task will be worked on continuously during the project. Instead of doing a big bang of documentation at the end, when a task is completed documentations concerning it will be written. This documentation also includes a complete javadoc for all classes and methods. 4.1 Deliverables A working reverse proxy with a set of rules for at least IP, URL, time, date and also a composite rule is to be delivered. A documentation for the entire webapp including javadoc shall also be included in the release. 5 Time Schedule The tasks are marked by the time it will take to complete them and also which tasked they depend on. Two releases are set for the project, one at the twentyninth of July in order to have an early release ready for the OSCON. A second and final release is made on the twenty-sixth of August leaving a few days to fix eventual bugs before the first September when the work should be finished. These releases are set in stone and will not change, however bare in mind that this is an estimate and all tasks might not be fully completed for the releases. Task Duration (days) Dependencies T1 6 None T2 2 None T3 4 T2 T4 4 T2 T5 4 T1, T4 T6 8 None T7 2 T4 T8 4 T7 Included in the first release are tasks T1-T3, T5 and a start of T4. For the last and final release T4 is completed and T6-T8 are added to the release. 6 Recording process All task process will be recorded as to how long time they took to finish and possible problems when implementing the tasks. The recording document can serve as a basis for future estimates and also make sure that tasks are being completed. 4
7 Revision history June 7 2005 - Initial release. June 8 2005 - Spelling and wording corrected. June 10 2005 - Switched to L A TEX for a nicer layout. 5