Internet Architecture? Clients...

Towards Query Caching in a Web Application Proxy Sara Sprenkle sprenkle@cs.duke.edu January 10, 2002 Abstract The bottleneck of Web content delivery is not serving static content but dynamic content which requires data processing. We propose using a database cache to ooad the demand on a database-driven Web service. This paper presents the design of a database cache and preliminary results on the overhead of maintaining such a cache. 1 Introduction The web has evolved from being an archive of distributed, static information to an interactive, wide-area application service provider. This evolution requires new techniques for improving web performance, as measured by client-perceived latency. One approach to maintaining low latency is to scale these applications. Web-based services that generate dynamic content do not receive the performance benets of traditional static content caching. Dynamic Web services respond to user requests on-demand by executing programs that return documents created from data stored on the server and elsewhere, the state of the server, and other information resources. The generated documents cannot be cached because they were created in response to a specic request; furthermore, the same request may return a dierent response because of changes to the data used to create the response, changes in the server's state, etc. There have been many proposals for Internet infrastructures to improve the delivery of static content. The dicult question is determining which infrastructure to use. We will consider this question for a common scenario: many client-requests for dynamic content from a Web application server, as depicted in Figure 1. We believe that caching the data used to generate dynamic content is a promising approach to improving client-perceived latency of dynamic content. The problem with response caches is that they do not handle dynamic content; a proxy stores the response to a specic request but does not know how the response was generated. A response cache is unaware of the underlying data that was used to generate the document and therefore does not know when a cached response is stale because of changes to the data. By caching the data, we push the data closer to the users, decreasing the latency of responses, network utilization, and Web application server load. Furthermore, the cache is an alternate, if stale, data source when the primary database is unreachable from a client because of system or network failure. Figure 2 illustrates the two cases for cache placement: the cache resides on the node and the cache is placed into the network. We rst experiment with placing the cache on the same node as the original database. We will address the issues of putting the cache in the network in the future. 1

Web Application Server 0 0 0 1 1 1 0 0 0 0 1 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 database Internet Architecture? 0 0 1 1 0 0 1 1 Clients.... 0 0 1 1 0 0 1 1 Figure 1: Goal: An Internet architecture with the best performance, availability, and scalability as viewed by clients. This work is a preliminary study used to guide future work in wide-area caching infrastructures. The infrastructure may or may not be database-driven. In the rest of the paper, we will introduce our cache design and present the implemented prototype in Section 2 We discuss the overhead of maintaining a cache by analyzing the results of our experiments in Section 3. We present some ideas for continuing this work in Section 5. 2 Our Approach We chose to use a semantic database cache. Support for using semantic caches is oered by the authors in [11]. We make some simplifying assumptions about the data and the queries on the data. We assume that there are no updates to the data so we can ignore data consistency issues. We envision using an infrastructure like Ivory [15] to automatically handle consistency. Queries are SELECT-PROJECT-JOIN. Database accesses are divided into two categories cacheable and non-cacheable. Non-cacheable accesses, which include creating and updating data, are directed to the primary. When possible, cacheable accesses are handled by the cache. If the required data was not resident at the cache, the request was redirected to the primary database. A cache can handle the request in two cases. If the query does a search by key, the cache rst attempts to answer the query. If the cache does not contain the necessary data, the cache faults in the required tuples. If the query does a search on non-key elds, the cache rst checks if it has answered this query before. If it has, it computes and returns the response. Otherwise, it requests the result from the primary database. When the cache needs to fault tuples, it requests full tuples from the primary database, ignoring projections and ordering. The assumption is that if one attribute of a tuple is requested, other tuples will be accessed soon. Similar reasoning was also used in [13] for determining optimal cache plans. Furthermore, since the cache processes the query locally to handle projections, it will also handle ordering at the same time. We considered having the cache use a remainder query to only request the missing results from the 2

Web Application Server 0 0 0 1 1 1 0 0 0 0 1 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 cache database cache. 0 0 1 1 0 0 1 1 Clients... 0 0 1 1 0 0 1 1 Figure 2: Place a semantic cache of the database into the network. cache to reduce communication, but this proved to be too computationally complex. The cache has to tell the primary what it has; a primary cannot be responsible for maintaining this information, especially when there could be multiple caches and the cache will implement its own eviction policy. The cache can either tell the primary explicitly what is resident by sending the relevant tuples or the cache can describe what it has by sending its list of answered queries. The reduced communication overhead did not warrant the cost of computing the remainder query. Our approach is extensible in that it does not depend on the data format or the location of the cache. We can apply these ideas to the wide area for other data formats if we determine that they are more appropriate for a set of applications. 2.1 Implementation Details Our database cache consists of two pieces a Java frontend for cache management and a DB2 database for storage. The frontend is written in Java 1.3, run in the Java environment [12], and uses a JDBC driver to communicate with the database. We used ZQL [2], an SQL parser written in Java, to parse and recreate SQL statements easily. We used a hashtable to store the executed queries. The hashtable stores the SQL string and does string matching to determine if the query was previously executed. 3 Methodology and Results In this section, we discuss our experiments for testing the performance of the cache. We tested our cache with a representative e-commerce application an on-line bookstore based on the TPC-W benchmark [1]. 3.1 TPC-W implementation A freely-available, Java-based TPC-W release [6] was developed at the University of Wisconsin. The release includes Java data structures and servlets and testing software. We augmented the TPC-W release to use a database cache. 3

Parameter Comments Value NUM ITEMS Number of "books" in database 1 NUM RBE Number of load generators 5 MIX Browsing or Shopping or Ordering Browsing RAMP TIME Startup & shutdown allowance 10 sec INTERVAL Measurement interval 600 sec Table 1: TPC-W parameters used in all experiments. in tuples Cache Access 33152 Cache Hit 31896 Cache Miss 1256 Average Cache Hit Rate.9645 Table 2: Example cache statistics; Average cache hit rate averaged over ten experiments. The servlets are cleanly separated from database access. Of the 14 servlets, ve of them contained only non-cacheable data accesses: Best Sellers, Customer Registration, Buy Confirm, Order Display, Admin Confirm. 3.2 Experiments The cache/database and the Web server (Java's WebServer) are housed on separate Linux machines on the same LAN. For our initial experiments, we used a relatively-small database, scaled by the test parameters, summarized in table 1. We chose the browsing mix to reduce the number of updates to the database. We tested our cache using the remote emulated browser provided with the Wisconsin TPC-W release. We ran the experiment several times with and without the cache so that we can compare the overhead incurred by maintaining the cache. In another set of experiments, we tracked cache statistics. Cache misses are the tuples that are faulted in from the primary and not already resident on the cache. Cache hits are successful key eld and non-key eld searches. Before a cache requests the result from the primary database, it computes the result of the query in its cache. The tuples in this result are also considered cache hits. 3.3 Results Figures 3 and 4 show the results for application response time without and with the cache, respectively. We note that the communication overhead of redirecting noncacheable accesses to the primary is relatively low, approximately one second. The New Product Search and Search Response, as expected, have long tails. The rst request for each search is slow when O(100) tuples are are faulted in; however, subsequent searches are much faster. Over 50% of the search responses take less than 1 second, which is consistent with the primary database. Example cache hit results, measured in tuples, are in Table 2. Since an eviction policy was not implemented, the cache hit rate increased the longer the experiment ran. The statistics will be more meaningful once a cache eviction policy is implemented. 4

% Interactions 100 80 60 40 20 Response Time Home Product Detail Search Request Shopping Cart Buy Request Order Inquiry Admin Request 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Time (s) % Interactions 100 80 60 40 20 Response Time New Products Best Sellers Search Results Customer Regist. Buy Confirm Order Display Admin Confirm. 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Time (s) Figure 3: Application response time, in seconds, without caching. 3.4 Discussion The semantic cache is a promising approach to serving dynamic content faster. The overhead of loading a cache is relatively low compared to the benets of caching. We can implement several dierent eviction policies to determine how to eciently utilize the cache. 4 Related Work We believe that the pieces for creating a database-driven Web Application proxy are in place. Putting the appropriate pieces together is the challenge. There is a rich body of work related to the problem of serving dynamic content eciently. The primary dierence in these approaches is the caching granularity. 4.1 Database Cache We primarily used two papers to guide our eort. The authors in [14] and [11] both discuss using remainder queries to get missing results to a cache. The authors of [11] eectively argue the case for semantic caches, backed by simulation results. To the best of our knowledge, they did not implement their approach in a database. 4.2 Datastore Cache Cao's Active Caches [7] is a Java-based approach to caching dynamic content. The Active Cache implementation utilizes Java's security features to perform computation on cached documents. A cache applet is attached to a document so that proxies, without server interaction, can perform the necessary processing on a cached document to make it current. IBM's trigger monitor [8, 9] pre-generates the dynamic responses, e.g., HTML pages, to user requests. The responses' data dependencies on underlying data are used to determine when a response should be regenerated, i.e., when data changes occur, the responses that depend on the changed data are regenerated. Pre-generating pages greatly decreases the workload on the server. Since disk space is now inexpensive, pre-generating large numbers of pages is an option. However, if data changes at 5

% Interactions 100 80 60 40 20 Response Time Home Product Detail Search Request Shopping Cart Buy Request Order Inquiry Admin Request 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Time (s) % Interactions 100 80 60 40 20 Response Time New Products Best Sellers Search Results Customer Regist. Buy Confirm Order Display Admin Confirm. 0 0 1 2 3 4 5 6 7 8 9 Time (s) Figure 4: Application response time, in seconds, with caching. a much higher rate than pages containing the data are requested or the data aects many pages that probably will not be requested, pre-generating the pages wastes resources. 5 Future Work The preliminary results are promising and encourage further work in this direction. Our algorithm for cache management is very simple. By adding some more \smarts" into the system, we could improve the cache hit rate and, thus, reduce the application response time. For example, two separate searches (title CONTAINS Baltimore, title CONTAINS Orioles) could cover another search (title CONTAINS Baltimore AND Orioles). This is one of many checks we could do before querying the primary. We did not implement restrict the size of the cache. We believe that we can use the ideas proposed in [11] to manage our cache based on queries. Evicting based on queries will allow the cache to make good eviction decisions based upon the commonly asked queries. We can run experiments to determine the most eective caching strategies when presented with dierent query mixes such as those suggested by the TPC-W benchmark. To improve the performance of our cache, we can use the cache optimization techniques proposed in [13] and others. After we resolve basic database cache management issues, we want to extend these ideas to the wide area. Serving queries from database caches or Java datastores introduces security, trust, and resource allocation issues that are being researched by related eorts [3, 4, 10, 16, 17]. In this work, we did not address cache consistency issues, but we believe that we can use an infrastructure like Ivory [5, 15] to automatically handle consistency management. Allowing updates will aect the overall performance of the system, but one can envision several dierent policies to decrease the overhead of handling updates. For example, if the primary receives few updates, with the mechanisms in Ivory, it can push those updates to caches so that the cache does not need to probe Ivory. Using Ivory, a cache could also handle local data reads and writes. An example of local data in the TPC-W suite is shopping carts. The application on the cache could handle browsing and adding items 6

into the cart; only when the customer chooses to buy the items does the primary database get notied of the transactions. Furthermore, we could use Ivory to handle eviction and also to explore the benets of prefetching. 6 Conclusions We have implemented a simple database cache and shown that there is relatively low overhead involved with loading a cache. The benets of using a cache such as reducing the load on the primary database, decreasing the client-perceived latency, reducing network utilization, and improving application availability outweigh the costs of maintaining a cache. There are many obvious directions for future research in cache eviction and consistency policies. We also plan to extend this work into the wide area. Acknowledgments We would like to thank Kevin Walsh for his extensions to the Wisconsin release of TPC-W to correct the code to match the TPC-W specications, to x some database ineciencies and miscellaneous bugs, and to generalize the code for use with multiple databases. References [1] Transaction Processing Performance Council. http://www.tpc.org/tpcw/. [2] ZQL. http://dyade.inrialpes.fr/membres/gibello/zql/zql.html. [3] G. Banga, P. Drushel, and J.C. Mogul. Resource containers: A new facility for resource management in server systems. Third Symposium on Operating Systems Design and Implementation, February 1999. [4] E. Belani, A. Vahdat, T. Anderson, and M. Dahlin. The CRISIS wide area security architecture. In USENIX Security Symposium, January 1998. [5] Geo C. Berry, Jerey S. Chase, Landon P. Cox, and Amin Vahdat. Toward automatic state management for dynamic web services. 1999 Network Storage Symposium, October 1999. [6] Harold W. Cain, Ravi Rajwar, Morris Marden, and Mikko H. Lipasti. An architectural evaluation of Java TPC-W. In The Seventh International Symposium on High-Performance Computer Architecture, January 20. [7] P. Cao, J. Zhang, and K. Beach. Active cache: Caching dynamic contents. In IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing (Middleware '98), The Lake District, England, September 1998. [8] J. Challenger, A. Iyengar, and P. Dantzig. A scalable system for consistently caching dynamic data. In IEEE INFOCOM '99, March 1999. [9] J. Challenger, A. Iyengar, K. Witting, C. Ferstat, and P. Reed. A publishing system for eciently creating dynamic web content. In IEEE INFOCOM 2 Conference, Tel-Aviv, Israel, March 2. 7

[10] G. Czajkowski and T. von Eicken. Jres: A resource accounting interface for Java. In 1998 ACM OOPSLA Conference, October 1998. [11] Shaul Dar, Michael J. Franklin, Bjorn r Jonsson, Divesh Srivastava, and Michael Tan. Semantic data caching and replacement. In The VLDB Journal, pages 330{341, 1996. [12] James Gosling, Bill Joy, and Guy Steele. The Java Language Specication. Addison Wesley Publishing Company, Reading, Massachusetts, 1996. [13] Laura M. Haas, Donald Kossmann, and Ioana Ursu. Loading a cache with query results. In VLDB Conference, Edinburgh, Scotland, September 1999. [14] Arthur M. Keller and Julie Basu. A predicate-based caching scheme for client-server database architectures. VLDB Journal: Very Large Data Bases, 5(1):35{47, 1996. [15] Sara Sprenkle and Je Chase. Scaling Java-based dynamic web services. Technical Report CS- 20-02, Duke University, Department of Computer Science, May 20. [16] Amin Vahdat. Toward wide-area resource allocation. In Parallel and Distributed Processing Techniques and Applications, June 1999. [17] D.S. Wallach, D. Balfanz, D. Dean, and E.W. Felten. Extensible security architectures for Java. In 16th ACM Symposium on Operating Systems Principles, pages 116{128, October 1997. 8