1 How to Monitor Performance Contents 1.1. Introduction... 1 1.2. Performance - some theory... 1 1.3. Performance - basic rules... 3 1.4. Recognizing some common performance problems... 3 1.5. Monitoring, and optimizing, the performance... 4 1.5.1. Tools and mechanisms... 4 1.5.2. Interpreting the request.log... 5 1.5.3. Caching... 8 1.5.4. Analyzing Search... 10 1.5.5. Monitoring Performance using JVisualVM... 10 1.5.6. Performance when loading and editing Digital Assets... 10 1.1 Introduction CQ5 encompasses several applications, and interacts with several more. Performance (or the lack of it) is one of the first things that your users notice, so as with any application with a user interface, performance is of key importance. To optimize the performance of your CQ5 WCM installation you need to monitor various attributes of the instance and its behavior. This is primarily of interest to power user, system administrators and project managers. 1.2 Performance - some theory The problems that cause performance issues are often difficult to track down, even when their effects are easy to see. A basic starting point is a good knowledge of your system when it is operating as normal. If you don't know how your environment "looks" and "behaves" when it is performing properly, it can be difficult to locate the problem when performance deteriorates. This means that you should spend some time investigating your system when it is running smoothly and ensure that collecting performance information is an ongoing task. This will provide you with a basis for comparison should the performance suffer. The following diagram illustrates the path that a request for CQ5 content can take - and therefore the number of different elements which can impact the performance. Page 1 of 11
Figure 1.1. CQ5 request - the web-chain Performance is also a balance between Volume and Capacity: Volume the amount of output that is processed and delivered by the system. Capacity the system s ability to deliver the volume. This can be illustrated in various locations throughout the web-chain. Page 2 of 11
Figure 1.2. Capacity vs. Volume There are several functional areas which are often responsible for impacting the performance: Caching Application (your project) code Search functionality 1.3 Performance - basic rules Certain rules should be kept in mind when optimizing performance: 1. Performance tuning must be part of every project. 2. Do not optimize early in the development cycle. 3. Performance is only as good as the weakest link. 4. Always think about capacity vs. volume. 5. Optimize important things first. 6. Never optimize without realistic goals. Note Bear in mind that the mechanism you use to measure performance will often affect exactly what you are trying to measure. You should always try to account for these discrepancies, and eliminate as much of their effect as possible; in particular browser plug-ins should be de-activated wherever possible. 1.4 Recognizing some common performance problems The following lists common performance issues which occur, together with proposals on how to spot and counteract them. Page 3 of 11
Table 1.1. Recognizing common performance problems Area Symptom(s) To increase capacity... To reduce volume... Client High client CPU usage. Install a client CPU with higher performance. Server Network Low server CPU usage. Some clients fast, some slow. CPU usage low on both servers and clients. Browsing locally on the server is (comparatively) fast. Web-server CPU usage on the webserver is high. Application Server CPU usage is high. Repository Cache Simplify (HTML) layout. Upgrade to a faster browser. Improve client-side cache. Remove any network bottlenecks. Increase network bandwidth. Cluster your web-servers. Use a hardware loadbalancer. Improve/optimize the configuration of the client cache. Reduce the "weight" of your web pages (e.g. less images, optimized HTML). Reduce the hits per page (visit). Cluster your CQ5 instances. Search for, and eliminate, CPU and memory hogs (use code review, timing output, etc). High memory consumption. Improve caching on all levels. Low response times. Optimize templates and components (e.g. structure, logic). 1.5 Monitoring, and optimizing, the performance Performance issues may stem from a number of causes that have nothing to do with your website, including temporary slowdowns in connection speed, CPU load, and many more. It may also impact either all your visitors, or only a subset of them. All this information needs to be obtained, sorted and analyzed before you can optimize the performance. If you experience a performance issue: try to replicate it: with one (or preferably more) standard web-browsers, on a different client that you know has good general performance and/or on the server itself (if possible) check whether anything (related to the system) has changed within an appropriate time-space, and if any of these changes could have impacted the performance collect as much information as possible to compare with your knowledge of the system under normal circumstances 1.5.1 Tools and mechanisms The following gives a short overview of some of the tools available for monitoring performance. Page 4 of 11
Note Some of these will be dependent on your operating system. Table 1.2. Tools and mechanisms for monitoring performance Tool Used to analyze... Usage / More information... request.log Response times and concurrency. Interpreting the request.log. truss/strace Page Loads Unix command. Include the misc.truss log level to INFO. Thread dumps Observe JVM threads. Identify contentions, locks and long-runners. Dependent on the operating system, e.g. kill -QUIT <pid> on Unix/Linux whereas Ctrl-Break on Windows. System calls Identify timing issues. Calls to System.currentTimeMillis() or com.day.util.timing are used to generate timestamps from your code, or via HTML-comments. Note: These should be implemented so that they can be activated / deactivated as required; when a system is running smoothly the overhead of collecting statistics will not be needed. Apache Bench Search Analysis JMeter JProfiler JConsole JVisualVM Identify memory leaks, selectively analyze response time. Execute search queries offline, identify response time of query, test and confirm result set. Load and functional tests. In-depth CPU and memory profiling. Observe JVM metrics and threads. Observe JVM metrics, threads, memory and profiling. For full details: http://httpd.apache.org/docs/2.0/programs/ ab.html; basic usage is: ab -k -n <requests> -c <concurrency> <url> Analyzing Search. http://jakarta.apache.org/jmeter/ http://www.ej-technologies.com/ Usage: jconsole Note: With JDK 1.6 JConsole is extensible with plug-ins; for example, Top or TDA (Thread Dump Analyzer). Monitoring Performance using JVisualVM. Usage: jconsole truss/strace, lsof Timing Statistics In depth kernel call and process analysis (Unix). See timing statistics for page rendering. Note: With JDK 1.6 JConsole is extensible with plug-ins; for example, Top or TDA (Thread Dump Analyzer). Unix/Linux commands. To see timing statistics for page rendering you can use Ctrl-Shift-U together with? debugclientlibs=true set in the URL. 1.5.2 Interpreting the request.log This file registers basic information about every request made to CQ5. From this valuable conclusions can be extracted. Page 5 of 11
1.5.2.1 Monitoring traffic on your website The request log registers each request made, together with the response made: 09:43:41 [66] -> GET /author/y.html HTTP/1.1 09:43:41 [66] <- 200 text/html 797ms By totaling all the GET entries within a specific periods (e.g. over various 24 hour periods) you can make statements about the average traffic on your website. 1.5.2.2 Monitoring response times with the CQ5 request.log A good starting point for performance analysis is the request log. You can find the request log at <cq-installation-dir>/crx-quickstart/logs. The log looks as follows (the lines are shortened for simplicity): 31/Mar/2009:11:32:57 +0200 [379] -> GET /path/x HTTP/1.1 31/Mar/2009:11:32:57 +0200 [379] <- 200 text/html 33ms 31/Mar/2009:11:33:17 +0200 [380] -> GET /path/y HTTP/1.1 31/Mar/2009:11:33:17 +0200 [380] <- 200 application/json 39ms This log has one line per request or response: The date at which each request or response was made. The number of the request, in square brackets. This number matches for the request and the response. An arrow indicating whether this is a request (arrow pointing to the right) or a response (arrow to the left). For requests, the line contains: the method (typically, GET, HEAD or POST) the requested page the protocol For responses, the line contains: the status code (200 means success, 404 means page not found ) the MIME type the response time Using small scripts, you can extract the required information from the log file and assemble the statistics you want. From these, you can see which pages or types of pages are slow, and if the overall performance is satisfactory. 1.5.2.3 Monitoring search response times with the CQ5 request.log Search requests are also registered in the log file: 31/Mar/2009:11:35:34 +0200 [338] -> GET /author/playground/en/tools/search.html? query=dilbert&size=5&dispenc=utf-8 HTTP/1.1 31/Mar/2009:11:35:34 +0200 [338] <- 200 text/html 1562ms Page 6 of 11
So, as above, you can use scripts to extract the relevant information and build up statistics. 1.5.2.4 Monitoring the numbers and impacts of concurrent users Again the request.log can be used to monitor concurrency and the system's reaction to it. Tests must be made to determine how many concurrent users the system can handle before a negative impact is seen. Again scripts can be used to extract results from the log file: monitor how many requests are made within a specific time span e.g. one minute test the effects of a specific number of users all making the same requests at (as close as possible) the same time; e.g. 30 users clicking Save at the same time 31/Mar/2009:11:45:29 +0200 [333] -> GET /author/libs/personalize/content/statics.close.gif HTTP/1.1 31/Mar/2009:11:45:29 +0200 [334] -> GET /author/libs/personalize/content/statics.detach.gif HTTP/1.1 31/Mar/2009:11:45:30 +0200 [335] -> GET /author/libs/cfc/content/imgs/ logo.rzmnurccynwctpcxyubnitcoibmmw000.default.gif HTTP/1.1 31/Mar/2009:11:45:32 +0200 [335] <- 304 text/html 0ms 31/Mar/2009:11:45:33 +0200 [334] <- 200 image/gif 31ms 31/Mar/2009:11:45:38 +0200 [333] <- 200 image/gif 31ms 31/Mar/2009:11:45:42 +0200 [336] -> GET /author/libs/cfc/content/imgs/ logo.rzmnurccynwctzrxunqbbqtvuucmbrrbuwxz0000.default.gif HTTP/1.1 31/Mar/2009:11:45:43 +0200 [337] -> GET /author/titlebar_bg.gif HTTP/1.1 31/Mar/2009:11:45:43 +0200 [336] <- 304 text/html 0ms 31/Mar/2009:11:45:44 +0200 [337] <- 304 text/html 0ms 1.5.2.5 Using rlog.jar to find requests with long duration times CQ includes various helper tools located in <cq-installation-dir>/crx-quickstart/opt/ helpers. One of these, rlog.jar, can be used to quickly sort request.log so that requests are displayed by duration, from longest to shortest time. The following command shows the possible arguments: $java -jar rlog.jar Request Log Analyzer Version 21584 Copyright 2005 Day Management AG Usage: java -jar rlog.jar [options] <filename> Options: -h Prints this usage. -n <maxresults> Limits output to <maxresults> lines. -m <maxrequests> Limits input to <maxrequest> requests. -xdev Exclude POST request to CQDE. For example, you can run it specifying request.log file as a parameter and show the 10 first requests that have the longest duration: $ java -jar../opt/helpers/rlog.jar -n 10 request.log *Info * Parsed 464 requests. *Info * Time for parsing: 22ms *Info * Time for sorting: 2ms *Info * Total Memory: 1mb *Info * Free Memory: 1mb *Info * Used Memory: 0mb ------------------------------------------------------ 18051ms 31/Mar/2009:11:15:34 +0200 200 GET /content/geometrixx/en/company.html text/ html 2198ms 31/Mar/2009:11:15:20 +0200 200 GET /libs/cq/widgets.js application/xjavascript 1981ms 31/Mar/2009:11:15:11 +0200 200 GET /libs/wcm/content/welcome.html text/html Page 7 of 11
1973ms 31/Mar/2009:11:15:52 +0200 200 GET /content/campaigns/geometrixx.teasers..html text/html 1883ms 31/Mar/2009:11:15:20 +0200 200 GET /libs/security/cq-security.js application/ x-javascript 1876ms 31/Mar/2009:11:15:20 +0200 200 GET /libs/tagging/widgets.js application/xjavascript 1869ms 31/Mar/2009:11:15:20 +0200 200 GET /libs/tagging/widgets/themes/default.js application/x-javascript 1729ms 30/Mar/2009:16:45:56 +0200 200 GET /libs/wcm/content/welcome.html text/html; charset=utf-8 1510ms 31/Mar/2009:11:15:34 +0200 200 GET /bin/wcm/contentfinder/asset/view.json/ content/dam?_dc=1238490934657&query=&mimetype=image&_charset_=utf-8 application/json 1462ms 30/Mar/2009:17:23:08 +0200 200 GET /libs/wcm/content/welcome.html text/html; charset=utf-8 Note 1.5.3 Caching You may need to concatenate the individual request.log files if you need to do this operation on a large data sample. The following diagram shows the different cache locations that can be used for the various content types. Figure 1.3. What can be cached where? The following can act as a rough guide for target values: Page 8 of 11
Figure 1.4. Cache vs. Uncached - maximum hits / second Although there are many algorithms to ensure that data is retrieved from the source system when appropriate, circumstances can arise where the data residing in a cache is out of date. Retrieving every page individually is the only guaranteed method of ensuring your content is up-to-date, but it is very costly in terms of response, and can indeed cause knock-on effects. This is particularly relevant when using personalized pages, where at least some content of a page is dependent on the user, and the account they used to login. Figure 1.5. Cache speed vs. Data Integrity Page 9 of 11
1.5.3.1 Optimizing your content for cache performance Make sure you use realistic cache settings for the browser cache. If you have disabled the browser cache for development, this may increase traffic and decrease responsiveness. 1.5.4 Analyzing Search First steps to analyzing the search function can be made with Monitoring search response times with the CQ5 request.log. However, once you have determined the response time, you may need to analyze why the request is taking the time it does, and what can be done to improve the response. Further information about the underlying search functionality of CRX can be found at Searching in CRX. 1.5.5 Monitoring Performance using JVisualVM Since JDK 1.6 the tool command jvisualvm is available. After you have installed JDK 1.6 you can: 1. Either: a. Start your CQ5 instance using the -jconsole option. b. Add the -Dcom.sun.management.jmxremote argument to the java command line that starts your JVM. 2. Run jvisualvm (normally found in the JDK 1.6 bin folder). 3. From within the Local application, double-click com.day.crx.quickstart.main: Note You can use this tool to generate thread dumps and memory head dumps. This information is often requested by the technical support team. 1.5.6 Performance when loading and editing Digital Assets Due to the large volume of data involved when loading and editing digital assets, performance can become an issue. Page 10 of 11
Two things affect performance here: CPU - multipe cores allow for smoother work when transcoding Hard disk - parallel RAID disks achieve the same To improve performance you can consider the following: How many assets are going to be uploaded per day? A good estimate can be based on: The timeframe in which edits will be made (typically the length of the working day, more for international operations). The average size of images uploaded (and the size of renditions generated per image) in megabytes. Determine the average data rate: 80% of all edits will be made in 20% of the time, so in peak time you will have 4 times the average data rate. This is your performance goal. Page 11 of 11