Monitoring System Status

CHAPTER 14 This chapter describes how to monitor the health and activities of the system. It covers these topics: About Logged Information, page 14-121 Event Logging, page 14-122 Monitoring Performance, page 14-124 About Logged Information The ACE Web Application Firewall and Manager include a rich set of features for monitoring system activities. The features include the Manager Dashboard, which presents customizable views of dynamic traffic statistics, the performance monitor, extensive error logging, the audit log, which shows policy changes in the Manager, and the incidents report. This chapter describes the monitoring tools available in the Manager web console. For information on using external tools to monitor the system, such as SNMP and syslog, see the Cisco ACE Web Application Firewall Administration Guide. The logs can enhance system security by providing information on potentially malicious traffic crossing your network. It identifies requests that match a variety of attack signatures, including signatures designed to match SQL injection attacks or command injection attacks. The log can also be used to identify problems with backend infrastructure, since server processing errors are captured and reported in the logs. The performance reporting tools can help you tune your system for best performance. The Manager Dashboard displays a summary of the information provided by the logs. As the first page that appears after a successful login, it alerts you to conditions that may require attention, such as possible attacks. It can be customized to display the graphs of interest to you. Graphs are available that present the transaction rate, errors, and latency by service definition. The types of logs in the ACE Web Application Firewall system include: The event log records data about system events that affect the processing and administrative activity of the ACE Web Application Firewall and Manager. Examples of events recorded by the event log are message transactions, system startup and shutdown, authentication of web console users, deployment of policies, and a variety of errors and other activities. The performance log keeps a variety of statistics on traffic in the system intended to assist performance analysis. It provides information on transaction count, processing time, backend round-trip time, and more. This information appears in the Performance Monitor and in the graphs that can be added to the Traffic Monitor section of the Manager Dashboard. 14-121

Event Logging Chapter 14 The audit log shows user activity in the ACE Web Application Firewall Manager web console. Logged information on a busy system can occupy a considerable amount of disk space on the appliance. To prevent resource exhaustion, when the log files on the appliances take up a particular amount of disk space, older log files are automatically deleted to make more space. This feature is intended to prevent unexpected shutdown of the appliance. However, it s preferable to have log files copied to backup storage and removed from the appliance at regular intervals using a managed process. This way, the logged information is recoverable if necessary. For this purpose, you can set up a Shell script that moves the files off the appliance at regular intervals. For more information on disk management, see the Cisco ACE Web Application Firewall Administration Guide. Event Logging The event log provides detailed information on the activities of the ACE Web Application Firewall and Manager. It displays information on traffic processing activities as well as on the internal operation of the ACE Web Application Firewall Manager and ACE Web Application Firewall. These events include control events (such as policy deployment), error notifications, and other events important to the operation of the system. This information can help you diagnose problems in the policy or network configuration of the system. The system can write to the event log at several levels of detail. Each successively higher level of detail records more information. The logging levels are: Table 14-1 Event logging levels Level Alert Error Warning Notice Info Debug Description Critical system conditions that require immediate attention to prevent system failure. Error conditions that cause incorrect results or incorrect system behavior. Conditions that appear to be incorrect and may cause unexpected system behavior or other undesirable results. Normal but significant conditions, such as receipt or delivery of a message. This level of reporting produces one line of output for each message processed under normal conditions. Significant processing stages in the normal handling of message traffic; at this level each message processed should produce several lines of output. All information the ACE Web Application Firewall or Manager can report. Among other things, this level logs the body of every message the ACE Web Application Firewall processes. that the debug-level information shown for a message may contain sensitive information, including passwords passed in a request. In general, this level of logging should be used only in testing or troubleshooting scenarios. It s important to consider that a busy ACE Web Application Firewall can generate a large number of event log records. Event information is passed to the Manager via syslog, which, as a UDP protocol, offers best-effort delivery only. In extremely busy systems or in stress-testing scenarios, it s possible for event log information to be lost. 14-122

Chapter 14 Event Logging At the higher levels of detail Notice, Info, and Debug the system records so much information that it may affect the performance of the ACE Web Application Firewall. These logging levels are useful when investigating a problem, but should be avoided on an ongoing basis in a production system. Configuring Event Logging Event logs items are generated by both the ACE Web Application Firewall and Manager. The types of events they generate are: The ACE Web Application Firewall event logs provide information mainly on the message processing activities of the system. The ACE Web Application Firewall Manager event logs provide information on administrative activities in the system. In general, the ACE Web Application Firewall Manager event logs are useful to system administrators, while the ACE Web Application Firewall logs are helpful to both administrators and developers who are creating and testing service definitions in the policy. The log level at which events are recorded can be separately configured for the Firewall and Manager. If the Manager controls multiple clusters, the Event Log displays Firewall events only for the Firewalls in the current cluster. Manager events are shown for all clusters. For Manager events, the log description indicates the cluster affected by the event, by cluster name. For more information, see Chapter 16, Managing Firewall Clusters. To set the event logging level, take the following steps: Step 1 Step 2 Step 3 Step 4 Log in to the web console as an Administrator user or a Privileged user with the Operations role. Display the System Management page in either of the following ways: Click the System Management link in the navigation menu, or If you're already viewing the Event Log page, click one of the edit links at the far right of the Current Event Logging pane. The ACE Web Application Firewall Manager displays the System Management page. Choose a value from the Log all Manager events of type menu for Manager logging, or from the Log all Manager events of type menu for Firewall logging. Click the Set Log Level button next to the menu to confirm the new settings. The new settings take effect immediately. Client IP Logging The Client IP option, which appears under the Global Policy Settings menu item, allows you to direct the Manager to use a value from an HTTP request header as the source client IP for purposes of logging and reporting. This option is useful when the ACE Web Application Firewall is deployed behind a load balancer that is configured to send the actual IP address of the client as an HTTP header, for example, in the X-Forwarded-For header. 14-123

Monitoring Performance Chapter 14 When the option is enabled, the event logs contain the IP address extracted from the HTTP header in addition to the IP address of the load balancer. To enable this option, in the Global Policy Settings page, click edit and check the Use specified HTTP header value as the client IP check box. The default name of the HTTP header used for the client IP is X-Forwarded-For. The name of the HTTP header can be changed if the load balancer inserts the client IP value into a differently named header. Viewing the Event Log To view the event log, click the Event Log link in the Reports & Tools section of the navigation menu. By default, the ACE Web Application Firewall Manager displays events in the last hour. The search and filter tools at the top of the Event Log Viewer enable you to filter the logs that are displayed. For example, you can choose to view only event generated for a particular ACE Web Application Firewall instance. You can also search by message GUID, the globally unique identifier assigned to a given message transaction by the ACE Web Application Firewall. In this case, the Event Log Viewer displays only events associated with the request or response with that ID. Monitoring Performance The Performance Monitor provides extensive performance information on the system, including message count, sizes, and processing time. The performance monitor can help you identify bottlenecks in the system and optimize performance at the ACE Web Application Firewall and backend infrastructure. Information is presented on the page by handler group and endpoint. For each item, a variety of performance statistics are shown. For descriptions of each statistical category, see the online help accessed from the Performance Monitor page. Figure 14-1 Performance Information It is important to note that statistics shown in the monitor should be regarded as approximate in some cases. In particular, messages that result in certain types of errors may not cause relevant statistics to be incremented as would be expected. 14-124

Chapter 14 Monitoring Performance Filtering Performance Data by Time The performance monitor includes controls that let you filter the information by time in various ways. Time filtering affects the console view as well what information is exported to file. You can show statistics by: A set time period ending at the present time, such as over the last hour or the last seven days. A time period starting at a set time, such as at 10AM and ending at the present time. A set time period ending in the past, such as from 10AM to 8PM on a given date. When analyzing performance data, it is important to consider that the Manager s physical capacity for performance information is not unlimited. When the Manager s performance data capacity is reached, oldest performance information is lost. To conserve space in order to minimize this effect, the Manager consolidates information from smaller time frames into larger time frames over time, in effect, lowering the resolution of performance data as it ages. Therefore, while you can query the Manager performance information for a short-time span from a relatively distant time period of its operation, it s possible that the data returned is actually representative of a larger time period than requested. In this event, a notice at the top of the page indicates that the specified resolution is not available. Also, the actual values are reflected in the time filter fields at the top of the page. The rate at which this data consolidation or loss occurs varies depending on the nature of the traffic in the system. It is worth noting that the most significant factor in reaching the performance capacity is the number of separate virtual services and, in particular, the use of identity reporting rather than the volume of traffic at the Firewall. As a rough guideline, for a policy with about 100 virtual services, each of which gets constant traffic flow (about a request every ten seconds) and with identity tracking disabled, the Manager may be expected to reach its performance data capacity in seven to eight months. For a policy with just ten virtual services and no identity tracking, the Manager may be able to retain performance data without loss for several years. Data consolidation, on the other hand, may occur after several hours. Given ten virtual services that each receive a message every ten seconds, data would be consolidated into a five-minute time frame after about six-and-a-half hours. Eight days later, data from the five-minute time frames would be consolidated into a single one-hour time frame, and so on. If you request information in the Performance Monitor for a time interval at a resolution for which data is not available, the interface presents the closest time range that is available, and indicates that time range at the top of the page. If maintaining historical performance information is important to you, you should export performance data to a file regularly. The Manager supports performance data export in CSV and XML formats. When the Manager consolidates performance information into records that correspond to a day, it does so along day boundaries determined in GMT. Viewing Performance Information To view performance information: Step 1 Step 2 Log in to the web console as an Administrator user, Privileged user, or Policy View user. Click the Performance Monitor link in the Reports & Tools section of the navigation menu. 14-125

Monitoring Performance Chapter 14 The Performance Monitor page lists performance statistics for the service definitions in the policy sorted into handler groups. By default the page displays statistics for all virtual services in your policy. The handler group row shows total statistics for all virtual services in that group. Under the group name, statistics are broken down by each service. For a multiple operation virtual service, statistics are not available for each operation in the virtual service, only for the entire virtual service. You can use the controls at the top of the page to filter what information is displayed in various ways, such as by Firewall or time period. There are a few points to note regarding these statistics: The Request Processing and Response Processing times represent the amount of time it takes the ACE Web Application Firewall to perform validation, consumer authentication, transformation, or any other processing steps specified by the policy on the message. The Service Latency column shows the time it takes from the point at which the ACE Web Application Firewall sends the request to the backend service until is receives the response. It does not include the time the ACE Web Application Firewall spends processing the message. The total time it takes for message processing including request processing, response processing, and service round trip is indicated in the Processing Latency column. These categories are shown in Figure 14-2. Figure 14-2 Performance statistics categories service consumer ACE Web App Firewall request processing time backend service response processing time processing latency time 280805 The times indicated in the Performance Monitor are based on time-to-first-byte. This means that the timer starts when the first byte of the message is received by the Firewall, and ends when the first byte is transmitted to the network from the Firewall. Accordingly, the values can be affected by network conditions, particularly if messages are composed of multiple packets. For information on each performance category, see the online help for the performance monitor page. Exporting Performance Information to a File If left on the ACE Web Application Firewall Manager of a busy ACE Web Application Firewall system, performance data is eventually lost. When the amount of performance data reaches the Manager s capacity, the oldest information is deleted to make space for new information. If you need to retain information indefinitely, you can export performance information to a file. 14-126

Chapter 14 Monitoring Performance In addition to providing a mechanism for saving performance data indefinitely, the performance data export feature provides access to richer information than that provided in the Performance Monitor interface, with additional statistical categories for message processing times. Performance data can be exported as XML data or to a comma-separated values (CSV) file. As in the Performance Monitor, statistics in the exported file are grouped by handler. When viewing performance monitor, note that handlers that have been moved between subpolicies are identified by internal object number, rather than by handler name, for their activity in the former subpolicy. It is important to note that the information in exported files is presented differently from the performance monitor. The exported performance information should be considered raw data, in that it is not processed or organized for human-readability. the following differences between exported data and the performance monitor: Virtual services that have received traffic in the selected time frame are listed in the file. Virtual services that have not received requests do not appear in the generated file. The performance monitor shows message processing totals for each handler group. The exported file does not show total values in the same way; instead, it contains a record for each virtual service. If identity reporting is enabled, it contains a record for each identity that accessed the service, with a request count for that identity. The exported data file includes records for requests that were not serviced due to an error. They are indicated by an error count field with a value greater than 1. In addition to the time to first byte measurement shown in the Performance Monitor, the exported file shows measurements for time-to-last-byte for each request and response. To export performance data to an XML or CSV file: Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 While logged into the web console as an Administrator user, Privileged user, or Policy View user, click the Performance Monitor link in the Reports & Tools section of the navigation menu. Use the Firewall and time controls to filter the information to be exported to the exported file. In addition to affecting the view in the Performance Monitor, the filter controls, such as time spans, control what information is exported to a file. Click Update View. Choose the format of the output file, either: XML, for an XML format file CSV, for a comma delimited file This choice does not affect what information is generated, only its format. Click Export Raw Data. In the File Save dialog, choose a file location and name for saving the export file. After you save it, the file is generated and downloaded to the file location you specified. The exported file contains all of the information shown in the Performance Monitor, plus some additional statistical categories. This information includes message error counts, such as access failures, and information on message size. 14-127

Monitoring Performance Chapter 14 The XML file indicates the time frame represented by the data in the file with the Report element. The element has a querystarttime and queryendtime attribute, which indicates the time period for which performance data was captured for the file. The file provides extensive details on time-based performance measures. the following points on this performance data: Message timings are shown in microseconds (the Performance Monitor shows time in milliseconds). Time measurements include the following statistics: Time-to-first byte (TTFirst) is the time from when the Firewall receives the first byte of a message, off the network, until the time it starts sending the first byte of the message. The times shown in the Performance Monitor are time-to-first byte. Time-to-last byte (TTLast) is the time from when the Firewall receives the last byte of a message until it sends the last byte of the message In the names of the statistics categories, you can determine the message processing stage measured by the following identifiers: Req is the request processing time, the amount of time the ACE Web Application Firewall spends processing the consumer request. An example is MinReqTTFirst. Resp is the response processing time, the amount of time the ACE Web Application Firewall spends processing the response from the backend service. An example is MinRespTTFirst. Source is the backend message roundtrip time, from when the outgoing request is sent to the service until the response is received back from the service. An example is MinSourceTTFirst. Roundtrip is the total message processing time, which includes request processing, response processing, and the roundtrip to the backend service. An example is MinRoundtripTTFirst. For a description of each statistical category, see the online help for the web console. 14-128