Guardium S-TAP: Application Note Lightweight Host-Based Agent for Capturing All Database Activity Highlights Unique in the industry, S-TAP is a lightweight, host-based probe (agent) that monitors all database activities on a server, no matter what database type is used and no matter what type of connection is used. S-TAP monitors both network traffic and local access by privileged users. It monitors all types of local access protocols on all major operating system including TCP, shared memory, Oracle BEQ, named pipes, TLI and IPC connections. In comparison, other database monitoring systems have significant gaps in their OS/protocol coverage. S-TAP does not require any database changes or changes in the way clients connect to the database. It does not rely on native DBMS logs or auditing utilities, and preserves separation of duties because its configuration cannot be changed by DBAs and it can be managed independently by IT security teams. S-TAP is available for all Unix and Windows platforms and consistently supports all databases and connection types on all these platforms. S-TAP uses patented technology to ensure that it consumes very little host resources and can be run on even highly loaded servers with minimal impact to applications and users. S-TAP has been benchmarked to collect over 1,000 audit records per second with less than a 3% performance hit. S-TAP is installed once on every operating system regardless of how many databases instances and types are running on the OS. Communications between the S-TAP and the Guardium appliance can be encrypted if sensitive data is being sent over an insecure network. You can define filtering policies to control how much data is sent from the S-TAP to the Guardium server. These policies can dynamically be added and changed based on business requirements in order to limit both network communications and S-TAP activity on the host. S-TAP implements failover and load balancing to ensure that if a target server is unavailable, audit data is not lost. S-TAP sends a periodic heartbeat check to the Guardium server, so that you can generate a real-time alert if the agent is disabled or uninstalled. S-TAPs are centrally managed from a Web-based console, and can be rapidly deployed using unattended installs (InstallShield, Linux RPM, Solaris package, etc.) S-TAP is currently deployed on many thousands of production databases across all verticals including financial services, telecommunications, energy, retail & hospitality, manufacturing, media, etc. For example, Dell deployed Guardium to hundreds of databases (Oracle & SQL Server, on both Windows and Linux) across 10 data centers worldwide in only 12 weeks using an S-TAPbased implementation (no SPAN ports). Introduction Guardium S-TAP (Software TAP) is a lightweight host-based probe (agent) that is installed on servers where databases instances are installed. The S-TAP is responsible for monitoring all database activities in a way that is non-intrusive to the database and relaying database activity to a Guardium Collector server for analysis, compliance reporting, forensics, and maintaining a secure audit trail. The S- TAP uses patented technology to ensure that very little resources are used on the server so that it can be used even on very highly-loaded servers and so that it does not impact the availability of applications. Policies can be defined to tailor the amount of information sent to the Guardium server based on business requirements, and to define how to balance load and handle failure conditions. Background In order to monitor all database activity, the Guardium server needs to look at all requests going from users and applications to the database, as well as look at the result sets sent from the database to the originating clients. This data can arrive at the Guardium server in one of two ways using traditional network sniffing techniques or using the S-TAP. S-TAP has rapidly become the preferred method within Guardium s enterprise customer base, as explained below. Why Network Sniffing is Not Sufficient Network sniffing can be done either by using a network TAP or a switched SPAN port. In the first case, a network TAP is installed between the database and the
switch and a copy of all database traffic is also forwarded to the Guardium server. Alternatively, a switch SPAN port can be configured to mirror all data on the port(s) to which the database is connected to the Guardium server. In both of these cases, all network activity will be monitored by the Guardium server with nothing running on the host where the database is running. While this type of solution is technically elegant (due to the zero-impact attribute), it is often insufficient for the following reasons: 1. Some database activities are local to the host. For example, a DBA may ssh or telnet to the host and connect to the database server locally. A DBA can even have direct console or serial access to the host. All such local activity would not be monitored by a pure network-based approach since there is no database activity traversing the network. The same is true when an application server resides on the same host as the database server all activity will be local to that host and network inspection will be insufficient. 2. Network communications may be encrypted. For example IPSEC or SSH tunnels may be used to protect sensitive data in-transit. In this case, SPAN sessions and network TAPs will produce data that cannot be used for monitoring. 3. For environments that have thousands of database servers a network-based approach is impractical. SPAN sessions are scarce resources that network administrators do not freely give up, and putting in a network tap per database is an expensive proposition. Moreover, a network-based approach for large environments produces maintainability challenges. Every time host connectivity to the network changes (e.g., moves from one switch to another), SPAN definitions may need to be altered or network TAPs moved. The solution to all these problems is S-TAP. S-TAP is a lightweight probe that can monitor all database communications, local or remote, and is not dependent on network topology or gear. Instead, S-TAP relies on operating system resources and as such, it monitors all connections (including local connections and connections encrypted at the OS level). S-TAP is easy to install and can easily be made a part of a gold build to be installed whenever a database is installed. Configuration elements define exactly what activity an S-TAP needs to monitor and how to behave in different circumstances. Figure 1: S-TAP Supports all mainstream UNIX/Linux & Windows platforms. OS Type Version 32-Bit & 64-Bit Microsoft Windows NT 32-Bit 2000, 2003 Both 1 Solaris - SPARC 8, 9,10 Both Solaris - Intel 10 Both IBM AIX SuSE Linux S/390 (z/linux) 5.1, 5.2, 5.3 Both 6.1 64-Bit 9,10 N/A SuSE Linux Enterprise 2 9,10 Both Red Hat Enterprise Linux 2 2 3 Both 3, 4, 5 Both 1 11.00, 11.11, 11.31 Both HP-UX 11.23 PA 32-Bit 11.23 IA64 64-Bit HP Tru64 UNIX 3 5.1A, 5.1B 64-Bit 1. Itanium version also available. 2. S-TAP for other Red Hat and SuSE Linux versions can typically be delivered in a few weeks. 3. Local TCP monitoring only.
Integration with Your Existing Infrastructure The first version of S-TAP was released with Guardium 4 in early 2005. S-TAP was the world's first host-based, database activity monitoring agent and is today by far the most mature such agent. It is now in its fourth major release and has been deployed on some of the world's most demanding database servers. S-TAP has been designed for rapid deployment and easy integration with your existing infrastructure. For example, Dell deployed Guardium to hundreds of databases (Oracle & SQL Server, running on Linux and Windows) across 10 data centers worldwide in only 12 weeks using an S-TAP-based implementation. Dell concluded that database monitoring using S-TAPs was a simpler approach, for both initial deployment and ongoing management, compared to monitoring via SPAN ports. As shown in Figure 1 above, S-TAP is available for all mainstream Unix, Linux and Windows distributions. Full coverage means that any connection type, on any of the operating systems, will be fully monitored without any need for reconfiguration. With S-TAP, there is no configuration required at the operating system and no configuration required of the database. This is unique in the industry. Very few monitoring agents in the industry today support this nonintrusiveness on operating systems such as Solaris and Windows, and apart from S-TAP no other monitoring agent exists that also fully supports AIX, HP-UX and Linux. Furthermore, S-TAP support on all these operating systems fully covers TCP/IP connections, Oracle BEQ connections, Oracle and MySQL IPC connections, DB2 and Informix shared memory connections, Sybase and Informix TLI connections, etc. For Tru64, S-TAP does not include loadable-kernel modules and monitoring for non-tcp/ip communication is performed using a proxy architecture. S-TAP is installed as a system account (root) and runs as a single process on the operating system. On Windows it is installed as a Windows service and on Unix as a daemon that if killed, is restarted by the operating system. S-TAP works at the operating system level and not at the database level. A database is a user-level program that gets services from the operating system. By viewing these service requests S-TAP knows what the database is doing without having to be installed within or on the database. Therefore, S-TAP is not sensitive to the database type or version and does not affect the database in any way; the database is completely oblivious to the existence of an S-TAP. Since the S-TAP lives at the operating system level, there is a single S-TAP process no matter how many database instances are installed on the host. A single S-TAP can monitor any number of database instances of any type supported by Guardium. S-TAP is installed either using an interactive installer or using a non-interactive script. The latter allows S-TAP to be quickly installed on a large number of servers using a single configuration file that is used to populate the mandatory configuration parameters for all the installed servers. On Windows the S-TAP is provided as an Install Shield installer. On Unix and Linux the script is either provided as a shell script or is packaged within a native installer (such as a Solaris package, an HP-UX depot, an AIX BFF file or a Linux RPM). Table 2 shows disk space requirements for S-TAP for different platforms. S-TAP has kernel-level components in addition to the daemon that runs in user-mode. On Windows these are drivers and on Unix/Linux they are loadable kernel modules. These kernel-level components ensure that all database activity can be monitored, that monitoring cannot be bypassed, that monitoring is done efficiently, and that no changes to the database are required. These modules do not interfere and interoperate with other kernel module-based systems such as CA's SEOS and IBM's TAMOS. Figure 2: S-TAP Disk Space Requirements
S-TAP Configuration & Operation Once installed, the S-TAP will appear as an operating system process, e.g.: $ ps -ef grep S-TAP root 25618 25575 0 14:40 pts/1 00:00:00 /usr/local/guardium/guard_s-tap/guard_s-tap /usr/local/guardium/guard_s-tap/guard_tap.ini The S-TAP is controlled by a configuration file called guard_tap.ini. The S-TAP can be configured locally on the database server by modifying this file, but more typically it is configured from the Guardium server using the Web-based administration console (see Figure 3). Figure 3: S-TAP Configuration Screen S-TAP can report to one or more Guardium servers (for failover or load balancing more on this later). The Guardium servers to which the S-TAP reports display the current status of the S-TAP. For example, Figure 4 shows an S-TAP status monitor on a Guardium server. An S-TAP maintains a heartbeat with its controlling servers, and if an S-TAP is down for some reason (e.g. the network is down or a superuser has uninstalled the S-TAP) then the Guardium server will
immediately be aware of this fact. Built-in alerts and reports are provided to ensure that there is no downtime in monitoring through the S-TAP. S-TAPs run as root and not as the database instance account. An S-TAP cannot be controlled by DBAs and their configuration cannot be changed by DBAs. This ensures that separation of duties is preserved. Figure 4: S-TAP Status Monitoring Screen Non-Intrusiveness Non-intrusiveness is achieved through the S-TAP's kernel components. A database makes system calls which the S-TAP monitors. Therefore, there are never any changes that need to be made to the database or the way that clients connect to the database. Because S-TAP utilizes kernel modules/drivers, resource utilization is also minimized and the overall impact to the server is minimal. S-TAP has been deployed on production servers with as many as 128 cores with negligible performance impact 1. Defining precisely how many resources an S-TAP will take is difficult since it depends on application behavior and on the monitoring policy (how much of the database activity needs to be monitored). As a rule-of thumb, S-TAP will not consume more than an average of 5% of server resources. Through filtering policies, this can be reduced further for servers that are running close to 100% utilization (see section below). The S-TAP's memory footprint is very small. It will typically consume 20MB of RAM. Additionally, S-TAP maintains a buffer that is a memory-mapped file which is used for cases where there is no connectivity to any of the Guardium servers (see below). This will show up as RAM that is used by S-TAP but it is really a memory mapped file. The size of this file is configurable through guard_tap.ini and is typically set to 100MB. For example, the following proc output shows an S-TAP that is consuming approximately 9MB of RAM with a total memory of 115MB mapped (including the memory-mapped file): 1 See related application note on S-TAP benchmark performance.
$ cat /proc/26392/status Name: guard_s-tap State: S (sleeping) Tgid: 26392 Pid: 26392 PPid: 25575 TracerPid: 0 Uid: 0 0 0 0 Gid: 0 0 0 0 FDSize: 256 Groups: 0 128 1002 1005 VmPeak: 115856 kb VmSize: 115852 kb VmLck: 0 kb VmHWM: 2112 kb VmRSS: 2112 kb VmData: 9204 kb VmStk: 84 kb VmExe: 1180 kb VmLib: 2836 kb VmPTE: 32 kb Threads: 1 SigQ: 0/28658 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000000011000 SigCgt: 0000000000000000 CapInh: 0000000000000000 CapPrm: ffffffffffffffff CapEff: ffffffffffffffff CapBnd: ffffffffffffffff Cpus_allowed: 00000000,00000003 Cpus_allowed_list: 0-1 Mems_allowed: 1 Mems_allowed_list: 0 voluntary_ctxt_switches: 16 nonvoluntary_ctxt_switches: 2 Data Filtering The S-TAP configuration parameters allow you to specify precisely what is monitored as far as database activity is concerned. The default is to monitor all connections to the database, but based on business requirements and deployment topology, S-TAP can be configured to filter-out some connections. The main reason that S-TAP supports filtering is network efficiency. When deciding whether to use an S-TAP-only deployment or a hybrid deployment (with SPAN ports used for network activity and S-TAP for local access) many factors come into play. Each approach has its pros and cons. Network engineers and operations personnel will often prefer to use an S-TAP-only approach because SPAN ports may not be readily available or because using SPAN ports requires change management procedures whenever host connectivity changes. However, one point that network engineers will make is that using S-TAP has the potential to double the network traffic of the database server. This is usually not an issue since most network cards today are 1Gbps (or at least 100Mbps) and database traffic itself is usually less than 50Mbps. Still, using an agent will add to network traffic and hence S-TAP supports filtering at a very granular level. Because not all traffic going to the database needs to be always captured, the Guardium system allows filters to affect the S-TAP and avoid unnecessary network load. The ability to filter at a very granular level is a unique capability in the industry. Let's look at a few examples. If network activity is captured through the use of a SPAN port or a network TAP, then you can create a filter that ignores all network activity and just captures local access. This type of filtering can be implemented either through the use of policies (which can dynamically affect each connection differently) or at a global S-TAP level. The latter is done using the networks and exclude_networks parameters in guard_tap.ini. In the example below the S-TAP is told to monitor only local connections. Each one of these parameters can have a list of expressions. Either one of these two parameters can be used and they can both be used in tandem to express any series of filters based on IPs. Networks=127.0.0.1/255.255.255.255 exclude_networks=
Filtering is not limited to IP addresses; in fact it can be done based on any number of attributes that can be expressed in a policy rule. Another example involves batches and ETL scripts. These programs can generate a very large number of audit records that are almost never reviewed. There is therefore no reason to burden the network by sending all this traffic, so S-TAP filtering can be used to filter those connections. Generally, filtering can be done at any level by program name, by IP address, by user name, SQL command type, etc. Granular policies are used to define these filters and thus very complex filtering conditions can easily be expressed. Figure 5 shows a rule that specifies that S-TAP should filter out all connections made by SQL*Loader that occur between 1am and 3am (a time period that is defined elsewhere) and that connect using the user APPLOAD. Figure 5: Policy rule for filtering data traffic by ignoring specified connections. Finally, S-TAP can be told to send request data only. When one looks at database activity there are requests (the queries sent by the client to the server) and there are responses (the result sets sent from the database back to the client). In some situations, business requirement dictate that both requests and responses need to be inspected (for example when extrusion rules need to be applied or when exceptions need to be logged). In other cases only an audit trail of activity is required. In the latter case one can choose to have S-TAP send only the requests to the appliance, thus significantly cutting down on network traffic between the S-TAP and the appliance (since the result sets usually include much more data than the requests). In this case too, policies rules are used so that very granular definitions can be applied to the decision of when to send result sets and when not to send them.
Failover and Load Balancing The S-TAP agent does not work in a vacuum. It sends data for analysis, parsing and evaluation to the appliance. In fact, the S-TAP does as little as possible so as not to consume many resources on the host. Because of this attribute, S-TAP has failover features to ensure that if an appliance becomes unavailable, the database activity will be sent to another appliance. S-TAP has been deployed on the largest database servers in the world (e.g., servers with 128 cores). In extreme volumes and full audit conditions, more than one appliance may be required to sustain logging. Therefore, S-TAP implements load balancing to allow it to send traffic to more than one appliance. Finally, S-TAP can also concurrently send the same traffic to more than one appliance in support of an architecture that mandates immediate and complete disaster recovery properties. Setting up failover and load balancing is done either by modifying the guard_tap.ini file or through the S-TAP administration console on the appliance. An S-TAP can have more than one Guardium host servers defined. For example, to define a failover chain composed of two Guardium appliances: [SQLGuard_0] sqlguard_ip=192.168.222.40 sqlguard_port=16016 primary=1 [SQLGuard_1] sqlguard_ip=192.168.222.41 sqlguard_port=16016 primary=2 This defines 192.168.222.40 as the primary Guardium server. If that server is unavailable (i.e., the S-TAP cannot initiate a TCP/IP connection) then it will start sending data to 192.168.222.41. Once the S-TAP is sending to 192.168.222.41 it will only go back to 192.168.222.40 if 192.168.222.41 becomes unavailable, or if the S-TAP is restarted, or if the Guardium administrator forces a manual failover. Given the two Guardium server definitions shown above, the described failover behavior occurs only if the following property is set: participate_in_load_balancing=0 If this parameter is set to participate_in_load_balancing=1 then the S-TAP will go into load balancing mode. This means that database sessions will be (statistically) split approximately half sent to 192.168.222.40 and half sent to 192.168.222.41. If this parameter is set to participate_in_load_balancing=2 then all traffic will be sent all the time to both 192.168.222.40 and 192.168.222.41. What happens if both appliances are unavailable? You can configure any number of appliances for failover this is not limited to two appliances. For example, if you have 5 appliances you can use all five and split the S-TAPs to statistically make best use of all these servers. For example, if these 5 appliances are used to monitor 100 physical servers you can configure the first 20 S-TAPs to have a failover chain of the form <1,2,3,4,5>, the next 20 to have a failover chain of the form <2,3,4,5,1> and the last 20 servers to have a failover chain of the form <5,1,2,3,4>. Finally, it is possible that all appliances are unreachable. It is unlikely that all the target appliances fail at the same time but a network issue can account for such a condition. For example, if the database server itself loses connectivity from the network then no appliance will be reachable. In such a case users will not even be able to ssh to the host so there should not be much activity on the server but a user may be logged onto the console and may have access to the database. If this occurs, the appliance will record the precise time at which the S-TAP has become unavailable and can send an alert. In addition, the S-TAP has a local file that it uses to write out data if none of the target servers are available. This is a memory-mapped file that is used to store activity if an S-TAP is unable to send data to any of the appliances. This file needs to be allocated on the file system and affects the amount of disk space required by S-TAP. The size of the file is controlled by a parameter in guard_tap.ini and is set to a default of 100MB: buffer_file_size=100
The buffer is a cyclical buffer if 100M are used up before one of the appliances is reachable by the S-TAP it will start overwriting older activity. You may assign more space for this buffer but in most implementations this is not needed. It is unlikely that all appliances will be down, and if they are, then normally there is no access to the host over the network and thus DBAs or other users/applications will not be connecting to the database. Securing Communications between S-TAP and the Guardium Server Communication between the S-TAP and the appliance is based on a purpose-built binary protocol. Data does not pass in clear text and is hard to decipher, but it is not encrypted and it is possible to extract the data through analysis. In Windows, this communication occurs on port 9500 and on Unix it occurs on port 16016. For the vast majority of implementations this is the recommended approach. However, in some environments the data may be sensitive enough that the communication stream needs to be encrypted. Be aware that encrypting the traffic produces a performance impact on the host, since the host is performing the encryption. Encryption is not a trivial operation and you should assume an additional 5% hit in terms of resource utilization. The precise number will depend on how much data the S-TAP is forced to send to the appliance so you can utilize filtering to reduce this additional burden. As a rule of thumb, if you are not encrypting all communications from the database clients to the database, then you should not be encrypting the data between the S-TAP and the appliance. There are two methods by which you can encrypt S-TAP communications. The first is to configure the S-TAP to use TLS. This is controlled by configuration parameters within the guard_tap.ini configuration file: use_tls=1 failover_tls=1 If you set use_tls=1 then the S-TAP will attempt to initiate a TLS encrypted communication with the appliance. On Windows this will occur on port 9501 and on Unix this will occur on port 160 18. If you set failover_tls=1, then if the S-TAP cannot set up an SSL connection, it will set up a regular connection. If you do not want communication to occur unless it is encrypted, then set failover_tls=0. The S-TAP status monitor displays, in addition to other status values, whether the communication is encrypted or not as can be seen by Figure 4. You can also set up alerts if you configure the S-TAP using failover_tls=1 but want to be notified if unencrypted communications occur. If you require more control over encryption algorithms, ciphers, block chaining modes etc., there is a second facility for encryption utilizing SSH tunnels. In this case you can set a local SSH tunnel that uses the tunnel account on a Guardium appliance. You would generate a public key on the host and upload it to the appliances (see Figure 6) with which the S- TAP will communicate. You then point the S-TAP to talk to the tunnel endpoint. The data will pass over an encrypted SSH tunnel over port 22 without the S-TAP itself doing the encryption. Figure 6: Uploading a public key to set up secure communications using an SSH tunnel.
How S-TAP Works with Clusters Many database environments are clustered. There are many clustered environments and packages. Since the S-TAP works at the OS level it is generally not sensitive to the clustering type. The general guideline is that an S-TAP should be installed on all nodes of the cluster and configured as though that node is the primary node. All S-TAPs will be fully functional the difference is that on the active node the S-TAP will be reporting data because the database is active and the other S-TAP will be idle since the database will be idle. On active-active clusters or implementations such as Oracle RAC then all the S-TAPs will be reporting data that is handled by that node. There is one form of clustering that requires a special configuration in the guard_tap.ini file. Some clusters operate such that, on the inactive node, the file system housing the database is not mounted. Only when failover occurs then the file system will be mounted. In this scenario, the S-TAP on the inactive node needs to wait until the file system is mounted because it needs to know where the database is installed (for most database types). If the database is an Oracle, DB2 or Informix database running on Unix then setting the following initialization parameter will ensure that the S-TAP on the inactive node will be idle until the file system is mounted and then will start functioning normally: wait_for_db_exec=10 S-TAP Summary S-TAP is Guardium's agent for capturing all database activity at the host level. S-TAP is currently deployed in some of the world's busiest database environments on a variety of operating systems, databases and connection protocols. It is a critical component of Guardium s enterprise solution for managing your entire database security, governance and compliance lifecycle (Figure 6). S-TAP is a low-impact probe that allows massive audit data collection with very low overhead. For example, S-TAP has been benchmarked to collect over 1,000 audit records per second with less than a 3% performance hit. In comparison, conventional log-reading agents such as Oracle Audit Vault collectors and Lumigent agents show benchmarks where collecting only 100 records per second consume more than 5% of a server's resource while increasing resource consumption linearly with the number of requests per second. Additionally, S-TAP implements advanced functionality such as failover, load balancing, encryption and filtering allowing you to not only meet your audit and security requirements but also to optimize your implementation based on your business requirements. Figure 6: Guardium manages the entire lifecycle of database security, governance and compliance.
About the Guardium Platform Guardium s real-time database security and monitoring solution monitors all access to sensitive data, across all major DBMS platforms and applications, without impacting performance or requiring changes to databases or applications. The solution prevents unauthorized or suspicious activities by privileged insiders, potential hackers, and end-users of enterprise applications such as Oracle EBS, PeopleSoft, Siebel, JD Edwards, SAP, Business Intelligence and in-house systems. Additional modules are available for performing database vulnerability assessments, change and configuration auditing, data-level access control and blocking, data discovery and classification, and compliance workflow automation. Forrester Research recently named Guardium a Leader across the board, with dominance and momentum on its side. Guardium earned the highest overall scores for Architecture, Current Offering and Corporate Strategy ( The Forrester Wave: Enterprise Database Auditing And Real-Time Protection, Q4 2007, October 2007). About Guardium Guardium, the database security company, delivers the most widely-used solution for ensuring the integrity of enterprise information and preventing information leaks from the data center. Founded in 2002, the company s enterprise security platform is now installed in more than 450 data centers worldwide, including 3 of the top 4 global banks; 3 of the top 5 insurers; 2 of the top 3 retailers; 2 of the leading global soft drink brands; 2 global auto makers; and the world s #1 PC manufacturer. Founded in 2002, Guardium was the first company to address the core data security gap by delivering a scalable enterprise platform that both protects databases in real-time and automates the entire compliance auditing process. For more information, please contact your Guardium partner, Regional Sales Manager or email info@guardium.com. Copyright 2009 Guardium. All rights reserved. Information in this document is subject to change without notice. Guardium, Safeguarding Databases, S-TAP and S-GATE are trademarks of Guardium. All other trademarks and service marks are the property of their respective owners. STAP-PN 0109