BlueBoX: A Policy diven, Host Based Intusion Detection system Suesh N. Chai Pau Chen Cheng IBM Thomas J. Watson Reseach Cente Yoktown Heights, NY 10598, U.S.A. schai,pau @watson.ibm.com Abstact In this pape we descibe ou expeiences with building BlueBox, a host based intusion detection system. Ou appoach can be viewed as ceating an infastuctue fo defining and enfocing vey fine gained pocess capabilities in the kenel. These capabilities ae specified as a set of ules (policies) fo egulating access to system esouces on a pe executable basis. The language fo expessing the ules is intuitive and sufficiently expessive to effectively captue secuity boundaies. We have pototyped ou appoach on Linux 2.2.14 kenel, and have built ule templates fo popula daemons such as Apache 2.0 and wu-ftpd. We ae validating ou design by testing against a compehensive database of known attacks. Ou system has been designed to minimize the kenel changes and pefomance impact and thus can be poted easily to new kenels. We will discuss the motivation and ationale behind BlueBox, its design, implementation on Linux, and elated wok. 1 Intoduction The two mechanisms pedominantly used to secue application seves today ae fiewalls and netwok intusion detection systems. One of the attactive featues of these mechanisms is that they ae independent of the seve and thus, easily deployed. Fiewalls contols the flow of though communication and netwok IDSs detect possible attacks by monitoing the communication. While fiewalls, when popely configued, seve thei intended pupose, cuent netwok IDSs suffe fom a numbe of limitations. Netwok IDSs typically analyze taffic on the netwok and eithe scan fo pattens containing known attacks o detect statistically abnomal pattens. With the advent of taffic encyption potocols such as SSL [FKK96, DA97] and IPSEC [Atk95], a significant potion of taffic on the Intenet is encypted and theefoe is unavailable fo examination. Also, thee ae well known ways to evade netwok IDSs [PN98]. Thus, inceasingly, intusion detection must move to the host seve whee the content is visible in the clea and these evasion techniques do not wok. Ou system, BlueBox, is such a host based eal time intusion detection system and it can also be configued fo blocking intusions. To contast ou appoach we fist look at mechanisms used in cuently deployed host based IDSs. They ae pimaily based on one of the following [DDW99, Jac99]: Anomaly detection : Defined by a statistical pofile of nomal behavio [JV94, ALJ 93, FHSL96, DDNW98]. A patten that deviates significantly fom the nomal pofile is consideed an attack. Misuse detection: Defined by collections of signatues of known attacks [Jac99, Pax98, RLS 97, CDE 96]. Activities matching such pattens ae consideed attacks. Conceptually, misuse detection is based on knowledge of bad behavios (attacks) and anomaly detection is based on knowledge of good (nomal) behavios. If both techniques wee pefect, then each would exactly complement the othe: i.e. what is not bad is good and vice vesa. In eality, neithe technique is pefect. Misuse detection can neve know all possible attacks and it usually classifies some good behavios as attacks. Likewise, anomaly detection can not cove all good behavios and will mistake some attacks fo good behavios. Also, an entity s behavio pofile will change as its usage patten changes. So anomaly detection has to adapt its pofile to these changes. This opens the possibility fo an attacke to gadually incease its level of malicious activities until these activities ae consideed nomal. Ou policy diven technique, like the concept of sandboxing, ties to define the bounday between the good and the bad as a set of ules. These ules specify what an executable pogam o scipt is allowed to do and attempts
to violate them ae consideed intusions. The ules govening a pocess define pecisely which system esouces a pocess can access and in what way. Section 3 gives an oveview of what the scope of the ules ae. The ules ae defined though pecise undestanding of the expected behavio of the pogam. They can be defined using existing templates, audit tails, configuation and, if necessay, pogam semantics. The ules ae specified off line, compiled into a machine eadable binay which is associated with the pogam and loaded into the kenel when the pogam is executed. Rule enfocement happens when the pogam is executed in the context of a pocess: the behavio of the pocess is checked and constained accoding to the ules. The enfocement is done in the kenel duing invocations of system calls. The concept of sandboxing has appeaed in numeous contexts including IDS and we discuss this in Section 2. We believe that the policy based appoach of Blue- Box and like systems offes a numbe of advantages ove the taditional attack signatue based o pofile based appoaches. They include: The secuity bounday is much moe pecisely defined in tems of the intended use of the sensitive system esouces. Rules ae based on undestanding a pogam s behavio and not on attack signatues o time vaiant, incomplete statistical pofiles of nomal behavio. This has two advantages: (1) unknown attacks can be detected, (2) peviously unseen but legitimate behavios would not be mistaken fo attacks. Theefoe the false positive and (hopefully) the false negative ates will be lowe. Anothe potential win is the manageability of the IDS especially as compaed to statistical pofiling based techniques. Thee is no need to constantly maintain and update attack signatue database o statistic pofiles. Since the ules ae pecisely defined in tems of system esouces and not by attacks, thee will be vey few updates, if any, of ules fo an application unning on a paticula platfom. Pehaps the most impotant advantage of BlueBox s policy based appoach is that detection is done in eal time. theefoe thee is the option to block an unauthoized access o act. On the othe hand, since the ules ae defined on access to system esouces thee ae disadvantages as compaed to othe IDSs. Some of them ae: Vesion Migation: Since diffeent vesions of applications may access diffeent esouces evey vesion will equie modified sets of ules. Howeve, in ou expeience with the Apache http seve, mino vesion changes impact the ules vey minimally. Even with majo vesion changes, lage chunks of the ule sets can be eused. In Memoy attacks: Since the checks on pocess behavio ae made only when the pocess makes a system call, attacks which ae in memoy can not be detected. The est of the pape is oganized as follows: Section 2 suveys elated wok and compaes them with ou system. Section 3 gives an oveview of the specification and geneation of ules. Section 4 pesents the technical details of ou design and implementation, the pecise scope of ules and the system achitectue. Section 5 pesents a few examples of how BlueBox thwats seveal well known attacks and also details expeiences on specifying ules. Section 6 discusses the pefomance impact of the IDS and we conclude in Section 7. 2 Related Wok Resticting pogam behavio based on extenally specified ules has a vey long histoy dating back to the efeence monitos of opeating systems seveal decades ago. In this section, we highlight moe ecent mechanisms and compae them with ou wok. Some of the systems ae vey diffeent fom BlueBox while othes ae vey simila. 2.1 Language Based mechanisms Thee ae a lage numbe of language based mechanisms to estict pogam behavio based on policy. They ange fom the theoetical pogam coectness methodology of using assets, to the popula type based mechanisms enfoced by the loade such as the famed Java Vitual Machine [JVM01]. While the secuity guaantees pomised by these mechanisms ae stonge than ous, they make vey stong and in some cases, unealistic, assumptions about the tusted computing base (TCB). Some classes of such systems include the following: 2.1.1 Pogam Coectness Based Mechanisms This method has been the subject of extensive eseach spanning decades. Recently, these mechanisms have been poposed as effective mechanisms to mitigate exposues [UES00]. While theoetically elegant, they ae lagely esticted to checks in the use space. Hence, the TCB needed fo these mechanisms to be effective is unealistic since all the checks inseted in to the use space pogam must be executed. This is aely ealized in commecial opeating systems: An attacke mounting a buffe oveflow attack is in no way esticted by any of the checks inseted in the oiginal pogam.
2.1.2 Type based mechanisms The celebated Java Vitual Machine is a classic example of a system which enfoces stong checks on intepeted byte code. Fo this mechanism to wok one has to extend the TCB to include the intepete and loade. In seveal contolled envionments this is possible, howeve it is not ealistic, fo easons of pefomance, to have daemons such as the http seve un in this envionment. 2.2 System call patten based systems These systems identify intusions by an initial taining phase whee exhaustive testing is used to identify the accepted set of pattens in system call sequences, and then flagging an intusion if thee ae eoneous pattens in system calls made by daemons in an actual un. Some examples ae discussed in [FHSL96, DDNW98]. The main advantage of these systems is the minimized impact on the kenel i.e. one needs to make few changes to the kenel to implement them. Howeve, they can not offe stong secuity guaantees: Fistly, thei efficacy equies exhaustive taining to identify nomal pattens and if not done coectly, can esult in a lage numbe of false positives. Secondly they ae vey sensitive to the exact vesion of the monitoed softwae: small changes in souce code can yield vey diffeent system call pattens. Fo example, the Apache http daemon can be configued to un using pocesses o theads, and the system call pattens ae consideably diffeent. Since BlueBox ties to captue the esouces the daemon uses, thee ae vey few changes between the two vesions. 2.3 Kenel Based efeence monitos In the last few yeas thee has been a enewed inteest in sandboxing by intecepting system calls made by pocesses. We descibe some systems and highlight the similaities and diffeences. 2.3.1 LIDS The Linux Intusion Detection system (LIDS) [XB01] aims to extend the concept of capabilities pesent in the basic Linux system by defining fine gained file access capabilities fo each pocess. BlueBox s ules fo file system objects is vey simila to this. The complete ule set of Blue- Box is a stict supeset of the LIDS system. Among the seveal additional featues of BlueBox is the state infomation which is useful in thwating some attacks as descibed in Section 5. 2.3.2 A Pogam as a Finite State Machine Seka et al [SU99] pesent a system which combines language based systems with system call intecept based systems. Thei appoach is to model pocesses with a state diagam descibing its functionality and then enfocing this state diagam in the kenel duing system call invocation. They achieve stong secuity guaantees since the state diagam captues exact pocess semantics. The main dawback of this system is the difficulty in geneating the equied state diagams fo a new pocess. Also, we conjectue based on ou expeience in incopoating state, that the pefomance penalty in enfocing the ules could be somewhat high. 2.3.3 Geneic softwae wappes Geneic softwae wappes[kfbk00, FBF99] ae a mechanism to enfoce vaious access contol and intusion detection checks tiggeed by events duing pocess execution. The infastuctue will egiste vaious scipts to be un based on events, monito pocess execution fo these events to occu, and execute egisteed scipts when the events occu. This is a poweful infastuctue which can integate numeous appoaches to system secuity unde one unifying famewok. The main dawbacks of this appoach is the complexity of witing scipts and the pefomance impact in such a complex famewok. We believe that ou appoach is much moe intuitive and has substantially bette pefomance. 2.3.4 Othe Sandboxing Systems The system that comes closest to ou system is the wok of Benaschi et al [BGM00]. Thei system achitectue is vey simila to ous and the main diffeences ae in the syntax and semantics of the ules themselves. The placements of diffeent pats of the system within the kenel ae also vey diffeent. Ou placement aims to minimize impact on the kenel code by placing a wappe aound kenel system call handles while thei placement ties to minimize pefomance impact. Ou system is extensible to newe vesions of the kenel since by and lage the same wappe should wok fo newe kenels. The Domain and Type Enfocement (DTE) based system by Walke et al [WSB 96] goups file system objects into sets called types and puts a subject (an executable) into a domain which has specific access ights to types. It does not povide potection on non file system object esouces and seems to incu moe complexity when poviding fine ganulaity contol than BlueBox. 2.4 Use space system call intospection A valid citicism of systems such as BlueBox is the modifications to the kenel equied to install the infastuctue to install and enfoce pocess behavio ules. To cicumvent this, one appoach is to use existing monitoing infastuctue in kenels such as ptace to have monitos which
eside in use space [Wag99, JS00]. The monito sits in a sepaate pocess and intecepts system calls made by the monitoed pocess using ptace; the monito pocess can then enfoce the ules by examining the intecepted system calls and thei paametes and possibly modifying the paametes o teminating the calls. As pointed out by the authos[js00], this appoach has a few dawbacks. Fistly, since ules ae enfoced in the context of the monito pocess, thee is some ovehead due to context switching and copying data fom one pocess s context to the othe s. Also, thee ae cases when the monitoed pocess is not entiely unde the contol of the monito due to the implementations of ptace. 3 BlueBox Policy Specification and Geneation Since an attack on a system must access sensitive system esouces in unintended ways to be successful, a BlueBox policy defines and enfoces ules contolling a pocess s access to system esouces, thus thwating unintended access. We categoize system esouces and the types of access to them in Table 1. Featues of ou cuent ule specification includes: Access pemissions to file system objects. Access to the file system, e.g., mount, unmount. Pemitted uid and gid tansitions. signals which can be sent, eceived, blocked, ignoed & handled. Pocess chaacteistics such as scheduling pioities which can be modified. Elementay contols fo othe system esouces such as IPC objects, sockets and ioctl calls. This is an aea equies moe study fo moe compehensive ules. To make the policy specification expessive, we povide an allowed system calls list as a coase level of contol that is effective in thwating a numbe of attacks. Since system esouces must be accessed though system calls, disallowing invocations of a system call disallows access to esouces. Fo instance, most seve pocesses don t need to mount o unmount file systems, so mount and unmount ae not in thei allowed lists and an invocation of eithe will be consideed an intusion egadless of the invocation s paametes. We have identified 72 hamless system calls; each of which eithe has no secuity implications o is not suppoted by the Linux 2.2.14 kenel. These calls ae listed in Appendix A. The policy fo a pogam can also be maked inheitable: this is useful fo a scipt whee each pogam executed by the scipt can shae the scipt s policy. Based on ou expeience, fo a given pogam thee ae seveal mechanisms and tools one could use to build and specify the ules. Intended Semantics: The most compehensive way to geneate the coect ules fo a pogam is by looking though the intended semantics of the pogam. While this can be daunting fo big seves such as Apache, we have found that fo seveal cgi bin scipts, this is the easiest way to captue ules since these scipts typically access few esouces. Configuation: Fo seves such as Apache which can be configued to un in diffeent ways, configuation files need to be used (eithe manually o automated) to ceate ules. Audit Tails: A vey staightfowad mechanism to geneate lage chunks of the ules is to inspect system call audit tails. Fo a numbe of seves and scipts we have found this to be the simplest method. Existing Templates: Fo lage and popula seves such as the Apache httpd, we envision existing ule templates which can automatically be customized to new installations. Ou efeence seve is the Apache httpd fo which we have developed a template. We ae cuently investigating ule templates fo lage application seves and hope to include ule templates fo the most common configuations of application seves such as the IBM WebSphee[WEB01]. While these mechanisms sound daunting fo nontivial pogams, as we discuss in section 5, we believe that the amount of exta wok is manageable. Fo ou pototypical application of web seves, most of the ules need to be done once, with little customization fo new seves. 4 Technical details In this section we will fist discuss the BlueBox system achitectue to show how a policy is defined and enfoced, then we discuss policy specification in details and conclude with a discussion of BlueBox s impact on the kenel. 4.1 System Achitectue The BlueBox system achitectue is shown in Figue 1. The achitectue includes two pats : Policy Specification and Pasing A BlueBox policy fo an executable pogam is specified in a human eadable fom using a text edito and then pased into a binay file by a pase pogam. This pat is done off line and befoe the pogam is executed. Details ae in Section 4.2.
esouces types of access File system objects ceate, open, ead, wite, execute, emoval, link to, change of access pemissions, change of owneship File systems mount, unmount, types of mount Identities acquie, elease, inheit Pocesses (addess spaces, signals, ) ead, wite, delive CPU cycles, pocess scheduling pioity aise System clock set, ead System/kenel memoy ead, wite IPC objects : pipes, semaphoes, message queues, shaed memoy, Devices, netwok Pivileges ceate, open (attach), ead, wite ceate/attach, open, ead, wite, io contol, emoval, link to, change of access pemissions, change of owneship acquie, elease, aise, lowe Table 1. types of esouces and access Policy Loading and Enfocement Since BlueBox policies ae meant to contol access to system esouces which can only be accessed though system calls, the natual place fo ule enfocement is at the kenel system call enty point. Ou pototype on Linux 2.2.14 places an enfoce module at the kenel system call enty point to enfoce ules. The enfoce has built in knowledge of what categoies of esouces each call may access so it can check the paametes of the invoked system call against the ules. Since it is impactical to wite policies fo all pocesses on a system, we added a new system call to mak a pocess as being monitoed ; this status will be passed on to its childen and cannot be unmaked. As a tool, we have a simple wappe pogam which maks itself as monitoed and then execves the eal pogam to pass on the monitoed status to the new pocess image. When loading the new image the modified execve system call handle 1 loads the ules into the kenel and stats enfocement. If no ules fo the new image ae found, then the pocess will ty to inheit and shae the ules of the old image; if these ules ae not inheitable o do not exist, then the pocess will be cippled; i.e., it is only allowed to make hamless system calls. Rules ae ead only afte being loaded. Each monitoed pocess is allocated a kenel memoy buffe 2 to 1 The API fo execve is not changed. 2 At pesent, the size of the buffe is one page o bytes. hold its pivate BlueBox state which can change as the pocess executes. Moe discussion on BlueBox pocess state is given in Section 4.3. When a pocess foks, the child pocess shaes the paent s policy but will be given a copy of the paent s BlueBox state. A pocess s BlueBox state will be eset when it execves a pogam. 4.2 Rules fo Diffeent Types of Resouces In this section we will discuss ules fo thee types of esouces, namely file system objects, uid/gid lists and signals; each has paticula syntax and semantics. We believe the syntax and semantics discussed hee can epesent most, if not all, of BlueBox ules. 4.2.1 Rules fo File System Objects Rules on file system objects ae encoded as a tee which mimics the hieachy of files on a UNIX system. The policy of a pogam includes one such tee encoding the pogam s access ights to file system objects. Figue 2 shows a pat of the specification of ules on file system objects fo Apache 2.0 HTTP Seve. Each node in the tee ecods access ights to a (set of) file system object(s). The oot of a tee coesponds to the oot of the hieachy of files. Like a UNIX file system, each node has a name. Unlike a UNIX file system, the name can contain UNIX shell like wildcad chaactes * and? with the same intepetation as in a UNIX shell. The only exception is that a leaf node with the name epe-
On-Line Off-Line Human-eadable Rules Pocess Rule Pase Use kenel Rule Enfoce System Call Handle Binay Rule in Kenel loaded into the kenel when execve Binay Rule File Figue 1. BlueBox Achitectue sents an entie subtee; fo example, matches any file in the subtee unde. Limited suppot fo chaacte classes (e.g., ) is also povided 3. A node s name can also contain envionment vaiables and these ae evaluated when the policy is being loaded into the kenel. Fo example, if a ule is and the value of is joe, then the pocess will have ead access to all HTML files unde. When a pocess makes a system call to access a file system object, the object s absolute pathname is matched against the tee. If a path in the tee matches the object s pathname, then the access ights in the last node of the path detemines if this invocation of the system call is allowed. Besides the usual ead, wite, execute, ceate, append, access ights to a file system objects also include : delete, had link to, soft link to, shaed lock, exclusive lock, tuncate. Thee ae also ights elated to diectoies used as file system mount points : (a) mount point : a diectoy can be a mount point, (b) unmount: a file system mounted on a diectoy can be unmounted; and ights elated to swapping devices (c) swapon : a device can be a swapping device, (d) swapoff: a device can be eleased fom being a swapping device. A node in the tee may also be associated with a list of uid s and a list of gid s (see Section 4.2.2). These lists ae the allowed new use and goup owneships fo file system objects matching the node. 3 Chaacte anges (e.g., - ) and the chaacte ae not allowed in a chaacte class. 4.2.2 Rules fo Identities Rules on identities (uids o gids) ae encoded as lists of singula integes and anges 4 such as. The basic opeation on such a list is to check if a specific integal value is in it. Each pogam s policy has an uid list and a gid list. These lists ae the new identities a pocess unning the pogam is allowed to assume. A pocess has thee types of identities : eal, effective and saved [MBKQ96]. Since a pocess can feely exchange the values of diffeent types of ids o assign one to the othe, the BlueBox enfoce does not make a distinction among the thee types of id s when checking the ules. In othe wods, when a system call equests new uid s o gid s, the enfoce only allows one of the following two cases : 1. the uid s/gid s ae in the set of uid s/gid s which the pocess aleady has, o 2. the uid s/gid s ae in the pocess s uid/gid list and if the following condition is met: if the pocess s has gone though the tansition and asks to change its to, then equals. This condition is meant to pevent an attacke fom hopping ove diffeent uid s. An intege list can also epesent ules on system esouces with integal values such as scheduling pioities, etc.. 4 It may contain non negative and negative integes; e.g., uid s could be negative o non negative.
pathname access pemisions ceation mode /* :ead, w:wite, x:execute, c:ceate, a:append */ /* shae libaies */ /etc/ld.so.* /lib/* /* system configuation files */ /etc/host.conf /etc/hosts /etc/passwd /etc/goup /etc/esolv.conf /* Apache files */ /us/local/apache2/conf/* /us/local/apache2/htdocs/*.html /us/local/apache2/logs/eo log wca 666 /us/local/apache2/logs/access log wca 666 /us/local/apache2/logs/efee log wca 666 /us/local/apache2/logs/agent log wca 666 /us/local/apache2/logs/httpd.pid wc 644 /us/local/apache2/cgi-bin/* x Figue 2. Patial ules fo Apache file access 4.2.3 Rules fo Signals Rules fo signals ae encoded as a bit mask 5, which is an aay of unsigned integes used as bit vectos and epesents a set of non negative integes whose coesponding bits ae. Bits in a bit mask ae numbeed sequentially, stating fom the LSB of the fist intege, numbeed zeo, to the MSB of the last intege. Unlike an intege list, set opeations can be easily pefomed on bit masks. Fo ules on handling eceived signals, BlueBox puts signals into fou subsets : (1) those can be blocked (CBB), (2) those can be ignoed (CBI), (3) those can be default (CBD) : thei handles can be the default handles, (4) those can be handled (CBH) : thei handles can be assigned by the pocess. These subsets can intesect in any possible way. Since a UNIX/LINUX system does not suppot othe types of teatment fo eceived signals, if a signal is in only one subset, then can be becomes must be. Fo example, signals that ae only in the CBB subset ae signals that must be blocked. Besides maintaining fou bit masks fo the fou can be subsets, BlueBox also computes and main- 5 Bit masks ae also used to encode the allowed system calls list. tains the must be blocked subset fo pefomance easons. An aay of pointes to handles fo the CBH subset is also maintained; Section 4.3 gives moe details on this aay. 4.3 Pe Pocess State Incopoating pocess state into ules can potect pocess against a much lage numbe of potential attacks. Seveal daemons, especially setuid pogams, stat out with eal uid as oot, setting only the effective uid as a use, while etaining the possibility of acquiing oot state to do pivileged opeations. If such a daemon is subveted the attacke can then e acquie oot pivileges. One such example is descibed in the attack on the wu ftp daemon in Section. 5. Incopoation of state into the system call checks impacts pefomance as pocess state needs to be updated and checked. We have chosen to have a small amount of pocess state so as to minimize the pefomance impact. Ou guiding pinciple is to add state only when absolutely necessay. Pats of the states we maintain ae: Identity state: The main state component we maintain is the cuent pocess identity state. The states we note
ae the initial oot state, use state and eoot state when the pocess becomes oot again. Fo each state, thee is a sepaate edition of the ules dictating which system calls ae allowed but all states shae the same set of file system access pemissions. Daemons typically switch back to oot state only fo a shot while to do a few pivileged opeations and this can be effectively contolled by just changing the allowed system calls. System call count: Anothe pocess state component is the numbe of times cetain system calls ae made. Cuently, this is enabled fo only the fok and waitpid system calls. Fo each call we keep the cuent count and maximum allowed. This component is useful in two situations: Fist, we can use this to stop DOS attacks which epeatedly consume system esouces via system calls: e.g. an attacke could epeatedly fok child pocesses. The second situation whee this might be useful is in contolling scipts which execute abitay shell commands. Since the shell scipt foks pocesses to execute diffeent commands this can contol the numbe of commands the pocess can execute. While this by itself does not offe moe secuity, it does so in combination with othe ules. Signal Handles: Anothe DOS attack is to have signals handled incoectly esulting in eant pocess behavio. This can be done by egisteing a wong signal handle. Since thee is no way fo the IDS to identify the coect signal handle, it assumes that the fist handle egisteed is the ight handle and does not pemit any change to this. Ou philosophy to adding state to the ules is that if we add state only when thee is substantial benefit to be gained eithe in stengthening secuity guaantees o in making it easie to specify ules fo a paticula pocess. We note that ou pocess state is substantially smalle than the system poposed by Seka et. al [SU99]. 4.4 Kenel Impact A vey impotant design citeia fo ou system was to minimize the impact on the kenel. The placement of functionality has been caefully done to educe impact on the kenel. Ou efeence intusion avoidance implementation on Linux has an intecept at the system call enty point, and mino hooks in the kenel code fo pocess ceation and temination (the fok, execve and exit system calls). The total impact on the kenel souces is limited to about 10 lines of assembly and 20 lines of C code. The est of the enfocement pocess and the code to pase, allocate memoy fo and install ules ae in a completely independent module. The patches to kenel ae vey simple and do not change the semantics of the emaining code no do they intefee with othe pats of the system. A vey valid concen is the potability of BlueBox acoss diffeent vesions of the kenel: we believe that the points in the Linux kenel which we have intecepted ae vey stable and unlikely to change in evisions of the kenel. On Linux, whee it is easie to allocate memoy as pages, each pocess usually needs no moe than 2 pages (8K) to stoe all IDS elated stuctues. Of couse, we use only a smalle subset of this depending on ule size etc. Substantial potions of the ules ae shaed by pocesses and any child thead/pocess that they spawn. This can be educed with elementay optimization. 5 Examples In this section we illustate how ou famewok can be effectively used to thwat well-known attacks. They also illustate how ules fo vaious pocess can be defined. 5.1 Phf cgi bin with Apache The phf cgi bin scipt was a sample scipt which came with the ealie distibutions of Apache as an example of how cgi bin scipts could be witten. Figue 5.1 shows the elevant pats of the code fo phf scipt. The scipt fist /* tansfom http equest * into options */ /* Remove shell chaactes * fom options */ escape shell command( /sbin/ph options ); popen( /sbin/ph options, ); Figue 3. The PHF cgi bin scipt syntactically tansfoms the incoming http equest into a list of options fo a fictional pogam ph and then spawns (using popen) a shell to execute ph with the ceated options. The escape shell cmd suboutine escapes shell chaactes which may be pesent in the options sting. The fatal bug was that it did not escape the newline ( n) chaacte: The attack simply ensued that abitay command was executed by passing the new command afte a newline chaacte in the options. This is a good example of how staightfowad it is to wite effective ules. By design, the scipt invokes two commands /bin/sh ( while using the popen libay call ) and the pogam /sbin/ph. Thus a vey natual set of ules is to allow ead and execute to these files. Besides shaed
libaies, the pocess accesses no othe objects. Making these ules as inheited ensues that the pocess which executes /bin/sh can only execute these two pogams and the attack is thwated. Note that the pocess can execute these as many times as it wants. 5.2 Buffe oveflow in wwwcount The wwwcount pogam is a popula cgi pogam which maintains a count of the numbe of hits on a website and displays this in a gaphical fom. This is widely used although in non sensitive web sites. The ealie vesions of the pogam suffe fom a well known buffe oveflow attack which can be used to execute abitay pogam on the web site. It is almost tivial to define the ules fo this scipt. Fom the definition, o fom an inspection of the system call audit tace fo this pocess we can deive the pope file accesses: These ae all esticted to a single diectoy based on the initial configuation of the pogam. No executable is in the ules; in fact, the execve system call is not in the allowed system call list. 5.3 wu ftpd buffe oveflow This example illustates how to use the state maintenance pat of ou system to enfoce sophisticated checks. wu ftpd is the ftp daemon developed at the Washington Univesity at St. Louis and is one of the moe popula ftp daemons in use today. Thee have been a numbe of attempts to model the behavio of the daemon to detect intusions [SU99]. At a vey high level, the ftp daemon stats unning as oot, waits fo a use to login by authentication and sets its effective uid to that of the use. Fo the est of this session, the daemon has as effective uid that of the authenticated use. It is thus in an unpivileged state, except when it needs to bind sockets to the well known ftp data pot. Since this is a pivileged pot, this bind opeation can only be done in pivileged state so the daemon becomes oot again. The only system calls made by the daemon in this state ae socket, bind and setuid to use. Figue 4 descibes this state diagam of the ftp daemon. Fom this functional desciption we can easily identify one potion of the ules fo the ftp daemon. In the initial state it stats as oot and is pemitted to make most of the system calls, in the second state it has a nonzeo uid and is pemitted among othe the setuid system call to become oot again. In the thid state the daemon is only allowed to execute the socket, bind and setuid to use system calls. Note that this is only a subset of the entie ule set and illustates how this thwats a well known attack. This subset of the ules is shown in Figue 5. The ealie vesions of this daemon wee susceptible to an attack whee a egula use authenticated and oveflew the pocess heap[wuf]. Then, abitay code could be executed in the eoot state e.g. spawn a oot shell on the seve. Using the subset of the BlueBox ules descibed above, we can mitigate the damage due to this attack. The only system calls the attacke can execute in the eoot state ae the socket, bind and setuid to use; the attacke has no potential access to the file system objects i.e. all othe sensitive system calls ae disallowed. Although thee is no way in the kenel, to distinguish the nomal setting uid to oot by the ftp daemon fom the use state and the attacke setting uid to oot afte the buffe oveflow, this is the best potection one can expect. The examples that we have descibed in this section highlight seveal impotant featues of the semantics of the ules in ou system. They also illustate the secuity guaantees the system can povide. Fo instance, in the case of the phf attack, the system guaantees that the only executables ae /bin/sh and /sbin/ph. Howeve the attack can make the system endlessly execute these binaies esulting in a denial of sevice. In the ftpd example, we ae unable to detect that the buffe oveflowed, yet we ae able to substantially mitigate the damage that the attacke can do. Anothe impotant featue is that the ules fo a lage numbe of pogams ae vey easy to wite and can potentially be done with a single examination of the audit tail. Even in the moe sophisticated example of the ftp daemon, we believe ou appoach is substantially simple than the state diagam based appoach advocated by [SU99]. 6 Pefomance One of the main design guidelines fo BlueBox is to minimize the pefomance impact. Cucial design decisions about how much state to incopoate into the ules wee diven pimaily by how much it impacts the pefomance of the pocess being monitoed. The pototypical application we use to measue the pefomance is the Apache 2.0 web seve daemon. The esults fo this daemon ae epesentative as it execises most of the checks implemented fo the vaious system calls. In fact, many of the compute intensive system call checks, such as open, ead and fcntl, ae used substantially. Othe pocesses will typically use fewe such calls and hence the pefomance impact on the Apache httpd daemon will be an uppe bound. 6.1 Testbed Ou tests an the WebStone benchmak of seve pefomance with the following paametes: Thee is a single client machine geneating load and it has between thee and eight theads geneating equests fo the seve. These wee so chosen such that the esulting load does not satuate the seve with o without BlueBox. The load geneated by the clients is entiely static content. Testing unde dynamic content would esult in a lage penalty due to the ovehead of loading ules fo each scipt that is invoked. Both the
Initial Root State use authentication Nonzeo use id eoot fo bind afte bind Reoot State Figue 4. State diagam of the wu ftp daemon Initial Root State use authentication Nonzeo use id eoot fo bind afte bind Reoot State Link to libaies. Open sockets Read /etc/passwd Setuid to use etc. Access use files setuid to oot etc. Only system calls - Bind - Socket - Setuid(back to use). Figue 5. Subset of the state dependent ules fo the ftp daemon. webstone client and the Apache seve wee put on a gigabit ethenet to ensue that no effect due to lage netwok latencies wee obseved in the esults. 6.2 Test Results Figue 6 shows the pefomance of the Apache 2.0 webseve pefoming with and without BlueBox unde vaious seve load factos. We anticipate a % pefomance penalty fo the Apache 2.0 seve unning on the Linux 2.2.14 kenel. 6.3 Bottlenecks The main pefomance bottlenecks in enfocing the system call checks fo the Apache seve is pathname esolution. Fo each equest, the Apache seve opens a file and then uses sendfile to send it ove the socket. Fo each equest, we pefom a full name esolution opeation to match the ight file name with a node on the tee of file system object ules to eliminate secuity holes. This can be additionally optimized by caching and making cetain names as fully esolved. Anothe way to educe this ovehead of name esolution is to have mandatoy access contol type labels [DoD85, SEL] on file system objects and move the check entiely to the file system i.e. the file system will check the labels fo pemission befoe it opens the file. The esults shown in Figue 6 wee geneated entiely using static content. Dynamic content equies the seve to load anothe pocess and thus load the ules fo this new pocess which adds to the pefomance penalty. This can be somewhat mitigated by caching the data stuctues epesenting ules fo fequently used cgi bin scipts. We ae in the pocess of implementing this in the BlueBox implementation on Linux. Using these optimizations, we expect that the pefomance penalty fo the Apache daemon will be close to 5%. We believe that this penalty is not excessive given the secuity guaantees one can obtain using this system. 7 Conclusion We have pesented Blue Box, a simple system fo sandboxing applications which can substantially mitigate secuity exposues of pocesses. We believe that ou system is a simple and compehensive way to incopoate checks on the execution of pogams at the time of invocation of system calls. We have descibed ules fo impotant seves such as the Apache daemon, and a numbe of popula cgi bin scipts; these ules can be used as templates acoss installations with new ules witten fo the individual scipts. Ou ule syntax and semantics ae simple and yet quite effective in catching a lage numbe of known attacks. Since
1000 Pefomance compaison unde WebStone 2.5 load Apache2.0 on Linux 2.2 Apaceh2.0 on Linux 2.2 + IDS 950 Coonections/sec 900 850 800 750 700 55 60 65 70 75 80 85 90 95 CPU utilization Figue 6. Pefomance of the Apache 2.0 with and without system call checks pefomance has been a motivating facto in ou design, we have achieved ou secuity guaantees with minimal impact on the pefomance. On a much lage scale, we believe that much moe effective secuity can be achieved by integating the attack signatue based systems, statistical pofile based systems and the sandboxing systems such as the one descibed in this pape. Depicted in Figue 7, the signatue based appoach detects attacks fom the outside, the statistical pofile appoach detects anomaly inside, and the sandboxing appoach stops attacks on the bounday. 8 Acknowledgements This wok has benefited substantially fom discussions with a lage numbe of people. In paticula, we would like to acknowledge the contibutions of Pankaj Rohatgi, Josyula R. Rao, David Saffod and Douglas Schales of IBM Reseach, and Hevé Deba who was at IBM Zuich Reseach Lab. Refeences [ALJ 93] Deba Andeson, Teesa F. Lunt, Haold Javitz, Ann Tamau, and Alfonso Valdes. SAFEGUARD FINAL REPORT: Detecting Unusual Pogam Behavio Using the NIDES Statistical Component. Technical epot, Compute Science Laboatoy, SRI Intenational, Menlo Pak, Califonia, USA, Decembe 1993. [Atk95] [BGM00] [CDE 96] [DA97] Randall Atkinson. Secuity Achitectue fo the Intenet Potocol. Intenet RFC 1825, August 1995. M. Benaschi, E. Gabielli, and L. Mancini. Enhancements to the Linux Kenel fo Blocking Buffe Oveflow Based Attacks, http://www.iac.m.cn.it/ newweb/tecno/papes/bufovep, August 2000. Mak Cosbie, Byn Dole, Todd Ellis, Ivan Ksul, and Eugene Spaffod. IDIOT Uses Guide. Technical Repot CSD-TR-96-050, COAST Laboatoy, Dept. of Compute Sciences, Pudue Univesity, Septembe 1996. Tim Dieks and Chistophe Allen. The TLS Potocol Vesion 1.0. IETF daft-ietf-tlspotocol-02.txt, Mach 1997. [DDNW98] Hevé Deba, Mac Dacie, Mehdi Nassehi, and Andeas Wespi. Fixed vs. Vaiable Length Pattens fo Detecting Suspicious Pocess Behavio, Reseach Repot, No. RZ3012, IBM Reseach Division, Zuich Reseach Lab, Apil 1998. [DDW99] Hevé Deba, Mac Dacie, and Andeas Wespi. Towas a taxonomy of intusion detection systems. Compute Netwoks, 31, 1999. [DoD85] US Depatment of Defense Tusted Compute System Evaluation Citeia, DOD 5200.28-STD,
Integated Defense System attack signatue detection sandboxing anomaly detection Figue 7. Integated Defense [FBF99] [FHSL96] http://www.adium.ncsc.mil/ tpep/libay/ainbow/ index.html, Decembe 1985. Timothy Fase, Lee Badge, and Mak Feldman. Hadening COTS softwae with geneic softwae wappes. In Poceedings of the IEEE Symposium on Secuity and Pivacy, 1999. Stephanie Foest, Steven A. Hofmey, Anil Somayaji, and Thomas A. Longstaff. A Sense of Self fo UNIX Pocesses. In IEEE Symposium on Secuity and Pivacy, 1996. [FKK96] Alan O. Feie, Philip Kalton, and Paul C. Koche. The SSL Potocol Vesion 3.0. IETF daft-ietf-tls-ssl-vesion3-00.txt, Novembe 1996. [Jac99] Kathleen A. Jackson. INTRUSION DE- TECTION SYSTEM (IDS) Poduct Review, IBM intenal confidential document, IBM Reseach Division, Zuich Reseach Lab, Apil 1999. [JS00] K. Jain and R. Seka. Use-Level Infastuctue fo System Call Inteposition: A Platfom fo Intusion Detection and Confinement. In Poceedings of the Netwok and Distibuted Systems Secuity Symposium, Febuay 2000. [JV94] Hal Javitz and Alfonso Valdes. The NIDES statistical component desciption and justification. Technical epot, Compute Science Laboatoy, SRI Intenational, Menlo Pak, Califonia, USA, Mach 1994. [JVM01] The Java Vitual Machine, http://www.javasoft.com, 2001. [KFBK00] Calvin Ko, Timothy Fase, Lee Badge, and Douglas Kilpatick. Detecting and Counteing System Intusions Using Softwae Wappes. In Poceedings of the 9th USENIX Secuity Symposium, August 2000. [MBKQ96] Mashall Kik McKusick, Keith Bostic, Michael J. Kales, and John S. Quateman. The Design and Implementation of the 4.4 BSD Opeating System, pages 67, 540. Addison Wesley, New Yok City, New Yok, USA, 1996. [Pax98] Van Paxson. Bo: A System fo Detecting Netwok Intudes in Real Time. In the 7th USENIX Secuity Symposium, 1998. [PN98] Thomas H. Ptacek and Timothy N. Newsham. Insetion, Evasion, and Denial of Sevices: Eluding Netwok Intusion Detection, http://www.nai.com, Januay 1998.
[RLS 97] Macus J. Ranum, Kent Landfield, Mike Stolachuk, Mak Sienkiewicz, Andew Lameth, and Eic Wall. Implementing a Genealized Tool fo Netwok Monitoing. In the 11th USENIX Systems Administato Confeence, 1997. Appendix A Hamless System Calls Each of these system calls eithe has no secuity implications o is not suppoted by the Linux 2.2.14 kenel. [SEL] [SU99] [UES00] [Wag99] Secuity Enhanced Linux. Available online at http://www.nsa.gov/selinux/. R. Seka and P. Uppului. Synthesizing Fast Intusion Detection Systems fom High-Level Specifications. In the 8th USENIX Secuity Symposium, pages 63 78, August 1999. Úlfa Elingsson and Fed B. Schneide. IRM enfocement of Java stack inspection. In IEEE Symposium on Secuity and Pivacy, 2000. David A. Wagne. Janus: an appoach fo confinement of untusted applications. Technical Repot CSD 99 1056, Univesity of Califonia at Bekeley, August 1999. [WEB01] WebSphee V4.0 Advanced Edition Handbook. Online at http://www.edbooks.ibm.com/ edpieces/pdfs/sg246176.pdf, Novembe 2001. [WSB 96] Kenneth M. Walke, Daniel F. Stene, M. Lee Badge, Michael J. Petkac, David L. Shemann, and Kaen A. Oostendop. Confining Root Pogams with Domain and Type Enfocement. In the 6th USENIX Secuity Symposium, July 1996. [WUF] Souce code to exploit the heap oveflow in wu ftpd. Online at http://olive.efi.h/ cv/ secuity/bugs/munixes/ wuftpd15.html. [XB01] Huagang Xie and Philippe Biondi. The Linux Intusion Detection Poject, http://www.lids.og, 2001. afs syscall alam beak bk capget chdi fchdi fdatasync fstat fstat64 fstatfs fsync ftime get kenel syms getcwd getdents getegid geteuid getgid getgoups getitime getpgid getpgp getpid getppid getpioity getesgid getesuid getlimit getusage getsid gettimeofday getuid gtty lock lstat lstat64 mpx msync nanosleep newselect oldfstat oldlstat oldolduname olduname poll pof quey module eadlink sched get pioity max sched get pioity min sched getpaam sched getschedule sched get inteval sched yield setitime sgetmask stat stat64 statfs stty sysfs sysinfo syslog time times uname ustat vfok vhangup vm86 wait4 Table 2. Hamless System Calls