INTRUSION PREVENTION AND EXPERT SYSTEMS By Avi Chesla avic@v-secure.com Introduction Over the past few years, the market has developed new expectations from the security industry, especially from the intrusion detection systems industry. One of the most challenging expectations is that intrusion detection products be able to not only detect attacks, but also prevent them in real-time. This demand forces systems to be more independent from the human factor. Not relying on the human factor means that operations that were usually conducted by the security expert need now to be performed automatically by the systems themselves. These systems that the market is seeking are called intrusion prevention systems. The motivation behind the market s demand to transit from intrusion detection into intrusion prevention rests on two foundations: 1. The growth in the sophistication and frequency of attacks over the last few years. 2. More and more organizations are closely dependent on the Internet in order to conduct profitable business. Together these factors result in a demand for products with stronger processing power and faster response to attacks that threaten Internet connectivity and application integrity. In most cases, the human security expert cannot comply within the required response time. Intrusion prevention systems are needed in order to respond accurately and in a timely manner and thus meet the demands of the market. When analyzing which technologies will best fit the market s expectations, it becomes clear that developing an intrusion prevention system involves an integration of advanced behavioral analysis technologies. This article begins with a short explanation of the motivation behind the market s demand for intrusion prevention systems. It then explains the difficulties that this demand raises from a technological point of view. We then go on to explain why any intrusion prevention system, in order to be effective, must take control of some responsibilities that were before in the hands of the human security experts. The article includes a general explanation of the human brain s assessment methods and how these methods are used by the security expert in order to assess a communication as an attack, suspicious activity or normal Internet activity. The main issue is to characterize behavioral analysis technologies that will meet the goal of emulating the human security expert. Technologies such as adaptive expert decision engines and closed-feedback systems are briefly explained.
Old Approach and New Demands Intrusion detection systems (IDS) can be generally characterized as sensors. The sensor s duty is to monitor traffic and alert whenever a deterministic security rule is violated. Quite a few doubts have been raised as of late regarding the effectiveness of this method, especially when prevention actions need to be automatically implemented according to the IDS s alerts. The power of the sensor rests on its ability to watch and report when certain rules are breached. In the past, the market s expectation from an intrusion detection system never included the generation of automatic prevention measures independently. Therefore, IDSs were developed as passive devices, in other words, sensors. The effectiveness of the IDS is based on the assumption that there is always, or almost always, security expert personnel in place to analyze their reports, decide if the report poses a real threat, figure out the action that would eliminate the threat and generate it before too much damage is caused. Based on this assumption, it was a logical decision by the IDS vendors to separate between the detection and prevention responsibilities, letting the IDS take charge of the more basic operations, meaning monitoring and alerting, and assigning the security expert to complete the work which requires intelligence. IDS vendors followed this approach and, as a result, developed systems that are strongly dependent on the human factor (i.e., the human security expert). This expert is responsible for analyzing the IDS reports, filtering false positive events, deciding about the most appropriate countermeasures, and implementing them. Resting on the assumption that the expert s typical response time is acceptable, the synergy between the IDS and the human factor became a standard market requirement. Although some traditional IDS products can be configured to communicate with third-party devices that will do the blocking for them (for example, firewalls, routers), this type of method of prevention is limited to the filtering capabilities of the third-party devices, which are usually not granular enough to accurately mitigate attacks without disturbing the communication of legitimate users. Over the past years, few critical conditions have changed. Today we see a significant increase in Internet use by businesses and the criticalness of Internet reliability and speed in order for businesses to remain competitive. The Internet has become a lot faster and is the basis for thousands of proprietary and public applications. Organizations of all types and sizes have become heavily dependent on their own Internet infrastructures and, more significantly, those of third parties, to be able to conduct profitable business. Every moment without a transaction is a pure loss of business revenue. Moreover, successful hacking of a company s Internet application shows a weakness that hurts the company s reputation beyond the actual loss of revenue. This Internet dependence and critical role of the Internet for businesses makes Internet connections and public applications the most attractive targets for attackers. The lack of expert staff to analyze and respond to an increasing volume of attack activities has led the market to conclude that a security product needs to be able to automatically generate real-time prevention measures, and thus eliminate the dependency on the human factor. The information security industry has branded these new systems intrusion prevention systems. The Challenge A short examination of the requirements for automated real-time prevention reveals a major difficulty. Real-time prevention assumes that the system comprises some kind of computerized intelligence that will emulate the operations that were previously conducted by the security expert. Without this intelligence, any system that was previously required to perform sensor duties and is now also intended to prevent the detected attacks will generate false prevention measures. False prevention is something that the market cannot accept under any circumstance. In order to understand and confront the challenge of emulating the security expert, let us first characterize a potential process that the human brain executes in order to arrive at conclusions. Understanding the process will hopefully lead us to some conclusions regarding the technologies that may be effective in emulating the security expert. Human Assessment Methods In everyday life, the human brain encounters problems that involve varying degrees of freedom. These problems, whether they have to do with an analysis of communication systems or with basic physical operations such as walking, driving etc., can be extremely complicated. Despite this, they are all successfully handled by the human brain. Degree of Freedom A degree of freedom for a system is analogous to an independent variable for a mathematical function. All system degrees of freedom must be specified to fully characterize the system at any given time. In the simplest cases of physical systems, a degree of freedom is an independent displacement or rotation that a system may exhibit. In order to solve a multiple degrees of freedom problem, a very complicated mathematical procedure needs to be performed. The ability of our brain to perform the required complicated mathematical procedures doesn t really exist. However, we are still able to handle problems that include many degrees of freedom. The Analytical Approach and the Human Approach The questions of how we are able to solve multiple degrees of freedom problems so fast without really solving the analytical equations is not yet solved. However, a few suggestions for systems that could emulate human brain operations were raised over the past two or three decades. One of them follows the assumptions that are presented in this section: Qualitative Categories In order to see, feel or hear, we use our sensors (our eyes, ears, etc.). Although the sensors inputs can be very precise, we map the environmental inputs we are getting into qualitative categories. When we sense heat, for example, different intervals of temperatures will be associated with different qualitative categories. The same goes for quantities such as velocity, distance, etc. Every type of variable has its own set of qualitative categories that are constructed through time in an adaptive manner. Figures 1 and 2 illustrate two types of qualitative categories and how inputs from the environment are mapped into the domains of these categories. This set of qualitative categories enables us to map precise inputs into these illustrated groups. The position of every input on the x- axis and the category s shape define the output, which is the weight (y-axis). The weight represents the degree in which each input belongs to a specific category.
Adaptation After qualitative categories have been shaped and positioned along their reference axis, it is assumed that the order and shapes styles will not change over time, unless a drastic change in the environment s rules takes place. However, the position of the categories can be shifted along the reference axis (i.e., x-axis), as well as categories actual shapes within their styles according to an adaptive process. For example, if we take the distance set of category shapes (very close, close, etc. in Figure 2) and use those in order to quantify the distance between our location and that of a person standing in front of us, then 70 miles will be considered far away. In most cases, this seems to be a reasonable decision. But if we want to use the same set of distance category shapes in order to quantify how near an asteroid is, then a 70-mile input will have to produce very close output. The adaptation process helps us to shift and to scale (shrink or stretch) the category shapes along the reference distance axis according to the environment that surrounds us. Each environment defines a different scale; in this case, a different scale of distance. To illustrate this adaptation process, let s examine the following adapted qualitative categories in Figure 3. Compared to Figure 2, the x-axis scale was adapted to fit different environments, such as an environment that needs to deal with measuring distance between an asteroid and Earth. As shown, the order of the categories along the x-axis and the shapes did not change. Correlation Rules Intelligence After the inputs are mapped into categories, giving each one of them a suitable weight (level of belonging), expert rules that define the relationships between the weights need to be established. As opposed to differential operators, which are used in order to correlate between the variables in multiple degrees of freedom mathematical equations, these rules are much simpler. For example: 1. if the asteroid DISTANCE is far away AND its velocity is slow then (LEVEL OF ALERT IS LOW) ELSE 2. if the asteroid DISTANCE is not far AND its velocity is slow then (LEVEL OF ALERT IS MEDIUM) ELSE 3. if the asteroid DISTANCE is close AND its velocity is medium then (LEVEL OF ALERT IS HIGH) ELSE A set of such rules will create correlation that in the end generates a decision followed by an action or inaction. As long as these rules are built logically and more cases (rules) are adapted and piled up, the decision becomes more robust. As long as these rules are logically consistent, the level of intelligence becomes higher. Closed Feedback Closed-feedback operations are necessary for any kind of system that isn t purely analytical, like the human brain or the alternative we present in this article. The brain constantly examines the actual result of its actions and compares them to the desired results. This operation is responsible for tuning actions until an acceptable result is achieved. The Security Expert Let us use the previously described process on the operation that the information security expert needs to perform. Figure 1: Temperature Qualitative Categories Figure 2: Distance Qualitative Categories In order for the security expert to be able to analyze communication parameters, decide about their level of threat and the appropriate prevention methods, the following operations are required: 1. Sensors sensors are the tools that enable the security expert to watch and aggregate communication characteristic parameters. With the sensors inputs, the security expert can create qualitative categories. 2. Creating Qualitative Categories the security expert adapts the network environment. He need to know which: Services are running inside the network. Types of protocols these services use and how these protocols are distributed. He also needs to know approximately the number of: Packets rates. Requests generated to his servers. Protocol error replies returning form his servers. The security expert builds qualitative categories in his mind. These categories are not different from the ones described in the previous section. According to his acquired knowledge, he adapts a shape and position to each category, probably in the same way described in the previous section as an adaptation process. For example, the number of protocol error replies can be characterized as seen in Figure 4. It should be emphasized that one communication parameter that was characterized as high in a certain environment can be
characterized as low in another, according to the adaptation process. 3. Correlation ( Intelligence ) Relying on an assessment of each communication parameter independently of the other type will usually lead to a wrong decision (usually called false positive decision). Therefore, the security expert correlates between all weights (degrees of belonging to a category) through logical rules he has constructed in his mind. These expert rules are deterministic relationships that will eventually define the level of decision accuracy. In the case of error replies (Figure 4), the security expert might adhere to the following rules (adding additional parameters), in order to come to a decision: a. If the error rate is high AND the number of source IP addresses that cause the errors is high then (Level Of Threat Is Medium) Else, b. If the error rate is high AND the number of source IP addresses that cause the errors is low then (Level Of Threat Is High) Else If rule b is true, then there is a higher probability that the cause for these protocol error replies is a real attacker. 4. Closed-Feedback Operation In order to reduce false positives, the security expert conducts closed-feedback operations. These operations enable him to fix inaccurate decisions. When a decision on some kind of action (prevention measure) is made, the expert checks the results of this action. If the difference between the desired result and the actual result is acceptable, then the same action needs to be continued. If the difference isn t acceptable, then the expert would stop using the last action and continue to search for a more appropriate one. A Technology Gap Without adapting a technology that will enable an appropriate alternative to at least some of the security expert s operations, the transition from a system that acts as a sensor to a system that is supposed to automatically block attacks cannot be made. Applicable Behavioral Analysis Technologies and Expert Systems (ES) Tools An expert system is a software that works with both knowledge and information. Expert systems aid in formulating a decision the way an expert in the field might. In order to do this, expert human rules need to be formulated in such a way that the system will be able to use them in the decision-making process. Expert systems provide a way of drawing definite conclusions from vague, ambiguous or imprecise information. Therefore, expert system algorithms can overcome analysis difficulties that Internet communication usually raises. Some of the generic components of an expert system are described below: Knowledge Base: A store of factual and heuristic knowledge. This knowledge can be expressed through mathematical functions that formulate qualitative category shapes, as described in the previous sections. Decision Engine: Inference mechanisms for manipulating the outputs (weights) of each category function in order to form a line of reasoning in solving a problem. The inference mechanism can be constructed through chaining of IF-THEN rules such as those described before as Correlation Rules. Knowledge Acquisition System: This system helps to build knowledge bases. Collecting knowledge is needed in order to adapt the network s Figure 3: Adapted Qualitative Categories Figure 4: Error Replies Qualitative Categories environment. This knowledge is important in order to tune the system s decision and can be understood as the adaptation process that was described before. Closed-Feedback Systems: Feedback control is an error-driven strategy; corrections are made on the basis of a difference between the system s current state and the desired state. In the simplest case of linear feedback control, the corrections are proportional to the magnitude of the difference or error. Closed-feedback algorithms help to minimize false positive decisions. Figure 5 describes the closed-feedback process. After a decision takes place (1- process), the system checks the difference between the existing and desired result (2) and generates actions accordingly. The desired result is adapted from the environment (adapted knowledge base/3 desired set) and compared to the existing result. The closed-feedback operation is responsible for fixing the process accordingly (4 controller) until an acceptable result is achieved. Conclusions Over the last two years, the requirements from network intrusion prevention systems (NIPS) have been defined in the following ways: 1. In-line Devices As opposed to sensors (passive) devices that usually sit out of the line; IPS products must have the capability to sit in-line, thus enabling very fast responses to attacks. 2. Stability and Redundancy As an in-line device, IPS must be
extremely reliable. This fact forces IPS vendors to develop products that support redundancy and fail-over capabilities. 3. Reduce False Positives In-line devices that automatically block attacks must have a negligible percentage of false positive detections. 4. Behavioral Analysis In order to reduce the high number of false positives that was usually associated with traditional network IDS sensors, NIPS needs to include behavioral analysis technologies alongside the state-of-the-art traditional technologies such as attack signature detections and protocol anomaly (enforcement of protocol rules) detection engine. The first two requirements are mainly a matter of engineering. The other two requirements might involve a lot more than that. Unfortunately, succeeding to overcome the challenge of automatic prevention forces IPS vendors to answer these requirements. Behavioral analysis technologies need to be integrated into intrusion prevention systems in order to perform some of the operations that were before the responsibility of the security expert. As long as human intelligence remains an unsolved mystery, we cannot expect an intrusion prevention system to provide us with a complete solution, and it will always be necessary to flag suspicious activity for further human investigation. However, in this article, we have reviewed and characterized the process in which the human security expert comes to conclusions. These characteristics are similar to the ones that exist today in expert systems. The field of expert systems is a developed discipline and is researched all over the world in both academic institutions and the industry. In the future it will be beneficial to use the expert systems outlined in this article in order to successfully emulate the security expert. Unfortunately, the majority of IPS vendors have not yet integrated behavioral analysis capabilities, which are different from the traditional ones, into their products. Therefore, we will have to wait a little bit more before being able to assess the actual limitations or effectiveness of IPS products. Avi Chesla currently serves as Director of Research and Product Management for Vsecure Figure 5: Closed-Feedback System Technologies (US) Inc., a developer of innovative intrusion prevention products. He is a graduate of physics and mathematics in Tel Aviv University and has been focusing on nextgeneration security solutions since 2000. Avi can be contacted at avic@v-secure.com.