A Framework for Stateful Inspection

Similar documents

Transport Layer Protocols

Computer Networks. Chapter 5 Transport Protocols

ICOM : Computer Networks Chapter 6: The Transport Layer. By Dr Yi Qian Department of Electronic and Computer Engineering Fall 2006 UPRM

[Prof. Rupesh G Vaishnav] Page 1

Overview. Securing TCP/IP. Introduction to TCP/IP (cont d) Introduction to TCP/IP

Solution of Exercise Sheet 5

Hands-on Network Traffic Analysis Cyber Defense Boot Camp

We will give some overview of firewalls. Figure 1 explains the position of a firewall. Figure 1: A Firewall

Ethernet. Ethernet. Network Devices

Protocols and Architecture. Protocol Architecture.

Basic Networking Concepts. 1. Introduction 2. Protocols 3. Protocol Layers 4. Network Interconnection/Internet

Computer Networks UDP and TCP

Transport Layer. Chapter 3.4. Think about

Access Control: Firewalls (1)

Objectives of Lecture. Network Architecture. Protocols. Contents

CSE331: Introduction to Networks and Security. Lecture 12 Fall 2006

Network and Services Discovery

Technical Support Information Belkin internal use only

Overview of TCP/IP. TCP/IP and Internet

IP Network Layer. Datagram ID FLAG Fragment Offset. IP Datagrams. IP Addresses. IP Addresses. CSCE 515: Computer Network Programming TCP/IP

Computer Networks Practicum 2015

q Connection establishment (if connection-oriented) q Data transfer q Connection release (if conn-oriented) q Addressing the transport user

Cisco Configuring Commonly Used IP ACLs

Module 1. Introduction. Version 2 CSE IIT, Kharagpur

Configuring Health Monitoring

Firewalls. Firewalls. Idea: separate local network from the Internet 2/24/15. Intranet DMZ. Trusted hosts and networks. Firewall.

Large-Scale TCP Packet Flow Analysis for Common Protocols Using Apache Hadoop

New York University Computer Science Department Courant Institute of Mathematical Sciences

Algorithms and Techniques Used for Auto-discovery of Network Topology, Assets and Services

Lecture 23: Firewalls

Networking Test 4 Study Guide

Project 4: (E)DoS Attacks

TOE2-IP FTP Server Demo Reference Design Manual Rev1.0 9-Jan-15

1 An application in BPC: a Web-Server

IP Firewalls. an overview of the principles

Stateful Firewalls. Hank and Foo

IP address format: Dotted decimal notation:

Data Link Layer(1) Principal service: Transferring data from the network layer of the source machine to the one of the destination machine

TCP Performance Management for Dummies

CSE 473 Introduction to Computer Networks. Exam 2 Solutions. Your name: 10/31/2013

B-2 Analyzing TCP/IP Networks with Wireshark. Ray Tompkins Founder of Gearbit

Chapter 5. Transport layer protocols

Network Security. Chapter 3. Cornelius Diekmann. Version: October 21, Lehrstuhl für Netzarchitekturen und Netzdienste Institut für Informatik

(Refer Slide Time: 02:17)

Unified Language for Network Security Policy Implementation

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

SY system so that an unauthorized individual can take over an authorized session, or to disrupt service to authorized users.

Understanding TCP/IP. Introduction. What is an Architectural Model? APPENDIX

File Transfer And Access (FTP, TFTP, NFS) Chapter 25 By: Sang Oh Spencer Kam Atsuya Takagi

What is a Firewall? A choke point of control and monitoring Interconnects networks with differing trust Imposes restrictions on network services

CYBER ATTACKS EXPLAINED: PACKET CRAFTING

The OSI model has seven layers. The principles that were applied to arrive at the seven layers can be briefly summarized as follows:

INTRODUCTION TO FIREWALL SECURITY

INTERNET SECURITY: THE ROLE OF FIREWALL SYSTEM

How To Design A Layered Network In A Computer Network

TECHNICAL NOTES. Security Firewall IP Tables

CPS221 Lecture: Layered Network Architecture

Firewalls. Chapter 3

Computer Networks/DV2 Lab

allow all such packets? While outgoing communications request information from a

CS5008: Internet Computing

Computer Networks/DV2 Lab

ΕΠΛ 674: Εργαστήριο 5 Firewalls

Understanding Layer 2, 3, and 4 Protocols

PART OF THE PICTURE: The TCP/IP Communications Architecture

Firewalls P+S Linux Router & Firewall 2013

N-CAP Users Guide Everything You Need to Know About Using the Internet! How Firewalls Work

Firewall Implementation

Indian Institute of Technology Kharagpur. TCP/IP Part I. Prof Indranil Sengupta Computer Science and Engineering Indian Institute of Technology

Proxy Server, Network Address Translator, Firewall. Proxy Server

Chapter 8 Security Pt 2

Dissertation Title: SOCKS5-based Firewall Support For UDP-based Application. Author: Fung, King Pong

CIT 380: Securing Computer Systems

Firewall Introduction Several Types of Firewall. Cisco PIX Firewall

Host Fingerprinting and Firewalking With hping

8-bit Microcontroller. Application Note. AVR460: Embedded Web Server. Introduction. System Description

It is the thinnest layer in the OSI model. At the time the model was formulated, it was not clear that a session layer was needed.

2. IP Networks, IP Hosts and IP Ports

Networking Security IP packet security

DO NOT REPLICATE. Analyze IP. Given a Windows Server 2003 computer, you will use Network Monitor to view and analyze all the fields of IP.

SwiftBroadband and IP data connections

CMPT 471 Networking II

What is a DoS attack?

CIT 480: Securing Computer Systems. Firewalls

CSE 3461 / 5461: Computer Networking & Internet Technologies

Project 2: Firewall Design (Phase I)

Firewall Design Principles

COMP 361 Computer Communications Networks. Fall Semester Midterm Examination

20-CS X Network Security Spring, An Introduction To. Network Security. Week 1. January 7

Firewalls. Network Security. Firewalls Defined. Firewalls

How do I get to

Firewalls Netasq. Security Management by NETASQ

ΕΠΛ 475: Εργαστήριο 9 Firewalls Τοίχοι πυρασφάλειας. University of Cyprus Department of Computer Science

Chapter 9. IP Secure

1. Introduction. 2. DoS/DDoS. MilsVPN DoS/DDoS and ISP. 2.1 What is DoS/DDoS? 2.2 What is SYN Flooding?

CIT 480: Securing Computer Systems. Firewalls

Transcription:

A Framework for Stateful Inspection - Applied to TCP and Linux Netfilter Group D505c Mikkel Refsgaard Bech Torben Vinther Schmidt Carsten Stiborg Department of Computer Science Aalborg University Jan 14th, 2002

Project group: D505c Group members: Mikkel Refsgaard Bech Torben Vinther Schmidt Carsten Stiborg Supervisors: Mikkel Christiansen Emmanuel Fleury Numbers of Copies: 7 Number of Pages: 66 Synopsis This project examines the concept of stateful inspection and how it is applied in firewalls. First a framework containing a modeling language for networking protocols is derived. This is used to model the Transmission Control Protocol (TCP) which is used as case study. Then the concept of stateful inspection is applied to TCP in order to get a model for use in firewalls. Afterwards, the Linux Netfilter code is reverse engineered in order to create a model for comparison with the model for stateful inspection on TCP. The result of the comparison is a proposal for improved stateful inspection within Linux Netfilter. The report concludes that the framework was applicable in improving stateful inspection within Linux Netfilter. Aalborg University - Fredrik Bajers Vej 7E - DK-9220 Aalborg - Phone +45 96 35 80 80 - Telefax +45 98 15 98 89

This project was prepared on the DAT5-semester at the Department of Computer Science at Aalborg University, Distributed Systems and Semantics division, in the fall of 2001 and January 2002. It has been made with help from the open source community and their resources (homepages, FAQs, HOWTOs, and mailing lists). The goal of the project was to provide a framework for improving implementations of stateful inspection firewalls. We propose a framework for modeling network protocols and use this to model stateful inspection on TCP. We have reverse engineered an implementation of stateful inspection in order to model it. Based on a comparison of the models we propose improvements for the implementation. Our approach is not deeply theoretical but emphasizes application of our findings to practical use. Throughout the report, there are examples, tables, and figures. These are all enumerated within each chapter, e.g. Figure 3-1 means the first figure in chapter 3. In the appendix A and B are important parts of the source code of the implementation. Aalborg, January 14th, 2002 Mikkel Refsgaard Bech Torben Vinther Schmidt Carsten Stiborg iii

1 Introduction 1 1.1 Firewalls...................................... 1 1.1.1 The Generic Firewall............................ 1 1.1.2 Open Systems Interconnection....................... 2 1.2 Packet, Stateful Inspection, and Application Firewalls.............. 3 1.2.1 Packet Firewalls.............................. 3 1.2.2 Stateful Inspection Firewalls........................ 4 1.2.3 Application Layer Firewalls........................ 5 1.3 Alignment and comparison............................ 6 1.3.1 Comparison of Firewalls.......................... 6 1.3.2 Improving Stateful Inspection Firewalls.................. 8 1.4 Project Objective.................................. 8 1.5 The Structure of the Report............................ 9 2 Framework 11 2.1 The Formalism................................... 11 2.2 Graphical representation.............................. 13 3 The Transmission Control Protocol 15 3.1 The Protocol.................................... 15 3.1.1 The TCP Header.............................. 15 3.2 Connection Phases................................. 17 3.2.1 Connection Establishment......................... 17 3.2.2 Established Connection........................... 18 3.2.3 Connection Termination.......................... 20 3.3 TCA for a TCP Connection............................ 20 3.4 Summary...................................... 23 v

4.2.3 Resetting the Connection.......................... 30 4.3 The Stateful Inspection Model........................... 30 4.4 Stateful Inspection Issues............................. 31 4.4.1 Model Completeness............................ 31 4.4.2 Passive or Active.............................. 31 4.4.3 Other Protocols............................... 34 4.5 Summary...................................... 35 5 The Netfilter Implementation 37 5.1 Linux and Netfilter................................. 37 5.1.1 Netfilter Architecture............................ 37 5.1.2 Netfilter Modules.............................. 37 5.2 Implementation of TCP State Tracking...................... 39 5.2.1 Handshake Check.............................. 39 5.2.2 Transitions................................. 40 5.3 The State Table.................................. 41 5.3.1 The State Entry............................... 43 5.4 Summary...................................... 45 6 Applying the Stateful Inspection TCP Model to Netfilter 49 6.1 Similarities..................................... 49 6.2 Differences..................................... 50 6.3 Improvement Proposals.............................. 50 7 Conclusion 53 A Excerpt of ip_conntrack_core.c 55 B ip_conntrack_proto_tcp.c 61

1-1 A typical proxy relaying FTP requests for a HTTP browser............. 5 1-2 The SOCKS proxy relays the client requests through the SOCKS layer...... 6 2-1 A TCA containing three states............................ 14 3-1 The header format for TCP.............................. 15 3-2 Connection establishment for the client....................... 18 3-3 Connection establishment for the server....................... 19 3-4 Active connection termination, shown as a TCA.................. 21 3-5 Passive connection termination, shown as a TCA.................. 21 3-6 This TCA shows how a TCP connection acts as specified.............. 22 3-7 The retransmission workaround........................... 23 4-1 The ACK is blocked even though it is legal, because the state has changed back.. 27 4-2 The final state diagram for stateful inspection on TCP for the client........ 32 4-3 The final state diagram for stateful inspection on TCP for the server........ 33 5-1 The Netfilter architecture............................... 38 5-2 Schematic representation of the minimal Netfilter struct used at each state entry.. 44 5-3 State changes, client side............................... 46 5-4 State changes, server side.............................. 47 vii

1-1 The TCP/IP model aligned to the OSI model.................... 3 1-2 Where firewalls operate in accordance to the ISO and TCP/IP models....... 7 5-1 Timeout periods for TCP as defined by ip_conntrack................ 43 ix

During the last 10 years the use of computer networks has exploded. A network is a number of computers that are connected by some medium that allows them to communicate. Networks are established internally in almost all companies and professional organizations. Some networks provide connectivity through them to other networks in order to share resources. This creates a larger network, which is called the Internet. It may not be the case that all the resources that a network provide is willing to share them with others. Some resources are meant only for our private network. Each of the computers in a network has an address so it can be contacted. This also implies that if someone want to access this computer from any other network, that person is able to do that. This also means that this person could attack this computer and gain access to confidential information. That someone could access confidential information is of course unwanted. Therefore policies are created for what is allowed on a network. This policy can be expressed in many ways including, but not limited to, something as simple as piece of written paper or a list in the network administrators head. ½º½ Ö Û ÐÐ However, since the policy can be ignored by someone on a different network, a method for enforcing these policies are needed. A way to enforce the policies against other networks is by using a firewall. In this section we will define the generic firewall and introduce our perspective of different types ½º½º½Ì Ò Ö Ö Û ÐÐ of firewalls. In order to understand in which environment firewalls works, they will be aligned to the Open Systems Interconnection (OSI) reference model. The general types of firewalls that we will introduce are packet firewalls, stateful inspection firewalls and application firewalls. First, the generic firewall will be defined. As mentioned earlier, networks are composed of connected computers. Networks are used to send messages from one machine to another. The passing of messages between computers is called communication. We define that the private network is a network over which we have administrative control and want to protect from public networks, which is a network that we can not control, e.g. the Internet. The firewall is a machine which is on the border between networks and inspects communication that enters or leaves the network for messages that we do not allow in our policy. To define what 1

of formal rules for how communication between computers must proceed. These protocols exist on different layers. ½º½º¾ÇÔ ÒËÝ Ø Ñ ÁÒØ ÖÓÒÒ Ø ÓÒ In order to find out what layers protocols and firewalls work on we look at the Open Systems Interconnection (OSI) reference model. This is a standard for network protocols on different layers proposed by the International Standards Organization (ISO) [Tan96]. Each layer has a welldefined behavior and does not overlap. Following are the different layers, starting from the bottom layer ascending to the top. Physical The physical layer is the actual hardware, it is this layer that handles the actual communication, on this layer the data type considered are bits. Data Link The data link layer handles the errors originating in the physical layer, thereby assuring that the data sent and received are alike. This layer works with frames. Network The network layer is the first layer to handle end-to-end communication, the previous layers only consider the next step in the route. This layer considers the data sent and received as messages or rather packets. Transport The transport layer takes care of sending the data received from the session layer, splitting it up into smaller pieces if needed and concatenating the data it receives from the network layer, if it has been split. The main goal of the transport layer is to separate the network layer from the session layer, the network layer changing with the hardware it is built upon, and the session layer only being dependent on the operating system it is implemented in. [RAD01] Session The session layer enlarge the service provided by the transport layer, while the transport layer only handles data going in one direction, the session layer applies the possibility of traffic going in both directions. The session layer also implements the concept of checkpoints in the received stream. If either the sender or the receiver brakes down, the checkpoint is used to restart the transmission from its position. Presentation The presentation layer takes care of presenting datatypes like integers, strings, characters, floats, etc. correct on different types of machines. Application The application layer contains many different protocols, like email, directory lookup, remote job execution, etc. 2

application. Because the OSI model is build in layers, an implementation on one layer can be replaced by another implementation without affecting the other layers. A more thorough description of the different layers can be found in [Tan96]. The OSI model is very general and this may be hindering when designing network protocols. Therefore, designers tend to cut corners. This is the case with the most commonly used protocols on the Internet. These are the Transmission Control Protocol (TCP) and Internet Protocol (IP) [TMW97]. These are almost always used together and therefore also known as TCP/IP. In Table 1-1 the TCP/IP model is aligned to the OSI model. As shown, the physical and data link layer is combined into one, since the software needed to control the physical layer always follows the network interface. The presentation and the session layer from the OSI model does not exist in TCP/IP. Some session procedures are, however, present in the TCP protocol, e.g. two-way connections. Also presentation is commonly handled on the application layer to the extend needed for that protocol. Layer OSI TCP/IP 7 Application Application 6 Presentation 5 Session 4 Transport Transport (TCP) 3 Network Internet (IP) 2 Data Link Host-to-network 1 Physical ½º¾ È Ø ËØ Ø ÙÐÁÒ Ô Ø ÓÒ Ò ÔÔÐ Ø ÓÒ Ö Û ÐÐ Table 1-1: The TCP/IP model aligned to the OSI model. ½º¾º½È Ø Ö Û ÐÐ We will take a closer look at three of most common types of firewalls, namely packet, stateful inspection, and application firewalls. In Section 1.3 we will place each of these types of firewalls in alignment to the models in Table 1-1. These are firewalls that operate on the Internet and Transport layer of the TCP/IP model and focuses on packets. This means that decisions are made based on single packets, i.e. not considering the context. 3

Example 1 Allow ½º¾º¾ËØ Ø ÙÐÁÒ Ô Ø ÓÒ Ö Û ÐÐ protocol TCP source IP = 192.*.*.* Deny ALL This says that TCP packets with source IP starting with the number 192 is allowed. If the packet does not apply to this rule, the next rule denies it. Stateful inspection firewalls operate on the network, transport, and application layer. They focus on protocols rather than single packets or data. We have found no definition of what stateful inspection is and firewalls that are claimed to be stateful differ a lot in functionality. In order to clarify what we mean by stateful inspection we will define it. In stateful inspection the protocol and the states of the protocol are known. Likewise, all messages of a communication is known and the states of each of the communications are known. On the basis of this we define it as: Determining whether the state of a communication conforms with the state of the corresponding protocol. This implies that the events dictated in a protocol are known and the sequence of these events are also known. We say that when a certain event has happened the protocol is in a certain state. Likewise, communication have the same events and therefore states. However, it can at some point differ from the protocol and when this happens it is known. While stateful inspection is only to determine the state of communication, stateful inspection firewalling is the act of filtering on the basis of the state of communication. In the rest of the report when referenced to stateful inspection, the act of filtering using stateful inspection is meant. What is done is ensuring that a communication behaves the way it should. If we discover that a communication changes state, we check if it is a legal state change and if not, react by discarding or rejecting the message, otherwise we allow it to pass and note that the state has changed for that connection. In addition to this, rule filtering can be done, i.e. determining by rules which kind of message is allowed based on the state of communication. This could be a rule which says that the communication is not allowed to change to certain state, e.g. a rule that says that a certain type of communication is not allowed to be initiated. 4

A proxy firewall is an application that relays packets for a client. The main idea is to deny all, except that which is allowed to pass through the firewall. What is allowed through the firewall is determined by rules for who is allowed to use it and what the firewall supports. Proxies are used for various reasons in addition to firewalling, e.g. a proxy can be used to mask the real IP-address of the clients using it. Also, it can be used to translate messages between protocols, e.g. letting a client use a Hyper Text Transfer Protocol (HTTP) browser to browse a File Transfer Protocol (FTP) site, communicating through HTTP to the proxy server, while the proxy server communicates with the FTP server through FTP. This is illustrated in Figure 1-1. Private Network Public Network HTTP Browser HTTP request HTTP Proxy FTP request FTP Server HTTP answer FTP answer ËÇ ÃËÈÖÓÜ Figure 1-1: A typical proxy relaying FTP requests for a HTTP browser. Similar to the proxies previously described, the SOCKS works as illustrated in Figure 1-2 on the next page. A client runs a program that works as a layer between the operating system and the application when the application accesses the network. Basically, it replaces the send and recv primitives used with sockets, which makes it operating system independent. The SOCKS client then connects to the SOCKS server which then in return relays the communication to the application server which the client applications is trying to connect to. The SOCKS client works like a wrapper for network communication and the SOCKS server then works as a gateway for the application, denying all other communication. Some applications have their own SOCKS client and therefore have no need for an external one. The SOCKS proxy technology has been developed by NEC [NEC01] and is free for non-commercial use. 5

SOCKS Client SOCKS Server Application Server OS OS OS Client SOCKS Server Server ÅÁÅ Û Ô Ö ÌÓÓÐ Figure 1-2: The SOCKS proxy relays the client requests through the SOCKS layer. Some firewalls look specifically at the content of the data, e.g. is it email, news, etc. A typical example of such is the products of MIMEsweeper, like Mailsweeper and Websweeper. These are not placed on the border of networks, but instead with the applicable service, i.e. Mailsweeper is placed on the same machine that runs as a mail server. MIMEsweeper protects by providing a plug-in interface for handling different application protocols. This is e.g. done by providing an anti-virus tool for the Mailsweeper interface. Mailsweeper then uses this anti-virus tool with each mail going to or from the mail server. The ½º MIMEsweeper Ð ÒÑ ÒØ Ò ÓÑÔ Ö ÓÒ tools contain a very large interface for policy declaration, this is needed since data is separated into many distinct types on this layer. When defining the policies one should consider each possible data type that can be transferred through the network, e.g. movies, pictures, and documents which can be a quite demanding task. [MIM01] In Table 1-2 on the facing page each of the firewalls are placed corresponding to the layer they operate on using the layers from Table 1-1 on page 3. ½º º½ ÓÑÔ Ö ÓÒÓ Ö Û ÐÐ We can see that stateful inspection firewalls include filtering on the application, transport, and network layers. All that is really needed for stateful inspection firewalls is a protocol for which states exists. This means that the concept of stateful inspection could in theory be used on any layer of the OSI model. When considering which type of firewall to use there are different points which speaks for each of them and we will discuss them here. 6

2 Data Link Host-to-network 1 Physical Table 1-2: Where firewalls operate in accordance to the ISO and TCP/IP models. What kind of firewall one chooses is dependent on what kind of security is needed. It is possible to combine different kinds of firewalls for more expressiveness and more security. However, combining them could lead to a difficult configuration of the firewalls, as is often the case with using several programs for the same purpose, due to overlaps and conflicts. In other words it is preferable to only have one which cover all. Where, being kernel space or user space, the different firewall operates is essential for the efficiency of the filter. In kernel space interaction with the Transport and Network layer is done much faster and thus allowing for a higher load of communication. However, kernel space is very limited and very little can be stored. Filters that work in user space access the network through kernel space and thus are slower than filters in kernel space because they have to go through this layer. However, user space have much more storage space. As we want the firewall to be efficient we of course tend towards those that run in kernel space. Packet filters does not need to store anything and can therefore easily run in kernel space. Stateful inspection firewalls need to store information about the communication but not the content of messages and can also run in kernel space. Application firewalls need to store the content of messages, which can be quite a lot and must therefore run in user space. We also want a firewall to be able to express our policies fully and we therefore tend towards those which have a lot of expressiveness. When it comes to expressiveness, packet filters are limited because they can not express relations in communication. What is meant by this is that if we allow communication from a computer inside the network to a computer outside the network, we also allow for that computer to communicate in the other direction. We can not express that we only allow replies and not requests from that computer. Stateful inspection surpasses this by considering the communication instead of packets. Stateful inspection firewalls can then express that only if the computer from within the network starts to communicate to a computer outside, the computer outside is allowed to reply. When considering application firewalls we have even more information to base our rules on. These are able to express that a certain content is not allowed within a message. Where we want both expressiveness and efficiency of a firewall, stateful inspection offers the best tradeoff between these. Along with this and its ability to operate on all layers makes it a desirable type of firewall. 7

done. This should both be in terms of a protocol and a current implementation of stateful inspection firewall. The purpose is for the framework to pinpoint problems within the implementation and/or vice versa. In order to apply the framework, a model of the protocol from a firewall s point of view has to be ½º ÈÖÓ ØÇ Ø Ú made in addition to a model of the implementation. Comparing these will show differences and similarities. On the basis of this comparison, an improvement proposals for stateful inspection within the implementation must be made. When networks are joined together, security becomes an issue. Network administrators have a need to protect their networks from malicious use, especially from outsiders. Firewalls play an essential role in protecting networks from outsiders. Filtering communication on different network layers using only one tool is a step towards a better firewall, since information on all layers can be inspected, and decisions whether to accept or drop packets can be made from more information. One such type is stateful inspection firewalls which gathers information about the state of communication going to and from the network. It is on the basis of this information that it decides what kind of communication is allowed. Because this information can be gathered on different layers it has an advantage over packet filters and application firewalls. Because it runs in kernel space it is reasonably effective. Alas, this advantage does not come without a cost. Information has to be stored in the limited kernel space. The goal of this project is: To examine the concept of stateful inspection for use in firewalls and reach a framework for stateful inspection firewalls, the behavior of which is well-defined and unambiguous. This framework is proved by example by improving an implementation through the use of the framework. Through this project we contribute to the development of stateful inspection in firewalls by: 1. A framework for modeling network protocols for the purpose of stateful inspection. 2. A model of the transmission control protocol for use with stateful inspection. 3. A reverse engineered model of a implementation. 4. Propose an improved design for the implementation based on a comparison of the models. 8

using requirement specifications for TCP. Once this model has been created we will consider the model and the role of a firewall to create a model for stateful inspection on TCP. We then reverse-engineer a current implementation of stateful inspection to model this using the ½º Ì ËØÖÙØÙÖ Ó Ø Ê ÔÓÖØ language and compare it to the stateful inspection on TCP model. Based on a comparison of these two models we will propose a new design for stateful inspection in the implementation. The new design must be an improvement over the old and must be implementable. In Chapter 2 we will derive a modeling language for networking protocols. We will examine the TCP protocol and proposes a model of a correctly behaving connection in Chapter 3. In Chapter 4 we will discuss stateful inspection for use in firewalls and proposes a model of stateful inspection on TCP. In Chapter 5 we will reverse engineer an implementation of stateful inspection and a model of the code will be made. In Chapter 6 the model for stateful inspection on TCP and the implementation will be compared and the final design will be proposed. We will conclude in Chapter 7. 9

In the introduction we stated that one of the problems in stateful inspection is the lack of a formalized model used to describe the procedures of stateful inspection. In this chapter we will present such a model using automata to depict state and state changes within protocols. However, as regular deterministic automata does not suffice for modeling protocols we need to expand the concept of regular deterministic automata. ¾º½ Ì ÓÖÑ Ð Ñ We will first present an overview of the contents of a generic network protocol, then describe the formalism and finally an alternate graphical form of the formalism will be presented. Through this section a formal language will be defined to describe generic protocols. Before designing such a formal language, it must be clear what a generic network protocol consists of. ÙØÓÑ Ø Å Ì Ñ A protocol consist of a set of rules that dictates what sort of action should be carried out in a given situation. An automaton is a well defined modeling language [Sip96] that we can use to describe a protocol, since it changes state on the basis of the current state. We can say that protocols change their state because they act on a given situation, that is, they remember their situation, they are in a state. In a network protocol packets are transmitted and received, through these packets data are exchanged. In the framework it should be possible to model this, since data could be of meaning to the protocol, forcing a change of state. All protocols must somehow apply rules for ending their data exchange, but doing nothing is a quite legitimate way of managing it. However, it must also be handled by the protocol that nothing happens in the end. Timeouts is a way to handle the closing of a protocol. Even if the data exchange is nicely closed with special purpose packets, it may be a problem if the packets did not arrive, so time is still an important part of a protocol. 11

The models for network communication will from now on be described as Timed Counter Automata (TCA) as it includes a subset of both timed automata, introduced by [AD90], and counter andì Ñ ÓÙØ¾Æ ½ automata as described by [Min67]. A TCA is a 6-tuple (É Õ¼ Î ), where: 1.Éis a finite set of states. Each state contains a timeout described by: Ø Ì Ñ ÓÙØ, whereøis a global clock Timeout describes the maximum amount of time in which it is allowed to stay in the same state. 2.Õ¼¾Éis the initial state. When the automaton is started, the global clockøis equal to¼. 3. is a finite set of messages. Messages describes contents of a packet which the automaton models. This could be a predefined bit-sequence in the data header of a packet. The set of is defined by all variable names of the content of possible transmittable and receivable packets, including. Before creating ¾ Ú Îand Ú Î a TCA it must be clear which messages exist, and their content. 4.Îis a finite set of bounded variables in.úis the set of global variables and Úis the set of variables contained within the packet. Thus: holds: Ú Ú Ú Ú ¾ ½ Ò Where the following 12 5.Æ¾ É Å Í É, is the set of transitions where: ¾ ØÖÙ Ü Ü Ý Ø Ø Ý µ and ¾ withü Ý¾Ú Ú,¾Æ, is the set of all possible guards. A guard is a boolean expression which must be evaluated to true in order for a transition to be taken. Variables must be equal to either constants, or other variables. Every entry is defined by an expression based on the following grammar:

Ù¾Í Ù Ü Ü Ý Ù Ù withü Ý¾Ú Úand¾ On each transition the global clockøis reset to¼. 6. Éis the set of accepted states. This is a state which is equivalent to the termination of ¾º¾ Ö Ô ÐÖ ÔÖ ÒØ Ø ÓÒ a protocol. Now we have defined the TCA as a 6-tuple we will define a graphical representation of it. Here we will present a graphical representation of the TCA. The graphical representation is not as strict as the formal description. The general rules which should be followed are listed here: States should be drawn as ellipses. With their names in the upper line inside the ellipse, and the timeout period below. The timeout period should be without theø as it is always the same, moreover, whenì Ñ ÓÙØ ½no timeout value should be written. Neither the corresponding transition be shown, since it would never be taken. The initial state should be drawn as a double lined ellipse. Transitions are drawn as arrows between the states. Conditions on the transitions (Guards, Messages, and Assignments) should be written above or right of the transitions they affect, where each line is ANDed together. Accept states should be drawn as rectangle. These are only general guidelines for how to depict TCAs in general depiction should be done as seen most fitting, using the given syntax, see 2. Example 2 Figure 2-1 on the next page illustrates a simple automaton with three states. One is the initial state, A, and one is an accept state, end. From the initial state a transition can be taken when a message of the type packet is transmitted, this message will, before being transmitted, have its local variable named Answer assigned the value of the unknown global variable The_Answer. This model do not care where the variable is set, it could be on another level of the protocol or simply a part not described. The state A do indeed have a timeout 13

transition will be taken, which leads to the accepting state end. A packet! packet!.answer := The_Answer packet? packet?.answer = 42 B 11000 t=11000 end Figure 2-1: A TCA containing three states. 14

In this chapter we look at how a TCP connection behaves and focus on what is relevant for stateful inspection from a firewall point of view. We use the previously defined Timed Counter Automaton to provide a model for TCP. º½ Ì ÈÖÓØÓÓÐ In this section we will focus on a specific protocol, namely TCP. We have chosen to use this protocol as our case study because, as shown in [TMW97], most of the traffic on the Internet use this protocol. TCP provides a connection-oriented reliable channel of communication between two º½º½Ì Ì ÈÀ Ö peers. The basic TCP is described in RFC 793 [Pos81]. To provide a reliable connection-oriented channel of communication the protocol specification describes a header, connection establishment, connection termination and data communication. These elements of the protocol will be described in the following with focus on what is needed for stateful inspection. 0 Bit 16 Bit 32 Bit Source Port Destination Port Sequence Number Acknowledgment Number Data Offset Reserved URG ACK PSH RST SYN FIN Window Checksum Urgent Pointer Options Padding data Figure 3-1: The header format for TCP. The header is a chunk of data prepended to each data packet. The format of the header is shown in Figure 3-1. It contains information about which port of the sender of the packet it is send from, and to which port on the receiver it is intended. The port numbers, together with the IP-addresses from 15

received. In TCP it is not required that every packet received is acknowledged, but instead every byte transfered has been acknowledged. The field named data offset is used to describe how many 32 bit words are currently present in the TCP header. The reserved field is not used and must be set to 0. The following 6 bits are used to determine what type of data is attached, and which fields in the packet are used. The first bit is the Urgent flag (URG), it signals that urgent mode is activated. The Acknowledge flag (ACK) indicates that this packet acknowledges some received data. The Push flag (PSH) is a reminiscence from early implementations of TCP. Its meaning is to push the data sent forward to the receiving application without waiting for further data. The Reset flag (RST) is a control flag that tell the peer that the connection has to be dropped, e.g. as a reply to a faulty synchronization. The Synchronize flag (SYN) is used to start the connection by synchronizing the two parts acknowledgment numbers. The Final flag (FIN) is used to signal that there is no more data from the sender. A packet with one of the flags set will be noted as being of that type, i.e. a SYN/ACK packet is a packet with the SYN and ACK flag set. The Window field contains the size of the senders current data buffer, implying that no more data than that specified in this field can be sent to the sender. If the receiving TCP implementation has not yet delivered the data to the appropriate application, the window size will become smaller. In other words, this is the size of the data buffer subtracted the size of the data it contains. The Checksum field is used to minimize the risk of badly transferred data. The Urgent Pointer points to the last byte of urgent data. The Options field is a variable field. It can contain multiple options selected for the current packet. Each options may vary in size and Padding is applied in order to make the header be composed of 32-bit words. The fields that are interesting for doing stateful inspection are those that change the state of the connection, identify the connection or somehow can be used to confirm the correctness of the connection. These fields are: 16 Source port and destination port, because they, together with the source and destination address from the IP layer, uniquely identify the connection. Sequence number and acknowledgment number, because they show the progress of the connection. The data length of a packet and window size are also needed for flow control which determines how a connection can progress. The flags ACK, RST, SYN, FIN, because they determine and change the state of the connection.

Ì ÑÓ Ð A TCP connection session can be divided into three phases, namely establishing the connection, transmission of data and connection termination. Ô Ø Ô Ø ÝÒ Ð ØÖÙ µ Ô Ø Ð In this section we will describe these phases. First we will consider some modeling issues and then move on to connection establishment. For each part we will make a TCA. In order to simplify the figures a special notation will be used. A transition with a Ë Æ is equal to Ð µ Ô Ø Ò Ð Ð µ Ô Ø Ö Ø Ð Ð µ. As this is quite long and there is no ambiguity with using the shorter version, the short version will be used. Likewise a ÁÆ Ã is a packet with both the FIN and the ACK flag set. Also, to avoid misunderstanding Ë Æ will be used instead of Ô Ø to denote the acknowledgment field in the packet. There are a number of different variables that are used. TYPE is the type of role the machine takes. This is commonly determined by the implementation and can only be client or server. the ack, seq, win, and dl in the packet scope are respectively the acknowledgment field, the º¾º½ ÓÒÒ Ø ÓÒ Ø Ð Ñ ÒØ sequence number field, the window size field, and the data length of the packet. The data length is not a field in itself, however, it is calculated as the total length from the IP header subtracted the IP header length field from the IP header and data offset field from the TCP header of the current packet. Before data can be transferred between a client and a server, a connection must be established. In TCP this is done by synchronizing sequence numbers. The client sends a packet with the SYN flag set, and include the ISN in the packet. The server must reply to this packet with an ACK packet, acknowledging the ISN plus one, since a packet with the SYN flag set demands that the sequence number is incremented. This is shown in Figure 3-2 on the next page. However, until now the establishment have only been one way. A similar process is needed for the server, which is shown in Figure 3-3 on page 19. This implies that the server must send a packet to the client with the SYN flag set containing the servers ISN and the client to reply with an ACK. The established state is dashed because this is not the complete figure and more will follow. The normal course is for the client to send its ISN and the server to respond with a SYN/ACK packet, which is an acknowledgment of the clients ISN and its own ISN. The client then responds with an acknowledgment of the servers ISN. However, TCP also allows both peers to initiate a connection at the same time. This means that they simultaneously sends their ISN and acknowledges the other s ISN. These two scenarios would both result in the client considering its connection 17

arise. To compensate for any packet loss retransmissions are used. E.g. if the SYN packet sent by the client does not reach the server, the client will simply time out and retransmit its packet. If the packet lost was the ACK packet sent by the server, the result would be somewhat the same, the client would still time out and re-send its SYN packet and the server sends an ACK in response assuming that its previous ACK packet was lost. The IP layer can also cause what may seem as errors by e.g. delivering packets out of order. start SYN! synsent SYN? ack := SYN?.seq + max(syn?.dl,1) ACK! SYN/ACK? seq = SYN/ACK?.ack ack := SYN/ACK?.seq + max(syn/ack?.dl,1) ACK! synrcvd ACK? seq = ACK?.ack ACK?.dl<=win ack := ACK?.seq + ACK?.dl º¾º¾ Ø Ð ÓÒÒ Ø ÓÒ established Figure 3-2: Connection establishment for the client. In this section we will not denote the two peers of the connection as client and server, since they both transmit and receive data, making neither one of them different from the other. Therefore we instead call them sender and receiver. In this phase of the connection data flows from the sender to the receiver. To ensure that the data arrives at the other end of the connection, it can be ordered, and that data is not duplicated we use 18

synrcvd ACK? seq = ACK?.ack ACK?.dl <= win ack := ACK?.seq + ACK?.dl established Figure 3-3: Connection establishment for the server. the sequence number and acknowledgment number. The window size of the receiver is used to ensure that the sender does not send more that the receiver can receiver. If this is done anyway, the excess packets are not acknowledged. If a packet is missing, data following it will not be acknowledged until the missing packet has been received. Also acknowledgments themselves will not be acknowledged, only when new sequence number is presented, this will be acknowledged. If a packet is lost it implies that the current information about the state of the sender is also lost. The sender, whose packet is lost, will, if the packet contained data, retransmit it as a result of the receiver not acknowledging the data. The unacknowledged data will first be acknowledged when the receiver retransmits the lost acknowledgment, is ready to transmit something else, or if additional data are sent from the sender. Until then, the retransmissions will be continued and they will be dropped as they are received. Acknowledgments are not required to be sent right away when a packet is received, as TCP acknowledges the individual bytes continuously. If packets are transmitted from a client that has not yet acknowledged data, the ACK flag will be set and the acknowledgment number will contain the next sequence number to be received. In most implementations the peer, that has received data and not yet sent an acknowledgment, waits 200 ms before transmitting the acknowledgment. This is done in order to avoid transmitting additional data onto the network, since all the packets received within the 200 ms can be acknowledged in one packet and acknowledgments are piggy-backed onto other transmitted packets if possible. [Ste94] 19

are different from each other. The only exception is that the connection is defined as half-way closed after one has terminated the connection to the other and data can still be transferred in the direction that has not been closed. The connection termination is a two-way handshake. First, one of the sides, in our example the client, sends a packet with the FIN flag set. The server must reply to the FIN packet with an ACK packet containing the acknowledgment number incremented one from the sequence number, or if the received FIN packet contains data, the acknowledgment number must be increased by the value of the data length. Moreover, the server must consider if data has been lost between the last receive packet and the FIN packet received. As usual packets can vanish or be delayed and in that coincidence the packets must be retransmitted. The connection will be considered closed from the client to the server when the ACK packet is received by the client. The server closes its connection in the same manner. If the very last acknowledgment is lost, it will generate a retransmission of the FIN packet from the server to the client (as the server closes its connection last in our current example), however, the connection could already be thought of as closed. But in order to avoid that unnecessary packets are sent, the connection will be in a Closewait state. In theory it can be there forever, since no timeout period has been specified [Pos81]. In the coincidence of simultaneous closing, the only difference from the described is that both connections can not be sure that their last packet arrived at the other peer. Figure 3-4 on the next º Ì ÓÖ Ì È ÓÒÒ Ø ÓÒ page and Figure 3-5 on the facing page depicts two TCA respectively describing the closing phase of the connection for both the client and the server, the client being the peer which actively closes the connection and the server which passively closes the connection. Here we present a TCA describing the entire TCP connection. The TCA is a concatenation of the previous depicted TCA. The model is depicted in Figure 3-6 on page 22 and we call this the TCP peer model. There are a few remarks to the figure. Some transitions have been omitted in order to minimize the figure. Retransmissions are not modeled in the figure. Retransmissions Ã Ã Õ Ã Ð Ã ÊËÌ are basically a loop on the same state, however, since two transitions can lead to the same state to avoid wrong modeling a workaround can be made by splitting the state in two, this is exemplified in Figure 3-7 on page 23. But as this would make the figure very large this have been omitted. Also on every state succeeding the established state there are transitions with to closed and loops with Õ Ã ÊËÌ going Û Ò Ã Õ, Ã Õ Ã Ã Õ Ã Ð 20

ack := FIN/ACK?.seq + max(fin/ack?.dl, 1) ACK! ACK?.ack=seq ack := ACK?.seq + ACK?.dl finwait2 ACK? ACK?.ack < seq ack := ACK?.seq + ACK?.dl closing ACK? ACK?.ack < seq ack := ACK?.seq + ACK?.dl FIN/ACK? FIN/ACK?.ack = seq ack := FIN/ACK?.seq + max(fin/ack?.dl, 1) ACK! ACK? ACK?.dl=0 ACK?.ack=seq FIN/ACK? FIN/ACK?.ack = seq ack := FIN/ACK?.seq + max(fin/ack?.dl, 1) ACK! timewait timeout t = timeout Closed Figure 3-4: Active connection termination, shown as a TCA. established FIN/ACK? seq <= FINACK?.ack ack := FIN/ACK?.seq + max(fin/ack?.dl,1) FIN/ACK? seq <= FIN/ACK?.ack ack := FIN/ACK?.seq + max(finack?.dl,1) FIN/ACK! closewait ACK! ACK? ACK?.dl = 0 seq <= ACK?.ack FIN/ACK! lastack ACK? ACK?.dl = 0 seq <= ACK?.ack ACK? ACK?.dl = 0 seq = ACK?.ack closed Figure 3-5: Passive connection termination, shown as a TCA. 21

22 start RST! TYPE = Server Figure 3-6: This TCA shows how a TCP connection acts as specified. ACK? seq < ACK?.ack ACK/RST! ACK/RST? seq = ACK/RST?.ack closing ACK? ACK?.ack=seq ack := ACK?.seq + ACK?.dl finwait1 FIN/ACK! FIN/ACK? FIN/ACK?.ack = seq ack := FIN/ACK?.seq + max(fin/ack?.dl, 1) ACK! synsent established TYPE = Client SYN! SYN! SYN? ack := SYN?.seq + max(syn?.dl,1) ACK! listen SYN/ACK? seq = SYN/ACK?.ack ack = SYN/ACK?.seq + max(syn/ack?.dl,1) ACK! ACK? seq > ACK?.ack ack := ACK?.seq + ACK?.dl FIN/ACK? seq <= FIN/ACK?.ack ack := FIN/ACK?.seq + max(fin/ack?.dl,1) FIN/ACK! FIN/ACK? FIN/ACK?.ack < seq ack := FIN/ACK?.seq + max(fin/ack?.dl, 1) ACK! FIN/ACK? FIN/ACK?.ack = seq ack := FIN/ACK?.seq + max(fin/ack?.dl, 1) ACK! timewait ACK? ACK?.dl=0 ACK?.ack=seq t = timeout SYN? ack := SYN?.seq + max(syn?.dl,1) SYN/ACK! synrecv ACK? seq = ACK?.ack ACK?.dl <= win ack := ACK?.seq + ACK?.dl finwait2 FIN/ACK? seq <= FIN/ACK?.ack ack := FIN/ACK?.seq + max(fin/ack?.dl,1) closewait lastack ACK? ACK?.dl = 0 seq = ACK?.ack FIN/ACK! ACK? ACK?.dl = 0 seq <= ACK?.ack ACK? ACK?.dl = 0 seq <= ACK?.ack ACK/RST? TYPE = server seq = ACK/RST?.ack ACK? seq < ACK?.ack ACK/RST! closed

end end º ËÙÑÑ ÖÝ and Ã Ã Õ. This state. Figure 3-7: The retransmission workaround. is of course not on the closed state as this is an accepting In this chapter we have studied the TCP protocol with focus on the details that are important for stateful inspection. We have studied the connection phases and we have made a model for the behavior of each of the peers in a connection. This model represents a single peer and is the same for both peers. 23

In the previous chapter we looked at the behavior of TCP. From this behavior we derived a model of how a peer should react. In this chapter the behavior of TCP is examined from a stateful inspection firewall s point of view and a new model is derived. Afterwards we address other º½ Ì Å ÒÁÒÌ Å Ð problems concerning modeling stateful inspection. More specific, we focus on whether a firewall should be active or passive, and behavior of protocols that significantly differs from TCP. We will start by examining where the stateful inspection firewall is placed. Communication is between two points and it is somewhere between these points that a firewall exists. Where it is actually placed, be it closer to point A than point B or vice versa, does not matter since we only need to filter at some point between peers. Considering an arbitrary network, literally millions of different routes could exist for a single message to travel in order to reach its destination. Virtually every message could travel its own route. This problem is eliminated by forcing packets to travel through that point, e.g. by making it physically impossible to go another way. This is necessary because the intention of a firewall is to ensure that unwanted traffic is filtered away and therefore all traffic must travel through the firewall. In other words, to get traffic in or out of the protected network everything have to pass the firewall and thereby we have to ensured that a firewall is always in between peers. Considering communication on a network, a peer can only tell what it have received and send of messages. It is unknown to a single peer if messages send have actually been received by the º¾ ËØ Ø ÙÐÁÒ Ô Ø ÓÒÓÒÌ È other peer. Likewise a firewall only knows what messages it has seen, it does not know if the messages are actually received by a peer once it has seen them. To make matters even worse it is not guaranteed that a message sent before another will be received before the other. As shown, TCP can be described as a TCA, which any valid TCP connection will conform to. So, in order to examine if any TCP connection is valid, it is a question of changing the state of the connection according to every received packet. We will now look at how we can use this information in order to create a TCA for how a stateful inspection firewall should react to a TCP connection. As said previously, a connection viewed from a point in the network, be it the peers or in between them, it is only possible to say what has come through that point. The firewall, which is in the middle, does not know whether a packet it has seen is received by the receiver or if a response has 25

positives and positive negatives, in other words, the number of wrongly blocked packets and the number of accepted packets which does not belong to the connection. All this will be taken into consideration as we look at how to design a stateful inspection firewall for TCP. º¾º½À Ò The initializing handshake is a three-way handshake or two two-way handshakes. First there is no connection in which the state is equal to the TCP state Listening and Start depending on whether it is the server or client. Then a SYN packet is sent from the client to the server and the connection then changes state equal to the Synsent. We will always consider the client to be the one which first sent a SYN packet, even if it is a simultaneous connection establishment. The server then receives the packet and responds with an ACK to the SYN and sends a SYN itself, commonly done by a combined SYN/ACK packet. Again the connection changes state accordingly to what has been send. Similar in the case of simultaneous connection establishment both peers change state accordingly. Similar the handshakes used to close the connection changes the state of the connection. Again, as it is not specified who sends the first FIN packet, both scenarios have to be taken into account, and also simultaneous FINs. As with TCP there are transitions that allow state changes. By filtering out packets which are not a legal state change packet, it is possible to securely make rules for opening and closing connections. In both cases we check that the acknowledgments acknowledges the correct sequence number. This is done by storing the sequence number from the SYN packets and waiting for the corresponding acknowledgments. After synchronization this is done by checking that the data length plus sequence number is equal to the acknowledgment number. Retransmission should also be allowed through the firewall. To do this a looping state change is used to allow the last seen packet through the firewall. How retransmissions are handled in an established connection is a more difficult task and we need more information to handle this, therefore we will come back to this later. However, due to retransmissions it is not always easy to inspect a connection between peers. Consider the following example: Example 3 A client initiates a connection by sending a SYN. This is registered by the firewall. The server replies with SYN/ACK, which the firewall also registers and now waits for an ACK. However, the client times out before receiving the SYN/ACK and retransmits the SYN. There are two possible actions by the firewall when it sees the SYN, either it can change back to its previous state or remain in the same state. 26

The transmission of packets is illustrated in Figure 4-1. Although this would not halt the connection, a legal ACK is blocked and this must never happen. Therefore the state must remain the same even after seeing a SYN/ACK from the server and getting a SYN from the client and none of the packets must be dropped. Client Firewall Server SYN SYN SYN/ACK ACK SYN/ACK Figure 4-1: The ACK is blocked even though it is legal, because the state has changed back. By checking sequence number and acknowledgment numbers we have ensured that no malicious º¾º¾Ï Ò ÓÛË Þ Ò Ë ÕÙ Ò ÆÙÑ Ö packet in the synchronization nor termination of connections is able to intervene with the connection. However, the connection can still be interfered with by inserting a FIN packet when in the established state thereby closing the connection. Therefore, measures must be taken to minimize this risk. The goal of connection establishment phase is to synchronize the sequence numbers. The sequence number is then used to determine whether all packets have been received or not. Simple state inspection would be to change the state every time a higher sequence number has been seen and when these have been acknowledged. This works well with TCP, but as IP does not guarantee that packets are received in the order in which they were sent this would lead to dropping of out-of-order packets. 27

Ø ÓÙÒ Ö For any given packet, is the sequence number andòis the length of the data. The notation is host B s sequence number andò is the data length of the packet from B. We know that we are not allowed to send more data than the receiver can accept and thus the upper boundary for a given data packet is this: (last byte in packet) (maximum byte that host A is allowed to send) Ò Ð Ø µ Ð Ø Û Ò µ (4-1) The last byte in a packet is known to be the same as the sequence number plus the data length of the packet. The maximum byte that A is allowed to send is the last acknowledged number from B,, plus the last window size from B,Û Ò, which host B has send and has been seen by A. The functionð Ø Üµreturns the lastüseen by, e.g.ð Ø µis the last sequence number from host B that A has seen. Note that the last is not same as the current. From this we get: There is an exception to 4-1. When a window size of zero is announced by host B, host A will start a so-called persist timer that probes host B for a non-zero window size. This is done by periodically sending 1 byte data packets Ò Ð Ø µ Ð Ø Ñ Ü Û Ò ½µµ to B until a response packet from B containing a non-zero window has been seen by A [SW95, page 827]. This is the only exception to the data boundaries [Pos81, section 1.5]. The functionñ Ü Ü Ýµreturns the larger ofüandý. From this the upper boundary is: (4-2) Ò Ð Ø µ Ð Ø Ñ Ü Û Ò ½µµ Considering the firewall in the middle we can also say that this holds for the firewall. Thus we get the upper boundary for what the firewall should accept: (4-3) 28 (4-4) Moving on to the lower boundary we know that host A will only send data that has not been acknowledged and thus we have: Ð Ø µ

Ð Ø µ Ð Ø Ò µ Ð Ø Ñ Ü Û Ò ½µµ (4-6) ÒÓÛÐ Ñ ÒØ ÓÙÒ Ö Ð Ø µ Ð Ø Ò µ Ð Ø Ñ Ü Û Ò ½µµ Finally if we move the point of view to the firewall we will see that the following holds and our lower data boundary for the firewall is: (4-7) We now move on to host B acknowledging Ð Ø µ Ð Ø Ò µ data from host A. The notation is the acknowledgment number of the current packet which we are examining from host B. Considering what the upper boundary for acknowledgments is we know that data which have not been sent can not be acknowledged. Thus we get: (4-8) Ð Ø µ Ð Ø Ò µ (4-9) Again, considering this from the firewall point of view we get the upper boundary for the firewall: Ð Ø µ Matters are however complicated when considering the lower boundary for acknowledgments. Intuitively we know that there is no need to acknowledge data which has already been acknowledged and we can use the boundary: (4-10) However, we do not know if packets arrive out-of-order. If two acknowledgment packets arrive out-of-order the packet Ð Ø µ Ð Ø Ò µ which is out of bound will be Å ÃÏÁÆ ÇÏ blocked. As we do not want to interfere with a protocol behaving properly, as is the case, we need to refine our view of the lower boundary. We define a new value which we callå ÃÏÁÆ ÇÏ. This value is slightly greater than the maximum value of a TCP window size. Using this we get: (4-11) By makingå ÃÏÁÆ ÇÏlarger than the maximum possible TCP window size we have ensured that no valid ACK is blocked. The observation is that all the data in the window is potentially being acknowledged, under the window everything is acknowledged and everything 29

Í Ò Ø ÓÙÒ Ö The boundaries are fairly simple to test and each packet that passes the firewall will be tested Boundary Ò Ð Ø µ Ð Ø Ñ Ü Û Ò ½µµ for each boundary. If a packet, that is not within these boundaries, is received by the firewall, it is dropped. Boundary Ð Ø µ Ð Ø Ò µ With this the window of possible malicious packets getting through is minimized. Alas, the window can not be removed entirely because packets can arrive out of order. The four Boundary Ð Ø µ Ð Ø Ò µ boundaries are as follows. Boundary Ð Ø µ Ð Ø Ò µ Ð Ø Ñ Ü Û Ò ½µµ Å ÃÏÁÆ ÇÏ Upper Data Lower Data Upper Acknowledgment Lower Acknowledgment º¾º Ê ØØ Ò Ø ÓÒÒ Ø ÓÒ Using all the above stated boundaries the connection is reasonably secured from malicious data, however, the network media is volatile and may corrupt the connections to an extend where they must be terminated. This termination is handled by the special reset (RST) packets and these also have to be accounted for. As described, RST packets must be allowed to pass through the firewall since they are an essential part of the protocol. An example of when a RST packet is sent follows: when a client sends a SYN which the server receives and an old SYN is then received by the server afterwards. The server does not know which of these SYNs it should react to so it sends a RST packet. In this case the connection should be dropped. Likewise there are other scenarios where a RST is allowed. Of course a connection should not be terminated just because a RST is received, the RST must have been sent by one of the peers and must be validated by checking that the RST acknowledges º Ì ËØ Ø ÙÐÁÒ Ô Ø ÓÒÅÓ Ð the right sequence number. This is similar to the TCP peer model, Figure 3-6 on page 22. Instead of going to the closed state, where all stored information about the connection is removed, the transition goes to the timewait state, because of retransmissions. In section 4.2 we have considered the SYN, ACK, FIN, and RST flags of the TCP header. We have also considered the sequence number field, acknowledgment number field, the packet data 30

state in the client figure and the synrecv in the server figure the before-mentioned boundaries are applied but omitted from the figure to save space. The reason for having two figures is that it is more appropriate to view the connection as what comes from each peer. In other words, as we only º can say what ËØ Ø ÙÐÁÒ Ô Ø ÓÒÁ Ù has been sent by a peer, and thus what state it is in, we model this. The variables Ø Ø andë Ø Ø are shared values, which means that an update in one model updates the other model. We use a rewriting of to in order to save space. Moreover, we have not considered timeouts in every state in the two TCAs since none is specified in the RFC [Pos81]. º º½ÅÓ Ð ÓÑÔÐ Ø Ò We have discussed TCP specific issues, now we will address some more general issues on modeling stateful inspection. More specifically we look at issues which TCP does not cover and issues that are non-protocol related. It is important that the firewall conforms to the specification of the protocol that it inspects, otherwise packets that should have been discarded may pass through and packets that should have been allowed may be blocked. This kind of behavior is of course unacceptable and should be avoided. As attacks often use behavior that is not described in the protocol specification, they therefore leave the reaction to attacks to the implementation of the protocol. These rarely have attack protection as this should be handled on other layers. The SITCP model covers a basic TCP implementation as is specified in RFC 793 [Pos81]. However, there are issues which we do not cover in the model, e.g. TCP option. These were, at time of writing, not relevant for the model. If the specification is changed at any time, in a way which conflicts with the model, the model has to be updated accordingly. If the model is not up to date and do not allow every scenario possible as specified by the specification, blocking and accepting of wrong packets can happen. º º¾È Ú ÓÖ Ø Ú Another way to possibly break conformance is to interfere with the connection, e.g. sending packets on behalf of one of the communicating parties. In some cases this is desired for reasons of security or performance. Each case should be studied thoroughly to determine the consequences before using such a feature. The purpose of a firewall is not to interfere with communication, but to ensure that only the communication, that the administrator of the firewall accepts, get through. If the firewall plays an 31

synsent ACK? Sstate = SYNRECV ACK?.ack = seq + max(dl,1) win >= dl Cstate := EST est ACK? Sstate = SYNRECV EST FINWAIT FIN/ACK? Sstate = EST Cstate := FINWAIT RST? FIN/ACK? Sstate = FINWAIT Cstate := TIMEWAIT finwait ACK? Sstate = EST ACK? Sstate = TIMEWAIT Cstate := TIMEWAIT timewait 2MSL t = 2MSL closed Figure 4-2: The final state diagram for stateful inspection on TCP for the client. 32

start SYN/ACK? Cstate = SYNSENT SYN/ACK?.ack = seq + max(dl,1) Sstate := SYNRECV SYN? Sstate := SYNRECV synrcvd ACK? Cstate = EST FINWAIT Sstate := EST est ACK? Cstate = EST FINWAIT RST? FIN/ACK? Cstate = EST Sstate := FINWAIT FIN/ACK? Cstate = FINWAIT Sstate := TIMEWAIT RST? FIN/ACK? Cstate = FINWAIT Sstate := TIMEWAIT finwait ACK? Cstate = EST ACK? Cstate = TIMEWAIT Sstate := TIMEWAIT timewait 2MSL t = 2MSL closed Figure 4-3: The final state diagram for stateful inspection on TCP for the server. 33

The paradox is that it is not really possible to determine if a communication is having evasive behavior because of attacks or because a faulty peer or bad implementation on the peer. We want to be active if it is an attack, but not if it is a faulty peer. If it can be verified that the communication º º ÇØ ÖÈÖÓØÓÓÐ semantic is the same when a firewall is active as when it is passive, then there is no harm in making the firewall active as this could be used to speed up communications and ensure prevention of attacks. However, most implementations are not formally verified but rather verified by making a thorough test as this is generally easier to do. We Í Ö Ø Ö ÑÈÖÓØÓÓÐ have applied stateful inspection to TCP. This is not the only protocol that exists as it is not always appropriate to use TCP for everything, e.g. with streaming media it is sometimes better that only some of the data gets through to ensure the flow of the stream. Therefore other protocols with less restrictive semantics, like UDP, exists. We will therefore examine the possibility of applying stateful inspection to other protocols. Examining UDP it is impossible to talk about states as it is connectionless and have predefined sequence of events. It is in fact stateless in itself [Pos80]. If we define UDP as being two states, opened and closed, one way to look at it would be to say that if the first packet from A is allowed to B by the rules, any subsequent packet going either from A to B or B to A is allowed within a ÝÒ Ñ ÀÓ Ø ÓÒ ÙÖ Ø ÓÒÈÖÓØÓÓÐ given timeout. The timeout is necessary because we do not know when the sender has finished its transmission from the protocol itself. This behavior is highly vulnerable to packet insertion as it only checks for sender and receiver (and the corresponding ports). As the only measurable metrics are the two ports and the two IP addresses, this is the only thing that can be done. What actually should be done in the case of UDP is an implementation choice. Although UDP itself does not have states, other protocols, that uses UDP as transport protocol, have states. One such is the Dynamic Host Configuration Protocol (DHCP). Considering DHCP in brief the client makes a request for a DHCP server after which all possible DHCP servers respond to that request by acknowledging that it exists. The client then chooses a server (commonly the first to reply) and requests a configuration from it. The server then replies with a configuration and the client is configured. It is very clear already at this point that DHCP has states, even though the protocol is more detailed [Dro93]. So it is possible to talk about states in DHCP even though it uses UDP where it is not. 34

Considering this, allowing a connection to port 21 on the TCP level is not enough as the data is transferred on another connection. As these are established on unknown ports it is impossible to make fitting rules for this. Therefore state inspection on this protocols layer is needed. However, it is undesirable to model that related connections are started in the model itself as there can be virtually an unlimited number of related connections (in reality you can only have as many as you º ËÙÑÑ ÖÝ have free sockets, this is still a lot though). Instead it is more appropriate to have a model for the control channel and a model for the data channel. In the model of a control channel you specify that, when needed, a new instance of the data channel model is started. By assuming nothing and concluding only state changes for the single peer based on what it has sent, it was possible to model stateful inspection on TCP. Also, it was possible to minimize the window of malicious packets being accepted by using information about sequence and acknowledgment numbers and window sizes. It is also clear that there is a need for state inspection on other protocols than TCP. Although it could be the case that certain protocols are without states, it does not rule out filtering on protocols which depends on these protocols. If protocols have a relationship, this does not affect filtering on other layers, i.e. there is no state relationship between protocols. 35

In this chapter we will focus on the implementation of stateful inspection in Netfilter. Specifically the way state keeping is handled and how the list of connected states is maintained. But first we will briefly explain what Linux and Netfilter is. º½ Ä ÒÙÜ Ò Æ Ø ÐØ Ö Linux is described by www.linux.org as a free Unix-type operating system originally created by Linus Torvalds with the assistance of developers around the world. Developed under the GNU General Public License, the source code for Linux is freely available to everyone. We chose Linux because the source code is available and because of the architecture of Netfilter, described in the Section 5.1.1. Netfilter is also modular built, which means that modules can be replaced and loaded separately. These modules can be made and tested in user space before using them in kernel space, which means that the operating system does not suffer if the module should prove faulty. Netfilter is described in the Linux netfilter Hacking HOWTO [Rus01] as a framework for packet mangling... that... defines "hooks" (IPv4 defines 5) which are well-defined points in a packet s traversal of that protocol stack... and that... At each of these points, the protocol will call the º½º½Æ Ø ÐØ Ö Ö Ø ØÙÖ netfilter framework with the packet and the hook number. The modules described in the Section 5.1.2, all registers themselves on one or more of the hooks. Thereby the modules can interact with the packet and do whatever the module wants to with the packet. The Figure 5-1 on the next page show how packets traverse the hooks in netfilter. Packets from the network enter at pre-routing and packets from local processes enter at local out. The packets leave the machine after the hook post-routing. The hook local in is called when packets are destined º½º¾Æ Ø ÐØ ÖÅÓ ÙÐ for a process on the machine, the forward hook is called for packets that are just being forwarded through the machine. For a firewall this is of course the way most packets go. At each of the hooks it is possible to interact the with the packet e.g. alter the content or drop the packet. The routing points are not hooks, they simply determine the destination of the packet because the destination could be changed by a module. Here we briefly describe the most basic modules that hooks onto the Netfilter architecture that are relevant for stateful inspection. Others are Mangling and NAT, but these are not considered since they are not relevant. 37

LOCAL OUT Ì ÓÒÒØÖ ÅÓ ÙÐ Figure 5-1: The Netfilter architecture. In addition to these modules, there is a mandatory IP Tables module. The IP Tables module is only a framework for the different parts in Netfilter, so it is not relevant to consider in this project. Of the several modules in Netfilter there is one module which is dedicated to remembering the different connections, namely the Connection Tracking module, which will be referred to as the Conntrack module. It is an independent module which other modules can contact to get information of a given connection, amongst other things the state of a connection. In the Netfilter architecture, as seen in Figure 5-1, the Conntrack module hooks in at four points. This happens at pre-routing, local in, local out, and post-routing. We will exemplify the reason for this. Example 4 Two machines, A and B, on each their network with a machine working as both firewall and NAT, machine F, in the middle. A starts a connection to B. First we have to store the connection from A to B (and B to A), afterwards we can NAT the packets. Now because of NAT we also have to store the connection from F to B (and B to F). Every time we have to store the connection before NAT because NAT uses this information itself to determine which packets should be translated to what before sending it on. From this we can see that we need to track before NAT and after NAT. This reasons the hooks at pre-routing and post-routing. There is also a special case which adds the need to hook at local in and Ì ÐØ ÖÅÓ ÙÐ local out. Considering the machine acting as firewall has multiple IPs, e.g. one for the firewall and one for a web server, packets can still go through NAT, but will no longer pass through the forward hook, but instead through the local in and local out, because the web server and firewall is the same machine. So because we still need to track after NAT we need to add hooks at local in and local out. The Filter module is the module which keeps track of the rules set through the interface. It also determines on the basis of these rules which packets to drop, reject, or accept. The rules can be 38

matched the rules. Ì ÅÓ ÙÐ ÓÒØ Ò Ò Ì ÈÅÓ Ð The Filter module contains only the rule checking part and as this is not a focus of this project we do not need to consider this further. The State module is only a wrapper so this also has no º¾ ÁÑÔÐ Ñ ÒØ Ø ÓÒÓ Ì ÈËØ Ø ÌÖ Ò relevance of this project. Finally, the Conntrack module keeps states and contains the model for state changes so it is herein we find the relevant code for this project. This module is described in the following section. The current implementation is rather incomplete when it comes to checking TCP. We have found that there is a handshake check on synchronization and that the state of each of the peers in the connection are tracked. The following sections describe the code of these checks. The code for stateful inspection TCP in the Conntrack module is located in the file ip_conntrack_proto_tcp.c in the kernel code [T 01] we have also included this file in appendix B on º¾º½À Ò page 61, we will throughout this section consider only this piece of code. The function tcp_packet has the responsibility of determining whether the packets that are inspected should be dropped or not. The code in the following is from that file. The code checks the sequence number of the synchronization handshake. This is done by saving the sequence number on the SYN packet and comparing this with the acknowledgment number on the following ACK packet. This is only done for the servers sequence number. if (oldtcpstate == TCP_CONNTRACK_SYN_SENT && CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY && tcph->syn && tcph->ack) conntrack->proto.tcp.handshake_ack = htonl(ntohl(tcph->seq) + 1); This code checks that: the previous state is SYN_SENT 39

if (oldtcpstate == TCP_CONNTRACK_SYN_RECV && CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL && tcph->ack &&!tcph->syn && tcph->ack_seq == conntrack->proto.tcp.handshake_ack) set_bit(ips_assured_bit, &conntrack->status); This code checks that: the previous state is SYN_RECV the current packet is from the client the ACK flag is set and the SYN flag is not set º¾º¾ÌÖ Ò Ø ÓÒ the sequence acknowledgment number is equal to the previously saved number If all this is true, then the assured bit is set, meaning that we now consider this connection to be established. In the code there is functionality that behaves like our derived TCA model of stateful inspection on TCP. It is an array that is used to find the state of a connection based on the flags, the direction of the packet and the previous state of the connection. The new state is determined by indexing a three dimensional array (called tcp_conntrack). The first dimension has two fields that corresponds to the directions of the connection. The second dimension has five fields that corresponds to the flags rst, syn, fin, ack and no flag set. The flags are in a prioritized order and only one flag is selected, so if more than one flag is set, the highest prioritized flag is chosen. The third dimension has ten fields corresponding to the states that a TCP can be in, plus two administrative states, i.e. siv (MAX_CONNECTION) and sno (NO_CONNECTION) [Pos81]. static enum tcp_conntrack tcp_conntracks[2][5][tcp_conntrack_max] = { { /* ORIGINAL */ /* sno, ses, sss, ssr, sfw, stw, scl, scw, sla, sli */ /*syn*/ {sss, ses, sss, ssr, sss, sss, sss, sss, sss, sli }, 40

/* sno, ses, sss, ssr, sfw, stw, scl, scw, sla, sli */ /*syn*/ {ssr, ses, ssr, ssr, ssr, ssr, ssr, ssr, ssr, ssr }, /*fin*/ {scl, scw, sss, stw, stw, stw, scl, scw, sla, sli }, /*ack*/ {scl, ses, sss, ssr, sfw, stw, scl, scw, scl, sli }, /*rst*/ {scl, scl, scl, scl, scl, scl, scl, scl, sla, sli }, /*none*/{siv, siv, siv, siv, siv, siv, siv, siv, siv, siv } } }; In the following code, the first line saves the state of the connection. This is needed to determine the new state. The second line indexes the array and sets the new state. oldtcpstate = conntrack->proto.tcp.state; newconntrack = tcp_conntracks [CTINFO2DIR(ctinfo)] [get_conntrack_index(tcph)][oldtcpstate]; º Ì ËØ Ø Ì Ð Although the code tracks the state of each of the peers in the connections, the code never reject packets based on the state. As mentioned earlier it is necessary to remember the state of the connection, in order to inspect received packets and act on them. The state information remembered, should contain just enough data to make it possible to recreate the parts of the previous state that are important for inspecting the current received packet. Netfilter solves this by using a state table, in which state information can be stored and retrieved. Netfilter implements the state table among with other code in the file _Ø Ð _ ÒØÖ Ê Åµ ½ ip_conntrack_core.c, this file will be partly explained through this section. The parts which are discussed is included in appendix A on page 55. ½ ¾ Ê Å if¾å Ê Å if½ Ê Å In Netfilter the state table is implemented as a hashtable with the size: ½ Þ _Ó _ ÒØÖÝµ with size_of_entry = 8Byte 41

} /* ntohl because more differences in low bits. */ /* To ensure that halves of the same connection*/ /* don t hash clash, we add the source per-proto*/ /* again. */ return (ntohl(tuple->src.ip + tuple->dst.ip + tuple->src.u.all + tuple->dst.u.all + tuple->dst.protonum) + ntohs(tuple->src.u.all)) % ip_conntrack_htable_size; The hash key is constructed from the packets source and destination IP along with another number, referenced as all, in TCP this is the port number. Only using the numbers described until now the hash key would be the same for packets arriving from both sides of the connection. The protocol ËØ Ø Ì Ð Å Ò Ñ ÒØ number does not change that, however, the packet source s all field is added once again, which will make the two hash keys different from each other. The ntohl (network to host long) and ntohs (network to host short) functions are used to change the numbers from big endian values to little endian values, this is done to generate more dispertion of input values that are quite similar. Each packet received by the connection tracking module is evaluated in order to determine whether the packet is already part of a connection, or if a new connection should be created. The function ip_conntrack_in is called by netfilter each time a packet is received. ip_conntrack_in first determines which connection the received packet is part of, that is, which entry into the state table that should be used to determine if the packet is valid. The function resolve_normal_ct is called by ip_conntrack_in to handle the connection finding. If no matching entry is found in the hash chain, a new entry is attempted to be created. Of course this depends on the received packet, if the packet cannot start a connection it must be rejected. The exact creation of state entries are done by the function init_conntrack. However, init_conntrack is not merely used to insert connections into the state table, it is also used to remove unconfirmed connections if the state table is already full. The algorithm used tries to remove an arbitrary state entry if no memory is available to create a new one. However, if no arbitrary connection can be removed, the algorithm tries to remove an entry from the same hash chain as the one it is about to insert into. The policy used by ip_conntrack is to remove connections that have not yet been confirmed, that is connections which have only seen a client request but not a server reply and thus, does not yet classify as a connection. This specific policy 42

the TCP timeouts implemented by Netfilter. State No connection SYN_SENT SYN_RECEIVED ESTABLISHED FIN_WAIT TIME_WAIT CLOSED CLOSE_WAIT LAST_ACK LISTEN Timeout 30 minutes 2 minutes 1 minute 5 days 2 minutes 2 minutes 10 seconds 1 minute 30 seconds 2 minutes º º½Ì ËØ Ø ÒØÖÝ Table 5-1: Timeout periods for TCP as defined by ip_conntrack. As we have discussed the workings of the state table one thing remains, namely the state entry, we will now focus on that. The state entry contains a large amount of data. The main part of a connection tracking entry, is the struct called ip_conntrack. When Netfilter neither is compiled with IRC, NAT nor Masquerading, a single entry takes up 198 Bytes. A schematic representation of a connection tracking entry, is depicted in Figure 5-2 on the following page. The figure is constructed as follows, the arrows indicates that a certain field is of a specific struct type. When a type is declared as a union, undirected lines will symbol which primitive data types can be stored in it. The connection point for the hash chain is the list in ip_conntrack_tuple. Besides being the connection point of the hash table, ip_conntrack_tuple_hash also contains the struct tuple. Tuple holds the ip numbers of the source and the destination of the connection, along with port information and the likes, used to identify a connection. The pointer ctrack connects ip_conntrack_tuple_hash to the main struct, namely ip_conntrack. The connection tracking entry contain much data, among the most interesting is the struct time_list which contains the time of when the connection tracking entry will be obsolete, and can be deleted. The fields of the ip_conntrack struct gathers the different parts of the connection tracking table, here follows an examination of them: 43

44 Figure 5-2: Schematic representation of the minimal Netfilter struct used at each state entry. unsigned long expires time_list unsigned long list data list_head *next list_head void (*function) list_head *prev timeout tuplehash[2] ip_conntrack_tuple_hash ip_conntrack list tuple *ctrack src ip_conntrack_manip u_int32_t u ip volatile int status ip_conntrack_expect int list tuple mask (*expectfn) ip_conntrack_tuple dst u_int32_t ip u_int16_t protonum u_int16_t all union u tcp u_int16_t port ip_conntrack expected master infos[5] int is_ftp union ip_conntrack_manip_proto nf_ct_info nf_conntrack *master udp u_int16_t port ip_conntrack *expectant u_int32_t seq union help u_int32_t len icmp u_int16_t id union proto enum tcp_conntrack state u_int8_t type ct_general enum ip_ct_ftp_type ftptype icmp ip_ct_icmp count ip_ct_tcp ip_ct_ftp u_int8_t code ip_conntrack_help *helper u_int32_t handshake_ack u_int16_t port nf_conntr void (*destroy) u_int seq_aft

connection is marked as being used. expected Contains a list of expected connections, that is, connections that are related to this connection. *helper Points to a helper that may be of service with the connection tracking of higher level protocols. The helper contains a function that, e.g. can determine which packets to let through the firewall and which to deny, if related connection should be allowed, and so forth. master Is a nf_ct_info struct, if the current connection was opened as a related connection, master will point to the ct_general of the ip_conntrack that allowed this connection to be created. infos These nf_ct_infos specify what relation this packet has to the connection tracking, they are initial set to the ct_general of the ip_conntrack to which the nf_ct_infos belong. help This union contains structures for higher level protocols state keeping. Higher lever protocols include FTP and IRC. º proto This ËÙÑÑ ÖÝ union contains structures for protocols supported by Netfilter connection tracking. ct_general A counter used to count all connections that are mastered by the holding connection, along with timers and infos. In this chapter we have gathered information about how stateful inspection works in Netfilter. From this we have made two TCAs that show how the state changes for each of the directions, from the client, Figure 5-3 on the next page, and from the server, Figure 5-4 on page 47. These combined are a model of stateful inspection in Netfilter and we call this the NFSI (Netfilter stateful inspection) model. In addition to what is previously stated about figures. A flag is equal to any derivation of the flag and the flags that has a lower priority than that. E.g. A FIN? means both FIN? and FIN/ACK?. This is due to the prioritizing of flags in the code. 45

46 No connection 30 min Synrecv 1 min SYN? Listen 2 min SYN? ACK? ACK? seq = ACK?.ack ACK? Figure 5-3: State changes, client side. t = 30 min t = 30 sek SYN? FIN? Timewait 2 min t = 1 min Lastack 30 sek FIN? seq = FIN?.ack FIN? ACK? FIN? ACK? RST? SYN? t = 2 min FIN? SYN? Synsent 2 min Closewait 1 min RST? SYN? RST? ACK? RST? SYN? Closed 10 sek SYN? FIN? ACK? RST? RST? seq = RST?.ack Established 5 days RST? FIN? ACK? RST? t = 10 sek SYN? ACK? RST? FIN? Finwait 2 min SYN? RST? FIN? ACK? t = 2 min t = 5 t = 2 min terminated

Established 5 days SYN? seq := SYN?.seq ACK? Figure 5-4: State changes, server side. t = 5 days FIN? Closewait 1 min t = 1 min FIN? ACK? RST? t = 2 min Synsent 2 min RST? FIN? ACK? SYN? t = 30 min No connection 30 min SYN? FIN? ACK? RST? RST? Closed 10 sek SYN? SYN? Synrecv 1 min RST? SYN? RST? FIN? ACK? RST? t = 10 sek SYN? Finwait 2 min SYN? ACK? FIN? Timewait 2 min t = 2 min ACK? SYN? FIN? FIN? ACK? t = 1 min Lastack 30 sek ACK? FIN? RST? SYN? t = 2 min t = 30 sek R terminated 47

CHAPTER In Chapter 3 and Chapter 4 we built a model of a TCP connection seen respectively from the peers, the TCP peer model, and the firewall, the SITCP model. In Chapter 5 we examined the º½ implementation Ë Ñ Ð Ö Ø of stateful inspection on Linux Netfilter and created a model of how Netfilter worked, the NFSI model. Here, we will compare the SITCP model and the NFSI model in order to find the points where they are alike, different, and what is lacking. On some points the NFSI model and the SITCP model are alike. These points are: 1. They are described in a statemachine-like way. 2. They track the connection of each of the peers. 3. They protect the connection phase. Ad 1. We have described the SITCP model as a TCA and we were also able describe the NFSI model as such. This shows that our representation can represent a real implementation and it makes it easier to compare those models. Ad 2. Both the NFSI model and the SITCP model have a separate sub model for each of the peers. It is useful to have two models simply because the behavior on the peers is different. Therefore we have no single state for a connection, but rather two states, one for each peer. It is also the case that we do not assume anything about a connection. State changes are based on facts. It is possible to combine the models into one using what we know about computational theory, but it would give no advantages and make the model grow large [Sip96]. Ad 3. Both the NFSI model and SITCP model follow the connection phase by checking sequence numbers and acknowledgment numbers. This is an important phase of the connection because it makes sure that both peers are aware of the connection. This is also where the initial sequence number is exchanged which is stored in order to ensure correct flow. However, the NFSI model does not check the client s sequence number, only the server s. 49

sizes. Ad 1. The NFSI model track the state changes, but it does not use them for anything else than timeouts. The SITCP model track the state of the peers and decides whether to allow or drop the packet based on this state, the NFSI model should do this in all the phases of and for the entire life of connection. As filtering packets based on the state is what stateful inspection is, this is very essential. In the NFSI model, the connection handshakes are tracked, but only for precisely the second and third packet. However, it is not the state change code that tracks the correctness of the handshake, but a separate handshake code as described in the chapter 5 subsection 5.2.1. Since state changes are not tracked, neither are the packets that terminate a connection, the FIN packets. Therefore, all tracked connections timeout by a relatively long timeout. This make NFSI vulnerable in environments where many connections are made, because the number of tracked connections are limited. This limit is imposed because the state of the connection is stored in kernel space. If many connections are made within the length of the timeout, the result would be that new connections are denied. Ad 2. The NFSI model does neither track nor check the sequence numbers, the acknowledgment º ÁÑÔÖÓÚ Ñ ÒØÈÖÓÔÓ Ð numbers or the window size once established. All these numbers have influence on the state of the connection and are important in narrowing down the window of opportunity for a malicious packets. The current implementation is good and the design is in many ways similar to the SITCP model, but security can be enhanced by adding some functionality. The functionality we propose added is this: 50 1. Improve the state tracking code. 2. Improve sequence, acknowledgment, and window checking code. 3. Reduce the timeout on properly closed connections. 4. Reduce the size of the state table. 5. Adding the feature to change the maximum number of concurrent connections and method for maintaining the state table.

Ad 3. Reducing the timeout on closed connections will help free up resources used by the state table, which in turn allows a higher number of connections over time. Ad 4. The state table should be minimized as much as possible in terms of space usage. This means reducing the size of the each entry in the table as much as reasonably possible. Ad 5. The maximum number of concurrent connections should be configurable, which at present is a fixed size in the code based on the installed memory on the machine. The state table is maintained, however, the user has no method of specifying how it should be maintained and more advanced methods of maintenance should be user configurable. 51

Our motivation for doing this project was to improve security for a network connected to other networks. More specific we wanted to improve stateful inspection firewalls that protect a single network from other network. Our ambitions was to make a framework for modeling network protocols for stateful inspection. We wanted to prove the applicability of the framework by providing a model of TCP for stateful inspection and to apply the framework to an existing implementation to find points to improve. We have provided a framework to describe network protocols in a formalized way. This framework consists of a definition of stateful inspection and a language to describe network protocols. We believe that this framework can be used to make and improve implementations of firewalls that use stateful inspection. We also believe that we are able to model any kind of protocol for use with stateful inspection. We have used this framework on TCP to make a model of how this protocol should be statefully inspected. We have also used this framework on a current implementation of stateful inspection to derive a model of this implementation. From these models we have found that the implementation can be expanded with some important functionality. This functionality includes better state tracking, more checks on sequence and acknowledgment numbers, and reduction of the timeout on closed connections to avoid exhausting limited resources. Future work could be to fully design the improvements for stateful inspection in Netfilter. Then implement this and test the proposed improvements. Also future work could be to build a general implementation that would accept any protocol described in our framework. In other words, model other protocols and include these in the stateful inspection firewall. Another thing that requires further investigation is how to react to attacks on the firewall and whether the firewall should insert packets in the communication to protect a peer. 53

Ò Ø ÓÒÒØÖ These functions are used to insert state entries into the state table of Netfilter and also to remove them in certain situations. It has been taken from line 471 through 703 from the file ip_conntrack_core.c in the Netfilter source code [T 01]. The first function init_conntrack specific inserts and initializes a new ip_conntrack. However, it also is used to remove unconfirmed connections if no memory is available for allocating the connection. Lines 487 through 502 tries to insert the connection, if more connections than ip_conntrack_max allows exist, a random connection will first be tried to be removed, this is done by at each runthrough incrementing the static variable drop_next. drop_next describes the hash chain from which a connection should be dropped next. The connection is not just dropped, it is dropped if it has not been replied to, this happens in function early_drop. The rest of the function is used to initialize the separate fields of the struct ip_conntrack. In line 526 the function protocol->new is executed, it initializes the state keeping for TCP or whatever protocol is used above the IP layer. /* Allocate a new conntrack: we return -ENOMEM if classification failed due to stress. Otherwise it really is unclassifiable. */ static struct ip_conntrack_tuple_hash * init_conntrack(const struct ip_conntrack_tuple *tuple, 475 struct ip_conntrack_protocol *protocol, struct sk_buff *skb) { struct ip_conntrack *conntrack; struct ip_conntrack_tuple repl_tuple; 480 size_t hash, repl_hash; struct ip_conntrack_expect *expected; int i; static unsigned int drop_next = 0; 485 hash = hash_conntrack(tuple); if (ip_conntrack_max && atomic_read(&ip_conntrack_count) >= ip_conntrack_max) { /* Try dropping from random chain, or else from the 490 chain about to put into (in case they re trying to bomb one hash chain). */ if (drop_next >= ip_conntrack_htable_size) drop_next = 0; if (!early_drop(&ip_conntrack_hash[drop_next++]) 495 &&!early_drop(&ip_conntrack_hash[hash])) { if (net_ratelimit()) printk(kern_warning "ip_conntrack: table full, dropping" " packet.\n"); 500 return ERR_PTR(-ENOMEM); 55

if (!conntrack) { DEBUGP("Can t allocate conntrack.\n"); return ERR_PTR(-ENOMEM); } 515 memset(conntrack, 0, sizeof(struct ip_conntrack)); atomic_set(&conntrack->ct_general.use, 1); conntrack->ct_general.destroy = destroy_conntrack; conntrack->tuplehash[ip_ct_dir_original].tuple = *tuple; 520 conntrack->tuplehash[ip_ct_dir_original].ctrack = conntrack; conntrack->tuplehash[ip_ct_dir_reply].tuple = repl_tuple; conntrack->tuplehash[ip_ct_dir_reply].ctrack = conntrack; for (i=0; i < IP_CT_NUMBER; i++) conntrack->infos[i].master = &conntrack->ct_general; 525 if (!protocol->new(conntrack, skb->nh.iph, skb->len)) { kmem_cache_free(ip_conntrack_cachep, conntrack); return NULL; } 530 /* Don t set timer yet: wait for confirmation */ init_timer(&conntrack->timeout); conntrack->timeout.data = (unsigned long)conntrack; conntrack->timeout.function = death_by_timeout; 535 /* Mark clearly that it s not in the hash table. */ conntrack->tuplehash[ip_ct_dir_original].list.next = NULL; /* Write lock required for deletion of expected. Without this, a read-lock would do. */ 540 WRITE_LOCK(&ip_conntrack_lock); conntrack->helper = LIST_FIND(&helpers, helper_cmp, struct ip_conntrack_helper *, &repl_tuple); /* Need finding and deleting of expected ONLY if we win race */ 545 expected = LIST_FIND(&expect_list, expect_cmp, struct ip_conntrack_expect *, tuple); /* If master is not in hash table yet (ie. packet hasn t left this machine yet), how can other end know about expected? Hence these are not the droids you are looking for (if 550 master ct never got confirmed, we d hold a reference to it and weird things would happen to future packets). */ if (expected && is_confirmed(expected->expectant)) { /* Welcome, Mr. Bond. We ve been expecting you... */ conntrack->status = IPS_EXPECTED; 555 conntrack->master.master = &expected->expectant->ct_general; IP_NF_ASSERT(conntrack->master.master); LIST_DELETE(&expect_list, expected); expected->expectant = NULL; nf_conntrack_get(&conntrack->master); 560 } atomic_inc(&ip_conntrack_count); WRITE_UNLOCK(&ip_conntrack_lock); if (expected && expected->expectfn) 565 expected->expectfn(conntrack); 56

match for the ip_conntrack_tuple is sought, if it cannot be found a new connection is tried started by calling init_conntrack, if this again fails NULL or an error value is returned. However, if the connection was found or created, the current info for the connection is set (line 596 through 619) and the ip_conntrack is returned. /* On success, returns conntrack ptr, sets skb->nfct and ctinfo */ 570 static inline struct ip_conntrack * resolve_normal_ct(struct sk_buff *skb, struct ip_conntrack_protocol *proto, int *set_reply, unsigned int hooknum, 575 enum ip_conntrack_info *ctinfo) { struct ip_conntrack_tuple tuple; struct ip_conntrack_tuple_hash *h; 580 IP_NF_ASSERT((skb->nh.iph->frag_off & htons(ip_offset)) == 0); if (!get_tuple(skb->nh.iph, skb->len, &tuple, proto)) return NULL; 585 /* look for tuple match */ h = ip_conntrack_find_get(&tuple, NULL); if (!h) { h = init_conntrack(&tuple, proto, skb); if (!h) 590 return NULL; if (IS_ERR(h)) return (void *)h; } 595 /* It exists; we have (non-exclusive) reference. */ if (DIRECTION(h) == IP_CT_DIR_REPLY) { *ctinfo = IP_CT_ESTABLISHED + IP_CT_IS_REPLY; /* Please set reply bit if this packet OK */ *set_reply = 1; 600 } else { /* Once we ve had two way comms, always ESTABLISHED. */ if (h->ctrack->status & IPS_SEEN_REPLY) { DEBUGP("ip_conntrack_in: normal packet for %p\n", h->ctrack); 605 *ctinfo = IP_CT_ESTABLISHED; } else if (h->ctrack->status & IPS_EXPECTED) { DEBUGP("ip_conntrack_in: related packet for %p\n", h->ctrack); *ctinfo = IP_CT_RELATED; 610 } else { DEBUGP("ip_conntrack_in: new packet for %p\n", h->ctrack); *ctinfo = IP_CT_NEW; } 615 *set_reply = 0; } 57

mand packet executed from the struct proto. /* Netfilter hook itself. */ unsigned int ip_conntrack_in(unsigned int hooknum, struct sk_buff **pskb, const struct net_device *in, 625 const struct net_device *out, int (*okfn)(struct sk_buff *)) { struct ip_conntrack *ct; enum ip_conntrack_info ctinfo; 630 struct ip_conntrack_protocol *proto; int set_reply; int ret; /* FIXME: Do this right please. --RR */ 635 (*pskb)->nfcache = NFC_UNKNOWN; /* Doesn t cover locally-generated broadcast, so not worth it. */ #if 0 /* Ignore broadcast: no connection. */ 640 if ((*pskb)->pkt_type == PACKET_BROADCAST) { printk("broadcast packet!\n"); return NF_ACCEPT; } else if (((*pskb)->nh.iph->daddr & htonl(0x000000ff)) == htonl(0x000000ff)) { 645 printk("should bcast: %u.%u.%u.%u->%u.%u.%u.%u (sk=%p, ptype=%u)\n", NIPQUAD((*pskb)->nh.iph->saddr), NIPQUAD((*pskb)->nh.iph->daddr), (*pskb)->sk, (*pskb)->pkt_type); 650 } #endif /* Previously seen (loopback)? Ignore. Do this before fragment check. */ 655 if ((*pskb)->nfct) return NF_ACCEPT; /* Gather fragments. */ if ((*pskb)->nh.iph->frag_off & htons(ip_mf IP_OFFSET)) { 660 *pskb = ip_ct_gather_frags(*pskb); if (!*pskb) return NF_STOLEN; } 665 proto = find_proto((*pskb)->nh.iph->protocol); /* It may be an icmp error... */ if ((*pskb)->nh.iph->protocol == IPPROTO_ICMP && icmp_error_track(*pskb, &ctinfo, hooknum)) 670 return NF_ACCEPT; 58 if (!(ct = resolve_normal_ct(*pskb, proto,&set_reply,hooknum,&ctinfo))) /* Not valid part of a connection */

/* Invalid */ 685 nf_conntrack_put((*pskb)->nfct); (*pskb)->nfct = NULL; return NF_ACCEPT; } 690 if (ret!= NF_DROP && ct->helper) { ret = ct->helper->help((*pskb)->nh.iph, (*pskb)->len, ct, ctinfo); if (ret == -1) { /* Invalid */ 695 nf_conntrack_put((*pskb)->nfct); (*pskb)->nfct = NULL; return NF_ACCEPT; } } 700 if (set_reply) set_bit(ips_seen_reply_bit, &ct->status); } return ret; 59

The following source code has been taken from the file ip_conntrack_proto_tcp.c of the Netfilter source code [T 01]. 1 #define NO_VERSION #include <linux/types.h> #include <linux/sched.h> #include <linux/timer.h> 5 #include <linux/netfilter.h> #include <linux/module.h> #include <linux/in.h> #include <linux/ip.h> #include <linux/tcp.h> 10 #include <linux/netfilter_ipv4/ip_conntrack.h> #include <linux/netfilter_ipv4/ip_conntrack_protocol.h> #include <linux/netfilter_ipv4/lockhelp.h> #if 0 15 #define DEBUGP printk #else #define DEBUGP(format, args...) #endif 20 /* Protects conntrack->proto.tcp */ static DECLARE_RWLOCK(tcp_lock); 25 /* FIXME: Examine ipfilter s timeouts and conntrack transitions more closely. They re more complex. --RR */ /* Actually, I believe that neither ipmasq (where this code is stolen from) nor ipfilter do it exactly right. A new conntrack machine taking into account packet loss (which creates uncertainty as to exactly the conntrack of the connection) is required. RSN. --RR */ 30 static const char *tcp_conntrack_names[] = { "NONE", "ESTABLISHED", "SYN_SENT", 35 "SYN_RECV", "FIN_WAIT", "TIME_WAIT", "CLOSE", "CLOSE_WAIT", 40 "LAST_ACK", "LISTEN" }; #define SECS *HZ 45 #define MINS * 60 SECS #define HOURS * 60 MINS #define DAYS * 24 HOURS 50 static unsigned long tcp_timeouts[] = { 30 MINS, /* TCP_CONNTRACK_NONE, */ 61

#define sno TCP_CONNTRACK_NONE #define ses TCP_CONNTRACK_ESTABLISHED 65 #define sss TCP_CONNTRACK_SYN_SENT #define ssr TCP_CONNTRACK_SYN_RECV #define sfw TCP_CONNTRACK_FIN_WAIT #define stw TCP_CONNTRACK_TIME_WAIT #define scl TCP_CONNTRACK_CLOSE 70 #define scw TCP_CONNTRACK_CLOSE_WAIT #define sla TCP_CONNTRACK_LAST_ACK #define sli TCP_CONNTRACK_LISTEN #define siv TCP_CONNTRACK_MAX 75 static enum tcp_conntrack tcp_conntracks[2][5][tcp_conntrack_max] = { { /* ORIGINAL */ /* sno, ses, sss, ssr, sfw, stw, scl, scw, sla, sli */ /*syn*/ {sss, ses, sss, ssr, sss, sss, sss, sss, sss, sli }, 80 /*fin*/ {stw, sfw, sss, stw, sfw, stw, scl, stw, sla, sli }, /*ack*/ {ses, ses, sss, ses, sfw, stw, scl, scw, sla, ses }, /*rst*/ {scl, scl, sss, scl, scl, stw, scl, scl, scl, scl }, /*none*/{siv, siv, siv, siv, siv, siv, siv, siv, siv, siv } }, 85 { /* REPLY */ /* sno, ses, sss, ssr, sfw, stw, scl, scw, sla, sli */ /*syn*/ {ssr, ses, ssr, ssr, ssr, ssr, ssr, ssr, ssr, ssr }, /*fin*/ {scl, scw, sss, stw, stw, stw, scl, scw, sla, sli }, 90 /*ack*/ {scl, ses, sss, ssr, sfw, stw, scl, scw, scl, sli }, /*rst*/ {scl, scl, scl, scl, scl, scl, scl, scl, sla, sli }, /*none*/{siv, siv, siv, siv, siv, siv, siv, siv, siv, siv } } }; 95 static int tcp_pkt_to_tuple(const void *datah, size_t datalen, struct ip_conntrack_tuple *tuple) { const struct tcphdr *hdr = datah; 100 tuple->src.u.tcp.port = hdr->source; tuple->dst.u.tcp.port = hdr->dest; 105 } return 1; static int tcp_invert_tuple(struct ip_conntrack_tuple *tuple, const struct ip_conntrack_tuple *orig) { 110 tuple->src.u.tcp.port = orig->dst.u.tcp.port; tuple->dst.u.tcp.port = orig->src.u.tcp.port; return 1; } 115 /* Print out the per-protocol part of the tuple. */ static unsigned int tcp_print_tuple(char *buffer, 62

{ enum tcp_conntrack state; 130 READ_LOCK(&tcp_lock); state = conntrack->proto.tcp.state; READ_UNLOCK(&tcp_lock); 135 } return sprintf(buffer, "%s ", tcp_conntrack_names[state]); static unsigned int get_conntrack_index(const struct tcphdr *tcph) { if (tcph->rst) return 3; 140 else if (tcph->syn) return 0; else if (tcph->fin) return 1; else if (tcph->ack) return 2; else return 4; } 145 /* Returns verdict for packet, or -1 for invalid. */ static int tcp_packet(struct ip_conntrack *conntrack, struct iphdr *iph, size_t len, enum ip_conntrack_info ctinfo) 150 { enum tcp_conntrack newconntrack, oldtcpstate; struct tcphdr *tcph = (struct tcphdr *)((u_int32_t *)iph + iph->ihl); /* We re guaranteed to have the base header, but maybe not the 155 options. */ if (len < (iph->ihl + tcph->doff) * 4) { DEBUGP("ip_conntrack_tcp: Truncated packet.\n"); return -1; } 160 WRITE_LOCK(&tcp_lock); oldtcpstate = conntrack->proto.tcp.state; newconntrack = tcp_conntracks 165 [CTINFO2DIR(ctinfo)] [get_conntrack_index(tcph)][oldtcpstate]; /* Invalid */ if (newconntrack == TCP_CONNTRACK_MAX) { 170 DEBUGP("ip_conntrack_tcp: Invalid dir=%i index=%u conntrack=%u\n", CTINFO2DIR(ctinfo), get_conntrack_index(tcph), conntrack->proto.tcp.state); WRITE_UNLOCK(&tcp_lock); return -1; 175 } conntrack->proto.tcp.state = newconntrack; /* Poor man s window tracking: record SYN/ACK for handshake check */ 180 if (oldtcpstate == TCP_CONNTRACK_SYN_SENT && CTINFO2DIR(ctinfo) == IP_CT_DIR_REPLY 63

if (del_timer(&conntrack->timeout)) conntrack->timeout.function((unsigned long)conntrack); } else { 195 /* Set ASSURED if we see see valid ack in ESTABLISHED after SYN_RECV */ if (oldtcpstate == TCP_CONNTRACK_SYN_RECV && CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL && tcph->ack &&!tcph->syn && tcph->ack_seq == conntrack->proto.tcp.handshake_ack) 200 set_bit(ips_assured_bit, &conntrack->status); } 205 return NF_ACCEPT; } ip_ct_refresh(conntrack, tcp_timeouts[newconntrack]); /* Called when a new connection for this protocol found. */ static int tcp_new(struct ip_conntrack *conntrack, 210 struct iphdr *iph, size_t len) { enum tcp_conntrack newconntrack; struct tcphdr *tcph = (struct tcphdr *)((u_int32_t *)iph + iph->ihl); 215 /* Don t need lock here: this conntrack not in circulation yet */ newconntrack = tcp_conntracks[0][get_conntrack_index(tcph)] [TCP_CONNTRACK_NONE]; 220 /* Invalid: delete conntrack */ if (newconntrack == TCP_CONNTRACK_MAX) { DEBUGP("ip_conntrack_tcp: invalid new deleting.\n"); return 0; } 225 conntrack->proto.tcp.state = newconntrack; return 1; } 230 struct ip_conntrack_protocol ip_conntrack_protocol_tcp = { { NULL, NULL }, IPPROTO_TCP, "tcp", tcp_pkt_to_tuple, tcp_invert_tuple, tcp_print_tuple, tcp_print_conntrack, tcp_packet, tcp_new, NULL }; 64

[AD90] Rajeev Alur and David Dill. Automata for modeling real-time systems. pages 322 335, 1990. [Dro93] R. Droms. Dynamic host configuration protocol. Technical Report RFC 1541, October 1993. [Mil80] R. (Robin) Milner. A calculus of communicating systems, volume 92. Springer-Verlag Inc., New York, NY, USA, 1980. [MIM01] MIMEsweeper. Mimesweeper, October 2001. http://www.mimesweeper.com. [Min67] Marvin Lee Minsky. Computation: Finite and Infinite Machines. Prentice-Hall Inc., Englewood Cliffs, New Jersey, 1967. [NEC01] NEC Corporation. Socks, October 2001. http://www.socks.nec.com. [Pos80] Jon B. Postel. User datagram protocol. Technical Report RFC 768, August 1980. [Pos81] [Pos85] Jon B. Postel. Transmission Control Protocol. Technical Report RFC 793, SRI International, 1981. Jon B. Postel. File Transfer Protocol. Technical Report RFC 959, SRI International, 1985. [RAD01] RAD Data Communications. The transport layer, October 2001. http://www.rad.com/networks/1994/osi/transp.htm. [Roo00] Guido van Rooij. Real Stateful TCP Packet Filtering in IP Filter. 2nd International SANE Conference, March 2000. [Rus01] Paul Rusty Russel. Linux netfilter Hacking HOWTO. October 2001. http://netfilter.samba.org/unreliable-guides/netfilter-hacking-howto/index.html. [Sip96] Michael Sipser. Introduction to the Theory of Computation. PWS, 1996. [Ste94] [SW95] [T 01] W. Richard Stevens. TCP/IP Illustrated, Volume 1. Addison-Wesley Publishing Company, One Jacob Way, Reading, Massachusetts, first edition, 1994. W. Richard Stevens and Gary R. Wright. TCP/IP Illustrated, Volume 2: The Implementation. Addison-Wesley Publishing Company, One Jacob Way, Reading, Massachusetts, 1995. Linus Torvalds et al. Linux kernel 2.4 source code, 2001. http://www.kernel.org/. 65

66