Network Design and Implementation of Proxies



Similar documents
IP - The Internet Protocol

Network Simulation Traffic, Paths and Impairment

Ethernet. Ethernet. Network Devices

Proxy Server, Network Address Translator, Firewall. Proxy Server

High Performance VPN Solutions Over Satellite Networks

Transport Layer Protocols

VLAN und MPLS, Firewall und NAT,

High-Performance IP Service Node with Layer 4 to 7 Packet Processing Features

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Network Address Translation (NAT)

This topic lists the key mechanisms use to implement QoS in an IP network.

Frequently Asked Questions

Protocols and Architecture. Protocol Architecture.

Network Layer. Introduction Datagrams and Virtual Circuits Routing Traffic Control. Data delivery from source to destination.

Firewalls and VPNs. Principles of Information Security, 5th Edition 1

We will give some overview of firewalls. Figure 1 explains the position of a firewall. Figure 1: A Firewall

Optimizing TCP Forwarding

TCP Offload Engines. As network interconnect speeds advance to Gigabit. Introduction to

QoS Parameters. Quality of Service in the Internet. Traffic Shaping: Congestion Control. Keeping the QoS

Considerations In Developing Firewall Selection Criteria. Adeptech Systems, Inc.

INTERNET SECURITY: THE ROLE OF FIREWALL SYSTEM

Operating Systems Design 16. Networking: Sockets

Dissertation Title: SOCKS5-based Firewall Support For UDP-based Application. Author: Fung, King Pong

OpenFlow Based Load Balancing

UPPER LAYER SWITCHING

Understanding TCP/IP. Introduction. What is an Architectural Model? APPENDIX

Firewall Introduction Several Types of Firewall. Cisco PIX Firewall

Lehrstuhl für Informatik 4 Kommunikation und verteilte Systeme. Firewall

ICOM : Computer Networks Chapter 6: The Transport Layer. By Dr Yi Qian Department of Electronic and Computer Engineering Fall 2006 UPRM

Communications and Computer Networks

Quality of Service in the Internet. QoS Parameters. Keeping the QoS. Traffic Shaping: Leaky Bucket Algorithm

Firewall Design Principles

PART OF THE PICTURE: The TCP/IP Communications Architecture

Module 1. Introduction. Version 2 CSE IIT, Kharagpur

CS 78 Computer Networks. Internet Protocol (IP) our focus. The Network Layer. Interplay between routing and forwarding

Network Defense Tools

Access Control: Firewalls (1)

INTRODUCTION TO FIREWALL SECURITY

Encapsulating Voice in IP Packets

Internet Firewall CSIS Packet Filtering. Internet Firewall. Examples. Spring 2011 CSIS net15 1. Routers can implement packet filtering

Improving DNS performance using Stateless TCP in FreeBSD 9

12. Firewalls Content

Distributed Systems 3. Network Quality of Service (QoS)

Internet Protocol: IP packet headers. vendredi 18 octobre 13

Wedge Networks: Transparent Service Insertion in SDNs Using OpenFlow

Fig : Packet Filtering

A host-based firewall can be used in addition to a network-based firewall to provide multiple layers of protection.

IMPLEMENTATION OF INTELLIGENT FIREWALL TO CHECK INTERNET HACKERS THREAT

hp ProLiant network adapter teaming

Stateful Inspection Technology

VOICE OVER IP AND NETWORK CONVERGENCE

GlobalSCAPE DMZ Gateway, v1. User Guide

Cisco Integrated Services Routers Performance Overview

Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago

Cisco PIX vs. Checkpoint Firewall

Guide to TCP/IP, Third Edition. Chapter 3: Data Link and Network Layer TCP/IP Protocols

NAT TCP SIP ALG Support

Per-Flow Queuing Allot's Approach to Bandwidth Management

A Brief Overview of VoIP Security. By John McCarron. Voice of Internet Protocol is the next generation telecommunications method.

What is CSG150 about? Fundamentals of Computer Networking. Course Outline. Lecture 1 Outline. Guevara Noubir noubir@ccs.neu.

IP address format: Dotted decimal notation:

CSET 4750 Computer Networks and Data Communications (4 semester credit hours) CSET Required IT Required

Protecting and controlling Virtual LANs by Linux router-firewall

Network Programming TDC 561

SE 4C03 Winter 2005 Firewall Design Principles. By: Kirk Crane

Quality of Service Analysis of site to site for IPSec VPNs for realtime multimedia traffic.

A Transport Protocol for Multimedia Wireless Sensor Networks

Chapter 3. Internet Applications and Network Programming

Improving Quality of Service

Overview. Securing TCP/IP. Introduction to TCP/IP (cont d) Introduction to TCP/IP

Chapter 3. TCP/IP Networks. 3.1 Internet Protocol version 4 (IPv4)

CHAPTER 3 STATIC ROUTING

Content Distribution Networks (CDN)

Linux Network Security

Switch Fabric Implementation Using Shared Memory

Names & Addresses. Names & Addresses. Hop-by-Hop Packet Forwarding. Longest-Prefix-Match Forwarding. Longest-Prefix-Match Forwarding

Chapter 9 Firewalls and Intrusion Prevention Systems

20-CS X Network Security Spring, An Introduction To. Network Security. Week 1. January 7

Objectives of Lecture. Network Architecture. Protocols. Contents

MANAGING NETWORK COMPONENTS USING SNMP

Network Security TCP/IP Refresher

PROTECTING INFORMATION SYSTEMS WITH FIREWALLS: REVISED GUIDELINES ON FIREWALL TECHNOLOGIES AND POLICIES

Network Management and Monitoring Software

VMWARE WHITE PAPER 1

21.4 Network Address Translation (NAT) NAT concept

CMPT 471 Networking II

Lecture 23: Firewalls

What is Firewall? A system designed to prevent unauthorized access to or from a private network.

Understanding Latency in IP Telephony

First Workshop on Open Source and Internet Technology for Scientific Environment: with case studies from Environmental Monitoring

SFWR ENG 4C03 Class Project Firewall Design Principals Arash Kamyab March 04, 2004

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Proxies. Chapter 4. Network & Security Gildas Avoine

Overview of Computer Networks

Transport and Network Layer

Deploying Firewalls Throughout Your Organization

Smart Queue Scheduling for QoS Spring 2001 Final Report

Transcription:

Advanced Security Proxies: An Architecture and Implementation for High- Performance Network Firewalls Roger Knobbe, Andrew Purtell, Stephen Schwab {rknobbe, apurtell, sschwab}@nai.com TIS Labs at Network Associates 3415 S. Sepulveda Blvd., Suite 700 Los Angeles, CA 90034 Abstract -- The TIS Labs Advanced Security Proxies (ASP) project is investigating software architectures for highperformance firewalls to enable the secure use of next generation networks. The project objective is to demonstrate an architecture and implementation in which protocol-specific proxies control when data transmission is allowed across the firewall, but which allows the proxy a range of options in determining how that data transits the firewall. By employing proxies that selectively use a range of lower-level protocol stack features, this novel architecture provides higher performance and greater flexibility in determining exactly what information the proxies examine. These decisions are made at the granularity of each proxied connection. We describe the firewall design and implementation and report preliminary experimental results using Fast Ethernet. I. INTRODUCTION The Advanced Security Proxies (ASP) project at TIS Labs is experimenting with novel firewall architectures and implementations suitable for use in high-performance network environments. Firewall implementations have traditionally been classified as either packet filtering firewalls, in which individual packets are examined and permitted or denied according to policy, or proxy firewalls. In a proxy firewall, the client or process initiating the connection contacts an intermediary, known as a proxy, running on the firewall. After establishing a connection with the proxy, and some additional authentication is performed between the client and the proxy, a second connection is established from the proxy to the server or other remote process listening for a network connection request. Thereafter, the client and server both carry on their normal activities, including sending and receiving application-level protocol messages. However, the messages are relayed through the proxy firewall, which performs all lower-level / protocol handling, and delivers the data to the application proxy. The application proxy checks the message, to prevent illegal activity or possible attack, and then forwards the message, once again requiring the firewall to perform all the / processing involved in transmitting data. While proxies allow for the implementation of high assurance firewalls, they have often required that all data flow up and down the protocol stacks, even though much of the data was simply being copied from one side of the firewall to the other side. As network speeds have increased, protocol stacks on general purpose operating systems have not been sufficiently fast to keep pace. As a result, proxy firewalls had become speed bumps on the network. One way of mitigating these performance limitations has been to direct some classes of network traffic through packet filtering software on the firewall hosts. Under proxy control, the packet filters can be programmed to permit packets belonging to approved connections to pass through the firewall. At the completion of a proxied connection, the packet filter rules are dynamically removed, plugging the hole that is temporarily opened to carry the traffic. In order to support higher speed network technologies with proxy firewalls, we are studying a variety of methods by which a proxy can control the lower-layers of the network hardware or software in order to optimize the flow of data across the firewall. Our research objective is to demonstrate an approach that can keep pace with increasing network speeds without resorting to expensive, overly specialized custom hardware solutions. Our current implementation goal is focused on reaching OC- 12 ATM speeds of 622 Mbits/second, which represent a significantly higher performance network than is usually protected using a proxy firewall. Our initial implementation work has utilized 100 Mbit/s FastEthernet, with OC-12 ATM experimentation on the roadmap. The rest of this paper continues with an overview of the architecture of the ASP firewall. This is followed by sections describing the implementation of the firewall using the Scout operating system, and experimental measurements reporting on the performance of the current prototype. The paper concludes with a review of related work, an outline of our future work and conclusions. This work was sponsored by the Defense Advanced Research Projects Agency and the Department of Defense and managed by USAF Rome Laboratory under contract number F30602-97-C-0336.

II. ARCHITECTURE The ASP firewall architecture is designed to provide multiple fast datapaths across the firewall between the protected and unprotected networks. These network services, accessed through well-defined interfaces, are utilized by ASP firewall proxies to accelerate the traffic transiting the network boundary. Because each proxy retains full control over when traffic is switched to a fast datapath, security functions can be applied to the specific application protocol messages that must be inspected and checked against the security policy enforced by the proxy. For example, an FTP proxy might check that each FTP command including a file pathname uses a bounded length to prevent a buffer overrun attack. In contrast, network traffic which is not subject to examination by the proxy may be directed over a fast datapath, reducing the delay experienced by each individual packet, and reducing the overall load on the firewall host system. For connections, this will often allow greater throughput, while the firewall as a whole will expend more of its computational resources on performing useful proxy activity rather than just copying data bytes from one network protocol stack endpoint to another endpoint. The architecture envisions multiple fast datapaths, corresponding to distinct implementation layers and degree of processing of network traffic. In an ATM environment, cells on switched virtual circuits (SVCs) might be the unit of traffic to be forwarded by the network plumbing. In more general environments, the datagram, with or without defragmentation support, can be forwarded. A connection forwarder would deliver connection establishment and termination notifications to a proxy while immediately forwarding packets on an established connection. Finally, the approach is not limited to connection-oriented traffic. Other classes of traffic including UDP based multimedia protocols or media level signaling protocols (such as ATM UNI or ILMI) can also be handled within this framework. For a number of reasons, the ASP firewall prototype described below is a software-only implementation of the architecture. In addition, we have focused exclusively on -oriented connection traffic and -based fast datapaths as the former are the most widely deployed proxy firewalls, and the latter are the most generically supported WAN networks. However, the architecture itself is applicable to a wider range of network technologies and hardware implementations. After describing our implementation and experiments, utilizing the Scout extensible operating system, we also identify requirements for hardware support for the architecture in future work. III. IMPLEMENTATION Two separate fast datapath facilities have been implemented in the prototype ASP firewall. Known as CUT and NETTAP, both implementations are designed to accelerate the handling of packets traversing the firewall. The difference between the two is that CUT is designed to carry traffic belonging to a connection that has not been established prior to the proxy issuing commands to CUT, while NETTAP is designed to handle traffic belonging to a previously extant connection. In particular NETTAP can switch traffic from the currently proxied connection onto the fast datapath. To evaluate the performance of the firewall acceleration facilities, FTP and HTTP proxies have been implemented that utilize CUT and NETTAP respectively. Both proxies and fast datapath mechanisms have been implemented in the Scout extensible operating system. A. Scout Operating System The Scout O.S. [1], developed at the University of Arizona, is a modular O.S. directed toward communication-oriented applications. These included thin-client Java stations, file- and web-servers, QoS routers and firewalls. The key abstraction of Scout is the path, which replaces the process as the execution-time entity created to service a user request. Unlike a process, a path is explicitly tied to one or more I/O devices. Its purpose is to handle incoming messages across the set of composable modules that comprise the path. The Scout infrastructure provides general facilities for manipulating messages; scheduling threads associated with each path; creating, managing and destroying paths; and demultiplexing messages from an input device to a path capable of processing it. Paths are composed of sequences of modules (confusingly called routers by the Scout developers) that process messages as they traverse the path. Many of the modules are designed to process one layer of a network protocol stack, and there are modules for,, ARP and Ethernet devices (), as well as specific network devices. A Scout kernel is configured by specifying all the modules that will be present in the system, and the router (module) graph indicating which modules have connecting edges. Each module has a number of interfaces, drawn from a small number of types defined by the Scout infrastructure. Interfaces may only be connected to each other if the are of the same type, reducing, though not eliminating, configuration errors. The workhorse type in the system is NET, corresponding to an abstract network interface over which a message between two layers in a protocol stack may be passed. The usual convention for naming protocol stack interfaces is to refer to them by orientation such as UP or DOWN. Thus the module s DOWN:NET interface

is connected to the module s UP:NET interface in an ordinary protocol stack. Fig. 1 illustrates a firewall configuration with two protocol stacks. At boot time, each router is initialized, with addresses or routing entries. The Proxy calls pathcreate creating a path from the Proxy to. This passive path waits for a connection request. A software demultiplexing mechanism delivers each incoming packet to the path responsible for processing it. Demultiplexing is accomplished using a distributed approach in which each module attempts to match the fields in its own network layer with one or more possible paths. In the case of, if no established connection exists, the demultiplexing search would first match the incoming packet s address with the local host, then the packet type () and port number with the passive path previously created. In response to the connection request, pathcreate is called to construct an active path. Each module along the path initializes perstage state used by that module in processing packets. control block variables specifying sequence and acknowledgement numbers are stored here, along with congestion control windows and retransmission timers. After the connection closes, a pathdestroy operation frees the stages associated with each module, and also removes the demultiplexing entries for that path. B. Optimized Firewall Datapaths Following earlier proxy firewall optimization work, we implemented a pair of identical / protocol stacks with one or two modules at the top, inspecting and shuttling application messages from one side of the network to the other. Without any datapath optimization modules, the system performs all the protocol stack processing associated with any other proxy firewall. However, since Scout is a single address space system, data is passed via reference wherever possible. The message handling library takes care to avoid extra byte Proxy Path Command Path Optimized Path F ARP CUT PROXY Fig. 1. Configuration of ASP Firewall with CUT ARP copying as headers are removed during input processing and latter added during output processing. The modules described below provide fast forwarding datapaths. CUT, illustrated in Fig. 1, provides an optimized datapath by connecting via its DOWN interface to two different modules. In addition, a command interface is provided and used by the FTP proxy to enable the flow of packets over the fast path. CUT connects to at the same point where a module would connect. An CUT command includes parameters describing a connection, consisting of the source and destination addresses and port numbers. Optional arguments may include the selection of different initial sequence numbers on each side of the firewall connection. In response to the command, CUT creates a path in two steps. First, pathcreate is called to build a path from CUT down one protocol stack, through and modules. CUT uses a hook in the module to add this path to the demultiplexing map. From the Scout point-of-view, CUT represents an optimized form of. More importantly, the demultiplexing tree implementation is deterministic, and the module is configured at compile time to demultiplex all packets with protocol field equal to 6, the standard value for. Once the first half of the path is established, a pathextend call is made to repeat the process on the other protocol stack connected to the opposite network interface. PathExtend is functionally equivalent to pathcreate, except the starting point is the existing path, and new stages are added in turn for each module of the new protocol stack. Once the optimized CUT path is established between the two networks, the FTP proxy proceeds to forward the application level message which triggers the file transfer. FTP uses a separate connection for data, which means the original control connection does not need to be optimized. It remains quiescent while the file data moves over the CUT path. After the operation completes, the CUT module observes the connection teardown by recording the packets with FIN flags, and the respective acknowledgements of these FIN packets by each side. At that point, CUT tears down the optimized datapath using pathdestroy, and the proxy is notified of the change. NETTAP, illustrated in Fig. 2, provides applicationlevel proxies with several additional approaches to establishing and managing fast datapaths. The NETTAP module is inserted directly between the and modules, in a position where it can intercept and redirect the traffic under proxy control. When using NETTAP, a typical application will first establish two connections and will initially fully proxy data between these connections. Our HTTP proxy currently makes use of the NETTAP module. HTTP traffic is collected and examined until a complete HTTP GET request has been received. The simple proxy implements a policy that

filters out URLs, protecting some private URLs on the server, while allowing the rest to be accessed across the firewall. Other policies are easily implemented. Once the proxy determines that the GET request is allowed, it forwards it to the web server, and issues a NETTAP JOIN command, resulting in the HTTP response being carried over an optimized datapath. Unlike the CUT module, the NETTAP module switches the same connection used to carry the HTTP traffic onto the fast datapath. NETTAP affords the proxying application with a range of cost-benefit tradeoffs by operating in one of three selectable modes. The first, JOIN, optimizes host and application processing out of the path completely, in an analogous manner to CUT. Once the application has switched a connection to this mode, it cannot receive data on, send data to, or control either connection on the path. The second, SNOOP, establishes a fast datapath but also collects and queues payload data, which is delivered to the application as available CPU cycles permit. In this mode, the application may monitor traffic and, if needed, terminate both connections, but cannot guarantee that some hostile traffic has not already been forwarded before the moment it is detected. The third mode, INSPECT, addresses this issue by queuing and delaying the forwarding of traffic until the application has signaled that the collected data may be sent. Of these modes, JOIN has the least impact on system resources, and INSPECT the greatest. Both CUT and NETTAP perform many of the same low-level operations on packets. addresses and port numbers may be changed to facilitate network address translation and port hiding. In addition, the sequence and acknowledgement fields of the header must be translated between the two halves of the network connection. numbers the bytes transmitted from sender to receiver in order to provide reliable service. Acknowledgements in the header of packets traveling in the opposite direction indicate the greatest Control Operations Proxy Path Optimized Path NETTAP PROXY NETTAP Fig. 2. Configuration of ASP Firewall with NETTAP Maximum Throughput (megabytes per second) 100.00 10.00 1.00 4k 8k 16k 32k 64k 128k File Size (bytes) proxy packet filter no firewall ASP Fig. 3. FTP Firewall Performance sequence number received, and confirm delivery of data to the host. Sequence numbers do not start from any fixed initial value. Instead, each host picks an initial sequence number. In order for a fast datapath to be created, the firewall must modify the sequence and acknowledgement fields by adding or subtracting a constant value corresponding to the difference in the initial sequence numbers. Finally, the and checksum fields must be updated to reflect any changes made to the headers. In our implementation, we use an efficient differential checksum algorithm. Instead of examining all the data in the segment to re-compute a fresh checksum, the original checksum is used as a starting point, and is modified based on the old and new values of the header fields that change. IV. EXPERIMENTS Initial measurements of firewall performance have been collected using a testbed consisting of 450 Mhz Pentium II PCs with 128 Mbytes of RAM and 100 Mbit/second full-duplex FastEthernet interfaces. A client and server machine are each connected via switches to a firewall machine with two separate network interfaces. Fig. 3 compares the performance of the ASP FTP proxy using CUT against a proxy and packet filter firewall. The peak CPU utilization of the ASP firewall was 30%. Fig. 4 compares the performance of the ASP HTTP proxy using NETTAP JOIN mode against a proxy and packet filtering firewall. For large transfer sizes, the ASP firewall provides nearly the same throughput as the control case with no firewall between the client and web server. The peak CPU utilization of this configuration was 28%, in contrast to 61% with the NETTAP SNOOP mode. Smaller transfer sizes result in performance similar to traditional proxies, as the costs of connection setup and inspecting the HTTP GET messages dominate overall performance in this regime. 256k 512k 1024k

Maximum Throughput (megabits per second) 100 10 1 4k 8k 16k V. RELATED WORK 32k 64k 128k proxy packet filter no firewall ASP 256k 512k File Size (Bytes) Fig. 4. HTTP Firewall Performance Several other researchers have investigated ways of improving firewall proxy performance by optimizing the underlying network protocol stacks. Spatscheck et. al. [1] describes a connection optimization for Scout paths and a firewall proxy that utilizes it. Maltz and Bhagwat [2] have implemented similar splicing mechanisms in a Unix kernel to support firewall and web caching applications. Fall [3] investigated more general mechanisms for optimizing data copying between both files and network sockets, and introduced a Unix interface to control optimization. VI. FUTURE DIRECTION AND CONCLUSION There are many additional features we would like to implement in our ASP firewall prototype. While the current Scout system required a simple change to the module to allow CUT to insert paths into the demultiplexing map, a more sophisticated Scout infrastructure would allow CUT to lookup and access the demultiplexing map dynamically. This would remove a dependency between the CUT module on low-level internals of the module. Also, while the path operations are conceptually very simple, the actual interface in Scout requires developers to directly manipulate the pointers linking the various stages of the path together. This code is difficult to debug, extremely dense, and highly prone to coding errors resulting in deformed paths. An enhanced version of the system should provide a higher-level interface to aid developers in the construction and management of paths. An additional NETTAP command, SPLIT, is currently in development, which will resume the full proxying of a connection by switching the traffic away from the fast datapath and back to the two distinct / protocol stacks below the proxy. SPLIT will provide the ability for proxies to periodically re-inspect a connection. A security policy might require re-examination of a 1024k connection after a fixed time period, or after a specified number of bytes or packets had traversed the fast path. Once the ASP architecture is well established, it should be feasible to implement key pieces in hardware, allowing proxy firewalls to scale with increasing network bandwidth. Many hardware routing devices include the capability of updating the checksum, but do not have the capability to update the checksum. This is probably due to the fact that all routers must decrement the time-to-live field, requiring changes to the checksum, but do not change any fields. One clearly identified hardware requirement to support optimized datapaths in firewalls is a method for differential or constant time checksum recalculation. The demultiplexing of packets to different destinations involves the inspection of both and fields. The hardware requirements for firewalls vs. routing are not dramatically different, except that flags are also useful as part of the demultiplexing rules. However, the rate at which new connections are established and destroyed may be much greater than the rate at which routing table changes propagate to forwarding engines in routers. Firewalls require an interface to the demultiplexing mechanisms that supports many more changes per second, and with small, bounded latency. Finally, firewalls need to track and record connection termination. A simple state machine for recording the sequence number of packets with the FIN flag, and generating a notification when the corresponding acknowledgement number is seen would also be a requirement for implementing all of the optimized network plumbing in hardware. As more proxies are implemented that utilize optimize datapaths, the configuration of firewall security policies becomes more complicated. More investigation of the various ways in which security policies exploit dynamic firewall behavior to improve performance without compromising network boundary integrity is needed. ACKNOWLEDGEMENTS John Sebes conceived of the original ASP project goals and architecture, and contributed to its success. Hilarie Orman suggested the use of Scout as a firewall platform. REFERENCES [1] D. Mosberger and L. Peterson. Making paths explicit in the Scout operating system. In Proceedings of Second USENIX Symposium on Operating Systems Design and Implementation, October 1996, Seattle, Washington, pages 153-167. [2] M.K. Ranum and F.M. Avolio. A toolkit and methods for Internet firewalls. In Proceedings of the Summer 1994 USENIX Conference, Boston, Massachusetts, pages 37-44. [3] D. Maltz and P. Bhagwat. Splicing for Application Layer Proxy Performance. IBM Research Report RC 21139, March 17, 1998. [4] B. Cheswick and S. Bellovin. Firewalls and Internet Security. Reading, MA, Addison-Wesley, 1994. [5] K. Fall, A Peer-to-Peer I/O System in Support of I/O Intensive Worloads, Ph.D. Thesis, 1993, Department of Computer Science, U.C. San Diego. [6] O. Spatscheck, J. Hansen, J. Hartman and L. Peterson. Optimizing Forwarder Performance, Technical Report 98-01, Department of Computer Science, The University of Arizona, February 9, 1998.