Maximizing the Availability of Distributed Software Services. Peter Clutterbuck

Similar documents

Content Distribution Networks (CDN)

High Performance Cluster Support for NLB on Window

Abstract. Introduction. Section I. What is Denial of Service Attack?

Ethernet. Ethernet. Network Devices

Outline. CSc 466/566. Computer Security. 18 : Network Security Introduction. Network Topology. Network Topology. Christian Collberg

OpenFlow Based Load Balancing

Security vulnerabilities in the Internet and possible solutions

CMPT 471 Networking II

Whitepaper Continuous Availability Suite: Neverfail Solution Architecture

A Study of Network Security Systems

Final exam review, Fall 2005 FSU (CIS-5357) Network Security

Security Technology White Paper

Guide to DDoS Attacks December 2014 Authored by: Lee Myers, SOC Analyst

co Characterizing and Tracing Packet Floods Using Cisco R

Internet Firewall CSIS Packet Filtering. Internet Firewall. Examples. Spring 2011 CSIS net15 1. Routers can implement packet filtering

MONITORING OF TRAFFIC OVER THE VICTIM UNDER TCP SYN FLOOD IN A LAN

20-CS X Network Security Spring, An Introduction To. Network Security. Week 1. January 7

CloudFlare advanced DDoS protection

Understanding TCP/IP. Introduction. What is an Architectural Model? APPENDIX

Communications and Computer Networks

MOC 6435A Designing a Windows Server 2008 Network Infrastructure

Improving DNS performance using Stateless TCP in FreeBSD 9

1. Introduction. 2. DoS/DDoS. MilsVPN DoS/DDoS and ISP. 2.1 What is DoS/DDoS? 2.2 What is SYN Flooding?

CS5008: Internet Computing

Chapter 8 Security Pt 2

Federal Computer Incident Response Center (FedCIRC) Defense Tactics for Distributed Denial of Service Attacks

Protecting and controlling Virtual LANs by Linux router-firewall

WAN Traffic Management with PowerLink Pro100

Firewalls. Ola Flygt Växjö University, Sweden Firewall Design Principles

Multi-layer switch hardware commutation across various layers. Mario Baldi. Politecnico di Torino.

Dos & DDoS Attack Signatures (note supplied by Steve Tonkovich of CAPTUS NETWORKS)

Networks: IP and TCP. Internet Protocol

GPRS and 3G Services: Connectivity Options

Brocade NetIron Denial of Service Prevention

Acquia Cloud Edge Protect Powered by CloudFlare

About Firewall Protection

Internet Firewall CSIS Internet Firewall. Spring 2012 CSIS net13 1. Firewalls. Stateless Packet Filtering

Classification of Firewalls and Proxies

Building a Highly Available and Scalable Web Farm

Security of IPv6 and DNSSEC for penetration testers

Multicast-based Distributed LVS (MD-LVS) for improving. scalability and availability

Architecture of distributed network processors: specifics of application in information security systems

SURE 5 Zone DDoS PROTECTION SERVICE

Internet Protocol: IP packet headers. vendredi 18 octobre 13

Top-Down Network Design

SY system so that an unauthorized individual can take over an authorized session, or to disrupt service to authorized users.

Multi-Datacenter Replication

AS/400e. TCP/IP routing and workload balancing

Networking TCP/IP routing and workload balancing

Data Sheet. V-Net Link 700 C Series Link Load Balancer. V-NetLink:Link Load Balancing Solution from VIAEDGE

VPN. Date: 4/15/2004 By: Heena Patel

CYBER ATTACKS EXPLAINED: PACKET CRAFTING

Cisco Integrated Services Routers Performance Overview

UNDERSTANDING FIREWALLS TECHNICAL NOTE 10/04

Purpose-Built Load Balancing The Advantages of Coyote Point Equalizer over Software-based Solutions

SECURING APACHE : DOS & DDOS ATTACKS - I

hp ProLiant network adapter teaming

Scalable Linux Clusters with LVS

CSCE 465 Computer & Network Security

Post-Class Quiz: Telecommunication & Network Security Domain

Chapter 2 TOPOLOGY SELECTION. SYS-ED/ Computer Education Techniques, Inc.

Secure SCTP against DoS Attacks in Wireless Internet

Quality Certificate for Kaspersky DDoS Prevention Software

Firewalls and Intrusion Detection

Denial of Service attacks: analysis and countermeasures. Marek Ostaszewski

CS 356 Lecture 16 Denial of Service. Spring 2013

VXLAN: Scaling Data Center Capacity. White Paper

ERserver. iseries. TCP/IP routing and workload balancing

Operating System Concepts. Operating System 資訊工程學系袁賢銘老師

Creating Web Farms with Linux (Linux High Availability and Scalability)

Computer Network. Interconnected collection of autonomous computers that are able to exchange information

Load Balancing. Final Network Exam LSNAT. Sommaire. How works a "traditional" NAT? Un article de Le wiki des TPs RSM.

How To Set Up A Net Integration Firewall

Elfiq Link Load Balancer Frequently Asked Questions (FAQ)

Overview of Network Security The need for network security Desirable security properties Common vulnerabilities Security policy designs

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

1. Firewall Configuration

Chapter 1 - Web Server Management and Cluster Topology

Survey on DDoS Attack Detection and Prevention in Cloud

Network Traffic Analysis

Load Balancing for Microsoft Office Communication Server 2007 Release 2

Firewalls. Ahmad Almulhem March 10, 2012

Firewalls Overview and Best Practices. White Paper

IP - The Internet Protocol

Denial of Service (DoS)

GPRS / 3G Services: VPN solutions supported

Module 15: Network Structures

EECS 489 Winter 2010 Midterm Exam

DOMAIN NAME SECURITY EXTENSIONS

Transport and Network Layer

SAN Conceptual and Design Basics

Service Level Agreement for Windows Azure operated by 21Vianet

Dissertation Title: SOCKS5-based Firewall Support For UDP-based Application. Author: Fung, King Pong

Analysis on Some Defences against SYN-Flood Based Denial-of-Service Attacks

What is a Firewall? A choke point of control and monitoring Interconnects networks with differing trust Imposes restrictions on network services

SECURITY FLAWS IN INTERNET VOTING SYSTEM

Transcription:

Maximizing the Availability of Distributed Software Services by Peter Clutterbuck Bachelor of Science (Maths and Computing), (CQU Australia) 1990 Graduate Diploma in Information Technology, (CQU Australia) 1993 Master of Science (Computing), (QUT Australia) 1996 Thesis submitted in accordance with the regulations for the Degree of Doctor of Philosophy Information Security Research Centre Faculty of Information Technology Queensland University of Technology 2 November, 2005 i

Keywords Availability, denial of service, wait time, replication, cluster, distributed service, trust, authentication, redirection, dispatching, scheduling, filtering, option. ii

Abstract In a commercial Internet environment, the quality of service experienced by a user is critical to competitive advantage and business survivability. The availability and response time of a distributed software service are central components of the overall quality of service provided to users. Traditionally availability is a measure of service down time. Traditionally availability measures the probability that the service will be live and is expressed in terms of failure occurrence and repair or recovery time. Response time is a measure of the time taken from when the service request is made, to when service provision occurs for the user. Deteriorating response time is also a valuable indicator to denial of service attacks which continue to pose a significant threat to service availability. The concept of the service cluster is increasingly being deployed to improve service availability and response time. Cluster processor replication increases service availability. Cluster dispatching of service requests across the replicated cluster processors increases service scalability and therefore response time. This thesis commences with a review of the research and current technology in the area of distributed software service availability. The review aims to identify any deficiencies within that area and propose critical features that mitigate those deficiencies. The three critical features proposed are in relation to user wait time, cluster dispatching, and the trust-based filtering of service requests. The user wait time proposal is that the availability of a distributed service should reflect both liveness probability level and probabalistic user access time of the service. The cluster dispatching proposal is that dispatching processing overhead is a function of the number of Internet Protocol (IP) datagrams/transport Control Protocol (TCP) segments that are received by the dispatcher in respect of each service request. Consequently the number of IP datagrams/tcp segments should be minimised ideally so that for each incoming service request there is one IP datagram/tcp segment. The trust-based filtering proposal is that the level of trust in respect of each service request should be identified by the service as this is critical in mitigating distributed denial of service attacks and therefore maximising the availability of the service A conceptual availability model which supports the three critical features within an Internet clustered service environment is then described. The conceptual model proposes an expanded availability definition and then describes the realization of this definition via additional capabilities positioned within the Transport layer of the Internet communication environment. The additional capabilities of this model also facilitate the minimization of cluster dispatcher processing load and the identification by the cluster dispatcher of request trust level. The model is then implemented within the Linux kernel. The implementation involves the addition of several options to the existing TCP specification and also the addition of several functions to the existing Socket API. The implementation is subsequently evaluated in a dispatcher-based clustered service environment. iii

Contents Keywords Abstract List of Figures Declaration Previously Published Material Acknowledgements ii iii vii ix x xi Chapter 1 Availability A Literature Review 1.1 Introduction 1 1.2 Availability and Reliability 2 1.3 Service Clustering 3 1.3.1 Availability via Clustering 4 1.3.2 Scalability via Clustering 5 1.4 Availability and Denial of Service 9 1.4.1 Kernel Based Denial of Service Protection 9 1.4.2 Distributed Denial of Service 15 1.5 Critical Features 18 1.5.1 User Wait Time 19 1.5.2 Server Side Dispatching 19 1.5.3 Trust 21 1.6 Conclusion 22 Chapter 2 An Availability Maximising Model 2.1 Introduction 23 2.2 Distributed Service Infrastructure 25 2.2.1 Internet Service, Cluster, Management Agents and Cluster Policies 25 2.2.2 Naming Service, Network Infrastructure, Process Interface 27 2.2.3 Service Request, Service Provision and Processing Type 29 2.3 Infrastructure Deficiencies 30 2.3.1 A Limiting Availability Definition 30 2.3.2 Non-Optimal Dispatcher Load 31 2.3.3 Service Request Validation 34 2.4 Research Solutions 36 2.4.1 An Extended Definition 37 2.4.2 Decreasing Cluster Dispatcher Load 39 2.4.3 Trust Based Filtering 40 2.5 Detailed Design 41 2.5.1 Wait-Time Objects 41 2.5.2 Redirect Objects 47 2.5.3 Trust-Based Filter Objects 49 2.6 Conclusion 51 iv

Chapter 3 Evaluation Experimental Methodology 3.1 Introduction 52 3.2 Network Infrastructure Overview 54 3.3 Experimental Methodology: Deployment of Prototype Infrastructure 55 3.3.1 Extended User-API 56 3.3.2 IPSec and FreeS/WAN 58 3.4 The Test Application and its Interaction with the Prototype Infrastructure 60 3.4.1 Cluster Dispatcher Application Design 60 3.4.2 Cluster Processing Application Design 62 3.4.3 Key Requesting Client Application Design 64 3.5 Evaluation Criteria 65 3.6 Conclusion 67 Chapter 4 Model Implementation Protocol Level 4.1 Overview 68 4.2 Existing TCP Specification 69 4.2.1 TCP Core Operations 69 4.2.2 TCP Options 72 4.2.3 TCP State Information 74 4.2.4 TCP Interfaces 77 4.3 TCP Protocol Changes 79 4.3.1 Cluster Initialization 80 4.3.1.1 Cluster Dispatcher Initialization 80 4.3.1.2 Cluster Processing Node Initialization 82 4.3.2 Service Request Processing 85 4.3.2.1 Client-Dispatcher Service Request Processing 85 4.3.2.2 Client-Cluster Node Service Request Processing 90 4.3.2.3 Cluster Node Dispatcher Update of Wait Time 92 4.3.3 Trust Based Service Request Filtering 94 4.4 Conclusion 96 Chapter 5 Model Implementation Linux Kernel and Socket API 5.1 Introduction 97 5.2 Linux Networking Overview 99 5.3 Linux System Calls 100 5.4 Connection Initialization and Passive Opening 103 5.4.1 Event 1 Socket Initialization 104 5.4.2 Event 2 Port Binding 106 5.4.3 Event 3 Cluster Dispatcher Passive Opening 108 5.4.4 Event 4 Cluster Node Passive Opening 111 5.5 Client Connection with Cluster Dispatcher 113 5.5.1 Event 5 Client Connect Call 114 5.5.2 Event 6 Cluster Dispatcher Receipt of SYN Segment 117 5.5.3 Event 7 Client Receipt of Redirecting SYN+ACK Segment 120 5.6 Client Connection with Cluster Node 122 5.6.1 Event 8 Client Sending Redirected SYN 122 5.6.2 Event 9 Cluster Node Receives Redirected SYN 124 5.6.3 Event 10 Client Receives Cluster Node s SYN+ACK 126 5.6.4 Event 11 Node Receives Client s ACK 128 v

5.7 Update Wait Time Cluster Node to Dispatcher 130 5.7.1 Event 12 Node Calculates Updated Processing Time 131 5.7.2 Event 13 Cluster Dispatcher Receives Node Updated Wait Time 133 5.8 Trust Based Filtering 134 5.9 Conclusion 135 Chapter 6 Experimental Results 6.1 Introduction 137 6.2 Minimised Wait Times and Performance 137 6.2.1 Minimised Wait Time Testing 137 6.2.2 Operating System Kernel Performance 141 6.2.3 Packet Count Performance 144 6.3 Denial of Service Resilience 148 6.3.1 LVS/TUN DoS Protection Algorithm 148 6.3.2 TCP Redirection DoS Protection 149 6.3.3 DoS Protection Algorithm and Timing Comparison 150 6.4 API Usability 151 6.4.1 Standard Client-Server API Usage 152 6.4.2 Research Client-Server API Usage 153 6.4.3 Cluster Dispatcher Research API Usage 153 6.4.4 Cluster Processing Node Research API Usage 154 6.5 API Compatibility 155 6.6 Evaluation Summary and Limitations 156 6.7 Conclusion 158 Chapter 7 Conclusion 7.1 Introduction 159 7.2 A Summary Chapters 1 to 5 159 7.3 Evaluation Results and Research Conclusions 161 7.4 Future Directions 163 7.4.1 Server Processing Paradigm 163 7.4.2 Valid Initial Sequence Number Computation 164 7.4.3 Interoperability Constraints 164 7.5 Final Comments 165 Bibliography 166 Appendices Appendix A Linux Network Programming (TCP) Description 170 Appendix B TCP Options 381 vi

List of Figures 1.1 Replicated Service Availability Management 4 1.2 Network Load Balancing 6 1.3 Virtual Server via NAT 7 1.4 Virtual Server via IP Tunnelling (LVS/TUN) 8 1.5 The Maximum Waiting Time Concept 11 2.1 The Dispatcher-Based Cluster Paradigm An Overview 23 2.2 The Dispatcher-Based Cluster Paradigm A Detailed View 25 2.3 Process Interface Concept 28 2.4 The Internet TCP/IP Communication Model 28 2.5 Service Provision and Service Provision Commencement 29 2.6 Dispatching via IP Tunnelling 32 2.7 Example State Information LVS/TUN 33 2.8 Layers Involved with Dispatcher Processing 34 2.9 Distributed Denial of Service Vulnerability 35 2.10 Model Processing Overview 37 2.11 Wait-Time Objects Initialisations 42 2.12 Wait-Time Objects Cyclical Processing Actions 44 2.13 Redirect Objects Processing Actions 48 2.14 Initialisation Phase: Trust-Based Filter Object 50 2.15 Filtering Phase: Trust-Based Filtering Object 50 3.1 Network Infrastructure Overview 54 4.1 Protocol Layering in RFC 793 69 4.2 TCP State Transition Diagram 71 4.3 TCP Options 74 4.4 TCP State Information 75 4.5 Send Sequence Number State 76 4.6 Receive Sequence Number State 76 4.7 Wait Time Option Prepared by Client 87 4.8 Wait Time Option Prepared by Cluster Dispatcher 88 4.9 Redirection Option Prepared by Cluster Dispatcher 88 4.10 Wait Time Update Option Prepared by Cluster Node 93 5.1 TCP Event Overview 98 5.2 The Linux Network Layer Structure 99 5.3 Four Tier User-Level Library Function Design 102 5.4 Two Tier Library-Kernel Function Design 102 5.5 Function Call Hierarchy for socket( ) 104 5.6 Socket Data Structure Architecture 105 5.7 Sock Struct and Selected Members 106 5.8 Function Call Hierarchy for bind( ) 107 5.9 TCP Bind Bucket Tables 107 5.10 Bind Data Structures 108 5.11 Function Call Hierarchy for sys_listen_and_send_redirect( ) 109 5.12 The Listen Queue 111 5.13 Function Call Hierarchy for listen_and_receive_redirect( ) 112 vii

5.14 The SYN Queue Data Structures 113 5.15 Function Call Hierarchy for connect_with_wait_time( ) 114 5.16 Wait Time Option Design 115 5.17 ESTABLISHED and BIND Tables 116 5.18 Function Hierarchy for SYN Processing by Dispatcher s TCP 117 5.19 Dispatcher TCP s Wait Time Option 119 5.20 Dispatcher TCP s Redirection Option 119 5.21 Function Hierarchy for SYN+ACK+Options Processing 120 5.22 Function Hierarchy for Sending (Redirected) SYN 123 5.23 Function Hierarchy for Processing Incoming (Redirected) SYN 125 5.24 Function Hierarchy for Processing Incoming SYN+ACK 127 5.25 Function Hierarchy for Processing Incoming ACK 129 5.26 The Accept Queue 130 5.27 Function Hierarchy following update_wait_time( ) 132 5.28 Update Wait Time Option 133 5.29 Function Hierarchy for Processing Incoming Wait Time Update 134 5.30 Function Hierarchy following trust_filter( ) 135 6.1 Software/Hardware Infrastructure 138 6.2 Maximum Client Wait Times 140 6.3 Timing Comparison via select( ) and readtsc 142 6.4 API Call Duration 142 6.5 Packet Count Statistics for Redirection TCP and IP Tunnelling 144 6.6 Dispatcher Event Count LVS/TUN Dispatching 146 6.7 Dispatcher Event Count TCP Redirection Dispatching 147 6.8 Denial of Service Protection Summary 151 6.9 API Categories and Member Functions 151 6.10 Standard Socket API Usage 152 6.11 Client Research API Usage 153 6.12 Cluster Dispatcher API Usage 153 6.13 Cluster Processing Node Research API Usage 154 6.14 Interoperability of Research and Standard Components 155 6.15 Summarised Test Results 157 viii

1.2 Declaration The work contained in this thesis has not been previously submitted for a degree or diploma at any higher education institution. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made. Signed:. Date:.. ix

Previously Published Material The following papers have been published or presented, and contain material based on the content of the thesis. [1] Peter Clutterbuck and George Mohay. Increased Availability for Networked Software Services. Australasian Journal of Information Systems. Volume 7, number 2, pages 13 19. May 2000. This publication relates to material from Chapters 1 and 2 of the thesis. [2] Peter Clutterbuck and George Mohay. Measuring Distributed Software Service Availability and Response Time via the Socket API and TCP Options. Proceedings of the International Conference on Internet Computing (IC 02). Volume 3, pages 1074 1080. CSREA Press USA. June 2002. This publication relates to material from Chapters 2, 4, and 5 of the thesis. [3] Peter Clutterbuck and George Mohay. Internet Service Cluster Dispatching via a TCP Redirection Option. Proceedings of the International Conference on Internet Computing (IC 03). Volume 1, pages 381 387. CSREA Press USA. June 2003. This publication relates to material form Chapters 2, 4, and 5 of the thesis. [4] Peter Clutterbuck and George Mohay. Increased Availability and Scalability for Clustered Services via the Wait Time Calculation, Trust Based Filtering and Redirection of TCP Connection Requests. Proceedings of the IEEE TENCON Conference. Volume 1, pages 328 332. IEEE October 2003. This publication relates to material from Chapters 4, 5, and 6 of the thesis. x

Acknowledgements I would like to thank my principal supervisor Professor George Mohay. I am very grateful to Professor Mohay for his guidance, support and patience over the past several years. He has greatly assisted me in all areas of the research process. I wish to express my thanks to my associate supervisor Professor Mark Looi. I wish to also acknowledge the help and support I received during the initial stages of this research from the Distributed Systems Technology Centre (QUT). In particular I would like to thank Dr. Tim Redhead, Dean Povey and Simon Gibson. These three people made me feel very much part of their team and were very supportive in many important areas. My final acknowledgement is to my wife Ricki. I greatly valued and appreciated her encouragement and support during all stages of this research. Thank you very much. xi

Chapter 1 Availability A Literature Review 1.1 Introduction The research described within this thesis has the central aim of maximising the availability of a distributed software service. Availability, confidentiality, and integrity are listed consistently [18], [20], [27] as the major three deliverables of computing security research. Of these, [27] describes availability as the most difficult. [20] states that this difficulty has been caused by the temporal dimension that is associated with the availability challenge. The research described within this thesis focuses upon the following three specific research goals: The availability of a distributed software service should reflect both the liveness probability level and the probabilistic user access time of the software service. The dispatcher processing overhead is a function of the number of IP datagrams/tcp segments that are received or sent by the dispatcher in respect of each service request. Consequently the number of datagrams/tcp segments should be minimized, ideally so that for each incoming service request there is one IP datagram/tcp segment. The level of trust in respect of each service request received by the service cluster should be identified as this is critical in mitigating distributed denial of service attacks and therefore in maximising the availability of a distributed software service. The remainder of this chapter will describe the detailed derivation and meaning of these three research goals. The derivation and meaning of these three goals require the analysis of the major research themes that constitute the existing body of availability theory, together with the identification of the critical feature(s) within each major research theme. Each critical feature is then analysed in terms of its necessity and sufficiency as a core part of a complete availability theory. This analysis facilitates a progress check of where the established availability research has reached, and where future research may proceed most productively. Indeed this progress check ultimately points to the three research goals underpinning this research. The remainder of this chapter will unfold as follows. Section 1.2 discusses how availability is closely linked with reliability, and both concepts are subsets of dependable computing. Section 1.3 describes how current availability assurance strategies are primarily based on countering any single point of failure within the software service. Consequently, availability requires service replication, and this is facilitated via the concept of service clustering. Section 1.4 analyses how research into denial of service, a significant threat to software service availability, has developed historically along two main streams. Initial denial of service research focused upon solution strategies placed within the operating system kernel and in some cases other operating system (driver) modules. Later denial of service research 1

recognised that network software services were increasingly vulnerable to denial of service attacks mounted from remote machines. This later research, now known as distributed denial of service research, recognises that no single denial of service solution strategy exists, and that any practical level of protection must be based on several cooperating solution strategies that are located architecturally within the software service, within the operating system kernel/operating system driver modules, and also at strategic points across the network itself. Section 1.5 describes the critical features of each of the three major availability themes. These critical features in turn produce the three research goals that have been listed at the commencement of this chapter. Section 1.6 concludes the chapter and points to the content of Chapter 2. 1.2 Availability and Reliability A very early examination of software availability theory is contained within [1]. This description begins by describing reliability and availability as important performance measures for the quality of a system. Availability is introduced as a function applicable to a system that tolerates shutdown times caused by either planned or unplanned outages. The availability function, A(t), is defined as the probability that the system is operating at time t. In contrast, the reliability function R(t) is the probability that the system has operated over the interval 0 to t. The most important items for consideration are how frequently the system goes down and for how long it stays down. An important parameter that characterizes this down time is the mean time to (or between) failures (MTTF). In the general case Markov modelling is used to analyse system availability and derive the required probabilities. An important difference between A(t) and R(t) is their steady-state behaviour. As t becomes large, all reliability functions approach zero, whereas availability functions reach some steady-state value. As an example, if MTTF is 10 4 hours (a little over 1 year), and the mean repair time is 100 hours (about 4 days), then the steady-state availability probability of 0.99 is produced. Similar availability metrics are surveyed in [2]. Reliability is the conditional probability at a given confidence level that a system will perform its intended function properly without failure and satisfy specified performance requirements during a given time interval [0, t] when used in the manner and for the purpose intended while operating under the specified application and environment stress levels. Instantaneous availability, A(t), is the probability that a system is performing properly at time t and is equal to reliability for a system that does not tolerate shutdowns of any type. Steady state availability is the probability that a system will be operational at any random point of time and is expressed as the expected fraction of time a system is operational during the period it is required to be operational. More recently availability is defined in [3] as one of several reliability metrics. Reliability is described as the most important dynamic characteristic of almost all software systems. Informally, the reliability of a software system is a measure of how well users think it provides the services that they require. More formally, reliability is usually defined as the probability of failure-free operation for a specified time in a specified environment for a specified purpose. In the main, software reliability metrics have evolved from hardware reliability metrics. Metrics used for software reliability specification are as follows: 2

POFOD ROCOF MTTF AVAIL The probability of failure on demand. This is a measure of the likelihood that the system will fail when a service request is made. For example, a POFOD of 0.001 means that 1 out of 1000 service requests may fail. POFOD is most relevant within safety-critical systems. The rate of failure occurrence. This is a measure of the frequency of occurrence with which unexpected behaviour is likely to occur. For example, a ROCOF of 2/100 means that 2 failures are likely to occur in each 100 time units. ROCOF is most relevant to operating systems and transaction processing systems. The mean time to failure. This is a measurement of the time between observed system failures. For example, an MTTF of 500 means that 1 failure can be expected every 500 time units. It is the reciprocal of ROCOF if the system is not being changed. MTTF is most relevant to systems with long transaction processing times. Availability is a measure of how likely the system is to be available for use. For example, an availability of 0.998 means that in every 1000 time units, the system is likely to be available for 998 of these. Availability is most relevant to continuously running systems. Availability is the metric with which users or system operators are mostly concerned. Availability is the complement of down-time and as such takes into account the elapsed repair or restart time when a system failure occurs. If repair or restart time is brief, it is possible to have acceptable availability within a system that displays low reliability as measured by metrics such as ROCOF or MTTF. Strategies for achieving high levels of availability are discussed in [4]. The concept of downtime is essential in defining and achieving availability. Downtime is described as if a user cannot get his job done on time, the system is down. This strict definition is required because the system is provided for users to work in an efficient and timely way. When circumstances prevent a user from achieving this goal, the system is down. Causes of downtime include people factors, planned outages, environmental problems, hardware difficulties and software failures (server software and network software). 1.3 Service Clustering The clustered internet service concept takes the single service process of the traditional service architecture, and replicates this process to form a cluster or group of service processes. The cluster concept increases service availability and scalability by providing fault tolerance and greater processing power via hardware redundancy and software replication. The cluster concept is comprehensively treated in [5]. This section will firstly describe how availability is increased by clustering, and then describe how scalability is added on to enhance the availability management model. 3

1.3.1 Availability via Clustering The increased availability provided by a clustered service is based upon replicated service availability management that has been described in [6], [7], [8], [9], [10], [11], and [12]. Replicated service availability, as it is presented in [7], is illustrated in Figure 1.1. [7] assumes an asynchronous distributed system consisting of multiple host machines or nodes linked together into an arbitrary network topology. This service hosting network in turn has some arbitrary link with the service user community. The primary entities within the model include the distributed service, a service group, an availability policy, and a management service. The distributed service is characterised by its state, type and operational implementation. A service group is a collection of replicated operational implementations (i.e., processes). The service group consists of a single primary implementation and several backup implementations. A service group may comprise only those nodes capable of speedy communication with all nodes within the group. Group membership is dynamic. An availability policy is a tuple (replication, synchronization). Replication defines the number of service backup implementations maintained within the distributed system. Synchronization defines the mapping of updates from the primary state to the backup state. Close synchronization describes one to one updating across all group members, whilst loose synchronization describes less frequent updating across the group. A management service is the system entity that implements the availability policy of the distributed service. A management service is implemented as a group of cooperating management agents. The management agent groups maintain consistent state information. A management agent is located within each node hosting a member of the service group. A management agent may be constituted as part of the service member implementation or as a standalone process that is separate from the service member implementation (process). General User Community State Updating (Close Synchronization) Service Network Gateway Arbitrary Service Network Service Hosting Operating System and Node Service Secondary Process Service Primary Process Service Hosting Operating System and Node Management Service Agents Network Service / Service Group (Replication of 1 backup) Figure 1.1 Replicated Service Availability Management The model manages distributed service availability by ensuring that operational primary and backup implementations exist within the distributed system at all times. Each primary and backup implementation, depending on the level of synchronization specified within the availability policy, possess uniform service state information. 4

The only threat to model viability is total, simultaneous failure of all nodes. The model makes no distinction between network level communication faults, hardware or operating system faults at a node level, or service implementation (process) failure. A group member isolated or extinguished for any reason is removed from the group until the isolation is resolved. When group reconfigurations occur for any failure reason, the management service across the newly formed group selects a replacement primary server, and then creates an additional backup on a suitable host node within the distributed system. 1.3.2 Scalability via Clustering Whilst the availability management model described above provides hardware and software fault tolerance, it does not fully provide the scalability required for distributed services that process large and variable volumes of work. This is because secondary group members in the availability management model are utilised as failover backups and not for routine processing. This design choice reflects how that model viewed availability as a service liveness issue only. The cluster concept allows the addition of increased scalability to the availability management model by introducing load balancing strategies so that all replicated group members routinely participate in processing. These load balancing strategies therefore form a central issue in clustering. Load balancing strategies display a four-way taxonomy [14]: client side, server side Domain Name Service (DNS) Round-Robin, server side filtering, and server side dispatching. Developments in client side load balancing includes Berkeley s Smart Client [13]. The Smart Client requires that the internet service provide an applet running on the client side. The applet makes requests to the cluster to collect load information on all servers, and then chooses a server based on that information. The applet tries other servers within the cluster if it finds the initially selected server has failed. The Smart Client, as with other client side load balancing approaches, is not client-transparent and consequently requires modification of client applications. The client side approach also displays the potential to increase network traffic because of the extra cluster probing that is involved. DNS Round-Robin (DNS RR or DNS aliasing) is used by a significant number of Web servers to distribute load across the Web servers cooperating to provide the service [13]. DNS support for load balancing is described in [14]. A single logical hostname for the service is mapped onto multiple IP addresses. Each IP address represents a processing member of the service cluster. When a client resolves a hostname, alternative IP addresses are provided in a round-robin fashion. [13] outlines two major problems with DNS Round-Robin load balancing. Firstly, the randomized load balancing of DNS RR will not work as well for requests demonstrating wide variance in processing time (eg, some web requests may pull many pages from a site, and others may pull only one or very few). Secondly, DNS RR cannot account for geographic load balancing since DNS does not possess knowledge of client location or server capabilities. [15] also describes the unreliability of DNS RR when a server node fails, the appropriate change in IP mapping will take time to propagate through DNS. The change must be made within the appropriate DNS zone file(s). The delay is further exacerbated by the heavy use of caching name servers within DNS. Therefore client resolution requests will continue for some time to have the failed IP address returned to them. 5

Load balancing by server side filtering is the strategy underpinning Microsoft Network Load Balancing [16]. This clustering technology is included in Windows Advanced Server and Datacenter Server operating systems. Network Load Balancing is described with reference to Figure 1.2. Figure 1.2 shows four Network Load Balancing (NLB) hosts, or cluster members, supported by optional shared storage. The clustered service is assigned a primary IP address which is advertised to the user community via DNS. The clustered service consists of a maximum thirty two cluster members. Each cluster member is assigned a unique host priority in the range 1 to 32 where lower numbers denote higher priorities. The cluster member with the highest host priority (i.e., lowest numeric value) is called the default host. The essence of NLB is that service requests are filtered and not dispatched. This means that all incoming service requests are received by all cluster members at the device driver level there is no front end host receiving all requests and dispatching/routing each request to the most appropriate cluster member. Internet/Intranet Hub Router/Gateway Network Load Balancing Hosts (Cluster Members) Maximum of 32 Figure 1.2 Network Load Balancing (NLB) All service requests to the cluster arrive (via the hub) at each host and are then passed to the Network Load Balancing Driver, which is positioned on each host between the LAN device driver and TCP/IP. The Network Load Balancing Driver on each host performs a statistical mapping to determine which host should handle the request. This mapping uses a randomization function that calculates a host priority based on the client s IP address, port number, and other state information. The Network Load Balancing Driver then passes the accepted service requests to TCP/IP and discards the remaining requests. The NLB literature states that filtering delivers higher network throughput than dispatcher-based solutions. Whilst NLB requires no dedicated hardware, it can only be implemented on an Ethernet or FDDI LAN, and indeed only on a single segment of a LAN. No wide area deployment of the cluster members is possible. NLB does not monitor the detailed workload of the software services comprising the cluster. The NLB literature states that where client connections produce widely varying loads on the server, Network Load Balancing s load balancing algorithm is less effective. NTB s architecture takes advantage of the cluster subnet s hub to simultaneously deliver all incoming network traffic to each 6

and every cluster host. The NLB literature states that if the client-side network connections to the switch are significantly faster that the server-side connections, incoming traffic can occupy a prohibitively large portion of the server-side bandwidth. The NLB literature also states that NLB does not manage any incoming IP traffic other than TCP traffic, User Datagram Protocol (UDP) traffic, and Generic Routing Encapsulation (GRE) traffic (as part of PPTP traffic) for specified ports. It does not filter IGMP, ARP, the Internet Control Message Protocol (ICMP), or other IP protocols. All such traffic is passed unchanged to the TCP/IP protocol software on all of the hosts within the cluster. As a result, the cluster can generate duplicate responses from certain point-to-point TCP/IP programs (such as ping) when the cluster IP address is used. This appears to suggest a vulnerability whereby a NLB cluster may be used to amply the impact of a Smurf attack [71] on another network. The Smurf attack occurs when an attacker sends spoofed ICMP ECHO packets to the address of the amplifying network. The source address of the packets is forged to make it appear as if the victim system has initiated the request. Server-side dispatching is the load balancing strategy within the Linux Virtual Server architecture (LVS) [15]. Linux Virtual Server is implemented (for Internet services) in two main ways via Network Address Translation (LVS/NAT) and via IP tunnelling (LVS/TUN). Virtual Server via NAT is illustrated in Figure 1.3. USER Real Server 1 Internet/Intranet Virtual IP address Load Balancing Linux Box Switch/Hub Real Server 2 Figure 1.3 Virtual Server via NAT Private Network Real Server 3 The operation of Linux Virtual Server via NAT is as follows. When a user accesses the service provided by the server cluster, the request packet destined for the virtual IP address (the external IP address for the load balancer) arrives at the load balancer. The load balancer examines the packet s destination address and port number. If they are matched for a virtual service according to the virtual server rule table, a real server is chosen from the cluster by a scheduling algorithm, and the connection is added into the hash table which records all established connections. Then, the destination address and port of the packet are rewritten to those of the chosen server, and the packet is forwarded to the server. When further incoming packets belonging to this connection arrive at the load balancer (from the user), these packets are also rewritten and forwarded to the chosen server. When reply packets come back (from the chosen 7

server), the load balancer rewrites the source address and port of the packets to those of the virtual service. The major disadvantage of this dispatching approach is a lack of scalability due to the time taken by the load balancer in rewriting both the incoming and outgoing packets. [15] reports that the time taken by a Pentium machine to rewrite a packet of average length 536 bytes is around 60µs. [15] reports the maximum throughput of the tested Pentium load balancer is 8.93 MBytes/s and that it can schedule 22 real servers if the average throughput of real servers is 400Kbytes/s. A second disadvantage of the NAT approach is that the cluster machines must be deployed within a private network. Private IP addresses must be used for the cluster members (address space 10.0.0, 172.16 to 173.31, and 192.68.0 to 192.68.255. Server-side dispatching via IP tunnelling (LVS/TUN) is also used as a load balancing strategy within the Linux Virtual Server architecture. IP tunnelling is a technique that allows IP datagrams for one IP address to be wrapped and redirected to another IP address. The original IP datagram is wrapped (encapsulated) inside another IP datagram. The Linux Virtual Server architecture via IP tunnelling is illustrated in Figure 1.4. USER Replies going directly to the user Real Server 1 Internet/Intranet Virtual IP address IP tunnel IP tunnel Load Balancing Linux Box Switch/Hub IP tunnel Real Server 2 Private Network Real Server 3 Figure 1.4 Virtual Server via IP Tunnelling (LVS/TUN) The workflow for virtual server via IP tunnelling is as follows: The user directs the request to the virtual IP address of the cluster. The request is received by the load balancer, which then proceeds to establish which of the cluster servers should action the request. The IP datagram containing the service request is then wrapped in another IP datagram, and sent to the selected server. The selected server unwraps the IP datagram (thereby obtaining the details of the original connection request TCP and IP headers), actions the request, and sends the service results directly to the user. This is the central difference between NAT and IP tunnelling load balancing strategies. Under NAT, traffic into the cluster and traffic out of the cluster all goes through the load balancer. There is substantial overhead in processing the bidirectional traffic. Under IP tunnelling, traffic into the cluster comes via the load balancer. However all reply traffic is sent directly by the processing server to the user. This produces a substantial saving in the processing load of the load balancer. Consequently [15] 8

states that the LVS/TUN load balancer can handle huge amounts of requests; it may schedule over 100 real servers and won t be the bottleneck of the system. 1.4 Availability and Denial of Service Denial of service (DoS) is defined in [58] as an inhibition of service. The attacker prevents a server from providing a service. Denial of service poses the same threat as an infinite delay. In [59] a denial of service attack is characterized by an explicit attempt to prevent the legitimate use of a service. DoS protection has been a long term fundamental goal within the security research community. Unfortunately DoS also has proven to be a non-static problem and has quickly evolved with the growth of computer networks. Distributed denial of service (DDoS) is the current manifestation in today s distributed systems of the original Denial of Service challenge [17]. [59] describes DDoS as being the deployment of multiple attacking entities to attain the DoS goal. It is also clear that denial of service attacks have increased significantly with the explosive growth of Internet computing. In 1983 Gligor reported in [18] that In the past ten years more than fifty distinct examples of denial of service have been presented in the literature. By 2001 [19] reported measurements by CAIDA/UCSD (the Cooperative Association for Internet Data Analysis - http://www.caida.org) that detected more than 12,000 attacks against more than 5,000 victims during a three week study in February of that year. [20] states that 38% of security professionals surveyed declared that their sites had been the object of at least one DoS attack in the previous year. [21] describes in 2000 the largest malicious assault in the history of the Net, involving DoS attacks against corporate web sites including CNN, etrade, ZDNet, and Datak. This section will present denial of service research as it has developed along two paths. The first path will describe what is termed as kernel based denial of service protection. This protection has almost entirely consisted of resource access control solutions that are implemented within the operating system kernel/operating system (driver) modules. The second path will describe the research focus on distributed denial of service (DDoS) protection. This will show that the nature of DDoS renders the traditional operating system kernel based, access control model inadequate for DDoS protection, and that a much more network oriented and multi-party collaborative approach must be taken in solving the challenge. Indeed the description of DDoS research will show clearly that in overall terms denial of service remains a significant security challenge that requires a variety of mitigation controls. The detailed taxonomy of denial of service attack strategies and defences provided in [60] describes over thirty protection mechanisms that are deployed within the victim s network, the intermediate network(s), and also the source network. It must be stressed that the denial of service mitigation control that will be suggested in this chapter (and more fully defined in subsequent chapters) is one further control mechanism designed to work in concert with all other accepted controls. 1.4.1 Operating System Kernel Based Denial of Service Protection Operating system kernel based denial of service (DoS) theory was initially developed in [18], and subsequently expanded in [22], [23], [24], [25], [26], [27], [28], [29] and [63]. The central thread through all of this research is that DoS protection should be positioned within a trusted computing base and should consist of resource/service access monitoring and control functions that aim to enforce some form of service wait 9