Netfilter Failover. Connection Tracking State Replication. Krisztián Kovács <hidden@sch.bme.hu> 2003.08.17



Similar documents
Fault tolerant stateful firewalling with GNU/Linux. Pablo Neira Ayuso Proyecto Netfilter University of Sevilla

Linux Virtual Server Tutorial

Active-Active Servers and Connection Synchronisation for LVS

Introduction to Linux Virtual Server and High Availability

How to replicate the fire: HA for netfilter based firewalls

Troubleshooting and Maintaining Cisco IP Networks Volume 1

ct_sync: state replication of ip_conntrack

Linas Virbalas Continuent, Inc.

Introduction. Linux Virtual Server for Scalable Network Services. Linux Virtual Server. 3-tier architecture of LVS. Virtual Server via NAT

IP Alarm System Resiliency on Disaster Recovery and ISP changes. A better alternative to DNS-based solutions

Netfilter s connection tracking system

Active-Active Servers and Connection Synchronisation for LVS

A Transport Protocol for Multimedia Wireless Sensor Networks

Using VyOS as a Firewall

Intro to Linux Kernel Firewall

How To Build A Virtual Server Cluster In Linux 2003

Linux Virtual Server Clusters

Managing and Maintaining Windows Server 2008 Servers

High Availability Solutions for the MariaDB and MySQL Database

Limitations on Monitored Lines

ICOM : Computer Networks Chapter 6: The Transport Layer. By Dr Yi Qian Department of Electronic and Computer Engineering Fall 2006 UPRM

Load Balancing und Load Sharing

GLBP - Gateway Load Balancing Protocol

Ulogd2, Advanced firewall logging

TCP for Wireless Networks

CCNP Switch Questions/Answers Implementing High Availability and Redundancy

High Availability Solutions with MySQL

13th IFIP/IEEE International Workshop on Distributed Systems: Operations & Management

Active-Active Firewall Cluster Support in OpenBSD

How To. Configure Ethernet Protection Switching Ring (EPSR) to Protect a Ring from Loops. Introduction

Stateful Firewalls. Hank and Foo

R3: Windows Server 2008 Administration. Course Overview. Course Outline. Course Length: 4 Day

Open Source Firewall

Availability Digest. Redundant Load Balancing for High Availability July 2013

A Standard Modest WebSite

AdvOSS Session Border Controller

Evaluating the Cisco ASA Adaptive Security Appliance VPN Subsystem Architecture

Protecting the Home Network (Firewall)

Configuring CSS Remote Access Methods

Internet Ideal: Simple Network Model

High Availability. Vyatta System

Tomás P. de Miguel DIT-UPM. dit UPM

Astaro Deployment Guide High Availability Options Clustering and Hot Standby

Owner of the content within this article is Written by Marc Grote

Fully Managed Secure Data Sharing (a cloud service)

High Performance Cluster Support for NLB on Window

Module 1: Overview of Network Infrastructure Design This module describes the key components of network infrastructure design.

Linux High Availability

Middleboxes. Firewalls. Internet Ideal: Simple Network Model. Internet Reality. Middleboxes. Firewalls. Globally unique idenpfiers

CSCE 465 Computer & Network Security

Designing a Windows Server 2008 Network Infrastructure

High Availability. Vyatta System

Creating Web Farms with Linux (Linux High Availability and Scalability)

Routing Security Server failure detection and recovery Protocol support Redundancy

Table of Contents. Cisco Blocking Peer to Peer File Sharing Programs with the PIX Firewall

Digi Certified Transport Technician Training Course (DCTT)

Resource Utilization of Middleware Components in Embedded Systems

Open-Xchange Server High availability Daniel Halbe, Holger Achtziger

High availability on the Catalyst Cloud

Chapter 1 - Web Server Management and Cluster Topology

Microsoft Windows Server 2008: MS-6435 Designing Network and Applications Infrastructure MCITP 6435

FatPipe Networks

Data Replication INSTALATION GUIDE. Open-E Data Storage Server (DSS ) Integrated Data Replication reduces business downtime.

UNIVERSITY OF BOLTON CREATIVE TECHNOLOGIES COMPUTING AND NETWORK SECURITY SEMESTER TWO EXAMINATIONS 2014/2015 NETWORK SECURITY MODULE NO: CPU6004

H0/H2/H4 -ECOM100 DHCP & HTML Configuration. H0/H2/H4--ECOM100 DHCP Disabling DHCP and Assigning a Static IP Address Using HTML Configuration

Load Balancing and Sessions. C. Kopparapu, Load Balancing Servers, Firewalls and Caches. Wiley, 2002.

Netfilter / IPtables

ΕΠΛ 674: Εργαστήριο 5 Firewalls

Intercommunication between two MyPBX (via VoIP Trunk)

TIBCO Rendezvous Network Server Glossary

Volume Replication INSTALATION GUIDE. Open-E Data Storage Server (DSS )

ACCESS 9340 and 9360 Meter Ethernet Communications Card ETHER

Advanced Transportation Management Systems

ACHILLES CERTIFICATION. SIS Module SLS 1508

Integrated Traffic Monitoring

Mail-SeCure Load Balancing

Oracle Cluster File System on Linux Version 2. Kurt Hackel Señor Software Developer Oracle Corporation

Course Outline: Designing a Windows Server 2008 Network Infrastructure

PATROL Console Server and RTserver Getting Started

Highly Available Service Environments Introduction

OpenFlow and Onix. OpenFlow: Enabling Innovation in Campus Networks. The Problem. We also want. How to run experiments in campus networks?

Cisco Networking Academy CCNP Multilayer Switching

Disfer. Sink - Sensor Connectivity and Sensor Android Application. Protocol implementation: Charilaos Stais (stais AT aueb.gr)

Improving the Performance of TCP Using Window Adjustment Procedure and Bandwidth Estimation

Common Server Setups For Your Web Application - Part II

EMC CENTERA VIRTUAL ARCHIVE

Recommended IP Telephony Architecture

Adding scalability to legacy PHP web applications. Overview. Mario Valdez-Ramirez

AFW: Automating host-based firewalls with Chef

Magnum Network Software DX

IPv6 Specific Issues to Track States of Network Flows

TESTING & INTEGRATION GROUP SOLUTION GUIDE

Exam : EE : F5 BIG-IP V9 Local traffic Management. Title. Ver :

ΕΠΛ 475: Εργαστήριο 9 Firewalls Τοίχοι πυρασφάλειας. University of Cyprus Department of Computer Science

Using Fuzzy Logic Control to Provide Intelligent Traffic Management Service for High-Speed Networks ABSTRACT:

A Step-by-Step Guide to Synchronous Volume Replication (Block Based) over a WAN with Open-E DSS

Demystifying cluster-based fault-tolerant Firewalls

Transcription:

Netfilter Failover Connection Tracking State Replication Krisztián Kovács <hidden@sch.bme.hu> 2003.08.17 1

Original idea Harald's OLS 2002 paper: How To Replicate The Fire HA For Netfilter Based Firewalls The problem: Netfilter is a stateful packet filter, we have to replicate the conntrack entries NAT data 2

Network architecture 3

Cluster interface and internals Master-slave system Cluster is reachable through a virtual IP, packets sent to this IP are processed by the master An internal replication network is available (high-speed, secure, etc.) All nodes have the same configuration Two (almost) separate problems to solve: IP failover State replication 4

IP failover Don't reinvent the wheel: use VRRP (Virtual Router Redundancy Protocol) Provides: master election protocol virtual IP configuration on all interfaces VRRP implementation: keepalived daemon originally developed for LVS We need to be able to get notified when state changes occur: this is possible with keepalived 5

State replication Two roles: master and slave Master generates replication protocol messages from conntrack events Slave listens for these messages and updates the conntrack entries accordingly 6

ct_sync architecture 7

ct_sync implementation 8

Problems Dynamic structures (linked lists, pointers) Data structures have to be serialized Cluster-level unique IDs are needed instead of pointers Harald's approach: since the address of the ip_conntrack structure does not change, use that as an ID This is not correct: when failover happens, the addresses change (they are not guaranteed to be the same on the slaves!) 9

Problems Refresh and timers Refresh events occur too often, so they cannot be replicated This causes inconsistency: if the timers are not refreshed on the slaves, conntrack entries may timeout too early Because of this, timers are not activated on slaves, they are initialized at creation, and started only if a slave->master transition occurs 10

Replication protocol Protocol messages must not be tracked NOTRACK should be used Multicast UDP based NACK (Negative ACKnowledgement) based error recovery Each packet has a sequence number, master has the last N packets sent in memory If a slave detects that the received packet has an invalid sequence number, it requests retransmission 11

Error recovery Every packet sent by the master contains the sequence number of the oldest packet which can be retransmitted Slaves can detect some cases when recovery is trivially impossible Retransmission request suppression The protocol is very stupid Not scalable: a few missing packets cause retransmission storms (slaves do not store packets with too recent sequence numbers) 12

Protocol messages Message grouping is not implemented yet... :( 13

Protocol messages Every message is self-contained (not incremental): contains every information about a given conntrack entry or expectation update/delete messages for conntrack entries/expectations 14

Configuration Module parameters Initial node status (master/slave) Node ID Replication interface name procfs entries Change node status based on keepalived events 15

Test system 16

Future work I. Complete rewrite and cleanup :) (for a new and stable ctnetlink, maybe...) Better (scalable, more intelligent) replication protocol Should it be architecture and version independent? Message grouping: more messages in one packet 17

Future work II. Full resynchronization Maintenance of the cluster is really hard without it Retransmission of the whole conntrack table is impossible (slow, locks up master) Possible solution: transmission of fake update messages (eg. for unchanged entries) when master is idle 18

Future work III. Real testing Due to lack of hardware, I was unable to test the system on real HW UML was used throughout the development and testing In its current state it's not architecture independent 19

Open problems Is a userspace solution based on ctnetlink feasible? (With performance in mind) Load balancing instead of failover 20