The Need for SIP-Enabled Application Delivery Controllers
Table of Content Introduction...3 The Growing Deployment of SIP Communication...3 Application Delivery Controllers Will Become Standard for High Capacity Deployments...3 History and Current State of Load-Balancing and High Availability (HA) in VoIP...4 The Need for SIP-Enabled ADCs...6 The Role of the L4 Load Balancer...7 The Role of the SIP Proxy...7 The Advantages of a SIP ADC That Also Serves as a SIP Proxy...8 Simplicity...8 Request and Response Routing...8 TCP Splitting...10 Aging...11 Refer and Replaces...11 Transport Conversion and TLS Offload...12 Summary...12 About Radware SIP Director...12 About Radware...12 Smart Network. Smart Business.
Introduction The traditional public switched telephone network (PSTN) is known for its robustness. We, as users, consider telephony service availability as a given we expect to hear a dial tone and to be able to perform a call with decent quality whenever we pick up the handset. This is well demonstrated in the movie, The Day After Tomorrow when the character Jack Hall (Dennis Quaid) picks up a public pay phone and makes a call while New York City is under a deadly arctic storm, which destroys large parts of the city and populace. In reality, we have seen failures of time division multiplexing (TDM) networks only in extreme cases such as 9/11. This type of reliability and robustness is expected from the newly deployed VoIP network but is it available today? What is required for achieving this goal? This white paper provides an overview of the solution required for addressing this need and gives practical examples to demonstrate why such a solution must be SIP-enabled in order to function as a standard SIP entity in the network. Drawing upon over 10 years of extensive field-experience, Radware s solutions and technology provide its wide customer base with service continuity and reliability and are complemented by the company s expertise. The Growing Deployment of SIP Communication Since its early days in the 90s VoIP has evolved from an early adopters technology for low- rate, long distance phone calls to a service that is gradually replacing the TDM network with SIP as the protocol of choice. SIP and adjacent protocols and standards provide a wide range of multimedia value-added services that converge fixed and mobile networks with Web 2.0. services delivered over these networks, in order to achieve the goal of providing an enhanced user experience. An example is allowing users to start an instant message (IM) session and then immediately turning it into a full video conference and collaboration session that invites additional parties to connect via the web from home, the office and on-the-go. VoIP brings forth a new architecture, the IP Multimedia Subsystem (IMS), currently being adopted and deployed by carriers and operators. This new architecture allows for the creation of such services and the integration of new and traditional services provided by external companies through standard interfaces. With the increasing deployment of SIP and IMS, carriers and service providers are required to provide 5x9s service availability and high levels of service quality for the growing volume of communication traffic. This in turn imposes carrier-grade performance requirements on vendors and system integrators building solutions and products for these networks, thus posing new challenges for making these solutions network-ready. Application Delivery Controllers Will Become Standard for High Capacity Deployments As SIP deployment continues to grow it is representative today of where HTTP was in the late 90 s in terms of growth potential and availability requirements. At that time, solutions for load balancing, scalability and service availability were typically addressed in the scope of the product or application. As traffic increased and applications became mission-critical the importance of scalability and performance increased and service availability became critical. Therefore, it was clear that a dedicated solution was required in order to service this need. This resulted in the evolution of the Application Delivery Controller (ADC). Smart Network. Smart Business. 3
Looking today at VoIP and specifically SIP from the standpoint of the ADC evolution brings one to a clear conclusion that for SIP to continue its evolution in replacing legacy TDM telephony, it must adopt the same approach of an external purpose-built ADC. With such a solution, carrier- grade requirements for availability and load balancing can be addressed by a dedicated, off-the-shelf ADC. This reduces time-to-service for service providers and vendors and simplifies development and deployment significantly. A SIP-enabled ADC should support the following capabilities: 1. SIP Service Availability and Disaster Recovery service availability assurance handling for both local and global server fail-over, local server failure and site disaster recovery. This is accomplished by identifying server failures and managing traffic redirection to a specific backup server or remote site. 2. Optimized SIP Performance, Scalability and Acceleration local and global load balancing allowing distribution of traffic to remote server farms for handling traffic peaks. Servers may be added to a cluster seamlessly providing a scalable architecture to increase both performance (calls per second) and overall capacity (concurrent calls). Traffic is accelerated by the offload of CPU intensive tasks such as TLS to dedicated hardware (HW). 3. Optimal Call Completion through accurate routing of traffic and maintaining session persistency. Messages are distributed to backend servers based on their capability to support them (e.g., server load, service type). Persistency is critical as persistency mismatch may cause session failure. 4. Interoperability a SIP-enabled ADC must be fully standard compliant and validated at SIP interoperability events such as SIPit. Additionally, it should provide interoperability enhancement by performing protocol mitigation, for example, through transport conversion and SIP/SIPS scheme conversion for TLS traffic. 5. Security is required both on an IP level for threats such as distributed denial-of-service (DDoS) and SIPspecific threats such as SIP layer floods, worms and SIP protocol exploits (e.g., buffer overflows, SQL injections). History and Current State of Load-Balancing and High Availability (HA) in VoIP Carrier-grade capabilities for VoIP service availability and load balancing are commonly developed today as an integral part of the application. Typical models of high availability and load balancing are limited to the current server cluster in architectures such as N+1 or N+M. Resources are not viewed globally for failure recovery, disaster recovery and for load balancing to support resource optimization. The fact that solutions are integrated into the application limits their scope and scalability options. This type of solution also typically consumes resources that would be better utilized from an ROI perspective for increasing application performance instead of performing tasks that can be handled by an external ADC. From a development resource utilization perspective, including partial functions of an ADC in the scope of the application requires repetitive development efforts of proprietary solutions. It is better to invest resources in increasing application differentiation and functionality than in development of functions that can be integrated and utilized from an externally dedicated ADC. Approaches to load balancing VoIP traffic have evolved as deployments have increased. Historically in TDM networks and later in early VoIP deployments load between servers was split based on phone number prefixes, destinations and type of service. This static distribution model is inefficient, typically resulting in an unbalanced system, waste of resources and low ROI. Each server cluster, statically handling a specific group of prefixes or a service must be designed to support peak traffic since there is no way to dynamically change traffic distribution. Smart Network. Smart Business. 4
Additionally, peak traffic periods may vary between server clusters based on time zone or other behavioral characteristics. This results in low server utilization as a cluster typically handles lower traffic levels. The static distribution model approach is additionally management intensive as it requires making frequent changes to the static traffic distribution configuration as loads in each group may change during the day/week or over a period of time. Scalability in this type of architecture is problematic. Instead of overall scaling of globally shared resources it is required to scale each segment. This increases scalability costs and can potentially lead to a system overload of specific server clusters. Other models use a dispatcher for the initial message of the communication but following traffic is routed directly to the handling server. Common Dispatcher First message and response Following messages go directly to the selected server Dispatcher selects a server for the first message SIP Servers Figure 1: First message dispatcher model In this architecture model if there is mid-call server failure there is no way for the Dispatcher to change message routing for service continuity so backups will need to be implemented between the servers themselves. This imposes greater loads on the servers as they are required to perform health monitoring between them and must keep synchronization of data and states. Server resources are thus allocated to these tasks instead of application-reducing ROI. Another common model incorporates broadcast and hash-based decision methods. In this approach the front-end dispatcher broadcasts traffic to all backend servers. Each server then performs a hash function to decide if the call should be handled by it. This approach is used in Microsoft s Network Load Balancer (NLB). This model introduces additional load on all servers and creates issues in case of server failure and recovery as the routing decision is made at the backend servers instead of at the front-end. There is no way for the front-end dispatcher to change load balancing and routing rules due to changes in traffic load and server availability. Message received by dispatcher sent to all servers Common Dispatcher An external ADC that functions as a SIP Proxy in the message routing will be able to decide on changes in message routing to a new service server in case of local failure or can even decide to route messages to a different location for disaster recovery. Server hash-based decision to handle call Figure 2: Broadcast & hash-based decision Smart Network. Smart Business. 5
The Need for SIP-Enabled ADCs Solutions available today for HTTP and Web load balancing and availability are sufficient only for part of the challenges SIP networks face. SIP, as a real-time communication protocol, imposes new challenges requiring SIP-enabled solutions to complement current HTTP application delivery solutions. Trying to solve this challenge by adding SIP as just another protocol supported by the L4 ADC with some configuration and scripting rules to support it is an insufficient solution resulting in many cases in actual failure of common SIP scenarios. The ITU-OCAF (Open Communication Architecture Forum) has defined a three-tiered SIP load balancing architecture which includes a L4 load balancer, a SIP stateless proxy and the application. Adopting this approach significantly simplifies integration and deployment in the network. It addresses the issues of SIP service availability and traffic load balancing through a SIP standardized and simple way having each component handle tasks for which it is best designed. SIP Applications & Services (eg. application/feature server, IVR, messaging application, voice/video conferencing) SIP SIP Border Elements (eg. SBC) IP Network Load Balancer SIP Proxy SIP Core Network Services (eg. x-cscf, softswitch) SIP Director Application Delivery Controller Call Centers Figure 3: ITU-OCAF three-tiered SIP load balancing architecture Smart Network. Smart Business. 6
The Role of the L4 Load Balancer The load balancer component handles tasks in the IP level in addition to some simple L7 functions it may also perform. These include: Health Monitoring IP level using ping and ARP and SIP level using OPTIONS Local and Global Traffic Distribution distributing traffic between SIP Proxies, between servers in a cluster and globally between server farms. This allows for better resource utilization as server farms are not generally designed for extreme case peak load; instead they can distribute traffic in extreme cases to other locations. Global Disaster Recovery in case of site failure a load balancer at another site will take over Virtual IP (VIP) of the failed location and handle its traffic through the VIP. Service IP Virtualization using the DNS service record the load balancer can route traffic to the server supporting the required service. This results in complete service virtualization allowing one VIP to represent a service regardless of where it actually resides and how it is globally distributed. Security Gateway (GW) protects from DDoS and other IP level attacks freeing backend servers from unsolicited traffic overload that may halt or degrade service quality. This has a greater effect on real-time services such as VoIP where delay significantly reduces service quality. Converged HTTP Load Balancing many applications today, especially in the application server level, provide service through a combination of VoIP and Web applications. These require ADC solutions that support converged traffic. The Role of the SIP Proxy The SIP Proxy component functions as a standard SIP Stateless Proxy and is critical for allowing simplified configuration and deployment as well as for assuring correct and interoperable function of the SIP load balancer. Functions of the SIP Proxy include: Simplified Application Integration the SIP Proxy is configured as the outbound proxy of the backend servers. The result is a simplified integration without any complicated workarounds with only SIP traffic directed to it. Other solutions that do not function as a SIP Proxy are required to function as the default GW requiring all traffic (SIP and non-sip) to be routed through it, complicating integration with the network. Rule-Based SIP Routing the ADC routes SIP messages based on SIP standard routing rules such as message type (e.g. INVITE and REGISTER) and Request URI (RFC 3261). Additionally, it can make routing decisions based on any other message or message body parameter, recognizing identifiers such as IMS traffic or user capabilities. SIP Application Layer Persistency persistency is performed by taking an active role in the SIP message routing process as a standard SIP entity. This ensures accurate and correct function based on SIP semantics, as well as support for advanced SIP scenarios. Transport Conversion and Traffic Acceleration - the SIP Proxy supports all SIP common transports and converts between them thus increasing interoperability in case backend servers support only a subset of these transports. It also allows for TLS offload to reduce processing requirements of backend servers. Smart Network. Smart Business. 7
The Advantages of a SIP ADC That Also Serves as a SIP Proxy Using a SIP-enabled ADC that functions as a SIP Proxy in the network is a natural, standard and simple way to guarantee SIP service availability. While L4-only solutions are flexible and it is possible to use them, the inherent drawbacks are breaks in SIP message flow and the creation of interoperability issues. Below are a few examples of common SIP scenarios and a comparison between the handling of each scenario by a SIP-enabled ADC vs. a common ADC that doesn t function as a SIP Proxy. Simplicity An ADC for SIP is required to perform SIP-based operations based on message content analysis. Such operations include session management and routing, transport conversion and security functions. By acting as a SIP Proxy the ADC can perform these operations in a SIP standard manner, avoid interoperability issues and simplify integration. A L4 ADC is required to function as a default GW for the traffic to be routed through it. This results in the routing of all traffic through the ADC regardless if it is SIP signaling, media or other types of traffic. The ADC will need to look at all traffic and filter out the non-sip traffic; it is insufficient to make the decision based on the well-known port 5060 as traffic may arrive on other ports. This increases the load on the ADC overwhelming it with irrelevant traffic. Rather, a SIP-enabled ADC can be configured as an Inbound Proxy for all traffic pointed by the DNS servers or peering SIP elements and as the Outbound Proxy of the backend servers. This in turn will cause all and only SIP traffic to be routed to it. Functioning as a SIP Proxy enables simple and standard solutions for many of the ADC requirements that are else addressed in a complex and inefficient way when the ADC does not function as a SIP Proxy. Request and Response Routing The SIP specification includes many requirements and definitions for routing of messages and responses. SIP routing is based on message content and not IP routing logic. Routing varies based on type of entity, request, service and whether it is the first request of a session, a forked request, a follow-on request or a response. Additionally, persistency is required to allow the same server to handle all messages of the same dialogue or related dialogues such as all calls of a conference; and to avoid unnecessary message and information exchange between cluster-servers in case of persistency mismatch. Persistency should be flexible enough to be based on varying and multiple parameters such as CallD, ConferenceID, User and any combination of any message or message body part. This requires understanding of SIP message content and semantics, as well as follow-up on the state of each SIP call. For an ADC to perform correct and accurate routing and persistency, such that it will avoid additional unnecessary hops between backend servers, the ADC must understand message content and session/service logic. Trying to impose this on the user by requiring him to define complex scripting and rules will increase time-to-service, complicate integration and result in interoperability issues and breaking of SIP robustness. Thus, only by functioning as a SIP Proxy and by being an integral entity in the network that follows the SIP specification, will the ADC succeed to accommodate these requirements sufficiently. Below is an example of complex message routing and persistency based on the type of message and source/ destination. This diagram demonstrates a scenario in which the system requires the same server to handle registrations and calls of a specific UA. In such a case the load balancer is required to maintain persistency based on different headers depending on message type and source/destination. Smart Network. Smart Business. 8
2 UA-1 UA-2 Local Network 1 4 1 4 3 S1 SIP Director 2 4 Peering Network 3 S2 1 UA-1 Registers 2 UA-2 Registers UA-3 UA-4 3 Invite from UA-3 to UA-1 By To header 4 Invite from UA-1 to UA-2 By From and To headers S3 Figure 4: Complex routing and persistency scenarios Flow legend: (1) UA-1 from the local network performs registration by sending a REGISTER request. Load-balancer routes request to S1 and keeps relation between Contact header and S1 for persistency of future messages. Re- REGISTER messages are also routed to S1. (2) UA-2 from the local network performs registration by sending a REGISTER request. Load-balancer routes request to S2 and keeps relation between Contact header and S2 for persistency of future messages. Re- REGISTER messages are also routed to S2. (3) UA-3 from a peering network sends an INVITE request trying to connect a call with UA-1. Load-balancer routes request to S1 based on To header. (4) UA-1 sends an INVITE trying to call UA-2. Load-balancer sends request to S1 based on From header and then to S2 based on To header. The examples in this diagram demonstrate the complex routing and persistency scenarios possible where multiple and varying headers as well as actual scenarios should be considered for accurate persistency and routing. Only an ADC acting as a SIP Proxy is capable of handling these requirements. Smart Network. Smart Business. 9
TCP Splitting SIP allows for the sending of messages on reliable connections. SIP also defines a mechanism for reusing a single reliable connection for multiple calls and/or for upstream and downstream messages. In this context a SIP load balancer should be able to receive messages on one TCP connection and split it to multiple TCP connections for traffic distribution between a few servers. The multiplexed traffic arriving on one TCP connection may have messages belonging to multiple calls in the same TCP stream. Splitting of these TCP streams while making sure all messages of a specific call are routed to the same server requires SIP logic in the load balancer. This functionality should also be supported in the opposite direction by multiplexing traffic received on multiple TCP connections onto one connection. In general the requirement is that the SIP load balancer will support all SIP transport requirements and that it will terminate the TCP connection both in IP and SIP levels. Failing to support this will result in the inability of the load balancer to distribute traffic between the backend servers. This requirement translates into the need for the SIP load balancer to function as a SIP Proxy. Below is an example of a scenario requiring support for TCP splitting and multiplexing. UA-1 SIP Director performs TCP splitting of inbound traffic and TCP multiplexing of outbound traffic S1 UA-2 One TCP Connection UA-3 SIP Proxy SIP Director S2 S3 UA-n Public Network Service Center Figure 5: TCP splitting and multiplexing In the diagram above we see multiple calls routed through a SIP Proxy that forwards the calls to a SIP-enabled ADC, which then distributes them between backend servers in the service center. Response messages and new requests are also initiated by service center servers that are multiplexed by the ADC and sent to the public SIP Proxy on one TCP connection. This is possible due to the internal SIP Proxy coupled with the SIP-enabled ADC that terminates the inbound TCP connection both in IP and SIP levels and opens a new connection with each backend server. A SIP ADC that doesn t function as a SIP Proxy will receive messages for all calls on the same TCP connection from the SIP Proxy in the public network. It will not be able to distribute calls between servers in the service center as it will not be able to split the multiplexed traffic in order to maintain call persistency; thus, all messages will be sent to the same backend server. Smart Network. Smart Business. 10
Aging Aging of SIP traffic is required to determine activity of calls. SIP has defined mechanisms for checking validity and activity for cases where a BYE message wasn t sent or didn t arrive due to network problems. Because the time of SIP sessions and signaling inactivity period is unpredictable and may be long, IP-based mechanisms for determining connection activity are not sufficient. Additionally, SIP also uses UDP as the transport mechanism where connections are not applicable. For this reason SIP has defined protocol specific mechanisms for refreshing SIP sessions and validating their activity. This is done through session timers. Only a SIP-enabled ADC functioning as a SIP Proxy will be able to take part in this process and be capable of clearing void sessions. Other ADCs may have resource leaks because of void sessions remaining in memory or may delete active sessions it mistakes as expired. Refer and Replaces SIP supports transfer and consultation call through a REFER request and Replaces header. In such a scenario UA-1 may call UA-2; after call is connected UA-2 may put UA-1 on hold and connect a call with UA-3 (consultation call). After UA-2 consults with UA-3 it sends a REFER request to UA-1 indicating it replaces the original call between them (Replaces header = UA-2). This in turn causes UA-1 to invite UA-3 to a new call with Replaces header equal to UA-2 to indicate it replaces the consultation call between UA-2 and UA-3. Both calls between UA-1 and UA-2 as well as the call between UA-2 and UA-3 are then disconnected. In this scenario it is required that all calls will be handled by the same backend server for keeping track of calls and for billing purposes. Keeping the relation between calls and server persistency requires the ADC to understand SIP message semantics and to support these SIP requirements. UA-2 Local Network 2 1 2 3 SIP Director 1 2 3 4 S1 4 UA-3 S2 Peering Network 1 3 4 S3 UA-1 1 UA-1 calls UA-2 2 UA-2 consults with UA-3 3 UA-2 sends REFER to UA-1 indicating it is referred to UA-3 and replaces call with UA-2. 4 UA-1 sends an INVITE to UA-3 with Replace header = UA-2. Consultation call between UA-2 and UA-3 is then disconnected as well as original call between UA-1 and UA-2 Figure 6: Transfer with consultation requiring complex persistency Smart Network. Smart Business. 11
Transport Conversion and TLS Offload SIP defines various transports on which SIP traffic may be carried. These include UDP, TCP and TLS. As the SIP specification has evolved mandatory support of transport schemes have changed to eventually require UDP and TCP to be supported by all entities and TLS by SIP Proxies. Still, since requirements have changed over time (and because some implementations are not complete) there are entities with partial transport support. By supporting all SIP transports a SIP-enabled ADC may convert between them and increase interoperability of backend servers. Additionally, it may terminate TLS traffic in purpose-built HW, thus accelerating traffic and reducing round trip delay. Transport conversion and TLS offload tasks require changing SIP message content and therefore can be performed only by an entity functioning as a standard SIP Proxy. Summary In conclusion, as SIP deployments continue to increase and related services become increasingly mission-critical availability, scalability and security requirements have likewise become essential for deployment. This evolution has resulted in the need for off-the-shelf SIP Application Delivery Controllers (ADC) that can accommodate carriergrade operational, architectural and development requirements in one, purpose- built component as the ITU-OCAF has defined. Using a SIP-enabled ADC that functions as a SIP Proxy in the network is a natural, standard and simple way to guarantee SIP service availability. While L4-only solutions are flexible and it is possible to use them, the inherent drawbacks are breaks in SIP message flow and the creation of interoperability issues. About Radware SIP Director Radware s SIP Director is the industry s first intelligent SIP-enabled Application Delivery Controller (ADC), which is designed on the ITU-OCAF three-tiered SIP load balancing architecture. With SIP Director, carriers can guarantee out-ofthe-box, carrier-grade SIP service delivery, to enable reliable, optimized, interoperable and scalable SIP services across the carrier network. As a dedicated, SIP-enabled ADC, Radware s SIP Director ensures efficient, lossless message routing and communication across all SIP entities, regardless of differences in their standards support or features. SIP Director also provides the ability to track, understand and accurately bill for complex and long SIP sessions. Radware s SIP Director leverages the company s 10 years of experience in application delivery to extend the same carrier-grade features and benefits to SIP service delivery. With SIP service delivery guarantees, companies gain from the following advantages: High availability, fail-over and disaster recovery Scalability and performance Simplicity and flexibility SIP interoperability and feature support System security Reduced time-to-market About Radware Radware (NASDAQ:RDWR) is the industry leader in intelligent application delivery solutions. Wireline and mobile carriers rely on Radware to optimize IP services, guaranteeing high-availability, maximum performance, service and network integrity and efficient infrastructure utilization. With Radware, carriers are able to ensure the performance and quality of IP-based services, reduce service delivery costs and introduce new services and Internet business models to grow revenues and IP service profitability. Let Radware make your network business-smart to get the most value from current and future IP services. For more information, please visit www.radware.com. 2008 Radware, Ltd. All Rights Reserved. Radware and all other Radware product and service names are registered trademarks of Radware in the U.S. and other countries. All other trademarks and names are the property of their respective owners. Smart Network. Smart Business. 12