Value- Added Services with Sandvine Divert A Sandvine Technology Showcase Contents Executive Summary... 1 Introduction to Value-Added Services and Service Chaining... 2 Sandvine Divert... 3 Enabling Service Function Chains... 4 Solutions Partner Ecosystem... 5 How Sandvine Divert addresses IETF Problem Areas... 5 Topological Dependencies... 5 Configuration Complexity... 6 Constrained High Availability... 7 Consistent Ordering of Service Functions... 8 Application of Service Policy... 9 Transport Dependence... 9 Elastic Service Delivery... 10 Traffic Selection Criteria... 10 Limited End-to-End Service Visibility... 12 Per-Service (re)classification... 13 Symmetric Traffic Flows... 14 Multi-Vendor Service Functions... 14 Future Extensions... 14 Conclusion... 16 Executive Summary There are two practical options available to a communications service provider (CSP) for deploying value-added services (VAS): integration/embedding within a PCEF, and redirecting from a PCEF to physical or virtual service nodes. Using the 12 problem areas for service function chaining identified by the IETF s Internet Working Group as a basis of comparison, the external redirection model is determined to be the superior approach. Sandvine Divert is a patented technology that enables intelligent management of multiple service functions in a communications network. Divert redirects processed traffic flows on an application-aware, subscriber-aware, and/or content/mime-typeaware basis to physical or virtual third-party systems for further processing, before returning traffic back to onto the wire. Divert places each service function logically inline on a per-flow basis after a connection is established and the application and/or content type is known. This feature, combined with the Sandvine s traffic classification technology, gives Divert the maximum theoretically achievable efficiency for traffic redirection, far beyond rudimentary port-based redirection and heuristic guessing. Sandvine s traffic redirection is a proven solution, having enabled more than 40 VAS and service function chain deployments worldwide through our Solutions Partner Ecosystem, a collaboration of partners and pre-integrated joint solutions. With Sandvine already in the network, CSPs can simply plug-andplay our ecosystem partners offerings to get an end-to-end solution. Characteristics of Sandvine Divert... 16
Introduction to Value-Added Services and Service Chaining In telecommunications, value-added services (VAS) come in many forms, including consumer and enterprise, and those that generate incremental revenue and those that do not. To deploy a single VAS, there are two practical options available to a communications service provider (CSP): integration and redirection. A topic closely related to VAS is service chaining (i.e., service function chaining), a technique for selecting and steering data traffic flows through various service functions that is being investigated and developed by the Internet Engineering Task Force (IETF) Network Working Group. In order to realize the full promise and potential of VAS and service chaining, the IETF has identified a number of challenges that a VAS deployment approach must overcome. 1 These challenges, or problem areas, provide a framework by which potential enablement solutions can be evaluated and compared. An integrated approach is viable only if a CSP has a firm understanding of precisely what service functions they want to deploy, if the list of service functions is very small and unlikely to change, and if the integrated service functions are of acceptable quality as to fulfill the requirements. However, the redirection-based enablement is the superior option overall: when done correctly, redirection can overcome all of the challenges identified by the IETF. Most importantly, redirection preserves choice and flexibility the CSP is free to choose any vendor for any service function, and can introduce or remove service functions as needs change over time. 2 1 Which can be found here: http://datatracker.ietf.org/doc/draft-ietf-sfc-problem-statement/ 2 A detailed examination of the different techniques for enabling value-added services and service function chains, that leads to the conclusions repeated in this introduction, is available in the Sandvine whitepaper Value-Added Services and Service Chaining: Deployment Considerations and Challenges Page 2
Sandvine Divert Sandvine Divert is a patented 3 technology that enables intelligent arbitration and management of multiple service functions in a telecommunications network. Divert allows the Sandvine Policy Traffic Switch (PTS), our PCEF/TDF, to redirect traffic flows on an application-aware, subscriber-aware, and/or content/mime-type-aware basis to physical or virtual third-party systems (e.g., value-added services and other service nodes and functions) for further processing. The service nodes then return the processed traffic back to the PTS for reinsertion onto the wire (Figure 1). Figure 1 - Sandvine Divert in action: redirection is subscriber-, application-, and content type-aware, with automatic load-balancing and health-checking of service function availability Perhaps the most powerful differentiator for Divert is its ability to logically inline service functions on a per-flow basis after a connection is established and the application and/or content type is known. This feature, combined with the PTS traffic classification technology, gives Divert the maximum theoretically achievable efficiency for traffic redirection, far beyond rudimentary port-based redirection (e.g., all port 80) and heuristic guessing (e.g., all port 80 that has small packets) 4. Essentially, the PTS filters large volumes of traffic to find the specific flows that a given service function requires, thereby maximizing the efficiency of the various service function deployments (i.e., fewer units required) and simplifying operations for the CSP. It is important to note that even appliances with integrated/embedded service functions (e.g., multiple functions within a single chassis) and VAS need to redirect relevant traffic to the integrated 3 US 2004/0193714 A1 4 Other vendors have also employed variations of the late-bind technique whereby they answer the SYN for every flow in an effort to build state and application-awareness; this technique has generally fallen out of favor due to a propensity for outages (e.g., if there s a SYN-flood attack, if the destination server is down and the load balancer continues trying to build state, etc.). Page 3
processing blades and specific processes, and Sandvine s intellectual property reserves Divert s superior efficiency in this regard. The Divert mechanism also balances load across groups of service functions, and includes health-checks to automatically shift load in the case of a failure on one or more of the nodes. Furthermore, new nodes (either physical or virtual) can be dynamically added to the group of service functions as network demands require, and traffic will balance across the enhanced group. In the event that a service function becomes unavailable, Divert will not steer to it and traffic will continue uninterrupted. Enabling Service Function Chains To enable service function chains, the PTS redirects traffic flows to multiple service function nodes in series. For instance, non-video HTTP flows may be sequentially steered to a parental control filtering node first and subsequently through a cache, while HTTP video flows may be steered first through a video optimization service function, and then on to the cache. The solution includes a generic switch to perform a VLAN translation function and to increase the number of external ports available to which to connect the service nodes. Figure 2 - Different traffic types take different paths through the service function chain Rules configured on the external switch route packets through sequences of service functions, in both directions, with sequences starting and ending at the external PTS service ports. The PTS has at least two physical external service ports connected to the switch and steers traffic flows to these labeled ports based on network policy conditions. Packets of chained flows are labeled with VLAN numbers Page 4
configured in policy 5, which are then mapped to tags. This labeling completely abstracts network configuration and topology from network policy, which has benefits that are explained further in this document. Critically, the two directions of a chained flow are steered to different ports such that they traverse service node chains in the opposite order. The routing logic on the external switch is deterministic based on the ingress port of a packet and the VLAN tag in the packet header. When chained traffic returns to the PTS, the packet s original L2 header is restored. Health checks for sequences are sent over the service plane, and also serve as both keep-alive transmissions and to enable MAC learning on the VLANs being used. The paramount goal in a consumer network is typically to optimize, or at least protect, the subscriber experience, and the introduction of unacceptable latency that results in noticeable degradation of service is why most service chaining options are rejected. Latency is particularly important in service chaining cases precisely because the same flow needs to be handled by more than one service function, each with its own overhead. Sandvine s Divert capability was architected to ensure that a flow is only inspected by a PTS once (while retaining the capability to count packets pre- and post-divert, for business intelligence). This ensures that the redirection action itself always introduces minimal latency overhead, even when multiple redirections are required (of course, the latency introduced by each service function is always unavoidable). Solutions Partner Ecosystem Sandvine s traffic redirection technology has enabled more than 40 VAS and service function chain deployments worldwide through our Solutions Partner Ecosystem, a collaboration of partners and preintegrated joint solutions. The ecosystem enables solutions through proven partnerships with best-of-breed technology vendors, including PeerApp (content caching), Netsweeper (content and parental controls), and OpenWave Mobility (advanced monetization services). With Sandvine already in the network, CSPs can simply plugand-play our ecosystem partners offerings to get an end-to-end solution. 6 How Sandvine Divert addresses IETF Problem Areas The IETF has identified 12 Problem Areas that represent the primary challenges related to service chaining. These areas and their general implications for service function enablement are discussed in detail in the Sandvine whitepaper Value-Added Services and Service Chaining: Deployment Considerations and Challenges. This section does not reproduce the text of those problem areas, but explains how Sandvine Divert addresses the issues identified by the IETF. Topological Dependencies The Sandvine Divert mechanism abstracts service functions from the physical network topology, while simultaneously optimizing resource utilization. Unlike other implementations that require configuration 5 When this document refers to something as being configured in policy, defined in policy, etc., it means that rules are expressed in the SandScript policy language; more information about SandScript is available here: https://www.sandvine.com/technology/sandscript.html 6 More information about the Solutions Partner Ecosystem, including an extended list of the partner members, is available at http://www.sandvine.com/solutions/partner-enabled-solutions/ Page 5
changes on data plane routers to enable VLANs or, even more fundamentally, changes to ensure flow symmetry, Sandvine enables service chaining without any reconfiguration of the existing network data plane or routing. This characteristic applies equally to mid-deployment alterations, such as the addition of a new service function to an existing deployment. The ordering is logically strict in the sense that flows are sequenced according to a defined order set in policy, but they can be deployed in any physical order. The sequencing order is maximally granular and in no case is there a need to establish an alternate topology. Regardless of whether a service function is delivered via a physical device or a virtual one, the ordering is determined according to the VLAN translation process and divorced from any hard-wired series. Moreover, Sandvine is unique in enabling multiple orthogonal/independent Divert conditions in policy construction. Descriptions of classification engines usually focus on granular classification of the flow protocol or application (including classification based on server response packets) but with Sandvine this can also be layered with device type segmentation, subscriber state, and other factors. Determinants of the sequence ordering (e.g., network policy conditions ) can be freely combined to achieve highly flexible ordering. Configuration Complexity It is imperative that the enablement solution must maintain flexibility, ease of configuration, and ease of reordering of service functions. Since Sandvine s Divert architecture abstracts service chain configuration from the network topology, if there is a need to change the order of service functions, then there should not be any need to change network wiring. Configuring Divert to enable service chains is relatively easy when the mechanism is understood, with all changes being expressed in policy. It is instructive to go a little bit deeper into the Divert mechanism. To enable service chains, the PTS redirects traffic out a service port with a VLAN tag that signifies which sequence of service nodes a particular direction of a flow needs to traverse. Each direction of the flow has its packets tagged with a different VLAN. All interfaces of all divert service functions must have different VLANs/labels. The generic switch (recall Figure 2) provides a single point of VLAN translation, which serves to direct the traffic (based on port arrival and VLAN tag) through each of the successive service nodes. Upon entering the switch after the final service function is performed, the VLAN tag is changed a final time in order for the packet to arrive back at the PTS to be reinserted into the existing network data plane. The following process illustrates a typical flow sequence: 1. The PTS specifies a configurable pair of VLAN tags for a service chain destination (e.g., service nodes A and B) a. A unique tag pair is required for every 'destination' (i.e., each unique chain of service functions) 2. Upstream (subscriber to Internet) traffic is sent out on PTS service port 1 with VLAN tag A and destination MAC address of service port 2, and returns on service port 2 with VLAN tag B Page 6
3. Downstream (Internet to subscriber) traffic is sent out on service port 2 with VLAN tag B and destination MAC address of service port 1, and returns on service port 1 with VLAN tag A 4. When diverted traffic returns to the PTS, the destination MAC address is replaced with the original external MAC address of the packet; in the service chaining case, the rewriting of the Ethernet Head to MAC addresses is automatically handled by Sandvine, on the PTS The role of the generic switch (which is assumed to be MAC address transparent) in this process is also relatively simple: 1. The switch is configured such that a packet arriving from the PTS with VLAN tag A goes to Service Function 1, then Service Function 2 (potentially on through Service Function N); the VLAN tag is flipped to B upon return to the PTS 2. In the opposite direction, it is configured such that a packet arriving with VLAN tag B goes to (Service Function N, as required ), Service Function 2, then Service Function 1; the VLAN tag is flipped to A upon return to the PTS There is an inherent assumption of flow stickiness that is important in this context. Once a service chain is selected, it should not be changed, because modification of the flow path by a service function within the chain could result in a breaking of the chain or flow. This idea is pertinent to the discussion in subsequent sections (in particular Per-Service (re)classification). 7 The PTS includes a number of commands to simplify troubleshooting and configuration. The show policy destinations command displays details on all the destinations that have been configured in policy, including Divert destinations. Details include the service node name, status, flow counts, and errors. The command also applies to service chain sequences, including further details on connective interfaces, ports, VLANs, MTU, maximum connection limitations, alarms, etc. The show interface configuration command is similar but adds further details on an individual interface level, including port number, administrative status, operational status, MTU, autonegotiation status, MAC address, aliasing, interface role, etc. The show interface vlan command displays the policy database table for external port VLAN configuration, providing a simple tool for administrators to monitor and validate service function ordering: cli > show interface vlan Port VLAN ---- ----------- 1-1 1, 2, 3, 2001, 2002, 3000 1-2 1, 2, 3, 2010, 2012, 2090 Constrained High Availability Service chaining with Sandvine Divert allows for unconstrained high availability at both the Sandvine and service function levels. 7 It is evident that the IETF working group is contemplating allowing service functions in the chain to alter the sequence but this should be avoided. If there is not bidirectional symmetry in the flow and service function elements in the chain can make routing decisions, then this is a recipe for problems. It is preferable to keep things simple by requiring elements in the chain to have simple and symmetric forwarding rules. Page 7
Sandvine achieves linear scalability and high availability via PTS clustering 8, which enables the inspection of every packet and every flow while delivering efficient scaling for carrier-grade throughput and growth. In a service function chain deployment, multiple PTS units in a cluster can redirect to one or more of the generic switches in a mesh configuration, ensuring there is no weak point in the availability model. Thus, it is possible to deploy in such a manner that any PTS in the cluster and/or any generic switch could fail while preserving overall availability and redundancy. 9 High availability within the end-to-end service chain can also be achieved for service functions, through the deployment of redundant nodes. The Divert functionality includes load balancing and health-check mechanisms for both individual service functions and end-to-end service chains. It is worth noting that while a failure in one service node will bring down a service chain, the chains themselves may be deployed in a highly available/redundant fashion. This is best illustrated with a simple example in which there are two service functions, each with two active service nodes. This equates to four chains: Service function1 has service nodes A + B Service function2 has service nodes C + D The resultant chains from this scenario would be: AC, AD, BC, BD (for outbound traffic) If service node A were to become unavailable, service node B would ensure that service function1 continued to be delivered. Moreover, neither of the service function2 nodes would become unavailable because of the failure. Note that the above example of course provides a distinct logical ordering (not necessarily physical ordering): AC, AD, BC, BD for outbound and CA, DA, CB, DB for return traffic, in this case. It is also entirely possible in the Sandvine model to have complete flexibility in the service function ordering, such that for certain flows, service function1 would come before service function2 (in a given direction) under certain conditions, and after it for other conditions (in the same direction). Therefore, if either service function could come first in a given direction (depending on various policy conditions), the resultant chains from this scenario would be: AC, AD, BC, BD, CA, CB, DA, DB (with the corresponding opposite of each for return traffic). One scenario where this might be desirable is a case with two service functions: firewall and intrusion-detection. Certain customers might want the intrusion-detection inside the firewall, while others might want it outside. Consistent Ordering of Service Functions Essentially, an enablement mechanism must allow the CSP to define and control a specific and consistent (subject to conscious decisions to change) ordering of service functions within the chain. As mentioned and illustrated previously, in Configuration Complexity, Sandvine Divert offers complete flexibility in service function ordering. The VLAN architecture enforces consistent ordering and this consistency is aided by having service function elements that cannot subvert the ordering. This is also important in the context of service function elements that may be injecting packets (and must know where to put them). 8 To learn more about clustering, please read the technology showcase, Policy Traffic Switch Clusters: Overcoming Routing Asymmetry and Achieving Scale 9 Sandvine utilizes MSTP/RSTP to enable such a high-availability model without creating loops. Page 8
It is worth noting that the concept of service function ordering is somewhat problematic, in the sense that any order is reversed for the return path of a flow. The text in the IETF problem statement seems to assume that service chains are unidirectional (i.e., that one would need to build two chains to accommodate bidirectional traffic flows). This assumption is misleading and may deflect attention from the most common real-world use cases, which typically require symmetric service chains. In CSP networks, we think of subscribers on one side and Internet on the other: Subscriber<->A<->B<->Internet Any order is obviously reversed in each direction as packets flow in each direction over the lifespan of a flow. Similarly, in data center networks, one thinks of the servers on one side and users on the other. Again, each direction uses opposite sequences. Application of Service Policy While Sandvine Divert does utilize VLAN tagging for sequencing flows to service nodes, in the Sandvine implementation there is a separation of network policy from the service chains. There is a relatively simple configuration task to allocate VLAN IDs to set up the chains, and the network policy control only needs to talk about the abstract labeled chains. Business logic is separated from the network. This approach differentiates Sandvine s Divert from other VLAN-based service chain implementations. The system includes the concept of a destination manager that is tasked with creating and storing a list of Divert destination pointers, including the child destinations. The child destination list is only used at load time to compute Divert sequence properties (for example, the maximum number of permitted connections, etc.). The structure also allows for the recursive inclusion of sequences within sequences, although infinite recursion is avoided since sequences are only able to contain previously defined destinations or sequences. The system also keeps track of VLANs that have been used, in order to avoid VLAN collision; this characteristic is automatically enforced by the internal policy parser, and the list of which VLANs are configured and by whom is readable. If there is a request to configure a VLAN that has already been used by another user, an alarm is raised. The IETF problem area states that, Per-service function packet classification is inefficient and prone to errors, duplicating functionality across service functions Sandvine s Divert does not suffer from this drawback, as packet classification occurs once on the PTS and the service function chain is defined end-to-end. Transport Dependence Access/transport agnosticism is another advantage of Divert that yields highly flexible service chains: Sandvine can deploy in any IP network, providing consistent service chaining functionality within and across DSL, cable, FTTx, WiFi, WiMAX, and 2G/3G/4G mobile networks. The Sandvine PTS can also detect and redirect a variety of tunneled (e.g., GTP, GRE, L2TP, Q-in-Q, and IP-in-IP) and encapsulated (e.g., MPLS, EoMPLS, and VLAN) traffic to service functions. The PTS removes the flow headers when initiating the Divert action and reapplies the same headers when the traffic is returned from the service nodes at the end of the service chain. This support for encapsulated and/or tunneled traffic is critical for deployment flexibility and enables service nodes to be used irrespective of their support or lack of support for encapsulated traffic. Page 9
While the Sandvine PTS is transport agnostic, service function elements in the chain see only simple IP packets (i.e., service function elements are freed to be network-agnostic), residing on a normal Ethernet LAN. They can, for example, inject packets, and the PTS will put the correct header on. Elastic Service Delivery The primary difference between service chaining using Sandvine s Divert capability and the one implied in the IETF problem statement is the separation of the service layer from the conventional data plane; that is, the problem statement assumes no separation and extends this assumption into challenges that result. Having a separation facilitates ease of management and elastic capacity provisioning and service delivery. Although the Sandvine Divert technique utilizes VLANs, the service layer is isolated conceptually, below or within the PTS. Other VLAN-based models impose requirements around complex and labor-intensive router reconfiguration, classification of subsets of traffic into the VLAN, etc., all of which is based on an assumption that the network traffic topology and flow must be changed. Conversely, in the Sandvine model, all the work is in the PTS or generic fan-out switch below it, which greatly simplifies the configuration process without requiring any changes to the overall network topology. This model minimizes the cost of implementation for these solutions, while minimizing points of failure and operational complexity and risk. The Sandvine service chaining implementation enables complete elasticity of service function processing capacity, with additional capacity added as required. Capacity today is typically elastic in the sense that each individual service function can be sized according to the exact amount of addressable traffic redirected by the PTS (i.e., as opposed to the inefficiencies inherent with portbased redirection techniques). There are also a number of sub-features available that are aimed to enhance the graceful incorporation of additional capacity. For example, traffic can be applied slowly to destinations within a service function group. This enables traffic that is steered to each destination to ramp up on startup and each time the destination transitions from being down to being up. This prevents the service node from being overloaded and allows a new node that has just been added (e.g., to meet capacity demands) to perform the necessary tasks to process the flows that it receives. Another aspect of the elasticity is the dynamic selection of health-checked paths; when elements are brought up and down, the load-balancing mechanism dynamically adapts to use only the up ones. Perhaps the most exciting innovations in the area of elastic service delivery are being made in the network functions virtualization (NFV) and software-defined networking (SDN) realms, as multiple virtualized service functions are co-resident on commercial off-the-shelf (COTS) hardware; from the Sandvine perspective, whether a service function is physical or virtual makes no difference 10. Traffic Selection Criteria This problem area presupposes that there is no means of efficiently and effectively determining what traffic should go to what service functions, but this is not the case. What is true, however, is that the degrees of efficiency and effectiveness vary greatly. 10 It s also worth noting that the Sandvine system can also be fully virtualized. Interested readers can find out more at https://www.sandvine.com/technology/virtualization/ Page 10
It is very important to note that even with integrated solutions, the VAS component typically resides on a separate blade or processing group; as a result, the platform must still redirect traffic to these processors, even though the redirection is at a process-level or internal to a larger chassis. In the theoretical best case, only traffic that meets criteria specific to a service function gets sent to that service function. Such criteria might include traffic pertaining to a particular subscriber, subscriber segment, device, application, protocol, CDN, route, video resolution, video provider, etc. As stated previously, other approaches to service chaining use port-based redirection, or combine the port-based with a heuristic guess to achieve slightly better efficiency, because they lack Sandvine s ability to keep TCP state and remap window/isn/options; they typically use load balancers that have evolved from an enterprise environment, with systems designed to be placed in front of a server with a known application/content and application. Sandvine Divert allows connection to the server first, and recognition of both the protocol and the server response that indicates MIME type, plus the consideration of other orthogonal policy conditions (e.g., subscriber state, client device), before redirecting to third-party service functions. Consider a very simple example in which a CSP wishes to implement a video optimization service function. Sandvine s Divert capability leverages the traffic classification capabilities of the PTS to identify protocols of interest and redirect only optimizable traffic to the video optimization platform (e.g., redirect RTMP, but not proprietary Netflix). Moreover, the Divert mechanism has the ability to redirect based on MIME-type so that Flash video assets (i.e.,.flv) can be differentiated from the rest of an HTTP flow. In contrast, a solution that uses PBR/port-based load balancer redirection will be forced to try to capture Flash video by redirecting port 80 traffic, which suffers from a number of limitations: 1. While the majority of traffic on port 80 is HTTP, not everything is, so the efficiency of that approach is immediately compromised 2. There is HTTP traffic on ports other than port 80, so the efficacy is doubly suspect 3. Not all HTTP is optimizable (only the Flash video content is) so this approach is again inefficient 4. Redirecting inappropriate content in this manner creates potential for high-profile failures The limitations of the port-based approach, which still apply even when heuristics are added, are obvious when one looks at video protocols such as RTMP and RTSP. Both of these video protocols have a separate control and data channel, and the data (bearer) is on a random port. RTMP can also use UDP for the bearer. This is before one even considers the operational complexity, cost, and risk associated with altering routing tables to enable PBR. Network operators minimize capital expenses (i.e., number of service node appliances) and operating expenses (e.g., complexity and revenue share percentage margins) by leveraging Sandvine s application and subscriber awareness via the condition-based policy that determine steering and sequencing. Put simply, precisely choosing which traffic to redirect means fewer service nodes are required. In the same way that application and content-type awareness ensure that only pertinent flows are redirected, the same holds true for subscribers of a particular service. This enables only the flows belonging to users of a particular service (e.g., Video Pack, or Parental Control ) to be redirected to appropriate service functions, minimizing the number of nodes required. Depending on the service offering, this also enables subscribers to be given an option to opt-in or to opt-out of a particular service, such as parental controls, via a portal. Page 11
Traditionally, CSPs have tried to achieve this result in mobile networks with IP address range segmentation whereby the mobile operator would create distinct IP address pools for subscribers based on the combination of services they have purchased or are entitled to. These IP address ranges would then be used to provide the routing of the subscribers traffic to the appropriate network elements. Unfortunately, this approach leads to an inefficient use of the CSP s IP address space and is a difficult model to maintain and scale, requiring a great deal of manual management. Limited End-to-End Service Visibility Sandvine has developed a set of tools that give visibility into the service chain and allow for inline health-checking of every node within the chain. Health checks are defined for individual destinations within a group, by adding service chain sequences to a destination group definition in policy: destination Group1 group destinations \ HttpVideoOptimizeAndFilter if <condition> then divert destination Group1 \ from client interface FromClient \ from server interface FromServer The list of destinations in a sequence does not enforce an order or network configuration but rather infers a parent-child relationship among destinations known to the PTS. For example, the divert sequence destination inherits properties from its child destinations (i.e., service nodes), such as the maximum number of connections permitted (i.e., the lowest of these values among the service nodes), TCP SYN handling configuration, etc. These inherited values can also be explicitly altered by specifying different values for the service chain. Sandvine supports health check mechanisms for individual service chain sequence destinations, including ICMP Ping and HTTP response validations, both of which are sent from the PTS over the control/service plane. These are effectively Layer 3 health checks and rely on the existence of a Layer 3 service function (virtual or physical) that is reachable over the management port of the PTS NPU (network processing unit) controller 11. However, in isolation, these health check mechanisms would be insufficient since: 1. There is also a requirement to ensure that each service node endpoint in a service chain is checked (PTS NPU module => service node 1 => service node 2 => service node 3 => PTS NPU module) as well as the return path (PTS NPU module => service node 3 => service node 2 => service node 1 => PTS NPU module); this also covers the internal PTS service plane 2. A service chain is functionally a Layer 2 network path Therefore, in addition to ICMP Ping and HTTP response health checks, Sandvine offers an additional Layer 2 inline health check for these sequences. The following is an example of what the policy syntax looks like for an inline health check: healthcheck name inline interval time timeout time \ {retry number} \ {retry_failure number} \ {src_addr ip} \ {dst_addr ip} \ {ttl number} 11 For an extended description of the NPU, please refer to either one of two Sandvine technology showcases: Maximizing Performance with Core and Processor Affinity, and Policy Traffic Switch Clusters: Overcoming Routing Asymmetry and Achieving Scale; both are available at www.sandvine.com Page 12
where: src_addr is the source address in the Ping packet dst_addr is the destination address in the Ping packet ttl is the time to live on the Ping packet By default, the source and destination IP addresses use private IP addresses due to the fact that the inline health check is a Layer 2 mechanism and relies on VLAN switching to traverse the service chain sequence in each direction. The IP addresses are necessary since the inline health check is represented as an IP packet. The TTL ensures that if the health check packets are ever leaked out of the Layer 2 network, they will quickly be dealt with by neighboring Layer 3 devices. Inline health checks are based on packets transmitted by each PTS processing unit through policydeclared sequence destinations. The packets are based on an IP packet with the following format: where: Ethernet Header VLAN Tag IP Header Payload Ethernet Header is populated with the primary and secondary MAC addresses for the Divert (service) interface VLAN Tag is populated with the VLANs specified for each side of the Divert (service) interface IP Header is populated with the source/destination IP addresses and TTL, as specified by the health check policy definition. If no IP addresses are defined, then acceptable defaults are used instead (ie. 10.0.0.0 and 10.0.0.1). Payload is the payload of the packet, consisting of a set of fields that enable further identification of the packet as a health check packet and mapping of the packet back to the originating Divert sequence. When an inline health check is applied to a service chain, separate health checks are registered: one for each VLAN/interface defined in the sequence. This ensures that there is health-checking in both directions through the service chain. If the health check timer expires prior to the health check completing, the system consults a configurable retry number and retry failure number and continues health checking. If it is determined that a service node is down, then the node is designated as down and the steering of new flows to it is discontinued. All existing flows will resume being steered to the service node in the case that it comes back online (as detected by the health check algorithm). Per-Service (re)classification This problem area presupposes that the service chain is not configured in an end-to-end manner, but with Sandvine Divert this is not the case - service chains are determined in an end-to-end fashion and there is no need for each service node to determine the next hop. While Sandvine therefore solves this problem (e.g., minimizing resource consumption/computational overhead and ensuring differences in classification functionality between service functions do not limit Page 13
service chaining realization), it is conceivable that permitting service nodes to play a more active role in reclassification might be desirable, in certain cases. However, as noted in Configuration Complexity and Consistent Ordering of Service Functions, doing so is generally a bad idea. Each service function in the chain should be agnostic about its upstream and downstream nodes. Allowing service functions within a chain to second-guess the ordering can break bidirectional flows. Any re-classification must necessarily be limited in the output chains it can select; it must not be possible for a downstream service function to be bypassed. Therefore, any re-classification would need to be constrained to subchains that return to the original chain. One possibility that may yield benefits in emergent network service chaining models that leverage NFV is enabling each virtual service node to optionally inject packets directly into the next-hop node. Traditionally, as each packet arrives at a service function node, headers are inspected, an operation is performed, a new packet is (re)generated on the output side, and the process repeats at each step in the n-length service chain. There is a significant computational cost associated with bringing packets in and out of virtual machines and theoretically, once the first service nodes makes changes, it should be possible for the hypervisor to inject the same output directly into the successive virtual machine(s) in the series, thereby obviating this processing. Today, Sandvine leverages the superior classification capabilities of the Sandvine PTS, which achieves maximal flow steering granularity while removing any onus from service nodes for acting as determinants and/or sharing a common forwarding/injection framework. Symmetric Traffic Flows Sandvine PTS units completely resolve network asymmetry from the standpoint of the service functions, presenting redirected traffic in-order and symmetrically. Critically, and unlike market alternatives that rely on sharing state between multiple boxes, Sandvine s asymmetry resolution works in any access network, with any number of asymmetric routing paths. 12 Multi-Vendor Service Functions Perhaps more than any other problem area, this one strongly favors Sandvine Divert over all competing approaches. With an integrated/embedded model, the CSP can only use those service functions that are already integrated (or could be integrated via additional effort). Practically, this restriction prevents the CSP from choosing between a range of best-of-breed options to select the optimal choice. Conversely, Sandvine Divert preserves the CSP s ability to choose service functions from any vendor, provided they can interoperate with Divert (in Divert, all service nodes function as bumps in the wire, which is an insertion model that virtually all vendors support). In practice, there are rarely issues, as evidenced by the more than 40 successful Divert-based deployments running worldwide. Future Extensions There are a number of promising emergent technologies that could be applied to service function chaining and VAS-enablement. One such area is NFV; Sandvine can already Divert to either physical or virtualized service nodes but there is definitely a concerted, industry-wide push for virtualization of network functions onto Intel /x86 platforms and into carrier data centers/private clouds. 12 To learn exactly how we achieve this result, please read the technology showcase, Policy Traffic Switch Clusters: Overcoming Routing Asymmetry and Achieving Scale Page 14
Another promising technology is the nascent VXLAN (Virtual extensible LAN) technology. VXLAN is a proposed encapsulation protocol that adds a 24-bit segment ID to each data frame, thereby overcoming the primary limitation of existing VLAN technology: the limit of 4,096 unique network IDs. The 24-bit segment ID expands the number of unique IDs to 16,000,000, allowing for tunnels/logical networks to be generated for an unlimited (in practical terms) number of service chains. VXLAN offers a virtual overlay on existing L3 networks to enable elastic computing architectures and the logical segmentation required to run such multi-tenant deployments over shared network infrastructure. The orchestration of such a service-delivery overlay also extends the promise of automatically calculating all possible permutations of service functions and then generating distinct tunnels for each permutation in order to steer traffic appropriately, and many are looking to SDN for this orchestration. The focus of SDN is the separation of data and control plane components in the network, which is a concept that is highly synergistic with dynamic service chaining (i.e., through a shared abstraction of network infrastructure and the intelligent and automated provisioning of network resources through a central and separate controller). For this reason, SDN holds significant promise for service chaining, although as it has initially been conceived by the ONF (Open Networking Foundation), it is primarily focused on applications and lacks subscriber awareness. For example, the ONF wrote recently about a logically centralized controller that maintains a global inventory of all network resources and completely controls resource allocation in response to evolving application-specified traffic demands. There is no implied subscriber state and profile awareness in this virtualized model. However, the open APIs above the SDN controller allows integration with a PCRF (or other entities) to enrich the system with this kind of awareness. We already see an analog in the existing Sandvine architecture with the separation of data plane functions from the control plane, and we will continue to further enhance SDN models by extending awareness to the subscriber. Sandvine has long worked within the PCC (Policy and Charging Control) architectural framework (among others), which employs points of detection, decision, enforcement, and charging on the network, all of which are well-defined and broadly understood. Conversely, SDN aims to establish an entire network path (i.e., service chain), but where PCC is subscriber-oriented, SDN is more suited today for an IPcentric data center environment. We see an opportunity to combine aspects of PCC, SDN, and NFV to deliver on CSPs need for subscriber-based service chaining in a virtualized environment, with a network-wide decoupling of the infrastructure control plane. Future iterations of Divert might therefore incorporate an SDN Controller, to signal the orchestration of virtualized assets in the data centre or cloud. Using open, standard SDN interfaces, the Sandvine Policy Engine 13 can provide subscriber and application awareness to service nodes (something referred to as the provisioning/furnishing of dataplane metadata ), and map subscribers and applications to IP layer information, thereby allowing efficient control of a service chain of virtualized functions and applications on a per-subscriber basis. In such a model, individual cores can dynamically be put into commission via orchestration, through bare metal allocation or hypervisors, enabling completely dynamic and elastic allocation for peak needs. This approach preserves and extends Sandvine s existing efficiencies in capacity, density, and performance by providing service functions with steering, session load balancing, asymmetry handling, and data partitioning for throughput interfaces. 13 Learn more here: https://www.sandvine.com/platform/policy-engine.html Page 15
Conclusion Sandvine believes that Sandvine Divert, which is in commercial use in networks around the world, represents a viable, proven technique for network service function chaining and VAS-enablement; importantly, we believe that Divert overcomes or suitably addresses the problem areas outlined by the IETF. Characteristics of Sandvine Divert The table below summarizes how Sandvine s Divert functionality addresses the IETF problem areas. Consideration Topological Dependencies Configuration Complexity Constrained High Availability Consistent Ordering of Service Functions Application of Service Policy Transport Dependence Elastic Service Delivery Traffic Selection Criteria Limited End-to-End Service Visibility Per-Service (re)classification Symmetric Traffic Flows Multi-Vendor Service Functions Sandvine Divert Since Divert abstracts service functions from the physical network topology, there is no topological dependence. Divert preserves rich configuration options and provides GUI and CLI tools that simplify the tasks as much as possible. Divert allows for unconstrained high availability at both the Sandvine and service node levels. Divert offers complete flexibility in service function ordering, and provides CSPs with tools to impose and verify the ordering of service functions. Since Divert configures service function chains in an end-to-end manner, and separates the chains from the physical network topology, it does not suffer from the problems identified in this problem area. Divert is completely agnostic of access and transport, and supports both tunneled and encapsulated traffic. Furthermore, since Divert can remove and reapply headers, it makes it possible to redirect this traffic to service functions (which will see only simple IP packets). Divert enables complete elasticity of service function processing capacity. Divert is based on Sandvine s patented technology to send only precisely relevant traffic to a service function; in other words Divert achieves the theoretical maximum efficiency no alternative can match this efficiency. Divert provides CSPs with tools to provide visibility into the service function chain and to allow for inline health-checking of every service function node. Since the service function chain is configured end-to-end with Divert, there is no per-service (re)classification. The Sandvine deployment completely resolves routing asymmetry from the perspective of the service functions. Divert preserves the CSP s ability to choose service functions from any vendor. Page 16