Experiences with Class of Service (CoS) Translations in IP/MPLS Networks Rameshbabu Prabagaran & Joseph B. Evans Information and Telecommunications Technology Center Department of Electrical Engineering & Computer Science University of Kansas, Lawrence, KS 66045 {ramesh, evans } @ ittc.ku.edu Abstract This paper presents some experiences with Class of Service (CoS) translation in IP and MPLS based networks. IP provides CoS in the form of eight priority classes that can be used to distinguish between a variety of traffic types. Since most of the layer-2 technologies provide support for strict QoS, an appropriate translation from the coarse grained IP CoS to the finegrained layer-2 QoS is fundamental to obtaining desired end-to-end throughput. Multiprotocol Label Switching (MPLS), residing in between IP and layer-2 in the protocol stack, provides an interface to translate IP CoS to appropriate layer-2 QoS. This paper presents some of the results obtained by using MPLS CoS with relative and fixed bandwidth allocation to MPLS classes. Experiments were conducted to observe the effects of per- CoS WFQ and CBQ inside the MPLS cloud on fixed size high bandwidth traffic and bursty traffic. It was found that MPLS CoS did relative allocation of bandwidth and prevented starvation of lower priority flows inside the MPLS core. This paper also discusses some of the experiments conducted to evaluate the effects of improper CoS mapping, as a packet traverses multiple networks. 1. Introduction Class of Service (CoS) support has become an indispensable function in many of the large Service Provider networks today because of the competitive nature of the Internet and the diversity in customer needs. CoS support in traditional IP routed cores has been provided by a variety of queueing and scheduling mechanisms, two of the most important ones being Weighted Fair Queuing (WFQ) [1] and Class Based Queueing (CBQ) [2]. WFQ is flow based and provides easy support for relative bandwidth allocation while CBQ is link node based and provides easy support for fixed bandwidth allocation. The above two mechanisms primarily utilize the precedence bits in the IP header [7] to determine the behavior a packet has to receive at a particular node in the network. The behavior that a packet receives as it traverses the path from the source to the destination is also partly dictated by the Quality of Service (QoS) guarantees that the link-layer can provide. Since most of the layer-2 technologies provide only strict QoS support, an appropriate translation from the coarse grained IP CoS (a total of eight precedence levels) to the fine grained layer-2 QoS is fundamental to network design. The introduction of Multi-Protocol Label Switching (MPLS) [3] inside the core to enable faster switching, QoS guarantees and traffic-engineering capabilities has led to more efficient techniques that address the IP CoS to layer- 2 QoS translation problems. This paper presents some experiences with CoS translation in IP and MPLS based networks. Effects of relative bandwidth and fixed bandwidth allocation on fixed size and bursty traffic are discussed, based on experimental results. Issues involved in improper CoS mapping as a packet traverses multiple networks have also been explained, along with the supporting experimental results. This paper is organized as follows. Section 2 outlines some of the techniques put forth by the IETF for IP to MPLS CoS mapping over arbitrary layer-2 technology. Section 3 describes the test environment and the tools used in conducting the experiments. Section 4 summarizes the measurements made with relative and fixed bandwidth allocation in IP/MPLS networks. Section 5 discusses the effects of improper CoS mapping when a packet traverses multiple networks. Section 6 contains some final comments. 2. Techniques for IP to MPLS CoS mapping The MPLS based peer model employs IP intelligence at every hop in the core due to the core routers taking part in routing. As a result, congestion is not pushed towards the edges, as in the overlay model, thus leading to more
efficient bandwidth utilization. The IETF has proposed two ways in which IP CoS can be mapped to MPLS CoS [4]. In one model, the ToS octet [5] in the IP header is copied onto the EXP field [6] of the MPLS shim header and appropriate packet treatment is given based on the value contained in the EXP field. When spanning multiple domains, either the pipe model or the uniform model [4] can be used consistently to provide appropriate treatment to the packet. In another model, an MPLS signaling protocol like LDP [9] or RSVP-TE [10] is used to signal N labels per class per IP source-destination pair. This model provides IP treatment to the packet at the edge and MPLS over layer-2 treatment at the core. As an example, if the core is ATM based, this model provides per-cos WFQ and per-cos WRED [11] at the edge and per-cos WFQ and per-cos WEPD [12] in the core. The experiments that were conducted employed the techniques described in the second model for providing MPLS CoS. The layer-2 medium used was ATM. The reasons for employing the second model in preference to the first were twofold. First, congestion can be managed at every hop (IP hop or ATM hop) and there is a possible discard at every hop unlike the first model where there cannot be a loss in the ATM fabric. Second, resource allocation is per CoS per link in the second model while it is per pair of edge-routers in the first model. The following briefly describes the steps involved in CoS operation. (i) The IP Type of Service (ToS) for a packet is set in the host (or router). The precedence bits define the CoS to be applied to a packet and is as given in table 1. (ii) The packet is queued in the Label Edge Router (LER) according to its CoS. (iii) The MPLS CoS bits are mapped to an ATM label VC in LSR at edge of ATM cloud. (iv) Queuing to ATM cells is done based on their CoS in the ATM Label Switch Router (LSR) which can be inferred from the label value. (v) The labeled packet is received at the egress LER and after the removal of the label, and it is forwarded with appropriate CoS. Class mapped Table 1. IP CoS to MPLS class mapping Class 0 0 or 4 Class 1 1 or 5 Class 2 2 or 6 Class 3 3 or 7 Precedence in IP header Figure 1. Test scenario 3. Test environment and tools The test scenario used for conducting the experiments is as shown in figure 1. The scenario consisted of two clouds, one being an MPLS cloud and the other being just an edge router of an IP cloud. Two Cisco 7206s and one Cisco 7507 were used as edge routers and a Cisco 12008 GSR served as the core router. Throughout this paper, the following notations will be used to denote the hosts or routers. Cisco 7507 - Router R1 Cisco 7206 at the edge of the MPLS cloud - Router R2 Cisco 7206 at the edge of the IP cloud - Router R3 Cisco 12008 GSR - Router P Linux system connected by 100 Mbps Ethernet to R1 - H1 Linux system connected by the ATM interface to R1 - H2 Linux system connected to R3 - H3 Linux system connected to R2 - H4 H1, H2 and H4 served as traffic generators while H3 served as the traffic sink. The links under congestion were R1-P, P-R2 and R2-R3. Traffic was generated using a tool called Netspec [8]. Netspec provides the capability to send traffic either to fill the maximum bandwidth of the pipe using fixed sized packets or in bursts with a specified period and limited to a specified rate. Netspec also provides the capability to set the ToS byte in the IP header for a particular flow. Cisco IOS 12.0.7 (T) was the IOS image used on the routers for testing.
4. Measurements with bandwidth allocation to MPLS classes Experiments were conducted using the test scenario illustrated in figure 1. R1, R2 and P were configured to run MPLS in downstream-on-demand mode with independent control. OSPF [13] was run inside the MPLS cloud i.e. on R1, R2 and P to learn IGP routes inside the AS. I-BGP was configured on R1 and R2 and networks connected to R1 were distributed into BGP [14]. E-BGP was run between R3 and R2 since R3 was treated as an edge of a different network. GNU zebra was run on H1, H2 and H3 and E-BGP sessions were established between the Linux hosts and the leaf routers. This enabled the dissemination of the routes belonging to the hosts into all other hosts and proper connectivity between the traffic sources and the traffic sink. The preliminary objectives were to test the effects of relative and fixed bandwidth allocation to classes using WFQ and CBQ inside the MPLS cloud for full-blast and bursty traffic. The performance measures used to distinguish the results were received and transmitted throughput at the sink and source respectively. No measurements on delay and delay jitter were performed due to resource constraints. 4.1 Baseline testing Experiments were conducted to quantify the transmitted and received throughput at the sources and the sink without any CoS features enabled on any of the routers. Two full-blast flows with different ToS bits set were sent from H1 and H2, respectively, to H3. It can be seen from figure 1 that the flows traverse the paths R1-P, P-R2 and R3-R3 before reaching the sink. In addition to the above, a rate-limited flow of 10 Mbps bandwidth was sent from H4 to the sink to cause additional congestion. All the flows sent were of the UDP [15] traffic type. The transmitted and received throughputs observed are as shown in table 2. The low received throughputs are due to the buffering capacity of the Cisco 7200s under high UDP loads. 4.2 Effect of relative bandwidth allocation to MPLS classes using WFQ Since relative bandwidth support is easily provisioned using WFQ at a particular node, this set of experiments used per-cos WFQ at the edge and core. Four classes were created (as in table 1) and a portion of the link bandwidth was allocated to each of the four classes. As a result, four LSPs were created per source-destination pair that shared the bandwidth allocated to each class, i.e. all class 1 LSPs irrespective of the source and destination shared the bandwidth allocated to class 1. 4.2.1 Test with full-blast traffic Four tests were conducted to study the effect of allotting different amounts of bandwidth to the MPLS classes when traffic belonging to different priorities was sent from the source to the sink. The description of the four tests and the observed results follow. In all the tests, two UDP full-blast flows (flow1 and flow2) with IP precedence of 0 and 6 were sent from H2 and H1, respectively, to the sink. In addition, a rate-limited 10 Mbps flow was sent from H4 to H3. IP CoS in the form of WFQ was enabled in the IP edge router, R3, and MPLS CoS in the form of per-cos WFQ was enabled in all the routers in the MPLS cloud. The relative bandwidth assignment for routers in the IP and MPLS cloud for each of the tests is as given in table 3. Figure 2 illustrates the received throughput for each of the traffic flows in the four tests. Table 3. Assignment of relative bandwidth to MPLS classes Test # Bandwidth allocated (% of link bandwidth) Class 0 Class 1 Class 2 Class 3 Test1 10 20 40 30 Test2 10 10 70 10 Test3 1 0 99 0 Test4 99 0 1 0 Table 2. Baseline testing Tx node IP ToS set Throughput in Mbps Transmitted Received H2 0 131.71 37.445 H1 6 95.51 35.23
Figure 2. Received throughput for full-blast traffic with WFQ From the graph, the following can be observed. (i) Unused bandwidth allocated to a class is shared by packets belonging to other classes. (ii) There is no strict allocation of bandwidth and allocation is on a relative basis as is required for MPLS CoS. Tests 1 and 2 illustrate this behavior. (iii) Allocating 99% of the bandwidth to a specific class does not starve packets belonging to other classes. This can be seen from tests 3 and 4 where a particular class is assigned 1% of the bandwidth but packets belonging to that class do not get dropped heavily. It was also observed that MPLS signaling took place when the relative bandwidth parameter was changed at the ingress of the MPLS LSPs. 4.2.2 Test with bursty traffic Since the majority of the Internet traffic is bursty, tests were conducted to observe the effects of MPLS CoS on bursty traffic. The MPLS domain was configured to operate in multi-vc [4] mode and as in the earlier tests, relative bandwidth was configured for each of the MPLS classes. In these tests, however, the relative bandwidth parameters were kept constant and traffic belonging to different priorities was sent from the source to the destination. The primary objective behind conducting these tests was to observe how MPLS CoS reacts to traffic belonging to different priorities in the presence and absence of congestion. The description of the tests follow. In all of the tests, the relative bandwidth configuration was 20% to class 1, 40% to class 2, 30% to class 3, and 10% to class 0, with the percentages denoting the share of the aggregate link bandwidth that is allotted to the class. Three bursty flows (flow1, flow2, and flow3) were constrained to a bandwidth of 20 Mbps each and one bursty flow (flow4) was confined to a bandwidth of 6 Mbps on the path from H1 to H3. In addition, a full-blast flow (flow 5) was sent from H2 to H3 to fill the available bandwidth on the links. As before, a background 10 Mbps flow was sent from H4 to H3. The precedence levels associated with the different flows in the four different tests that were conducted are as given in table 4. Precedence 6 was used to denote the highest priority because most of the network control traffic such as routing updates, management protocol packets, etc., will use IP precedence 6. Table 4. Assignment of IP precedence to traffic flows with WFQ. Test # IP precedence assigned to flows Flow1 Flow2 Flow3 Flow4 Flow5 1 0 0 0 0 6 2 1 2 4 0 6 3 1 2 0 0 6 4 6 6 6 6 0 Figure 3 illustrates the received throughput for bursty traffic. From the graph and the above tests, the following were inferred. (i) Higher priority traffic received a larger share of the unused bandwidth than lower priority traffic. This can be observed from tests 1, 2 and 3. (ii) From tests 1 and 2, it can be seen that flow 5 suffered a drop in received throughput when the bandwidth available to class 2 was shared by flows 2 and 5. (iii) Bursty traffic had better aggregate throughput (105 Mbps + 10 Mbps) compared to full-blast traffic (85 Mbps + 10 Mbps) for the same per- CoS configuration. (iv) MPLS CoS enabled using per-cos WFQ did not allow a particular flow to be starved even if the flow's class had a configured bandwidth that was less than the actual bandwidth of the flow.
Figure 4 illustrates the graph for the received throughput in the CBQ case. Figure 3. Received throughput for bursty traffic with WFQ 4.3 Effect of bandwidth allocation to MPLS classes using CBQ Experiments were conducted to observe the effects of Class Based Queueing on bursty traffic. The primary objective behind conducting this sequence of tests was to observe starvation and fairness in MPLS CoS implemented using CBQ. CBQ was configured on all interfaces inside the MPLS cloud and on R3. In all of the tests, precedence 6 class was allotted 50 Mbps bandwidth and precedence 1 class was allotted 20 Mbps bandwidth. Traffic not conforming to the above two classes was configured with a bandwidth of 10 Mbps. As before, three bursty flows rate-limited to 20 Mbps bandwidth and one burst flow rate-limited to 6 Mbps bandwidth were sent from H1 to H3 along with a full-blast flow from H2 to H3. A background 10 Mbps flow was also sent from H4 to H3. The precedence associated with the different flows is as given in Table 5. Table 5. Assignment of IP precedence to traffic flows with CBQ Test 3 IP precedence assigned to flows Flow1 Flow2 Flow3 Flow4 Flow5 1 0 0 0 1 6 2 1 1 1 1 6 3 6 6 6 6 0 Figure 4. Received throughput for bursty traffic with CBQ It can be observed from the graph that CBQ does not forward packets strictly based on the fixed bandwidth allocated to a class. This is done primarily to ensure that the packets belonging to the class with lesser or zero bandwidth allocation do not get starved. The difference in received throughput for flow 5 in tests 1, 2 and 3 clearly indicates the relative bandwidth nature of CBQ used to provide MPLS CoS. 5. Effects of improper CoS mapping When a packet traverses multiple domains with different layer-2 technologies or technologies that can cause a different treatment to be applied to the packet in one domain versus another domain, there is a possibility that the packet will receive undesirable processing in transit. As an example, if a packet belonging to precedence 6 is mapped onto ATM-CBR service in one domain and onto ATM-UBR service in another domain, then the endto-end throughput and end-to-end behavior of the packet may not match the desired values. Hence, it becomes
imperative to map CoS in a proper way at the edges and reinforce similar mapping inside the core. Tests were conducted to study the effects of turning off CoS in either the MPLS or IP domain in figure 1. Various combinations were tested and a description of the tests is given below. In all the tests, two full-blast UDP flows were sent from H1 and H2 respectively to the sink H3. The flow from H1 had a precedence of 0 and the flow from H2 had a precedence of 6. A 10 Mbps rate-limited flow was also sent from H4 to H3. Test 1 consisted of baseline testing, as described in section 4.1. In test 2, CoS treatment to packets was turned off inside the MPLS cloud while CoS treatment was given to packets in the IP edge (R3). WFQ was configured on all the interfaces of R3. In test 3, CoS was configured on the interfaces connecting R3 and R2 alone. No CoS feature was configured on any of the other interfaces. In test 4, CoS was configured on all the interfaces in the IP and MPLS clouds. The CoS treatment was provided by employing per-cos WFQ inside the MPLS domain and WFQ in the IP edge. In test 5, CoS was configured on all the routers inside the MPLS domain. No CoS features were enabled on the interface connecting R2 and R3. The received throughput at H3 for the two full-blast flows is illustrated in figure 5. From tests 2 and 3, it can be seen that CoS enabled on R3 alone has no effect on received throughput. Since the packets get best-effort treatment in the first domain (MPLS domain), there is no change in observed throughput. Upon enabling CoS in the first domain (test 4), it can be observed that the higher priority flow gets better aggregate throughput compared to the lower priority flow. When the CoS was turned off in the second domain (IP edge), the throughput measured for the higher priority flow was less than that with CoS enabled. These tests illustrate, to a certain extent, how the presence or absence of CoS features affect end-to-end packet behavior and throughput when a packet traverses multiple domains. Figure 5. Received throughput for full-blast flows traversing multiple networks 6. Conclusion This paper presented some experiences with Class of Service (CoS) translations in IP and MPLS networks. Effects of per-cos WFQ and CBQ inside a MPLS domain were evaluated using laboratory experiments. It was observed that MPLS CoS provided relative bandwidth allocation to traffic classes, hence preventing starvation of low priority traffic. The behavior was analyzed both for full-blast and bursty UDP traffic with respect to transmitted and received throughputs. The effects of improper CoS mapping as a packet traversed multiple networks have also been studied and discussed in this paper. Future work should focus on testing with Differentiated Services (DiffServ) instead of IP CoS and architectures for proper CoS mapping when a packet traverses multiple networks should be developed. This effort focussed mainly on UDP traffic; it would be enlightening to study the effects of MPLS CoS on TCP traffic as well. It would also be useful to study some of the effects of IP to MPLS CoS mapping over IEEE 802.3 and IEEE 802.11 technologies.
References [1] A. Demers, S. Keshav and S. Shanker, Analysis and simulation of a fair queueing algorithm, Proceedings of ACM SIGCOMM, pp 3-12, 1989. [2] S. Floyd and V. Jacobson, Link-sharing and Resource Management Models for Packet Networks, IEEE ACM transactions on Networking, Vol 3 No.4, pp 365-386, Aug 1995. [3] E. Rosen, A. Viswanathan, R. Callon, Multiprotocol Label Switching Architecture, RFC 3031, Jan 2001. [4] F. Le Faucheur, et al, MPLS Support of Differentiated Services, Work in progress, Apr 2001. [5] P. Almquist, Type of Service in the Internet Protocol Suite, RFC 1349, Jul 1992. [6] E. Rosen, et al, MPLS Label Stack Encoding, RFC 3032, Jan 2001. [7] DARPA, Internet Protocol, RFC 0791, Sep 1981. [8] Netspec, Traffic Generator tool, www.ittc.ukans.edu/netspec [9] L. Andersson, P. Doolan, N. Feldman, A. Fredette, B. Thomas, LDP specification, RFC 3036, Jan 2001. [10] D. Awduche, L. Berger, Der-Hwa Gan, T. Li, V. Srinivasan, G. Swallow, RSVP-TE: Extensions to RSVP for LSP Tunnels, Work in progress, Feb 2001. [11] S. Floyd and V. Jacobson, Random Early Detection gateways for Congestion Avoidance, V.1 N.4, p. 397-413, Aug 1993. [12] A. Romanow, and S. Floyd, Dynamics of TCP traffic over ATM networks, IEEE JSAC, V. 13 N. 4, p. 633-641, May 1995. [13] J. Moy, OSPF Version 2, RFC 2328, Apr 1998. [14] Y. Rekhter, T. Li, A Border Gateway Protocol 4 (BGP- 4), RFC 1771, Mar 1995. [15] J. Postel, User Datagram Protocol, RFC 768, Aug 1980.