1 Application Note Testing VoIP on MPLS Networks Why does MPLS matter for VoIP? Multi-protocol label switching (MPLS) enables a common IP-based network to be used for all network services and for multiple customers of a network operator. It allows IP networks to carry voice, data and video traffic with differentiated service-level performance parameters. MPLS also enables virtual private network (VPN) services over IP networks, so that a network operator can offer private networking services to multiple customers on a shared infrastructure. Although MPLS may be used with non-ip networks, it is IP networking; and more specifically data, voice, video services over IP networks; that makes MPLS an attractive and growing technology. MPLS is used to ensure that all packets in a particular flow take the same route over a backbone. Deployed by many telcos and service providers, MPLS can enable traffic engineering to deliver the quality of service (QoS) required to support real-time voice and video as well as service level agreements (SLAs) that guarantee bandwidth. An MPLS network ingress element attaches labels to IP packets. This label instructs the routers and switches in the network where to forward the packets based on pre-established IP routing information. Label switched paths (LSP) are defined in routing tables, and are used to send tagged packets on specific paths through the network. LSPs represent a new type of virtual paths for segregating traffic in an IP network. MPLS has been applied to implementing Virtual Private Networks, which is a key revenue generator for service providers offering enterprise services. MPLS has also been applied to enabling Class of Service (CoS) for multi-service networks (voice, data, video) in conjunction with techniques including DiffServ. For example, MPLS is used to distinguish and prioritize VoIP traffic on a common IP network to ensure VoIP is delivered with higher QoS objectives, and even to offer different levels of VoIP service quality. MPLS can also be used to manage VoIP performance within a VPN. The different QoS requirements of VoIP traffic, whether on a wholesale basis when transported en masse on a common IP network or within VPNs, can be met by using MPLS in conjunction with DiffServ, proper traffic engineering, and other techniques. Challenges with VoIP Voice is a real-time service. It must be delivered with minimal delay (150 milliseconds end-to-end is a common recommendation) and it must be reproduced with a constant bit stream on the egress network or endpoint. Due to the delay requirements, IP retransmissions are not allowed. Therefore, packets that are dropped on the network, or late packets dropped by a jitter buffer, are not saved or reproduced. IP s best effort delivery and non-deterministic routing introduce delay and more importantly variance in delay, also known as packet jitter, in the voice transmission. VoIP packets may be lost due to packets dropped in router queues or by the jitter buffer. IP Performance Parameter End user and IP Network impacts Packet Loss Impacts voice clarity Packet Delay Impacts voice delay and echo delay Increases packet loss Packet Jitter Impacts voice clarity Increases packet loss WEBSITE:
2 2 VoIP packet loss, delay and jitter are the key performance indicators of the IP network performance. These IP network performance conditions interact with VoIP processes (codecs, jitter buffers, packet loss concealment, echo cancellers, silence suppression, etc.) in complex ways to impact voice quality. The complex interactions of these processes and conditions can get out of control and not only degrade voice quality, but do so in a way that is difficult to diagnose and analyze. Figure 1 illustrates some of the cause-effect relationships inherent among network conditions, network performance parameters, VoIP processing parameters, and end user service quality. Figure 1: Any notion that end user voice quality can be determined by average loss and jitter measurements must be challenged by this representation. It is only in the context of a packet stream that these performance indicators matter. Therefore, analysis for the purpose of troubleshooting and diagnosing problems must focus on a problem domain. A problem domain may be a segment of the network, traffic for a specific customer, traffic from a specific endpoint, a virtual network, a call, or any combination of these. How to address these problems By the time a network operator finally determines that service quality has deteriorated, frustrated customers may have already hung up on both the call and the service. To attract and retain customers, network operators must have effective means for detecting service impairments and quickly diagnosing the root cause so that the impairments can be fixed and the service levels restored. Considering the effectiveness and popularity of service-level management and customer-centric management techniques, a top-downs approach to problem resolution makes sense. This approach simplifies and expedites problem resolution by allowing network technicians to use service-level parameters to represent the impacts of network-level parameters, thus abstracting a complex level of diagnostics and presenting information for human consumption. This approach also makes more efficient use of limited personnel. A problem is quantified and assessed based on its potential impact to customers, thus focusing the valuable time of technicians on resolving the problem that impacts customers and on restoring service levels.
3 3 It is ultimately problems that impact end user service that must be detected. Since several network performance parameters can affect end user service in complex ways, it is useful to measure and present end user service performance as a starting point. Such testing can be performed using intrusive or non-intrusive methods. Ultimately a combination of both provides the most robust and effective means. For diagnoses and root-cause analysis of customer-impacting problems in a live network, non-intrusive testing is the most common and efficient means. Non-intrusive testing is simple and relatively inexpensive because it does not require end-to-end equipment and control; it can be done at a single point in the network. It also does not utilize network bandwidth or traffic resources. It can offer visibility into network performance (e.g., packet loss and jitter), and, based on new developments enable speech quality to be accurately predicted based on, end user service including voice quality scores such as MOS and R Factor. Finally, non-intrusive testing can measure actual customer traffic. Therefore, it can be used for many Operations Support Systems (OSS) and Business Support Systems (BSS) applications. A top-down approach, such at that illustrated in Figure 2, starts with detecting a customer-impacting problem. This can be done with service-level metrics such as MOS or R Factor measurements, which are provided via non-intrusive testing on the JDSU Network Analyzer. Figure 2: A tops-down approach to VoIP problem resolution When a problem is detected, then a technician identifies what exactly is the problem and where it is on the network. Typically VoIP problems are due to excessive RTP packet loss or jitter on the network. Identifying this requires analysis of RTP packet loss and jitter for individual VoIP call streams. However, prior to delving into the next layers, it is important to know the nature of impairments such as loss and jitter. That is, are these impairments bursty or random? Are they transient or persistent? Do they occur in both streams/directions of a single call or only in one? Do they occur at the beginning of calls or later? It is also valuable to locate where on the network it may be occurring. By performing several single-point measurements on different segments of the network (using a portable Network Analyzer for example) one can quickly isolate these impairments. A little bit of intuition, along with knowledge of the network topology and results of the RTP analysis, can expedite this isolation process. Finally, in order to fix the problem, one must diagnose the root cause. This requires analysis of the underlying network layers. The Network Analyzer provides integrated real-time layer 1-7 analysis of nearly all network technologies and protocols in use today to expedite the VoIP problem resolution process. Here follows an example: Step 1: A network technician views a list of VoIP calls on a network link and the corresponding MOS for each call. The technician sorts calls on MOS from lowest to highest, and then sees several calls with MOS falling below 3.0.
4 4 Step 2: The technician views RTP performance statistics for bad calls, and sees if excessive packet loss or jitter is directly correlated to drops in MOS. This is shown in the Network Analyzer screen in Figure 3. In Figure 3a, MOS, packet loss, packet jitter, and overall QoS are graphed normalized to their pre-set red/yellow/green thresholds. One can see that when MOS breaks a threshold into the red zone, it precedes a red zone entry of packet loss. This is also indicated in Figure 3b, which shows MOS, packet loss, and packet jitter graphed to their absolute values. Here, a drop in MOS score (indicating a drop in voice quality) directly precedes a rise in packet loss. One can correlate the voice quality impairment to packet loss. One also sees a rise in packet jitter, but not a corresponding drop in MOS, indicating that the packet jitter was not high enough to cause a noticeable impairment in voice quality. The Network Analyzer also shows the technician the nature of loss or jitter: random or bursty, persistent or transient. Figure 3a: Graph of Threshold Normalized MOS, packet loss, packet jitter and overall QoS
5 5 Figure 3b: Graph of Absolute MOS, packet loss, packet jitter and overall QoS If poor MOS values are observed, but no packet loss on network, and average jitter for each call is within acceptable limits (e.g., 50 msec), view frame-by-frame jitter to see if a burst of late packets occur. This is shown in Figure 4. Such a burst may not contribute significantly to an average jitter measurement, but will nonetheless be dropped by the jitter buffer at the destination. This will result in the same impact as packets dropped on the network, and may contribute to voice quality problems. Figure 4 Bursts of high packet jitter
6 6 Step 3: Drill-down to integrated layer 1-7 analysis to determine the root-cause of the service quality problem, regardless of the infrastructure, technologies, and protocols used to transport VoIP. The Network Analyzer provides extensive measurement capability over all common layer 1 and 2 technologies and protocols. View layer 1-3 performance measurements to determine if high utilization, low throughput, errored frames, or other network condition has contributed. Run Expert Analyzer and Protocol Vitals to see key network health and performance measurements at layers 2 and 3. See layer 2-7 protocol events and stats available in Commentators, Connection Stats, and Protocol Stats. View physical layer events in Line Vitals Performance and events on Ethernet, ATM, multiport ATM IMA, FR, PoS, HDLC, SDLC. Underlying most VoIP problems is a network problem. If VoIP problems are to be fixed and service levels restored, the capability to diagnose the underlying network problem is absolutely required. The Network Analyzer provides this top-to-bottom approach and complete layer 1-7 analysis capability across all network topologies, technologies, and protocols. After the root cause is diagnosed and fixed, one can verify the restoration of service levels using end-to-end active testing methods such as those provided by the JDSU Voice Quality Tester (VQT). The VQT can measure end user voice quality from many different end user access points, including analog POTS service, 10/100 Ethernet ports, IP phones, PBX and other phones. The VQT is useful for certifying service levels on new VoIP deployments and for verifying service levels after performing a problem resolution operation as described above. How MPLS-enabled analysis helps The application of MPLS to IP networks adds a new challenge, but also a new opportunity, for VoIP analysis and testing. The challenge is that VoIP traffic from multiple different virtual networks and service class tiers are mixed on common physical links. The opportunity is that MPLS provides a means to separate this traffic for targeted analysis. It simply requires the right tools to do this. The Network Analyzer comprises many MPLS capabilities that not only enable targeted VoIP analysis over MPLS in specific domains (VPNs, LSPs, service tiers, etc), but also enable analysis of underlying network layers including MPLS networks. When MPLS is used in DiffServ architectures to provide prioritization for different services, diagnosing performance problems can be significantly expedited by focusing the analysis domain on specific LSPs. Using the Network Analyzer, one can see VoIP performance for all traffic on a link, and for traffic in specific domains. These domains can be based on MPLS LSP, IP address, VLAN, ATM VP.VC, FR DLCI. To analyze in a specific domain, one first applies a capture filter for a specific LSP. A filter can be applied for up to 6 values of labels in a label stack. One can further filter on specific values for Class of Service (CoS). This enables one to then analyze VoIP for that LSP, for targeted analysis of performance for that service tier. One can view traffic performance, for both VoIP and non-voip traffic, per LSP and CoS. For underlying network analysis, without LSP capture filters applied, one can view traffic stats (utilization, DLL errors, throughput) per LSP for all LSPs. This indicates if performance impairments are specific to LSPs. Such conditions may be addressed by applying different prioritizations or routes. One can also view frame size distribution per LSP to see if excessive segmentation could contribute to network performance issues. When MPLS is used to provide VPN services, diagnosing performance problems can be significantly expedited by again focusing the analysis domain on specific LSPs for specific VPNs. Using the Network Analyzer, one
7 7 can see VoIP performance for traffic on a specific LSP for a VPN. Other domains for targeted traffic analysis include VLAN, ATM VP.VC, and FR DLCI. One first applies a capture filter for a specific LSP. Since label stacks are sometimes used to differentiate traffic within a VPN, a filter can be applied for up to 6 stacked values in a label stack. One can further filter on specific values for Class of Service (CoS). This enables one to then analyze VoIP for a specific VPN, and even for a specific tier of service or type of traffic (e.g., VoIP) for a VPN. One can view traffic performance, for both VoIP and non-voip traffic, per LSP and CoS. For underlying network analysis, without LSP capture filters applied, one can view traffic stats (utilization, DLL errors, throughput) per LSP for all LSPs. This indicates if performance impairments are specific to LSPs, and may trigger the need to re-engineer routing or prioritization for a specific VPN. One can also view frame size distribution per LSP to see if excessive segmentation could contribute to network performance issues. Problem solving guide using the DNA PRO or DNA MX Plan Establish VoIP service objectives, to determine measurement thresholds for detecting and troubleshooting problems, and for verifying service restoration. Primary service objectives refer to end user metrics such as those for voice quality: clarity, delay, echo, etc. These will drive what the secondary service objectives should be. Secondary service objectives refer to performance parameters that are needed to deliver primary service objectives. These include IP network performance parameters such as packet loss, jitter, and delay. Connect Connect the DNA PRO or MX at the points of IP network access for VoIP gateways and phones. The DNA may be connected to the network segment via a Test Access Port (TAP) or data access switch, or may be connected directly in-line or off a switch span port. Configure Setup capture filters for a specific MPLS LSP and/or CoS. (For MPLS domain analysis. For other domains, filter on VLAN, VP.VC, DLCI, address, or protocol). To optimize real-time performance, setup a capture filter on protocol = RTP. When troubleshooting root cause, disable this capture filter to view non-voip traffic and events on the network. To optimize real-time performance, configure the RTCP Monitor measurement to generate MOS score and RTP stats for 20 RTP streams. Monitor and detect To run in real-time, begin the RTCP Monitor measurement in the Network Analyzer. Sort on the MOS column to place the calls with worst overall quality on top. When MOS falls below threshold for one or more calls, stop the capture to perform drill-down analysis on the traffic and events that contributed to the drop in MOS. Identify problem View the graph of MOS, packet loss and jitter measurements (see Figure 3) for the worst calls to determine the nature of problem: bursty loss, random loss, excessive jitter (which contributes to loss at the jitter buffer). If excessive jitter, note the round-trip delay measurement to see if that parameter is contributing to jitter.
8 8 On the graph, right-click on spike in measurement, and select frame-by-frame graph. Compare high frameby-frame jitter values with jitter buffer settings. If measured jitter is more than jitter buffer setting, VoIP packets are probably being dropped. This has the same affect as packet loss. On frame-by-frame graph, right-click on spike in measurement or missing packet, and select to see decodes. This will show the RTP packet in context with other frames received. Analyze root cause View layer 1-3 performance measurements to determine if high utilization, low throughput, errored frames, or other network condition has contributed. There are many measurements on the Network Analyzer to help determine what is happening on the network. Some examples are: Run protocol vitals from capture buffer and observe pre-filter stats to determine if any particular type of traffic is congesting the network. Run Expert Analyzer, Protocol Stats, Connection Stats, Node Stats from capture buffer, to determine if any events or data traffic are impacting VoIP traffic. View layer 1 physical layer events in Line Vitals. Disable capture filters and run a data capture to collect all traffic on the network. View LSP Statistics measurement to see performance of all LSPs on the network link. If VLAN, VP.VC, or DLCI are used, run these measurements to see performance of all virtual channels. If packets are dropped on the network (determined by TNA packet loss measurement) or at the jitter buffer (determined by TNA packet jitter measurement), then that is a potential source of poor voice quality. Determine if VoIP can be prioritized higher in queues, or rerouted during peak traffic periods. Also try G.711 codec for less impact of dropped packets, or try smaller frame sizes (10 msec instead of 20 or 30 msec). Verify After diagnosing the root cause and restoring the service, verify that service levels are restored by measuring end-to-end service quality using the VQT. Conclusion MPLS enables service providers to deliver new services governed by specific SLAs and CoS. These services comprise the triple play: offering real-time voice and video along with data on a common network. It is the real-time services like voice that will generate the most revenue. The ability to keep these services running at quality levels that meet customer expectations is crucial to retaining customers and realizing revenues. While MPLS introduces new challenges to diagnosing and troubleshooting service-level problems, advanced tools like the JDSU Network Analyzer makes this job simple and fast for next generation network engineers and technicians. Test & Measurement Regional Sales Figure 5 Distributed Network analyzer: Solutions for voice, data, video and mobile network test. NORTH AMERICA TEL: FAX: LATIN AMERICA TEL: FAX: ASIA PACIFIC TEL: FAX: EMEA TEL: FAX: WEBSITE: Product specifications and descriptions in this document subject to change without notice JDS Uniphase Corporation VOIPMPLS.AN.NSD.TM.AE June 2010