How To: Diagnose Poor VoIP Calls through diagnostics. When problems arise with poor VoIP phone calls, where do you start looking to troubleshoot the issue? This is one of the many challenges of managing an IP Telephony-environment. The end user has to deal with a new issue they rarely came across in the traditional telephony world, a bad phone call. Couple that with some poor explanations by the end-user of what a bad phone call is like (laden with echoes, noise, and delays in the conversation), and as a voice or network engineer, you may have to hard time figuring out where to start diagnosing the issue. Is it the IP Telephony system? Perhaps the network is having issues? Maybe the user is simple dreaming there s an echo? How would you really want to figure out what is going on? Other factors to consider include the idea that you are only as good as your weakest link. That is to say that if one hop of the path of a phone call crosses a router with an incorrect QoS configuration, or even the dreaded half-duplex mode still enabled, then this will hinder the entire phone call. Soft phones are also heavily in use. These are easy to deploy and cost much less than VoIP phones. They are still however just as susceptible to the same problems as real phones plus have other factors like the performance of the desktop or laptop that may hinder the call quality. Luckily, in this industry, we have some defined standard for measuring good quality phone calls. They are as follows: Metrics Acceptable Good MOS (Mean Opinion Score) >3.6 >4.08 R-Value >70 >80 Delay <400ms <150ms Jitter <150ms <60ms Packet loss <1% <0.5% Furthermore, we have choices to make for what codec we can use for audio compression levels depending on the bandwidth available to us. Common codecs includes G.711u/a and G.729. The difference being that the latter is a compressed packet 1/8 th the size. Being a compressed voice packet means the margin of error has dropped considerably for network issues to impact the voice quality. Things really do need to be working well if you are using VoIP over small network links. Using NetIQ s Vivinet Diagnostics application, we have a means to determine problem resolution. This test would be based on synthetic data that would be created between two locations on the internal network. This way we can at least isolate if the problem is internal or if it is perhaps part of the WAN link (if there is one). NetIQ can manage IP Telephony solutions from a Cisco, Nortel, Avaya, or Microsoft, but in the example below, we will focus on using software endpoints (probes on the network) to generate a synthetic call using the same codec and possibly QoS tags that may be in use in the production voice system. We will then 1
execute a manual test between these two endpoints to determine what degradation occurs in the phone call and where to start troubleshooting the issue. This type of diagnosis is vendor-agnostic as it can work with any platform of IP Telephony solution. However, to analyse the network, the best results occur from diagnosing a network that is using Cisco or Nortel switches or routers as we can achieve a better analysis. What you need: 1. NetIQ Performance endpoints a. Download and install for the operating system of choice: i. http://www.netiq.com/support/pe/upgrade.asp 2. NetIQ Vivinet Diagnostics a. Download and install (Please contact your local NetIQ Representative for a trial key - http://www.netiq.com/trial/contacts.asp ) i. You can download a trial here > http://www.netiq.com/products/vd/default.asp 3. A PC to run the Diagnostics application (can be a laptop, desktop, or server running Windows Server 2000 SP3+, or 2003 Server) 4. At least 2 PCs to run the endpoint software (can again be a laptop, desktop, or server running Windows 2000 SP3+, XP, 2003 Server, or Linux) For the PCs that will run the endpoint software, it is strongly recommended that these sit as close to the phones on the network. By this we mean the same VLAN is possible. Therefore we can get the most accurate depiction of all the hops in the network the voice traffic traverses to get to its destination. When looking at the main Diagnostics windows, you can define your call test. Here we will define IP addresses o f the endpoints. However, if you provide the logon information for a Cisco Call Manager or Nortel CS1K, you could also target real phones. 2
Once choosing the appropriate codec and QoS tag (if being used), running the diagnostic will automatically populate the trace of the call. Note: We will need the SNMP read strings in use for routers and switches for interrogation. Enter them in the options > SNMP menu. Once the diagnosis is complete, we can see there are red X marks and warning signs depicting issues. Also, click on the objects and links will yield information about that device or connection. 3
Next we see information collected on the network devices and links. The real guts of a problem with a call come from the report view. This view is designed to take all the issues that have been determined by Diagnostics during its analysis (based on its built-in knowledge) and present them to you in a prioritised list highlighting the most common issues first. This information also shows the results of the synthetic call based on the industry standard targets described above for call quality. Therefore we see not only the quality of the test call, but also all the possible issues along the way that could have caused problems. This also takes into consideration that RTP (voice streams) traffic moves in one-way directions so we need to make two traces for each direction. Therefore we can determine if we are running into asynchronous routing issues. The following screen is a summary of the problems for the report view. 4
This report could easily be considered the starting point for where to begin troubleshooting. Let s take a closer look at some of the common issues affecting VoIP calls today: 1. Network impairments a. Echoes in a call, words cutting out, delay in the conversation these are all examples of how network impairments affect the call. Packet loss, end-to-end delay, and high congestion on ports are examples of where we see poor call quality. b. Incorrect priority queuing for RTP streams c. Jitter buffer loss running high d. Half-duplex configuration 5
e. Congestion on the ports f. Asynchronous routing g. High link utilisation 2. Soft phone considerations a. Operating system performance b. Resource constraints on the desktop c. Insufficient bandwidth when connecting in via VPN over slow links (public wi-fi locations, etc) d. Poor packetisation delay 3. WAN links a. Incorrect QoS configurations b. High congestion These are just a sample of the issues that could be affecting a phone call. Many more things come into play. However, having a starting point to focus on may help save considerable time and effort for in-depth problem resolution. The time it takes to resolve these issues is key because now that you ve gone to using IP-Telephony, you ve just created a new mission-critical application to manage on your existing data network voice. Contact your NetIQ Sales Representative on managing your IP Telephony Environments. http://www.netiq.com/trial/contacts.asp This document was written by Haf Saba. Snr. NetIQ Solutions Specialist, Asia Pacific. haf.saba@netiq.com 6