VoIP network planning guide Document Reference: Volker Schüppel 08.12.2009
1 CONTENT 1 CONTENT... 2 2 SCOPE... 3 3 BANDWIDTH... 4 3.1 Control data 4 3.2 Audio codec 5 3.3 Packet size and protocol overhead 5 3.3.1 UDP Protocol overhead 6 3.3.2 IP Protocol overhead 7 3.3.3 Network Protocol overhead 7 3.4 Jitter 8 3.5 Voice activity detection 8 3.6 Example 9 4 JITTER... 10 5 PRIORITIES... 11 5.1 Differentiated Services Code Point 12 6 NETWORK PROVIDER... 14 7 LINKS... 15 Page 2/15
2 SCOPE This document is written for all users who are going to install Artist VoIP products. Most of the information is in general true for VoIP, while some information is very Artist specific. In difference to other audio technologies (Analog IO, AES etc.) VoIP transmission needs accurate planning beforehand. If the network is not planned, it s very likely that the installation will fail or the transmission is not reliable. The reader must have basic knowledge of IP networks including IP addressing. The document covers the 3 most important topics of an installation (Bandwidth, Jitter and Priorities). The following sample network shows a very typical VoIP installation where two locations are interconnected with a wide area network. While the local area networks are often fast and reliable, the WAN is the limiting factor and that s where the focus is. Page 3/15
3 BANDWIDTH The VoIP transmission is based on two parts, control (signalization) data and audio data. The control data is very constant and can not be influenced by the user. The audio data depends on several factors: - Configured audio codec - Configured packet size - Network type - Jitter - Voice activity The total bandwidth is the sum of five parts: bw Total = bw ControlData + bw AudioCodec + bw UdpProtocolOverhead + bw IpProtocolOverhead + bw NetworkOverhead 3.1 Control data The bandwidth for the control data is quite constant. bw ControlData = 20 kb/s Page 4/15
3.2 Audio codec Each audio codec has different properties and one of it is the audio data bandwidth. This is the data which is used for the raw audio excluding any network protocol overhead. Codec All G.711 PCM 8K All RARe bw AudioCodec 64 kb/s 128 kb/s 64 kb/s G.722 64 kps PLC 64 kb/s G.722 48 kps PLC 48 kb/s 3.3 Packet size and protocol overhead VoIP traffic is not streamed over the network, it is separated into packets. For each VoIP channel the user can individually select the size of the packets: 20ms, 40ms, 80ms or 160ms. Default is 20ms. When the transmitter wants to send a packet, it has to wait until enough audio is available for sending, e.g. for a 40ms packet it has to wait 40ms. So the delay depends on the packet size. Page 5/15
Small packets create less delay, but create more protocol overhead, because you need to send more packets for the same amount of audio. E.g. 20ms packets add 8 times the protocol overhead as 160ms packets. 3.3.1 UDP Protocol overhead The audio data is encapsulated into UDP datagrams which adds protocol overhead. For each packet UDP adds 64 Bits overhead resulting in additional bandwidth. Audio packet size Packets / second 20ms 50 3,2 kb/s 40ms 25 1,6 kb/s 80ms 12,5 0,8 kb/s 160ms 6,25 0,4 kb/s bw UdpProtocolOverhead Page 6/15
3.3.2 IP Protocol overhead The UDP packets are encapsulated into IP datagrams. For each packet IP adds 160 Bits overhead. Audio packet size Packets / second 20ms 50 8 kb/s 40ms 25 4 kb/s 80ms 12,5 2 kb/s 160ms 6,25 1 kb/s bw IpProtocolOverhead 3.3.3 Network Protocol overhead The network protocol depends on the network type. E.g. Ethernet is using the Ethernet protocol. Wide area networks are based on DSL, Cable, E1, T1 etc. and use other protocols and therefore create different overhead. Thus the same IP traffic results in different network traffic between the LAN and the WAN. This chapter only handles Ethernet networks as an example. It gives an idea how network protocol overhead is calculated, so the reader is able to adjust the calculation for other network types. The Ethernet protocol adds 144 Bits overhead for each Ethernet packet. One audio packet is encapsulated in one Ethernet packet. Audio packet size Packets / second 20ms 50 7,2 kb/s 40ms 25 3,6 kb/s 80ms 12,5 1,8 kb/s 160ms 6,25 0,9 kb/s bw NetworkOverhead Page 7/15
3.4 Jitter Jitter is the time variation of the VoIP packet transmission. Please see chapter 4. It has an influence on the bandwidth, since the transmission is not constant when there is network jitter. Regarding a limited timeframe, it can happen that there are less packets transmitted, resulting in a lower bandwidth. Of course the opposite can also happen, resulting in a temporary higher bandwidth. Theoretically the required temporary bandwidth could be infinite. It s nearly impossible to calculate the bandwidth variation caused by Jitter beforehand. As an advice it s a good idea to have 25% bandwidth reserve. The user should check the Jitter and the bandwidth variation when the system is installed. 3.5 Voice activity detection Artist VoIP channels have a configurable Voice activity detection (VAD) function. Per default it is enabled, which means that the audio transmission stops, when the audio drops below the Vox threshold. In this case the bandwidth is reduced to bw ControlData. See 3.1 The network should always be designed to transmit the bandwidth for a permanent audio signal, but VAD is a nice feature to reduce the data volume in practice. Page 8/15
3.6 Example This chapter provides enough information to calculate all variants of VoIP configurations. As an example, this is the calculation for the VoIP channel default settings. G.722 (64 kps), 20ms packet size, Ethernet: bw Total = bw ControlData + bw AudioCodec + bw UdpProtocolOverhead + bw IpProtocolOverhead + bw NetworkOverhead = 20 kb/s + 64 kb/s + 3,2 kb/s + 8 kb/s + 7,2 kb/s --------------- 102,4 kb/s Page 9/15
4 JITTER Jitter is the time variation of the VoIP packet transmission. Jitter is a typical problem of the connectionless networks or packet switched networks. Due to the information is divided into packets each packet can travel by a different path from the emitter to the receiver. Jitter is technically the measure of the variability over time of the latency across a network. The solution is a jitter (receiver) buffer in order to equalize the variation. The receive buffer size can be configured for each VoIP channel in the Artist system. A larger receive buffer can handle greater jitter, but increases the delay. Receiver buffer size Maximum jitter 80 ms 20 160 ms 40 320 ms 80 Page 10/15
5 PRIORITIES Often VoIP traffic will be transmitted together with other traffic in the same network. That s the main reason why the VoIP traffic is delayed and jittered. The bottlenecks in the network are switches, routers and wide area networks with limited bandwidth. If the network is shared among VoIP and other data services, the administrator must plan / think about priorities. IP traffic can use priorities which is called quality of service (QoS). IP packets can be marked with type of service (ToS) bits in the IP header. In IP version 4 it is not mandatory that routers and switches support quality of service, but in professional equipment it is very common. When IP traffic is marked with higher quality of service, network equipment can switch / forward it faster than ordinary traffic. Artist channels can be configured individually with a ToS field value in the VoIP property sheet. The configured value will be used in the IP header of all packets. The interpretation of the ToS field by network equipment is not exactly specified, but a very common way is Differentiated Service Code Point (DSCP). Page 11/15
5.1 Differentiated Services Code Point Differentiated Services (DiffServ) is a new model in which traffic is treated by intermediate systems with relative priorities based on the type of services (ToS) field. Defined in RFC 2474 and RFC 2475, the DiffServ standard supersedes the original specification for defining packet priority described in RFC 791. DiffServ increases the number of definable priority levels by reallocating bits of an IP packet for priority marking. The DiffServ architecture defines the DiffServ (DS) field, which supersedes the ToS field in IPv4 to make per-hop behavior (PHB) decisions about packet classification and traffic conditioning functions, such as metering, marking, shaping, and policing. The RFCs do not dictate the way to implement PHBs; this is the responsibility of the vendor. DS5 DS4 DS3 DS2 DS1 DS0 ECN ECN DSCP six bits (DS5-DS0) ECN two bits, currently unused DiffServ uses the most significant bits (DS5, DS4 and DS3) for priority setting. Precedence Level DS5 DS4 DS3 Description 7 111 Link layer and routing protocol keep alive Lowest latency and jitter 6 110 Used for IP routing protocols 5 101 Express Forwarding Voice and Video 4 100 Controlled Load (Streaming Multimedia) 3 011 Excellent Load (Business Critical) 2 010 Standard (Spare) 1 001 Background 0 000 Best effort Page 12/15
DS2 and DS1 specify the drop probability; bit DS0 is always zero. Drop probability Low 010 Medium 100 High 110 DS2 DS1 DS0 Example for Precedence level 5, drop probability low: 101 010 00 = 0xA8 = 168 Page 13/15
6 NETWORK PROVIDER Typically a network provider offers data services with specified network quality, often with different levels (and prices). E.g. it could look like this. Quality level Basic Advanced Primary Voice Packet loss (end to end) Round trip delay (end to end) Jitter (end to end) 1% 0,30% 0,20% 0,10% - 80 ms 80 ms 60 ms - - 30 ms 12 ms A specified network should always be used, if possible. The administrator has the possibility to plan the network and beforehand and furthermore it doesn t change when using it. If the network is unspecified, you don t know what happens in practice Page 14/15
7 LINKS IP Protocol: http://en.wikipedia.org/wiki/ip UDP Protocol:: http://en.wikipedia.org/wiki/user_datagram_protocol Understanding Jitter in Packet Voice Networks http://www.cisco.com/en/us/tech/tk652/tk698/technologies_tech_note09186a00800 945df.shtml Understanding Delay in Packet Voice Networks http://www.cisco.com/en/us/tech/tk652/tk698/technologies_white_paper09186a008 00a8993.shtml Page 15/15