1 Enhancement of VoIP over IEEE WLAN via Dual Queue Strategy + Multimedia & Wireless Networking Laboratory School of Electrical Engineering Seoul National University Jeonggyun Yu + Sunghyun Choi + Jaehwan Lee * NESPOT Project Group Service Development Laboratory Korea Telecom Abstract- Today s IEEE Wireless LAN (WLAN) is an excellent solution for the broadband wireless networking. However, it lacks of the capability to support real-time services such as Voiceover-IP (VoIP) properly. In this paper, we present a simple and viable approach to enhance the VoIP performance over the WLAN by implementing two queues along with a strict priority queuing on top of the Medium Access Control (MAC) controller, e.g., in the device driver of the cards. We find via extensive simulations that the proposed scheme is remarkably effective for the VoIP service in the infrastructure-based WLAN in the coexistence with the non-real-time traffic thanks to the flow control mechanism of the TCP protocol, which is typically used for the non-real-time traffic today. Due to its simplicity, the proposed scheme should be readily deployable in the existing WLANs via simple software upgrades for the enhanced VoIP services. Keywords IEEE WLAN, VoIP, TCP, Dual Queue I. INTRODUCTION In recent years, IEEE WLAN has gained a prevailing position in the market for the (indoor) broadband wireless access networking. The IEEE standard defines the medium access control (MAC) layer and the physical (PHY) layer specifications . The mandatory part of the MAC is called the distributed coordination function (DCF), which is based on Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA). Today, most of the devices implement the DCF only. Because of the contention-based channel access nature of the DCF, it supports only the best-effort service without guaranteeing any Quality of Service (QoS). Until today, the main usage of the WLANs has been limited to the Internet-based non-real-time (NRT) data services like Web browsing, , and file transfer. Recently, the needs for real-time (RT) services such as Voice over IP (VoIP) and audio/video (AV) streaming over the WLANs have been increasing drastically. However, the current devices are not capable to support the RT services properly, which are delay-sensitive while tolerable of some losses. The emerging IEEE e MAC, which is an amendment of the existing MAC, will provide the QoS . The standardization of the IEEE e is still on-going even if it is in the final stage. Even after the standardization is finalized, it may take some time for the first e-compliant QoS-enabled WLAN equipments to become available in the market. Moreover, it may be difficult to upgrade/replace the existing access points (APs) for the QoS support even if the e-compliant devices are available. The main problem is that such an upgrade requires the existing WLAN equipment hardware to be replaced since the e MAC implementation cannot be done by just upgrading the firmware of an existing MAC controller chip only. In this paper, we consider a software upgrade-based approach to provide a limited QoS for VoIP service enhancement over the WLAN. Our proposed scheme basically implements dual queues on top of the MAC controllers. In reality, these two queues can be implemented in the device driver of the WLAN devices. Basically, RT and NRT packets 1 are classified and enqueued into one of the two queues. Then, we implement a strict priority queuing to serve these two queues in order to give a priority to the RT packets; the NRT queue is never served as long as the RT queue is non-empty. A similar, but more complicated approach has been reported in . The authors implemented two queues in the device driver of a WLAN network interface card (NIC). They have also implemented a rather complicated scheduling algorithm based on the earliest deadline first (EDF) algorithm for RT packets and an adaptive traffic smoother for NRT packets to regulate the NRT traffic amount. The performance evaluation was provided with the real testbed working in the ad-hoc mode of the WLAN. Our approach does not implement any complicated scheduling algorithms other than the strict priority queuing. However, we find that this simple scheduling policy is good enough to provide a good QoS to the VoIP packets over the TCP-based NRT traffic in the infrastructure-based WLAN. This turns out to be true due to the TCP s flow control mechanism as well as the fact that the downlink (i.e., AP-to-stations) in the infrastructurebased WLAN is the bottleneck of the network performance. Note that many Internet applications run over the TCP. The rest of the paper is organized as follows. In Section II, IEEE MAC is briefly reviewed. The proposed dual queue scheme is presented in Section 0. Section IV discusses the VoIP scheme in consideration and its admission control. After comparing the conventional single queue and the proposed dual queue schemes via simulations in Section V, we conclude in Section VI by discussing the future work. II. IEEE MAC The IEEE legacy MAC  defines two coordination functions, namely, the mandatory DCF based on 1 In the terms, the packet is not a correct term, but we use the term packet for simplicity throughout the paper.
2 CSMA/CA and the optional point coordination function (PCF) based on poll-and-response mechanism. Most of today s devices operate in the DCF mode only. We briefly overview how the DCF works here as the proposed dual queue scheme runs on top of the DCF-based MAC. The MAC works with a single first-in-first-out (FIFO) transmission queue. The CSMA/CA of DCF works as follows: when a packet arrives at the head of transmission queue, if the channel is busy, the MAC waits until the medium becomes idle, then defers for an extra time interval, called the DCF Interpacket Space (DIFS). If the channel stays idle during the DIFS deference, the MAC then starts the backoff process by selecting a random backoff counter. For each idle slot time interval, the backoff counter is decremented. When the counter reaches zero, the packet is transmitted. The timing of DCF channel access is illustrated in Fig. 1. Each station maintains a contention window (CW), which is used to select the random backoff counter. The backoff counter is determined as a random integer drawn from a uniform distribution over the interval [0,CW]. If the channel becomes busy during a backoff process, the backoff is suspended. When the channel becomes idle again, and stays idle for an extra DIFS time interval, the backoff process resumes with the suspended backoff counter value. For each successful reception of a packet, the receiving station immediately acknowledges by sending an acknowledgement (ACK) packet. The ACK packet is transmitted after a short IFS (SIFS), which is shorter than the DIFS. If an ACK packet is not received after the data transmission, the packet is retransmitted after another random backoff. The CW size is initially assigned CWmin, and increases to 2 (CW + 1) 1 when a transmission fails. Immediate access when medium is idle >= DIFS DIFS Busy Medium DIFS PIFS SIFS Defer Access Contention Window Backoff Window Slot Time Next Frame Select Slot and decrement backoff as long as medium stays idle Fig. 1. IEEE DCF access scheme All of the MAC parameters including SIFS, DIFS, Slot Time, CWmin, and CWmax are dependent on the underlying physical layer (PHY). Table I shows these values for the b PHY . The b PHY supports four transmission rates, namely, 1, 2, 5.5, and 11 Mbps. We assume the b PHY in this paper due mainly to its wide deployment base even if the proposed dual queue scheme should work with any PHY. TABLE I MAC PARAMETERS FOR B PHY SIFS DIFS Slot Parameters CWmin CWmax (usec) (usec) (usec) b PHY III. DUAL QUEUE STRATEGY Our approach is to implement two queues inside the AP. Especially, these queues are implemented above the MAC controller, i.e., in the device driver of the NIC, so that a packet scheduling can be performed in the driver level. Note that the MAC controller cannot be modified by people other than the corresponding chip vendor. Fig. 2 shows the device driver structure for both the original device driver and a modified device driver supporting our approach. In the original driver, there is basically a single FIFO queue for the packet transmission. A packet from the higher layer or from the wireline port (in case of the AP) is processed for the header and so on, and is forwarded to the MAC controller for the transmission. The MAC controller itself has also a single FIFO queue. We implement two FIFO queues, called RT and NRT queues, in the device driver level as shown in Fig. 2 (b). We classify each packet to transmit into RT or NRT types. The current IP datagrams do not carry any information about the corresponding applications or QoS requirements, and hence we use the port number as well as UDP packet type to classify a RT packet. That is, the device driver is provided the specific port number information of the RT applications in consideration. It is typical that a VoIP application utilizes a pre-assigned range of port numbers along with Real-Time Protocol (RTP) over UDP protocols. For transmission scheduling, we implement a simple strict priority queuing so that the NRT queue is never served as long as the RT queue is not empty. FIFO TCP/UDP IP RT+NRT Frame Processsing MAC PHY (a) Original transmit function Device Driver FIFO RT TCP/UDP IP FIFO Frame Processing MAC PHY RT+NRT NRT (b) Modified transmit function Fig. 2. Device driver structures As stated above, the MAC controller itself has a FIFO queue, which we refer to as MAC HW queue in this paper. The performance of the proposed dual queue scheme can be compromised when the MAC HW queue is large due to the queuing delay within the MAC HW queue. That is, if the MAC HW queue can accommodate more than one packet, a voice packet dequeued from the RT queue should wait until all the preceding packets in the MAC HW queue are transmitted. The problem here is that the MAC HW queue size is implementation-dependent and vendor-specific such that it differs for different MAC controllers, and the size is not typically configurable. We will evaluate the effect of the MAC HW queue size on the performance of the dual queue later.
3 IV. VOIP AND ADMISSION CONTROL In this section, we briefly discuss the VoIP codec in consideration, and the VoIP admission control, which is essential to provide an acceptable QoS to the voice traffic. A. Voice-over-IP (VoIP) There are many types of voice codec used in IP telephony, namely, G.711, G.723.1, G.726, G.728, and G.729 . These codecs have different bit rates and complexities. In this paper, we consider G.711, the simplest voice codec. G.711 is a standard generating a 64 kbps stream, based on an 8-bit pulse coded modulation (PCM), with the sampling rate of 8000 samples/second. Even though it achieves the worst compression among peer voice codecs, it is often used in practice thanks to its simplicity. For example, we have observed using a network traffic capturing tool  that G.711 is used in Microsoft MSN Messenger for the VoIP application. The number of samples per a VoIP packet is another important factor. The codec defines the size of a sample, but the total number of samples conveyed in a packet affects how many packets are generated per second. There is basically a trade-off since the larger a packet size (or more samples carried per packet), the longer the packetization delay, but the lower the packetization overheads as analyzed below. In this paper, we assume that a VoIP packet is generated every 20 msec, i.e., with 160-byte (= 8 kbytes/sec * 20 msec) voice data. We also assume that RTP over UDP is used for the VoIP transfer. When an IP datagram is transferred over the WLAN, the datagram is typically encapsulated by an IEEE Sub-Network Access Protocol (SNAP) header. Note that all these assumptions are very typical in the real world. Accordingly, the VoIP packet size at the MAC Service Access Point (SAP) 2 becomes: 160-byte DATA + 12-byte RTP header + 8-byte UDP header + 20-byte IP header + 8-byte SNAP header = 208 bytes per VoIP packet B. VoIP Admission Control Apparently, the number of allowable VoIP sessions over WLAN should be limited to maintain an acceptable QoS. The maximum number of VoIP sessions over the WLAN can be approximately calculated as follows. We first calculate the time to transmit a VoIP packet over the b PHY at 11 Mbps without any transmission failure assuming (1) ACK packet is transmitted at 2 Mbps, and (2) the long PHY preamble is used. These two assumptions are very valid in the real WLANs. Note that for a successful MAC packet transfer, the following five events happen in order : (1) DIFS deference; (2) backoff; (3) packet transmission; (4) SIFS deference; and (5) ACK transmission. Then, the VoIP packet transfer time is determined to be about 981 µsec by adding the following three values as well as one SIFS (= 10 µsec) and one DIFS (= 50 µsec): 1) VoIP MAC packet transmission time: 2 The MAC SAP is the interface between the MAC and the higher layer, i.e., the IEEE Logical Link Control (LLC). = 192-µsec PLCP preamble/header + (24-byte MAC header + 4-byte CRC byte payload) / 11 Mbps = 363 µsec 2) ACK transmission time at 2 Mbps: = 192-µsec PLCP preamble/header + 14-byte ACK packet / 2 Mbps = 248 µsec 3) Average backoff duration: = 31 (CWmin) * 20 µsec (One Slot Time) / 2 = 310 µsec A VoIP session consists of two senders, which transmit a packet every 20 msec, since it is inter-active. Then, we find that about 20 voice packets (= 20 msec / 981 msec) can be transmitted during a 20 msec interval. Accordingly, we estimate that about 10 VoIP sessions can be admitted into IEEE b WLAN. We discuss this issue further based on the simulated results later. V. COMPARATIVE PERFORMANCE EVALUATION In this section, we comparatively evaluate the performance of the original single queue scheme and our dual queue scheme using the ns-2 simulator  to show the utility of the dual queue scheme for the VoIP sevice over an infrastructure WLAN environment. We use the b PHY for our simulations, and all the stations transmit packets at 11Mbps, which is the highest transmission rate of the b PHY. Two different types of traffic are used for our simulations, namely, voice and data. The voice traffic is modeled by a two-way constant bit rate (CBR) session according to G.711 codec as described in Section IV.A. The data traffic application is modeled by a unidirectional FTP/TCP flow with 1460-byte packet size and 12-packet (or byte) receive window size. This application corresponds to the upload or download of a large file via FTP. Note that the maximum segment size (MSS) of the TCP across the popular Ethernet is 1460 bytes. We have also learned by sniffing the network traffic  that byte TCP receive window size is commonly used in Microsoft Windows XP. This is different from the size of 42 packets claimed to be common in . However, it should be also noted that as far as we have found, the receive window size is heavily dependent on the underlying OS and its configuration, and is often adapted depending on the application and network setup. Voice 1 FTP Server... Voice n Switch AP Voice Gateway Fig. 3. Network topology for simulations The network topology for our simulations is shown in Fig. 3. Each station involving with a VoIP session generates and receives only voice traffic. The other stations either generate Data 1... Data m
4 or receive only TCP packets, and each of them treats only one TCP flow, i.e., the number of TCP flows corresponds to the combined upstream and downstream TCP stations. This topology can be often found in the real WLANs with mixed VoIP and Internet traffic. A. VoIP Capacity for Admission Control First, we have simulated the pure VoIP situations in order to evaluate the admission control policy. Note that there is no difference between the single queue and dual queue schemes in this case. Fig. 4 shows both delay and packet drop rate as the number of VoIP sessions increases. The packet drop occurs at the transmitting stations since we have used a limited-size queue (of 50 and 100 packets) in both AP and stations. We observe from the simulation result that up to 11 VoIP sessions can be admitted into the system since the downlink drop rate is over 0.1 with 12 VoIP sessions, and this high packet drop rate is not acceptable practically. In Section IV. B, we have estimated that 10 VoIP sessions can be admitted. Our under-estimation is due to the fact that the average backoff duration is reduced as the number of stations increases up to the point that the packet collision effect becomes dominant. However, we find that our simple calculation-based estimation was quite close. We find that up to 11 VoIP sessions, there is no much difference between 50 and 100 queue sizes since there should not be many queued VoIP packets anyway. Accordingly, we use 50 packets for the RT queue size for the rest of the paper. (a) Delay of voice traffic (b) Drop rate of voice traffic Fig. 4. Capacity of IEEE b for VoIP Longer delay and higher drop rate are observed for the downlink transmissions with over 11 VoIP sessions. This is due to the fact that the downlink is disadvantaged compared to the uplink (i.e., station-to-ap) since the downlink (or AP transmission) is shared by multiple VoIP sessions. Under the DCF access rule, the AP basically gets the channel access chances as often as other competing stations do for their uplink transfers. It should be also noted that the admission control should be performed more carefully considering the link condition between the AP and each individual station. Note that our simulation results are based on 11 Mbps data transmission rate. The bad side is that the channel condition fluctuates over time due to the station mobility, time-varying interference, etc. One possible admission control policy could be admitting up to smaller number of VoIP sessions. For example, when all the stations transmit packets at 2 Mbps instead of 11 Mbps, the number of admissible VoIP sessions becomes 5 from the analysis in Section IV. B. If all the admitted (up to five) VoIP stations can transmit/receive packets at 11 Mbps thanks to good channel conditions, there will be plenty residual bandwidth, which can be utilized by other types of traffic, i.e., NRT TCP traffic considered in the following. B. Comparison of Single Queue and Dual Queue In this scenario, we simulate with a single VoIP session and various numbers of TCP flows in order to evaluate the effects of different numbers of TCP sessions to the VoIP performance. We consider three different queue sizes (i.e., 50, 100, and 500 packets) at the AP. There are always the same numbers of upstream and downstream TCP flows. That is, the value of one in the x-axis represents the case when there is one upstream and one downstream TCP flows. Here, we assume that the MAC HW queue size is equal to one packet. Fig. 5 presents both delay and packet drop rate performances with the single queue scheme. We can see that the delay of voice packets with the single queue increases linearly proportional to the number of TCP flows when the queue size is large enough (i.e., queue size of 500 in our simulations) in Fig. 5 (a). This is because the average number of queued packets at the AP linearly increases as the number of TPC flows increases. Note that with TCP, there can be a number of outstanding TCP packets (including both data and ACK packets) inside the network, i.e., between a station and the FTP server in our simulation environment in Fig. 3. The number of outstanding TCP packets is determined by the minimum of the received window size and the congestion window size. We observe from the simulations that the bottleneck link is the downlink of the WLAN, i.e., the AP s downlink transmissions, and hence virtually all the outstanding packets are queued in the AP queue. This is the reason why the VoIP packet delay increases linearly as the number of TCP flows increases with the queue size of 500. However, the situation is a bit different in case of queue sizes of 50 and 100. That is, the delay increases very slowly or almost is saturated beginning a specific number of TCP flows. This slow delay increase occurs due to the packet drops out of the buffer overflow as confirmed from Fig. 5 (b). Fig. 6 shows the VoIP delay performance as the TCP flow number increases with the proposed dual queue scheme for three different NRT queue sizes. It should be noted that we do not show the packet drop rate performance since no packet drop has been observed in this case. We first observe a
5 significant reduction of the delay with the dual queue; the worst case delay now is about 11 msec. We can imagine that the delay of downlink voice traffic is mainly due to the queuing delay with the single queue while it is mainly due to the wireless channel access delay in the MAC HW queue with the dual queue. Fig. 6 shows that delays of both uplink and downlink voice packets with 50 and 100-packet NRT queues increase as the number of TCP flows increases. However, in cases of the queue size of 500 packets, there is almost no change in delay irrespective of the number of TCP flows. This is somewhat counter-intuitive since the TCP is known to be aggressive, and hence there should be more uplink contentions as the number of TCP flows increases, thus degrading the voice delay performance. at the NRT queue since the receive window size is 12, and 500 is more than enough in this case. On the other hand, when the queue size is small such that some TCP packets get dropped due to the buffer overflow of the NRT queue, there will be retransmit timeout events with some TCP flows, and it will make more stations with upstream TCP flows actively contending for the channel in order to retransmit TCP data packets. This is the reason why the delay performance gets worse with the NRT queue sizes of 50 and 100 as the TCP flow number increases. This kind of TCP behavior still exists with the single queue situation, but it is not observed since the delay performance is dominated by the queuing delay discussed above. (a) Delay Fig. 6. Delay of voice packets with dual queue (b) Drop rate Fig. 5. Delay and drop rate of voice packets with single queue However, if we delve into the TCP behavior more carefully, the observed delay performance looks very reasonable. As discussed above, when the queue size is large enough, most of TCP packets (either ACK or data) are accumulated at the AP because of the bottleneck WLAN downlink. Therefore, for example, the source stations of upstream TCP flows can transmit a TCP data packet only when it receives a TCP ACK packet from the AP provided that a timeout does not occur. This makes basically only one or two stations with TCP flows actively contending for the channel irrespective of the number of total TCP flows. This is the reason why the delay performance is rather stable across all the TCP flow numbers when the NRT queue size is 500. Note that with 10 upstream and downstream TCP flows, there will be up to 240 (= 10 * 2 * 12) TCP packets enqueued Fig. 7. Aggregated TCP throughput in dual queue Fig. 7 shows the aggregated throughputs of upstream and downstream TCP flows, which are measured at the AP, with the dual queue scheme. We basically observe the unfairness between upstream and downstream TCP flows with the queue sizes of 50 and 100 while the unfairness is not observed with the queue size of 500. This is again because some TCP packets are dropped in case of 50 and 100 queue sizes. For example, with the queue size of 100, the unfairness is observed beginning the number of TCP flows equal to 5, in which the maximum outstanding TCP packets becomes 120 (= 5 * 2 * 12) larger than the queue size. Because the TCP ACK is cumulative, which makes up the loss of previously dropped TCP ACK packet, upstream TCP flows are less affected by the packet drops at the AP, thus achieving a higher throughput than the downstream TCP flows. This observation is in the same line as , and implies that the queue size for the AP should be large enough in order to
6 avoid the unfairness between uplink and downlink. This is good for us since our dual queue scheme performs better in terms of the delay with large NRT queues. We do not show the TCP throughput performance with the single queue, but the same behavior is basically observed, where the throughput values are a bit larger than the dual queue case since the VoIP and TCP packets are treated in the same manner with the single queue. C. Effect of MAC HW Queue Size In order to evaluate the effect of MAC HW queue size on the dual queue performance, we have simulated with 1, 5, and 10 VoIP sessions as the MAC HW queue size increases from 1 to 8 packets. Five of upstream and downstream TCP sessions, respectively, were also introduced along with the VoIP sessions. Fig. 8 shows that the delay of downlink voice packets increases linearly proportional to the MAC HW queue size due to the effect of the queuing delay at the MAC HW queue. We also find that the downlink delay is affected by the VoIP session numbers. With a small MAC HW queue, the more the VoIP sessions, the larger the downlink delay due to the queuing delay at the RT queue. However, as the MAC HW queue size increases, the delay increases more slowly when there are more VoIP sessions, and with the MAC HW queue size of 8, the worst delay is observed with a single VoIP session. The reason can be understood as follows: with a large MAC HW queue, it is likely to have more VoIP packets (and hence fewer TCP packets) in the MAC HW queue when there are more VoIP sessions. Since the VoIP packet s transmission time is likely to be shorter than the TCP data packet s transmission time due to the packet size difference, the queuing delay at the MAC HW queue is smaller with more VoIP sessions. The final and the most important observation is that the delay performance with large MAC HW queue size is still at an acceptable range for the typical VoIP applications, i.e., under 25 msec. Fig. 8. Effect of MAC HW queue size VI. CONCLUSION AND FUTURE WORK In this paper, we have presented a dual queue scheme, which implements two queues in the device driver of the MAC controller so that a packet scheduling based on a strict priority queuing can be conducted at the driver level to prioritize the VoIP packets. Based on the simulations, we compared the original single queue scheme and the proposed dual queue scheme to demonstrate that the performance of VoIP can be enhanced significantly through our scheme when the VoIP and TCP traffic coexists. The reason why a simple scheduling above the MAC controller can work surprisingly well is due to the behavior of the TCP flow control in the WLAN with the downlink as the bottleneck link. We are currently implementing the dual queue scheme into the real testbed based on the HostAP driver of the Intersil Prism 2.5-based WLAN devices . It is our eventual goal to incorporate the dual queue scheme into the real deployment of the largest WLAN-based hotspot service in Korea . REFERENCES  IEEE, Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications, Reference number ISO/IEC :1999(E), IEEE Std , 1999 edition,  IEEE, Supplement to Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications: Higher-speed Physical Layer Extension in the 2.4 GHz Band, IEEE Std b-1999,  IEEE Working Group, Draft Supplement to Part 11: Wireless Medium Access Control (MAC) and physical layer (PHY) specifications: Medium Access Control (MAC) Enhancements for Quality of Service (QoS), IEEE e/D5.0, July  Stefan Mangold, Sunghyun Choi, Guido R. Hiertz, Ole Klein, and Bernhard Walke, "Analysis of IEEE e for QoS Support in Wireless LANs," accepted to IEEE Wireless Communications Magazine, July  Sunghyun Choi, Javier del Prado, Sai Shankar N, and Stefan Mangold, IEEE e Contention-Based Channel Access (EDCF) Performance Evaluation, in Proc. IEEE ICC 03, Anchorage, Alaska, USA, May  Saar Pilosof et al., Understanding TCP fairness over Wireless LAN, in Proc. IEEE INFOCOM 03, vol. 2, pp , March  Amit Jain, Daji Qiao, and Kang G. Shin, RT-WLAN: A Soft Real-Time Extension to the ORiNOCO Linux Device Driver, in Proc. IEEE PIMRC'03, Beijing, China, Sept  Daji Qiao, Sunghyun Choi, and Kang G. Shin, "Goodput Analysis and Link Adaptation for IEEE a Wireless LANs," IEEE Trans. on Mobile Computing (TMC), vol. 1, no. 4, pp , October-December  Daniel Collins, Carrier Grade Voice over IP, 2nd Ed., McGraw-Hill, September  The Network Simulator ns-2, online link.  Airopeek, online link.  Jouni Malinen, Host AP driver for Intersil Prism2/2.5/3, online link.  KT NESPOT, online link.