pleentation of Active Queue Manageent in a obined nput and Output Queued Switch Bartek Wydrowski and Moshe Zukeran AR Special Research entre for Ultra-Broadband nforation Networks, EEE Departent, The University of Melbourne, Parkville, Vic. 3, Australia {b.wydrowski,.zukeran}@ee.u.oz.au Abstract- This paper investigates the ipleentation of a class of active queue anageent algoriths (AQM) whose easure of congestion includes packet arrival rate, such as REM and GREEN, in a cobined input and output queued (OQ) switch. We propose a structure with one AQM per output port and we analyze the constraints on the switch fabric speed that this iposes. A two queue odel of a OQ switch is developed and siulated to validate our design and copare its perforance with a droptail queue switch.. NTRODUTON Quality of Service (QoS) easures such as Packet delay and loss and utilization of capacity are largely governed by congestion control. Extensive research has been perfored on analysis and design of various active queue anageent (AQM) algoriths. The controller structure, stability and speed have been studied analytically and epirically [][9][]. The ost proising AQM architectures, such as REM [], GREEN [9] and Receding Horizon ontrol [] involve the easureent of packet arrival rate as a eans of easuring the congestion. They aintain very low packet backlog, low packet loss and a high link utilization copared to the existing algoriths deployed on the nternet, such as droptail and RED [2]. However, the new algoriths have so far been studied only in the context of a single queue link, with a well-defined service capacity. Devices such as cobined input and output queued (OQ) switches [6] have ultiple queues whose capacities depend on cross traffic between ultiple input and output interfaces. This paper proposes a structure for ipleenting these new arrival rate based (RB) AQM algoriths in a OQ switch.. AQM BAKGROUND An AQM algorith signals the congestion level at a link to source algoriths transitting over the link, either by packet dropping or by arking explicit congestion notification EN [] capable packets, at a rate P(t) at tie t. n general, we can approxiate AQM algoriths as a function which coputes the packet arking or dropping rate P(t) fro link congestion state variables: P(t+)=F(P(t),(t),B(t),(t), (t), ) () where (t) is the capacity of bottleneck link at tie t, B(t) is the packet backlog, (t) is the packet arrival rate (b/s), and (t) is the derivative of (t). We will call AQMs backlog based (BB) when P(t)= F(B(t)), where F(x) is a positive and non-decreasing function for x > and f()=. For RB schees, F( ) includes at least (t) in the paraeters, but ay have other inputs as well. RB AQMs such as REM and GREEN are integration process, which can be approxiated by P(t+)= P(t) +α ((t) u (t)), where α is the control gain and u controls the target utilization with u <. Note that in steady state, when P(t+)= P(t) the arking/dropping rate is such that, (t) = u (t), and therefore (t) < (t) independently of the backlog. This eans that RB AQM can clear the buffer of backlog and achieves very low queuing delay. urrent switches are being built using BB AQM control, typically with droptail or RED AQM. However, it is known [][9] that BB schees suffer fro a fundaental shortcoing that RB schees do not. The BB processes couples the congestion notification rate P(t) to the backlog size B(t) by the positive increasing AQM function P(t+)= F(B(t)). Therefore during congestion, when the congestion feedback signal needs to increase, the backlog also necessarily increases. For exaple, in a droptail queue of size B ax, F(B(t)) is a threshold function, with P(t+)= when B(t) B ax, and P(t+)= when B(t) > B ax,. This eans that the droptail queue is unable to signal congestion by packet dropping unless the queue is full, ie: B(t) > B ax. Since a positive dropping rate P(t) is required to control TP sources, the droptail queue ust be full for a percentage of the tie. This results in a large ean backlogs. The RED algorith has a piece-wise linear increasing F(B(t)) function. This also necessitates an increase of backlog to increase the dropping/arking rate. Siulation results in this paper
deonstrate how unlike BB AQM, RB AQM can control the source rates to aintain low backlog..2 SWTH BAKGROUND A switch consists of the three ajor coponents, the input interfaces, the output interfaces and the switching fabric. Higher end switches consist of separate /O odules and a separate switch fabric in the for of a back plane for interconnecting the odules. The /O odules ay have a nuber of different physical interfaces. n this paper, we will use this odular topology of a switch for our proposal and analysis. nput Queue 2 N Switch Fabric Output Queue 2 N Output nterface nput nterface Fig.. OQ switch. Fig. 2. /O Module The switching fabric is the core eleent of the switch. t is a resource shared by all of the /O odules to forward packets fro the input interface to the output interface. Different switching fabrics are discussed in [6]. This paper will focus on a space division switch fabric which has a dedicated channel for each fabric output port. Such switches are typically ipleented using a cross-bar architecture [7]. The switch fabric norally operates at a rate S ties faster than the /O odules [4], where S is called the speedup. As discussed in detail in [4], there are a nuber of different strategies for the structure of queues. The OQ architecture studied in this paper, shown in Fig., is used in any coercial high-perforance switches. To avoid head-of-line (HOL) [4] blocking, each input interface has a separate queue for packets destined for different output interfaces. An exaple OQ /O odule is shown in Fig. 2. There are four separate input queues for each of the two input interface, which allow the switch to have four different output interfaces. The odule has two output interfaces, and therefore has two output queues. The reainder of this paper is organized as follows: in Section 2 we describe the proposed architecture and in Section 3 we state the capacity constraints that this architecture iposes on the switch design. n Section 4 we describe a ethod for odeling the ultiple queue switch which we siulate in Section. 2. PROPOSED RB AQM ARHTETURE Fig. 3. illustrates an exaple of the proposed architecture of an RB AQM switch. We conceptually separate the input and output functionality of an /O odule, into separate virtual input and output odules interconnected by a switch Physical nterface Physical nterface 2 Switch Fabric Switch Fabric fabric. For each physical output port, there is one instance of the control algorith AQM. n our analysis, the switch fabric is abstracted fro its physical ipleentation. The paraeters which copletely define the switch fabric are the input port capacity Ai for each input port i, and output port capacity Bj for each output port j. We assue that given these capacity constraints, the switch fabric is able to forward any input to output port traffic. There are input odules in the switch. Each input odule d has a set of input interfaces d. Each input interface n is an eleent in the set d and has capacity n. The aggregate traffic arriving at input interface n is n. The traffic arriving at input interface n destined for output interface is n. Each input odule has a separate queue for each output interface destination. Each input odule d has a work conserving scheduler S d, for exaple a round robin scheduler, which deterines the next packet to forward to the switch fabric input port. The aggregate traffic forwarded fro input odule d to a switch input port is Ad. There are O output odules in the switch. Each output odule e has a set of output interfaces O e. Each output interface is an eleent in the set O e and has capacity O. The traffic output by an output interface is O. The traffic arriving at the input of the output odule e fro the switch fabric output port is Be. O O O2 O2 O3 O3 O4 O4 AQM AQM 2 AQM 3 B(t) B B B2(t) B3(t) B2 B2 Switch Fabric A A A2 A2 AQM 4 B4(t) Fig. 3. RB AQM switch architecture. S S 2 2 3 4 2 22 2 23 2 24 3 32 3 33 3 34 4 42 4 43 4 44 Since each AQM has the structure given by (), the total packet arrival rate (t) and packet backlog B (t) destined for output interface as well as the output interface capacity O (t) ust be easured. The capacity O (t) can be estiated by easuring the packet size and transission tie, and the derivative ter (t) can be calculated fro a easureent of (t). To easure (t) and B (t) each input odule ust easure and store n (t) and the backlog of packets in
input interface n destined for output interface, B n (t). The axiu rate at which these B n (t) and n (t) easureents need to be counicated fro the input odule to the AQM algorith is at each packet arrival. The total packet arrival rate (t) and backlog B (t) is calculated as inputs into the AQM algorith: = (2) n B = Bn + B A where B A (t) is the packet backlog in the output interface. Now that each input of the AQM has been described, the AQM () can be evaluated to update the arking/dropping rate P (t). The arking and dropping strategy is different. For EN capable packets, the ark-front strategy investigated in [] reduces the feedback tie of the congestion signal. Therefore EN capable packets are randoly arked in the output odule at the head of the output queue. Non-EN capable packets, are randoly dropped and this should be done at the earliest point in the switch. This prevents the packet which will ultiately be dropped fro being forwarded through the switch and avoids wasting switch fabric and buffer capacity. 3. RB AQM SWTH FABR ONSTRANTS n this section we forally specify the switch fabric capacity requireents for controlling congestion with a single AQM controller per output port. Let us assue end-to-end flow control is stable and source rates converge to a steady state at or below the network capacity. Then we will show that as long as the output interface capacity is the only bottleneck in the steady state, a single AQM per output port design is sufficient. The two conditions for this are:. For any input odule d, the switch fabric input port capacity Ad connecting to it ust be able to carry all of the incoing traffic fro odule d: Ad n d =,2... n d 2. For any output odule e, the switch fabric output port capacity Be driving the odule ust have enough capacity to fully load the physical capacity of all of the output interfaces of the odule: n steady state Be > O e =,2... O O e Then given () and (6): ˆ = u. O (t), therefore: O ˆ n =,2.. O = (3) (4) () (6) Be > ˆ n O e Therefore we have shown that given the iniu capacity conditions of the switch fabric (4) and (), a single AQM at the output interface is capable of controlling the traffic to atch it to the single bottleneck. When the traffic is not in steady state (6) and (7) ay be violated. Note that (7) cannot be violated without violating (6). The AQM always controls the congestion signal to steer the syste towards (t) = u. (t), and restore (6). By restoring (6), (7) is also restored. So, although the switch fabric output port capacity ay becoe a second bottleneck in transient conditions, it is only necessary to perforing congestion control on the one bottleneck. 4. TWO QUEUE MODEL To siulate the switch shown in Fig. 3 in transient conditions, the odel is siplified. n this section we show that, for the purpose of analyzing the aount of backlog in the input and output queues, the ulti-queue switch is equivalent to a two queue network, as shown in Fig. 4. The growth of an input interface queue that buffers traffic fro input interface n destined for output interface is the arrival rate into this queue n (t) inus the service rate of the queue. Since by (4) the only possible bottleneck in serving the n input queue is the switch fabric output port, the available capacity of this port is the service rate available for the n input interface queue. We will assue worst case conditions in our analysis. The least aount of capacity available at the switch fabric output port for an output interface is when all other output interfaces on the sae odule are fully loaded: Min B = B Oy y O, y where e is the output odule of interface. Then the growth of the input n interface buffer with packets for the output interface is: n = n ( Min B Ak ) k =, k where Ak (t) is the traffic rate fro input queue k to output queue. Let us su the growth of all of the input queues with traffic destined for output interface in all of the input odules: A n = B A, Bn (8) = n ( e Min B ( ) Ak (7) ) where A represents the growth rate of a virtual aggregate input queue for output interface. Note that the total traffic fro the input interfaces n destined for output
interface, Ak, is at ost the capacity of the switch-fabric output port available,, assuing all MinB other output interfaces on the output odule are fully loaded: k = Ak = in( Min B, then the worst case growth rate of the virtual input queue for output interface, and the growth rate of the th output queue is: A B = = n Ak Min B O B B B A n ) The dynaics of the total backlog in the switch for output interface can now be expressed as the su of the dynaics of two separate queues: A B = + B Note that (9) only describes the aount of backlog in the switch, and not the ordering of packets, since the behavior of the schedulers in the switch was not odeled. O O BB Fig. 4. Model of Switch A BA A 2 2. RB AQM SMULATON RB AQM and droptail switches were both siulated in two scenarios with TP traffic. The network topology for both experients is defined in Fig.. The switch fabric speedup is S. The switch has four input interfaces connected to links with different propagation delays. Each source node S to S 4 is capable of hosting an arbitrary nuber of TP sessions. Each TP session originates fro a rando source, S to S 4 and terinates at D. When on, TP sessions are saturated. TP sessions have unifor rando duration fro 3 to 3 sec. The input and output queue sizes are 2 packets each. D s Output Buffer Mbps P (t) Fig.. Model of RB AQM Switch Virtual nput Buffer S x Mbps (t) P(t+)=F(P(t),O(t),(t), ) Mbps s Mbps 7 s Mbps s Mbps 2 s (9) S S2 S3 S4 The RB AQM used in the switch is the GREEN AQM: P ( t + ) = P + U ( u ) P + x where U ( x) = x < and = ax( abs( α ( ( t) u O( t))), k ) where for output interface, u controls the target utilization, α is the control gain, and k is a constant which liits the iniu adjustent to P (t). The values of P (t), (t) and O (t) are updated with every packet arrival. For both experients α =3x -6, k =x -, u =.98. All sources were EN enabled.. EPERMENT : EFFET OF SPEEDUP RB AQM and droptail switches were siulated. The ean backlog for a range of switch fabric speedups S was easured, shown in Fig. 6. and 7. For each speedup, the ean backlog was taken over a s siulation period. During the siulation period 2 TP sessions were started and stopped, with unifor rando start ties over the siulation period. Notice that for the full range of speedup S, the RB AQM switch has substantially lower ean total backlog than the droptail switch. As discussed, the droptail switch aintains full buffers because it is unable to signal congestion until the buffer overflows. For both the droptail and RB AQM switches, a speedup of 2 results in alost all of the backlog being in the output queue. This eans that for such speedups the output buffers should be larger than the input buffers and that any service differentiation for scheduling of packets needs only to be perfored at the output queue, since this is where alost all queuing delay occurs..2 EPERMENT 2: EFFET OF LOAD RB AQM and droptail switches were siulated with speedup S = 2. The ean backlog and output link utilization were easured for a range of traffic loads, shown here in Fig. 8,9 and. A total of N TP sessions, where N = 2, 4 22 were started and stopped with unifor rando start ties over the siulation period. Note that the utilization of the RB AQM trial was slightly lower than that of the droptail and the backlog of the RB AQM trial was significantly lower. This agrees with the fundaental result of queueing theory that for a rando arrival process, reducing the ean backlog reduces utilization. The droptail queue aintains a very high utilization, which the RB AQM reduces slightly for a significant reduction in packet backlog. Given a axiu queue size, RB AQM can control the tradeoff between utilization and backlog, (here with the u paraeter [9]), unlike the droptail queue. O
6. ONLUSON This paper presented an architecture for ipleenting RB AQM in a OQ switch. Only a single AQM controller per output port was shown to be required. Siulations were perfored to verify the proposed architecture and copare its perforance to the existing droptail architecture. t was shown that RB AQM can achieve a low ean backlog whilst aintaining a high utilization. 3 2 2 Mean Backlog (Pkt) Utilisation.9.8.7.6. Fig. 6. Droptail AQM Backlog Bb (out) Ba (inp)..3..8 2. 2.3 Speedup Factor 2 2 Fig. 7. RB AQM Backlog Bb (out) Ba (inp)..3..8 2. 2.3 Speedup Factor Fig. 8. Utilisation vs Load RB Droptail 2 8 4 2 Load (TP sessions started) Fig. 9. Backlog vs Load (Droptail) Bb (inp) Ba (out) 2 8 4 2 Load (TP sessions started) 2 2 Fig.. Backlog vs Load (RB AQM) Bb (inp) Ba (out) 2 8 4 2 Load (TP sessions started) REFERENES. S. H. Low and D. E. Lapsley, Optiization Flow ontrol, : Basic Algorith and onvergence, EEE/AM Transactions on Networking, vol 7 part 6 pp86-87, Dec. 999. 2. S. Floyd and V. Jacobson, Rando early detection gateways for congestion avoidance EEE/AM Transactions on Networking, (4):397--43, August 993. 3. Hewlett and Packard Product Literature, HP Prourve Switch 3xl Series Reviewer s Guide, 22. http://www.hp.co/go/hpprocurve. 4. Eilio Leonardi, Marco Mellia, Fabio Neri and Marco Ajone Marsan, "On the stability of input-queued switches with speed-up", EEE AM Transactions on Networking, vol. 9, no., pp. 4-8, 2.. hunlei Liu, Raj Jain, "proving Explicit ongestion Notification with the Mark-Front Strategy", oputer Networks, Vol 3, no 2-3, pp 8-2, February 2. 6. Pankaj Gupta, Scheduling in nput Queued Switches: A Survey, June 996, unpublished anuscript, http://klaath.stanford.edu/~pankaj/research.htl 7. Vinita Singhal and Robert Le, High-Speed Buffered rossbar Switch Design Using Virtex-EM Devices, www.xilinx.co, ilinx Application Note, 2. 8. S. T. huang, A.Goel, N.McKeown, and B.Prabhakar, Matching output queueing with cobined input and output queueing., EEE J.Select. Areas oun., vol. 7, pp3-39, Dec. 999. 9. Bartek Wydrowski and Moshe Zukeran, GREEN: An Active Queue Manageent Algorith for a Self Managed nternet, Proceedings of 22, New York, vol. 4, pp. 2368-2372, 22.. K.B.Ki and S.H.Low, Analysis and design of AQM for Stabilizing TP, altech Technical Report altech STR:22.9, March, 22.. Raakrishnan, K.K., Floyd, S., and Black, D, The Addition of Explicit ongestion Notification (EN) to P,ETF RF 368, Proposed Standard, Septeber 2.