Router-based Congestion Control through Control Theoretic Active Queue Management A L B E R T O G I G L I O Master's Degree Project Stockholm, Sweden 24 IR-RT-EX-421
2
Contents 1 Introduction 5 1.1 TCP basics.... 5 1.2 Active Queue management - AQM... 6 1.3 Thesis objectives... 6 1.4 Thesis overview... 7 2 State of the art 9 2.1 TCP Congestion Control... 9 2.1.1 Limits of the end-to-end congestion control... 11 2.2 AQM algorithms... 11 2.3 Explicit Congestion Notification - ECN... 11 2.4 Random Early Detection - RED... 13 2.5 Control Theoretic Designs... 15 2.5.1 The model of TCP dynamics... 15 2.5.2 Feedback structure description... 16 2.5.3 Stability analysis... 17 2.5.4 Performance comparison... 17 3 Model analysis 21 3.1 Stochastic modelling... 21 3.2 Control theoretical analysis... 25 3.2.1 Stability range... 27 3.2.2 Time delay... 28 3.3 Algorithms overview... 28 4 NS-2 and simulation settings 31 4.1 NS-2... 31 4.1.1 Which language?... 31 4.2 Description of the experiments... 32 4.2.1 Experiment 1... 32 4.2.2 Experiment 2... 32 4.2.3 Experiment 3... 33 4.2.4 Experiment 4... 33 4.2.5 Elephant vs. Mice traffic... 34 3
4 CONTENTS 4.2.6 Experiment 5... 34 4.2.7 Experiment 6... 35 4.2.8 Experiment 7-TCPvs.UDP... 35 5 The PID controller 37 5.1 Introduction... 37 5.1.1 Design based on the network behavior... 37 5.1.2 Targets in the design... 38 5.2 The PID tuning... 38 5.2.1 PID tuning methods... 39 5.2.2 The resulting compensator... 39 5.2.3 Result analysis... 41 5.3 Digital implementation... 41 5.3.1 Approximation methods... 42 5.3.2 Approximation choice... 45 5.4 The simulation results... 45 6 Internal Model Control - IMC 53 6.1 Closed-loop criteria... 53 6.2 IMC design... 54 6.3 Controller design and tuning... 55 6.4 Static design vs. Adaptive... 56 6.5 Algorithm implementation... 57 6.6 The simulation results... 59 6.7 Adaptive implementation... 6 6.7.1 Tracking and estimating parameters... 65 6.8 The theoretical results... 68 6.9 The simulation results... 68 7 Optimal control 71 7.1 Sensitivity function... 71 7.2 Linear Quadratic control... 72 7.3 Stability analysis... 76 7.4 Observer... 76 7.5 The resulting controller... 77 7.6 The simulation results... 79 8 Conclusions 85 8.1 Future work... 85 A NS-2 87 B OTcl Code 91
Chapter 1 Introduction The Transmission Control Protocol (TCP) is the most widely spread transport level protocol in the Internet. It provides an end-to-end reliable transmission of packets and works on the shoulders of the IP protocol, which is mainly responsible for the addressing and fragmentation of the data. Its main feature is transmission of data without errors, which is fundamental for most of the data exchanges; on the other hand there is also a fast growing demand of best-effort transmission, caused by the interworking of data and multimedia application on the Internet. The protocol that represents the transport level of this structure is the User Datagram Protocol (UDP). In Figure 1.1 the transport protocols are placed into the TCP/IP stack, and compared to the standard layers defined by the Open System Interconnect (OSI) reference model. The OSI stack is an architectural model which describes the functions of data communication protocols: each layer is self-standing and performs services to the layer above using data from levels below. 1.1 TCP basics TCP ensures a data transmission without errors by a feedback system based on acknowledgments of received packets. Each packet sent by the end-point A has a sequence number and is routed through the different possible paths to reach its destination end-point B; the sequence number is needed as packets can arrive in mixed order, because of the routing algorithm through the network. When a packet is received, B sends back an acknowledgment (ACK) to A, confirming the packet arrival. A timeout threshold is set when the packet is sent from A: if the ACK is not received after the time has elapsed, the packets is sent again. The fundamental dynamics of TCP is window-size based. The congestion window is the number of packets the transmitting source can send at the same time without waiting for a cumulative ACK. This specification provides a high utilization of the available bandwidth and is at the basis of the congestion control mechanism: when a bottleneck link discards some packets because of an overflow in its routing buffer or a packet is lost or arrives corrupted at the end-point, A (the sender) reduces its congestion window and hence decreases the load in the sender perspective. This dynamics is called additive 5
6 CHAPTER 1. INTRODUCTION ISO/OSI VS. TCP/IP PROTOCOLS APPLICATION PRESENTATION SESSION TRANSPORT NETWORK DATA LINK PHYSICAL APPLICATION TRANSPORT NETWORK DATA LINK PHYSICAL FTP, HTTP, TELNET TCP, UDP IP Figure 1.1: OSI and TCP/IP layer structure with relative protocols increase - multiplicative decrease and it has satisfied the Internet requirements until the beginning of the 9s. However, the exponential growth of the Internet usage, together with the larger dimensions of transferred files and multimedia applications, have drastically increased the load in the network leading to a higher probability of congestion (creating longer queues at the router buffers), increasing Round Trip Times (RTT) and requiring several retransmissions of packets when queues saturate. 1.2 Active Queue management - AQM To counteract the network performance deterioration caused by the increased load, new TCP specifications have been proposed (e.g. Tahoe, Reno, Vegas), but they still only provide end-to-end congestion control functionality, keeping a drop-tail policy for routing. The new challenge has moved to the queue management: instead of waiting until congestion is present i.e. for timeouts or triple duplicate-acks to decrease the transmission window of A, the aim of Active Queue Management (AQM) is to prevent full queues by signalling congestion to the transmitter before it happens. In [6] a short analysis of some proposals for AQM is presented: from the simple drop-tail modifications to the control-theoretic designs, through Random Early Detection (RED) proposals. Different points of view can be found in the scientific community, but it seems clear that some more complex implementations of control theoretic designs will lead to better results, also taking into consideration the increased computational power of routers and lower costs for memory. 1.3 Thesis objectives The aim of this thesis is to analyze the possibility of improved control of existing models developed in [2] and to test alternative solutions, evaluated in network behavior simulation.
1.4. THESIS OVERVIEW 7 The discrete simulator NS-2 [23] is used to test the implementations of the models studied in theory: the challenge is to obtain an algorithm, which can ensure stable queue behavior and fast response to perturbations in the network. 1.4 Thesis overview This thesis is organized as follows: in Chapter 2 the end-to-end congestion control performed by the TCP protocol and the improvements of the router-based schemes are described. The model of the TCP flows competing for their routing is analyzed in Chapter 3 and improvement limitations are set. Chapter 4 describes the network simulator NS-2 and the experiment settings for testing the algorithms. A Proportional-Integral-Derivative controller, improvement to the PI controller described in [2], is analyzed and tested in Chapter 5. Chapter 6 describes a static and an adaptive algorithm based on a common Internal Model Control design. Internal disturbances are taken into consideration with a Linear Quadratic controller in Chapter 7, in order to balance the effects of the state-feedback. For each algorithm a theoretical description is given, then the simulation results together with the chosen benchmark (PI controller) are presented. Chapter 8 summarizes the results, presents the final conclusions of the thesis work and suggests future possibilities of improvement. For a more complete description, in Appendix A a little guide for using the network simulator NS-2 is given; the OTcl code for the simulations is added in Appendix B.
8 CHAPTER 1. INTRODUCTION
Chapter 2 State of the art The large use of the TCP transport protocol in the Internet and the growing volume of data sent through the Web has in the last years stimulated the interest of researchers to study the new problems of network congestion and increased performance of the standard protocols. Backward compatibility is a great challenge in this field. 2.1 TCP Congestion Control In order to be able to understand the dynamics of a router-based congestion control, the end-to-end TCP congestion control must first be introduced, as it represents the basic layer of control in every TCP data exchange. As written in the introductive chapter, TCP is a Transport layer protocol, and is the first in the stack working end-to-end (see Figure 2.1). Application Transport Application Transport Network Network Network Network Data Link Data Link Data Link Data Link Physical Physical Physical Physical A Router 1 Router 2 B Figure 2.1: End-to-end communication for the transport protocol TCP The main modes in one of the standard TCP versions (TCP New Reno) are: Slow Start 9
1 CHAPTER 2. STATE OF THE ART Congestion Avoidance Fast Retransmit Fast Recovery The first two are used by the TCP sender to control the amount of data being injected into the network. The congestion window (cwnd) is a sender-side limit on the amount of data the sender can transmit into the network before receiving an ACK, while the receivers advertised window (rwnd) is a receiver-side limit on the amount of outstanding data. The minimum of cwnd and rwnd governs data transmission. The slow-start algorithm is used to probe the available capacity of an unknown link in the beginning of a transfer, or after a congestion avoidance round, in order not to congest the network. In case of a timeout event or a duplicate ACK, the congestion avoidance algorithm halves the transmission window (cwnd). A variable is added to the TCP per-connection state to determine if either the slow-start or the congestion avoidance algorithms are used, in a fully cooperative environment. A B 1 A B Time-out threshold 1 ACK1 2 3 4 5 ACK1 ACK2 ACK2 Retransmission packet 1 6 ACK2 ACK2 ACK1 retransmit packet 3 ACK6 Figure 2.2: Time-flow events for retransmission causes: time-out and triple-duplicate ack The fast retransmit algorithm is used by the TCP sender to react to losses on the channel, and it is started in case of the arrival of 3 duplicate ACKs. It forces the retransmission of the supposed missing packet before the time-out occurs. After the single retransmission, the fast recovery algorithm governs the sending of the new data, in order to increase the utilization of the available bandwidth: in case of a triple duplicate ACK, in fact, the use of the slow start algorithm would strongly decrease the sending rate, while the congestion on the link has already been solved. For a detailed description of the algorithms and the protocol, information can be found in [8].
2.2. AQM ALGORITHMS 11 cwnd (bytes) slow start congestion avoidance fast recovery single retransmission (fast retransmit) time Figure 2.3: Cwnd evolution for TCP New Reno 2.1.1 Limits of the end-to-end congestion control An analysis of the behavior of TCP aggregate flows is presented in [9]: this research shows how the TCP behavior is dependent on the considered system, leading to a floating level of performance. In an environment which is constantly changing and growing as the modern Internet, there is need for standard behavior under different network conditions. Moreover, TCP congestion control acts merely as an end-to-end system, not taking into consideration the network that broadcasts the packets. Its only aim is to perform a reliable error-free data exchange, and the algorithms that work to get this result are obliged to avoid link congestion. This policy performs late corrections and can not ensure a stable steady-state load in the network. AQM was developed from the network point of view, in order to perform in cooperation with the TCP protocol, but also as a distribute mediator for the throughput maximization performed by the TCP protocol. In the following sections some of the most interesting proposals for AQM are presented, in order to give a broader view of the ongoing research and to open the way for this thesis work. 2.2 AQM algorithms In the following sections some of the main steps in the AQM research are described, in order to give a general idea of the problem. Explicit Congestion Notification - ECN Random Early Detection - RED Control Theoretic Algorithms 2.3 Explicit Congestion Notification - ECN ECN is a functionality that has been studied to work together with AQM algorithms in order to improve their performance. It requires modifications in both IP and TCP packet headers.
12 CHAPTER 2. STATE OF THE ART ECT CE Non ECT-capable 1 Non ECT-capable and packet dropped 1 ECT-capable and no congestion 1 1 Congestion Experienced 1 2 3 4 5 6 7 DSCP - Differenciated Services ECN Figure 2.4: The ToS/TC octet The AQM algorithm implemented in a router, instead of dropping incoming packets to prevent congestion, can set the CE (Congestion Experienced) codepoint in the IP header of the packet, and route it to the next hop. The routers that receive the packet with the CE bit, don t modify the field and treat the packet as a normal one. In order to avoid the loss of information, ECN-capable packets should have the DF (Don t Fragment) bit set. The destination end-point acknowledges the received packet but sets the congestion bit into the TCP ACK, so that the sender can react in the same way as for a packet loss (halving the window size), with the advantage of not having lost the packet. At the IP level, two bits are added in the packet header: the ECN-Capable Transport (ECT) and the CE. The first indicates if the end-points of the transport protocol are ECN-capable, while the second signals a congestion problem in the network. The two bits are placed inside the ToS (Type of Service) octet of IPv4 or inside the TC (Traffic Class) octet of IPv6 and occupy bits 6 and 7, as shown in Figure 2.4. According to the standardization of the IP header, bits 6 and 7 were defined as Currently Unused, and an experimental use of the ECN bits has been allowed to use that space (RFC278). The position of the field inside the IP header is shown in Figure 2.5. As from Figure 2.1, the IP protocol manages the routing operations at each hop, while TCP works on end-to-end communication. Following the classical TCP negotiation scheme (handshake), the end-points must first show if they are ECN-capable. The chain of events that follows a congestion notification is the following: If the sender is ECN-capable, the ECT codepoint is set in the IP packet headers An ECN-capable router, which decides to drop one incoming packet, looks at the ECT flag and sets the CE, forwarding the packet The receiver checks the CE codepoint and sets the ECN-Echo - ECE (Figure 2.6) in the next TCP ACK sent to the sender The sender receives the packet and reacts as if a packet had been dropped The TCP header of the next packet has the CWR bit set in order to acknowledge the reception of the ECN notification
2.4. RANDOM EARLY DETECTION - RED 13 4 8 16 18 32 IHL ToS identification total length fragment offset TTL protocol source address destination address options data header checksum Figure 2.5: The IP packet header 1 2 3 4 5 6 7 8 9 1 11 12 13 14 15 Header length Reserved C W R E C E U R G A C K P S H R ST S YN F IN Figure 2.6: The TCP packet header ECN has problems when there is a tunnel connection: in IP security protocols, as IPsec, the IP header is encapsulated in an outer packet, and the inner fields can not be changed for the hops in the tunnel (in order to prevent malicious modification). If a router inside the tunnel wants to signal incoming congestion, the only way to do it is to drop packets. One solution is to negotiate the use of ECN during the IPsec handshakes at the tunnel endpoints, so that its use can be disabled by a security administrator in case the risks outweigh the benefits. 2.4 Random Early Detection - RED The aim of RED design, which was presented in 1993 [12], is congestion avoidance through low delay and high throughput in the network. The average queue size is kept low in order to minimize the time delay and in order to let bursty traffic to enter the queue without overloading it. Another challenge is to break down synchronization of flows (that is what happens in drop-tail systems) and avoid periodic fluctuations. In order to avoid the bias against bursty traffic, a randomized marking function is used, with the effect of a dropping probability proportional to the bandwidth usage. Usually RED is described in cooperation with normal TCP implementation, but it can be applied along with ECN policy. The pseudo-code of the RED algorithm is the following: for each packet arrival
14 CHAPTER 2. STATE OF THE ART p 1 q K s + K p max p min th max th q Figure 2.7: RED dynamics calculate the average queue size avg if min th avg max th calculate probability p a with probability p a: mark the arriving packet else if max th avg mark the arriving packet end if end for The first action RED performs is to calculate the average queue size: it can be done either according to the queue size in bytes or in number of packets. The second option accounts different types of TCP connections (there is a big difference between packet dimensions for HTTP or FTP connections), and reflects the delay in the queue in a closer way. In order to average the queue length, a filter is used: the parameter w q ( w q 1), which is used to weight the actual queue measurement in relation to the previous average value, represents the time constant of the low-pass filter that performs the averaging (EWMA - Exponential Weighted Moving Average). This parameter is difficult to tune, as a high value makes the system more responsive while letting high transient fluctuation and bursty traffic affect the queue congestion.. The other two main parameters in RED are the thresholds min th and max th, which define the areas of random dropping and full dropping of the incoming packets. In Figure 2.7 the block diagram of RED structure is represented as a cascade of a low-pass filter and a non-linear gain element. Another important parameter is p max, which represents the dropping probability at the maximal threshold. For the general RED, the curve is linear and its gradient L RED can be easily derived from the other parameters. All the previously introduced parameters must be tuned in order to satisfy the performance requirements in different working conditions. The difficulty to reach common criteria for RED tuning has lead researchers either to look for modifications on the main system or to avoid its use. Many proposals of modified RED have been studied and tested in simulations, with various results: some of them are Adaptive RED [13], [14], Gentle RED [15] and Blue [16]. RED can reach good performance compared to a traditional end-to-end congestion control, but its behavior is quite slow and the steady state behavior is oscillatory because of the two thresholds. Recently other solutions which use control theory to model and control the system have been proposed. Previous research in this field are examined in the
2.5. CONTROL THEORETIC DESIGNS 15 following section and used as a starting point for further analysis and experimentation. 2.5 Control Theoretic Designs In recent years, thanks to the development of good mathematical models of TCP dynamics, a control theoretical approach to congestion avoidance has been proposed and the debate on this subject is more and more alive in the scientific community. An introduction to the subject is presented here and then further developed in chapter 3. 2.5.1 The model of TCP dynamics A mathematical model for the TCP behavior is proposed in [3], and simulations performed according to it showed an accurate match with TCP dynamics. The model is introduced here and a complete analysis can be found in Section 3.1. It takes in consideration the two possible retransmission causes: timeouts and triple duplicate ACKs: where Ẇ (t) q(t) = ( 1 +(1 Q(W )) R(t) +(1 W (t))q(w ) N i=1 W (t) R(t) W (t)w (t R(t) 2R(t R(t) W (t R(t) p(t R(t)) R(t R(t) ) p(t R(t)) C, (2.1) W = expected TCP window size q = expected queue length R = round trip time C = link capacity N = number of TCP sessions p = probability of packet dropping t = time The function Q(W ) determines the probability that one loss is caused by a timeout (rather than by a triple duplicate ACK), given that the window size is W at the time of the loss. A simplified expression is Q(W ) = min(1, 3 W ). The model is simplified in [1], it ignores the timeout mechanism but is well suited for performing a small-signal linearization and to prepare the system for a control theoretical analysis. The resulting model is: Ẇ (t) = 1 R(t) W (t)w (t R(t)) p(t R(t)) (2.2) 2R(t R(t)) q(t) = W (t) N(t) C (2.3) R(t)
16 CHAPTER 2. STATE OF THE ART - s+ 2 R C 2N 2 2N 2 R C dw N R s+ 1 R dq e -sr dp Figure 2.8: Linearized system In [1] equations (2.2) and (2.3) are linearized around the operating point (W,q,p ), which leads to the following: δẇ (t) = N R 2 C (δw(t)+δw(t R )) R C 2 2N 2 δp(t R ) (2.4) δ q(t) = N R δw(t) 1 R δq(t) (2.5) A further simplification is possible when the TCP transmission window is much larger than 1. This assumption is reasonable for typical network conditions and the time-delay that affects the window dynamic can then be ignored. The final system is represented as a block-diagram in Figure 2.8, in order to be able to show the plant of the feedback structure. 2.5.2 Feedback structure description In control theory an output feedbacks into the input to compensate the errors. The feedback system is roughly made of two parts: the plant, that describes the system dynamics, and a compensator, which must ensure that the system is robustly stable and with good time-response to the inputs. The AQM action can be seen as a compensator that works together with the TCP dynamic in order to stabilize it. The block diagram with the feedback network is shown in Figure 2.9. The function P (s) is the plant transfer function and takes into consideration the queue dynamics (P queue (s)) and the TCP behavior (P tcp (s)). The plant P (s) is derived by taking the Laplace transform of equations (2.4) and (2.5) after the time-delay for the window size is cancelled: ( C2 2N P (s) = )e sr (s + 2N R 2C )(s + 1 (2.6) R ) From the control theoretical point of view, every AQM implementation can be tested on this structure: in [1] a thorough analysis of RED as feedback block is proposed, while in
2.5. CONTROL THEORETIC DESIGNS 17 PLANT dp e -sr P (s) dq AQM controller Figure 2.9: feedback block diagram [2] the performance of a Proportional Controller and a Proportional-Integral Controller are tested against RED. The experiments and implementations proposed in the two above mentioned articles are recreated and the results in frequency domain as well as in time domain are reported here to create a benchmark for the next improvements. The common test settings are the following: C = 15 Mbps = 375 packets/sec (packets of fixed length 5 Bytes) N =6 R + =.246 sec, where N is a lower bound on the number of flows and R + is an upper limit of the Round-trip time. 2.5.3 Stability analysis The two main parameters to be checked in the frequency domain are the Phase Margin and the Gain Margin. The phase margin (PM) is defined as (ω pm 18), where ω pm is the phase response at the frequency where the magnitude response is db: it can here be viewed as a measure of the quantity of uncertainty of time-delay a system can accept. The second (GM) is the magnitude response when the phase response is 18 o, and can be thought as the factor that can be applied to the open loop gain of a stable system to make it unstable. In the time domain the main parameters to be checked are the overshoot, the settling time and the rise time. No information is lost in the conversion between the frequency and time representations, so usually the time-domain is considered a proof of the results obtained in frequency domain. 2.5.4 Performance comparison From the Bode diagrams in Figures 2.1, 2.11 and 2.12, the main difference among the three systems is the cut-off frequency(w c ): in RED implementation is.5 rad/sec, for the Proportional compensator 1.5 rad/sec, while for the Proportional-Integral is back to
18 CHAPTER 2. STATE OF THE ART 5 Bode Diagram Gm = 32.835 db (at 1.47 rad/sec), Pm = 88.844 deg (at.5222 rad/sec) 1 Step Response.9 Phase (deg) Magnitude (db) 5 1 15 9 18 27 36 Amplitude.8.7.6.5.4.3.2 45 54.1 63 1 3 1 2 1 1 1 1 1 1 2 Frequency (rad/sec) 1 2 3 4 5 6 7 8 9 1 Time (sec) Figure 2.1: RED: frequency and time response 2 Bode Diagram Gm = 1.36 db (at 3.8568 rad/sec), Pm = 68.24 deg (at 1.4953 rad/sec) 1.2 Step Response 1 Phase (deg) Magnitude (db) 2 4 6 8 9 18 27 36 Amplitude.8.6.4.2 45 54 1 1 1 1 1 1 2 Frequency (rad/sec).2 2 4 6 8 1 12 14 16 18 2 Time (sec) Figure 2.11: Proportional compensator: frequency and time response
2.5. CONTROL THEORETIC DESIGNS 19 2 Bode Diagram Gm = 18.748 db (at 3.4989 rad/sec), Pm = 75.137 deg (at.52871 rad/sec) 1.2 Step Response 1 Phase (deg) Magnitude (db) 2 4 6 8 9 18 27 36 Amplitude.8.6.4.2 45 54 1 1 1 1 1 1 2 Frequency (rad/sec).2 2 4 6 8 1 12 14 16 18 2 Time (sec) Figure 2.12: Proportional-Integral: frequency and time response.5 rad/sec. The cross-over frequency is proportionally inverse to the response time of the system, so it means that to get a faster system w c must be as high as possible. The fastest system (the one with lowest settling time) is the Proportional, but the fact that the plant has no poles in zero causes the system to have a steady-state error in case of a constant reference signal, which means that the difference between the desired output and the closed-loop output does not converge to zero. This fact does not seem to be a problem because of the fast variations in the network conditions, that should require a fast system which can provide a quick response to the load changes; in [2] the results of a worst-case simulation in NS-2 are reported and the Proportional controller causes overload of the queue. This simulation does not seem to be meaningful in the analysis of the problem, as the implementation of the Proportional controller is used together with the RED non-linear gain function, which has been reported before not to be easy to tune for a general case. In order to avoid this kind of problem, a Proportional-Integral (PI) compensator is proposed in [2]: it has the capacity (if used with the considered plant) to lead the steady-state error to zero, as it has a pole placed at the origin. The diagram of the PI feedback system is shown in Figure 2.13, emphasizing the role of the desired operating point, which is the reference value of the output to be reached. The resulting system performs well, but its step response is slow. In chapter 5 a PID (Proportional-Integrative-Derivative) controller is tuned and analyzed, in order to verify the increased performances of a more complex system applied to the TCP dynamics.
2 CHAPTER 2. STATE OF THE ART - PLANT DYNAMIC q p dp PI dq q - p Figure 2.13: Block diagram of PI controller,with reference value q
Chapter 3 Model analysis After having shortly introduced the model used in this master thesis in Section 2.5, a more accurate description is reported here. The work of V. Misra, W. B. Gong, D. Towsley and C. V. Hollot from University of Massachusetts is explained to create the link between queueing theory and control theory. 3.1 Stochastic modelling The additive increase-multiplicative decrease scheme of TCP can properly be activated through losses in the network, and it is through losses that TCP tries to detect congestion. From a network centric point of view, indications of losses arrive to the source from the network at a certain rate (in different forms, e.g. duplicate ACKs or gaps in sequence numbers showing timeout losses). The arrival process of losses can be modelled as a Poisson process: the reasons behind this assumption are the high number of hops that a packet passes through before reaching its destination and the elevate number of flows at each router. high number of hops - at each router the packet is enqueued or dropped (marked if ECN): if the overflow process at each router is stationary and orderly, the overall congestion process is the sum of the individual processes. As the number of hops increases, Khinchine s theorem [2] states that the process approaches a Poisson behavior. high number of flows at router - the losses seen by each flow can be considered as a sampling of the buffer overflow process at each router, as a probabilistic thinning. As the number of flows at each router increases, the thinned process starts approaching a Poisson process according to Kallenberg s theorem [21]. With the previous statements, the loss process seen by a single flow at a router is Poissonlike and the multiple hops that define the path make the aggregate loss process closer to a Poisson process. The traffic is then modelled as a fluid: the increases in the window size are considered continuous and represented by dt RT T, as they take place every round-trip time (in absence 21
22 CHAPTER 3. MODEL ANALYSIS W W i1 W i2 W i3 A i1 A i2 A i3 T 2T 4T t Z i TD TO Z i S i Figure 3.1: Evolution of window size with TD and TO losses of losses). The losses, each modelled as a Poisson stream, are of two kinds (showed in Figure 2.2): triple duplicate ACK (TD) and timeouts (TO). A TD loss causes the window size to be halved while after a TO the transmission window decreases its size to 1 plus an exponential silence for a period of T,2T...64T, depending on the number of successive TO losses detected. The round-trip time is peculiar to each single flow and takes the form R i (t) =a i + q(t) C, (3.1) where i identifies the flow, a i denotes the propagation delay and q(t) C is the queuing delay. The packet losses to flow i are described as a Poisson process {N i (t)} with time varying rate λ i (t) (the time varying rate can model the different independent marking schemes of AQM). Identifying the window size as W i (t), the following equation describes its behavior: dw i (t) = dt RT T i (q(t)) W i(t) dn i (t)+(1 W i (t))dn i (t) (3.2) 2 The first term accounts the additive increase part of TCP, the second reflects the multiplicative decrease and the third the timeout behavior. An example of the time evolution of a TCP congestion window is shown in Figure 3.1: Z TO t is the duration of a timeout sequence, Zt TD + Z TO timeout sequences and then S i = Z TD t t. the time interval between two consecutive Let M i be defined as the number of packets sent during period S i. The throughput is then defined as B = E[M] E[S] (3.3) and it accounts the number of sent packets. In Figure 3.1 the following variables can be defined: n i : number of TD periods in interval Z TD i
3.1. STOCHASTIC MODELLING 23 Y ij : number of packets sent in the j-th TD interval A ij : duration of the j-th period X ij : number of rounds in the j-th period W ij : window size at the end of the period In order to derive E[n] it is useful to observe that during Zi TD there are n i TD periods, where each of the first n i 1 ends in a TD, while the last one ends with a TO. This means that there is one TO out of n i loss indications. Calling Q the probability that a loss indication ending a TD period is a TO, it is Q = 1 E[n]. Considering the last two rounds in a TD period and knowing the packet exchange for the triple duplicate ACK, it is possible to get the formulation of Q. Given that there is a sequence of losses in one round, the probability that the first k packets are ACKed in a round of w packets is A(w, k) = (1 p)k p 1 (1 p) w (3.4) The probability, in the last round before a TO (when n packets are sent), that m packets are ACKed in sequence and the others are lost is: { (1 p) m p m n 1 C(n, m) = (1 p) n (3.5) m = n Then Q(w), the probability that a loss in a window of size w is a TO, is Q(w) = { 1 if w 3 2 k= A(w, k)+ w k=3 A(w, k) 2 m= C(k, m) otherwise (3.6) that shows how a TO occurs if the number of packets successfully transmitted in the penultimate round is less than 3 or in case the number of packets successfully transmitted in the last round is less than 3. Assuming the independency of the beginning instants of losses in each round, and performing some algebraic manipulations, Q(w) becomes ( Q(w) = min 1, (1 (1 p)3 )(1+(1 p) 3 (1 (1 p) w 3 ) )) 1 (1 p) w (3.7) Noting that lim Q(w) 3 p w, (3.8) the probability of a timeout loss is given by Q(w) min(1, 3 w ) (3.9)
24 CHAPTER 3. MODEL ANALYSIS Formula 3.2 then becomes dt dw i (t) = R i (q(t)) (1 Q(W i)) W i(t) dn i (t)+q(w i )(1 W i (t))dn i (t) (3.1) 2 The approach followed to get the probability of timeout losses led to a formulation of the throughput in (3.3), which considers a complete period between timeout losses. As the model is used to approach congestion control algorithms, it is good hypothesis to state that no timeout losses are experienced, especially when using ECN mechanism (even if ECN acknowledgment packets can be lost in the network, if the congestion algorithm is implemented in the whole network it should prevent packets to be dropped). The model used in [2] does not take into account timeout losses in order to perform a linearization of the model and the throughput is expressed in a single round view, taking the form W i (t)/r i (t). Ignoring the timeout term in (3.2) the TCP window size behavior becomes dt dw i (t) = R i (q(t)) W i(t) dn i (t) (3.11) 2 Taking the expected values of each side, eq. (3.11) becomes [ ] dt E[dW i (t)] = E E[W i(t)dn i (t)] (3.12) R i (q(t)) 2 Even if the window size and the arrivals of losses at the source are not independent, (3.12) can be approximated as follows, without losing the matching behavior of the model: [ ] dt de[w i ]=E E[W i]λ i (t) dt, (3.13) R i (q) 2 where λ i (t) represents the arrival rate in the Poisson process of the loss indications to the source. Considering one single congested router, a loss indication arrives at the source approximately one round trip time (R i ) after a packet has been dropped (or marked) at the queue. The following approximation is necessary because the number of hops and the position of the congested router in the network are variables (the delay between the marking of a packet and the loss indication to the source can get values in the range [R i /2,R i ] according to the position where congestion is experienced). In AQM schemes the marking probability is proportional to the load of the queue: knowing the throughput B(t R i ) at the time t R i, the rate of loss indications that the source gets at time t is given by p( x(t R i )) B(t R i ), whose expected value can be expressed as W i (t R i ) λ i (t) =p( x(t R i )) (3.14) R i ( q(t R i )) Substituting (3.14) into (3.13) and approximating the expected value of the RTT with the RTT corresponding to the expected queue length, it becomes de[w i ] dt R i ( q) W i Wi (t R i ) 2R i ( q(t R i )) p( x(t R i))dt (3.15)
3.2. CONTROL THEORETICAL ANALYSIS 25 The approximation done on the RTT expected value is possible considering the high deterministic contribution to the RTT formulation (propagation delay) and the limited random part (queuing delay), which is more and more stabilized when the number of flows increases. The differential form of (3.15) is then d W i dt = 1 R i ( q) W i Wi (t R i ) 2R i ( q(t R i )) p( x(t R i)) (3.16) and must be coupled with a differential equation for the queue length. The Lindley s equation [22] for discrete systems defines the queue behavior at time t as N q(t) =q(t R i ) 1 q(t) C + W i (t R i ), (3.17) where 1 q(t) takes value 1 if q(t) >. Equation (3.17) can be transformed into a differential equation i=1 dq(t) dt = 1 q(t) C + Taking the expectation of both sides, it becomes N i=1 W i R i (q) (3.18) d q(t) dt = E[ 1 q(t) ]C + N i=1 W i R i ( q), (3.19) where the same approximation on the RTT performed in (3.16) is used. For a bottleneck router, the size of the queue is positive with probability close to 1: considering the traffic as a fluid, at each instant a packet is routed according to the service capacity of the queue. Thus, the differentiate behavior of the queue length becomes d q(t) dt C + N i=1 W i R i ( q) Equations (3.16) and (3.2) describe an accurate model of the dynamics of TCP. (3.2) 3.2 Control theoretical analysis Considering flows with identical throughput competing for the service, the model is then described by the following coupled nonlinear differential equations: where Ẇ (t) = 1 R(t) W (t)w (t R(t)) p(t R(t)) 2R(t R(t)) q(t) = W (t) N(t) C, (3.21) R(t)
26 CHAPTER 3. MODEL ANALYSIS W = expected TCP window size (packets) q = expected queue length (packets) R = round trip time (seconds) C = link capacity (packets/second) N = number of TCP sessions p = probability of packet dropping/marking t = time The dropping/marking probability p can take values in [,1], while W and q are upper bounded by the network configuration. In order to treat the model with common control theoretical methods, a linearization of the system is necessary. The small signal linearization is performed around the operating point (W,q,p ), with W and q as states and p as input of the system. The operating point is calculated in the steady-state condition, where a stabilized (Ẇ = and q = ) behavior is expected: it is defined by Ẇ = W 2 p = 2 (3.22) q = W = R C (3.23) N In order to perform the linearization, the number of flows N and the round trip time R are approximated as constants along the process. The temporary functions f,g are used to perform partial derivatives on Ẇ and q: f(w, W R,q,p) = 1 R W (t)w R(t) p(t R) 2R g(w, q) = W (t) N C, (3.24) R where W R (t) =W (t R). Evaluating partial derivatives at the operating point defined in (3.23) and substituting them in δf δẇ (t) = δw δw(t)+ δf δw R (t)+ δf δw R δq δq(t)+δf δp(t R) δp δ q(t) = δg δw δw(t)+δg δq(t), (3.25) δq the linearized model becomes (knowing that R is expressed as q/c + T p ): δẇ (t) = N R 2C δw(t) N R 2C δw R(t)+ δq R C 2 2N 2 δp(t R ) δ q(t) = N R δw(t) 1 R δq(t), (3.26) whose short formulation is δẇ (t) = N R 2 C (δw(t)+δw(t R )) R C 2 2N 2 δp(t R ) δ q(t) = N R δw(t) 1 R δq(t) (3.27)
3.2. CONTROL THEORETICAL ANALYSIS 27 After the small signal linearization process the states of the model are δw =. W W and δq =. q q, while the input is represented by δp =. p p. The delayed contribution to the window size in the first equation in (3.27) can be considered at the time t because its operational values don t change much in a infinitesimal view if the window is much larger than one. This assumption is reasonable in the considered networks, and the simplified model becomes 2N δẇ (t) = R 2C δw(t) R C 2 2N 2 δp(t R ) δ q(t) = N R δw(t) 1 R δq(t) (3.28) In Figure 2.8 a representation of the system with blocks was presented, while the transfer function representation is given in (2.6). 3.2.1 Stability range The differential equations (3.28) can also be represented with state-space formulation: this method is widely used in control theory because inputs, states and outputs are simply related by matrixes. The common structure is given by ẋ(t) = A(t)x(t)+B(t)u(t) y(t) = C(t)x(t)+D(t)u(t), (3.29) where x are the states, u the inputs and y the outputs of the system. The stability of a system is ensured when the eigenvalues of the matrix A are negative: from (3.28) the state-state matrix is given by [ ] 2N R 2C N R 1 R whose eigenvalues are 2N R 2C and 1 (3.3) R As all the network parameters (N, R, C) are positive quantities, the eigenvalues are always negative and the TCP dynamics together with the queue behavior is open-loop stable at the equilibrium around the operating points. The use of controllers to make the system faster and to reduce the deviance of the output can lead to instability. The analysis of the algorithms studied in this master thesis considers stability as an important issue and theoretical results as well as limitations in the controller s use will be presented for each solution. The analysis of stability for the controllers in this master thesis work is performed in frequency domain with phase and gain margins. Stability is granted only if both margins are positive: the general idea in control theory is to have as large value as possible, in order to prevent disturbances or changes in the system leading to instability. As the
28 CHAPTER 3. MODEL ANALYSIS realistic range of the network parameters is limited, the strategy under the stability check is to create a number of worst-cases with which the stability of the system is tested. The magnitude of the margins becomes then a minor issue, being the positive sign of both margins the main considered proof for a stable system. Even is a system with low margins can be easily led into instability by small changes in the network, those changes are accounted by the worst-case tests. 3.2.2 Time delay The input to the plant (probability of packet dropping/marking) is delayed by one RTT and this means that the effect of a change in the value of the input at the time t can influence the states and the output only at t + R. The disturbances or changes in the input that last less than T are impossible to be properly contrasted by the controller. As the input itself is influenced by the network parameters through feedback, the fastchanging dynamics of the flows entering the router is difficult to be caught. In the frequency domain, the delay present in the system limits the achievement of cut-off frequencies ω c higher then 1/R. The controllers developed in this master thesis try to reach the best possible results, but the problems described above create strong limitations to performance improvements. 3.3 Algorithms overview The algorithms studied in this master thesis work use a common queueing and service structure, but they differentiate themselves in the way they calculate the probability of dropping/marking packets to prevent congestion and stabilize the queue size. Contrary to what RED does, the control theoretical implementations use the instantaneous queue size to calculate the dropping probability: the time constant of the low pass filter (representing the Exponential Weighted Moving Average system) is source of tuning problems in RED and it can be avoided in faster systems. In all the presented implementations the sampling frequency is quite low (between 5 or 16 Hz), and this makes the computational effort decrease by several orders of magnitude compared to RED. The router queue follows a FIFO (First-In First-Out) policy, with a preliminary choice of incoming packets according to the queue length (and network conditions, for adaptive algorithms). The implementations can be represented with the flow chart in Figure 3.2 and the OTcl code can be found in Appendix B.
3.3. ALGORITHMS OVERVIEW 29 a packet arrives at the router queue length is checked packet dropped YES queue full? p is calculated at every sampling instant NO check ECT bit in IP header NO u > p? compare u and p NO ECN capable? YES random u is picked from uniform distribution YES set CE flag packet enqueued Figure 3.2: Queuing scheme for incoming packets
3 CHAPTER 3. MODEL ANALYSIS
Chapter 4 NS-2 and simulation settings The mathematical model of the TCP dynamics is an approximation of the real behavior and the control theoretical analysis of the AQM algorithms is a proper step: a simulation program is needed to add a random contribution to the model and fully catch network behavior. Only after the validation of the analytical results with a complex simulator, an algorithm can be judged as performing well or not. 4.1 NS-2 In order to test all the designs that are studied during this master thesis work, they are implemented into the discrete event simulator NS-2. NS-2 is a open source simulator developed and constantly updated through contributions of researchers and students. In version 2.27 that is used for this project various AQM implementations are present, included the Proportional-Integral controller. Thanks to the common structure and other similar points, the PI implementation will be used as the starting point to write the code for PID and the other controllers. Once the simulator is installed on a Linux emulator (Cygwin) and the verification test made, each new queue implementation must be recognized from the program: this means that first the implementation in C++ is added to the Queue directory along with the working variables, and then the simulator is taught where to look for the new implementation. All the steps needed to perform these operations will be described in Appendix A. 4.1.1 Which language? NS-2 is written in C++, and the objects described in C++ are called from OTCL, (Objectoriented TCL). This is the language used to describe the topology of the network, the type of transmission, routing options, and everything regarding the simulation (timers, protocols). In order to perform the tests on the implementations, both languages are used: with OTCL the basic topology (nodes and links) is described, the flows with relative delay times are set and the queue policy is decided. C++ let the implementation of the AQM algorithms to be designed, using the signals which come from the network as inputs (in this case the queue length at sampling times). 31
32 CHAPTER 4. NS-2 AND SIMULATION SETTINGS source 1 source 2 source 3 router dest. 1 dest. 2 dest. 3 source n buffer dest. n Figure 4.1: The general topology of the simulations 4.2 Description of the experiments Analyzing the simulations used in [1] and [2], a group of 7 experiments has been designed. A broad range of network conditions is represented in order to perform a good comparison among the existing algorithms and the ones developed during the master thesis work. In the following a description of the topologies and systems are presented, while the results of the experiments for each design can be found in Sections 5.4, 6.6, 6.9, 7.6 and??. In all 7 experiments, the TCP version used is New Reno and the ECN marking protocol is used. In the following sections a description of the settings for each experiment is given, while the code used in NS-2 can be found in Appendix B. 4.2.1 Experiment 1 The first experiment is performed around the working conditions used to linearize the system in [1]. Except for some simplifications made during the linearization process, this simulated environment should correspond to the average one used in Matlab to test the theoretical designs, which the time and frequency-domain graphs come from. The settings are: Number of flows (N): 6 FTP flows (the average of the FTP packets can be omitted, as the PID controller does not follow a packet-size policy) Fixed Round-Trip Time: 2ms (with queue length at the reference point) Capacity of the router: 375 pkt/sec (corresponding to 15Mb/s for an average packet size of 5 Bytes) 4.2.2 Experiment 2 On the same topology of Experiment 1, this experiment differs for the variable RTT of the packets and for a block of some flows between 1 and 12 seconds of the simulation: this is made in order to test the response of the system at sudden changes in the network. One of the aims of AQM algorithms is to keep the routing queue stable, whatever happens in
4.2. DESCRIPTION OF THE EXPERIMENTS 33 the network: a block of flows is a good shock in the router. Here is the list of features of the experiment: N: 6 FTP flows RTT uniformly varying between 13 ms and 22 ms Block of 2 flows (1/3 of the total number) between 1 and 12s. 4.2.3 Experiment 3 This experiment uses the same topology as of Experiment 1, with a reduced number of flows: it was observed in the theoretical analysis that a small number of flows (as well as a high round-trip time) leads towards instability. The behavior of the system under these conditions must be kept in consideration, in order to avoid unresponsive queue length. number of flows N: 3 FTP flows RTT: 2 ms 4.2.4 Experiment 4 It s also important to check the behavior of the controller when the number of flows grows a lot, i.e. in the situation when the congestion control should be more effective managing many connections at the same time. All the parameters are kept fixed as in Experiment 1 and 3, but N: 4 FTP flows Periodical behavior and number of flows After having introduced the first experiments, which focus on the number of flows as the main parameter, some basic considerations can be added: the router gets packets from different sources which try to transmit at the same time, and enqueues the packets in its buffer. In case of standard TCP, the buffer is filled until it gets full and then packets are discarded, causing the TCP protocol to wait for the Time-out (or a triple duplicate-ack) and for a new transmission of that packet. As the buffer is full for most of the packets which arrive in the range of time when congestion happens, many of the sources will behave in same way, waiting for a signal from the other end and then reducing the transmission window. This creates a periodical behavior of the sources and of the queue length, with very bad performance. Even ECN can not solve this problem as the transmitters are warned about possible congestion all together, and react in the same way: reducing the transmission window. With the use of an AQM algorithm the problem of synchronization of flows is solved thanks to the probability of dropping packets (or marking, in case of ECN) being distributed
34 CHAPTER 4. NS-2 AND SIMULATION SETTINGS between and 1. This causes only one part of the packets be dropped or marked, with a smoother global behavior. In all the AQM algorithms (except for some adaptive systems) the number of flows is an important thread. If the flows are few, then each signal that a source gets from the router weights a lot in the general view and it is difficult to compensate it with the other sources. If the number of connections is high, then the reduction of a transmission window from one source is compensated by the continuous transmission of the other ones, and the result is a much more stable queue length. 4.2.5 Elephant vs. Mice traffic In the next 3 experiments the behavior of the algorithms under different kind of flows will be studied. The mathematical system analyzed in [3] and used in [1] and [2], as well in this master thesis as fixed starting point, uses the parameter N (the number of flows) in the description of the system dynamics. This parameter should represent the total number of flows that pass through the router, but not all the flows have the same characteristics. In the first 4 experiments the flows are only made of FTP packets: usually these are called ELEPHANTS, because the flows are long-lived and the transmission is trying to reach a Constant Bit Rate (CBR). For the stability of the router buffer these flows are positive, because they are kept for long time and no big changes happen. Along with Elephant transmission there are also the so called MICE packets: they are short-lived TCP packets used for Web traffic. HTTP traffic is made of small page requests, small downloads from the servers that host the web pages, in a discontinuous way. From the point of view of a router, this kind of traffic causes a more variable behavior of the queue and some AQM algorithms clearly try to avoid it. The control theoretic algorithms take into consideration the number of packets in the queue, setting an average value based on statistic measurements, and not their size: this policy let all kind of packets to be processed in the same way and so HTTP transmissions can flow along the FTP ones. 4.2.6 Experiment 5 In this experiment a mixture of FTP and HTTP packet flows are sent through the router, in order to analyze the behavior of the congestion control algorithms in presence of a high number of flows with different characteristics. It has been chosen to add source nodes that host the Web clients in order to keep them separate from the FTP connections and a (cache + server) on the destination side (see Figure 4.2). There are several ways to define a flow: it can be considered as the unique link between two IP addresses, between two gateways, or each single Application-level connection. Setting different source nodes for different kind of traffic, the behavior of HTTP packets can be studied separately to the common TCP transport protocol with FTP. These are the settings for this experiment: 6 FTP flows
4.2. DESCRIPTION OF THE EXPERIMENTS 35 FTP source 1 router FTP dest. 1 FTP source n FTP dest. 2 HTTP source 1 FTP dest. n HTTP source m buffer HTTP cache HTTP server Figure 4.2: The topology for mixed traffic 18 HTTP flows (18 clients and 1 server) fixed RTT 4.2.7 Experiment 6 The structure of experiment 5 is kept fixed, increasing the number of both FTP and HTTP connections. 18 FTP flows 36 HTTP flows (36 clients and 1 server) 4.2.8 Experiment 7 - TCP vs. UDP The transport-level TCP is not the only one used on the Internet. Another protocol is growing along with multimedia and interactive contents on the Web: i.e. UDP. As described in the introductive chapter, UDP offers a best-effort communication: this means that if some packets are lost between the two end-points, the protocol continues sending data without caring of the errors. It has no adaptive system and no variable transmission window. No AQM algorithm has power to signal to the UDP protocol about an incoming congestion and, after a packet is discarded from the router, the flow from the source will continue with the previous throughput (that is different from the goodput). This behavior should intuitively cause a flat increasing of the queue length. In order to build a complex environment, a mixture of FTP, HTTP and constant bit-rate UDP packets are sent through the router: in order to test the robustness of the algorithm against unresponsive traffic, the number of UDP flows is set at a much higher value than the statistical measurements in real networks. The statistical percentage of UDP protocol use varies according to the router placement in the network and evolves in a quite rapid way. The TCP flows are set as in Experiment 5, and unresponsive traffic is chosen to represent 1% of the FTP load. 6 FTP flows
36 CHAPTER 4. NS-2 AND SIMULATION SETTINGS 18 HTTP flows (18 clients and 1 server) 6 CBR UDP flows
Chapter 5 The PID controller The PID control algorithm is used for the control of almost all lops in the process industries, and is the basis for more advanced control strategies. Its applications in a wide range of environments makes it a good candidate for the control of TCP dynamics too. 5.1 Introduction As analyzed in chapter 2, all the proposed AQM algorithms have some points of strength as well as some weaknesses. Considering the control theoretic designs, it should be clear that the Proportional controller is very fast and responsive but has a steady-state error that does not let the queue stabilize itself on a reference value, being dependent on the system parameters; the round trip time of a flow cannot be considered stabilized as the queuing delay is variable according to the network conditions. On the other hand, the Proportional-Integral design avoids the steady-state error (thanks to the integral part), but slows down the responsiveness by almost one degree of magnitude. This effect creates a very stable feedback system, capable to work on a wide spectrum of the three variables, but with bad capacity to react to the changes in the network (at least from the point of view of more complex systems). From the control theory it is well known that a derivative part in a PID design helps to reduce the overshoot and the settling time: with the presence of a derivative effect, the integral part may be increased to make the system become faster and with relative low overshoot. The tuning should always take into consideration 5.1.1 Design based on the network behavior When thinking about designing a PID controller to regulate the queue length of a router placed in the middle of a computer network, the first idea is that network conditions are highly varying: the tuning of a PID should not be optimized around an operating point, but for a wide range of values of the network variables. According to this consideration and following the ideas presented in Section 3.2.1, several operating points (worst cases too) are identified and tested. The PM and GM (see Section 2.5.3) can be viewed as the distance between the test point and instability: in 37
38 CHAPTER 5. THE PID CONTROLLER - PLANT DYNAMIC q p dp PID dq q ref - p Figure 5.1: Block diagram of PID controller case of a positive result in the worst cases analysis, PM and GM can be lower than the suggested thresholds without any problem. 5.1.2 Targets in the design As written previously, the PI design has paved the way for better performing systems. Before starting tuning the PID controller, some ideas are set as base of the work: 1. A fast response is the most important aim for the controller design 2. Overshoot is not the primary problem 3. The chosen design should prevent instability in a large set of conditions, even if this limits the optimization for some specific cases With the idea that the implementation of this AQM system could be used in a wide range of routing environments, it is fundamental to create a general structure that grants reliable performances in any case. As observed in the simulations with NS-2, a higher overshoot does not affect performance negatively, while a slower rise time has a strong effect on responsiveness. 5.2 The PID tuning In this master thesis work the TCP linearized system derived in [1] (here called Plant Dynamic ) is used to find the best setting of the PID controller. In Figure 5.1 q ref represents the reference number of packets which the system tries to tend to. A PID controller can be seen as a cascade of a Proportional, an Integrative and a Derivative part: G(s) =K P + K I s + K Ds (5.1) Tuning the PID controller is equivalent to finding the values K P,K I and K D that make the system robustly stable.
5.2. THE PID TUNING 39 input PID controller PLANT P(s) output Figure 5.2: The feedback structure 5.2.1 PID tuning methods As the PID controller is one of the most used feedback structures in several applications, standardized tuning methods have been studied. The most famous were developed by Ziegler and Nichols (Z-N) in the early fourties: the methods can be applied both on openloop or closed-loop systems, and use the identification of periodical oscillations in the output after modifications of the parameters to set fixed relations. Another method is manual tuning, based on sequences of simple changes in the parameters, looking for an unstable behavior of the closed-loop response. One parameter at a time (first K I, then K P and in the end K D ) is increased until the system reaches instability, and then decreased by an order of magnitude. At the end of the process a finer tuning is needed according to what performance is needed. This second method has been used in this master thesis and a lot of attention has been paid to the final step of the process, in order to avoid instability for the worst cases. 5.2.2 The resulting compensator The PID transfer function can be derived directly from (5.1) and is G(s) = K Ds 2 + K P s + K I (5.2) s The complete feedback system is showed in Figure 5.2. From equations (2.6) and (5.2) the following open-loop transfer function is derived: ( KD s 2 ) + K P s + K I ( H(s) =G(s)P (s) = C2 2N )e sr s (s + 2N R 2C )(s + 1 (5.3) R ) In order to compare the stability and performance of the tuned feedback system, the same network conditions used in [2] are set. They are the following: C=375 packets/sec, N =6, R=.246 sec. Moreover, the time-delay is approximated by a second-order Pade approximation. The PID tuning process has the aim of making the system faster (ω c around 1 rad/sec) than the one regulated by the PI controller, while keeping the variance of the simulated queue length as small as possible. According to the high-varying nature of the process, the main checked parameters are the stability and the simulation results.
4 CHAPTER 5. THE PID CONTROLLER.5.4 5 Round trip Time (sec. ).3 3 1 2.2.1 4 2 4 6 8 1 12 14 16 18 2 22 24 Number of flows Figure 5.3: A parametrical analysis of instability area, with algorithm test points In Figure 5.3 the instability area for the system is shown: being fixed the capacity of the router (C=15 Mbps), high RTT and low number of flows affect the system. After the tuning of the controller (using manual tuning described in Section 5.2.1), the values of K D, K P and K I are set as K D =5 1 6 K P =4 1 5 K I =4 1 5 H(s) =.5859s4 9.64s 3 +6.547s 2 + 815.2s + 929.5 s 5 +28.98s 4 + 312.5s 3 + 963.4s 2 (5.4) + 426.2s The open-loop transfer function (5.4) is used in the frequency domain to show the PM and GM of the system, while the time response is evaluated by inputting a step with constant amplitude into the closed feedback system. The relation between open-loop and closed loop (given that H(s) = G(s)P (s)) is H cl.loop (s) = G(s)P (s) 1+G(s)P (s) (5.5) The frequency function and the step response of the system for a number of different network conditions (represented by the circles in Figure 5.3) are shown in Figure 5.4: the reference settings (see above) let the algorithm be compared with the graphs in Section 2.5.
5.3. DIGITAL IMPLEMENTATION 41 1 5 1.8 1.6 Step Response 1 2 3 4 5 1.4 5 1.2 1 1 3 1 2 1 1 1 1 1 1 2 1 3 5 1 15 2 25 3 35 4 45 1 3 1 2 1 1 1 1 1 1 2 1 3 1 2 3 4 5 Amplitude 1.8.6.4.2.2 5 1 15 2 25 3 35 4 45 Time (sec) Figure 5.4: Proportional-Integral-Derivative: frequency and time responses 5.2.3 Result analysis If the PID time response is compared with the other control theoretic implementations, it comes out that its margins and its time response are close to the Proportional controller characteristics, with the advantage of having no steady state error. In the time-response graph it is possible to observe an overshoot of around 25% of the input signal, but in Section 5.1.2 it was stated that overshoot does not decrease the performance of the system, and simulations with smoother and slower systems confirm it. The Proportional controller shows an excellent behavior in the average, while starting oscillating in some particular conditions of the network. This PID design has almost the same fast response of a Proportional controller (the rise time is between 1 and 2 seconds, that means 5 times faster than the proposed PI design in [2]), and the output seems to tend to the reference value under a wide range of network conditions. 5.3 Digital implementation In order to implement the PID controller inside the routing algorithm, the transfer function must be converted into a discrete time. The discrete-time form represents the system, sampled at a given frequency. The sampling frequency (f s ) must be chosen to be at least 2 times the loop bandwidth of the system (for Nyquist-Shannon s sampling theorem), in order not to lose useful information about the behavior of the system. The PID design has the cut-off frequency (ω c ) at 1.22 rad/sec, and the sampling frequency is chosen to be 1 or 2 times ωc 2π, which means 2-4 Hz.
42 CHAPTER 5. THE PID CONTROLLER In [2] the PI controller is oversampled at 16 Hz: for testing the PID implementation the same frequency is used, in order to have a more accurate comparison. The digitalization is made on the controller transfer function, and from Figure 5.1 the input and output components can be identified: δq = q q ref is the input and δp = p p outputs the PID controller. Considering that q and p are respectively the queue length and the probability of dropping incoming packets measured at each sampling time, and that p can be set to (as the aim of the controller is to reach a stable queue behaviour), it becomes: p = H(s) (5.6) δq 5.3.1 Approximation methods There are several possibilities for approximating a Laplace transfer function into z-domain and to get the algorithm for software implementation. Three methods have been tested, to analyze the different behaviors: they are listed and then described, from the simplest to the more advanced. Finite differences Backward difference + Tustin Tustin Finite differences With this method it is possible to translate the transfer function from the s-domain directly to the working algorithm. The finite difference is the discrete analog of the derivative: f df (x) = dt = f(x) f(x h) h, where h is the sampling period and the usual infinitesimal quantity given by the limit of h to (as in the derivative definition) is substituted by a finite quantity. The following steps illustrate how to get the algorithm: p δp = K P + K I s + K Ds (5.7) p δq = K P s + K I + K D s 2 s Cross multiplication leads to ps = δq(k P s + K I + K D s 2 ) ṗ = K P δq + KI δq + K D δq At this point the finite differences can replace the derivatives: the indexes k identify which sampled time the variable refers to, and T s is the sampling period (t = kt s ). p k p k 1 T s = K P δq k δq k 1 T s + K I δq k + K D δq k + δq k 2 2δq k 1 (T s ) 2 p k = p k 1 + K P (δq k δq k 1 )+K I T s δq k + K D δq k + δq k 2 2δq k 1 T s (5.8)
5.3. DIGITAL IMPLEMENTATION 43 Knowing that δq = q q ref, it is possible to get the formula of the digital implementation with reference to the values of the state variables introduced before. p k = p k 1 + K P (q k q k 1 )+T s K I (q k q ref )+ K D T s (q k 2q k 1 + q k 2 ) (5.9) This implementation requires 5 state variables (p k,p k 1,q k,q k 1,q k 2, q ref ) instead of the 4 in the PI design, and the calculation is performed at every sampling instant. Backward difference + Tustin Another method is to approximate the integral part of the controller with a bilinear conversion (Tustin) and the derivative part with a backward difference substitution. These two transformations are applied to the variable s in order to express the transfer function in terms of z. The relations are the following: TUSTIN: s = 2 T s z 1 z+1 BACKWARD RECTANGULAR: s = z 1 T sz Substituting s in (5.7) as described before, the following steps can be made: p δq p δq p δq = K P + T sk I 2 = K P + T sk I 2 z +1 z 1 + K z 1 D T s z 1+z 1 1 z 1 + K 1 z 1 D = 2K P T s (1 z 1 )+K I T 2 (1 + z 1 )+2K D (1 z 1 ) 2 2T s (1 z 1 ) T s After cross multiplication and expansion of the terms δq, the algorithm becomes p k = p k 1 + K P (q k q k 1 )+ K IT 2 (q k + q k 1 2q ref ) + K D T (q k + q k 2 2q k 1 ) (5.1) In order to reduce the number of computations and optimize the algorithm, (5.1) can be rewritten as p k = p k 1 + a(q k q ref ) b(q k 1 q ref )+c(q k 2 q ref ), (5.11) where a = K P + K IT s 2 + K D Ts, b = K P K IT s 2 + 2K D T s and c = K D Ts
44 CHAPTER 5. THE PID CONTROLLER H (z) H (z) 1 2 H (z)*h (z) 1 2 Figure 5.5: Cascade of transfer functions Tustin If both the integral and derivative part of the transfer function are approximated with the bilinear method, the denominator of the transfer function in the z-domain has an order higher than one. The digital implementation can be performed dividing H(z) into the cascade of 2 transfer functions of the first order (see Figure 5.5), in order to reduce the number of support variables. p δq = K P + T sk I z +1 2 z 1 + K 2 z 1 D (5.12) T s z +1 With a common denominator the formula becomes p δq = K P (1 z 2 T )+K s I 2 (1 + z 1 ) 2 2 + K D T s (1 z 1 ) 2 (1 z 1 )(1 + z 1 (5.13) ) Substituting the values of K P, K I and K D of the chosen tuning and the sampling time of 1/16sec, the resulting formula becomes p δq =.1641(1.9927z 1 )(1.9582z 1 ) (1 + z 1 )(1 z 1 ) (5.14) The variable x is used to keep the state of the intermediate values between the two transfer functions. The steps to get the final implementation are the following: x δq =.16411.9927z 1 1+z 1 x(1 + z 1 )=.1641δq(1.9927z 1 ), which give x k = x k 1 +.1641δq k.16δq k 1 (5.15) The same operations are made on the second term: p x = 1.9582z 1 1 z 1 p k = p k 1 + x k.9582x k 1 (5.16) Expressions (5.15) and (5.16) represent the recursive implementation of the controller.
5.4. THE SIMULATION RESULTS 45 5.3.2 Approximation choice The 3 different approximations discussed in the previous section have been presented to show the difference in the corresponding algorithms, even if all of them are digital conversions of the same continuous-time controller. While the first has only a theoretical aim, the second two are good approximations of the continuous process and both can be implemented. Two implementations have been tested in the network simulator NS-2 and different results have been observed in transient behavior: both of them will be presented and compared to the PI controller. Transient response has been ignored in previous research, as the network has been thought to be stable and with fixed parameters. In experiment 2, with the block of flows, it is clearly shown how a disturbance in the network can influence the queue behavior and how much time it takes to come back to the set reference load. In this master thesis work it has been chosen to start plotting the queue length from the initial time: a rapid increase in the number of flows is a challenge for the controller and can happen in any instant. 5.4 The simulation results The 2 PID implementations tested in NS-2 can be described by the following instructions called at every sampling time: EULER + TUSTIN approximation (PID1) p = a*(q-q_ref)-b*(q_old-q_ref)+c*(q_old2-q_ref)+p_old p_old = p q_old2 = q_old q_old = q Sampling frequency: 16 Hz a =.84 b =.16 c =.8 TUSTIN approximation (PID2) p=p_old+x-a*x_old x=x_old+b*(q-q_ref)-c*(q_old-q_ref) p_old=p q_old=q x_old=x Sampling frequency: 16 Hz a =.9582
46 CHAPTER 5. THE PID CONTROLLER 8 7 PID1 PI PID2 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 5.6: Experiment 1: PID vs. PI b =.1641 c =.16 In the following pages the graphical results of the PID implementation are compared with the PI from [2]. The description of the experiment settings can be found in Section 4.2. For a clearer comprehension of the results, the distributions of the queue length for each experiment are plotted in Figure 5.15 and the numerical data of a modified standard deviation (calculated assuming the mean to be 2 along all the process) are reported in (5.1). In Figure 5.6 the basic network settings are used, i.e. they follow the conditions, which the controller has been tuned around. Even if the setup recalls the theoretical average situation, the behavior of the PID controller cannot be considered the optimal one: it was designed to work in a spread range of situations with similar responses, using a fixed point (settings as in experiment 1) to tune the parameters for the stability. Results show a quite faster transient response of both PID controller implementations compared to the PI, and a minor variance around the reference value of 2 packets. Figure 5.8 presents quite similar results with the first experiment concerning the initial response, and in the closer graph between 1 and 12 seconds (when 2 flows out of 6 are blocked) it can be observed that the PID algorithm is fast to level the queue length around the reference value, while the PI takes almost all the 2 seconds to react to the change in the number of flows. A common behavior between PI and PID1 is shown after the flows are restarted: for both configurations the queue increases and the reference level is reached in around 1 seconds. In the first two experiments (N=6) the PI controller can not prevent a buffer overflow in the transient time, while the PID1 starts to stabilize the queue since the beginning. The two tested implementations of the PID controller show different behaviors in transient times: PID1 is able to avoid queue overflow when all the connections are started at the
5.4. THE SIMULATION RESULTS 47 7 PID1 PI PID2 35 PID2 PID1 PI 3 6 5 25 4 2 3 15 2 1 1 5 5 1 15 2 25 3 1 15 11 115 12 125 13 135 Figure 5.7: Transient time for Ex.1 and block of flows in Ex.2 8 7 PID2 PID1 PI 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 5.8: Experiment 2: PID vs. PI
48 CHAPTER 5. THE PID CONTROLLER 45 4 PID2 PID1 PI 35 3 25 2 15 1 5 2 4 6 8 1 12 14 16 18 2 Figure 5.9: Experiment 3: PID vs. PI same time and the stabilization of the queue length is faster than PID2. On the contrary, when a part of the flows is restarted after a block of 2 seconds, PID2 performs a faster stabilization. The result of experiment 3 (Figure 5.9) shows a similar behavior of the three systems in case of a low number of flows. The best results that the PID algorithms achieve are shown in Figure 5.1: with 4 FTP flows competing for the slots in the router queue, the PID controllers have fast initial responses (around 1 seconds compared to 6 seconds for the PI) and small variances around the reference point. Figure 5.11 and 5.12 represent the queue evolution under the router load of FTP and HTTP flows. Compared to the experiments with FTP packets only, the introduction of small client-server packets does not affect the performance of the controllers, whose behaviors are very similar to experiments 2 and 4. The introduction of unresponsive flows causes an increase of the variance of the queue length for both PI and PID controllers. The distributions of the queue length show the behavior of PID1 and PID2: it can be seen that this implementation does not totally prevent queue overflow, but it can decrease the number of packet retransmissions in transient time. PID1 experiences overflow only in Experiment 4 and 6, while PID2 can prevent the queue to become full only in Experiment 3. The prevention of overflows, even in settling time, is useful both for the router to get a stabilized queue length and keep margins for unresponsive traffic, and for end users to improve throughput.
5.4. THE SIMULATION RESULTS 49 8 7 PID2 PID1 PI 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 5.1: Experiment 4: PID vs. PI 8 7 PID2 PID1 PI 7 PID2 PID1 PI 6 6 5 5 4 4 3 3 2 2 1 1 2 4 6 8 1 12 14 16 18 2 5 1 15 2 25 3 35 Figure 5.11: Experiment 5: PID vs. PI Algorithm Ex 1 Ex 2 Ex 3 Ex 4 Ex 5 Ex 6 Ex 7 PI 66.6727 65.547 31.8862 25.11 64.8226 14.551 88.841 PID1 43.1862 39.9757 31.9656 7.9475 38.8274 76.5422 49.828 PID2 56.968 43.2889 3.1869 89.1458 52.762 68.2996 54.542 Table 5.1: Standard deviation of queue length for the whole simulation time
5 CHAPTER 5. THE PID CONTROLLER 8 7 PID2 PID1 PI 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 5.12: Experiment 6: PID vs. PI 8 7 PID2 PID1 PI 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 5.13: Experiment 7: PID vs. PI
5.4. THE SIMULATION RESULTS 51 6 14 5 12 4 1 8 3 6 2 4 1 2 35 1 2 3 4 5 6 7 (a) Ex.1 8 1 2 3 4 5 6 7 (b) Ex.2 3 7 25 6 5 2 4 15 3 1 2 5 1 9 5 1 15 2 25 3 35 (c) Ex.3 12 1 2 3 4 5 6 7 8 (d) Ex.4 8 7 1 6 8 5 6 4 3 4 2 2 1 1 2 3 4 5 6 7 (e) Ex.5 2.5 x 14 1 2 3 4 5 6 7 8 (f) Ex.6 2 1.5 1.5 1 2 3 4 5 6 7 8 (g) Ex.7 Figure 5.14: PID1 - Distribution of queue length values
52 CHAPTER 5. THE PID CONTROLLER 5 12 45 4 1 35 8 3 25 6 2 15 4 1 2 5 3 1 2 3 4 5 6 7 8 (a) Ex.1 9 1 2 3 4 5 6 7 8 (b) Ex.2 8 25 7 2 6 5 15 4 1 3 2 5 1 8 5 1 15 2 25 3 35 4 45 (c) Ex.3 12 1 2 3 4 5 6 7 8 (d) Ex.4 7 1 6 5 8 4 6 3 4 2 1 2 1 2 3 4 5 6 7 8 (e) Ex.5 2.5 x 14 1 2 3 4 5 6 7 8 (f) Ex.6 2 1.5 1.5 1 2 3 4 5 6 7 8 (g) Ex.7 Figure 5.15: PID2 - Distribution of queue length values
Chapter 6 Internal Model Control - IMC In this chapter the Internal Model Control design is presented and analyzed, and later this controller is used with the model of the TCP dynamics developed in [1] to test the possibilities of improving performance in the simulated system. This control system design strategy has been proposed by M. Morari and C. E. Garcia [17] in 1982, and it is very useful for the control of system models that could present errors or approximations in their formulations. 6.1 Closed-loop criteria From a closed-loop point of view, there are several performance criteria that a control design is trying to achieve: Stable closed-loop Effects of disturbances are minimized Rapid and smooth responses No offset Robust control system (system is insensitive to changes in the process conditions and to errors in the process model) All the previous criteria can not be satisfied at the same time with a normal control system, because of conflicts and trade-offs. The main ones are between response speed / effects of disturbances on the system, and robustness / performance. In the previous chapters the stress for the controller design was put on the compromise between stability and performance: in the case of PID design, it was stated that performance was set as second-order criteria, in order not to lose stability. 53
54 CHAPTER 6. INTERNAL MODEL CONTROL - IMC q ref CONTROLLER PLANT dq p G* G + c - + L + q G * q* - + q q* - PLANT MODEL Figure 6.1: Internal Model Control block diagram 6.2 IMC design The Internal Model Control design takes into account the model uncertainty and it allows to straightforwardly relate the controller settings with the model parameters. The block diagram in Figure 6.1 shows the IMC approach. The transfer function G represents the real TCP dynamics (in this thesis work, it is the NS-2 simulated behavior), while G is the approximated linearized system derived in [1]. The output of the controller (p=probability of dropping/marking a packet) is sent to both processes, whose outputs are subtracted and fed back. The reference value q ref is added to the feedback quantity to create the new input. From the theory of feedback block structures it is known that the simple transfer function of the forward link in an encapsulated system is given by the closed loop formulation divided by the feedback link. Following this rule, the IMC structure can be transformed into a classical feedback block diagram (shown in figure 6.2), with the transformation G c = G c 1 G cg (6.1) A proof of the relation is given in the following passages, with a direct analysis of the IMC structure. If the inner feedback system with the process model is isolated from the process itself, the following relation can be obtained: Grouping input and output terms, it becomes: which can be expressed as the following transfer function p = q ref G c qg c + pgg c (6.2) p(1 GG c)=g c(q ref q), (6.3) p q ref q = G c 1 GG c (6.4) In order to design the controller G c (which is related to the process model) the following steps are followed:
6.3. CONTROLLER DESIGN AND TUNING 55 q ref CONTROLLER PLANT dq p G + c G - + L + q Figure 6.2: IMC equivalent feedback block diagram G is factored into G + and G, where the first term contains any time delays and right-half zeros of the function G The controller G c is defined as 1 G f, where f is a low-pass filter with a static gain one. The function f is used to make the controller a proper transfer function (i.e. the order of the denominator is equal or higher of the numerator) and to tune the controller. The only factor that is inverted is G, as poles in the right-half plane would lead to instability and an inverse delay term e +st would require knowledge of future events. The function f has the form 1 f = (λs +1) r, (6.5) where r is used to add the right number of poles to compensate the zeros and λ becomes the only tuning parameter in the IMC controller design. As this structure uses the process inside the feedback structure, it should not be possible to exactly tune the controller in order to get the wanted results: the control theoretical analysis is therefore done under the hypothesis that the process model matches the real dynamics. Once switched to the classical feedback structure and tuned the controller, the process G is substituted by its model G : further modifications in the controller tuning can be made after the implementation of the algorithm in the simulator NS-2. 6.3 Controller design and tuning In order to get the final controller, first the controller in the IMC structure must be designed: following the main steps written before, G c will be designed. The process model from [1] is P (s) = ( C2 2N )e sr (s + 2N R 2 C )(s + 1 R ) (6.6) The transfer function (6.6) presents a time delay term, which is ignored when building G c in order to keep it proper. It becomes: G c = 2N 1 (s + R 2C )(s + 1 R ) (λs +1) 2, (6.7) C 2 2N
56 CHAPTER 6. INTERNAL MODEL CONTROL - IMC where r (order of the low pass filter) has been set to 2 in order to balance the number of zeros and poles. If λ is chosen positive, (6.7) is ensured to be a proper and stable transfer function. The next step is to get the formulation of G c for the equivalent system: substituting (6.7) in (6.1), it becomes: G c = 1 (s+ 2N R 2 C )(s+ 1 R ) C 2 2N (λs+1)2 (s+ 2N R 2 C )(s+ 1 R ) C 2 2N (λs+1)2 C 2 2N (1 R 2 s) (s+ 2N R 2 C )(s+ 1 R )(1+ R 2 s) (6.8) With some simplifications and reordering of terms the transfer function of the controller in the classical feedback structure is expressed as: G c = (s + 2N R 2C )(s + 1 R )(1 + R 2 s) [ ] (6.9) C 2 2N ( R λ 2 2 )s 3 + λ(λ + R )s 2 +(2λ + R )s The controller is then tuned with the help of frequency and time responses: the network parameters are set as follows: C = 375packet/sec, N = 6,R =.246sec. The plots of margins (in frequency domain) and step response (time domain) of the system under default conditions are shown in Figure 6.3. The value of λ used for the controller is.1; it has been tuned according not only to the results of the control theory, but also after the analysis of the behavior of several implementations in the network simulator. It can be observed how fast the controller is to reach the input, with a typical non minimum-phase behavior without overshoot. 6.4 Static design vs. Adaptive The IMC design strategy for the controller opens two scenario: a static algorithm can be derived and the performance of the system will depend on the network conditions. This is the way chosen for the previous controllers, because the resulting implementation is easy to be transferred on routers, which already use a RED congestion control algorithm. On the other hand, the inner structure of a controller designed from a IMC structure gives the possibility of creating an adaptive controller, directly linked to the network operating conditions. The real implementation of an adaptive controller, which is based on network parameters, is based on the quality of the parameters estimation: only the access to these data let the controller work properly. In Section 6.5 a fixed implementation will be studied and tested, while the study for an adaptive one will be presented in Section 6.7.
6.5. ALGORITHM IMPLEMENTATION 57 4 Bode Diagram Gm = 8.98 db (at 9.2 rad/sec), Pm = 63.6 deg (at 2.33 rad/sec) 1.2 Step Response 2 1 Magnitude (db) 2 4.8 6 8 27 Amplitude.6.4 Phase (deg) 18 9.2 1 1 1 1 1 1 2 1 3 Frequency (rad/sec).2.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Time (sec) Figure 6.3: IMC: frequency and time response 6.5 Algorithm implementation The previously tuned controller must be digitalized in order to get a C++ implementation to be used inside the simulator NS-2. Under the network conditions described in 6.3 the controller can be expressed as G c (s) =.8533s3 +.186s 2 +.337s +.1491 s 3 +28.13s 2 + 362.6s (6.1) The transfer function is then digitalized to get an expression in z-domain. The sampling frequency must be chosen according to the cut-off frequency (w c ) of the open-loop system. In the IMC design it is 2.33 rad/s, compared to 1.22 rad/sec of the PID controller: then the system is oversampled at 8 Hz, considering the response twice as fast. With a Tustin approximation of the variable s, the resulting transfer function in z- domain becomes G c (z) =.7752z3 227z 2 +.293z.6612 z 3 2.657z 2, (6.11) +2.361z.745 which can be factorized in zero-pole form as G c (z) =.77524(z.9934)(z.954)(z.933) (z 1)(z 2 1.657z +.745) (6.12) The two main parameters in the network (number of flows and RTT) are made vary in a wide range to test the stability of the system: in Figure 6.4 the colored area represents
58 CHAPTER 6. INTERNAL MODEL CONTROL - IMC the conditions that lead to instability. If compared to the PID working area, the IMCbased algorithm shows problems of stability with low number of flows, while having a more stable behavior in other conditions..5.4 5 Round Trip Time (sec. ).3.2 3 1 2.1 4 2 4 6 8 1 12 14 16 18 2 22 Number of flows Figure 6.4: A parametrical analysis of instability area, with algorithm test points As the denominator has order 3, the transfer function is transformed into the series of two simpler transfer functions with lower order. As for the PID implementation, a support variable x is used to keep the internal steps. The following steps describe the creation of the final simulation implementation: x.9934z 1 =.775241 dq 1 z 1 (6.13) Cross multiplication gives x(1 z 1 )=.77524(1.9934z 1 ) x k = x k 1 +7.7524 1 4 dq k 7.712 1 4 dq k 1 (6.14) The second part of the controller gives p(1 1.657z 1 +.745z 2 )=x [ (1.954z 1 )(1.933z 1 ) ] p k =1.657p k 1.745p k 2 + x k 1.8537x k 1 +.8585x k 2 (6.15) Putting all together, the recursive instructions are the following: p k =1.657p k 1.745p k 2 + x k 1.8537x k 1 +.8585x k 2 x k = x k 1 +7.7524 1 4 dq k 7.712 1 4 dq k 1 (6.16) The frequency function and the step response of the system for a number of different network conditions (represented by the circles in Figure 6.4) are shown in Figure 6.5. The step response of test 3 is not shown, because unstable.
6.6. THE SIMULATION RESULTS 59 1 5 5 1.6 1.4 1.2 1 Step Response 1 1 3 1 2 1 1 1 1 1 1 2 1 3 1 2 1 2 3 4 5 Amplitude.8.6.4.2 3 4 1 3 1 2 1 1 1 1 1 1 2 1 3.2 5 1 15 2 25 3 35 4 Time (sec) Figure 6.5: Internal Model Control: frequency and time responses 6.6 The simulation results The following pseudo-code is recalled at every sampling time: p=a*p_old-b*p_old2+x-c*x_old+d*x_old2 x=x_old+e*(q-q_ref)-f*(q_old-q_ref) p_old=p q_old2=q_old q_old=q x_old2=x_old x_old=x with Sampling frequency: 8 Hz a=1.657 b=.745 c=1.8537 d=.8585 e=7.7524 1 4 f=7.712 1 4
6 CHAPTER 6. INTERNAL MODEL CONTROL - IMC Algorithm Ex 1 Ex 2 Ex 3 Ex 4 Ex 5 Ex 6 Ex 7 PI 66.6727 65.547 31.8862 25.11 64.8226 14.551 88.841 PID1 43.1862 39.9757 31.9656 7.9475 38.8274 76.5422 49.828 IMC 34.6913 29.2412 29.983 14.8258 28.183 67.4426 38.4543 Table 6.1: Standard deviation of queue length for the whole simulation time The graphical results from the simulations are compared with the PI controller, which was set as performance benchmark. 8 7 PI IMC 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 6.6: Experiment 1: IMC vs. PI From the plots of the simulation results and the numerical data it is easy to observe that the IMC design leads to responses similar to the PID implementations, but it increases the performance about queue stabilization. Moreover, from Figure 6.13 the distribution of the queue length along the simulation time shows that IMC has the property of avoiding buffer overflow in most of the cases (except for Experiment 4 and 6). 6.7 Adaptive implementation As discussed in Section 6.4, the IMC controller structure allows a direct design for an adaptive implementation. The design follows the same rules of the static version, but it keeps all the variables inside the formulation: these variables are then tracked or estimated by the router to apply appropriate changes in the shape of the controller. The adaptive design of the controller, together with a good estimation of the network parameters, solves the problems of instability that PID and static IMC designs showed: being the controller built from a stable set of parameters, the stability cannot be lost if the parameters are estimated correctly.
6.7. ADAPTIVE IMPLEMENTATION 61 8 7 IMC PI 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 6.7: Experiment 2: IMC vs. PI 45 4 PI IMC 35 3 25 2 15 1 5 2 4 6 8 1 12 14 16 18 2 Figure 6.8: Experiment 3: IMC vs. PI
62 CHAPTER 6. INTERNAL MODEL CONTROL - IMC 8 7 IMC PI 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 6.9: Experiment 4: IMC vs. PI 8 7 PI IMC 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 6.1: Experiment 5: IMC vs. PI
6.7. ADAPTIVE IMPLEMENTATION 63 8 7 IMC PI 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 6.11: Experiment 6: IMC vs. PI 8 7 PI IMC 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 6.12: Experiment 7: IMC vs. PI
64 CHAPTER 6. INTERNAL MODEL CONTROL - IMC 45 12 4 35 1 3 8 25 6 2 15 4 1 2 5 14 1 2 3 4 5 6 (a) Ex.1 9 1 2 3 4 5 6 (b) Ex.2 12 8 7 1 6 8 5 6 4 4 3 2 2 1 8 5 1 15 2 25 3 (c) Ex.3 12 1 2 3 4 5 6 7 8 (d) Ex.4 7 1 6 5 8 4 6 3 4 2 1 2 1 2 3 4 5 6 (e) Ex.5 2 x 14 1 2 3 4 5 6 7 8 (f) Ex.6 1.8 1.6 1.4 1.2 1.8.6.4.2 5 1 15 2 25 3 35 4 45 5 (g) Ex.7 Figure 6.13: IMC - Distribution of queue length values
6.7. ADAPTIVE IMPLEMENTATION 65 From equation (6.9) further steps can be done to obtain a formulation similar to (6.12): first the numerator is expressed in proper zero-form and denominator simplified. G c (s) = 2N 2N (s + R 2C )(s + 1 R )(s + 2 R ) C 2 λ 2 s 3 + 2λ(λ+R ) λ 2 R s 2 + 2(2λ+R ) s λ 2 R (6.17) To this expression is then applied the z-transform with a Tustin approximation, and the resulting form is G c (z) = ( (z 1) z 2 + z( N(8R C + N)(16R + 1)(8R +1) 4R 3C3 (128λ 2 R + 16λ 2 + 16λR +2λ + R ) (6.18) (z 8R2 C N 8R 2C+N )(z 16R 1 16R +1 )(z 8R 1 8R +1 ) ) 4λ+2R 256λ 2 R 128λ 2 R +16λ 2 +16λR +2λ+R )+ 128λ2 R 16λ 2 16λR +2λ+R 128λ 2 R +16λ 2 +16λR +2λ+R This long formula can be simplified according to 2 reasons: The parameter λ can be kept fixed (λ =.1) even in the adaptive algorithm, as performance and robustness do not vary much while the number of computations per round is heavily decreased. The algorithm is tested on a single router with known capacity: supposing that the implementation is applied case by case into the routers of a larger network, the parameter C can be set as constant. After these considerations, (6.18) becomes G c (z) = N(3 15 R + N)(16R + 1)(8R +1) 2.194 1 12 R 3(145R +1.8) (z 3 15 R 2 N 3 1 5 R 2+N )(z 16R 1 16R +1 )(z 8R 1 8R +1 ( ) ) (6.19) (z 1) z 2 + z( 254R +.4 145R +1.8 )+ 112R 1.4 145R +1.8 Expression (6.19) has now the same structure of (6.12) and a digital implementation can be easily derived with the same steps used for the static implementation. It can be observed that in (6.19) there are only 2 free parameters (R =Round Trip Time and N=Number of flows): in the following section it will be explained how these parameters are tracked from the router in the considered topology and what happens in a broader network. 6.7.1 Tracking and estimating parameters In this master thesis work a simple topology has been considered: the transmission of packets between two end-points through a single router. In order to study the effects of congestion control algorithms, this structure is appropriate because if each router is able to stabilize its buffer to the chosen threshold, then a network of routers will then show the same steady-state behavior.
66 CHAPTER 6. INTERNAL MODEL CONTROL - IMC For the static implementations considered in previous chapters, the parameters of the network could have affected the performance of the controller s response but they were not required to be known for the correct work of the controller. For the adaptive IMC algorithm (as well as for all the other adaptive strategies) the network parameters are required by the network for the correct action of the controller. Parameters that can be somehow tracked by the router by analyzing the fields of packet headers or the buffer status are evaluated independently of the other network elements: the number of flows N can be counted by each single router by building a table of source/destination IP addresses. Other parameters (e.g. the Round Trip Time) depend on a broader range of variables and can not be directly evaluated by the router: for those parameters statistical measurements and estimation theory helps to find the values that should follow the reality. The definition of RTT, that must be evaluated for building the adaptive IMC controller, is R = q C + T P, (6.2) where the first term represents the queueing time and T P includes propagation time and router latency. Each physical link between the nodes of a network has different length and different construction specifications. Once pushed into the link, a bit travels at a speed that depends on the medium (twisted-pair copper wire, coaxial cable, optical fiber and so on), and anyway is limited by the speed of light. The length of each single link determines the time of travel; the number of hops (a hop is defined as a passage from one router to another) as well as packet processing delays inside each router on the path create the propagation time for each packet. Expression (6.2) is made of a trackable term (given by the instantaneous queue length divided by the router packet processing capacity) and a quantity that must be estimated. For the network topology considered in this project the propagation time is set and in each experiment its average is kept fixed: the main aim of this master thesis is to analyze the performance and stability of congestion control algorithms, and separate scenario are used to test them. Further improvements for a more general and realistic implementation can be derived from studies focusing the difficult aspect of RTT estimation. Active flows estimation method Instead of measuring the number of flows by compiling a large per-flow table, in [19] an interesting estimation method is proposed: it has been developed to stabilize the RED algorithm, but can be easily adapted to work with control theoretical methods. The basic idea is to compare every packet which arrives at the buffer with a randomly chosen packet which preceded it by a short range of time: they can be part of the same flow or different flows. In order to increase the possibility of good estimation, a support buffer is created in order to keep trace of a range of time longer than the buffer capacity: the buffer contains a number M of recent active flows with a timestamp (to compare the temporal distance of two packets in the same flow) and a counter for the number of packet of the considered flow.
6.7. ADAPTIVE IMPLEMENTATION 67 The buffer is empty in the beginning; as a packet arrives at the router queue, its flow identifier (source address, destination address,etc) is compared to the one of a randomly chosen packet in the support buffer. If they match, the counter of the existing packet is increased by one, otherwise the arriving packet is added to the buffer. When the support buffer gets full, in case of flow mismatch between the arriving packet and the randomly picked one, with probability p the new packet overwrites the existing value formally starting a new flow counter, and with probability (1 p) no change is done. The support buffer is only used to estimate the flow number: each packet arriving at the router queue and compared with entry of the support buffer, can be then marked or dropped according to the AQM policy. Keeping the probability p constant, the time it takes to the support buffer to lose the memory on past active flows is M/p packets. Depending on the choice of the flow identifiers, the two parameters can be properly tuned. An estimate P (t) of the frequency of matches around the time of the arrival of the t-th packet at the queue is kept. Defining { if no match, M atch(t) = (6.21) 1 if match the following relation can be written: P (t) =(1 α)p (t 1) + αmatch(t), (6.22) where α 1. P (s) can be considered an estimate of the frequency of matches (two packets part of the same flow) as well as the probability that an arriving packet matches with a randomly chosen one. Each packet which arrives at the queue has probability π i to belong to the flow i; supposing these probabilities do not change over time and knowing that an entry in the support buffer represents flow i with probability π i, the probability of an incoming packet to be part of an active flow is P {Match(t) =1} = i π 2 i (6.23) In case of flows with identical intensity, π i = 1 N becomes P {Match(t) =1} = 1 N for 1 i N and the previous formula (6.24) In the symmetrical case, P (t) 1 represents a good estimation of the number of flows and it can be extended to the asymmetrical case, as an approximate value is already good for the use in the AQM algorithm.
68 CHAPTER 6. INTERNAL MODEL CONTROL - IMC 1 5 Step Response 1 1 2 3 4 5.8 5.6 1 1 3 1 2 1 1 1 1 1 1 2 1 3 5 1 15 2 25 3 35 4 1 3 1 2 1 1 1 1 1 1 2 1 3 1 2 3 4 5 Amplitude.4.2.2.4.5 1 1.5 2 2.5 3 Time (sec) Figure 6.14: Adaptive IMC: frequency and time response 6.8 The theoretical results The controller (6.19) is tested under different network conditions to show its theoretical robustness: in Figure 6.5 the plot of margins and time response for the standard parameters, corresponding to the static algorithm, were shown to be compared with other algorithms. The adaptive algorithm is now tested under the network conditions used in sections 5.2.2 and 6.5: in Figure 6.14 it can be easily observed that the results are extremely similar among them. The adaptive algorithm ensures a common response for every choice of network parameters (performing well also in cases of other algorithms instability). The possibility of getting the same high level of performance once the algorithm is implemented into the network is strictly linked to the quality of the parameter tracking and their computational speed. 6.9 The simulation results The adaptive algorithm is first tested under an ideal condition, i.e. with full knowledge of the network parameters. This means that the number of flows and the Round-Trip Time for each experiment are set into the program and they are not estimated: also the changes into the network are passed to the program at the same time they happen (e.g. the block of flows in Experiment 2). This case represents a benchmark for evaluating the performance of both the static and the adaptive implementation: if the flows are long-lived FTP connections, the full knowledge of their number makes the algorithm reach the best performance. In case of a
6.9. THE SIMULATION RESULTS 69 Algorithm Ex 1 Ex 2 Ex 3 Ex 4 Ex 5 Ex 6 Ex 7 IMC 34.6913 29.2412 29.983 14.8258 28.183 67.4426 38.4543 Ideal AIMC 33.6992 3.7818 33.552 18.14 26.4164 24.7617 37.884 Table 6.2: Standard deviation of queue length for the whole simulation time mixture of FTP and HTTP flows, the weight of the flows is different to the eyes of the controller, and an adaptive implementation, which relies on a flow-id computation, does not follow the optimal design. 8 Ideal AIMC IMC 8 Ideal AIMC IMC 7 7 6 6 5 5 4 4 3 3 2 2 1 1 2 4 6 8 1 12 14 16 18 2 2 4 6 8 1 12 14 16 18 2 Figure 6.15: Ideal AIMC vs IMC: Experiment 4 and 6 Looking at Table 6.2, the values of the Adaptive IMC are very similar in all the simulations, while the static design shows a quite large range of standard deviations. It s also interesting to see the results of Experiment 2: when the Round-Trip Time of the flows is random, the evaluation of the instantaneous queue delay can not give the algorithm a better way to control the different flows. In Figure 6.15 the largest improvements of the ideal adaptive IMC design are shown: the knowledge of the number of flows in case of high load of the router helps to get good stabilization of the queue and fast response. It must be observed that these results are theoretical: the estimation of the number of flows and the round-trip time is a complicate subject and also the method described in Section 6.7.1 takes very long time to reach an acceptable estimation. The results of Experiment 3 show that an adaptive AQM algorithm gives better results than its corresponding static implementation in simple networks, but its use in real
7 CHAPTER 6. INTERNAL MODEL CONTROL - IMC networks presents many difficulties. In the case of adaptive IMC design, an advantage may be the stability of the controller: the interaction between more advanced estimation methods and this controller may reach very high performance.
Chapter 7 Optimal control The algorithms studied in previous chapters have their focus on the transient response of the process, trying to make it as fast as possible. Optimal control takes into consideration a balance between speed and disturbance rejection, and it is the key for obtaining a robust controller for the complex dynamics of TCP flows competing for routing. 7.1 Sensitivity function The Gain Margin and Phase Margin give useful information about the stability of a system. However, they have limitations as guides to the design of realistic control problems. The considered system has two inputs: a reference signal (constant step of height 2) and unknown disturbances, caused by the interactions of multiple flows competing for the service at the router and the congestion control. So far the dynamic performance has been described by the transient response to a step input, which represented the reference input. The disturbances were not taken into consideration. In Figure 7.1 the closed-loop block diagram of the system is shown: the reference input and the additive disturbances of the process are represented, while the measurement noise on the output is not inserted because not present in the considered system. Let Y be the output and D the disturbances. From the closed-loop, the output can disturbances reference input + - CONTROLLER C(s) PLANT P(s) + + output Figure 7.1: Closed-loop block diagram 71
72 CHAPTER 7. OPTIMAL CONTROL be expressed as 1 Y = D = SD, (7.1) 1+CP where the factor S is called Sensitivity function. It can also represent the transfer function between the reference input and the error between the output and the reference input. A Complementary Sensitivity function T is defined as 1 S and has the form (using the notation of Figure 7.1) T = CP (7.2) 1+CP The goal in the controller s design is to attenuate the disturbances that are inserted in the system ( S small) and at the same time to follow the reference signal ( T =db). The two functions are complementary, so both must be taken into consideration when designing a system. The Linear Quadratic control is able to balance the two properties, in order to build the optimal controller according to the requirements and the constraints of the system. 7.2 Linear Quadratic control In order to design a linear quadratic controller, the system is usually described with statespace equations. Given the system the controller can be described as ẋ = Ax + Bu y = Cx + Du (7.3) u = F r r F y y, (7.4) where r is the reference signal and y is the output of the system. The Linear Quadratic controller tries to balance the states x and the input u. A cost function is defined by 1 T ( x (t)q 1 x(t)+u (t)q 2 u(t) ) dt, (7.5) 2 where Q 1 and Q 2 are weights to the two variables, x represents the transpose of matrix x. The cost function represents the weighted sum of the energy of the state and the control and must be minimized. The design parameters Q 1 and Q 2 > must first be chosen according to the system and to the desired results, and then an optimal system with good response will be created through several iterations. The theory that is under the linear quadratic problem is here briefly explained, and then the controller s synthesis is made with a mathematical program, because of the iterative nature of the process. First the Hamiltonian is defined as: H(x, λ, t) = 1 ( x Q 1 x + u Q 2 u ) + λ (Ax + Bu) (7.6) 2
7.2. LINEAR QUADRATIC CONTROL 73 The minimum principle states that the optimal control and the optimal state trajectories must satisfy the following equations ẋ = δh δλ, x() = x (7.7) λ = δh, δx λ(t ) = (7.8) δh δu = (7.9) where λ is an adjoint state. Using rules for differentiating matrices and vectors, in the LQ regulator case they become ẋ = Ax + Bu, x() = x (7.1) λ = Q 1 x + A T λ, λ(t ) = (7.11) u = Q 1 2 B λ (7.12) where u is the optimal control. Those coupled linear differential equations form a two point boundary value problem (TPBVP) which is difficult to solve numerically. If the optimal control is substituted into the state equation, it becomes [ ] [ ][ ] [ ] ẋ A BQ 1 = 2 λ B x.= x H Q 1 A λ λ The matrix H is called the Hamiltonian matrix and it is important in Linear Quadratic theory. It turns out that the complex TPVBP problem has not to be solved, and the solution can be reached through the following steps: first a substitution is made λ = Px. (7.13) Differentiating both sides of (7.13) with respect to time and substituting from (7.13), the following relation is got: dλ dt = dp dt x + P dx dt = dp dt x + PAx PQ 2R 1 Q 2Px = Q 1 x A Px (7.14) Equation (7.2) must hold for any state x, hence a sufficient condition for the optimal control is that P satisfies the Riccati differential equation dp dt = A P + PA+ Q 1 PBQ 1 2 B P, P(T ) = (7.15) This is called the finite horizon problem (the integral is between and T ) and gives a linear time-varying controller. If T is made approach infinity, it turns out that, under mild conditions, P (t) P and the positive definite solution of the algebraic Riccati equation (ARE) results in an asymptotically stable closed-loop system.
74 CHAPTER 7. OPTIMAL CONTROL The resulting controller has the form u = Kx (7.16) where K is given by and the ARE equation is K = Q 1 2 B P (7.17) A P + PA+ Q 1 PBQ 1 2 B P = (7.18) The choice of Q 1 and Q 2 determines the controller design. First of all the plant must be expressed into a state-space representation to be able to do some considerations on the states and on their weights inside the quadratic cost function. From Section 2.5.2 the following transfer function of the TCP dynamics and queue behavior is recalled: ( C2 2N P (s) = )e sr (s + 2N R 2C )(s + 1 (7.19) R ) Using the standard values of the network parameters C = 15 Mbps = 375 packets/sec (packets of fixed length 5 Bytes) N =6 R + =.246 sec. and expressing the delay with a first-order Pade approximation, the system takes the form ż 8.13 z 64 δẇ = 122.1.5288 δw + 48.5 δp δ q 243.9 4.65 δq δq = [ 1 ] z δw δq, (7.2) where z is an adjunctive state created by the delay approximation. In Figure 7.2 a block diagram of the process is presented. The states of the plant do not have all the same importance: the dynamics of intermediate states z and δw are not an issue for the performance optimization. The output y (corresponding to the third state δq) and the input δp are the two variables that must be weighted inside the cost function. One way to set the weights Q 1 and Q 2 is to use the inverses of the squares of the values that the error and the control should get. There are some constraints that must be taken into consideration: a strong control with high variation can lead to saturation, while realistic limitations should be followed when weighting the error quantity. In the considered system the control input is represented by the probability of packet dropping/marking: after its calculation every sampling instant, its value must be bounded
7.2. LINEAR QUADRATIC CONTROL 75 8.13 dp 64 + - 1 z + + 122.1 1 dw + dq 243.9 1 s s s - - - 48.5.5288 4.65 Figure 7.2: State-space graphical representation between and 1. In case of strong control the algorithm loses the property of synchronization avoidance. The minimization of the error between the measured output and the reference output is the most important issue for the LQ design in this master thesis work: as the optimal controller that this technique creates is a balance between two different properties, the output error can be weighted as it could reach much better performance. After these theoretical considerations and several simulations of different final algorithms in the network simulator NS-2, the following values have been assigned to the weights Q 1 =.1 Q 2 =1 and the cost function to be minimized is 1 ( δq (t)q 1 δq(t)+u (t)q 2 u(t) ) dt, (7.21) 2 where δq is q q ref because the system has been linearized around the reference point. Once the weights are set, iterative evaluations of the algebraic Riccati equation (ARE) lead to the state feedback controller, that uses all the states to feedback into the input (see Figure 7.3). The measurements of all the states are not available at the router, as u x = Ax + Bu dq K Figure 7.3: State feedback the only parameter that is stored along the time is the queue length. The Luenberger observer is a smart way to estimate the states from the output and it will be described in Section 7.4.
76 CHAPTER 7. OPTIMAL CONTROL 7.3 Stability analysis The Linear Quadratic controller has an interesting property on stability margins GM: [.5, ) PM 6 o, that can be demonstrated by the so called Return Difference Inequality (RDI) [25]. Let y be the output measurement and y c the regulated output. In general it is which give two transfer functions y = Cx y c = C q x, G(s) =CΦ(s)B G q (s) =C q Φ(s)B, (7.22) where Φ(s) =(si A) 1. G(s) is the transfer function between the input and the measurements, while G q (s) is the transfer function between the (input+disturbances) and the controlled outputs. After Q 1 is defined as C qc q, Q 2 is set to 1 and the open loop of the system is identified from Figure 7.3, the following equality can be written: 1+kφ(jω)b 2 =1+ G q (jω) 2 (7.23) As the right-hand side is always larger than 1, the following inequalities are got 1+kφ(jω)b 2 1 or J(jω) 1 or S(jω) 1, (7.24) where J(s) represents the return difference and S(s) is the sensitivity function. Because of these inequalities, the Nyquist plot of the open loop transfer function of a LQ-controller based design stays always outside of the unit circle centered in (-1,). The stability margins are then derived with geometric arguments. 7.4 Observer The only measured state/output is the queue length, from which δq = q q ref can be calculated. The Luenberger observer is a dynamical system that asymptotically estimates the states using information on the output and the input of the system. In Figure 7.4 the complete functional structure is presented: L represents the array of values which multiply the output in order to feedback the states of the estimator. It can be designed only if (C, A) are detectable, i.e. if the information on the output is enough to observe all the states. The system (7.2) satisfies this requirement and the estimator must be designed faster than the state-feedback controller obtained through the solution of the ARE equation: the poles are placed on the left of the slowest pole of the controller, in order to grant fast convergence.
7.5. THE RESULTING CONTROLLER 77 u - x = Ax+Bu C dq K L + - x ^ = Ax+Bu ^ C dq^ Figure 7.4: LQ controller with state observer 7.5 The resulting controller The design of the observer and the controller are two independent problems and can be performed without correlation. As written in Section 7.2, the design of the controller is made with mathematical software and here the results are reported. With the choice of weights Q 1 =.1 Q 2 =1 the optimal controller for the state-space system (7.2) becomes K = [ 94.385 12.139.124 ] (7.25) The estimator poles are set at -2 and the following result is obtained: L =.562 3.797 47.2761 (7.26) From Figure 7.4 the state-space representation of the system can be written as: ˆẋ = Aˆx + Bu + L(δq δq) ˆ ˆẋ = (A LC)ˆx + Bu + Lδq, (7.27) where both the input u and the output δq = q q ref are used. The output of the controller is given by u = K ˆx (7.28) The controller is then transformed into transfer function representation and sampled with the sampling frequency of 2Hz, which was chosen after the evaluation of several simulation results. The discrete transfer functions are: 976.344(z +.2332)(z +.3119) (z.454) 3 (7.29)
78 CHAPTER 7. OPTIMAL CONTROL between the output q k q ref and the input p k, and.297(z +.45)(z +.1197) (z.454) 3 (7.3) between the input p k 1 and the input p k. In order to get a correct form for a C++ implementation, (7.29) and (7.3) are transformed respectively into 976.344(1 +.2332z 1 )(1+.3119z 1 )z 1 (1.454z 1 ) 3 (7.31) and.297(1 +.45z 1 )(1+.1197z 1 )z 1 (1.454z 1 ) 3 (7.32) With the same manipulations used for the PID and IMC designs, the probability of packet dropping/marking is calculated at every sampling instant as x1=a*x1_old-b*p_old2-b*c*p_old3 x2=a*x2_old+g*(q_old-q_ref)+g*h*(q_old2-q_ref) p1=d*p_old-e*p_old2+x1+f*x1_old p2=d*p_old-e*p_old2+x2+i*x2_old p=p1+p2 p_old3=p_old2 p_old2=p_old p_old=p q_old2=q_old q_old=q x1_old=x1 x2_old=x2 with a=.454 b=976.3453 c=.2332 d=.98 e=.2612 f=.3119 g=.297 h=.45
7.6. THE SIMULATION RESULTS 79 8 7 LQR PI 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 7.5: Experiment 1: LQR. PI i=.1197 In order to reduce the variance of the queue length, the calculation of p is made with a higher frequency than the designed one. The resulting algorithm is not a proper LQ system, because the theoretical design does not match with the practical implementation, but the final performance is the main goal. The sampling frequency is then chosen as 5 Hz, and the reference queue length must be modified in order to keep the average around 2. The reference length is set at 23 packets, which corresponds to the use of a transfer function after the reference input. 7.6 The simulation results From the graphical results and numerical data of the modified standard deviation, it can be easily observed how good performance this algorithm can reach, compared to the PI controller but also to the other tested implementations. One of the main features is the prevention of the buffer overflow in all the considered cases (from the queue-size distribution in Figure 7.12): the setup speed and the reaction time to changes in the network, together with a compact distribution of the queue occupancy, are the other aspects that make this algorithm reach very good performance. In case of low number of flows, which represents the most difficult case for queue stabilization, the controller seems to behave asymmetrically: when the reference value of 2 packets is passed with a negative derivative (the controller tries to reduce the number of packets in the queue), some low peaks can be observed. The difference between the sampling frequency used to design the controller and the really implemented one: this choice was made in order to reduce the variance of the process and some peaks under the reference value are supposed to be accepted in a general view of well-performing algorithm. In Experiment 2 (Figure 7.6) the block of active flows causes a small decrease in the
8 CHAPTER 7. OPTIMAL CONTROL 8 7 PI LQR 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 7.6: Experiment 2: LQR vs. PI Algorithm Ex 1 Ex 2 Ex 3 Ex 4 Ex 5 Ex 6 Ex 7 PI 66.6727 65.547 31.8862 25.11 64.8226 14.551 88.841 LQR 22.68 18.8615 31.8651 47.331 21.322 15.6593 24.1967 Table 7.1: Standard deviation of queue length for the whole simulation time average of the queue size and the increase of low peaks, but the algorithm seems not to be influenced much from changes in the network conditions and to react very fast.
7.6. THE SIMULATION RESULTS 81 45 4 LQR PI 35 3 25 2 15 1 5 2 4 6 8 1 12 14 16 18 2 Figure 7.7: Experiment 3: LQR vs. PI 8 7 LQR PI 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 7.8: Experiment 4: LQR vs. PI
82 CHAPTER 7. OPTIMAL CONTROL 8 7 LQR PI 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 7.9: Experiment 5: LQR vs. PI 8 7 LQR PI 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 7.1: Experiment 6: LQR vs. PI
7.6. THE SIMULATION RESULTS 83 8 7 LQR PI 6 5 4 3 2 1 2 4 6 8 1 12 14 16 18 2 Figure 7.11: Experiment 7: LQR vs. PI
84 CHAPTER 7. OPTIMAL CONTROL 3 7 25 6 2 5 4 15 3 1 2 5 1 15 5 1 15 2 25 3 (a) Ex.1 18 5 1 15 2 25 3 (b) Ex.2 16 14 1 12 1 8 5 6 4 2 45 5 1 15 2 25 (c) Ex.3 14 1 2 3 4 5 6 7 8 (d) Ex.4 4 12 35 3 1 25 8 2 6 15 4 1 5 2 5 1 15 2 25 (e) Ex.5 14 5 1 15 2 25 3 35 4 (f) Ex.6 12 1 8 6 4 2 5 1 15 2 25 3 (g) Ex.7 Figure 7.12: Linear Quadratic - Distribution of queue length values
Chapter 8 Conclusions The aim of this thesis work was the improvement of performance for a control theoretic AQM algorithm. The reference algorithm was based on the PI controller described in [2]. In Table 8.1 the complete list of standard deviations for the tested algorithms is given: the implementations are listed in order of general performance, from the reference to the best. The stabilization of the queue occupancy is improved with the last algorithms, but there are two cases that show strange results. In Experiment 3 the results are very similar among them and it is difficult to improve the performance: the low number of flows let the differences among the algorithms be reduced. With high number of active flows (Experiment 4) the reference algorithm takes long time to reach the desired queue length: the settling time is highly reduced with more complex control theoretic designs even if the results are worse than in other cases. Together with the performance, also stability of controllers has been studied: the importance of stability is justified with the chaotic nature of the Internet and highly changeable network parameters (Round-Trip Time and number of active flows, as well as routers capacity). Both the adaptive version of the IMC-design controller and the Linear Quadratic controller are stable in the range of possible values of network parameters. The adaptive version of the IMC controller has been designed and tested in order to try to understand the possibilities of performance improvement compared to the static version: the limited network topology and the number of implementations to be tested hasn t allowed a complete study on this subject. 8.1 Future work With the results obtained in this master thesis work, further implementations may be designed: the different tuning of already tested controllers or more complex controllers as Linear Quadratic Gaussian controller may deserve future research. In order to design a LQG controller, a deep study on the interaction of flows competing for the service at the queue is necessary: the disturbances, which influence the controller, must get a precise description. 85
86 CHAPTER 8. CONCLUSIONS Algorithm Ex 1 Ex 2 Ex 3 Ex 4 Ex 5 Ex 6 Ex 7 PI 66.6727 65.547 31.8862 25.11 64.8226 14.551 88.841 PID1 43.1862 39.9757 31.9656 7.9475 38.8274 76.5422 49.828 IMC 34.6913 29.2412 29.983 14.8258 28.183 67.4426 38.4543 Ideal AIMC 33.6992 3.7818 33.552 18.14 26.4164 24.7617 37.884 Robust 24.94 24.221 38.1319 28.166 23.696 18.138 31.1913 LQR 22.68 18.8615 31.8651 47.331 21.322 15.6593 24.1967 Table 8.1: Standard deviation of queue length for the whole simulation time After the simulations on a simple topology a larger network should be created and the effects of the interactions among routers may be studied, maybe using the experience on Round-Trip Time estimation researches. Moreover, the model that describes the dynamics could be improved by considering the new TCP functionalities (slow start, congestion avoidance, etc..), which improve the end-to-end congestion control.
Appendix A NS-2 A short step-by-step guide to the basics of the network simulator NS-2 for the reproduction of the experiments is here described. The installation of NS-2 (version 2.27) can be performed without problems if the instructions in Nicolas Christin s webpage ( http://www.sims.berkeley.edu/christin/ns-cygwin.shtml ) are followed exactly. All the listed libraries must be added into the workspace in order that the simulator can recreate its working structure. The libraries already contain most of the objects that will be used in the simulations: the only objects that must be modified are the queue algorithms (which are the main subject of research in this master thesis). The queuing algorithms are defined with 2 files: *. cc and *. h, written in C++: the first describes all the functions, while the second declares the variables and functions used in the corresponding file. The two files must be inserted in the folder /ns-allinone- 2.27/ns-2.27/queue, which contains all the queuing implementations. The implementations use some parameters that can be modified without changing the *.cc file. In the folder /ns-allinone-2.27/ns-2.27/tcl/lib the file ns-default.tcl can be found: for each new implementation, the set of parameters that are passed to the program is added. According to the list of parameters that are reported in each chapter, they are inserted in the following way (IMC is taken as example): #Queue/IMC Queue/IMC set bytes_ false Queue/IMC set queue_in_bytes_ false Queue/IMC set a_.77524 Queue/IMC set b_.7712 Queue/IMC set c_ 1.8537 Queue/IMC set e_ 1.657 Queue/IMC set f_.745 Queue/IMC set w_ 8 87
88 APPENDIX A. NS-2 Queue/IMC set qref_ 2 Queue/IMC set mean_pktsize_ 5 Queue/IMC set setbit_ true Queue/IMC set prob_ Queue/IMC set curq_ Before compiling the program, the queue controller must be added into the library of functions of the network simulator: it can be done by inserting the following lines in the file ns-lib.tcl in ns-allinone-2.27/ns-2.27/tcl/lib : Simulator instproc simplex-link { n1 n2 bw delay qtype args } {...... if {[string first "RED" $qtype]!= -1 [string first "PI" $qtype]!= -1 [string first "PID" $qtype]!= -1 [string first "IMC" $qtype]!= -1 [string first "AIMC" $qtype]!= -1 [string first "LQOBS" $qtype]!= -1 [string first "Vq" $qtype]!= -1 [string first "REM" $qtype]!= -1 [string first "GK" $qtype]!= -1 [string first "RIO" $qtype]!= -1} { $q link [$link_($sid:$did) set link_] }...... } where all the tested implementations are set. In the file makefile in /ns-allinone-2.27/ns-2.27 the new implementations are signalled to the program by adding the following lines into the OBJ CC part: OBJ_CC = \...... queue/red-pd.o queue/pi.o queue/pid.o queue/imc.o queue/aimc.o queue/lqobs.o queue/vq.o queue/rem.o queue/gk.o After each modification to the implementations, the program must be compiled with the updates: the command make written at the prompt in the folder /ns-allinone-2.27/ns- 2.27 performs this action. After having set the queuing algorithm, the topology and network settings must be designed by writing a *.tcl file. In Appendix B the general code used in this master thesis is reported.
The results (queue size along the time) are stored and elaborated with a mathematical program, in order to perform comparisons with the chosen benchmark and to get statistical data. 89
9 APPENDIX A. NS-2
Appendix B OTcl Code The OTcl code used in the 7 simulations is reported here: one common structure is given and particular settings for different experiments are added and explained. First the simulation object is created: set ns [new Simulator] then the number of flows (TCP and UDP) is set #number of FTP flows (it varies according to the experiments) set r 6 #number of HTTP connections (expressed as a multiple of FTP connections) set s [expr $r*3] #number of UDP flows for Experiment 7-1% of the FTP flows set u [expr $r/1] The topology (nodes and links) is then designed #define core nodes (routers) set nc [$ns node] set nc1 [$ns node] #define source and destination nodes for FTP connections set i 1 while {$i <= $r} { set src($i) [$ns node] 91
92 APPENDIX B. OTCL CODE } set des($i) [$ns node] incr i #define router properties: capacity, latency time, QUEUING SCHEME and queue maximal length $ns duplex-link $nc $nc1 15Mb 5ms PI $ns queue-limit $nc $nc1 8 #define links between end-point nodes and core nodes, with bandwidth and propagation delay set i 1 while {$i <= $r} { $ns duplex-link $src($i) $nc 1Mb 2ms DropTail $ns duplex-link $des($i) $nc1 1Mb 2ms DropTail incr i } The Round Trip Time is given by 2*(propagation time+latency time+queuing time). The queuing time is calculated at the reference queue length. In Experiment 2 the Round Trip Time is made vary between 13 ms and 22 ms. The propagation delay for each link is then obtained by a uniform distribution and values dly($i) used as propagation times: #random generator set rng [new RNG] $rng seed #parameters for random variables in link delay set RVdly [new RandomVariable/Uniform] $RVdly set min_ 5 $RVdlyset max_ 2 $RVdly use-rng $rng for {set i 1} {$i <= $r} {incr i} { set dly($i) [expr [$RVdly value]] }
93 After having specified physical and network-layer elements, the FTP transport-level connections must be created: set i 1 while {$i <= $r} { set tcp($i) [$ns create-connection TCP/Newreno $src($i) TCPSink $des($i) ] set ftp($i) [$tcp($i) attach-source FTP] $ftp($i) set packetsize 5 incr i } The packet size is set as 5 Bytes (approximated average size) for Experiments 1-4 and 15 Bytes for Experiments 5-7, as small HTTP packets contribute to decrease the average packet size) The HTTP client-server structure (Experiments 5-7) is designed: #cache + server set serv [$ns node] set cac [$ns node] $ns duplex-link $cac $serv 1Mb ms DropTail #source nodes and links for http set i 1 while {$i <= $s} { set src1($i) [$ns node] $ns duplex-link $src1($i) $nc 1Mb 2ms DropTail incr i } $ns duplex-link $nc1 $cac 1Mb 2ms DropTail The client-server structure needs the following setup parameters: set pgp [new PagePool/Math] set tmp [new RandomVariable/Constant] $tmp set val_ 124 # average page size $pgp ranvar-size $tmp set tmp [new RandomVariable/Exponential] # Age generator $tmp set avg_ 5
94 APPENDIX B. OTCL CODE # average page age $pgp ranvar-age $tmp set server [new Http/Server $ns $serv] $server set-page-generator $pgp The Poisson process of page requests is set in the following lines for {set i 1} {$i <= $s} {incr i} { set client($i) [new Http/Client $ns $src1($i)] set tmp [new RandomVariable/Exponential] ;# Poisson process $tmp set avg_ 5 ;# average request interval $client($i) set-interval-generator $tmp $client($i) set-page-generator $pgp } set cache [new Http/Cache $ns $cac] In Experiment 7 the interaction between TCP and UDP flows is studied: the following lines describe the UDP network (the packets are sent between the same hosts that have established FTP connections). set i 1 while {$i <= $u} { set udp($i) [new Agent/UDP] $ns attach-agent $src($i) $udp($i) set cbr($i) [new Application/Traffic/CBR] $cbr($i) attach-agent $udp($i) $udp($i) set packetsize_ 536 set null($i) [new Agent/Null] $ns attach-agent $des($i) $null($i) $ns connect $udp($i) $null($i) incr i } The simulations are 2 seconds long and the data are collected in the whole period of simulation. A tracing variable is added in the router queue in order to get the results on its behavior. for {set i 1} {$i <= $r} {incr i} { $ns at. "$ftp($i) start" } $ns at. "start-http"
95 for {set j 1} {$j <= $u} {incr j} { $ns at. "$cbr($j) start" } #Tracing the queue set pidq [[$ns link $nc $nc1] queue] set tchan_ [open all.q w] $pidq trace curq_ $pidq attach $tchan_ $ns at 2 "finish" The two procedures start-http and finish, which are recalled in the Tcl script, must be placed after the simulation basic elements. proc start-http {} { global ns cache server client global s set i 1 while {$i <= $s} { $client($i) connect $cache incr i } $cache connect $server for {set i 1} {$i <= $s} {incr i} { $client($i) start-session $cache $server } } # Define finish procedure (include post-simulation processes) proc finish {} { global tchan_ set awkcode { { if ($1 == "Q" && NF>2) { print $2, $3 >> "temp.q"; set end $2 } else if ($1 == "a" && NF>2) print $2, $3 >> "temp.a";
96 APPENDIX B. OTCL CODE } } set f [open temp.queue w] puts $f "TitleText: pid" puts $f "Device: Postscript" if { [info exists tchan_] } { close $tchan_ } exec rm -f temp.q temp.a exec touch temp.a temp.q exec awk $awkcode all.q } puts $f \"queue exec cat temp.q >@ $f close $f exec xgraph -bb -tk -x time -y queue temp.queue & exit The simulation object is then executed. $ns run
Bibliography [1] C.V. Hollot, V. Misra, D. Towsley, W. Gong A Control Theoretic Analysis of RED - Proceedings of IEEE Infocom 21, Anchorage, Alaska, April 21 [2] C.V. Hollot, V. Misra, D. Towsley, W. Gong On Designing Improved Controllers for Routers Supporting TCP Flows - Proceedings of IEEE Infocom 21, Anchorage, Alaska, April 21 [3] V. Misra, W. Gong, D. Towsley Fluid-based Analysis of a Network of AQM Routers Supporting TCP Flows with an Application to RED - Proceedings of ACM/SIGCOMM, 2 [4] V. Misra, W. Gong, D. Towsley Stochastic Differential Equation Modeling and Analysis of TCP-Windowsize Behavior - Technical report ECE-TR-CCS-99-1-1, 1999 [5] J. Padhye, V. Firoiu, D. Towsley, J. Kurose Modeling TCP Throughput: A Simple Model and its Empirical Validation - Proceedings of SIGCOMM 98, 1998 [6] S. Ryu, C. Rump, C. Qiao Advances in Internet Congestion Control - IEEE Communications Surveys and Tutorials, Vol. 5, 23 [7] K. Claffy, H.W. Braun, G.C. Polyzos A parameterizable methodology for Internet traffic flow profiling - IEEE Journal on Selected Areas in Communications, 13(8) pp.1481-1494, 1995 [8] M. Allman, V. Paxson, W. Stevens TCP Congestion Control, RFC 2581, April 1999 [9] A. Veres, M. Boda The Chaotic Nature of TCP Congestion Control - Infocom 2, Telaviv, 2 [1] S. Floyd TCP and Explicit Congestion Notification - ACM Computer Communication Review, vol. 24 no. 5 pp. 1-23, October 1994 [11] K. Ramakrishnan, S. Floyd, D. Black The Addition of Explicit Congestion Notification (ECN) to IP, RFC 3168, September 21 [12] S. Floyd, V. Jacobson Random Early Detection Gateways for Congestion Avoidance - IEEE/ACM Transactions on Networking, 1(4), August 1997 97
98 BIBLIOGRAPHY [13] S. Floyd, R. Gummadi, S. Shenker Adaptive RED: An Algorithm for Increasing the Robustness of RED s Active Queue Management - Proceedings of ACM/SIGCOMM, 21 [14] W. Feng, D. Kandlur, D. Saha, K. Shin Techniques for Eliminating Packet Loss in Congested TCP/IP Networks - Technical report, University of Michigan, CSE-TR- 349-97, November 1997 [15] S. Floyd Recommendation on using the gentle variant of RED - http://www.aciri.org/floyd/red/gentle.html, March 2 [16] W. Feng, D. Kandlur, D. Saha, K. Shin Blue: A New Class of Active Queue Management Algorithms - Technical report, UM CSE-TR-387-99, 1999 [17] C.E. Garcia, M. Morari Internal Model Control: A unifying review and some new results - Industrial Engineering Chemical Process Design and Development, no. 21 pp. 38-323, American Chemical Society, 1982 [18] X. Deng, S. Yi, G. Kesidis, C.R. Das A Control Theoretic Approach in Designing Adaptive AQM Schemes - Technical report, CSE Department, The Pennsylvania State University (http://cse.psu.edu/ xdeng/research.html) [19] T.J. Ott, T.V. Lakshman, L. Wong SRED: Stabilized RED - Proceedings of IEEE Infocom 1999 [2] A. Khinchine Mathematical Methods in the Theory of Queuing, Hafner Publishing Co., New York, 196 [21] O. Kallenberg Limits of compound and thinned point processes - Journal of Applied Probability, vol. 12 pp. 269-278, 1975 [22] N. O Connell Large Deviations in Queuing Networks [23] ns-2 Network Simulator, Obtain via http://www.isi.edu/nsnam/ns/ [24] The VINT Project The NS manual (formerly ns Notes and Documentation) (http://www.isi.edu/nsnam/ns/doc) [25] B. Shahian, M. Hassul Control System Design using MATLAB, Prentice Hall, 23