On Growth of Parallelism within Routers and Its Impact on Packet Reordering 1



Similar documents
Storage Basics Architecting the Storage Supplemental Handout

An Efficient Method for Improving Backfill Job Scheduling Algorithm in Cluster Computing Systems

Finding a Needle in a Haystack: Pinpointing Significant BGP Routing Changes in an IP Network

On Multicast Capacity and Delay in Cognitive Radio Mobile Ad-hoc Networks

A Virtual Machine Dynamic Migration Scheduling Model Based on MBFD Algorithm

EECS 122: Introduction to Communication Networks Homework 3 Solutions

FDA CFR PART 11 ELECTRONIC RECORDS, ELECTRONIC SIGNATURES

Web Application Scalability: A Model-Based Approach

Memory management. Chapter 4: Memory Management. Memory hierarchy. In an ideal world. Basic memory management. Fixed partitions: multiple programs

Load Balancing Mechanism in Agent-based Grid

An important observation in supply chain management, known as the bullwhip effect,

Buffer Capacity Allocation: A method to QoS support on MPLS networks**

Branch-and-Price for Service Network Design with Asset Management Constraints

Concurrent Program Synthesis Based on Supervisory Control

Static and Dynamic Properties of Small-world Connection Topologies Based on Transit-stub Networks

Large-Scale IP Traceback in High-Speed Internet: Practical Techniques and Theoretical Foundation

C-Bus Voltage Calculation

Characterizing and Modeling Network Traffic Variability

DAY-AHEAD ELECTRICITY PRICE FORECASTING BASED ON TIME SERIES MODELS: A COMPARISON

Sage Timberline Office

The Online Freeze-tag Problem

Multi-Channel Opportunistic Routing in Multi-Hop Wireless Networks

A MOST PROBABLE POINT-BASED METHOD FOR RELIABILITY ANALYSIS, SENSITIVITY ANALYSIS AND DESIGN OPTIMIZATION

Service Network Design with Asset Management: Formulations and Comparative Analyzes

Service Network Design with Asset Management: Formulations and Comparative Analyzes

CRITICAL AVIATION INFRASTRUCTURES VULNERABILITY ASSESSMENT TO TERRORIST THREATS

From Simulation to Experiment: A Case Study on Multiprocessor Task Scheduling

On Traffic Fairness in Data Center Fabrics

Multiperiod Portfolio Optimization with General Transaction Costs

Dynamic Load Balance for Approximate Parallel Simulations with Consistent Hashing

17609: Continuous Data Protection Transforms the Game

Monitoring Frequency of Change By Li Qin

Local Connectivity Tests to Identify Wormholes in Wireless Networks

Drinking water systems are vulnerable to

Red vs. Blue - Aneue of TCP congestion Control Model

Jun (Jim) Xu Principal Engineer, Futurewei Technologies, Inc.

STATISTICAL CHARACTERIZATION OF THE RAILROAD SATELLITE CHANNEL AT KU-BAND

TRANSMISSION Control Protocol (TCP) has been widely. On Parameter Tuning of Data Transfer Protocol GridFTP for Wide-Area Networks

Response-Time Control of a Processor-Sharing System Using Virtualized Server Environments Kjaer, Martin Ansbjerg; Kihl, Maria; Robertsson, Anders

An inventory control system for spare parts at a refinery: An empirical comparison of different reorder point methods

Design of A Knowledge Based Trouble Call System with Colored Petri Net Models

FIArch Workshop. Towards Future Internet Architecture. Brussels 22 nd February 2012

Rummage Web Server Tuning Evaluation through Benchmark

Time-Cost Trade-Offs in Resource-Constraint Project Scheduling Problems with Overlapping Modes

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 29, NO. 4, APRIL Load-Balancing Spectrum Decision for Cognitive Radio Networks

Comparing Dissimilarity Measures for Symbolic Data Analysis

CS 78 Computer Networks. Internet Protocol (IP) our focus. The Network Layer. Interplay between routing and forwarding

A Simple Model of Pricing, Markups and Market. Power Under Demand Fluctuations

Title: Stochastic models of resource allocation for services

Learning Human Behavior from Analyzing Activities in Virtual Environments

Buffer Sizing in Wireless Mesh Networks

Machine Learning with Operational Costs

Simulink Implementation of a CDMA Smart Antenna System

Automatic Search for Correlated Alarms

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Saturation Throughput in a Heterogeneous Multi-channel Cognitive Radio Network

Factoring Variations in Natural Images with Deep Gaussian Mixture Models

Traffic Analysis for Voice in Wireless IP Networks

SDN/OpenFlow. Outline. Performance U!, Winterschool, Zurich. SDN to OpenFlow. OpenFlow a valid technology!

Receiver Buffer Requirement for Video Streaming over TCP

Requirements of Voice in an IP Internetwork

An Introduction to Risk Parity Hossein Kazemi

ENFORCING SAFETY PROPERTIES IN WEB APPLICATIONS USING PETRI NETS

Sage Document Management. User's Guide Version 13.1

Franck Cappello and Daniel Etiemble LRI, Université Paris-Sud, 91405, Orsay, France

Modeling and Simulation of an Incremental Encoder Used in Electrical Drives

Joint Production and Financing Decisions: Modeling and Analysis

Failure Behavior Analysis for Reliable Distributed Embedded Systems

Point Location. Preprocess a planar, polygonal subdivision for point location queries. p = (18, 11)

Measuring relative phase between two waveforms using an oscilloscope

A Novel Architecture Style: Diffused Cloud for Virtual Computing Lab

Provable Ownership of File in De-duplication Cloud Storage

How To Provide Qos Based Routing In The Internet

Passive Compensation For High Performance Inter-Chip Communication

On the (in)effectiveness of Probabilistic Marking for IP Traceback under DDoS Attacks

Pressure Drop in Air Piping Systems Series of Technical White Papers from Ohio Medical Corporation

Effect Sizes Based on Means

One-Chip Linear Control IPS, F5106H

TOWARDS REAL-TIME METADATA FOR SENSOR-BASED NETWORKS AND GEOGRAPHIC DATABASES

Project Management and. Scheduling CHAPTER CONTENTS

NAVAL POSTGRADUATE SCHOOL THESIS

The fast Fourier transform method for the valuation of European style options in-the-money (ITM), at-the-money (ATM) and out-of-the-money (OTM)

Secure synthesis and activation of protocol translation agents

A Multivariate Statistical Analysis of Stock Trends. Abstract

Softmax Model as Generalization upon Logistic Discrimination Suffers from Overfitting

A Preferred Service Architecture for Payload Data Flows. Ray Gilstrap, Thom Stone, Ken Freeman

NOISE ANALYSIS OF NIKON D40 DIGITAL STILL CAMERA

Stability Improvements of Robot Control by Periodic Variation of the Gain Parameters

An Associative Memory Readout in ESN for Neural Action Potential Detection

Re-Dispatch Approach for Congestion Relief in Deregulated Power Systems

Sage HRMS I Planning Guide. The HR Software Buyer s Guide and Checklist

The impact of metadata implementation on webpage visibility in search engine results (Part II) q

INFERRING APP DEMAND FROM PUBLICLY AVAILABLE DATA 1

On the predictive content of the PPI on CPI inflation: the case of Mexico

Corporate Compliance Policy

Voice Over IP. MultiFlow IP Phone # 3071 Subnet # Subnet Mask IP address Telephone.

COST CALCULATION IN COMPLEX TRANSPORT SYSTEMS

Alpha Channel Estimation in High Resolution Images and Image Sequences

Switch Fabric Implementation Using Shared Memory

NUTSS: A SIP-based Approach to UDP and TCP Network Connectivity

Transcription:

On Growth of Parallelism within Routers and Its Imact on Reordering 1 A. A. Bare 1, A. P. Jayasumana, and N. M. Piratla 2 Deartment of Electrical & Comuter Engineering, Colorado State University, Fort Collins, CO 80523, USA 1 Agilent echnologies, Inc., 900 S. aft Av., Loveland CO 80537, USA 2 Deutsche elekom Laboratories, Ernst-Reuter-Platz 7, D-10587 Berlin, Germany Abstract he network link seeds increase at a higher rate comared to rocessing seeds. his couled with the increase in size of router tables demand higher levels of arallelism within router hardware. However, such arallelism introduces unintended consequences that otentially may negate some of the erformance gains rovided by the imroved technology. he growth trends of comuting seeds, link seeds, and routing table sizes are used to evaluate one such consequence, acket reordering within routers. Results resented show the trends related to the degree of hardware arallelism and acket reordering I. INRODUCION As the seed of hysical links and networks increase beyond gigabit er second, and the end-to-end latency to acket transmission time ratio increases by orders of magnitude, certain henomena that were insignificant and safely ignored assume substantial imortance. In fact some of these second order effects, unless countered, can negate to a significant degree the gains rovided by faster hysical links and routing/switching hardware, and will have an adverse imact on the end-to-end erformance seen by the alications. hese unavoidable henomena include, among others, delay jitter and acket reordering. Jitter has received attention only with resect to real-time alications such as VoIP, and effects of reordering were safely ignored. According to Moore s law, the CPU comuting seed aroximately doubles every 18 months [1, 2], while recent trends indicate that network link seed aroximately doubles every nine months [3, 4]. hus, the network link seeds increase at a faster rate than the comuting seed. he Internet itself is growing in size, resulting in the increase of routing tables sizes in backbone routers [5]. Consequently, the amount of rocessing to be erformed by the routers will increase at a faster rate than the rate of increase in the comuting ower. Routers will rely on architectures that use an increasing number of rocessors working in arallel to counter the additional comutation requirements. However, rocessing ackets from the same stream in arallel rocessors deteriorates the roblem of reordering. he two obvious solutions, avoiding arallel rocessing of ackets of the stream by sending all of them (or at least those that have the otential for reordering) to the same rocessor, and buffering ackets and forcing them to leave in-order, ose challenges that require additional resources. In this aer, we analyze the imact of the increasing linkseed vs. CPU erformance ga and the growth of router table sizes on arallelism required within routers, and the resulting effect on reordering. By scaling the CPU seed, link seed, routing table size and the number of flows, we show that the arallelism in routers has to increase significantly, an unintended and inevitable result of which is reordering. While the resent generation of high-erformance routers comensate for internal reordering within routers using inut tracking and outut buffering techniques, such techniques do not scale well with the increasing erformance ga, and result in higher router latency. his aer, while not resenting solutions to this dilemma, attemts to quantify the arallelism and make a case for dealing with such secondary effects roactively in hardware architectures and rotocols as seeds continue to evolve. Section II reviews the router functionality and architecture, and addresses the imacts of acket reordering. he rates of change of CPU seeds, link seeds and routing table sizes are related to arallelism within routers in Section III. Section IV resents simulation-based trends for acket reordering for different levels of arallelism. Conclusions are in Section V. II. BACKGROUND ON ROUERS AND REORDERING A. Routers A router erforms two basic tasks, route rocessing and acket forwarding. Routers share information about network conditions, routing information base (RIB), with eers using rotocols such as OSPF, RIP, and BGP. Using this information, each router builds and maintains a routing table, which is then used to decide the aroriate outgoing interface for forwarding each incoming acket. he basic functional comonents of a router are shown in Fig. 1. Switching fabric is the hardware that transfers the ackets between the line cards. Line cards are the hysical link media whose design deends on the network link technology being 1 his research was suorted in art by NSF IR Grant No.0121546 and by Agilent echnologies. Proceedings of the 2007 15th IEEE Worksho on Local and Metroolitan Area Networks 1-4244-1100-9/07/$25.00 2007 IEEE 145

Routing Software Routing able Processing - Network Processors Switching Fabric Buffer Sace Fig. 1. Essential elements inside a router and the ath a acket follows used, e.g., OC-3. Forwarding of a acket involves a number of subtasks such as extracting the acket header, identifying destination address, finding outgoing interface from the routing table with refix matching, adding link layer header to the acket for the outgoing link, lacing the ackets in the corresonding queues and making decisions about droing ackets. Network rocessor units (NPUs), secialized CPUs such as Intel IXP2400, are used to erform these oerations. A high-erformance router tyically has several network rocessors, and deending on the architecture it may have a searate NPU er line interface or a ool of NPUs for ackets arriving at several interfaces [6,7,8,9]. A single NPU unit may also contain multile rocessing engines. As a acket enters the line inut interface of a router, its header is removed and assed through the switch fabric to the acket rocessor. Using the routing table, this rocessor determines how to forward the acket and sends back the header after udating it. he line card integrates the new header with the acket and sends the entire acket to the outbound line card [10,11]. All the cells are then sent to the shared memory ool for temorary storage while the IP address is being looked u and the outgoing interface gets ready for sending. his is the store art of a store-andforward router. he buffer manager creates a query based uon the information extracted from the acket to determine the outgoing interface of the acket. Query is assigned to an NPU that carries out a longest-refix match over the routing table. he decision is conveyed to the buffer manager, which sends a notification to the I/O manager of the corresonding outgoing interface. he notification is queued at the outgoing interface. When the notification reaches the head of the queue, the I/O manager reads all the corresonding cells from the shared memory. he checksum and the link layer header are added and the acket is sent over the link. he oerations including query formation, routing table looku, queuing and then sending the acket form the forward art of the storeand-forward router. Switch fabric with distributed buffer memory and forwarding [11] is a router architecture in which each line card is equied with its own forwarding engine to transmit data via a switch fabric to any other line card as needed. his reduces the load on each forwarding engine to that of the corresonding incoming line. However, as the link seeds continue to outstri rocessing seeds, arallelism will have to be introduced within the line cards, at which time effects such as reordering introduced by shared memory architectures will aear in routers with this architecture as well. A comlex device such as a backbone router may fail to kee the order of a flow of ackets assing through it. s in a flow may take different aths. If a acket suffers excessive delay, then it may reach the destination after its successor. his arallelism in rocessing ackets is the rimary reason for reordering. Reasons for acket reordering in high-seed routers include following: Routers utilize multile arallel NPUs for meeting erformance requirements. A single NPU may even have within itself multile rocessing engines, which work on several queries at the same time. Simultaneous rocessing of ackets from the same stream establishes a race condition that may cause the rocessing of a acket after its successor. Severity of this effect is high when the inter-acket ga among the ackets in a stream is low. Per-acket load balancing distributes ackets towards same destination over multile outgoing interfaces for even link utilization [7]. A router may be designed using multile shared-memory systems or small-sized routers to construct a largecaacity router. Difficulty of coordination among these units causes the order of ackets to change [11]. With head-of-line (HOL) blocking, the outgoing interface of a acket stream may not wait for a blocked acket in the same stream, to sustain interface throughut goals, thereby, causing reordering [11]. Due to route flaing, the ackets in the same stream may be erroneously sent over different aths resulting in out-of-orderliness [12]. In Junier M160, with four arallel rocessors, each with a caacity of 2.5 Gbs to serve a single 10Gbs interface, reordering was a concern [13,14]. Corrective action was taken in the design of the subsequent generations of high-end Junier routers, e.g., 640, to avoid this roblem. Other notable instances of reordering include that in BD6808/6816 [14]. he recent high-end routers attemt to reduce or avoid reordering by either a) inut buffering, i.e., tracking the ackets at the inut to identify the individual streams, and forwarding the ackets of the same stream to the same queue, thus reventing reordering, or b) outut buffering, i.e., buffering ackets at the outut of the router to ensure that the ackets belonging to the same stream are released in the order of their entry into the node [15]. NPUs from vendors such as IBM, Motorola, Vitesse, I and Intel have built-in hardware to track individual flows [16]. Rearrangement of ackets, using inut tracking or outut buffering, requires identifying each flow, recording the incoming sequence of ackets in each flow, and establishing the correct order at the outgoing interface using the recorded sequence. It is also ossible to add a time stam as ackets come into the router and buffer and release them without causing reordering. However, as link seed to rocessing seed increases further, these solutions aear to be non-scalable. Furthermore, buffering ackets in the router for taking corrective action adds to the end-to-end acket latency. As the networks move from bandwidth-limited regime to latency-limited regime [17], increasing the latency is not an attractive otion. In fact, increase in latency itself may contribute to reduced throughut for many alications, thus negating some of the 146

benefits of increased link seeds. B. Imact of Reordering Recovery from reordering is the resonsibility of the endnodes according to the end-to-end design argument for the Internet as well as the best-effort delivery model. End-oint recovery from reordering has worked well in the ast with CP or alication-level buffering in case of UDP. When out-of-order ackets are received, CP erceives it as loss of ackets, resulting in deterioration of erformance due to following [18]: he number of unnecessary retransmissions increases resulting in dro in throughut. he congestion window becomes very small due to multile fast retransmissions, causing roblems in raising the window size, resulting in decreased bandwidth utilization. Due to multile retransmissions the round tri time (R) is not samled frequently, thus degrading the estimate of R. Performance of receiver also suffers, because whenever the receiver receives out-of-order ackets, it has to buffer all the out-of-order ackets and they need to be sorted as well. Detection of loss of ackets is delayed because of outof-order delivery, due to which retransmission request for a lost acket is sent only when CP times out. Due to reverse ath reordering, i.e. reordering of acknowledgements, CP loses its self-clocking roerty, i.e., roerty of CP that it only sends ackets when another acket is acknowledged, doesn t remain valid resulting in bursty transmissions and ossible congestion. reordering can severely degrade the end-to-end erformance [19]. For certain alications based on UDP, e.g. VoIP, an out-of-order acket arriving after its layback time has elased is treated as lost, decreasing the erceived quality of voice on receiver side, but still consuming NIC and rocessing resources. Corrective action is ossible with buffering at the receiver as long as delay is not excessive, but the amount of resources for recovery will increase with the degree and extent of reordering, and with the bit rate of alication (e.g., video over IP will require significantly higher bit rates comared to VoIP). III. RENDS IN COMPUING AND LINK SPEEDS he rocessor level arallelism within routers is dictated by the growth rates in link seeds, rocessing seeds and routing table sizes. In this section, we consider these growth trends and evaluate the arallelism required for future higherformance routers. A. Increase in Network Link Seed Let α be the factor by which the network link seed for a high-end router increases during time eriod (months elased from some initial 0 ), i.e., if s is the initial link seed, the link seed will be (s * α) after the eriod. As the link seed almost doubles every 9 months, α = 2 /9 (1) B. Increase in Comuting Seed Let β be the factor of increase in the comuting seed during time eriod. Alying the Moore s law to the network rocessors: β = 2 /18 (2) C. Increase in BGP able Size Increasing number of subnets in the Internet, and usage of CIDR (Classless Inter-Domain Routing), load balancing, etc., have caused the number of refixes in the routing table of a generic backbone router to increase almost exonentially between the year 1998 and 2000 [20], and almost quadratically after that [21]. he data obtained from AS1221 [5] is used to estimate the trend corresonding to the size of BGP table. he quadratic equation fitted to data from 2000-2004 is given by: S = 7.804e-013 * 2 u - 0.00076 * u + 9.603e+004 (3) S is the tyical number of BGP entries er router, at the time instance U, which is in the form of UNIX timestam. Although long-term observations indicate that a higher-order fit may be better, we use the quadratic relationshi. he result would be, if anything, an underestimate of the comlexity of routing tables. We use γ to reresent the factor by which the size of BGP table increases during the time eriod. D. Increase in the Number of NPUs Next we derive a simle aroximation for the number of NPUs required (n), after the time duration, given that the router at the initial time required m NPUs. Assuming that network usage increases roortionately to network caacity, i.e., the link utilization remains constant, and that the acket lengths remain the same, the number of ackets er second arriving over the link also increases by the same factor, α. he amount of work for rocessing these ackets thus increases by the same roortion. he amount of comutations required for rocessing each acket (looking u router entry, etc.) is considered to increase logarithmically with the BGP table size. hus, the overall amount of comutations that the router has to erform increases by a factor ω = α * log2( γ ) (4) Considering β as the factor of increase in the comuting seed during the time eriod, the following relationshi can be formulated for m and n: n = ω*m/β (5) Combining the imacts of scaling on these arameters, we see that the increase in arallelism in routers is inevitable. E. Mean Processing ime he amount of time taken by an NPU for rocessing a acket is comuted using the results of the exeriments resented in [22], which were carried out to measure the single-ho delay of a acket through an oerational router in a backbone IP network. he router transit time of a acket is observed to be roortional to the length of the acket. his time includes the time sent in the address looku rocess, transfer of the acket from the inut to the aroriate outgoing interface, and the time sent in the queue at the 147

outgoing interface. he first two of these three oerations are mandatory for rocessing every acket and take a minimum rocessing time for each acket, as the queuing oeration may not be required for each acket. Relationshi between the length of a acket and the mean rocessing time in [22] is: d (L) = (0.0213*L+25) (6) where L is the length of the acket in bytes. hus, d (L)reresents the router transit time (in µs) of a acket of length L bytes, minus the time sent by the acket in the queue at the outgoing interface. During the time eriod, as the comuting seed and the rocessing work increase by factors β, and log 2 (γ) resectively, the mean rocessing time d (L) taken by a router for a acket of length L bytes is given in µs by: d (L) = (0.0213*L+25)* log 2 (γ)/β (7) IV. SIMULAION BASED PREDICIONS A simulator deicting the functionality of a simle router scaled to handle future generations based on discussions in Section III has been imlemented [23]. he goal of the exeriments was to study the reordering induced in acket sequences, due to arallel rocessing of the ackets by multile rocessors, in future generation routers. Fig. 2 reresents the high level functionality of the simulated router. Multile acket streams arrive at different line interfaces and are disatched to an NPU by the disatcher. Each acket had a sequence number and a stream identifier. One of the randomly selected streams was designated as the stream of interest, which is used for measuring the amount of reordering. hus, the remaining traffic emulates background traffic, irresective of the line interface that it entered through. he rimary configurable traffic arameters in the simulator were: (a) number of acket streams, (b) nature of the traffic or statistical traffic distribution, (c) acket size distribution, (d) size of ackets in the main stream alone, (e) total link bandwidth, and (f) utilization of the link by the aggregate traffic. Studies have shown that the Internet traffic is self-similar in nature [24]. hus, the simulator was configured to generate self-similar traffic [25]. size density over the Internet is trimodal with higher frequencies for acket sizes 40-44, 552-576 and 1500 bytes [26,27]. Assuming no jumbo frames, the acket size density follows the trimodal characteristic. he acket size in a given stream was held constant, but acket sizes of different streams were chosen using the trimodal acket size distribution. he mean number of ackets generated er unit time er stream was aroximately the same. A line-card inut carries a large number of streams, and all the results resented are for a randomly selected stream with acket size 1500B. he router had a designated number of simulated NPUs, as raffic Generator Disatcher Queues NPU Controller NPU NPU NPU NPU Main Collector Aggregate raffic Processed raffic s Fig. 2. Functionality of the simulated router based on scaled arameters er Eq. (5), to rocess the ackets arriving at line interfaces. Each NPU was rovided with an inut queue whose size was limited by the delay-bandwidth roduct. A acket disatcher unit distributes the incoming ackets to the inut queues of the NPUs, using either round-robin scheme or shortest-queue first. As an NPU finished rocessing a acket, it icks u the next acket in its queue, or waits for a acket to arrive, if the queue is emty. Meanwhile, the rocessed acket exits the router. Following the comutations of α and β based on, the values of the factor increase in BGP table size γ, the number of NPUs n using Eq. (5) and the mean rocessing time d (L) using Eq. (7) were comuted. able I summarizes the arameter values used for the simulation. he link utilization, unless otherwise stated, was ket at 0.5 during the simulations. In this aer, we resent results only for the round-robin scheme for acket disatcher, which is the more otimistic case as the amount of reordering in this case is lower comared to shortest-queue-first case. he reordering of the outgoing ackets was measured using two metrics, Reorder Density (RD) and Reorder Bufferoccuancy Density (RBD) [28,29,30]. RD is the distribution of the dislacements of ackets from their original ositions, normalized with resect to the number of ackets. An early acket corresonds to a negative dislacement and a late acket to a ositive dislacement. RBD is the normalized histogram of the occuancy of a hyothetical buffer that would allow the recovery from out-of-order delivery of ackets. If an arriving acket is early, it is added to a hyothetical buffer until it can be released in order. he occuancy of this buffer after each arrival is used as the measure of reordering. A threshold, used to declare a acket as lost, kees the comlexity of comutation within bounds. RD and RBD are able to cature reordering more comrehensively comared to existing metrics [31]. 148

Link change OC-3 to α β γ ω n Mean Processing time OC-12 4 2.00 1.60 2.71 3 0.00722 * L + 8.476 OC-24 8 2.83 1.87 7.22 5 0.00680* L + 7.977 OC-48 16 4.00 2.16 17.77 9 0.00592 * L + 6.944 OC-96 32 5.66 2.46 41.56 15 0.00489 * L + 5.736 OC-192 64 8.00 2.77 94.07 24 0.00391 * L + 4.593 OC-384 128 11.3 3.10 208.9 37 0.00307 * L + 3.608 OC-768 256 16.0 3.44 456.3 57 0.00237 * L + 2.785 RBD 1 Link Utilization 10% 30% 50% 70% 90% ABLE I. PARAMEERS USED IN SIMULAIONS FOR DIFFEREN LINK SPEEDS 1 OC-12 OC-24 OC-48 OC-96 OC-384 OC-768 0 1 2 3 4 5 6 7 8 9 10 Buffer-Occuancy Fig. 4. RBD of reordering through the simulated router for different link utilizations (for OC768, acket size 1500 bytes) RD Size in bytes 40 256 576 1024 1500-6 -5-4 -3-2 -1 0 1 2 3 4 5 6 Earliness/Lateness RBD RBD 1 0 1 2 3 4 5 6 7 8 9 10 Buffer-Occuancy OC-12 OC-24 OC-48 OC-96 OC-384 OC-768 Fig. 3. RD and RBD variation for different incoming link seeds (at 50% utilization) With increasing link seeds and routing table sizes, it is observed that the acket reordering will increase significantly, in the backbone routers. Fig. 3 deicts RBD and RD of the acket sequences in a stream. he RBD indicates that the reorder buffer is occuied by at least one acket 10% of the time for OC-384, and 17% of the time for OC-768. RD indicates that only 84% of the ackets arrive at the exected 1E-3 0 1 2 3 4 5 6 7 8 9 10 Buffer-Occuancy Fig. 5. RBD variation with acket size (with stream of interest occuying 10Mbs on a 50% utilized OC768) osition for OC-384 and only 74% for OC-768. he amount of reordering is much higher than that observed in a tyical multi-ho link today desite the fact that reordering due to er-acket scheduling on multile outgoing links [32] is still not accounted for. Ref [28] shows the RD in a cascade of two subnets to be the convolution of RDs of individual subnets, thus, the overall dislacement of a acket from its original osition will be significantly higher, when we have multile routers in the ath of a acket stream. Figures 4 and 5 indicate the variation of reordering, in terms of RBD, of a flow in an OC-768 link with the link utilization and the acket size resectively. Larger ackets occuy the link longer, thus heling reduce the amount of reordering. More link utilization aggravates the roblem of reordering. From RBD, it is observed that using a buffer for de-ordering, whether attemted at the end-oint or within the router with outut buffering, will result in significant buffer utilizations as the link seed to rocessing seed ga increases. he values in able 1 should only be considered as indicative of trends in arallelism as oosed to absolute values for each generation of technology. Factors such as ossible imrovement in router table look u (e.g., hardware 149

imlementations), otical switching, etc., could certainly alter the rate of change associated with these trends. Other ossible remedies for reducing reordering include increasing the acket length, or even switching bursts of ackets instead of individual ackets. V. CONCLUSIONS he increasing ga between link and rocessing seeds and the shrinking acket transmission time with resect to end-toend latencies will result in second order effects in networks that have been ignored from rotocol and erformance oints of view. We considered acket reordering introduced within routers to show the increasing trends in such second order effects. Countering such effects within routers as well as endnodes will require increasing resources. Schemes such as load balancing and DiffServ will only increase these effects, which in turn will negatively imact the very same goals these techniques are geared towards, better erformance, efficiency and QOS. For examle, a load-balancing scheme aimed at reducing overall congestion may result in more reordering, resulting in more retransmissions, thus contributing toward congestion. he need for develoing an understanding of these secondary henomena and their imact on end-to-end erformance (as oosed to just the rimary effects such as throughut, loss and delay) thus cannot be overemhasized. his toic has received attention only recently, and this is understandable given their negligible imact at sub Gbs rates. here is a lack of understanding of these effects, and no theoretical foundation exists for modeling them let alone redicting them. here is also a need to identify roer metrics for measuring and characterizing these henomena. Proer understanding of such henomena can lead to modifications to rotocols and architectures that can counter their effects. For examle, the acknowledgement transmission olicy of CP may be changed based on measured or estimated values for reordering to overcome deterioration of erformance. radeoffs involved in recovery from these effects have to be considered as well. Reordering may be dealt at the end nodes with additional buffers, modifications to CP, etc., at the cost of increase comlexity at the end nodes. Alternatively, roactive measures can be taken within routers, such as ensuring in-order release, but these solutions will come at the cost of increased latency. Hardware at the end-node only has to deal with its own flows, while solutions at the router needs to accommodate all the arallel flows assing through each link. REFERENCES [1] G. Moore, Cramming more comonents onto integrated circuits, Electronics, vol. 38, no. 8, Ar. 1965. [2] Moore s Law, htt://www.intel.com/research/silicon/mooreslaw.htm. [3] V. Subbiah and P. Pandit, he unbreakable aroach for deloying Oracle9i Real Alication Clusters, NetA ech Library, R 3218, Oct. 2002, htt://www.neta.com/library/tr/3218.df. [4] J. Latta, NGN - he future of networking, WAVE Reort, issue 2054, Nov. 2000, htt://www.wave-reort.com/other-html-files/ngn1.htm. [5] BGP Reorts, htt://bg.otaroo.net/as1221/bg-active.html. [6] M. Kohler, Network rocessor overview - Intel, Lucent, Sitera, C- Port, CMP Media, Inc., 2000, htt://www.netrino.com/articles/networkprocessors/. [7] Cisco, Load balancing with Cisco exress forwarding, htt://www.cisco.com/war/ublic/cc/d/ifaa/a/much/rodlit/ loadb_an.df. [8] L. Kencl and J.-Y. Le Boudec, Adative load sharing for network rocessors, Proc. of INFOCOM 2002, June 2002, : 545-554. [9] Switch Architectures, Light Reading, htt://www.lightreading.com /document.as?site=lightreading&doc_id=25989&age_number=3. [10] C. Partridge, P. Carvey, et al., "A fifty gigabit er second IP router", IEEE/ACM ransactions on Networking, 6(3), 1998,.237-47. [11] Cisco white aer, he evolution of high-end router architectures: Basic scalability and erformance considerations for evaluating largescale router designs, htt://www.cisco.com/en/us/roducts/ hw/routers/s167/roducts_ white_aer09186a0080091fdf.shtml. [12] D. Howe, Flaing router, htt://burks.brighton.ac.uk/burks/foldoc/18/43.htm. [13] Internet Core Router est, htt://www.lightreading.com/ document.as?doc_id=4009&age_number=8. [14] M. Przybylski, B. Belter, and A. Binczewski, Shall we worry about acket reordering, Comutational Methods in Science and echnology, 11(2), 141-146, 2005. [15] H. Liu, A trace driven study of acket level arallelism, Proc. of International Conference on Communications (ICC 02), New York, NY, 2002,. 2191-2195. [16] end2end-interest mailing list, Reordering in Routers, htt://www.ostel.org/iermail/end2end-interest/2003- August/003420.html. [17] L. Kleinrock, he latency/bandwidth tradeoff in Gigabit networks, IEEE Communications, Aril 1992, 36-40. [18] M. Laor and L. Gendel, he effect of acket reordering in a backbone link on alication throughut, IEEE Network, Set./Oct. 2002,. 28-36. [19] C. Semeria, Internet Processor II ASIC: Rate limiting and traffic olicing features, White aer, Junier Networks, 2000. htt://www.junier.net/solutions/literature/white_aers/200005.df. [20] G. Huston, Analyzing the Internet BGP routing table, Cisco Systems, htt://www.cisco.com/war/ublic/759/ij_4-1/ij_4-1_bg.html. [21]. Bu, L. Gao and D. owsley, On characterizing BGP routing table growth, Comuter Networks, 45(1), 45-54, May 2004. [22] K. Paagiannaki, S. Moon, C. Fraleigh, P. hiran, F. obagi, and C. Diot, Analysis of measured single-ho delay from an oerational backbone network, Proc. of IEEE INFOCOM, June 2002, : 535-544. [23] A. A. Bare, Measurement and analysis of acket reordering, Masters hesis, Deartment of Comuter Science, Colorado State University, 2004. [24] A. Feldmann, A. C. Gilbert, W. Willinger, and. G. Kurtz, he changing nature of network traffic: Scaling henomena, ACM Comuter Communication Review, vol. 28, no. 2, Ar. 1998, : 5-29. [25] G. Kramer, Generator of self-similar network traffic, htt://wwwcsif.cs.ucdavis.edu/~kramer/code/trf_gen2.html. [26] K. Claffy, Greg Miller, and Kevin homson, the nature of the beast: recent traffic measurements from an Internet backbone, CAIDA, htt://www.caida.org/outreach/aers/1998/inet98/inet98.html. [27] Caida, Size and Sequencing, htt://www.caida.org/analysis/learn/acketsizes/. [28] N. M. Piratla, A. P. Jayasumana and A. A. Bare, RD: A formal, comrehensive metric for acket reordering, Proc. Networking 2005, Lecture Notes in Comuter Science 3462,. 78-89, May 2005. [29] A. P. Jayasumana, N. M. Piratla, A. A. Bare,. Banka, R. Whitner and J. McCollom, Reorder Density and Reorder Buffer-occuancy Density - Metrics for Reordering Measurements, IEF draft, Revised Aril 2007, htt://www.cnrl.colostate.edu/reorder/ draft-jayasumana-reorder-density-07.txt. [30] N. M. Piratla, A. P. Jayasumana, A. A. Bare and. Banka, "Reorder Buffer-Occuancy Density and its Alication for Evaluation of Reordering," Comuter Communications (2007). [31] N. M. Piratla and A. P. Jayasumana, "Metrics for Reordering - A Comarative Analysis," International Journal of Communication Systems (IJCS), o aear. [32] N.M. Piratla and A. P. Jayasumana, "Reordering of s due to Multiath Forwarding - An Analysis," Proc. IEEE Int. Conf. on Communications (ICC 2006), Istanbul, June 2006. 150