Trends in TCP/IP Retransmissions and Resets



Similar documents
Improvement of a TCP Incast Avoidance Method for Data Center Networks

Performance Center Overview. Performance Center Overview 1

cooking trajectory boiling water B (t) microwave time t (mins)

What do packet dispersion techniques measure?

Strategic Optimization of a Transportation Distribution Network

Chapter 8: Regression with Lagged Explanatory Variables

TSG-RAN Working Group 1 (Radio Layer 1) meeting #3 Nynashamn, Sweden 22 nd 26 th March 1999

TEMPORAL PATTERN IDENTIFICATION OF TIME SERIES DATA USING PATTERN WAVELETS AND GENETIC ALGORITHMS

Chapter 7. Response of First-Order RL and RC Circuits

AP Calculus BC 2010 Scoring Guidelines

AP Calculus AB 2013 Scoring Guidelines

Acceleration Lab Teacher s Guide

Capacity Planning and Performance Benchmark Reference Guide v. 1.8

Task is a schedulable entity, i.e., a thread

USE OF EDUCATION TECHNOLOGY IN ENGLISH CLASSES

A Scalable and Lightweight QoS Monitoring Technique Combining Passive and Active Approaches

Chapter 1.6 Financial Management

Model-Based Monitoring in Large-Scale Distributed Systems

MACROECONOMIC FORECASTS AT THE MOF A LOOK INTO THE REAR VIEW MIRROR

CHARGE AND DISCHARGE OF A CAPACITOR

A DISCRETE-TIME MODEL OF TCP WITH ACTIVE QUEUE MANAGEMENT

Packet-Oriented Communication Protocols for Smart Grid Services over Low-Speed PLC

Journal Of Business & Economics Research September 2005 Volume 3, Number 9

Automatic measurement and detection of GSM interferences

Vector Autoregressions (VARs): Operational Perspectives

Statistical Analysis with Little s Law. Supplementary Material: More on the Call Center Data. by Song-Hee Kim and Ward Whitt

Chapter 2 Kinematics in One Dimension

PROFIT TEST MODELLING IN LIFE ASSURANCE USING SPREADSHEETS PART ONE

Chapter 6: Business Valuation (Income Approach)

Duration and Convexity ( ) 20 = Bond B has a maturity of 5 years and also has a required rate of return of 10%. Its price is $613.

SELF-EVALUATION FOR VIDEO TRACKING SYSTEMS

AP Calculus AB 2010 Scoring Guidelines

Appendix D Flexibility Factor/Margin of Choice Desktop Research

DDoS Attacks Detection Model and its Application

Small and Large Trades Around Earnings Announcements: Does Trading Behavior Explain Post-Earnings-Announcement Drift?

Morningstar Investor Return

Appendix A: Area. 1 Find the radius of a circle that has circumference 12 inches.

11/6/2013. Chapter 14: Dynamic AD-AS. Introduction. Introduction. Keeping track of time. The model s elements

Why Did the Demand for Cash Decrease Recently in Korea?

WATER MIST FIRE PROTECTION RELIABILITY ANALYSIS

Chapter 8 Student Lecture Notes 8-1

A Note on Using the Svensson procedure to estimate the risk free rate in corporate valuation

Market Liquidity and the Impacts of the Computerized Trading System: Evidence from the Stock Exchange of Thailand

Making a Faster Cryptanalytic Time-Memory Trade-Off

Making Use of Gate Charge Information in MOSFET and IGBT Data Sheets

Individual Health Insurance April 30, 2008 Pages

Mobile and Ubiquitous Compu3ng. Mul3plexing for wireless. George Roussos.

Predicting Stock Market Index Trading Signals Using Neural Networks

Risk Modelling of Collateralised Lending

The Grantor Retained Annuity Trust (GRAT)

COMPARISON OF AIR TRAVEL DEMAND FORECASTING METHODS

µ r of the ferrite amounts to It should be noted that the magnetic length of the + δ

Measuring macroeconomic volatility Applications to export revenue data,

BALANCE OF PAYMENTS. First quarter Balance of payments

The naive method discussed in Lecture 1 uses the most recent observations to forecast future values. That is, Y ˆ t + 1

The Transport Equation

Product Operation and Setup Instructions

Mathematics in Pharmacokinetics What and Why (A second attempt to make it clearer)

Return Calculation of U.S. Treasury Constant Maturity Indices

Equities: Positions and Portfolio Returns

A Note on the Impact of Options on Stock Return Volatility. Nicolas P.B. Bollen

Analysis and Design of a MAC Protocol for Wireless Sensor etworks with Periodic Monitoring Applications

UNDERSTANDING THE DEATH BENEFIT SWITCH OPTION IN UNIVERSAL LIFE POLICIES. Nadine Gatzert

Chapter 4: Exponential and Logarithmic Functions

9. Capacitor and Resistor Circuits

Default Risk in Equity Returns

Day Trading Index Research - He Ingeria and Sock Marke

ARCH Proceedings

Usefulness of the Forward Curve in Forecasting Oil Prices

Debt Accumulation, Debt Reduction, and Debt Spillovers in Canada, *

NASDAQ-100 Futures Index SM Methodology

Chapter 2 Problems. 3600s = 25m / s d = s t = 25m / s 0.5s = 12.5m. Δx = x(4) x(0) =12m 0m =12m

GoRA. For more information on genetics and on Rheumatoid Arthritis: Genetics of Rheumatoid Arthritis. Published work referred to in the results:

4 Convolution. Recommended Problems. x2[n] 1 2[n]

The Kinetics of the Stock Markets

A Resource Management Strategy to Support VoIP across Ad hoc IEEE Networks

Detection of DDoS Attack in SIP Environment with Non-parametric CUSUM Sensor

DYNAMIC MODELS FOR VALUATION OF WRONGFUL DEATH PAYMENTS

The Application of Multi Shifts and Break Windows in Employees Scheduling

INTRODUCTION TO FORECASTING

Full-wave rectification, bulk capacitor calculations Chris Basso January 2009

CALCULATION OF OMX TALLINN

Heuristics for dimensioning large-scale MPLS networks

Supplementary Appendix for Depression Babies: Do Macroeconomic Experiences Affect Risk-Taking?

INTEREST RATE FUTURES AND THEIR OPTIONS: SOME PRICING APPROACHES

Information technology and economic growth in Canada and the U.S.

II.1. Debt reduction and fiscal multipliers. dbt da dpbal da dg. bal

Mortality Variance of the Present Value (PV) of Future Annuity Payments

Hedging with Forwards and Futures

Answer, Key Homework 2 David McIntyre Mar 25,

Enhanced TCP SYN Attack Detection

Transcription:

Trends in TCP/IP Reransmissions and Reses Absrac Concordia Chen, Mrunal Mangrulkar, Naomi Ramos, and Mahaswea Sarkar {cychen, mkulkarn, msarkar,naramos}@cs.ucsd.edu As he Inerne grows larger, measuring and characerizing is dynamics becomes essenial for purposes ranging from he opimizaion of fuure nework equipmen o modeling he effecs of new proocols on he exising raffic mix. One imporan aspec of Inerne connecions is bandwidh uilizaion wih daa sen using he TCP proocol. In his paper, we discuss findings from a sudy of 65,535 TCP flows beween 8083 Inerne sies. We look ino wo major bandwidh uilizaion problems: HTTP connecions rese unnecessarily by impaien users and bandwidh wased on reransmissions. For HTTP connecions, we provide an algorihm o disinguish a rese of an impaien user from a nework rese. For bandwidh wased, we calculaed he amoun of daa reransmied and also analyzed he goodpu and hroughpu per flow. We discovered ha in our one-hour race, unnecessary reses occur 10.6% of he ime and ha reransmissions wase 3.47% of he oal bandwidh. 1 Inroducion Analysis of Inerne raffic behavior is a major area of research oday. As he Inerne grows larger, measuring and characerizing is dynamics becomes essenial for purposes ranging from he opimizaion of fuure nework equipmen o modeling he effecs of new proocols on he exising raffic mix. Since he TCP/IP is he primary proocol being used in oday s Inerne, i is imporan o characerize is performance. While here are various aspecs of TCP ha can be analyzed, such as congesion conrol and end-oend rouing behaviors, bandwidh uilizaion is one of he mos imporan saisics in TCP connecions. To undersand bandwidh uilizaion, wo main quesions were pu forh: Wha percenage of HTTP connecions is rese unnecessarily? Wha percenage of bandwidh is wased on reransmissions? Using daa colleced by he NLANR/MOAT Nework Analysis Infrasrucure (NAI) projec and analysis sofware from CAIDA's CoralReef projec, rends were idenified in TCP packe ransmission over a one hour ime period. The sudy was a passive analysis of 65,535 TCP flows beween 8083 Inerne sies. Saisics, such as hroughpu, goodpu, and he number of reses ha occurs due o image- impaien cliens (e.g. individuals who rese heir HTTP connecions before he enire web page is ransferred) were colleced. In ξ2, we discuss he mehodology and he ools used for our analysis. In ξ3, we provide he resuls and he analysis used o answer our quesions. We found ha in our one-hour race, unnecessary reses occur 10.6% of he ime and ha reransmissions wase only a small percenage - 3.47 % of he oal bandwidh. In ξ4, we discuss oher ineresing observaions made during he course of his projec. In ξ5, we briefly summarize our findings and discuss fuure direcions for evaluaing reransmission and rese rends. 2 Analysis Mehodology and Tools The raw daa used in he sudy was provided by he NLANR/MOAT Nework Analysis Infrasrucure (NAI) projec. A monior was placed on an OC-3c link a he New Zealand Inerne Exchange and colleced he daa. 1 1 The monior was conneced o a Swiched Por Analyzer (SPAN) using a 100BaseTX Fas Eherne. Timesamps are consequenly skewed compared o heir arrival or deparure imes a he inpu/oupu pors of he swich as he oal capaciy of he swich is higher han he monioring uplink and packes arriving

Due o limied resources, mainly disk space, we did our analysis on a one-hour race exraced from a coninuous five-day race. Alhough he swiching nework used in his race was ATM, he sudy concenraed on he TCP/IP layer and only he 40 byes of header informaion exraced from he firs ATM cell of each packe were analyzed. Since he races were encoded in he DAG forma, CAIDA s CoralReef sofware was modified o reduce he raw race o a se of daa ha could be analyzed wih oher ools. Alhough he analysis for he sudy could have been done enirely wihin CoralReef, i was decided ha an exernal program would be able o provide flexibiliy and spaial efficiency. This daa se consised of specific informaion found wihin he TCP/IP header, such as source/desinaion IP address, source/desinaion por address, window size, number of byes of daa sen, ACK number, sequence number, and he SYN, FIN, RST and ACK flags. In addiion, he imesamp was recorded and he proocol number given by he monior was used o verify ha he packe was in fac TCP/IP. Working wih he one-hour race of daa exraced wih he CoralReef uiliies, flows characerized by he 4-uple, {IP source address, IP desinaion address, IP source por, IP desinaion por} were individually analyzed o yield saisics on reransmissions and impaien reses. Table 1 summarizes he daa gahering and analysis echniques. Table 1: Daa Gahering and Analysis Techniques Tool Inpu Measuremens Oupu Acive/passive Funcions Time Scope Perl Scrip Coral Reef exraced header informaion Passive Coun oal byes sen, reransmied byes, number of reses Throughpu calculaion Goodpu calculaion Measure Bandwidh uiliy 1 hour from sar of ransmission File and Graphs Since he sudy was focused on reransmission and reses, he saisics calculaed from observed measuremens included bandwidh wased due o reransmissions, hroughpu, goodpu, and oal number of impaien reses. The bandwidh wased, BW, due o reransmi was calculaed using he oal number of byes reransmied, r, and he oal ransmission ime,, in seconds wih he following equaion. r BW = The oal hroughpu, T, was calculaed by looking a he oal ransmied byes, b, over he oal ransmission ime,, in seconds. b T = Toal goodpu, G, was defined wih respecs o he oal ransmied byes, b, oal reransmied byes, r, over he ransmission ime,, in seconds b r G = from differen pors a he same ime need o be queued before being delivered o he SPAN por. The chance of raffic being los before being moniored is fairly low as he oal bandwidh a he swich currenly peaks a around 10-12 MBis/sec.

In order o calculae reses due o image-impaien users, he following algorihm was used : If eiher source or desinaion por = 80 2 If RST = 1 If byes ransmied <10,000 Incremen number of reses Since we were looking for reransmissions due o impaien users, i was necessary o observe he handshake a he beginning of he ransmission. This algorihm was run only on he flows ha began wihin he 1-hour ime period. A disincion beween connecions was needed o decide which reses were due o nework failures versus hose due o image-impaien users. Looking a several images on he Inerne, we discovered ha a small image would be a leas 20KB. We assumed ha a user fiing his profile would erminae he connecion halfway hrough. Alhough his number can vary grealy, his provides a reasonable esimae of connecions rese unnecessarily. 3 Resuls and Analysis All of he resuls have been obained by analyzing he races associa ed wih a paricular flow. Table 2, below, is a sample of some of he packes sen wihin a flow. We generaed hese races for every 4-uple ha was found in he firs hour of he race. Table 2: Sample of a Flow Daa Srucure Time Src IP Ds IP Lengh Src Por Ds Por Seq Num Window Size Ack Number ACK FIN RST SYN 9.48764210E+08 10.0.19.232 10.0.0.26 44 1091 83 842060 8192 0 0 0 0 1 9.48764210E+08 10.0.0.26 10.0.19.232 44 83 1091 1749083406 8760 842061 1 0 0 1 9.48764210E+08 10.0.19.232 10.0.0.26 40 1091 83 842061 8760 1749083407 1 0 0 0 9.48764210E+08 10.0.19.232 10.0.0.26 40 1091 83 842061 32768 1749083407 1 0 0 0 9.48764210E+08 10.0.19.232 10.0.0.26 618 1091 83 842061 32768 1749083407 1 0 0 0 Afer running he algorihms on 65,535 TCP flows beween 8083 Inerne sies, a summary of he ransmission characerisics for each flow was creaed. The flow summary conained he following informaion per connecion: sar ime, end ime, sen byes, reransmied byes, duplicae ACKs, ousanding ACKs, and he number of RST bis se. As described earlier, his daa was used o calculae hroughpu per flow, goodpu per flow, bandwidh wased due o reransmissions, oal byes reransmied during one hour ransmission, and oal number of reses. 3.1 Connecions Rese Wheher or no a user reses a connecion depends mosly on he individual using he link a he ime he race was aken. Hence, i becomes difficul o make a generalized saemen abou he menaliy of users using passive daa. Following he algorihm described in Secion 2, he oal number of connecions rese by impaien users in he hour-long period oaled 6,947 ou of 65,535 TCP connecions, meaning ha approximaely 10.6% of he connecions were rese. The number of reses is a significan porion of he oal number of connecions. 2 We have assumed ha in mos cases packes sen beween any por number higher han 1023 and a well-known por number below 1023 are generaed by he same proocol (e.g., HTTP on por 80). This maches ypical end hos behavior, in which cliens allocae ephemeral pors from he range 1024 o 32767.

I is highly likely ha he major porion of hese reses would have led o reransmission, wased bandwidh, and increased congesion. I would be difficul o calc ulae he reransmissions ha would have resuled from hese reses because every ime a new connecion is esablished, a new por number is issued. By looking a he passive daa, we were unable o decide wheher successive connecions o he same IP address were requesing he same daa. A more viable approach o undersanding he percenage of HTTP connecions ha are rese unnecessarily would be o use an acive mehodology. By observing a large group of users and heir browsing behavior, a more accurae generalizaion can be made. 3.2 Reransmissions TCP reransmissions should only occur when i is cerain ha a packe o be reransmied was acually los. As seen in [2], redundan reransmissions can also occur because of los acknowledgmens, coarse feedback, and bad reransmission imeous. This sudy looks a he bandwidh wased due o all ypes of reransmissions. A summary of he saisics gahered for reransmissions can be found in Table 3 below. Table 3: Summary Saisics for Reransmissions Toal byes sen/received 878,172,211 Reransmied byes 3,0537,579 % of reransmied byes 3.47 % Used bandwidh (byes/sec) 243,936.7253 Wased bandwidh (byes/sec) 8483.5375 % of wased bandwidh 3.47% Toal packes sen 1,910,081 Toal packes reransmied 65006 % of packes resen 3.403% Figure 1 below, depics he bandwidh wased due o reransmissions as a percenage of he flows. This pie char shows he amoun of reransmissions occurred during he races. Figure 1 Bandwidh Wased Due o Reransmissions In order o compare our resuls, we looked he percenage of redundan reransmissions repored in [2]. This sudy indicaed ha approximaely 1% o 6% of all packes sen was redundanly reransmied. Our resuls fall

wihin his range. Alhough his observaion could be misused o make a generalizaion of TCP connecions, i is imporan o noe ha TCP can display a wide variey of behavior making generalizaion abou any of is characerisics difficul. For example, his can be observed using he nesa faciliies available on Solaris and Windows operaing sysems o observe he segmens reransmied acively measured for each compuer. The percenage of reransmissions for hese sysems were significanly smaller han our calculaed percenage for reransmied packes. On a Windows NT machine, approximaely 0.010% segmens were ransmied and on a Solaris machine, here was a repored 0.014% segmens reransmied. Because his faciliy looked solely a he amoun of reransmissions sen for compuers ha were mainly cliens, he smaller resuls were expeced. 3.2.1 Throughpu & Goodpu Throughpu helps us esimae he uilizaion of he available bandwidh. Wih he virual channel as a Classical- IP-over-ATM, he connecion had a packe peak rae of 250,000 byes/sec se in each direcion (500,000/sec when you aggregae he wo race files). The hroughpu measured occupies 48.78% of he oal available bandwidh on he OC3c link. Figure 2a, b: Throughpu and Goodpu versus Saring Time Figure 2a, above, shows he hroughpu per flow ploed agains is repored sar ime. This graph shows he variaion in hroughpu per flow hroughou he one-hour period. Noe ha he high peaks indicae he hroughpu of he paricular flow had high bandwidh uilizaion. The graphs are consruced using 30,000 flows randomly chosen from he daa se. This is due o he limiaion of he graphing program used o plo he daa. Similarly, Figure 2b, below, illusraes he goodpu obained by hese flows. Comparing he wo graphs, i can be seen ha b r hese graphs resemble each oher. Wih goodpu defined as G = his resul was predicable. Wih a minimal amoun of byes reransmied, only accouning for 3.47% of he oal byes ransmied, he goodpu resuls should have been similar o he hroughpu graph. 4. Oher Observaions Piggybacking An ineresing observaion in he races was ha some of he source/desinaion pairs were no piggybacking heir acknowledgemens. The same sequence number was used on wo consecuive packes, one conaining an acknowledgemen for a packe received and he oher conaining a payload. While his paern is expeced following he iniial handshake, i was observed several imes wihin one flow. A possible explanaion could be ha he server/clien did no wan o delay acknowledgmens o wai for he send buffer o fill.

Duplicae ACKs By looking a he TCP flow races, he congesion avoidance scheme being used for he TCP connecion could be parially idenified. Mos of he connecions waied for hree duplicae ACKs before reransmiing a packe. This behavior idenified he congesion avoidance scheme as being Reno or Tahoe. The oal number of duplicae ACKs ha was found in he one hour race was 284,258. This is reasonable, considering ha a packe is reransmied for every hree duplicae ACKs received on average. Window Size In our analysis, he window size sayed almos consan hroughou he hour long period. This was conradicory o wha was expeced. TCP implemens congesion avoidance by cuing down he window size o half of he value a which congesion sared. TCP s policy of reransmiing afer every hree duplicae ACKs was also observed. This should have been accompanied by he decrease in window size. However no such observaion was made. 5 Conclusion Afer conducing an analysis of he TCP raffic workload, seen a a single measuremen sie, inside a major Inerne raffic exchange poin, he following conclusions can be made: We discovered ha 10.6 % of he connecions were rese due o image impaien users. We fel ha his was a significan porion of he connecion. We assume ha a major porion of hese reses would have led o reransmissions, leading o wased bandwidh and increased congesion. I would be difficul o calculae he reransmissions ha would have resuled from hese reses because every ime a new connecion is esablished, a new por number is issued. By looking a he passive daa, we were unable o decide wheher successive connecions o he same IP address were requesing he same daa. The reransmission rends ha we observed in our analyses showed ha we had a 3.4% reransmission rae over all packes sen. This was in accordance o oher reransmission resuls repored in previous sudies [2]. However, due o he wide variaion of TCP behavior, we could no generalize he resuls. Some common assumpions such as piggybacking of daa wih acknowledgemens and decrease in window size a he onse of congesion were violaed. However, we did observe ha reransmissions occurred afer hree duplicae ACKs which is consisen wih he TCP congesion avoidance mechanism of he mos popular TCP implemenaions ( e.g Reno and Tahoe) Fuure direcions for undersanding rends regarding reransmission and reses would include more exensive analysis of races over a longer period of ime. I would help characerize TCP rends more accuraely. An acive measuremen of reseing rends wih image impaien users could lead o ineresing conclusions. 5 Acknowledgemens This work would no have been possible wihou he CoralReef ools provided by Caida, daa provided by he NLANR/MOAT Nework Analysis Infrasrucure (NAI) projec, aid provided by David Moore, and he insigh ino TCP/IP given by Geoff Voelker. References [1] Peerson and Davie. Compuer Neworks: A Sysems Approach. Morgan Kaufman, 2000 ( Second Ediion). [2] Paxson, V. End-o-End Inerne Packe Dynamics, Proc. SIGCOMM 97, Sep 1997.

[3]Paxson, V. End-o-End Inerne Rouing Behavior in he Inerne, Proc. SIGCOMM 96, Augus 1996. [4] Sevens, W. TCP/IP Illusraed, Volume1: The Proocols. Addison Wesley, 1994. [5] McCreary, S and Claffy, Kc Trends in Wide Area IP Traffic Paerns, 2000