BACKBONE INTERNET TRAFFIC INTENSITIES AND STATISTICS IN SUDAN Hiba M. Osman Sudan Telecom Co. Ltd. P. O. Box 11155 Khartoum, Khartoum, Sudan Fax: +249 183 798504 email: hiba@sudatel.net Sami M. Sharif Department of Electrical and Electronic Engineering Faculty of Engineering and Architecture, University of Khartoum P. O. Box 321 Khartoum, Khartoum, Sudan smsharif@uofk.edu Huda M. A. ElHag Department of Computer Science Faculty of Mathematical Sciences, University of Khartoum P. O. Box 321 Khartoum, Khartoum, Sudan hudaalhajj@uofk.edu 1 abstract To properly model the Internet a through understanding of the characteristics of network traffic is needed. These characteristics can be obtained by measurement and analysis of actual network traffic. Internet traffic is known to exhibit self-similarity and long-range dependence in both local area networks and wide area networks. This paper presents backbone Internet traffic intensities and statistics in the Sudan in the period from 17th April 2002 to 30th September 2002. The traffic measurements were carried out using MRTG (Multi Router Traffic Grapher) software at an Internet core router. Keywords : Internet traffic analysis, Internet traffic statistics, Internet traffic distribution, self-similarity, long-range dependence,. 2 Introduction To properly model the Internet a through understanding of the characteristics of network traffic is needed. These characteristics can be obtained by measurement and analysis of actual network traffic. A lot of work has been done in the measurement and analysis of network traffic. These studies encompass both local area networks and wide area networks. For both local area networks [1][2] and wide area network [3][4], it is shown that network traffic is self-similar and long-range dependant in nature [5][6]. This paper presents backbone Internet traffic intensities and statistics in the Sudan in the period from 17th April 2002 to 30th September 2002. The traffic measurements for the international links were carried out using MRTG (Multi Router Traffic Grapher) [7] software at the Table 1. Link Specifications (Reference [8]) Link Name Link Capacity Link Type I1 2Mbps Full Duplex I2 2Mbps Full Duplex I3 2Mbps Full Duplex I4 6Mbps CBR/ 12Mbps BBR Simplex Downlink Internet core router. Satellites provide international connectivity through an Earth Station, to four international to Internet service providers. Local Internet connectivity is provided through a Frame Relay network. Two types of Internet connectivity are provided, committed information rate and burstable rate. 3 Measurement Setup and Data Collection The measurement setup was made up of satellites at an earth station and a small dish using DVB/IP technology at Khartoum Center, this was used to provide connectivity through four links to International ISPs Table 1 shows the Internet links names and specifications. All of the links terminate in the Internet core router, I1, I2 and I3 are connected using serial ports while I4 is via Ethernet. Internet service is provided to organizations and local service providers through the Frame Relay network, Figure 1 shows the top-level measurement connections. There are two types of Internet connectivity, committed information rate and burstable rate [8]. Data was collected at the Internet gateway router from the international ports (I1, I2, I3 and I4) during the 1
Table 2. Peak Hour Intensity and Utilization for International Links Link Peak Hour Intensity Utilization Name (byte) (%) I1 3 pm 73193 28.8 I2 3 pm 125331 51 I3 1 pm 122692 49.9 I4 2 pm 97507 18.6 Figure 1. Measurement Connectivity Table 3. International Links Load Distributions Link Name Received Traffic Transmitted Traffic Load Percentage (%) Load Percentage (%) I1 18 33 I2 32 33 I3 30 34 I4 20 0 Total 100 100 Figure 2. Daily Traffic Variations for International Internet links, for the period April 2002 to September 2002 period from 17-4-2002 to 30-9-2002 [8]. The trace consisted of the following data: link name, link type (upload / download), international service provider, protocol used, bandwidth, upload and download bytes/second, queuing delay and utilization for a Frame Relay switch for one day. 4 Backbone Traffic Analysis This section describes the traffic patterns per day, week and month for the international Internet links. Daily Traffic Pattern As shown in Figure 2 the input traffic intensity starts to grow from 8 am and reaches its peak between 12 noon and 2 pm then decreases sharply from 4 pm to 7 pm and has another peak between 10 pm to 1 am. The peak at night is mostly generated by the traffic from internal ISPs. The traffic generated by the local ISP s consists of about 42.9%. The ISPs serve home users who always access the Internet at night. The increased Internet access at night is due to the cheaper telephone and Internet service. Business users generate the evening peak, the drop between 4 to 7 pm is the end of business day. The output traffic follows the same behavior as the input. Table 2 shows the peak hour for each link, the table also shows values of intensity and utilization. The four links maintain the same daily traffic variations for input and output traffic, with slight difference in the values of incoming traffic. From the statistics gathered it is observed that the received and transmitted traffic are shared between the four links as shown in Table 3.From 3 it is clear that the load balancing technique used to balance the output traffic is efficient with exact load sharing. For the incoming traffic the I2 and I3 links carry the same load which indicates that the metrics of BGP protocol used to control the traffic give good results. For the case of I1 the load is less than I2 and I3 because no routing policy is used to share the traffic load of this link it just depends on aspath length metric. Link I4 is static, no dynamic routing protocol is used and customer traffic is not shared with the other links, the 20% indicates that theirtraffic is less than the other customers. As shown in Figure 3 below both the input and output traffic have lowest values on Friday and peak values on Wednesday. The difference between the traffic on Fridays and Wednesdays is always less than 10%, therefore the weekly traffic variations are not significant, because most of the users are home users and their access on Friday same as on the other week days. Figure 4 show the monthly growth of the Internet traffic. The data shows the average traffic utilization per month for the period from April to September 2002. There was no observable change in the incoming traffic while the outgoing traffic doubled. That is because during this period most of the ISPs began to have their own download links and relied on Sudatel backbone for the upload only. The traffic mean and standard deviation for the links I1, I2 and I3 are calculated for the period between 15 August 2002 and 30 September 2002. Table 5 below, shows the mean and standard deviation for input and output utilization. All the links have comparable mean and standard deviation for the outgoing traffic. For the incoming I2 and I3
Figure 3. Weekly traffic variations for international Internet links, for the period April 2002 to September 2002 Figure 5. Distribution of Incoming Traffic compared to Poisson distributions have similar statistics, but the incoming traffic for link I1 is less than 50% that of link I2 and link I3, this means that the number of accessed web sites that have the shortest path through I1 are less than those through I2 and I3.The technique used to distribute upload traffic between the three links worked fine, and gave 1:1 load balancing. Traffic Distribution Figure 4. Monthly traffic variations for international Internet links, for the period April 2002 to September 2002 To find the statistical distribution of the received traffic, the probability of each traffic utilization value is calculated, the probability is taken from the 5 minute averaged utilization readings for the period of 15 August 2002 to 30 September 2002. Figure 5 shows the observed distribution for each link compared with the corresponding Poisson distribution. The Poisson distribution for each graph is calculated from the utilization values in Table 4. As shown in Figure 5 the arrival model for all of the links does not follow Poisson arrival model, hence the queuing strategies that apply to a Poisson arrival model may not be suitable with this kind of traffic. 5 Self-Similarity Table 4. Mean and standard deviation for upload and download Utilization Link Statistic Input Output Name Utilization Utilization (%) (%) I1 Mean 18.64 32.55 Std. Deviation 11.88 16.64 I2 Mean 37.88 36.24 Std. Deviation 18.54 17.59 I3 Mean 32.55 34.78 Std. Deviation 16.37 16.80 A phenomenon that is self-similar looks the same or behaves the same when viewed at different degrees of magnification or different scales on a dimension. This dimension can be space or time [9]. Figure 6 compares traffic that is non self-similar (left) and traffic that is self-similar (right). For the non self-similar traffic it is observed that as the time scales get longer the traffic smoothes out, but for the self-similar traffic the traffic tends to look the same for short time scales and long time scales. The data used to check for self similarity was taken from one of the aggregate Frame Relay links. The measurement set consisted of an 8 hour period, with a resolution of 12 seconds. Figure 7 shows the traffic variations on different time scales. On the X-axis each time unit = 12
Figure 6. Comparison between self-similar and non-selfsimilar stochastic processes seconds, the traffic in Figure 7 maintains the same behavior when viewed at different scales of aggregation. Figure 8 shows plots of the number of packets per unit time for a measurement set from 24 December 2002, which consists of over 7 hours of continuous monitoring of the Internet traffic, on the plot each subsequent plot is obtained from the previous one by increasing the time resolution by a factor of 2 and displaying a chosen sub interval. The first line covers a period of 7 hours, the second is 3.5 hours and so on.all the plots look similar to one another, all the plots involve a fair amount of burstiness thus Internet traffic tends to look the same at large scales and at small scales. The variance time plot was calculated for the same data using the same aggregation levels as on Figure 8, the resultant variance-time plot is shown in Figure 9. The values of log variance against log 10 (m) are used to estimate the straight-line equation in order to find the slope; we obtained the following equation log 10 (normalized var) = 0.0159 1.0379 log 10 (m), from the above equation the slope of the line that represent log 10 (var) and log 10 (m) is 1, this gives β = 1 and H = 1. This indicates that the traffic is self similar with H = 1. Figure 7. Traffic at different time scales 6 Utilization and queuing delay Figure 10 shows delays versus utilization plot for a measurement set from 17 October 2002 which consisted of over 8 hours of continuous monitoring of University of Khartoum Internet Frame Relay link.the y-axis shows the normalized queuing delay averaged over 14 second polling intervals and divided by the maximum allowable delay value of the switch, the x-axis shows the percentage utilization of the link. The dark line shows the general trend of the plot. It is observed that the delay increases sharply for utilization values above 55%. Figure 11 shows delay versus utilization for an aggregate Internet link with low load. Figure 8. Plots of the number of packets per unit time for a measurement set from 24 December 2002
7 Conclusion Figure 9. Variance-time This paper gives the backbone internet intensities and statistics. The traffic examined came from Internet gateway router; using this data it was shown that Internet traffic is self-similar in nature. The data measured was at relatively coarse time scales, in future work measurements should be taken at finer time scales. The ratio of the accessed web sites that have the shortest path through I2 and I3 to those have the shortest path through I1 is 4:1.The peak utilization is at 2 pm, during week days and the traffic shows no variation except for a slight drop on Fridays. On a monthly basis the upload traffic doubled in four months but there was no increase in the download traffic. The packet delay in the Frame Relay switches was found to increase sharply when the utilization exceeded 55%, this should be taken into account for the links that carry delay sensitive applications. In the future it is predicted that demand for Internet applications such as video conferencing, distance learning, remote banking and e-commerce will come into being in Sudan, some of these services are Quality of Service sensitive applications. The performance effects of traffic self-similarity have to be taken into effect when designing networks that carry such applications. The data used in this research was gathered in a few months, the results obtained may need to be recalculated for long term measurements. For further research large storage devices, statistical analysis tools and packet sniffers to gather traffic data from the routers and switches should be used. References Figure 10. Delay-Utilization [1] Leland W. E., Willinger W., Taqqu M. S. and Willson D.V., On The Self Similar Nature Of Ethernet Traffic, Proceedings of the ACM SIGCOMM 93, September 1993. [2] Leland W. E., Willinger W., Taqqu M. S. and Willson D.V., On The Self Similar Nature Of Ethernet Traffic (Extended Version), IEEE / ACM Transactions on Networking, Vol. 2, 1994, pp. 1-15. [3] Crovella M. E., and Bestavros A., Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes, Proceedings of the 1996 ACM SIGMET- RICS International Conference on Measurement and Modeling of Computer Systems, May 1996. [4] Paxson V. and Floyd S., Wide-Area Traffic: The Failure of Poisson Modeling, IEEE / ACM Transactions on Networking, Vol. 3, 1995, pp. 226-244. Figure 11. Delay-Utilization [5] Crovella M. E., Taqqu M. S. and Bestavros A., Heavy- Tailed Probability Distributions In The World Wide Web, in A Practical Guide To Heavy Tails: Statistical Techniques And Applications, R. Adler, R. Feldman and M. S. Taqqu, editors, Boston: Birkhauser,, 1998. [6] Willinger W., Paxson V., Reidi R. and Taqqu M. S., Long-Range Dependence and Data Network Traffic,
in Theory And Applications of Long-Range Dependence, P. Doukan, G. Oppenheim and M. S. Taqqu editors, Boston: Birkhauser, 2002. [7] MRTG (multi router traffic grapher), http://people.ee.ethz.ch/ oetiker/webtools/mrtg/ [8] H. M. Osman, Internet backbone network traffic in Sudan, masters thesis, Khartoum University, Khartoum, Sudan, 2002. [9] W. Stallings, High-speed networks TCP/IP and ATM design principles (Prentice Hall, 1998).