Internet Traffic Analysis and the Unidirectional Classifier

Transcription

1 Classification of emerging protocols in the presence of asymmetric routing M. Crotti, F. Gringoli, L. Salgarelli Università degli Studi di Brescia, Brescia, Italy, Summary. The growing communication needs of things, such as sensors, automated controllers, etc., are rapidly showing their impact on the Internet. On one hand, the nature of the traffic that needs to be transported is changing, with the emergence of new communication protocols and standards. On the other hand, the way this traffic is routed is also changing: the practice of asymmetric routing is quickly spreading even to the edges of the network. In this scenario the challenge of traffic classification and protocol recognition, that is crucial to control and manage network traffic, is even harder than today. In this paper we describe how to optimize a simple statistical traffic classifier to recognize the unknown traffic that can be generated by emerging protocols. We carry the analysis further to understand how asymmetric routing can impact its accuracy. Our experimental results suggest that the deterioration of accuracy is not very significant in terms of decrease in true positives, while it can be substantial in terms of increasing the false positive rates. Furthermore, our work suggests that some protocols carry more information relevant to traffic classification in one direction than in the other. 1 Introduction A widely accepted notion of the Internet of Things (IoT) is referred to the possibility of equipping everyday objects with adequate technology in order to allow them to communicate using TCP/IP with other objects, identify themselves or even participate to distributed computing [1]. A rapid technological evolution is taking the IoT phenomenon an essential step further: by embedding short-range transceivers into a wide array of everyday items, new forms of communication are growing not only between people and things, but also between things themselves. These new devices are expected to produce significant amount of traffic not only locally, but also globally, for example when swarms of sensors are probed and accessed remotely through the Internet by their automated controllers.

2 2 M. Crotti, F. Gringoli, L. Salgarelli In order to maintain and improve the way QoS delivery and security are dealt with in the IoT, traffic analysis mechanisms need to be able to cope with the emergence of the new protocols used by things, and with how such protocols impact the current Internet. At the same time, both economical and technological reasons are behind another change that the Internet is experiencing, and that will be even more accelerated by the advent of the IoT. Asymmetric routing [2, 3], a practice already common in the Internet core, is going to affect parts of the network that are closer and closer to the edges in a few years [4]. In order to maintain their effectiveness in this environment, Internet traffic analysis techniques need to be made robust to the effects of asymmetric routing. In this paper we focus on the two problems described above, and present the analysis of a commonly used traffic classification technique with respect to its capabilities of recognizing unknown protocols and its ability to operate when only one of the one of the two counter propagating half-flows is available. 1.1 Research contributions In this work we offer two main research contributions. First, we introduce an effective and modular statistical classification technique that optimizes the precision of a statistical classifier in recognizing the emerging (i.e. unknown ) protocols in an asymmetrical routing environment. Then we investigate on how the technique can be expanded so as to make it work effectively on bidirectional flows. Second, we analyze on three data sets the differences in classification accuracy when half-flows are considered as opposed to bidirectional ones. The traces we use are quite different with respect to the environment in which they were captured, the type of protocols they contain, and the time at which the captures were made. This should ensure the generality of the results of our analysis. The rest of the paper is organized as follows: Section 2 describes an optimization technique conceived for the chosen unidirectional classifier. We then introduce in Section 3 two bidirectional classifiers that can be obtained combining the results of the unidirectional verdicts. In Section 4 we describe a couple of performance parameters that are useful in determining the improvement introduced by the bidirectional classifiers. Section 5 details the datasets we used in our experiments. In Section 6 we report numerical results, and we analyze them in Section 7. Finally, Section 8 concludes the paper. 2 Tuning the unidirectional classifiers The classification algorithm that we developed in [5] and [6] has been extended in this work in order to maximize the capability of the classifier to recognize emerging protocols. To do so, we introduced two free parameters:

3 Classification of emerging protocols in the presence of asymmetric routing 3 the acceptance threshold Tacc i for each of the ω i protocols under observation, and the number n p of packets used to compute the anomaly score. We choose to fix those parameters by means of an optimization algorithm based on the maximization of the classifier s overall recall. The recall measurement R is a performance parameter widely adopted in the Machine Learning (ML) literature [7] and, in our context, it is defined as follows: T pi R i =, T pi + F ni where T pi denotes the true positives that are the half-flows correctly labeled as belonging to protocol ω i and F ni denotes the false negatives, i.e., the half-flows incorrectly labeled as not belonging to ω i. The value of R i for each protocol ω i Ω (i.e. the set of known protocols) is obtained by running the classification algorithm on a set G Ω composed by flows generated by application protocols inside Ω that didn t enter the training phase. Similarly, R n+1 is obtained by classifying a set G ωn+1 composed by flows that have been generated by other protocols than those in Ω. Those two sets compose the optimization set G: G = G Ω G ωn+1. We finally determine Tacc i and n p that maximize the classifier s overall recall by imposing the following cost function: ( n ) max R i (T i n p acc, n p ) + R n+1 (n p ). (1) i=1 3 Combining unidirectional classifiers In this section we propose two methods to combine the outputs of the unidirectional classifier applied to F I and F R (i.e., the two half-flows propagating respectively from Initiator to Responder and vice-versa) to obtain a bidirectional classifier. The first technique aims at the maximization of the classifier s recall, and is based on a maximum a posteriori probability (MAP) approach. The second technique aims at the maximization of the classifier s precision P that is defined as follows: P = T p T p + F p. The precision parameter is also widely adopted in the ML literature [7] and, differently from the recall parameter, increases when the false positives F p decrease and indicates the reliability of the classifier s verdict. In our context the precision maximization is based on the evaluation of coincident fault probability of unidirectional classifiers (i.e., when both classifiers emit the same wrong verdict).

4 4 M. Crotti, F. Gringoli, L. Salgarelli 3.1 First technique: Recall maximization Here we describe how the MAP approach can be applied to the output of the unidirectional classifiers described in Section 2. Starting from the elements that belong to the optimization set G we gather the unidirectional classification results and we build, for each of the n + 1 classes, a classification matrix: A (ωi) = a (ωi) 1, a (ωi) 1,n a (ωi) i,i a (ωi) n+1, a(ωi) n+1,n+1 Here a (ωi) j,k represents the percentage of flows belonging to ω i class marked by the F I classifier as belonging to ω j class and marked by the F R classifier as belonging to ω k class. From a statistical point of view A (ωi) can be seen as a joint probability matrix and each a (ωi) j,k element represents a conditional probability:. a (ωi) j,k = P (ˆω I = ω j, ˆω R = ω k ω i ). (2) If the Maximum A-posteriori Probability condition is imposed to the output of the classifier that combines F I and F R verdicts, under the assumption that the flows are uniformly distributed across the ω i classes, the estimated ˆω class for each i, j would be: ( ˆω(j, k) = arg max ω i a (ωi) j,k ). (3) The behavior of the statistical classifier based on a MAP estimation can be summarized with the following classification matrix ˆΩ: ˆΩ = ˆω(1, 1) ˆω(1, n + 1) ˆω(i, i) ˆω(n + 1, 1) ˆω(n + 1, n + 1) In practice, the input to the above matrix are the two outputs of the unidirectional classifier applied to F I and F R. Such outputs serve as indexes to the element of the matrix that indicates the bidirectional classifier s outcome. 3.2 Second technique: Precision maximization The proposed unidirectional classification algorithm, as explained in [5, 6], can assign a half-flow to any of a known classes of protocols Ω or mark the flow as unknown (i.e., as belonging to the ill-defined class ω n+1 ). Since we aim at the maximization of precision, the way of combining unidirectional verdicts is straightforward: we assign a flow to a class ω i Ω if the classifiers verdicts

5 Classification of emerging protocols in the presence of asymmetric routing 5 on the two halves of the flow agree, otherwise the flow is marked as unknown (i.e., assigned to class ω n+1 ). Applying the described Precision Maximization criteria (MaxP), the elements ˆω(i, j) of the classification matrix ˆΩ would be set up as follows: { ωi if i = j ˆω(i, j) = ω n+1 otherwise 4 Comparing the unidirectional and bidirectional classifiers In the previous sections we proposed two different methods for combining the information coming from F I and F R into a bidirectional classifier. Since in this paper we are interested in comparing the results of the bidirectional approaches with the unidirectional ones, here, starting from the literature, we develop a couple of performance indicators that we will use in Section 7 to help us with the analysis. 4.1 MAP: recall comparison Basically we need a measure to determine if and how a MAP approach impacts on the Recall parameter with respect to a unidirectional classifier. Since the optimum recall results R I and R R of the unidirectional classifiers are known, they can be used as a lower bound for the recall of combined classification as follows: R min = max (R I, R R ). The estimation of the overall recall for the MAP approach is simply obtained with the following: R = 1 n+1 f ˆΩ(ω i ), (4) n + 1 where f ˆΩ(ω i ) represents the recall of the MAP classifier for each class ω i of the evaluation set and it is derived as follows: f ˆΩ(ω i ) = i=1 n+1 a j,k=1 ˆω(j,k)=ω i (ω i) j,k. Obviously, if R > R min the bidirectional classifier achieves better results than each of the unidirectional classifiers. In order to give an indication of the improvement of classification accuracy due to the MAP approach we define the parameter R :

6 6 M. Crotti, F. Gringoli, L. Salgarelli R = R R min 1 R min, normalized in order to give a non-negative value when the MAP approach leads to better results with respect to the unidirectional classification and a maximum value of 1 in case of perfect classification. 4.2 MaxP: independence of wrong verdicts As previously depicted, the chosen approach for Precision Maximization imposes that a classification result is accepted only if both unidirectional classifiers emit the same verdict. Therefore, a desirable property for the maximization of precision would be an high rate of coincident hits (i.e. the two classifiers emit the correct verdict) and a low rate of coincident fault. This property holds when the classifiers faults are independent or, at least weakly dependent, therefore a measure of independence for the faulty classification is needed. Several works of literature such as [8, 9] adopt the so-called diversity measures in order to determine the degree of dependence of classifiers. One of them is the double fault measure Df (defined by G.Giacinto et al. [10]) that allows to determine the rate of double fault of two-class classifiers. Since in this paper we are working with a couple of multiple-class classifiers that can emit two types of verdict (i.e., the pattern belongs to ω i Ω or it belongs to ω n+1 ) we need to refine the definition of double fault measure given in. [10] taking into account the fact that two classifiers can fail emitting two distinct wrong verdicts. We are interested in the analysis of the frequency of coincident faulty verdicts, therefore we state that a coincident fault happens when both unidirectional classifiers emit the same wrong verdict. Hence, the frequency of coincident fault Df(i) for the i th class is: Df(i) = n j=1 j i a (ωi) j,j, and then the overall coincident fault frequency is simply derived by the following: Df = 1 n+1 Df(i). n + 1 The degree of dependency of the classifiers is obtained by comparing Df with the coincident fault probability of independent classifiers Df 0 and the coincident fault probability of positively correlated classifiers Df 1. In our case the behavior of F I and F R classifiers are known and the probability that both classifiers assign a pattern to a class are: i=1

7 Classification of emerging protocols in the presence of asymmetric routing 7 P 0 (i, j) = P 0 ( ˆω c = ˆω s = ω j ω i ) = P ( ˆω c = ω j ω i ) P ( ˆω s = ω j ω i ), if the two classifiers are independent and: P 1 (i, j) = min (P ( ˆω c = ω j ω i ), P ( ˆω s = ω j ω i )), in case the classifiers are correlated. Therefore the probability of coincident fault for the i th class can be estimated as follows: Df l (i) = n P l (i, j). (5) j=1 j i Here l = 0 gives the double fault measure for independent classifiers and l = 1 is related to the completely dependent classifiers. The estimation of overall coincident fault probability is therefore: Df l = 1 n+1 Df l (i), n + 1 i=1 where l assumes the same meaning of Equation 5. Finally, the parameter P can be obtained as follows: P = 1 Df Df 0 Df 1 Df 0. Here P assumes nonnegative values. If 0 < P < 1 the classifiers are positively correlated in emitting the wrong verdict, otherwise (i.e., P > 1) they are negatively correlated. 5 Datasets We base our experimental analysis on the results obtained applying the classifiers described in this paper to three different datasets, as described below. 5.1 UNIBS dataset (2006) The packet traces from this set were collected in early 2006 at the border router of of our Faculty network. Having full monitor access to this router, we can apply pattern-matching mechanisms to assess the actual application that has generated each TCP flow, in some cases with the help of manual inspection. Because of this, we consider the traffic information derived from UNIBS relatively reliable with respect to the pre-classification, i.e., with respect to

8 8 M. Crotti, F. Gringoli, L. Salgarelli knowing, independently from our classifier, which application generated each flow ( ground truth ). Our network infrastructure comprises several 1000Base-TX segments routed through a Linux-based dual-processor box and includes about a thousand workstations with different operating systems. All the traffic traces were gathered on the 100Mb/s link connecting the edge router to the Internet for a period of three weeks: a total of 50GB of traffic was collected by running Tcpdump [11] for fifteen minutes every hour. According to the optimization procedure, three different and independent datasets have been collected during the course of several weeks: A training set composed of six protocols. For each protocol we fixed a maximum amount of 20 thousand flows. We chose to train the classifier with the protocols a network administrator would be interested to monitor, such as mail, web, chat and file transfer, and that in the each of the datasets account for the vast majority of exchanged traffic. An optimization set composed of the six protocols mentioned above and another set of flows, G ωn+1 that have not been generated by the applications that appear in the training set. To obtain G ωn+1 we simply collected the network traffic at the edge router and then we extracted the certified G Ω flows. The remaining flows compose G ωn+1. A payload inspection of flows composing G ωn+1 for the UNIBS dataset revealed a wide variety of protocols, such as Kazaa, Gnutella, SSH, Netbios,.... An evaluation set E = E Ω E ωn+1 composed by the same protocols of the optimization set and used to verify the classifier s ability to recognize protocols different than those used during the training phase. 5.2 LBNL dataset ( ) The LBNL traffic traces we used were collected at the Lawrence Berkeley National Laboratory under the Enterprise Tracing Project [12]. The packet traces were obtained at the two central routers of the LBNL network and they contain more than one hundred hours of traffic generated from several thousand internal hosts. The traffic traces are public, but they are completely anonymized, so ascertaining the ground truth on the application behind each recorded flow is not possible. Therefore, for this set, we built protocol sets according to the TCP destination port number of each flow, an accepted practice in these cases [13, 14]. We used the traffic traces captured on December 15 and 16, 2004 to obtain the training and the optimization sets and those captured on January 6 and 7, 2005 to build the evaluation set. Once again we performed the training by using the most frequently used port numbers in the dataset.

9 Classification of emerging protocols in the presence of asymmetric routing 9 CAIDA UNIBS LBNL F I F R F I F R F I F R Recall ω n+1 Ω ω i ω j Table 1. Unidirectional classification: optimal results obtained after the analysis of the first three data packets. 5.3 CAIDA dataset (2002) We built this data set starting from three hour long traces obtained by the Cooperative Association for Internet Data Analysis (CAIDA) [15], and collected at the AMES Internet Exchange (AIX) along an OC48 link on August 14, We used flows extracted from the first hour (corresponding to the interval UTC) to build the training set the optimization set and from the third hour ( UTC) to buld the evaluation set. As for the previous set, these traces are also anonymized, so port numbers are used as indicators of each protocol. The selection of flows composing the training, optimization and evaluation sets has been made following the same approach adopted for the LBNL and UNIBS datasets. 6 Results 6.1 Unidirectional classification Here we report the classification results obtained by tuning the unidirectional classification algorithm with the optimum parameters. For each of the three datasets we show the overall recall, the misclassification rate for untrained protocols (i.e., ω n+1 Ω) and the misclassification rate for trained protocols (i.e., ω i ω j ). As it can be seen in Table 1, the classifier takes the right decision in almost 90% of cases for the UNIBS and LBNL dataset after the analysis of the first three F I or F R data packets. A small portion of traffic belonging to trained protocols is misclassified but a noteworthy portion of traffic belonging to the ω n+1 class is incorrectly marked as belonging to the Ω set. The CAIDA dataset exibits the worst classification results since the classifier reaches, in the best case, only the 83% of recall and a large portion of ω n+1 traffic is labeled as belonging to Ω set. 6.2 Bidirectional classification Here we show classification results obtained by applying the two bidirectional classification techniques described in Section 3.

10 10 M. Crotti, F. Gringoli, L. Salgarelli CAIDA UNIBS LBNL MaxP MAP MaxP MAP MaxP MAP Recall ω n+1 Ω ω i ω j P R Table 2. Classification results obtained with the two bidirectional classifiers. As can be seen in Table 2 where the bold numbers evidence a classification verdict that surpasses the best results obtained using a unidirectional classifier, the effects of combining classification verdicts has a positive impact on the Recall parameter when a MAP approach is adopted (it is always greater than 90% for each of the three network environments) and brings to substantive reduction of misclassification rates choosing a MaxP approach that, in the worst case decreases of more than 40% and in the best case of over 90% (CAIDA dataset, ω n+1 Ω). Finally the last line of the table shows the values of the comparison parameters P and R that will be taken into account in the next Section where the unidirectional and bidirectional classification results will be compared. 7 Discussion Looking at the results obtained for unidirectional and bidirectional classification, the impact of asymmetric routing on the correctness of classifier s verdict is clearly visible : we are losing both in classification accuracy and precision when we choose to (or we are forced to) deploy a unidirectional classifier. In the following, first we analyze the overall results evidencing the differences between unidirectional and bidirectional classifiers on the different network environments, then we deepen the analysis on each network environment showing how the classification accuracy, for different protocols, often depends on the direction of half-flows. 7.1 Overall results The results obtained by combining unidirectional classifiers are, for the CAIDA dataset, definitely better than those obtained with unidirectional classification. As shown in Table 2, with a MaxP approach the misclassification of flows belonging to untrained protocols falls from 60% to 5%. This behavior, due to the substantial incorrelation between F I and F R classifiers in emitting wrong verdicts is well captured by the comparison parameter P that measures the tendency of the unidirectional classifiers in emitting the same wrong verdict. This fact reflects also on the classification results obtained applying

11 Classification of emerging protocols in the presence of asymmetric routing 11 a MAP approach: the misclassification rates substantially decrease and the overall recall that increases from 83% to 91% is evidenced by R parameter that, as explained in Section 4, measures the recall improvement between unidirectional and bidirectional classification results. Similar values of the comparison parameters P and R hold for the UNIBS dataset and as a consequence a similar behavior is observed in terms of improvement of overall recall and decrease of overall misclassification rate. Obviously, since the accuracy of the unidirectional classifiers of the UNIBS dataset is greater than the one of the CAIDA dataset (e.g., the misclassification rate of ω n+1 traffic settles around 30% in the worst case), the accuracy improvement is not as remarkable as the one obtained for the CAIDA network environment. On the other hand the low value of comparison parameter P computed for the LBNL dataset evidences a substantial correlation between the two classifiers verdicts. This reflects on the classification results of the bidirectional classifiers: applying a MaxP approach, the misclassification rate of ω n+1 traffic is not dramatically different from the one obtained by the best unidirectional classifier (it shifts from 24.1% to 13.5%) and the MAP approach slightly improves the classification results in comparison to the ones of the F I unidirectional classifier. 7.2 On the impact of contrasting classifiers verdicts In the previous section we stated that the correlation between F I and F R classifiers verdicts can impact on the results of bidirectional classification. Here we deepen the analysis showing what could happen if classifiers verdicts differ. CAIDA UNIBS F I F R Joint F I F R Joint Port R P R P MAP MaxP Protocol R P R P MAP MaxP torrent ω n http LBNL F I F R Joint Port R P R P MAP MaxP Table 3. Example of different accuracy in unidirectional classification In Table 3 for each of the three analyzed dataset, we show a couple of protocols that reach high accuracy in classification (in terms of precision and recall) just for one traffic direction.

12 12 M. Crotti, F. Gringoli, L. Salgarelli It can be seen that the pairing techniques show classification results that, at least, are comparable with the best results obtained for unidirectional classification (e.g. the results shown for the LBNL dataset) but if the classifiers are not strictly correlated in emitting the classification verdicts, as in the case of ω n+1 traffic for the CAIDA dataset or HTTP traffic for the UNIBS dataset, the accuracy of the bidirectional classification surpass the best unidirectional classifier results. Once more we are demonstrating that when unidirectional classifiers exhibit uncorrelated behaviors, an improvement in classification results could be expected if classifiers verdicts are combined. We are still not able to determine which network conditions make the unidirectional classifiers emitting uncorrelated verdicts but we are able to determine the degree of correlation between those verdicts an then, we can qualitatively estimate the performance improvement that could be expected combining the classification verdicts. 8 Conclusions In this paper we have proposed a supervised statistical traffic classifier based on the analysis of a small set of features of the network traffic. Our aim is the development of an effective classification system that can deal with the continuous growth of emerging communication protocols. Such growth, that is typical of IoT scenarios, requires a classification system that should, at least, be able to recognize if a communication belongs to a set of services or it belongs to a new protocol in order to implement on the (un)classified flows QoS policies or security policies. In order to deal with another emerging practice, i.e., asymmetric routing, that is spreading even on the last legs of Internet connectivity, we have also analyzed how the precision of the statistical traffic classifier can be affected by the availability of both halves of flows during the decision process. Experimental results support a hypothesis that is quite easy to understand: the classifier that can work on flows in their entirety achieves better results than one that is forced to operate on half-flows only. However, reducing our analysis to a mere empirical proof of this assumption, would be a mistake. First of all, the numerical comparison between the unidirectional and bidirectional classifiers points out that the loss of accuracy when only half-flows are available is much more substantial in terms of increased false positives as opposed to decreased true positives. Second, the accuracy loss of unidirectional classifiers varies through the different network environments: the more heterogeneous a network environment is, the less the unidirectional classification is precise. Finally, the analysis points out that some protocols seem to exhibit more statistical correlation between the two half-flows composing each session than others. We think that this fact needs to be taken into consideration when designing mechanisms that measure traffic flows, including traffic classifiers:

13 Classification of emerging protocols in the presence of asymmetric routing 13 certain protocols are more amenable than others at reducing their observation to only one direction of traffic. References 1. Jan S. Rellermeyer, Michael Duller, Ken Gilmer, Damianos Maragkos, Dimitrios Papageorgiou, and Gustavo Alonso. The software fabric for the internet of things. In IOT, pages , Z. Morley Mao, L. Qiu, J. Wang, and Y. Zhang. On AS-level path inference. In SIGMETRICS 05: Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, pages , New York, NY, USA, ACM. 3. Y. He, M. Faloutsos, and S. Krishnamurthy. Quantifying routing asymmetry in the internet at the as level. In Proceedings of the GLOBECOM 2004 Conference, Dallas, Texas, USA, IEEE Computer Society Press. 4. The Cooperative Association for Internet Data Analysis (CAIDA) - Observing routing asymmetry in Internet traffic M. Crotti, M. Dusi, F. Gringoli, and L. Salgarelli. Traffic Classification through Simple Statistical Fingerprinting. ACM SIGCOMM Computer Communication Review, 37(1):5 16, Jan M. Dusi, F. Gringoli, and L. Salgarelli. IP Traffic Classification for QoS Guarantees: the Independence of Packets. In Proceedings of The 1st IEEE International Workshop on IP Multimedia Communications (IPMC 2008), St. Thomas, U.S. Virgin Islands, Aug Ian H. Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, October L. I. Kuncheva and C. J. Whitaker. Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy. Mach. Learn., 51(2): , D. Chen, D. Hua, K. Sirlantzis, and X. Ma. On the Relation Between Dependence and Diversity in Multiple Classifier Systems. In ITCC 05: Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC 05) - Volume I, pages , Washington, DC, USA, IEEE Computer Society. 10. G. Giacinto, F. Roli, and G. Fumera. Design of effective multiple classifier systems by clustering of classifiers. In Proc. of ICPR2000, 15th Int. Conference on Pattern Recognition, pages 3 8, Tcpdump/Libpcap LBNL/ICSI Enterprise Tracing Project T. Karagiannis, A. Broido, M. Faloutsos, and K. C. Claffy. Transport layer identification of P2P traffic. In IMC 04: Proceedings of the 4th ACM SIGCOMM conference on Internet measurement, pages , New York, NY, USA, ACM Press. 14. T. Karagiannis, A. Broido, N. Brownlee, K.C. Claffy, and M. Faloutsos. Is P2P dying or just hiding? In Proceedings of the GLOBECOM 2004 Conference, Dallas, Texas, USA, IEEE Computer Society Press. 15. The Cooperative Association for Internet Data Analysis (CAIDA).