Monitoring Large Flows in Network

Monitoring Large Flows in Network Jing Li, Chengchen Hu, Bin Liu Department of Computer Science and Technology, Tsinghua University Beijing, P. R. China, 100084 { l-j02, hucc03 }@mails.tsinghua.edu.cn, liub@tsinghua.edu.cn ABSTRACT In order to provide high quality network management, traffic scheduling and network security, we need the per flow traffic information. However, in a highspeed net link, the flows number is so large that it is infeasible to gather and process all flows information accurately and efficiently. Several studies showed that a small percentage of flows account for most of the total network throughput. Therefore, monitoring just the highspeed flows is an efficient and practical method in traffic control. Before being able to monitor large flows, they must be distinguished first. Several papers have studied the problem of finding large flows, but the methods are either too complicated or request relatively too much memory and computation resources. In this paper, we present and analyze a simple but efficient algorithm to figure out the elephant flows. The proposed method only need few counters and can figure out nearly all the predefined large flows. KEY WORDS Network, Measurement, Monitoring, Large flow, QoS 1. Introduction Accurate traffic measurement and monitoring are critical to network management. It poses a great challenge on monitoring per-flow traffic because the number of simultaneous existing flows may be extremely large. Fang and Peterson [1] used a variety of traces to show that the number of flows between end host pairs can have reached up to one million in one hour. The above paper also reported over 0.5M existent flows simultaneously even with aggregation. In order to keep up with the increases of the flows number, a naïve solution should adopt DRAMs to maintain the measured records in a per-flow basis. However, the gap between DRAM speed and line speed make it infeasible to update per-flow/per-packet counters. Here we meet the paradox of memory size and speed. One solution to this paradox is sampling. Take Cisco Netflow [2] as an example, as a flow level measurement tool in backbone, it adopts DRAM to keep its flow counters, only statistics packets result in sampling. However, it has two problems. One is processing overhead: updating the DRAM slows down the forwarding rate; the other is storage overhead: the amount of data generated by NetFlow can overwhelm the collection server or its network connection. For example, article [3] reports a loss rate of up to 90% when using basic NetFlow. A common observation found in many measurement studies [1, 3] is that a small percentage of flows accounts for a large percentage of the total traffic. [1] Shows that 9% of the flows account for 90% of the byte traffic. Furthermore, in many circumstance, knowledge of these large flows is sufficient. [1] Suggested achieving scalable DiffServ by providing selective treatment only to a small number of large flows. [3] Underlines that the importance of knowledge of heavy hitters" for decisions about network upgrades. Thus, in [4] Estan ea al. concludes that it is infeasible to accurately measure all flows on high speed links, but many applications do benefit from the accurate measuring of only a few large flows. Therefore, it is efficient and applicable to change the way from perflow monitoring to large flow monitoring. In this paper, we propose a new algorithm that identifies the active large flows in network traffic. And only a small size of external SRAM or on-chip embedded memory is required to serve the counters for the large flows, which is quite suitable for high-speed networks. A flow identifier is a computational function of some related fields within the IP header, which distinguishes one flow from another exclusively. Large flows are defined as those that send more than a given threshold of the link capacity (say 1%) during a given measurement interval (e.g. one second). The proposed algorithm requires a small amount of memory and reports the flow IDs of the flows exceed a predefined threshold at the end of the measurement interval. We will evaluate the proposed algorithm in the following three aspects: 1. Hit rate (number of large flows which was found divided by the actual total number of large flows). 2. Redundancy (the number of all the flows reported by our algorithm divided by the actual number of large flows). 3. Searching Delay (how long it takes to find a large flow). After stating our main idea, the remainder of the paper is organized as follows. Section 2 describes the related works. Section 3 and 4 introduces a large flow monitoring architecture with a large flow identification algorithm. Section 5 shows the simulation results upon a real trace to verify our algorithm. Section 6 makes a

comparison between our algorithm and the sample and hold algorithm given by [4].Section 7 discusses some implementation issues. Section 8 concludes the whole paper. 2. Related Work Many applications, such as DDoS attack detection and network trouble shooting, will benefit from fast and accurate flow monitoring. As mentioned before, flow monitoring is becoming increasingly challenging as transmission links becomes faster and the absolute volume of network traffic increases sharply. This requires network monitor to peek at every incoming packet and run quickly enough to keep up with the network speed, otherwise will face the risk of missing packets. Note that, although the large number of small flows bringing very few traffic, they challenges the scalability of per-flow monitoring in Internet, especially for high-speed links. Several papers purposed ways to deal with this situation and they suggest giving priority to the monitoring of large elephant flows and neglecting small mice ones [4, 5]. The first work [4] presented two approaches to identify large flows. The first one (sample and hold) applies packets sampling, but after packet classification: if a packet belonging to a flow record has already presented in the flow cache, the flow record is updated; if hasn t, a new flow record is created with probability p. Thus, once created, a flow record will account all the packets of this designated flow. The second approach spends S seconds identifying (through a multistage hash filter and a threshold value T) the flows which bringing out more than T traffic. After that, in the next S seconds, only traffic coming from these flows is collected. In parallel, a new identification phase begins, and the results are used to select the flows in the next S seconds. The second work [5] proposed to report, from the meter to the collector, only the flow records whose size is greater than a threshold L. Flows smaller than L are reported with a probability that decreases with the size itself. It s then described how to compensate at the collector for not reported flows, so that a generic application can build reliable, unbiased estimates of a generic flow aggregation (called color ). The approach targets the reduction of reporting traffic between the meter and the collector, but assumes that the meter s memory and processing resources are big and fast enough to record all the simultaneous flows. The third work [6] supposed the flow cache can host a limited number of n records. As long as its occupancy is below n, new flow records are created for each arriving flow. When the flow cache occupancy is n, try to keep in the flow cache the flows receiving the greatest amount of traffic. The work presents a methodology (GROWTH RATE) that, through a metric, decides whether to keep the current pool monitored flows or admit a new one removing the one currently having the lowest metric value. Their methodology computes the metric of a new arriving flow and all existing flows to make a decision. However, it needs to estimate flow growth rate and predicting time, further, it needs additional mechanism to find a flow with the smallest metric. 3. Large Flow Monitoring Architecture Figure 1 shows our large flow monitoring architecture. It includes three units: Large Flow List Unit, Large Flow Filter Unit and Timer Unit. The Large Flow List Unit keeps all the large flow s IDs that have been figured out by the proposed algorithm, the Large Flow Filter Unit finds out the elephant flows, the Timer Unit clears the Large Flow List in every preset time interval. We used the following method to figure out the large flows: When a packet arrives, we extract the packet s flow ID and do a lookup searching for it. If it is in the Large Flow List, we transfer the packet to the Application Specific Processing Unit, which will process the information of the incoming packet deeply. For example, the Application Specific Processing Unit can be a traffic manager that schedules the large flow differently or can be an IDS system that determines whether the flows belong to a DoS attack. If the flow ID is not in the Large Flow List, we transfer the packet to the Flow Filter Unit to determine weather it belongs to a newly coming large flow. Even a flow is identified as a large one temporarily, it might turn to a small one some time later, so after a time interval, the Timer Unit will clear the Large Flow List. 4. Algorithm In this section, we will describe the large flow filtering algorithm in detail. In this section, large flows will be defined as flows whose speed is more than one percent of the line speed. Before presenting the algorithm s formal description, we first introduce the basic idea. We select a flow randomly and add its total throughput into a counter, at the same time we decrease the counter by a linespeed/100. If the flow s speed is more than one percent of the line speed then the counter will keep increasing and pass a predefined threshold. If not, then the counter s number will keep decreasing to zero. So, by observing the counter s value, we can distinguish whether a flow is a big one or a normal one. If it is a big one, we put it into a large flow list, if it is a normal one we just simply ignore it, and then we select another flow to continue the large flow identification process. There exits a problem here, if we use one counter only, when we count one flow s throughput, packets belongs to other flows will be ignored. To avoid this phenomenon, we use an array of counters, every flow s packets are firstly hashed to one of these

Put new Large Flow into the list Incoming packet Packet's flow id is not in the list Large Flow Filter Large List Flow Clear the list every time interval Packet's flow id is in the list Application specific processing unit Timer Figure 1: Large flow monitoring architecture counters, and then we add the packets length to the counter and decrease the counter as described above. Now, we describe our algorithm formally. We use a flow record array to filter the big flows. we call this array flow_record[]. We define init_time as the algorithm s start time and current_time as a packet s arriving time. We use an assistant counter called meter to store the value of LineSpeed/100, we set meter = LineSpeed*(current_time - init_time)/100 when a packet arrives, so the meter's value is changed every time a new packet comes. Every flow record has two fields. One field is called flow_id, it stores the current monitored flow's identifier, the other field is called counter, it stores the monitored flow's throughput plus the meter's value on the time the monitored flow's flow ID is written to the flow record. When a packet arrives, we call this packet packet_one; we hash the packet s flow ID to one of the flow records, which is called flow_record[x]. 1) If the valume of flow_record[x].counter is smaller than the meter s, we stop monitoring the flow represented by flow_record[x].flow_id and replace the value of flow_record[x].flow_id with the flow ID of packet_one and set flow_record[x].counter as meter's current value plus the packet Length of packet_one. 2) If flow_record[x].counter is greater than the meter s and the flow ID of packet_one is equal to flow_record[x].flow_id, we add flow_id[x].counter with the length of packet_one. Otherwise we ignore the newly coming packet. If flow_id[x].counter s value minus meter s value is greater than the predefined threshold, we put flow_record[x].flow_id into the large flows list and clear the corresponding counter to zero. We noticed that every time a flow record's flow_id field is set and the record s counter field will be set to the meter s current value plus the incoming packet's length. After that time, the flow record's counter is increased by the speed of the flow and the meter's value is increased by the speed of one percent of the link speed. So, by comparing the counter's value and the meter's value, we can determine weather a flow's speed is more than one percent of the linespeed or not. By adjusting the threshold value, we can control how long a flow is active fast we will put it into the large flows list. The algorithm s pseudo code is presented below: Double init_time; Double current_time; Int Line_Speed = LINESPEED; Int Threshold = THRESHOLD; Int counter_num = COUNTER_NUM; Int meter=0; Typedef record { Int flow_id; Int counter; } Record;

Record flow_record [COUNTER_NUM]; List Badflowidlist; for (int i = 0;i<counter_num;i++) flow_record[i]. counter = 0; init_time =0; while (true) { Wait for a packet to come; current_time = the packet s coming time; meter = (current_timeinit_time)*line_speed/100; p = the coming packet; Int temp = hash (p.flow_id); if (flow_record [temp].counter<meter) { flow_record [temp].counter = p.length + meter; flow_record [temp].flow_id = p.flow_id; } else if (p.flow_id == flow_record [temp].flow_id) flow_record [temp].counter = flow_record[temp].counter + p.length; if ((flow_record [temp].counter-meter) > Threshold) { Badflowidlist.add (flow_record [temp].flow_id); flow_record [temp].counter =0; } } trace and the second one is an oc-12 trace. In our simulation, we used one hundred counters, in other words, we set counter_num =100. For the oc-3 trace simulation, we set the threshold to (10+i) (KB) 0<=i<10, if a counter is more than the threshold, our algorithm reports the corresponding flow_id. For the oc-12 trace simulation, we set the threshold to (40+4i) (KB) 0<=i<10. Figure 2 shows the average large flow s numbers (whose throughput is over one percent of the link rate). This value is obtained through offline computation. Both the oc-3 and oc-12 traces have 90 seconds IP trace headers. We compute the large flow s number beyond 1% of the link bandwidth every second, and then compute the 90 seconds average. We find that in the oc-3 trace, about thirty flow s throughput is more than one percent of the link rate; whereas in the oc-12 trace, about four flow s throughput is more than one percent of the link rate. 5. Real Trace Simulation Figure 3 (a): Hit rate (OC-3) Figure 2: Large flow number In the real network trace simulation, we used the trace provided by http://pma.nlanr.net. We defined all the packets from the same source IP as a flow. We tested our algorithm by two network traces. The first trace is an oc-3 Figure 3 (b): Hit rate (OC-12) We define hit rate as the founded large flow s number divided by the actual large flow s number-already

defined before. Figure 3 (a) showed the hit rate of our algorithm s by different threshold on the oc-3 trace. Figure 3 (b) showed our algorithm s hit rate by different threshold on the oc-12 trace. We can see that, although few counters is used (comparing to [4]), the proposed algorithm can find large flows with throughput more than one percent of the link throughput very accurately. Figure 5 (a): Delay (OC-3) Figure 4 (a): Redundancy (OC-3) Figure 5 (b): Delay (OC-12) Figure 4 (b): Redundancy (OC-12) Redundancy is defined as the number of all the flows reported divided by actual large flow s number - already defined before. Figure 4 (a) presents the redundancy of proposed algorithm by deferent threshold on the oc-3 trace, while Figure 4 (b) presents the redundancy on the oc-12 trace. When the threshold is relatively low, the redundancy will be two to three. As the number of large flows is not very large, the redundancy is acceptable. In order to find all the more than one percent big flows, the corresponding redundancy of the algorithms proposed in [3] are all one hundred also. So, our algorithm is more memory efficient. We defined delay as the time difference between the arrival time of the first packet of the large flow and the time when the large flow is figured out by the proposed algorithm. Figure 5 (a) and (b) show delay of proposed algorithm on the oc-3 trace and on the oc-12 trace respectively. From the simulation result of two traces, the delay of proposed algorithm is about 0.1 second. In order to find a large flow the delay is necessary and not very long. 6. Comparing with sample and hold algorithm Now we make some comparison between the algorithm presented in this paper and the sample and hold algorithm introduced by [4]. As listed in figure 6, both the two algorithms can achieve very good hit rate, with sample and hold you need compute the sample probability, and with our algorithm you need properly set the threshold. Our algorithm can achieve very low redundancy, in other words, most of the flows reported by our algorithm are big flows, but with

sample and hold, to find the flows whose throughput is more than one percent of the link throughput, it tells you ten thousands, and its your own task to find the real big ones from the ten thousands ones. Both the algorithms need to process every packet, but our algorithm s computing complexity is very low, on the contrary, sample and hold need to implement some probability decision that is not easy to implement, especially in high speed. Our algorithm only needs very little memory space to work. To find the flows whose throughput is more than one percent of the link throughput, one hundred records is enough to our algorithm, but in sample and hold, you need one hundred times memory space as our algorithm do. In high speed routers more memory means more power consumption and the degradation of stability. Figure 6 lists some characteristics of the algorithm presented in this paper and the sample and hold algorithm on the assumption that we want to find the flows whose throughput is more than one percent of the link total throughput. Our Algorithm Sample and Hold Hit Rate almost 100% if properly set the threshold almost 100% if pre-computed the sample probability Redundancy very low, far from ten very high, one hundred Memory space requirement very little, one hundred records is enough very high, tens of thousands records Computation complexity some add, subtraction and comparison need probability decision Figure 6 Comparing with sample and hold security as well as high quality network management and measurement. 9. Acknowledgment This work is supported NSFC (No.60173009 and No.60373007), China 863 High-tech Plan (No. 2003AA115110 and No. 2002AA103011-1) and China/Ireland Science and Technology Collaboration Research Fund (CI-2003-02). References [1] W. Fang. and L. Peterson, Inter-AS traffic patterns and their implications. Proceedings of IEEE Global Telecommunications Conference 1999, Rio de Janeiro, Brazil, 1999, 1859-1868. [2] Cisco NetFlow http://www.cisco.com /warp /public/732 /Tech /netflow. [3] A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, &F. True, Deriving traffic demands for operational IP networks: methodology and experience, IEEE/ACM Transactions on Networking,9(3),2001,265-280 [4] C. Estan and G. Varghese, New directions in traffic measurement and accounting, Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement, San Francisco, CA, 2001,75-80. [5] N.G. Duffield, C. Lund, M. Thorup, Charging from sampled network usage, Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement, San Francisco, CA, 2001, 245-256. [6] M. Molina, A Scalable and Efficient Methodology for Flow Monitoring in the Internet, Proc. of 18th International Teletraffic Conference, Berlin, Germany, 2003. 7. Implementation Issue To implement the algorithm proposed in this paper, first we should determine whether a flow ID is in the Large Flow List. It will be a time consuming task if the Large Flow List contains too many entries. This problem can be solved by adopting a small size CAM in hardware implementation, or by organizing the Large Flow List as a hash table in software implementation. 8. Conclusion In this paper, we presents a simple but novel algorithm to figure out the large flows, and analyze its performance, then prove its practicability through real network trace simulation. In the further research, we will try to resolve the questions such as how to use these large flows information to aid implementing QoS, network