Get Your FIX: Flow Information export Analysis and Visualization Joint Techs Workshop, Madison, Wisconsin, July 19, 2006 Dave Plonka plonka@doit.wisc.edu Division of Information Technology, Computer Sciences Wisconsin Advanced Internet Lab
Outline Visualization Tools What is FlowScan? What are IP Flows? Interpreting Sample FlowScan Graphs FlowScan Hardware & Software Components Graphs of Network Events & Anomalies Examining Traffic Rate Measurements Fine Timescales
What is FlowScan? FlowScan is a freely-available network traffic reporting and visualization tool. Its development began in December 1998, and it was first released in March 2000. There are hundreds of users today including campuses and ISPs. FlowScan analyzes data exported by Internet Protocol routers.
What does FlowScan do? FlowScan counts IP flows by protocol, application, user population, or Internet connection. Protocols include TCP and UDP. Applications include email (SMTP), file sharing (e.g. KaZaA). User populations are subnets such as schools or departments. Internet connections are transit and peering links between Autonomous Systems
What is a Flow? An IP flow is a unidirectional series of IP packets of a given protocol (and port where applicable), traveling between a source and destination, within a certain period of time. K. Claffy, G. Polyzos, H. Werner-Braun, c. 1993.
What is a Flow? These flows represent an ftp file transfer that lasted 9 seconds. Two bidirectional Internet connections, comprised of a total of 430 packets containing 380,122 bytes, are summarized into just five flows.
Background on Flows & Routerbased Flow-Export The notion of flow profiling was introduced by the research community. Today, flow profiling is built into some networking devices for operational and accounting purposes. Vendor implementations include Cisco NetFlow, Juniper cflow, Riverstone (formerly Cabletron) LFAP, Foundry (InMon) sflow These essentially use the definition introduced by [ClaffyPB] with timeout and TCP stateful inspection. The "IP Flow Information export" (IFPIX) Working Group in the IETF has defined the IPFIX protocol, which is currently under review for publication on a standards track.
A Flow Record Diagram by Daniel W. McRobb, from the cflowd configuration documentation, 1998-1999.
Interpreting FlowScan Graphs Horizontal axis is time, current time to the right. Vertical axis indicates magnitude of measurement, usually in bits, packets, or flows per second. Outbound traffic is upwards, Inbound traffic is downwards (mnemonic: pejoritive `bottom feeders'). Colored bars show traffic classification and are stacked (not overlayed) to show the total.
Interpreting FlowScan Graphs
Hardware and Software Components
Router-based Flow Export LAN LAN LAN LAN Flow collector stores exported flows from router. Internet Diagram by Mark Fullmer (author of flow-tools), 2002.
Router-based Flow Export
Router-based Flow Export
Ethernet Flow Probe Workstation A Workstation B Flow probe connected to switch port in traffic mirror mode Campus Diagram by Mark Fullmer (author of flow-tools), 2002.
Ethernet Flow Probe
Ethernet Flow Probe
Events & Anomalies Denial-of-Service Probes, Scans Worm Propagation Flash Crowds Distributed Denial-of-Service
Inbound DSL DoS Flood A campus DSL user's host (640Kbps download) was the recipient of 50,000 packets per second, which totaled over 10 megabits per second.
Active Hosts... indications of Network Abuse
Code Red Worm Propagation The following graph (next slide) plots the difference between the number of UW- Madison IP addresses that have transmitted traffic and the number that have received traffic. These values are plotted independently for each of UW-Madison's four class B networks. This metric represents the number of campus host IP addresses that participated in "monologues" - one way exchanges of IP information with hosts in the outside world. A negative value indicates that more addresses have received IP traffic than have generated outbound IP traffic. As such, negative numbers in the plot are often an indication of inbound "scanning" or probing behavior (such as that done by the hosts in the outside world that were infected with the Code Red worm) because those scans often attempt to talk to unused campus IP addresses or to hosts which simply do not respond because of firewall policies.
Code Red Worm "Monologues"
Flash Crowds Larry Niven's 1973 SF short story "Flash Crowd" predicted that one consequence of cheap teleportation would be huge crowds materializing almost instantly at the sites of interesting news stories. Twenty years later the term passed into common use on the Internet to describe exponential spikes in website or server usage when one passes a certain threshold of popular interest. http://www.tuxedo.org/~esr/jargon/html/entry/flash-crowd.html
Linux Release Events
RedHat 7.2 Flows
Photo: Michael Rothbart, Illustration: Kandis Elliot The Blooming of the Titan Arum http://www.news.wisc.edu/titanarum/ On thursday June 7, 2001, UW-Madison's 8-feet, 5-inch tall titan opened up gradually over the course of six hours This illustration shows Titan Arum in bud, left, and full bloom, center. Inside the base of the spadix (the fleshy central column of the flower) are over a thousand tiny flowers, right.
The Blooming of the Titan Arum - 2001
The Blooming of the Titan Arum 2005 http://www.news.wisc.edu/titanarum/
Outbound Distributed DoS flood from 30+ Campus Hosts
The Same ICMP DDoS flood was also observed by FlowScan at another campus...
The Knight IRC Robot Coordinated via Internet Relay Chat (IRC) using "robots". Independent observations reported aggregates over 500Mbs The Same DDoS flood was also observed by FlowScan at other campuses...
Traffic Rate Measurements at Fine Timescales What if we did packet and bit rate measurements at one second intervals? What equipment is necessary? Is this useful to characterize and detect anomalies? What characteristics can be exposed to expedite abuse detection?
NeTraMet NeTraMet is a configurable flow-based measurement software package. Broad notion of flows (not just 5-tuple IP flows) We configure it to measure the packet and bit rates, in each direction, on dot1q Gigabit Ethernet link and save into 120 one-second buckets The values are fetched via SNMP, at 15 second intervals, stored into a flow data file, parsed by scripts ( fdf2rrd ) and the rates stored into an RRD file configured to store at one-second intervals.
Network Monitoring Cards (Endace DAG4.3GE) + NeTraMet and rrdtool software on Linux
3306/tcp (MySQL) probe/flood to 64k campus IPs
512/tcp (exec) probe to 64k campus IP addresses
UDP flood from one host to one campus host
UDP flood from three hosts to one campus host
Credits & Thanks Flow-related tools: Tobi Oetiker (rrdtool, MRTG) Mark Fullmer (flow-tools) Carter Bullard (argus) Nevil Brownlee (NeTraMet)
Resources FlowScan: http://net.doit.wisc.edu/~plonka/flowscan/ http://wwwstats.net.wisc.edu Argus: http://www.qosient.com/argus/ flow-tools: http://www.splintered.net/sw/flow-tools/ cflowd, CoralReef: http://www.caida.org/ tools/measurement/cflowd/ tools/measurement/coralreef/ IP Flow Information export, an IETF Working Group: http://ipfix.doit.wisc.edu
Resources IP Flow Information export, an IETF Working Group: http://ipfix.doit.wisc.edu Freely-available IPFIX implementations: http://libipfix.sourceforge.net http://www.ntop.org http://www.cert.org/netsa/tools/fixbuf/ Related Analysis tools Wisconsin Netpy: http://wail.cs.wisc.edu/netpy/