DNSFlow- Enabling netflowlike telemetry for DNS Kyle Creyts, Manish Karir Merit Network Inc. Joe Eggleston, Craig Labovitz DeepField Networks
Intro An analogy to (gasp) telecom networks: Data plane scalable telemetry via Netflow Operationally incredibly useful Embedded everywhere Well-developed analysis ecosystem Control (signaling) plane DNS Potential for broad operational visibility (beyond dns system itself) Embedded nowhere (some syslog support) Nascent analysis ecosystem DNS + Netflow Lends DNS-based context to analysis of data trafffic
DNSFlow DNSFlow protocol/reference implementation for efficient export of DNS responses Primary operational insight from recursive responses Also compelling security applications for iterative responses (see ISC Passive DNS) Goal Open access to data Vendor-neutral, open, standards-based, cheap (free), efficient, lightweight, flexible deployment
Challenges Scale of data But, most queries redundant 1:20 compression ratio possible Even more with smart sampling 20,000 15,000 10,000 5,000 0 New responses/hour
Status Open source, version 1 Simple (hundreds of lines of ANSI C) Flow like output NMSG (Passive DNS) interoperability layer Upcoming version IPFIX Easy integration with existing tools. Support for compression, sampling and built-in algorithms identifying redundant queries Deployed in several production networks today https://github.com/deepfield/dnsflow
Deployment Runs on any *nix. Anywhere with visibility of recursive responses. Recursive resolver SPAN/mirror port of resolver Discussions towards embedding in open source and commercial DNS Servers Collection/ Analysis DNSFlow from tap or server plugin Commodity PC or VM Recursive DNS Iterative DNS DNS Server Data Traffic Users
Using DNSFlow Data Phishing Notification of phishing activity used in conjunction with DNS blacklists such as phishtank, malwaredomains.com, or SURBL when a user is successfully phished (opens a link) with a domain in the blacklist, it will incur a DNS lookup that DNS answer will appear in DNSflow can query DNSflow data, see which users (client IPs) were phished, and which domain(s)-ip pair they were victim to alert_id set_id IP f/ response 7907823 4819173 xxx. 235.133.xxx 7924562 4829607 xxx. 233.142.xxx client IP timestamp BL source name from DNS 10.1.1.2 2012-04-09 00:57:57.379013 10.1.1.1 2012-04-09 05:45:08.815553 malwaredomains malwaredomains www.sexy-screensavers.com. www.meb.gov.tr.
Building DNSMapDB Goal: To characterize the DNS-IP mappings as observed at a regional ISP Two different types of entries in DNSMapDB largely-static and largelydynamic After first 7 days we continue to learn new pairs but at a constant rate of roughly 1000pairs/hour 20,000 18,000 16,000 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 New pairs/hour (7 days)
Characterizing DNSMAPdb There are two different contributors to DNSMapDB Mostly static infrastructure Largely dynamic mappings which originate from CDNs and massive web hosting dynamic infrastructure Shortest name 2 character and longest 235, but average is 27 characters. Total of 308 unique TLDs Top 5 TLDs account for almost 90% of the DNS names observed edu, 2% us, 2% org, 7% net, 15% com, 62%
Second level domain distribution How big is a given second level domain most have very few name-ip mappings < 60 second level domains have more 10 or more mappings Looking at the other end of the spectrum very few have large number of entries but how big do these get? Atleast 40 have more than 1000 entries each - top 8 have 10K or more each 100,000 2nd-Level Domains 10,000 1,000 100 10 1 1 10 100 1,000 10,000 100,000 Name/IP Pairs
IP Address distribution by /8 Distribution of 200K unique IPs in DNSMapDB by slash8 Number of IPs in DNSMAPdb 10000 9000 8000 7000 6000 5000 4000 3000 2000 1000 0 *.amazonaws.com *.tumblr.com 64 50 0 8 14 19 31 39 46 54 59 64 69 75 80 85 90 95 100 107 112 117 122 127 132 137 142 147 152 157 162 167 172 177 183 189 195 200 205 210 215 220 230 173 174 184 *.cloudfront.net 208 216 Slash 8
DNSMapDB Entries by /8 Primary contributors to DNSMapDB entries 90000 80000 205 Cloudfront 52K 70000 60000 50000 40000 30000 20000 Apple 12K 2o7 9K Akadns 10K 66 17 Blogspot 12K Google 6K 74 Mcafee 13K Spamhaus 9K Mail-abuse 3K 127 Edgesuite 4K Akamai 4K 192 209 Google 27K 216 Cloudfront 12K 10000 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 130 135 140 145 150 155 160 165 170 175 180 185 190 195 200 205 210 215 220 225 230 235 240 245 250 255
Thank You Kyle Creyts <kcreyts@merit.edu> Joe Eggleston <joe@deepfield.net>