DNS Caching Krytyczna infrastruktura operatora i ostatni element układanki Adam Obszyński, CISSP, CCIE #8557 Regional Sales Engineer Eastern Europe aobszynski@infoblox.com 1
Dawno temu AD 2000 2
Two kind of External DNS Servers? Authoritative Name Servers hosting company.com (corporate web site : www.company.com) Internet users > http://www.company.com BIND DNS Internet Webserver Mailserver ETHERNET BIND DNS ETHERNET Forwarders (aka resolvers, dns cache) Enable web surfing, sending emails, etc. Internal applications Internal users > http://www.google.com
O Czym my tu dzisiaj? Który element puzzle nas interesuje? Dlaczego myślimy o DNS Cache? Jak można to zrobić lepiej a może najlepiej? S Jak zrobili to inni?
O Czym my tu dzisiaj? Który element puzzle nas interesuje? Dlaczego myślimy o DNS Cache? Jak można to zrobić lepiej a może najlepiej? S Jak zrobili to inni?
Bandwidth -> Core Cisco.com
Bandwidth -> Access http://blogs.broughturner.com/
Serialization -> Access It was true in 1999 and 2000 Not today :-) Cisco.com
DNS: Scale Number of Queries YES Cause of Increase DNS prefetching function 28-times increase in one year FireFox -> enabled 06.2009.* Auto Update Web History NTT Information Sharing Platform Laboratories
O Czym my tu dzisiaj? Który element puzzle nas interesuje? Dlaczego myślimy o DNS Cache? Jak można to zrobić lepiej a może najlepiej? S Jak zrobili to inni?
DNS Not Just Glue...
Web Prefetching Srinivas Krishnan and Fabian Monrose Department of Computer Science University of North Carolina at Chapel Hill
Web Delay Sample Fast Web Performance Starts with DNS http://techcrunch.com/ 300 objects++ 60++ domains http://blog.catchpoint.com/
Web Delay Sample 2 Fast Web Performance Starts with DNS Two components to DNS latency: Latency Client <-> Server Caches <-> name servers Cache misses Under provisioning Malicious traffic https://developers.google.com/
DNS Challenges Data traffic explosion drives increasing DNS load Rise of applications such as Facebook and Mobile devices are causing huge growth in DNS traffic Customer satisfaction is critical Unsatisfied mobile customers readily switch providers Distributed DNS approach places caching servers closer to the customer - Because response time is critical to the customer experience - But centralized management now becomes a critical requirement 4
Costs of Maintaining DNS Infrastructure are on the Rise More DNS servers = Higher management costs Security vulnerability patching costs are high Securing DNS infrastructure requires additional equipment and skills High availability implementations require significant expenses and skills TASK: Update the DNS software on 15 name servers 400-1000% Faster BIND: 200-330 Min. Infoblox: 5-20 Min. TIME
How ISPs Deal with DNS Today* Increase the number of DNS servers Use faster underlying server hardware Use load balancers to handle load and IPS s to handle vulnerabilities Code expensive customized changes into DNS software
O Czym my tu dzisiaj? Który element puzzle nas interesuje? Dlaczego myślimy o DNS Cache? Jak można to zrobić lepiej a może najlepiej? S Jak zrobili to inni?
Mitigations of DNS Cache problems Over-provisioning Caching DNS resolvers demand a lot of network input/output highly vulnerable to cache poisoning (cache miss rate) Prepare for DoS/DDoS (over-provision with many machines) Load-balancing for shared caching Possible backfire -> reduce the cache hit rate (independent caches) Load-balance without fragmentation Think about 2 levels close to the user -> small cache with most popular names 2 nd level -> distributed per names Distributed clusters for geographical coverage Closer to your users -> less latency DNS Anycast (details later) BUT, Centralized HUGE servers can help with fragmentation! Low latency from user do DataCenter needed 19
DNS Anycast Anycast address: 10.0.0.1 Routing advertisement DNS Cache Routing advertisements Query to 10.0.0.1 Query to 10.0.0.1 Routing advertisements Routing advertisement DNS Cache Anycast address: 10.0.0.1 2007 Infoblox Inc. All Rights Reserved.
DNS Anycast Anycast address: 10.0.0.1 Routing advertisement DNS Cache Routing advertisements Query to 10.0.0.1 Query to 10.0.0.1 Routing advertisements Routing advertisement DNS Cache Anycast address: 10.0.0.1 2007 Infoblox Inc. All Rights Reserved.
DNS Anycast Anycast address: 10.0.0.1 Routing advertisement DNS Cache Queries automatically re-routed to next nearest Routing advertisements Query to 10.0.0.1 Query to 10.0.0.1 Route removed Routing advertisement DNS Cache Anycast address: 10.0.0.1 2007 Infoblox Inc. All Rights Reserved.
Don t use risky (or old) DNS software (TCP Case) 41.53: Flags [S], seq 3070710725, win 65535, options [mss 1460,nop,wscale 4,nop,nop,TS val 172155998 ecr 0,sackOK,eol], length 0 49744: Flags [S.], seq 3594360937, ack 3070710726, win 65535, options [mss 1460,nop,wscale 3,sackOK,TS val 1909669925 ecr 172155998], 41.53: Flags [.], ack 1, win 8235, options [nop,nop,ts val 172156005 ecr 1909669925], length 0 41.53: Flags [P.], seq 1:20, ack 1, win 8235, options [nop,nop,ts val 172156005 ecr 1909669925], length 1952227+ SOA?. (17) 49744: Flags [P.], seq 1:748, ack 20, win 8326, options [nop,nop,ts val 1909669936 ecr 172156005], length 74752227*- 1/13/22 SOA (745 41.53: Flags [.], ack 748, win 8188, options [nop,nop,ts val 172156016 ecr 1909669936], length 0 41.53: Flags [F.], seq 20, ack 748, win 8192, options [nop,nop,ts val 172156019 ecr 1909669936], length 0 49744: Flags [.], ack 21, win 8326, options [nop,nop,ts val 1909669946 ecr 172156019], length 0 41.53: Flags [.], ack 748, win 8192, options [nop,nop,ts val 172156025 ecr 1909669946], length 0 49744: Flags [F.], seq 748, ack 21, win 8326, options [nop,nop,ts val 1909669946 ecr 172156019], length 0 41.53: Flags [.], ack 749, win 8192, options [nop,nop,ts val 172156025 ecr 1909669946], length 0 29.53: Flags [S], seq 2260025309, win 65535, options [mss 1460,nop,wscale 4,nop,nop,TS val 172152327 ecr 0,sackOK,eol], length 0 49743: Flags [S.], seq 2528398468, ack 2260025310, win 5792, options [mss 1460,sackOK,TS val 2332945284 ecr 172152327,nop,wscale 2], 29.53: Flags [.], ack 1, win 8235, options [nop,nop,ts val 172152328 ecr 2332945284], length 0 29.53: Flags [P.], seq 1:20, ack 1, win 8235, options [nop,nop,ts val 172152328 ecr 2332945284], length 1914386+ SOA?. (17) 49743: Flags [.], ack 20, win 1448, options [nop,nop,ts val 2332945285 ecr 172152328], length 0 49743: Flags [P.], seq 1:3, ack 20, win 1448, options [nop,nop,ts val 2332945286 ecr 172152328], length 2 29.53: Flags [.], ack 3, win 8235, options [nop,nop,ts val 172152329 ecr 2332945286], length 0 49743: Flags [P.], seq 3:748, ack 20, win 1448, options [nop,nop,ts val 2332945287 ecr 172152329], length 74534048 [b2&3=0x1] [13a] [ 29.53: Flags [.], ack 748, win 8188, options [nop,nop,ts val 172152330 ecr 2332945287], length 0 29.53: Flags [F.], seq 20, ack 748, win 8192, options [nop,nop,ts val 172152332 ecr 2332945287], length 0 49743: Flags [F.], seq 748, ack 21, win 1448, options [nop,nop,ts val 2332945292 ecr 172152332], length 0 29.53: Flags [.], ack 749, win 8192, options [nop,nop,ts val 172152333 ecr 2332945292], length 0 https://labs.ripe.net/
Cache Poisoning Checklist by Cricket Liu Use dedicated Forwarders Run the most robust server code Split external/internal and forwarders Filter traffic to/from your forwarders 24
Other cases For DNSSEC size is important :-) TCP Check your ACLs EDNS/DNSSEC Check your Firewalls Spoofing - check RFC 5452 for Security DNS Cache Pollution RFC1918 ranges (AS112).local &.localhost domains Flood Educate your users! Newest concepts: DNS Cache server per user? Hardened OS 25
Devices v Solutions Dedicated vs Self made. Dedicated DNS Cache appliance does not stop answering queries from cache when capacity limits are reached for cache misses Avg. Latency (Seconds) a Bind 9.8 HW DNS Cache 26
Focus. Dedicated vs Self made. Note how the response rate drops off at 35k queries per second. This is a result of the total number of outstanding recursive requests hitting the processing limit. a 27
O Czym my tu dzisiaj? Który element puzzle nas interesuje? Dlaczego myślimy o DNS Cache? Jak można to zrobić lepiej a może najlepiej? S Jak zrobili to inni?
/ Servers 29
Google, OpenDNS and more 30
Removed 31
Removed 32
Removed 33
Removed 34
Removed 35
Removed 36
Number of Servers/Appliances Needed to Reach 500K and 1M DNS QPS # of servers/appliances needed to reach 500K DNS QPS # of servers/appliances needed to reach 1M DNS QPS BIND 13 25 HW DNS Appliance 1 1 An Hardware DNS appliance can achieve over 1 M DNS QPS BIND require 13 servers to reach 500K DNS QPS and 25 servers to achieve 1M DNS QPS 37
DNS Challenges They had ISPs need reliable, high performance DNS servers Limited options for carrier-grade server hardware Needs field replaceable, hot swap-able PSU/Fan/HDD DNS Queries/sec performance needs to be high Avoid buying and managing large number of servers Reduce support cost Protection against network threats is a growing concern Traditional ISP DNS uses BIND software on generic servers Extensive maintenance burden Customers want to move away from software-only solutions Need high performance appliance, plus ease of management No field software installs to customer units SLA 38
Pytania? aobszynski@infoblox.com 39
Anti DoS/DDoS Techniques TCP-SYN Flood Tracks the number of SYN requests per second, if the number of SYN requests goes above a threshold the code examines the requests to see if the clients are responding with ACK's if not the clients are added to a temp gray list and any pending connections are torn down. UDP Flood If it detects that a high number of packets with a very small payload are being received from a client or pool of clients, the client I.P address will be placed on a gray list All traffic from addresses on the gray list will be dropped for 60 seconds then removed from the gray list Spoofed Source Addresses The attack involves sending a spoofed TCP SYN packet (connection initiation) with the target host's IP address to an open port as both source and destination. 40