Classifying DNS Heavy User Traffic by using Hierarchical Aggregate Entropy. 2012/3/5 Keisuke Ishibashi, Kazumichi Sato NTT Service Integration Labs

Similar documents
Clear and Present Danger Increase in Number of DNS AAAA Queries

Part 5 DNS Security. SAST01 An Introduction to Information Security Martin Hell Department of Electrical and Information Technology

Understand Names Resolution

1. Domain Name System

HTG XROADS NETWORKS. Network Appliance How To Guide: EdgeDNS. How To Guide

Appendix D: Configuring Firewalls and Network Address Translation

DNS Root NameServers

How To Guide Edge Network Appliance How To Guide:

Chapter 25 Domain Name System Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Domain Name System (DNS)

Lesson 13: DNS Security. Javier Osuna GMV Head of Security and Process Consulting Division

Introduction to the Domain Name System

Decoding DNS data. Using DNS traffic analysis to identify cyber security threats, server misconfigurations and software bugs

Extending Black Domain Name List by Using Co-occurrence Relation between DNS queries

DNS Domain Name System

- Domain Name System -

Detecting Mass-Mailing Worm Infected Hosts by Mining DNS Traffic Data

Forouzan: Chapter 17. Domain Name System (DNS)

Local DNS Attack Lab. 1 Lab Overview. 2 Lab Environment. SEED Labs Local DNS Attack Lab 1

Internet Engineering Task Force (IETF) Request for Comments: Category: Standards Track February 2013 ISSN:

CYBER SCIENCE 2015 AN ANALYSIS OF NETWORK TRAFFIC CLASSIFICATION FOR BOTNET DETECTION

The Domain Name System

Domain Name System. Heng Sovannarith

DNS : Domain Name System

DNS. Computer networks - Administration 1DV202. fredag 30 mars 12

DNS. The Root Name Servers. DNS Hierarchy. Computer System Security and Management SMD139. Root name server. .se name server. .

Operational Problems in IPv6: Fallback and DNS issues

Hosting more than one FortiOS instance on. VLANs. 1. Network topology

ARP and DNS. ARP entries are cached by network devices to save time, these cached entries make up a table

Using the Domain Name System for System Break-ins

Windows 2008 Server. Domain Name System Administración SSII

LASTLINE WHITEPAPER. Using Passive DNS Analysis to Automatically Detect Malicious Domains

The Continuing Denial of Service Threat Posed by DNS Recursion (v2.0)

ECE 4321 Computer Networks. Network Programming

Internet Security [1] VU Engin Kirda

A Plan for the Continued Development of the DNS Statistics Collector

How to Add Domains and DNS Records

BITS-Pilani Hyderabad Campus CS C461/IS C461/CS F303/ IS F303 (Computer Networks) Laboratory 3

Computer Networks: Domain Name System

F5 Intelligent DNS Scale. Philippe Bogaerts Senior Field Systems Engineer mailto: Mob.:

Security A to Z the most important terms

Network Security CS 192

How To - Configure Virtual Host using FQDN How To Configure Virtual Host using FQDN

Domain Name Abuse Detection. Liming Wang

Internet-Praktikum I Lab 3: DNS

The Domain Name System

CSIS 3230 Computer Networking Principles, Spring 2012 Lab 7 Domain Name System (DNS)

Domain Name System (DNS) Fundamentals

Wharf T&T Limited DDoS Mitigation Service Customer Portal User Guide

Collax Active Directory

Motivation. Domain Name System (DNS) Flat Namespace. Hierarchical Namespace

The Domain Name System

Protecting DNS Critical Infrastructure Solution Overview. Radware Attack Mitigation System (AMS) - Whitepaper

Internet Monitoring via DNS Traffic Analysis. Wenke Lee Georgia Institute of Technology

Module 2. Configuring and Troubleshooting DNS. Contents:

Domain Name Service (DNS) Training Division, NIC New Delhi

3. The Domain Name Service

How to Configure Split DNS

DNS Session 4: Delegation and reverse DNS. Joe Abley AfNOG 2006 workshop

New possibilities in latest OfficeScan and OfficeScan plug-in architecture

THE MASTER LIST OF DNS TERMINOLOGY. v 2.0

Computer Networks: DNS a2acks CS 1951e - Computer Systems Security: Principles and Prac>ce. Domain Name System

DNS Pharming Attack Lab

Chapter 2 Application Layer

Detecting Search Lists in Authoritative DNS

what s in a name? taking a deeper look at the domain name system mike boylan penn state mac admins conference

DNS. DNS Fundamentals. Goals of this lab: Prerequisites: LXB, NET

On Entropy in Network Traffic Anomaly Detection

DNS and BIND Primer. Pete Nesbitt linux1.ca. April 2012

Domain Name System (DNS) RFC 1034 RFC

THE MASTER LIST OF DNS TERMINOLOGY. First Edition

HOW TO CONFIGURE PASS-THRU PROXY FOR ORACLE APPLICATIONS

Harness Your Internet Activity!

1 hours, 30 minutes, 38 seconds Heavy scan. All scanned network resources. Copyright 2001, FTP access obtained

How To Protect A Network From Attack From A Hacker (Hbss)

Domain Name System (DNS) Security By Diane Davidowicz 1999 Diane Davidowicz

Akamai CDN, IPv6 and DNS security. Christian Kaufmann Akamai Technologies DENOG 5 14 th November 2013

DNS. Spring 2016 CS 438 Staff 1

DNS Domain Name System

How to Configure the Windows DNS Server

IPv6 support in the DNS

DNSSEC and DNS Proxying

Enterprise Architecture Office Resource Document Design Note - Domain Name System (DNS)

Domain Name System (DNS)

Where is Hong Kong in the secure Internet infrastructure development. Warren Kwok, CISSP Internet Society Hong Kong 12 August 2011

IPv6 Support in the DNS. Workshop Name Workshop Location, Date

Application Protocols in the TCP/IP Reference Model

Network Monitoring using MMT:

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

DNS Noise: Measuring the Pervasiveness of Disposable Domains in Modern DNS Traffic

Basic DNS Course. Module 1. DNS Theory. Ron Aitchison ZYTRAX, Inc. Page 1 of 24

HW2 Grade. CS585: Applications. Traditional Applications SMTP SMTP HTTP 11/10/2009

Quick Start for Network Agent. 5-Step Quick Start. What is Network Agent?

Transcription:

Classifying DNS Heavy User Traffic by using Hierarchical Aggregate Entropy 2012/3/5 Keisuke Ishibashi, Kazumichi Sato NTT Service Integration Labs

Motivation Network resources are consumed by a small number of heavy users Controlling traffic from heavy users is a crucial task for efficient use of network. filtering, rate limiting, charging Before controlling the heavy user traffic, we need to understand what type of traffic they send If heavy user traffic are mostly anomalous, then filtering such traffic is rather acceptable. Anomalous traffic: DDoS attack, spam, illegal file exchange etc. Thus, we need to classify heavy user traffic whether normal or abnormal In this talk, we focus on heavy users in DNS traffic, one of the most important control traffic in the Internet 2

Bogus traffic in DNS DNS: mainly used for mapping domain name to IP address or vice versa Two types of servers: caching server and authoritative server Bogus queries are consuming resources of both DNS authoritative servers and caching servers repeated queires for a single name (bug?) scaning quereis for non exiting names (worm?) www.google.com? Authoritative Server 192.168.0.1 www.google.com? Cache Server 192.168.0.1 [Wessels] D. Wessels et.al., Wow, That's a Lot of Packets, " PAM 2003. [Toyono] T. Toyono et. al., An analysis of the queries from the view point of caching servers, 2007 DNS-Operations Workshop. 3

Motivation cont d Most of bogus queries are sent by small number of heavy clients [toyono] Filtering queries sent by those heavy clients is efficient to protect DNS server resources 4 However, not all queries from heavy clients are bogus PTR queries from web servers (analog) Aggregated queries from DNS proxies

Normal heavy user DNS prefetch resolves all the domain names of the URLs in a browsed web page before the URL is actually clicked Faster web but burst (unnecessary) queries for a page that contains huge URLs Log analyzer Log analyzers in web server send reverse queries ( resolve domain names for IP addresses) for addresses in their access logs What organizations access our web servers? 5

Entropy based classification 6 Needs to classify heavy clients into normal users and abnormal users Classify heavy clients by their query pattern How to capture query patterns? Use of entropy of queries in domain name spaces Repeat queries Domain name space # of queries Small entropy Scan queries Entropy of legitimate queries: expected to lie between them Domain name space # of queries Large entropy

#of queries #of queries #of queries #of queries Hierarchical Aggregate Entropy(1/2) Does not have information on spatial characteristics Independent on where queries concentrate or diverse in domain name spaces Only depends on how queries concentrate or diverse coarse graining Same entropy Hierarchical Aggregate Entropy Aggregating queries accordance to its hierarchical structure and calculate entropy for each hierarchy coarse graining 7 Large entropy Small entropy

Hierarchical Aggregate Entropy(2/2) DNS: tree based hierarchical structure Fully qualified domain name (FQDN): www.google.com, www.ntt.co.jp FQDN-level entropy H(D (0) D (1) )): deviation in www.example.org level Second level domain (SLD):.google.com SLD-level entropy H(D (1) D (2) )) :deviation in example.org level Top level domain (TLD) :.com,.net,.jp TLD-level entropy H(D (2) ) dispersion in queries for com,.net,.jp Domain ドメインツリー Tree Query Distribution クエリ 分 布 Level-2 (TLD) Level 1 (SLD).arpa in-addr.arpa 192. in-addr.apra 10. in-addr.apra example.com root.com com lb. example.com.net example.net...net Identify the deviation occurs intra TLD or inter TLD. Level 0 (FQDN) 0.0.168.192. in-addr.apra 0.0.0.10. in-addr.apra www. example.com mail.lb. example.com ns. example.net www. example.net 8

Experimental results Calculate entropies of top 10,000 heavy clients for DNS traffic monitored at DNS caching servers Entropies from normal clients concentrated in a specific region Clients whose entropies are out of the region can be expected to be abnormal 9

Classification by using SVM Extract normal domain by using SVM Training Data: manually labeled data for host sending over 1 query per second SVM(Support Vector Machine): generates boundary between normal region and abnormal region based on the training data 10 Training Data Classification Results

Accuracy of classification Evaluate the accuracy with 10 cross-fold validation Separate training data into 10 groups. Classify host in a group by using training data of the rest of nine groups and compare the classification results and manual label. 10% improvement can be achievement by using hierarchical aggregate entropy Entropy Hierarchical Aggregate Entropy Mis-classification ratio (FP+NP) 8.7% FQDN Entropy 18.9% SLD Entropy 23.3% TLD Entropy 19.8% 11

Reverse queries (IP addr -> FQDN) have common SLD and TLD -> entropy is zero. 1.0.168.192.in-addr.arpa. Apply our hierarchical aggregate entropy to IP address part 1.0.168.192.in-addr.arpa. TLD Reverse query SLD TLD SLD Confirm concentration of entropy of normal clients to a domain 12

Effect of DNS prefetch Extract Firefox users, and compare their entropies before and after Firefox implements DNS prefetch After the implementation, ratio of Firefox users among heavy users increases, and that of normal heavy users increases as well. Filtering queries from heavy users may impede Internet access of normal users 13

Conclusion Propose the use of hierarchical aggregate entropies to classify DNS heavy clients Can capture spatial dispersion of queries among domain name spaces Entropies from normal clients concentrated in a specific region Experimental results show that the proposed method achieve 10 % improvement in classification accuracy 14