A Plan for the Continued Development of the DNS Statistics Collector



Similar documents
Decoding DNS data. Using DNS traffic analysis to identify cyber security threats, server misconfigurations and software bugs

DNS Root NameServers

Portals and Hosted Files

SuperLumin Nemesis. Administration Guide. February 2011

Send document comments to

McAfee Network Threat Response (NTR) 4.0

Camilyo APS package by Techno Mango Service Provide Deployment Guide Version 1.0

Report on Content Management Systems. University Communications Web Services Office March,

HP Intelligent Management Center v7.1 Network Traffic Analyzer Administrator Guide

Automating System Administration with Perl

DNS Security Control Measures: A heuristic-based Approach to Identify Real-time incidents

White Paper Copyright 2011 Nomadix, Inc. All Rights Reserved. Thursday, January 05, 2012

From traditional to alternative approach to storage and analysis of flow data. Petr Velan, Martin Zadnik

DNS Queries And IPv6 Root Servers

VitalQIP DNS/DHCP & IP Address Management Software and Appliance Solution

<Insert Picture Here> Michael Hichwa VP Database Development Tools Stuttgart September 18, 2007 Hamburg September 20, 2007

Application Discovery Manager User s Guide vcenter Application Discovery Manager 6.2.1

DNS and issues in connecting UNINET-ZA to the Internet

Firewall Builder Architecture Overview

Securing LAN Connected Devices in Industrial Sites with TLS and Multicast DNS

Healthstone Monitoring System

Automatic Configuration of Slave Nameservers (BIND only)

Monitoring MySQL database with Verax NMS

TSM Studio Server User Guide

Red Hat System Administration 1(RH124) is Designed for IT Professionals who are new to Linux.

CloudStack Metering Working with the Usage Data. Tariq Iqbal Senior

S3 Monitor Design and Implementation Plans

How to Design and Create Your Own Custom Ext Rep

Effect of anycast on K-root

RingStor User Manual. Version 2.1 Last Update on September 17th, RingStor, Inc. 197 Route 18 South, Ste 3000 East Brunswick, NJ

Monitoring PostgreSQL database with Verax NMS

Data Collection and Analysis: Get End-to-End Security with Cisco Connected Analytics for Network Deployment

Real-time Streaming Analysis for Hadoop and Flume. Aaron Kimball odiago, inc. OSCON Data 2011

Dell Enterprise Reporter 2.5. Configuration Manager User Guide

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH

The Root of the Matter: Hints or Slaves

Working With Virtual Hosts on Pramati Server

Installation Guide Avi Networks Cloud Application Delivery Platform Integration with Cisco Application Policy Infrastructure

RSSAC Recommendation on Measurements of the Root Server System RSSAC 002

Getting Started with Google Cloud Platform

Detecting rogue systems

IceWarp to IceWarp Server Migration

Understanding EMC Avamar with EMC Data Protection Advisor

Monitoring Network Traffic using ntopng

Sisense. Product Highlights.

Zenoss Service Dynamics Impact and Event Management Installation and Administration

Domain Name Resolver (DNR) Configuration

ISP Systems Design. ISP Workshops. Last updated 24 April 2013

Instructions for Access to Summary Traffic Data by GÉANT Partners and other Organisations

Data Driven Success. Comparing Log Analytics Tools: Flowerfire s Sawmill vs. Google Analytics (GA)

FileNet System Manager Dashboard Help

Initializing SAS Environment Manager Service Architecture Framework for SAS 9.4M2. Last revised September 26, 2014

StreamServe Persuasion SP5 Control Center

COS 333: Advanced Programming Techniques

WHM Administrator s Guide

WhatsUpGold. v NetFlow Monitor User Guide

BORG DIGITAL High Availability

ATLAS job monitoring in the Dashboard Framework

THE MASTER LIST OF DNS TERMINOLOGY. v 2.0

Basic & Advanced Administration for Citrix NetScaler 9.2

DNS based Load Balancing with Fault Tolerance

CHAPTER 5: BUSINESS ANALYTICS

A Survey Study on Monitoring Service for Grid

Obelisk: Summoning Minions on a HPC Cluster

Classifying DNS Heavy User Traffic by using Hierarchical Aggregate Entropy. 2012/3/5 Keisuke Ishibashi, Kazumichi Sato NTT Service Integration Labs

SOA Software API Gateway Appliance 7.1.x Administration Guide

Firewalls with IPTables. Jason Healy, Director of Networks and Systems

CS555: Distributed Systems [Fall 2015] Dept. Of Computer Science, Colorado State University

SIDN Server Measurements

Citrix NetScaler 10 Essentials and Networking

Web Application Vulnerability Testing with Nessus

VMware Mirage Web Manager Guide

Module 2. Configuring and Troubleshooting DNS. Contents:

co Characterizing and Tracing Packet Floods Using Cisco R

Application Centric Infrastructure Object-Oriented Data Model: Gain Advanced Network Control and Programmability

Deploying a distributed data storage system on the UK National Grid Service using federated SRB

Lightweight DNS for Multipurpose and Multifunctional Devices

Rails Cookbook. Rob Orsini. O'REILLY 8 Beijing Cambridge Farnham Koln Paris Sebastopol Taipei Tokyo

Computer Networks: Domain Name System

Product Guide. Sawmill Analytics, Swindon SN4 9LZ UK tel:

Team Members: Christopher Copper Philip Eittreim Jeremiah Jekich Andrew Reisdorph. Client: Brian Krzys

Visualization of Semantic Windows with SciDB Integration

Network Detective. Network Detective Inspector RapidFire Tools, Inc. All rights reserved Ver 3D

Products, Features & Services

DNS and BIND. David White

Improved metrics collection and correlation for the CERN cloud storage test framework

Tools for penetration tests 1. Carlo U. Nicola, HT FHNW With extracts from documents of : Google; Wireshark; nmap; Nessus.

The Domain Name System

DiskPulse DISK CHANGE MONITOR

ABRAHAM ARCHITECTURE OF A CLOUD SERVICE USING PYTHON TECHNOLOGIES

Building a Scalable News Feed Web Service in Clojure

Deploying IP Anycast. Core DNS Services for University of Minnesota Introduction and General discussion

SCF/FEF Evaluation of Nagios and Zabbix Monitoring Systems. Ed Simmonds and Jason Harrington 7/20/2009

This release also incorporates new features which improve manageability for system administrators and usability for contributors.

Introduction to Network Operating Systems

Oracle CRM Foundation

BEST PRACTICES FOR IMPROVING EXTERNAL DNS RESILIENCY AND PERFORMANCE

Ein Unternehmen stellt sich vor. Nagios in large environments

web hosting and domain names

About This Document 3. Integration and Automation Capabilities 4. Command-Line Interface (CLI) 8. API RPC Protocol 9.

Transcription:

A Plan for the Continued Development of the DNS Statistics Collector Background The DNS Statistics Collector ( DSC ) software was initially developed under the National Science Foundation grant "Improving the Integrity of Domain Name System (DNS) Monitoring and Protection" (NSF grant SCI- 0427144), a joint project between the Cooperative Association for Internet Data Analysis (CAIDA) and the Internet Systems Consortium (ISC). Most of the work was performed by Duane Wessels of The Measurement Factory, Inc. (TMF) under subcontract with CAIDA. The software was actively developed between 2004 and 2009. During 2010 a few bugs have been fixed, but no new features added. The DSC intellectual property is jointly owned by TMF and ISC. TMF has publicly stated its desire to transfer its rights in DSC to the Domain Name System Operations Analysis and Research Center (DNS- OARC), contingent upon ISC also transferring its rights to DNS- OARC. DSC is in use at a number of Top- Level Domains and other large- scale DNS operations. A number of DNS- OARC members contribute DSC data to DNS- OARC. In October 2010 a number of DSC users and other interested parties met in Denver, Colorado to discuss DSC s future. General consensus of that meeting is that DSC development should continue, under the administration of DNS- OARC. The purpose of this document is to describe an initial work plan for enhancing DSC. Terminology, Description of Architecture, and Implementation Details DSC has a distributed design. COLLECTOR processes run at or very near DNS server software. The collector receives DNS messages via Unix packet capture (libpcap) facilities. DNS messages are analyzed and then enumerated into multiple DATASETS.

The collector outputs its datasets at 60- second intervals. Datasets are written to disk as an XML file. Datasets are then transmitted to one or more PRESENTERS, typically over SSH. A presenter receives datasets from multiple collectors. Incoming XML files are EXTRACTED and their contents appended to DAT- FILES. The presenter produces graphs via web interface by reading the dat- files. DSC stores data in a SERVER/NODE hierarchy. For example, consider DNS Root server with numerous anycast instances. The server may be equal to k.root- servers.net and the nodes equal to geographic locations, such as Amsterdam, Berlin, and Tokyo. Another approach is to equate the server name and the zone ( uk ) and then set node names equal to NS record names ( ns1.nic.uk, ns2.nic.uk, etc). The presenter also functions as a storage archive. The dat- files are kept indefinitely, while XML files are generally removed after 1-7 days. For some datasets there is no loss of information going from an XML file to a dat- file. For larger datasets, summarization occurs such that some time granularity is lost. The collector was designed to give DSC users the flexibility to create their own datasets. Datasets are defined in the collector s configuration file. However, the creation of a new dataset does not, unfortunately, automatically allow it to be displayed by a presenter. Adding new datasets (or graphs) to a presenter requires updates to the DSC source code. The collector is written in C and uses a third- party library that requires some C++ code. The collector s support scripts are written mostly in Bourne Shell, with one in Perl. The presenter code is written mostly in Perl, with a handful of Bourne Shell support scripts. What follows is a list of proposed DSC enhancements based on discussions that took place during the October meeting in Denver. DSC Architectural Enhancements Server/Node - > General Hierarchy Some DSC users find the Server/Node naming convention restrictive. It would be helpful to support a more general, hierarchical naming scheme. For example, uk/lon/ns1/a may specify the.uk zone, nameserver NS1, location LON(don), server A.

This would be a relatively fundamental change that will require updating many parts of the code. Note that currently the XML files output by the collector do not contain the server or node names. The server and node names are specified when the collector s cron job uploads data to a presenter. Thus, if a dataset is sent to multiple presenters, it may appear under different server/node names. Solve the problem of unknown datasets in the Presenter Currently it is possible to create new datasets in a collector. However, the presenter will not know how to display the data. This is, essentially, because datasets have names, rather than types. Datasets should be assigned types. Then the administrator can create new datasets of known types (possibly adding some filters) and have them automatically displayed by the presenter. Representation of Data Between Collector and Presenter There was an early decision to use XML for data representation between collectors and presenters. One reason was the belief that there would be third- party tools that desire to interact with DSC. For example, a DNS server implementation may collect statistics internally and publish them to a DSC presenter. Or, somebody might develop an improve presenter interface. In reality there are no third- party tools that interoperate with DSC at this level. The choice of XML and Perl in the presenter causes the extractor processes to be inefficient. As discussed during our workshop, there is not strong opposition to XML per se, but something should be done to improve the speed of the extraction processes. This could mean switching to a different representation (JSON or YAML for example). Or perhaps the XML parsing simply need to be rewritten, perhaps in a compiled, rather than scripted, language. Move Valid Domains from Presenter to Collector DSC was initially designed to collect data from DNS Root nameservers. One interesting statistic at Root nameservers is the amount of queries for invalid TLDs. We get this from the qtype_vs_tld dataset, which a set of counts of (TLD,Qtype) pairs. The DSC grapher configuration contains domain_list and valid_domains directives that it uses when displaying valid and invalid TLDs.

These mechanisms should be further generalized. In particular, the notion of valid domains should be moved into the collector. For a (non- root) authoritative nameserver, the valid domains would be set to the list of domains that the nameserver is authoritative for. Improve Documentation Some users have commented that the documentation should be improved. DSC Collector Enhancements Collector Message Separation There are many cases where the packet capture stream coming into DSC contains multiple nodes that should be separated. This may happen when a DNS operator has a number of customers and each customer should receive a DSC feed with only its own data. It may also happen when span ports are utilized to aggregate packet streams from multiple servers into a single collection box. Currently a DSC user is required to run multiple DSC collector instances, perhaps with a BPF filter rule, to separate the streams. Instead, DSC should support the ability to supply data for multiple nodes from a single collector instance. The DSC administrator should be able to specify the method of message separation. At least the following should be supported: By server IP address By query name matching Optionally it may be useful to support separation by MAC- layer addresses. When message separation is employed, the collector may receive queries that do not belong to any defined node. Either such orphaned messages should be dropped by the collector, or one of the nodes designated as a default. Centralize configs distribute to collectors Currently DSC has no features to assist with distribution of collector configuration files. The administrator must manually configure each collector. For some users, there may be 50-100 collector sites to configure (nearly identically). DSC needs a feature where collectors can request their configuration from a presenter or other central location. The presenter seems a logical choice since collectors already contact presenters every 60 seconds.

This feature may also require support for include files in the collector configuration, so that the only some parts of the configuration come from the presenter and other parts are configured locally. For example: # dsc.conf interface eth1; local-address 1.2.3.4; include datasets.conf ; Configuration Builder Develop a user interface to build collector configuration files. For example: Standard templates to use as starting points (ie, root, TLD, authority) Checkbox to select/deselect certain datasets Ability to clone an existing dataset Dynamic Collector Configuration To reconfigure a collector, the administrator must generate the new collector configuration file and manually restart the collector processes. In the best case the collector will lose data for one interval (60 seconds). Perhaps longer if there are error or additional changes become necessary. It should be possible to reconfigure a running DSC collector. Obviously, the collector must validate the new configuration before using it. If a new configuration contains errors, then the old configuration remains in use. Attack Resilience DSC currently has no features that can help it avoid becoming a DoS victim. If a nameserver comes under attack, DSC may not be able to keep up with all datasets at the full rate. It would be nice to have a feature whereby DSC detected overload conditions and then updated a subset of datasets to reduce its resource consumption. Collector Cron Currently the collector uses standard Unix cron jobs to transmit XML files to presenters. Having these cron jobs means that the administrator must take care to ensure that both are running. It is possible that the collector stops running and cron

keeps running, or vice- versa. The administrator must also worry about where cron job output emails are sent to. It would be nice if the collector itself handled these tasks so that the cron job configuration could be placed inside the collector configuration file. DSC Presenter Enhancements Backend Storage API Currently, for most users, the presenter stores data in the Unix filesystem. There is a branch in the subversion repository that stores data in an SQL (MySQL or Postgres) database. At least one DSC user reports success with SQL storage. Today the filesystem backend storage is barely fast enough. To allow better scaling for DSC in the future, DSC should support alternative storage backends. These may include Unix filesystem, traditional SQL, SQLite, and NoSQL such as Cassandra. Users should be able to chose a storage scheme that meets their needs. The first step to modular backend storage is to define and implement a storage API for DSC. The API will allow the user to choose backends and allow different applications to read the DSC data (see Anomaly Detection below). The storage API may include the following functionality: Initialize store for a server/node Write data from collector to the store Read data from the store Perform periodic (e.g., daily) maintenance Improve Presentation and Interaction of Graphs Today DSC uses relatively old- fashioned web technologies. Although the web interface is a CGI script, the output is mostly just simple HTML and embedded (static) images. A number of new web technologies exist that DSC could benefit from to make it more responsive: AJAX, XML- RPC, and HTML5. Overall Speed Improvements Currently the extractor component the part that reads XML files and writes dat- files is the least efficient. This is perhaps due to the use of Perl, use of XML, and use of the Unix filesystem.

The extractor and presenter in general must be made to operate more quickly and more efficiently. There is no obvious path to be taken at this time. It will be necessary to experiment and explore different options to understand where improvements can be made. Support for Different Views The presenter could benefit from a views feature whereby different users are allowed to see different things, or different levels of detail. For example, guest users may see only delayed, non- interactive data. Some users might only see data aggregated at the server level, while others can drill down deeper into the server/node hierarchy. Anomaly Detection During our Denver meeting we discussed whether or not DSC should provide some anomaly detection. There was consensus that this should not be a core feature of DSC, but rather, with a good Storage API, it will be possible to develop external anomaly detection applications.