A Plan for the Continued Development of the DNS Statistics Collector

A Plan for the Continued Development of the DNS Statistics Collector Background The DNS Statistics Collector ( DSC ) software was initially developed under the National Science Foundation grant "Improving the Integrity of Domain Name System (DNS) Monitoring and Protection" (NSF grant SCI- 0427144), a joint project between the Cooperative Association for Internet Data Analysis (CAIDA) and the Internet Systems Consortium (ISC). Most of the work was performed by Duane Wessels of The Measurement Factory, Inc. (TMF) under subcontract with CAIDA. The software was actively developed between 2004 and 2009. During 2010 a few bugs have been fixed, but no new features added. The DSC intellectual property is jointly owned by TMF and ISC. TMF has publicly stated its desire to transfer its rights in DSC to the Domain Name System Operations Analysis and Research Center (DNS- OARC), contingent upon ISC also transferring its rights to DNS- OARC. DSC is in use at a number of Top- Level Domains and other large- scale DNS operations. A number of DNS- OARC members contribute DSC data to DNS- OARC. In October 2010 a number of DSC users and other interested parties met in Denver, Colorado to discuss DSC s future. General consensus of that meeting is that DSC development should continue, under the administration of DNS- OARC. The purpose of this document is to describe an initial work plan for enhancing DSC. Terminology, Description of Architecture, and Implementation Details DSC has a distributed design. COLLECTOR processes run at or very near DNS server software. The collector receives DNS messages via Unix packet capture (libpcap) facilities. DNS messages are analyzed and then enumerated into multiple DATASETS.

The collector outputs its datasets at 60- second intervals. Datasets are written to disk as an XML file. Datasets are then transmitted to one or more PRESENTERS, typically over SSH. A presenter receives datasets from multiple collectors. Incoming XML files are EXTRACTED and their contents appended to DAT- FILES. The presenter produces graphs via web interface by reading the dat- files. DSC stores data in a SERVER/NODE hierarchy. For example, consider DNS Root server with numerous anycast instances. The server may be equal to k.root- servers.net and the nodes equal to geographic locations, such as Amsterdam, Berlin, and Tokyo. Another approach is to equate the server name and the zone ( uk ) and then set node names equal to NS record names ( ns1.nic.uk, ns2.nic.uk, etc). The presenter also functions as a storage archive. The dat- files are kept indefinitely, while XML files are generally removed after 1-7 days. For some datasets there is no loss of information going from an XML file to a dat- file. For larger datasets, summarization occurs such that some time granularity is lost. The collector was designed to give DSC users the flexibility to create their own datasets. Datasets are defined in the collector s configuration file. However, the creation of a new dataset does not, unfortunately, automatically allow it to be displayed by a presenter. Adding new datasets (or graphs) to a presenter requires updates to the DSC source code. The collector is written in C and uses a third- party library that requires some C++ code. The collector s support scripts are written mostly in Bourne Shell, with one in Perl. The presenter code is written mostly in Perl, with a handful of Bourne Shell support scripts. What follows is a list of proposed DSC enhancements based on discussions that took place during the October meeting in Denver. DSC Architectural Enhancements Server/Node - > General Hierarchy Some DSC users find the Server/Node naming convention restrictive. It would be helpful to support a more general, hierarchical naming scheme. For example, uk/lon/ns1/a may specify the.uk zone, nameserver NS1, location LON(don), server A.

This would be a relatively fundamental change that will require updating many parts of the code. Note that currently the XML files output by the collector do not contain the server or node names. The server and node names are specified when the collector s cron job uploads data to a presenter. Thus, if a dataset is sent to multiple presenters, it may appear under different server/node names. Solve the problem of unknown datasets in the Presenter Currently it is possible to create new datasets in a collector. However, the presenter will not know how to display the data. This is, essentially, because datasets have names, rather than types. Datasets should be assigned types. Then the administrator can create new datasets of known types (possibly adding some filters) and have them automatically displayed by the presenter. Representation of Data Between Collector and Presenter There was an early decision to use XML for data representation between collectors and presenters. One reason was the belief that there would be third- party tools that desire to interact with DSC. For example, a DNS server implementation may collect statistics internally and publish them to a DSC presenter. Or, somebody might develop an improve presenter interface. In reality there are no third- party tools that interoperate with DSC at this level. The choice of XML and Perl in the presenter causes the extractor processes to be inefficient. As discussed during our workshop, there is not strong opposition to XML per se, but something should be done to improve the speed of the extraction processes. This could mean switching to a different representation (JSON or YAML for example). Or perhaps the XML parsing simply need to be rewritten, perhaps in a compiled, rather than scripted, language. Move Valid Domains from Presenter to Collector DSC was initially designed to collect data from DNS Root nameservers. One interesting statistic at Root nameservers is the amount of queries for invalid TLDs. We get this from the qtype_vs_tld dataset, which a set of counts of (TLD,Qtype) pairs. The DSC grapher configuration contains domain_list and valid_domains directives that it uses when displaying valid and invalid TLDs.

These mechanisms should be further generalized. In particular, the notion of valid domains should be moved into the collector. For a (non- root) authoritative nameserver, the valid domains would be set to the list of domains that the nameserver is authoritative for. Improve Documentation Some users have commented that the documentation should be improved. DSC Collector Enhancements Collector Message Separation There are many cases where the packet capture stream coming into DSC contains multiple nodes that should be separated. This may happen when a DNS operator has a number of customers and each customer should receive a DSC feed with only its own data. It may also happen when span ports are utilized to aggregate packet streams from multiple servers into a single collection box. Currently a DSC user is required to run multiple DSC collector instances, perhaps with a BPF filter rule, to separate the streams. Instead, DSC should support the ability to supply data for multiple nodes from a single collector instance. The DSC administrator should be able to specify the method of message separation. At least the following should be supported: By server IP address By query name matching Optionally it may be useful to support separation by MAC- layer addresses. When message separation is employed, the collector may receive queries that do not belong to any defined node. Either such orphaned messages should be dropped by the collector, or one of the nodes designated as a default. Centralize configs distribute to collectors Currently DSC has no features to assist with distribution of collector configuration files. The administrator must manually configure each collector. For some users, there may be 50-100 collector sites to configure (nearly identically). DSC needs a feature where collectors can request their configuration from a presenter or other central location. The presenter seems a logical choice since collectors already contact presenters every 60 seconds.

This feature may also require support for include files in the collector configuration, so that the only some parts of the configuration come from the presenter and other parts are configured locally. For example: # dsc.conf interface eth1; local-address 1.2.3.4; include datasets.conf ; Configuration Builder Develop a user interface to build collector configuration files. For example: Standard templates to use as starting points (ie, root, TLD, authority) Checkbox to select/deselect certain datasets Ability to clone an existing dataset Dynamic Collector Configuration To reconfigure a collector, the administrator must generate the new collector configuration file and manually restart the collector processes. In the best case the collector will lose data for one interval (60 seconds). Perhaps longer if there are error or additional changes become necessary. It should be possible to reconfigure a running DSC collector. Obviously, the collector must validate the new configuration before using it. If a new configuration contains errors, then the old configuration remains in use. Attack Resilience DSC currently has no features that can help it avoid becoming a DoS victim. If a nameserver comes under attack, DSC may not be able to keep up with all datasets at the full rate. It would be nice to have a feature whereby DSC detected overload conditions and then updated a subset of datasets to reduce its resource consumption. Collector Cron Currently the collector uses standard Unix cron jobs to transmit XML files to presenters. Having these cron jobs means that the administrator must take care to ensure that both are running. It is possible that the collector stops running and cron

keeps running, or vice- versa. The administrator must also worry about where cron job output emails are sent to. It would be nice if the collector itself handled these tasks so that the cron job configuration could be placed inside the collector configuration file. DSC Presenter Enhancements Backend Storage API Currently, for most users, the presenter stores data in the Unix filesystem. There is a branch in the subversion repository that stores data in an SQL (MySQL or Postgres) database. At least one DSC user reports success with SQL storage. Today the filesystem backend storage is barely fast enough. To allow better scaling for DSC in the future, DSC should support alternative storage backends. These may include Unix filesystem, traditional SQL, SQLite, and NoSQL such as Cassandra. Users should be able to chose a storage scheme that meets their needs. The first step to modular backend storage is to define and implement a storage API for DSC. The API will allow the user to choose backends and allow different applications to read the DSC data (see Anomaly Detection below). The storage API may include the following functionality: Initialize store for a server/node Write data from collector to the store Read data from the store Perform periodic (e.g., daily) maintenance Improve Presentation and Interaction of Graphs Today DSC uses relatively old- fashioned web technologies. Although the web interface is a CGI script, the output is mostly just simple HTML and embedded (static) images. A number of new web technologies exist that DSC could benefit from to make it more responsive: AJAX, XML- RPC, and HTML5. Overall Speed Improvements Currently the extractor component the part that reads XML files and writes dat- files is the least efficient. This is perhaps due to the use of Perl, use of XML, and use of the Unix filesystem.

The extractor and presenter in general must be made to operate more quickly and more efficiently. There is no obvious path to be taken at this time. It will be necessary to experiment and explore different options to understand where improvements can be made. Support for Different Views The presenter could benefit from a views feature whereby different users are allowed to see different things, or different levels of detail. For example, guest users may see only delayed, non- interactive data. Some users might only see data aggregated at the server level, while others can drill down deeper into the server/node hierarchy. Anomaly Detection During our Denver meeting we discussed whether or not DSC should provide some anomaly detection. There was consensus that this should not be a core feature of DSC, but rather, with a good Storage API, it will be possible to develop external anomaly detection applications.