The original RFC should
be regarded as being authoritative.
Network Working Group | A. Romao | |
Request for Comments: 1713 | FCCN | |
FYI: 27 | November 1994 | |
Category: Informational |
DNS owes much of its success to its distributed administration. Each component (called a zone, the same as a domain in most cases), is seen as an independent entity, being responsible for what happens inside its domain of authority, how and what information changes and for letting the tree grow downwards, creating new components.
On the other hand, many inconsistencies arise from this distributed nature: many administrators make mistakes in the way they configure their domains and when they delegate authority to sub-domains; many of them don't even know how to do these things properly, letting problems last and propagate. Also, many problems occur due to bad implementations of both DNS clients and servers, especially very old ones, either by not following the standards or by being error prone, creating or allowing many of the above problems to happen.
All these anomalies make DNS less efficient than it could be, causing
trouble to network operations, thus affecting the overall Internet. This
document tries to show how important it is to have DNS properly managed,
including what is already in place to help administrators taking better
care of their domains.
Some prior knowledge from the reader is assumed, both on DNS basics
and some other tools (e.g., dig and nslookup), which are not analyzed in
detail here; hopefully they are well-known enough from daily usage.
By default, host just maps host names to Internet addresses, querying the default servers or some specific one. It is possible, though, to get any kind of data (resource records) by specifying different query types and classes and asking for verbose or debugging output, from any name server. You can also control several parameters like recursion, retry times, timeouts, use of virtual circuits vs. datagrams, etc., when talking to name servers. This way you can simulate a resolver behavior, in order to find any problems associated with resolver operations (which is to say, any application using the resolver library). As a query program it may be as powerful as others like nslookup or dig.
As a debugger, host analyzes some set of the DNS space (e.g., an entire zone) and produces reports with the results of its operation. To do this, host first performs a zone transfer, which may be recursive, getting information from a zone and all its sub-zones. This data is then analyzed as requested by the arguments given on the command line. Note that zone transfers are done by contacting authoritative name servers for that zone, so it must be possible to make this kind of request from such servers: some of them refuse zone transfers (except from secondaries) to avoid congestion.
With host you may look for anomalies like those concerning authority (e.g., lame delegations, described below) or some more exotic cases like extrazone hosts (a host of the form host.some.dom.ain, where some.dom.ain is not a delegated zone of dom.ain). These errors are produced upon explicit request on the command line, but you may get a variety of other error messages as a result of host's operations, something like secondary effects. These may be mere warnings (which may be suppressed) or serious errors - in fact, warning messages are not that simple, most of them are due to misconfigured zones, so it might not be a good idea to just ignore them.
Error messages have to do with serious anomalies, either with the packets
exchanged with the queried servers (size errors, invalid ancounts, nscounts
and the like), or others related to the DNS information itself (also called
"status messages" in the program's documentation): inconsistencies between
SOA records as shown by
different servers for a domain, unexpected address-to-name mappings,
name servers not responding, not reachable, not running or not existing
at all, and so on.
Host performs all its querying on-line, i.e., it only works with data received from name servers, which means you have to query a name server more than once if you want to get different kinds of reports on some particular piece of data. You can always arrange arguments in such a way that you get all information you want by running it once, but if you forget something or for any reason have to run it again, this means extra zone transfers, extra load on name servers, extra DNS traffic.
Host is an excellent tool, if used carefully. Like most other
querying programs it may generate lots of traffic, just by issuing a simple
command. Apart from that, its resolver simulation and debug capabilities
make it useful to find many common and some not so common DNS configuration
errors, as well as generate useful reports and statistics about the DNS
tree. As an example, RIPE (Reseaux IP Europeens) NCC uses it to generate
a monthly european hostcount, giving an overview of the Internet usage
evolution in Europe. Along with these counts, error reports are generated,
one per country, and the whole information is made available in the RIPE
archive.
The program checks domain configurations stored locally, with data arranged hierarchically in directories, resembling the DNS tree organization of domains. To set up this information dnswalk may first perform zone transfers from authoritative name servers. You can have a recursive transfer of a domain and its sub-domains, though you should be careful when doing this, as it may generate a great amount of traffic. If the data is already present, dnswalk may skip these transfers, provided that it is up to date.
Dnswalk looks for inconsistencies in resource records, such as MX and aliases pointing to aliases or to unknown hosts, incoherent PTR, A and CNAME records, invalid characters in names, missing trailing dots, unnecessary glue information, and so on. It also does some checking on authority information, namely lame delegations and domains with only one name server. It is easy to use, you only have to specify the domain to analyze and some optional parameters and the program does the rest. Only one domain (and its sub-domains, if that's the case) can be checked at a time, though.
While in the process of checking data, dnswalk uses dig and resolver
routines (gethostbyXXXX from the Perl library) a lot, to get such data
as authority information from the servers of the analyzed domains, names
from IP addresses so as to verify the existence of PTR records, aliases
and so on. So, besides the zone transfers you may count on some more
extra traffic (maybe not negligible if you are debugging a relatively large
amount of data and care about query retries and timeouts), just by running
the program.
It's easy to create a lame delegation: the most common case happens when an administrator changes the NS list for his domain, dropping one or more servers from that list, without informing his parent domain administration, who delegated him authority over the domain. From now on the parent name server announces one or more servers for the domain, which will receive queries for something they don't know about. (On the other hand, servers may be added to the list without the parent's servers knowing, thus hiding valuable information from them - this is not a lame delegation, but shouldn't happen either.) Other examples are the inclusion of a name in an NS list without telling the administrator of that host, or when a server suddenly stops providing name service for a domain.
To detect and warn DNS administrators all over the world about this
kind of problem, Bryan Beecher from University of Michigan wrote lamers,
a program to analyze named (the well-known BIND name server) logging information
[2]. To produce useful logs, named was applied
a patch to detect and log lame delegations (this patch was originally written
by Don Lewis from Silicon Systems and is now part of the latest release
of BIND thanks to Bryan Beecher, so it is expected to be widely available
in the near future). Lamers is a small shell script that simply scans
these logs and reports the lame delegations found. This reporting
is done by sending mail to the hostmasters of the affected domains, as
stated in the SOA record for each of them. If this is not possible, the
message is sent to the affected name servers' postmasters instead.
Manual processing is needed in case of bounces, caused by careless setup
of those records or invalid postmaster addresses. A report of the
errors found by the U-M servers is also posted twice a month on the USENET
newsgroup
comp.protocols.tcp-ip.domains.
If you ever receive such a report, you should study it carefully in order to find and correct problems in your domain, or see if your servers are being affected by the spreading of erroneous information. Better yet, lamers could be run on your servers to detect more lame delegations (U-M can't see them all!). Also, if you receive mail reporting a lame delegation affecting your domain or some of your hosts, please don't just ignore it or flame the senders. They're really trying to help!
You can get lamers from ftp://terminator.cc.umich.edu/dns/lame-delegations.
To look for this kind of problems Paul Mockapetris and Steve Hotz, from the Information Sciences Institute, wrote a C-shell script called DOC (Domain Obscenity Control), an automated domain testing tool that uses dig to query the appropriate name servers about authority for a domain and analyzes the responses.
DOC limits its analysis to authority data since the authors anticipated that people would complain about such things as invasion of privacy. Also, at the time it was written most domains were so messy that they thought there wouldn't be much point in checking anything deeper until the basic problems weren't fixed.
Only one domain is analyzed each time: the program checks if all the
servers for the parent domain agree about the delegation information for
the domain. DOC then picks a list of name servers for the domain
(obtained from one of the parent's servers) and starts checking on their
information, querying each of them: looks for the SOA record, checks if
the response is authoritative, compares the various records retrieved,
gets each one's list of NS, compares the lists (both among these servers
and the parent's), and for those servers inside the
domain the program looks for PTR records for them.
Due to several factors, DOC seems to have frozen since its first public
release, back in 1990. Within the distribution there is an RFC draft
about automated domain testing, which was never published. Nevertheless,
it may provide useful reading. The software can be fetched from ftp://ftp.uu.net/networking/ip/dns/doc.2.0.tar.Z.
These tools work on cached DNS data, i.e., data stored locally after performing zone transfers (presently done by a slightly modified version of BIND's named-xfer, called ddt-xfer, which allows recursive transfers) from the appropriate servers, rather than querying name servers on-line each time they run. This option was taken for several reasons [3]: (1) efficiency, since it reads data from disk, avoiding network transit delays, (2) reduced network traffic, data has to be fetched only once and then run the programs over it as many times as you wish and (3) accessibility - in countries with limited Internet access, as was the case in Portugal by the time DDT was in its first stages, this may be the only practical way to use the tools.
Point (2) above deserves some special considerations: first, it is not
entirely true that there aren't additional queries while processing the
information, one of the tools, the authority checker, queries (via dig)
each domain's purported name servers in order to test the consistency of
the authority information they provide about
the domain. Second, it may be argued that when the actual tests
are done the information used may be out of date. While this is true,
you should note that this is the DNS nature, if you obtain some piece of
information you can't be sure that one second later it is still valid.
Furthermore, if your source was not the primary for the domain then you
can't even be sure of the validity in the exact moment you got it in the
first place. But experience shows that if you see an error, it is
likely to be there in the next version of the domain information (and if
it isn't, nothing was lost by having detected it in the past). On
the other side, of course there's little point in checking one month old
data...
The list of errors looked for includes lame delegations, version number mismatches between servers (this may be a transient problem), non-existing servers, domains with only one server, unnecessary glue information, MX records pointing to hosts not in the analyzed domain (may not be an error, it's just to point possibly strange or expensive mail-routing policies), MX records pointing to aliases, A records without the respective PTR and vice-versa, missing trailing dots, hostnames with no data (A or CNAME records), aliases pointing to aliases, and some more. Given the specialized nature of each tool, it is possible to look for a well defined set of errors, instead of having the data analyzed in all possible ways.
Except for ddt-xfer, all the programs are written in Perl. A new
release may come into existence in a near future, after a thorough review
of the methods used, the set of errors checked for and some bug fixing
(in particular, a Perl version of ddt-xfer is expected). In the mean time,
the latest version is available from
ftp://ns.dns.pt/pub/dns/ddt-2.0.1.tar.gz.
While the panacea is yet to be found (claims are made that the latest official version of BIND is a great step forward [5]), work has been done in order to identify sources of anomalies, as a first approach in the search for a solution. The Checker Project is one such effort, developed at the University of Southern California [6]. It consists of a set of C code patched into BIND's named, for monitoring server activity, building a database with the history of that operation (queries and responses). It is then possible to generate reports from the database summarizing activity and identifying behavioral patterns from client requests, looking for anomalies. The named code alteration is small and simple unless you want do have PEC checking enabled (see below). You may find sources and documentation at ftp://catarina.usc.edu/pub/checker.
Checker only does this kind of collection and reporting, it does not try to enforce any rules on the administrators of the defective sites by any means whatsoever. Authors hope that the simple exhibition of the evidences is a reason strong enough for those administrators to have their problems fixed.
An interesting feature is PEC (proactive error checking): the server pretends to be unresponsive for some queries by randomly choosing some name and start refusing replies for queries on that name during a pre-determined period. Those queries are recorded, though, to try to reason about the retry and timeout schemes used by name servers and resolvers. It is expected that properly implemented clients will choose another name server to query, while defective ones will keep on trying with the same server. This feature seems to be still under testing as it is not completely clear yet how to interpret the results. A PEC-only error checker is available from USC that is much simpler than the full error checker. It examines another name server client every 30 minutes to see if this client causes excessive load.
Presently Checker has been running on a secondary for the US domain for more than a year with little trouble. Authors feel confident it should run on any BSD platform (at least SunOS) without problems, and is planned to be included as part of the BIND name server.
Checker is part of a research project lead by Peter Danzig from USC,
aimed to implement probabilistic error checking mechanisms like PEC on
distributed systems [7]. DNS is one such system
and it was chosen as the platform for testing the validity of these techniques
over the NSFnet. It is hoped to achieve enough knowledge to provide
means to improve performance and reliability of distributed systems. Anomalies
like undetected server failures, query loops, bad retransmission backoff
algorithms, misconfigurations and resubmission of requests after negative
replies are some of the targets for these checkers to detect.
The reference taken was the contrib directory in the latest BIND distribution (where some of the above programs can also be found). There you will find tools for creating your DNS configuration files and NIS maps from /etc/hosts and vice-versa or generate PTR from A records (these things may be important as a means of avoiding common typing errors and inconsistencies between those tables), syntax checkers for zone files, programs for querying and monitoring name servers, all the small programs presented in [8], and more. It is worth spending some time looking at them, maybe you'll find that program you were planning to write yourself. The latest public version of BIND can be found at ftp://gatekeeper.dec.com/pub/misc/vixie/4.9.2-940221.tar.gz. As of this writing BIND-4.9.3 is in its final beta stages and a public release is expected soon, also at gatekeeper.dec.com.
You may also want to consider using a version control system like SCCS
or RCS to maintain your configuration files consistent through updates,
or use tools like M4 macros to generate those files. As stated above,
it's important to avoid human-generated errors, creating problems that
are difficult to track down, since they're
often hidden behind some mistyped name. Errors like this may
end up in many queries for a non-existing name, just to mention the less
serious kind. See [9] for a description of the
most common errors made while configuring domains.
You may simply want to perform some sanity checks on your own domain, without any further concerns. Or you may want to participate in some kind of global effort to monitor name server traffic, either for research purposes or just to point out the "trouble-queries" that flow around.
Whatever your interest may be, you can almost surely find a tool to suit it. Eliminating problems like those described in this document is a major contribution for the efficiency of an important piece of the Internet mechanism. Just to have an idea of this importance, think of all the applications that depend on it, not just to get addresses out of names. Many systems rely on DNS to store, retrieve and spread the information they need: Internet electronic mail was already mentioned (see [10] for details) and work is in progress to integrate X.400 operations with DNS [11]; others include "remote printing" services [12], distributed file systems and network routing purposes, among others. These features may be accomplished by some standard, well-known resource records [13], or by new, experimental ones [14, 15]. Even if some of them won't succeed, one may well expect some more load on the DNS burden.
The ubiquitous DNS thus deserves a great deal of attention, perhaps
much more than it generally has. One may say that it is a victim
of its own success: if a user triggers an excessive amount of queries only
to have one request satisfied, he won't worry about it (in fact, he won't
notice it), won't complain to his system administrator, and things will
just go on like this. Of course, DNS was designed to resist and provide
its services despite all these anomalies. But by doing so it is frequently
forgotten, as long as people can Telnet or ftp. As DNS will be given
new responsibilities, as pointed in the above paragraph, the problems described
in this text will grow more serious and new ones may appear (notably security
ones [16], with a lot of work being presently in progress
addressing security in DNS), if nothing is done to purge them.
[2] Beecher, B., "Dealing With Lame Delegations",
Univ. Michigan,
LISA VI, October 1992.
[3] Frazao, J. and J. L. Martins, "Ddt - Domain
Debug Tools, A
Package to Debug the DNS Tree", Dept. Informatica
Faculdade
Ciencias Univ. Lisboa, DI-FCUL-1992-04, January
1992.
[4] Danzig, P., "Probabilistic Error Checkers:
Fixing DNS", Univ.
Southern California, Technical Report, February
1992.
[5] Kumar, A., J. Postel, C. Neuman, P. Danzig
and S. Miller, "Common
DNS Implementation Errors and Suggested Fixes",
RFC 1536,
USC/Information Sciences Institute, October
1993.
[6] Miller, S. and P. Danzig, "The Checker Project,
Installation and
Operator's Manual", Univ. Southern California,
TR CS94-560, 1994.
[7] Danzig, P., K. Obraczka and A. Kumar, "An
Analisys of Wide-Area
Name Server Traffic", Univ. Southern California,
TR 92-504, 1992.
[8] Albitz, P. and C. Liu, "DNS and BIND", O'Reilly
and Associates
Inc., October 1992.
[9] Beertema, P., "Common DNS Data File Configuration
Errors", RFC 1537,
CWI, October 1993.
[10] Partridge, C., "Mail Routing and the Domain
System", STD 14, RFC 974,
CSNET CIC BBN Laboratories Inc., January
1986.
[11] Allocchio, C., A. Bonito, B. Cole, S. Giordano
and R. Hagens,
"Using the Internet DNS to Distribute
RFC1327 Mail Address
Mapping Tables", RFC
1664, GARR, Cisco Systems Inc., Centro
Svizzero Calcolo Scientifico, ANS, August
1994.
[12] Malamud, C. and M. Rose, "Principles of
Operation for the TPC.INT
Subdomain: General Principles and Policy",
RFC 1530, Internet
Multicasting Service, Dover Beach Consulting
Inc., October 1993.
[13] Rosenbaum, R., "Using the Domain Name System
to Store Arbitrary
String Attributes", RFC
1464, Digital Equipment Corporation, May
1993.
[14] Everhart, C., L. Mamakos, R. Ullmann and
P. Mockapetris (Ed.),
"New DNS RR Definitions", RFC
1183, Transarc, Univ. Maryland,
Prime Computer, Information Sciences Institute,
October 1990.
[15] Manning, B., and R. Colella, "DNS NSAP Resource
Records", RFC 1706,
USC/Information Sciences Institute, NIST,
October 1994.
[16] Gavron, E., "A Security Problem and Proposed
Correction With
Widely Deployed DNS Software", RFC
1535, ACES Research Inc.,
October 1993
Phone: +351 1 294 28 44
Fax: +351 1 295 77 86
EMail: artur@fct.unl.pt