Domain name system
The
domain name system (
DNS) stores and associates many types of information with
domain names, but most importantly, it translates domain names (computer
hostnames) to
IP addresses. It also lists
mail exchange servers accepting
e-mail for each domain. In providing a worldwide keyword-based redirection service, DNS is an essential component of contemporary
Internet use.
Useful for several reasons, the DNS pre-eminently makes it possible to attach easy-to-remember domain names (such as "
wikipedia.org") to hard-to-remember IP addresses (such as 207.142.131.206). Humans take advantage of this when they recite
URLs and
e-mail addresses. In a subsidiary function, the domain name system makes it possible for people to assign authoritative names without needing to communicate with a central registrar each time.
The practice of using a name as a more human-legible abstraction of a machine's numerical address on the network predates even
TCP/IP, and goes all the way back to the
ARPAnet era. Originally, each computer on the network retrieved a file called
HOSTS.TXT from SRI (now
SRI International) which mapped an address (such as 192.0.34.166) to a name (such as
www.example.net.) The
Hosts file still exists on most modern operating systems, either by default or through configuration, and allows users to specify an
IP address to use for a
hostname without checking the DNS. This file
now serves primarily for troubleshooting DNS errors or for mapping local addresses to more organic names. (The Hosts file can also help in
ad-blocking, and
spyware may utilize it to hijack a computer.) But a system based on a HOSTS.TXT file had inherent limitations, because of the obvious requirement that every time a given computer's address changed, every computer that wanted to communicate with it would need an update to its Hosts file.
The growth of networking called for a more scalable system: one that recorded a change in a host's address in one place only. Other hosts would learn about the change dynamically through a notification system, thus completing a globally accessible network of all hosts' names and their associated IP Addresses. Enter the DNS.
Paul Mockapetris invented the DNS in
1983; the original specifications appear in
RFC 882 and 883. In
1987, the publication of RFC 1034 and RFC 1035 updated the DNS specification and made RFC 882 and RFC 883 obsolete. Several more-recent RFCs have proposed various extensions to the core DNS protocols.
Mockapetris wrote the first implementation of DNS. The following year (1984), four Berkeley students â€" Douglas Terry, Mark Painter, David Riggle and Songnian Zhau â€" did the first Unix implementation. Ralph Campbell did maintenance of Terry
et al's work after that. In 1985, Kevin Dunlap of
Digital Equipment Corporation did a major rewrite of the DNS implementation and renamed it BIND. Mike Kavels, Phil Almquist and Paul Vixie have maintained BIND since then. A port of BIND to the Windows NT platform took place in the early 1990s.
|
Domain names, arranged in a tree, cut into zones, each served by a nameserver. |
The domain name space consists of a
tree of domain names. Each node or leaf in the tree has an associated
resource record, which holds the information associated with the domain name. The tree sub-divides into
zones. A zone consists of a collection of connected nodes authoritatively served by an
authoritative DNS nameserver. (Note that a single nameserver can host several zones.)
When a system administrator wants to let another administrator control a part of the domain name space within his or her zone of authority, he or she can
delegate control to the other administrator. This splits a part of the old zone off into a new zone, which comes under the authority of the second administrator's nameservers. The old zone becomes no longer authoritative for what comes under the authority of the new zone.
A
resolver looks up the information associated with nodes. A resolver knows how to communicate with name servers by sending DNS requests, and heeding DNS responses. Resolving usually entails
recursing through several name servers to find the needed information.
Some resolvers function simplistically and can only communicate with a single name server. These simple resolvers rely on a
recursing name server to perform the work of finding information for them.
Understanding the parts of a domain name
A
domain name usually consists of two or more parts (technically
labels), separated by dots. For example
wikipedia.org.
* The rightmost label conveys the
top-level domain (for example, the address
en.wikipedia.org has the top-level domain
org).
* Each label to the left specifies a subdivision or
subdomain of the domain above it. Note that "subdomain" expresses relative dependence, not absolute dependence: for example,
wikipedia.org comprises a subdomain of the
org domain, and
en.wikipedia.org comprises a subdomain of the domain
wikipedia.org. In theory, this subdivision can go down to 127 levels deep, and each label can contain up to 63 characters, as long as the whole domain name does not exceed a total length of 255 characters. But in practice some
domain registries have shorter limits than that.
* DNS refers to a domain name that has one or more associated IP addresses as a
hostname. For example, the
en.wikipedia.org and
wikipedia.org domains are both hostnames, but the
org domain is not.The DNS consists of a hierarchical set of
DNS servers. Each domain or subdomain has one or more
authoritative DNS servers that publish information about that domain and the name servers of any domains "beneath" it. The hierarchy of authoritative DNS servers matches the hierarchy of domains. At the top of the hierarchy stand the
root servers: the servers to query when looking up (
resolving) a top-level domain name (TLD).
The address resolution mechanism
(This description deliberately uses the fictional
.example TLD in accordance with the DNS guidelines themselves.)
In theory, a full host name may have several name segments, (e.g
ahost.ofasubnet.ofabiggernet.inadomain.example). In practice, in the experience of the majority of public users of Internet services, full host names will frequently consist of just three segments (
ahost.inadomain.example, and most often www
.inadomain.example).
For querying purposes, software interprets the name segment by segment, from right to left, using an iterative search procedure. At each step along the way, the program queries a corresponding DNS server to provide a pointer to the next server which it should consult.
|
A DNS recurser consults three nameservers to resolve the address www.wikipedia.org. |
As originally envisaged, the process was as simple as:# the local system is pre-configured with the known addresses of the
root servers in a file of
root hints, which need to be updated periodically by the local administrator from a reliable source to be kept up to date with the changes which occur over time.# query one of the root servers to find the server authoritative for the next level down (so in the case of our simple hostname, a root server would be asked for the
address of a server with detailed knowledge of the
example top level domain).# querying this second server for the address of a DNS server with detailed knowledge of the second-level domain (
inadomain.example in our example).# repeating the previous step to progress down the name, until the final step which would, rather than generating the address of the next DNS server, return the final address sought.
The diagram illustrates this process for the real host www.wikipedia.org.
The mechanism in this simple form has a difficulty: it places a huge operating burden on the collective of root servers, with each and every search for an address starting by querying one of them. Being as critical as they are to the overall function of the system such heavy use would create an insurmountable bottleneck for trillions of queries placed every day. In practice there are two key additions to the mechanism.
* Firstly, the DNS resolution process allows for local recording and subsequent consultation of the results of a query (or
caching) for a period of time after a successful answer (the server providing the answer initially dictates the period of validity, which may vary from just seconds to days or even weeks). In our illustration, having found a list of addresses of servers capable of answering queries about the .example domain, the local resolver will not need to make the query again until the validity of the currently known list expires, and so on for all subsequent steps. Hence having successfully resolved the address of
ahost.inadomain.example it is not necessary to repeat the process for some time since the address already reached will be deemed reliable for a defined period, and resolution of
anotherhost.anotherdomain.example can commence with already knowing which servers can answer queries for the
.example domain. Caching significantly reduces the rate at which the most critical name servers have to respond to queries, adding the extra benefit that subsequent resolutions are not delayed by network transit times for the queries and responses.
* Secondly, most domestic and small-business clients "hand off" address resolution to their
ISP's DNS servers to perform the look-up process, thus allowing for the greatest benefit from those same ISPs having busy local caches serving a wide variety of queries and a large number of users.
For further discussion in greater detail of these additions to the mechanism see below.
Circular Dependencies and Glue Records
Name servers in delegations appear listed by name, rather than by IP address. This means that a resolving name server must issue another DNS request to find out the IP address of the server to which it has been referred. Since this can introduce a
circular dependency if the nameserver referred to is under the domain that it is authoritative of, it is occasionally necessary for the nameserver providing the delegation to also provide the IP address of the next nameserver. This record is called a
glue record.
For example, assume that the sub-domain en.wikipedia.org contains further sub-domains (such as something.en.wikipedia.org) and that the authoritative nameserver for these lives at
ns1.en.wikipedia.org. A computer trying to resolve
something.en.wikipedia.org will thus first have to resolve
ns1.en.wikipedia.org. Since
ns1 is also under the
en.wikipedia.org subdomain, resolving
ns1.en.wikipedia.org requires resolving
ns1.en.wikipedia.org which is exactly the circular dependency mentioned above. The dependency is broken by the glue record in the nameserver of
wikipedia.org that provides the IP address of
ns1.en.wikipedia.org directly to the requestor, enabling it to
bootstrap the process by figuring out where
ns1.en.wikipedia.org is located.
When an application (such as a
web browser) tries to find the IP address of a domain name, it doesn't necessarily follow all of the steps outlined in the
Theory section above. We will first look at the concept of caching, and then outline the operation of DNS in "the real world."
Caching and time to live
Because of the huge volume of requests generated by a system like the DNS, the designers wished to provide a mechanism to reduce the load on individual DNS servers. The mechanism devised provided that when a DNS resolver (i.e. client) received a DNS response, it would
cache that response for a given period of time. A value (set by the administrator of the DNS server handing out the response) called the
time to live, or
TTL defines that period of time. Once a response goes into cache, the resolver will consult its cached (stored) answer; only when the TTL expires (or when an administrator manually flushes the response from the resolver's memory) will the resolver contact the DNS server for the same information.
Generally, the Start of Authority (SOA) record specifies the time to live. The SOA record has the parameters:
*
Serial â€" the zone serial number, incremented when the
zone file is modified, so the slave and secondary name servers know when the zone has been changed and should be reloaded.
*
Refresh â€" the number of seconds between update requests from secondary and slave name servers.
*
Retry â€" the number of seconds the secondary or slave will wait before retrying when the last attempt has failed.
*
Expire â€" the number of seconds a master or slave will wait before considering the data stale if it cannot reach the primary name server.
*
Minimum â€" previously used to determine the minimum TTL, this offers negative caching.
(Newer versions of
BIND (
named) will accept the suffixes 'M','H','D' or 'W', indicating a time-interval of minutes, hours, days and weeks respectively.)
Caching time
As a noteworthy consequence of this distributed and caching architecture, changes to the DNS do not always take effect immediately and globally. This is best explained with an example: If an administrator has set a
TTL of 6 hours for the host
www.wikipedia.org, and then changes the IP address to which
www.wikipedia.org resolves at 12:01pm, the administrator must consider that a person who cached a response with the old IP Address at 12:00pm will not consult the DNS server again until 6:00pm. The period between 12:01pm and 6:00pm in this example is called
caching time, which is best defined as a period of time that begins when you make a change to a DNS record and ends after the maximum amount of time specified by the
TTL expires. This essentially leads to an important logistical consideration when making changes to the DNS:
not everyone is necessarily seeing the same thing you're seeing.
RFC1537 helps to convey basic rules for how to set the TTL.
Note that the term "propagation", although very widely used, does not describe the effects of caching well. Specifically, it implies that [1] when you make a DNS change, it somehow spreads to all other DNS servers (instead, other DNS servers check in with yours as needed), and [2] that you do not have control over the amount of time the record is cached (you have complete control for all DNS records on your domain, except your NS records and any authoritative DNS servers that use your domain name).
Many people incorrectly refer to a mysterious 48 hour or 72 hour propagation time when you make a DNS change. When one changes the NS records for one's domain or the IP addresses for hostnames of authoritative DNS servers using one's domain (if any), there can be a lengthy period of time before all DNS servers use the new information. This is because those records are handled by the zone parent DNS servers (for example, the .com DNS servers if your domain is example.com), which typically cache those records for 48 hours. However, those DNS changes will be immediately available for any DNS servers that do not have them cached. And, any DNS changes on your domain other than the NS records and authoritative DNS server names can be nearly instantaneous, if you choose for them to be (by lowering the TTL once or twice ahead of time, and waiting until the old TTL expires before making the change).
DNS in the real world
|
DNS resolving from program to OS-resolver to ISP-resolver to greater system. |
Users generally do not communicate directly with a DNS resolver. Instead DNS resolution takes place transparently in client applications such as web browsers (like
Internet Explorer,
Opera,
Mozilla Firefox,
Safari,
Netscape Navigator, etc), mail clients (
Outlook Express,
Mozilla Thunderbird, etc), and other Internet applications. When a request is made which necessitates a DNS lookup, such programs send a resolution request to the local DNS resolver in the operating system which in turn handles the communications required.
The DNS resolver will almost invariably have a cache (see above) containing recent lookups. If the cache can provide the answer to the request, the resolver will return the value in the cache to the program that made the request. If the cache does not contain the answer, the resolver will send the request to a designated DNS server or servers. In the case of most home users, the
Internet service provider to which the machine connects will usually supply this DNS server: such a user will either configure that server's address manually or allow
DHCP to set it; however, where systems administrators have configured systems to use their own DNS servers, their DNS resolvers will generally point to their own nameservers. This name server will then follow the process outlined above in
DNS in theory, until it either successfully finds a result, or does not. It then returns its results to the DNS resolver; assuming it has found a result, the resolver duly caches that result for future use, and hands the result back to the software which initiated the request.
Broken resolvers
An additional level of complexity emerges when resolvers violate the rules of the DNS protocol. Some people have suggested that a number of large ISPs have configured their DNS servers to violate rules (presumably to allow them to run on less-expensive hardware than a fully-compliant resolver), such as by disobeying TTLs, or by indicating that a domain name does not exist just because one of its name servers does not respond.
As a final level of complexity, some applications such as Web browsers also have their own DNS cache, in order to reduce use of the DNS resolver library itself. This practice can add extra difficulty to DNS debugging, as it obscures which data is fresh, or lies in which cache. These caches typically have very short caching times of the order of 1 minute. A notable exception is
Internet Explorer; recent versions cache DNS records for 30 minutes.
Other DNS applications
The system outlined above provides a somewhat simplified scenario. The DNS includes several other functions:
* Hostnames and IP addresses do not necessarily match on a one-to-one basis. Many hostnames may correspond to a single IP address: combined with
virtual hosting, this allows a single machine to serve many web sites. Alternatively a single hostname may correspond to many IP addresses: this can facilitate
fault tolerance and load distribution, and also allows a site to move physical location seamlessly.
* There are many uses of DNS besides translating names to IP addresses. For instance,
Mail transfer agents use DNS to find out where to deliver
e-mail for a particular address. The domain to mail exchanger mapping provided by
MX records accommodates another layer of fault tolerance and load distribution on top of the name to IP address mapping.
*
Sender Policy Framework takes advantage of a DNS record type, the TXT record.
* To provide resilience in the event of computer failure, multiple DNS servers provide coverage of each domain. In particular, thirteen root servers exist worldwide. DNS programs or operating systems have the IP addresses of these servers built in. At least nominally, the
USA hosts all but three of the root servers. However, because many root servers actually implement
anycast, where many different computers can share the same IP address to deliver a single service over a large geographic region, most of the physical (rather than nominal) root servers now operate outside the USA.
The DNS uses
TCP and
UDP on
port 53 to serve requests. Almost all DNS queries consist of a single UDP request from the client followed by a single UDP reply from the server. TCP typically comes into play only when the response data size exceeds 512 bytes, or for such tasks as
zone transfer. Some operating systems such as
HP-UX are known to have resolver implementations that use TCP for all queries, even when UDP would suffice.
Extensions to DNS
EDNS is an extension of the DNS protocol which enhances the transport of DNS data in UDP packages, and adds support for expanding the space of request and response codes. It is described in RFC 2671.
Implementations of DNS
For a commented list of DNS server-side implememtations, see
Comparison of DNS server software.
* RFC 882 Concepts and Facilities (Deprecated by RFC 1034)
* RFC 883 Domain Names: Implementation specification (Deprecated by RFC 1035)
* RFC 1032 Domain administrators guide
* RFC 1033 Domain administrators operations guide
* RFC 1034 Domain Names - Concepts and Facilities.
* RFC 1035 Domain Names - Implementation and Specification
* RFC 1101 DNS Encodings of Network Names and Other Types
* RFC 1183 New DNS RR Definitions
* RFC 1706 DNS NSAP Resource Records
* RFC 1876 Location Information in the DNS (
LOC)
* RFC 1886 DNS Extensions to support
IP version 6* RFC 1912 Common DNS Operational and Configuration Errors
* RFC 1995 Incremental Zone Transfer in DNS
* RFC 1996 A Mechanism for Prompt Notification of Zone Changes (DNS NOTIFY)
* RFC 2136 Dynamic Updates in the domain name system (DNS UPDATE)
* RFC 2181 Clarifications to the DNS Specification
* RFC 2308 Negative Caching of DNS Queries (DNS NCACHE)
* RFC 2317 Classless IN-ADDR.ARPA delegation
* RFC 2671 Extension Mechanisms for DNS (EDNS0)
* RFC 2672 Non-Terminal DNS Name Redirection
* RFC 2782 A DNS RR for specifying the location of services (DNS
SRV)
* RFC 2845 Secret Key Transaction Authentication for DNS (
TSIG)
* RFC 2874 DNS Extensions to Support IPv6 Address Aggregation and Renumbering
* RFC 3403 Dynamic Delegation Discovery System (DDDS) (
NAPTR records)
Important categories of data stored in the DNS include the following:
* An
A record or
address record maps a hostname to a 32-bit
IPv4 address.
* An
AAAA record or
IPv6 address record maps a hostname to a 128-bit
IPv6 address.
* A
CNAME record or
canonical name record makes one domain name an alias of another. The aliased domain gets all the subdomains and DNS records of the original.
* An
MX record or
mail exchange record maps a domain name to a list of
mail exchange servers for that domain.
* A
PTR record or
pointer record maps an
IPv4 address to the
canonical name for that host. Setting up a PTR record for a hostname in the
in-addr.arpa domain that corresponds to an IP address implements
reverse DNS lookup for that address. For example (at the time of writing),
www.icann.net has the IP address 192.0.34.164, but a PTR record maps
164.34.0.192.in-addr.arpa to its canonical name,
referrals.icann.org.
* An
NS record or
name server record maps a domain name to a list of DNS servers authoritative for that domain. Delegations depend on NS records.
* An
SOA record or
start of authority record specifies the DNS server providing
authoritative information about an Internet domain, the email of the domain administrator, the domain serial number, and several timers relating to refreshing the zone.
* An
SRV record is a generalized service location record.
* A
TXT record allows an administrator to insert arbitrary text into a DNS record. For example, this record is used to implement the
Sender Policy Framework specification.
*
NAPTR records (NAPTR stands for "Naming Authority Pointer") are a newer type of DNS record that support regular expression based rewriting.
Other types of records simply provide information (for example, a
LOC record gives the physical
location of a host), or experimental data (for example, a
WKS record gives a list of servers offering some
well known service such as HTTP or POP3 for a domain).
Domain names must use only a subset of
ASCII charactersâ€"the
Roman alphabet in upper and lower case, the digits 0 through 9, the dot, and the
hyphen. This prevented the representation of names and words of many languages natively.
ICANN has approved the
Punycode-based
IDNA system, which maps
Unicode strings into the valid DNS character set, as a workaround to this issue. Some
registries have adopted IDNA.
DNS was not originally designed with security in mind, and thus has a numberof security issues.DNS responses are traditionally not cryptographically signed, leading tomany attack possibilities;
DNSSEC modifies DNS to add support forcryptographically signed responses.There are various extensions to support securing zone transfer information as well.
Some domain names can spoof other, similar-looking domain names. For example, "paypal.com" and "paypa1.com" are different names, yet users may be unable to tell the difference.This problem is much more serious in systems that support internationalized domain names,since many characters that are different (from the point of view of ISO 10646)appear identical on typical computer screens.
Registrant
No one in the world really "owns" a domain name except the
Network Information Centre (NIC), or
domain name registry. Most of the NICs in the world receive an annual fee from a legal user in order for the legal user to utilise the domain name (i.e. a sort of a leasing agreement exists, subject to the registry's terms and conditions). Depending on the various naming convention of the registries, legal users become commonly known as "registrants" or as "domain holders".
ICANN holds a complete list of domain registries in the world. One can find the legal user of a domain name by looking in the
WHOIS database held by most domain registries.
For most of the more than 240
country code top-level domains (ccTLDs), the domain registries hold the authoritative WHOIS (Registrant, name servers, expiry dates etc). For instance,
DENIC, Germany NIC holds the authoritative WHOIS to a .DE domain name.
However, some domain registries, such as
VeriSign, use a registry-registrar model. There are hundreds of Domain Name Registrars that actually perform the domain name registration with the end-user, such as
eNom. By using this method of distribution, the registry only has to manage the relationship with the registrar, and the registrar maintains the relationship with the end-users, or 'registrants'. For .COM, .NET domain names, the domain registries, VeriSign holds a basic WHOIS (registrar and name servers etc). One can find the detailed
WHOIS (Registrant,
name servers, expiry dates etc) at the registrars.
Since about 2001, most
gTLD registries (.ORG, .BIZ, .INFO) have adopted a so-called "thick" registry approach, i.e. keeping the authoritative
WHOIS with the various registries instead of the registrars.
Administrative contact
A registrant usually designates an administrative contact to manage the domain name. In practice, the administrative contact usually has the most immediate power over a domain. Management functions delegated to the administrative contacts may include (for example):
* the obligation to conform to the requirements of the domain registry in order to retain the right to use a domain name
* authorisation to update the physical address, e-mail address and telephone number etc in
WHOISTechnical contact
A technical contact manages the name servers of a domain name. The many functions of a technical contact include:
* making sure the configurations of the domain name conforms to the requirements of the domain registry
* updating the domain zone
* providing the 24x7 functionality of the name servers (that leads to the accessibility of the domain name)
Billing contact
Self-explanatory, the party whom a
NIC invoices.
Name servers
Namely the authoritative
name servers that host the domain name zone of a domain name.
Many investigators have voiced criticism of the methods currently used to control ownership of domains. Critics commonly claim abuse by monopolies or near-monopolies, such as
VeriSign, Inc. Particularly noteworthy was the
VeriSign Site Finder system which redirected all unregistered .com and .net domains to a
VeriSign webpage, this was rapidly removed after widespread critism.
There is also significant disquiet regarding
United States political influence over the
Internet Corporation for Assigned Names and Numbers (ICANN). This was a significant issue in the attempt to create a .xxx
Top-level domain and sparked greater interest in
Alternative DNS roots that would be beyond the control of any single country.
Truth in Domain Names Act
In the
United States, the "Truth in Domain Names Act", in combination with the
PROTECT Act, forbids the use of a misleading domain name with the intention of attracting people into viewing a
visual depiction of sexually explicit conduct on the Internet.
*
cybersquatting*
domain hack*
dynamic DNS*
DNS cache poisoning*
DNSSEC*
ICANN*
Root nameserver*
DNS hosting service*
EveryDNS*
geodomain*
NBNS*
NIS*RFC 1034 Domain Names - Concepts and Facilities
*RFC 1035 Domain Names - Implementation and Specification
*
DNS RR Types*
DNS Coverage via CircleID*
Understanding SOA Records*
All About DNS*
Securing DNS with Transaction Signatures*
Signposts in Cyberspace: The domain name system and Internet Navigation (PDF format)*
DNS Tunneling*
DNS Forgery*
DNS Poisoning, a practical example*
Sites supporting DNS LOC*
domain name system Links, Whitepapers, and Research*
DNS lookups shows recursive search process during DNS lookup
*
Setting up DNS server in unix*
Ensuring quality and robustness of DNS systems*
Online DNS check utility*
Online DNS tools*
DNS Tools*
Microsoft KB Article on IE Cache Times*
Serving DNS using a Peer-to-Peer Lookup Service*
Distributed DNS*
DNS Resources Directory*
DNS Query Tool*
List of Public DNS Servers*
Free domain name system chapter from DNS in Action book.
*
DNS Amplification Attacks*
Norid: Domain name registries around the world*
DNS Glossary*
Domain Name Theft, Fraud And Regulations