Date: Mon, 25 Mar 2002 18:47:10 +0100 From: Brad Knowles <brad.knowles@skynet.be> To: Tim <tim@sleepy.wojomedia.com>, Terry Lambert <tlambert2@mindspring.com> Cc: chat@FreeBSD.ORG Subject: Re: qmail (Was: Maintaining Access Control Lists ) Message-ID: <p05101519b8c51042d9db@[10.0.1.8]> In-Reply-To: <20020325140022.GA23251@sleepy.wojomedia.com> References: <p05101505b8c430e28572@[10.0.1.9]> <000c01c1d3ab$6d2c6960$6600a8c0@penguin> <p05101509b8c47b17d088@[10.0.1.8]> <20020325015236.A97552@futuresouth.com> <p0510150eb8c48ba6b1f4@[10.0.1.8]> <3C9EFED0.DB176CB8@mindspring.com> <20020325115207.GA22032@sleepy.wojomedia.com> <3C9F1A16.207EA23E@mindspring.com> <20020325140022.GA23251@sleepy.wojomedia.com>
next in thread | previous in thread | raw e-mail | index | archive | help
At 8:00 AM -0600 2002/03/25, Tim wrote:
> First, I am assuming that you serialize the administration
> script (no parallel scripts going on).
Big shops can't afford to do this. The locking has to be done at
a lower level.
> If primary/secondary has the exact same zones, then with djbdns it
> looks like this:
>
> database -> ns1
> rsync ns1 ns2
Right. But rsync isn't a part of the DNS standard protocol.
> I agree with your points. On the other hand, djbdns
> solves a specific set of user needs very well (basically, those
> that maintain n servers each of which containing the same zones).
> I think it really depends on your needs.
Sigh.... It looks like I'm going to have to publicly post my
list of 18 things that I have found wrong so far with djbdns, as
opposed to simply sending it privately to a few individuals. So be
it:
1. By default, tinydns does not hand out referrals to questions it
is asked about zones it does not control. I believe that this
violates the spirt of the RFCs, if not the letter.
2. By default, tinydns does not support the use of TCP at all. This
most definitely violates the spirt of the RFCs, as well as the
letter (if a DNS query via UDP results in truncation, you're
supposed to re-do the query using TCP instead).
Indeed, if you want to support TCP under tinydns, you have to
configure an optional program called "axfrdns", which was
intended to handle zone transfers, but also happens to share the
same database as tinydns, and can handle generic TCP queries.
3. The suggested method for copying contents of DNS zones is rsync,
scp, or other remote copy tools. The DNS standard method of
zone transfers (query type "axfr") is only supported as an
additional, disrecommended method.
4. Without a patch from a third party, tinydns does not listen to
more than one IP address. If you have a multi-homed server, you
have to apply a patch from someone other than they author,
before you can get it to listen on more than one
address/interface.
5. Without a patch from a third party, tinydns does not support the
standard "NOTIFY" protocol of informing secondary nameservers
that the zone has been updated, and that they need to check the
SOA serial number and download a new copy (if they don't already
have it).
6. Without a third party patch, tinydns does not support standard
SRV records (which are intended to ultimately replace MX
records, as well as perform similar functions for services other
than mail).
7. You cannot set an alternative SOA contact address (other than
what is hard-coded within tinydns), if you do not have a patch
from a third party.
8. Like tinydns, dnscache will not bind to more than one IP address
without a third party patch.
9. Because they are separate programs, you can't have both tinydns
and dnscache listening to the same IP address(es) on the same
server.
While this is not the recommended mode of configuration, some
sites don't have the luxury of having separate
authoritative-only and caching/recursive-only server(s), and
need to mix them both on one machine (or set of machines). With
the BIND 9 "view" mechanism, this is relatively easy to do.
With djbdns, this is impossible.
10. There aren't even any patches that can get djbdns to
implement TSIG, Dynamic DNS, or DNSSEC, nor are they ever likely
to be created (my understanding is that the author is strongly
opposed to them).
Unfortunately, as time goes on and more and more people are
doing things like IPv6, VPNs based on IPSec, or people just care
about being able to cryptographically prove that their servers
are handing out the only correct information and that the
clients are able to cryptographically verify this fact (think:
electronic banking), these kinds of features are going to become
ever more commonplace.
Note that, with the advent of BIND 9, you can create a
caching-only server that will validate cryptographically signed
records, and all clients can benefit even if they do not
themselves implement any of the new DNSSEC features.
11. There are a number of things that djbdns does which I believe to
be outright bugs. However, the author of this package simply
refuses to accept that his code could be anything less than 100%
perfect, and while he claims to have a "bounty" that he will pay
for any bug that is found, in reality he is the one that gets to
define what he accepts as a "bug", and has repeatedly
demonstrated a tendancy to openly refuse to accept some
purported bug, but then to quietly fix the code with future
releases.
So, let's look at some of these bugs:
A. When an IQUERY is sent to a djbdns server, it will
respond with opcode set to QUERY. (it should simply
copy the opcode, not make something up).
B. DNSCACHE (the caching server) does not respond to
queries with the RD bit clear in the query. (Instead
of simply answering from cache without recursing
the dns-tree).
11. Unfortunately, there is very little documentation available
for djbdns. Whereas for BIND you will discover at least four or
five separate books directly related to BIND on Unix (and one
for BIND on Windows NT), and at least sixteen different books
that are related to the DNS in general (not including books
where the DNS forms just a relatively small part of the whole),
there are no books available on djbdns.
If you're at a site all by yourself, and you don't have access
to normal mailing lists, newsgroups, or other support services,
then whatever books you can carry in with you are your last
refuge of assistance.
12. Commercial support for djbdns is also questionable. Yes,
there are a few groups listed at <http://www.tinydns.org/>, but
how big are they? How long have they been around? How likely
are they to still be here in six months? How many combined
decades of experience do they have in designing DNS protocols or
programs to serve them?
These are not the kinds of questions you need to ask when you go
to the folks at Nominum (see <http://www.nominum.com/>), since
they wrote BIND 9 under contract to the ISC, and they have their
own implementation of a nameserver (not based on BIND) which is
used as the core of the Global Name Service solution that they
offer. Indeed, the GNS is a singularly unique service that is
not available anywhere else at any other company in the world,
and even goes beyond the current technical implementation of a
root nameserver or a gTLD nameserver. Note that Nominum also
offers free secondary nameservice to anyone in the world through
secondary.com (see <http://www.secondary.com/>) for up to five
domains and less than 100 records per person/organization, and
they use the GNS to provide this service.
If you're looking for BIND support in Europe, you can talk to
Nominum, or you can look to the folks at Men & Mice (see
<http://www.menandmice.com/>), who also have a service agreement
with the ISC.
13. Training for djbdns is minimal or non-existent. Contrariwise,
if you go to any major Unix or Linux conference in the world,
they will probably have training available on BIND.
In partnership with the ISC, the two official providers of
training on DNS and BIND are Nominum (see
<http://www.nominum.com/services/training/index.html>), and in
Europe, both Nominum and the company Men & Mice (see
<http://www.menandmice.com/DNS-training/index.html>) provide
training.
In addition, there are other third-parties also providing
training in the DNS, BIND, etc.... Quick searches at Google
(see <http://www.google.com/>) turned up Genesis Communication
(see <http://www.genesiscom.ch/Services/S_prfeduEdns.htm>),
Intranet Solutions (see
<http://www.intranet-solutions.com/home/services/training/
dnstraining.htm>), and VeriSign (see
<http://www.verisign.com/training/courses/dns/>). I'm sure there
are plenty of others, these are just the ones I turned up with a
trivial search and which I found to be of interest.
Of course, note that Cricket Liu (one of the co-authors of the
book _DNS and BIND_) recently left his previous employer
(NSI/VeriSign), and is now employed by Men & Mice. He continues
to work for them doing many of the same things he used to do
before -- including providing training and consulting services,
etc....
14. One argument frequently used to support the use of djbdns
over BIND is performance. Upon further investigation, this
claim simply does not hold water.
Benchmarks published by Rick Jones have clearly shown that BIND
can scale up to at least 12,000 DNS queries per second, and
there is every indication that BIND 9.2 will be able to go
considerably higher. The best benchmarks available for tinydns
indicate that it can handle at least 500 queries per second, but
that is the highest number reported. Other people on the
bind-users mailing list have indicated that they have performed
their own (as yet unpublished) benchmarks of tinydns, and that
it had notable performance problems that BIND did not suffer.
The best published benchmarks from the author for dnscache
report a query handling rate of less than one million records
over a 4.5 hour period of time, which works out to an average of
less than sixty-two queries per second. Even if you look at
numbers of queries per CPU second, the best numbers they can
provide are 13.7 million queries over a four week period of time
with 128 minutes of CPU time used (an average of slightly less
than 1784 queries per CPU second).
Compare this with the requirement from RFC 2010 "Operational
Criteria for Root Name Servers" (since obsoleted by RFC 2870
"Root Name Server Operational Requirements") is that the machine
and software in question be able to handle at least 2000 queries
per second, and be scalable to levels higher than that. Indeed,
recent reports have indicated that a.root-servers.net
(considered by many to be the "primary" root nameserver) is
currently handling around 12,000 DNS queries per second at peak.
Preliminary benchmarks published on the bind-users mailing list
have indicated that, on the same hardware, there is little or no
performance benefit to using dnscache as opposed to BIND 9.1.2,
and when these tests are re-run with BIND 9.2, I'm sure that it
will come out even faster.
15. One of the recommended features of dnscache is the ability to
limit the amount of memory that the nameserver will use. If you
attempt to go over this limit, the nameserver will start
throwing away some old data, in order to fit the new data in.
On the surface, this sounds like a good idea, and a way to avoid
the problems you can get into if BIND ever starts paging and/or
swapping. However, on further reflection, this is, at very
best, a false economy.
Let's take the case where an unexpected process starts chugging
through a large log of a webserver, and it wants to do reverse
DNS resolution on all those collected IP addresses. If these
queries were aimed at your main caching nameserver, odds are
that you probably would never again need the answers to these
queries, at least not before they time out of the cache
naturally.
If the cache of this nameserver were relatively full of "real"
data already, the problem is that this real data would be thrown
away to make room for the new one-time-only queries. Even if a
Least-Recently-Used (LRU) algorithm were employed to choose the
answers that can best afford to be flushed (to make room for the
new answers to be stored), there is a significant amount of
overhead that the nameserver would have to go through in order
to simply perform the garbage-collection and memory flushing
routines.
Since an LRU algorithm probably won't be used, the overhead of
garbage collection and memory flushing would be much, much
greater -- and everything else the server does will suffer.
This also greatly increases the probability that a query will be
performed that would otherwise have been in the cache, except
for the fact that it was flushed to make room for new answers,
thus resulting in data thrashing.
Indeed, it is swap thrashing that was the intended goal to
avoid, but local disk typically has a latency measured in terms
of single digit milliseconds. Contrariwise, DNS queries that
have to go out to the Internet and come back are frequently
measured in terms of tens, hundreds, or even thousands of
milliseconds. Therefore, you will have traded a known serious
problem with local swap thrashing for an unknown and quite
probably much, much more serious remote data thrashing.
Moreover, once you get yourself into a thrashing scenario like
this, odds are that it will be at least as bad and unrecoverable
as it would be to have had BIND doing local swap thrashing.
No, it is much better to have a known problem such as local swap
thrashing that is relatively easily detected with tools like ps,
top, vmstat, iostat, sar, etc... than it is to have a mysterious
slow nameserver problem that you cannot easily detect, debug, or
resolve.
16. Unfortunately, a lot of the reasons the author gives for
running djbdns instead of BIND are related to problems in older
versions of BIND which have been fixed or are largely non-issues
in later releases of BIND 9.
For example, he makes a big point of tinydns being better than
BIND, because while the process is starting up, it still answers
queries. While previous versions of BIND would not answer
queries during startup, this is no longer a problem with BIND 9.
Dan also makes a great deal of the fact that the djbdns tools
run as a user other than root, and in chroot() environments.
While the "monolithic setuid root" situation was an issue with
older versions of BIND, even more recent releases of BIND 8
could be easily run as a non-priviledged user in a chroot()
environment, and this is the preferred method of running BIND 9.
Contrariwise, one of the legitimate big complaints about older
versions of BIND is that they implemented zone transfers in a
separate program. If the database was large, then the
fork()/exec() overhead was large, and the system could seriously
thrash itself to death as it copied all those pages (for systems
without copy-on-write), only to immediately throw them away
again when it fired up the named-xfer program. With BIND 9,
this problem is solved by having a separate thread inside the
server handling zone transfers, and no fork()/exec() is done.
However, tinydns/axfrdns goes back to the fork()/exec() model
that was so greatly despised.
17. The license under which the djbdns collection (and other
programs from the author) are published does not meet the
definition of "open source", according to the book _Open
Sources_ (see
<http://www.oreilly.com/catalog/opensources/book/perens.html>).
You may not care about this issue, but some people do. If you
do care about this issue, then this may be very important to
you. Certainly, the ability to legally make modifications to the
code is fairly important in this arena, and if you can't legally
do so, then it is much less useful to you to have the source
code at all.
18. Finally, I want to touch on the subject of the
trustworthiness of the code itself.
One of the basic concepts of the Open Source movement is that,
as the number of people participating in the project grows, and
as the collected design & implementation experience of those
people grows, and the number of sites running the software
grows, and the number of different types of hardware running the
software grows, then your probability of rapidly discovering
bugs (and being able to fix them) quickly approaches 100%, even
for the deepest bugs in the system.
Unfortunately, the djbdns programs are all large enough, and
dependant on enough system libraries, that they simply cannot be
proven to be bug-free. Once you reach this level of project
size, other factors start being more important than keeping down
absolute code size of the individual units.
In the case of djbdns, they have a relatively small community of
developers who are not all that experienced, a relatively small
number of sites that are running the software, and the software
is running on a relatively small sample of different types of
hardware. They just don't have enough aggregate CPU time spent
executing each and every line of code, and enough decades of
combined DNS protocol design experience that they can begin to
compare with BIND 9.
--
Brad Knowles, <brad.knowles@skynet.be>
Do you hate Microsoft? Do you hate Outlook? Then visit the Anti-Outlook
page at <http://www.rodos.net/outlook/> and see how much fun you can have.
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
-Benjamin Franklin, Historical Review of Pennsylvania.
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?p05101519b8c51042d9db>
