Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 30 Dec 2006 13:05:57 -0800
From:      Doug Barton <dougb@FreeBSD.org>
To:        Chuck Swiger <cswiger@mac.com>
Cc:        FreeBSD mailing list <freebsd-stable@freebsd.org>
Subject:   Re: BIND-9.3.2 (from 5.5-STABLE) segfault under load...
Message-ID:  <4596D4B5.5080004@FreeBSD.org>
In-Reply-To: <459582C6.4010200@mac.com>
References:  <459582C6.4010200@mac.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Chuck Swiger wrote:
> Hi--
> 
> I had named segfault a day or so ago under high load ("adnslogres -c
> 200" against a webserver logfile) after logging the following:

Hard to tell if your problem here is related to running on 5.5 or not,
but of course recommendation number one is to consider upgrading to
6.x. Recommendation number two is to upgrade BIND to 9.3.3, preferably
by upgrading to 6.2-RC2, or by upgrading to the head of RELENG_5, or
as a last resort by using the port, with or without the option to
replace the base BIND.

> [ ... ]
>> Dec 28 03:38:56 <daemon.notice> pi named[1853]: enforced
>> delegation-only for 'AR' (ctina.ar/A/IN) from 137.39.1.3#53

If you're using this option, please make sure that you know why you
are using it, and what the potential side effects are. That discussion
is off topic for this list, but feel free to take it up on
bind-users@isc.org if you wish.

>> Dec 28 03:50:23 <daemon.warn> pi named[1853]: client 127.0.0.1#53077:
>> no more recursive clients: quota reached

There is extensive discussion about this problem in the bind-users
archives. Take a look at file:///usr/share/doc/bind9/arm/ and check
out the quota options to get this adjusted to where it needs to be for
your situation. Alternatively, if you're sure that the excess load is
caused by the adnslogres program, try lowering the number of
concurrent connections.

>> Dec 28 03:50:24 <daemon.warn> pi last message repeated 258 times
>> Dec 28 03:50:24 <kern.info> pi kernel: pid 1853 (named), uid 53:
>> exited on signal 11

As a guess, I'd say that this is the problem, you're just hammering
the thing too hard. No point in pursuing this much further till you've
at least upgraded BIND though.

> Named is being invoked via "-4 -u bind -c named.conf -t /var/named"; but
> it could not dump core as /var/named is owned by root. 

Check out the dump-file directive in the ARM. I have a directory in
the chroot called /var/dump, owned by the bind user, and the following
in my named.conf:

options {
	...
	dump-file "/var/dump/named_dump.db";
	...
};

> I've changed
> that temporarily so I ought to be able to get a corefile if I can
> reproduce it.

See above.

> As the subject mentions, this is a Dell 1850 (rackmount PowerEdge)
> running FreeBSD-5.5 & BIND-9.3.2; until just now, everything had been
> running stably for months at a time.

I assume you've checked the usual suspects, dead fans, other hardware
problems, etc?


hth,

Doug

-- 

    This .signature sanitized for your protection



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4596D4B5.5080004>