Date: Sat, 20 Oct 2007 11:32:08 -0700 (PDT) From: Doug Barton <dougb@FreeBSD.org> To: Jeremy Chadwick <koitsu@FreeBSD.org> Cc: freebsd-stable@freebsd.org Subject: Re: BIND 9.3.4 assertion failure on restart Message-ID: <alpine.BSF.0.9999.0710201122130.50892@qbhto.arg> In-Reply-To: <20071018193322.GA23372@eos.sc1.parodius.com> References: <20071018193322.GA23372@eos.sc1.parodius.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Jeremy, I saw this on Thursday, but I also saw that Mark had answered you and I had to focus on $REAL_LIFE so sorry for the delay. On Thu, 18 Oct 2007, Jeremy Chadwick wrote: > The following is a reproducible problem on a couple of our DNS servers: > (one running 6.2-STABLE, one running 7.0-PRERELEASE): > > pid 52308 (named), uid 53: exited on signal 6 > Oct 18 12:10:21 anubis named[52308]: /usr/src/lib/bind/isc/../../../contrib/bind9/lib/isc/task.c:1238: INSIST((((manager->tasks).head == ((void *)0)) ? isc_boolean_true : isc_boolean_false)) failed > Oct 18 12:10:21 anubis named[52308]: exiting (due to assertion failure) > > The problem only occurs when using "/etc/rc.d/named restart". Doing a > manual "/etc/rc.d/named stop" then "/etc/rc.d/named start" does not > induce the problem. I'm currently working on some improvements to the rc.d/named script that should help with that issue (unrelated to the bug Mark mentioned in BIND 9.3.4). > There was one random Internet user who posted about the same issue: > > http://forums.devshed.com/dns-36/weird-loggs-470845.html > > There's nothing bizarre about our BIND configuration on these boxes. > I've re-written it (by hand) a couple times hoping it might be some > syntax problem or other oddity, but it doesn't appear to be. We're not > chrooting, You probably should be. :) You're correct in thinking that it's not a factor for this issue though. > and there's no jails. Only thing "non-standard" in rc.conf that's > named-related is named_flags="-4". Yeah, that's both harmless and common. > Both boxes exhibiting this problem are running on identical hardware > (C2Ds, same memory amount, etc.), with an SMP kernel. The 7.0 box uses > the ULE scheduler, while the 6.2 box uses the 4BSD scheduler. I mention > this because the master server (running 6.2-STABLE on different > hardware, non-SMP kernel, single-core P4 CPU) uses CPUTYPE?=prescott and > does not have this problem. If you're running on 6.x and/or BIND 9.3.x you should definitely not use threads, and your idea of using -n1 is probably a good idea as well (even if the bug were not present). I saw your followup to this post so I'm a little unclear as to what hardware we're talking about, but if you're using a dual core or SMP machine I strongly encourage you to upgrade to 7.0 and BIND 9.4.1-P1. Both new versions have significant improvements in how they handle threads, and Kris has done some great work profiling that combination and shown that it significantly outperforms 6.2 and 9.3.x. > I can't provide access to these boxes, but I can provide the > configuration files and zones (there are not many) to those I trust > (dougb@ that means you :) ). Heh, thanks. hth, Doug -- This .signature sanitized for your protection
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.0.9999.0710201122130.50892>