From owner-freebsd-questions@FreeBSD.ORG  Fri Mar 18 18:41:56 2005
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id CA77E16A4CE
	for <freebsd-questions@freebsd.org>;
	Fri, 18 Mar 2005 18:41:56 +0000 (GMT)
Received: from aphrodite.gwi.net (aphrodite.gwi.net [207.5.128.164])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 40B0A43D39
	for <freebsd-questions@freebsd.org>;
	Fri, 18 Mar 2005 18:41:56 +0000 (GMT)	(envelope-from jcoombs@gwi.net)
Received: from failure (murdoc.gwi.net [207.5.142.8])
	by aphrodite.gwi.net (8.12.9p2/8.12.9) with SMTP id j2IIft0m024280
	for <freebsd-questions@freebsd.org>;
	Fri, 18 Mar 2005 13:41:55 -0500 (EST)	(envelope-from jcoombs@gwi.net)
Message-ID: <06da01c52bea$6d1f32f0$1700a8c0@failure>
From: "Joshua Coombs" <jcoombs@gwi.net>
To: <freebsd-questions@freebsd.org>
Date: Fri, 18 Mar 2005 13:43:49 -0500
MIME-Version: 1.0
Content-Type: text/plain;
	format=flowed;
	charset="iso-8859-1";
	reply-type=original
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.2527
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2527
Subject: Bind Wierdness
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Mar 2005 18:41:56 -0000

Hello.

While trying to track down periodic radius failures, I discovered that Bind
was periodically timing out, and even occasionally incorrectly responding
with a failure.  We orriginally were running 9.2.3 built from ports on
FreeBSD 4.9p11, with a mem limit set at 900M, maint interval of 60 minutes.
The failures were 61 minutes appart, like clockwork.

We moved up to 9.3.0, again built from ports, and continued to observe the
same problem.  I then built from src, enabling threading, with no luck.  A
quick discussion with the port maintainer pointed out that 9.3.1 would have
'major threading fixes' for FreeBSD, so I waited for it to come out.  Now
that it's out, I've built it, threading enabled, and still have the periodic
outages.  I've currently got the maint interval set at 15 mins, and my
problems are tracking the period like clock work.

At the moment, my primary source of data comes from my radius server
monitoring, as I don't have a direct long term dns monitor going yet.  I've
been testing by throwing nslookup requests inside while loops from cli and
observing the output.

The host system for bind is running 9 to 14% cpu load, even durring the
maint windows, so I don't believe the host system is overloaded.

How should I proceed to diagnose and correct this?  I've posted to the 
bind-users list, seems a few others have noticed similar problems, but noone 
wants to provide any diagnostic hints there.

Joshua Coombs