From owner-freebsd-current@FreeBSD.ORG Wed Jun 30 19:35:49 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8BF6F16A4CF for ; Wed, 30 Jun 2004 19:35:49 +0000 (GMT) Received: from smtp-gw-cl-c.dmv.com (smtp-gw-cl-c.dmv.com [216.240.97.41]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0F59643D39 for ; Wed, 30 Jun 2004 19:35:49 +0000 (GMT) (envelope-from sven@dmv.com) Received: from lanshark.dmv.com (lanshark.dmv.com [216.240.97.46]) by smtp-gw-cl-c.dmv.com (8.12.10/8.12.10) with ESMTP id i5UJZG9D029221 for ; Wed, 30 Jun 2004 15:35:16 -0400 (EDT) (envelope-from sven@dmv.com) From: Sven Willenberger To: current@freebsd.org Content-Type: text/plain Date: Wed, 30 Jun 2004 15:34:05 -0400 Message-Id: <1088624045.1179.25.camel@lanshark.dmv.com> Mime-Version: 1.0 X-Mailer: Evolution 1.5.9 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.39 Subject: Stack backtrace: how can I help? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Jun 2004 19:35:49 -0000 My abilities to dig into kernel routines, etc is very limited so I am asking how I can help those who may be able to fix this recurring problem. This has been posted by myself and others with utterly no response from anyone other than one response saying "it must be a bug". Under heavy loads, on 5.2.1-P8 systems, I get a Stack backtrace relating to flushing dirty buffers (ffs_fsync). the relevant code from ffs_softdep.c ( src/sys/ufs/ffs/ffs_softdep.c,v 1.149 2003/10/23 21:14:08 jhb ) getdirtybuf(bpp, mtx, waitfor) struct buf **bpp; struct mtx *mtx; int waitfor; { struct buf *bp; int error; /* * XXX This code and the code that calls it need to be reviewed to * verify its use of the vnode interlock. */ for (;;) { if ((bp = *bpp) == NULL) return (0); if (bp->b_vp == NULL) backtrace(); ..... It does seem related to the load created by perl (these machines run spamassassin through either mimedefang or milter-spamc) and are now running 5.8.4; the upgrade to perl made no difference ... still getting these backtraces. Each machine handles (filters) roughly 120K email messages per day. a) what additional information would be of help here b) what can I do to help troubleshoot this -- for the most part the machines recover after the backtrace (of course they are inoperable during the time the trace is generated creating further backlog/work for the other machines in the cluster) although occasionally it will cause a panic and either reboot or hang at sync. c) is it possible to cvsup the latest ffs files and make install those without killing the machine?