Date: Wed, 10 Apr 2002 14:54:21 -0400 (EDT) From: Bruce Campbell <bruce@engmail.uwaterloo.ca> To: Matthew Dillon <dillon@apollo.backplane.com> Cc: Danny Schales <dan@coes.LaTech.edu>, Rolandas Naujikas <rolnas@takas.lt>, Doug White <dwhite@resnet.uoregon.edu>, Wilko Bulte <wkb@freebie.xs4all.nl>, Paul Horechuk <phorechuk@docucom.ca>, stable@FreeBSD.ORG, jah4007@cs.rit.edu Subject: Re: nfs_fsync: not dirty error in 4.5-RELEASE (possible solution) Message-ID: <Pine.GSO.4.05.10204101444200.3560-100000@engmail.uwaterloo.ca> In-Reply-To: <200203292234.g2TMYpq67679@apollo.backplane.com>
next in thread | previous in thread | raw e-mail | index | archive | help
After experiencing 9 such panics in a 10 day period, on 3 different
machines, I applied the below possible solution to just one of the 3
systems. That was 4 days ago. Since then, none of the 3 systems have
panic'ed ;-(
During the last panics, I was able to determine that the only user
connected was one who was over quota. He then moved to another of the 3
servers and was the only one connected there, and then that one panic'ed
also.
Despite a number of over quota experiments, I was unable to reproduce
the condition.
On Fri, 29 Mar 2002, Matthew Dillon wrote:
> Ok, I am putting this back on the main list.
>
> After looking at a kernel core that Danny graciously provided, I believe
> I have located the problem.
>
> The core shows NFS panicing on a struct buf showing up on the vnode's
> v_dirtyblkhd list that is not marked B_DELWRI.
>
> After examining the core I found that the buffer was marked B_INVAL,
> and I found a case in brelse() where B_DELWRI is cleared on a buffer
> marked B_DELWRI|B_INVAL without moving it out of the vnode's v_dirtyblkhd
> list. Specifically, line 1214 if kern/vfs_bio.c:
>
> /*
> * If B_INVAL, clear B_DELWRI. We've already placed the buffer
> * on the correct queue.
> */
> if ((bp->b_flags & (B_INVAL|B_DELWRI)) == (B_INVAL|B_DELWRI)) {
> bp->b_flags &= ~B_DELWRI;
> --numdirtybuffers;
> numdirtywakeup(lodirtybuffers);
> }
>
> I believe that the correct fix is to change this code to:
>
> /*
> * If B_INVAL, clear B_DELWRI. We've already placed the buffer
> * on the correct queue.
> */
> if ((bp->b_flags & (B_INVAL|B_DELWRI)) == (B_INVAL|B_DELWRI))
> bundirty(bp);
>
> I would appreciate it if everyone who is able to easily reproduce this
> panic would test this fix and post your results back to the list. If
> this solves the problem I will commit it to -current and -stable.
>
> -Matt
> Matthew Dillon
> <dillon@backplane.com>
>
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.05.10204101444200.3560-100000>
