From owner-freebsd-hackers  Tue Feb 23 12:43:32 1999
Delivered-To: freebsd-hackers@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2])
	by hub.freebsd.org (Postfix) with ESMTP id 89D3A1143F
	for <freebsd-hackers@FreeBSD.ORG>; Tue, 23 Feb 1999 12:43:29 -0800 (PST)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id MAA53144;
	Tue, 23 Feb 1999 12:43:22 -0800 (PST)
	(envelope-from dillon)
Date: Tue, 23 Feb 1999 12:43:22 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <199902232043.MAA53144@apollo.backplane.com>
To: Luoqi Chen <luoqi@watermarkgroup.com>
Cc: dfr@nlsystems.com, freebsd-hackers@FreeBSD.ORG, mjacob@feral.com
Subject: Re: Panic in FFS/4.0 as of yesterday - update
References:  <199902231930.OAA21444@lor.watermarkgroup.com>
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


:>     getnewbuf() is starting vfs_bio_awrite()'s on essentially random
:>     buffers - not necessarily just buffers related to the VFS recursion.
:>     This means that it is possible for it to recurse through unrelated
:>     bp's and overflow the stack.
:> 
:Hmm, vfs_bio_awrite() only tries to write bufs already in core, no buffer
:reconstitution is needed. I fail to see why it would recurse back into
:getnewbuf().

     If there is no VFS layering, then true.  If there *IS* VFS layering,
     then the VOP_BWRITE() may try to consistitute another buffer.

     For example, if you are going through a VN device which is file-backed,
     it must do a BMAP operation which may constitute a new buffer ( or 
     several ).   This will loop back to the getnewbuf() code, which then
     may decide to do the same thing all over again with a different random
     bp and cause another recursion.  And so on.  Until the supervisor stack
     goes poof.

:>     getnewbuf() appears to have the same problem that the ufs fsync code
:>     has -- it's assuming that when it converts a DELWRI bp to async, that
:>     the I/O operation will either be in-progress or completely resolved
:>     after the call.  But there are cases, such as with softupdates, where
:>     this isn't true.. where the bp may be requeued synchronously due to
:>     their being unresolved dependancies.  In this case, both getnewbuf()
:>     and the ufs fsync code will potentially loop on the same bp over 
:>     and over again.
:> 
:Is this what caused the "bmsafemap" panic? I'll take a look.

    It shouldn't be the cause... the bmsafemap panic wasn't supposed to
    occur at all, no matter what the dirty buffer situation.

    I currently have case added to report the BMSAFEMAP condition and 
    ignore it, and all my problems went away.  I've added debugging code
    supplied by Kirk ( just a vprintf(), really ) so when it happens again
    I'll be able to give him some more meaningful information in regards
    to the type of vnode the condition occured on.

					-Matt

:-lq

					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message