From owner-freebsd-hackers Tue Feb 23 10:48: 7 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (Postfix) with ESMTP id D7B92114CE for ; Tue, 23 Feb 1999 10:48:05 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id KAA51270; Tue, 23 Feb 1999 10:48:01 -0800 (PST) (envelope-from dillon) Date: Tue, 23 Feb 1999 10:48:01 -0800 (PST) From: Matthew Dillon Message-Id: <199902231848.KAA51270@apollo.backplane.com> To: Luoqi Chen Cc: dfr@nlsystems.com, dillon@apollo.backplane.com, freebsd-hackers@FreeBSD.ORG, mjacob@feral.com Subject: Re: Panic in FFS/4.0 as of yesterday - update References: <199902231444.JAA02311@lor.watermarkgroup.com> Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG :> No, don't disable it. Unless you want the process to overflow it's :> supervisor stack, that is! :> :It won't overflow kernel stack in this case, which was reentrancy rather :than recursion. I don't see any real danger of recursion unless there's :a broken layered FS implementation, which the comment says it tries to :protect against, in which case we really should fix the fs instead. getnewbuf() is starting vfs_bio_awrite()'s on essentially random buffers - not necessarily just buffers related to the VFS recursion. This means that it is possible for it to recurse through unrelated bp's and overflow the stack. :> failure as a stack recursion counter but judging from the comments, it :> was designed to handle both conditions. :> :I don't think the code was designed to protect from too many 'starting up' :I/O's (it would not panic if this is the case), but true run-away situations. : :> I think the proper solution is to have getnewbuf() speed up the syncer :> daemon to retire the dirty buffers in the case where getnewbuf() :> gets itself tied into knots, then wait and return NULL. Also, I think : :This sounds good. There's a variable just for that: rushjob :) :> we need to implement a hard wait if numfreebuffers < lofreebuffers : :The test is in getblk(), but I agree it belongs to getnewbuf(). : :> and the caller to getnewbuf() is not the syncer daemon ( update_proc ), : :I'm not sure if this exemption is useful -- there's not much we can do if :we run out of KVA space. : :> but allow it otherwise. writerecursion would then simply block waiting :> for the syncer when it gets too big rather then panic. :> :Then the name "writerecursion" would be a little misleading, now it becomes :a variable to limit too many async I/O's being started at one time. : :-lq getnewbuf() appears to have the same problem that the ufs fsync code has -- it's assuming that when it converts a DELWRI bp to async, that the I/O operation will either be in-progress or completely resolved after the call. But there are cases, such as with softupdates, where this isn't true.. where the bp may be requeued synchronously due to their being unresolved dependancies. In this case, both getnewbuf() and the ufs fsync code will potentially loop on the same bp over and over again. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message