From owner-freebsd-hackers Thu Feb 25 1:16:36 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (Postfix) with ESMTP id 3609814CCE for ; Thu, 25 Feb 1999 01:16:35 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id BAA02250; Thu, 25 Feb 1999 01:16:16 -0800 (PST) (envelope-from dillon) Date: Thu, 25 Feb 1999 01:16:16 -0800 (PST) From: Matthew Dillon Message-Id: <199902250916.BAA02250@apollo.backplane.com> To: Matthew Dillon Cc: Luoqi Chen , dfr@nlsystems.com, freebsd-hackers@FreeBSD.ORG, mjacob@feral.com Subject: Re: Panic in FFS/4.0 as of yesterday - update References: <199902231444.JAA02311@lor.watermarkgroup.com> <199902231848.KAA51270@apollo.backplane.com> Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Oof! Well, that takes the cake. I just had a supervisor stack overflow due to the writerecursion junk in getnewbuf(). 5 levels of (FFS + VN + NFS-FILE) resulted in 60+ subroutine levels in DDB's traceback. That is after DDB faulted on itself with just me hitting return. Doh. I think I have a solution. Normally, B_DELWRI buffers are pushed out by the syncer. Normally getnewbuf() never needs to deal with B_DELWRI buffers. It is only during heavy buffered write I/O where the dirty buffers trash the buffer queues. At the moment, getnewbuf() attempts to convert B_DELWRI buffers into async writes in order to push them out. This is what leads to the recursion problem. I don't think we can afford to recurse even once, therefore we cannot push out B_DELWRI buffers in getnewbuf(). At all. I think the proper solution is to speed up the syncer when getnewbuf() gets starved. I also see a fairly serious potential low-free-buffer lockup. At the moment, getblk() ( which calls getnewbuf() ) treats the lo buffer limit as 'lofreebuffers' and blocks if numfreebuffers < lofreebuffers. I believe we must relax that rule if curproc == updateproc ( i.e. the syncer ), thus guarenteeing sufficient buffers for the syncer to be able to operate in extreme conditions. So, to whit, I am going to start testing changes that remove the async write code from getnewbuf() entirely and implement the emergency reserve for updateproc. It should be fairly straightforward. I'll get the thing reliable and then submit a patch for review. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message