From owner-freebsd-hackers Sat Feb 20 4:17:48 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from herring.nlsystems.com (nlsys.demon.co.uk [158.152.125.33]) by hub.freebsd.org (Postfix) with ESMTP id C94C510E05 for ; Sat, 20 Feb 1999 04:17:40 -0800 (PST) (envelope-from dfr@nlsystems.com) Received: from localhost (dfr@localhost) by herring.nlsystems.com (8.9.3/8.8.8) with ESMTP id MAA53242; Sat, 20 Feb 1999 12:17:10 GMT (envelope-from dfr@nlsystems.com) Date: Sat, 20 Feb 1999 12:17:10 +0000 (GMT) From: Doug Rabson To: Matthew Dillon Cc: freebsd-hackers@FreeBSD.ORG Subject: Re: Panic in FFS/4.0 as of yesterday In-Reply-To: <199902190915.BAA31066@apollo.backplane.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Fri, 19 Feb 1999, Matthew Dillon wrote: > :On Thu, 18 Feb 1999, Matthew Jacob wrote: > : > :> Oh, btw- I should clarify a little about this test and some spice to the > :> mix... The same test on a system that is 25% of the cpu power and 25% of > :> the memory running solaris 2.7 Intel not only successfully has always runs > :> this test but also retains a quite acceptable responsiveness. Please don't > :> make me claim Slowlaris is better! > : > :I'm sure that something very wrong is happening, don't worry. Hopefully, I > :will be able to see something. > : > :-- > :Doug Rabson Mail: dfr@nlsystems.com > :Nonlinear Systems Ltd. Phone: +44 181 442 9037 > > I've started testing the VN device. So far I've found it to be > extremely unstable when using an NFSV2 or NFSV3 file as backing > store. I'm going to try using an MFS based file as backing store > next to see whether the problem is with the VN device or the NFS device. > > I've gotten the bmsafemap softupdates panic with softupdates mounted > filesystems sitting on top of VN, but that was with the NFS-backed VN > test which was unstable even without softupdates so I don't know if > that is a real crash. > > I haven't tried reproducing the softupdates panic on its own merits > yet. I want to fix VN first. I've just been looking at the responsiveness problem associated with Matt Jacob's bulk writing test and I can see what is happening (although I'm not sure what to do about it). The system is unresponsive because the root inode is locked virtually all of the time and this is because of a lock cascade leading to a single process which is trying to rewrite a block of the directory which the test is running in (synchronously since the fs is not using softupdates). That process is waiting for its i/o to complete before unlocking the directory. Unfortunately the buffer is the last on the drive's buffer queue and there are 647 (for one instance which I examined in the debugger) buffers ahead of it, most of which are writing about 8k. About 4Mb of buffers on the queue are from a *single* process which seems extreme. The i/o for directories are being hugely delayed by the several bulk writing threads which the test has managed to start up and any directory which stays locked for long can easily lead to a locked root vnode (especially since there is a herd of processes in the test trying to create files in the same directory). I have modified my source tree to use bufq_insert_tail instead of bufqdisksort in scsi_da.c which didn't make any difference to the responsiveness problem (it probably made it worse since it would guarantee that the directory i/o is delayed by the maximum amount of time). It seems to me that there should be a mechanism to prevent the queued i/o lists from becoming so long (over 5Mb is queued on the machine which I have in the debugger), perhaps by throttling the writers if they start too much asynchronous i/o. I wonder if this can be treated as a similar problem to the swapper latency issues which John Dyson was talking about. I haven't seen the panic which Matt reported yet but I imagine that its an overload condition caused by the extreme amounts of pending i/o. -- Doug Rabson Mail: dfr@nlsystems.com Nonlinear Systems Ltd. Phone: +44 181 442 9037 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message