From owner-freebsd-hackers Mon Mar 1 0:11:58 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (Postfix) with ESMTP id 78A2A1530F for ; Mon, 1 Mar 1999 00:10:44 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id AAA42400; Mon, 1 Mar 1999 00:10:27 -0800 (PST) (envelope-from dillon) Date: Mon, 1 Mar 1999 00:10:27 -0800 (PST) From: Matthew Dillon Message-Id: <199903010810.AAA42400@apollo.backplane.com> To: Matthew Jacob , freebsd-hackers@FreeBSD.ORG Subject: Re: Panic in FFS/4.0 as of yesterday - update References: Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG New test patches are available for the getnewbuf() bug. And a couple more bugs found. Only the 'VFS_BIO Fixed to getnewbuf' patch is described in this email. http://www.backplane.com/FreeBSD4/ The patch needs more extensive testing. Bugs fixed: * getnewbuf() recursion reduced to 1 level. 5 levels is too deep for some VFS stacks and can overflow the supervisor stack. * A number of low-memory deadlock situations fixed by redoing the way buffers are written out and freed up. * numfreebuffer, numdirtybuffer accounting fixed - it was broken. Old Bugs not fixed: * I/O saturation is still a problem. There is no easy solution - even reverting to synchronous I/O doesn't help because the STEST script starts up 50 processes doing writes. This isn't a serious bug under normal operating conditions but it is annoying. New bugs found ( and not fixed ): * exec_map can only hold 16 exec'ing processes at once. It needs a counter and a sleep/wait. Sometimes when I ran Matt J's 'breakit' script, a background program would 'Abort trap' out instantly on startup because more then 16 were doing an exec at once. This can only occur in a low-memory or heavy-I/O situation. * Another bug in vfs.ffs.doreallocblks found. When writing large files ( running Matt J's STEST again ) with doreallocblks on and softupdates enabled, blocks associated with the file are apparently reallocated as the file is being written. When writing 50 4MB files ( 200MB ), over 350MB of disk space can be used during the test because softupdates is unable to get in and sync the bitmaps. If the test is paused, softupdates catches up. Otherwise you can run out of disk space. Also, if the filesystem runs out of space during the test, aka 'filesystem full', softupdates sometimes panics with the error: panic: softdep_setup_blkmapdep: found block * Possible hang with NFS ( still under test ). Sometimes NFS gets into loops during flushdirtybuffers() where it tries to rewrite the same block over and over again. I don't know why yet. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message