From owner-freebsd-hackers Tue Nov 5 10:09:02 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id KAA13984 for hackers-outgoing; Tue, 5 Nov 1996 10:09:02 -0800 (PST) Received: from who.cdrom.com (who.cdrom.com [204.216.27.3]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id KAA13977; Tue, 5 Nov 1996 10:08:59 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by who.cdrom.com (8.7.5/8.6.11) with SMTP id KAA11771 ; Tue, 5 Nov 1996 10:08:58 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id LAA06616; Tue, 5 Nov 1996 11:00:11 -0700 From: Terry Lambert Message-Id: <199611051800.LAA06616@phaeton.artisoft.com> Subject: Re: More info on the daily panics... To: ponds!rivers@dg-rtp.dg.com (Thomas David Rivers) Date: Tue, 5 Nov 1996 11:00:11 -0700 (MST) Cc: terry@lambert.org, dyson@freebsd.org, freebsd-hackers@freebsd.org In-Reply-To: <199611050428.XAA00313@lakes.water.net> from "Thomas David Rivers" at Nov 4, 96 11:28:03 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > Time for a repost, it seems... > > > ... description delete ... > > Err, umm, ... doesn't that only fix the "free vnode isn't" > panic? That's really not what I'm seeing... I'm seeing > inode allocation panics coming from ffs_valloc.c. No. It fixes the freelist wrap error. > I'm seeing panics in ffs_vfree() that seem to be that the > inode is clear when it shouldn't be. vclean() will cause it to be marked clear if: 1) The overflow occurs by exactly *1*; this will cause the new inode to overwrite the old. 2) The vnode that is reallocated is freed by the original owner; this will cause the [new] inode to be cleared. 3) You attempt a second operation on the scond vnode reference to the same object. If the overflow occurs by 2 or more (alloc a/alloc b/alloc c/free b/free a) then you will get "free vnode isn't"... you will see this during a directory lookup operation for a create, especially in msdosfs. It tends to be hacked around on a per FS basis because the VOP_LOCK code is duplicated instead of shared, and everyone implements it slightly differently because the lock data area is off the inode instead of the vnode. Like I said in the patch, understand what the patch is working around and you will understand why it's needed. The patch to the create op in vfs_vnops.c that was made a bit ago only hacks around the problem. > I'm thinking the problem is either the inode allocation bit > (cg_inosused) is being cleared when it shouldn't be, or it > isn't being set when it shouldn't be... That would readily > explain the panic's I'm seeing... Consider instead what would happen if ffs reused a vnode when it was not truly free... it would rewrite the inode data pointer. But the original inode data pointer would point to the same object, but the inode that the vnode that the original inode pointed to could be free. So a reference by inode "works" (but gives the wrong vnode and buffers), but a reference by vnode fails. This also explains the occasional write warnings, and the occasional library corruption, FWIW. Cleaning an inode from the hash with an associated vnode would cause the data buffers from the second file to be applied to the first. I believe this is the source of a number of MSDOSFS "bugs"; MSDOSFS is more sensitive because an inode *is* a directory entry. The buffer swapping means that even a read-only mounted FS can get writes on overflow. You can verify this by ASSERT'ing that the inode reference in the vnode the inode points to points to the inode in question (this may fail on null-layer stacking, however, since the vnode data pointer points to another vnode). Part of "the true fix" must be to zone devices and pass writes through zone filtering... this is partially done anyway, but there are currently four or five logical device interfaces, and not all of them do it... only the disklabel devices really get it. Basically, there is a need for a common "logical device descriptor" which applies to all partitioning mechanisms. > Now, could the vfs_subr.c changes you suggest cause this > to happen? If the ffs_xxxx routines and data are properly > isolated - seems like that wouldn't be the case... But, > I'm not file-system guru. Yes, it can cause the error. Try the ASSERT. I've been loathe to discuss this in detail without patches in hand, since it's a serious reliability problem... unfortunately, patches require an orthogonal infrastructure (layering fixes to make everyone lock the same way so it can be worked around, then logical to physical device mapping by descriptor through a mapping layer for a real fix to simply disallow out-of-zone writes, then per fs allocation of the vnode as a subelement of the in core inode and a VOP_VRELE, etc.). Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.