Date: Fri, 24 Apr 1998 06:02:35 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: eivind@yes.no (Eivind Eklund) Cc: fs@FreeBSD.ORG Subject: Re: Problem which FSCK doesn't fix Message-ID: <199804240602.XAA06993@usr09.primenet.com> In-Reply-To: <19980423234857.51461@follo.net> from "Eivind Eklund" at Apr 23, 98 11:48:57 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> > A users problem: A partition passes fsck (with -f) fine, but causes > panic()s sometimes ("free vnode isn't"), and makes dump loop 'forever' > (or at least well past 1700%, which seems 17x too much :-) The "free vnode isn't" is an inherent race condition having to do with vclean and how VOP_LOCK can't be trusted because of the way it is implemented. This is the basis of my "vclean must die" crusade and my "VOP_LOCK must become veto based" crusade. Basically, it's pretty glaringly obvious, if you can get the whole vnode life cycle in your head at one time. Going to a veto basis is just to make sure that all FS's use the corrected code, more than a real requirement (assuming this is FFS; if not, the FS probably has bogus semantics, since most of them do). The problem occurs when an allocation is made while an allocation is being slept, simultaneously. I'd explain it, but it's not simple to explain, and you'd be better off loading all the code in your head and pondering it yourself, rather than having me load an analogue of the code in your head and stepping you through it. When you look at it without taking the whole thing into account at one time, it looks like it can't happen. Yet users report the error, don't they? > I have a binary copy of the partition, a copy of the disklabel output > for the disk, and a copy of fdisk output for the disk. > > Anybody that could clue me in to what's the best way to approach this > (or feel a desire to take over ;-)? The "free vnode isn't" error is totally unrelated to this one. The dump looping I don't understand; how large is the image of this thing? In general, dump is seriously neglected. I would not expect a dump problem, but neither would I be terribly surprised at one. Code does not mutate, but there are occasions when it is allowed to be selectively maintained. You may want to look at reverting the reporting format changes and some other minor changes that occurred about 8-10 months ago. This may or may not fix things, but the closer to the CSRG code base the code is, the more likely that some academic type sat around for a quarter or more thinking about the next-to-last cahnge before charging in and twiddling bits. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199804240602.XAA06993>