Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 27 Sep 1996 14:24:07 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        guido@gvr.win.tue.nl (Guido van Rooij)
Cc:        pst@shockwave.com, FreeBSD-hackers@freebsd.org
Subject:   Re: cvs commit: src/sbin/fsdb fsdb.c
Message-ID:  <199609272124.OAA10412@phaeton.artisoft.com>
In-Reply-To: <199609271904.VAA01907@gvr.win.tue.nl> from "Guido van Rooij" at Sep 27, 96 09:04:07 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> The strange thing is that this should be impossible to happen. Anyway,
> the problem is that sometimes an filesystem passes the fsck but still makes
> the kernel panic with a bad dir: mangled entry (or something like that).
> The reason is that the size of the directory is beyond the last datablock,
> thus effectively making a sparse directory file (at least in my case).
> Fsck doesn't find anything becuase it only examines the present datablocks.
> The kernel does see such a non-present block as a bunch of zero's. And
> that causes the panic because a non-used directory chunk should have a
> reclen field of 255. The fix (until fsck is fixed) is to fsdb the filesystem,
> chdir to the bad dir and do an ls. You will then see the last entry and you
> can reset the size of the directory untill just after that entry.

This FS *was* fsck'ed after a crash, or it *wasn't* fsck'ed after a
crash?

If it *wasn't*, then the loop was created in the FS code.

If it *was*, then the fsck code is faulty.

I have already fixed one fault in the lost+found creation handling
(root inode link count).  If a crash occured after a directory entry
removal, but prior to the VOP_TRUNCATE, the FS would appear to be in a
consistent state.

Such a crash should not mark the FS clean.

The correct mechanism for recovery would be for the fck to travers the
last directory block in a directory to make sure it has at least one
valid entry, and perform a full traversal with a file truncation if
otherwise, to complete the directory "shrink".


Since the lost+found and the truncate back were the two major fsck
impactful semantic changes (all other operations *should* be idempotent),
then this should be the last one lurking in the "4.4 semantic changes"
for fsck.


So: can you tell me if the condition resulted from fsck not catching
it after a crash, or if it resulted from normal operation of the FS?


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199609272124.OAA10412>