Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Nov 2000 09:01:03 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        dwalton@acm.org
Cc:        julian@elischer.org (Julian Elischer), fs@FreeBSD.ORG
Subject:   Re: corrupted filesystem
Message-ID:  <200011200901.CAA04577@usr06.primenet.com>
In-Reply-To: <3A17FFA9.30580.3B5C5FD@localhost> from "Dave Walton" at Nov 19, 2000 04:28:25 PM

next in thread | previous in thread | raw e-mail | index | archive | help
> > there is a file/dir bit in the directory  as well
> > obviously they disagree..
> 
> Ah.  Now it makes sense.
> 
> > hopefully if you can run fsck -y you can find the inodes of the 
> > user's individual directories when they are put in lost+found 
> 
> Isn't it possible for fsck -y to cause more damage while it tries to 
> repair things?

It will probably delete everything as "unreferenced".  It's pretty
clear that your /usr directory got hosed, which probably means "/"
on the FS being mounted as /usr.

This will have happened because the directory entry block that
was at the top level was undergoing modification at the time of
the crash.  The most likely reason for this was a create or a
rename of a file or directory in /usr, resulting in a compaction
of the directory entry block containing the damaged entries.

You could "fix" this by dumping the contents of the top level
directory on the device, and sifting through it by hand with a
copy of /sys/ufs/ufs/dir.h in hand.

"Fixed", you will have added back references to the inodes of the
directories (and perhaps files, if there were any) under /usr.

See the comments in the dir.h file referenced above for details
on the layout of the inodes and what makes an inode "deleted" or
still alive.

If any of your entries are the first entry in a directory block,
you are probably screwed for that entry, since the way a directory
entry is deleted from the front of a block is to zero its inode
number.  The one exception to this will be the first one, since
that entry will be for "." (and the next for "..").  As the root
of a filesystem, we know that the inode there will be "2".

Adjusting the d_type for the home directory should be easy.  If
the problem is the type on the inode itself, you are in much
worse trouble, since this will mean that the inode that had the
home directory in it has been subsequently reqused for a normal
file, and the data is probably destroyed.

This can also be recovered, with difficulty (you will probably
need to write a specialized tool for this, unless you are willing
to live with everything being place in lost+found).


My suggestion would be to do an image backup of the FS to tape
(preferrable, twice, just in case; that a hell of a lot of data)
_NOW_, and that you do it _BEFORE_ you do anything else.

Realize that your previous attempts at recovery, if an automatic
recovery was attempted, may have damaged your data by clearing
inodes that should not have been cleared.  If during your manual
fsck, you overrode the "no", you could likewise have damaged the
data.


As a general rule, fsck exists not to recover data, but to return
the FS to a consistant state.  It will likely recover all that it
can to lost+found, but that may not be everything.  What it does
recover to lost+found will be inode number named directories and
files.  Since directories are recovered first, this should mean
that subhierarchies underneath will remain intact, and all you will
lose is the names of the directories.

But beware: with a corrupt root inode, all bets are off: fsck will
not be able to recover, since if / is not a directory, then it will
not be able to create /lost+found within the FS.  So your first
order of business _MUST_ be to recover as much of your / on that
device as possible, so that what isn't recovered by fixing that
will be recoverable into a /lost+found which can be created by the
fsck process.


Generally, when I run ito this type of failure, I sit down on a
different system and write some tools specific to the recovery task
at hand; this can be a cluster-grep, a finder-of-directory-inodes
(with non-zero reference counts), a simple raw directory editor,
etc.: whatever is needed for the specific task.


PS: Your job is going to be much harder, since block devices were
murdered, so unless your system predated the murder, you will have
to be very careful to only read and write in disk block sized
units on disk block boundaries.  For almost all disks, this will
be 512b chunks on 512b boundaries.  Emergency recovery was much
easier before block devices were shot in the head.  8-(.

					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200011200901.CAA04577>