Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 1 Dec 2010 03:12:10 -0800
From:      Garrett Cooper <yanegomi@gmail.com>
To:        Peter Holm <pho@freebsd.org>
Cc:        Marshall Kirk McKusick <mckusick@mckusick.com>, Kostik Belousov <kostikbel@gmail.com>, current@freebsd.org
Subject:   Re: How a full fsck screwed up my SU+J filesystem
Message-ID:  <AANLkTi=ab471f1%2BWLepOEXnci=LhMJzvH98a3=MM1NHq@mail.gmail.com>
In-Reply-To: <20101201110008.GA50719@x2.osted.lan>
References:  <1FA8A18C-9350-4C2D-B034-768566ACB718@gmail.com> <20101201110008.GA50719@x2.osted.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Dec 1, 2010 at 3:00 AM, Peter Holm <pho@freebsd.org> wrote:
> On Wed, Dec 01, 2010 at 01:28:06AM -0800, Garrett Cooper wrote:
>> =A0 =A0 =A0 So... I was doing a portmaster -af today because vlc stopped=
 playing audio (for some reason ... I kind of went on a pkg_cutleaves rampa=
ge and probably deinstalled too much stuff), and the machine hardlocked dur=
ing an upgrade. I did a soft reboot and saw messages along the lines of "yo=
ur journal and filesystem mount time mismatched; running a full fsck". I fi=
gured "ok, sure..." and let it do it's thing. Problem was that it pruned a =
lot of stuff from my /usr partition -- including the .sujournal !!! So now =
it's stuck at Mounting local file systems: stating:
>>
>> Failed to find journal. =A0 Use tunefs to create one
>> Failed to start journal: 2
>>
>> =A0 =A0 =A0 (I assume the 2 means ENOENT). All of the above were printf(=
9)'s from the kernel.
>> =A0 =A0 =A0 Now the machine won't continue in multiuser mode (doesn't re=
spond to interrupts, no panic, etc). Going into ddb, I don't see anything i=
n info_threads (just a bunch of references to sched_switch, a few to fork_t=
rampoline, cpustop_handler, and kdb_enter). I'm going to try and massage th=
e machine back to life from single user mode, but the fact that this died i=
n this way (i.e. .sujournal getting nuked by a full fsck) is a bit disheart=
ening for SU+J :(... It would be nice if at least the fsck aborted before g=
oing and nuking the journal :/... (or at the very least if the file wasn't =
removable -- i.e. SF_NOUNLINK).
>> =A0 =A0 =A0 Here's to hoping I can resuscitate the filesystem...
>
> Thank you for reporting this.
>
> I was able to reproduce the problem by:
>
> tunefs -j enable /dev/md5a
> mount /dev/md5a /mnt
> chflags 0 /mnt/.sujournal
> rm -f /mnt/.sujournal
> umount /mnt
> mount /dev/md5a /mnt
>
> The mount(1) is now stuck in mntref.
>
> http://people.freebsd.org/~pho/stress/log/kostik404.txt
>
> A sequence of "tunefs -j disable" + "tunefs -j enable" should get
> you going.

    I had to do fsck -y again, unmount the partition, then tunefs -j
enable it, and then I was able to get it to work.
    Looks like some files were lost -- it would be interesting to see
what the degree of damage was. Thankfully it was just /usr (mostly
/usr/local because I was recompiling ports) and not my more critical
data partition(s).
Thanks!
-Garrett



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTi=ab471f1%2BWLepOEXnci=LhMJzvH98a3=MM1NHq>