Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 1 Dec 2010 16:27:48 +0200
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Peter Holm <pho@freebsd.org>
Cc:        Garrett Cooper <yanegomi@gmail.com>, Marshall Kirk McKusick <mckusick@mckusick.com>, current@freebsd.org
Subject:   Re: How a full fsck screwed up my SU+J filesystem
Message-ID:  <20101201142748.GN2392@deviant.kiev.zoral.com.ua>
In-Reply-To: <20101201110008.GA50719@x2.osted.lan>
References:  <1FA8A18C-9350-4C2D-B034-768566ACB718@gmail.com> <20101201110008.GA50719@x2.osted.lan>

next in thread | previous in thread | raw e-mail | index | archive | help

--Nu+27GzWfrM0D7m+
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Dec 01, 2010 at 12:00:08PM +0100, Peter Holm wrote:
> On Wed, Dec 01, 2010 at 01:28:06AM -0800, Garrett Cooper wrote:
> > 	So... I was doing a portmaster -af today because vlc stopped playing a=
udio (for some reason ... I kind of went on a pkg_cutleaves rampage and pro=
bably deinstalled too much stuff), and the machine hardlocked during an upg=
rade. I did a soft reboot and saw messages along the lines of "your journal=
 and filesystem mount time mismatched; running a full fsck". I figured "ok,=
 sure..." and let it do it's thing. Problem was that it pruned a lot of stu=
ff from my /usr partition -- including the .sujournal !!! So now it's stuck=
 at Mounting local file systems: stating:
> >=20
> > Failed to find journal.   Use tunefs to create one
> > Failed to start journal: 2
> >=20
> > 	(I assume the 2 means ENOENT). All of the above were printf(9)'s from =
the kernel.
> > 	Now the machine won't continue in multiuser mode (doesn't respond to i=
nterrupts, no panic, etc). Going into ddb, I don't see anything in info_thr=
eads (just a bunch of references to sched_switch, a few to fork_trampoline,=
 cpustop_handler, and kdb_enter). I'm going to try and massage the machine =
back to life from single user mode, but the fact that this died in this way=
 (i.e. .sujournal getting nuked by a full fsck) is a bit disheartening for =
SU+J :(... It would be nice if at least the fsck aborted before going and n=
uking the journal :/... (or at the very least if the file wasn't removable =
-- i.e. SF_NOUNLINK).
> > 	Here's to hoping I can resuscitate the filesystem...
> > Thanks,
> > -Garrett_______________________________________________
>=20
> Thank you for reporting this.
>=20
> I was able to reproduce the problem by:
>=20
> tunefs -j enable /dev/md5a
> mount /dev/md5a /mnt
> chflags 0 /mnt/.sujournal
> rm -f /mnt/.sujournal
> umount /mnt
> mount /dev/md5a /mnt
>=20
> The mount(1) is now stuck in mntref.
>=20
> http://people.freebsd.org/~pho/stress/log/kostik404.txt
>=20
> A sequence of "tunefs -j disable" + "tunefs -j enable" should get
> you going.

The action is of the category "do not do it then" for sure.

The problem in kostik404 is due to ffs_mount() did not cleaned up
the vnodes instantiated during the mount. Activating softdep journal
instantiates at least root vnode, and a journal vnode, if found. The
following patch fixed it for me.

diff --git a/sys/ufs/ffs/ffs_vfsops.c b/sys/ufs/ffs/ffs_vfsops.c
index 94951e4..72f40da 100644
--- a/sys/ufs/ffs/ffs_vfsops.c
+++ b/sys/ufs/ffs/ffs_vfsops.c
@@ -928,6 +928,7 @@ ffs_mountfs(devvp, mp, td)
 		if ((fs->fs_flags & FS_DOSOFTDEP) &&
 		    (error =3D softdep_mount(devvp, mp, fs, cred)) !=3D 0) {
 			free(fs->fs_csp, M_UFSMNT);
+			ffs_flushfiles(mp, FORCECLOSE, td);
 			goto out;
 		}
 		if (fs->fs_snapinum[0] !=3D 0)

--Nu+27GzWfrM0D7m+
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEARECAAYFAkz2W2QACgkQC3+MBN1Mb4iqggCeNPP2xM8RVCRMnRURL8SvLVLT
57UAniW3pf+qBzfDwWIORAP3bFTRPBUX
=iqzH
-----END PGP SIGNATURE-----

--Nu+27GzWfrM0D7m+--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20101201142748.GN2392>