From owner-freebsd-current@FreeBSD.ORG Thu Jan 22 04:08:44 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 25E6A16A4CE for ; Thu, 22 Jan 2004 04:08:44 -0800 (PST) Received: from mailout2.pacific.net.au (mailout2.pacific.net.au [61.8.0.85]) by mx1.FreeBSD.org (Postfix) with ESMTP id 631A643D58 for ; Thu, 22 Jan 2004 04:08:37 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])i0MC8Ptd024590; Thu, 22 Jan 2004 23:08:25 +1100 Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) i0MC8Mp2028332; Thu, 22 Jan 2004 23:08:23 +1100 Date: Thu, 22 Jan 2004 23:08:23 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Matthias Andree In-Reply-To: Message-ID: <20040122215703.E8399@gamplex.bde.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-current@freebsd.org Subject: Re: How to fsck -CURRENT on next reboot [ext2fs] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Jan 2004 12:08:44 -0000 On Wed, 21 Jan 2004, Matthias Andree wrote: > To have your FreeBSD -CURRENT fsck all ufs file systems at next reboot, do: > > 1. mount_ext2fs an ext2 or ext3 file system for read/write > 2. change a file (or touch or rm one) > 3. umount that ext2fs again > > 4. See how it complains about giving up on fsync first for the ext2fs, then > for devfs. This hopefully only happens after mount -u to convert from ro to rw. It happens because GEOM has enforced the open mode on the disk device for a long time, but ext2fs is still missing the hack to always open rw in case mount -u is used to convert from ro to rw. This gives lots of unwriteable buffers whose write is retried endlessly every second or 2 but never gets as far as the driver, and strange results for subsequent accesses to the file since the buffers are not invalid. This was fixed in msdosfs about 6 months ago. It was apparently more serious there because there was no delay between the retries. The following patch doesn't merge the comments about this from ffs because they have some style bugs. %%% Index: ext2_vfsops.c =================================================================== RCS file: /home/ncvs/src/sys/gnu/ext2fs/ext2_vfsops.c,v retrieving revision 1.114 diff -u -2 -r1.114 ext2_vfsops.c --- ext2_vfsops.c 5 Nov 2003 11:56:58 -0000 1.114 +++ ext2_vfsops.c 22 Jan 2004 10:39:38 -0000 @@ -656,5 +656,9 @@ ronly = (mp->mnt_flag & MNT_RDONLY) != 0; vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY, td); +#ifdef notyet error = VOP_OPEN(devvp, ronly ? FREAD : FREAD|FWRITE, FSCRED, td, -1); +#else + error = VOP_OPEN(devvp, FREAD|FWRITE, FSCRED, td, -1); +#endif VOP_UNLOCK(devvp, 0, td); if (error) @@ -736,5 +740,9 @@ if (bp) brelse(bp); +#ifdef notyet (void)VOP_CLOSE(devvp, ronly ? FREAD : FREAD|FWRITE, NOCRED, td); +#else + (void)VOP_CLOSE(devvp, FREAD|FWRITE, NOCRED, td); +#endif if (ump) { bsd_free(ump->um_e2fs->s_es, M_EXT2MNT); @@ -791,6 +799,10 @@ ump->um_devvp->v_rdev->si_mountpoint = NULL; +#ifdef notyet error = VOP_CLOSE(ump->um_devvp, ronly ? FREAD : FREAD|FWRITE, NOCRED, td); +#else + error = VOP_CLOSE(ump->um_devvp, FREAD|FWRITE, NOCRED, td); +#endif vrele(ump->um_devvp); bsd_free(fs->s_es, M_EXT2MNT); %%% > 5. Now reboot. > > 6. syncher gives up on 3 buffers (I have / /var and /usr - coincidence?) > > 7. At boot, fsck complains about non-cleanly umounted file systems Syncing on reboot was broken in rev.1.14 of ext2fs/fs.h and associated changes. This undoes the quick fix for the previous implementation of the bug: bug: rev.1.1: never release the buffers before ext2_umount() fix: rev.1.3: half release the buffers using B_LOCKED rev.1.4: release the buffers some more by not setting B_DELWRI on them rev.1.6: be more careful about setting B_DELWRI bug: rev.1.14: half release the buffers using BUF_KERNPROC() instead of B_LOCKED. This has the disadvantage of not actually working. > For added fun, try instead of 5., try: > > 5B. umount -f on the ext2fs that you modified instead of reboot > 6B. See the kernel panic > 7B. savecore logs "reboot after panic: vinvalbuf: dirty bufs" I didn't test this. Do you get the "fsync: giving up on dirty" message in this case? I don't see how vinvalbuf can panic exactly. However, it seems to have a race: % while (vp->v_numoutput) { % vp->v_iflag |= VI_BWAIT; % error = msleep(&vp->v_numoutput, VI_MTX(vp), % slpflag | (PRIBIO + 1), "vinvlbuf", slptimeo); % if (error) { % VI_UNLOCK(vp); % return (error); % } % } This seems OK, except it may fail, and both ffs and ext2fs panic in reload if it fails there. % if (!TAILQ_EMPTY(&vp->v_dirtyblkhd)) { % VI_UNLOCK(vp); % if ((error = VOP_FSYNC(vp, cred, MNT_WAIT, td)) != 0) % return (error); This is very likely to fail, since we have bogusly unwriteable blocks, but if it fails that VOP_FSYNC() (= vop_stdfsync() + an update for ext2fs) fails and we don't panic here at least. % /* % * XXX We could save a lock/unlock if this was only % * enabled under INVARIANTS % */ % VI_LOCK(vp); There seems to be a window in which new i/o can be started. This may be more likely because we have lots of bogusly unwriteable blocks to retry. % if (vp->v_numoutput > 0 || % !TAILQ_EMPTY(&vp->v_dirtyblkhd)) % panic("vinvalbuf: dirty bufs"); This panic somehow occurred. % } There is still the fundamental problem that there is no way to discard unwriteable blocks (either bogus ones or ones with actual i/o errors). Even vinvalbuf() doesn't really understand them. In the umount -f case, it should do something like direct VOP_FSYNC() to only retry a limited number of times, then forcibly discard them if they are still unwriteable. > e2fsprogs (with mke2fs, the ext2fs newfs tool) are available as port > from http://mandree.home.pages.de/freebsd/e2fsprogs/ while Tytso > (e2fsprogs maintainer) reviews my changes - feedback on my port is > appreciated. The e2fsprogs stuff from official ports doesn't work > because it assumes that block devices are buffered. About time this was fixed :-). Bruce