From owner-freebsd-questions@FreeBSD.ORG Thu Aug 23 05:02:48 2007 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BDA4216A421 for ; Thu, 23 Aug 2007 05:02:48 +0000 (UTC) (envelope-from smithi@nimnet.asn.au) Received: from gaia.nimnet.asn.au (nimbin.lnk.telstra.net [139.130.45.143]) by mx1.freebsd.org (Postfix) with ESMTP id C841013C469 for ; Thu, 23 Aug 2007 05:02:46 +0000 (UTC) (envelope-from smithi@nimnet.asn.au) Received: from localhost (smithi@localhost) by gaia.nimnet.asn.au (8.8.8/8.8.8R1.5) with SMTP id PAA01635; Thu, 23 Aug 2007 15:02:25 +1000 (EST) (envelope-from smithi@nimnet.asn.au) Date: Thu, 23 Aug 2007 15:02:24 +1000 (EST) From: Ian Smith To: Chris In-Reply-To: <3aaaa3a0708220716m5601bb4ewc8688225291ae7bd@mail.gmail.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: freebsd-questions@freebsd.org, Bill Moran Subject: Re: fsck strangeness X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Aug 2007 05:02:48 -0000 On Wed, 22 Aug 2007, Chris wrote: > On 20/08/07, Ian Smith wrote: > > Sorry for the repeat post folks, but I goofed last time, leaving out the > > subject line while replying to the digest. Still curious .. Ian > > ======= > > > > On Sat, 18 Aug 2007 21:32:28 +0200 Erik Trulsson wrote: > > > On Sat, Aug 18, 2007 at 08:21:42PM +0100, Christopher Key wrote: > > > > Hello, > > > > > > > > I'm having some rather strange behaviour with fsck. > > > > > > > > When I boot the system, it asserts that all the file systems are clean, but > > > > subsequently running an fsck on /dev/ad8s1e (mounted as /var) detects > > > > errors. Even if this first check is run whilst the file system is mounted, > > > > and is hence run in NO WRITE mode, a second check doesn't find block > > > > errors. If I then unmount the file system and check the disk, it's fine, > > > > as indeed it is if I unmount, remount, then check. However, if I then > > > > reboot, the process repeats, and an fsck immediately after reboot will find > > > > errors again. If I bring the system up in single user mode, and run fsck > > > > either before or after mounting /var, it finds no errors. > > > > > > > > I'm running 6.2_RELEASE with a custom kernel based upon generic-smp, but > > > > with a lot of unecessary bits removed, and geom_mirror compiled in. I > > > > don't think it's the drive that's at fault, all the other partitions in the > > > > slice are fine, it's a fairly new drive, and it passes a self test quite > > > > happily. Included below is a transcript that attempt to show what's going > > > > on in detail, is there anything else relevant? > > > > > > > > Can anyone suggest what might be going on and how to fix it, or suggest > > > > some slightly better diagnostics? Apologies if this is an RTFM issue, I > > > > have had a good dig through the handbook, but can't seem to find anything > > > > that helps. > > > > > Running fsck on a file system that has been mounted read/write will almost > > > always report spurious errors and can really screw up the disk if it tries > > > to 'correct' those errors. > > > > I'm a bit confused by this. I've been running 'fsck -n' over FreeBSD > > systems since 2.2.6, and modulo seeing the at-the-time inconsistencies > > on those filesystems in /etc/fstab that are mounted, as Chris reported > > and as are expected, I've never had a problem with it, nor seen the sort > > of inconsistent results between runs that Chris is reporting. > > > > > You should normally not run fsck on a mounted filesystem and you should > > > *NEVER* run fsck on a filesystem that has been mounted read/write. > > > > This seems to imply that using the -n switch may have different results > > than not using it and having fsck determine 'NO WRITE' itself from the > > fact that it's noticed that the fs is mounted? Are you suggesting by > > "can really screw up the disk if it tries to 'correct' those errors" > > that fsck might WRITE to a mounted fs that it's showing as 'NO WRITE'? > > > > I've never had any screwups with it, but then I've always specified -n. > > > > Later Bill Moran said: > > > > > Don't run fsck on mounted filesystems unless they're mounted read-only. > > > > > > Although, it's possible I misunderstood your description of the problem. > > > > so I'm still curious, and am wondering if Chris using SMP kernel and/or > > geom_mirror might have anything to do with this? Or whether his use of > > 'umount -f' might be (or cause) the problem indicated by his results? > > > > > > # umount -f /var > > > > > > > > > > > > # mount /var > If its bad to run fsck on a mounted read,write then why does > background fsck do it? or you talking about foreground fsck only? Well I was referring to foreground fsck, and I still don't know why running it on a mounted fs is 'bad' when fsck runs in 'NO WRITE' mode anyway when it finds a fs is mounted, hence my query above. My knowledge of this is thin, despite reading McKusick's paper through several times, but we're told that background fsck runs on a snapshot of the fs concerned. How any bg fsck corrections are woven back into the live fs later is still a mystery to me, but that's because I still have an only barely superficial understanding of how snapshots work .. I still feel that your 'umount -f /var' seems potentially hairy, but can't say if that might explain the behaviour you were reporting. Cheers, Ian