Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 22 Jul 2010 11:21:45 -0400
From:      "Mikhail T." <mi+thun@aldan.algebra.com>
To:        Kirk McKusick <mckusick@mckusick.com>
Cc:        fs@freebsd.org
Subject:   Re: background fsck considered harmful? (Re: panic: handle_written_inodeblock: bad size)
Message-ID:  <4C486209.7050402@aldan.algebra.com>
In-Reply-To: <201007220335.o6M3Z1ZT062733@chez.mckusick.com>
References:  <201007220335.o6M3Z1ZT062733@chez.mckusick.com>

next in thread | previous in thread | raw e-mail | index | archive | help
21.07.2010 23:35, Kirk McKusick ΞΑΠΙΣΑΧ(ΜΑ):
> Foreground fsck checks all the disk
> metadata every time, so hard disk errors are captured immediately
> before they have had a chance to accumulate. But background fsck
> users blame it because it has not found them.
>    
I don't blame the program itself -- if it was deliberately /designed/ to 
only do partial checking. However, I was under the impression, that the 
background fsck was meant to do the same job as the "real" one, and 
that, whenever it did not, it was simply a bug in the /implementation/.

I suspect, this misconception is shared by plenty of other users... 
Indeed, even if a inquisitive admin wanted to find out, fsck(8) gives 
absolutely no warning to that effect -- it simply states, that 
background fsck will be attempted, whenever possible.
> If you have small disk systems, running foreground fsck is an
> acceptable solution (and indeed I would recommend it). But when
> you are running systems with 20Tb of disks, you are not willing
> to have your system down for 10 hours after every crash.
>
> A reasonable intermediate solution is to use background fsck by
> default, but schedule down time to run a full fsck once a month
> or so to check for accumulated hard disk errors.
>    
Maybe, filesystems less than, say, 100Gb (default threshold, subject to 
admin's adjustment) in size should always be foreground fsck-ed? This 
should, at least, cover the system file-systems (such as / and /var) on 
typical installations...

And a stern warning issued, when a background fsck is attempted -- for 
whatever reason. Something like:

    background fsck, although faster, may be unable to detect certain
    rare forms of filesystem corruption. You are advised to perform a
    full fsck on %s on a regular basis. See fsck(8).

should go into the right place under fsck_ffs/ -- not sure, where exactly...

Below is a simple patch for the top-level fsck(8). Somebody more 
knowledgeable of the details should augment fsck_ffs(8) -- it currently 
gives the lists of inconsistencies checked for without mentioning the 
difference in coverage between full and background modes...

    diff -U 2 -r1.38.2.1 fsck.8
    --- fsck.8      3 Aug 2009 08:13:06 -0000       1.38.2.1
    +++ fsck.8      22 Jul 2010 15:19:25 -0000
    @@ -170,4 +170,12 @@
      When running in background mode,
      only one file system at a time will be checked.
    +
    +.Sy Warning:
    +because background fsck is performed while the filesystem
    +is in use, it is limited to checking for only the most commonly
    +occuring filesystem abnormalities. Under certain circumstances,
    +some errors can escape background fsck. It is recommended, that you
    +perform full fsck on your systems once in a while -- or whenever
    +you encounter filesystem-related panics.
      .It Fl t Ar fstype
      Invoke

Yours,

    -mi





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C486209.7050402>