Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Jul 2010 20:35:01 -0700
From:      Kirk McKusick <mckusick@mckusick.com>
To:        "Mikhail T." <mi+thun@aldan.algebra.com>
Cc:        fs@freebsd.org
Subject:   Re: background fsck considered harmful? (Re: panic: handle_written_inodeblock: bad size) 
Message-ID:  <201007220335.o6M3Z1ZT062733@chez.mckusick.com>
In-Reply-To: <4C476370.6030907@aldan.algebra.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
> Date: Wed, 21 Jul 2010 17:15:28 -0400
> From: "Mikhail T." <mi+thun@aldan.algebra.com>
> Organization: Virtual Estates, Inc.
> To: Kirk McKusick <mckusick@mckusick.com>
> Cc: fs@freebsd.org
> Subject: Re: background fsck considered harmful?
> 
> 21.07.2010 16:15, Kirk McKusick:
> > Certainly disabling background fsck will eliminate that from your
> > possible set of issues and may prevent a recurrance. It does mean
> > that after a crash you will have to wait while your filesystems
> > are checked before your system will come up. If your filesystems
> > are below 0.5Tb that should be tolerable.
> >
> > The longer term solution is to use journaled soft updates when they
> > become available in 9.0.
>    
> We are about to ship 8.1 -- with background fsck enabled by default 
> possibly causing problems requiring far more admin time (and involving 
> real data-loss).
> 
> If the existing fsck can not be improved to properly fix the fs, when 
> running in background mode, just as well as when it is running 
> pre-mount, then, IMHO, it should not be enabled by default.
> 
> Crashes are quite rare and waiting once in a while for fsck to rumble 
> through would be better, than to have some people enter into a vicious 
> circle of mysterious panics (even if Jeremy's ongoing work makes them 
> slightly less mysterious).
> 
> Respectfully yours,
> 
>     -mi

I believe that you are being excessively harsh on background fsck.
Generally the problems are caused by hard-disk errors. Because
background fsck only checks a small subset of the disk it does
not find them and so when they eventually accumulate enough they
cause difficult problems. Foreground fsck checks all the disk
metadata every time, so hard disk errors are captured immediately
before they have had a chance to accumulate. But background fsck
users blame it because it has not found them.

If you have small disk systems, running foreground fsck is an
acceptable solution (and indeed I would recommend it). But when
you are running systems with 20Tb of disks, you are not willing
to have your system down for 10 hours after every crash.

A reasonable intermediate solution is to use background fsck by
default, but schedule down time to run a full fsck once a month
or so to check for accumulated hard disk errors.

	Kirk McKusick



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201007220335.o6M3Z1ZT062733>