Date: Wed, 21 Jul 2010 20:35:01 -0700 From: Kirk McKusick <mckusick@mckusick.com> To: "Mikhail T." <mi+thun@aldan.algebra.com> Cc: fs@freebsd.org Subject: Re: background fsck considered harmful? (Re: panic: handle_written_inodeblock: bad size) Message-ID: <201007220335.o6M3Z1ZT062733@chez.mckusick.com> In-Reply-To: <4C476370.6030907@aldan.algebra.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> Date: Wed, 21 Jul 2010 17:15:28 -0400 > From: "Mikhail T." <mi+thun@aldan.algebra.com> > Organization: Virtual Estates, Inc. > To: Kirk McKusick <mckusick@mckusick.com> > Cc: fs@freebsd.org > Subject: Re: background fsck considered harmful? > > 21.07.2010 16:15, Kirk McKusick: > > Certainly disabling background fsck will eliminate that from your > > possible set of issues and may prevent a recurrance. It does mean > > that after a crash you will have to wait while your filesystems > > are checked before your system will come up. If your filesystems > > are below 0.5Tb that should be tolerable. > > > > The longer term solution is to use journaled soft updates when they > > become available in 9.0. > > We are about to ship 8.1 -- with background fsck enabled by default > possibly causing problems requiring far more admin time (and involving > real data-loss). > > If the existing fsck can not be improved to properly fix the fs, when > running in background mode, just as well as when it is running > pre-mount, then, IMHO, it should not be enabled by default. > > Crashes are quite rare and waiting once in a while for fsck to rumble > through would be better, than to have some people enter into a vicious > circle of mysterious panics (even if Jeremy's ongoing work makes them > slightly less mysterious). > > Respectfully yours, > > -mi I believe that you are being excessively harsh on background fsck. Generally the problems are caused by hard-disk errors. Because background fsck only checks a small subset of the disk it does not find them and so when they eventually accumulate enough they cause difficult problems. Foreground fsck checks all the disk metadata every time, so hard disk errors are captured immediately before they have had a chance to accumulate. But background fsck users blame it because it has not found them. If you have small disk systems, running foreground fsck is an acceptable solution (and indeed I would recommend it). But when you are running systems with 20Tb of disks, you are not willing to have your system down for 10 hours after every crash. A reasonable intermediate solution is to use background fsck by default, but schedule down time to run a full fsck once a month or so to check for accumulated hard disk errors. Kirk McKusick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201007220335.o6M3Z1ZT062733>