Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 02 Apr 2018 22:23:39 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-fs@FreeBSD.org
Subject:   [Bug 227204] Combination of gmirror and enabled softupdates journalling cause slow filesystem degradation
Message-ID:  <bug-227204-3630-41Scw6sRhR@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-227204-3630@https.bugs.freebsd.org/bugzilla/>
References:  <bug-227204-3630@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D227204

Kirk McKusick <mckusick@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|New                         |Closed
                 CC|                            |mckusick@FreeBSD.org
         Resolution|---                         |Works As Intended

--- Comment #1 from Kirk McKusick <mckusick@FreeBSD.org> ---
This is a problem that is endemic to all overwriting  filesystems that use
journalling. Specifically, the journal only checks and corrects things that=
 it
knows need to be fixed. Under normal circumstances it knows about everything
that might be wrong. Unfortunately most disks are run with `write cache
enabled' which means that they can lie about completing writes to stable st=
ore.
Specifically they report that a write is on the platter (or in the flash) w=
hen
in fact it is only in the disk's volatile cache. If there is a power-fail
event, they are usually able to flush their cache, but not always. Since the
journal has been told that the write completed, it does not check for the
missed write and the corresponding corruption of the filesystem remains unt=
il a
full fsck is run (which checks all of the metadata integrity). If the missed
write was an update to a cylinder-group map, then you can end up
double-allocating a block (such as you see in your example). When an attemp=
t is
made to free a double-allocated block you will get a system panic with "fre=
eing
free block".

Some systems have tried periodically forcing a full fsck (on the order of e=
very
month or so) to catch these types of errors, but the disruption if the rebo=
ot
happened during a busy period led them to drop this practice. Still it is a
good idea to periodically run a full fsck just to ensure that your filesyst=
ems
stay healthy. If this is not practical you should consider using ZFS which
provides a great deal more redundancy and integrity though requires
considerably more resources (disk + CPU + memory) for a given storage load =
than
does UFS.

I am closing this report with "Works as Intended" as that is the closest to
"This is a known shortcoming of journalled overwriting filesystems".

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-227204-3630-41Scw6sRhR>