Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 Aug 2016 21:02:34 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-fs@FreeBSD.org
Subject:   [Bug 211013] Write error to UFS filesystem with softupdates panics machine
Message-ID:  <bug-211013-3630-1vEOM1gQh8@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-211013-3630@https.bugs.freebsd.org/bugzilla/>
References:  <bug-211013-3630@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D211013

--- Comment #1 from commit-hook@freebsd.org ---
A commit references this bug:

Author: mckusick
Date: Tue Aug 16 21:02:30 UTC 2016
New revision: 304239
URL: https://svnweb.freebsd.org/changeset/base/304239

Log:
  Bug 211013 reports that a write error to a UFS filesystem running
  with softupdates panics the kernel. The problem that has been pointed
  out is that when there is a transient write error on certain metadata
  blocks, specifically directory blocks (PAGEDEP), inode blocks
  (INODEDEP), indirect pointer blocks (INDIRDEPS), and cylinder group
  (BMSAFEMAP, but only when journaling is enabled), we get a panic
  in one of the routines called by softdep_disk_io_initiation that
  the I/O is "already started" when we retry the write.

  These dependency types potentially need to do roll-backs when called
  by softdep_disk_io_initiation before doing a write and then a
  roll-forward when called by softdep_disk_write_complete after the
  I/O completes.  The panic happens when there is a transient error.
  At the top of softdep_disk_write_complete we check to see if the
  write had an error and if an error occurred we just return.  This
  return is correct most of the time because the main role of the routines
  called by softdep_disk_write_complete is to process the now-completed
  dependencies so that the next I/O steps can happen.

  But for the four types listed above, they do not get to do their
  rollback operations. This causes the panic when softdep_disk_io_initiation
  gets called on the second attempt to do the write and the roll-back
  routines find that the roll-backs have already been done. As an
  aside I note that there is also the problem that the buffer will
  have been unlocked and thus made visible to the filesystem and to
  user applications with the roll-backs in place.

  The way to resolve the problem is to add a flag to the routines called
  by softdep_disk_write_complete for the four dependency types noted
  that indicates whether the write was successful (WRITESUCCEEDED).
  If the write does not succeed, they do just the roll-backs and then
  return. If the write was successful they also do their usual
  processing of the now-completed dependencies.

  The fix was tested by selectively injecting write errors for buffers
  holding dependencies of each of the four types noted above and then
  verifying that the kernel no longer paniced and that following the
  successful retry of the write that the filesystem could be unmounted
  and successfully checked cleanly.

  PR: 211013
  Reviewed by: kib

Changes:
  head/sys/ufs/ffs/ffs_softdep.c
  head/sys/ufs/ffs/softdep.h

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-211013-3630-1vEOM1gQh8>