Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 1 Dec 2021 11:15:14 -0700
From:      Alan Somers <asomers@freebsd.org>
To:        FreeBSD <freebsd-stable@freebsd.org>
Subject:   ZFS deadlocks triggered by HDD timeouts
Message-ID:  <CAOtMX2hMu7qXqHt5rhi9CBNDRERpWshcF%2BR9N_VQOrYvYFERQg@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
On a stable/13 build from 16-Sep-2021 I see frequent ZFS deadlocks
triggered by HDD timeouts.  The timeouts are probably caused by
genuine hardware faults, but they didn't lead to deadlocks in
12.2-RELEASE or 13.0-RELEASE.  Unfortunately I don't have much
additional information.  ZFS's stack traces aren't very informative,
and dmesg doesn't show anything besides the usual information about
the disk timeout.  I don't see anything obviously related in the
commit history for that time range, either.

Has anybody else observed this phenomenon?  Or does anybody have a
good way to deliberately inject timeouts?  CAM makes it easy enough to
inject an error, but not a timeout.  If it did, then I could bisect
the problem.  As it is I can only reproduce it on production servers.

-Alan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2hMu7qXqHt5rhi9CBNDRERpWshcF%2BR9N_VQOrYvYFERQg>