Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 1 Dec 2021 11:24:57 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        Alan Somers <asomers@freebsd.org>
Cc:        FreeBSD <freebsd-stable@freebsd.org>
Subject:   Re: ZFS deadlocks triggered by HDD timeouts
Message-ID:  <CANCZdfo7W-eFoQ6X4y0rY=k5in6T7Ledjhes39ToO9ZXLXyVbw@mail.gmail.com>
In-Reply-To: <CAOtMX2hMu7qXqHt5rhi9CBNDRERpWshcF%2BR9N_VQOrYvYFERQg@mail.gmail.com>
References:  <CAOtMX2hMu7qXqHt5rhi9CBNDRERpWshcF%2BR9N_VQOrYvYFERQg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--0000000000004878e205d219cb42
Content-Type: text/plain; charset="UTF-8"

On Wed, Dec 1, 2021, 11:16 AM Alan Somers <asomers@freebsd.org> wrote:

> On a stable/13 build from 16-Sep-2021 I see frequent ZFS deadlocks
> triggered by HDD timeouts.  The timeouts are probably caused by
> genuine hardware faults, but they didn't lead to deadlocks in
> 12.2-RELEASE or 13.0-RELEASE.  Unfortunately I don't have much
> additional information.  ZFS's stack traces aren't very informative,
> and dmesg doesn't show anything besides the usual information about
> the disk timeout.  I don't see anything obviously related in the
> commit history for that time range, either.
>
> Has anybody else observed this phenomenon?  Or does anybody have a
> good way to deliberately inject timeouts?  CAM makes it easy enough to
> inject an error, but not a timeout.  If it did, then I could bisect
> the problem.  As it is I can only reproduce it on production servers.
>

What SIM? Timeouts are tricky because they have many sources, some of which
are nonlocal...

Warner

>

--0000000000004878e205d219cb42--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfo7W-eFoQ6X4y0rY=k5in6T7Ledjhes39ToO9ZXLXyVbw>