From nobody Fri Nov  5 19:05:25 2021
X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 9880C1846D9C
	for <freebsd-fs@mlmmj.nyi.freebsd.org>; Fri,  5 Nov 2021 19:05:31 +0000 (UTC)
	(envelope-from cross+freebsd@distal.com)
Received: from relay.wiredblade.com (relay.wiredblade.com [168.235.95.80])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(Client did not present a certificate)
	by mx1.freebsd.org (Postfix) with ESMTPS id 4Hm9232mVPz3pvH
	for <freebsd-fs@freebsd.org>; Fri,  5 Nov 2021 19:05:31 +0000 (UTC)
	(envelope-from cross+freebsd@distal.com)
Received: from mail.distal.com (pool-108-48-165-176.washdc.fios.verizon.net [108.48.165.176])
	by relay.wiredblade.com with ESMTPSA
	(version=TLSv1.2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256)
	; Fri, 5 Nov 2021 19:05:29 +0000
Received: from smtpclient.apple (<unknown> [2001:420:c0c4:1005::15])
	by tristain.distal.com (OpenSMTPD) with ESMTPSA id ba53751d (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256:NO)
	for <freebsd-fs@freebsd.org>;
	Fri, 5 Nov 2021 15:05:28 -0400 (EDT)
From: Chris Ross <cross+freebsd@distal.com>
Content-Type: text/plain;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-fs
List-Help: <mailto:freebsd-fs+help@freebsd.org>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Subscribe: <mailto:freebsd-fs+subscribe@freebsd.org>
List-Unsubscribe: <mailto:freebsd-fs+unsubscribe@freebsd.org>
Sender: owner-freebsd-fs@freebsd.org
Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.20.0.1.32\))
Subject: Re: ZFS operations hanging, but no visible errors?
Date: Fri, 5 Nov 2021 15:05:25 -0400
References: <B28E52F4-F475-4CF6-BE0C-F5C803AD5757@distal.com>
 <20211105173935.7aa53269@fabiankeil.de>
 <86999084-7007-4F08-A4C4-4A835A7E1C78@distal.com>
 <CAOeNLur6Db=vcwMS_0Tmwdim6JGFX-0sYLd=hOjgZzBLBEm4fQ@mail.gmail.com>
 <0AABEDF8-665F-465F-9792-0B0BE6CAE97F@distal.com>
To: freebsd-fs <freebsd-fs@freebsd.org>
In-Reply-To: <0AABEDF8-665F-465F-9792-0B0BE6CAE97F@distal.com>
Message-Id: <33253C33-9A8D-403B-A7E9-C511EA4ED34A@distal.com>
X-Mailer: Apple Mail (2.3693.20.0.1.32)
X-Rspamd-Queue-Id: 4Hm9232mVPz3pvH
X-Spamd-Bar: ----
Authentication-Results: mx1.freebsd.org;
	none
X-Spamd-Result: default: False [-4.00 / 15.00];
	 REPLY(-4.00)[];
	 TAGGED_FROM(0.00)[freebsd]
X-ThisMailContainsUnwantedMimeParts: N



> On Nov 5, 2021, at 13:12, Chris Ross <cross+freebsd@distal.com> wrote:
>=20
> Okay.  Despite everything I had running being stuck I was able to log =
into the console, and coincidentally or not, things have now recovered.  =
Well, the old commands/sessions didn=E2=80=99t, but I can log in again.  =
I can=E2=80=99t get to the tmux session it seems, but.
>=20
> I=E2=80=99m able to run that sysctl, which has a lot of data.  The =
last records all about two hours ago are:
>=20
> 1636125429   metaslab.c:2538:metaslab_unload(): metaslab_unload: txg =
1033689, spa tank, vdev_id 1, ms_id 854, weight 780000000000001, =
selected txg 1033574 (601067 ms ago), alloc_txg 1033313, loaded 5902891 =
ms ago, max_size 2147475456
> 1636125429   metaslab.c:2538:metaslab_unload(): metaslab_unload: txg =
1033689, spa tank, vdev_id 2, ms_id 88, weight 880000000000001, selected =
txg 1033574 (601067 ms ago), alloc_txg 1020497, loaded 864138 ms ago, =
max_size 17179869184
> 1636125429   metaslab.c:2538:metaslab_unload(): metaslab_unload: txg =
1033689, spa tank, vdev_id 1, ms_id 859, weight 780000000000001, =
selected txg 1033574 (601067 ms ago), alloc_txg 1033029, loaded 2201252 =
ms ago, max_size 2147475456
> 1636125429   metaslab.c:2538:metaslab_unload(): metaslab_unload: txg =
1033689, spa tank, vdev_id 1, ms_id 860, weight 780000000000001, =
selected txg 1033574 (601067 ms ago), alloc_txg 1033229, loaded 3395548 =
ms ago, max_size 2147303424
> 1636125429   metaslab.c:2538:metaslab_unload(): metaslab_unload: txg =
1033689, spa tank, vdev_id 1, ms_id 863, weight 7c0000000000001, =
selected txg 1033574 (601067 ms ago), alloc_txg 1033448, loaded 4046753 =
ms ago, max_size 4294926336
>=20
> Not sure if that helps=E2=80=A6.

Okay.  Following up just to close out the =E2=80=9Cactive=E2=80=9D state =
of the issue.  It
became unresponsive again moments after the above.  The kernel
was functional, as I was able to switch to multiple virtual consoles,
but logging in only yielded a =E2=80=9CLast login=E2=80=9D line, then =
nothing else.
C/R=E2=80=99s were echoed on consoles, but nothing else happened.

I issued a Ctrl-Alt-Delete, and it began stopping things, failed the 90
second watchdog timer and noted terminating shutdown abnormally.
The kernel did eventually report =E2=80=9CAll buffers synced.=E2=80=9D =
then nothing else.

After about 10 minutes, I tried Ctrl-Alt-Delete again, and then =
power-cycled
the box.

I=E2=80=99d still be interested in hearing any theories about what =
happened, but
I no longer have the device in this state to test.

                - Chris=