Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 Aug 2023 12:22:00 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Alexander Motin <mav@FreeBSD.org>
Cc:        Current FreeBSD <freebsd-current@freebsd.org>
Subject:   Re: ZFS deadlock in 14
Message-ID:  <40D0C681-C28B-47C2-B913-90A56CFD69D4@yahoo.com>
In-Reply-To: <1AC87B79-6B65-402B-B65F-CCFFCC503861@yahoo.com>
References:  <4FFAE432-21FE-4462-9162-9CC30A5D470A.ref@yahoo.com> <4FFAE432-21FE-4462-9162-9CC30A5D470A@yahoo.com> <b8f819f0-9f6d-af87-64f9-35e37fbf4b2c@FreeBSD.org> <1AC87B79-6B65-402B-B65F-CCFFCC503861@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Aug 23, 2023, at 13:37, Mark Millard <marklmi@yahoo.com> wrote:
>=20
> On Aug 23, 2023, at 11:40, Alexander Motin <mav@FreeBSD.org> wrote:
>=20
>> On 22.08.2023 14:24, Mark Millard wrote:
>>> Alexander Motin <mav_at_FreeBSD.org> wrote on
>>> Date: Tue, 22 Aug 2023 16:18:12 UTC :
>>>> I am waiting for final test results from George Wilson and then =
will
>>>> request quick merge of both to zfs-2.2-release branch. =
Unfortunately
>>>> there are still not many reviewers for the PR, since the code is =
not
>>>> trivial, but at least with the test reports Brian Behlendorf and =
Mark
>>>> Maybee seem to be OK to merge the two PRs into 2.2. If somebody =
else
>>>> have tested and/or reviewed the PR, you may comment on it.
>>> I had written to the list that when I tried to test the system
>>> doing poudriere builds (initially with your patches) using
>>> USE_TMPFS=3Dno so that zfs had to deal with all the file I/O, I
>>> instead got only one builder that ended up active, the others
>>> never reaching "Builder started":
>>=20
>>> Top was showing lots of "vlruwk" for the cpdup's. For example:
>>> . . .
>>> 362     0 root         40    0  27076Ki   13776Ki CPU19   19   4:23  =
 0.00% cpdup -i0 -o ref 32
>>> 349     0 root         53    0  27076Ki   13776Ki vlruwk  22   4:20  =
 0.01% cpdup -i0 -o ref 31
>>> 328     0 root         68    0  27076Ki   13804Ki vlruwk   8   4:30  =
 0.01% cpdup -i0 -o ref 30
>>> 304     0 root         37    0  27076Ki   13792Ki vlruwk   6   4:18  =
 0.01% cpdup -i0 -o ref 29
>>> 282     0 root         42    0  33220Ki   13956Ki vlruwk   8   4:33  =
 0.01% cpdup -i0 -o ref 28
>>> 242     0 root         56    0  27076Ki   13796Ki vlruwk   4   4:28  =
 0.00% cpdup -i0 -o ref 27
>>> . . .
>>> But those processes did show CPU?? on occasion, as well as
>>> *vnode less often. None of the cpdup's was stuck in
>>> Removing your patches did not change the behavior.
>>=20
>> Mark, to me "vlruwk" looks like a limit on number of vnodes.  I was =
not deep in that area at least recently, so somebody with more =
experience there could try to diagnose it.  At very least it does not =
look related to the ZIL issue discussed in this thread, at least with =
the information provided, so I am not surprised that the mentioned =
patches do not affect it.
>=20
> Thanks for the information. Good to know. I'll redirect this to be a =
different discussion.

Mateusz Guzik had me revert 138a5dafba31 ( which is for
sys/kern/vfs_subr.c ), which was enough to allow me to
run bulk -a with USE_TMPFS=3Dno usefully. (There is now a
new sys/kern/vfs_subr.c patch for me to try instead.)

So I used the reverted context to test without your patches
to see if I'd get a deadlock from a bulk -a with USE_TMPFS=3Dno
usage. It is past 9200 finished in about 18 hrs of building.
No deadlock. (I do not plan on letting the bulk -a run to
completion.)

The 3 load averages are normally over 100 and the MaxObs
figures for the 3 are currently: 349.68, 264.30, 243.16
(for a 32 hardware-thread system).

So it looks like when I try again with Mateusz's new patch,
trying with your patches would not be much of a test for
preventing deadlocks for this context. More of a cross check
on if other types of issues showed up vs. not. It is not
clear how useful such testing might be.

It might be that the high load average bulk -a style makes
the deadlocks in question less likely for some reason.

=3D=3D=3D
Mark Millard
marklmi at yahoo.com

=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?40D0C681-C28B-47C2-B913-90A56CFD69D4>