Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 19 Aug 2023 11:40:59 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Current FreeBSD <freebsd-current@freebsd.org>
Subject:   Re: ZFS deadlock in 14
Message-ID:  <0F2C42B4-36FF-443A-A174-5B0CC57C4FC7@yahoo.com>
In-Reply-To: <59FCB309-4A55-4924-98C4-7ACCA70FD299@yahoo.com>
References:  <59FCB309-4A55-4924-98C4-7ACCA70FD299@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
We will see how long the following high load average bulk -a
configuration survives a build attempt, using a non-debug kernel
for this test.

I've applied:

# fetch -o- https://github.com/openzfs/zfs/pull/15107.patch | git -C =
/usr/main-src/ am --dir=3Dsys/contrib/openzfs
-                                                       13 kB  900 kBps  =
  00s
Applying: Remove fastwrite mechanism.

# fetch -o- https://github.com/openzfs/zfs/pull/15122.patch | git -C =
/usr/main-src/ am --dir=3Dsys/contrib/openzfs
-                                                       45 kB 1488 kBps  =
  00s
Applying: ZIL: Second attempt to reduce scope of zl_issuer_lock.

on a ThreadRipper 1950X (32 hardware threads) that is at
main 6b405053c997:

Thu, 10 Aug 2023
. . .
   =E2=80=A2 git: cd25b0f740f8 - main - zfs: cherry-pick fix from =
openzfs Martin Matuska=20
   =E2=80=A2 git: 28d2e3b5dedf - main - zfs: cherry-pick fix from =
openzfs Martin Matuska
. . .
   =E2=80=A2 git: 6b405053c997 - main - OpenSSL: clean up botched merges =
in OpenSSL 3.0.9 import Jung-uk Kim

So it is based on starting with the 2 cherry-pick's as
well.

The ThreadRipper 1950X boots from a bectl BE and
that zfs media is all that is in use here.

I've setting up to test starting a bulk -a using
ALLOW_MAKE_JOBS=3Dyes along with allowing 32 builders.
This so 32*32 or so potentially for load average(s)
at times. There is 128 GiBytes of RAM and:

# swapinfo
Device          1K-blocks     Used    Avail Capacity
/dev/gpt/OptBswp480 503316480        0 503316480     0%

I'm not so sure that such a high load average bulk -a
is reasonable for a debug kernel build: unsure of
resource usage for such and if everything could be
tracked as needed. So I'm testing a non-debug build
for now.

I have built the kernels (nodbg and dbg), installed
the nodbg kernel, rebooted, and started:

# poudriere bulk -jmain-amd64-bulk_a -a
. . .
[00:01:22] Building 34042 packages using up to 32 builders
. . .

The ports tree is from back in mid-July.

I have a patched up top that records and reports
various MaxObs???? figures (Maximum Observed). It
was recetnly reporting:

. . .;  load averages: 119.56, 106.79,  71.54 MaxObs: 184.08, 112.10,  =
71.54
1459 threads:  . . ., 273 MaxObsRunning
. . .
Mem: . . ., 61066Mi MaxObsActive, 10277Mi MaxObsWired, 71371Mi =
MaxObs(Act+Wir+Lndry)
. . .
Swap: . . ., 61094Mi MaxObs(Act+Lndry+SwapUsed), 71371Mi =
MaxObs(Act+Wir+Lndry+SwapUsed)

=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0F2C42B4-36FF-443A-A174-5B0CC57C4FC7>