Date: Sat, 19 Aug 2023 13:41:56 -0700 From: Mark Millard <marklmi@yahoo.com> To: Current FreeBSD <freebsd-current@freebsd.org> Subject: Re: ZFS deadlock in 14 Message-ID: <C5747BF8-724E-43B7-88D0-A9F70485E7E1@yahoo.com> In-Reply-To: <3AA253E3-C4F0-4AA3-9C37-D77E7527A458@yahoo.com> References: <59FCB309-4A55-4924-98C4-7ACCA70FD299@yahoo.com> <0F2C42B4-36FF-443A-A174-5B0CC57C4FC7@yahoo.com> <3AA253E3-C4F0-4AA3-9C37-D77E7527A458@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[I forgot to adjust USE_TMPFS for the purpose of the test. So I'll later be starting over.] On Aug 19, 2023, at 12:18, Mark Millard <marklmi@yahoo.com> wrote: > On Aug 19, 2023, at 11:40, Mark Millard <marklmi@yahoo.com> wrote: >=20 >> We will see how long the following high load average bulk -a >> configuration survives a build attempt, using a non-debug kernel >> for this test. >>=20 >> I've applied: >>=20 >> # fetch -o- https://github.com/openzfs/zfs/pull/15107.patch | git -C = /usr/main-src/ am --dir=3Dsys/contrib/openzfs >> - 13 kB 900 = kBps 00s >> Applying: Remove fastwrite mechanism. >>=20 >> # fetch -o- https://github.com/openzfs/zfs/pull/15122.patch | git -C = /usr/main-src/ am --dir=3Dsys/contrib/openzfs >> - 45 kB 1488 = kBps 00s >> Applying: ZIL: Second attempt to reduce scope of zl_issuer_lock. >>=20 >> on a ThreadRipper 1950X (32 hardware threads) that is at >> main 6b405053c997: >>=20 >> Thu, 10 Aug 2023 >> . . . >> =E2=80=A2 git: cd25b0f740f8 - main - zfs: cherry-pick fix from = openzfs Martin Matuska=20 >> =E2=80=A2 git: 28d2e3b5dedf - main - zfs: cherry-pick fix from = openzfs Martin Matuska >> . . . >> =E2=80=A2 git: 6b405053c997 - main - OpenSSL: clean up botched = merges in OpenSSL 3.0.9 import Jung-uk Kim >>=20 >> So it is based on starting with the 2 cherry-pick's as >> well. >>=20 >> The ThreadRipper 1950X boots from a bectl BE and >> that zfs media is all that is in use here. >>=20 >> I've setting up to test starting a bulk -a using >> ALLOW_MAKE_JOBS=3Dyes along with allowing 32 builders. >> This so 32*32 or so potentially for load average(s) >> at times. There is 128 GiBytes of RAM and: >>=20 >> # swapinfo >> Device 1K-blocks Used Avail Capacity >> /dev/gpt/OptBswp480 503316480 0 503316480 0% >>=20 >> I'm not so sure that such a high load average bulk -a >> is reasonable for a debug kernel build: unsure of >> resource usage for such and if everything could be >> tracked as needed. So I'm testing a non-debug build >> for now. >>=20 >> I have built the kernels (nodbg and dbg), installed >> the nodbg kernel, rebooted, and started: >>=20 >> # poudriere bulk -jmain-amd64-bulk_a -a >> . . . >> [00:01:22] Building 34042 packages using up to 32 builders >> . . . >>=20 >> The ports tree is from back in mid-July. >>=20 >> I have a patched up top that records and reports >> various MaxObs???? figures (Maximum Observed). It >> was recetnly reporting: >>=20 >> . . .; load averages: 119.56, 106.79, 71.54 MaxObs: 184.08, 112.10, = 71.54 >> 1459 threads: . . ., 273 MaxObsRunning >> . . . >> Mem: . . ., 61066Mi MaxObsActive, 10277Mi MaxObsWired, 71371Mi = MaxObs(Act+Wir+Lndry) >> . . . >> Swap: . . ., 61094Mi MaxObs(Act+Lndry+SwapUsed), 71371Mi = MaxObs(Act+Wir+Lndry+SwapUsed) >=20 > Status report at about 1 hr in: >=20 > [main-amd64-bulk_a-default] [2023-08-19_11h04m26s] [parallel_build:] = Queued: 34435 Built: 1929 Failed: 9 Skipped: 2569 Ignored: 358 = Fetched: 0 Tobuild: 29570 Time: 00:59:59 >=20 > Not hung up yet. >=20 > =46rom about 10 minutes after that: >=20 > . . . load averages: 205.56, 181.58, 153.68 MaxObs: 213.78, 182.26, = 153.68 > 1704 threads: . . ., 311 MaxObsRunning > . . . > Mem: . . ., 100250Mi MaxObsActive, 16857Mi MaxObsWired, 124879Mi = MaxObs(Act+Wir+Lndry) > . . . > Swap: . . . 5994Mi MaxObsUsed, 116589Mi MaxObs(Act+Lndry+SwapUsed), = 127354Mi MaxObs(Act+Wir+Lndry+SwapUsed) Just relized that I'd forgotten to reconfigure the USE_TMPFS=3Dall to be USE_TMPFS=3Dno so what I've done so far is not a great test. I'll still probably let it reach 3hr and get the summary information before I stop it, adjust USE_TMPFS, and start over from scratch. =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C5747BF8-724E-43B7-88D0-A9F70485E7E1>