Date: Fri, 8 Sep 2023 21:54:14 -0700 From: Mark Millard <marklmi@yahoo.com> To: Martin Matuska <mm@FreeBSD.org>, Alexander Motin <mav@FreeBSD.org>, Glen Barber <gjb@FreeBSD.org> Cc: Current FreeBSD <freebsd-current@freebsd.org>, FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>, Pawel Jakub Dawidek <pjd@freebsd.org> Subject: Re: main [and, likely, stable/14]: do not set vfs.zfs.bclone_enabled=1 with that zpool feature enabled because it still leads to panics Message-ID: <8746A218-F83A-40E7-95F8-5EC1E36411C1@yahoo.com> In-Reply-To: <A906A64F-3CAF-49E4-9C11-1A188FD22881@yahoo.com> References: <7CE2CAAF-8BB0-4422-B194-4A6B0A4BC12C@yahoo.com> <08B7E72B-78F1-4ACA-B09D-E8C34BCE2335@yahoo.com> <20230907184823.GC4090@FreeBSD.org> <F4ED7034-6776-402C-8706-DED08F41455E@yahoo.com> <4f4e2b68-57e0-a475-e2bd-1f2b8844ebfe@FreeBSD.org> <354C5B8C-4216-4171-B8C2-8E827817F8E5@yahoo.com> <8B8B3707-4B37-4621-8124-D6A77CAF6879@yahoo.com> <15df58d3-4603-132f-112e-d10a6d4419bf@FreeBSD.org> <2a25427c-5a61-3f72-4e31-b7666741d38d@FreeBSD.org> <63717d32-f340-1320-3335-85135d1b62bc@FreeBSD.org> <05C47E15-640D-41AD-9C4C-73A1D5041CF4@yahoo.com> <A906A64F-3CAF-49E4-9C11-1A188FD22881@yahoo.com>
index | next in thread | previous in thread | raw e-mail
On Sep 8, 2023, at 18:19, Mark Millard <marklmi@yahoo.com> wrote: > On Sep 8, 2023, at 17:03, Mark Millard <marklmi@yahoo.com> wrote: > >> On Sep 8, 2023, at 15:30, Martin Matuska <mm@FreeBSD.org> wrote: >> >>> I can confirm that the patch fixes the panic caused by the provided script on my test systems. >>> Mark, would it be possible to try poudriere on your system with a patched kernel? >> >> . . . >> >> On 9. 9. 2023 0:09, Alexander Motin wrote: >>> On 08.09.2023 09:52, Martin Matuska wrote: >>>> . . . >>> >>> Thank you, Martin. I was able to reproduce the issue with your script and found the cause. >>> >>> I first though the issue is triggered by the `cp`, but it appeared to be triggered by `cat`. It also got copy_file_range() support, but later than `cp`. That is probably why it slipped through testing. This patch fixes it for me: https://github.com/openzfs/zfs/pull/15251 . >>> >>> Mark, could you please try the patch? >> >> If all goes well, this will end up reporting that the >> poudriere bulk -a is still running but has gotten past, >> say, 320+ port->package builds finished (so: more than >> double observed so far for the panic context). Later >> would be a report with a larger figure. A normal run >> I might let go for 6000+ ports and 10 hr or so. >> >> Notes as I go . . . >> >> Patch applied, built, and installed to the test media. >> Also, booted: >> >> # uname -apKU >> FreeBSD amd64-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT amd64 1500000 #75 main-n265228-c9315099f69e-dirty: Thu Sep 7 13:28:47 PDT 2023 root@amd64-ZFS:/usr/obj/BUILDs/main-amd64-dbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-DBG amd64 amd64 1500000 1500000 >> >> Note that this is with a debug kernel (-dbg- in path and -DBG in >> the GENERIC* name). Also, the vintage of what it is based on has: >> >> git: 969071be938c - main - vfs: copy_file_range() between multiple mountpoints of the same fs type >> >> The usual sort of sequencing previously reported to get to this >> point. Media update starts with the rewind to the checkpoint in >> hopes of avoiding oddities from the later failure. >> >> . . . : >> >> [main-amd64-bulk_a-default] [2023-09-08_16h31m51s] [parallel_build:] Queued: 34588 Built: 414 Failed: 0 Skipped: 39 Ignored: 335 Fetched: 0 Tobuild: 33800 Time: 00:30:41 >> >> >> So 414 and and still building. >> >> More later. (It may be a while.) >> > > [main-amd64-bulk_a-default] [2023-09-08_16h31m51s] [parallel_build:] Queued: 34588 Built: 2013 Failed: 2 Skipped: 179 Ignored: 335 Fetched: 0 Tobuild: 32059 Time: 01:42:47 > > and still going. (FYI: The failures are expected.) > > After a while I might stop it and start over with a non-debug > kernel installed instead. I did ^C after 2.5 hr (with 2447 built): ^C[02:30:05] Error: Signal SIGINT caught, cleaning up and exiting [main-amd64-bulk_a-default] [2023-09-08_16h31m51s] [sigint:] Queued: 34588 Built: 2447 Failed: 5 Skipped: 226 Ignored: 335 Fetched: 0 Tobuild: 31575 Time: 02:29:59 [02:30:05] Logs: /usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-09-08_16h31m51s [02:30:05] Cleaning up [02:38:04] Unmounting file systems Exiting with status 1 I'll switch it over to a non-debug kernel and, probably, world and setup/run another test. . . . (time goes by) . . . Hmm. This did not get sent when I wrote the above. FYI, non-debug test status: [main-amd64-bulk_a-default] [2023-09-08_19h51m52s] [parallel_build:] Queued: 34588 Built: 2547 Failed: 5 Skipped: 239 Ignored: 335 Fetched: 0 Tobuild: 31462 Time: 01:59:58 I may let it run overnight. === Mark Millard marklmi at yahoo.comhelp
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8746A218-F83A-40E7-95F8-5EC1E36411C1>
