Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 Aug 2023 07:54:33 +0200
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        Mark Millard <marklmi@yahoo.com>
Cc:        Current FreeBSD <freebsd-current@freebsd.org>
Subject:   Re: poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds in first builder
Message-ID:  <CAGudoHEWun73q5dPHreCv3cZ-%2BQrRT-CxyzrzsF3nsPi0k46rQ@mail.gmail.com>
In-Reply-To: <BB4A5EFC-2807-446C-83E0-330AFDCA28AC@yahoo.com>
References:  <4FFAE432-21FE-4462-9162-9CC30A5D470A.ref@yahoo.com> <4FFAE432-21FE-4462-9162-9CC30A5D470A@yahoo.com> <b8f819f0-9f6d-af87-64f9-35e37fbf4b2c@FreeBSD.org> <5D23E6BE-A25C-4190-BB2C-A2D3511ABD90@yahoo.com> <CAGudoHGMvFvSFwT5jotpiDg2%2B79O2jv1Fx4eAEH0yijs5pAJow@mail.gmail.com> <BB4A5EFC-2807-446C-83E0-330AFDCA28AC@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 8/24/23, Mark Millard <marklmi@yahoo.com> wrote:
> On Aug 23, 2023, at 15:10, Mateusz Guzik <mjguzik@gmail.com> wrote:
>
>> On 8/23/23, Mark Millard <marklmi@yahoo.com> wrote:
>>> [Forked off the ZFS deadlock 14 discussion, per feedback.]
>>>
>>> On Aug 23, 2023, at 11:40, Alexander Motin <mav@FreeBSD.org> wrote:
>>>
>>>> On 22.08.2023 14:24, Mark Millard wrote:
>>>>> Alexander Motin <mav_at_FreeBSD.org> wrote on
>>>>> Date: Tue, 22 Aug 2023 16:18:12 UTC :
>>>>>> I am waiting for final test results from George Wilson and then will
>>>>>> request quick merge of both to zfs-2.2-release branch. Unfortunately
>>>>>> there are still not many reviewers for the PR, since the code is not
>>>>>> trivial, but at least with the test reports Brian Behlendorf and Mark
>>>>>> Maybee seem to be OK to merge the two PRs into 2.2. If somebody else
>>>>>> have tested and/or reviewed the PR, you may comment on it.
>>>>> I had written to the list that when I tried to test the system
>>>>> doing poudriere builds (initially with your patches) using
>>>>> USE_TMPFS=no so that zfs had to deal with all the file I/O, I
>>>>> instead got only one builder that ended up active, the others
>>>>> never reaching "Builder started":
>>>>
>>>>> Top was showing lots of "vlruwk" for the cpdup's. For example:
>>>>> . . .
>>>>> 362     0 root         40    0  27076Ki   13776Ki CPU19   19   4:23
>>>>> 0.00% cpdup -i0 -o ref 32
>>>>> 349     0 root         53    0  27076Ki   13776Ki vlruwk  22   4:20
>>>>> 0.01% cpdup -i0 -o ref 31
>>>>> 328     0 root         68    0  27076Ki   13804Ki vlruwk   8   4:30
>>>>> 0.01% cpdup -i0 -o ref 30
>>>>> 304     0 root         37    0  27076Ki   13792Ki vlruwk   6   4:18
>>>>> 0.01% cpdup -i0 -o ref 29
>>>>> 282     0 root         42    0  33220Ki   13956Ki vlruwk   8   4:33
>>>>> 0.01% cpdup -i0 -o ref 28
>>>>> 242     0 root         56    0  27076Ki   13796Ki vlruwk   4   4:28
>>>>> 0.00% cpdup -i0 -o ref 27
>>>>> . . .
>>>>> But those processes did show CPU?? on occasion, as well as
>>>>> *vnode less often. None of the cpdup's was stuck in
>>>>> Removing your patches did not change the behavior.
>>>>
>>>> Mark, to me "vlruwk" looks like a limit on number of vnodes.  I was not
>>>> deep in that area at least recently, so somebody with more experience
>>>> there could try to diagnose it.  At very least it does not look related
>>>> to
>>>> the ZIL issue discussed in this thread, at least with the information
>>>> provided, so I am not surprised that the mentioned patches do not
>>>> affect
>>>> it.
>>>
>>> I did the above intending to test the deadlock in my context but
>>> ended up not getting that far when I tried to make zfs handle all
>>> the file I/O (USE_TMPFS=no and no other use of tmpfs or the like).
>>>
>>> The zfs context is a simple single partition on the boot media. I
>>> use ZFS for bectl BE use, not for other typical reasons. The media
>>> here is PCIe Optane 1.4T media. The machine is a ThreadRipper
>>> 1950X, so first generation. 128 GiBytes of RAM. 491520 MiBytes of
>>> swap, also on that Optane.
>>>
>>> # uname -apKU
>>> FreeBSD amd64-ZFS 14.0-ALPHA2 FreeBSD 14.0-ALPHA2 amd64 1400096 #112
>>> main-n264912-b1d3e2b77155-dirty: Sun Aug 20 10:01:48 PDT 2023
>>> root@amd64-ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG
>>> amd64 amd64 1400096 1400096
>>>
>>> The GENERIC-DBG variant of the kernel did not report any issues in
>>> earlier testing.
>>>
>>> The alter referenced /usr/obj/DESTDIRs/main-amd64-poud-bulk_a was
>>> installed from the same build.
>>>
>>> # zfs list
>>> NAME                                          USED  AVAIL  REFER
>>> MOUNTPOINT
>>> zoptb                                        79.9G   765G    96K  /zoptb
>>> zoptb/BUILDs                                 20.5G   765G  8.29M
>>> /usr/obj/BUILDs
>>> zoptb/BUILDs/alt-main-amd64-dbg-clang-alt    1.86M   765G  1.86M
>>> /usr/obj/BUILDs/alt-main-amd64-dbg-clang-alt
>>> zoptb/BUILDs/alt-main-amd64-nodbg-clang-alt  30.2M   765G  30.2M
>>> /usr/obj/BUILDs/alt-main-amd64-nodbg-clang-alt
>>> zoptb/BUILDs/main-amd64-dbg-clang            9.96G   765G  9.96G
>>> /usr/obj/BUILDs/main-amd64-dbg-clang
>>> zoptb/BUILDs/main-amd64-dbg-gccxtc           38.5M   765G  38.5M
>>> /usr/obj/BUILDs/main-amd64-dbg-gccxtc
>>> zoptb/BUILDs/main-amd64-nodbg-clang          10.3G   765G  10.3G
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang
>>> zoptb/BUILDs/main-amd64-nodbg-clang-alt      37.2M   765G  37.2M
>>> /usr/obj/BUILDs/main-amd64-nodbg-clang-alt
>>> zoptb/BUILDs/main-amd64-nodbg-gccxtc         94.6M   765G  94.6M
>>> /usr/obj/BUILDs/main-amd64-nodbg-gccxtc
>>> zoptb/DESTDIRs                               4.33G   765G   104K
>>> /usr/obj/DESTDIRs
>>> zoptb/DESTDIRs/main-amd64-poud               2.16G   765G  2.16G
>>> /usr/obj/DESTDIRs/main-amd64-poud
>>> zoptb/DESTDIRs/main-amd64-poud-bulk_a        2.16G   765G  2.16G
>>> /usr/obj/DESTDIRs/main-amd64-poud-bulk_a
>>> zoptb/ROOT                                   13.1G   765G    96K  none
>>> zoptb/ROOT/build_area_for-main-amd64         5.03G   765G  3.24G  none
>>> zoptb/ROOT/main-amd64                        8.04G   765G  3.23G  none
>>> zoptb/poudriere                              6.58G   765G   112K
>>> /usr/local/poudriere
>>> zoptb/poudriere/data                         6.58G   765G   128K
>>> /usr/local/poudriere/data
>>> zoptb/poudriere/data/.m                       112K   765G   112K
>>> /usr/local/poudriere/data/.m
>>> zoptb/poudriere/data/cache                   17.4M   765G  17.4M
>>> /usr/local/poudriere/data/cache
>>> zoptb/poudriere/data/images                    96K   765G    96K
>>> /usr/local/poudriere/data/images
>>> zoptb/poudriere/data/logs                    2.72G   765G  2.72G
>>> /usr/local/poudriere/data/logs
>>> zoptb/poudriere/data/packages                3.84G   765G  3.84G
>>> /usr/local/poudriere/data/packages
>>> zoptb/poudriere/data/wrkdirs                  112K   765G   112K
>>> /usr/local/poudriere/data/wrkdirs
>>> zoptb/poudriere/jails                          96K   765G    96K
>>> /usr/local/poudriere/jails
>>> zoptb/poudriere/ports                          96K   765G    96K
>>> /usr/local/poudriere/ports
>>> zoptb/tmp                                    68.5M   765G  68.5M  /tmp
>>> zoptb/usr                                    35.1G   765G    96K  /usr
>>> zoptb/usr/13_0R-src                          2.64G   765G  2.64G
>>> /usr/13_0R-src
>>> zoptb/usr/alt-main-src                         96K   765G    96K
>>> /usr/alt-main-src
>>> zoptb/usr/home                                181M   765G   181M
>>> /usr/home
>>> zoptb/usr/local                              5.08G   765G  5.08G
>>> /usr/local
>>> zoptb/usr/main-src                            833M   765G   833M
>>> /usr/main-src
>>> zoptb/usr/ports                              26.4G   765G  26.4G
>>> /usr/ports
>>> zoptb/usr/src                                  96K   765G    96K
>>> /usr/src
>>> zoptb/var                                    52.6M   765G    96K  /var
>>> zoptb/var/audit                               356K   765G   356K
>>> /var/audit
>>> zoptb/var/crash                               128K   765G   128K
>>> /var/crash
>>> zoptb/var/db                                 49.7M   765G    96K
>>> /var/db
>>> zoptb/var/db/pkg                             49.4M   765G  49.4M
>>> /var/db/pkg
>>> zoptb/var/db/ports                            164K   765G   164K
>>> /var/db/ports
>>> zoptb/var/log                                1.61M   765G  1.61M
>>> /var/log
>>> zoptb/var/mail                                632K   765G   632K
>>> /var/mail
>>> zoptb/var/tmp                                 128K   765G   128K
>>> /var/tmp
>>>
>>> # poudriere jail -jmain-amd64-bulk_a -i
>>> Jail name:         main-amd64-bulk_a
>>> Jail version:      14.0-ALPHA2
>>> Jail arch:         amd64
>>> Jail method:       null
>>> Jail mount:        /usr/obj/DESTDIRs/main-amd64-poud-bulk_a
>>> Jail fs:
>>> Jail updated:      2021-12-04 14:55:22
>>> Jail pkgbase:      disabled
>>>
>>>
>>>
>>> So, setting up another test with some related information
>>> shown before, during, and after. sysctl output is from
>>> another ssh session than the bulk -a run.
>>>
>>> # sysctl -a | grep vnode
>>> kern.maxvnodes: 2213808
>>> kern.ipc.umtx_vnode_persistent: 0
>>> kern.minvnodes: 553452
>>> vm.vnode_pbufs: 2048
>>> vm.stats.vm.v_vnodepgsout: 0
>>> vm.stats.vm.v_vnodepgsin: 272429
>>> vm.stats.vm.v_vnodeout: 0
>>> vm.stats.vm.v_vnodein: 12461
>>> vfs.vnode_alloc_sleeps: 0
>>> vfs.wantfreevnodes: 553452
>>> vfs.freevnodes: 962766
>>> vfs.vnodes_created: 2538980
>>> vfs.numvnodes: 1056233
>>> vfs.cache.debug.vnodes_cel_3_failures: 0
>>> vfs.cache.stats.heldvnodes: 91878
>>> debug.vnode_domainset: <NULL>
>>> debug.sizeof.vnode: 448
>>> debug.fail_point.status_fill_kinfo_vnode__random_path: off
>>> debug.fail_point.fill_kinfo_vnode__random_path: off
>>>
>>> # poudriere bulk -jmain-amd64-bulk_a -a
>>> . . .
>>> [00:01:34] Building 34042 packages using up to 32 builders
>>> [00:01:34] Hit CTRL+t at any time to see build progress and stats
>>> [00:01:34] [01] [00:00:00] Builder starting
>>> [00:01:57] [01] [00:00:23] Builder started
>>> [00:01:57] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.20.4
>>> [00:03:09] [01] [00:01:12] Finished ports-mgmt/pkg | pkg-1.20.4: Success
>>> [00:03:22] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1
>>> [00:03:22] [02] [00:00:00] Builder starting
>>> [00:03:22] [03] [00:00:00] Builder starting
>>> . . .
>>> [00:03:22] [31] [00:00:00] Builder starting
>>> [00:03:22] [32] [00:00:00] Builder starting
>>> [00:03:31] [01] [00:00:09] Finished print/indexinfo | indexinfo-0.3.1:
>>> Success
>>> [00:03:31] [01] [00:00:00] Building devel/gettext-runtime |
>>> gettext-runtime-0.22
>>> . . .
>>>
>>> Note that only [01] makes progress: no new "Builder started"
>>> notices occur. top shows 31 of the pattern:
>>> cpdup -i0 -o ref ??
>>>
>>> Then during the 31 cpudup's showing vlruwk most of the time:
>>>
>>> # sysctl -a | grep vnode
>>> kern.maxvnodes: 2213808
>>> kern.ipc.umtx_vnode_persistent: 0
>>> kern.minvnodes: 553452
>>> vm.vnode_pbufs: 2048
>>> vm.stats.vm.v_vnodepgsout: 22844
>>> vm.stats.vm.v_vnodepgsin: 582398
>>> vm.stats.vm.v_vnodeout: 890
>>> vm.stats.vm.v_vnodein: 34296
>>> vfs.vnode_alloc_sleeps: 2994
>>> vfs.wantfreevnodes: 553452
>>> vfs.freevnodes: 2209662
>>> vfs.vnodes_created: 12206299
>>> vfs.numvnodes: 2214071
>>> vfs.cache.debug.vnodes_cel_3_failures: 0
>>> vfs.cache.stats.heldvnodes: 459
>>> debug.vnode_domainset: <NULL>
>>> debug.sizeof.vnode: 448
>>> debug.fail_point.status_fill_kinfo_vnode__random_path: off
>>> debug.fail_point.fill_kinfo_vnode__random_path: off
>>>
>>> Wait a while but still the mostly cpdup vlruwk status:
>>>
>>> # sysctl -a | grep vnode
>>> kern.maxvnodes: 2213808
>>> kern.ipc.umtx_vnode_persistent: 0
>>> kern.minvnodes: 553452
>>> vm.vnode_pbufs: 2048
>>> vm.stats.vm.v_vnodepgsout: 22844
>>> vm.stats.vm.v_vnodepgsin: 583527
>>> vm.stats.vm.v_vnodeout: 890
>>> vm.stats.vm.v_vnodein: 34396
>>> vfs.vnode_alloc_sleeps: 8053
>>> vfs.wantfreevnodes: 553452
>>> vfs.freevnodes: 2210166
>>> vfs.vnodes_created: 12212061
>>> vfs.numvnodes: 2215106
>>> vfs.cache.debug.vnodes_cel_3_failures: 0
>>> vfs.cache.stats.heldvnodes: 497
>>> debug.vnode_domainset: <NULL>
>>> debug.sizeof.vnode: 448
>>> debug.fail_point.status_fill_kinfo_vnode__random_path: off
>>> debug.fail_point.fill_kinfo_vnode__random_path: off
>>>
>>> ^C[00:14:55] Error: Signal SIGINT caught, cleaning up and exiting
>>>
>>> # sysctl -a | grep vnode
>>> kern.maxvnodes: 2213808
>>> kern.ipc.umtx_vnode_persistent: 0
>>> kern.minvnodes: 553452
>>> vm.vnode_pbufs: 2048
>>> vm.stats.vm.v_vnodepgsout: 22844
>>> vm.stats.vm.v_vnodepgsin: 584474
>>> vm.stats.vm.v_vnodeout: 890
>>> vm.stats.vm.v_vnodein: 34591
>>> vfs.vnode_alloc_sleeps: 17584
>>> vfs.wantfreevnodes: 553452
>>> vfs.freevnodes: 2210796
>>> vfs.vnodes_created: 12222343
>>> vfs.numvnodes: 2216564
>>> vfs.cache.debug.vnodes_cel_3_failures: 0
>>> vfs.cache.stats.heldvnodes: 539
>>> debug.vnode_domainset: <NULL>
>>> debug.sizeof.vnode: 448
>>> debug.fail_point.status_fill_kinfo_vnode__random_path: off
>>> debug.fail_point.fill_kinfo_vnode__random_path: off
>>>
>>> [main-amd64-bulk_a-default] [2023-08-23_13h58m08s] [sigint:] Queued:
>>> 34435
>>> Built: 2     Failed: 0     Skipped: 35    Ignored: 358   Fetched: 0
>>> Tobuild: 34040  Time: 00:14:36
>>> [00:16:13] Logs:
>>> /usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-08-23_13h58m08s
>>> [00:16:49] Cleaning up
>>> load: 5.28  cmd: sh 77057 [vlruwk] 141.63r 0.00u 30.98s 28% 6932k
>>> #0 0xffffffff80b76ebb at mi_switch+0xbb
>>> #1 0xffffffff80bc960f at sleepq_timedwait+0x2f
>>> #2 0xffffffff80b76610 at _sleep+0x1d0
>>> #3 0xffffffff80c5b2dc at vn_alloc_hard+0x2ac
>>> #4 0xffffffff80c50a12 at getnewvnode_reserve+0x92
>>> #5 0xffffffff829afb12 at zfs_zget+0x22
>>> #6 0xffffffff8299ca8d at zfs_dirent_lookup+0x16d
>>> #7 0xffffffff8299cb5f at zfs_dirlook+0x7f
>>> #8 0xffffffff829ac410 at zfs_lookup+0x350
>>> #9 0xffffffff829a782a at zfs_freebsd_cachedlookup+0x6a
>>> #10 0xffffffff80c368ad at vfs_cache_lookup+0xad
>>> #11 0xffffffff80c3b6d8 at cache_fplookup_final_modifying+0x188
>>> #12 0xffffffff80c38766 at cache_fplookup+0x356
>>> #13 0xffffffff80c43fb2 at namei+0x112
>>> #14 0xffffffff80c62e5b at kern_funlinkat+0x13b
>>> #15 0xffffffff80c62d18 at sys_unlink+0x28
>>> #16 0xffffffff83b8e583 at filemon_wrapper_unlink+0x13
>>> #17 0xffffffff81049a79 at amd64_syscall+0x109
>>>
>>> [00:26:28] Unmounting file systems
>>> Exiting with status 1
>>>
>>> # sysctl -a | grep vnode
>>> kern.maxvnodes: 2213808
>>> kern.ipc.umtx_vnode_persistent: 0
>>> kern.minvnodes: 553452
>>> vm.vnode_pbufs: 2048
>>> vm.stats.vm.v_vnodepgsout: 22844
>>> vm.stats.vm.v_vnodepgsin: 585384
>>> vm.stats.vm.v_vnodeout: 890
>>> vm.stats.vm.v_vnodein: 34798
>>> vfs.vnode_alloc_sleeps: 27578
>>> vfs.wantfreevnodes: 553452
>>> vfs.freevnodes: 61362
>>> vfs.vnodes_created: 20135479
>>> vfs.numvnodes: 59860
>>> vfs.cache.debug.vnodes_cel_3_failures: 0
>>> vfs.cache.stats.heldvnodes: 208
>>> debug.vnode_domainset: <NULL>
>>> debug.sizeof.vnode: 448
>>> debug.fail_point.status_fill_kinfo_vnode__random_path: off
>>> debug.fail_point.fill_kinfo_vnode__random_path: off
>>>
>>>
>>> For reference (from after):
>>>
>>> # kldstat
>>> Id Refs Address                Size Name
>>> 1   95 0xffffffff80200000  274b308 kernel
>>> 2    1 0xffffffff8294c000   5d5238 zfs.ko
>>> 3    1 0xffffffff82f22000     7718 cryptodev.ko
>>> 4    1 0xffffffff83b10000     3390 acpi_wmi.ko
>>> 5    1 0xffffffff83b14000     3220 intpm.ko
>>> 6    1 0xffffffff83b18000     2178 smbus.ko
>>> 7    1 0xffffffff83b1b000     2240 cpuctl.ko
>>> 8    1 0xffffffff83b1e000     3360 uhid.ko
>>> 9    1 0xffffffff83b22000     4364 ums.ko
>>> 10    1 0xffffffff83b27000     33c0 usbhid.ko
>>> 11    1 0xffffffff83b2b000     3380 hidbus.ko
>>> 12    1 0xffffffff83b2f000     4d20 ng_ubt.ko
>>> 13    6 0xffffffff83b34000     abb8 netgraph.ko
>>> 14    2 0xffffffff83b3f000     a250 ng_hci.ko
>>> 15    4 0xffffffff83b4a000     2670 ng_bluetooth.ko
>>> 16    1 0xffffffff83b4d000     83a0 uftdi.ko
>>> 17    1 0xffffffff83b56000     4e58 ucom.ko
>>> 18    1 0xffffffff83b5b000     3360 wmt.ko
>>> 19    1 0xffffffff83b5f000     e268 ng_l2cap.ko
>>> 20    1 0xffffffff83b6e000    1bf68 ng_btsocket.ko
>>> 21    1 0xffffffff83b8a000     38f8 ng_socket.ko
>>> 22    1 0xffffffff83b8e000     3250 filemon.ko
>>> 23    1 0xffffffff83b92000     4758 nullfs.ko
>>> 24    1 0xffffffff83b97000     73c0 linprocfs.ko
>>> 25    3 0xffffffff83b9f000     be70 linux_common.ko
>>> 26    1 0xffffffff83bab000     3558 fdescfs.ko
>>> 27    1 0xffffffff83baf000    31b20 linux.ko
>>> 28    1 0xffffffff83be1000    2ed40 linux64.ko
>>>
>>>
>>> Note that before the "Cleaning up" notice, the vfs.freevnodes
>>> shows as being around (for example) 2210796. But after
>>> "Exiting with status": 61362. vfs.vnodes_created has a
>>> similar staging of in the ball park of up to 12222343 but
>>> then the change to: 20135479. Similarly, vfs.numvnodes
>>> 2216564 -> 59860.
>>>
>>>
>>>
>>> Anything else I should gather and report as basic information?
>>>
>>
>> This is a known problem, but it is unclear if you should be running
>> into it in this setup.
>
> The changed fixed the issue: so I do run into the the issue
> for this setup. See below.
>
>> Can you try again but this time *revert*
>> 138a5dafba312ff39ce0eefdbe34de95519e600d, like so:
>> git revert 138a5dafba312ff39ce0eefdbe34de95519e600d
>>
>> may want to switch to a different branch first, for example: git
>> checkout -b vfstesting
>
> # git -C /usr/main-src/ diff sys/kern/vfs_subr.c
> diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
> index 0f3f00abfd4a..5dff556ac258 100644
> --- a/sys/kern/vfs_subr.c
> +++ b/sys/kern/vfs_subr.c
> @@ -3528,25 +3528,17 @@ vdbatch_process(struct vdbatch *vd)
>         MPASS(curthread->td_pinned > 0);
>         MPASS(vd->index == VDBATCH_SIZE);
>   +       mtx_lock(&vnode_list_mtx);
>         critical_enter();
> -       if (mtx_trylock(&vnode_list_mtx)) {
> -               for (i = 0; i < VDBATCH_SIZE; i++) {
> -                       vp = vd->tab[i];
> -                       vd->tab[i] = NULL;
> -                       TAILQ_REMOVE(&vnode_list, vp, v_vnodelist);
> -                       TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist);
> -                       MPASS(vp->v_dbatchcpu != NOCPU);
> -                       vp->v_dbatchcpu = NOCPU;
> -               }
> -               mtx_unlock(&vnode_list_mtx);
> -       } else {
> -               for (i = 0; i < VDBATCH_SIZE; i++) {
> -                       vp = vd->tab[i];
> -                       vd->tab[i] = NULL;
> -                       MPASS(vp->v_dbatchcpu != NOCPU);
> -                       vp->v_dbatchcpu = NOCPU;
> -               }
> +       for (i = 0; i < VDBATCH_SIZE; i++) {
> +               vp = vd->tab[i];
> +               TAILQ_REMOVE(&vnode_list, vp, v_vnodelist);
> +               TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist);
> +               MPASS(vp->v_dbatchcpu != NOCPU);
> +               vp->v_dbatchcpu = NOCPU;
>         }
> +       mtx_unlock(&vnode_list_mtx);
> +       bzero(vd->tab, sizeof(vd->tab));
>         vd->index = 0;
>         critical_exit();
>  }
>
> Still with:
>
> # grep USE_TMPFS= /usr/local/etc/poudriere.conf
> # EXAMPLE: USE_TMPFS="wrkdir data"
> #USE_TMPFS=all
> #USE_TMPFS="data"
> USE_TMPFS=no
>
>
> That allowed the other builders to eventually reach "Builder started"
> and later activity, [00:05:50] [27] [00:02:29] Builder started
> being the first non-[01] to do so, no vlruwk's observed in what
> I saw in top:
>
> # poudriere bulk -jmain-amd64-bulk_a -a
> . . .
> [00:01:34] Building 34042 packages using up to 32 builders
> [00:01:34] Hit CTRL+t at any time to see build progress and stats
> [00:01:34] [01] [00:00:00] Builder starting
> [00:01:57] [01] [00:00:23] Builder started
> [00:01:57] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.20.4
> [00:03:08] [01] [00:01:11] Finished ports-mgmt/pkg | pkg-1.20.4: Success
> [00:03:21] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1
> [00:03:21] [02] [00:00:00] Builder starting
> [00:03:21] [03] [00:00:00] Builder starting
> [00:03:21] [04] [00:00:00] Builder starting
> [00:03:21] [05] [00:00:00] Builder starting
> [00:03:21] [06] [00:00:00] Builder starting
> [00:03:21] [07] [00:00:00] Builder starting
> [00:03:21] [08] [00:00:00] Builder starting
> [00:03:21] [09] [00:00:00] Builder starting
> [00:03:21] [10] [00:00:00] Builder starting
> [00:03:21] [11] [00:00:00] Builder starting
> [00:03:21] [12] [00:00:00] Builder starting
> [00:03:21] [13] [00:00:00] Builder starting
> [00:03:21] [14] [00:00:00] Builder starting
> [00:03:21] [15] [00:00:00] Builder starting
> [00:03:21] [16] [00:00:00] Builder starting
> [00:03:21] [17] [00:00:00] Builder starting
> [00:03:21] [18] [00:00:00] Builder starting
> [00:03:21] [19] [00:00:00] Builder starting
> [00:03:21] [20] [00:00:00] Builder starting
> [00:03:21] [21] [00:00:00] Builder starting
> [00:03:21] [22] [00:00:00] Builder starting
> [00:03:21] [23] [00:00:00] Builder starting
> [00:03:21] [24] [00:00:00] Builder starting
> [00:03:21] [25] [00:00:00] Builder starting
> [00:03:21] [26] [00:00:00] Builder starting
> [00:03:21] [27] [00:00:00] Builder starting
> [00:03:21] [28] [00:00:00] Builder starting
> [00:03:21] [29] [00:00:00] Builder starting
> [00:03:21] [30] [00:00:00] Builder starting
> [00:03:21] [31] [00:00:00] Builder starting
> [00:03:21] [32] [00:00:00] Builder starting
> [00:03:29] [01] [00:00:08] Finished print/indexinfo | indexinfo-0.3.1:
> Success
> [00:03:30] [01] [00:00:00] Building devel/gettext-runtime |
> gettext-runtime-0.22
> [00:04:49] [01] [00:01:19] Finished devel/gettext-runtime |
> gettext-runtime-0.22: Success
> [00:04:52] [01] [00:00:00] Building devel/gmake | gmake-4.3_2
> [00:05:20] [01] [00:00:28] Finished devel/gmake | gmake-4.3_2: Success
> [00:05:23] [01] [00:00:00] Building devel/libtextstyle | libtextstyle-0.22
> [00:05:50] [27] [00:02:29] Builder started
> [00:05:50] [27] [00:00:00] Building devel/npth | npth-1.6
> [00:05:50] [08] [00:02:29] Builder started
> [00:05:50] [08] [00:00:00] Building devel/boost-jam | boost-jam-1.82.0
> [00:05:51] [16] [00:02:30] Builder started
> [00:05:51] [16] [00:00:00] Building devel/libdaemon | libdaemon-0.14_1
> [00:05:51] [19] [00:02:30] Builder started
> [00:05:51] [19] [00:00:00] Building graphics/libpotrace | libpotrace-1.16
> [00:05:51] [14] [00:02:30] Builder started
> [00:05:51] [14] [00:00:00] Building sysutils/gnome_subr | gnome_subr-1.0
> [00:05:51] [17] [00:02:30] Builder started
> [00:05:51] [17] [00:00:00] Building misc/hicolor-icon-theme |
> hicolor-icon-theme-0.17
> [00:05:51] [26] [00:02:30] Builder started
> [00:05:51] [26] [00:00:00] Building devel/libinotify | libinotify-20211018
> [00:05:51] [12] [00:02:30] Builder started
> [00:05:51] [12] [00:00:00] Building devel/evdev-proto | evdev-proto-5.8
> [00:05:51] [29] [00:02:30] Builder started
> [00:05:51] [29] [00:00:00] Building devel/opencl | opencl-3.0.14
> [00:05:51] [05] [00:02:30] Builder started
> [00:05:51] [05] [00:00:00] Building devel/autoconf-switch |
> autoconf-switch-20220527
> [00:05:51] [11] [00:02:30] Builder started
> [00:05:51] [11] [00:00:00] Building textproc/html2text |
> html2text-1.3.2a_1,1
> [00:05:51] [25] [00:02:30] Builder started
> [00:05:51] [04] [00:02:30] Builder started
> [00:05:51] [25] [00:00:00] Building audio/speexdsp | speexdsp-1.2.1
> [00:05:51] [04] [00:00:00] Building converters/libiconv | libiconv-1.17
> [00:05:51] [22] [00:02:30] Builder started
> [00:05:51] [22] [00:00:00] Building dns/public_suffix_list |
> public_suffix_list-20230630
> [00:05:51] [20] [00:02:30] Builder started
> [00:05:51] [02] [00:02:30] Builder started
> [00:05:51] [24] [00:02:30] Builder started
> [00:05:51] [31] [00:02:30] Builder started
> [00:05:51] [20] [00:00:00] Building print/gsfonts | gsfonts-8.11_8
> . . .
>
> I had waited a very long time with no such progress for [02]..[32]
> for the prior code.
>
> Now testing for the zfs deadlock issue should be possible for
> this setup.
>

Thanks for testing, I wrote a fix:

https://people.freebsd.org/~mjg/vfs-recycle-fix.diff

Applies to *stock* kernel (as in without the revert).

-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHEWun73q5dPHreCv3cZ-%2BQrRT-CxyzrzsF3nsPi0k46rQ>