Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 23 Aug 2023 14:50:38 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Current FreeBSD <freebsd-current@freebsd.org>
Subject:   poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds  in first builder
Message-ID:  <5D23E6BE-A25C-4190-BB2C-A2D3511ABD90@yahoo.com>
In-Reply-To: <b8f819f0-9f6d-af87-64f9-35e37fbf4b2c@FreeBSD.org>
References:  <4FFAE432-21FE-4462-9162-9CC30A5D470A.ref@yahoo.com> <4FFAE432-21FE-4462-9162-9CC30A5D470A@yahoo.com> <b8f819f0-9f6d-af87-64f9-35e37fbf4b2c@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
[Forked off the ZFS deadlock 14 discussion, per feedback.]

On Aug 23, 2023, at 11:40, Alexander Motin <mav@FreeBSD.org> wrote:

> On 22.08.2023 14:24, Mark Millard wrote:
>> Alexander Motin <mav_at_FreeBSD.org> wrote on
>> Date: Tue, 22 Aug 2023 16:18:12 UTC :
>>> I am waiting for final test results from George Wilson and then will
>>> request quick merge of both to zfs-2.2-release branch. Unfortunately
>>> there are still not many reviewers for the PR, since the code is not
>>> trivial, but at least with the test reports Brian Behlendorf and =
Mark
>>> Maybee seem to be OK to merge the two PRs into 2.2. If somebody else
>>> have tested and/or reviewed the PR, you may comment on it.
>> I had written to the list that when I tried to test the system
>> doing poudriere builds (initially with your patches) using
>> USE_TMPFS=3Dno so that zfs had to deal with all the file I/O, I
>> instead got only one builder that ended up active, the others
>> never reaching "Builder started":
>=20
>> Top was showing lots of "vlruwk" for the cpdup's. For example:
>> . . .
>>  362     0 root         40    0  27076Ki   13776Ki CPU19   19   4:23  =
 0.00% cpdup -i0 -o ref 32
>>  349     0 root         53    0  27076Ki   13776Ki vlruwk  22   4:20  =
 0.01% cpdup -i0 -o ref 31
>>  328     0 root         68    0  27076Ki   13804Ki vlruwk   8   4:30  =
 0.01% cpdup -i0 -o ref 30
>>  304     0 root         37    0  27076Ki   13792Ki vlruwk   6   4:18  =
 0.01% cpdup -i0 -o ref 29
>>  282     0 root         42    0  33220Ki   13956Ki vlruwk   8   4:33  =
 0.01% cpdup -i0 -o ref 28
>>  242     0 root         56    0  27076Ki   13796Ki vlruwk   4   4:28  =
 0.00% cpdup -i0 -o ref 27
>> . . .
>> But those processes did show CPU?? on occasion, as well as
>> *vnode less often. None of the cpdup's was stuck in
>> Removing your patches did not change the behavior.
>=20
> Mark, to me "vlruwk" looks like a limit on number of vnodes.  I was =
not deep in that area at least recently, so somebody with more =
experience there could try to diagnose it.  At very least it does not =
look related to the ZIL issue discussed in this thread, at least with =
the information provided, so I am not surprised that the mentioned =
patches do not affect it.

I did the above intending to test the deadlock in my context but
ended up not getting that far when I tried to make zfs handle all
the file I/O (USE_TMPFS=3Dno and no other use of tmpfs or the like).

The zfs context is a simple single partition on the boot media. I
use ZFS for bectl BE use, not for other typical reasons. The media
here is PCIe Optane 1.4T media. The machine is a ThreadRipper
1950X, so first generation. 128 GiBytes of RAM. 491520 MiBytes of
swap, also on that Optane.

# uname -apKU
FreeBSD amd64-ZFS 14.0-ALPHA2 FreeBSD 14.0-ALPHA2 amd64 1400096 #112 =
main-n264912-b1d3e2b77155-dirty: Sun Aug 20 10:01:48 PDT 2023     =
root@amd64-ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.a=
md64/sys/GENERIC-NODBG amd64 amd64 1400096 1400096

The GENERIC-DBG variant of the kernel did not report any issues in
earlier testing.

The alter referenced /usr/obj/DESTDIRs/main-amd64-poud-bulk_a was
installed from the same build.

# zfs list
NAME                                          USED  AVAIL  REFER  =
MOUNTPOINT
zoptb                                        79.9G   765G    96K  /zoptb
zoptb/BUILDs                                 20.5G   765G  8.29M  =
/usr/obj/BUILDs
zoptb/BUILDs/alt-main-amd64-dbg-clang-alt    1.86M   765G  1.86M  =
/usr/obj/BUILDs/alt-main-amd64-dbg-clang-alt
zoptb/BUILDs/alt-main-amd64-nodbg-clang-alt  30.2M   765G  30.2M  =
/usr/obj/BUILDs/alt-main-amd64-nodbg-clang-alt
zoptb/BUILDs/main-amd64-dbg-clang            9.96G   765G  9.96G  =
/usr/obj/BUILDs/main-amd64-dbg-clang
zoptb/BUILDs/main-amd64-dbg-gccxtc           38.5M   765G  38.5M  =
/usr/obj/BUILDs/main-amd64-dbg-gccxtc
zoptb/BUILDs/main-amd64-nodbg-clang          10.3G   765G  10.3G  =
/usr/obj/BUILDs/main-amd64-nodbg-clang
zoptb/BUILDs/main-amd64-nodbg-clang-alt      37.2M   765G  37.2M  =
/usr/obj/BUILDs/main-amd64-nodbg-clang-alt
zoptb/BUILDs/main-amd64-nodbg-gccxtc         94.6M   765G  94.6M  =
/usr/obj/BUILDs/main-amd64-nodbg-gccxtc
zoptb/DESTDIRs                               4.33G   765G   104K  =
/usr/obj/DESTDIRs
zoptb/DESTDIRs/main-amd64-poud               2.16G   765G  2.16G  =
/usr/obj/DESTDIRs/main-amd64-poud
zoptb/DESTDIRs/main-amd64-poud-bulk_a        2.16G   765G  2.16G  =
/usr/obj/DESTDIRs/main-amd64-poud-bulk_a
zoptb/ROOT                                   13.1G   765G    96K  none
zoptb/ROOT/build_area_for-main-amd64         5.03G   765G  3.24G  none
zoptb/ROOT/main-amd64                        8.04G   765G  3.23G  none
zoptb/poudriere                              6.58G   765G   112K  =
/usr/local/poudriere
zoptb/poudriere/data                         6.58G   765G   128K  =
/usr/local/poudriere/data
zoptb/poudriere/data/.m                       112K   765G   112K  =
/usr/local/poudriere/data/.m
zoptb/poudriere/data/cache                   17.4M   765G  17.4M  =
/usr/local/poudriere/data/cache
zoptb/poudriere/data/images                    96K   765G    96K  =
/usr/local/poudriere/data/images
zoptb/poudriere/data/logs                    2.72G   765G  2.72G  =
/usr/local/poudriere/data/logs
zoptb/poudriere/data/packages                3.84G   765G  3.84G  =
/usr/local/poudriere/data/packages
zoptb/poudriere/data/wrkdirs                  112K   765G   112K  =
/usr/local/poudriere/data/wrkdirs
zoptb/poudriere/jails                          96K   765G    96K  =
/usr/local/poudriere/jails
zoptb/poudriere/ports                          96K   765G    96K  =
/usr/local/poudriere/ports
zoptb/tmp                                    68.5M   765G  68.5M  /tmp
zoptb/usr                                    35.1G   765G    96K  /usr
zoptb/usr/13_0R-src                          2.64G   765G  2.64G  =
/usr/13_0R-src
zoptb/usr/alt-main-src                         96K   765G    96K  =
/usr/alt-main-src
zoptb/usr/home                                181M   765G   181M  =
/usr/home
zoptb/usr/local                              5.08G   765G  5.08G  =
/usr/local
zoptb/usr/main-src                            833M   765G   833M  =
/usr/main-src
zoptb/usr/ports                              26.4G   765G  26.4G  =
/usr/ports
zoptb/usr/src                                  96K   765G    96K  =
/usr/src
zoptb/var                                    52.6M   765G    96K  /var
zoptb/var/audit                               356K   765G   356K  =
/var/audit
zoptb/var/crash                               128K   765G   128K  =
/var/crash
zoptb/var/db                                 49.7M   765G    96K  =
/var/db
zoptb/var/db/pkg                             49.4M   765G  49.4M  =
/var/db/pkg
zoptb/var/db/ports                            164K   765G   164K  =
/var/db/ports
zoptb/var/log                                1.61M   765G  1.61M  =
/var/log
zoptb/var/mail                                632K   765G   632K  =
/var/mail
zoptb/var/tmp                                 128K   765G   128K  =
/var/tmp

# poudriere jail -jmain-amd64-bulk_a -i
Jail name:         main-amd64-bulk_a
Jail version:      14.0-ALPHA2
Jail arch:         amd64
Jail method:       null
Jail mount:        /usr/obj/DESTDIRs/main-amd64-poud-bulk_a
Jail fs:          =20
Jail updated:      2021-12-04 14:55:22
Jail pkgbase:      disabled



So, setting up another test with some related information
shown before, during, and after. sysctl output is from
another ssh session than the bulk -a run.

# sysctl -a | grep vnode
kern.maxvnodes: 2213808
kern.ipc.umtx_vnode_persistent: 0
kern.minvnodes: 553452
vm.vnode_pbufs: 2048
vm.stats.vm.v_vnodepgsout: 0
vm.stats.vm.v_vnodepgsin: 272429
vm.stats.vm.v_vnodeout: 0
vm.stats.vm.v_vnodein: 12461
vfs.vnode_alloc_sleeps: 0
vfs.wantfreevnodes: 553452
vfs.freevnodes: 962766
vfs.vnodes_created: 2538980
vfs.numvnodes: 1056233
vfs.cache.debug.vnodes_cel_3_failures: 0
vfs.cache.stats.heldvnodes: 91878
debug.vnode_domainset: <NULL>
debug.sizeof.vnode: 448
debug.fail_point.status_fill_kinfo_vnode__random_path: off
debug.fail_point.fill_kinfo_vnode__random_path: off

# poudriere bulk -jmain-amd64-bulk_a -a
. . .
[00:01:34] Building 34042 packages using up to 32 builders
[00:01:34] Hit CTRL+t at any time to see build progress and stats
[00:01:34] [01] [00:00:00] Builder starting
[00:01:57] [01] [00:00:23] Builder started
[00:01:57] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.20.4
[00:03:09] [01] [00:01:12] Finished ports-mgmt/pkg | pkg-1.20.4: Success
[00:03:22] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1
[00:03:22] [02] [00:00:00] Builder starting
[00:03:22] [03] [00:00:00] Builder starting
. . .
[00:03:22] [31] [00:00:00] Builder starting
[00:03:22] [32] [00:00:00] Builder starting
[00:03:31] [01] [00:00:09] Finished print/indexinfo | indexinfo-0.3.1: =
Success
[00:03:31] [01] [00:00:00] Building devel/gettext-runtime | =
gettext-runtime-0.22
. . .

Note that only [01] makes progress: no new "Builder started"
notices occur. top shows 31 of the pattern:
cpdup -i0 -o ref ??

Then during the 31 cpudup's showing vlruwk most of the time:

# sysctl -a | grep vnode
kern.maxvnodes: 2213808
kern.ipc.umtx_vnode_persistent: 0
kern.minvnodes: 553452
vm.vnode_pbufs: 2048
vm.stats.vm.v_vnodepgsout: 22844
vm.stats.vm.v_vnodepgsin: 582398
vm.stats.vm.v_vnodeout: 890
vm.stats.vm.v_vnodein: 34296
vfs.vnode_alloc_sleeps: 2994
vfs.wantfreevnodes: 553452
vfs.freevnodes: 2209662
vfs.vnodes_created: 12206299
vfs.numvnodes: 2214071
vfs.cache.debug.vnodes_cel_3_failures: 0
vfs.cache.stats.heldvnodes: 459
debug.vnode_domainset: <NULL>
debug.sizeof.vnode: 448
debug.fail_point.status_fill_kinfo_vnode__random_path: off
debug.fail_point.fill_kinfo_vnode__random_path: off

Wait a while but still the mostly cpdup vlruwk status:

# sysctl -a | grep vnode
kern.maxvnodes: 2213808
kern.ipc.umtx_vnode_persistent: 0
kern.minvnodes: 553452
vm.vnode_pbufs: 2048
vm.stats.vm.v_vnodepgsout: 22844
vm.stats.vm.v_vnodepgsin: 583527
vm.stats.vm.v_vnodeout: 890
vm.stats.vm.v_vnodein: 34396
vfs.vnode_alloc_sleeps: 8053
vfs.wantfreevnodes: 553452
vfs.freevnodes: 2210166
vfs.vnodes_created: 12212061
vfs.numvnodes: 2215106
vfs.cache.debug.vnodes_cel_3_failures: 0
vfs.cache.stats.heldvnodes: 497
debug.vnode_domainset: <NULL>
debug.sizeof.vnode: 448
debug.fail_point.status_fill_kinfo_vnode__random_path: off
debug.fail_point.fill_kinfo_vnode__random_path: off

^C[00:14:55] Error: Signal SIGINT caught, cleaning up and exiting

# sysctl -a | grep vnode
kern.maxvnodes: 2213808
kern.ipc.umtx_vnode_persistent: 0
kern.minvnodes: 553452
vm.vnode_pbufs: 2048
vm.stats.vm.v_vnodepgsout: 22844
vm.stats.vm.v_vnodepgsin: 584474
vm.stats.vm.v_vnodeout: 890
vm.stats.vm.v_vnodein: 34591
vfs.vnode_alloc_sleeps: 17584
vfs.wantfreevnodes: 553452
vfs.freevnodes: 2210796
vfs.vnodes_created: 12222343
vfs.numvnodes: 2216564
vfs.cache.debug.vnodes_cel_3_failures: 0
vfs.cache.stats.heldvnodes: 539
debug.vnode_domainset: <NULL>
debug.sizeof.vnode: 448
debug.fail_point.status_fill_kinfo_vnode__random_path: off
debug.fail_point.fill_kinfo_vnode__random_path: off

[main-amd64-bulk_a-default] [2023-08-23_13h58m08s] [sigint:] Queued: =
34435 Built: 2     Failed: 0     Skipped: 35    Ignored: 358   Fetched: =
0     Tobuild: 34040  Time: 00:14:36
[00:16:13] Logs: =
/usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-08-23_1=
3h58m08s
[00:16:49] Cleaning up
load: 5.28  cmd: sh 77057 [vlruwk] 141.63r 0.00u 30.98s 28% 6932k
#0 0xffffffff80b76ebb at mi_switch+0xbb
#1 0xffffffff80bc960f at sleepq_timedwait+0x2f
#2 0xffffffff80b76610 at _sleep+0x1d0
#3 0xffffffff80c5b2dc at vn_alloc_hard+0x2ac
#4 0xffffffff80c50a12 at getnewvnode_reserve+0x92
#5 0xffffffff829afb12 at zfs_zget+0x22
#6 0xffffffff8299ca8d at zfs_dirent_lookup+0x16d
#7 0xffffffff8299cb5f at zfs_dirlook+0x7f
#8 0xffffffff829ac410 at zfs_lookup+0x350
#9 0xffffffff829a782a at zfs_freebsd_cachedlookup+0x6a
#10 0xffffffff80c368ad at vfs_cache_lookup+0xad
#11 0xffffffff80c3b6d8 at cache_fplookup_final_modifying+0x188
#12 0xffffffff80c38766 at cache_fplookup+0x356
#13 0xffffffff80c43fb2 at namei+0x112
#14 0xffffffff80c62e5b at kern_funlinkat+0x13b
#15 0xffffffff80c62d18 at sys_unlink+0x28
#16 0xffffffff83b8e583 at filemon_wrapper_unlink+0x13
#17 0xffffffff81049a79 at amd64_syscall+0x109

[00:26:28] Unmounting file systems
Exiting with status 1

# sysctl -a | grep vnode
kern.maxvnodes: 2213808
kern.ipc.umtx_vnode_persistent: 0
kern.minvnodes: 553452
vm.vnode_pbufs: 2048
vm.stats.vm.v_vnodepgsout: 22844
vm.stats.vm.v_vnodepgsin: 585384
vm.stats.vm.v_vnodeout: 890
vm.stats.vm.v_vnodein: 34798
vfs.vnode_alloc_sleeps: 27578
vfs.wantfreevnodes: 553452
vfs.freevnodes: 61362
vfs.vnodes_created: 20135479
vfs.numvnodes: 59860
vfs.cache.debug.vnodes_cel_3_failures: 0
vfs.cache.stats.heldvnodes: 208
debug.vnode_domainset: <NULL>
debug.sizeof.vnode: 448
debug.fail_point.status_fill_kinfo_vnode__random_path: off
debug.fail_point.fill_kinfo_vnode__random_path: off


For reference (from after):

# kldstat
Id Refs Address                Size Name
 1   95 0xffffffff80200000  274b308 kernel
 2    1 0xffffffff8294c000   5d5238 zfs.ko
 3    1 0xffffffff82f22000     7718 cryptodev.ko
 4    1 0xffffffff83b10000     3390 acpi_wmi.ko
 5    1 0xffffffff83b14000     3220 intpm.ko
 6    1 0xffffffff83b18000     2178 smbus.ko
 7    1 0xffffffff83b1b000     2240 cpuctl.ko
 8    1 0xffffffff83b1e000     3360 uhid.ko
 9    1 0xffffffff83b22000     4364 ums.ko
10    1 0xffffffff83b27000     33c0 usbhid.ko
11    1 0xffffffff83b2b000     3380 hidbus.ko
12    1 0xffffffff83b2f000     4d20 ng_ubt.ko
13    6 0xffffffff83b34000     abb8 netgraph.ko
14    2 0xffffffff83b3f000     a250 ng_hci.ko
15    4 0xffffffff83b4a000     2670 ng_bluetooth.ko
16    1 0xffffffff83b4d000     83a0 uftdi.ko
17    1 0xffffffff83b56000     4e58 ucom.ko
18    1 0xffffffff83b5b000     3360 wmt.ko
19    1 0xffffffff83b5f000     e268 ng_l2cap.ko
20    1 0xffffffff83b6e000    1bf68 ng_btsocket.ko
21    1 0xffffffff83b8a000     38f8 ng_socket.ko
22    1 0xffffffff83b8e000     3250 filemon.ko
23    1 0xffffffff83b92000     4758 nullfs.ko
24    1 0xffffffff83b97000     73c0 linprocfs.ko
25    3 0xffffffff83b9f000     be70 linux_common.ko
26    1 0xffffffff83bab000     3558 fdescfs.ko
27    1 0xffffffff83baf000    31b20 linux.ko
28    1 0xffffffff83be1000    2ed40 linux64.ko


Note that before the "Cleaning up" notice, the vfs.freevnodes
shows as being around (for example) 2210796. But after
"Exiting with status": 61362. vfs.vnodes_created has a
similar staging of in the ball park of up to 12222343 but
then the change to: 20135479. Similarly, vfs.numvnodes
2216564 -> 59860.



Anything else I should gather and report as basic information?

=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5D23E6BE-A25C-4190-BB2C-A2D3511ABD90>