Date: Mon, 4 Sep 2023 22:06:45 -0700 From: Mark Millard <marklmi@yahoo.com> To: Alexander Motin <mav@FreeBSD.org> Cc: dev-commits-src-main@freebsd.org, Current FreeBSD <freebsd-current@freebsd.org> Subject: Re: An attempted test of main's "git: 2ad756a6bbb3" "merge openzfs/zfs@95f71c019" that did not go as planned Message-ID: <4E888EC8-4B06-4139-90DF-59AFBA700A5F@yahoo.com> In-Reply-To: <5724006A-F4C4-42CA-98A9-90C5EB914F5E@yahoo.com> References: <F20BD354-2E10-4CD9-88FB-99DB1E405A52.ref@yahoo.com> <F20BD354-2E10-4CD9-88FB-99DB1E405A52@yahoo.com> <c9c5096b-33d9-1cb6-d446-97b99f324fce@FreeBSD.org> <673A446E-6F94-451E-910F-079F678C5289@yahoo.com> <2BDD30B5-6248-4EC3-83C8-0499E0717D1D@yahoo.com> <C78808EB-FDE2-432D-8309-A7017DFC1BCE@yahoo.com> <69028684-c2cd-8f0c-617f-d0763c08dbe4@FreeBSD.org> <8D832756-6754-4D1E-AE8D-E716FE08F747@yahoo.com> <5605506b-5059-fb72-3e5a-741863d54444@FreeBSD.org> <5724006A-F4C4-42CA-98A9-90C5EB914F5E@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sep 4, 2023, at 18:39, Mark Millard <marklmi@yahoo.com> wrote: > On Sep 4, 2023, at 10:05, Alexander Motin <mav@FreeBSD.org> wrote: >=20 >> On 04.09.2023 11:45, Mark Millard wrote: >>> On Sep 4, 2023, at 06:09, Alexander Motin <mav@FreeBSD.org> wrote: >>>> per_txg_dirty_frees_percent is directly related to the delete = delays we see here. You are forcing ZFS to commit transactions each 5% = of dirty ARC limit, which is 5% of 10% or memory size. I haven't looked = on that code recently, but I guess setting it too low can make ZFS = commit transactions too often, increasing write inflation for the = underlying storage. I would propose you to restore the default and try = again. >>> While this machine is different, the original problem was worse than >>> the issue here: the load average was less than 1 for the most part >>> the parallel bulk build when 30 was used. The fraction of time = waiting >>> was much longer than with 5. If I understand right, both too high = and >>> too low for a type of context can lead to increased elapsed time and >>> getting it set to a near optimal is a non-obvious exploration. >>=20 >> IIRC this limit was modified several times since originally = implemented. May be it could benefit from another look, if default 30% = is not good. It would be good if generic ZFS issues like this were = reported to OpenZFS upstream to be visible to a wider public. = Unfortunately I have several other project I must work on, so if it is = not a regression I can't promise I'll take it right now, so anybody else = is welcome. >=20 > As I understand, there are contexts were 5 is inappropriate > and 30 works fairly well: no good single answer as to what > value range will avoid problems. >=20 >>> An overall point for the goal of my activity is: what makes a >>> good test context for checking if ZFS is again safe to use? >>> May be other tradeoffs make, say, 4 hardware threads more >>> reasonable than 32. >>=20 >> Thank you for your testing. The best test is one that nobody else = run. It also correlates with the topic of "safe to use", which also = depends on what it is used for. :) >=20 > Looks like use of a M.2 Samsung SSD 960 PRO 1TB with a > non-debug FreeBSD build is suitable for the bulk -a -J128 > test (no ALLOW_MAKE_JOBS variants enabled, USE_TMPFS=3Dno in > use) on the 32 hardware thread system. (The swap partition > in use is from the normal environment's PCIe Optane media.) > The %idle and the load averages and %user stayed reasonable > in a preliminary test. One thing it does introduce is trim > management (both available and potentially useful). (Optane > media does not support or need it.) No > per_txg_dirty_frees_percent adjustment involved (still 5). >=20 > I've learned to not use ^T for fear of /bin/sh aborting > and messing up poudriere's context. So I now monitor with: >=20 > # poudriere status -b >=20 > in a separate ssh session. >=20 > I'll note that I doubt I'd try for a complete bulk -a . > I'd probably stop it if I notice that the number of > active builders drops off for a notable time (normal > waiting for prerequisites appearing to be why). >=20 >=20 So much for that idea. It has reached a state of staying under 3500 w/s and up to 4.5ms/w (normally above 2ms/w). %busy wondering in the range 85% to 101%. Lots of top showing tx->tx. There is some read and other activity as well. Of course the kBps figures are larger than the earlier USB3 context (larger kB figures). It reached about 1350 port->package builds over the first hour after the 2nd "Buildee started". autotrim is off. Doing a "zpool trim -w zamd64" leads to . . . larger w/s figures during the process. So more exploring to do at some point. Possibly: autotrim per_txg_dirty_frees_percent For now, I'm just running "zpool trim -w zamd64" once and a while so the process continues better. Still no evidence of deadlocks. No evidence of builds failing for corruptions. . . . At around the end of 2nd hour: 2920 or so built.=20 Still no evidence of deadlocks. No evidence of builds failing for corruptions. . . . I've turned on autotrim without stopping the bulk build. . . . At around the end of 3rd hour: 4080 or so built.=20 Still no evidence of deadlocks. No evidence of builds failing for corruptions. Looks like the % idle has been high for a significant time. I think I'll stop this specific test and clean things out. Looks like lang/guile* are examples of not respecting the lack of ALLOW_MAKE_JOBS use. Hmm. The ^C ended up with: ^C[03:41:07] Error: Signal SIGINT caught, cleaning up and exiting [main-amd64-bulk_a-default] [2023-09-04_17h49m28s] [sigint:] Queued: = 34588 Built: 4331 Failed: 12 Skipped: 303 Ignored: 335 Fetched: = 0 Tobuild: 29607 Time: 03:41:05 [03:41:07] Logs: = /usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-09-04_1= 7h49m28s [03:41:19] [95] [00:06:59] Finished databases/php80-pdo_sqlite | = php80-pdo_sqlite-8.0.30: Failed: build-depends [03:41:20] [60] [00:10:53] Finished www/php82-session | = php82-session-8.2.9: Success [03:41:20] [19] [00:06:42] Finished databases/pecl-memcache@php80 | = php80-pecl-memcache-8.2: Failed: build-depends [03:41:23] [62] [00:06:40] Finished = net-mgmt/icingaweb2-module-incubator@php81 | = icingaweb2-module-incubator-php81-0.20.0: Success [03:41:24] Cleaning up exit: cannot open ./var/run/116_nohang.pid: No such file or directory exit: cannot open ./var/run/93_nohang.pid: No such file or directory and JID 253's processes are stuck in the STOP state, along with the related JID 0 processes. These are for x11-toolkits/libXaw . I do not know how long it might have been stuck. /usr/local/libexec/poudriere/sh -e /usr/local/share/poudriere/bulk.sh = -J128 -jmain-amd64-bulk_a -a is stuck in killpg state: 3896 101784 sh - mi_switch+0xbb = _sx_xlock_hard+0x3e1 killpg1+0x193 kern_kill+0x23c amd64_syscall+0x109 = fast_syscall_common+0xf8=20 Most recent process birth time listed-first order for the STOP's: 86826 634820 sh - mi_switch+0xbb = thread_suspend_switch+0xc3 cursig+0x7b9 ast_sig+0x1df ast_handler+0x88 = ast+0x20 doreti_ast+0x1c=20 86825 321240 sh - mi_switch+0xbb = thread_suspend_check+0x260 sig_intr+0x4c fork1+0x186 sys_fork+0x54 = amd64_syscall+0x109 fast_syscall_common+0xf8=20 86550 308525 sh - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = pipe_read+0x420 dofileread+0x86 sys_read+0xbe amd64_syscall+0x109 = fast_syscall_common+0xf8=20 3998 637224 gmake - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 = fast_syscall_common+0xf8=20 3976 298566 sh - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 = fast_syscall_common+0xf8=20 3974 309336 gmake - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 = fast_syscall_common+0xf8=20 3962 577230 gmake - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 = fast_syscall_common+0xf8=20 3958 635598 sh - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 fast_syscall_common+ 0xf8=20 3915 668148 make - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 fast_syscall_common+ 0xf8=20 3912 638108 sh - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_wait_sig+0x9 _sleep+0x1e3 = kern_wait6+0x51c sys_wait4+0x7d amd64_syscall+0x109 fast_syscall_common+ 0xf 46142 212631 sh - mi_switch+0xbb = sleepq_catch_signals+0x2ab sleepq_timedwait_sig+0x12 = _cv_timedwait_sig_sbt+0x10c seltdwait+0x75 kern_select+0x95e = sys_select+0x57 amd64_syscall+0x109 fast_syscall_common+0xf8=20 =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E888EC8-4B06-4139-90DF-59AFBA700A5F>