From nobody Sun Aug 20 00:33:47 2023 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4RSxSF553vz4qR3x for ; Sun, 20 Aug 2023 00:34:05 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic313-20.consmr.mail.gq1.yahoo.com (sonic313-20.consmr.mail.gq1.yahoo.com [98.137.65.83]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4RSxSD3CVDz3d1g for ; Sun, 20 Aug 2023 00:34:04 +0000 (UTC) (envelope-from marklmi@yahoo.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=yahoo.com header.s=s2048 header.b=DVOguTjN; spf=pass (mx1.freebsd.org: domain of marklmi@yahoo.com designates 98.137.65.83 as permitted sender) smtp.mailfrom=marklmi@yahoo.com; dmarc=pass (policy=reject) header.from=yahoo.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1692491641; bh=Em1xBU2uVQ+k1yrMk5AG5s2djwPsxZXg05iWrtSy62k=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject:Reply-To; b=DVOguTjNblJuxrzID3ejWCJdw5uAOu2TTk1Oo/C99HD0O1vnrkE+1K2BaSPM5aOHzwnOCnOYnbVYCZQGbUoSbp/FLumnE3mhT9nEWF9v4AyoBlctjaSYw3+vYCHsXCXjf3dhI7+cYuGHdFQaErGG0mVSpeeJBYyFvgccgo8l14+hMVeLC9I04uCNunEvNfWUwjosfx9bScctX4ezrbO6RCsSAOK+H9zGJEvMZC1XpUfZo2Sfg+N/4Ae8ZNDuRi2M5RuuDnipMiOn22F32lBHdjLrATkDE87Ok2HlJjKWeMyrgoNPz88+wEc1m62uOB8J+INjveYyQ297DPpKENrE5w== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1692491641; bh=oPYAcMfyZGP4bNBfI/d8kMgjQVfuvTrzgFXgrAMxX5v=; h=X-Sonic-MF:Subject:From:Date:To:From:Subject; b=qQKWuDhHTeC+0TOL8svsC5DR3LpW2m3BLKK0JYuTdVslTVooSpVe7lk9JTuNHNF10bpB8FhT1Y/S0pVKHz2jbdTH+nUxkIcEvfPJCKBcvQze0UYj0D2lLMF6/F4Teyu9hpTfiZ/9PXU16qmDDlER5NLa/ktYSspb18YzYS4UKYwBxIk0ocpKoEmA3Y1RYAylmb06DrL5QBUacy3vBxnWFCYmut0fNohLBzU34TlJIls8894xyDEGn+2/6AsrQw+DURUvLq3rByKcpGu0zjjQsTEXTgagvwkQL2P4ijpf0MkxcvkM3E3qW6v0jy13oOS33Gg9MvrfyR4JpS/z+TM+fg== X-YMail-OSG: RsL86a8VM1mtiWr8PPuo9o11H5vDmsvICA4IVCx6SR59YGNANsmO0_Ig4Qrxj1V y6KaxbXCzAe_CZbP1kZG6g2H4fsOKmQmt6VVIVhv3F1Hwh0KlEHLhlYLG0orvCpS2eLNoon.oDnY c6CeJMwAP9a1nTQe3JnUrfV83wUuP4tDIN0Ms4sVjAeKKTu609WUy3FxE3CXSTcp29RCw.fX3WEq lRq4AwCEEBDr5bBfSHsDlX_JnNu_ylNBrjpQAxxtffEeDnTGmDJwmdr5xLNyK8hHliCKbxpoN2cS 7ieon2sF9DYiPJS26OLgwYJgPNs5E1gj7C9XSnqpAUqzvjrCa1L7R_Q0ralMJCws3dTg3.7bSKJ9 IbNH7bNNYr96VcoIpMmhZrU6iZ8VbbuLpqBErCLNv.rxHdxLdfcbZIhmDAlhkSbloKwHtB5_rXMp Qp9STdl8Gldr.pLyWkTL7YUa4mfxLTOg3p4ztLoze5uUvLfqjo4ytcywp_GfwLypvAVdynaCs6Rc 21OWA5_.iYpxEaYCSa45H5aCSOkKkvw.W0piFfsrbUkVMlFALTcD_BquCJwIiL9rEAmRmwfNaAnF 5uuZY426iIK_YR9BFmcK0kvbAueMzcwwuvicc8YVgkD3W7sGWBqPjYGnJuT64sG8HNKVHD9VIVB4 Lpc13lJijlVVRMIMUubOnGU.p5PInp2w._DCMMwJF6ZTybMAOvWRaMo5_8.P.WDnqP7HJsyjAYtW rWg3KMT3nRhDtymcFqW8.8DmlGMf8VDbfNOaBMYjOIDjbuZBiZEMdMtgMjs5k1ZeK1I814HGCxf_ GjIx.oUyZrIuGDdW1vBeQnoPSheiz7V8yLMFj7SzfuCYe5ogqFrfkTPdGdChjTc.0pxyWcTbDWqb p_gtoZ2Ouhk8EKnuk3Lepu1MKthHrttKwWYTUrpWQZfF6GYheusHwqrsUjT2FYC.qaUK5zpNxX6R 5iSsabjS5ox8Tw_aK2PTOfl6IYNT3VEgNdak9M2w9ncTX2pwSm9av1g6fgUSOQB7mJVou72wO0jE Ja3uQm4Rr.pHebqfBUO3d7F.HeldQ57KIOmNeSpKwQpvd5HCDOvjcRKgWxj7OQy5udHv4RNiU1dm VB9nL..MGbA5kSVe_OwuFAQuPTKPTIq5RXCepM9Zye7WxIJF1KWtxpnhh2rGp.njdUyVB3cw8fl4 n7MvG7vpaQXjZ7h.BjzDUkXb40I58hShvQz6IW76TavsmZkcVwqfnCzwve.pLiSTAtmx3C5ZX5.1 kqkRgeinNJFY4ZaCvwInrm6.RLpWuK6MONGKxF6evQtVMvCpP.YkUr8.ldq0_N13aNJdiVIRyx_p mrhLojdMWzpCJI68LrESipUBmQ3Hd8TZB92YisTS5Gf8nU2tu3WB12x0t9UZW5PfJ2Jx9TFc_GmJ HeP4Ucd.9EcR2r.YPL1kCIsG9GJ0nXeYmZ6Ma0ZnfpLQH_EZpWf50XbMncqykeTtuBJMXDVHex6Y EiOt527OPOrWK5wsdxocDLTaXJIBW9juDOMYw.S3nqPBqt4_ssEpv6EpSAQWWZ1NHqiu59_nTJqR BM4mJgKaUIdSlTpRuOm8R60Cjq3hRmt5LcV1XCCUG1mRh.Ae.IBWCKTKfH1v7LdqFMdM4UKEmnZm rGoZthLNeNyUbUBXcJe8CdwiS104H4qMd4D5LcSc_SVvoSUIwxrKEewrirXVLeHzZxJtO4l4fUb7 Cv.SixLwuFmIr2llDUiHIx8hwyU1nsVGpElsf1Uxvws.D4.3DngtQ_ItiVewAhGT0B2.uxl1MLb2 mkRKJ9hAT6R5lRQ13LWIzyHTILi.Cf1oYrFTgr3FkSvpsKj0R.qSfXUDVDA386H0PyBk5y8xqr0o z_wwTpVl0MOrxfcStKDaWNHJlN2cyM_aMwtrZYkuPMQnbvbKSOowE77I4R1B2ymb1laIJyZtL7gn .otyhMYvKSV7_BWMuQD29wdIxMbD6ESe1HMvj0CQrkk.G4qFZNz.J_R9gNHWLwQMIQwGouTnEDuz L6JSajGaO7kKHQO4WebEDZQZMtVlPCPSqlDb0ofloA07WBD_73P6kf7mfvu2LL3iu2H5WWxP05IO 8HU_LagxVjP9HSBOFbCU0D3ax58tKSw9g.9eXyxRuUSZnGieuywWLNzN4JspAiIsuVWlm2poUxPf GjRBTa74IrbxkntwjGtfL8x3X9TWDFkHmkX2.4hrVHJrd7U2jNmWp_Og2pB2jyhzS4ZRFfTBbscl dcBgbfrGzN.CIyIsDFKJUJD2GXo3BnXNFmHcJJmsPzTYxaT590hiSwAafK2fZ6Lla14b0UNo.5TB I X-Sonic-MF: X-Sonic-ID: 195d1693-ede8-4d22-9031-2a116035a6b7 Received: from sonic.gate.mail.ne1.yahoo.com by sonic313.consmr.mail.gq1.yahoo.com with HTTP; Sun, 20 Aug 2023 00:34:01 +0000 Received: by hermes--production-gq1-6b7c87dcf5-wlch2 (Yahoo Inc. Hermes SMTP Server) with ESMTPA ID 169ce5b3ada94fae3eca25ce000ffde9; Sun, 20 Aug 2023 00:33:58 +0000 (UTC) Content-Type: text/plain; charset=us-ascii List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.700.6\)) Subject: Re: ZFS deadlock in 14 [USE_TMPFS=no poudriere messed up from the start, lots of "vlruwk"] From: Mark Millard In-Reply-To: <8D0C1422-CE60-4266-8051-2296C3E9B7D7@yahoo.com> Date: Sat, 19 Aug 2023 17:33:47 -0700 Cc: Alexander Motin Content-Transfer-Encoding: quoted-printable Message-Id: References: <59FCB309-4A55-4924-98C4-7ACCA70FD299@yahoo.com> <0F2C42B4-36FF-443A-A174-5B0CC57C4FC7@yahoo.com> <3AA253E3-C4F0-4AA3-9C37-D77E7527A458@yahoo.com> <8D0C1422-CE60-4266-8051-2296C3E9B7D7@yahoo.com> To: Current FreeBSD X-Mailer: Apple Mail (2.3731.700.6) X-Spamd-Result: default: False [-3.49 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.99)[-0.993]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; MIME_GOOD(-0.10)[text/plain]; FROM_HAS_DN(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; ARC_NA(0.00)[]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; RCVD_IN_DNSWL_NONE(0.00)[98.137.65.83:from]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.65.83:from]; DKIM_TRACE(0.00)[yahoo.com:+]; TO_DN_ALL(0.00)[]; FREEMAIL_FROM(0.00)[yahoo.com]; MID_RHS_MATCH_FROM(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_TLS_LAST(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2] X-Spamd-Bar: --- X-Rspamd-Queue-Id: 4RSxSD3CVDz3d1g On Aug 19, 2023, at 16:27, Mark Millard wrote: > On Aug 19, 2023, at 15:41, Mark Millard wrote: >=20 >> On Aug 19, 2023, at 13:41, Mark Millard wrote: >>=20 >>> [I forgot to adjust USE_TMPFS for the purpose of the test. >>> So I'll later be starting over.] >>>=20 >>> . . . >>=20 >> I finally got around to starting a from-scratch bulk -a >> again (based on USE_TMPFS=3Dno this time). This is with >> 15107.patch and 15122.patch applied. This is a non-debug >> kernel experiment. >>=20 >> Interstingly it got: >>=20 >> [00:01:34] [01] [00:00:00] Builder starting >> [00:01:57] [01] [00:00:23] Builder started >> [00:01:57] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.20.4 >> [00:03:09] [01] [00:01:12] Finished ports-mgmt/pkg | pkg-1.20.4: = Success >> [00:03:21] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1 >> [00:03:21] [02] [00:00:00] Builder starting >> [00:03:21] [03] [00:00:00] Builder starting >> [00:03:21] [04] [00:00:00] Builder starting >> [00:03:21] [05] [00:00:00] Builder starting >> [00:03:21] [06] [00:00:00] Builder starting >> [00:03:21] [07] [00:00:00] Builder starting >> [00:03:22] [08] [00:00:00] Builder starting >> [00:03:22] [09] [00:00:00] Builder starting >> [00:03:22] [10] [00:00:00] Builder starting >> [00:03:22] [11] [00:00:00] Builder starting >> [00:03:22] [12] [00:00:00] Builder starting >> [00:03:22] [13] [00:00:00] Builder starting >> [00:03:22] [14] [00:00:00] Builder starting >> [00:03:22] [15] [00:00:00] Builder starting >> [00:03:22] [16] [00:00:00] Builder starting >> [00:03:22] [17] [00:00:00] Builder starting >> [00:03:22] [18] [00:00:00] Builder starting >> [00:03:22] [19] [00:00:00] Builder starting >> [00:03:22] [20] [00:00:00] Builder starting >> [00:03:22] [21] [00:00:00] Builder starting >> [00:03:22] [22] [00:00:00] Builder starting >> [00:03:22] [23] [00:00:00] Builder starting >> [00:03:22] [24] [00:00:00] Builder starting >> [00:03:22] [25] [00:00:00] Builder starting >> [00:03:22] [26] [00:00:00] Builder starting >> [00:03:22] [27] [00:00:00] Builder starting >> [00:03:22] [28] [00:00:00] Builder starting >> [00:03:22] [29] [00:00:00] Builder starting >> [00:03:22] [30] [00:00:00] Builder starting >> [00:03:22] [31] [00:00:00] Builder starting >> [00:03:22] [32] [00:00:00] Builder starting >> [00:03:30] [01] [00:00:09] Finished print/indexinfo | = indexinfo-0.3.1: Success >> [00:03:31] [01] [00:00:00] Building devel/gettext-runtime | = gettext-runtime-0.22 >>=20 >> and is still that way minutes later. >>=20 >> ^T shows: >>=20 >> [00:03:31] [01] [00:00:00] Building devel/gettext-runtime | = gettext-runtime-0.22 >> load: 13.02 cmd: sh 2187 [vlruwk] 570.19r 0.62u 38.60s 9% 3948k >> #0 0xffffffff80b7701b at mi_switch+0xbb >> #1 0xffffffff80bc976f at sleepq_timedwait+0x2f >> #2 0xffffffff80b76770 at _sleep+0x1d0 >> #3 0xffffffff80c5b435 at vn_alloc_hard+0x2a5 >> #4 0xffffffff80c50b72 at getnewvnode_reserve+0x92 >> #5 0xffffffff829b9b12 at zfs_zget+0x22 >> #6 0xffffffff829a6a8d at zfs_dirent_lookup+0x16d >> #7 0xffffffff829a6b5f at zfs_dirlook+0x7f >> #8 0xffffffff829b6410 at zfs_lookup+0x350 >> #9 0xffffffff829b182a at zfs_freebsd_cachedlookup+0x6a >> #10 0xffffffff80c36a0d at vfs_cache_lookup+0xad >> #11 0xffffffff80c45141 at vfs_lookup+0x581 >> #12 0xffffffff80c44238 at namei+0x238 >> #13 0xffffffff80c63b5e at kern_statat+0xee >> #14 0xffffffff80c64237 at sys_fstatat+0x27 >> #15 0xffffffff81049a79 at amd64_syscall+0x109 >> #16 0xffffffff8101f11b at fast_syscall_common+0xf8 >> [main-amd64-bulk_a-default] [2023-08-19_15h14m10s] [parallel_build:] = Queued: 34435 Built: 2 Failed: 0 Skipped: 35 Ignored: 358 = Fetched: 0 Tobuild: 34040 Time: 00:10:52 >> ID TOTAL ORIGIN PKGNAME PHASE PHASE = TMPFS CPU% MEM% >> [01] 00:07:29 devel/gettext-runtime | gettext-runtime-0.22 build = 00:06:32 25.4% 0% >> [00:11:25] Logs: = /usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-08-19_1= 5h14m10s >>=20 >> Note the 3:31->11:25 . >>=20 >> Top is showing lots of "vlruwk". For example: >>=20 >> 362 0 root 40 0 27076Ki 13776Ki CPU19 19 4:23 = 0.00% cpdup -i0 -o ref 32 >> 349 0 root 53 0 27076Ki 13776Ki vlruwk 22 4:20 = 0.01% cpdup -i0 -o ref 31 >> 328 0 root 68 0 27076Ki 13804Ki vlruwk 8 4:30 = 0.01% cpdup -i0 -o ref 30 >> 304 0 root 37 0 27076Ki 13792Ki vlruwk 6 4:18 = 0.01% cpdup -i0 -o ref 29 >> 282 0 root 42 0 33220Ki 13956Ki vlruwk 8 4:33 = 0.01% cpdup -i0 -o ref 28 >> 242 0 root 56 0 27076Ki 13796Ki vlruwk 4 4:28 = 0.00% cpdup -i0 -o ref 27 >>=20 >> In other words, it is messed up from the start, not >> just later. >>=20 >> It does suggest that the dbg kernel should not end up with >> resource problems: not that much gets very far. So I'll >> probably stop it and substitute the debug kernel, reboot >> and try again. >=20 > Still for nodbg kernel . . . >=20 > The "vlruwk" processes do occasionally instead show a > CPU?? . Nothing seems stuck in only one STATE. (Live > lock?) >=20 > As for using the dbg kernel instead . . . >=20 > Most of the time that processes are showing CPU?? more > progress is made in building, but basically one builder. > vlruwk dooes show up, gradually showing a larger fracion > of the time. ref 02 .. ref 32 are still in cpdup -i0 -o . > *vnode is showing up some as well. N process looks to > be stuck in just one of those. (Live lock?) >=20 > The debug kernel is not reporting anything during this > so far. >=20 > (some time goes by) >=20 > At this point vlruwk is fairly commonly what mostlt display > for the cpdup's that are not finishing --but none are stuck > in vlruuwk . >=20 > Looks like I should try without the 2 patches (15107 and > 15122). >=20 I restored to eliminate the 2 patches and rebuilt the kernels installed the nodbg one, rebooted, and tried again, still using USE_TMPFS=3Dno . . . The cpdup's get the same sort of vlruwk/CPU??/*vnode sort of general activity. ref 02 .. ref 32 still do not complete, leaving 1 builder active. In other words, the patches did not make a difference that I've noticed for what I'm reporting. =3D=3D=3D Mark Millard marklmi at yahoo.com