From nobody Fri Jan 28 15:16:30 2022
X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 691A41975AA6
	for <freebsd-arm@mlmmj.nyi.freebsd.org>; Fri, 28 Jan 2022 15:16:41 +0000 (UTC)
	(envelope-from marklmi@yahoo.com)
Received: from sonic311-24.consmr.mail.gq1.yahoo.com (sonic311-24.consmr.mail.gq1.yahoo.com [98.137.65.205])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(Client did not present a certificate)
	by mx1.freebsd.org (Postfix) with ESMTPS id 4JlgzD2kNXz4TD5
	for <freebsd-arm@freebsd.org>; Fri, 28 Jan 2022 15:16:40 +0000 (UTC)
	(envelope-from marklmi@yahoo.com)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1643382993; bh=oFNwHvgRPm3tLSaoDWoj+lV2MRV/Ok+7r4fawj6Q2QA=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject:Reply-To; b=UxrilVx9k/AOxOxvY619s4nMu/mcqs9YZqjKedyGLDcFIrlX7MAob6MAimnntEyLCzHwbWLcg2WVV1JFCS9VntOD5Qs6yG6JMOs5PRQnRLkBXgkrNPj5+50ZPY0YZ/GK74hzBSiIqA+m+hGt+RsyZezAcrHnOVMi5xCvfduTVvbWNu7+sY8cy4U+KOYJdOnm1m91pjT60rUHWG7yKmEuqaCY0Izt+VLiVR2/0JdgIdwIoAnc59JGgTxjz9mhWjFNqLpwSV5/O0wHSWfBMwBkcMf3jGPNpfpMiFGLAIUS9rQZTG1l0/oawQVGwSa88ZMnFEueDkN/v7eBLvKmINMfUA==
X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1643382993; bh=WPzibjtnrr6O+6/DZZODN/LSalxUakRcTzD7LMUcyss=; h=X-Sonic-MF:Subject:From:Date:To:From:Subject; b=quPE+jvQDdFQ/yjB6A0RoZ4lv/QmxIxGZDfMXspsl4z64iITaZXb5Gmo87ahNCpoNuupfi+t5ZI3fuTjjCX7JBbDUQ9xWaquUICR9VD3rOBOc0fJamSa7EI55gjAJj+2NGRYsCRJPgavzAci7kEDJWHJ+Zc3bCSfn5R0sTGGXR77Q9KAnHTVci8rRQv62w6bwH4dQOFyV6F8TsaNjGsRv/0mKF5UYw9+EUePrD05ZrO37Q5qtV1xOYTYm/aC6vFDqgr2ivcEGO7pfSbY5Zbzl1Wg9/IkQPaaXm94fc5n3zvrEBoPfZp1JPKxHqQUQkh8gSlBhw5jGrvSPGi8l0pJ3w==
X-YMail-OSG: .0.BSgIVM1lh_nWuDGheSyPXwHkTEniWLvyduxZd8.dYuw.qFJBH_TdTlZhG7hF
 sTLxoDQtFdl7KjKmell.aOInOiCXZJbmz_B88OYQ15ft2paDuPmd4bWwfbvlOlno55Ct76lW.EgE
 wyoFVYEeyLqZH42FpNXFbi8XEj6jwlWs8Slh9roc8CK.ntH2tS.NbhzoN_ISYJWk2aIiyLCVu2Bu
 MtIWCbNU5dTnzegErWIePNi792Rvh3QKYtB9YRxWwXocIRKbrdCMk6T1_HMDXTFdWopm8u9wixfn
 kUrkBO4egqkZPMT11yxegqfD6gxK9WYFTxF1y8D2vTVsaWuXe_WaxbcWrLWOS_nFYYlABgx_.GQM
 lp8O4lOZeDbVuPbT0RKNnZfOraoApLl1myx3m2pG893zJcLayExmnKhoce2xFdet1wofOSuj_.8c
 Up4Rjdm3TAsziArYTton_pfQkkRgeVlxi9g73tvYmUmF1iprrwUrFaX4ulo2EmIJw0ejkppen20I
 7nGTXmCMsLLYzc_MYwLypyCGJNVFgfkBe3ZGQ7RPkBOzWTcfTMCBSYoKXgJ6JWJf7BFoRzAifzTz
 cmLI_iOnJDhdGShV_6Z_aQljp6IjAH_7OKyXeJh7AVECYQz53ZBh2HUCST5A003uqqWF5HAZhf2X
 VfBUP4oKaZDmNT8wFIF_0Jm8AtWTLCJwqHJ5BHd30UthYjE9R54Q5WAWgV2ZEKlywcc.Hs333mFo
 d7VnLaNgpCEaH1WGIccbsae415H2GAy0GR7zGfXFwrb7r3AUqLDbIwKdrL0LvOTuVc7sgBHNDBDA
 VSXQukx.UFOw_5hkIUhqACKNFBhw74mmSjAqM7PX9biBMSmhZlsz7.w8vXiKR8M_7XePeap8xCY_
 1N6pY_FugnbIYSz9hv3lUdtRkz6Ysjf9590FbY.RNZodh1TzjQZV1zPLDe6UFwug.XjwVdx2YFf8
 _sveUnFa6RizmsfpS6Q5K7Nrok73ZHJYKO5DKV8HlFFBX18K.LBDkHPm5QySfTzxvQS8uaANkdu2
 urHFyBfsaMuydFED6hzwGtItZJkH0lTZxgNC6zXEFD4B83uowhF8s4bACLjwEunTlMV90xoct4Ih
 4y1Vjrg1UxtTNPYObLUeMQXziQiWoWp5mlwT4K.WPk9EbOTyAOB28PmUuaYAMmkbIfYsYbNJ7Cd3
 47c_FsJAJNBQCBPZUqiATwq0UCMUiZxeKlUPud4OQwnFdMGHbvcx80QBoihyjAQq_tNzV1IEye9M
 tN_1VnqsJZGkRc5QytRMDIjSkacwM09GaSgpQMeZre3IclXWq_HWRQ85wCtynYuhOzzWJW4ZkynB
 aVcc6oA9B4_fXRhZYyOAK3.S3QPGqRkiuZ4KdCy03EA2nfkvMpaOFUXmbGFFA5WuyuRO8ErhhlnU
 nlhRcMLjSbjWPTCD1.LJ_H0j9pWbvDT9.3x.ikkTCOt4OMgO.1HEEsdV..67ZuP7A.StVdIauciE
 x1BUUrM5AVH2FujpxH57NuWVe4SqDE7nj1mAbo3M2Ab._Iq6zeOcvH1rMScCvR2y2cIlfmjMlTAo
 s7YELgpZtnaFubWx69qssr2zk.qT39LtU6Fq7xZwQ0ICFol7aih7MYJ2vI2BFeKAfgMlq_ncqWQM
 9rFxRpyg_XOCuGdX1j0W1gTCXYVb6FtMjNpvWP4xYwskPz6QHknlRMBg8p0AJFei3Y66gV4F6lrU
 QKvMPOTuP1CxwrU4us9eRESRbNxT81624noJvFQeJWigz2jWGQDaqPiyu0PG0oHu4uO0DqMNY1Ld
 3ESjh_Wa8Cx9qxSInnlirJqVHVAb1RlVBHY5s9Upk7gL17SG25aWBQpFMpo4OHBmvYgcL5TKsZ5b
 zvMPv64YBQVb02MD77aCuursz5.5JlIsyGGsqz8a8rpl4_TktcHz5f4dJUhgn1HL0r.O5iVF0dRe
 AaD.X5Th88MYmmkiXwF29rb0uI8i_65Dk9C3QIySWe9RZcn34idiET3ZDyLd2fxvxoc9xqU5EnI9
 Y4ruTvOmy3CjqX8izleKWWhYANWQsxops_3yjb0iutm5yj5gk6WERj6rnppsxcOdjbx14ZcR5Dfv
 3O4X2rwq4edUo6hmBOeL7F3qERb3ACdKtzOEKK8oSEQI91H0DrXS8d8mzu5uA9yvgh6wjdzE93Zw
 Pi4LawQa0_Wy0fNck16kSn3hZsANVuGvJVwpqOOZ7vDESty1oOxPDcZYWJdkbkfL3Ydl4FSXZgLZ
 EA7kvc5OJOpveT0wPDQW7Df9.vc2gKOEAfwvNubvzKRj2kuPNY0KNWx1HSSl2KesIVoH5yZl9aLA
 drvaqqXJ3LIr_P54Yvec-
X-Sonic-MF: <marklmi@yahoo.com>
Received: from sonic.gate.mail.ne1.yahoo.com by sonic311.consmr.mail.gq1.yahoo.com with HTTP; Fri, 28 Jan 2022 15:16:33 +0000
Received: by kubenode527.mail-prod1.omega.ne1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 55f93c5ffe246bf30fdba9c3435fa7dc;
          Fri, 28 Jan 2022 15:16:31 +0000 (UTC)
Content-Type: text/plain;
	charset=us-ascii
List-Id: Porting FreeBSD to ARM processors <freebsd-arm.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-arm
List-Help: <mailto:freebsd-arm+help@freebsd.org>
List-Post: <mailto:freebsd-arm@freebsd.org>
List-Subscribe: <mailto:freebsd-arm+subscribe@freebsd.org>
List-Unsubscribe: <mailto:freebsd-arm+unsubscribe@freebsd.org>
Sender: owner-freebsd-arm@freebsd.org
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\))
Subject: Re: devel/llvm13 failed to reclaim memory on 8 GB Pi4 running
 -current [UFS context: used the whole swap space too]
From: Mark Millard <marklmi@yahoo.com>
In-Reply-To: <B12D2AB9-147E-49EF-854F-A3B999ADDECC@yahoo.com>
Date: Fri, 28 Jan 2022 07:16:30 -0800
Cc: Free BSD <freebsd-arm@freebsd.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <CFD69941-AB3C-4F25-BCF4-6F4B0F5A7B08@yahoo.com>
References: <20220127164512.GA51200@www.zefox.net>
 <C8BDF77F-5144-4234-A453-8DEC9EA9E227@yahoo.com>
 <2C7E741F-4703-4E41-93FE-72E1F16B60E2@yahoo.com>
 <20220127214801.GA51710@www.zefox.net>
 <5E861D46-128A-4E09-A3CF-736195163B17@yahoo.com>
 <20220127233048.GA51951@www.zefox.net>
 <6528ED25-A3C6-4277-B951-1F58ADA2D803@yahoo.com>
 <10B4E2F0-6219-4674-875F-A7B01CA6671C@yahoo.com>
 <54CD0806-3902-4B9C-AA30-5ED003DE4D41@yahoo.com>
 <A4FA4E8B-635B-454E-87D1-C36A84E2C3BA@yahoo.com>
 <9771EB33-037E-403E-8A77-7E8E98DCF375@yahoo.com>
 <B12D2AB9-147E-49EF-854F-A3B999ADDECC@yahoo.com>
To: bob prohaska <fbsd@www.zefox.net>
X-Mailer: Apple Mail (2.3654.120.0.1.13)
X-Rspamd-Queue-Id: 4JlgzD2kNXz4TD5
X-Spamd-Bar: ---
Authentication-Results: mx1.freebsd.org;
	dkim=pass header.d=yahoo.com header.s=s2048 header.b=UxrilVx9;
	dmarc=pass (policy=reject) header.from=yahoo.com;
	spf=pass (mx1.freebsd.org: domain of marklmi@yahoo.com designates 98.137.65.205 as permitted sender) smtp.mailfrom=marklmi@yahoo.com
X-Spamd-Result: default: False [-3.49 / 15.00];
	 FREEMAIL_FROM(0.00)[yahoo.com];
	 MV_CASE(0.50)[];
	 R_SPF_ALLOW(-0.20)[+ptr:yahoo.com];
	 TO_DN_ALL(0.00)[];
	 DKIM_TRACE(0.00)[yahoo.com:+];
	 RCPT_COUNT_TWO(0.00)[2];
	 DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject];
	 NEURAL_HAM_SHORT(-0.99)[-0.986];
	 FROM_EQ_ENVFROM(0.00)[];
	 RCVD_TLS_LAST(0.00)[];
	 MIME_TRACE(0.00)[0:+];
	 FREEMAIL_ENVFROM(0.00)[yahoo.com];
	 ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US];
	 MID_RHS_MATCH_FROM(0.00)[];
	 DWL_DNSWL_NONE(0.00)[yahoo.com:dkim];
	 ARC_NA(0.00)[];
	 NEURAL_HAM_MEDIUM(-1.00)[-1.000];
	 R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048];
	 FROM_HAS_DN(0.00)[];
	 NEURAL_HAM_LONG(-1.00)[-1.000];
	 MIME_GOOD(-0.10)[text/plain];
	 TO_MATCH_ENVRCPT_SOME(0.00)[];
	 RCVD_IN_DNSWL_NONE(0.00)[98.137.65.205:from];
	 MLMMJ_DEST(0.00)[freebsd-arm];
	 RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.65.205:from];
	 RCVD_COUNT_TWO(0.00)[2]
X-ThisMailContainsUnwantedMimeParts: N
Status: O
Content-Length: 16780
Lines: 455

On 2022-Jan-28, at 00:31, Mark Millard <marklmi@yahoo.com> wrote:

> [Back to omitting Mark Johnston.]
>=20
> On 2022-Jan-27, at 23:40, Mark Millard <marklmi@yahoo.com> wrote:
>=20
>> [Mark Johnston included: handling was different than under ZFS.]
>>=20
>> On 2022-Jan-27, at 22:35, Mark Millard <marklmi@yahoo.com> wrote:
>>=20
>>> [Back to omitting Mark Johnston.]
>>>=20
>>> On 2022-Jan-27, at 22:00, Mark Millard <marklmi@yahoo.com> wrote:
>>>=20
>>>> On 2022-Jan-27, at 21:55, Mark Millard <marklmi@yahoo.com> wrote:
>>>>=20
>>>>> On 2022-Jan-27, at 17:43, Mark Millard <marklmi@yahoo.com> wrote:
>>>>>=20
>>>>>> On 2022-Jan-27, at 15:30, bob prohaska <fbsd@www.zefox.net> =
wrote:
>>>>>>=20
>>>>>>> On Thu, Jan 27, 2022 at 02:21:44PM -0800, Mark Millard wrote:
>>>>>>>>=20
>>>>>>>> Okay. I just started a poudriere bulk devel/llvm13 build
>>>>>>>> in a ZFS context:
>>>>>>>>=20
>>>>>>>> . . .
>>>>>>>> [00:00:37] Pkg: +BE_AMDGPU -BE_FREEBSD +BE_NATIVE -BE_STANDARD =
+BE_WASM +CLANG +DOCS +EXTRAS -FLANG +LIT +LLD +LLDB +MLIR -OPENMP =
-PYCLANG
>>>>>>>> [00:00:37] New: +BE_AMDGPU -BE_FREEBSD -BE_NATIVE +BE_STANDARD =
+BE_WASM +CLANG +DOCS +EXTRAS +FLANG +LIT +LLD +LLDB +MLIR +OPENMP =
+PYCLANG
>>>>>>>> . . .
>>>>>>>> [00:01:27] [01] [00:00:00] Building devel/llvm13 | =
llvm13-13.0.0_3
>>>>>>>>=20
>>>>>>>=20
>>>>>>> Is this ARM hardware, or an emulator?
>>>>>>=20
>>>>>> 8 GiByte RPi4B, USB3 NVMe media with a ZFS partition. The content
>>>>>> is a slightly modified copy of the HoneyComb's PCIe slot Optane
>>>>>> media.
>>>>>>=20
>>>>>> The UFS-based 8 GiByte RPi4B is also based on copying from the
>>>>>> same Optane media, both for the system materials and various
>>>>>> ports/packages/pouriere related materials. (Not, necessarily,
>>>>>> other things.)
>>>>>>=20
>>>>>>> I've been using plain old make in /usr/ports/devel,=20
>>>>>>> might it be informative to try a poudriere build as well?
>>>>>>=20
>>>>>> The Pkg:, New:, and llvm13 lines I listed are poudriere(-devel)
>>>>>> output. I am doing my builds via poudriere. ALLOW_PARALLEL_JOBS=3D
>>>>>> and USE_TMPFS=3D"data" in use.
>>>>>>=20
>>>>>> I have a context in which almost all prerequisites had already
>>>>>> been built. (The change in options lead to 2 very small ports
>>>>>> to build before devel/llvm13's started in a builder.)
>>>>>>=20
>>>>>> (You might not have a jail that already has the prerequisites.)
>>>>>>=20
>>>>>>> One would expect the added overhead to increase memory use.
>>>>>>>=20
>>>>>>=20
>>>>>> Well, from the context I started in, only devel/llvm13 is being
>>>>>> built once it starts. Once it gets to the build phase (after
>>>>>> dependencies and such are set up), there is not much overhead
>>>>>> because the only activity is the one builder and it is only
>>>>>> building llvm13 --via make in the builder. At the end there
>>>>>> would be extra activity as poudriere finishes up. During the
>>>>>> build phase, I only expect minor overhead from poudriere
>>>>>> monitoring the build logs and such.
>>>>>>=20
>>>>>> I expect that the mere fact that a poudriere jail is in use
>>>>>> for the builder to execute in does not contribute to
>>>>>> significantly increasing the system's memory use or changing
>>>>>> the system's memory use pattern.
>>>>>>=20
>>>>>>=20
>>>>>> There are some other differences my context. The instances of
>>>>>> main [so: 14] are non-debug builds (but with symbols). The
>>>>>> builds are optimized for the RPi4B (and others) via use of
>>>>>> -mcpu=3Dcortex-a72 usage. My /usr/main-src/ does have some
>>>>>> personal changes in it. (Some messaging about the kills is
>>>>>> part of that.)
>>>>>>=20
>>>>>> The RPi4B's are using:
>>>>>>=20
>>>>>> over_voltage=3D6=20
>>>>>> arm_freq=3D2000=20
>>>>>> sdram_freq_min=3D3200=20
>>>>>> force_turbo=3D1=20
>>>>>>=20
>>>>>> (There are heat-sinks, fans, and good power supplies.)
>>>>>>=20
>>>>>> The media in use are USB3 1 TB Samsung Portable SSD T7
>>>>>> Touch's. I'm unlikely to see "swap_pager: indefinite
>>>>>> wait buffer:" notices if the cause was based on the
>>>>>> media performance. (You have spinning rust, if I
>>>>>> remember right.)
>>>>>>=20
>>>>>> I do not have a monitoring script making a huge log file
>>>>>> during the build. So less is competing for media access
>>>>>> or leading to other overheads. (But, as I remember,
>>>>>> you have gotten the problem without having such a script
>>>>>> running.)
>>>>>=20
>>>>>=20
>>>>> ZFS context:
>>>>>=20
>>>>> Well, the ZFS example used up all the swap space, according
>>>>> to my patched top. This means that my setting of
>>>>> vm.pfault_oom_attempts is not appropriate for this context:
>>>>>=20
>>>>> # Delay when persistent low free RAM leads to
>>>>> # Out Of Memory killing of processes:
>>>>> vm.pageout_oom_seq=3D120
>>>>> #
>>>>> # For plunty of swap/paging space (will not
>>>>> # run out), avoid pageout delays leading to
>>>>> # Out Of Memory killing of processes:
>>>>> vm.pfault_oom_attempts=3D-1
>>>>> #
>>>>> # For possibly insufficient swap/paging space
>>>>> # (might run out), increase the pageout delay
>>>>> # that leads to Out Of Memory killing of
>>>>> # processes (showing defaults at the time):
>>>>> #vm.pfault_oom_attempts=3D 3
>>>>> #vm.pfault_oom_wait=3D 10
>>>>> # (The multiplication is the total but there
>>>>> # are other potential tradoffs in the factors
>>>>> # multiplied, even for nearly the same total.)
>>>>>=20
>>>>> I'll need to retest with something more like the
>>>>> commented out vm.pfault_oom_attempts and
>>>>> vm.pfault_oom_wait figures in order to see the
>>>>> intended handling of the test case.
>>>>>=20
>>>>> What are you using for each of:
>>>>> vm.pageout_oom_seq ?
>>>>> vm.pfault_oom_attempts ?
>>>>> vm.pfault_oom_wait ?
>>>>>=20
>>>>>=20
>>>>> For reference, for ZFS:
>>>>>=20
>>>>> last pid:   380;  load averages:   1.50,   3.07,   3.93 MaxObs:   =
5.71,   4.92,   4.76                                                     =
                                     up 0+07:23:14  21:23:43
>>>>> 68 threads:    1 running, 65 sleeping, 2 waiting, 19 MaxObsRunning
>>>>> CPU: 13.3% user,  0.0% nice,  4.9% system,  0.9% interrupt, 80.8% =
idle
>>>>> Mem: 4912Mi Active, 167936B Inact, 1193Mi Laundry, 1536Mi Wired, =
40960B Buf, 33860Ki Free, 6179Mi MaxObsActive, 6476Mi MaxObsWired, =
7820Mi MaxObs(Act+Wir+Lndry)
>>>>> ARC: 777086Ki Total, 132156Ki MFU, 181164Ki MRU, 147456B Anon, =
5994Ki Header, 457626Ki Other
>>>>> 59308Ki Compressed, 254381Ki Uncompressed, 4.29:1 Ratio
>>>>> Swap: 8192Mi Total, 8192Mi Used, K Free, 100% Inuse, 19572Ki In, =
3436Ki Out, 8192Mi MaxObsUsed, 14458Mi MaxObs(Act+Lndry+SwapUsed), =
15993Mi MaxObs(Act+Wir+Lndry+SwapUsed)
>>>>>=20
>>>>> Console:
>>>>> (Looks like I misremembered adjusting the "out of swap space"
>>>>> wording for the misnomer message.)
>>>>>=20
>>>>> swap_pager: out of swap space
>>>>> swp_pager_getswapspace(18): failed
>>>>> swap_pager: out of swap space
>>>>> swp_pager_getswapspace(1): failed
>>>>> swp_pager_getswapspace(1): failed
>>>>> swap_pager: out of swap space
>>>>> swp_pager_getswapspace(1): failed
>>>>> swp_pager_getswapspace(7): failed
>>>>> swp_pager_getswapspace(24): failed
>>>>> swp_pager_getswapspace(3): failed
>>>>> swp_pager_getswapspace(18): failed
>>>>> swp_pager_getswapspace(17): failed
>>>>> swp_pager_getswapspace(1): failed
>>>>> swp_pager_getswapspace(12): failed
>>>>> swp_pager_getswapspace(23): failed
>>>>> swp_pager_getswapspace(30): failed
>>>>> swp_pager_getswapspace(3): failed
>>>>> swp_pager_getswapspace(2): failed
>>>>>=20
>>>>> . . . Then a bunch of time with no messages . . .
>>>>>=20
>>>>> swp_pager_getswapspace(5): failed
>>>>> swp_pager_getswapspace(28): failed
>>>>>=20
>>>>> . . . Then a bunch of time with no messages . . .
>>>>>=20
>>>>>=20
>>>>> Top again:
>>>>>=20
>>>>> last pid:   382;  load averages:   0.73,   1.00,   2.40 MaxObs:   =
5.71,   4.92,   4.76                                                     =
                                     up 0+07:31:26  21:31:55
>>>>> 70 threads:    1 running, 65 sleeping, 4 waiting, 19 MaxObsRunning
>>>>> CPU:  0.1% user,  0.0% nice,  5.6% system,  0.0% interrupt, 94.3% =
idle
>>>>> Mem: 3499Mi Active, 4096B Inact, 2612Mi Laundry, 1457Mi Wired, =
40960B Buf, 34676Ki Free, 6179Mi MaxObsActive, 6476Mi MaxObsWired, =
7820Mi MaxObs(Act+Wir+Lndry)
>>>>> ARC: 777154Ki Total, 135196Ki MFU, 178330Ki MRU, 5995Ki Header, =
457631Ki Other
>>>>> 59520Ki Compressed, 254231Ki Uncompressed, 4.27:1 Ratio
>>>>> Swap: 8192Mi Total, 8192Mi Used, K Free, 100% Inuse, 409600B In, =
4096B Out, 8192Mi MaxObsUsed, 14458Mi MaxObs(Act+Lndry+SwapUsed), =
15993Mi MaxObs(Act+Wir+Lndry+SwapUsed)
>>>>>=20
>>>>>=20
>>>>> I then used top to kill ninja and the 4 large compiles
>>>>> that were going on. I'll change:
>>>>>=20
>>>>> vm.pfault_oom_attempts
>>>>> vm.pfault_oom_wait
>>>>>=20
>>>>> and reboot and start over.
>>>>>=20
>>>>>=20
>>>>> I expect that the ongoing UFS test will likely end up
>>>>> similarly and that similar adjustments and restarts
>>>>> will be needed because of actually running out of
>>>>> swap space.
>>>>>=20
>>>>=20
>>>> I forgot to report:
>>>>=20
>>>> [00:01:27] [01] [00:00:00] Building devel/llvm13 | llvm13-13.0.0_3
>>>> [07:49:17] [01] [07:47:50] Finished devel/llvm13 | llvm13-13.0.0_3: =
Failed: build
>>>>=20
>>>> So the swap space filling happened somewhat before
>>>> that much time had passed.
>>>=20
>>> ZFS context:
>>>=20
>>> I will not start the next bulk until just before bed. I do not
>>> want it to fail while I'm not monitoring it.
>>>=20
>>> The last 4 reported compile starts reported in the log are:
>>>=20
>>> [ 64% 4725/7265] . . . flang/lib/Evaluate/fold.cpp
>>> [ 65% 4726/7265] . . . flang/lib/Evaluate/fold-character.cpp
>>> [ 65% 4727/7265] . . . flang/lib/Evaluate/check-expression.cpp
>>> [ 65% 4728/7265] . . . flang/lib/Evaluate/fold-designator.cpp
>>>=20
>>> But it is possible one or more of these completed and some
>>> earlier one(s) was(were) still running.
>>>=20
>>> So, if you do not need the Fortran compiler, you can probably
>>> avoid the problem by setting the options for devel/llvm13 to
>>> not build flang.
>>=20
>> UFS context:
>>=20
>> . . .;  load averages:   . . . MaxObs:   5.47,   4.99,   4.82
>> . . . threads:    . . ., 14 MaxObsRunning
>> . . .
>> Mem: . . ., 6457Mi MaxObsActive, 1263Mi MaxObsWired, 7830Mi =
MaxObs(Act+Wir+Lndry)
>> Swap: 8192Mi Total, 8192Mi Used, K Free, 100% Inuse, 8192Mi =
MaxObsUsed, 14758Mi MaxObs(Act+Lndry+SwapUsed), 16017Mi =
MaxObs(Act+Wir+Lndry+SwapUsed)
>>=20
>>=20
>> Console:
>>=20
>> swap_pager: out of swap space
>> swp_pager_getswapspace(4): failed
>> swp_pager_getswapspace(1): failed
>> swp_pager_getswapspace(1): failed
>> swp_pager_getswapspace(2): failed
>> swp_pager_getswapspace(2): failed
>> swp_pager_getswapspace(4): failed
>> swp_pager_getswapspace(1): failed
>> swp_pager_getswapspace(9): failed
>> swp_pager_getswapspace(4): failed
>> swp_pager_getswapspace(7): failed
>> swp_pager_getswapspace(29): failed
>> swp_pager_getswapspace(9): failed
>> swp_pager_getswapspace(1): failed
>> swp_pager_getswapspace(2): failed
>> swp_pager_getswapspace(1): failed
>> swp_pager_getswapspace(4): failed
>> swp_pager_getswapspace(1): failed
>> swp_pager_getswapspace(10): failed
>>=20
>> . . . Then some time with no messages . . .
>>=20
>> vm_pageout_mightbe_oom: kill context: v_free_count: 7740, =
v_inactive_count: 1
>> Jan 27 23:01:07 CA72_UFS kernel: pid 57238 (c++), jid 3, uid 0, was =
killed: failed to reclaim memory
>> swp_pager_getswapspace(2): failed
>>=20
>>=20
>> Note: The "vm_pageout_mightbe_oom: kill context:"
>> notice is one of the few parts of an old reporting
>> patch Mark J. had supplied (long ago) that still
>> fits in the modern code (or that I was able to keep
>> updated enough to fit, anyway). It is another of the
>> personal updates that I keep in my source trees,
>> such as in /usr/main-src/ .
>>=20
>> diff --git a/sys/vm/vm_pageout.c b/sys/vm/vm_pageout.c
>> index 36d5f3275800..f345e2d4a2d4 100644
>> --- a/sys/vm/vm_pageout.c
>> +++ b/sys/vm/vm_pageout.c
>> @@ -1828,6 +1828,8 @@ vm_pageout_mightbe_oom(struct vm_domain *vmd, =
int page_shortage,
>>        * start OOM.  Initiate the selection and signaling of the
>>        * victim.
>>        */
>> +       printf("vm_pageout_mightbe_oom: kill context: v_free_count: =
%u, v_inactive_count: %u\n",
>> +          vmd->vmd_free_count, =
vmd->vmd_pagequeues[PQ_INACTIVE].pq_cnt);
>>       vm_pageout_oom(VM_OOM_MEM);
>>=20
>>       /*
>>=20
>>=20
>> Again, I'd used vm.pfault_oom_attempts inappropriately
>> for running out of swap (although with UFS it did do
>> a kill fairly soon):
>>=20
>> # Delay when persistent low free RAM leads to
>> # Out Of Memory killing of processes:
>> vm.pageout_oom_seq=3D120
>> #
>> # For plunty of swap/paging space (will not
>> # run out), avoid pageout delays leading to
>> # Out Of Memory killing of processes:
>> vm.pfault_oom_attempts=3D-1
>> #
>> # For possibly insufficient swap/paging space
>> # (might run out), increase the pageout delay
>> # that leads to Out Of Memory killing of
>> # processes (showing defaults at the time):
>> #vm.pfault_oom_attempts=3D 3
>> #vm.pfault_oom_wait=3D 10
>> # (The multiplication is the total but there
>> # are other potential tradoffs in the factors
>> # multiplied, even for nearly the same total.)
>>=20
>> I'll change:
>>=20
>> vm.pfault_oom_attempts
>> vm.pfault_oom_wait
>>=20
>> and reboot --and start the bulk somewhat before
>> going to bed.
>>=20
>>=20
>> For reference:
>>=20
>> [00:02:13] [01] [00:00:00] Building devel/llvm13 | llvm13-13.0.0_3
>> [07:37:05] [01] [07:34:52] Finished devel/llvm13 | llvm13-13.0.0_3: =
Failed: build
>>=20
>>=20
>> [ 65% 4728/7265] . . . flang/lib/Evaluate/fold-designator.cpp
>> [ 65% 4729/7265] . . . flang/lib/Evaluate/fold-integer.cpp
>> FAILED: =
tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/fold-integer.c=
pp.o=20
>> [ 65% 4729/7265] . . . flang/lib/Evaluate/fold-logical.cpp
>> [ 65% 4729/7265] . . . flang/lib/Evaluate/fold-complex.cpp
>> [ 65% 4729/7265] . . . flang/lib/Evaluate/fold-real.cpp
>>=20
>> So the flang/lib/Evaluate/fold-integer.cpp one was the one killed.
>>=20
>> Notably, the specific sources being compiled are different
>> than in the ZFS context report. But this might be because
>> of my killing ninja explicitly in the ZFS context, before
>> killing the running compilers.
>>=20
>> Again, using the options to avoid building the Fortran
>> compiler probably avoids such memory use --if you do not
>> need the Fortran compiler.
>>=20
>=20
> Notes on the 2 types of "out of swap space" notices
> in the main [so: 14] code:
>=20
> One place that can lead to a variation of such notices is:
>=20
> void
> vm_pageout_oom(int shortage)
> {
> . . .
>        if (bigproc !=3D NULL) {
>                switch (shortage) {
>                case VM_OOM_MEM:
>                        reason =3D "failed to reclaim memory";
>                        break;
>                case VM_OOM_MEM_PF:
>                        reason =3D "a thread waited too long to =
allocate a page";
>                        break;
>                case VM_OOM_SWAPZ:
>                        reason =3D "out of swap space";
>                        break;
>                default:
>                        panic("unknown OOM reason %d", shortage);
>                }
>                if (vm_panic_on_oom !=3D 0 && --vm_panic_on_oom =3D=3D =
0)
>                        panic("%s", reason);
>                PROC_LOCK(bigproc);
>                killproc(bigproc, reason);
> . . .
> }
>=20
> It is the VM_OOM_SWAPZ path that produces the misnomer
> text during killproc(bigproc, reason) .
>=20
> But there is another place that produces an accurate message
> that does not, of itself, mention kills:
>=20
> static void
> swp_sizecheck(void)
> {
>=20
>        if (swap_pager_avail < nswap_lowat) {
>                if (swap_pager_almost_full =3D=3D 0) {
>                        printf("swap_pager: out of swap space\n");
>                        swap_pager_almost_full =3D 1;
>                }
>        } else {
>                swap_pager_full =3D 0;
>                if (swap_pager_avail > nswap_hiwat)
>                        swap_pager_almost_full =3D 0;
>        }
> }
>=20
> It looks like you were getting the 2nd (accurate) form
> of message generation. At this point kills have not
> started and the message is not about kills at all.
>=20
> Thus my initial notes about misnomer messages for your
> specific context were wrong. Sorry.

A typo in the commands prevented the poudriere bulk's from
starting last night. So I've just started the UFS one (no
sleep command first this time). Later I'll start the ZFS one.


=3D=3D=3D
Mark Millard
marklmi at yahoo.com