From nobody Fri Nov 19 19:08:58 2021 X-Original-To: arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 670A8189B46C for ; Fri, 19 Nov 2021 19:09:12 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic303-25.consmr.mail.gq1.yahoo.com (sonic303-25.consmr.mail.gq1.yahoo.com [98.137.64.206]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4HwmRq2hvZz4TY6 for ; Fri, 19 Nov 2021 19:09:11 +0000 (UTC) (envelope-from marklmi@yahoo.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1637348944; bh=ZX56WDd7s6jeiBDTC+SliUAk7IvT7xrwfdKnybIRJJM=; h=From:Subject:Date:References:To:In-Reply-To:From:Subject:Reply-To; b=cAWfoKRE8Z6sFEP/lZ5Gf97BbuLxx2xMiClPiNSa1OGoEWttI3lvfKQMxAIzvaHVpIgdt59vkIWQfY7405jMz9p4BEXCCrp5HpXm7/vYO8DEgOLoVAev+9RogBxDhM5u3OdXGZg4sdqkaYF1ooD6FOo6qpNimNQ0tgGH24ONEmQX8dCL94X7rSfAxsHQWIv56vpAfRyImw91DK4sPGxaBRMHnDvjh6SMoGjl7I0o+lao/emA2C1n7HS4J0yvpIQ/FdLml308RV28FuziXq5VCn61lsmaXmy8JZjHNoDgvUvq5pCuBtlDmqyFmCuo4N/udwJr9w89TRNnY/u/EhHy/A== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1637348944; bh=dyWk6NdDPc3jNa5AzTAfQ8bwePOING0VJMZxFJSAzY4=; h=X-Sonic-MF:From:Subject:Date:To:From:Subject; b=Oey1fsHlF1V8Kkpwjyw7+gtxlvC3OOspUKjuy0+LWWlZXkcPs0kk5yZ0uGhDPqUsd4PbCAcMrwZM6KN2zEgXtXuTApQAyD8gKL8eAjyAInYBPWTUNcjaZujoh9BQb0/+IXzScSrhDGaWbSLyDlMUuwpyzk6ixaR9F7pheWU3PaqRFgZ0QS9RP97WK+Y6Gpbw6FqrpSfa2a3MISJJ1O+5CVPo+voOXRNj8OFgmkgDiKR+0wbHqeTVExY1k1Z8/qfKtmWb155qM52MsFEE0n8rwy77pb5NxKkS4vwwUSZwtGmw1reGPaWnvEq1sNCVyDw4iWgQz86+HU+Xl/NVSa1wVw== X-YMail-OSG: XBkgwVoVM1lMn8vgaRH3TZMrb1LHNgnYC1FN3sfqrz2mG_6UQ3Caxpue.C4byoz WxHUxhPaJgw05UPp9CoAgTKQqap543hxy3Y8wj6FZF.yucxyb_SWaExhYNbTwg1jK2ImqBWD2jVy 341lp5uIjsoKMDt5S6Et.hbpVhKrhnEUmZrg4Rult2JABu3jKaDfnJllkx862GPLfdw2Lv2AjwG_ dzmhJcXbTsqo0Wzi3KEpGWVsCsliokC0dP38I.swNLuAtnBiaIDm1J7Z3d.fglAkRD1VrHuwH7hV O_0vNi0z38WN1vrAgXZFZWrdnZEZ7teqY3GlfoLT7xxLilv_V1G1Ez9hmleEJnN7r6RvuAOL177w 2BDdEuluJhxKNj8T0mNRJitaijBbQzTO5Sd0lkqV0tWFAQU8HVMGQ8mlgkEmuylAvibL5g.FTf2j ltiAJa72bsW.1B0oUQwKmeL9mO12EUxVjyDHIKWdz2.LRSghci52PzmOXevP8zswAEO1HRx3T7R5 WvPsY0dysroUJcGtomvLFZskZd4HVpXZ8Doffm0r4SIoNSG5sZBjaAhHMd6mfpdS4RDQdkehVRk0 S7SBRk.86DCUtlBjAoOp1hacrBlWnNnNcTWV4sD0nG0YNs4.CNoi4OsGBjNA1rIEXFuuTKFN1jFp 1juJpbwkjRjbdM7U7Hv6SRiHHcKdBYX01EjtHra0_.kQytMTda6fVeJ0BBrBRXbLnutgIFM1glQe eOyPleCQxcXM4nrhb2oDrPEiDtwV9Je6.mexAIFvUWiVTXKx5FirpQyNPHpiAdJa_9u1tGAxf3D5 TpPp24uj12N77V375wQrLm6ETnPk.jBaRsc4sMkvaoRFP8dEJl9Npq20zMNgeu9yY4r3m0Ha2l0U 54jVWlBHG1o5dhk9P_U_CUY6lud7P33GsbtSyxvAm_nTqwlTg_c9u0f_J_iJXjYOoDicu5Tzpk51 E.GL6TTfSXSwY17Waudzbl8hwKq7flOVy6MsO6mg7y5vm4QaO2Ig.caiSnoEh.ZenWeIDwXb3G0S 2f5LIDPEC2_j_u0Ia0SQBH__7ljMslpdQydphT0hcPQ0IAoue9jXtZmn1O23KEuAu9MwtMQEt9cJ FACH1Tv6hbcfwezh2G7G3CuGIR1.F5eLiTtOTzJFhoU1.AzP4WtxYkqRlesBNLdJkiqYCL2Bsksa 8DCIgtB2wxaHwK0FiUjoJtFveKlzWOA7z7KAs2A2CwoYq4lA5nTOy1va7rqUB_pGyaoGrRVroydA 4LfnxBgj2bAs2uBjPtm7ptDBvAkVm5ItlNhpLuON7g_dWmjweMGx6TNKWxgeHKMXM3rmQox5KyM. K3OCgu_dp96E2_P5mLeXRuMpuRUPi9wzsexOYuR3FJp5sTv1R.YTEPdtkanxVRT7c8usF4x1hfv9 8w4TB5PB8wtXbPYb_vzQwmSKVhYvwJ2vAlEBA5IcrDqaTrbgm6wqIwTJAZh6uBsJ22UapTUhmU2j AXcwuT.FcBrf.CCpaHVYzT.jSD3KyyHqrIfQ04bfCFF7FTHqu6Mw5xvTN0ZokSiP2zpK8b7_Ok8_ WqUBfOaBM_EOj9cZnV4SqkR1e6.dtMituHzdtjEUwzBLtbZOjKgxrRuzjal5HE03pwOmbkMzPheU oGUOD.w8fxm3.kwwOEPSLpfVeTJb0XkXmD.qQBLaLXK51NPTl47emMZR8dk8a_s79A1y_HOxBhtE 6F0zsAxhVrniB6TPoZG.j_ohas7ZnGlGe5X5EoouyzziJBVPHgYdPVUEJW62zGr.ztHDaNrRXohm Am4vPc5SdcpBfPpBXdFqqQ4xW675uwetY4MlmBLF5HzhlU0vpOJCf.HdyVz7PmT1GzNWq9dDdQqE 17lumISEcM_NNpQ5sc3R7Lol9oPvadR702I8EKr_tMK.ta37Q4RTdmu1KbYIDyptagkLRtmpAxd_ ZPGBJYALVPLQFHHpmV0k6rPjNPEIHumSNu1NdSePDuEwa_bZmNT2mC6zrzIpswJqjXKrLg80tDqe QLOYGfO.mcO4IU3eEaWDpwNIeTQ5480KTienNKpBX2xS11rf54m8oolF0o6cDTG6y0Eqxu2FoMqT JJDFeG9Y8jzIHa_SPvnaYQLAQUipBX8NFnToNHSqF6x9anSAK11KQNzPHxSs4wBZ0M6jL4dZ4cFz HB_Yf1K0ozT9rJvlvjiavZvTWdWr_2OcfdeNSgJOoC1vq4dt.nYZDM9K5w40i2j1MThbm7it9Q1d IwlESmrRgpNaB8EIbt71S7E3oRntLp_Ype.PozeKojn52Xl0oiCy3Q8j01WYv1m1bMc4xeLxEyFc X.sRKJv6YpaU6mjbuKVJg5VM- X-Sonic-MF: Received: from sonic.gate.mail.ne1.yahoo.com by sonic303.consmr.mail.gq1.yahoo.com with HTTP; Fri, 19 Nov 2021 19:09:04 +0000 Received: by kubenode527.mail-prod1.omega.bf1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 90d826abde6eeb10822a09d541675b1a; Fri, 19 Nov 2021 19:09:02 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\)) Subject: Re: FYI: aarch64 main [so: 14] system hung up with a large amount of memory in use (given the RAM+SWAP configuration) but lots of swap left Date: Fri, 19 Nov 2021 11:08:58 -0800 References: <50D9DD1F-6949-412B-AE86-46E6F0129E8B@yahoo.com> To: freebsd-current , "freebsd-arm@freebsd.org" In-Reply-To: <50D9DD1F-6949-412B-AE86-46E6F0129E8B@yahoo.com> Message-Id: X-Mailer: Apple Mail (2.3654.120.0.1.13) X-Rspamd-Queue-Id: 4HwmRq2hvZz4TY6 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=yahoo.com header.s=s2048 header.b=cAWfoKRE; dmarc=pass (policy=reject) header.from=yahoo.com; spf=pass (mx1.freebsd.org: domain of marklmi@yahoo.com designates 98.137.64.206 as permitted sender) smtp.mailfrom=marklmi@yahoo.com X-Spamd-Result: default: False [-3.50 / 15.00]; RCVD_TLS_LAST(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.64.206:from]; FROM_HAS_DN(0.00)[]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; MIME_GOOD(-0.10)[text/plain]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[98.137.64.206:from]; NEURAL_HAM_SHORT(-1.00)[-1.000]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; RCVD_COUNT_TWO(0.00)[2]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim] Reply-To: marklmi@yahoo.com From: Mark Millard via freebsd-current X-Original-From: Mark Millard X-ThisMailContainsUnwantedMimeParts: N On 2021-Nov-13, at 03:40, Mark Millard wrote: > On 2021-Nov-13, at 03:20, Mark Millard wrote: >=20 >=20 >> While attempting to see if I could repeat a bugzilla report in a >> somewhat different context, I has the system hang up to the >> point that ^C and ^Z did not work and ^T did not echo out what >> would be expected for poudriere (or even the kernel backtrace). >> I was able to escape to ddb. >>=20 >> The context was Cortex-A72 based aarch64 system using: >>=20 >> # poudriere jail -jmain-CA7 -i >> Jail name: main-CA7 >> Jail version: 14.0-CURRENT >> Jail arch: arm.armv7 >> Jail method: null >> Jail mount: /usr/obj/DESTDIRs/main-CA7-poud >> Jail fs: =20 >> Jail updated: 2021-06-27 17:58:33 >> Jail pkgbase: disabled >>=20 >> # uname -apKU >> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 = main-n250455-890cae197737-dirty: Thu Nov 4 13:43:17 PDT 2021 = root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm6= 4.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400040 1400040 >>=20 >> It is a non-debug build (but with symbols). >>=20 >> 16 cortex-A72 cores, 64 GiBytes RAM, root on ZFS, 251904Mi swap, >> USE_TMPFS=3Dall in use. ALLOW_PARALLEL_JOBS=3D in use too. >> (Mentioned only for context: I've no specific evidence if other >> contexts would also have failed, say, USE+TMPFS=3D"data" or UFS.) Of course not a "+": USE_TMPFS=3D"data" >> When I looked around at the db> prompts I noticed one >> oddity (I'm no expert at such inspections): >>=20 >> db> show allchains >> . . . >> chain 92: >> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL >> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL >> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL >> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL >> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL >> thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL >> . . . (thousands of more instances of that line content, >> I never found the last) . . . >>=20 >> My patched top (that reports some "maximum observed" (MaxObs???) >> figures) was showing (having hung up with the system): >>=20 >> last pid: 18816; load averages: 10.11, 16.76, 18.73 MaxObs: 115.65, = 103.13, 96.36 = up 8+06:52:04 20:30:57 >> 324 threads: 17 running, 305 sleeping, 2 waiting, 147 MaxObsRunning >> CPU: 2.8% user, 0.0% nice, 97.1% system, 0.0% interrupt, 0.0% = idle >> Mem: 19044Ki Active, 331776B Inact, 73728B Laundry, 6950Mi Wired, = 69632B Buf, 558860Ki Free, 47709Mi MaxObsActive, 12556Mi MaxObsWired, = 59622Mi MaxObs(Act+Wir+Lndry) >> ARC: 2005Mi Total, 623319Ki MFU, 654020Ki MRU, 2048Ki Anon, 27462Ki = Header, 745685Ki Other >> 783741Ki Compressed, 3981Mi Uncompressed, 5.20:1 Ratio >> Swap: 251904Mi Total, 101719Mi Used, 150185Mi Free, 40% Inuse, 3432Ki = In, 3064Ki Out, 101719Mi MaxObsUsed, 101737Mi = MaxObs(Act+Lndry+SwapUsed), 109816Mi MaxObs(Act+Wir+Lndry+SwapUsed) >>=20 >> (Based on the 20:30:57 time shown, it had been hung up for over >> 2 hours when I got to it.) >>=20 >> There were no console messages. /var/log/messages had its >> last message at 18:57:52. No out-of-swap or such >> messages. >>=20 >>=20 >> I did get a dump via the db> prompt. >>=20 >=20 > In retrying the poudriere-devel run expiriment I'm > getting various builds that are generating > multi-GiByte log files (and growing) that have > lines like: >=20 > thread 'rustc' panicked at 'capacity overflow', = library/alloc/src/raw_vec.rs:559:5 > stack backtrace: > note: Some details are omitted, run with `RUST_BACKTRACE=3Dfull` for a = verbose backtrace. >=20 > error: internal compiler error: unexpected panic >=20 > note: the compiler unexpectedly panicked. this is a bug. >=20 > note: we would appreciate a bug report:=20 > = https://github.com/rust-lang/rust/issues/new?labels=3DC-bug%2C+I-ICE%2C+T-= compiler&template=3Dice.md >=20 >=20 > note: rustc 1.55.0 running on armv7-unknown-freebsd >=20 > note: compiler flags: -C embed-bitcode=3Dno -C debuginfo=3D2 -C = linker=3Dcc --crate-type lib >=20 > note: some of the compiler flags provided by cargo are hidden >=20 > query stack during panic: > #0 [trimmed_def_paths] calculating trimmed def paths > #1 [lint_mod] linting module `transitions` > #2 [analysis] running analysis passes on this crate > end of query stack > thread 'rustc' panicked at 'cannot panic during the backtrace = function', library/std/src/../../backtrace/src/lib.rs:147:13 > stack backtrace: > 0: 0x4710076c - = ::fmt::h4428caffcb182c5b > 1: 0x471c9d00 - core::fmt::write::h91f4a7678561fd61 > 2: 0x470e2180 - > 3: 0x470ebd40 - > 4: 0x470eb824 - > 5: 0x41ed4848 - > 6: 0x470ec690 - = std::panicking::rust_panic_with_hook::h6bc4b7e83060df25 > 7: 0x47100f0c - > 8: 0x47100900 - > 9: 0x470ec374 - > . . . > 65: 0x470ee71c - > 66: 0x401361bc - > 67: 0x40135cd8 - pthread_create > 68: 0x40138b9c - pthread_peekjoin_np > 69: 0x40138b9c - pthread_peekjoin_np > 70: 0x40138b9c - pthread_peekjoin_np > 71: 0x40138b9c - pthread_peekjoin_np > 72: 0x40138b9c - pthread_peekjoin_np > 73: 0x40138b9c - pthread_peekjoin_np > . . . massive repitition of pthread_peekjoin_np lines . . . >=20 > (I used USE_TMPFS=3D"data" to avoid tmpfs memory usage this > time.) >=20 In trying stress tests such as: # stress --vm 16 --vm-bytes 16G --vm-stride 4096 --vm-hang 30 I've not reproduced a hang under root-on-ZFS (or root-on-UFS). Using USE_TMPFS=3D"data" in poudriere has not produced the hangup in any bulk rebuild runs. (I eventually have to stop such bulk run because of the indefinitely growing files.) These, with the earlier, suggest that the tmpfs use via USE_TMPFS=3Dall in poudriere contributes to the hangup condition involved, as does the indefinitely growing files in various tmpfs instances for various builders. Side notes: Luckily, with the Optane PCIe orOptane U.2-via-M.2-adapter based media, USE_TMPFS=3Dall is only a little faster than USE_TMPFS=3D"data" for the bulk builds. Example from UFS context test results that happen to be handy: # poudriere status -a =3D>> Warning: Looking up all matching builds. This may take a while. SET PORTS JAIL BUILD STATUS QUEUE BUILT FAIL SKIP = IGNORE FETCH REMAIN TIME LOGS . . . - default 13_0R-CA72 2021-11-17_17h29m50s done 485 481 4 0 = 0 0 0 06:07:05 = /usr/local/poudriere/data/logs/bulk/13_0R-CA72-default/2021-11-17_17h29m50= s - default 13_0R-CA72 2021-11-18_12h42m09s done 485 481 4 0 = 0 0 0 06:02:27 = /usr/local/poudriere/data/logs/bulk/13_0R-CA72-default/2021-11-18_12h42m09= s . . . Using a portable USB3 SSD media with USE_TMPFS=3D"data" took noticeably = longer: - default 13_0R-CA72 2021-11-16_02h28m29s done 485 481 4 0 = 0 0 0 07:55:25 = /usr/local/poudriere/data/logs/bulk/13_0R-CA72-default/2021-11-16_02h28m29= s These bulk runs do not involve the indefinitely growing files. Much of the time goes into gcc11, llvm12, llvm13, and rust builds, for = example. The builds use ALLOW_PARALLEL_JOBS=3D and had 16 builders, so maximum = observed load averages (via my personal variation of top): USB3 SSD USE_TMPFS=3D"data" test: last pid: . . .; load averages: . . . MaxObs: 124.65, 106.26, 90.03 Optane USE_TMPFS=3D"data" test: last pid: . . .; load averages: . . . MaxObs: 99.60, 84.64, 78.15 Optane USE_TMPFS=3Dall test: last pid: . . .; load averages: . . . MaxObs: 113.79, 97.04, 84.88 The context is a 16 Cortex-A72 HoneyComb system, 64 GiBytes of RAM. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)