Date: Sat, 13 Nov 2021 03:40:06 -0800 From: Mark Millard via freebsd-current <freebsd-current@freebsd.org> To: freebsd-current <freebsd-current@freebsd.org>, "freebsd-arm@freebsd.org" <arm@freebsd.org> Subject: Re: FYI: aarch64 main [so: 14] system hung up with a large amount of memory in use (given the RAM+SWAP configuration) but lots of swap left Message-ID: <50D9DD1F-6949-412B-AE86-46E6F0129E8B@yahoo.com> In-Reply-To: <D08880FF-514B-455B-8E6A-101A7DF558F6@yahoo.com> References: <D08880FF-514B-455B-8E6A-101A7DF558F6@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2021-Nov-13, at 03:20, Mark Millard <marklmi at yahoo.com> wrote: > While attempting to see if I could repeat a bugzilla report in a > somewhat different context, I has the system hang up to the > point that ^C and ^Z did not work and ^T did not echo out what > would be expected for poudriere (or even the kernel backtrace). > I was able to escape to ddb. >=20 > The context was Cortex-A72 based aarch64 system using: >=20 > # poudriere jail -jmain-CA7 -i > Jail name: main-CA7 > Jail version: 14.0-CURRENT > Jail arch: arm.armv7 > Jail method: null > Jail mount: /usr/obj/DESTDIRs/main-CA7-poud > Jail fs: =20 > Jail updated: 2021-06-27 17:58:33 > Jail pkgbase: disabled >=20 > # uname -apKU > FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 = main-n250455-890cae197737-dirty: Thu Nov 4 13:43:17 PDT 2021 = root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm6= 4.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400040 1400040 >=20 > It is a non-debug build (but with symbols). >=20 > 16 cortex-A72 cores, 64 GiBytes RAM, root on ZFS, 251904Mi swap, > USE_TMPFS=3Dall in use. ALLOW_PARALLEL_JOBS=3D in use too. > (Mentioned only for context: I've no specific evidence if other > contexts would also have failed, say, USE+TMPFS=3D"data" or UFS.) >=20 > When I looked around at the db> prompts I noticed one > oddity (I'm no expert at such inspections): >=20 > db> show allchains > . . . > chain 92: > thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL > thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL > thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL > thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL > thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL > thread 100671 (pid 15928, make) is blocked on lockmgr 0%A%0EXCL > . . . (thousands of more instances of that line content, > I never found the last) . . . >=20 > My patched top (that reports some "maximum observed" (MaxObs???) > figures) was showing (having hung up with the system): >=20 > last pid: 18816; load averages: 10.11, 16.76, 18.73 MaxObs: 115.65, = 103.13, 96.36 = up 8+06:52:04 20:30:57 > 324 threads: 17 running, 305 sleeping, 2 waiting, 147 MaxObsRunning > CPU: 2.8% user, 0.0% nice, 97.1% system, 0.0% interrupt, 0.0% idle > Mem: 19044Ki Active, 331776B Inact, 73728B Laundry, 6950Mi Wired, = 69632B Buf, 558860Ki Free, 47709Mi MaxObsActive, 12556Mi MaxObsWired, = 59622Mi MaxObs(Act+Wir+Lndry) > ARC: 2005Mi Total, 623319Ki MFU, 654020Ki MRU, 2048Ki Anon, 27462Ki = Header, 745685Ki Other > 783741Ki Compressed, 3981Mi Uncompressed, 5.20:1 Ratio > Swap: 251904Mi Total, 101719Mi Used, 150185Mi Free, 40% Inuse, 3432Ki = In, 3064Ki Out, 101719Mi MaxObsUsed, 101737Mi = MaxObs(Act+Lndry+SwapUsed), 109816Mi MaxObs(Act+Wir+Lndry+SwapUsed) >=20 > (Based on the 20:30:57 time shown, it had been hung up for over > 2 hours when I got to it.) >=20 > There were no console messages. /var/log/messages had its > last message at 18:57:52. No out-of-swap or such > messages. >=20 >=20 > I did get a dump via the db> prompt. >=20 In retrying the poudriere-devel run expiriment I'm getting various builds that are generating multi-GiByte log files (and growing) that have lines like: thread 'rustc' panicked at 'capacity overflow', = library/alloc/src/raw_vec.rs:559:5 stack backtrace: note: Some details are omitted, run with `RUST_BACKTRACE=3Dfull` for a = verbose backtrace. error: internal compiler error: unexpected panic note: the compiler unexpectedly panicked. this is a bug. note: we would appreciate a bug report:=20 = https://github.com/rust-lang/rust/issues/new?labels=3DC-bug%2C+I-ICE%2C+T-= compiler&template=3Dice.md note: rustc 1.55.0 running on armv7-unknown-freebsd note: compiler flags: -C embed-bitcode=3Dno -C debuginfo=3D2 -C = linker=3Dcc --crate-type lib note: some of the compiler flags provided by cargo are hidden query stack during panic: #0 [trimmed_def_paths] calculating trimmed def paths #1 [lint_mod] linting module `transitions` #2 [analysis] running analysis passes on this crate end of query stack thread 'rustc' panicked at 'cannot panic during the backtrace function', = library/std/src/../../backtrace/src/lib.rs:147:13 stack backtrace: 0: 0x4710076c - <std::sys_common::backtrace::_print::DisplayBacktrace = as core::fmt::Display>::fmt::h4428caffcb182c5b 1: 0x471c9d00 - core::fmt::write::h91f4a7678561fd61 2: 0x470e2180 - <unknown> 3: 0x470ebd40 - <unknown> 4: 0x470eb824 - <unknown> 5: 0x41ed4848 - <unknown> 6: 0x470ec690 - = std::panicking::rust_panic_with_hook::h6bc4b7e83060df25 7: 0x47100f0c - <unknown> 8: 0x47100900 - <unknown> 9: 0x470ec374 - <unknown> . . . 65: 0x470ee71c - <unknown> 66: 0x401361bc - <unknown> 67: 0x40135cd8 - pthread_create 68: 0x40138b9c - pthread_peekjoin_np 69: 0x40138b9c - pthread_peekjoin_np 70: 0x40138b9c - pthread_peekjoin_np 71: 0x40138b9c - pthread_peekjoin_np 72: 0x40138b9c - pthread_peekjoin_np 73: 0x40138b9c - pthread_peekjoin_np . . . massive repitition of pthread_peekjoin_np lines . . . (I used USE_TMPFS=3D"data" to avoid tmpfs memory usage this time.) =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50D9DD1F-6949-412B-AE86-46E6F0129E8B>