Date: Sat, 15 Jun 2024 02:23:53 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 279742] 14.1-RELEASE hangs compiling pspp requiring reboot Message-ID: <bug-279742-227@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D279742 Bug ID: 279742 Summary: 14.1-RELEASE hangs compiling pspp requiring reboot Product: Base System Version: 14.0-STABLE Hardware: amd64 OS: Any Status: New Severity: Affects Many People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: dgilbert@eicat.ca Created attachment 251458 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D251458&action= =3Dedit core.txt of crash. I clicked on 14.0-STABLE because 14.1-RELEASE was not yet a choice. I upgraded my poudriere box to 14.1, created a new jail for 14.1, and launc= hed into a "-a" build pretty much immediately after returning from BSDCan. The build machine is a Threadripper 1900X with 128G of RAM and 140TB of disk in RAID-Z2. It has stably built poudriere almost constantly since I upgraded = it to it's current state --- about 3 years or so. After the first poudriere hang, I instrumented things like temperatures. N= one of these spiked, but the hang happened again and again. After awhile, it w= as clear that pspp compiling was the trigger. Note that pspp would have compi= led under 14.0 less than a week before (ie: just before BSDCan). I had to get debugging in to my kernel and learn how to cause it to debug.= =20 That took a couple tries --- all-the-while repeatedly crashing while pspp w= as building. Top was up on the window I keep open ... and this was the last t= op on display. last pid: 31372; load averages: 21.72, 32.46, 41.6670=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 up 0+04:34:59 20:36:48 220 processes: 12 running, 192 sleeping, 2 zombie, 14 waiting CPU: 21.7% user, 0.0% nice, 40.4% system, 0.0% interrupt, 37.8% idle Mem: 32M Active, 264K Inact, 124G Wired, 604M Free ARC: 16G Total, 230M MFU, 334M MRU, 22M Anon, 15G Header, 191M Other 107M Compressed, 460M Uncompressed, 4.28:1 Ratio Swap: 256G Total, 98G Used, 158G Free, 38% Inuse, 2868K In, 3612K Out PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMM= AND 61759 root 2 166 i10 60M 2616K vofflo 3 40:43 75.20% pspp-output 6367 root 2 166 i10 88M 2604K vofflo 15 36:06 73.63% pspp-output 15409 root 2 166 i10 92M 2608K vofflo 2 33:53 72.64% pspp-output 81893 root 2 166 i10 86M 2600K CPU12 12 34:05 72.04% pspp-output 78622 root 2 166 i10 57M 2588K CPU11 11 28:42 69.19% pspp-output 25531 root 2 166 i10 95M 2616K CPU5 5 27:00 68.84% pspp-output 81789 root 2 166 i10 42M 2584K CPU6 6 23:16 65.11% pspp-output 87988 root 2 166 i10 102M 2596K CPU7 7 20:57 64.28% pspp-output 11364 root 2 166 i10 57M 2612K CPU10 10 19:50 64.14% pspp-output 23538 root 2 166 i10 66M 2604K CPU11 11 21:09 63.94% pspp-output 61379 root 2 166 i10 93M 2624K tmpfs 4 21:10 63.46% pspp-output 85836 root 2 166 i10 74M 2608K CPU14 14 19:19 62.69% pspp-output 58400 root 2 166 i10 76M 2440K RUN 5 13:26 56.27% pspp-output 58294 root 2 166 i10 72M 2444K CPU1 1 14:44 56.15% pspp-output 70050 root 2 166 i10 48M 2440K RUN 1 12:46 56.10% pspp-output 2561 root 1 20 0 303M 1728K select 12 1:09 0.40% smbd 2502 postgres 1 20 0 173M 1012K select 4 0:13 0.16% post= gres 65067 root 1 20 0 17M 1452K CPU9 9 0:21 0.14% top 2577 root 1 20 0 17M 1216K select 9 0:40 0.13% tmux 72517 root 6 166 i10 2310M 452K uwait 1 9:40 0.07% ghc-9.6.4 8903 root 45 166 i10 34G 4716K uwait 0 12:30 0.06% java 2503 postgres 1 20 0 31M 684K select 7 0:08 0.05% post= gres 37351 root 1 20 0 22M 328K select 9 0:03 0.05% sshd 2190 root 1 20 0 14M 172K select 6 0:00 0.03% sysl= ogd 72294 root 11 166 i10 345M 1664K kqread 4 0:02 0.01% node 2294 root 1 20 0 280M 228K select 11 0:00 0.01% httpd 1192 root 1 20 0 18M 340K select 0 0:27 0.01% moun= td 1162 ntpd 1 20 0 23M 520K select 12 0:01 0.01% ntpd 95259 root 1 20 0 12M 328K ttyin 4 0:03 0.01% cu 1749 uwsgi 1 20 0 57M 412K kqread 12 0:01 0.00% uwsgi-3.8 36420 root 1 20 0 19M 544K select 5 0:01 0.00% mini= com 1307 root 1 20 0 164M 460K kqread 8 0:00 0.00% php-= fpm 1253 root 128 68 0 12M 2316K rpcsvc 11 0:13 0.00% nfsd 91926 root 2 166 i10 74M 2908K pfault 15 123:07 0.00% pspp-output 72530 root 11 166 i10 7498M 836K pfault 5 99:32 0.00% node 46100 root 18 166 i10 261G 932K uwait 4 18:33 0.00% dotn= et 73028 root 1 166 i10 165M 4096B WAIT 11 3:56 0.00% <pkg-static> 2955 root 1 166 i10 15M 4096B wait 13 3:24 0.00% <sh> 93083 root 1 166 i10 195M 4096B WAIT 13 3:14 0.00% <pkg-static> 22537 root 6 166 i10 298M 17M uwait 10 1:22 0.00% ld.l= ld 2588 root 1 166 i10 22M 224K select 6 1:02 0.00% sh 24257 root 1 166 i10 145M 4096B WAIT 12 0:42 0.00% <pkg-static> 1301 www 1 20 0 27M 4096B WAIT 5 0:32 0.00% <ngi= nx> 90301 root 14 166 i10 261G 260K uwait 4 0:24 0.00% dotn= et It's worth noting here that the virtual terminal switch (alt - F<n>) works after this happens, but no other input is recognized (can't hit return in a window and shells going through the machine to others don't continue their output). When it happened this time, I dropped to KDB and dumped. core.txt attached. NOTE: this is repeatable. I have been through the cycle 6 times so far. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-279742-227>