in of fbsd@www.zefox.net has no SPF policy when checking 50.1.20.27) smtp.mailfrom=fbsd@www.zefox.net Received: from www.zefox.net (localhost [127.0.0.1]) by www.zefox.net (8.18.1/8.18.1) with ESMTPS id 5AP3uu5X076336 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO) for ; Mon, 24 Nov 2025 19:56:57 -0800 (PST) (envelope-from fbsd@www.zefox.net) Received: (from fbsd@localhost) by www.zefox.net (8.18.1/8.18.1/Submit) id 5AP3uueo076335 for freebsd-arm@freebsd.org; Mon, 24 Nov 2025 19:56:56 -0800 (PST) (envelope-from fbsd) Date: Mon, 24 Nov 2025 19:56:56 -0800 From: bob prohaska To: freebsd-arm@freebsd.org Subject: Re: Arm v7 RPi2 -current unresponsive to debugger escape during buildworld Message-ID: References: List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spamd-Bar: / X-Spamd-Result: default: False [-0.59 / 15.00]; AUTH_NA(1.00)[]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-0.95)[-0.946]; NEURAL_HAM_LONG(-0.54)[-0.541]; MID_RHS_WWW(0.50)[]; WWW_DOT_DOMAIN(0.50)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-arm@freebsd.org]; ASN(0.00)[asn:7065, ipnet:50.1.16.0/20, country:US]; RCPT_COUNT_ONE(0.00)[1]; RCVD_TLS_LAST(0.00)[]; MISSING_XM_UA(0.00)[]; FROM_HAS_DN(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; DMARC_NA(0.00)[zefox.net]; TO_DN_NONE(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; MLMMJ_DEST(0.00)[freebsd-arm@freebsd.org]; ARC_NA(0.00)[]; R_SPF_NA(0.00)[no SPF record] X-Rspamd-Queue-Id: 4dFpkf5TmBz3bSM On Mon, Nov 24, 2025 at 06:07:26PM -0800, bob prohaska wrote: > A few minutes ago a Pi2 running buildworld for -current locked up again, with no > responsie to the debugger escape. > > The system was swapping fairly hard but not stuck, maybe 600 MB in use, eventually > swap use declined but in minutes it got stuck with top displaying: > > last pid: 51520; load averages: 2.82, 2.96, 2.96 up 1+02:48:27 16:27:58 > 57 processes: 3 running, 54 sleeping > CPU: 66.4% user, 0.0% nice, 16.0% system, 0.3% interrupt, 17.4% idle > Mem: 183M Active, 540M Inact, 416K Laundry, 175M Wired, 98M Buf, 19M Free > Swap: 2048M Total, 23M Used, 2025M Free, 1% Inuse > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 51497 root 5 59 0 352M 208M uwait 3 0:08 161.51% ld.lld > 51518 root 1 101 0 167M 71M CPU1 1 0:03 87.52% cc > 51520 root 1 59 0 167M 72M RUN 2 0:03 67.88% cc > 11811 root 1 0 0 6724K 1456K CPU0 0 5:51 0.46% top > 2047 root 1 0 0 4676K 704K select 0 1:08 0.09% powerd > 2206 bob 1 0 0 14M 1212K select 0 0:46 0.06% sshd-session > 2119 root 1 9 0 14M 2320K select 1 1:27 0.00% sshd > > The over-100% utilization for cpu 3 looks somewhat implausible. > Might it suggest anything significant? I haven't seen this from > top in a long time, so relatively speaking it's new behavior. > > There were no console warnings of any kind. A few minutes after a restart of the -j3 buildworld that the machine had been working on a spontaneous debugger excursion occurred: dev = da0s2d, block = 47297583, fs = /usr panic: ffs_blkfree_cg: freeing free frag cpuid = 2 time = 1764039108 KDB: stack backtrace: db_trace_self() at db_trace_self pc = 0xc0626ee4 lr = 0xc0076968 (db_trace_self_wrapper+0x30) sp = 0xc4eeda50 fp = 0xc4eedb68 db_trace_self_wrapper() at db_trace_self_wrapper+0x30 pc = 0xc0076968 lr = 0xc0307a60 (vpanic+0x140) sp = 0xc4eedb70 fp = 0xc4eedb90 r4 = 0x00000100 r5 = 0x00000000 r6 = 0xc0775b24 r7 = 0xc0b7a8e4 vpanic() at vpanic+0x140 pc = 0xc0307a60 lr = 0xc0307920 (vpanic) sp = 0xc4eedb98 fp = 0xc4eedb9c r4 = 0x00000000 r5 = 0xd76da900 r6 = 0x00003a80 r7 = 0x00000007 r8 = 0x00003a87 r9 = 0xc5f3a7d8 r10 = 0xd76cd000 vpanic() at vpanic pc = 0xc0307920 lr = 0xc0586354 (ffs_blkfree_cg+0x7a8) sp = 0xc4eedba4 fp = 0xc4eedc28 r4 = 0x00000007 r5 = 0x00003a87 r6 = 0xc5f3a7d8 r7 = 0xd76cd000 r8 = 0xc4eedb9c r9 = 0xc0307920 r10 = 0xc4eedba4 ffs_blkfree_cg() at ffs_blkfree_cg+0x7a8 pc = 0xc0586354 lr = 0xc0581ce8 (ffs_blkfree+0x100) sp = 0xc4eedc30 fp = 0xc4eedc90 r4 = 0x00001000 r5 = 0x00000004 r6 = 0x00000000 r7 = 0x02d1b42f r8 = 0xc4eedce0 r9 = 0x0169436a r10 = 0x00000000 ffs_blkfree() at ffs_blkfree+0x100 pc = 0xc0581ce8 lr = 0xc05b1cd0 (freework_freeblock+0x790) sp = 0xc4eedc98 fp = 0xc4eedd00 r4 = 0x0169436a r5 = 0xc4eedce0 r6 = 0x00001000 r7 = 0x00000001 r8 = 0xc078f34c r9 = 0xdd44f3d0 r10 = 0xd76da900 freework_freeblock() at freework_freeblock+0x790 pc = 0xc05b1cd0 lr = 0xc05a30fc (handle_workitem_freeblocks+0x1f8) sp = 0xc4eedd08 fp = 0xc4eedd50 r4 = 0xdd44f380 r5 = 0xd734d4c8 r6 = 0xd76da900 r7 = 0xc07a28e4 r8 = 0xd734d480 r9 = 0xd734d4c0 r10 = 0xffffffff handle_workitem_freeblocks() at handle_workitem_freeblocks+0x1f8 pc = 0xc05a30fc lr = 0xc059b79c (process_worklist_item+0x22c) sp = 0xc4eedd58 fp = 0xc4eedda0 r4 = 0x00000004 r5 = 0xc078f34c r6 = 0xdd44f380 r7 = 0xd76da900 r8 = 0xd76da900 r9 = 0xc4eedd60 r10 = 0xffffffff process_worklist_item() at process_worklist_item+0x22c pc = 0xc059b79c lr = 0xc05963c8 (softdep_process_worklist+0xc4) sp = 0xc4eedda8 fp = 0xc4eeddd8 r4 = 0x00000000 r5 = 0xc078f34c r6 = 0x0000000a r7 = 0x00000028 r8 = 0xd76da900 r9 = 0x00000000 r10 = 0xc4ff6a80 softdep_process_worklist() at softdep_process_worklist+0xc4 pc = 0xc05963c8 lr = 0xc0599f1c (softdep_flush+0x130) sp = 0xc4eedde0 fp = 0xc4eede18 r4 = 0xc4ff6a80 r5 = 0x00000001 r6 = 0x00200000 r7 = 0x00000000 r8 = 0xc078f34c r9 = 0xc4ff6a88 r10 = 0xd76da900 softdep_flush() at softdep_flush+0x130 pc = 0xc0599f1c lr = 0xc02bb8cc (fork_exit+0xa0) sp = 0xc4eede20 fp = 0xc4eede38 r4 = 0xc4eede40 r5 = 0xc4fa1800 r6 = 0xc0599dec r7 = 0xc4f74e40 r8 = 0xc4ff6a80 r9 = 0xc0b8d184 r10 = 0xc4f72c00 fork_exit() at fork_exit+0xa0 pc = 0xc02bb8cc lr = 0xc06296f4 (swi_exit) sp = 0xc4eede40 fp = 0x00000000 r4 = 0xc0599dec r5 = 0xc4ff6a80 r6 = 0x7ff747b2 r7 = 0x00002710 r8 = 0xc10051d0 r10 = 0xc4f72c00 swi_exit() at swi_exit pc = 0xc06296f4 lr = 0xc06296f4 (swi_exit) sp = 0xc4eede40 fp = 0x00000000 KDB: enter: panic [ thread pid 18 tid 100086 ] Stopped at kdb_enter+0x54: ldrb r15, [r15, r15, ror r15]! db> The machine had been idling for a couple of hours, so it probably finished its background fsck before re-starting buildworld. This is the first time in recent memory that buildworld has ended in the debugger, if in fact builworld was the cause. Thanks for reading, bob prohaska