Date: Mon, 2 Dec 2024 23:38:23 -0800 From: Mark Millard <marklmi@yahoo.com> To: "mmel@freebsd.org" <mmel@FreeBSD.org> Cc: FreeBSD ARM List <freebsd-arm@freebsd.org>, FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org> Subject: Re: Official armv7 PkgBase kernel-NODEBUG installation's USB2 boot gets "Fatal kernel mode data abort: 'Alignment Fault' on write" very early, at least on an OrangePi+ 2ed Message-ID: <D8C9F4E8-242D-4106-A55B-EDA986E1CEA5@yahoo.com> In-Reply-To: <4c29d5cb-0a31-4131-a3a9-846fd4ce926f@FreeBSD.org> References: <EE03BBF6-30A9-4716-A97D-1B21C6643519@yahoo.com> <74468E62-5A74-4FB4-94F6-599E5BA3A9A1@yahoo.com> <B2AD5C3E-B142-472D-8909-186DBFC4FEFE@yahoo.com> <4c29d5cb-0a31-4131-a3a9-846fd4ce926f@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Top post of identifying a new context: Now that stable/14 is based on LLVM19, stable/14 is broken like main [so: 15] was, at least in part. Some MFC activity looks to be required in order to boot armv7 via stable/14 now. More may be required. The failure looks like: . . . mmc1: <MMC/SD bus> on aw_mmc1 mmc1: No compatible cards found on bus aw_mmc1: Spurious interrupt - no active request, rint: 0x00000004 mmc2: <MMC/SD bus> on aw_mmc0 mmcsd1: 32GB <SDHC SL32G 8.0 SN 006A919A MFG 02/2015 by 3 SD> at mmc2 = 50.0MHz/4bit/32768-block mmc2: Failed to set VCCQ for card at relative address 43690 uhub0: 1 port with 1 removable, self powered uhub2: 1 port with 1 removable, self powered uhub5: 1 port with 1 removable, self powered uhub8: 1 port with 1 removable, self powered Root mount waiting for: usbus3 Fatal kernel mode data abort: 'Alignment Fault' on write trapframe: 0xc6b7dc10 FSR=3D00000801, FAR=3Ddb0b901b, spsr=3D20000013 r0 =3Ddb0b9000, r1 =3D00000000, r2 =3D00000006, r3 =3D00000024 r4 =3Ddb058c80, r5 =3D00000000, r6 =3D00000001, r7 =3D00000006 r8 =3Dc6b7dd20, r9 =3Dc0b324fc, r10=3Dc08ef8dc, r11=3Dc6b7dcb8 r12=3D00000000, ssp=3Dc6b7dca0, slr=3Dc019f774, pc =3Dc019f524 panic: Fatal abort cpuid =3D 1 time =3D 3 KDB: stack backtrace: Fatal kernel mode data abort: 'Translation Fault (L1)' on read trapframe: 0xc6b7d970 FSR=3D00000005, FAR=3Da3e89ab0, spsr=3D200001d3 r0 =3Dc6b7da24, r1 =3D00000001, r2 =3Da3e89aad, r3 =3D63622c6d r4 =3Dc0866e44, r5 =3Df3bea0b1, r6 =3D0000a776, r7 =3D81000000 r8 =3Dc0813294, r9 =3Dc0b229c4, r10=3Dc6b7db1c, r11=3Dc6b7da18 r12=3Dc6b7dad8, ssp=3Dc6b7da00, slr=3Dc0665720, pc =3Dc066974c panic: Fatal abort cpuid =3D 1 time =3D 3 KDB: stack backtrace: Fatal kernel mode data abort: 'Translation Fault (L1)' on read trapframe: 0xc6b7d6f0 FSR=3D00000005, FAR=3Da3e89ab0, spsr=3D200001d3 r0 =3Dc6b7d7a4, r1 =3D00000001, r2 =3Da3e89aad, r3 =3D63622c6d r4 =3Dc0866e44, r5 =3Df3bea0b1, r6 =3D0000a776, r7 =3D81000000 r8 =3Dc0813294, r9 =3Dc0b229c4, r10=3Dc6b7d89c, r11=3Dc6b7d798 r12=3Dc6b7d858, ssp=3Dc6b7d780, slr=3Dc0665720, pc =3Dc066974c panic: Fatal abort cpuid =3D 1 time =3D 3 KDB: stack backtrace: Fatal kernel mode data abort: 'Translation Fault (L1)' on read trapframe: 0xc6b7d470 FSR=3D00000005, FAR=3Da3e89ab0, spsr=3D200001d3 r0 =3Dc6b7d524, r1 =3D00000001, r2 =3Da3e89aad, r3 =3D63622c6d r4 =3Dc0866e44, r5 =3Df3bea0b1, r6 =3D0000a776, r7 =3D81000000 r8 =3Dc0813294, r9 =3Dc0b229c4, r10=3Dc6b7d61c, r11=3Dc6b7d518 r12=3Dc6b7d5d8, ssp=3Dc6b7d500, slr=3Dc0665720, pc =3Dc066974c After that the L1 translation fault repeats over and over. On Nov 8, 2024, at 04:49, Michal Meloun <mmel@freebsd.org> wrote: > On 08.11.2024 4:15, Mark Millard wrote: >> [I narrowed the artifact kernel range for the change in the type of >> failure that happens.] >> On Nov 7, 2024, at 17:43, Mark Millard <marklmi@yahoo.com> wrote: >>> [The change to LLVM 19 is what leads to the Alignment >>> Fault' on write failure. Details later below.] >>>=20 >>> On Nov 7, 2024, at 01:42, Mark Millard <marklmi@yahoo.com> wrote: >>>=20 >>>> Note: Unfortunately, the panics here are too early for a >>>> dump device to be available. >>>>=20 >>>> Context started PkgBase upgrade from: >>>>=20 >>>> # uname -apKU >>>> FreeBSD OPiP2E-RPi2v1p1 15.0-CURRENT FreeBSD 15.0-CURRENT = main-n272821-37798b1d5dd1 GENERIC-NODEBUG arm armv7 1500025 1500025 >>>>=20 >>>> Installed packages to be UPGRADED: >>>> FreeBSD-dtb: 15.snap20241009161500 -> 15.snap20241028121139 = [base] >>>> FreeBSD-kernel-generic: 15.snap20241011221604 -> = 15.snap20241106134422 [base] >>>> FreeBSD-kernel-generic-dbg: 15.snap20241011221604 -> = 15.snap20241106134422 [base] >>>> FreeBSD-kernel-generic-mmccam: 15.snap20241011221604 -> = 15.snap20241106134422 [base] >>>> FreeBSD-kernel-generic-mmccam-dbg: 15.snap20241011221604 -> = 15.snap20241106134422 [base] >>>> FreeBSD-kernel-generic-nodebug: 15.snap20241011221604 -> = 15.snap20241106134422 [base] >>>> FreeBSD-kernel-generic-nodebug-dbg: 15.snap20241011221604 -> = 15.snap20241106134422 [base] >>>> FreeBSD-src-sys: 15.snap20241011221604 -> = 15.snap20241106160110 [base] >>>>=20 >>>> (Those were installed but the FreeBSD-dtb had linux 6.4 >>>> dtb files, not the 6.8 ones. 6.8 ones from a personal build >>>> were copied to where they need to be. I've separately >>>> reported the 6.4 vs. 6.8 issue.) >>>>=20 >>>> # ~/pkgbase-snapshot-list.sh >>>> Via pkg-static info -C -x '^FreeBSD-' . . . >>>> 1 FreeBSD-*-15.snap20241106160110 >>>> 6 FreeBSD-*-15.snap20241106134422 >>>> 1 FreeBSD-*-15.snap20241028121139 >>>> 3 FreeBSD-*-15.snap20241011221604 >>>> 2 FreeBSD-*-15.snap20241011210446 >>>> 38 FreeBSD-*-15.snap20241011182434 >>>> 4 FreeBSD-*-15.snap20241011073851 >>>> 5 FreeBSD-*-15.snap20241010141501 >>>> 1 FreeBSD-*-15.snap20241010120743 >>>> 296 FreeBSD-*-15.snap20241009161500 >>>> Instead via /var/cache/pkg/*.snap*.pkg . . . >>>> 1 FreeBSD-*-15.snap20241106160110 >>>> 6 FreeBSD-*-15.snap20241106134422 >>>> 1 FreeBSD-*-15.snap20241028121139 >>>> 10 FreeBSD-*-15.snap20241011221604 >>>> 2 FreeBSD-*-15.snap20241011210446 >>>> 38 FreeBSD-*-15.snap20241011182434 >>>> 4 FreeBSD-*-15.snap20241011073851 >>>> 5 FreeBSD-*-15.snap20241010141501 >>>> 1 FreeBSD-*-15.snap20241010120743 >>>> 297 FreeBSD-*-15.snap20241009161500 >>>>=20 >>>>=20 >>>> The failure (kernel-GENERIC-NODEBUG): >>>>=20 >>>> . . . >>>> Root mount waiting for: usbus3 CAM >>>> Fatal kernel mode data abort: 'Alignment Fault' on write >>>> trapframe: 0xc6c9ac10 >>>> FSR=3D00000801, FAR=3Ddb23209b, spsr=3D20000013 >>>> r0 =3Ddb232080, r1 =3D00000000, r2 =3D00000006, r3 =3D00000024 >>>> r4 =3Ddb19e280, r5 =3D00000000, r6 =3D00000001, r7 =3D00000006 >>>> r8 =3Dc6c9ad20, r9 =3Dc0b7973c, r10=3Dc092074c, r11=3Dc6c9acb8 >>>> r12=3D00000000, ssp=3Dc6c9aca0, slr=3Dc01b01d8, pc =3Dc01aff88 >>>>=20 >>>> panic: Fatal abort >>>> cpuid =3D 1 >>>> time =3D 3 >>>> KDB: stack backtrace: >>>> db_trace_self() at db_trace_self >>>> pc =3D 0xc0667004 lr =3D 0xc0078630 = (db_trace_self_wrapper+0x30) >>>> sp =3D 0xc6c9a9c8 fp =3D 0xc6c9aae0 >>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30 >>>> pc =3D 0xc0078630 lr =3D 0xc0328db8 (vpanic+0x140) >>>> sp =3D 0xc6c9aae8 fp =3D 0xc6c9ab08 >>>> r4 =3D 0x00000100 r5 =3D 0x00000000 >>>> r6 =3D 0xc084d1f1 r7 =3D 0xc0b69a94 >>>> vpanic() at vpanic+0x140 >>>> pc =3D 0xc0328db8 lr =3D 0xc0328c78 (vpanic) >>>> sp =3D 0xc6c9ab10 fp =3D 0xc6c9ab14 >>>> r4 =3D 0xc6c9ac10 r5 =3D 0x00000013 >>>> r6 =3D 0xdb23209b r7 =3D 0x00000001 >>>> r8 =3D 0x00000801 r9 =3D 0x00000013 >>>> r10 =3D 0xdb23209b >>>> vpanic() at vpanic >>>> pc =3D 0xc0328c78 lr =3D 0xc068c8e8 (abort_align) >>>> sp =3D 0xc6c9ab1c fp =3D 0xc6c9ab48 >>>> r4 =3D 0x00000001 r5 =3D 0x00000801 >>>> r6 =3D 0x00000013 r7 =3D 0xdb23209b >>>> r8 =3D 0xc6c9ab14 r9 =3D 0xc0328c78 >>>> r10 =3D 0xc6c9ab1c >>>> abort_align() at abort_align >>>> pc =3D 0xc068c8e8 lr =3D 0xc068c958 (abort_align+0x70) >>>> sp =3D 0xc6c9ab50 fp =3D 0xc6c9ab68 >>>> r4 =3D 0xc6d21c00 r10 =3D 0xdb23209b >>>> abort_align() at abort_align+0x70 >>>> pc =3D 0xc068c958 lr =3D 0xc068c5e0 (abort_handler+0x430) >>>> sp =3D 0xc6c9ab70 fp =3D 0xc6c9ac08 >>>> r4 =3D 0x00000000 r10 =3D 0xdb23209b >>>> abort_handler() at abort_handler+0x430 >>>> pc =3D 0xc068c5e0 lr =3D 0xc0669868 (exception_exit) >>>> sp =3D 0xc6c9ac10 fp =3D 0xc6c9acb8 >>>> r4 =3D 0xdb19e280 r5 =3D 0x00000000 >>>> r6 =3D 0x00000001 r7 =3D 0x00000006 >>>> r8 =3D 0xc6c9ad20 r9 =3D 0xc0b7973c >>>> r10 =3D 0xc092074c >>>> exception_exit() at exception_exit >>>> pc =3D 0xc0669868 lr =3D 0xc01b01d8 = (usb_msc_auto_quirk+0xfc) >>>> sp =3D 0xc6c9aca0 fp =3D 0xc6c9acb8 >>>> r0 =3D 0xdb232080 r1 =3D 0x00000000 >>>> r2 =3D 0x00000006 r3 =3D 0x00000024 >>>> r4 =3D 0xdb19e280 r5 =3D 0x00000000 >>>> r6 =3D 0x00000001 r7 =3D 0x00000006 >>>> r8 =3D 0xc6c9ad20 r9 =3D 0xc0b7973c >>>> r10 =3D 0xc092074c r12 =3D 0x00000000 >>>> bbb_command_start() at bbb_command_start+0x4c >>>> pc =3D 0xc01aff88 lr =3D 0xc01b01d8 = (usb_msc_auto_quirk+0xfc) >>>> sp =3D 0xc6c9acc0 fp =3D 0xc6c9acf8 >>>> r4 =3D 0xdb16d800 r5 =3D 0xdb19e280 >>>> r6 =3D 0x00000001 r10 =3D 0xc092074c >>>> usb_msc_auto_quirk() at usb_msc_auto_quirk+0xfc >>>> pc =3D 0xc01b01d8 lr =3D 0xc01a4bd8 = (usb_alloc_device+0x9c4) >>>> sp =3D 0xc6c9ad00 fp =3D 0xc6c9ad68 >>>> r4 =3D 0x00000000 r5 =3D 0x00000001 >>>> r6 =3D 0x00000000 r7 =3D 0x00000002 >>>> r8 =3D 0xdb16d800 r9 =3D 0xda241c78 >>>> r10 =3D 0x000003ee >>>> usb_alloc_device() at usb_alloc_device+0x9c4 >>>> pc =3D 0xc01a4bd8 lr =3D 0xc01ad16c (uhub_explore+0x494) >>>> sp =3D 0xc6c9ad70 fp =3D 0xc6c9adc0 >>>> r4 =3D 0x00000000 r5 =3D 0x00000000 >>>> r6 =3D 0xdb16e800 r7 =3D 0x00000000 >>>> r8 =3D 0xdb18c200 r9 =3D 0x00000001 >>>> r10 =3D 0x00000000 >>>> uhub_explore() at uhub_explore+0x494 >>>> pc =3D 0xc01ad16c lr =3D 0xc0198654 (usb_bus_explore+0x1d4) >>>> sp =3D 0xc6c9adc8 fp =3D 0xc6c9add8 >>>> r4 =3D 0xda241c78 r5 =3D 0xdb16e800 >>>> r6 =3D 0x00000000 r7 =3D 0xda241d6c >>>> r8 =3D 0xc09b0b5f r9 =3D 0x00000001 >>>> r10 =3D 0xda241d1c >>>> usb_bus_explore() at usb_bus_explore+0x1d4 >>>> pc =3D 0xc0198654 lr =3D 0xc01b22d0 (usb_process+0x124) >>>> sp =3D 0xc6c9ade0 fp =3D 0xc6c9ae10 >>>> r4 =3D 0xda241d0c r5 =3D 0xda241d14 >>>> usb_process() at usb_process+0x124 >>>> pc =3D 0xc01b22d0 lr =3D 0xc02da4f0 (fork_exit+0xb0) >>>> sp =3D 0xc6c9ae18 fp =3D 0xc6c9ae38 >>>> r4 =3D 0xc6c9ae40 r5 =3D 0xc6d21c00 >>>> r6 =3D 0xc6d08740 r7 =3D 0xda241d0c >>>> r8 =3D 0xc01b21ac r9 =3D 0x00000000 >>>> r10 =3D 0x00000000 >>>> fork_exit() at fork_exit+0xb0 >>>> pc =3D 0xc02da4f0 lr =3D 0xc06697fc (swi_exit) >>>> sp =3D 0xc6c9ae40 fp =3D 0x00000000 >>>> r4 =3D 0xc01b21ac r5 =3D 0xda241d0c >>>> r6 =3D 0x00000000 r7 =3D 0x00000000 >>>> r8 =3D 0x00000000 r10 =3D 0x00000000 >>>> swi_exit() at swi_exit >>>> pc =3D 0xc06697fc lr =3D 0xc06697fc (swi_exit) >>>> sp =3D 0xc6c9ae40 fp =3D 0x00000000 >>>> KDB: enter: panic >>>> [ thread pid 14 tid 100069 ] >>>> Stopped at kdb_enter+0x54: ldrb r15, [r15, r15, ror r15]! >>>> db> >>>=20 >>> Using just available official artifact kernels for testing >>> I've established that 0953460ce149 (and various from before >>> that) does not have the problem: >>>=20 >>> Wed, 23 Oct 2024 >>> =E2=80=A2 git: 5c92f84bb607 - main - LinuxKPI: update = rcu_dereference_*() and lockdep_is_held() Bjoern A. Zeeb >>> =E2=80=A2 git: 6fa91acca40d - main - conf/NOTES: Remove trailing = whitespace Li-Wen Hsu >>> =E2=80=A2 git: 91b7b225b2ce - main - LINT: Add mac_do Li-Wen Hsu >>> =E2=80=A2 git: 419249c1cacc - main - Revert "LINT: Add mac_do" = Li-Wen Hsu >>> =E2=80=A2 Re: git: 419249c1cacc - main - Revert "LINT: Add = mac_do" Baptiste Daroussin >>> =E2=80=A2 Re: git: 13da1af1cd67 - main - libcxxrt: Update to = upstream 698997bfde1f John Baldwin >>> =E2=80=A2 Re: git: 419249c1cacc - main - Revert "LINT: Add = mac_do" John Baldwin >>> =E2=80=A2 git: 0953460ce149 - main - libc: fix access mode tests = in fmemopen(3) Ed Maste >>>=20 >>> So the above one worked. >>>=20 >>> The next available kernel to test was f3dbef108212 (the bump for = LLVM19 >>> at the end of the below): >>>=20 >>> =E2=80=A2 RE: git: 6a07e67fb7a8 - main - vm_meter: Fix laundry = accounting Mark Millard >>> =E2=80=A2 git: 6b9f7133aba4 - main - libc: Add one more check in = new fmemopen test Ed Maste >>> =E2=80=A2 git: 0fca6ea1d4ee - main - Merge llvm-project main = llvmorg-19-init-18630-gf2ccf80136a0 Dimitry Andric >>> =E2=80=A2 git: 36b606ae6aa4 - main - Merge llvm-project = release/19.x llvmorg-19.1.0-rc1-0-ga4902a36d5c2 Dimitry Andric >>> =E2=80=A2 git: 3f157662c0ef - main - Tentatively apply = https://github.com/llvm/llvm-project/pull/101403 Dimitry Andric >>> =E2=80=A2 git: d575077527d4 - main - bsd.sys.mk: for clang >=3D = 19, similar to gcc >=3D 8.1, turn off -Werror for = -Wcast-function-type-mismatch. Dimitry Andric >>> =E2=80=A2 git: 36d486cc2ecd - main - Fix enum warning in = ath_hal's ar9002 Dimitry Andric >>> =E2=80=A2 git: 6846ab2fb663 - main - libcxx simd_utils.h: only = enable _LIBCPP_HAS_ALGORITHM_VECTOR_UTILS for clang >=3D 15, since older = versions do not support the required builtins. Dimitry Andric >>> =E2=80=A2 git: 81e300df5e65 - main - libcxx atomic_ref.h: add = typename keyword for difference_type declarations, otherwise older clang = versions cannot compile this header. Dimitry Andric >>> =E2=80=A2 git: 6b4981df6008 - main - libcxx cstdlib, cwchar: = avoid using long long functions if not supported, even for older = compilers that do not support the using_if_exists attribute. Dimitry = Andric >>> =E2=80=A2 git: 2f6d6eaf2d51 - main - libcxx-compat: revert = llvmorg-19-init-18063-g561246e90282: Dimitry Andric >>> =E2=80=A2 git: 04f5b79cfa49 - main - libcxx-compat: revert = llvmorg-19-init-18062-g4dfa75c663e5: Dimitry Andric >>> =E2=80=A2 git: e8054e44f4ca - main - libcxx-compat: revert = llvmorg-19-init-17853-g578c6191eff7: Dimitry Andric >>> =E2=80=A2 git: 0bec0529b1d7 - main - libcxx-compat: revert = llvmorg-19-init-17728-g30cc12cd818d: Dimitry Andric >>> =E2=80=A2 git: e8847079df1b - main - libcxx-compat: revert = llvmorg-19-init-17727-g0eebb48fcfbc: Dimitry Andric >>> =E2=80=A2 git: 2f2ebe758bea - main - libcxx-compat: revert = llvmorg-19-init-17473-g69fecaa1a455: Dimitry Andric >>> =E2=80=A2 git: 1199d38d8ec7 - main - libcxx-compat: revert = llvmorg-19-init-8667-g472b612ccbed: Dimitry Andric >>> =E2=80=A2 git: a7b2d7f261b8 - main - libcxx-compat: revert = llvmorg-19-init-5639-ga10aa4485e83: Dimitry Andric >>> =E2=80=A2 git: f3859a1a13a1 - main - libcxx-compat: revert = llvmorg-19-init-4504-g937a5396cf3e: Dimitry Andric >>> =E2=80=A2 git: 072b5fb698ab - main - libcxx-compat: revert = llvmorg-19-init-4003-g55357160d0e1: Dimitry Andric >>> =E2=80=A2 git: b60301d8b594 - main - libcxx-compat: don't remove = headers that were reintroduced by reverts Dimitry Andric >>> =E2=80=A2 git: 2e861daab905 - main - libcxx-compat: install = headers that were reintroduced by reverts Dimitry Andric >>> =E2=80=A2 git: ff6c8447844b - main - libcxx-compat: update = libcxx.imp for headers that were reintroduced by reverts Dimitry Andric >>> =E2=80=A2 git: 52418fc2be8e - main - Merge llvm-project = release/19.x llvmorg-19.1.0-rc2-0-gd033ae172d1c Dimitry Andric >>> =E2=80=A2 git: 62987288060f - main - Merge llvm-project = release/19.x llvmorg-19.1.0-rc3-0-g437434df21d8 Dimitry Andric >>> =E2=80=A2 git: 6c4b055cfb6b - main - Merge llvm-project = release/19.x llvmorg-19.1.0-rc4-0-g0c641568515a Dimitry Andric >>> =E2=80=A2 git: 835c3a3e69af - main - Merge commit 6dbdb8430b49 = from llvm git (by Nikolas Klauser): Dimitry Andric >>> =E2=80=A2 git: c80e69b00d97 - main - Merge llvm-project = release/19.x llvmorg-19.1.0-0-ga4bf6cd7cfb1 Dimitry Andric >>> =E2=80=A2 git: 6e516c87b6d7 - main - Merge llvm-project = release/19.x llvmorg-19.1.1-0-gd401987fe349 Dimitry Andric >>> =E2=80=A2 git: 5deeebd8c6ca - main - Merge llvm-project = release/19.x llvmorg-19.1.2-0-g7ba7d8e2f7b6 Dimitry Andric >>> =E2=80=A2 git: f3dbef108212 - main - Bump __FreeBSD_version for = llvm 19.1.2 merge Dimitry Andric >>>=20 >>> f3dbef108212 gets the: >>>=20 >>> "Fatal kernel mode data abort: 'Alignment Fault' on write" >>>=20 >>> boot failure for artifact kernel. 6b9f7133aba4 does nit >>> seem a likely source of the problem, basically leaving the >>> LLVM changes as what is at issue. >>>=20 >>> I'll note that artifact kernels are witness kernels. So >>> this exploration adds to the distinctions observed >>> compared to the prior notes. >>>=20 >>>> Looking at bbb_command_start() 's pc: >>>>=20 >>>> # llvm-addr2line -e /boot/kernel.GENERIC-NODEBUG/kernel 0xc01aff88 >>>> /home/pkgbuild/worktrees/main/sys/dev/usb/usb_msctest.c:554 >>>>=20 >>>> What leads to that line is: >>>>=20 >>>> = /*------------------------------------------------------------------------= * >>>> * bbb_command_start - execute a SCSI command synchronously >>>> * >>>> * Return values >>>> * 0: Success >>>> * Else: Failure >>>> = *------------------------------------------------------------------------*= / >>>> static int >>>> bbb_command_start(struct bbb_transfer *sc, uint8_t dir, uint8_t = lun, >>>> void *data_ptr, size_t data_len, void *cmd_ptr, size_t cmd_len, >>>> usb_timeout_t data_timeout) >>>> { >>>> sc->lun =3D lun; >>>> sc->dir =3D data_len ? dir : DIR_NONE; >>>> sc->data_ptr =3D data_ptr; >>>> sc->data_len =3D data_len; >>>> sc->data_rem =3D data_len; >>>> sc->data_timeout =3D (data_timeout + USB_MS_HZ); >>>> sc->actlen =3D 0; >>>> sc->error =3D 0; >>>> sc->cmd_len =3D cmd_len; >>>> memset(&sc->cbw->CBWCDB, 0, sizeof(sc->cbw->CBWCDB)); >>>>=20 >>>> The memset line is line 554 of sys/dev/usb/usb_msctest.c . >>>=20 >>> The below looks to be a separate problem based on >>> some later FreeBSD kernel update than the above. >>>=20 >>>> I'll note that attempting to use the WITNESS variant of the kernel >>>> ( /boot/kernel/ ) gets a different, even earlier failure: >>>>=20 >>>> . . . >>>> VT: init without driver. >>>> panic: acquiring blockable sleep lock with spinlock or critical = section held (sleep mutex) pmap @ = /home/pkgbuild/worktrees/main/sys/arm/arm/pmap-v6.c:6455 >>>=20 >>> I do know that d021d3b3c675 at the end of the below >>> shows this failure --before the system has a chance >>> to get the usb related write alignment failure >>> reported above. >>>=20 >>> I have not explored where in the below range the >>> behavior changes (for what is available as an >>> official artifact kernel). It seems unlikely that >>> any of the below would actually boot: it is likely >>> a question of which of the 2 (or more) failure >>> types happen for each instead. >> The last before "Thu, 24, Oct 2024" was: >> =E2=80=A2 git: 8b2e7da70855 - main - llvm19: permit = incremental builds from llvm18 Brooks Davis >> That is the last available artifact kernel that gets the >> original usb related write alignment type of failure. >>> Thu, 24 Oct 2024 >>> =E2=80=A2 git: 34951b0b9e78 - main - swap_pager: move = scan_all_shadowed, use iterators Doug Moore >>> =E2=80=A2 git: 2ac21f2c98ed - main - x86 specialreg.h: visually = align %cr4 and MSR_EFER bit mask definitions Konstantin Belousov >>> =E2=80=A2 git: cc11bc1150d5 - main - x86 specialreg.h: add all = defined bits for %cr4 Konstantin Belousov >>> =E2=80=A2 git: cc4b25f10211 - main - x86 specialreg: reorder %cr3 = bits masks definitions by value Konstantin Belousov >>> =E2=80=A2 git: 5999b74e9637 - main - x86 specialreg: add bit = masks definitions for LAM in %cr3 Konstantin Belousov >>> =E2=80=A2 git: 6308db659f2a - main - x86 specialreg: add bit = masks definitions for EFER features Konstantin Belousov >>> =E2=80=A2 git: 9f718b57b846 - main - x86 specialreg: add bit = masks definitions for LASS and LAM features Konstantin Belousov >>> =E2=80=A2 git: 3360a15898ce - main - net: route: convert routing = statistics to a sysctl Kyle Evans >>> =E2=80=A2 Re: git: 3360a15898ce - main - net: route: convert = routing statistics to a sysctl Kyle Evans >>> =E2=80=A2 git: 77b70ad751df - main - e1000: Move I219 LM19/V19 to = ADL Kevin Bowling >> The last above is the first available artifact kernel that >> that gets the different error. There are no armv7 artifact >> kernels between 8b2e7da70855 and 77b70ad751df . >> So something from 34951b0b9e78 .. 77b70ad751df leads to >> the change in the type of failure. I've no clue what. >> It looked to me like the x86 commits and e1000 commit had >> no chance of contributing to the armv7 context. Thus >> who I added to the CC vs. did not add. >>> =E2=80=A2 git: d64442a89896 - main - arm{,64}: use genassym for = INTR_ROOT_* values Kyle Evans >>> =E2=80=A2 git: 536c8d948e85 - main - intrng: change = multi-interrupt root support type to enum Kyle Evans >>> =E2=80=A2 git: 4f12b529f404 - main - sys/intr.h: formally depend = on machine/intr.h Kyle Evans >>> =E2=80=A2 git: a5b1eecbed07 - main - Apply workaround for = building llvm-project with WITHOUT_LLVM_ASSERTIONS Dimitry Andric >>> =E2=80=A2 git: 1c83996beda7 - main - Adjust = LLVM_ENABLE_ABI_BREAKING_CHECKS depending on NDEBUG Dimitry Andric >>> =E2=80=A2 git: b2dd4970c7b5 - main - dev/gpio: Mask all pl011 = interrupts Andrew Turner >>> =E2=80=A2 git: 3b03e1bb8615 - main - intrng: Store the IPI = priority Andrew Turner >>> =E2=80=A2 git: 6204391e99ca - main - arm64: Check TDP_NOFAULTING = in a data abort Andrew Turner >>> =E2=80=A2 git: a84653c5db25 - main - arm64: Don't enable = interrupts when in a spinlock Andrew Turner >>> =E2=80=A2 git: d7f930b80e89 - main - arm64: Implement = efi_rt_arch_call Andrew Turner >>> =E2=80=A2 git: 8efb1500d4f1 - main - arm64: Enable handling EFI = runtime service faults Andrew Turner >>> =E2=80=A2 git: 9693241188aa - main - sound: Call DSP_REGISTERED = before PCM_DETACHING Christos Margiolis >>> =E2=80=A2 git: bb5e3ac1a7b7 - main - sound: Use DSP_REGISTERED in = dsp_clone() Christos Margiolis >>> =E2=80=A2 git: a4111e9dc722 - main - sound: Change PCMDIR_* = numbering Christos Margiolis >>> =E2=80=A2 git: 802c78f5194e - main - sound: Untangle dsp_cdevs[] = and dsp_unit2name() confusion Christos Margiolis >>> =E2=80=A2 git: b1bb6934bb87 - main - sound: Fix build error in = chm_mkname() KASSERT Christos Margiolis >>> =E2=80=A2 git: ce20b48a60fb - main - sctp: improve debug output = Michael Tuexen >>> =E2=80=A2 git: e4ac0183a1a8 - main - sctp: cleanup Michael Tuexen >>> =E2=80=A2 git: 8c8ebbb04518 - main - bhyve ahci: Improve = robustness of TRIM handling John Baldwin >>> =E2=80=A2 git: f0bc751d6fb4 - main - csa: Use pci_find_device to = simplify clkrun_hack John Baldwin >>> =E2=80=A2 git: d96ba5a62365 - main - config: Remove a stray = semicolon Zhenlei Huang >>> =E2=80=A2 git: 56b17de1e836 - main - makefs: Remove a stray = semicolon Zhenlei Huang >>> =E2=80=A2 git: 88b71d1fe054 - main - arm64: rockchip: Remove a = stray semicolon Zhenlei Huang >>> =E2=80=A2 git: b4856b8e9d87 - main - LinuxKPI: Remove stray = semicolons Zhenlei Huang >>> =E2=80=A2 git: 75ff90814aec - main - enic: Remove a stray = semicolon Zhenlei Huang >>> =E2=80=A2 git: 6ccf4f4071c5 - main - mana: Remove stray = semicolons Zhenlei Huang >>> =E2=80=A2 git: 86a2c910c05c - main - mpi3mr: Remove a stray = semicolon Zhenlei Huang >>> =E2=80=A2 git: 36756195a342 - main - ocs_fc: Remove a stray = semicolon Zhenlei Huang >>> =E2=80=A2 git: 2f395cfda8b5 - main - tcp cc: Remove a stray = semicolon Zhenlei Huang >>> =E2=80=A2 git: f3a097d0312c - main - netstat: switch to using the = sysctl-exported stats for live stats Kyle Evans >>> =E2=80=A2 git: 656991b0c629 - main - locks: augment lock_class = with lc_trylock method Gleb Smirnoff >>> =E2=80=A2 git: efcb2ec8cb81 - main - callout: provide = CALLOUT_TRYLOCK flag Gleb Smirnoff >>> =E2=80=A2 git: bffebc336f4e - main - tcp: use CALLOUT_TRYLOCK for = the TCP callout Gleb Smirnoff >>> =E2=80=A2 git: d021d3b3c675 - main - tcp: get rid of = TDP_INTCPCALLOUT Gleb Smirnoff >>>> cpuid =3D 0 >>>> time =3D 1 >>>> KDB: stack backtrace: >>>> Fatal kernel mode data abort: 'Translation Fault (L1)' on read >>>> trapframe: 0xc0f14568 >>>> FSR=3D00000005, FAR=3Ddb7fcfb1, spsr=3D200001d3 >>>> r0 =3Dc0f1465c, r1 =3D00000001, r2 =3Ddb7fcfae, r3 =3D1b000a4e >>>> r4 =3Dc07fc55c, r5 =3D8fce1b89, r6 =3D00006f3e, r7 =3D81000000 >>>> r8 =3Dc07c4b6c, r9 =3Dc094ace8, r10=3Dc09741d8, r11=3Dc0f14618 >>>> r12=3Dc0f146c4, ssp=3Dc0f145fc, slr=3Dc0601428, pc =3Dc062686c >>>>=20 >>>> panic: Fatal abort >>>> cpuid =3D 0 >>>> time =3D 1 >>>> KDB: stack backtrace: >>>> Fatal kernel mode data abort: 'Translation Fault (L1)' on read >>>> trapframe: 0xc0f141f0 >>>> FSR=3D00000005, FAR=3Ddb7fcfb1, spsr=3D200001d3 >>>> r0 =3Dc0f142e4, r1 =3D00000001, r2 =3Ddb7fcfae, r3 =3D1b000a4e >>>> r4 =3Dc07fc55c, r5 =3D8fce1b89, r6 =3D00006f3e, r7 =3D81000000 >>>> r8 =3Dc07c4b6c, r9 =3Dc094ace8, r10=3Dc09741d8, r11=3Dc0f142a0 >>>> r12=3Dc0f1434c, ssp=3Dc0f14284, slr=3Dc0601428, pc =3Dc062686c >>>>=20 >>>> panic: Fatal abort >>>> cpuid =3D 0 >>>> time =3D 1 >>>> KDB: stack backtrace: >>>> Fatal kernel mode data abort: 'Translation Fault (L1)' on read >>>> trapframe: 0xc0f13e78 >>>> FSR=3D00000005, FAR=3Ddb7fcfb1, spsr=3D200001d3 >>>> r0 =3Dc0f13f6c, r1 =3D00000001, r2 =3Ddb7fcfae, r3 =3D1b000a4e >>>> r4 =3Dc07fc55c, r5 =3D8fce1b89, r6 =3D00006f3e, r7 =3D81000000 >>>> r8 =3Dc07c4b6c, r9 =3Dc094ace8, r10=3Dc09741d8, r11=3Dc0f13f28 >>>> r12=3Dc0f13fd4, ssp=3Dc0f13f0c, slr=3Dc0601428, pc =3Dc062686c >>>>=20 >>>> panic: Fatal abort >>>> cpuid =3D 0 >>>> time =3D 1 >>>> KDB: stack backtrace: >>>> Fatal kernel mode data abort: 'Translation Fault (L1)' on read >>>> trapframe: 0xc0f13b00 >>>> FSR=3D00000005, FAR=3Ddb7fcfb1, spsr=3D200001d3 >>>> r0 =3Dc0f13bf4, r1 =3D00000001, r2 =3Ddb7fcfae, r3 =3D1b000a4e >>>> r4 =3Dc07fc55c, r5 =3D8fce1b89, r6 =3D00006f3e, r7 =3D81000000 >>>> r8 =3Dc07c4b6c, r9 =3Dc094ace8, r10=3Dc09741d8, r11=3Dc0f13bb0 >>>> r12=3Dc0f13c5c, ssp=3Dc0f13b94, slr=3Dc0601428, pc =3Dc062686c >>>>=20 >>>> panic: Fatal abort >>>> cpuid =3D 0 >>>> time =3D 1 >>>> KDB: stack backtrace: >>>> Fatal kernel mode data abort: 'Translation Fault (L1)' on read >>>> trapframe: 0xc0f13788 >>>> FSR=3D00000005, FAR=3Ddb7fcfb1, spsr=3D200001d3 >>>> r0 =3Dc0f1387c, r1 =3D00000001, r2 =3Ddb7fcfae, r3 =3D1b000a4e >>>> r4 =3Dc07fc55c, r5 =3D8fce1b89, r6 =3D00006f3e, r7 =3D81000000 >>>> r8 =3Dc07c4b6c, r9 =3Dc094ace8, r10=3Dc09741d8, r11=3Dc0f13838 >>>> r12=3Dc0f138e4, ssp=3Dc0f1381c, slr=3Dc0601428, pc =3Dc062686c >>>>=20 >>>> . . . >>>>=20 >>>> Looking: >>>>=20 >>>> # llvm-addr2line -e /boot/kernel.GENERIC-NODEBUG/kernel 0xc062686c >>>> /home/pkgbuild/worktrees/main/sys/vm/uma_core.c:5676 >>>>=20 >>>> static int >>>> sysctl_handle_uma_zone_frees(SYSCTL_HANDLER_ARGS) >>>> { >>>> uma_zone_t zone =3D arg1; >>>> uint64_t cur; >>>>=20 >>>> cur =3D uma_zone_get_frees(zone); >>>> return (sysctl_handle_64(oidp, &cur, 0, req)); >>>> } >>>>=20 >>>> The "return" line is 5676 of sys/vm/uma_core.c . >>>>=20 >>>>=20 >>>> Also, for what leads up to: >>>>=20 >>>> /home/pkgbuild/worktrees/main/sys/arm/arm/pmap-v6.c:6455 >>>>=20 >>>> /* >>>> * The implementation of pmap_fault() uses IN_RANGE2() macro which >>>> * depends on the fact that given range size is a power of 2. >>>> */ >>>> CTASSERT(powerof2(NB_IN_PT1)); >>>> CTASSERT(powerof2(PT2MAP_SIZE)); >>>>=20 >>>> #define IN_RANGE2(addr, start, size) \ >>>> ((vm_offset_t)(start) =3D=3D ((vm_offset_t)(addr) & ~((size) - = 1))) >>>>=20 >>>> /* >>>> * Handle access and R/W emulation faults. >>>> */ >>>> int >>>> pmap_fault(pmap_t pmap, vm_offset_t far, uint32_t fsr, int idx, = bool usermode) >>>> { >>>> pt1_entry_t *pte1p, pte1; >>>> pt2_entry_t *pte2p, pte2; >>>>=20 >>>> if (pmap =3D=3D NULL) >>>> pmap =3D kernel_pmap; >>>>=20 >>>> /* >>>> * In kernel, we should never get abort with FAR which is in = range of >>>> * pmap->pm_pt1 or PT2MAP address spaces. If it happens, stop = here >>>> * and print out a useful abort message and even get to the = debugger >>>> * otherwise it likely ends with never ending loop of aborts. >>>> */ >>>> if (__predict_false(IN_RANGE2(far, pmap->pm_pt1, NB_IN_PT1))) = { >>>> /* >>>> * All L1 tables should always be mapped and present. >>>> * However, we check only current one herein. For = user mode, >>>> * only permission abort from malicious user is not = fatal. >>>> * And alignment abort as it may have higher = priority. >>>> */ >>>> if (!usermode || (idx !=3D FAULT_ALIGN && idx !=3D = FAULT_PERM_L2)) { >>>> CTR4(KTR_PMAP, "%s: pmap %#x pm_pt1 %#x far = %#x", >>>> __func__, pmap, pmap->pm_pt1, far); >>>> panic("%s: pm_pt1 abort", __func__); >>>> } >>>> return (KERN_INVALID_ADDRESS); >>>> } >>>> if (__predict_false(IN_RANGE2(far, PT2MAP, PT2MAP_SIZE))) { >>>> /* >>>> * PT2MAP should be always mapped and present in = current >>>> * L1 table. However, only existing L2 tables are = mapped >>>> * in PT2MAP. For user mode, only L2 translation = abort and >>>> * permission abort from malicious user is not fatal. >>>> * And alignment abort as it may have higher = priority. >>>> */ >>>> if (!usermode || (idx !=3D FAULT_ALIGN && >>>> idx !=3D FAULT_TRAN_L2 && idx !=3D = FAULT_PERM_L2)) { >>>> CTR4(KTR_PMAP, "%s: pmap %#x PT2MAP %#x far = %#x", >>>> __func__, pmap, PT2MAP, far); >>>> panic("%s: PT2MAP abort", __func__); >>>> } >>>> return (KERN_INVALID_ADDRESS); >>>> } >>>>=20 >>>> /* >>>> * A pmap lock is used below for handling of access and R/W = emulation >>>> * aborts. They were handled by atomic operations before so = some >>>> * analysis of new situation is needed to answer the = following question: >>>> * Is it safe to use the lock even for these aborts? >>>> * >>>> * There may happen two cases in general: >>>> * >>>> * (1) Aborts while the pmap lock is locked already - this = should not >>>> * happen as pmap lock is not recursive. However, under pmap = lock only >>>> * internal kernel data should be accessed and such data = should be >>>> * mapped with A bit set and NM bit cleared. If double abort = happens, >>>> * then a mapping of data which has caused it must be fixed. = Further, >>>> * all new mappings are always made with A bit set and the = bit can be >>>> * cleared only on managed mappings. >>>> * >>>> * (2) Aborts while another lock(s) is/are locked - this = already can >>>> * happen. However, there is no difference here if it's = either access or >>>> * R/W emulation abort, or if it's some other abort. >>>> */ >>>>=20 >>>> PMAP_LOCK(pmap); >>>>=20 >>>> That "PMAP_LOCK(pmap);" line is line 6455 of sys/arm/arm/pmap-v6.c = . >>>>=20 >>>>=20 >>>> FYI: Running the prior kernel.GENERIC-NODEBUG/ ( called >>>> kernel.GENERIC-NODEBUG.good/ ) continues to operate >>>> normally. I do not have the older PkgBase kernel/ around >>>> to try, unfortunately. >> I'll remind that this is from using official FreeBSD builds >> of the kernel versions that I tested, not from my personal >> build context. >> =3D=3D=3D >> Mark Millard >> marklmi at yahoo.com > Hi Mark, >=20 > Please see https://reviews.freebsd.org/D47485 >=20 > Unfortunately, I see 2 problems with llvm 19. >=20 > The first is regression, the compiler generates inline memset() = accessing non-aligned data with sub-optimal instructions (with word = access). This regression triggers bug in the kernel (which should be = fixed in D47485). >=20 > Second, regarding "panic: acquiring blockable sleep lock" is due to an = bug in lld. It mis-links the ".ARM.exidx" section on the output = binary, which is used by the stack unwinder in the kernel. > I don't have a fix for this for now, so you have to use the linker = from llvm18 as a workaround. >=20 > I'm not sure if I have enough free cycles to manage both issues on the = llvm side... =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D8C9F4E8-242D-4106-A55B-EDA986E1CEA5>