Date: Sat, 24 Jun 2023 12:16:16 -0700 From: Mark Millard <marklmi@yahoo.com> To: Current FreeBSD <freebsd-current@freebsd.org>, freebsd-arm <freebsd-arm@freebsd.org> Subject: Re: aarch64 main-n263493-4e8d558c9d1c-dirty (so: 2023-Jun-10) Kyuafile run: "Fatal data abort" crash during vnet_register_sysinit Message-ID: <8E9937A8-1563-49C2-A1B1-150864C09AA0@yahoo.com> In-Reply-To: <FAF014A1-88B5-4CAE-8A5C-2C2065528003@yahoo.com> References: <3FD359F8-CFCC-400F-B6DE-B635B747DE7F@yahoo.com> <FAF014A1-88B5-4CAE-8A5C-2C2065528003@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Jun 24, 2023, at 10:49, Mark Millard <marklmi@yahoo.com> wrote: > On Jun 24, 2023, at 10:00, Mark Millard <marklmi@yahoo.com> wrote: >=20 >> The running system build is a non-debug build (but >> with symbols not stripped). >>=20 >> The HoneyComb's console log shows: >>=20 >> . . . >> GEOM_STRIPE: Device stripe.IMfBZr destroyed. >> GEOM_NOP: Device md0.nop created. >> g_vfs_done():md0.nop[READ(offset=3D5885952, length=3D8192)]error =3D = 5 >> GEOM_NOP: Device md0.nop removed. >> GEOM_NOP: Device md0.nop created. >> g_vfs_done():md0.nop[READ(offset=3D5935104, length=3D4096)]error =3D = 5 >> g_vfs_done():md0.nop[READ(offset=3D5935104, length=3D4096)]error =3D = 5 >> GEOM_NOP: Device md0.nop removed. >> GEOM_NOP: Device md0.nop created. >> GEOM_NOP: Device md0.nop removed. >> Fatal data abort: >> x0: ffffa02506e64400 >> x1: ffff0001ea401880 (g_raid3_post_sync + 3a145f8) >> x2: 4b >> x3: a343932b0b22fb30 >> x4: 0 >> x5: 3310b0d062d0e1d >> x6: 1d0e2d060d0b3103 >> x7: 0 >> x8: ea325df8 >> x9: ffff0001eec946d0 ($d.6 + 0) >> x10: ffff0001ea401880 (g_raid3_post_sync + 3a145f8) >> x11: 0 >> x12: 0 >> x13: ffff000000cd8960 (lock_class_mtx_sleep + 0) >> x14: 0 >> x15: ffffa02506e64405 >> x16: ffff0001eec94860 (_DYNAMIC + 160) >> x17: ffff00000063a450 (ifc_attach_cloner + 0) >> x18: ffff0001eb290400 (g_raid3_post_sync + 48a3178) >> x19: ffff0001eec94600 (vnet_epair_init_vnet_init + 0) >> x20: ffff000000fa5b68 (vnet_sysinit_sxlock + 18) >> x21: ffff000000d8e000 (sdt_vfs_vop_vop_spare4_return + 0) >> x22: ffff000000d8e000 (sdt_vfs_vop_vop_spare4_return + 0) >> x23: ffffa0000042e500 >> x24: ffffa0000042e500 >> x25: ffff000000ce0788 (linker_lookup_set_desc + 0) >> x26: ffffa0203cdef780 >> x27: ffff0001eec94698 (__set_sysinit_set_sym_if_epairmodule_sys_init = + 0) >> x28: ffff000000d8e000 (sdt_vfs_vop_vop_spare4_return + 0) >> x29: ffff0001eb290430 (g_raid3_post_sync + 48a31a8) >> sp: ffff0001eb290400 >> lr: ffff0001eec82a4c ($x.1 + 3c) >> elr: ffff0001eec82a60 ($x.1 + 50) >> spsr: 60000045 >> far: ffff0002d8fba4c8 >> esr: 96000046 >> panic: vm_fault failed: ffff0001eec82a60 error 1 >> cpuid =3D 14 >> time =3D 1687625470 >> KDB: stack backtrace: >> db_trace_self() at db_trace_self >> db_trace_self_wrapper() at db_trace_self_wrapper+0x30 >> vpanic() at vpanic+0x13c >> panic() at panic+0x44 >> data_abort() at data_abort+0x2fc >> handle_el1h_sync() at handle_el1h_sync+0x14 >> --- exception, esr 0x96000046 >> $x.1() at $x.1+0x50 >> vnet_register_sysinit() at vnet_register_sysinit+0x114 >> linker_load_module() at linker_load_module+0xae4 >> kern_kldload() at kern_kldload+0xfc >> sys_kldload() at sys_kldload+0x60 >> do_el0_sync() at do_el0_sync+0x608 >> handle_el0_sync() at handle_el0_sync+0x44 >> --- exception, esr 0x56000000 >> KDB: enter: panic >> [ thread pid 70419 tid 101003 ] >> Stopped at kdb_enter+0x44: str xzr, [x19, #3200] >> db>=20 >>=20 >> I'll see if a re-run is repeatable. >>=20 >=20 > It repeats: >=20 > GEOM_STRIPE: Device stripe/stripe.VkbPk1 deactivated. > GEOM_STRIPE: Disk md1 removed from stripe.VkbPk1. > GEOM_STRIPE: Disk md0 removed from stripe.VkbPk1. > GEOM_STRIPE: Device stripe.VkbPk1 destroyed. > GEOM_NOP: Device md0.nop created. > g_vfs_done():md0.nop[READ(offset=3D5885952, length=3D8192)]error =3D 5 > GEOM_NOP: Device md0.nop removed. > GEOM_NOP: Device md0.nop created. > g_vfs_done():md0.nop[READ(offset=3D5935104, length=3D4096)]error =3D 5 > g_vfs_done():md0.nop[READ(offset=3D5935104, length=3D4096)]error =3D 5 > GEOM_NOP: Device md0.nop removed. > GEOM_NOP: Device md0.nop created. > GEOM_NOP: Device md0.nop removed. > Fatal data abort: > x0: ffffa0003b1a9500 > x1: ffff00021b530260 > x2: 4b > x3: a343932b0b22fb30 > x4: 0 > x5: 3310b0d062d0e1d > x6: 1d0e2d060d0b3103 > x7: 0 > x8: ea325df8 > x9: ffff00021d6946d0 ($d.6 + 0) > x10: ffff00021b530260 > x11: 0 > x12: 0 > x13: ffff000000cd8960 (lock_class_mtx_sleep + 0) > x14: 0 > x15: ffffa0003b1a9505 > x16: ffff00021d694860 (_DYNAMIC + 160) > x17: ffff00000063a450 (ifc_attach_cloner + 0) > x18: ffff00021a6ea400 > x19: ffff00021d694600 (vnet_epair_init_vnet_init + 0) > x20: ffff000000fa5b68 (vnet_sysinit_sxlock + 18) > x21: ffff000000d8e000 (sdt_vfs_vop_vop_spare4_return + 0) > x22: ffff000000d8e000 (sdt_vfs_vop_vop_spare4_return + 0) > x23: ffffa00000431500 > x24: ffffa00000431500 > x25: ffff000000ce0788 (linker_lookup_set_desc + 0) > x26: ffffa02e1ab6d180 > x27: ffff00021d694698 (__set_sysinit_set_sym_if_epairmodule_sys_init + = 0) > x28: ffff000000d8e000 (sdt_vfs_vop_vop_spare4_return + 0) > x29: ffff00021a6ea430 > sp: ffff00021a6ea400 > lr: ffff00021d682a4c ($x.1 + 3c) > elr: ffff00021d682a60 ($x.1 + 50) > spsr: 60000045 > far: ffff0003079ba4c8 > esr: 96000046 > panic: vm_fault failed: ffff00021d682a60 error 1 > cpuid =3D 1 > time =3D 1687628622 > KDB: stack backtrace: > db_trace_self() at db_trace_self > db_trace_self_wrapper() at db_trace_self_wrapper+0x30 > vpanic() at vpanic+0x13c > panic() at panic+0x44 > data_abort() at data_abort+0x2fc > handle_el1h_sync() at handle_el1h_sync+0x14 > --- exception, esr 0x96000046 > $x.1() at $x.1+0x50 > vnet_register_sysinit() at vnet_register_sysinit+0x114 > linker_load_module() at linker_load_module+0xae4 > kern_kldload() at kern_kldload+0xfc > sys_kldload() at sys_kldload+0x60 > do_el0_sync() at do_el0_sync+0x608 > handle_el0_sync() at handle_el0_sync+0x44 > --- exception, esr 0x56000000 > KDB: enter: panic > [ thread pid 36377 tid 100985 ] > Stopped at kdb_enter+0x44: str xzr, [x19, #3200] > db>=20 >=20 >=20 > For reference, the output of the run in the ssh > session ends with: >=20 > . . . > sys/kqueue/libkqueue/kqueue_test:main -> passed [48.258s] > sys/mac/bsdextended/ugidfw_test:main -> skipped: mac_bsdextended not = loaded [0.006s] > sys/mac/portacl/nobody_test:main -> skipped: MAC_PORTACL is = unavailable. [0.010s] > sys/mac/portacl/root_test:main -> skipped: MAC_PORTACL is = unavailable. [0.010s] > sys/mqueue/mqueue_test:mqtest1 -> passed [0.025s] > sys/mqueue/mqueue_test:mqtest2 -> passed [0.025s] > sys/mqueue/mqueue_test:mqtest5 -> passed [0.025s] > sys/net/if_ovpn/if_ovpn_c:tcp -> skipped: if_ovpn not loaded = [0.006s] > sys/netinet/arp:arp_add_success -> =20 >=20 > That should give some extra information about the context > of failure. So I installed, booted, and tried my debug build. It failed the same way in the same place, with no extra console reporting for the crash by the debug code: no assertion failures or WITNESS reports or the like first. =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8E9937A8-1563-49C2-A1B1-150864C09AA0>