Date: Mon, 21 Mar 2022 10:11:44 -0500 From: Mike Karels <mike@karels.net> To: Kristof Provost <kp@FreeBSD.org> Cc: freebsd-net@freebsd.org Subject: Re: kernel epoch crash in IPv4 multicast code Message-ID: <202203211511.22LFBiHG041121@mail.karels.net> In-Reply-To: Your message of Mon, 21 Mar 2022 13:41:15 %2B0100. <9E6CA0F5-5E02-4458-8D9F-C7F8F1715BFC@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Kristof wrote: > On 18 Mar 2022, at 19:02, Mike Karels wrote: > > It looks like the IPv4 multicast code has not been fully converted to > > use epochs. I installed this week's snapshot of -current, configured > > and started mrouted, and started rwhod -m. The system crashed shortly > > thereafter with this: > > > > panic: Assertion in_epoch(net_epoch_preempt) failed at /usr/src/sys/ne= tinet/ip_output.c:343 > > cpuid =3D 15 > > time =3D 1647609865 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe01= b51a39d0 > > vpanic() at vpanic+0x17f/frame 0xfffffe01b51a3a20 > > panic() at panic+0x43/frame 0xfffffe01b51a3a80 > > ip_output() at ip_output+0x15f9/frame 0xfffffe01b51a3b80 > > phyint_send() at phyint_send+0x107/frame 0xfffffe01b51a3be0 > > ip_mdq() at ip_mdq+0x259/frame 0xfffffe01b51a3c60 > > X_ip_mrouter_set() at X_ip_mrouter_set+0x9e4/frame 0xfffffe01b51a3d30 > > sosetopt() at sosetopt+0xee/frame 0xfffffe01b51a3d80 > > kern_setsockopt() at kern_setsockopt+0xad/frame 0xfffffe01b51a3de0 > > sys_setsockopt() at sys_setsockopt+0x24/frame 0xfffffe01b51a3e00 > > amd64_syscall() at amd64_syscall+0x12e/frame 0xfffffe01b51a3f30 > > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe01b51a= 3f30 > > --- syscall (105, FreeBSD ELF64, sys_setsockopt), rip =3D 0x821b72dda,= rsp =3D 0x8204c06f8, rbp =3D 0x8204c0750 --- > > KDB: enter: panic > > > > The kgdb backtrace is appended. > > > > It looks like ip_mroute is protected in the forwarding path (it's call= ed > > from ip_input) and the output path, but not in the setup path from > > setsockopt(). At least the MRT_ADD_MFC call needs to enter an epoch. > > I tried adding epoch handling in add_mfc(), and that seems to work. > > The alternative would be to do it in Xip_mrouter_set() so it would cov= er > > all the calls. Any opinions? > > > Your analysis looks reasonable. > I think I'd suggest adding the NET_EPOCH_ENTER() calls in add_mfc(). We = already do that in add_vif(), so we'd be following existing choices. > I'd also suggest adding NET_EPOCH_ASSERT() to everything which directly = or indirectly calls ip_output(). That should help us catch other potential= issues like this one. Thanks. I had already added one assert; I added one in send_packet() as well. For anyone interested, this is now in review: https://reviews.freebsd.org/D34624. Mike > Br, > Kristof
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?202203211511.22LFBiHG041121>