Date: Wed, 26 Apr 2017 14:53:55 -0700 From: Adrian Chadd <adrian.chadd@gmail.com> To: Chris Torek <torek@torek.net> Cc: FreeBSD Net <freebsd-net@freebsd.org> Subject: Re: three multicast fixes Message-ID: <CAJ-VmomrdDd567Lg=Kon%2Bx-XOg=u0f8Y9WYHa54QCN3_j4rQxQ@mail.gmail.com> In-Reply-To: <CAJ-Vmok1tDCbD8puxfFRrFKkNNTeSpJoEh1DWaU3wzbXnSvaJA@mail.gmail.com> References: <201704171358.v3HDwmF3071624@elf.torek.net> <CAJ-Vmok1tDCbD8puxfFRrFKkNNTeSpJoEh1DWaU3wzbXnSvaJA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Actually - into individual reviews.freebsd.org reviews? -adrian On 26 April 2017 at 14:53, Adrian Chadd <adrian.chadd@gmail.com> wrote: > hiya, > > can you throw these in bugs? > > > -a > > > On 17 April 2017 at 06:58, Chris Torek <torek@torek.net> wrote: >> The first is mostly cosmetic, it's just something I observed when >> I turned on multicast debug to find the bug that the second patch >> is for. >> >> The second patch is a kludge and if anyone has a better fix, have >> at it. :-) Note that it's also hard to provoke the bug. The only >> way I know to observe it somewhat reliably is to create and >> destroy bridge interfaces, using a debug kernel with memory debug >> enabled, while adding and deleting hardware interfaces to the >> bridge so that they repeatedly join and leave multicast groups. >> Eventually you hit the race and crash. >> >> The third is simple enough to observe. It may affect only the lagg >> driver, which implements SIOCDELMULTI by removing and re-adding all >> the multicast addresses other than the one actually deleted (which >> is of course really-removed by then). If running lagg on a VIMAGE >> kernel, leaving a multicast group causes this occasional crash: >> >> Fatal trap 12: page fault while in kernel mode >> cpuid = 0; apic id = 00 >> fault virtual address = 0x28 >> fault code = supervisor read data, page not present >> instruction pointer = 0x20:0xffffffff80adf6ea >> stack pointer = 0x28:0xfffffe009117e810 >> frame pointer = 0x28:0xfffffe009117e8b0 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags = interrupt enabled, resume, IOPL = 0 >> current process = 0 (thread taskq) >> >> rt_newmaddrmsg() at rt_newmaddrmsg+0x2a/frame 0xfffffe009117e8b0 >> if_addmulti() at if_addmulti+0x24c/frame 0xfffffe009117e940 >> lagg_ether_cmdmulti() at lagg_ether_cmdmulti+0x168/frame 0xfffffe009117e980 >> lagg_ioctl() at lagg_ioctl+0xad/frame 0xfffffe009117ea60 >> in_leavegroup() at in_leavegroup+0xb9/frame 0xfffffe009117eab0 >> inp_gcmoptions() at inp_gcmoptions+0x1a8/frame 0xfffffe009117eb20 >> taskqueue_run_locked() at taskqueue_run_locked+0x117/frame 0xfffffe009117eb80 >> taskqueue_thread_loop() at taskqueue_thread_loop+0xc8/frame 0xfffffe009117ebb0 >> fork_exit() at fork_exit+0x85/frame 0xfffffe009117ebf0 >> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe009117ebf0 >> --- trap 0, rip = 0, rsp = 0, rbp = 0 --- >> >> Here, "curvnet" is NULL in the lagg_ioctl(), so that when >> rt_newmaddrmsg() does: >> >> if (V_route_cb.any_count == 0) >> >> we crash with the offset of the per-vnet route_cb data. >> >> (The patches are in Git format, I applied them to the read-only >> freebsd master tree on GitHub.) >> >> Chris >> >> From 65145b74d3dae68feddcf8a2aad235c3f0a981e9 Mon Sep 17 00:00:00 2001 >> From: Chris Torek <chris.torek@gmail.com> >> Date: Mon, 8 Feb 2016 18:55:30 -0800 >> Subject: [PATCH 1/3] ip multicast debug: fix strings vs defines >> >> Turning on multicast debug made multicast failure worse >> because the strings and #define values no longer matched >> up. Fix them, and make sure they stay matched-up. >> --- >> sys/netinet/in_mcast.c | 31 +++++++++++++++++++++---------- >> 1 file changed, 21 insertions(+), 10 deletions(-) >> >> diff --git a/sys/netinet/in_mcast.c b/sys/netinet/in_mcast.c >> index c7fff9463128..eef560e7aca7 100644 >> --- a/sys/netinet/in_mcast.c >> +++ b/sys/netinet/in_mcast.c >> @@ -2920,7 +2920,14 @@ sysctl_ip_mcast_filters(SYSCTL_HANDLER_ARGS) >> >> #if defined(KTR) && (KTR_COMPILE & KTR_IGMPV3) >> >> -static const char *inm_modestrs[] = { "un", "in", "ex" }; >> +static const char *inm_modestrs[] = { >> + [MCAST_UNDEFINED] = "un", >> + [MCAST_INCLUDE] = "in", >> + [MCAST_EXCLUDE] = "ex", >> +}; >> +_Static_assert(MCAST_UNDEFINED == 0 && >> + MCAST_EXCLUDE + 1 == nitems(inm_modestrs), >> + "inm_modestrs: no longer matches #defines"); >> >> static const char * >> inm_mode_str(const int mode) >> @@ -2932,16 +2939,20 @@ inm_mode_str(const int mode) >> } >> >> static const char *inm_statestrs[] = { >> - "not-member", >> - "silent", >> - "idle", >> - "lazy", >> - "sleeping", >> - "awakening", >> - "query-pending", >> - "sg-query-pending", >> - "leaving" >> + [IGMP_NOT_MEMBER] = "not-member", >> + [IGMP_SILENT_MEMBER] = "silent", >> + [IGMP_REPORTING_MEMBER] = "reporting", >> + [IGMP_IDLE_MEMBER] = "idle", >> + [IGMP_LAZY_MEMBER] = "lazy", >> + [IGMP_SLEEPING_MEMBER] = "sleeping", >> + [IGMP_AWAKENING_MEMBER] = "awakening", >> + [IGMP_G_QUERY_PENDING_MEMBER] = "query-pending", >> + [IGMP_SG_QUERY_PENDING_MEMBER] = "sg-query-pending", >> + [IGMP_LEAVING_MEMBER] = "leaving", >> }; >> +_Static_assert(IGMP_NOT_MEMBER == 0 && >> + IGMP_LEAVING_MEMBER + 1 == nitems(inm_statestrs), >> + "inm_statetrs: no longer matches #defines"); >> >> static const char * >> inm_state_str(const int state) >> -- >> 2.12.1 >> >> >> From 099e1b2c059fe962c1dd91af0a6735f02f611ac3 Mon Sep 17 00:00:00 2001 >> From: Chris Torek <torek@ixsystems.com> >> Date: Wed, 10 Feb 2016 18:41:39 -0800 >> Subject: [PATCH 2/3] ip multicast: fix use-after-free >> >> During if_detach(), we get a race where a closing socket is >> releasing multicast data (via inp_freemoptions()) at the same >> time as igmp_ifdetach() is releasing all multicast data for >> the interface, resulting in a potential double teardown and >> double free. >> >> This bug has been present since late 2011: >> >> Author: jhb <jhb@FreeBSD.org> >> >> Defer the work of freeing IPv4 multicast options from a >> socket to an asychronous task. ... >> >> It is very hard to trip over. You must create and delete >> interfaces (bridges that join real interfaces are good >> candidates) repeatedly, and even then, if M_IPMADDR (in_multi >> data structure) memory is not reused for something else during >> the race, the reference count in inm->inm_refcount is an >> unsigned int, so it decrements from the left-over 0 to >> 4294967295, avoiding a second free. >> >> Turning on the memory debug options (scribbling values over >> freed memory) catches the problem more quickly, though you >> still must create and destroy interfaces that partcipate in >> multicast to see it. >> >> The fix here is a kludge, but should serve until the entire >> network locking code and up/downcall system is reworked. >> --- >> sys/netinet/in.c | 4 ++++ >> sys/netinet/in_mcast.c | 11 +++++++++++ >> sys/netinet/ip_var.h | 1 + >> 3 files changed, 16 insertions(+) >> >> diff --git a/sys/netinet/in.c b/sys/netinet/in.c >> index e1611d50b4f0..50cfb188c442 100644 >> --- a/sys/netinet/in.c >> +++ b/sys/netinet/in.c >> @@ -1013,6 +1013,8 @@ in_purgemaddrs(struct ifnet *ifp) >> struct in_multi *inm, *tinm; >> struct ifmultiaddr *ifma; >> >> + inp_freemopt_wait(); /* XXX kludge */ >> + >> LIST_INIT(&purgeinms); >> IN_MULTI_LOCK(); >> >> @@ -1043,6 +1045,8 @@ in_purgemaddrs(struct ifnet *ifp) >> igmp_ifdetach(ifp); >> >> IN_MULTI_UNLOCK(); >> + >> + inp_freemopt_wait(); /* XXX kludge */ >> } >> >> struct in_llentry { >> diff --git a/sys/netinet/in_mcast.c b/sys/netinet/in_mcast.c >> index eef560e7aca7..1d1258701fa2 100644 >> --- a/sys/netinet/in_mcast.c >> +++ b/sys/netinet/in_mcast.c >> @@ -1585,6 +1585,17 @@ inp_freemoptions(struct ip_moptions *imo) >> taskqueue_enqueue(taskqueue_thread, &imo_gc_task); >> } >> >> +/* >> + * Wait for inp_freemoptions() to complete any current work. >> + * Used because inp_freemoptions is no longer synchronous. >> + */ >> +void >> +inp_freemopt_wait(void) >> +{ >> + >> + taskqueue_drain(taskqueue_thread, &imo_gc_task); >> +} >> + >> static void >> inp_freemoptions_internal(struct ip_moptions *imo) >> { >> diff --git a/sys/netinet/ip_var.h b/sys/netinet/ip_var.h >> index f7e58d18635a..e51bcffc4fc2 100644 >> --- a/sys/netinet/ip_var.h >> +++ b/sys/netinet/ip_var.h >> @@ -200,6 +200,7 @@ extern struct pr_usrreqs rip_usrreqs; >> #define V_drop_redirect VNET(drop_redirect) >> >> void inp_freemoptions(struct ip_moptions *); >> +void inp_freemopt_wait(void); >> int inp_getmoptions(struct inpcb *, struct sockopt *); >> int inp_setmoptions(struct inpcb *, struct sockopt *); >> >> -- >> 2.12.1 >> >> >> From e27ff26bd086c46f2b1cce8eb95f4625cc4333fc Mon Sep 17 00:00:00 2001 >> From: Chris Torek <torek@ixsystems.com> >> Date: Sat, 5 Mar 2016 15:02:33 -0800 >> Subject: [PATCH 3/3] ip_multicast: defer vnet restore in in_leavegroup >> >> Note: in_leavegroup() does all its real work in >> in_leavegroup_locked(). For discussion here the two] >> functions might as well be equivalent. >> >> In in_leavegroup_locked(), when we're shedding a multicast >> group, we may (or may not) delete it from an interface via >> the igmp_change_state() call. This is where we currently >> set the multicast's vnet, and then restore the old vnet on >> return. >> >> However, a few lines later we use inm_release_locked() to >> release the inet multicast data structure, and that in turn >> may -- not necessarily will, only if the inm really is being >> freed -- call if_delmulti_ifma(), which may -- not necessarily >> will, again -- call the interface's SIOCDELMULTI ioctl >> (if and only if there is an interface and this was the last >> ref to this multicast address). >> >> For (at least) the lagg interface, we still need the current >> vnet to be valid during the SIOCDELMULTI. So, don't restore >> the old vnet until we've not only finished the IGMP code but >> also inm_release_locked(). >> --- >> sys/netinet/in_mcast.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/sys/netinet/in_mcast.c b/sys/netinet/in_mcast.c >> index 1d1258701fa2..4549c952e058 100644 >> --- a/sys/netinet/in_mcast.c >> +++ b/sys/netinet/in_mcast.c >> @@ -1275,12 +1275,12 @@ in_leavegroup_locked(struct in_multi *inm, /*const*/ struct in_mfilter *imf) >> CTR1(KTR_IGMPV3, "%s: doing igmp downcall", __func__); >> CURVNET_SET(inm->inm_ifp->if_vnet); >> error = igmp_change_state(inm); >> - CURVNET_RESTORE(); >> if (error) >> CTR1(KTR_IGMPV3, "%s: failed igmp downcall", __func__); >> >> CTR2(KTR_IGMPV3, "%s: dropping ref on %p", __func__, inm); >> inm_release_locked(inm); >> + CURVNET_RESTORE(); >> >> return (error); >> } >> -- >> 2.12.1 >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmomrdDd567Lg=Kon%2Bx-XOg=u0f8Y9WYHa54QCN3_j4rQxQ>