From owner-freebsd-net@freebsd.org Wed Apr 26 21:53:27 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 564E7D51BA8 for ; Wed, 26 Apr 2017 21:53:27 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-wm0-x230.google.com (mail-wm0-x230.google.com [IPv6:2a00:1450:400c:c09::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D85915FA for ; Wed, 26 Apr 2017 21:53:26 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: by mail-wm0-x230.google.com with SMTP id r190so1813481wme.1 for ; Wed, 26 Apr 2017 14:53:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=744W87Erlw5RBMg9IlCp9G+HqfM7ZLAipedFy2v9G/U=; b=q3rD2XTQRWOKsj1c976SnNqOr67SRDzO8B16XomF7JT/nFLwtVX58fQvj2sRg1BOqC gJEHCaW9aWETE15ZsM9riVK/DMiklh8hTp/CowFLSF4AyMXjU7kGRlotl7PA60DiFkUZ cS8iYIyQS2Rp3B8jMktDjrjyL8cLJVnmgV8dQH6LU/Y5Dq1ZDjhZkV+CP5T49ADMbCmw ELLqg6ZoauGS5VecNCYQR0tkwlr5fd6a+Sl/uZGWgcjEnvlgxQTcfBiG6/buAcSATaXR mD+A4wcku6C+TiLzwMOKej3ZWa9EDQWZtP4nY1H++sIOH61f0yMg9ozeGMsJ0gX6oiHG W3fA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=744W87Erlw5RBMg9IlCp9G+HqfM7ZLAipedFy2v9G/U=; b=jk45kwmFiQiLli/ZdEzL8RZKMImvDqo3BqiymuksqIfN7P81x79wTG7YgLzYR/msap TzQqO3xQ/Ne5Scd7Dxi7rZfz7Au8i+O7xYRbdLSKregAbjDkXTvbBpSoi2vMiWv5Xox9 k+kjqt1kBJFJAjKTGSsKwFZR/ERLCZS/yTkBf8mrBBaXSTyBgODwvSFq7qPd0rK62qUR AZ2WhxDxosrhx3nXgGXQ/IqU0mkwy57zUmjroxFTJCNQKYNyoGrGA6p1BOrWJSV2Jh1L uBJgwupcngFwvNKByasF/n1JWOZquTIUAxnUFGadojEyYz/0j/6SKwTYiD+ZqcBABxiD wwqw== X-Gm-Message-State: AN3rC/7GBdtJofLDsGg/FO030nHJQU773hsaH9MUAQZBzotFhnv7W0r0 40b/+nz3jpUiqUw+4D5rMdH8evw2C2FU X-Received: by 10.28.150.213 with SMTP id y204mr159476wmd.138.1493243605209; Wed, 26 Apr 2017 14:53:25 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.172.197 with HTTP; Wed, 26 Apr 2017 14:53:24 -0700 (PDT) In-Reply-To: <201704171358.v3HDwmF3071624@elf.torek.net> References: <201704171358.v3HDwmF3071624@elf.torek.net> From: Adrian Chadd Date: Wed, 26 Apr 2017 14:53:24 -0700 Message-ID: Subject: Re: three multicast fixes To: Chris Torek Cc: FreeBSD Net Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Apr 2017 21:53:27 -0000 hiya, can you throw these in bugs? -a On 17 April 2017 at 06:58, Chris Torek wrote: > The first is mostly cosmetic, it's just something I observed when > I turned on multicast debug to find the bug that the second patch > is for. > > The second patch is a kludge and if anyone has a better fix, have > at it. :-) Note that it's also hard to provoke the bug. The only > way I know to observe it somewhat reliably is to create and > destroy bridge interfaces, using a debug kernel with memory debug > enabled, while adding and deleting hardware interfaces to the > bridge so that they repeatedly join and leave multicast groups. > Eventually you hit the race and crash. > > The third is simple enough to observe. It may affect only the lagg > driver, which implements SIOCDELMULTI by removing and re-adding all > the multicast addresses other than the one actually deleted (which > is of course really-removed by then). If running lagg on a VIMAGE > kernel, leaving a multicast group causes this occasional crash: > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x28 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff80adf6ea > stack pointer = 0x28:0xfffffe009117e810 > frame pointer = 0x28:0xfffffe009117e8b0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 0 (thread taskq) > > rt_newmaddrmsg() at rt_newmaddrmsg+0x2a/frame 0xfffffe009117e8b0 > if_addmulti() at if_addmulti+0x24c/frame 0xfffffe009117e940 > lagg_ether_cmdmulti() at lagg_ether_cmdmulti+0x168/frame 0xfffffe009117e980 > lagg_ioctl() at lagg_ioctl+0xad/frame 0xfffffe009117ea60 > in_leavegroup() at in_leavegroup+0xb9/frame 0xfffffe009117eab0 > inp_gcmoptions() at inp_gcmoptions+0x1a8/frame 0xfffffe009117eb20 > taskqueue_run_locked() at taskqueue_run_locked+0x117/frame 0xfffffe009117eb80 > taskqueue_thread_loop() at taskqueue_thread_loop+0xc8/frame 0xfffffe009117ebb0 > fork_exit() at fork_exit+0x85/frame 0xfffffe009117ebf0 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe009117ebf0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Here, "curvnet" is NULL in the lagg_ioctl(), so that when > rt_newmaddrmsg() does: > > if (V_route_cb.any_count == 0) > > we crash with the offset of the per-vnet route_cb data. > > (The patches are in Git format, I applied them to the read-only > freebsd master tree on GitHub.) > > Chris > > From 65145b74d3dae68feddcf8a2aad235c3f0a981e9 Mon Sep 17 00:00:00 2001 > From: Chris Torek > Date: Mon, 8 Feb 2016 18:55:30 -0800 > Subject: [PATCH 1/3] ip multicast debug: fix strings vs defines > > Turning on multicast debug made multicast failure worse > because the strings and #define values no longer matched > up. Fix them, and make sure they stay matched-up. > --- > sys/netinet/in_mcast.c | 31 +++++++++++++++++++++---------- > 1 file changed, 21 insertions(+), 10 deletions(-) > > diff --git a/sys/netinet/in_mcast.c b/sys/netinet/in_mcast.c > index c7fff9463128..eef560e7aca7 100644 > --- a/sys/netinet/in_mcast.c > +++ b/sys/netinet/in_mcast.c > @@ -2920,7 +2920,14 @@ sysctl_ip_mcast_filters(SYSCTL_HANDLER_ARGS) > > #if defined(KTR) && (KTR_COMPILE & KTR_IGMPV3) > > -static const char *inm_modestrs[] = { "un", "in", "ex" }; > +static const char *inm_modestrs[] = { > + [MCAST_UNDEFINED] = "un", > + [MCAST_INCLUDE] = "in", > + [MCAST_EXCLUDE] = "ex", > +}; > +_Static_assert(MCAST_UNDEFINED == 0 && > + MCAST_EXCLUDE + 1 == nitems(inm_modestrs), > + "inm_modestrs: no longer matches #defines"); > > static const char * > inm_mode_str(const int mode) > @@ -2932,16 +2939,20 @@ inm_mode_str(const int mode) > } > > static const char *inm_statestrs[] = { > - "not-member", > - "silent", > - "idle", > - "lazy", > - "sleeping", > - "awakening", > - "query-pending", > - "sg-query-pending", > - "leaving" > + [IGMP_NOT_MEMBER] = "not-member", > + [IGMP_SILENT_MEMBER] = "silent", > + [IGMP_REPORTING_MEMBER] = "reporting", > + [IGMP_IDLE_MEMBER] = "idle", > + [IGMP_LAZY_MEMBER] = "lazy", > + [IGMP_SLEEPING_MEMBER] = "sleeping", > + [IGMP_AWAKENING_MEMBER] = "awakening", > + [IGMP_G_QUERY_PENDING_MEMBER] = "query-pending", > + [IGMP_SG_QUERY_PENDING_MEMBER] = "sg-query-pending", > + [IGMP_LEAVING_MEMBER] = "leaving", > }; > +_Static_assert(IGMP_NOT_MEMBER == 0 && > + IGMP_LEAVING_MEMBER + 1 == nitems(inm_statestrs), > + "inm_statetrs: no longer matches #defines"); > > static const char * > inm_state_str(const int state) > -- > 2.12.1 > > > From 099e1b2c059fe962c1dd91af0a6735f02f611ac3 Mon Sep 17 00:00:00 2001 > From: Chris Torek > Date: Wed, 10 Feb 2016 18:41:39 -0800 > Subject: [PATCH 2/3] ip multicast: fix use-after-free > > During if_detach(), we get a race where a closing socket is > releasing multicast data (via inp_freemoptions()) at the same > time as igmp_ifdetach() is releasing all multicast data for > the interface, resulting in a potential double teardown and > double free. > > This bug has been present since late 2011: > > Author: jhb > > Defer the work of freeing IPv4 multicast options from a > socket to an asychronous task. ... > > It is very hard to trip over. You must create and delete > interfaces (bridges that join real interfaces are good > candidates) repeatedly, and even then, if M_IPMADDR (in_multi > data structure) memory is not reused for something else during > the race, the reference count in inm->inm_refcount is an > unsigned int, so it decrements from the left-over 0 to > 4294967295, avoiding a second free. > > Turning on the memory debug options (scribbling values over > freed memory) catches the problem more quickly, though you > still must create and destroy interfaces that partcipate in > multicast to see it. > > The fix here is a kludge, but should serve until the entire > network locking code and up/downcall system is reworked. > --- > sys/netinet/in.c | 4 ++++ > sys/netinet/in_mcast.c | 11 +++++++++++ > sys/netinet/ip_var.h | 1 + > 3 files changed, 16 insertions(+) > > diff --git a/sys/netinet/in.c b/sys/netinet/in.c > index e1611d50b4f0..50cfb188c442 100644 > --- a/sys/netinet/in.c > +++ b/sys/netinet/in.c > @@ -1013,6 +1013,8 @@ in_purgemaddrs(struct ifnet *ifp) > struct in_multi *inm, *tinm; > struct ifmultiaddr *ifma; > > + inp_freemopt_wait(); /* XXX kludge */ > + > LIST_INIT(&purgeinms); > IN_MULTI_LOCK(); > > @@ -1043,6 +1045,8 @@ in_purgemaddrs(struct ifnet *ifp) > igmp_ifdetach(ifp); > > IN_MULTI_UNLOCK(); > + > + inp_freemopt_wait(); /* XXX kludge */ > } > > struct in_llentry { > diff --git a/sys/netinet/in_mcast.c b/sys/netinet/in_mcast.c > index eef560e7aca7..1d1258701fa2 100644 > --- a/sys/netinet/in_mcast.c > +++ b/sys/netinet/in_mcast.c > @@ -1585,6 +1585,17 @@ inp_freemoptions(struct ip_moptions *imo) > taskqueue_enqueue(taskqueue_thread, &imo_gc_task); > } > > +/* > + * Wait for inp_freemoptions() to complete any current work. > + * Used because inp_freemoptions is no longer synchronous. > + */ > +void > +inp_freemopt_wait(void) > +{ > + > + taskqueue_drain(taskqueue_thread, &imo_gc_task); > +} > + > static void > inp_freemoptions_internal(struct ip_moptions *imo) > { > diff --git a/sys/netinet/ip_var.h b/sys/netinet/ip_var.h > index f7e58d18635a..e51bcffc4fc2 100644 > --- a/sys/netinet/ip_var.h > +++ b/sys/netinet/ip_var.h > @@ -200,6 +200,7 @@ extern struct pr_usrreqs rip_usrreqs; > #define V_drop_redirect VNET(drop_redirect) > > void inp_freemoptions(struct ip_moptions *); > +void inp_freemopt_wait(void); > int inp_getmoptions(struct inpcb *, struct sockopt *); > int inp_setmoptions(struct inpcb *, struct sockopt *); > > -- > 2.12.1 > > > From e27ff26bd086c46f2b1cce8eb95f4625cc4333fc Mon Sep 17 00:00:00 2001 > From: Chris Torek > Date: Sat, 5 Mar 2016 15:02:33 -0800 > Subject: [PATCH 3/3] ip_multicast: defer vnet restore in in_leavegroup > > Note: in_leavegroup() does all its real work in > in_leavegroup_locked(). For discussion here the two] > functions might as well be equivalent. > > In in_leavegroup_locked(), when we're shedding a multicast > group, we may (or may not) delete it from an interface via > the igmp_change_state() call. This is where we currently > set the multicast's vnet, and then restore the old vnet on > return. > > However, a few lines later we use inm_release_locked() to > release the inet multicast data structure, and that in turn > may -- not necessarily will, only if the inm really is being > freed -- call if_delmulti_ifma(), which may -- not necessarily > will, again -- call the interface's SIOCDELMULTI ioctl > (if and only if there is an interface and this was the last > ref to this multicast address). > > For (at least) the lagg interface, we still need the current > vnet to be valid during the SIOCDELMULTI. So, don't restore > the old vnet until we've not only finished the IGMP code but > also inm_release_locked(). > --- > sys/netinet/in_mcast.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/sys/netinet/in_mcast.c b/sys/netinet/in_mcast.c > index 1d1258701fa2..4549c952e058 100644 > --- a/sys/netinet/in_mcast.c > +++ b/sys/netinet/in_mcast.c > @@ -1275,12 +1275,12 @@ in_leavegroup_locked(struct in_multi *inm, /*const*/ struct in_mfilter *imf) > CTR1(KTR_IGMPV3, "%s: doing igmp downcall", __func__); > CURVNET_SET(inm->inm_ifp->if_vnet); > error = igmp_change_state(inm); > - CURVNET_RESTORE(); > if (error) > CTR1(KTR_IGMPV3, "%s: failed igmp downcall", __func__); > > CTR2(KTR_IGMPV3, "%s: dropping ref on %p", __func__, inm); > inm_release_locked(inm); > + CURVNET_RESTORE(); > > return (error); > } > -- > 2.12.1 > > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"