Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Apr 2017 14:53:55 -0700
From:      Adrian Chadd <adrian.chadd@gmail.com>
To:        Chris Torek <torek@torek.net>
Cc:        FreeBSD Net <freebsd-net@freebsd.org>
Subject:   Re: three multicast fixes
Message-ID:  <CAJ-VmomrdDd567Lg=Kon%2Bx-XOg=u0f8Y9WYHa54QCN3_j4rQxQ@mail.gmail.com>
In-Reply-To: <CAJ-Vmok1tDCbD8puxfFRrFKkNNTeSpJoEh1DWaU3wzbXnSvaJA@mail.gmail.com>
References:  <201704171358.v3HDwmF3071624@elf.torek.net> <CAJ-Vmok1tDCbD8puxfFRrFKkNNTeSpJoEh1DWaU3wzbXnSvaJA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Actually - into individual reviews.freebsd.org reviews?



-adrian


On 26 April 2017 at 14:53, Adrian Chadd <adrian.chadd@gmail.com> wrote:
> hiya,
>
> can you throw these in bugs?
>
>
> -a
>
>
> On 17 April 2017 at 06:58, Chris Torek <torek@torek.net> wrote:
>> The first is mostly cosmetic, it's just something I observed when
>> I turned on multicast debug to find the bug that the second patch
>> is for.
>>
>> The second patch is a kludge and if anyone has a better fix, have
>> at it.  :-) Note that it's also hard to provoke the bug.  The only
>> way I know to observe it somewhat reliably is to create and
>> destroy bridge interfaces, using a debug kernel with memory debug
>> enabled, while adding and deleting hardware interfaces to the
>> bridge so that they repeatedly join and leave multicast groups.
>> Eventually you hit the race and crash.
>>
>> The third is simple enough to observe.  It may affect only the lagg
>> driver, which implements SIOCDELMULTI by removing and re-adding all
>> the multicast addresses other than the one actually deleted (which
>> is of course really-removed by then).  If running lagg on a VIMAGE
>> kernel, leaving a multicast group causes this occasional crash:
>>
>>   Fatal trap 12: page fault while in kernel mode
>>   cpuid = 0; apic id = 00
>>   fault virtual address   = 0x28
>>   fault code              = supervisor read data, page not present
>>   instruction pointer     = 0x20:0xffffffff80adf6ea
>>   stack pointer           = 0x28:0xfffffe009117e810
>>   frame pointer           = 0x28:0xfffffe009117e8b0
>>   code segment            = base 0x0, limit 0xfffff, type 0x1b
>>                           = DPL 0, pres 1, long 1, def32 0, gran 1
>>   processor eflags        = interrupt enabled, resume, IOPL = 0
>>   current process         = 0 (thread taskq)
>>
>>   rt_newmaddrmsg() at rt_newmaddrmsg+0x2a/frame 0xfffffe009117e8b0
>>   if_addmulti() at if_addmulti+0x24c/frame 0xfffffe009117e940
>>   lagg_ether_cmdmulti() at lagg_ether_cmdmulti+0x168/frame 0xfffffe009117e980
>>   lagg_ioctl() at lagg_ioctl+0xad/frame 0xfffffe009117ea60
>>   in_leavegroup() at in_leavegroup+0xb9/frame 0xfffffe009117eab0
>>   inp_gcmoptions() at inp_gcmoptions+0x1a8/frame 0xfffffe009117eb20
>>   taskqueue_run_locked() at taskqueue_run_locked+0x117/frame 0xfffffe009117eb80
>>   taskqueue_thread_loop() at taskqueue_thread_loop+0xc8/frame 0xfffffe009117ebb0
>>   fork_exit() at fork_exit+0x85/frame 0xfffffe009117ebf0
>>   fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe009117ebf0
>>   --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>>
>> Here, "curvnet" is NULL in the lagg_ioctl(), so that when
>> rt_newmaddrmsg() does:
>>
>>         if (V_route_cb.any_count == 0)
>>
>> we crash with the offset of the per-vnet route_cb data.
>>
>> (The patches are in Git format, I applied them to the read-only
>> freebsd master tree on GitHub.)
>>
>> Chris
>>
>> From 65145b74d3dae68feddcf8a2aad235c3f0a981e9 Mon Sep 17 00:00:00 2001
>> From: Chris Torek <chris.torek@gmail.com>
>> Date: Mon, 8 Feb 2016 18:55:30 -0800
>> Subject: [PATCH 1/3] ip multicast debug: fix strings vs defines
>>
>> Turning on multicast debug made multicast failure worse
>> because the strings and #define values no longer matched
>> up.  Fix them, and make sure they stay matched-up.
>> ---
>>  sys/netinet/in_mcast.c | 31 +++++++++++++++++++++----------
>>  1 file changed, 21 insertions(+), 10 deletions(-)
>>
>> diff --git a/sys/netinet/in_mcast.c b/sys/netinet/in_mcast.c
>> index c7fff9463128..eef560e7aca7 100644
>> --- a/sys/netinet/in_mcast.c
>> +++ b/sys/netinet/in_mcast.c
>> @@ -2920,7 +2920,14 @@ sysctl_ip_mcast_filters(SYSCTL_HANDLER_ARGS)
>>
>>  #if defined(KTR) && (KTR_COMPILE & KTR_IGMPV3)
>>
>> -static const char *inm_modestrs[] = { "un", "in", "ex" };
>> +static const char *inm_modestrs[] = {
>> +       [MCAST_UNDEFINED] = "un",
>> +       [MCAST_INCLUDE] = "in",
>> +       [MCAST_EXCLUDE] = "ex",
>> +};
>> +_Static_assert(MCAST_UNDEFINED == 0 &&
>> +              MCAST_EXCLUDE + 1 == nitems(inm_modestrs),
>> +              "inm_modestrs: no longer matches #defines");
>>
>>  static const char *
>>  inm_mode_str(const int mode)
>> @@ -2932,16 +2939,20 @@ inm_mode_str(const int mode)
>>  }
>>
>>  static const char *inm_statestrs[] = {
>> -       "not-member",
>> -       "silent",
>> -       "idle",
>> -       "lazy",
>> -       "sleeping",
>> -       "awakening",
>> -       "query-pending",
>> -       "sg-query-pending",
>> -       "leaving"
>> +       [IGMP_NOT_MEMBER] = "not-member",
>> +       [IGMP_SILENT_MEMBER] = "silent",
>> +       [IGMP_REPORTING_MEMBER] = "reporting",
>> +       [IGMP_IDLE_MEMBER] = "idle",
>> +       [IGMP_LAZY_MEMBER] = "lazy",
>> +       [IGMP_SLEEPING_MEMBER] = "sleeping",
>> +       [IGMP_AWAKENING_MEMBER] = "awakening",
>> +       [IGMP_G_QUERY_PENDING_MEMBER] = "query-pending",
>> +       [IGMP_SG_QUERY_PENDING_MEMBER] = "sg-query-pending",
>> +       [IGMP_LEAVING_MEMBER] = "leaving",
>>  };
>> +_Static_assert(IGMP_NOT_MEMBER == 0 &&
>> +              IGMP_LEAVING_MEMBER + 1 == nitems(inm_statestrs),
>> +              "inm_statetrs: no longer matches #defines");
>>
>>  static const char *
>>  inm_state_str(const int state)
>> --
>> 2.12.1
>>
>>
>> From 099e1b2c059fe962c1dd91af0a6735f02f611ac3 Mon Sep 17 00:00:00 2001
>> From: Chris Torek <torek@ixsystems.com>
>> Date: Wed, 10 Feb 2016 18:41:39 -0800
>> Subject: [PATCH 2/3] ip multicast: fix use-after-free
>>
>> During if_detach(), we get a race where a closing socket is
>> releasing multicast data (via inp_freemoptions()) at the same
>> time as igmp_ifdetach() is releasing all multicast data for
>> the interface, resulting in a potential double teardown and
>> double free.
>>
>> This bug has been present since late 2011:
>>
>>     Author: jhb <jhb@FreeBSD.org>
>>
>>     Defer the work of freeing IPv4 multicast options from a
>>     socket to an asychronous task. ...
>>
>> It is very hard to trip over.  You must create and delete
>> interfaces (bridges that join real interfaces are good
>> candidates) repeatedly, and even then, if M_IPMADDR (in_multi
>> data structure) memory is not reused for something else during
>> the race, the reference count in inm->inm_refcount is an
>> unsigned int, so it decrements from the left-over 0 to
>> 4294967295, avoiding a second free.
>>
>> Turning on the memory debug options (scribbling values over
>> freed memory) catches the problem more quickly, though you
>> still must create and destroy interfaces that partcipate in
>> multicast to see it.
>>
>> The fix here is a kludge, but should serve until the entire
>> network locking code and up/downcall system is reworked.
>> ---
>>  sys/netinet/in.c       |  4 ++++
>>  sys/netinet/in_mcast.c | 11 +++++++++++
>>  sys/netinet/ip_var.h   |  1 +
>>  3 files changed, 16 insertions(+)
>>
>> diff --git a/sys/netinet/in.c b/sys/netinet/in.c
>> index e1611d50b4f0..50cfb188c442 100644
>> --- a/sys/netinet/in.c
>> +++ b/sys/netinet/in.c
>> @@ -1013,6 +1013,8 @@ in_purgemaddrs(struct ifnet *ifp)
>>         struct in_multi         *inm, *tinm;
>>         struct ifmultiaddr      *ifma;
>>
>> +       inp_freemopt_wait();    /* XXX kludge */
>> +
>>         LIST_INIT(&purgeinms);
>>         IN_MULTI_LOCK();
>>
>> @@ -1043,6 +1045,8 @@ in_purgemaddrs(struct ifnet *ifp)
>>         igmp_ifdetach(ifp);
>>
>>         IN_MULTI_UNLOCK();
>> +
>> +       inp_freemopt_wait();    /* XXX kludge */
>>  }
>>
>>  struct in_llentry {
>> diff --git a/sys/netinet/in_mcast.c b/sys/netinet/in_mcast.c
>> index eef560e7aca7..1d1258701fa2 100644
>> --- a/sys/netinet/in_mcast.c
>> +++ b/sys/netinet/in_mcast.c
>> @@ -1585,6 +1585,17 @@ inp_freemoptions(struct ip_moptions *imo)
>>         taskqueue_enqueue(taskqueue_thread, &imo_gc_task);
>>  }
>>
>> +/*
>> + * Wait for inp_freemoptions() to complete any current work.
>> + * Used because inp_freemoptions is no longer synchronous.
>> + */
>> +void
>> +inp_freemopt_wait(void)
>> +{
>> +
>> +       taskqueue_drain(taskqueue_thread, &imo_gc_task);
>> +}
>> +
>>  static void
>>  inp_freemoptions_internal(struct ip_moptions *imo)
>>  {
>> diff --git a/sys/netinet/ip_var.h b/sys/netinet/ip_var.h
>> index f7e58d18635a..e51bcffc4fc2 100644
>> --- a/sys/netinet/ip_var.h
>> +++ b/sys/netinet/ip_var.h
>> @@ -200,6 +200,7 @@ extern struct       pr_usrreqs rip_usrreqs;
>>  #define        V_drop_redirect         VNET(drop_redirect)
>>
>>  void   inp_freemoptions(struct ip_moptions *);
>> +void   inp_freemopt_wait(void);
>>  int    inp_getmoptions(struct inpcb *, struct sockopt *);
>>  int    inp_setmoptions(struct inpcb *, struct sockopt *);
>>
>> --
>> 2.12.1
>>
>>
>> From e27ff26bd086c46f2b1cce8eb95f4625cc4333fc Mon Sep 17 00:00:00 2001
>> From: Chris Torek <torek@ixsystems.com>
>> Date: Sat, 5 Mar 2016 15:02:33 -0800
>> Subject: [PATCH 3/3] ip_multicast: defer vnet restore in in_leavegroup
>>
>> Note: in_leavegroup() does all its real work in
>> in_leavegroup_locked().  For discussion here the two]
>> functions might as well be equivalent.
>>
>> In in_leavegroup_locked(), when we're shedding a multicast
>> group, we may (or may not) delete it from an interface via
>> the igmp_change_state() call.  This is where we currently
>> set the multicast's vnet, and then restore the old vnet on
>> return.
>>
>> However, a few lines later we use inm_release_locked() to
>> release the inet multicast data structure, and that in turn
>> may -- not necessarily will, only if the inm really is being
>> freed -- call if_delmulti_ifma(), which may -- not necessarily
>> will, again -- call the interface's SIOCDELMULTI ioctl
>> (if and only if there is an interface and this was the last
>> ref to this multicast address).
>>
>> For (at least) the lagg interface, we still need the current
>> vnet to be valid during the SIOCDELMULTI.  So, don't restore
>> the old vnet until we've not only finished the IGMP code but
>> also inm_release_locked().
>> ---
>>  sys/netinet/in_mcast.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/sys/netinet/in_mcast.c b/sys/netinet/in_mcast.c
>> index 1d1258701fa2..4549c952e058 100644
>> --- a/sys/netinet/in_mcast.c
>> +++ b/sys/netinet/in_mcast.c
>> @@ -1275,12 +1275,12 @@ in_leavegroup_locked(struct in_multi *inm, /*const*/ struct in_mfilter *imf)
>>         CTR1(KTR_IGMPV3, "%s: doing igmp downcall", __func__);
>>         CURVNET_SET(inm->inm_ifp->if_vnet);
>>         error = igmp_change_state(inm);
>> -       CURVNET_RESTORE();
>>         if (error)
>>                 CTR1(KTR_IGMPV3, "%s: failed igmp downcall", __func__);
>>
>>         CTR2(KTR_IGMPV3, "%s: dropping ref on %p", __func__, inm);
>>         inm_release_locked(inm);
>> +       CURVNET_RESTORE();
>>
>>         return (error);
>>  }
>> --
>> 2.12.1
>>
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmomrdDd567Lg=Kon%2Bx-XOg=u0f8Y9WYHa54QCN3_j4rQxQ>