Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 Jul 2020 23:05:14 -0700
From:      John-Mark Gurney <jmg@funkthat.com>
To:        Peter Libassi <peter@libassi.se>
Cc:        Marko Zec <zec@fer.hr>, freebsd-net@freebsd.org, freebsd-current@freebsd.org
Subject:   Re: somewhat reproducable vimage panic
Message-ID:  <20200722060514.GF4213@funkthat.com>
In-Reply-To: <38F5A3A6-B578-4BA4-8F69-C248163CB6E0@libassi.se>
References:  <20200721091654.GC4213@funkthat.com> <20200721113153.42d83119@x23> <20200721202323.GE4213@funkthat.com> <38F5A3A6-B578-4BA4-8F69-C248163CB6E0@libassi.se>

next in thread | previous in thread | raw e-mail | index | archive | help
Peter Libassi wrote this message on Wed, Jul 22, 2020 at 06:54 +0200:
> Is this related to 
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234985 <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234985>; and https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=238326 <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=238326>;

Definitely not 234985..  I'm using ue interfaces, and so they don't
get destroyed while the jail is going away...

I don't think it's 238326 either.  This is 100% reliable and it's in
the IP multicast code..  It looks like in_multi isn't holding an
interface or address lock waiting for things to free up...

> > 21 juli 2020 kl. 22:23 skrev John-Mark Gurney <jmg@funkthat.com>:
> > 
> > Marko Zec wrote this message on Tue, Jul 21, 2020 at 11:31 +0200:
> >> On Tue, 21 Jul 2020 02:16:55 -0700
> >> John-Mark Gurney <jmg@funkthat.com> wrote:
> >> 
> >>> I'm running:
> >>> FreeBSD test 13.0-CURRENT FreeBSD 13.0-CURRENT #0 r362596: Thu Jun 25
> >>> 05:02:51 UTC 2020
> >>> root@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC
> >>> amd64
> >>> 
> >>> and I'm working on improve the if_ure driver.  I've put together a
> >>> little script that I've attached that I'm using to test the driver..
> >>> It puts a couple ue interfaces each into their own jail, configures
> >>> them, and tries to pass traffic.  This assumes that the two interfaces
> >>> are connected together.
> >>> 
> >>> Pretty regularly when destroying the jails, I get the following
> >>> panic: CURVNET_SET at /usr/src/sys/netinet/in_mcast.c:626
> >>> inm_release() curvnet=0 vnet=0xfffff80154c82a80
> >> 
> >> Perhaps the attached patch could help? (disclaimer: not even
> >> compile-tested)
> > 
> > The patch compiled, but it just moved the panic earlier than before.
> > 
> > #4  0xffffffff80bc2123 in panic (fmt=<unavailable>)
> >    at ../../../kern/kern_shutdown.c:839
> > #5  0xffffffff80d61726 in inm_release_task (arg=<optimized out>, 
> >    pending=<optimized out>) at ../../../netinet/in_mcast.c:633
> > #6  0xffffffff80c2166a in taskqueue_run_locked (queue=0xfffff800033cfd00)
> >    at ../../../kern/subr_taskqueue.c:476
> > #7  0xffffffff80c226e4 in taskqueue_thread_loop (arg=<optimized out>)
> >    at ../../../kern/subr_taskqueue.c:793
> > 
> > Now it panics at the location of the new CURVNET_SET and not the
> > old one..
> > 
> > Ok, decided to dump the contents of the vnet, and it looks like
> > it's a use after free:
> > (kgdb) print/x *(struct vnet *)0xfffff8012a283140
> > $2 = {vnet_le = {le_next = 0xdeadc0dedeadc0de, le_prev = 0xdeadc0dedeadc0de}, vnet_magic_n = 0xdeadc0de, 
> >  vnet_ifcnt = 0xdeadc0de, vnet_sockcnt = 0xdeadc0de, vnet_state = 0xdeadc0de, vnet_data_mem = 0xdeadc0dedeadc0de, 
> >  vnet_data_base = 0xdeadc0dedeadc0de, vnet_shutdown = 0xde}
> > 
> > The patch did seem to make it happen quicker, or maybe I was just more
> > lucky this morning...
> > 
> >>> (kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
> >>> #1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:394
> >>> #2  0xffffffff80bc6250 in kern_reboot (howto=260)
> >>>    at /usr/src/sys/kern/kern_shutdown.c:481
> >>> #3  0xffffffff80bc66aa in vpanic (fmt=<optimized out>, ap=<optimized
> >>> out>) at /usr/src/sys/kern/kern_shutdown.c:913
> >>> #4  0xffffffff80bc6403 in panic (fmt=<unavailable>)
> >>>    at /usr/src/sys/kern/kern_shutdown.c:839
> >>> #5  0xffffffff80d6553b in inm_release (inm=0xfffff80029043700)
> >>>    at /usr/src/sys/netinet/in_mcast.c:630
> >>> #6  inm_release_task (arg=<optimized out>, pending=<optimized out>)
> >>>    at /usr/src/sys/netinet/in_mcast.c:312
> >>> #7  0xffffffff80c2521a in taskqueue_run_locked
> >>> (queue=0xfffff80003116b00) at /usr/src/sys/kern/subr_taskqueue.c:476
> >>> #8  0xffffffff80c26294 in taskqueue_thread_loop (arg=<optimized out>)
> >>>    at /usr/src/sys/kern/subr_taskqueue.c:793
> >>> #9  0xffffffff80b830f0 in fork_exit (
> >>>    callout=0xffffffff80c26200 <taskqueue_thread_loop>, 
> >>>    arg=0xffffffff81cf4f70 <taskqueue_thread>,
> >>> frame=0xfffffe0049e99b80) at /usr/src/sys/kern/kern_fork.c:1052
> >>> #10 <signal handler called>
> >>> (kgdb) 
> >>> 
> >>> I have the core files so I can get additional information.
> >>> 
> >>> Let me know if you need any additional information.
> >>> 
> >> 
> > 
> >> Index: sys/netinet/in_mcast.c
> >> ===================================================================
> >> --- sys/netinet/in_mcast.c	(revision 363386)
> >> +++ sys/netinet/in_mcast.c	(working copy)
> >> @@ -309,8 +309,10 @@
> >> 	IN_MULTI_LOCK();
> >> 	SLIST_FOREACH_SAFE(inm, &inm_free_tmp, inm_nrele, tinm) {
> >> 		SLIST_REMOVE_HEAD(&inm_free_tmp, inm_nrele);
> >> +		CURVNET_SET(inm->inm_ifp->if_vnet);
> >> 		MPASS(inm);
> >> 		inm_release(inm);
> >> +		CURVNET_RESTORE();
> >> 	}
> >> 	IN_MULTI_UNLOCK();
> >> }

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200722060514.GF4213>