Date: Tue, 7 Apr 2020 19:23:47 -0400 From: Mark Johnston <markj@freebsd.org> To: Eric Joyner <erj@freebsd.org> Cc: Hans Petter Selasky <hps@selasky.org>, freebsd-net@freebsd.org, shurd <shurd@freebsd.org>, John Baldwin <jhb@freebsd.org>, Drew Gallatin <gallatin@netflix.com> Subject: Re: Issue with epoch_drain_callbacks and unloading iavf(4) [using iflib] Message-ID: <20200407232347.GA5605@raichu> In-Reply-To: <CA%2Bb0zg-JM1rjO_OPh16sgM3Hm2hbzePNaW5bcxiL9aOpJ_vsOA@mail.gmail.com> References: <CA%2Bb0zg809EGMS1Ngr38BSb1yNpDqxbCnAv9eC%2BcDwbMQ5t%2BqXQ@mail.gmail.com> <20200212222219.GE83892@raichu> <CAKdFRZjdiz_axuweksNUHis7jPKXHqOmhQg%2BQWzpVnsKY%2Bcrmg@mail.gmail.com> <20200328225150.GA82767@raichu> <CAKdFRZgm43LmjJ9dYDBGM8EV0ePRMLPr4YW_tPELANXQGpqpCA@mail.gmail.com> <CA%2Bb0zg_k=8nMhapa=T=yTcSJcUrrnG=AfQB%2Be0gPcCrgkbWtCQ@mail.gmail.com> <20200331192024.GE97238@raichu> <CA%2Bb0zg9z7srroWLtV_poedghXjCr0GvHv95cu4JzFrRdZoaeWw@mail.gmail.com> <20200406212903.GA55712@raichu> <CA%2Bb0zg-JM1rjO_OPh16sgM3Hm2hbzePNaW5bcxiL9aOpJ_vsOA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Apr 06, 2020 at 02:34:50PM -0700, Eric Joyner wrote: > On Mon, Apr 6, 2020 at 2:29 PM Mark Johnston <markj@freebsd.org> wrote: > > > On Mon, Apr 06, 2020 at 02:19:25PM -0700, Eric Joyner wrote: > > > Mark, > > > > > > I think I was mistaken about the backtrace looking the same. I was > > looking > > > at it from within ddb, and I think I focused on the > > > epoch_block_handler_preempt line and didn't notice that it only stopped > > > there this time. Here's the new one I've got from kgdb: > > > > Thanks. Could you try to print "td->td_name" from frame 4? It should > > also be available as er->er_blockedtd. Basically, I'm trying to verify > > that the interrupt thread itself isn't the one that we're waiting for, > > else there is another bug to be fixed. > > > > If you can provide kernel symbols and vmcore, I'd be happy to look at it > > myself. > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > > > Here's what I get: > > (kgdb) frame 4 > #4 epoch_block_handler_preempt (global=0xfffff80003de0100, > cr=0xfffffe00dee85900, arg=0x0) at /usr/src/sys/kern/subr_epoch.c:507 > 507 } > (kgdb) print td->td_name > $1 = "if_io_tqg_31\000\000\000\000\000\000\000" > (kgdb) print er->er_blockedtd > $2 = (struct thread *) 0x0 I spent some time looking at the core. It looks like we have yet another problem: the gtaskqueue code won't exit the net epoch if it is constantly running a net task. Could you please retry with the patches from before, and this one included? diff --git a/sys/kern/subr_gtaskqueue.c b/sys/kern/subr_gtaskqueue.c index f52f32204644..2b1386a612ee 100644 --- a/sys/kern/subr_gtaskqueue.c +++ b/sys/kern/subr_gtaskqueue.c @@ -345,7 +345,7 @@ gtaskqueue_run_locked(struct gtaskqueue *queue) struct epoch_tracker et; struct gtaskqueue_busy tb; struct gtask *gtask; - bool in_net_epoch; + bool in net_epoch; KASSERT(queue != NULL, ("tq is NULL")); TQ_ASSERT_LOCKED(queue); @@ -361,20 +361,19 @@ gtaskqueue_run_locked(struct gtaskqueue *queue) TQ_UNLOCK(queue); KASSERT(gtask->ta_func != NULL, ("task->ta_func is NULL")); - if (!in_net_epoch && TASK_IS_NET(gtask)) { - in_net_epoch = true; + if (TASK_IS_NET(gtask)) { NET_EPOCH_ENTER(et); - } else if (in_net_epoch && !TASK_IS_NET(gtask)) { + in_net_epoch = true; + } + gtask->ta_func(gtask->ta_context); + if (in_net_epoch) { NET_EPOCH_EXIT(et); in_net_epoch = false; } - gtask->ta_func(gtask->ta_context); TQ_LOCK(queue); wakeup(gtask); } - if (in_net_epoch) - NET_EPOCH_EXIT(et); LIST_REMOVE(&tb, tb_link); }
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200407232347.GA5605>