Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 9 Apr 2020 14:02:51 -0700
From:      Eric Joyner <erj@freebsd.org>
To:        Mark Johnston <markj@freebsd.org>
Cc:        Hans Petter Selasky <hps@selasky.org>, freebsd-net@freebsd.org, shurd <shurd@freebsd.org>,  John Baldwin <jhb@freebsd.org>, Drew Gallatin <gallatin@netflix.com>
Subject:   Re: Issue with epoch_drain_callbacks and unloading iavf(4) [using iflib]
Message-ID:  <CA%2Bb0zg9DZys8v--Rwtg1qBkz8XbByehq6vr-xmLtjenNGgRKiQ@mail.gmail.com>
In-Reply-To: <20200407232347.GA5605@raichu>
References:  <CA%2Bb0zg809EGMS1Ngr38BSb1yNpDqxbCnAv9eC%2BcDwbMQ5t%2BqXQ@mail.gmail.com> <20200212222219.GE83892@raichu> <CAKdFRZjdiz_axuweksNUHis7jPKXHqOmhQg%2BQWzpVnsKY%2Bcrmg@mail.gmail.com> <20200328225150.GA82767@raichu> <CAKdFRZgm43LmjJ9dYDBGM8EV0ePRMLPr4YW_tPELANXQGpqpCA@mail.gmail.com> <CA%2Bb0zg_k=8nMhapa=T=yTcSJcUrrnG=AfQB%2Be0gPcCrgkbWtCQ@mail.gmail.com> <20200331192024.GE97238@raichu> <CA%2Bb0zg9z7srroWLtV_poedghXjCr0GvHv95cu4JzFrRdZoaeWw@mail.gmail.com> <20200406212903.GA55712@raichu> <CA%2Bb0zg-JM1rjO_OPh16sgM3Hm2hbzePNaW5bcxiL9aOpJ_vsOA@mail.gmail.com> <20200407232347.GA5605@raichu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Apr 7, 2020 at 4:24 PM Mark Johnston <markj@freebsd.org> wrote:

> On Mon, Apr 06, 2020 at 02:34:50PM -0700, Eric Joyner wrote:
> > On Mon, Apr 6, 2020 at 2:29 PM Mark Johnston <markj@freebsd.org> wrote:
> >
> > > On Mon, Apr 06, 2020 at 02:19:25PM -0700, Eric Joyner wrote:
> > > > Mark,
> > > >
> > > > I think I was mistaken about the backtrace looking the same. I was
> > > looking
> > > > at it from within ddb, and I think I focused on the
> > > > epoch_block_handler_preempt line and didn't notice that it only
> stopped
> > > > there this time. Here's the new one I've got from kgdb:
> > >
> > > Thanks.  Could you try to print "td->td_name" from frame 4?  It should
> > > also be available as er->er_blockedtd.  Basically, I'm trying to verify
> > > that the interrupt thread itself isn't the one that we're waiting for,
> > > else there is another bug to be fixed.
> > >
> > > If you can provide kernel symbols and vmcore, I'd be happy to look at
> it
> > > myself.
> > > _______________________________________________
> > > freebsd-net@freebsd.org mailing list
> > > https://lists.freebsd.org/mailman/listinfo/freebsd-net
> > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> > >
> >
> > Here's what I get:
> >
> > (kgdb) frame 4
> > #4  epoch_block_handler_preempt (global=0xfffff80003de0100,
> > cr=0xfffffe00dee85900, arg=0x0) at /usr/src/sys/kern/subr_epoch.c:507
> > 507     }
> > (kgdb) print td->td_name
> > $1 = "if_io_tqg_31\000\000\000\000\000\000\000"
> > (kgdb) print er->er_blockedtd
> > $2 = (struct thread *) 0x0
>
> I spent some time looking at the core.  It looks like we have yet
> another problem: the gtaskqueue code won't exit the net epoch if it is
> constantly running a net task.  Could you please retry with the patches
> from before, and this one included?
>
> diff --git a/sys/kern/subr_gtaskqueue.c b/sys/kern/subr_gtaskqueue.c
> index f52f32204644..2b1386a612ee 100644
> --- a/sys/kern/subr_gtaskqueue.c
> +++ b/sys/kern/subr_gtaskqueue.c
> @@ -345,7 +345,7 @@ gtaskqueue_run_locked(struct gtaskqueue *queue)
>         struct epoch_tracker et;
>         struct gtaskqueue_busy tb;
>         struct gtask *gtask;
> -       bool in_net_epoch;
> +       bool in net_epoch;
>
>         KASSERT(queue != NULL, ("tq is NULL"));
>         TQ_ASSERT_LOCKED(queue);
> @@ -361,20 +361,19 @@ gtaskqueue_run_locked(struct gtaskqueue *queue)
>                 TQ_UNLOCK(queue);
>
>                 KASSERT(gtask->ta_func != NULL, ("task->ta_func is NULL"));
> -               if (!in_net_epoch && TASK_IS_NET(gtask)) {
> -                       in_net_epoch = true;
> +               if (TASK_IS_NET(gtask)) {
>                         NET_EPOCH_ENTER(et);
> -               } else if (in_net_epoch && !TASK_IS_NET(gtask)) {
> +                       in_net_epoch = true;
> +               }
> +               gtask->ta_func(gtask->ta_context);
> +               if (in_net_epoch) {
>                         NET_EPOCH_EXIT(et);
>                         in_net_epoch = false;
>                 }
> -               gtask->ta_func(gtask->ta_context);
>
>                 TQ_LOCK(queue);
>                 wakeup(gtask);
>         }
> -       if (in_net_epoch)
> -               NET_EPOCH_EXIT(et);
>         LIST_REMOVE(&tb, tb_link);
>  }
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>

Yeah, I'll give it a spin and try to get back to you before the end of the
week.

- Eric



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2Bb0zg9DZys8v--Rwtg1qBkz8XbByehq6vr-xmLtjenNGgRKiQ>