From owner-freebsd-net@freebsd.org Thu Apr 9 21:30:25 2020 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7449627FB1E for ; Thu, 9 Apr 2020 21:30:25 +0000 (UTC) (envelope-from ricera10@gmail.com) Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com [209.85.222.173]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 48yvSc3Jkyz4Jf3; Thu, 9 Apr 2020 21:30:24 +0000 (UTC) (envelope-from ricera10@gmail.com) Received: by mail-qk1-f173.google.com with SMTP id 130so312461qke.4; Thu, 09 Apr 2020 14:30:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=opP9GzyBjH7Nt/IbuwJ+UmQLVljxM9gBzOMd3RwMCTE=; b=oR3npWBJhBBteh9URfQupEvZaYNt6ThQCNXiTXvEjPWihHGo7IlkWjzvs3GVU527pj weV/FCzCq3lVOvPTDbgDJdwL3FGPyBtVwcOk/9G2fkdYrLAKVPOhQAFghQoUgxV8CRdD lewlroeN17tsybxqBJRX1kNDh6jpjfrLtLH0QKTbAazGWKCFyJxhHhtXPckcd0hPYTL7 zufgcWrf7BUJqAZtYKOnAbqugyFVB1/9/1h6FvqMkHgnBOGHhtQoe0H+j6ZMP03j8VvO nQjwVfCqTN1iDem6N4lvSmBTcjx+BN+sjJpryVxvS0+KaURrOOgjiZuPkXw7fZV12Yya 1kAQ== X-Gm-Message-State: AGi0PuZZvmwfC2e7KnIsi1NI6wTwBavXX9F2880HFpWqiStDrFwcN6xL YAxLwDin/92slbHeLcUky9Z4oRqyfA4= X-Google-Smtp-Source: APiQypLRfB5CIHL+udgXhAjX0X0AqBfVUTh2VOs+FVsm0D6888JrmbYnFIj8Z+H8dZ7a02ADBggjeA== X-Received: by 2002:a37:690:: with SMTP id 138mr1004889qkg.414.1586467822917; Thu, 09 Apr 2020 14:30:22 -0700 (PDT) Received: from mail-qt1-f173.google.com (mail-qt1-f173.google.com. [209.85.160.173]) by smtp.gmail.com with ESMTPSA id q5sm81760qkn.59.2020.04.09.14.30.22 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 09 Apr 2020 14:30:22 -0700 (PDT) Received: by mail-qt1-f173.google.com with SMTP id z90so42766qtd.10; Thu, 09 Apr 2020 14:30:22 -0700 (PDT) X-Received: by 2002:ac8:27a5:: with SMTP id w34mr1571719qtw.124.1586467821961; Thu, 09 Apr 2020 14:30:21 -0700 (PDT) MIME-Version: 1.0 References: <20200212222219.GE83892@raichu> <20200328225150.GA82767@raichu> <20200331192024.GE97238@raichu> <20200406212903.GA55712@raichu> <20200407232347.GA5605@raichu> In-Reply-To: From: Eric Joyner Date: Thu, 9 Apr 2020 14:29:45 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Issue with epoch_drain_callbacks and unloading iavf(4) [using iflib] To: Mark Johnston Cc: Hans Petter Selasky , freebsd-net@freebsd.org, shurd , John Baldwin , Drew Gallatin X-Rspamd-Queue-Id: 48yvSc3Jkyz4Jf3 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of ricera10@gmail.com designates 209.85.222.173 as permitted sender) smtp.mailfrom=ricera10@gmail.com X-Spamd-Result: default: False [-2.67 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17:c]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; DMARC_NA(0.00)[freebsd.org]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCPT_COUNT_FIVE(0.00)[6]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[173.222.85.209.list.dnswl.org : 127.0.5.0]; IP_SCORE(-0.67)[ip: (-2.51), ipnet: 209.85.128.0/17(-0.40), asn: 15169(-0.43), country: US(-0.05)]; FORGED_SENDER(0.30)[erj@freebsd.org,ricera10@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[173.222.85.209.rep.mailspike.net : 127.0.0.17]; MIME_TRACE(0.00)[0:+,1:+,2:~]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[erj@freebsd.org,ricera10@gmail.com] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Apr 2020 21:30:25 -0000 On Thu, Apr 9, 2020 at 2:02 PM Eric Joyner wrote: > On Tue, Apr 7, 2020 at 4:24 PM Mark Johnston wrote: > >> On Mon, Apr 06, 2020 at 02:34:50PM -0700, Eric Joyner wrote: >> > On Mon, Apr 6, 2020 at 2:29 PM Mark Johnston wrote: >> > >> > > On Mon, Apr 06, 2020 at 02:19:25PM -0700, Eric Joyner wrote: >> > > > Mark, >> > > > >> > > > I think I was mistaken about the backtrace looking the same. I was >> > > looking >> > > > at it from within ddb, and I think I focused on the >> > > > epoch_block_handler_preempt line and didn't notice that it only >> stopped >> > > > there this time. Here's the new one I've got from kgdb: >> > > >> > > Thanks. Could you try to print "td->td_name" from frame 4? It should >> > > also be available as er->er_blockedtd. Basically, I'm trying to >> verify >> > > that the interrupt thread itself isn't the one that we're waiting for, >> > > else there is another bug to be fixed. >> > > >> > > If you can provide kernel symbols and vmcore, I'd be happy to look at >> it >> > > myself. >> > > _______________________________________________ >> > > freebsd-net@freebsd.org mailing list >> > > https://lists.freebsd.org/mailman/listinfo/freebsd-net >> > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org >> " >> > > >> > >> > Here's what I get: >> > >> > (kgdb) frame 4 >> > #4 epoch_block_handler_preempt (global=0xfffff80003de0100, >> > cr=0xfffffe00dee85900, arg=0x0) at /usr/src/sys/kern/subr_epoch.c:507 >> > 507 } >> > (kgdb) print td->td_name >> > $1 = "if_io_tqg_31\000\000\000\000\000\000\000" >> > (kgdb) print er->er_blockedtd >> > $2 = (struct thread *) 0x0 >> >> I spent some time looking at the core. It looks like we have yet >> another problem: the gtaskqueue code won't exit the net epoch if it is >> constantly running a net task. Could you please retry with the patches >> from before, and this one included? >> >> diff --git a/sys/kern/subr_gtaskqueue.c b/sys/kern/subr_gtaskqueue.c >> index f52f32204644..2b1386a612ee 100644 >> --- a/sys/kern/subr_gtaskqueue.c >> +++ b/sys/kern/subr_gtaskqueue.c >> @@ -345,7 +345,7 @@ gtaskqueue_run_locked(struct gtaskqueue *queue) >> struct epoch_tracker et; >> struct gtaskqueue_busy tb; >> struct gtask *gtask; >> - bool in_net_epoch; >> + bool in net_epoch; >> >> KASSERT(queue != NULL, ("tq is NULL")); >> TQ_ASSERT_LOCKED(queue); >> @@ -361,20 +361,19 @@ gtaskqueue_run_locked(struct gtaskqueue *queue) >> TQ_UNLOCK(queue); >> >> KASSERT(gtask->ta_func != NULL, ("task->ta_func is >> NULL")); >> - if (!in_net_epoch && TASK_IS_NET(gtask)) { >> - in_net_epoch = true; >> + if (TASK_IS_NET(gtask)) { >> NET_EPOCH_ENTER(et); >> - } else if (in_net_epoch && !TASK_IS_NET(gtask)) { >> + in_net_epoch = true; >> + } >> + gtask->ta_func(gtask->ta_context); >> + if (in_net_epoch) { >> NET_EPOCH_EXIT(et); >> in_net_epoch = false; >> } >> - gtask->ta_func(gtask->ta_context); >> >> TQ_LOCK(queue); >> wakeup(gtask); >> } >> - if (in_net_epoch) >> - NET_EPOCH_EXIT(et); >> LIST_REMOVE(&tb, tb_link); >> } >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > > Yeah, I'll give it a spin and try to get back to you before the end of the > week. > > - Eric > I was able to try it out just now, and it looks this (and all of the other patches) finally causes the problem to not appear! I can unload the driver while iavf1 is receiving heavy traffic! - Eric