From owner-freebsd-net@freebsd.org Tue Mar 31 19:27:55 2020 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 51CBA26881C for ; Tue, 31 Mar 2020 19:27:55 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qt1-x841.google.com (mail-qt1-x841.google.com [IPv6:2607:f8b0:4864:20::841]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 48sK9P3c25z4WVx; Tue, 31 Mar 2020 19:27:53 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qt1-x841.google.com with SMTP id z24so18223695qtu.4; Tue, 31 Mar 2020 12:27:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=5o5boMQVEZ8a8TGShDbwB9h2W+N5JtWWWFfJQOCc7Ho=; b=N5LDxq4U9I3kOvVroxEb2c+rEuxWTRg8yWAXZ0TmwJHPn7bABg0sDpyxtfr/DYXh9V MzQuSd/tiy+0e6IZkXWRyMx9p6f4NdWXdoknEXKWFbcQM+ewaL47XtVKDq61BA3YEb5m Y7lDvi+Xr6F4TLIKfHDsElEJWol3Ml+x4SxUdoWI3ZTxyCes0gEdpmCMjxxWuvDEiXus XMNaa44fxT7s0hb06lPdh+HV1rpdcUZ3X0DjqQCO7pLwXmGVrxnAcleAwPwpUmEd2bVL YxMNRzqbEscl7kn7fgT1v0s8Iz3YCfjq2eigDkA+8UjVsJRrYEdV4yASAPA6yO34/cMj jADA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=5o5boMQVEZ8a8TGShDbwB9h2W+N5JtWWWFfJQOCc7Ho=; b=cFoyYBz0RtT92yraccYoMPhNOLFAAwU//L5VEGFDZmA3JipX+hep9OnKPAEUbRreJI cOenqcCj1gQssIYh/uD+GVr8IYs6SnlnBt+uM4sDuaq5ZL9EDiwSYlYSs+3pj2eVvueu wHbFAFIbkw2oQOGEUo9Az42xX1uwWKl+kW8brl6/0c82FMNL6cO37lEH/vM4QXeYPAEb 0dnNrH07NA4ynMEg0SCHgOftt8/lewgWIlQw1UzPPpeYq4iDhelmN9mPqIRLutLvKCTA UqvCM8w/wOrfXJm9ufooYMfOXngYFBnnWzJeNdWu2DCs7tHDKRrqiuKzkMYK83Pe2xwO tIBw== X-Gm-Message-State: ANhLgQ03QbcaP1QXiSBsE15Ab9PP514K778n0JCFneBibD7ep9HTRhyr DF1+HJqE3sJP83zqUe0iVR0b8+PHebE= X-Google-Smtp-Source: ADFU+vu6Hp8oo+Lnlc08OarQRssNW4+Fq0GyRrumZaXvQmUFtMEzzFsNAW4LWRV8PXy46etEFeED+g== X-Received: by 2002:a05:620a:166a:: with SMTP id d10mr6667130qko.388.1585682427262; Tue, 31 Mar 2020 12:20:27 -0700 (PDT) Received: from raichu (toroon0560w-lp130-10-174-94-17-182.dsl.bell.ca. [174.94.17.182]) by smtp.gmail.com with ESMTPSA id q1sm14290836qtn.69.2020.03.31.12.20.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Mar 2020 12:20:26 -0700 (PDT) Sender: Mark Johnston Date: Tue, 31 Mar 2020 15:20:24 -0400 From: Mark Johnston To: Eric Joyner Cc: freebsd-net@freebsd.org, Hans Petter Selasky , John Baldwin , shurd , Drew Gallatin , Gleb Smirnoff Subject: Re: Issue with epoch_drain_callbacks and unloading iavf(4) [using iflib] Message-ID: <20200331192024.GE97238@raichu> References: <20200130030911.GA15281@spy> <20200212222219.GE83892@raichu> <20200328225150.GA82767@raichu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 48sK9P3c25z4WVx X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-6.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Mar 2020 19:27:55 -0000 On Tue, Mar 31, 2020 at 12:14:20PM -0700, Eric Joyner wrote: > Mark, > > I tried out a kernel with the tip of CURRENT with both D24214 and D24215 > applied, and I still see the problem. As well, after doing a "sysctl > debug.kdb.enter=1" and viewing the stack trace there for kldunload, it > appears to be similar to the one I posted in my last post. Can you show it? I don't see how it could be the same, since with the patch we are no longer calling sched_bind() from the epoch scan call back. > > - Eric > > On Mon, Mar 30, 2020 at 1:19 PM Eric Joyner wrote: > > > On Sat, Mar 28, 2020 at 3:52 PM Mark Johnston wrote: > > > >> On Wed, Mar 11, 2020 at 04:32:40PM -0700, Eric Joyner wrote: > >> > Mark, > >> > > >> > I did get some time to get back and retry this; however your second > >> patch > >> > still doesn't solve the problem. Looking into it a bit, it looks like > >> the > >> > kldunload process isn't hitting the code you've changed; it's hanging in > >> > epoch_wait_preempt() in if_detach_internal(), which is immediately > >> before > >> > epoch_drain_callbacks(). > >> > > >> > I did a kernel dump while it was hanging, and this is the backtrace for > >> the > >> > kldunload process: > >> > >> I see. I think the callback can be made much simpler and avoid the > >> problematic sched_bind() calls. I wrote a patch that allows waiting > >> threads to lend scheduling priority to a preempted thread blocked in an > >> epoch section, based on some code I wrote to implement preemptible SMR > >> sections. If waiting for a running thread, the callback just spins. > >> > >> This might be enough to solve your problem, I posted the two lightly > >> tested patches here: > >> https://reviews.freebsd.org/D24214 > >> https://reviews.freebsd.org/D24215 > >> > >> If we hit a situation where a reader is preempted and then its CPU is > >> hogged by a high-priority kernel thread, this still won't be enough, but > >> I suspect it'll solve your case. Would you be able to test? > >> > > > > Yeah, I'll try them out. > > > > - Eric > >