From owner-freebsd-net@freebsd.org Wed Jan 29 21:41:51 2020 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 6925A1F8048 for ; Wed, 29 Jan 2020 21:41:51 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (turbocat.net [IPv6:2a01:4f8:c17:6c4b::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 487H4b1px6z4F5F; Wed, 29 Jan 2020 21:41:49 +0000 (UTC) (envelope-from hps@selasky.org) Received: from hps2020.home.selasky.org (unknown [62.141.129.235]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 0C21F260165; Wed, 29 Jan 2020 22:41:47 +0100 (CET) Subject: Re: Issue with epoch_drain_callbacks and unloading iavf(4) [using iflib] To: Eric Joyner , freebsd-net@freebsd.org References: From: Hans Petter Selasky Message-ID: <0e2e97f2-df75-3c6f-9bdd-e8c2ab7bf79e@selasky.org> Date: Wed, 29 Jan 2020 22:41:29 +0100 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:68.0) Gecko/20100101 Thunderbird/68.3.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 487H4b1px6z4F5F X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-6.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jan 2020 21:41:51 -0000 On 2020-01-29 22:30, Eric Joyner wrote: > Hi freebsd-net, > > We've encountered an issue with unloading the iavf(4) driver on FreeBSD > 12.1 (and stable). On a VM with two iavf(4) interfaces, if we send heavy > traffic to iavf1 and try to kldunload the driver, the kldunload process > hangs on iavf0 until iavf1 stops receiving traffic. > > After some debugging, it looks like epoch_drain_callbacks() [via > if_detach_internal()] tries to switch CPUs to run on one that iavf1 is > using for RX processing, but since iavf1 is busy, it can't make the switch, > so cpu_switch() just hangs and nothing happens until iavf1's RX thread > stops being busy. > > I can work around this by inserting a kern_yield(PRI_USER) somewhere in one > of the iavf txrx functions that iflib calls into (e.g. > iavf_isc_rxd_available), but that's not a proper fix. Does anyone know what > to do to prevent this from happening? > > Wildly guessing, does maybe epoch_drain_callbacks() need a higher priority > than the PI_SOFT used in the group taskqueues used in iflib's RX processing? > Hi, Which scheduler is this? ULE or BSD? EPOCH(9) expects some level of round-robin scheduling on the same priority level. Setting a higher priority on EPOCH(9) might cause epoch to start spinning w/o letting the lower priority thread which holds the EPOCH() section to finish. --HPS