From owner-freebsd-net@freebsd.org Thu Jan 30 01:12:17 2020 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id DEEA71FF0D0 for ; Thu, 30 Jan 2020 01:12:17 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (turbocat.net [IPv6:2a01:4f8:c17:6c4b::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 487MlP5DFFz4SMn; Thu, 30 Jan 2020 01:12:17 +0000 (UTC) (envelope-from hps@selasky.org) Received: from hps2020.home.selasky.org (unknown [62.141.129.235]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 1A2A72600DC; Thu, 30 Jan 2020 02:12:13 +0100 (CET) Subject: Re: Issue with epoch_drain_callbacks and unloading iavf(4) [using iflib] To: Eric Joyner Cc: freebsd-net@freebsd.org References: <0e2e97f2-df75-3c6f-9bdd-e8c2ab7bf79e@selasky.org> From: Hans Petter Selasky Message-ID: Date: Thu, 30 Jan 2020 02:12:05 +0100 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:68.0) Gecko/20100101 Thunderbird/68.3.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 487MlP5DFFz4SMn X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-6.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jan 2020 01:12:17 -0000 On 2020-01-29 22:44, Eric Joyner wrote: > On Wed, Jan 29, 2020 at 1:41 PM Hans Petter Selasky wrote: > >> On 2020-01-29 22:30, Eric Joyner wrote: >>> Hi freebsd-net, >>> >>> We've encountered an issue with unloading the iavf(4) driver on FreeBSD >>> 12.1 (and stable). On a VM with two iavf(4) interfaces, if we send heavy >>> traffic to iavf1 and try to kldunload the driver, the kldunload process >>> hangs on iavf0 until iavf1 stops receiving traffic. >>> >>> After some debugging, it looks like epoch_drain_callbacks() [via >>> if_detach_internal()] tries to switch CPUs to run on one that iavf1 is >>> using for RX processing, but since iavf1 is busy, it can't make the >> switch, >>> so cpu_switch() just hangs and nothing happens until iavf1's RX thread >>> stops being busy. >>> >>> I can work around this by inserting a kern_yield(PRI_USER) somewhere in >> one >>> of the iavf txrx functions that iflib calls into (e.g. >>> iavf_isc_rxd_available), but that's not a proper fix. Does anyone know >> what >>> to do to prevent this from happening? >>> >>> Wildly guessing, does maybe epoch_drain_callbacks() need a higher >> priority >>> than the PI_SOFT used in the group taskqueues used in iflib's RX >> processing? >>> >> >> Hi, >> >> Which scheduler is this? ULE or BSD? >> >> EPOCH(9) expects some level of round-robin scheduling on the same >> priority level. Setting a higher priority on EPOCH(9) might cause epoch >> to start spinning w/o letting the lower priority thread which holds the >> EPOCH() section to finish. >> >> --HPS >> >> > Hi Hans, > > kern.sched.name gives me "ULE" > Hi Eric, epoch_drain_callbacks() depends on that epoch_call_task() gets execution which is executed from a GTASKQUEUE at PI_SOFT. Also epoch_drain_callbacks() runs at the priority of the calling thread, and if this is lower than PI_SOFT, and a gtaskqueue is spinning heavily, then that won't work. For a single CPU system you will be toast in this situation regardless if there is no free time on a CPU for EPOCH(). In general if epoch_call_task() doesn't get execution time, you will have a problem. Maybe add a flag to iflib which stops the grouptask's before detaching the network interface? --HPS