From owner-cvs-src@FreeBSD.ORG Fri Dec 8 03:14:26 2006 Return-Path: X-Original-To: cvs-src@FreeBSD.org Delivered-To: cvs-src@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3F8AF16A403; Fri, 8 Dec 2006 03:14:26 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.FreeBSD.org (Postfix) with ESMTP id 997A143CAB; Fri, 8 Dec 2006 03:13:30 +0000 (GMT) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.13.8/8.13.8/ALCHEMY.FRANKEN.DE) with ESMTP id kB83ENf5012491; Fri, 8 Dec 2006 04:14:24 +0100 (CET) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.13.8/8.13.8/Submit) id kB83ENnd012490; Fri, 8 Dec 2006 04:14:23 +0100 (CET) (envelope-from marius) Date: Fri, 8 Dec 2006 04:14:23 +0100 From: Marius Strobl To: Bruce Evans Message-ID: <20061208031423.GC86517@alchemy.franken.de> References: <200612060218.kB62IfVn046324@repoman.freebsd.org> <20061206164242.A32496@delplex.bde.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061206164242.A32496@delplex.bde.org> User-Agent: Mutt/1.4.2.2i Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/pci if_xl.c if_xlreg.h X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Dec 2006 03:14:26 -0000 On Wed, Dec 06, 2006 at 06:01:48PM +1100, Bruce Evans wrote: > On Wed, 6 Dec 2006, Marius Strobl wrote: > > >marius 2006-12-06 02:18:41 UTC > > > > FreeBSD src repository > > > > Modified files: > > sys/pci if_xl.c if_xlreg.h > > Log: > > - Use the xl_stats_update() callout instead of if_slowtimo() for > > driving xl_watchdog() in order to avoid races accessing if_timer. > > It's a shame to force all NIC drivers to manage the timeout for this. > Most have a timeout for other purposes so I couldn't see how to save > much code using a callback, but a callback would be cleaner. (To avoid > the race, just move the decrement of the count to drivers.) If I understand you right, you would either break the ABI by adding another pointer for the callback to struct ifnet or the API by moving decrementing the count from if_slowtimo() to the NIC drivers. I think that the potential slight inrease in cleanliness doesn't outweigh the cost, not even for -CURRENT, let alone that you're not allowed to MFC it. > > > While at it relax the watchdog a bit by reloading it in xl_txeof()/ > > xl_txeof_90xB() if there are still packets enqueued. > > Er, xl_txeof_90xB() was one of the 20-30% of txeof() routines that > handled this correctly. The timeout shouldn't be touched in xx_txeof() > except to clear it when all descriptors have been handled. > > Reloading it without even checking that at least one descriptor has > been handled is almost as bad as clearing it without checking, since > xx_txeof() may be called when there is nothing for it to do. This is > most broken in the DEVICE_POLLING case. Then xx_txeof() will be called > every few mS, much more often than the counter is decrememented, so > unconditionally reloading it ensures that it never works. > > Reloading it only if some progress has been made is mainly a small > pessimization. If the initial timeout is not long enough for all the > descriptors, then something is wrong and it is probably better to let > the watchdog try to fix it than to risk packets trickling out at a > rate of 1 every ( - epsilon) seconds, especially since it > takes more code to implement the trickling. However, the trickling > case is very unlikely to occur and most or all xx_start() routines > don't worry about it -- they just reload the full timeout and don't > waste time calculating a timeout based on the number of new and old > descriptors. You're right, I missed the DEVICE_POLLING case, in which xl_txeof()/xl_txeof_90xB() is called unconditionally, effectively circumventing that the watchdog ever fires with my change in place. Thanks for pointing this out! I still think that for the interrupt driven case the change is fine though, as xl_txeof()/xl_txeof_90xB() then is only called if the chip signals XL_STAT_DOWN_COMPLETE (similar to what most NIC drivers do) which in fact is a sign of chip actvity. Generally I think that if a NIC shows TX activity (this also includes f.e. signaling TX collision) that should contribute to not letting the watchdog fire as it typically means resetting the chip. The basic idea probably is dynamic adaption of the watchdog period without the need to calculate it upfront based on new and old descriptors, link type etc. With the count variable(s) in the softc one could also easily implement detection of trickling. What I don't get is why all the DEVICE_POLLING-enabled drivers I looked at call xx_rxeof() and xx_txeof() unconditionally, i.e. without checking the RX/TX stats/ interrupt cause bits (provided that the NIC updates them when interrupts are disabled, i.e. if told to not deliver interrupts to the bus), unlike they do in the interrupt driven case. Somewhat looks like this was copied&pasted from sis(4), potentially without reason for other NICs. Could one of the polling(4) gurus please explain why most if not all xx_poll()/xx_poll_locked() do that? On Wed, Dec 06, 2006 at 06:18:47PM +1100, Bruce Evans wrote: > > Reference for a previous fix: if_rl.c 1.134. This actually reloads > the timeout to 5 if it is 0 AND there is an unhandled tx descriptor. > I think this case shouldn't happen (the watchdog should have handled > it), and if it does it should be handled by resetting. Maybe it only > happened due to the race decrementing if_timer. I don't see how this can happen either, so I'll just revert the parts of if_xl.c 1.207 that reload the timer in xl_txeof()/xl_txeof_90xB(), at least for now. Marius