From owner-cvs-src@FreeBSD.ORG  Fri Dec  8 03:14:26 2006
Return-Path: <owner-cvs-src@FreeBSD.ORG>
X-Original-To: cvs-src@FreeBSD.org
Delivered-To: cvs-src@FreeBSD.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 3F8AF16A403;
	Fri,  8 Dec 2006 03:14:26 +0000 (UTC)
	(envelope-from marius@alchemy.franken.de)
Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 997A143CAB;
	Fri,  8 Dec 2006 03:13:30 +0000 (GMT)
	(envelope-from marius@alchemy.franken.de)
Received: from alchemy.franken.de (localhost [127.0.0.1])
	by alchemy.franken.de (8.13.8/8.13.8/ALCHEMY.FRANKEN.DE) with ESMTP id
	kB83ENf5012491; Fri, 8 Dec 2006 04:14:24 +0100 (CET)
	(envelope-from marius@alchemy.franken.de)
Received: (from marius@localhost)
	by alchemy.franken.de (8.13.8/8.13.8/Submit) id kB83ENnd012490;
	Fri, 8 Dec 2006 04:14:23 +0100 (CET) (envelope-from marius)
Date: Fri, 8 Dec 2006 04:14:23 +0100
From: Marius Strobl <marius@alchemy.franken.de>
To: Bruce Evans <bde@zeta.org.au>
Message-ID: <20061208031423.GC86517@alchemy.franken.de>
References: <200612060218.kB62IfVn046324@repoman.freebsd.org>
	<20061206164242.A32496@delplex.bde.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20061206164242.A32496@delplex.bde.org>
User-Agent: Mutt/1.4.2.2i
Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject: Re: cvs commit: src/sys/pci if_xl.c if_xlreg.h
X-BeenThere: cvs-src@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: CVS commit messages for the src tree <cvs-src.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-src>
List-Post: <mailto:cvs-src@freebsd.org>
List-Help: <mailto:cvs-src-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Dec 2006 03:14:26 -0000

On Wed, Dec 06, 2006 at 06:01:48PM +1100, Bruce Evans wrote:
> On Wed, 6 Dec 2006, Marius Strobl wrote:
> 
> >marius      2006-12-06 02:18:41 UTC
> >
> > FreeBSD src repository
> >
> > Modified files:
> >   sys/pci              if_xl.c if_xlreg.h
> > Log:
> > - Use the xl_stats_update() callout instead of if_slowtimo() for
> >   driving xl_watchdog() in order to avoid races accessing if_timer.
> 
> It's a shame to force all NIC drivers to manage the timeout for this.
> Most have a timeout for other purposes so I couldn't see how to save
> much code using a callback, but a callback would be cleaner.  (To avoid
> the race, just move the decrement of the count to drivers.)

If I understand you right, you would either break the ABI by
adding another pointer for the callback to struct ifnet or
the API by moving decrementing the count from if_slowtimo()
to the NIC drivers. I think that the potential slight inrease
in cleanliness doesn't outweigh the cost, not even for -CURRENT,
let alone that you're not allowed to MFC it.

> 
> >   While at it relax the watchdog a bit by reloading it in xl_txeof()/
> >   xl_txeof_90xB() if there are still packets enqueued.
> 
> Er, xl_txeof_90xB() was one of the 20-30% of txeof() routines that
> handled this correctly.  The timeout shouldn't be touched in xx_txeof()
> except to clear it when all descriptors have been handled.
> 
> Reloading it without even checking that at least one descriptor has
> been handled is almost as bad as clearing it without checking, since
> xx_txeof() may be called when there is nothing for it to do.  This is
> most broken in the DEVICE_POLLING case.  Then xx_txeof() will be called
> every few mS, much more often than the counter is decrememented, so
> unconditionally reloading it ensures that it never works.
> 
> Reloading it only if some progress has been made is mainly a small
> pessimization.  If the initial timeout is not long enough for all the
> descriptors, then something is wrong and it is probably better to let
> the watchdog try to fix it than to risk packets trickling out at a
> rate of 1 every (<timeout> - epsilon) seconds, especially since it
> takes more code to implement the trickling.  However, the trickling
> case is very unlikely to occur and most or all xx_start() routines
> don't worry about it -- they just reload the full timeout and don't
> waste time calculating a timeout based on the number of new and old
> descriptors.

You're right, I missed the DEVICE_POLLING case, in which
xl_txeof()/xl_txeof_90xB() is called unconditionally,
effectively circumventing that the watchdog ever fires
with my change in place. Thanks for pointing this out!
I still think that for the interrupt driven case the
change is fine though, as xl_txeof()/xl_txeof_90xB() then
is only called if the chip signals XL_STAT_DOWN_COMPLETE
(similar to what most NIC drivers do) which in fact is
a sign of chip actvity. Generally I think that if a NIC
shows TX activity (this also includes f.e. signaling
TX collision) that should contribute to not letting
the watchdog fire as it typically means resetting the
chip. The basic idea probably is dynamic adaption of the
watchdog period without the need to calculate it upfront
based on new and old descriptors, link type etc. With the
count variable(s) in the softc one could also easily
implement detection of trickling.
What I don't get is why all the DEVICE_POLLING-enabled
drivers I looked at call xx_rxeof() and xx_txeof()
unconditionally, i.e. without checking the RX/TX stats/
interrupt cause bits (provided that the NIC updates
them when interrupts are disabled, i.e. if told to not
deliver interrupts to the bus), unlike they do in the
interrupt driven case. Somewhat looks like this was
copied&pasted from sis(4), potentially without reason
for other NICs. Could one of the polling(4) gurus please
explain why most if not all xx_poll()/xx_poll_locked()
do that?


On Wed, Dec 06, 2006 at 06:18:47PM +1100, Bruce Evans wrote:
>
> Reference for a previous fix: if_rl.c 1.134.  This actually reloads
> the timeout to 5 if it is 0 AND there is an unhandled tx descriptor.
> I think this case shouldn't happen (the watchdog should have handled
> it), and if it does it should be handled by resetting.  Maybe it only
> happened due to the race decrementing if_timer.

I don't see how this can happen either, so I'll just
revert the parts of if_xl.c 1.207 that reload the timer
in xl_txeof()/xl_txeof_90xB(), at least for now.

Marius