From owner-freebsd-net Fri Jun 28 0:33: 1 2002 Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 904A637B401 for ; Fri, 28 Jun 2002 00:32:52 -0700 (PDT) Received: from patrocles.silby.com (d116.as7.nwbl0.wi.voyager.net [169.207.128.244]) by mx1.FreeBSD.org (Postfix) with ESMTP id B337943E06 for ; Fri, 28 Jun 2002 00:32:50 -0700 (PDT) (envelope-from silby@silby.com) Received: from patrocles.silby.com (localhost [127.0.0.1]) by patrocles.silby.com (8.12.4/8.12.4) with ESMTP id g5S7ZNcv071336; Fri, 28 Jun 2002 02:35:23 -0500 (CDT) (envelope-from silby@silby.com) Received: from localhost (silby@localhost) by patrocles.silby.com (8.12.4/8.12.4/Submit) with ESMTP id g5S7ZKYf071333; Fri, 28 Jun 2002 02:35:22 -0500 (CDT) X-Authentication-Warning: patrocles.silby.com: silby owned process doing -bs Date: Fri, 28 Jun 2002 02:35:20 -0500 (CDT) From: Mike Silbersack To: Luigi Rizzo Cc: net@freebsd.org Subject: Re: interface stalling on tx ? In-Reply-To: <20020627230348.A54937@iguana.icir.org> Message-ID: <20020628022611.K70821-100000@patrocles.silby.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Thu, 27 Jun 2002, Luigi Rizzo wrote: > I thought that upon transmission the driver somehow registered a > timeout to take care of these events, but maybe I am wrong ? > > Have other people seen this problem too ? > > cheers > luigi The watchdog timer code in most of the drivers is rather conservative, and may not detect mid-transfer stalls. I'll use the dc driver as an example: In dc_start, if_timer = 5 is set. Then, in dc_txeof, if_timer = 0, disabling the watchdog timer. This means that after a _single_ frame is sent, any subsequent stall will not be recovered from by the watchdog. In the vr driver, we were having problems where such stalls could be caused by high load, and the ifconfig up / down process was getting annoying to users. I worked around this by setting if_timer = 5 every time vr_txeof was entered, only setting if_timer = 0 at the point when the _entire_ transmit buffer list was emptied. (See if_vr.c rev 1.49 to see how I did it in that driver.) You should be able to do something similar in all of the drivers, and I have indeed thought of doing so. Could you code up and test such a patch for whatever card you are using in your test environment to see if it is a successful workaround? Of course, in an ideal world all drivers would recover in a graceful fashion. However, taking advantage of the watchdog timer to reset stuck cards seems like an adequate workaround. So far, I can't see any downside to this approach. If the card never locks up, then the change is superfluous. When it does, the change is a lifesaver. Apologies if parts of this message sound like babbling; I should be sleeping at this moment in time. :) Mike "Silby" Silbersack To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message