From owner-freebsd-current@FreeBSD.ORG Mon Nov 28 17:04:53 2005 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B8B3F16A53B for ; Mon, 28 Nov 2005 17:04:53 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7318743D58 for ; Mon, 28 Nov 2005 17:04:43 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.5b3) with ESMTP id 2741939 for multiple; Mon, 28 Nov 2005 12:04:33 -0500 Received: from localhost (john@localhost [127.0.0.1]) by server.baldwin.cx (8.13.1/8.13.1) with ESMTP id jASH3rCQ058457; Mon, 28 Nov 2005 12:04:19 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Matthew Dillon Date: Mon, 28 Nov 2005 12:03:59 -0500 User-Agent: KMail/1.8.2 References: <200511231406.06282.jhb@freebsd.org> <20051125142713.M23990@maildrop.int.zabbadoz.net> <200511251609.jAPG9TRQ059536@apollo.backplane.com> In-Reply-To: <200511251609.jAPG9TRQ059536@apollo.backplane.com> MIME-Version: 1.0 Content-Disposition: inline Message-Id: <200511281204.01545.jhb@freebsd.org> Content-Type: text/plain; charset="iso-8859-6" Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.8 required=4.2 tests=ALL_TRUSTED autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on server.baldwin.cx X-Server: High Performance Mail Server - http://surgemail.com r=1653887525 Cc: "Bjoern A. Zeeb" , FreeBSD current mailing list Subject: Re: nve locking fixes round 2 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2005 17:04:53 -0000 On Friday 25 November 2005 11:09 am, Matthew Dillon wrote: > :... > : > :> The reason I set sc->pending_txs to 0 in DFly after the reinit is > :> because when a watchdog timeout occurs and you reset the device, > :> *ALL* mbufs still sitting in the transmit ring are lost. They will > :> never be acknowledged, ever. So pending_txs will never drop back to > :> 0 on its own. This is what led to continuous watchdog timeout reports > :> when, in fact, only one timeout actually occured. > : > :the problem is that with some versions of the hardware you are not > :even able to get the first packet out. > : > :-- > :Bjoern A. Zeeb bzeeb at Zabbadoz dot NeT > > I'm not sure if its the same as what happened to me, but I believe > I have observed this as well. But at least in my case it turned out > to be a bug in (if_nv.c for DFly) that issued ABI calls before > resetting the hardware. I think it had something to do with nv_stop() > being called before the initial hardware reset and nv_stop() then making an > ABI call or two that expected the hardware to already be in a sane state > (when it wasn't). You'd have to look at the DFly commit to see for sure. Yes, I have that patch in my tree though I'm not sure it is in the patch I posted. I'll update the patch to include that. Actually, try the first hunk in the patch at http://www.FreeBSD.org/~jhb/patches/nve_dffixes.patch it is the change Matt is referring to. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org