From owner-freebsd-current@FreeBSD.ORG Fri Nov 25 16:09:56 2005 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4E3BD16A423; Fri, 25 Nov 2005 16:09:56 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 043A543D5F; Fri, 25 Nov 2005 16:09:55 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.13.4/8.13.4) with ESMTP id jAPG9TKB059537; Fri, 25 Nov 2005 08:09:29 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.13.4/8.13.4/Submit) id jAPG9TRQ059536; Fri, 25 Nov 2005 08:09:29 -0800 (PST) Date: Fri, 25 Nov 2005 08:09:29 -0800 (PST) From: Matthew Dillon Message-Id: <200511251609.jAPG9TRQ059536@apollo.backplane.com> To: "Bjoern A. Zeeb" References: <200511231406.06282.jhb@freebsd.org> <200511242329.jAONTEkr055790@apollo.backplane.com> <20051125142713.M23990@maildrop.int.zabbadoz.net> Cc: FreeBSD current mailing list Subject: Re: nve locking fixes round 2 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Nov 2005 16:09:56 -0000 :... :> :> The reason I set sc->pending_txs to 0 in DFly after the reinit is :> because when a watchdog timeout occurs and you reset the device, :> *ALL* mbufs still sitting in the transmit ring are lost. They will :> never be acknowledged, ever. So pending_txs will never drop back to 0 on :> its own. This is what led to continuous watchdog timeout reports :> when, in fact, only one timeout actually occured. : :the problem is that with some versions of the hardware you are not :even able to get the first packet out. : :-- :Bjoern A. Zeeb bzeeb at Zabbadoz dot NeT I'm not sure if its the same as what happened to me, but I believe I have observed this as well. But at least in my case it turned out to be a bug in (if_nv.c for DFly) that issued ABI calls before resetting the hardware. I think it had something to do with nv_stop() being called before the initial hardware reset and nv_stop() then making an ABI call or two that expected the hardware to already be in a sane state (when it wasn't). You'd have to look at the DFly commit to see for sure. -Matt Matthew Dillon