From owner-freebsd-net@FreeBSD.ORG Fri Aug 11 16:48:04 2006 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 427C116A4DA; Fri, 11 Aug 2006 16:48:04 +0000 (UTC) (envelope-from landonf@opendarwin.org) Received: from goldfish.bikemonkey.org (goldfish.bikemonkey.org [64.81.64.61]) by mx1.FreeBSD.org (Postfix) with ESMTP id C0E8243D45; Fri, 11 Aug 2006 16:47:59 +0000 (GMT) (envelope-from landonf@opendarwin.org) Received: from [192.168.54.11] (nat.earth.threerings.net [64.127.109.100]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by goldfish.bikemonkey.org (Postfix) with ESMTP id 71097AF601B; Fri, 11 Aug 2006 09:47:59 -0700 (PDT) In-Reply-To: <2a41acea0608110922h4bed63b1ke09f91b610819805@mail.gmail.com> References: <20060811100536.V80282@k2.vol.cz> <20060811111240.GD96644@FreeBSD.org> <20060811133531.D80282@k2.vol.cz> <20060811125825.GH96644@cell.sick.ru> <2a41acea0608110922h4bed63b1ke09f91b610819805@mail.gmail.com> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Apple-Mail-3-914015310" Message-Id: Content-Transfer-Encoding: 7bit From: Landon Fuller Date: Fri, 11 Aug 2006 09:47:54 -0700 To: Jack Vogel X-Pgp-Agent: GPGMail 1.1.2 (Tiger) X-Mailer: Apple Mail (2.752.2) Cc: Daniel Ryslink , freebsd-net@freebsd.org Subject: Re: Problems with em interfaces on FreeBSD 6.1 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Aug 2006 16:48:04 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --Apple-Mail-3-914015310 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed On Aug 11, 2006, at 09:22, Jack Vogel wrote: > On 8/11/06, Gleb Smirnoff wrote: >> Daniel, >> >> On Fri, Aug 11, 2006 at 01:42:32PM +0200, Daniel Ryslink wrote: >> D> We have started to use the em driver only recently, after the >> upgrade to >> D> gigabit connectivity (100 MBit NICs from Intel used the fxp >> driver). >> D> >> D> As for the frequency of the incidents, here is a grep of the >> messages: >> D> >> D> ~~~~~~~~~~~~ ~~~~~~~~~~~~~~ >> D> Aug 4 22:35:23 b2 kernel: em0: watchdog timeout -- resetting >> D> Aug 5 00:09:20 b2 kernel: em1: watchdog timeout -- resetting >> D> Aug 5 06:08:59 b2 kernel: em1: watchdog timeout -- resetting >> D> Aug 6 12:38:16 b2 kernel: em1: watchdog timeout -- resetting >> D> Aug 6 20:39:47 b2 kernel: em0: watchdog timeout -- resetting >> D> Aug 7 18:37:29 b2 kernel: em1: watchdog timeout -- resetting >> D> Aug 8 07:27:48 b2 kernel: em0: watchdog timeout -- resetting >> D> Aug 8 09:38:17 b2 kernel: em0: watchdog timeout -- resetting >> D> Aug 8 12:54:54 b2 kernel: em1: watchdog timeout -- resetting >> D> Aug 8 22:41:17 b2 kernel: em1: watchdog timeout -- resetting >> D> Aug 9 05:17:24 b2 kernel: em1: watchdog timeout -- resetting >> D> Aug 9 10:56:10 b2 kernel: em1: watchdog timeout -- resetting >> D> Aug 9 20:10:06 b2 kernel: em1: watchdog timeout -- resetting >> D> Aug 11 08:41:44 b2 kernel: em0: watchdog timeout -- resetting >> D> Aug 11 10:35:43 b2 kernel: em0: watchdog timeout -- resetting >> D> ~~~~~~~~~~~~ ~~~~~~~~~~~~~~ >> D> >> D> The driver used is version 3.2.18 (I wanted to use the Intel >> 6.1.4 as a >> D> module, but I have found out that I made a mistake and >> accidentally loaded >> D> the old 3.2.18 driver). >> D> >> D> I have dilemma now - which new driver to try? The 6.0.5 >> submitted to the >> D> current FreeBSD 6.1 branch (modified by you, I believe, on 8th >> August), or >> D> the newest driver from Intel 6.1.4? Do you think one of these >> drivers >> D> could solve my problems? >> >> I'm not sure whether new driver will solve your problems. You >> should give >> a try to 6.1-STABLE which has 6.0.5 in it. The difference between >> 6.1.4 and >> 6.0.5 is quite small, I doubt that 6.1.4 worth a try in your case. > > Gleb is right, the difference between my 6.0.5 and 6.1.4 driver are > minor > and don't seem to have anything to do with your problem. > > I am happy Gleb got my code merged with tip of STABLE and would take > that driver code if I were you, it will become 6.2 before long :) > > Watchdogs happen because of transmit cleanup failing, your instances > are pretty widely seperated, it looks like some external network > problem > perhaps? We saw this issue here on SMP systems running 6.1; I've been meaning to set up a reproduction case in the lab and dig into the issue further. Disabling the mpsafe network stack (debug.mpsafenet=0) is our temporary work-around; rwatson mentioned that this has the effect of forcing the interrupt handler for if_em to not run in parallel with the transmit code, which is likely what caused the problem to disappear. -landonf --Apple-Mail-3-914015310 content-type: application/pgp-signature; x-mac-type=70674453; name=PGP.sig content-description: This is a digitally signed message part content-disposition: inline; filename=PGP.sig content-transfer-encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Darwin) iD8DBQFE3LS9lplZCE/15mMRAgKZAKCPPrfnzKWXew4qkgbd8vUSrHuYWgCePn1U ww2/czPIdUTqsNnt8tmjAug= =6R/I -----END PGP SIGNATURE----- --Apple-Mail-3-914015310--