From owner-freebsd-stable@FreeBSD.ORG Fri Apr 27 13:22:35 2007 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4833716A400 for ; Fri, 27 Apr 2007 13:22:35 +0000 (UTC) (envelope-from sven@dmv.com) Received: from smtp-gw-cl-c.dmv.com (smtp-gw-cl-c.dmv.com [216.240.97.41]) by mx1.freebsd.org (Postfix) with ESMTP id 0317C13C489 for ; Fri, 27 Apr 2007 13:22:34 +0000 (UTC) (envelope-from sven@dmv.com) Received: from mail-gw-cl-a.dmv.com (mail-gw-cl-a.dmv.com [216.240.97.38]) by smtp-gw-cl-c.dmv.com (8.12.10/8.12.10) with ESMTP id l3RDMX47070861; Fri, 27 Apr 2007 09:22:33 -0400 (EDT) (envelope-from sven@dmv.com) Received: from [216.240.97.46] (lanshark.dmv.com [216.240.97.46]) by mail-gw-cl-a.dmv.com (8.12.9/8.12.9) with ESMTP id l3RDMXuk055603; Fri, 27 Apr 2007 09:22:33 -0400 (EDT) (envelope-from sven@dmv.com) From: Sven Willenberger To: Jack Vogel In-Reply-To: <1177094694.5457.31.camel@lanshark.dmv.com> References: <1176911436.7416.8.camel@lanshark.dmv.com> <1177084316.5457.5.camel@lanshark.dmv.com> <20070420160431.GA17356@icarus.home.lan> <2a41acea0704201017n42d4e987l77752ee8f7ca9f1f@mail.gmail.com> <1177091905.5457.17.camel@lanshark.dmv.com> <2a41acea0704201127x319be08cw869efe1dd02a046e@mail.gmail.com> <1177094694.5457.31.camel@lanshark.dmv.com> Content-Type: text/plain Date: Fri, 27 Apr 2007 09:25:18 -0400 Message-Id: <1177680318.8713.1.camel@lanshark.dmv.com> Mime-Version: 1.0 X-Mailer: Evolution 2.8.3 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.39 X-Scanned-By: MIMEDefang 2.48 on 216.240.97.38 Cc: freebsd-stable@freebsd.org Subject: Re: CARP and em0 timeout watchdog X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Apr 2007 13:22:35 -0000 On Fri, 2007-04-20 at 14:44 -0400, Sven Willenberger wrote: > On Fri, 2007-04-20 at 11:27 -0700, Jack Vogel wrote: > > On 4/20/07, Sven Willenberger wrote: > > > On Fri, 2007-04-20 at 10:17 -0700, Jack Vogel wrote: > > > > On 4/20/07, Jeremy Chadwick wrote: > > > > > On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote: > > > > > > Having done more diagnostics I have found out it is not CARP related at > > > > > > all. It turns out that the same timeouts will happen when ftp'ing to the > > > > > > physical address IPs as well. There is also an odd situation here > > > > > > depending on which protocol I use. The two boxes are connected to a Dell > > > > > > Powerconnect 2616 gig switch with CAT6. If I scp files from the > > > > > > 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a > > > > > > hiccup (I used dd to create various sized testfiles from 32M to 1G in > > > > > > size and just scp testfile* to the other box). On the other hand, if I > > > > > > connect to 192.168.0.19 using ftp (either active or passive) where ftp > > > > > > is being run through inetd, the interface resets (watchdog) within > > > > > > seconds (a few MBs) of traffic. Enabling polling does nothing, nor does > > > > > > changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing > > > > > > such behavioral differences between scp and ftp? > > > > > > > > > > You'll get a much higher throughput rate with FTP than you will with > > > > > SSH, simply because encryption overhead is quite high (even with the > > > > > Blowfish cipher). With a very fast processor and on a gigE network > > > > > you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP. > > > > > That's the only difference I can think of. > > > > > > > > > > The watchdog resets I can't explain; Jack Vogel should be able to assist > > > > > with that. But it sounds like the resets only happen under very high > > > > > throughput conditions (which is why you'd see it with FTP but not SSH). > > > > > > > > What kind of hardware is this interface? Watchdogs mean TX cleanup > > > > isn't happening in a reasonable time, without further data its hard to > > > > know what might be going on. > > > > > > > > Jack > > > > > > from pciconf: > > > > > > em0@pci13:0:0: class=0x020000 card=0x108c15d9 chip=0x108c8086 rev=0x03 > > > hdr=0x00 > > > vendor = 'Intel Corporation' > > > device = 'PRO/1000 PM' > > > class = network > > > subclass = ethernet > > > em1@pci14:0:0: class=0x020000 card=0x109a15d9 chip=0x109a8086 rev=0x00 > > > hdr=0x00 > > > vendor = 'Intel Corporation' > > > class = network > > > subclass = ethernet > > > > > > em0 is the interface in question. > > > > > > from dmesg: > > > > > > em0: port > > > 0x4000-0x401f mem 0xe0300000-0xe031ffff irq 16 at device 0.0 on pci13 > > > > > > em1: port > > > 0x5000-0x501f mem 0xe0400000-0xe041ffff irq 17 at device 0.0 on pci14 > > > > OH, this is an 82573, and I've posted a firmware patcher a couple > > different times, there is a bit in the MANC register that is incorrectly > > programmed in some vendors systems. Can you search email for > > that patcher, it needs to run from DOS. If you are unable to find > > it let me know and I'll resent you a copy. > > > > Jack > > If you are referring to the dcgdis.ThisIsZip attachment, I found it in > earlier threads, thanks. Will work on patching the nics and will keep > the list updated. > > Thanks again. > > Sven > I am happy to report that the firmware patch seems to have fixed the issue and I can transfer data across the gigE network without the watchdog timeouts and lockups. Thanks again!! Sven