Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Apr 2007 11:50:19 -0700
From:      "Jack Vogel" <jfvogel@gmail.com>
To:        "Brian McCann" <bjmccann@gmail.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: CARP and em0 timeout watchdog
Message-ID:  <2a41acea0704201150r26babb4clbcbfb4cda09a853a@mail.gmail.com>
In-Reply-To: <2b5f066d0704201137j61b25e2bo24712323e7ca821e@mail.gmail.com>
References:  <1176911436.7416.8.camel@lanshark.dmv.com> <1177084316.5457.5.camel@lanshark.dmv.com> <20070420160431.GA17356@icarus.home.lan> <2a41acea0704201017n42d4e987l77752ee8f7ca9f1f@mail.gmail.com> <1177091905.5457.17.camel@lanshark.dmv.com> <2a41acea0704201127x319be08cw869efe1dd02a046e@mail.gmail.com> <2b5f066d0704201137j61b25e2bo24712323e7ca821e@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 4/20/07, Brian McCann <bjmccann@gmail.com> wrote:
> On 4/20/07, Jack Vogel <jfvogel@gmail.com> wrote:
> > On 4/20/07, Sven Willenberger <sven@dmv.com> wrote:
> > > On Fri, 2007-04-20 at 10:17 -0700, Jack Vogel wrote:
> > > > On 4/20/07, Jeremy Chadwick <koitsu@freebsd.org> wrote:
> > > > > On Fri, Apr 20, 2007 at 11:51:56AM -0400, Sven Willenberger wrote:
> > > > > > Having done more diagnostics I have found out it is not CARP related at
> > > > > > all. It turns out that the same timeouts will happen when ftp'ing to the
> > > > > > physical address IPs as well. There is also an odd situation here
> > > > > > depending on which protocol I use. The two boxes are connected to a Dell
> > > > > > Powerconnect 2616 gig switch with CAT6. If I scp files from the
> > > > > > 192.168.0.18 to the 192.168.0.19 box I can transfer gigs worth without a
> > > > > > hiccup (I used dd to create various sized testfiles from 32M to 1G in
> > > > > > size and just scp testfile* to the other box). On the other hand, if I
> > > > > > connect to 192.168.0.19 using ftp (either active or passive) where ftp
> > > > > > is being run through inetd, the interface resets (watchdog) within
> > > > > > seconds (a few MBs) of traffic. Enabling polling does nothing, nor does
> > > > > > changing net.inet.tcp.{recv,send}space. Any ideas why I would be seeing
> > > > > > such behavioral differences between scp and ftp?
> > > > >
> > > > > You'll get a much higher throughput rate with FTP than you will with
> > > > > SSH, simply because encryption overhead is quite high (even with the
> > > > > Blowfish cipher).  With a very fast processor and on a gigE network
> > > > > you'll probably see 8-9MByte/sec via SSH while 60-70MByte/sec via FTP.
> > > > > That's the only difference I can think of.
> > > > >
> > > > > The watchdog resets I can't explain; Jack Vogel should be able to assist
> > > > > with that.  But it sounds like the resets only happen under very high
> > > > > throughput conditions (which is why you'd see it with FTP but not SSH).
> > > >
> > > > What kind of hardware is this interface? Watchdogs mean TX cleanup
> > > > isn't happening in a reasonable time, without further data its hard to
> > > > know what might be going on.
> > > >
> > > > Jack
> > >
> > > from pciconf:
> > >
> > > em0@pci13:0:0:  class=0x020000 card=0x108c15d9 chip=0x108c8086 rev=0x03
> > > hdr=0x00
> > >     vendor   = 'Intel Corporation'
> > >     device   = 'PRO/1000 PM'
> > >     class    = network
> > >     subclass = ethernet
> > > em1@pci14:0:0:  class=0x020000 card=0x109a15d9 chip=0x109a8086 rev=0x00
> > > hdr=0x00
> > >     vendor   = 'Intel Corporation'
> > >     class    = network
> > >     subclass = ethernet
> > >
> > > em0 is the interface in question.
> > >
> > > from dmesg:
> > >
> > > em0: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port
> > > 0x4000-0x401f mem 0xe0300000-0xe031ffff irq 16 at device 0.0 on pci13
> > >
> > > em1: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port
> > > 0x5000-0x501f mem 0xe0400000-0xe041ffff irq 17 at device 0.0 on pci14
> >
> > OH, this is an 82573, and I've posted a firmware patcher a couple
> > different times, there is a bit in the MANC register that is incorrectly
> > programmed in some vendors systems. Can you search email for
> > that patcher, it needs to run from DOS. If you are unable to find
> > it let me know and I'll resent you a copy.
> >
> > Jack
> > _______________________________________________
> > freebsd-stable@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
> >
>
> FWIW, I've got 82546B cards and it's happening to me as well, but I'm
> on 6.1.  I'm upgrading to 6.2 and trying polling as we speak.
>
> --Brian

This is not the same problem, until you are running 6.2 RELEASE its
a whole other ballpark, there were locking issues between the driver
and the net layer that were fixed.

Jack



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2a41acea0704201150r26babb4clbcbfb4cda09a853a>