Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 Mar 2004 11:54:00 +0000
From:      ict technician <ict@cardinalnewman.coventry.sch.uk>
To:        freebsd-stable@freebsd.org
Subject:   Re: em0 checksum errors
Message-ID:  <200403051154.00447.ict@cardinalnewman.coventry.sch.uk>
In-Reply-To: <200403041208.46884.ict@cardinalnewman.coventry.sch.uk>
References:  <200403021228.17716.ict@cardinalnewman.coventry.sch.uk> <200403021434.27420.ict@cardinalnewman.coventry.sch.uk> <200403041208.46884.ict@cardinalnewman.coventry.sch.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 04 March 2004 12:08 pm, ict technician wrote:
> On Tuesday 02 March 2004 2:34 pm, ict technician wrote:
> > On Tuesday 02 March 2004 12:28 pm, ict technician wrote:
> > > I've been testing an application which uses UDP. I was having
> > > difficulties so I started taking packet dumps. I noticed that many
> > > packets have bad checksums. The errors are mostly on UDP packets but I
> > > do see some TCP packets with errors also. This occurs on system
> > > applications without the new app. running, e.g. dns/ssh
> > >
> > > This is reproducable on more than one system, although the NICs are
> > > probably from the same batch, as I bought a box of 5 out of 7 in use.
> > > Systems are 4.9-RELEASEp1/p2.
> > >
> > > The cards are Intel PRO/1000 MT Server. I'll get the numbers off the
> > > card shortly.
> > >
> > > One box on stable (18th Feb) seems okay so I'm going to try stable on
> > > my test box and see if that cures it.
> > >
> > > I won't spam the list with the dump.
> >
> > replies to self - how uncouth.
> >
> > While it's building I decide to re-read the recent thread
> > http://docs.freebsd.org/cgi/mid.cgi?6.0.3.0.0.20040226131930.10513908
> >
> > I'd discounted this as I wasn't seeing the EEPROM message.
> >
> > Sure enough, moving the em0 card seems to fix the problem.
> >
> > I'll reply to self again once I confirm the conflicting item ;)
>
> Aaarrrgghhh, evil PC hardware.
>
> Once I'd moved the NIC it decided to work. However, I moved the card back
> to it's original slot and could no longer reproduce the fault.
>
> All the other boxes with these cards are production but I can play with the
> "backup server" which only need to run at night.
>
> Tried a cold boot. No joy.
> Then I swapped the NIC. No joy. Noted that it's different C31527-002 vs
> A92165-004. Not listed as supported but according to Intel it's the same
> part.
>
> Much swapping of cards to no avail. Trying more stuff :((

Drat and double drat! It appears to be "a feature". The "broken" packets 
appear to deliver okay.

Now if I understand things correctly these cards can do checksum offloading. 
I'm guessing that the packets are snarfed before the card can fix-up the 
checksum. Can A N Expert confirm my conjecture?

In any case can anyone confirm the result? I'm doing
#tcpdump -lv -s1500 | grep bad

Wierd thing is one box (only) works. Naturally this is the box I tested on. 
For the record it's a GA-7VAXP-A Ultra.

I'm willing to take a look at this, but I'm no kernel hacker. Is it in 
em/bpf/tcpdump/network stack? It's probably not the recent PAE related 
changes since I tried 4.7, 4.8, 4.9, and 5.2.1RC. Also tried latest driver 
from Intel site.

I still need to track down the network problems with the new app, and having a 
broken tcpdump is cramping my style.

Cheers

-- 
i j hart

ICT Technician
Cardinal Newman Catholic School & Community College



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200403051154.00447.ict>