Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Mar 2004 12:51:37 +0000
From:      ict technician <ict@cardinalnewman.coventry.sch.uk>
To:        freebsd-stable@freebsd.org
Subject:   Re: em0 checksum errors
Message-ID:  <200403091251.37386.ict@cardinalnewman.coventry.sch.uk>
In-Reply-To: <200403051154.00447.ict@cardinalnewman.coventry.sch.uk>
References:  <200403021228.17716.ict@cardinalnewman.coventry.sch.uk> <200403041208.46884.ict@cardinalnewman.coventry.sch.uk> <200403051154.00447.ict@cardinalnewman.coventry.sch.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Friday 05 March 2004 11:54 am, ict technician wrote:
> On Thursday 04 March 2004 12:08 pm, ict technician wrote:
> > On Tuesday 02 March 2004 2:34 pm, ict technician wrote:
> > > On Tuesday 02 March 2004 12:28 pm, ict technician wrote:
> > > > I've been testing an application which uses UDP. I was having
> > > > difficulties so I started taking packet dumps. I noticed that many
> > > > packets have bad checksums. The errors are mostly on UDP packets but
> > > > I do see some TCP packets with errors also. This occurs on system
> > > > applications without the new app. running, e.g. dns/ssh
> > > >
> > > > This is reproducable on more than one system, although the NICs are
> > > > probably from the same batch, as I bought a box of 5 out of 7 in use.
> > > > Systems are 4.9-RELEASEp1/p2.
> > > >
> > > > The cards are Intel PRO/1000 MT Server. I'll get the numbers off the
> > > > card shortly.
> > > >
> > > > One box on stable (18th Feb) seems okay so I'm going to try stable on
> > > > my test box and see if that cures it.
> > > >
> > > > I won't spam the list with the dump.
> > >
> > > replies to self - how uncouth.
> > >
> > > While it's building I decide to re-read the recent thread
> > > http://docs.freebsd.org/cgi/mid.cgi?6.0.3.0.0.20040226131930.10513908
> > >
> > > I'd discounted this as I wasn't seeing the EEPROM message.
> > >
> > > Sure enough, moving the em0 card seems to fix the problem.
> > >
> > > I'll reply to self again once I confirm the conflicting item ;)
> >
> > Aaarrrgghhh, evil PC hardware.
> >
> > Once I'd moved the NIC it decided to work. However, I moved the card back
> > to it's original slot and could no longer reproduce the fault.
> >
> > All the other boxes with these cards are production but I can play with
> > the "backup server" which only need to run at night.
> >
> > Tried a cold boot. No joy.
> > Then I swapped the NIC. No joy. Noted that it's different C31527-002 vs
> > A92165-004. Not listed as supported but according to Intel it's the same
> > part.
> >
> > Much swapping of cards to no avail. Trying more stuff :((
>
> Drat and double drat! It appears to be "a feature". The "broken" packets
> appear to deliver okay.
>
> Now if I understand things correctly these cards can do checksum
> offloading. I'm guessing that the packets are snarfed before the card can
> fix-up the checksum. Can A N Expert confirm my conjecture?
>
> In any case can anyone confirm the result? I'm doing
> #tcpdump -lv -s1500 | grep bad
>
> Wierd thing is one box (only) works. Naturally this is the box I tested on.
> For the record it's a GA-7VAXP-A Ultra.
>
> I'm willing to take a look at this, but I'm no kernel hacker. Is it in
> em/bpf/tcpdump/network stack? It's probably not the recent PAE related
> changes since I tried 4.7, 4.8, 4.9, and 5.2.1RC. Also tried latest driver
> from Intel site.
>
> I still need to track down the network problems with the new app, and
> having a broken tcpdump is cramping my style.
>
> Cheers

Haven't worked out how the driver works yet, but disabling the hardware 
checksum "fixes" the problem for me. I've filed a pr kern/63982
and I'll copy it to those lovely Intel people.

-- 
i j hart

ICT Technician
Cardinal Newman Catholic School & Community College



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200403091251.37386.ict>