From owner-freebsd-stable@FreeBSD.ORG  Sun Feb  8 13:16:17 2009
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 11828106566B;
	Sun,  8 Feb 2009 13:16:17 +0000 (UTC)
	(envelope-from danny@cs.huji.ac.il)
Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84])
	by mx1.freebsd.org (Postfix) with ESMTP id AEEDE8FC68;
	Sun,  8 Feb 2009 13:16:16 +0000 (UTC)
	(envelope-from danny@cs.huji.ac.il)
Received: from pampa.cs.huji.ac.il ([132.65.80.32])
	by kabab.cs.huji.ac.il with esmtp
	id 1LW9WA-0003Fu-O4; Sun, 08 Feb 2009 15:16:14 +0200
X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.2
To: Robert Watson <rwatson@FreeBSD.org>
In-reply-to: <alpine.BSF.2.00.0902081252300.1129@fledge.watson.org> 
References: <E1LW5Ht-0000VH-D8@kabab.cs.huji.ac.il> 
	<20090208091656.GA31876@test71.vk2pj.dyndns.org>
	<E1LW60v-0000zC-B2@kabab.cs.huji.ac.il>
	<20090208104253.GB31876@test71.vk2pj.dyndns.org>
	<alpine.BSF.2.00.0902081252300.1129@fledge.watson.org>
Comments: In-reply-to Robert Watson <rwatson@FreeBSD.org>
	message dated "Sun, 08 Feb 2009 12:56:21 +0000."
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Sun, 08 Feb 2009 15:16:14 +0200
From: Danny Braniss <danny@cs.huji.ac.il>
Message-ID: <E1LW9WA-0003Fu-O4@kabab.cs.huji.ac.il>
Cc: Peter Jeremy <peter@vk2pj.dyndns.org>, freebsd-stable@freebsd.org
Subject: Re: impossible packet length ... 
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 08 Feb 2009 13:16:17 -0000

> On Sun, 8 Feb 2009, Peter Jeremy wrote:
> 
> > On 2009-Feb-08 11:31:45 +0200, Danny Braniss <danny@cs.huji.ac.il> wrote:
> >> Q: with rxcsum on, and a bad checksum packet is received, is it
> >>   dropped by the NIC? if not, then it somewhat explains the behaviour
> >
> > If checksum offloading is working correctly then a bad packet should be 
> > dropped by the NIC.  If checksum offloading isn't working correctly then you 
> > can wind up in the situation where both the NIC and the driver think the 
> > other party has verified the checksum.  It's also possible that you may be 
> > running into corruption during DMA transfer from the NIC to RAM.  ISTR there 
> > have been some issues reported recently with checksum offloading on some 
> > NICs - though I don't have details to hand - you might like to search the 
> > lists.
> >
> >> changing the nic is tough, but if needed will be done.
> >
> > If disabling checksum offloading fixes the problem and the additional CPU 
> > load is acceptable (at least until you find a real fix) then there's no need 
> > to change NICs.
> 
> Actually, my understanding was that packets with bad checksums are delivered 
> to software, and flag the descriptor ring header for each packet tells us 
> whether the checksum was (a) checked and (b) validated by the hardware.  We 
> then propagate these to mbuf flags so that higher stack layers know whether or 
> not to calculate the checksum themselves.  Regardless of the specifics, 
> though, packets with checked but bad checksums shouldn't make it to the socket 
> layer where they would be visible to NFS.  If the NIC is marking apparently 
> bad packets as good, there are a number of possible sources -- be it bad 
> checksum handling in the card, corruption between the card and higher levels 
> of the stack (a DMA problem, as you point out, would have this symptom).

looking at the bce source, it's not clear (to me :-). If errors are detected in
bce_rx_intr(), the packet gets dropped, which I would expect to be the 
treatment
of an offloded chekcum error, but it seems that is not the case. 

danny