From owner-freebsd-stable@FreeBSD.ORG Sun Feb 8 13:16:17 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 11828106566B; Sun, 8 Feb 2009 13:16:17 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id AEEDE8FC68; Sun, 8 Feb 2009 13:16:16 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1LW9WA-0003Fu-O4; Sun, 08 Feb 2009 15:16:14 +0200 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.2 To: Robert Watson In-reply-to: References: <20090208091656.GA31876@test71.vk2pj.dyndns.org> <20090208104253.GB31876@test71.vk2pj.dyndns.org> Comments: In-reply-to Robert Watson message dated "Sun, 08 Feb 2009 12:56:21 +0000." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Sun, 08 Feb 2009 15:16:14 +0200 From: Danny Braniss Message-ID: Cc: Peter Jeremy , freebsd-stable@freebsd.org Subject: Re: impossible packet length ... X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 08 Feb 2009 13:16:17 -0000 > On Sun, 8 Feb 2009, Peter Jeremy wrote: > > > On 2009-Feb-08 11:31:45 +0200, Danny Braniss wrote: > >> Q: with rxcsum on, and a bad checksum packet is received, is it > >> dropped by the NIC? if not, then it somewhat explains the behaviour > > > > If checksum offloading is working correctly then a bad packet should be > > dropped by the NIC. If checksum offloading isn't working correctly then you > > can wind up in the situation where both the NIC and the driver think the > > other party has verified the checksum. It's also possible that you may be > > running into corruption during DMA transfer from the NIC to RAM. ISTR there > > have been some issues reported recently with checksum offloading on some > > NICs - though I don't have details to hand - you might like to search the > > lists. > > > >> changing the nic is tough, but if needed will be done. > > > > If disabling checksum offloading fixes the problem and the additional CPU > > load is acceptable (at least until you find a real fix) then there's no need > > to change NICs. > > Actually, my understanding was that packets with bad checksums are delivered > to software, and flag the descriptor ring header for each packet tells us > whether the checksum was (a) checked and (b) validated by the hardware. We > then propagate these to mbuf flags so that higher stack layers know whether or > not to calculate the checksum themselves. Regardless of the specifics, > though, packets with checked but bad checksums shouldn't make it to the socket > layer where they would be visible to NFS. If the NIC is marking apparently > bad packets as good, there are a number of possible sources -- be it bad > checksum handling in the card, corruption between the card and higher levels > of the stack (a DMA problem, as you point out, would have this symptom). looking at the bce source, it's not clear (to me :-). If errors are detected in bce_rx_intr(), the packet gets dropped, which I would expect to be the treatment of an offloded chekcum error, but it seems that is not the case. danny