From owner-freebsd-stable@FreeBSD.ORG  Mon Feb  9 06:46:14 2009
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6B3611065672;
	Mon,  9 Feb 2009 06:46:14 +0000 (UTC)
	(envelope-from danny@cs.huji.ac.il)
Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84])
	by mx1.freebsd.org (Postfix) with ESMTP id 1AA208FC1A;
	Mon,  9 Feb 2009 06:46:14 +0000 (UTC)
	(envelope-from danny@cs.huji.ac.il)
Received: from pampa.cs.huji.ac.il ([132.65.80.32])
	by kabab.cs.huji.ac.il with esmtp
	id 1LWPuG-000DW9-96; Mon, 09 Feb 2009 08:46:12 +0200
X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.2
To: Robert Watson <rwatson@FreeBSD.org>
In-reply-to: <alpine.BSF.2.00.0902081427260.1129@fledge.watson.org> 
References: <E1LW5Ht-0000VH-D8@kabab.cs.huji.ac.il> 
	<20090208091656.GA31876@test71.vk2pj.dyndns.org>
	<E1LW60v-0000zC-B2@kabab.cs.huji.ac.il>
	<20090208104253.GB31876@test71.vk2pj.dyndns.org>
	<alpine.BSF.2.00.0902081252300.1129@fledge.watson.org>
	<E1LW9WA-0003Fu-O4@kabab.cs.huji.ac.il>
	<alpine.BSF.2.00.0902081318000.89719@fledge.watson.org>
	<E1LW9yH-0003Xm-KS@kabab.cs.huji.ac.il>
	<alpine.BSF.2.00.0902081427260.1129@fledge.watson.org>
Comments: In-reply-to Robert Watson <rwatson@FreeBSD.org>
	message dated "Sun, 08 Feb 2009 14:33:33 +0000."
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Mon, 09 Feb 2009 08:46:12 +0200
From: Danny Braniss <danny@cs.huji.ac.il>
Message-ID: <E1LWPuG-000DW9-96@kabab.cs.huji.ac.il>
Cc: Peter Jeremy <peter@vk2pj.dyndns.org>, freebsd-stable@freebsd.org
Subject: Re: impossible packet length ... 
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Feb 2009 06:46:14 -0000

> 
> On Sun, 8 Feb 2009, Danny Braniss wrote:
> 
> >> On Sun, 8 Feb 2009, Danny Braniss wrote:
> >>
> >>> looking at the bce source, it's not clear (to me :-). If errors are 
> >>> detected in bce_rx_intr(), the packet gets dropped, which I would expect 
> >>> to be the treatment of an offloded chekcum error, but it seems that is not 
> >>> the case.
> >>
> >> I think we're thinking of different checksums -- devices/device drivers 
> >> drop frames with bad ethernet checksums, but not IP and above layer 
> >> checksums.
> >
> > I know I'm stepping on thin ice hear - haven't touched Stevens for a while, 
> > (and I doubt it mentions offloading), but if the offload checksum is bad, 
> > why not just drop the packet?
> >
> > The way I read the driver, if the offload checksum is on, and if no errors 
> > where detected, then it's marked as ok.
> 
> There are a few good reasons I can think of, but this is hardly a 
> comprehensive list:
> 
> (1) If there are bad higher level checksums on the wire, you want to see them
>      in tcpdump, so allow them to get up to a higher layer if network layer
>      checksums aren't good.
> 
> (2) It's a matter of local policy as to whether UDP checksums (for example)
>      are observed or not.
> 
> (3) If you're forwarding or bridging packets, it should be up to the end nodes
>      how they deal with bad UDP checksums on packets to them, not the routers.

ok, I can understand the logic.

> 
> Looking at if_bce.c, the following seems to be reasonable logic; first, 
> ethernet-layer checksums:
> 
> 5902                 /* Check the received frame for errors. */
> 5903                 if (status & (L2_FHDR_ERRORS_BAD_CRC |
> 5904                         L2_FHDR_ERRORS_PHY_DECODE | 
> L2_FHDR_ERRORS_ALIGNMENT |
> 5905                         L2_FHDR_ERRORS_TOO_SHORT  | 
> L2_FHDR_ERRORS_GIANT_FRAME)) {
> 5906
> 5907                         /* Log the error and release the mbuf. */
> 5908                         ifp->if_ierrors++;
> 5909                         DBRUN(sc->l2fhdr_status_errors++);
> 5910
> 5911                         m_freem(m0);
> 5912                         m0 = NULL;
> 5913                         goto bce_rx_int_next_rx;
> 5914                 }
> 
> I.e., if there are ethernet-level CRC failures, drop the packet.
> 
> 5922                 /* Validate the checksum if offload enabled. */
> 5923                 if (ifp->if_capenable & IFCAP_RXCSUM) {
> 5924
> 5925                         /* Check for an IP datagram. */
> 5926                         if (!(status & L2_FHDR_STATUS_SPLIT) &&
> 5927                                 (status & L2_FHDR_STATUS_IP_DATAGRAM)) {
> 5928                                 m0->m_pkthdr.csum_flags |= 
> CSUM_IP_CHECKED;
> 5929
> 5930                                 /* Check if the IP checksum is valid. */
> 5931                                 if ((l2fhdr->l2_fhdr_ip_xsum ^ 0xffff) == 
> 0)
> 5932                                         m0->m_pkthdr.csum_flags |= 
> CSUM_IP_VALID;
> 5933                         }
> 5934
> 5935                         /* Check for a valid TCP/UDP frame. */
> 5936                         if (status & (L2_FHDR_STATUS_TCP_SEGMENT |
> 5937                                 L2_FHDR_STATUS_UDP_DATAGRAM)) {
> 5938
> 5939                                 /* Check for a good TCP/UDP checksum. */
> 5940                                 if ((status & (L2_FHDR_ERRORS_TCP_XSUM |
> 5941                                               L2_FHDR_ERRORS_UDP_XSUM)) 
> == 0) {
> 5942                                         m0->m_pkthdr.csum_data =
> 5943                                             l2fhdr->l2_fhdr_tcp_udp_xsum;
> 5944                                         m0->m_pkthdr.csum_flags |= 
> (CSUM_DATA_VALID
> 5945                                                 | CSUM_PSEUDO_HDR);
> 5946                                 }
> 5947                         }
> 5948                 }
> 
> Only look at higher level checksums if policy enables it on the interface; 
> then, only if the hardware has a view on the IP-layer checksums, propagte that 
> information to the mbuf flags from the descriptor ring entry flags, both 
> whether or not the checksum was verified, and whether or not it was good.  If 
> policy disables it, or the hardware expresses no view, we don't set flags, 
> which simply defers checksumming to a higher layer (if required -- for 
> forwarded packets, we won't test UDP-layer checksums at all).

I missed line 5928, and as usual, your explanation is most educational!
The comment in line 5939 is a bit missleading, the way I read the code, it
does not check for good checksum.

Cheers,
	danny