From owner-svn-src-user@FreeBSD.ORG Mon Nov 19 20:23:56 2012 Return-Path: Delivered-To: svn-src-user@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 47074EAB; Mon, 19 Nov 2012 20:23:56 +0000 (UTC) (envelope-from tuexen@fh-muenster.de) Received: from mail-n.franken.de (drew.ipv6.franken.de [IPv6:2001:638:a02:a001:20e:cff:fe4a:feaa]) by mx1.freebsd.org (Postfix) with ESMTP id A44D18FC0C; Mon, 19 Nov 2012 20:23:55 +0000 (UTC) Received: from [192.168.1.101] (p4FF52545.dip.t-dialin.net [79.245.37.69]) (Authenticated sender: macmic) by mail-n.franken.de (Postfix) with ESMTP id 6F62C1C0C069A; Mon, 19 Nov 2012 21:23:54 +0100 (CET) Subject: Re: svn commit: r243291 - in user/andre/tcp_workqueue/sys: net netinet netinet6 netipsec netpfil/ipfw netpfil/pf Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=iso-8859-1 From: Michael Tuexen In-Reply-To: <50AA831A.5000808@freebsd.org> Date: Mon, 19 Nov 2012 21:23:53 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <4973333F-FDA4-412E-91F2-64639BDB04A6@fh-muenster.de> References: <201211191804.qAJI4IXX014601@svn.freebsd.org> <50AA831A.5000808@freebsd.org> To: Andre Oppermann X-Mailer: Apple Mail (2.1283) Cc: src-committers@freebsd.org, svn-src-user@freebsd.org X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Nov 2012 20:23:56 -0000 On Nov 19, 2012, at 8:06 PM, Andre Oppermann wrote: > On 19.11.2012 19:57, Michael Tuexen wrote: >> On Nov 19, 2012, at 7:04 PM, Andre Oppermann wrote: >>=20 >>> Author: andre >>> Date: Mon Nov 19 18:04:17 2012 >>> New Revision: 243291 >>> URL: http://svnweb.freebsd.org/changeset/base/243291 >>>=20 >>> Log: >>> Convert IP, IPv6, UDP, TCP and SCTP to the new checksum offloading >>> semantics on the inbound and outbound path. >>>=20 >>> In short for inbound there are two levels the offloading NIC can >>> set: >>>=20 >>> CSUM_L3_CALC for an IP layer 3 checksum calculated by the = NIC; >>> CSUM_L3_VALID set when the calculated checksum matches the one >>> in the packet; >>> CSUM_L4_CALC for an UDP/TCP/SCTP layer 4 checksum calculated = by >>> the NIC; >>> CSUM_L4_VALID set when the calculated checksum matche the one >>> in the packet. >>>=20 >>> =46rom this follows that a packet failed checksum verification when >>> only *_CALC is set but not *_VALID. The NIC is expected to deliver >>> a failed packet up the stack anyways for capture by BPF and to >>> record protocol specific checksum mismatch statistics. >>>=20 >>> The old approach with CSUM_DATA_VALID and CSUM_PSEUDO_HDR could not >>> signal a failed packet. A failed packet was delivered into the = stack >>> and the protocol had to recalculate the checksum for verification >>> every time to detect that as the absence of CSUM_DATA_VALID didn't >>> signal that the packet was broken. It was only saying that the >>> checksum wasn't calculated by the NIC, which actually wasn't the = case. >>>=20 >>> Drag the other stack infrastructure, including packet filters, = along >>> as well. >> I looked at the code for SCTP. If the NIC reports that it computed = the >> checksum and the checksum is not reported as valid, you are dropping = the >> packet. The problem with this code is, that at least some NICs report >> for small SCTP packets that the checksum is wrong. To deal with this, >> I did the checksum verification is software in case the hardware = reported >> a bad checksum. Are you planning to deal with this in the specific = drivers? >=20 > Yes, in that case the definition is that the driver must not set > CSUM_L4_CALC and CSUM_L4_VALID so the protocol can calculated the > checksum itself. If it is known that, for example, packets shorter > than 128 bytes are not correctly calculated by hardware the driver > must check for that when it sets the checksum flags. >=20 > Do you have a list of NICs that are broken with SCTP/CRC32c in > particular cases? If I remember correctly, igb with packets up to 64 bytes (such there is final ethernet padding). >=20 > It may be helpful to add an INVARIANTS case where the checksum is > always recomputed by the protocol as well and compared with the > outcome of the hardware calculation. That way we cancatch misbehaving > NICs early on. Not sure we need this. In case the driver reports the checksum as bad and the transport layer will drop the packet, the connection will = eventually break. We also have counters... So it should show up during testing. Best regards Michael >=20 > --=20 > Andre >=20 >=20