Date: Sun, 1 Feb 2009 11:49:57 +0000 (GMT) From: Robert Watson <rwatson@FreeBSD.org> To: "Per Hurtig (work)" <per.hurtig@kau.se> Cc: freebsd-net@freebsd.org Subject: Re: TCP gets special treatment? Message-ID: <alpine.BSF.2.00.0902011141400.47005@fledge.watson.org> In-Reply-To: <a846cbcf0901280249w31265880x88c60762a111b7d8@mail.gmail.com> References: <a846cbcf0901280249w31265880x88c60762a111b7d8@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 28 Jan 2009, Per Hurtig (work) wrote: > How differently are TCP packets treated compared to e.g. SCTP packets, while > traversing the FreeBSD network stack (up to and including the IP-layer when > using ipfw)?. I do not assume that the firewall (ipfw) is explicitly > configured to check for established sessions or any TCP specifics. Are there > a lot of TCP-specific optimizations conducted by lower layers anyways > (besides possible checksum offloading)? Hi Per: On the whole, TCP packets are treated like any other packet until they reach the tcp_input() function during the input path, and once they've entered ip_output() in the output path. There are some exceptions that I'm aware of, including: - ipfw(4) has special knowledge of the layout and semantics of TCP packets, including stateful tracking of TCP connections, etc. ipfw(4) is able use (output) or to look up (input) the local socket for the purposes of identifying the credential that was or may be associated with. Many of us consider this highly dubious behavior subject to race conditions and unexpected semantics, but it appears to be popular functionality. Other firewall packets, including pf(4) have this functionality as well. - The IP input protocol dispatch (in_proto.c) doesn't set PR_LASTHDR for TCP (and UDP for that matter) because IPSEC policy is aware of TCP-level properties, meaning that some IPSEC processing (policy checking) isn't performed in the normal IPSEC input path and instead deferred to the TCP input path. See ip_ipsec.c. - Various sorts of checksum offload and segmentation offload require TCP segments to be handled outside of the core TCP routines, including ip_output(), where deferred checksum calculations will be performed if it turns out the output interface doesn't support hardware checksumming, and where TSO segments may be rejected, and in device drivers that perform (for example) TSO and LSO and are therefore aware (in some form) of TCP processing. tcp_lso.c, for example, is entirely called from the device driver in order to perform early reassembly, if the device driver supports it (primarily 10gbps drivers). Robert N M Watson Computer Laboratory University of Cambridge
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.0902011141400.47005>