Date: Fri, 27 May 2016 07:32:53 -0700 From: John Baldwin <jhb@freebsd.org> To: "Bjoern A. Zeeb" <bz@freebsd.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r300764 - head/sys/netinet Message-ID: <2386120.9zOMGESfe3@ralph.baldwin.cx> In-Reply-To: <EF74E59A-8842-4BAE-90BE-EEE00F2745A5@FreeBSD.org> References: <201605261835.u4QIZblp066704@repo.freebsd.org> <EF74E59A-8842-4BAE-90BE-EEE00F2745A5@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Friday, May 27, 2016 12:14:28 AM Bjoern A. Zeeb wrote: > > > On 26 May 2016, at 18:35 , John Baldwin <jhb@FreeBSD.org> wrote: > > > > Author: jhb > > Date: Thu May 26 18:35:37 2016 > > New Revision: 300764 > > URL: https://svnweb.freebsd.org/changeset/base/300764 > > > > Log: > > Don't reuse the source mbuf in tcp_respond() if it is not writable. > > > > Not all mbufs passed up from device drivers are M_WRITABLE(). In > > particular, the Chelsio T4/T5 driver uses a feature called "buffer packing" > > to receive multiple frames in a single receive buffer. The mbufs for > > these frames all share the same external storage so are treated as > > read-only by the rest of the stack when multiple frames are in flight. > > Previously tcp_respond() would blindly overwrite read-only mbufs when > > INVARIANTS was disabled or panic with an assertion failure if INVARIANTS > > was enabled. Note that the new case is a bit of a mix of the two other > > cases in tcp_respond(). The TCP and IP headers must be copied explicitly > > into the new mbuf instead of being inherited (similar to the m == NULL > > case), but the addresses and ports must be swapped in the reply (similar > > to the m != NULL case). > > Is the same true for ICMP(v6) replies? Hmm, icmp_error() always allocates a new packet, but icmp_respond() does not. There's no assertion to trip in that case. :-/ However, even if this change really isn't quite "perfect". Specifically, with "buffer packing" the multiple mbufs sharing a single backing buffer via m_ext are all using non-overlapping regions of that buffer, and each mbuf does have exclusive access to its portion of the buffer. You just can't grow it. So, if a reuse does not extend the buffer for these particular mbufs then it is actually ok. (So icmp_respond() is actually ok, and tcp_respond() is also probably ok in practice.) However, we don't have any way to distinguish this use case from multiple mbufs sharing possibly-overlapping regions of a buffer. Fixing icmp_respond() is probably simpler as you can just m_dup() if the source is not writable and then m_freem() the original and reuse the clone as 'm' for the rest of the function. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2386120.9zOMGESfe3>