Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 27 May 2016 07:32:53 -0700
From:      John Baldwin <jhb@freebsd.org>
To:        "Bjoern A. Zeeb" <bz@freebsd.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r300764 - head/sys/netinet
Message-ID:  <2386120.9zOMGESfe3@ralph.baldwin.cx>
In-Reply-To: <EF74E59A-8842-4BAE-90BE-EEE00F2745A5@FreeBSD.org>
References:  <201605261835.u4QIZblp066704@repo.freebsd.org> <EF74E59A-8842-4BAE-90BE-EEE00F2745A5@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Friday, May 27, 2016 12:14:28 AM Bjoern A. Zeeb wrote:
> 
> > On 26 May 2016, at 18:35 , John Baldwin <jhb@FreeBSD.org> wrote:
> > 
> > Author: jhb
> > Date: Thu May 26 18:35:37 2016
> > New Revision: 300764
> > URL: https://svnweb.freebsd.org/changeset/base/300764
> > 
> > Log:
> >  Don't reuse the source mbuf in tcp_respond() if it is not writable.
> > 
> >  Not all mbufs passed up from device drivers are M_WRITABLE().  In
> >  particular, the Chelsio T4/T5 driver uses a feature called "buffer packing"
> >  to receive multiple frames in a single receive buffer.  The mbufs for
> >  these frames all share the same external storage so are treated as
> >  read-only by the rest of the stack when multiple frames are in flight.
> >  Previously tcp_respond() would blindly overwrite read-only mbufs when
> >  INVARIANTS was disabled or panic with an assertion failure if INVARIANTS
> >  was enabled.  Note that the new case is a bit of a mix of the two other
> >  cases in tcp_respond().  The TCP and IP headers must be copied explicitly
> >  into the new mbuf instead of being inherited (similar to the m == NULL
> >  case), but the addresses and ports must be swapped in the reply (similar
> >  to the m != NULL case).
> 
> Is the same true for ICMP(v6) replies?

Hmm, icmp_error() always allocates a new packet, but icmp_respond() does not.
There's no assertion to trip in that case. :-/

However, even if this change really isn't quite "perfect".  Specifically,
with "buffer packing" the multiple mbufs sharing a single backing buffer via
m_ext are all using non-overlapping regions of that buffer, and each mbuf does
have exclusive access to its portion of the buffer.  You just can't grow it.
So, if a reuse does not extend the buffer for these particular mbufs then it
is actually ok.  (So icmp_respond() is actually ok, and tcp_respond() is also
probably ok in practice.)  However, we don't have any way to distinguish this
use case from multiple mbufs sharing possibly-overlapping regions of a
buffer.

Fixing icmp_respond() is probably simpler as you can just m_dup() if the source
is not writable and then m_freem() the original and reuse the clone as 'm' for
the rest of the function.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2386120.9zOMGESfe3>