From owner-svn-src-head@freebsd.org Fri May 27 14:43:52 2016 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8C436B4B82D; Fri, 27 May 2016 14:43:52 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6D9621704; Fri, 27 May 2016 14:43:52 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 56DD9B958; Fri, 27 May 2016 10:43:51 -0400 (EDT) From: John Baldwin To: "Bjoern A. Zeeb" Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r300764 - head/sys/netinet Date: Fri, 27 May 2016 07:32:53 -0700 Message-ID: <2386120.9zOMGESfe3@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.2-STABLE; KDE/4.14.3; amd64; ; ) In-Reply-To: References: <201605261835.u4QIZblp066704@repo.freebsd.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 27 May 2016 10:43:51 -0400 (EDT) X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 May 2016 14:43:52 -0000 On Friday, May 27, 2016 12:14:28 AM Bjoern A. Zeeb wrote: > > > On 26 May 2016, at 18:35 , John Baldwin wrote: > > > > Author: jhb > > Date: Thu May 26 18:35:37 2016 > > New Revision: 300764 > > URL: https://svnweb.freebsd.org/changeset/base/300764 > > > > Log: > > Don't reuse the source mbuf in tcp_respond() if it is not writable. > > > > Not all mbufs passed up from device drivers are M_WRITABLE(). In > > particular, the Chelsio T4/T5 driver uses a feature called "buffer packing" > > to receive multiple frames in a single receive buffer. The mbufs for > > these frames all share the same external storage so are treated as > > read-only by the rest of the stack when multiple frames are in flight. > > Previously tcp_respond() would blindly overwrite read-only mbufs when > > INVARIANTS was disabled or panic with an assertion failure if INVARIANTS > > was enabled. Note that the new case is a bit of a mix of the two other > > cases in tcp_respond(). The TCP and IP headers must be copied explicitly > > into the new mbuf instead of being inherited (similar to the m == NULL > > case), but the addresses and ports must be swapped in the reply (similar > > to the m != NULL case). > > Is the same true for ICMP(v6) replies? Hmm, icmp_error() always allocates a new packet, but icmp_respond() does not. There's no assertion to trip in that case. :-/ However, even if this change really isn't quite "perfect". Specifically, with "buffer packing" the multiple mbufs sharing a single backing buffer via m_ext are all using non-overlapping regions of that buffer, and each mbuf does have exclusive access to its portion of the buffer. You just can't grow it. So, if a reuse does not extend the buffer for these particular mbufs then it is actually ok. (So icmp_respond() is actually ok, and tcp_respond() is also probably ok in practice.) However, we don't have any way to distinguish this use case from multiple mbufs sharing possibly-overlapping regions of a buffer. Fixing icmp_respond() is probably simpler as you can just m_dup() if the source is not writable and then m_freem() the original and reuse the clone as 'm' for the rest of the function. -- John Baldwin