From owner-freebsd-net Fri May 31 13:45:29 2002 Delivered-To: freebsd-net@freebsd.org Received: from InterJet.dellroad.org (adsl-63-194-81-26.dsl.snfc21.pacbell.net [63.194.81.26]) by hub.freebsd.org (Postfix) with ESMTP id 6440537B413 for ; Fri, 31 May 2002 13:45:02 -0700 (PDT) Received: from arch20m.dellroad.org (arch20m.dellroad.org [10.1.1.20]) by InterJet.dellroad.org (8.9.1a/8.9.1) with ESMTP id NAA58363; Fri, 31 May 2002 13:26:07 -0700 (PDT) Received: (from archie@localhost) by arch20m.dellroad.org (8.11.6/8.11.6) id g4VKP9t02205; Fri, 31 May 2002 13:25:09 -0700 (PDT) (envelope-from archie) From: Archie Cobbs Message-Id: <200205312025.g4VKP9t02205@arch20m.dellroad.org> Subject: Re: m_split() considered harmful In-Reply-To: "from Julian Elischer at May 31, 2002 12:54:39 pm" To: Julian Elischer Date: Fri, 31 May 2002 13:25:09 -0700 (PDT) Cc: Bosko Milekic , Archie Cobbs , freebsd-net@FreeBSD.ORG X-Mailer: ELM [version 2.4ME+ PL88 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Julian Elischer writes: > '90-'95 under MACH/BSD and when I moved the code to FreeBSD1.x it cam > along.. there was no M_LEADINGSPACE/M_TRAILINGSPACE macro at that time.. > I did it in my code explicitly. > > It was not used in standard code because only in my own protocol stack > did I know that the damned object was not shared and writable.. > This has improved with time.. > > Having the size set to 0 however stopped users from hitting > cases where the WRITABILITY of the ext objext has not been correctly > tracked as it always returns "not enough space" (sometimes -ve). > > If we change this we need to audit the WRITABILTY.. > e.g. it is not checked in M_TRAILINGSPACE It's not clear whether the caller of M_TRAILINGSPACE()/M_LEADINGSPACE() is responsible for checking for writability, or the macros themselves. It seems to make more sense that the caller would be responsible... why would you call M_TRAILINGSPACE() unless you wanted to write something in there? I agree that we should audit for writability, however, the only case where this could become an issue with this patch is if somewhere a call to m_split() is made, followed by a call to M_TRAILINGSPACE(), followed by a write to the mbuf data area because M_TRAILINGSPACE() > 0. In fact, this would have to happen to 2 or more copies of the cluster for there to be a problem. I'd say this is very unlikely, but still possible. Any *other* use of the mbuf as writable after a call to m_split() is already an existing bug and is not made any worse by this patch. As a temporary saftey measure, I'll add M_WRITABLE(m) into the M_TRAILINGSPACE() macro. However, I see this as a temporary hack; the correct fix is to put the burden of writability on the caller. After all, M_TRAILINGSPACE() doesn't modify the mbuf data! That is, what we really need is a more general audit for code that writes into mbufs that might be read-only -- and, as one special case of tha, code that calls M_TRAILINGSPACE()/M_LEADINGSPACE() before writing into an mbuf. FYI, any easy way to do this would be to add const'ness to mtod(): #define mtod(m, t) ((const t)((m)->m_data)) and then look for GCC warnings. Any takers?? :-) Updated patch below. -Archie __________________________________________________________________________ Archie Cobbs * Packet Design * http://www.packetdesign.com Index: sys/mbuf.h =================================================================== RCS file: /home/ncvs/src/sys/sys/mbuf.h,v retrieving revision 1.89 diff -u -r1.89 mbuf.h --- sys/mbuf.h 5 Feb 2002 02:00:56 -0000 1.89 +++ sys/mbuf.h 31 May 2002 20:24:41 -0000 @@ -344,6 +344,9 @@ /* * Compute the amount of space available * before the current start of data in an mbuf. + * + * The M_WRITABLE() is a temporary, conservative safety measure: the burden + * of checking writability of the mbuf data area rests solely with the caller. */ #define M_LEADINGSPACE(m) \ ((m)->m_flags & M_EXT ? \ @@ -354,10 +357,14 @@ /* * Compute the amount of space available * after the end of data in an mbuf. + * + * The M_WRITABLE() is a temporary, conservative safety measure: the burden + * of checking writability of the mbuf data area rests solely with the caller. */ #define M_TRAILINGSPACE(m) \ - ((m)->m_flags & M_EXT ? (m)->m_ext.ext_buf + \ - (m)->m_ext.ext_size - ((m)->m_data + (m)->m_len) : \ + ((m)->m_flags & M_EXT ? \ + (M_WRITABLE(m) ? (m)->m_ext.ext_buf + (m)->m_ext.ext_size \ + - ((m)->m_data + (m)->m_len) : 0) : \ &(m)->m_dat[MLEN] - ((m)->m_data + (m)->m_len)) /* Index: kern/uipc_mbuf.c =================================================================== RCS file: /home/ncvs/src/sys/kern/uipc_mbuf.c,v retrieving revision 1.91 diff -u -r1.91 uipc_mbuf.c --- kern/uipc_mbuf.c 12 Apr 2002 00:01:50 -0000 1.91 +++ kern/uipc_mbuf.c 31 May 2002 20:24:45 -0000 @@ -560,6 +560,11 @@ * Partition an mbuf chain in two pieces, returning the tail -- * all but the first len0 bytes. In case of failure, it returns NULL and * attempts to restore the chain to its original state. + * + * Note that the resulting mbufs might be read-only, because the new + * mbuf can end up sharing an mbuf cluster with the original mbuf if + * the "breaking point" happens to lie within a cluster mbuf. Use the + * M_WRITABLE() macro to check for this case. */ struct mbuf * m_split(struct mbuf *m0, int len0, int wait) @@ -609,7 +614,6 @@ n->m_flags |= M_EXT; n->m_ext = m->m_ext; MEXT_ADD_REF(m); - m->m_ext.ext_size = 0; /* For Accounting XXXXXX danger */ n->m_data = m->m_data + len; } else { bcopy(mtod(m, caddr_t) + len, mtod(n, caddr_t), remain); To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message