Date: Mon, 28 Aug 2000 10:42:00 -0700 (PDT) From: Archie Cobbs <archie@whistle.com> To: freebsd-arch@freebsd.org Subject: Proposal to clarify mbuf handling rules (fwd) Message-ID: <200008281742.KAA69859@bubba.whistle.com>
index | next in thread | raw e-mail
David O'Brien correctly pointed out that I should have sent
the email below to -arch and -net instead of -current and
-net. So at the risk of repetition for some folks, here it is
again for anybody on -arch but not -net or -current. Followups
should go to -net I suppose. Sorry for the confusion.
Thanks,
-Archie
> From owner-freebsd-net@FreeBSD.ORG Sun Aug 27 14:38:34 2000
> From: Archie Cobbs <archie@whistle.com>
> Message-Id: <200008272125.OAA66159@bubba.whistle.com>
> Subject: Proposal to clarify mbuf handling rules
> To: freebsd-net@FreeBSD.ORG, freebsd-current@FreeBSD.ORG
>
> In looking at some of the problems relating to divert, bridging,
> etc., it's apparent that lots of code is breaking one of the rules
> for handling mbufs: that mbuf data can sometimes be read-only.
>
> Each mbuf may be either a normal mbuf or a cluster mbuf (if the
> mbuf flags contains M_EXT). Cluster mbufs point to an entire page
> of memory, and this page of memory may be shared by more than one
> cluster mbuf (see m_copypacket()). This effectively makes the mbuf
> data read-only, because a change to one mbuf affects all of the
> mbufs, not just the one you're working on. There have been (and
> still are) several FreeBSD bugs because of this subtlety.
>
> A test for an mbuf being "read-only" is:
>
> if ((m->m_flags & M_EXT) != 0 && MEXT_IS_REF(m)) ...
>
> So an implicit rule for handling mbufs is that they should be
> treated as read-only unless/until you either check that it's not,
> and/or pullup a new (non-cluster) mbuf that covers the data area
> that you're going to modify.
>
> However, many routines that take an mbuf parameter assume that the
> mbuf given to them is modifiable and proceed to write all over it.
> A few examples are: ip_input(), in_arpinput(), tcp_input(),
> divert_packt(), etc.
>
> In practice, this is often not a problem because the mbuf is actually
> modifiable (because there are no other references to it). But this
> is just because we've been lucky. When you throw things like bridging,
> dummynet, divert, and netgraph into the mix, not to mention other
> site-specific hacks, then these assumptions no longer hold. At the
> minimum these assumptions should be clearly commented, but that's
> not even the case right now.
>
> Routines that don't change any data, or that only do m_pullup(),
> M_PREPEND(), m_adj(), etc. don't have a problem.
>
> So I'd like to propose a mini-project to clarify and fix this problem.
> Here is the propsal:
>
> 1. All routines that take an mbuf as an argument must not assume
> that any mbuf in the chain is modifyable, unless expclicitly
> and clearly documented (in the comment at the top of the function)
> as doing so.
>
> 2. For routines that don't modify data, incorporate liberal use
> of the "const" keyword to make this clear. For example, change
>
> struct ip *ip;
> ip = mtod(m, struct ip *);
>
> to:
>
> const struct ip *ip;
> ip = mtod(m, const struct ip *);
>
> 3. For any routines that do need to modify mbuf data, but don't
> assume anything about the mbuf, alter those routines to do
> an m_pullup() when necessary to make the data are they are
> working on modifiable. For example:
>
> struct ip *ip;
>
> /* Pull up IP header */
> if (m->m_len < sizeof(*ip) && !(m = m_pullup(m, sizeof(*ip))))
> return;
> ip = mtod(m, struct ip *);
>
> #ifdef NEW_CODE_BEING_ADDED
> /* Make sure the IP header area is writable */
> if ((m->m_flags & M_EXT) != 0 && MEXT_IS_REF(m)) {
> /* m_pullup() *always* prepends a fresh, non-cluster mbuf */
> if ((m = m_pullup(m, sizeof(struct ip))) == 0)
> return;
> ip = mtod(m, struct ip *);
> }
> #endif
>
> /* Modify the header */
> ip->ip_len = 123;
> ...
>
> The only negative is the addition of the NEW_CODE_BEING_ADDED code
> in the relevant places. In practice this test will usually fail,
> as most mbufs are modifiable, so there should be no noticable
> slowdown. However, robustness should improve, especially when
> bridging, diverting, etc.
>
> What do people think? If this is generally agreeable I'll try to
> work on putting together a patch set for review.
___________________________________________________________________________
Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
home |
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200008281742.KAA69859>
