Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 27 Nov 2006 14:56:03 +0300
From:      Yar Tikhiy <yar@comp.chem.msu.su>
To:        Sam Leffler <sam@errno.com>
Cc:        freebsd-net@freebsd.org
Subject:   Re: m_copy & if_simloop
Message-ID:  <20061127115602.GB77085@comp.chem.msu.su>
In-Reply-To: <4569E158.5070800@errno.com>
References:  <20061126180306.GA64912@comp.chem.msu.su> <4569E158.5070800@errno.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Nov 26, 2006 at 10:47:52AM -0800, Sam Leffler wrote:
> Yar Tikhiy wrote:
> > Hi folks,
> > 
> > A friend user reported to me that rwhod wouldn't work in CURRENT
> > due to broken outgoing packets.  Here's an example:
> > 
> > 16:15:28.212810 IP truncated-ip - 6865 bytes missing! (tos 0x0, ttl  64, id 28554, offset 0, flags [none], proto: UDP (17), length: 7169, bad cksum 11c (->c64b)!) 10.10.10.4.513 > 10.10.10.255.513: UDP, length 276
> >         0x0000:  4500 1c01 6f8a 0000 4011 011c 0a0a 0a04  E...o...@.......
> >                       ^^^^                ^^^^ broken fields
> >         0x0010:  0a0a 0aff 0201 0201 011c 0000 0101 0000  ................
> >         0x0020:  4565 9ef0 0000 0000 6467 0000 0000 0000  Ee......dg......
> >         0x0030:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >         0x0040:  0000 0000 0000 0000 0000 001a 0000 000e  ................
> >         0x0050:  0000 0005 4564 a5e7 7474 7976 3000 0000  ....Ed..ttyv0...
> >         0x0060:  726f 6f74 0000 0000 4565 9d4a 0000 01a6  root....Ee.J....
> >         0x0070:  7474 7976 3100 0000 726f 6f74 0000 0000  ttyv1...root....
> >         0x0080:  4565 9d4d 0000 000c 7474 7976 3200 0000  Ee.M....ttyv2...
> >         0x0090:  726f 6f74 0000 0000 4565 9d4f 0000 0099  root....Ee.O....
> >         0x00a0:  7474 7976 3300 0000 726f 6f74 0000 0000  ttyv3...root....
> >         0x00b0:  4565 9d52 0000 019e 7474 7976 3400 0000  Ee.R....ttyv4...
> >         0x00c0:  726f 6f74 0000 0000 4565 9d54 0000 019c  root....Ee.T....
> >         0x00d0:  7474 7976 3500 0000 726f 6f74 0000 0000  ttyv5...root....
> >         0x00e0:  4565 9d59 0000 0198 7474 7976 3600 0000  Ee.Y....ttyv6...
> >         0x00f0:  726f 6f74 0000 0000 4565 9d5b 0000 0195  root....Ee.[....
> >         0x0100:  7474 7976 3700 0000 726f 6f74 0000 0000  ttyv7...root....
> >         0x0110:  4565 9d5e 0000 0000 7474 7970 3100 0000  Ee.^....ttyp1...
> >         0x0120:  7961 7200 0000 0000 4565 8361 0000 04b2  yar.....Ee.a....
> > 
> > BTW, the problem manifests itself only if the packet is longer than
> > 256 bytes.
> > 
> > The problem seems to stem from the following.  In ether_output(),
> > the broadcast packet is copied and looped back up the stack, now
> > via if_simloop().  The copy has been made with m_copy() since 4.4BSD.
> > I can't tell about the old days, but today m_copy() alias m_copym()
> > will copy mbufs but not mbuf clusters, which results in an effectively
> > read-only copy.  Such a copy must not be passed up the stack because
> > the stack is free to change it and thus destroy the original.  For
> > a long time, enough leading bytes were in plain mbuf(s) for the bug
> > to stay unnoticed.  However, the pattern changed in CURRENT some
> > day and -- here we are.
> > 
> > The problem can be cured by using m_dup() in place of m_copy()
> > (verified).
> > 
> > Is my analysis correct?
> 
> Sounds likely.  The read-only'ness definitely.

Thanks!

> > If so, here's an idea of a general fix.  Several source files do
> > the following:
> > 
> > 	struct mbuf *mcopy = m_copy(m, 0, M_COPYALL);
> > 	/* some even don't check mcopy for NULL here! */
> > 	if_simloop(ifp, mcopy, family, hdrlen);
> > 
> > It's common code, so just a flag to if_simloop() cound be introduced
> > meaning "m_dup() the packet properly".  E.g.:
> > 
> > 	if_simloop(ifp, m, family, hdrlen, M_DUP);
> > 
> > In STABLE, M_COPYALL can be added to hdrlen instead to preserve the
> > ABI.  M_COPYALL is defined as 1000000000 now, which allows for the
> > trick.
> > 
> > Comments?  Thanks!
> 
> What you suggest seems ok.  You might also look at m_unshare which was
> added for similar purpose but is not exactly what you want.  It may be
> possible to combine m_copy+m_unshare code (not calls) to create the new
> mbuf chain more efficiently.

I hope I understood your idea right:  Currently, m_dup() will produce
some copy of a mbuf chain; OTOH, as we have to copy the data around
anyway, it's a good chance to optimize its layout as m_unshare()
does.   Is this your point?  In fact, m_dup() just puts the data
in mbuf clusters instead of producing an identical, or otherwise
suboptimal, copy of the original mbuf chain, so it should be good for
the purpose.

-- 
Yar



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061127115602.GB77085>