Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 10 Jul 2004 10:27:23 -0400 (EDT)
From:      Robert Watson <rwatson@freebsd.org>
To:        Daniel Lang <dl@leo.org>
Cc:        current@freebsd.org
Subject:   Re: panic: m_copym, length > size of mbuf chain
Message-ID:  <Pine.NEB.3.96L.1040710092144.19581J-100000@fledge.watson.org>
In-Reply-To: <20040710105017.GA61243@atrbg11.informatik.tu-muenchen.de>

next in thread | previous in thread | raw e-mail | index | archive | help

On Sat, 10 Jul 2004, Daniel Lang wrote:

> So I come back to the issue. As I already wrote, I guess I can rule out
> hardware problems now. I did a very thorough test with the Dell
> diagnosis utilities which showed no problems. 

Thanks!

> Also, after John's patch I did not see any WITNESS related problems (so
> far) again. But I had the m_copy panic again (see subject). This time I
> did file a PR and did some more detailed gdb analysis. It is all
> documented at:
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/68889
> 
> I am puzzled, because the stack frame on entering m_copym has 0x0 as
> first argument (m), however in the previous frame when m_copy() is
> called, the struct mbuf* argument is valid. 
> 
> Ok, I just realized that there is a difference m_copy()  and m_copym() 
> are apparently different functions. Is this a makro/#define discrepancy
> it seems that that m_copym() is the function which is called in this
> line of code. 
> 
> Ah, I found it: 
> 
> sys/mbuf.h:#define m_copy(m, o, l) m_copym((m), (o), (l), M_DONTWAIT) 
> 
> so, the puzzle remains, since the arguments passed are kept, except that
> M_DONTWAIT flag is added.
> 
> Is this a trashed stack? 

Possibly, but notice that the m_copym() function modifies its copy of 'm'
in the stack as part of its work -- it uses 'm' to iterate the mbuf chain
passed in in order to move to the necessary starting offset for the copy.
Note that the requested offset ('off0') is 737, and the requested 'len' is
at least 1222, so the loop starting at line 369 will walk until it either
gets far enough or the "offset > size" assertion triggers:

        while (off > 0) {
                KASSERT(m != NULL, ("m_copym, offset > size of mbuf chain"));
                if (off < m->m_len)
                        break;
                off -= m->m_len;
                m = m->m_next;
        }

Since that assertion didn't trigger, we can assume m_copym() successfully
walked at least 'off0' (737) bytes.  The problem appears to be that it ran
out of mbufs in which to find data to copy, as it hit the end of the chain
(m == NULL):

        while (len > 0) {
                if (m == NULL) {
                        KASSERT(len == M_COPYALL,
                            ("m_copym, length > size of mbuf chain"));
                        break;
                }

So the initial conclusion is that the caller requested that more data be
copied from the chain than is actually present in the chain.  This
suggests a bug in socket buffer management or the TCP code.  It's
interesting to note that the socket buffer believes it contains less than
the requested length -- 'so_snd.sb_mbcnt' is 1536, which is arguably less
than 737 + 1222 (although we don't know, I think, if it's iterated or not
and therefore decreased the value of 'len').  Could you print the value of
'top' in the m_copym() stack?  That will tell us if it's on the first mbuf
or not.

It sounds like the socket buffer state may be inconsistent with the TCP
PCB state, or that the expectations in tcp_offset() are wrong.  I've CC'd
Paul because he's had his hands in the new SACK code that was merged, and
it has its hands in that bit of the output code.  Here are some things you
might want to try:

(1) Try running with TCP SACK disabled.  Set the
    'net.inet.tcp.sack.enable' sysctl to 0 to try this.

(2) Try adding some assertions just before the copy to m_copy() in
    tcp_output().  I'd suggest something like the following:

Index: tcp_output.c
===================================================================
RCS file: /home/ncvs/src/sys/netinet/tcp_output.c,v
retrieving revision 1.95
diff -u -r1.95 tcp_output.c
--- tcp_output.c	23 Jun 2004 21:04:37 -0000	1.95
+++ tcp_output.c	10 Jul 2004 14:12:29 -0000
@@ -740,6 +740,15 @@
 #endif
 		m->m_data += max_linkhdr;
 		m->m_len = hdrlen;
+		if (off + len > so->so_snd.sb_cc) {
+			printf("tcp_output: not enough data to copy\n");
+			printf("off=%d\n", off);
+			printf("len=%ld\n", len);
+			printf("so->so_snd.sb_cc=%d\n", so->so_snd.sb_cc);
+			printf("m_length(so->so_snd.sb_mb, NULL) == %d\n",
+			    m_length(so->so_snd.sb_mb, NULL));
+			panic("Down she goes...");
+		}
 		if (len <= MHLEN - hdrlen - max_linkhdr) {
 			m_copydata(so->so_snd.sb_mb, off, (int) len,
 			    mtod(m, caddr_t) + hdrlen);

    These printfs are oriented at making sure the socket buffer state is
    consistent, and don't attempt to dump the TCP state used to construct
    'len' or 'off'.  If the buffer is consistent (i.e., sb_cc ==
    m_length()), then the next logical thing to do is figure out how
    (len + off) became greater than the available buffer space.  If you
    look up a bit higher in tcp_output(), you'll see the values being set,
    trimmed, etc, based on various values in the tcpcb.

(3) If the socket buffer is inconsistent -- i.e., the length of the buffer
    in the socket buffer meta-data doesn't match the actual length of the
    mbufs in the buffer -- try compiling in "options SOCKBUF_DEBUG", which
    turns on additional socket buffer diagnostics. 

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Principal Research Scientist, McAfee Research




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1040710092144.19581J-100000>