From owner-freebsd-hackers  Fri Oct  8 11:33:53 1999
Delivered-To: freebsd-hackers@freebsd.org
Received: from bubba.whistle.com (bubba.whistle.com [207.76.205.7])
	by hub.freebsd.org (Postfix) with ESMTP id 6864E14A29
	for <freebsd-hackers@freebsd.org>; Fri,  8 Oct 1999 11:33:50 -0700 (PDT)
	(envelope-from archie@whistle.com)
Received: (from archie@localhost)
	by bubba.whistle.com (8.9.2/8.9.2) id LAA92201;
	Fri, 8 Oct 1999 11:33:03 -0700 (PDT)
From: Archie Cobbs <archie@whistle.com>
Message-Id: <199910081833.LAA92201@bubba.whistle.com>
Subject: Re: 3.3-STABLE panic in m_copym
In-Reply-To: <25159.939366752@verdi.nethelp.no> from "sthaug@nethelp.no" at "Oct 8, 1999 09:12:32 am"
To: sthaug@nethelp.no
Date: Fri, 8 Oct 1999 11:33:03 -0700 (PDT)
Cc: freebsd-hackers@freebsd.org
X-Mailer: ELM [version 2.4ME+ PL54 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

sthaug@nethelp.no writes:
> I have a Compaq Proliant 3000 (2 x PII-333) running 3.3-STABLE which has
> crashed several times with the following backtrace:
> 
> #0  boot (howto=256) at ../../kern/kern_shutdown.c:285
> #1  0xc0144299 in panic (fmt=0xc023eb04 "m_copym") at ../../kern/kern_shutdown.c:446
> #2  0xc015ac7e in m_copym (m=0xc141ae80, off0=10788, len=1216, wait=1) at ../../kern/uipc_mbuf.c:435
> #3  0xc019286a in tcp_output (tp=0xd0be8960) at ../../netinet/tcp_output.c:505
> #4  0xc0194106 in tcp_usr_send (so=0xd0ae9640, flags=0, m=0xc1420680, nam=0x0, control=0x0, p=0xd0e95b20) at ../../netinet/tcp_usrreq.c:395
> #5  0xc015c4b2 in sosend (so=0xd0ae9640, addr=0x0, uio=0xd0ee5f10, top=0xc1420680, control=0x0, flags=0, p=0xd0e95b20)
>     at ../../kern/uipc_socket.c:530
> #6  0xc01525dc in soo_write (fp=0xc210c600, uio=0xd0ee5f10, cred=0xc1fce600, flags=0) at ../../kern/sys_socket.c:82
> #7  0xc014f46a in dofilewrite (p=0xd0e95b20, fp=0xc210c600, fd=7, buf=0x806f0f4, nbyte=8192, offset=-1, flags=0)
>     at ../../kern/sys_generic.c:363
> #8  0xc014f373 in write (p=0xd0e95b20, uap=0xd0ee5f94) at ../../kern/sys_generic.c:298
> #9  0xc021f39b in syscall (frame={tf_es = 39, tf_ds = -1078001625, tf_edi = 671806342, tf_esi = 7, tf_ebp = -1077949676, 
>       tf_isp = -789684252, tf_ebx = 0, tf_edx = 434759, tf_ecx = 0, tf_eax = 4, tf_trapno = 7, tf_err = 2, tf_eip = 134533700, tf_cs = 31, 
>       tf_eflags = 518, tf_esp = -1077949700, tf_ss = 39}) at ../../i386/i386/trap.c:1100
> #10 0xc020b2ac in Xint0x80_syscall ()
> 
> The panic is the following loop in m_copym:
> 
> 	while (off > 0) {
> 		if (m == 0)
> 			panic("m_copym");
> 		if (off < m->m_len)
> 			break;
> 		off -= m->m_len;
> 		m = m->m_next;
> 	}
> 
> so it seems to be running off the end of the mbuf chain before having
> verified the whole length. Following the m_next pointers, starting with
> the mbuf pointer from the calling routine, I get a total of 5 mbufs in
> this chain, with the following lengths:
> 
> 0xc141ae80      2048
> 0xc13fef80      2008
> 0xc1446e00      2048
> 0xc147fe80      872
> 0xc1420680      1216

This may or may not be helpful, but..

Packet mbuf's contain redundant information: the header mbuf contains
the total length (m->m_pkthdr.len), which must be equal to the sum of
the lengths of the individual mbuf's in the chain (m->m_len).

I think these numbers getting out of sync is a common source of bugs.
For example, code that builds an mbuf chain by concatenating mbuf's
forgets to update the m->m_pkthdr.len field.

You might look at where the packet originated, ie in dofilewrite() (?)

-Archie

___________________________________________________________________________
Archie Cobbs   *   Whistle Communications, Inc.  *   http://www.whistle.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message