From owner-freebsd-hackers Fri Oct 8 11:33:53 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from bubba.whistle.com (bubba.whistle.com [207.76.205.7]) by hub.freebsd.org (Postfix) with ESMTP id 6864E14A29 for ; Fri, 8 Oct 1999 11:33:50 -0700 (PDT) (envelope-from archie@whistle.com) Received: (from archie@localhost) by bubba.whistle.com (8.9.2/8.9.2) id LAA92201; Fri, 8 Oct 1999 11:33:03 -0700 (PDT) From: Archie Cobbs Message-Id: <199910081833.LAA92201@bubba.whistle.com> Subject: Re: 3.3-STABLE panic in m_copym In-Reply-To: <25159.939366752@verdi.nethelp.no> from "sthaug@nethelp.no" at "Oct 8, 1999 09:12:32 am" To: sthaug@nethelp.no Date: Fri, 8 Oct 1999 11:33:03 -0700 (PDT) Cc: freebsd-hackers@freebsd.org X-Mailer: ELM [version 2.4ME+ PL54 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG sthaug@nethelp.no writes: > I have a Compaq Proliant 3000 (2 x PII-333) running 3.3-STABLE which has > crashed several times with the following backtrace: > > #0 boot (howto=256) at ../../kern/kern_shutdown.c:285 > #1 0xc0144299 in panic (fmt=0xc023eb04 "m_copym") at ../../kern/kern_shutdown.c:446 > #2 0xc015ac7e in m_copym (m=0xc141ae80, off0=10788, len=1216, wait=1) at ../../kern/uipc_mbuf.c:435 > #3 0xc019286a in tcp_output (tp=0xd0be8960) at ../../netinet/tcp_output.c:505 > #4 0xc0194106 in tcp_usr_send (so=0xd0ae9640, flags=0, m=0xc1420680, nam=0x0, control=0x0, p=0xd0e95b20) at ../../netinet/tcp_usrreq.c:395 > #5 0xc015c4b2 in sosend (so=0xd0ae9640, addr=0x0, uio=0xd0ee5f10, top=0xc1420680, control=0x0, flags=0, p=0xd0e95b20) > at ../../kern/uipc_socket.c:530 > #6 0xc01525dc in soo_write (fp=0xc210c600, uio=0xd0ee5f10, cred=0xc1fce600, flags=0) at ../../kern/sys_socket.c:82 > #7 0xc014f46a in dofilewrite (p=0xd0e95b20, fp=0xc210c600, fd=7, buf=0x806f0f4, nbyte=8192, offset=-1, flags=0) > at ../../kern/sys_generic.c:363 > #8 0xc014f373 in write (p=0xd0e95b20, uap=0xd0ee5f94) at ../../kern/sys_generic.c:298 > #9 0xc021f39b in syscall (frame={tf_es = 39, tf_ds = -1078001625, tf_edi = 671806342, tf_esi = 7, tf_ebp = -1077949676, > tf_isp = -789684252, tf_ebx = 0, tf_edx = 434759, tf_ecx = 0, tf_eax = 4, tf_trapno = 7, tf_err = 2, tf_eip = 134533700, tf_cs = 31, > tf_eflags = 518, tf_esp = -1077949700, tf_ss = 39}) at ../../i386/i386/trap.c:1100 > #10 0xc020b2ac in Xint0x80_syscall () > > The panic is the following loop in m_copym: > > while (off > 0) { > if (m == 0) > panic("m_copym"); > if (off < m->m_len) > break; > off -= m->m_len; > m = m->m_next; > } > > so it seems to be running off the end of the mbuf chain before having > verified the whole length. Following the m_next pointers, starting with > the mbuf pointer from the calling routine, I get a total of 5 mbufs in > this chain, with the following lengths: > > 0xc141ae80 2048 > 0xc13fef80 2008 > 0xc1446e00 2048 > 0xc147fe80 872 > 0xc1420680 1216 This may or may not be helpful, but.. Packet mbuf's contain redundant information: the header mbuf contains the total length (m->m_pkthdr.len), which must be equal to the sum of the lengths of the individual mbuf's in the chain (m->m_len). I think these numbers getting out of sync is a common source of bugs. For example, code that builds an mbuf chain by concatenating mbuf's forgets to update the m->m_pkthdr.len field. You might look at where the packet originated, ie in dofilewrite() (?) -Archie ___________________________________________________________________________ Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message