Date: Sat, 10 Jul 2004 12:50:17 +0200 From: Daniel Lang <dl@leo.org> To: Robert Watson <rwatson@freebsd.org> Cc: current@freebsd.org Subject: Re: panic: m_copym, length > size of mbuf chain Message-ID: <20040710105017.GA61243@atrbg11.informatik.tu-muenchen.de> In-Reply-To: <Pine.NEB.3.96L.1040707122259.37929D-100000@fledge.watson.org> References: <20040707162154.GB45200@atrbg11.informatik.tu-muenchen.de> <Pine.NEB.3.96L.1040707122259.37929D-100000@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Robert, Robert Watson wrote on Wed, Jul 07, 2004 at 12:24:59PM -0400: [..] > Just to try ruling out possibilities -- have you run an extensive set of > hardware diagnostics? Most server class hardware ships with a decent > diagnostics disk, and I'm sure we can find some for you in the event your > hardware didn't come with some. While it's quite possibly a software > problem, tracking hardware problems using software symptoms constitutes > undesirable pain and so it wouldn't hurt to give that a spin. I remember > seing your earlier e-mails about running with WITNESS increasing the > chances of pain -- this could be a bug in WITNESS as you suggest, or it > could be that WITNESS increases the opportunities for a variety of locking > related races by increasing the cost of lock/unlock operations. [..] So I come back to the issue. As I already wrote, I guess I can rule out hardware problems now. I did a very thorough test with the Dell diagnosis utilities which showed no problems. Also, after John's patch I did not see any WITNESS related problems (so far) again. But I had the m_copy panic again (see subject). This time I did file a PR and did some more detailed gdb analysis. It is all documented at: http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/68889 I am puzzled, because the stack frame on entering m_copym has 0x0 as first argument (m), however in the previous frame when m_copy() is called, the struct mbuf* argument is valid. Ok, I just realized that there is a difference m_copy()=20 and m_copym() are apparently different functions. Is this a=20 makro/#define discrepancy it seems that that m_copym() is the function which is called in this line of code. Ah, I found it: sys/mbuf.h:#define m_copy(m, o, l) m_copym((m), (o), (l), M_DONTWAIT) so, the puzzle remains, since the arguments passed are kept, except that M_DONTWAIT flag is added.=20 Is this a trashed stack? Cheers, Daniel --=20 IRCnet: Mr-Spock - Cool people don't move, they just hang around. - =20 Daniel Lang * dl@leo.org * ++49 89 289 18532 * http://www.leo.org/~dl/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040710105017.GA61243>