Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Aug 2011 23:56:11 +0200
From:      Kip Macy <kip.macy@gmail.com>
To:        Charlie Martin <crmartin@sgi.com>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: Where to ask about a 7.2 bug, and debugging sys/queue.h errors
Message-ID:  <CAJ7_N2iRaJ9%2BSFf%2BuujbtNJ97K=L_AdP6kYoxfwFij4CY%2Bwrgg@mail.gmail.com>
In-Reply-To: <4E56BB99.6030706@sgi.com>
References:  <4E56BB99.6030706@sgi.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Aug 25, 2011 at 11:16 PM, Charlie Martin <crmartin@sgi.com> wrote:
> We're having a crash in some internal code running on FreeBSD 7.2
> (specifically =A07.2-PRERELEASE FreeBSD 7.2-PRERELEASE and yeah, I know i=
t's
> quite a bit behind) in which after 18-30 hours of running load tests, the
> code panics with:
>
> panic: Bad link elm 0xffffff0044c09600 next->prev !=3D elm
> cpuid =3D 0
> KDB: stack backtrace:
> db_trace_self_wrapper() at 0xffffffff8019119a =3D db_trace_self_wrapper+0=
x2a
> panic() at 0xffffffff80307c72 =3D panic+0x182
> devfs_populate_loop() at 0xffffffff802a43a8 =3D devfs_populate_loop+0x548
>
>
> First question: where's the most appropriate place to ask about this kind=
 of
> bug on a back version.

Probably -stable. I don't know how many developers are still running
7. Most are on 8 at this point.

> Second: does this remind anyone of any bugs? =A0Googling came up with a f=
ew
> somewhat similar things but hasn't provided much insight so far.

This panic is very common when list updates aren't adequately serialized.

> Third: I tried compiling with the sys/queue.h QUEUE_MACRO_DEBUG defined i=
n
> order to get more useful information from the panic. =A0The kernel build =
fails
> in pmap.c when this macro is defined, giving an error saying the CTASSERT
> macro is resolving to a negative array size. =A0Is there any particular s=
ecret
> to using this macro (like, no one goes there any more?)

This is because you are running amd64 and the the pv_entry constants
were defined assuming the default (smaller) list entry structure. I
once fixed this in a local tree, but I think I was so dismayed at the
"obviousness" of the bug I was tracking down that I neglected to
commit the pmap update. It shouldn't be too hard to calculate the
correct constants.

Cheers



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ7_N2iRaJ9%2BSFf%2BuujbtNJ97K=L_AdP6kYoxfwFij4CY%2Bwrgg>