From owner-freebsd-hackers@FreeBSD.ORG  Thu Aug 25 22:25:32 2011
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B30F1106566B
	for <freebsd-hackers@freebsd.org>; Thu, 25 Aug 2011 22:25:32 +0000 (UTC)
	(envelope-from kip.macy@gmail.com)
Received: from mail-yw0-f54.google.com (mail-yw0-f54.google.com
	[209.85.213.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 749118FC19
	for <freebsd-hackers@freebsd.org>; Thu, 25 Aug 2011 22:25:32 +0000 (UTC)
Received: by ywo32 with SMTP id 32so2584427ywo.13
	for <freebsd-hackers@freebsd.org>; Thu, 25 Aug 2011 15:25:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	bh=IJhVz3nIcXSGs5x5PUnhNXHGalyTFnx6wq1/fTqgyao=;
	b=ShzRmIBuvSSyUxl+7ucIdNAhVCBokbrtds00YGsoC6tFoAp3o/dSjv13lf8C6nvBom
	9x8Ue8ZSZI2pZoYiNUtkNKZxNQfFC0ZnNBwcpyObbZzYG0JVrt7zLM8TGEEE+kbL2vuX
	8rkkwjllYb5rHu4OJ2wlFluDg/wDs2T1Rh20Q=
MIME-Version: 1.0
Received: by 10.42.168.72 with SMTP id v8mr232748icy.266.1314309371881; Thu,
	25 Aug 2011 14:56:11 -0700 (PDT)
Received: by 10.42.243.5 with HTTP; Thu, 25 Aug 2011 14:56:11 -0700 (PDT)
In-Reply-To: <4E56BB99.6030706@sgi.com>
References: <4E56BB99.6030706@sgi.com>
Date: Thu, 25 Aug 2011 23:56:11 +0200
Message-ID: <CAJ7_N2iRaJ9+SFf+uujbtNJ97K=L_AdP6kYoxfwFij4CY+wrgg@mail.gmail.com>
From: Kip Macy <kip.macy@gmail.com>
To: Charlie Martin <crmartin@sgi.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-hackers@freebsd.org
Subject: Re: Where to ask about a 7.2 bug, and debugging sys/queue.h errors
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Aug 2011 22:25:32 -0000

On Thu, Aug 25, 2011 at 11:16 PM, Charlie Martin <crmartin@sgi.com> wrote:
> We're having a crash in some internal code running on FreeBSD 7.2
> (specifically =A07.2-PRERELEASE FreeBSD 7.2-PRERELEASE and yeah, I know i=
t's
> quite a bit behind) in which after 18-30 hours of running load tests, the
> code panics with:
>
> panic: Bad link elm 0xffffff0044c09600 next->prev !=3D elm
> cpuid =3D 0
> KDB: stack backtrace:
> db_trace_self_wrapper() at 0xffffffff8019119a =3D db_trace_self_wrapper+0=
x2a
> panic() at 0xffffffff80307c72 =3D panic+0x182
> devfs_populate_loop() at 0xffffffff802a43a8 =3D devfs_populate_loop+0x548
>
>
> First question: where's the most appropriate place to ask about this kind=
 of
> bug on a back version.

Probably -stable. I don't know how many developers are still running
7. Most are on 8 at this point.

> Second: does this remind anyone of any bugs? =A0Googling came up with a f=
ew
> somewhat similar things but hasn't provided much insight so far.

This panic is very common when list updates aren't adequately serialized.

> Third: I tried compiling with the sys/queue.h QUEUE_MACRO_DEBUG defined i=
n
> order to get more useful information from the panic. =A0The kernel build =
fails
> in pmap.c when this macro is defined, giving an error saying the CTASSERT
> macro is resolving to a negative array size. =A0Is there any particular s=
ecret
> to using this macro (like, no one goes there any more?)

This is because you are running amd64 and the the pv_entry constants
were defined assuming the default (smaller) list entry structure. I
once fixed this in a local tree, but I think I was so dismayed at the
"obviousness" of the bug I was tracking down that I neglected to
commit the pmap update. It shouldn't be too hard to calculate the
correct constants.

Cheers