From owner-freebsd-stable@FreeBSD.ORG  Wed May 25 20:03:45 2005
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: freebsd-stable@freebsd.org
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id DE0FB16A41C
	for <freebsd-stable@freebsd.org>; Wed, 25 May 2005 20:03:45 +0000 (GMT)
	(envelope-from stephen@math.missouri.edu)
Received: from sccmmhc92.asp.att.net (sccmmhc92.asp.att.net [204.127.203.212])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 9033143D48
	for <freebsd-stable@freebsd.org>; Wed, 25 May 2005 20:03:45 +0000 (GMT)
	(envelope-from stephen@math.missouri.edu)
Received: from [10.0.0.4] (12-216-244-56.client.mchsi.com[12.216.244.56])
	by sccmmhc92.asp.att.net (sccmmhc92) with ESMTP
	id <20050525200342m9200gd979e>; Wed, 25 May 2005 20:03:43 +0000
Message-ID: <4294DA1D.1030202@math.missouri.edu>
Date: Wed, 25 May 2005 15:03:41 -0500
From: Stephen Montgomery-Smith <stephen@math.missouri.edu>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.8) Gecko/20050521
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: freebsd-stable@FreeBSD.ORG
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Cc: 
Subject: releng 5 panic (again)
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 25 May 2005 20:03:46 -0000

Please help me!

I know that I am getting few responses to my emails - I am guessing that 
my situation is difficult.  If you could offer any ideas how to help 
with further diagnostics.

I am regularly getting panics with instruction pointer equal to 
0xc0611c69.  I am not able to get any dumps - the dumpon directive is 
simply ignored.

(I did get one dump (for some reason), but that was with a kernel that 
was not made with config -g, and new kernels made afterwards seem 
significantly different, despite having exactly the same size.)

The code at this instruction pointer is

(kgdb) list *0xc0611c69
0xc0611c69 is in fill_kinfo_thread (../../../kern/kern_proc.c:748).
743                     }
744
745                     kg = td->td_ksegrp;
746
747                     /* things in the KSE GROUP */
748                     kp->ki_estcpu = kg->kg_estcpu;
749                     kp->ki_slptime = kg->kg_slptime;
750                     kp->ki_pri.pri_user = kg->kg_user_pri;
751                     kp->ki_pri.pri_class = kg->kg_pri_class;
752

so I'm guessing that kp is not correct.

Because of the consistency of the instruction pointer value from panic 
to panic, I really am thinking that this is not a hardware issue.

I will try any reasonable test you guys have for me.  Right now I am 
switching off HTT to see if that is the issue.  This is a dual Xeon system.

I am willing to provide a copy of the program that I'm guessing is 
causing the problem.  It is a multithreaded program that is very CPU 
instensive, although most of the inners of the code are from the fftw3 port.

One interesting thing about this program is that when I run it, top says 
that about 45% CPU is being used (which with 4 logical CPU's means that 
almost 2 CPU's are being used), but that actual program is registered at 
running with about 80% CPU time (which I am guessing means 0.8 of one 
CPU is being used).  It seems to me that there is some disparity in the 
accounting.

Maybe it is a problem with the math/fftw3 code.  But is still shouldn't 
causes crashes.

Please help me.  I am sure that this is a difficult problem, but I just 
don't know how to provide you any further decent diagnostic information.

Thanks, Stephen