From owner-freebsd-current  Sun Sep 10 20:48:31 2000
Delivered-To: freebsd-current@freebsd.org
Received: from pike.osd.bsdi.com (pike.osd.bsdi.com [204.216.28.222])
	by hub.freebsd.org (Postfix) with ESMTP id 55F9337B422
	for <current@FreeBSD.ORG>; Sun, 10 Sep 2000 20:48:29 -0700 (PDT)
Received: (from jhb@localhost)
	by pike.osd.bsdi.com (8.9.3/8.9.3) id UAA31972;
	Sun, 10 Sep 2000 20:48:16 -0700 (PDT)
	(envelope-from jhb)
From: John Baldwin <jhb@pike.osd.bsdi.com>
Message-Id: <200009110348.UAA31972@pike.osd.bsdi.com>
Subject: Re: New Fatal trap in Current SMP (random.dev changes ??)
In-Reply-To: <4.3.2.7.2.20000910202520.00b17498@pozo.com> from Manfred Antar
 at "Sep 10, 2000 08:30:14 pm"
To: Manfred Antar <null@pozo.com>
Date: Sun, 10 Sep 2000 20:48:16 -0700 (PDT)
Cc: current@FreeBSD.ORG
X-Mailer: ELM [version 2.4ME+ PL68 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Manfred Antar wrote:
> At 08:16 PM 9/10/2000 -0700, John Baldwin wrote:
> >Manfred Antar wrote:
> >> >From a new kernel compiled from sources current 7:30 pm 9/10/00 pacific time
> >> Although the first occurrence happened this morning after compiling a kernel 
> >> after the random dev changes.
> >> The system boots and mounts disks. When it gets to this point:
> >> Additional routing options: TCP extensions=NO TCP keepalive=YES.
> >> routing daemons:.
> >> I think the random dev kicks in at this point
> >> 
> >> Fatal trap 12: page fault while in kernel mode
> >> cpuid = 1; lapic.id = 0c000000
> >> fault virtual address = 0x2c
> >> fault code            = supervisor read, page not present
> >> instruction pointer   = 0x8:0xc014f280
> >> stack pointer         = 0x10:0xc9a74f84
> >> frame pointer         = 0x10:0xc9a74f9c
> >> code segment          = base 0x0, limit 0xfffff, type 0x1b
> >>                       = DPL 0, pres 1,def 32 1,gran 1
> >> processor flags       = interrupt enabled, resume, IOPL = 0
> >> current process       = 2 (random)
> >> trap number           = 12
> >> panic: page fault
> >> cpuid = 1; lapic.id = 0x000000
> >> boot() called on cpu#1
> >> 
> >> syncing disks.......
> >> 
> >> The machine then is frozen and needs a reset to work again
> >> the debugger is unavailable
> I backed out to the old versions of the random dev files:
> /sys/dev/randomdev:
> harvest.c
> randomdev.c
> yarrow.c
> yarrow.h
> /sys/sys:
> random.h
> 
> And the panic goes away
> Manfred

I've got it fixed.  The code is using TAILQ_REMOVE and TAILQ_FIRST to
pull entries out of a tailq while it is walking it via TAILQ_FOREACH.
Changing it to use a while(!TAILQ_EMPTY) instead of using TAILQ_FOREACH
fixes it.  I'll be committing the fix in just a sec..

-- 

John Baldwin <jhb@bsdi.com> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.cslab.vt.edu/~jobaldwi/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message