Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Mar 2003 23:51:52 +0100
From:      Thomas Moestl <tmoestl@gmx.net>
To:        Tim Robbins <tjr@FreeBSD.org>
Cc:        John Baldwin <jhb@FreeBSD.org>, Kris Kennaway <kris@obsecurity.org>, alfred@FreeBSD.org, current@FreeBSD.org, Poul-Henning Kamp <phk@phk.freebsd.dk>
Subject:   Re: NULL pointer problem in pid selection ?
Message-ID:  <20030310225151.GA2803@crow.dom2ip.de>
In-Reply-To: <20030311084346.A63542@dilbert.robbins.dropbear.id.au>
References:  <20030308213535.GE56020@rot13.obsecurity.org> <XFMail.20030310130015.jhb@FreeBSD.org> <20030311084346.A63542@dilbert.robbins.dropbear.id.au>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 2003/03/11 at 08:43:46 +1100, Tim Robbins wrote:
> On Mon, Mar 10, 2003 at 01:00:15PM -0500, John Baldwin wrote:
> 
> > On 08-Mar-2003 Kris Kennaway wrote:
> > > On Sat, Mar 08, 2003 at 11:46:34AM +0100, Poul-Henning Kamp wrote:
> > >> 
> > >> Just got this crash on -current, and I belive I have seen similar
> > >> before.  addr2line(1) reports the faulting address to be
> > >>      ../../../kern/kern_fork.c:395
> > >> which is in the inner loop of pid collision avoidance.
> > > 
> > > I've been running this patch from Alfred for the past month or so on
> > > bento, which has fixed a similar panic I was seeing regularly.
> > 
> > Using just a shared lock instead of an xlock should be ok there.  You
> > aren't modifying the process tree, just looking at it.  OTOH, the
> > proc lock is supposed to protect p_grp and p_session, so they shouldn't
> > be NULL. :(
> 
> I have a suspiscion that the bug is actually in wait1():
> 
>         sx_xlock(&proctree_lock);
> 	[...]
> 	/*
> 	 * Remove other references to this process to ensure
> 	 * we have an exclusive reference.
> 	 */
> 	leavepgrp(p);
> 
> 	sx_xlock(&allproc_lock);
> 	LIST_REMOVE(p, p_list); /* off zombproc */
> 	sx_xunlock(&allproc_lock);
> 
> 	LIST_REMOVE(p, p_sibling);
> 	sx_xunlock(&proctree_lock);
> 
> 
> Shouldn't we be removing the process from zombproc before setting
> p_pgrp to NULL via leavepgrp()? Does this even matter at all when both
> fork1() and wait1() are still protected by Giant?

Hmmm, I think you're right; if allproc_lock happens to be contested in
fork1() (which can happen because it it is locked without Giant held
in some places, and because sleeping with an sx lock is allowed),
we'll go to sleep there, dropping Giant. This opens up a race, since
wait1() can now proceed until after the leavepgrp() before blocking;
when allproc_lock is released, fork1() will be the first to pick it
up, and this panic will happen.
Seems that I relied on Giant too much when I first took a look into
that code :)

	- Thomas

-- 
Thomas Moestl <tmoestl@gmx.net>	http://www.tu-bs.de/~y0015675/
              <tmm@FreeBSD.org>	http://people.FreeBSD.org/~tmm/
PGP fingerprint: 1C97 A604 2BD0 E492 51D0  9C0F 1FE6 4F1D 419C 776C

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030310225151.GA2803>