From owner-freebsd-arch  Tue Oct  8  7:14:30 2002
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3490B37B408
	for <arch@FreeBSD.org>; Tue,  8 Oct 2002 07:14:28 -0700 (PDT)
Received: from mail.speakeasy.net (mail14.speakeasy.net [216.254.0.214])
	by mx1.FreeBSD.org (Postfix) with ESMTP id B72EE43E42
	for <arch@FreeBSD.org>; Tue,  8 Oct 2002 07:14:27 -0700 (PDT)
	(envelope-from jhb@FreeBSD.org)
Received: (qmail 26005 invoked from network); 8 Oct 2002 14:14:27 -0000
Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63])
          (envelope-sender <jhb@FreeBSD.org>)
          by mail14.speakeasy.net (qmail-ldap-1.03) with DES-CBC3-SHA encrypted SMTP
          for <jmallett@FreeBSD.org>; 8 Oct 2002 14:14:27 -0000
Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1])
	by server.baldwin.cx (8.12.6/8.12.6) with ESMTP id g98EEPn5006134;
	Tue, 8 Oct 2002 10:14:25 -0400 (EDT)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <XFMail.20021008101429.jhb@FreeBSD.org>
X-Mailer: XFMail 1.5.2 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <200210080322.g983MIvU034090@gw.catspoiler.org>
Date: Tue, 08 Oct 2002 10:14:29 -0400 (EDT)
From: John Baldwin <jhb@FreeBSD.org>
To: Don Lewis <dl-freebsd@catspoiler.org>
Subject: Re: [jmallett@FreeBSD.org: [PATCH] Reliable signal queues, etc.,
Cc: arch@FreeBSD.org, jmallett@FreeBSD.org
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG


On 08-Oct-2002 Don Lewis wrote:
> On  7 Oct, John Baldwin wrote:
>> 
>> On 07-Oct-2002 Don Lewis wrote:
> 
>>> Probably, but the list is also modified in the exit code.  All those
>>> processes that we are sending SIGKILL to are removing themselves from
>>> the list.
>> 
>> Processes dieing from SIGKILL that we send them aren't a problem since
>> we have already read their p_peers member before we kill them.  That's
>> the point of 'nq'.  The problem is that 'nq' could exit and could be
>> an invalid pointer.  If a process later in the list after 'nq' died
>> that is not a problem either.  Well, how about this:
> 
> I missed your use of nq, even though this is a fairly common way of
> handling similar problems if there is only a single thread.
> 
>> http://www.FreeBSD.org/~jhb/patches/ppeers.patch
> 
> That's pretty much what I had envisioned.  I have a little bit of a
> concern that funnelling a single mutex could be a bottleneck in some
> cases, but it is simple, safe, and otherwise low overhead.

Well, the mutex is only used in the RFTHREAD case most of the time.  The
only time it is uncondtionally acquired it is almost immediately released
in the !RFTHREAD case.

> It looks like we've got a potential lock order reversal problem, though.
> In fork1() we grab ppeers_lock while holding a couple of PROC_LOCKs,
> while in the first part of exit1() we grab ppeers_lock before PROC_LOCK.
> My caffeine level is insufficient to judge whether P_WEXIT checking
> would save us in practice.

Bah, fixed the reversal, thanks.  We still need the P_WEXIT check in
fork1() since otherwise a new peer or child could be added after we
have finished going through the entire list.  Hmm, adding this is ugly
though b/c we really need to check after we acquire the ppeers_lock and
do the actual hookup.  Hmm, we can move the RFTHREAD stuff a lot earlier
and then this isn't such a big deal.  Ok, I've updated the patch again.
One note: I've got a question about how to handle the error condition
in that case in fork1().  I'm really starting to think that instead of
returning an error, the peer process should just go ahead and call
exit1() in this case since it is about to be killed anyways.

-- 

John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message