From owner-freebsd-arch Mon Oct 7 11:54:44 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 89AC737B401 for ; Mon, 7 Oct 2002 11:54:41 -0700 (PDT) Received: from mail.speakeasy.net (mail17.speakeasy.net [216.254.0.217]) by mx1.FreeBSD.org (Postfix) with ESMTP id 196FA43E4A for ; Mon, 7 Oct 2002 11:54:41 -0700 (PDT) (envelope-from jhb@FreeBSD.org) Received: (qmail 17979 invoked from network); 7 Oct 2002 18:54:40 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) by mail17.speakeasy.net (qmail-ldap-1.03) with DES-CBC3-SHA encrypted SMTP for ; 7 Oct 2002 18:54:40 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.6/8.12.6) with ESMTP id g97Iscn5003007; Mon, 7 Oct 2002 14:54:38 -0400 (EDT) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.2 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200210051000.g95A0ZvU023752@gw.catspoiler.org> Date: Mon, 07 Oct 2002 14:54:42 -0400 (EDT) From: John Baldwin To: Don Lewis Subject: Re: [jmallett@FreeBSD.org: [PATCH] Reliable signal queues, etc., Cc: arch@FreeBSD.ORG, jmallett@FreeBSD.ORG Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 05-Oct-2002 Don Lewis wrote: > On 5 Oct, Juli Mallett wrote: > >> diff -Nrdu -x *CVS* -x *dev* sys/kern/kern_exit.c kernel/kern/kern_exit.c >> --- sys/kern/kern_exit.c Tue Oct 1 12:15:51 2002 >> +++ kernel/kern/kern_exit.c Sat Oct 5 01:20:57 2002 > >> @@ -209,12 +210,12 @@ >> PROC_LOCK(p); >> if (p == p->p_leader) { >> q = p->p_peers; >> + PROC_UNLOCK(p); >> while (q != NULL) { >> - PROC_LOCK(q); >> psignal(q, SIGKILL); >> - PROC_UNLOCK(q); >> q = q->p_peers; >> } >> + PROC_LOCK(p); >> while (p->p_peers) >> msleep(p, &p->p_mtx, PWAIT, "exit1", 0); >> } > > This scary looking fragment of code in exit1() is relying on the lock on > p->p_leader being continuously held to prevent the p_peers list from > changing while the list traversal is in progress. The code in > kern_fork.c and elsewhere in kern_exit.c holds a lock on p_leader while > the list modifications are done. > > The existing code looks like it could deadlock if q is locked because it > is in fork() or exit(). Process p will block when it tries to lock q, > and q will block when it tries to lock its p_leader, which happens to be > p. Ugh. Probably the code should be changed to do something like this: --- kern_exit.c 2 Oct 2002 23:12:01 -0000 1.181 +++ kern_exit.c 7 Oct 2002 18:48:18 -0000 @@ -203,17 +203,18 @@ */ p->p_flag |= P_WEXIT; - PROC_UNLOCK(p); /* Are we a task leader? */ - PROC_LOCK(p); if (p == p->p_leader) { q = p->p_peers; while (q != NULL) { + nq = q->p_peers; + PROC_UNLOCK(p); PROC_LOCK(q); psignal(q, SIGKILL); PROC_UNLOCK(q); - q = q->p_peers; + PROC_LOCK(p); + q = nq; } while (p->p_peers) msleep(p, &p->p_mtx, PWAIT, "exit1", 0); Also, we might should check P_WEXIT and abort in fork1() if it is set. (We don't appear to do that presently.) -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message