From owner-freebsd-arch Mon Oct 7 20:22:33 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 073E337B401; Mon, 7 Oct 2002 20:22:32 -0700 (PDT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4146E43E65; Mon, 7 Oct 2002 20:22:31 -0700 (PDT) (envelope-from dl-freebsd@catspoiler.org) Received: from mousie.catspoiler.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.5/8.12.5) with ESMTP id g983MIvU034090; Mon, 7 Oct 2002 20:22:22 -0700 (PDT) (envelope-from dl-freebsd@catspoiler.org) Message-Id: <200210080322.g983MIvU034090@gw.catspoiler.org> Date: Mon, 7 Oct 2002 20:22:18 -0700 (PDT) From: Don Lewis Subject: Re: [jmallett@FreeBSD.org: [PATCH] Reliable signal queues, etc., To: jhb@FreeBSD.org Cc: jmallett@FreeBSD.org, arch@FreeBSD.org In-Reply-To: MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 7 Oct, John Baldwin wrote: > > On 07-Oct-2002 Don Lewis wrote: >> Probably, but the list is also modified in the exit code. All those >> processes that we are sending SIGKILL to are removing themselves from >> the list. > > Processes dieing from SIGKILL that we send them aren't a problem since > we have already read their p_peers member before we kill them. That's > the point of 'nq'. The problem is that 'nq' could exit and could be > an invalid pointer. If a process later in the list after 'nq' died > that is not a problem either. Well, how about this: I missed your use of nq, even though this is a fairly common way of handling similar problems if there is only a single thread. > http://www.FreeBSD.org/~jhb/patches/ppeers.patch That's pretty much what I had envisioned. I have a little bit of a concern that funnelling a single mutex could be a bottleneck in some cases, but it is simple, safe, and otherwise low overhead. It looks like we've got a potential lock order reversal problem, though. In fork1() we grab ppeers_lock while holding a couple of PROC_LOCKs, while in the first part of exit1() we grab ppeers_lock before PROC_LOCK. My caffeine level is insufficient to judge whether P_WEXIT checking would save us in practice. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message