From owner-freebsd-stable@FreeBSD.ORG  Mon Apr  3 16:37:13 2006
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: freebsd-stable@FreeBSD.org
Delivered-To: freebsd-stable@FreeBSD.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 71D6416A420;
	Mon,  3 Apr 2006 16:37:13 +0000 (UTC)
	(envelope-from tgl@sss.pgh.pa.us)
Received: from sss.pgh.pa.us (sss.pgh.pa.us [66.207.139.130])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 079BC43D73;
	Mon,  3 Apr 2006 16:37:04 +0000 (GMT)
	(envelope-from tgl@sss.pgh.pa.us)
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
	by sss.pgh.pa.us (8.13.6/8.13.6) with ESMTP id k33Gb4Ns014655;
	Mon, 3 Apr 2006 12:37:04 -0400 (EDT)
To: Robert Watson <rwatson@FreeBSD.org>
In-reply-to: <20060403164139.D36756@fledge.watson.org> 
References: <20060402163504.T947@ganymede.hub.org>
	<25422.1144016604@sss.pgh.pa.us> <25526.1144017388@sss.pgh.pa.us>
	<20060402213921.V947@ganymede.hub.org>
	<26524.1144026385@sss.pgh.pa.us>
	<20060402222843.X947@ganymede.hub.org>
	<26796.1144028094@sss.pgh.pa.us>
	<20060402225204.U947@ganymede.hub.org>
	<26985.1144029657@sss.pgh.pa.us>
	<20060402231232.C947@ganymede.hub.org>
	<27148.1144030940@sss.pgh.pa.us>
	<20060402232832.M947@ganymede.hub.org>
	<20060402234459.Y947@ganymede.hub.org>
	<27417.1144033691@sss.pgh.pa.us>
	<20060403164139.D36756@fledge.watson.org>
Comments: In-reply-to Robert Watson <rwatson@FreeBSD.org>
	message dated "Mon, 03 Apr 2006 16:49:52 +0100"
Date: Mon, 03 Apr 2006 12:37:04 -0400
Message-ID: <14654.1144082224@sss.pgh.pa.us>
From: Tom Lane <tgl@sss.pgh.pa.us>
Cc: "Marc G. Fournier" <scrappy@postgresql.org>, pgsql-hackers@postgresql.org,
	freebsd-stable@FreeBSD.org, Kris Kennaway <kris@obsecurity.org>
Subject: Re: [HACKERS] semaphore usage "port based"? 
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 03 Apr 2006 16:37:13 -0000

Robert Watson <rwatson@FreeBSD.org> writes:
> However, pid's in general uniquely identify a process only at the time they 
> are recorded.  So any pid returned here is necessarily stale -- even if there
> is another process with the pid returned by GETPID, it may actually be a 
> different process that has ended up with the same pid.  The longer the gap 
> since the last semaphore operation, the more likely (presumably) it is that 
> the pid has been recycled.  And on modern systems with thousands of processes
> and high process turn-over (i.e., systems with CGI and other sorts of 
> scripting),pid reuse can happen quickly.  Is your use of the pid here 
> consistent with fact that pid's are reused quickly after process exit?

That's a fair question, but in the context of the code I believe we are
behaving reasonably.  The reason this code exists is to provide some
insurance against leaking semaphores when a postmaster process is
terminated unexpectedly (ye olde often-recommended-against "kill -9
postmaster", for instance).  If the PID returned by GETPID is
nonexistent or belongs to a process not owned by the postgres userid
then we assume that the semaphore set can be recycled.  We could get
fooled by PID recycling if the PID returned by GETPID belongs to a
postgres-owned process that isn't actually the original owner, but
the penalty is just that we'll fail to recycle semaphores that could
be released.  Not very harmful, and not very probable either, unless
you're running postgres under a userid that's used for a lot of other
stuff too.  There is not much risk of long-term leakage of many
semaphore sets, even if you've got lots of postmaster crashes going on
(which I sure hope you don't).  The code is designed to retry the same
semaphore keys on each cycle of life, so you'd have to get fooled by
chance coincidence of existing PIDs every time over many cycles to
have a severe resource-leakage problem.  (BTW, Marc, that's the reason
for *not* randomizing the key selection as you suggested.)

So I think the code is pretty bulletproof as long as it's in a system
that is behaving per SysV spec.  The problem in the current FBSD
situation is that the jail mechanism is exposing semaphore sets across
jails, but not exposing the existence of the owning processes.  That
behavior is inconsistent: if process A can affect the state of a sema
set that process B can see, it's surely unreasonable to pretend that A
doesn't exist.

			regards, tom lane