From owner-freebsd-questions@FreeBSD.ORG  Fri Jan  7 18:16:48 2005
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 5F3A516A4CE
	for <freebsd-questions@freebsd.org>;
	Fri,  7 Jan 2005 18:16:48 +0000 (GMT)
Received: from stewie.obfuscated.net (stewie.obfuscated.net [66.118.188.125])
	by mx1.FreeBSD.org (Postfix) with ESMTP id D0B6B43D3F
	for <freebsd-questions@freebsd.org>;
	Fri,  7 Jan 2005 18:16:47 +0000 (GMT)	(envelope-from m@obmail.net)
Received: from [192.168.1.103] (653259hfc120.tampabay.rr.com [65.32.59.120])
	(using TLSv1 with cipher RC4-SHA (128/128 bits))
	(No client certificate requested)
	by stewie.obfuscated.net (Postfix) with ESMTP id E53F36104;
	Fri,  7 Jan 2005 13:16:46 -0500 (EST)
In-Reply-To: <20050107173333.GA865@procyon.nekulturny.org>
References: <BE030CE7.15722%joe@jwebmedia.com>
	<41DDB2A7.8020001@wilderness.dyn.dhs.org>
	<41DE0F6F.3040303@taborandtashell.net> <1105100701.640.6.camel@chaucer>
	<20050107173333.GA865@procyon.nekulturny.org>
Mime-Version: 1.0 (Apple Message framework v619)
Content-Type: text/plain; charset=US-ASCII; format=flowed
Message-Id: <2EBCB4AD-60D8-11D9-B88F-00039367611E@obmail.net>
Content-Transfer-Encoding: 7bit
From: M <m@obmail.net>
Date: Fri, 7 Jan 2005 13:15:59 -0500
To: Danny MacMillan <flowers@users.sourceforge.net>
X-Mailer: Apple Mail (2.619)
cc: Mike Jeays <Mike.Jeays@rogers.com>
cc: Laurence Sanford <lauasanf@wilderness.dyn.dhs.org>
cc: tkelly-freebsd-questions@taborandtashell.net
cc: FreeBSD Mailing List <freebsd-questions@freebsd.org>
Subject: Re: Remote upgrade possible?
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Jan 2005 18:16:48 -0000


On Jan 7, 2005, at 12:33 PM, Danny MacMillan wrote:

> I haven't looked at the code, but your assertion is extremely unlikely.
> I really want to say "impossible" but as I said, I haven't looked at
> the code.  If FreeBSD loaded entire executable images into RAM when
> starting new processes, it would perform very poorly.  What is more
> likely is that the kernel keeps the image file open during program
> execution.  When the xterm binary is replaced, the old binary is still
> on disk in its old location, it just doesn't have any directory
> entries pointing to it.  Since the kernel still has the file open it
> won't be overwritten.  Hence the kernel can and will still load
> pages from the old image.  This is a function of the same behaviour
> that causes df and du output to differ in some cases.
>
> The lsof(8) utility seems to bear this out, as each process seems to
> keep each image (program and shared object files) open during
> execution.
>
> A new instance of xterm would use the new, upgraded binary.
>

When you run a program the program that runs the new one makes a copy 
of itself in the process table and they share code pages. This is done 
through fork(). At that point the new process, called the child, calls 
one of the exec() function calls which in turn calls a single syscall, 
execve(). execve() uses namei() to get the vnode pointer. Each vnode 
pointer has three ference counts, v_usecount, v_holdcnt and 
v_writecount. A vnode is not recycled until both the usecount and 
holdcnt are 0. When namei() is called it calls VREF() which is vref() 
which does

         vp->v_usecount++;

so if it's running the page can't be recycled from a point in time 
before the program actually is loaded in to memory. execve() calls 
exec_map_first_page(). Without tearing this apart I'm going to guess 
that this memory maps the first page of text (code) through the VM 
subsystem as evidenced by the conspicuous calls to vm_page*() functions 
so I'd conclude the file is memory mapped. Presuming it turns out the 
command you're calling isn't a shell script or other script execve() 
cleans up the environment so file descriptors and signal handlers don't 
get shared, the processes environment is setup, lets the calling 
(forking) process know it can continue on it's merry way, sets uid/gid 
if necessary/possible, and it looks like the scheduler takes care of 
the rest  (I'll be honest here, the code seems to trail off here so far 
as I can tell in to parts that are jumped to in case of error). In any 
case we have a increased usecount.

Now we are going to unlink that file and create a new one.

After some basic checks (you can't remove the root of a file system for 
example) unlink() will call VOP_REMOVE() which calls vrele() which 
deincrements the usecount when it's greater than one, which in this 
case it MUST be because the xterm process has one count on it and the 
file entry has another (hard links to the file may have additional 
counts on it).

Therefore it appears that you can unlink the file, it will remain on 
the disk to serve the memory mapped image used for the running process 
and install a new copy. I'm going to presume when a process exits it 
de-increments the usecount for the vnode, which, when 0 should put the 
page on the free list.