Date: Fri, 7 Jan 2005 13:15:59 -0500 From: M <m@obmail.net> To: Danny MacMillan <flowers@users.sourceforge.net> Cc: FreeBSD Mailing List <freebsd-questions@freebsd.org> Subject: Re: Remote upgrade possible? Message-ID: <2EBCB4AD-60D8-11D9-B88F-00039367611E@obmail.net> In-Reply-To: <20050107173333.GA865@procyon.nekulturny.org> References: <BE030CE7.15722%joe@jwebmedia.com> <41DDB2A7.8020001@wilderness.dyn.dhs.org> <41DE0F6F.3040303@taborandtashell.net> <1105100701.640.6.camel@chaucer> <20050107173333.GA865@procyon.nekulturny.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Jan 7, 2005, at 12:33 PM, Danny MacMillan wrote: > I haven't looked at the code, but your assertion is extremely unlikely. > I really want to say "impossible" but as I said, I haven't looked at > the code. If FreeBSD loaded entire executable images into RAM when > starting new processes, it would perform very poorly. What is more > likely is that the kernel keeps the image file open during program > execution. When the xterm binary is replaced, the old binary is still > on disk in its old location, it just doesn't have any directory > entries pointing to it. Since the kernel still has the file open it > won't be overwritten. Hence the kernel can and will still load > pages from the old image. This is a function of the same behaviour > that causes df and du output to differ in some cases. > > The lsof(8) utility seems to bear this out, as each process seems to > keep each image (program and shared object files) open during > execution. > > A new instance of xterm would use the new, upgraded binary. > When you run a program the program that runs the new one makes a copy of itself in the process table and they share code pages. This is done through fork(). At that point the new process, called the child, calls one of the exec() function calls which in turn calls a single syscall, execve(). execve() uses namei() to get the vnode pointer. Each vnode pointer has three ference counts, v_usecount, v_holdcnt and v_writecount. A vnode is not recycled until both the usecount and holdcnt are 0. When namei() is called it calls VREF() which is vref() which does vp->v_usecount++; so if it's running the page can't be recycled from a point in time before the program actually is loaded in to memory. execve() calls exec_map_first_page(). Without tearing this apart I'm going to guess that this memory maps the first page of text (code) through the VM subsystem as evidenced by the conspicuous calls to vm_page*() functions so I'd conclude the file is memory mapped. Presuming it turns out the command you're calling isn't a shell script or other script execve() cleans up the environment so file descriptors and signal handlers don't get shared, the processes environment is setup, lets the calling (forking) process know it can continue on it's merry way, sets uid/gid if necessary/possible, and it looks like the scheduler takes care of the rest (I'll be honest here, the code seems to trail off here so far as I can tell in to parts that are jumped to in case of error). In any case we have a increased usecount. Now we are going to unlink that file and create a new one. After some basic checks (you can't remove the root of a file system for example) unlink() will call VOP_REMOVE() which calls vrele() which deincrements the usecount when it's greater than one, which in this case it MUST be because the xterm process has one count on it and the file entry has another (hard links to the file may have additional counts on it). Therefore it appears that you can unlink the file, it will remain on the disk to serve the memory mapped image used for the running process and install a new copy. I'm going to presume when a process exits it de-increments the usecount for the vnode, which, when 0 should put the page on the free list.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2EBCB4AD-60D8-11D9-B88F-00039367611E>