Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 15 Feb 2004 13:18:59 -0500 (EST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Melvyn Sopacua <freebsd-current@webteckies.org>
Cc:        current@FreeBSD.org
Subject:   Re: Jails that keep hanging around
Message-ID:  <Pine.NEB.3.96L.1040215130356.56481G-100000@fledge.watson.org>
In-Reply-To: <200402151714.26631.freebsd-current@webteckies.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Sun, 15 Feb 2004, Melvyn Sopacua wrote:

> I have yet to figure out what triggers the bug, but I end up with
> 'running' jails, without any processes. So I thought I'd create 'jld' to
> remove a jail. However - prison_find isn't exported to userland.
> Probably for good reason.

Jails are reference-counted objects hung off of process credentials, which
are also reference-counted objects.  So a jail can't evaporate until the
last credential referencing that object also evaporates.  So you may be
dealing with one of two things:

(1) A reference leak in the jail or credential code.

(2) A reference that hasn't gone away for some legitimate (but obscure)
    reason.

Here are some places you might have credentials that "hang on" -- i.e.,
other kernel structures that cache references to credentials for some or
another reason:

struct buf	

struct file	Open file descriptors cache the credential of the process
		that created them.  If you pass a file descriptor out of a
		jail using a UNIX domain socket, then the jail will remain
		referenced until that descriptor is finally closed.

struct mount	When a file system is mounted, the mount structure
		describing the file system caches the credential of the
		process that performed the mount.  Since this can't be done in a
		jail, not likely the problem. 

struct sigio	When sigio (signal generation on I/O readiness) is
		configured for sockets or other objects, the credential of
		the process setting up sigio is cached, so as to authorize the
		later signal delivery.

struct socket	Open sockets also cache the credential of the process that
		created them.  If a socket is passed out, or referenced by
		another part of the kernel, the jail it is attached to will
		continue to exist until the socket is closed. 

kernel accounting	The kernel accounting code caches the credential
			of the process that turns on accounting. Since
			accounting can't be turned on in a jail, not
			likely the problem.

kernel alq	When tracing to disk is enabled in the kernel, a
		credential is cached from open of the file target to use for
		later I/O. 

UFS quotas, attributes	When access to files to hold UFS meta-data is set
			up, credentials from setup are cached for later I/O. 

struct tcpcb	TCP connections cache the socket credential during time
		wait so that IPFW-related uid and gid checks can be
		performed even once the socket has been released. 


So it seems there are generally two ways a jail might continue to be
referenced: a service is set up caching a credential, or an object is set
up caching a credential.  In common practice, neither of these prevents a
jail from evaporating: most services using cached credentials can't be set
up from jails, and most objects that cache credentials are only referenced
from proceses in the jail, so when the processes all exit, the cached
credentials also evaporate.  The only real exception to this is the tcpcb
-- TCP connection remenants can last for quite a time after their socket
exits, since they follow the TCP state machine (which has long waits).  So
this could be it -- check netstat and see if there are any largely closed
TCP sessions from the jails.  FYI, this is not a complete list of
credential references, but it accounts for most of them.

Are you using any services that pass references to sockets or other file
descriptor objects in and out of the jail using UNIX domain sockets?  If
so, that could also be it.

> Should I worry about these jails or is it harmless: 

It's probably harmless unless there's a leak.  Jails are fairly
light-weight objects, so if it takes a little longer to GC due to TCP,
it's OK.  On the other hand, if there's a leak, that's very bad; likewise,
if you have an application passing credentials in and out of the jail
(i.e., a jail management tool), it could be it needs to be slightly
modified so as to release the credentials faster after the jail exits.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Senior Research Scientist, McAfee Research



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1040215130356.56481G-100000>