Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Nov 2002 23:09:45 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        Daniel O'Connor <doconnor@gsoft.com.au>, Hans Zaunere <zaunere@yahoo.com>, freebsd-hackers@FreeBSD.ORG
Subject:   Re: Shared files within a jail
Message-ID:  <3DD1FAB9.82607C41@mindspring.com>
References:  <20021113034726.75787.qmail@web12801.mail.yahoo.com> <1037159767.66058.34.camel@chowder.localdomain> <200211130530.gAD5UxNt067928@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Matthew Dillon wrote:
>     Try using null mounts.  The warning is in there because making the
>     null mount code work is a real hack and the authors aren't entirely
>     sure that everything's gotten covered.  That said, use of a null mount
>     is certainly a lot safer if the stuff behind the mount is mostly
>     static.

The problem is in the VM object alias code.  Specifically, the
getpages/putpages have to be implemented in terms of read/write,
so that there are not two vm_object_t's that refer to the same
data, since there is no "upcall" to notify of changes in a lower
layer, and therefore guarantee coherency.

This basically means that the "pig tricks" that most people who
don't know any better do, like using both mmap() and file I/O
against the same file, require explicit calls to msync() to
ensure cache coherency.  Most people who write code these days
don't expect to have to call msync, and even if they expect to,
they're not entirely sure of when/why/how to call it.

This is the same reason that dropping the getpages/putpages VOPs
from the SMBFS implementation "fixes" the "cp" problem (by making
"cp" dork like "dd", by converting the getpages() request into a
read() request, instead).  But doing that introduces the same
cache coherency problems, again.

You can basically ignore this problem entirely, since your mounts
are going to be read-only, and you aren't going to have to worry
about someone dirtying pages through a nullfs mount.


>     Note that you can also use localhost NFS mounts to replicate pieces of
>     filesystems within jails, but you need to remember that the kernel
>     will wind up caching multiple copies of the data for these two cases
>     and that NFS has file locking issues.

Yes.  This will also work, if the man page for nullfs turns out to
be "too scary".  ;^).  Same coherency issues.

>     Finally, keep in mind that disk space these days is quite cheap.
>     Duplicating the data is not as bad a way to go as you might think, and
>     it allows you to incrementally upgrade each jail.  It may suffice to use
>     the null mount trick *only* for the big binaries and libraries that you
>     really want to share, and it may be reasonable to use softlinks to
>     accomplish it, like this:

And, in fact, this is what I tend to do.  But since the case in point
is for MySQL/Apache/etc., there's probably a lot more jhail instances
than what you are used to seeing.  This is a shared hosting platform,
which is trying to pretend it's not shared, right?

If you go this route, you may want to bump up the number of inodes
by quite a bit above the default...

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3DD1FAB9.82607C41>