FreeBSD Mail Archives

Date:      Fri, 8 Jan 1999 00:44:47 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        dillon@apollo.backplane.com (Matthew Dillon)
Cc:        tlambert@primenet.com, dyson@iquest.net, pfgiffun@bachue.usc.unal.edu.co, freebsd-hackers@FreeBSD.ORG
Subject:   Re: questions/problems with vm_fault() in Stable
Message-ID:  <199901080044.RAA22486@usr01.primenet.com>
In-Reply-To: <199901072306.PAA35328@apollo.backplane.com> from "Matthew Dillon" at Jan 7, 99 03:06:21 pm

OK, the MFS stuff first.

> :>     on a soft block.  For example, UFS/FFS was never designed to terminate
> :>     on memory, much less swap-backed memory.  Then came along MFS and
> :>     suddently were (and still are) all sorts of problems.
> :
> :I'd argue that MFS is an inappropriate use of the UFS code, since the
> :UFS code doesn't acknowledge the idea of shrinking block-backing
> :objects, and barely (with severe fragmentation based degradation)
> :recognizes growing block-backing objects.
> 
>     I have no idea what you are talking about here.

I'm saying that the MFS problems you are noting are artifacts of
the implementation, not the idea, and that if you go back to
first principles, and correctly implement the idea, then you
won't have those problems.

MFS can't give up unuse metadata pages because it's implemented on
top of something that thinks that, once something has been used, it's
dirty, and it has to have backing store forever afterward.

The MFS problems dealing with metadata and cylinder group allocation
(basically layout policy) are an artifact of the policy.  MFS can't
support dropping a cylinder group worth of metadata reference because
FFS/UFS can't support that, because, quite simply, they weren't
designed with the idea that the underlying backing store could change
size.

You can grow an MFS in cylinder group untis, at the cost of inequal
fragmantation bias on the existing (more used than the new unused
area) cylinder groups.  This works because it works in FFS, too.


MFS itself is a piece of crap whenit comes to juggling backing
store requirements because FFS/UFS is a piece of crap under the
same circumstances, and MFS is a derivative.


>     its idea of a fixed block device, just like UFS.  A VOP
>     mechanism already exists to handle block freeing and, in
>     fact, after the 15th, MFS will start to use it.... file
>     fragments and meta-data will still get unnecessary swap
>     assignments, but full file blocks will not.  This is a
>     major improvement and it took me all of 40 minutes to add
>     the support for it.

This is *not* a major improvement.  It's a trivial improvement
which does nothing to address the issue of fragmentation.  The
FFS/UFS combination on a fixed backing store is relative immune
to fragmentation because of the way the backing store is used via
what is, in effect, a statistical hash of blocks into the available
space for blocks.

The recovery mechanism you outline deals with breaking pages back
to the system for reuse, but *aggrivates* the fragmentation issue
to an almost unholy level, which just gets worse if you try and add
cylinder groups to "grow" the MFS.


The *only* soloution to "the MFS problem" is an FS architecture
that *expects* the underlying blocks store to break on page
boundaries, and either self-defragments or otherwise is made
fragmentation immune.  Probably the most correct way to do this
is to place the memory consumed by the FS into a seperate address
space that isn't the kernel's address space, such that the kernel
can feel free to relocate its pages out from under it to
transparently defragment it.


Note:

This means that there is an interface seperation between a file
as a device, and a VM using the device as a backing store.  This
incidently means you exit and reeenter the VM on either side of
a VFS whose files are being used as a swap store (which means that
there is a procedural interface instead of a VM alias in order to
ensure coherency between the VM object consuming the file and the
VM object backing the file on the underlying device).



					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199901080044.RAA22486>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation