Date: Fri, 8 Jan 1999 00:44:47 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: dillon@apollo.backplane.com (Matthew Dillon) Cc: tlambert@primenet.com, dyson@iquest.net, pfgiffun@bachue.usc.unal.edu.co, freebsd-hackers@FreeBSD.ORG Subject: Re: questions/problems with vm_fault() in Stable Message-ID: <199901080044.RAA22486@usr01.primenet.com> In-Reply-To: <199901072306.PAA35328@apollo.backplane.com> from "Matthew Dillon" at Jan 7, 99 03:06:21 pm
next in thread | previous in thread | raw e-mail | index | archive | help
OK, the MFS stuff first. > :> on a soft block. For example, UFS/FFS was never designed to terminate > :> on memory, much less swap-backed memory. Then came along MFS and > :> suddently were (and still are) all sorts of problems. > : > :I'd argue that MFS is an inappropriate use of the UFS code, since the > :UFS code doesn't acknowledge the idea of shrinking block-backing > :objects, and barely (with severe fragmentation based degradation) > :recognizes growing block-backing objects. > > I have no idea what you are talking about here. I'm saying that the MFS problems you are noting are artifacts of the implementation, not the idea, and that if you go back to first principles, and correctly implement the idea, then you won't have those problems. MFS can't give up unuse metadata pages because it's implemented on top of something that thinks that, once something has been used, it's dirty, and it has to have backing store forever afterward. The MFS problems dealing with metadata and cylinder group allocation (basically layout policy) are an artifact of the policy. MFS can't support dropping a cylinder group worth of metadata reference because FFS/UFS can't support that, because, quite simply, they weren't designed with the idea that the underlying backing store could change size. You can grow an MFS in cylinder group untis, at the cost of inequal fragmantation bias on the existing (more used than the new unused area) cylinder groups. This works because it works in FFS, too. MFS itself is a piece of crap whenit comes to juggling backing store requirements because FFS/UFS is a piece of crap under the same circumstances, and MFS is a derivative. > its idea of a fixed block device, just like UFS. A VOP > mechanism already exists to handle block freeing and, in > fact, after the 15th, MFS will start to use it.... file > fragments and meta-data will still get unnecessary swap > assignments, but full file blocks will not. This is a > major improvement and it took me all of 40 minutes to add > the support for it. This is *not* a major improvement. It's a trivial improvement which does nothing to address the issue of fragmentation. The FFS/UFS combination on a fixed backing store is relative immune to fragmentation because of the way the backing store is used via what is, in effect, a statistical hash of blocks into the available space for blocks. The recovery mechanism you outline deals with breaking pages back to the system for reuse, but *aggrivates* the fragmentation issue to an almost unholy level, which just gets worse if you try and add cylinder groups to "grow" the MFS. The *only* soloution to "the MFS problem" is an FS architecture that *expects* the underlying blocks store to break on page boundaries, and either self-defragments or otherwise is made fragmentation immune. Probably the most correct way to do this is to place the memory consumed by the FS into a seperate address space that isn't the kernel's address space, such that the kernel can feel free to relocate its pages out from under it to transparently defragment it. Note: This means that there is an interface seperation between a file as a device, and a VM using the device as a backing store. This incidently means you exit and reeenter the VM on either side of a VFS whose files are being used as a swap store (which means that there is a procedural interface instead of a VM alias in order to ensure coherency between the VM object consuming the file and the VM object backing the file on the underlying device). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199901080044.RAA22486>