Date: Sat, 09 Jun 2001 10:14:49 -0700 From: Terry Lambert <tlambert2@mindspring.com> To: Peter Wemm <peter@wemm.org> Cc: "Patrick W. Penzias Dirks" <pwd@apple.com>, FreeBSD-FS@FreeBSD.ORG, FreeBSD-Arch@FreeBSD.ORG Subject: Re: Support for pivot_root-like system call? Message-ID: <3B225989.6E2110@mindspring.com> References: <20010608153011.D7AF1380C@overcee.netplex.com.au>
next in thread | previous in thread | raw e-mail | index | archive | help
Peter Wemm wrote: > Terry, the 'cache coherency' bugs have been fixed in -current for ~8 > months now (September 2000). The infrastructure changes for this are > subject to a call-for-review right now for a merge to 4.x. Peter, the 'cache coherency' bugs have only been fixed for the trivial case where the pages at the top are identical to the pages at the bottom of a vnode stack. In the case of a transforming stack, the "final VP" that should be returned is actually an intermediate VP, and you need to take a write fault in the putpages in order to do the correct layer boundary transition. As a simple case, consider an FS that at the top presents one page, but at the bottom presents two pages from which that one page is derived. This could be a cryptographic FS, or it could be an FS that converts from ISO-8859-1 to ISO-10646, etc.. The point is that the page contents will undergo a transformation. The problem is the same as the explicit cache coherency code that resolved the getpages/putpages in one layer by calling the read/write function in the underlying layer in the historical "nullfs" workaround, which was a kludge. When FreeBSD moved to a unified VM and buffer cache, it erroneously removed the "hint points" at which an explicit coherency call would occur to synchronize the VM and buffer cache views of an object. This is precisely the code that is needed to synchronize a vm_object_t with the backing vm_object_t after a transformation. What FreeBSD has now will work for about 1/4 of the proposed uses of stacking FS layers. It will _NOT_ work for most of the interesting uses of a stacking architecture, which involve MUX'es (e.g. "translucent FS" for "writing" to a CDROM) or for "proxy FS" for debugging your FS code in user space, etc. -- I have around 16 examples where the current code still fails. Really, you want to define the actual device I/O in terms of a "disk block FS". The thing that most people apparent fail to "get" is that there is a significant difference between a stacking layer and a local media FS. A local media FS has to interact with the VM system (or buffer cache, or whatever). We need to couch UFS in terms of talking to a local media FS on top of which it is stacked. Unfortunately, it seems that most people are unwilling to actually email John Heidemann, the architect of the stacking code in 4.4BSD (and therefore FreeBSD). Frankly, it really pisses me off that this stuff doesn't work in FreeBSD. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3B225989.6E2110>