From owner-freebsd-arch Mon Oct 7 15:50:18 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7BBEF37B401 for ; Mon, 7 Oct 2002 15:50:16 -0700 (PDT) Received: from swan.mail.pas.earthlink.net (swan.mail.pas.earthlink.net [207.217.120.123]) by mx1.FreeBSD.org (Postfix) with ESMTP id 02BA443E6A for ; Mon, 7 Oct 2002 15:50:16 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0113.cvx22-bradley.dialup.earthlink.net ([209.179.198.113] helo=mindspring.com) by swan.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 17yghR-0004um-00; Mon, 07 Oct 2002 15:50:05 -0700 Message-ID: <3DA20F40.D3C3FD59@mindspring.com> Date: Mon, 07 Oct 2002 15:48:32 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Wilko Bulte Cc: Peter Wemm , Mikhail Teterin , arch@FreeBSD.ORG Subject: Re: swapon some regular file References: <20021007212545.C363B2A88D@canning.wemm.org> <3DA204A7.50530BE5@mindspring.com> <20021008000656.A598@freebie.xs4all.nl> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Wilko Bulte wrote: > > > vnconfig/mdconfig work because that basically adds the logical -> physical > > > translation step. I'd just as soon not have to mess with this though. > > > > It would be useful to be able to ask a file for its list of > > physical blocks on the underlying device, so that you could > > sort them into contiguous extents, and then use *those*, > > instead of eating the translation overhead, each time... > > Sounds a bit like VMS (IIRC..) > > :) Yes. Windows does the same thing, itself, for its own swap files. If you write an IFS for Windows, you will find that if you do not implement this optional-to-implement interface, you will not be able to configure swapping on the device in question. I was really more concerned with Peter's point, about getting around the problem of translation. One way to do it would be to front-load the cost, so that you only incur the overhead one time. The cost to swapping to a file with a vnconfig'ed device is that you pay the FS<->block translation penalty on each and every I/O. Creating this interface would avoid it. I haven't really paid a heck of a lot of attention to Poul's version of the slice code, as far as implementation details go; the overhead of translation layers, at least for linear translation, should be able to be optimized out, by way of ordered block lists plus the underlying device: most of these translations should be possible to complete statically, once. Poul said that there was a per-layer cost to using GEOM, which implies that he doesn't do this: in theiry, you should be able to collapse all references to a single layer, no matter what. If he did this, you could do the same thing for a discontiguous aggregate array of block extents -- which means that they could come from a file, or stripe sets, or whatever. The interesting thing in the swap case, is that you care more about the physicality of the blocks, anyway: the implication there is that translation layers are a bad idea, no matter what, if there are more than one of them. The other interesting thing in the swap case is that there is not a "right" order: so long as the blocks are physically contiguous, and the translation proceeds reversibly, you will get the same data in as out, even if the logical ordering of the blocks is not maintained at the upper layer (all you *really* care about is that the data at a logical offset in is the same as the data at the logical offset out). The only layers you cannot collapse are content translation, and content size change (e.g. a crypto layer using a one-time-pad, or a compression layer, using a fixed compression ratio and block reallocation to approximate physical contiguity). Pretty much, that's basically everything but the sample AES layer, at this point, which could have its overhead squeezed out of it. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message