From owner-freebsd-current@FreeBSD.ORG Sat Nov 19 14:43:21 2005 Return-Path: X-Original-To: freebsd-current@FreeBSD.org Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5418C16A41F; Sat, 19 Nov 2005 14:43:21 +0000 (GMT) (envelope-from Alexander@Leidinger.net) Received: from www.ebusiness-leidinger.de (jojo.ms-net.de [84.16.236.246]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9B3AB43D45; Sat, 19 Nov 2005 14:43:19 +0000 (GMT) (envelope-from Alexander@Leidinger.net) Received: from Andro-Beta.Leidinger.net (p54A5F21E.dip.t-dialin.net [84.165.242.30]) (authenticated bits=0) by www.ebusiness-leidinger.de (8.13.1/8.13.1) with ESMTP id jAJEIYxr090544; Sat, 19 Nov 2005 15:18:35 +0100 (CET) (envelope-from Alexander@Leidinger.net) Received: from Magellan.Leidinger.net (Magellan.Leidinger.net [192.168.1.1]) by Andro-Beta.Leidinger.net (8.13.3/8.13.3) with ESMTP id jAJEgsuh089922; Sat, 19 Nov 2005 15:42:55 +0100 (CET) (envelope-from Alexander@Leidinger.net) Date: Sat, 19 Nov 2005 15:42:54 +0100 From: Alexander Leidinger To: delphij@delphij.net Message-ID: <20051119154254.49627a5a@Magellan.Leidinger.net> In-Reply-To: References: <20051118114308.GA11281@uk.tiscali.com> <20051118130223.z87u1gbswwg0kckw@netchild.homeip.net> X-Mailer: Sylpheed-Claws 1.9.100 (GTK+ 2.8.6; i386-portbld-freebsd7.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable X-Virus-Scanned: by amavisd-new Cc: freebsd-current@FreeBSD.org, Robert Watson , phk@FreeBSD.org, Brian Candler Subject: Re: Logical volume management X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Nov 2005 14:43:21 -0000 On Sat, 19 Nov 2005 01:47:02 +0800 Xin LI wrote: > On 11/18/05, Alexander Leidinger wrote: > > This statement is a little bit simplified, but you're describing a stri= pped > > and crippled down version of Solaris ZFS (at least from the functionali= ty > > point of view). >=20 > Sorry for being more or less OT: Is there some ZFS whitepaper or > in-depth description available? I think some of their concepts are > very attractive and may worthy to see whether we should do some > similiar... With a quick search on sun.com I found: http://www.sun.com/software/solaris/zfs.jsp http://www.sun.com/emrkt/campaign_docs/expertexchange/knowledge/solaris_zfs= .html I've read about it in a study guide, which is more straight forward from an administration POV than this. What I like about it besides the reliability part (checksums and the like, have a look at the aove URL's) is the build-in growfs/shrinkfs part: =46rom an administration point of view you assign some raw storage space to a pool. A pool is a source of storage space for the FS the user sees. Such a pool may be mirrored, striped, whatever. But this RAID part is under the control of ZFS. It decides on it's own where to mirror (unlike normal mirroring not every block is on every disk, or in the same location on multiple disks, it's possible that one block of date is on disc1, LBA 5 and on disc3 LBA 23565) or how it's used (striping, mirroring). As soon as you add new storage, it gets used. To limit the size a particular mountpoint can occupy, you define a FS-quota (this is distinct from user-quota's). If you need more space there, you just increase the FS-quota (or decrease, if you put too much there). No need to newfs... in the sense of our current newfs implementation, you still have to create a ... let's call it namespace, which is allowed to use a portion of a pool. Then you mount this namespace somewhere in your directory tree. So it abstracts the volume management, you just have to say use the raw storage space from A B and C and it does the rest. And it extends the quota methodology to not only cover users, but the FS on raw storage itself. To address the concerns of Poul-Henning: Both of those features could be implemented (more or less... I don't have an idea how one would allow to share the same pool with multiple FS-namespaces in such a orthogonal scheme, but with a little bit of thinking someone may find a solution) without the need of each-other, you just have to have some more administrative commands to do the work. You also need a nice framework in the kernel which allows to propagate some information from one layer to another (e.g. size change). The combination of both into one "black box" seems easier to me, you don't have to change the ABI/API in the kernel (if you don't have the necessary infrastructure already). Bye, Alexander. --=20 0 and 1. Now what could be so hard about that? http://www.Leidinger.net Alexander @ Leidinger.net GPG fingerprint =3D C518 BC70 E67F 143F BE91 3365 79E2 9C60 B006 3FE7