From owner-freebsd-arch@FreeBSD.ORG  Tue Jan  6 23:49:28 2004
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 3EE1616A4CE; Tue,  6 Jan 2004 23:49:28 -0800 (PST)
Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id E36AB43D39; Tue,  6 Jan 2004 23:49:25 -0800 (PST)
	(envelope-from phk@phk.freebsd.dk)
Received: from critter.freebsd.dk (localhost [127.0.0.1])
	by critter.freebsd.dk (8.12.10/8.12.10) with ESMTP id i077nND7025393;
	Wed, 7 Jan 2004 08:49:24 +0100 (CET)
	(envelope-from phk@phk.freebsd.dk)
To: "Greg 'groggy' Lehey" <grog@freebsd.org>
From: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
In-Reply-To: Your message of "Wed, 07 Jan 2004 16:52:52 +1030."
             <20040107062252.GQ7617@wantadilla.lemis.com> 
Date: Wed, 07 Jan 2004 08:49:23 +0100
Message-ID: <25392.1073461763@critter.freebsd.dk>
cc: FreeBSD Architecture Mailing List <arch@freebsd.org>
Subject: Re: Vinum and GEOM: the future 
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Jan 2004 07:49:28 -0000

In message <20040107062252.GQ7617@wantadilla.lemis.com>, "Greg 'groggy' Lehey" 
writes:

>It's also very clear that GEOM offers significant advantages in this
>area (but also more room for users to shoot themselves in the foot;

I think it is important to keep a clear distinction between "GEOM" (the
infrastructure component) on one side and the GEOM classes on the other.

The intent was from the start that all politics would happen in the
classes not in the infrastructure code, and as a result, you can
seriously penetrate your feet with a badly thought out or implemented
class.

On the other hand, it is also possible to write classes in a way which
prevents such footshooting, at least that's the verdict so far.

>the quote above continues: "the only restriction is that cycles in the
>graph will not be allowed.").  The question I have is: what other
>advantages does it offer?

Parents praising kids is a dreadful thing to listen to, but as I
see it, the main advantage is to give us the necessary infrastructure
to do all the weird things we want to do, without running into problems
with recursion, kernel-stack overruns and needless code duplication.

Infrastructure is always hard to argue for in advance, but once in place
it rapidly becomes nearly invisible and after a while people start to
generalize from it.

Before GEOM, nobody ever asked me to be able to encrypt only one copy
of a mirror ("We take that disk home for the night") because obviously
mirroring was something you did with CCD or Vinum, and neither did
anything like encryption.  Now with GEOM you should see some of the
requests I get...

>(http://lca2004.linux.org.au/, in case you're interested, and yes,
>they specifically asked for a paper about Vinum.  Go figure), and I've
>come up with the following list of Vinum features:=20
>
>  [...]
>
>Interestingly, none of these touch GEOM as far as I can see.  Am I
>missing something?

Yes and no.  None of these are GEOM's jobs, they are all stuff which
the GEOM classes should do.  (Of course GEOM should make it as easy
as possible and offer sensible libraries etc).

If you put all of Vinum into one GEOM class, like I did with CCD,
then you would basically have the same situation as before GEOM,
except a lot of the magic code you had to do for Vinum now can rely
on GEOM to offer these facilities as standard.  The entire auto-discovery
thing for instance.

>Based on this understanding, my intentions for Vinum currently don't
>go beyond replacing the following:
>
>- Replace the objects volume, plex and subdisk with a corresponding
>  geom.  I expect this to enable a more arbitrary means of joining
>  together the objects, but that's about all.
>- Replace the ioctls with gctl_s.  This seems to be more cosmetic than
>  functional, though also a good idea.

Well, as I've said before, I would really suggest you just start
out by making Vinum a single GEOM class, where you use consumers
instead of calling devsw()->strategy() to access the disks vinum
live on, and offer providers instead of dev_t's for access to the
vinum entities you expose (volumes, plex etc).

I know ScottL started working on RaidFrame, and listening to him
during the process was very much like "OK, this is no longer necessary
...  this goes ... don't need this ... get rid of that ..." and
I would hope the vinum experience would be the same.

This would not result in changes to the vinum user exerience or
require documentation changes, and it would give you a good clean
point to further work from.

>This will certainly be worthwhile, but somehow I was expecting more.
>Can anybody suggest other things that could be changed with benefit?

Not of the top of my head.

In the long term, I would hope that we would end up where we have one
very general MIRROR class, one very general RAID5 classe etc.

These would just do the basic disk-request transformations, but not
contain any autoconfiguration or all that.

The other classes, VINUM could be a good example, would add the
"high-level" intelligence, by autodiscovering metadata and based
on the metadata configuring STRIPE, MIRROR and RAID5 modules to
do the right thing.

Other such "high-level" classes might be RAIDFRAME, "VERITAS" and
"AIX-LVM".

But this is long term stuff, and we need to crawl first.

For now I'm content with GEOM giving us the ability to implement
transformations in a clean way.

Poul-Henning

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.