From owner-freebsd-hackers  Thu Nov 12 18:22:01 1998
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id SAA04687
          for freebsd-hackers-outgoing; Thu, 12 Nov 1998 18:22:01 -0800 (PST)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from allegro.lemis.com (allegro.lemis.com [192.109.197.134])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id SAA04655
          for <hackers@FreeBSD.ORG>; Thu, 12 Nov 1998 18:21:57 -0800 (PST)
          (envelope-from grog@freebie.lemis.com)
Received: from freebie.lemis.com (freebie.lemis.com [192.109.197.137])
	by allegro.lemis.com (8.9.1/8.9.0) with ESMTP id MAA26879;
	Fri, 13 Nov 1998 12:51:36 +1030 (CST)
Received: (from grog@localhost)
	by freebie.lemis.com (8.9.1/8.9.0) id MAA01634;
	Fri, 13 Nov 1998 12:51:34 +1030 (CST)
Message-ID: <19981113125134.M781@freebie.lemis.com>
Date: Fri, 13 Nov 1998 12:51:34 +1030
From: Greg Lehey <grog@lemis.com>
To: Mike Smith <mike@smith.net.au>
Cc: hackers@FreeBSD.ORG
Subject: Re: [Vinum] Stupid benchmark: newfsstone
References: <19981111103028.L18183@freebie.lemis.com> <199811120600.WAA08044@dingo.cdrom.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 0.91.1i
In-Reply-To: <199811120600.WAA08044@dingo.cdrom.com>; from Mike Smith on Wed, Nov 11, 1998 at 10:00:40PM -0800
WWW-Home-Page: http://www.lemis.com/~grog
Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia
Phone: +61-8-8388-8286
Fax: +61-8-8388-8725
Mobile: +61-41-739-7062
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Wednesday, 11 November 1998 at 22:00:40 -0800, Mike Smith wrote:
>> On Monday,  9 November 1998 at 22:38:04 -0800, Mike Smith wrote:
>>>
>>> Just started playing with Vinum.  Gawd Greg, this thing seriously needs
>>> a "smart" frontend to do the "simple" things.
>>
>> Any suggestions?  After seeing people just banging out RAID
>> configurations with GUIs, I thought that this is probably a Bad
>> Thing.  If you don't understand what you're doing, you shouldn't be
>> doing it.
>
> That's not entirely true.

We'll argue that some other time.

>> The four-layer concepts used by Veritas and Vinum have always been
>> difficult to understand.  I'm trying to work out how to explain them
>> better, but taking the Microsoft-style "don't worry, little boy, I'll
>> do it all for you" approach is IMO not the right way.
>
> I think it's a mistake to conceal all the workings, but it's also a
> mistake to assume that for the "common case", you need to thrust all of
> it into the novice's face.
>
> The "common case" for RAID applications seems to be:  "I have these
> disk units, and I want to make them into a RAID volume".  So the
> required functionality is:
>
>  1) Input the disks to participate in the volume.

drive a device /dev/da0h
drive b device /dev/da1h
drive c device /dev/da2h
drive d device /dev/da3h
drive e device /dev/da4h

>  2) Input the RAID model to be used.

plex org raid5 256k

> Step 2 should check the sizes of the disks selected in step 1, and make
> it clear that you can only get striped or RAID 5 volumes if the disks
> are all the same size.  

You haven't said how big you want it yet.

> If they're within 10% or so of each other, it should probably ignore
> the excess on the larger drives.

Why?  That would be a waste.  Just say:

sd drive a length 5g
sd drive b length 5g
sd drive c length 5g
sd drive d length 5g
sd drive e length 5g

This is the way it is at the moment.  Agreed, it would be nice to find
a maximum size available.  Currently you need to do this:

$ vinum ld -v |  grep Avail
                Available:   128985600 bytes (123 MB)
                Available:   172966400 bytes (164 MB)
                Available:    24199680 bytes (23 MB)
                Available:    24199680 bytes (23 MB)

(don't worry about the tiny disks, you've seen these ones before :-)

I'll think about how to tell the system that you want a maximum size,
but in production environments these are things you think about before
you start putting the hardware together.

None of this requires a GUI, of course, and IMO none of it is any
easier with a GUI.

>>> There was an interesting symptom observed in striped mode, where the
>>> disks seemed to have a binarily-weighted access pattern.
>>
>> Can you describe that in more detail?  Maybe I should consider
>> relating stripe size to cylinder group size.
>
> I'm wondering if it was just a beat pattern related to the stripe size
> and cg sizes.  Basically, the first disk in the group of 4 was always
> active.  The second would go inactive for a very short period of time
> on a reasonably regular basis.  The third for slightly longer, and the
> fourth for longer still, with the intervals for the third and fourth
> being progressively shorter.

Right.  I'm still planning to think about it.

>>> It will get more interesting when I add two more 9GB drives and four
>>> more 4GB units to the volume; especially as I haven't worked out if I
>>> can stripe the 9GB units separately and then concatenate their plex
>>> with the plex containing the 4GB units; my understanding is that all
>>> plexes in a volume contain copies of the same data.
>>
>> Correct.  I need to think about how to do this, and whether it's worth
>> the trouble.  It's straightforward with concatenated plexes, of
>> course.
>
> Yes, and it may be that activities will be sufficiently spread out over
> the volume that this won't be a problem.

Possibly.  If you're using UFS it should be.

>>> Can you nest plexes?
>>
>> No.
>
> That's somewhat unfortunate, but probably contributes to code
> simplicity. 8)

Well, no.  I can see the requirement for easy extensibility, but
nesting plexes isn't the way to do it.  Finding a more flexible
mapping between plexes and subdisks is.

Greg
--
See complete headers for address, home page and phone numbers
finger grog@lemis.com for PGP public key

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message