Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 3 Nov 1999 14:40:37 -0500
From:      Greg Lehey <grog@mojave.sitaranetworks.com>
To:        Kelly Yancey <kbyanc@posi.net>, Bernd Walter <ticso@cicely.de>
Cc:        freebsd-fs@FreeBSD.ORG
Subject:   Re: feature list journalled fs
Message-ID:  <19991103144037.41321@mojave.sitaranetworks.com>
In-Reply-To: <Pine.BSF.4.05.9911031032500.26857-100000@kronos.alcnet.com>; from Kelly Yancey on Wed, Nov 03, 1999 at 11:40:24AM -0500
References:  <19991103105333.A89617@cicely7.cicely.de> <Pine.BSF.4.05.9911031032500.26857-100000@kronos.alcnet.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday,  3 November 1999 at 11:40:24 -0500, Kelly Yancey wrote:
> On Wed, 3 Nov 1999, Bernd Walter wrote:
>
>>>
>>>   I am under the impression that you can only enlarge a vinum volume if it
>>> in a RAID 0 configuration (concatenation). Obviously, it would be very
>>> difficult to enlarge a RAID 1 or RAID 5 configuration as it would require
>>> restriping the data across all disks; I'm not familiar with any product,
>>> hardware or software, that can do this.
>>
>> In case of Striping which is valid for Raid5 and concatenated Raid0
>> configrations it is not simply possible to do.  But think of a
>> Raid5 volume which is extended with concatenating another Raid5
>> set.  This is not doable with vinum - but I'm shure that this won't
>> happen before anyone is using such a feature feature.
>
>   That sounds more like a RAID 5/0 config. While I've never seen a
> hardware vendor advertise support for such a creature, it should
> theoretically be possible.
>   However, vinum volumes can only provide mirroring between plexes so
> it is impossible for vinum to extend a volume composed of RAID 5 plexes
> via concatenation. On the other hand, I see that Greg has "Extending
> striped and RAID-5 plexes" on his TODO list for vinum, presumably by
> [shudder] restriping everything.

That's what I'm thinking of.  Yes, it's slow and ugly, but it's a
function that people want.  The obvious current alternative is to back
up the entire volume to tape, rebuild the volume (including
reinitializing in the case of RAID-5) and restoring the data.  By
comparison, restriping looks pretty :-)

There is another way to do this now, on line, if you have enough disk:
create another plex, start it, remove the original plex, remodel it
nearer to the heart's desire, and start it.  It's slow, but not as
slow as backing up to tape, and you can continue to access the volume
while you're doing it.

>>>   Besides the fact that this would be an issue for any RAID controller
>>
>> No.
>> Most Controllers I have seen increases the size of a disk - not a volume.
>
>   Sorry, I was thinking about the software in RAID controllers in the same
> terms as vinum. You are correct, though, that to the OS it appears as a
> single disk which has been enlarged. The same thing, though, is true with
> vinum; it should appear simply as though the disk were enlarged (albeit a
> "virtual disk").

Correct.  I don't really see a difference here, except maybe in
terminology.  Note that many operating systems refer to disks as
volumes, however.

>   No file system should care whether a disk is a "real" disk or a
> "virtual" disk or else a "virtual" disk isn't very virtual.

Almost correct.  It's useful to understand the geometry of a stripe
set when setting up ufs; it's very easy to end up with all cylinder
groups on the same spindle.

>>> also. Anyone with a RAID controller can add a new disk to their RAID 0 and
>>> enlarge the virtual disk. Those controllers aren't going to tell you about
>>> the increased disk size any more than vinum does. Beyond that, who is to
>>
>> They don't need, because the partition the fs is on won't increases if the
>> virtual disk is getting bigger.
>
>   I need to clarify terminology here just for myself, because otherwise
> we're getting into confusing territory...
>
>   partition: UNIX-style partitions of which there can be 8 (lettered a-h);
> 	     exist in the disklabel of a slice.
>   slice: PC-style partitioning of disk space of which there can be 4;
> 	     exist in the master boot record.
>
>   vinum doesn't support partitions; I don't know whether it supports
> slices.

Vinum does support partitions, because there's nothing you can do to
stop it doing so.  They just don't make sense in a Vinum context.

>   Now, if vinum supports slices, then vinum doesn't care what filesystem
> one puts on it (ie how it is sliced up). In which case, one could use
> vinum to manage a virtual disk with NTFS on one slice and FFS on another.
>   However, if it does not support slices, which I suspect it doesn't, then
> then entire volume must be dedicated to a single file system. So
> arguably, yes, if someone were to extend the size of the virtual disk
> (presumably by adding physical disks to the plex), it would be reasonable
> to assume that any existing filesystem should be extended to fill the new
> space.

Slices are supported too, at least as far as the underlying disk code
is fooled by a Vinum volume.  But they don't make sense.

>   What I can't figure out is why Greg doesn't support slicing
> / partitioning the virtual disk (this is really the only thing that
> prevents it from being 100% transparent in my estimation).

As I said, they are supported, but they don't make sense.  Vinum has
its own, more flexible method for subdividing disks.

> With a MBR, vinum could be used to hold any filesystem (ie. NTFS,
> ext2, or FAT32) or any combination thereof;

It can now.  You don't need an MBR, since the bootstrap doesn't
understand Vinum.  And the usefulness of ext2 or NTFS file systems is
limited, since Linux and NT don't understand Vinum.

> with a disklabel vinum wouldn't require kludges like newfs -v.

newfs -v is needed because newfs *without* -v is a kludge.  It
shouldn't assume anything from the name of a partition.

>>> say that the entire size of the new, enlarged, virtual disk is supposed be
>>> dedicated to FFS. Is it not possible, however unlikely, for a sysadmin to
>>> add disk space to a RAID array and partition it as say FAT32?
>>
>> That's why it may be interesting to add such hooks to disklabel.
>
>   You are saying so that when someone updates the disklabel to specify a
> larger partition, the hooks would be used to notify the filesystem which
> could then do the dirty work?
>   You haven't happened to visit the Pacific Northwest recent, perhaps near
> the town of Redmond, WA? :) Seriously, such hooks would have to be in the
> kernel, not the disklabel program, in the off chance someone uses a tool
> other than disklabel to edit the partition table.

I suppose it's possible to get the Vinum daemon to do this.  In
principle the idea makes sense, but it would need to be done right.  I
can think of a lot of more important stuff to do first. 

>>>   I think what Greg was getting at as far as the file system is concerned,
>>> vinum just looks like a disk. Whatever else vinum may be, to the file
>>> system it just looks like a disk.
>>>
>>>> I have some ideas about how to get FFS resizeable without needing to freeze or
>>>> umount it before and without loosing inodes.
>>>
>>>   This is great, but I think that "vinum hooks" are no more needed than
>>> "ccd hooks" or "DPT hooks". User-land tools should allow the administrator
>>> to resize the file system at the administrators discretion. Beyond the
>>> technical issues of providing hooks to automatically extend file systems,
>>> there is the social implication of whether that is what the user wanted.
>>> User-land tools solve both problems.
>>
>> DPT should be obsolete because the don't change the size of a partition.
>> ccd's should be partionioned too and is not that usefull any more compared to
>> vinum.
>> vinum and disklabel are the hooks, but I think vinum is more usefull.
>> Greg already is about to implement spare disk support.
>> What about a kind of spare disk which is scheduled to increase a FS
>> automaticaly if running out of space.
>> Features like this need interaction between the fs and the volumemanager.
>> Of course Hardware Raid's are a point too - but that's more difficult.
>
>   Basically what we need is a filesystem-specific resize function which
> userland tools could use a syscall to request a filesystem be resized, and
> the filesystem itself would do the implemention. 

Resizing a file system is not a thing you can do in a system call.
Much needs to be done in user context.

> Assuming vinum remains the special case of only allowing one file
> system on it,

I'd rather hope that this should become the norm.

> it would safe for it to call the filesystem resize routine when it
> brings the spare on-line.  However, personally I would like to see
> vinum become a true virtual disk, 

It is :-)

> allowing multiple file systems. 

It doesn't make any sense to do this.

> In which case, I don't see where anything other than userland tools
> would access this interface.

That's the case at the moment.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19991103144037.41321>