From owner-freebsd-hackers Fri Mar 31 23:05:18 1995 Return-Path: hackers-owner Received: (from majordom@localhost) by freefall.cdrom.com (8.6.10/8.6.6) id XAA03090 for hackers-outgoing; Fri, 31 Mar 1995 23:05:18 -0800 Received: from cs.weber.edu (cs.weber.edu [137.190.16.16]) by freefall.cdrom.com (8.6.10/8.6.6) with SMTP id XAA03078 for ; Fri, 31 Mar 1995 23:05:16 -0800 Received: by cs.weber.edu (4.1/SMI-4.1.1) id AA08921; Fri, 31 Mar 95 23:58:48 MST From: terry@cs.weber.edu (Terry Lambert) Message-Id: <9504010658.AA08921@cs.weber.edu> Subject: Re: large filesystems/multiple disks To: henrich@crh.cl.msu.edu (Charles Henrich) Date: Fri, 31 Mar 95 23:58:47 MST Cc: freebsd-hackers@FreeBSD.org In-Reply-To: <199504010545.VAA01084@freefall.cdrom.com> from "Charles Henrich" at Apr 1, 95 00:45:28 am X-Mailer: ELM [version 2.4dev PL52] Sender: hackers-owner@FreeBSD.org Precedence: bulk > Are there any plans/work in progress for allowing FreeBSD the ability to have > filesystems span multiple physicle volumes (ala Logical Volume Manager on > AIX/OSF) ? Both I and Phil Neiswanger have toyed with this idea. At one time, I had logical volume management working badly under 386BSD 0.1 patchkit 1 for ESDI drives. The main issue is that the concept of partitioning/slicing/whatever truly needs to be divorced from device nodes. FreeBSD current is actually moving further away from this (or closer to it using the logical but roundabout incremental improvement approach depending on your point of view). One thing that is critical is that devices be where you left them, so dynamic disk ID assignment is right out, at least at the kernel level (a logical partition could be dynamically renamed relatively painlessly). The main problem is that there needs to be file system support for the idea of additional disk space ...ie: one place where you can add things on. This will work with IBM's JFS (obviously) or with a log structured file system, but precious little else. UFS is particularly badly suited to doing this. If you do what SPRITE does and shove all the inodes in one area and all of the data blocks in another, you can sort of do this for UFS. The alternative is to preallocate a major large number of inodes in the first place (which is what I did) or to backup, remkfs the file system after adding the storage, and restore everything. It should also be noted that this type of arrangement is extremely fragile -- it's order n^2 for n disks more fragile than file systems not spanning disks at all. You shouldn't attempt this type of thing without being ready to do backups. Basically, a failure of one disk could theoretically take out all of you real file systems sitting on logical partitions that spanned that one disk. Pretty gruesome, really. The stuff I had relatively happy used ESDI drives and relied on a working Bad144 mechanism, so it's kind of double-damned; if I were to do it today, it'd be a complete rewrite. One of the major pieces is the management piece to determine which 4M chunk on a physical disk is allocated to which logical disk slice. Needless to say, I binary edited this, so I didn't have on of these written, so you'd have to write one of them as well. The main gain is just-in-time meeting of storage requirements on huge databases that grow incrementally slow. The next most popular use is to add swap space to a system by growing the logical partition that's the swap area -- AIX is very swap hungry, being even more obscenely radical about memory overcommit than most systems. With the ability to swap on files (which BSD has) this is largely a useless application. So while it is a cool feature, it has limited practical utility. Terry Lambert terry@cs.weber.edu --- Any opinions in this posting are my own and not those of my present or previous employers.