From owner-freebsd-fs Wed Dec 19 17:24:47 2001 Delivered-To: freebsd-fs@freebsd.org Received: from mail.cablespeed.com (mail.cablespeed.com [206.112.192.76]) by hub.freebsd.org (Postfix) with SMTP id D189837B417 for ; Wed, 19 Dec 2001 17:24:39 -0800 (PST) Received: (qmail 24330 invoked by uid 0); 20 Dec 2001 01:24:39 -0000 Received: from unknown (HELO cablespeed.com) (216.45.72.227) by mail.cablespeed.com with SMTP; 20 Dec 2001 01:24:39 -0000 Message-ID: <3C213DD6.3CAD0C3C@cablespeed.com> Date: Wed, 19 Dec 2001 20:24:38 -0500 From: Chuck McCrobie X-Mailer: Mozilla 4.72 [en] (X11; I; FreeBSD 4.4-STABLE i386) X-Accept-Language: en MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Real world Root Resizing (was Re: Proposed auto-sizing patch ... References: <200112191945.OAA04975@repulse.cnchost.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Bakul Shah wrote: > > I wonder if one can devise a syscall interface to do this > safely without requiring detailed knowledge of the FS layout > and replicating a lot of FS code in user mode. > > * For shrinking a partition you need a syscall to limit > disk block allocation. Something like > > int fs_alloc(const char* mountpoint, size_t offset, size_t limit); > > This would do all allocation the [offset..limit) range > until the next call. Even if you grew a file outside this > range, the new blocks will be allocated here. A filesystem > that does not implement this functionality returns ENOSYS. > offset and limit are in disk blocksize unit but may need to > be rounded up to some FS specific parameter (such as > cylinder group size for FFS). > > * For defragmenting you need a way to move file data. > Something like > > int frealloc(fd, offset, count, addr) > > offset & count must be multiples of disk block size. > addr is a hint as to where these blocks should be moved. > The call fails if the suggested new blocks are in use. > > The FS code atomically (at syscall level) moves specified > blocks to the new area. > Windows 2000 provides a "MOVE FILE DATA" IOCTL to the file system. The file system is supposed to move the referenced file data to the specified location. The location is specified by disk lbn. The "MOVE FILE DATA" may specify a location which is now occupied (but wasn't before). The file system is supposed to ignore the request in that case. > * You also need to be able to get to various freelists. > Windows 2000 also provides a "GET SPACE BITMAP" IOCTL to the file system. The file system is supposed to return an up-to-date bitmap describing the allocation of space in the partition. > I can't see how defragmentation can be done without some > knowledge of FS layout but perhaps most of the details can be > abstracted out well enough that the same interface can be > used for different FSes. > I guess making a file physically contiguous might be a good start. I think the FFS cluster code attempts to keep files contiguous... Perhaps extracting out or exposing generic logic for the FFS code would work. Would it be possible to also move around inodes? My understanding of the idea behind "dir pref" is to keep inodes of files in the same directory contiguous. Do other pieces (NFS?) keep track of inodes by their location (or does inode number imply location?). That is, does moving a inode from one location to another break things higher up? > You would run this on a quiescent system but there is no need > to unmount the FS or even bring the system down to single > user. > > Placement of files can also be changed once you have this > interface. One idea is to sample file access time. Files > that gets read frequently can be moved to reduce seek time. > Files with similar access time can be clustered and so on. > What would be better than sampling atime is keeping read > stats in each inode: each time a file is read and the atime > is to be updated, increment a small counter (but make it > `stick' when it reaches max). This counter is zeroed when > the stats are gathered by a user program. I am not holding > my breath though. > > Comments? > > -- bakul > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-fs" in the body of the message -- -- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message