Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Apr 1996 11:51:56 -0700
From:      Mitchell Erblich <merblich@ossi.com>
To:        freebsd-fs@FreeBSD.ORG, pvh@leftside.its.uct.ac.za
Subject:   Re: Compressing filesystem: Technical issues
Message-ID:  <199604291851.LAA07027@guacamole.ossi.com>

next in thread | raw e-mail | index | archive | help

Peter and et al,

	I would taker in consideration what is the typical type of
	file would be compressed and what is the benefit vs the tradeoffs. Disks
	are already too slow, isn't the overhead of just uncompressing the blocks,
	on demand in a random access pattern add a delay to the fs object. However,
	I will proceed with the assumption that this approach may have some merit.

	I am unfamiliar with the Netware implimentation, so I will ignore
	comparisons with it.

	Since, I am not the designer/architect of this idea I will may some obvious
	assumptions that the type of file shouldn't be a directory type,
	symbolic type, fragments, annonymous memory on a swapfs, etc.

	Note: fragments are pieces of separate files that can be merged together
	so they share a single block.

	1) Depending on the usage of fragments within the fs and assuming the overhead
	of a compression/decompression algorithm and possible benefits, I will also
	eliminate text or binary files that are greater than an unknown value, as to
	not be able to use a fragment within this block. Since this space cannot be
	used anyway.

	2) This code would have to be able to keep track of free fs blocks as a normal
	compression algorithm most likely will have to allocate blocks before they
	are freed.

	3) Assuming that the original blocks might have been allocated somewhat
	contiguously, the algorithm may tradeoff fs object access speed vs fs object
	size. And it may cause a larger number of the objects within the fs to be 
	subject to seeks between EACH block access.

	4) That the compression algorithm NOT modify the fs object modification time.

	5) That accesses to the fs compressed object may or may not cause the entire
	   fs object to become uncompressed.

	6) Any fs object within this fs should probably have a new magic number as
	to not allow NON aware fs compressed object from using this new object and a
	series of tests of what happens when such accesses do occur.

	Because of the above, unless the fs oject is allocated contigously (to
	eliminate seeks) (considering possible interleaving and such), and can be 
	reallocated contigously for its compressed object and then uncompressed
	object, it should not be done. 

	Even when this can happen, the question as to whether the unused block portion
	is necssaritly bad. Assume 1 time period after the compression, the fs object
	is appended to, this non-compressed unused block portion can then be used without
	a new block allocation in some cases.

	And last but not least, the COST of the hardware SCSI, etc drive is decreasing
	rapidly on a dollar per MB basis, and thus minimizing this possible use of a fs.

	AND, lastly I think a better approach is to decrease fs object access time on
	double and possibly triple indirect fs object implimentations. One way I am
	exporing this approach is the use of pre-contigous allocations for large fs
	objects and variable block sizing. I am currently attempting to impliment at
	my home fs blocks of 256k in size and larger.  

Mitchell Erblich : merblich@ossi.com
Senior Software Engineer
PS : I speak for myself and not my company.

--------------------------------------------------------------------



> From owner-freebsd-fs@freefall.freebsd.org Thu Apr 25 17:01 PDT 1996
> Date: Fri, 26 Apr 1996 00:30:07 +0200 (SAT)
> From: Peter van Heusden <pvh@leftside.its.uct.ac.za>
> X-Sender: pvh@leftside
> To: freebsd-fs@FreeBSD.ORG
> Subject: Compressing filesystem: Technical issues
> MIME-Version: 1.0
> X-Loop: FreeBSD.org
> 
> I'm slowly getting started on the issue of writing a compressing 
> filesystem for BSD. The situation thus far:
> 
> 1) I'm thinking of a model much like the Netware 4.x one, where a file is 
> compressed if it has not been 'touched' (ie. read or written) in a 
> certain time (e.g. a week). It is then decompressed on being 'touched'.
> 
> 2) I think the correct approach is to base the filesystem on the existing 
> ufs code, and just add a flag which can sit in the i_flag field of the 
> inode which states whether this file is compressed or not. On a 
> successful read or write (i.e. one where data has actually been moved 
> to/from disk successfully) the to_be_compressed flag can be cleared.
> 
> 3) I am as yet uncertain about some of the design of the mark and sweep 
> process which would do the compressing. My current thinking is that this 
> would be a daemon spawned at mount time which would cycle through the 
> inodes (in numerical order) doing the mark 'n sweep thing using a new 
> filesystem specific ioctl. An unmount would have to gracefully kill the 
> daemon process, of course. I'm currently not certain where to put the 
> temporary data during compression... in memory? In a filesystem?
> 
> 4) I'll have to think up a good compression strategy which allows 
> recovery from corruption, etc etc.
> 
> Anyway, in my mind, issue 3, the process to do the compressing, is the 
> one I am having the most problems with. Any suggestions on the design of 
> something like this would be appreciated.
> 
> Thanks,
> Peter
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199604291851.LAA07027>