From owner-freebsd-fs  Mon Apr 29 11:55:49 1996
Return-Path: owner-fs
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.3/8.7.3) id LAA03935
          for fs-outgoing; Mon, 29 Apr 1996 11:55:49 -0700 (PDT)
Received: from loki.ossi.com (loki-8.ossi.com [192.240.8.1])
          by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id LAA03930
          for <freebsd-fs@FreeBSD.ORG>; Mon, 29 Apr 1996 11:55:46 -0700 (PDT)
Received: from guacamole.ossi.com (guacamole.ossi.com [192.240.4.30]) by loki.ossi.com (8.6.11/8.6.11) with ESMTP id LAA15783; Mon, 29 Apr 1996 11:54:49 -0700
Received: (from merblich@localhost) by guacamole.ossi.com (8.6.11/8.6.11) id LAA07027; Mon, 29 Apr 1996 11:51:56 -0700
Date: Mon, 29 Apr 1996 11:51:56 -0700
From: Mitchell Erblich <merblich@ossi.com>
Message-Id: <199604291851.LAA07027@guacamole.ossi.com>
To: freebsd-fs@FreeBSD.ORG, pvh@leftside.its.uct.ac.za
Subject: Re: Compressing filesystem: Technical issues
X-Sun-Charset: US-ASCII
Sender: owner-fs@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

Peter and et al,

	I would taker in consideration what is the typical type of
	file would be compressed and what is the benefit vs the tradeoffs. Disks
	are already too slow, isn't the overhead of just uncompressing the blocks,
	on demand in a random access pattern add a delay to the fs object. However,
	I will proceed with the assumption that this approach may have some merit.

	I am unfamiliar with the Netware implimentation, so I will ignore
	comparisons with it.

	Since, I am not the designer/architect of this idea I will may some obvious
	assumptions that the type of file shouldn't be a directory type,
	symbolic type, fragments, annonymous memory on a swapfs, etc.

	Note: fragments are pieces of separate files that can be merged together
	so they share a single block.

	1) Depending on the usage of fragments within the fs and assuming the overhead
	of a compression/decompression algorithm and possible benefits, I will also
	eliminate text or binary files that are greater than an unknown value, as to
	not be able to use a fragment within this block. Since this space cannot be
	used anyway.

	2) This code would have to be able to keep track of free fs blocks as a normal
	compression algorithm most likely will have to allocate blocks before they
	are freed.

	3) Assuming that the original blocks might have been allocated somewhat
	contiguously, the algorithm may tradeoff fs object access speed vs fs object
	size. And it may cause a larger number of the objects within the fs to be 
	subject to seeks between EACH block access.

	4) That the compression algorithm NOT modify the fs object modification time.

	5) That accesses to the fs compressed object may or may not cause the entire
	   fs object to become uncompressed.

	6) Any fs object within this fs should probably have a new magic number as
	to not allow NON aware fs compressed object from using this new object and a
	series of tests of what happens when such accesses do occur.

	Because of the above, unless the fs oject is allocated contigously (to
	eliminate seeks) (considering possible interleaving and such), and can be 
	reallocated contigously for its compressed object and then uncompressed
	object, it should not be done. 

	Even when this can happen, the question as to whether the unused block portion
	is necssaritly bad. Assume 1 time period after the compression, the fs object
	is appended to, this non-compressed unused block portion can then be used without
	a new block allocation in some cases.

	And last but not least, the COST of the hardware SCSI, etc drive is decreasing
	rapidly on a dollar per MB basis, and thus minimizing this possible use of a fs.

	AND, lastly I think a better approach is to decrease fs object access time on
	double and possibly triple indirect fs object implimentations. One way I am
	exporing this approach is the use of pre-contigous allocations for large fs
	objects and variable block sizing. I am currently attempting to impliment at
	my home fs blocks of 256k in size and larger.  

Mitchell Erblich : merblich@ossi.com
Senior Software Engineer
PS : I speak for myself and not my company.

--------------------------------------------------------------------


> From owner-freebsd-fs@freefall.freebsd.org Thu Apr 25 17:01 PDT 1996
> Date: Fri, 26 Apr 1996 00:30:07 +0200 (SAT)
> From: Peter van Heusden <pvh@leftside.its.uct.ac.za>
> X-Sender: pvh@leftside
> To: freebsd-fs@FreeBSD.ORG
> Subject: Compressing filesystem: Technical issues
> MIME-Version: 1.0
> X-Loop: FreeBSD.org
> 
> I'm slowly getting started on the issue of writing a compressing 
> filesystem for BSD. The situation thus far:
> 
> 1) I'm thinking of a model much like the Netware 4.x one, where a file is 
> compressed if it has not been 'touched' (ie. read or written) in a 
> certain time (e.g. a week). It is then decompressed on being 'touched'.
> 
> 2) I think the correct approach is to base the filesystem on the existing 
> ufs code, and just add a flag which can sit in the i_flag field of the 
> inode which states whether this file is compressed or not. On a 
> successful read or write (i.e. one where data has actually been moved 
> to/from disk successfully) the to_be_compressed flag can be cleared.
> 
> 3) I am as yet uncertain about some of the design of the mark and sweep 
> process which would do the compressing. My current thinking is that this 
> would be a daemon spawned at mount time which would cycle through the 
> inodes (in numerical order) doing the mark 'n sweep thing using a new 
> filesystem specific ioctl. An unmount would have to gracefully kill the 
> daemon process, of course. I'm currently not certain where to put the 
> temporary data during compression... in memory? In a filesystem?
> 
> 4) I'll have to think up a good compression strategy which allows 
> recovery from corruption, etc etc.
> 
> Anyway, in my mind, issue 3, the process to do the compressing, is the 
> one I am having the most problems with. Any suggestions on the design of 
> something like this would be appreciated.
> 
> Thanks,
> Peter
> 

From owner-freebsd-fs  Mon Apr 29 13:21:54 1996
Return-Path: owner-fs
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.3/8.7.3) id NAA09133
          for fs-outgoing; Mon, 29 Apr 1996 13:21:54 -0700 (PDT)
Received: from novell.com (prv-ums.Provo.Novell.COM [137.65.40.4])
          by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id NAA09108
          for <freebsd-fs@FreeBSD.ORG>; Mon, 29 Apr 1996 13:21:43 -0700 (PDT)
Received: from INET-PRV-Message_Server by fromGW
	with Novell_GroupWise; Mon, 29 Apr 1996 14:21:20 -0600
Content-Type: text/plain
Message-ID: <s184d060.037@fromGW>
X-Mailer: Novell GroupWise 4.1
Date: Mon, 29 Apr 1996 14:27:38 -0600
From: DARREND@novell.com (Darren Davis)
To: freebsd-fs@FreeBSD.ORG, pvh@leftside.its.uct.ac.za, merblich@ossi.com
Subject: Re: Compressing filesystem: Technical issues - Reply
Sender: owner-fs@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

>>> Mitchell Erblich <merblich@ossi.com>  4/29 12:51pm >>>
>Peter and et al,
>
>	I would taker in consideration what is the typical type of
>	file would be compressed and what is the benefit vs the tradeoffs.
Disks
>	are already too slow, isn't the overhead of just uncompressing
the blocks,
>	on demand in a random access pattern add a delay to the fs object.
However,
>	I will proceed with the assumption that this approach may have
some merit.

Actually, systems tend to be I/O bound more than compute bound.
By compressing a file, you potentially are trading off I/Os for CPU
cycles (A good tradeoff I believe).  Your I/Os will be smaller, but the
CPU must expend cycles to uncompress it.  I have seen on some
systems with fast CPUs and increase in system performance due to
the smaller I/Os involved with compressed files.

Darren R. Davis
Senior Software Engineer
Novell, Inc.


From owner-freebsd-fs  Tue Apr 30 12:13:07 1996
Return-Path: owner-fs
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.3/8.7.3) id MAA28358
          for fs-outgoing; Tue, 30 Apr 1996 12:13:07 -0700 (PDT)
Received: from loki.ossi.com (loki-8.ossi.com [192.240.8.1])
          by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id MAA28353
          for <freebsd-fs@freebsd.org>; Tue, 30 Apr 1996 12:13:04 -0700 (PDT)
Received: from guacamole.ossi.com (guacamole.ossi.com [192.240.4.30]) by loki.ossi.com (8.6.11/8.6.11) with ESMTP id MAA18245; Tue, 30 Apr 1996 12:12:02 -0700
Received: (from merblich@localhost) by guacamole.ossi.com (8.6.11/8.6.11) id MAA07811; Tue, 30 Apr 1996 12:09:13 -0700
Date: Tue, 30 Apr 1996 12:09:13 -0700
From: Mitchell Erblich <merblich@ossi.com>
Message-Id: <199604301909.MAA07811@guacamole.ossi.com>
To: freebsd-fs@freebsd.org, pvh@leftside.its.uct.ac.za, merblich@ossi.com,
        DARREND@novell.com
Subject: Re: Compressing filesystem: Technical issues - Reply
X-Sun-Charset: US-ASCII
Sender: owner-fs@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

Darren,

	I would almost agree with you, but the fs only compresses items after
a specified time frame. Which means that the fs objects must be first accessed
in uncompressed format, then compressed after the object is put onto the disk.
Then the question of increased seeks between file blocks. If we increase the
number of seeks that decreases our rate from 40MB/sec to 5MB/sec. We have better
have a 8 to 1 compression rate just to stay even for your assumption to be true.

However, what we really need to see is what happens under the worst type of
accesses, random. For there would be no read-ahead or write behind algorithm
to help us. After the fs object is in compressed format and is now demand accessed
on the disk, I believe our latency to access the fs block will outweigh any decrease
or increase in perceived bandwidth by transfering less data (because it is in
compressed format).

Another item in consideration is that our compressed fs block data is not equal
to the uncompress physical memory block data. However, this just leads to more
complication.

So, back to my original philosophy of contigous files with extememly large block
sizes. If a fs object has a significant block size, a triple indirect access can be
avoided in very large files. This eliminates a fs access and is a speed improvement.
Sun Microsystems did not impliment a triple indirect on SunOS 4.1.X because of this.
(I have no idea what they are doing for 64bit fs accesses on their Ultras.)

Mitchell Erblich : mrblich@ossi.com
Fujitsu Open Systems Solutions, Inc.
Senior Software Engineer
PS: I speak for myself and not my company.
---------------------------------------------------------------------
 

> From owner-freebsd-fs@freefall.freebsd.org Mon Apr 29 14:19 PDT 1996
> Date: Mon, 29 Apr 1996 14:27:38 -0600
> From: DARREND@novell.com (Darren Davis)
> To: freebsd-fs@freebsd.org, pvh@leftside.its.uct.ac.za, merblich@ossi.com
> Subject: Re: Compressing filesystem: Technical issues - Reply
> X-Loop: FreeBSD.org
> 
> >>> Mitchell Erblich <merblich@ossi.com>  4/29 12:51pm >>>
> >Peter and et al,
> >
> >	I would taker in consideration what is the typical type of
> >	file would be compressed and what is the benefit vs the tradeoffs.
> Disks
> >	are already too slow, isn't the overhead of just uncompressing
> the blocks,
> >	on demand in a random access pattern add a delay to the fs object.
> However,
> >	I will proceed with the assumption that this approach may have
> some merit.
> 
> Actually, systems tend to be I/O bound more than compute bound.
> By compressing a file, you potentially are trading off I/Os for CPU
> cycles (A good tradeoff I believe).  Your I/Os will be smaller, but the
> CPU must expend cycles to uncompress it.  I have seen on some
> systems with fast CPUs and increase in system performance due to
> the smaller I/Os involved with compressed files.
> 
> Darren R. Davis
> Senior Software Engineer
> Novell, Inc.
> 
> 

From owner-freebsd-fs  Tue Apr 30 12:52:58 1996
Return-Path: owner-fs
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.3/8.7.3) id MAA01925
          for fs-outgoing; Tue, 30 Apr 1996 12:52:58 -0700 (PDT)
Received: from novell.com (sjf-ums.sjf.novell.com [130.57.10.171])
          by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id MAA01920
          for <freebsd-fs@freebsd.org>; Tue, 30 Apr 1996 12:52:55 -0700 (PDT)
Received: from INET-SJF-Message_Server by fromGW
	with Novell_GroupWise; Tue, 30 Apr 1996 12:49:51 -0700
Content-Type: text/plain
Message-ID: <s1860c6f.087@fromGW>
X-Mailer: Novell GroupWise 4.1
Date: Tue, 30 Apr 1996 12:58:51 -0700
From: DARREND@novell.com (Darren Davis)
To: freebsd-fs@freebsd.org, pvh@leftside.its.uct.ac.za, DARREND@novell.com,
        merblich@ossi.com, merblich@ossi.com
Subject: Re: Compressing filesystem: Technical issues - Reply -
	Reply
Sender: owner-fs@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

I guess I didn't catch the part about being compressed in the background.
 I was thinking more along the lines of having a utility that would compress
the file on demand.  The compression would be at the vnode layer so that
all my standard utilities would do ordinary read and writes to the vnode
layer and get uncompressed data.  Then in order to inspect what files are
compressed, I would change ls and the file metadata so that a z would appear
in the left most column as follows:

bash$ ls -l bin
total 12
zrwxr-xr-x  1 darrend  darrend  85 Apr  5 16:29 acroread
zrwxr-xr-x  1 darrend  darrend  78 Feb 15 14:44 title
-rwxr-xr-x  1 darrend  darrend  74 Apr  2 14:54 wordview
bash$ 

The files acroread and title are compressed, wordview is not.  Having a
compress on demand  utility would allow me to setup a script to peruse
the tree and find files of a certain age and compress them.  Once compressed
I should have smaller I/Os since less blocks would be used.  I guess I
am visioning a compress layer so that I don't have the usuall block compression
problem of trying to guess how much space this file is going to take when
compressed (most just guess that 2:1 is average).  So what I would have
is something not much different the gzip with the benefit of having it
transparent through the vnode layer so that my legacy utilities don't have
to pipe through gzip.  Of course this has the BIG problem that the whole
file would need to be uncompressed when used, making this problematic with
large files.  Though I guess I still believe in the data that shows the
average file size to still be 22K.  As for large files, things like mpeg
streams and jpegs are compressed anyway so you never relize a benefit anyway.
 This is why I prefer compress on demand to having a background compression
based on file age.

Just my thoughts, (and most likely flawed).

Darren R. Davis
Senior Software Engineer
Novell, Inc.


From owner-freebsd-fs  Tue Apr 30 16:24:48 1996
Return-Path: owner-fs
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.3/8.7.3) id QAA22477
          for fs-outgoing; Tue, 30 Apr 1996 16:24:48 -0700 (PDT)
Received: from hauki.clinet.fi (root@hauki.clinet.fi [194.100.0.1])
          by freefall.freebsd.org (8.7.3/8.7.3) with ESMTP id QAA22449
          for <freebsd-fs@freebsd.org>; Tue, 30 Apr 1996 16:24:13 -0700 (PDT)
Received: from cantina.clinet.fi (root@cantina.clinet.fi [194.100.0.15]) by hauki.clinet.fi (8.7.5/8.6.4) with ESMTP id CAA10574; Wed, 1 May 1996 02:24:00 +0300 (EET DST)
Received: (hsu@localhost) by cantina.clinet.fi (8.7.3/8.6.4) id CAA05011; Wed, 1 May 1996 02:23:59 +0300 (EET DST)
Date: Wed, 1 May 1996 02:23:59 +0300 (EET DST)
Message-Id: <199604302323.CAA05011@cantina.clinet.fi>
From: Heikki Suonsivu <hsu@clinet.fi>
To: Peter van Heusden <pvh@leftside.its.uct.ac.za>
Cc: freebsd-fs@freebsd.org
In-reply-to: Peter van Heusden's message of 26 Apr 1996 04:49:56 +0300
Subject: Re: Compressing filesystem: Technical issues
Organization: Clinet Ltd, Espoo, Finland
References: <Pine.BSD.3.91.960426001058.18458A-100000@leftside>
Sender: owner-fs@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk


In article <Pine.BSD.3.91.960426001058.18458A-100000@leftside> Peter van Heusden <pvh@leftside.its.uct.ac.za> writes:
   I'm slowly getting started on the issue of writing a compressing 
   filesystem for BSD. The situation thus far:

   1) I'm thinking of a model much like the Netware 4.x one, where a file is 
   compressed if it has not been 'touched' (ie. read or written) in a 
   certain time (e.g. a week). It is then decompressed on being 'touched'.
...

You might want to have a look at

@inproceedings{Burrows:92,
        AUTHOR="Michael Burrows and Charles Jerian and Butler Lampson and Timoth
y Mann",
        TITLE="On-line Data Compression in a Log-structured File System",
        BOOKTITLE="ASPLOS V Proceedings",
        ORGANIZATION="ACM",
        YEAR=1992,
        MONTH="October",
        PAGES="2--9",
        ABSTRACT = "{We have incorporated on-line data compression into the
        low levels of a log-structured file system (Rosenblum's {\it Sprite
        LFS}).  Each block of data or meta-data is compressed as it is
        written to the disk and decompressed as it is read.  The
        log-strucuring overcomes the problems of allocation and
        fragmentation for variable-sized blocks.  We observe compression
        factors ranging from 1.6 to 2.2, using algorithms running from 1.7
        to 0.4 MBytes per second in software on a DECstation 5000/200.
        System performance is degraded by a few percent for normal
        activities (such as compiling or editing), and as much as a factor
        of 1.6 for file system intensive operations (such as copying
        multi-megabyte files).

        Hardware compression devices mesh well with this design.  Chips are
        already available that operate at speeds exceeding disk transfer
        rates, which indicates that hardware compression would not only
        remove the performance degradation we observed, but might well
        increase the effective disk transfer rate beyond that obtainable
        from a system without compression.}"
}

It looks like implementing this on BSD LFS would not be too hard, as the
idea is trivial and BSD LFS is based on the same idea.  I do not know how
stable BSD LFS is, anyone tried it lately ?

-- 
Heikki Suonsivu, T{ysikuu 10 C 83/02210 Espoo/FINLAND, hsu@clinet.fi
mobile +358-40-5519679 work +358-0-4375360 fax -4555276 home -8031121