From owner-freebsd-current  Wed Mar 15 09:32:29 1995
Return-Path: current-owner
Received: (from majordom@localhost) by freefall.cdrom.com (8.6.10/8.6.6) id JAA05807 for current-outgoing; Wed, 15 Mar 1995 09:32:29 -0800
Received: from cs.weber.edu (cs.weber.edu [137.190.16.16]) by freefall.cdrom.com (8.6.10/8.6.6) with SMTP id JAA05800 for <current@freebsd.org>; Wed, 15 Mar 1995 09:32:28 -0800
Received: by cs.weber.edu (4.1/SMI-4.1.1)
	id AA13323; Wed, 15 Mar 95 10:26:09 MST
From: terry@cs.weber.edu (Terry Lambert)
Message-Id: <9503151726.AA13323@cs.weber.edu>
Subject: Re: MINFREE change to 8%
To: swallace@ece.uci.edu (Steven Wallace)
Date: Wed, 15 Mar 95 10:26:08 MST
Cc: current@FreeBSD.org
In-Reply-To: <199503150733.AA07893@balboa.eng.uci.edu> from "Steven Wallace" at Mar 14, 95 11:33:21 pm
X-Mailer: ELM [version 2.4dev PL52]
Sender: current-owner@FreeBSD.org
Precedence: bulk

> Could someone explain to the ignorant what in the world you are hashing?
> (you are searching for something and use some algorithm as your hash
> index into the 10% reserve space on the disk?  (is this reserve
> distributed or continuous at a specific portion of the disk?)  Then
> you find something... ?)  How do skiplist compare and how would they
> be implemented?

You are hashing the block locations on the disk, with a high probablity
that you won't get a collision in allocation until disk usage hits
80%, a lower probability to 85%, a pretty low probablity fo 90%+.
This is all documented in Knuth, "Sorting and Searching".  Knuth
should be required reading for graduation with a CS degree.

The reserve is distributed.  Based on a used vs. free block count, you
decide whether to hack or not (the not case means "disk full" to all
but root).

The skiplists are more generally expensive, meaning 1-2 extra compares
per reference, but will effectively allow you to use all of disk space
instead of hunting around for a long time in the case of a hash
collision.  They are a trade off, and not always a desirable one, but
if people can't understand why their 2G unformatted drive becomes 1.7G
when formatted (15% loss of space), then they aren't going to buy
off on it becoming 1.5G (another 10%) when the establish another format
subsidiary to the first.

An implementation of skiplists is available via anonymous FTP from the
host ftp.mv.com.

The implementation at this site is the one from the Doctor Dobbs Journal
articles on skiplists last year or so.

Note that this isn't a condemnation of skiplists; they are appropriate
to solve the "user want to overuse the drive at some penalty to their
performance" problem.  A product I recently worked on uses skiplists
in its locking subsystem to facilitate IPC, and ends up with 150
times the lock/release rate of Tuxedo because of this.

It's just that I don't necessarily agree that allowing the user to do
that is preferrable to making them install a log structure file system
or some other alternate approach to using an unreasonable amount of
the disk.


Oh, and preemptively:  Time vs. space optimization has to do with a
preference to use frags on a per file basis (the first case) or to
start sharing them (at which point it become expensive to access
them but you get less per file waste).  Basically, in time optimized
access, instead of using an existing frag to finish out the file, a
new frag is allocated and partially used instead.


					Terry Lambert
					terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.