Date: Wed, 15 Mar 95 10:26:08 MST From: terry@cs.weber.edu (Terry Lambert) To: swallace@ece.uci.edu (Steven Wallace) Cc: current@FreeBSD.org Subject: Re: MINFREE change to 8% Message-ID: <9503151726.AA13323@cs.weber.edu> In-Reply-To: <199503150733.AA07893@balboa.eng.uci.edu> from "Steven Wallace" at Mar 14, 95 11:33:21 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> Could someone explain to the ignorant what in the world you are hashing? > (you are searching for something and use some algorithm as your hash > index into the 10% reserve space on the disk? (is this reserve > distributed or continuous at a specific portion of the disk?) Then > you find something... ?) How do skiplist compare and how would they > be implemented? You are hashing the block locations on the disk, with a high probablity that you won't get a collision in allocation until disk usage hits 80%, a lower probability to 85%, a pretty low probablity fo 90%+. This is all documented in Knuth, "Sorting and Searching". Knuth should be required reading for graduation with a CS degree. The reserve is distributed. Based on a used vs. free block count, you decide whether to hack or not (the not case means "disk full" to all but root). The skiplists are more generally expensive, meaning 1-2 extra compares per reference, but will effectively allow you to use all of disk space instead of hunting around for a long time in the case of a hash collision. They are a trade off, and not always a desirable one, but if people can't understand why their 2G unformatted drive becomes 1.7G when formatted (15% loss of space), then they aren't going to buy off on it becoming 1.5G (another 10%) when the establish another format subsidiary to the first. An implementation of skiplists is available via anonymous FTP from the host ftp.mv.com. The implementation at this site is the one from the Doctor Dobbs Journal articles on skiplists last year or so. Note that this isn't a condemnation of skiplists; they are appropriate to solve the "user want to overuse the drive at some penalty to their performance" problem. A product I recently worked on uses skiplists in its locking subsystem to facilitate IPC, and ends up with 150 times the lock/release rate of Tuxedo because of this. It's just that I don't necessarily agree that allowing the user to do that is preferrable to making them install a log structure file system or some other alternate approach to using an unreasonable amount of the disk. Oh, and preemptively: Time vs. space optimization has to do with a preference to use frags on a per file basis (the first case) or to start sharing them (at which point it become expensive to access them but you get less per file waste). Basically, in time optimized access, instead of using an existing frag to finish out the file, a new frag is allocated and partially used instead. Terry Lambert terry@cs.weber.edu --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9503151726.AA13323>