Date: Sun, 30 Dec 2007 19:59:38 -0800 From: perryh@pluto.rain.com To: aryeh.friedman@gmail.com, des@des.no Cc: freebsd-hackers@freebsd.org, ivoras@freebsd.org Subject: Re: BSD license compatible hash algorithm? Message-ID: <4778692a.rVZUrFy36ANLKL7F%perryh@pluto.rain.com> In-Reply-To: <4778294C.3030905@gmail.com> References: <5950EE0C-383D-4D6B-9991-A0DEABD2ADE4@u.washington.edu> <20071228003716.GB48997@lor.one-eyed-alien.net> <B8D4C3C6-B867-4550-9F17-4DC6930D10E2@u.washington.edu> <fl2qiv$qoh$1@ger.gmane.org> <4774EF27.90307@gmail.com> <86r6h5zpr3.fsf@ds4.des.no> <4778294C.3030905@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
"Aryeh M. Friedman" <gmail.com!aryeh.friedman@agora.rdrop.com> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Dag-Erling Sm??rgrav wrote: > > "Aryeh M. Friedman" <aryeh.friedman@gmail.com> writes: > >> All hashs have issues with pooling.... see > >> http://www.burtleburtle.net/bob/hash/index.html... btw it is > >> a old wives tale that the number of buckets should be prime > >> (mostly based on the very weak implementation Knuth offered) > > > > Not an "old wives' tale", but rather an easy way to implement a > > hash algorithm that is good enough for most simple uses: metric > > modulo table size, where metric is a number derived from the > > item in such a manner as to give a good spread. > > ... the above only applies if your using a very primitive hash > like Knuth's multiplication one.... every modern hash I know of > should have 2^k buckets actually for some k ... It very much depends on what is used for a rehash (collision step) value. The step value and the table size must have no common factor larger than 1, or there will be edge cases (bugs) in which some combinations of hash and step values will cause the table to appear full when in fact it is not. Making the table size prime is one simple way of preventing such problems, while still allowing the rehash value to depend on the key (thus reducing the liklihood of collision on the second probe). At the other extreme, the table can be any size at all if the step value is 1 (and the "modulo table size" operation will certainly be more efficient if the table size is 2^k).
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4778692a.rVZUrFy36ANLKL7F%perryh>