Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 31 Dec 2007 12:36:29 +0100
From:      =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= <des@des.no>
To:        "Aryeh M. Friedman" <aryeh.friedman@gmail.com>
Cc:        freebsd-hackers@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject:   Re: BSD license compatible hash algorithm?
Message-ID:  <86ejd3t7sy.fsf@ds4.des.no>
In-Reply-To: <4778294C.3030905@gmail.com> (Aryeh M. Friedman's message of "Sun\, 30 Dec 2007 18\:27\:08 -0500")
References:  <5950EE0C-383D-4D6B-9991-A0DEABD2ADE4@u.washington.edu> <20071228003716.GB48997@lor.one-eyed-alien.net> <B8D4C3C6-B867-4550-9F17-4DC6930D10E2@u.washington.edu> <fl2qiv$qoh$1@ger.gmane.org> <4774EF27.90307@gmail.com> <86r6h5zpr3.fsf@ds4.des.no> <4778294C.3030905@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
"Aryeh M. Friedman" <aryeh.friedman@gmail.com> writes:
> Dag-Erling Sm=C3=B8rgrav <des@des.no> writes:
> > "Aryeh M. Friedman" <aryeh.friedman@gmail.com> writes:
> > > All hashs have issues with pooling.... see
> > > http://www.burtleburtle.net/bob/hash/index.html... btw it is a
> > > old wives tale that the number of buckets should be prime (mostly
> > > based on the very weak implementation Knuth offered)
> > Not an "old wives' tale", but rather an easy way to implement a
> > hash algorithm that is good enough for most simple uses: metric
> > modulo table size, where metric is a number derived from the item
> > in such a manner as to give a good spread.
> Sorry for taking a while to reply.... but the above only applies if
> your using a very primitive hash like Knuth's multiplication one....

You are overlooking a very common case: the use of a hash table to
implement an in-memory dictionary (aka associative array) where the key
is an integer with poor variability in the high-order bits.  K % N where
K is the key and N is the size of the table requires very little code,
is reasonably efficient, and, provided that N is prime, gives a
reasonably good spread (excpet in pathological cases where the values of
K are clustered around multiples of N).

> every modern hash I know of should have 2^k buckets actually for some
> k<2^32 [in almost all cases <2^16 except for algorithms like the one I
> mentioned I am working on which sets k=3Dn where n=3Dthe bit count of the
> key].

I certainly hope not.  2^(2^32) is several of billion orders of
magnitude more than the number of elementary particles in the known
universe (currently estimated at 10^80).  Even 2^(2^16) is too big by
about sixty thousand orders of magnitude.

DES
--=20
Dag-Erling Sm=C3=B8rgrav - des@des.no



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86ejd3t7sy.fsf>