From owner-freebsd-hackers@FreeBSD.ORG Mon Dec 31 11:36:39 2007 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 957A916A421; Mon, 31 Dec 2007 11:36:39 +0000 (UTC) (envelope-from des@des.no) Received: from tim.des.no (tim.des.no [194.63.250.121]) by mx1.freebsd.org (Postfix) with ESMTP id 4E76113C45A; Mon, 31 Dec 2007 11:36:39 +0000 (UTC) (envelope-from des@des.no) Received: from tim.des.no (localhost [127.0.0.1]) by spam.des.no (Postfix) with ESMTP id C2A6A20B9; Mon, 31 Dec 2007 12:36:29 +0100 (CET) X-Spam-Tests: AWL X-Spam-Learn: disabled X-Spam-Score: -0.1/3.0 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on tim.des.no Received: from ds4.des.no (des.no [80.203.243.180]) by smtp.des.no (Postfix) with ESMTP id A716B2099; Mon, 31 Dec 2007 12:36:29 +0100 (CET) Received: by ds4.des.no (Postfix, from userid 1001) id 87E79844F0; Mon, 31 Dec 2007 12:36:29 +0100 (CET) From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= To: "Aryeh M. Friedman" References: <5950EE0C-383D-4D6B-9991-A0DEABD2ADE4@u.washington.edu> <20071228003716.GB48997@lor.one-eyed-alien.net> <4774EF27.90307@gmail.com> <86r6h5zpr3.fsf@ds4.des.no> <4778294C.3030905@gmail.com> Date: Mon, 31 Dec 2007 12:36:29 +0100 In-Reply-To: <4778294C.3030905@gmail.com> (Aryeh M. Friedman's message of "Sun\, 30 Dec 2007 18\:27\:08 -0500") Message-ID: <86ejd3t7sy.fsf@ds4.des.no> User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/22.1 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-hackers@freebsd.org, Ivan Voras Subject: Re: BSD license compatible hash algorithm? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Dec 2007 11:36:39 -0000 "Aryeh M. Friedman" writes: > Dag-Erling Sm=C3=B8rgrav writes: > > "Aryeh M. Friedman" writes: > > > All hashs have issues with pooling.... see > > > http://www.burtleburtle.net/bob/hash/index.html... btw it is a > > > old wives tale that the number of buckets should be prime (mostly > > > based on the very weak implementation Knuth offered) > > Not an "old wives' tale", but rather an easy way to implement a > > hash algorithm that is good enough for most simple uses: metric > > modulo table size, where metric is a number derived from the item > > in such a manner as to give a good spread. > Sorry for taking a while to reply.... but the above only applies if > your using a very primitive hash like Knuth's multiplication one.... You are overlooking a very common case: the use of a hash table to implement an in-memory dictionary (aka associative array) where the key is an integer with poor variability in the high-order bits. K % N where K is the key and N is the size of the table requires very little code, is reasonably efficient, and, provided that N is prime, gives a reasonably good spread (excpet in pathological cases where the values of K are clustered around multiples of N). > every modern hash I know of should have 2^k buckets actually for some > k<2^32 [in almost all cases <2^16 except for algorithms like the one I > mentioned I am working on which sets k=3Dn where n=3Dthe bit count of the > key]. I certainly hope not. 2^(2^32) is several of billion orders of magnitude more than the number of elementary particles in the known universe (currently estimated at 10^80). Even 2^(2^16) is too big by about sixty thousand orders of magnitude. DES --=20 Dag-Erling Sm=C3=B8rgrav - des@des.no