From owner-freebsd-hackers@FreeBSD.ORG Tue May 6 07:02:19 2008 Return-Path: Delivered-To: hackers@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3CD221065671; Tue, 6 May 2008 07:02:19 +0000 (UTC) (envelope-from das@FreeBSD.ORG) Received: from zim.MIT.EDU (ZIM.MIT.EDU [18.95.3.101]) by mx1.freebsd.org (Postfix) with ESMTP id 07CD08FC28; Tue, 6 May 2008 07:02:18 +0000 (UTC) (envelope-from das@FreeBSD.ORG) Received: from zim.MIT.EDU (localhost [127.0.0.1]) by zim.MIT.EDU (8.14.2/8.14.2) with ESMTP id m466Pu2o068340; Tue, 6 May 2008 02:25:56 -0400 (EDT) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by zim.MIT.EDU (8.14.2/8.14.2/Submit) id m466Puxj068339; Tue, 6 May 2008 02:25:56 -0400 (EDT) (envelope-from das@FreeBSD.ORG) Date: Tue, 6 May 2008 02:25:56 -0400 From: David Schultz To: Roman Divacky Message-ID: <20080506062556.GA68171@zim.MIT.EDU> Mail-Followup-To: Roman Divacky , hackers@FreeBSD.ORG References: <20080505204350.GA45321@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080505204350.GA45321@freebsd.org> Cc: hackers@FreeBSD.ORG Subject: Re: hashinit versus phashinit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 May 2008 07:02:19 -0000 On Mon, May 05, 2008, Roman Divacky wrote: > hi > > when we want to use a hash table in kernel we call "hashinit" which > initializes a hash table with power-of-2 size. There's also "phashinit" > that creates hash table of size that is a prime number. This was > added in 1995 by davidg@ but it is not used anywhere in the kernel. > > phk@ commited rev. 1.30 of vfs_cache.c replacing phashinit with hashinit > stating that it's better because it replaces a division with logical and. > is this reason still valid today? (the commit was done almost 11 years ago) > > is there still any reason why not use the phashinit instead of hashinit? > I believe using prime-sized hash table might have positive performance > impact... There's a tradeoff. The argument for using powers of 2 is that division takes many times longer than a logical AND. The argument for using primes is that if your hash function isn't as good as you thought it was, or if the data has some regularity you weren't expecting, you can get screwed a lot more easily with a power of 2 hash table. With a prime-sized hash table, you only get screwed if lots of your data is congruent modulo the prime, which is very rare. Most general-purpose hash implementations I've used (e.g., GNU libstdc++, Sun JDK, Microsoft .NET) use prime table sizes, probably to make it less likely that programmers will shoot themselves in the foot with pathological data or bad hash functions.