From owner-freebsd-hackers Mon Jan 29 14: 8: 1 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from devonshire.cnchost.com (devonshire.concentric.net [207.155.248.12]) by hub.freebsd.org (Postfix) with ESMTP id 82DE637B698 for ; Mon, 29 Jan 2001 14:07:43 -0800 (PST) Received: from bitblocks.com (ts016d06.sjc-ca.concentric.net [206.173.237.18]) by devonshire.cnchost.com id RAA02603; Mon, 29 Jan 2001 17:00:55 -0500 (EST) [ConcentricHost SMTP Relay 1.10] Message-ID: <3A75E812.F1182BF6@bitblocks.com> Date: Mon, 29 Jan 2001 14:00:50 -0800 From: Bakul Shah X-Mailer: Mozilla 4.72 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Poul-Henning Kamp Cc: Matt Dillon , hackers@FreeBSD.ORG Subject: Re: [kernel patch] fcntl(...) to close many descriptors References: <24289.980802466@critter> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > >If you can get to old CACMs see `Minimal Perfect Hash Functions Made Simple' > >by Richard J. Cichelli, Comm. of ACM, Jan 1980. AFAIK gperf uses some > >variation of that algorithm and may have some details. A minimal perfect hash > >function is only worth it (IMHO) when the set of input keys is mostly fixed and > >the hash function is used many many times (e.g. programming language keywords). > > And even then it's seldom worth it according to the people behind the LCC > compiler... I'd be interested in a reference if you have one [I don't doubt you, just curious). I used gperf to generate such a function in a verilog parser and came to the same conclusion but it can't be generalized to the syscall (or for that matter database) case. The reason it doesn't buy much in a compiler is because what is not a keyword is an identifier and you end up doing a symbol table lookup on it. So you may as well just do a symbol table search for everything (also, typically there are more identifiers than keywords in a program so it all comes out in a wash). This is not the case for a simple lookup (database or syscall) -- you don't do another lookup on the same string if the first search fails. I agree that for a very small number of syscalls a simple hash or even a strcmp is good enough but that doesn't scale well. Min. perfect hash function is about as good as it gets *if* one wants to allow for a large number of loadable syscalls *without* apriori assignment of numbers. With an IANA like central number assigning authority, preassignment of syscall numbers is far better. -- bakul To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message