Date: Mon, 4 Aug 2014 23:56:41 +0200 From: Jilles Tjoelker <jilles@stack.nl> To: Baptiste Daroussin <bapt@FreeBSD.org> Cc: arch@FreeBSD.org, hackers@FreeBSD.org Subject: Re: Add ohash (OpenBSD hash) into libutil Message-ID: <20140804215641.GA80850@stack.nl> In-Reply-To: <20140727230110.GI50802@ivaldir.etoilebsd.net> References: <20140727230110.GI50802@ivaldir.etoilebsd.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jul 28, 2014 at 01:01:10AM +0200, Baptiste Daroussin wrote: > If noone objects I would like to bring ohash from OpenBSD into our > libutil. > Ohash is already in base (usr.bin/m4/ohash) having it into libutil > will increase compatibility with OpenBSD as well as allowing the rest > of the base system to be able to benefit from ohash implementation. Following OpenBSD's general low priority for binary compatibility issues, the functions have the application allocate struct ohash. For use in a symbol-versioned library that promises backwards compatibility, it would be better to leave the type incomplete in the public header and have the library allocate it. Adding these functions encourages using open addressing (no chaining) because that's what's available. This may be OK but it might be bad in specific situations. The automatic resizing of the hash table, absent in many home-grown hash table implementations, may help avoid such problems somewhat. The code in ohash_resize() that may set the hash table size to UINT_MAX violates the requirement that the probe interval (incr) must be relative prime to the hash table size (otherwise ohash_lookup_interval() might loop indefinitely examining only a subset of the table, while the rest of the table still has free space). Normally, the code enforces this requirement by making the hash table size a power of two and the probe interval an odd number. On another note, if the hash table size is always a power of two, the slow hv % h->size can be replaced by hv & (h->size - 1). The incr should probably only be calculated when needed, or using a method that does not involve dividing by a variable. (Note that hv doesn't contain sufficient bits for an independent i and hv if the hash table is larger than 65536 entries, so such large hash tables will be used less uniformly.) Utilities that value startup and fork performance (such as sh(1)) do not want to depend on additional DSOs (why is libutil separate from libc?). -- Jilles Tjoelker
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140804215641.GA80850>