Date: Thu, 16 Nov 2006 17:52:32 +0900 From: JINMEI Tatuya / =?ISO-2022-JP?B?GyRCP0BMQEMjOkgbKEI=?= <jinmei@isl.rdc.toshiba.co.jp> To: Max Laier <max@love2party.net> Cc: David Malone <dwmalone@maths.tcd.ie>, freebsd-hackers@freebsd.org, freebsd-net@freebsd.org Subject: Re: ipv6 connection hash function wanted ... Message-ID: <y7v64dfixin.wl%jinmei@isl.rdc.toshiba.co.jp> In-Reply-To: <200611142020.53178.max@love2party.net> References: <200611141709.26644.max@love2party.net> <20061114190930.GA23096@walton.maths.tcd.ie> <200611142020.53178.max@love2party.net>
next in thread | previous in thread | raw e-mail | index | archive | help
>>>>> On Tue, 14 Nov 2006 20:20:47 +0100, >>>>> Max Laier <max@love2party.net> said: >> > Any ideas? Any papers that deal with this problem? >> >> Assuming you don't want to use one of the standard cryptographic >> ones (which I can imagine being a bit slow for something done >> per-packet), then one option might be to use a simpler hash that >> is keyed. Choose the key at boot/module load time and make it hard >> to produce collisions unless you know the key. > That's exactly what I am looking for ... now I need someone[tm] - with > better Math-Knowledge than mine - to write such a thing down in a simple > formula :-) i.e. take those bits from there and there and XOR them with > your canary yada-yada-yada ... If you want something whose behavior is mathematically guaranteed, I'd recommend universal hashing as already suggested in this thread. One example implementation is given below, assuming the hash key is source and destination IPv6 addresses and transport layer ports(*). As shown in the implementation, it's pretty simple and should run reasonably fast. Also, as long as the random values are reasonably unpredictable, the expected probability of producing the same hash value for arbitrarily-chosen two different keys is 1/65537. In general, for any prime number p as the value of LARGE_PRIME, this probability is 1/p (so if 65537 is too large, we can use a smaller number, e.g., 1009, depending on the desired balance between collision risk and memory footprint for hash buckets). (*)Technically, using in6_addr to represent an IPv6 address may not be enough; we may want to take into account zone indices (sin6_scope_id) as part of hash keys. JINMEI, Tatuya Communication Platform Lab. Corporate R&D Center, Toshiba Corp. jinmei@isl.rdc.toshiba.co.jp #define HASHKEYLEN 38 /* sizeof(in6_addr) * 2 + sizeof(in_port_t) * 2 */ #define LARGE_PRIME 65537 /* * This should be filled with unpredictable random values between 0 * and LARGE_PRIME-1 at startup time. */ u_int32_t random_parameters[HASHKEYLEN]; u_int32_t hash(struct in6_addr *saddr, struct in6_addr *daddr, in_port_t sport, in_port_t dport) { int i, j = 0; u_int32_t accumulated = 0; u_char *cp; for (i = 0; i < sizeof(*saddr); i++) accumulated += saddr->s6_addr[i] * random_parameters[j++]; for (i = 0; i < sizeof(*saddr); i++) accumulated += daddr->s6_addr[i] * random_parameters[j++]; cp = (u_char *)&sport; for (i = 0; i < sizeof(sport); i++) accumulated += cp[i] * random_parameters[j++]; cp = (u_char *)&dport; for (i = 0; i < sizeof(dport); i++) accumulated += cp[i] * random_parameters[j++]; return (accumulated % LARGE_PRIME); }
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?y7v64dfixin.wl%jinmei>