From owner-freebsd-hackers@FreeBSD.ORG Thu Nov 16 20:01:10 2006 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5446B16A412 for ; Thu, 16 Nov 2006 20:01:10 +0000 (UTC) (envelope-from youshi10@u.washington.edu) Received: from mxout4.cac.washington.edu (mxout4.cac.washington.edu [140.142.33.19]) by mx1.FreeBSD.org (Postfix) with ESMTP id EF12743D6B for ; Thu, 16 Nov 2006 20:01:09 +0000 (GMT) (envelope-from youshi10@u.washington.edu) Received: from smtp.washington.edu (smtp.washington.edu [140.142.32.141]) by mxout4.cac.washington.edu (8.13.7+UW06.06/8.13.7+UW06.09) with ESMTP id kAGK19Yc001988 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 16 Nov 2006 12:01:09 -0800 X-Auth-Received: from [128.208.4.96] (dzihan.cs.washington.edu [128.208.4.96]) (authenticated authid=youshi10) by smtp.washington.edu (8.13.7+UW06.06/8.13.7+UW06.09) with ESMTP id kAGK19ed018054 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Thu, 16 Nov 2006 12:01:09 -0800 Message-ID: <455CC385.10309@u.washington.edu> Date: Thu, 16 Nov 2006 12:01:09 -0800 From: Garrett Cooper User-Agent: Thunderbird 1.5.0.8 (X11/20061108) MIME-Version: 1.0 To: freebsd-hackers@freebsd.org References: <200611141709.26644.max@love2party.net> <20061114190930.GA23096@walton.maths.tcd.ie> <200611142020.53178.max@love2party.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-PMX-Version: 5.2.2.285561, Antispam-Engine: 2.5.0.283055, Antispam-Data: 2006.11.16.114932 X-Uwash-Spam: Gauge=IIIIIII, Probability=7%, Report='__CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __MIME_TEXT_ONLY 0, __MIME_VERSION 0, __SANE_MSGID 0, __USER_AGENT 0' Subject: Re: ipv6 connection hash function wanted ... X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Nov 2006 20:01:10 -0000 JINMEI Tatuya / ???? wrote: >>>>>> On Tue, 14 Nov 2006 20:20:47 +0100, >>>>>> Max Laier said: >>>>>> > > >>>> Any ideas? Any papers that deal with this problem? >>>> >>> Assuming you don't want to use one of the standard cryptographic >>> ones (which I can imagine being a bit slow for something done >>> per-packet), then one option might be to use a simpler hash that >>> is keyed. Choose the key at boot/module load time and make it hard >>> to produce collisions unless you know the key. >>> > > >> That's exactly what I am looking for ... now I need someone[tm] - with >> better Math-Knowledge than mine - to write such a thing down in a simple >> formula :-) i.e. take those bits from there and there and XOR them with >> your canary yada-yada-yada ... >> > > If you want something whose behavior is mathematically guaranteed, I'd > recommend universal hashing as already suggested in this thread. > > One example implementation is given below, assuming the hash key is > source and destination IPv6 addresses and transport layer ports(*). > As shown in the implementation, it's pretty simple and should run > reasonably fast. Also, as long as the random values are reasonably > unpredictable, the expected probability of producing the same hash > value for arbitrarily-chosen two different keys is 1/65537. In > general, for any prime number p as the value of LARGE_PRIME, this > probability is 1/p (so if 65537 is too large, we can use a smaller > number, e.g., 1009, depending on the desired balance between collision > risk and memory footprint for hash buckets). > > (*)Technically, using in6_addr to represent an IPv6 address may not be > enough; we may want to take into account zone indices (sin6_scope_id) > as part of hash keys. > > JINMEI, Tatuya > Communication Platform Lab. > Corporate R&D Center, Toshiba Corp. > jinmei@isl.rdc.toshiba.co.jp > > #define HASHKEYLEN 38 /* sizeof(in6_addr) * 2 + sizeof(in_port_t) * 2 */ > #define LARGE_PRIME 65537 > > /* > * This should be filled with unpredictable random values between 0 > * and LARGE_PRIME-1 at startup time. > */ > u_int32_t random_parameters[HASHKEYLEN]; > > u_int32_t > hash(struct in6_addr *saddr, struct in6_addr *daddr, > in_port_t sport, in_port_t dport) > { > int i, j = 0; > u_int32_t accumulated = 0; > u_char *cp; > > for (i = 0; i < sizeof(*saddr); i++) > accumulated += saddr->s6_addr[i] * random_parameters[j++]; > for (i = 0; i < sizeof(*saddr); i++) > accumulated += daddr->s6_addr[i] * random_parameters[j++]; > cp = (u_char *)&sport; > for (i = 0; i < sizeof(sport); i++) > accumulated += cp[i] * random_parameters[j++]; > cp = (u_char *)&dport; > for (i = 0; i < sizeof(dport); i++) > accumulated += cp[i] * random_parameters[j++]; > > return (accumulated % LARGE_PRIME); > } Jinmei, Wouldn't you get some overflow if (pardon my set notation) "INT_MAX < {saddr,daddr}->s6_addr[i] * random_parameters[j++]", which would implicitly introduce collision into your algorithm. Or is the overall set size sufficiently large not to worry about this particular issue? -Garrett