Date: Wed, 24 Aug 2011 18:02:35 +0300 From: Gleb Kurtsou <gleb.kurtsou@gmail.com> To: Hiroki Sato <hrs@FreeBSD.org> Cc: kostikbel@gmail.com, rmacklem@uoguelph.ca, pjd@FreeBSD.org, current@FreeBSD.org, kaduk@MIT.EDU Subject: Re: fsid change of ZFS? Message-ID: <20110824150235.GA46460@tops> In-Reply-To: <20110824.213458.806017948592590395.hrs@allbsd.org> References: <1614657395.247867.1314130280524.JavaMail.root@erie.cs.uoguelph.ca> <20110823212301.GE1697@garage.freebsd.pl> <20110824082119.GJ17489@deviant.kiev.zoral.com.ua> <20110824.213458.806017948592590395.hrs@allbsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On (24/08/2011 21:34), Hiroki Sato wrote: > Kostik Belousov <kostikbel@gmail.com> wrote > in <20110824082119.GJ17489@deviant.kiev.zoral.com.ua>: > > ko> On Tue, Aug 23, 2011 at 11:23:03PM +0200, Pawel Jakub Dawidek wrote: > ko> > On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote: > ko> > > Here's the patch. (Hiroki could you please test this, thanks, rick.) > ko> > > ps: If the white space gets trashed, the same patch is at: > ko> > > http://people.freebsd.org/~rmacklem/fsid.patch > ko> > > ko> > The patch is fine by me. Thanks, Rick! > ko> > ko> Sorry, I am late. > ko> > ko> It seems that the probability of the collisions for the hash is quite high. > ko> Due to the fixup procedure, the resulting typenum will depend on the order > ko> of the module initialization, isn't it ? IMO, it makes the patch goal not > ko> met. > > I tried the following two experiments (the complete results are > attached) to confirm the probability: > > 1. [fsidhash1.txt] > well-known vfc_name and the names "[a-z]fs" (# of names is 36) > with no fix-up recalculation. > > 2. [fsidhash2.txt] > well-known vfc_name and the names "[a-z][a-z]fs" (# of names is 710) > with no fix-up recalculation. > > There is no collision in the case 1. And when [a-z][a-z]fs are > included the average number of the collided names in the same hash > value is 4.43 (i.e. 160 different hash values are generated, the > theoretical best number is (710 entries / 256 buckets) = 2.77). Could you run the same test with fnv_32_buf()? Collision rate is likely to be lower. > At least, vfc_names we currently have in our kernel code have no > collision, fortunately. As you noticed "[a-z][a-z]fs" is an > impractical data set and these results cannot explain the > characteristics for all possible and practical vfc_names, so whether > this hash is reasonable or not depends on how we think of them. > Comments or other better idea? > > -- Hiroki
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110824150235.GA46460>