Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Aug 2011 18:02:35 +0300
From:      Gleb Kurtsou <gleb.kurtsou@gmail.com>
To:        Hiroki Sato <hrs@FreeBSD.org>
Cc:        kostikbel@gmail.com, rmacklem@uoguelph.ca, pjd@FreeBSD.org, current@FreeBSD.org, kaduk@MIT.EDU
Subject:   Re: fsid change of ZFS?
Message-ID:  <20110824150235.GA46460@tops>
In-Reply-To: <20110824.213458.806017948592590395.hrs@allbsd.org>
References:  <1614657395.247867.1314130280524.JavaMail.root@erie.cs.uoguelph.ca> <20110823212301.GE1697@garage.freebsd.pl> <20110824082119.GJ17489@deviant.kiev.zoral.com.ua> <20110824.213458.806017948592590395.hrs@allbsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On (24/08/2011 21:34), Hiroki Sato wrote:
> Kostik Belousov <kostikbel@gmail.com> wrote
>   in <20110824082119.GJ17489@deviant.kiev.zoral.com.ua>:
> 
> ko> On Tue, Aug 23, 2011 at 11:23:03PM +0200, Pawel Jakub Dawidek wrote:
> ko> > On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote:
> ko> > > Here's the patch. (Hiroki could you please test this, thanks, rick.)
> ko> > > ps: If the white space gets trashed, the same patch is at:
> ko> > >    http://people.freebsd.org/~rmacklem/fsid.patch
> ko> >
> ko> > The patch is fine by me. Thanks, Rick!
> ko>
> ko> Sorry, I am late.
> ko>
> ko> It seems that the probability of the collisions for the hash is quite high.
> ko> Due to the fixup procedure, the resulting typenum will depend on the order
> ko> of the module initialization, isn't it ? IMO, it makes the patch goal not
> ko> met.
> 
>  I tried the following two experiments (the complete results are
>  attached) to confirm the probability:
> 
>  1. [fsidhash1.txt]
> 	well-known vfc_name and the names "[a-z]fs" (# of names is 36)
> 	with no fix-up recalculation.
> 
>  2. [fsidhash2.txt]
> 	well-known vfc_name and the names "[a-z][a-z]fs" (# of names is 710)
> 	with no fix-up recalculation.
> 
>  There is no collision in the case 1.  And when [a-z][a-z]fs are
>  included the average number of the collided names in the same hash
>  value is 4.43 (i.e. 160 different hash values are generated, the
>  theoretical best number is (710 entries / 256 buckets) = 2.77).
Could you run the same test with fnv_32_buf()? Collision rate is likely
to be lower.

>  At least, vfc_names we currently have in our kernel code have no
>  collision, fortunately.  As you noticed "[a-z][a-z]fs" is an
>  impractical data set and these results cannot explain the
>  characteristics for all possible and practical vfc_names, so whether
>  this hash is reasonable or not depends on how we think of them.
>  Comments or other better idea?
> 
> -- Hiroki



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110824150235.GA46460>