From owner-svn-src-all@FreeBSD.ORG Tue Jul 2 05:08:29 2013 Return-Path: Delivered-To: svn-src-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 503FDCE3; Tue, 2 Jul 2013 05:08:29 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id F21F41654; Tue, 2 Jul 2013 05:08:28 +0000 (UTC) Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23]) by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id DF5553C0EAB; Tue, 2 Jul 2013 14:41:57 +1000 (EST) Date: Tue, 2 Jul 2013 14:41:57 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: "Pedro F. Giffuni" Subject: Re: svn commit: r252484 - head/sys/ufs/ffs In-Reply-To: <201307012143.r61Lhemi067176@svn.freebsd.org> Message-ID: <20130702130818.V865@besplex.bde.org> References: <201307012143.r61Lhemi067176@svn.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=K8x6hFqI c=1 sm=1 a=K9A2evXCcLcA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=Nw6HGA-A-RMA:10 a=j_m6SzapsullctZUoX8A:9 a=CjuIK1q_8ugA:10 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Jul 2013 05:08:29 -0000 On Mon, 1 Jul 2013, Pedro F. Giffuni wrote: > Log: > Change i_gen in UFS to an unsigned type. > > Revert the simplification of the i_gen calculation. > It is still a good idea to avoid zero values and for the case > of old filesystems there is probably no advantage in using > the complete 32 bits anyways. Er, this doesn't revert it. > Modified: head/sys/ufs/ffs/ffs_vfsops.c > ============================================================================== > --- head/sys/ufs/ffs/ffs_vfsops.c Mon Jul 1 21:41:12 2013 (r252483) > +++ head/sys/ufs/ffs/ffs_vfsops.c Mon Jul 1 21:43:40 2013 (r252484) > @@ -1791,7 +1791,7 @@ ffs_vgetf(mp, ino, flags, vpp, ffs_flags > * already have one. This should only happen on old filesystems. > */ > if (ip->i_gen == 0) { > - ip->i_gen = arc4random(); > + ip->i_gen = arc4random()/2 + 1; Missing spaces around binary operator. The old version was formatted normally. > if ((vp->v_mount->mnt_flag & MNT_RDONLY) == 0) { > ip->i_flag |= IN_MODIFIED; > DIP_SET(ip, i_gen, ip->i_gen); Also, IIRC ext2fs still hasn't caught up with the change from random() to arc4random() in 2003. random() / 2 + 1 doesn't work as intended on 64-bit arches, since random() returns u_long, so when u_long is 64 bits, dividing it by 2 leaves 63 bits, which is too many for i_gen. Also, random(9) is internally broken on 64-bit arches. It shouldn't exist. It is not random() at all, but just rand() with the clamp to RAND_MAX removed. Its linear congruential generator is suboptimal for 64 bits, and other parts of its algorithm are just broken for 64 bits. It uses longs internally to try to fake unsigned 31 bits, but the faking is broken in different ways depending on the size of long. It is documented to return a 31-bit value, but on 32-bit arches it seems to be possible for it to return 0xfffffffff, and on 64-bit arches it does return full 64-bit values. random(9) was cloned from rand(3). The userland versions have been edited a bit, but still have most of the bugs: random(3) uses an internal copy of rand(3) named good_rand() for seeding. If USE_WEAK_SEEDING is defined, this uses a weaker linear congruential generator (LCG). This uses int32_t instead of long internally, and doesn't attempt to reduce the value to 31 bits. Otherwise, the same LCG as random(9) is used, with the same buggy code except for changing the longs to 32-bits. Using int32_t gives the same overflow bugs on all (supported) arches, and avoids returning invalid values half the time on 64-bit arches. There are 2 callers of good_rand(), and only 1 of them discards bits above the 31st. srand() also supports USE_WEAK_SEEDING. It uses u_long internally, so it is actually correct. The internal value has the number of bits in a u_long and is generated without overflow and without any bias in the reduction to 31 bits. Then returning this value as an int in by taking the value modulo ((u_long)RAND_MAX + 1) gives a correct reduction to 31 bits when RAND_MAX is 0x7fffffff (or 15 bits if RAND_MAX is 0x7fff, etc.). srand() in the !USE_WEAK_SEEDING case still uses the same buggy code as random(9), with type long, so the internal values overflow and the inernal reduction to 31 bits is buggy, with the bugs depending on the size of long and other things. But it is mostly saved by taking the value modulo ((u_long)RAND_MAX + 1). This reduces to a valid value and leaves only minor biases from the buggy earlier reduction. random(6) used to have bugs related to the buggy internal reduction, and the biases from these were noticed. It uses floating point, so the reduction was easier, but it was still done wrong, by dividing by LONG_MAX instead of RANDOM_MAX_PLUS1. Using LONG_MAX is like using 0x7fffffff in random(9), but obviously buggier since the range for both is documented to be [0,0x7fffffff]. Hard-coding 0x7fffffff in random(6) would have been equally buggy. I think 0x80000000 is correct in both, but in random(9) this assumes too much about type sizes and layouts. The correct method in integer code is to take an unsigned modulus with divisor 0x80000000. Let the compiler optimize this to masking with 0x7fffffff. This depends on the maximum value plus 1 being representable in an unsigned type. For rand(), this occurs because rand() returns int, and for random(3), this occurs because random(3) returns long. Another bug in random(9) is that it returns u_long, so its API is different from random(3), but since it is documented to return only 31 bits, it is not really different except in the buggy cases where it returns 32-64 bits. Bruce