From owner-svn-src-head@freebsd.org Tue May 24 14:47:28 2016 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 49739B481E0 for ; Tue, 24 May 2016 14:47:28 +0000 (UTC) (envelope-from pfg@FreeBSD.org) Received: from nm48-vm7.bullet.mail.bf1.yahoo.com (nm48-vm7.bullet.mail.bf1.yahoo.com [216.109.115.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 11A871A57 for ; Tue, 24 May 2016 14:47:27 +0000 (UTC) (envelope-from pfg@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1464101240; bh=VySKh377OV8Nj1W8csiYM3fQNmLLxHpN9apdvgJmTk8=; h=Subject:To:References:Cc:From:Date:In-Reply-To:From:Subject; b=XOOf9wdM6qpQYAAQ35cnOPbPYEL3T1LFAL4FV7VP1GvG07XERekzm0fIgiPa1P7pG2PdLO41pVnMR+dbpka5kkBDLETmfC80uuoFzDbfdu47GzYu0aybimc0t5G7uFQMcfAmt5FBQG7rolJhtOvWnXvlCQilHRquOcLWZE0Mac7GXxkcii/TdtinY5uNhDBgZUlNOCTggQmOd0FcVmcCdtapxLdnu3LLgnOArI80itTo63p0vSQ+pMRRTwRF2iZt9MDPVFOsEwPvDGyYefTPrN0IBu3JW/Mg+eYoOeAhIZbyUK+KSLbvonHuAf9bxlGHpwyzYVnz3cv4FEAvuyzHKg== Received: from [66.196.81.171] by nm48.bullet.mail.bf1.yahoo.com with NNFMP; 24 May 2016 14:47:20 -0000 Received: from [68.142.230.72] by tm17.bullet.mail.bf1.yahoo.com with NNFMP; 24 May 2016 14:47:20 -0000 Received: from [127.0.0.1] by smtp229.mail.bf1.yahoo.com with NNFMP; 24 May 2016 14:47:20 -0000 X-Yahoo-Newman-Id: 543301.25536.bm@smtp229.mail.bf1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: FEjh0UQVM1kt_5UIkCeZS8ri1MvAaGtvhjCIhypCGqU6FV3 MGmDKiHm9qwYaw3zpNzOe654P7JQJKVU4_Dwxxt0_ny00MroEhFP7v8ssiYH QHtlZtYlhQ3T5OGyy9byPQoayvsELuLccbbrRKZ4_LAgr.znXOq9jdfen.Pt 01i5MlxjDUMsG1.OdpC50KLk890HKjwu8zQznwWSZyo3gCvaxSBRcHl0xMQv 2GGqvZl3QPbcyVeqJVoCWLh89moT1CYgF7H9L4ICXBM0XACUUR24Dnm8DA3r xv418uLhmFIv4eqLhV7DcGAkIHFuerFTR6oLvV4p4zaxIaQQLlDkeCtojymG _WxtqrVAWRXGJhpfeXV4LEmCY4gazUGGJTWovSmZvOEzp5z4ncuuat9EK3H4 u2IPOorq85N5HMN06KBdCGVXEqDBNKznDf827MgDhncjstZksKT6a1TnGGXm 25r5Y2mEVr1gjdW1IQDDbSh11w3Y_JkkTiojKhEnvjn_uGU06qjiF8zrgBx1 cREJZac8YOPMt5wL4iGXnQBpLRS5REOb1 X-Yahoo-SMTP: xcjD0guswBAZaPPIbxpWwLcp9Unf Subject: Re: svn commit: r300423 - in head/sys: fs/ext2fs ufs/ffs To: Bruce Evans , Kevin Lo References: <201605221431.u4MEVKXC007524@repo.freebsd.org> <20160524195127.C931@besplex.bde.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org From: Pedro Giffuni Message-ID: Date: Tue, 24 May 2016 09:47:38 -0500 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: <20160524195127.C931@besplex.bde.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 May 2016 14:47:28 -0000 On 05/24/16 07:20, Bruce Evans wrote: > On Sun, 22 May 2016, Kevin Lo wrote: > >> Log: >> arc4random() returns 0 to (2**32)−1, use an alternative to initialize >> i_gen if it's zero rather than a divide by 2. >> >> With inputs from delphij, mckusick, rmacklem >> >> Reviewed by: mckusick >> ... >> Modified: head/sys/fs/ext2fs/ext2_alloc.c >> ============================================================================== >> >> --- head/sys/fs/ext2fs/ext2_alloc.c Sun May 22 14:13:20 2016 >> (r300422) >> +++ head/sys/fs/ext2fs/ext2_alloc.c Sun May 22 14:31:20 2016 >> (r300423) >> @@ -408,7 +408,8 @@ ext2_valloc(struct vnode *pvp, int mode, >> /* >> * Set up a new generation number for this inode. >> */ >> - ip->i_gen = arc4random(); >> + while (ip->i_gen == 0 || ++ip->i_gen == 0) >> + ip->i_gen = arc4random(); > > This is a correct implementation of ffs's intended method, but ffs's > intended method is wrong (see below for its wrongness). > > Correctness depends on i_gen having type uint32_t in ext2fs. This > makes the code +ip_i_gen undead, so i_gen is re-randomized occasionally > (averaged over all inodes, once for every ~2 billionth reuse of an > inode, which is practically never. Bugs in ffs prevent it being done > even that often there. So the re-randomization is almost useless. I > think it is slighty worse than useless, since it may give the same > i_gen immediately, while always incrementing (but skipping 0) always > gives a new i_gen. > > ext2fs might not need this at all. For ffs, then special case of > i_gen == 0 must be handled because we still pretend to support file > systems created by newfs versions almost 25 years old. newfs didn't > initialize di_gen back then, so all inodes started with di_gen == 0. > ext2fs is less that 25 years old, so it has less history to support. > I agree ext2fs, and likely UFS2, can handle the case of di_gen == 0. The additional question is where do we actually need to do randomization in ext2fs? UFS does the randomization when formatting so ffs_vget doesn't do randomization at all. arc4random is AFAICT, only required for very old UFS1. I lost interest in ext2fs and can't be bothered to look in linux ;). >> >> vfs_timestamp(&ts); >> ip->i_birthtime = ts.tv_sec; >> >> Modified: head/sys/fs/ext2fs/ext2_vfsops.c >> ============================================================================== >> >> --- head/sys/fs/ext2fs/ext2_vfsops.c Sun May 22 14:13:20 2016 >> (r300422) >> +++ head/sys/fs/ext2fs/ext2_vfsops.c Sun May 22 14:31:20 2016 >> (r300423) >> @@ -998,7 +998,8 @@ ext2_vget(struct mount *mp, ino_t ino, i >> * already have one. This should only happen on old filesystems. >> */ >> if (ip->i_gen == 0) { >> - ip->i_gen = random() + 1; >> + while (ip->i_gen == 0) >> + ip->i_gen = arc4random(); > > This is correct, but might be unnecessary (see above). "on old > filesystems" > was copied from ffs where it meant "on file systems created by versions > of newfs that didn't initialize di_gen". Such file systems are restricted > to ffs1. But i_gen stil occurs due to bugs in newfs -- it is missing this > loop, so it sometimes sets i_gen to the random value of 1, and we > can't/don't > tell the difference between this and unitialized. > > I think ext2fs is more like ffs2 in a relevant way here. Both have the > feature of speeding up newfs by only writing a few inodes. Thus most > inode allocations occur in the kernel, and it hardly matters if newfs > initialized i_gen for the few inodes that it initialized. extfs and > ext2fs also have an inode non-clearing feature on unlink that might > be relevant. I forget if ffs2 has this. So the code should be simplified > by never expecting newfs to initialize i_gen to nonzero. It already > almost does this, except in the comment. For this, it is necessary to > skip i_gen == 0 (mod 2**32). > >> if ((vp->v_mount->mnt_flag & MNT_RDONLY) == 0) >> ip->i_flag |= IN_MODIFIED; >> } >> >> Modified: head/sys/ufs/ffs/ffs_alloc.c >> ============================================================================== >> >> --- head/sys/ufs/ffs/ffs_alloc.c Sun May 22 14:13:20 2016 (r300422) >> +++ head/sys/ufs/ffs/ffs_alloc.c Sun May 22 14:31:20 2016 (r300423) >> @@ -1102,8 +1102,8 @@ dup_alloc: >> /* >> * Set up a new generation number for this inode. >> */ >> - if (ip->i_gen == 0 || ++ip->i_gen == 0) >> - ip->i_gen = arc4random() / 2 + 1; >> + while (ip->i_gen == 0 || ++ip->i_gen == 0) >> + ip->i_gen = arc4random(); > > This is broken due to an old type error. In ffs, i_gen has type uint64_t, > so it is physically impossible for ++ip->i_gen to wrap to 0. > (i_gen is initialized from di_gen, and that has type uint32_t, so > i_gen is initially <= UINT32_MAX and it would take 584 years to wrap > with the modest inode recycling period of 1 nanosecond.) > So the above is an obfuscated way of writing: > > if (ip->i_gen == 0) { > /* > * This value means uninitialized (or a bug). Init now. > * The loop is to not have the usual bug here. > */ > do > ip->i_gen = arc4random(); > while (ip->i_gen == 0); > } else > ip->i_gen++; > > Now it is clear that i_gen can grow far above UINT32_MAX. But usually > it doesn't. Growth above UINT32_MAX gets truncated when the vnode is > recycled. Overflow occurs with i_gen is stored to di_gen, and growth > resumes at a small truncated value. > > The type error gives truncation on most uses of i_gen: di_gen, ufid_gen > and ueh_i_gen are all 32 bits. va_gen is 32-bits on 32-bit arches but > is 64 bits on 64-bit arches. Various bugs result. The bugs are mostly > features. It is not very useful to re-randomize on reaching the 32-bit > boundary. The bugfeature normally avoids this. If i_gen were not > truncated to 32 bits when the vnode is recycled (or on unmount), and > if it were consistently truncated (that means, truncate it to 32 bits > in va_gen on 64-bit arches), then the bugfeature would work perfectly. > The top 32 bits in i_gen would then be unused except to record history > for a few trillion years for most inodes. > >> DIP_SET(ip, i_gen, ip->i_gen); >> if (fs->fs_magic == FS_UFS2_MAGIC) { >> vfs_timestamp(&ts); >> @@ -2080,7 +2080,8 @@ gotit: >> bzero(ibp->b_data, (int)fs->fs_bsize); >> dp2 = (struct ufs2_dinode *)(ibp->b_data); >> for (i = 0; i < INOPB(fs); i++) { >> - dp2->di_gen = arc4random() / 2 + 1; >> + while (dp2->di_gen == 0) >> + dp2->di_gen = arc4random(); > > This seems to be correct. It is only for the ffs2 case, and di_gen was > initialized to 0 by the bzero(), but the while loop is easier to read > that the more optimal do-while loop that I wrote above. > >> dp2++; >> } >> /* >> >> Modified: head/sys/ufs/ffs/ffs_vfsops.c >> ============================================================================== >> >> --- head/sys/ufs/ffs/ffs_vfsops.c Sun May 22 14:13:20 2016 >> (r300422) >> +++ head/sys/ufs/ffs/ffs_vfsops.c Sun May 22 14:31:20 2016 >> (r300423) >> @@ -1768,7 +1768,8 @@ ffs_vgetf(mp, ino, flags, vpp, ffs_flags >> * already have one. This should only happen on old filesystems. >> */ >> if (ip->i_gen == 0) { >> - ip->i_gen = arc4random() / 2 + 1; >> + while (ip->i_gen == 0) >> + ip->i_gen = arc4random(); > > This also seems to be correct. Now the compiler can easily optimize the > while loop to a do-while loop, since a previous check for i_gen == 0 is > visible. > > "should" in the comment is correct, except this should never happen > now since file systems that were old when it was written now shouldn't > exist. However, this does happen now, mostly for new file systems, > due to a bug in newfs: newfs is missing this while loop, so it sometimes > initializes di_gen to 0. Then we can't/don't tell if di_gen was > initialized to a random value, so we-re-randomize it. > >> if ((vp->v_mount->mnt_flag & MNT_RDONLY) == 0) { >> ip->i_flag |= IN_MODIFIED; >> DIP_SET(ip, i_gen, ip->i_gen); > > All versions of newfs seem to have had buggy initialization: > - they didn't initialize di_gen (except to 0) until 1997 > - the first 1997 version used /dev/urandom to initialize "long ret;". > Assignment of this to "int32_t di_gen;" overflowed on 64-bit arches > (alpha?). This gave a negative value which gave further overflows > or suprising sign extensions mainly when assigned to "u_long va_gen'"; > most other types were consistently signed. > - the next 1997 version used random(). This fixed the generation of > negative values. 0 was still generated > - the 2003 version used arc4random(). This gave negative values again. > 0 was still generated. This is essentially the current version. The > current version uses a wrapper function like the first 1997 version. > - newfs doesn't seem to have ever had the version that did *random() / 2 > + 1 > like the kernel did. It thus escaped having the overflow/sign extension > bugs that the kernel had. Dividing by 2 was apparently supposed to avoid > these bugs. It worked with random() since random() returns at most > INT32_MAX. But arc4random() returns at mist UINT32_MAX. Dividing this > by 2 gives (u_int)INT32_MAX and adding 1 gives a value that overflows > when assigned to int32_t. This was later fixed by changing lots of > int32_t to uint32_t. > > I'm not sure about the security aspects of randomizing i_gen. The re- > randomization is so accidental and infrequent that security must be > unimportant. But I think a unique generation (over all inodes on all > file systems) would be better than any randomness. FreeBSD-1 was closer > to having that -- i_gen was initialized to the global nextgennumber++ if > it was zero; it was not incremented for inode use. The globabl would > have to be 64 bits now. Ensuring uniqueness is not easy since it means > that you have to check inode numbers on not-very-trusted newly mounted > file systems against all inode numbers already in use. Perhaps it works > to always use a new set for every new mount (ignore the ones on disk). > > Bruce Yes, I have always wondered: who is detecting if we have a repeated generation number, the end user? And then if we get a repeated generation number and try to fix it, how do we know that a new random value we generate to replace the duplicate is unused? I am rather afraid to hear the answers :-/. Pedro.