From owner-freebsd-bugs@FreeBSD.ORG Fri May 8 02:54:02 2009 Return-Path: Delivered-To: freebsd-bugs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D3BE810656A3 for ; Fri, 8 May 2009 02:54:02 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by mx1.freebsd.org (Postfix) with ESMTP id 730E78FC22 for ; Fri, 8 May 2009 02:54:02 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from besplex.bde.org (c122-106-172-38.carlnfd1.nsw.optusnet.com.au [122.106.172.38]) by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n482roaC029671 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 8 May 2009 12:53:57 +1000 Date: Fri, 8 May 2009 12:53:50 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: John Kilburg In-Reply-To: <200905080130.n481U6pm000135@freefall.freebsd.org> Message-ID: <20090508120355.S1497@besplex.bde.org> References: <200905080130.n481U6pm000135@freefall.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-bugs@freebsd.org Subject: Re: kern/133980: panic: ffs_valloc: dup alloc X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 May 2009 02:54:03 -0000 On Fri, 8 May 2009, John Kilburg wrote: > Subject: Re: kern/133980: panic: ffs_valloc: dup alloc > > I am guessing now that this is a problem that I've seen mentioned > in passing as a possibility in a few postings about the maximum > number of inodes for UFS2. > > Using the default settings the filesystem ended up with more than > 2^31 inodes. I decreased the inode density to about half the > default (~1.4M inodes) and now things seem to be working perfectly. > I rsync'd with 3 sessions, did a make buildworld and otherwise > loaded the thing with no problems at all. > > It would be nice if newfs (or something) warned people about the > effective 2^31 inode limit for ufs2 until the problem is spotted. > I guess the real fix is to switch to a filesystem that doesn't > have static inode allocation. :) ino_t is uint32_t, so the limit should be 2^32. This might be easy to fix. I think newfs just allocates space for inode blocks without ever counting inodes (and thus without checking that all the inode blocks are usable). It could easily count. Then ffs should simply refuse to allocate inodes with numbers >= 2^32, but it has the following bad code in ffs_alloc.c: ffs_valloc(): % ino = (ino_t)ffs_hashalloc(pip, cg, ipref, mode, ffs_nodealloccg); This blindly truncates any value >= 2^32 returned by ffs_hashalloc(). (ffs_hashalloc() returns an ffs2 (sic) block number (int64_t) but is abused to to allocate and return an inode number here.) % static ufs2_daddr_t % ffs_nodealloccg(ip, cg, ipref, mode) This used to return ino_t. Then the function itself was invalidly cast to a common type in the above use of it as a parameter to ffs_hashalloccg(), so as to break the warning about this type mismatch. Now a common return type is used instead. This is valid, but we should be careful not to return an out-of-range value. % struct inode *ip; % int cg; % ufs2_daddr_t ipref; % int mode; % { % ... % return (cg * fs->fs_ipg + ipref); This seems to be the main place where overflow may occur. Although the function can now return a 64-bit value exceeding 2^32-1, cg is plain int and fs_ipg is int32_t, so the multiplication will overflow at 2^31. ipref has type ino_t, but it is only a small adjustment within the inode block, so adding it normally won't overflow or get near 2^32. Thus overflow is most likely to occur only in the multiplication here. % /*VARARGS5*/ Historical garbage -- variadic args are neither supported or used here. % static ufs2_daddr_t % ffs_hashalloc(ip, cg, pref, size, allocator) % struct inode *ip; % int cg; % ufs2_daddr_t pref; % int size; /* size for data blocks, mode for inodes */ % allocfcn_t *allocator; % { % struct fs *fs; % ufs2_daddr_t result; % ... % * 1: preferred cylinder group % */ % result = (*allocator)(ip, cg, pref, size); Since (*allocator)() returns the common type ufs2_daddr_t and is always assigned to 'ufs2_daddr_t result' and then returned as a ufs2_daddr_t, no further overflows are possible before the cast in ffs_valloc(). (For uses of ffs_hashalloc() to allocate block numbers for ffs1, overflow would occur on assignment to a ufs1_daddr_t without even a cast if ffs_hashalloc() somehow generated an out-of-range value. ffs1 uses the same block allocator as ffs2 and depends on newfs only generating values that can't cause overflow for this to work. So does ffs2, but overflow for 64-bit block numbers is far off.) Bruce