Date: Sat, 10 Jun 2006 09:10:59 +1000 (EST) From: Bruce Evans <bde@zeta.org.au> To: Robert Watson <rwatson@freebsd.org> Cc: freebsd-fs@freebsd.org Subject: Re: heavy NFS writes lead to corrup summary in superblock Message-ID: <20060610075606.B14403@delplex.bde.org> In-Reply-To: <20060609172713.A31718@fledge.watson.org> References: <200606091451.k59EpQnt039643@lurza.secnetix.de> <20060609172713.A31718@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 9 Jun 2006, Robert Watson wrote: > On Fri, 9 Jun 2006, Oliver Fromme wrote: >> ... >> On a 300 GB file system using the default newfs parameters, you have about >> 36 million inodes. So using UFS1 will save about 4500 MB of space (vs. >> UFS2). However, with an inode density of 2^18 there are only 1 million >> inodes, so UFS1 makes only a difference of 136 MB. > > Ah, I took "A few very large files" to mean "A few very large files that are > probably too large for UFS1 to represent, as very large is getting very large > lately" :-). Switching to UFS1 under those circumstances would be > problematic. I don't know about UFS1 and UFS2 ;), but with the same block size ffs1 can represent much larger files than ffs2. This is because the data type for block numbers in ffs2 is twice as large as in ffs1, so indirect blocks can store twice as many block numbers in ffs1 and in ffs2. There are 3 levels of indirect blocks, so this factor of 2 applies 3 times, giving ratio of about 8 for the maximum file size in ffs1 vs ffs2. More precisely: nindir = blocksize / sizeof(blocknumber) maxfilesize = (nindir^3 + O(nindir^2)) * blocksize (I'm now confused about block vs fragment addressing. I think block numbers are actually frag numbers, so I had fragsize in the rightmost term in the above, but newfs uses blocksize. The following numbers may be off by a factor of blocksize/fragsize from this.) The default block/frag size of 16KB/2KB thus gives a maxfilesize of about 4K^3 * 16KB = 1024TB in ffs1 but only 2K^3 * 16KB = 128TB in ffs2. A block/frag size of 64KB/8KB this gives a maxfilesize of about 16K^3 * 64KB = 262144TB in ffs1 but only 32768TB in ffs2. The file size limit that ffs2 increases is the maximum size of a non-sparse file. This is quite different. Now the limit is physical addressibility of blocks. I think the block numbers really are fragment numbers in this context (but beware of errors by a factor of blocksize/fragsize in the fiollowing) , so the limit is: maxphysfilesize = (maxblocknumber + 1) * fragsize - 1 The default block/frag size of 16KB/2KB thus gives a maxphysfilesize of about 4G*4G * 2KB = 32G TB in ffs2 but only 4G * 2KB = 8TB in ffs1. The default block/frag size of 64KB/8KB thus gives a maxphysfilesize of about 4G*4G * 8KB = 128G TB in ffs2 but only 4G * 8KB = 32TB in ffs1. You can also use easily larger fragments if you want a larger maxphysfilesize in ffs1. The limit with 64KB-frags is 128TB. Larger sizes require increasing limits in vfs_bio starting with MAXBSIZE. The latter and even the former would give ffs file systems that wouldn't work in most implementations of ffs. Maximum file sizes (both physical and virtual) are also limited by other implementation details: (1) in FreeBSD before FreeBSD-5, vfs_bio and disk drivers can only access 1TB, so physical file sizes larger than 1TB cannot work since physical _filesystem_ sizes larger than 1TB cannot work. (2) in some versions of FreeBSD-5 (maybe only in pre-release versions), a bug in ffs1 causes truncation of disk addresses befor they reach vfs_bio, so physical _filesystem_ sizes larger than 1TB cannot work in ffs1. (3) ffs1 has a bogus internal limit of maxfilesize = (maxblocknumber + 1) / 2 * blocksize - 1 This confuses maxfilesize with maxphysfilesize and is obviously off by a factor of 2 and is probably off by a factor of blocksize/fragsize too. For most choices of block/frag sizes (all except 4K/512 IIRC), at limits maxfilesize unnecessarily, but the error factors in it result in it only limiting maxphysfilesize for non-default choices. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060610075606.B14403>