Date: Sat, 26 Mar 2005 16:30:48 -0500 From: David Schultz <das@FreeBSD.ORG> To: Scott Long <scottl@samsco.org> Cc: freebsd-fs@FreeBSD.ORG Subject: Re: UFS Subdirectory limit. Message-ID: <20050326213048.GA33703@VARK.MIT.EDU> In-Reply-To: <4244EAFD.1030304@samsco.org> References: <200503260011.aa53448@salmon.maths.tcd.ie> <20050326031018.GB41481@VARK.MIT.EDU> <4244EAFD.1030304@samsco.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 25, 2005, Scott Long wrote: > David Schultz wrote: > >On Sat, Mar 26, 2005, David Malone wrote: > > > >>There was a discussion on comp.unix.bsd.freebsd.misc about two weeks > >>ago, where someone had an application that used about 150K > >>subdirectories of a single directory. They wanted to move this > >>application to FreeBSD, but discovered that UFS is limited to 32K > >>subdirectories, because UFS's link count field is a signed 16 bit > >>quantity. Rewriting the application wasn't an option for them. > >> > >>I had a look at how hard it would be to fix this. The obvious route > >>of increasing the size of the link count field is trickly because > >>it means changing the struct stat, which has a 16 bit link count > >>field. This would imply ABI breakage, though it might be worth it. > > > > > >Why not just... > > > >- make a new st_nlink field that's 32 bits and put it in the spare > > 32-bit field in struct stat > > > >- rename the old st_nlink to st_onlink and leave it at 16 bits > > > >- the kernel would fill in st_onlink with max(st_nlink,SHORT_MAX) > > I thought that we already discussed this in the past year. There are > significant compatibility concerns here. What happens if you use an > old fsck binary on a new filesystem? Since you haven't changed the > magic, it has no way of knowing that nlink needs to be handled > differently. It would make it impossible to share a filesystem between > different versions of FreeBSD, let alone any other BSD. First of all, I was only talking about how to avoid badly breaking the stat ABI, not about how to avoid breaking the on-disk FS format. However, I think a similar trick could be applied to the disk inode. There are 24 bytes of reserved space in the UFS2 inode that current versions of fsck ignore, and four of them could be used to store a larger nlink field. The old nlink field would still be kept up-to-date by newer kernels, which would provide reverse compatibility for older kernels and versions of fsck *provided* that no directories have more than 32767 files. Clearly there's a fundamental limitation that older software won't be able to properly handle large directories, but at least small directories in the new format would be backwards compatible. The only other problem that comes to mind is that older versions of fsck and older kernels could cause the two nlink fields to get out of date. However, for directories, new kernels should be able to figure out the correct nlink value from the directory contents when this happens, since hard links to directories are not allowed. For regular files, it should be safe to assume the larger nlink value is the correct one; this may leak storage, but a new version of fsck would be able to reclaim it. Furthermore, this benign inconsistency would only happen in bizarre situations, such as switching from a new kernel to an old kernel, adding or removing hard links using the older kernel, and then switching back to the new kernel.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050326213048.GA33703>