From owner-freebsd-fs@FreeBSD.ORG Tue Mar 22 01:28:11 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0B92916A4CE for ; Tue, 22 Mar 2005 01:28:11 +0000 (GMT) Received: from mta7.srv.hcvlny.cv.net (mta7.srv.hcvlny.cv.net [167.206.4.202]) by mx1.FreeBSD.org (Postfix) with ESMTP id BFE2343D31 for ; Tue, 22 Mar 2005 01:28:10 +0000 (GMT) (envelope-from geoffo@comcast.net) Received: from [192.168.1.100] (ool-43516ad0.dyn.optonline.net [67.81.106.208]) by mta7.srv.hcvlny.cv.net (iPlanet Messaging Server 5.2 HotFix 1.25 (built Mar 3 2004)) with ESMTP id <0IDQ0018XC2ZE7@mta7.srv.hcvlny.cv.net> for freebsd-fs@freebsd.org; Mon, 21 Mar 2005 20:28:12 -0500 (EST) Date: Mon, 21 Mar 2005 20:28:09 -0500 From: Geoff To: freebsd-fs@freebsd.org Message-id: <1111454889.25432.7.camel@cave.localdomain> MIME-version: 1.0 X-Mailer: Evolution 2.0.3-1.2.101mdk Content-type: text/plain Content-transfer-encoding: 7BIT Subject: fstat -v failures X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Mar 2005 01:28:11 -0000 Running 'fstat -v' on my "FreeBSD 4.8-RELEASE #1" system returns a long string of failed attempts to read filedesc of pids as follows: -bash-2.05b# fstat -v USER CMD PID FD MOUNT INUM MODE SZ|DV R/W can't read filedesc at 0xc8ea7a00 for pid 26279 can't read filedesc at 0xcaf4dd00 for pid 97453 [...] can't read filedesc at 0xc8d03600 for pid 5463 can't read filedesc at 0xc8cf0400 for pid 5439 I'm trying to find out what files are open on my system to combat a "Too Many Open Files" error. Any help would be appreciated. - Geoff From owner-freebsd-fs@FreeBSD.ORG Fri Mar 25 04:23:32 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 20A4516A4CE for ; Fri, 25 Mar 2005 04:23:32 +0000 (GMT) Received: from web20429.mail.yahoo.com (web20429.mail.yahoo.com [66.163.170.252]) by mx1.FreeBSD.org (Postfix) with SMTP id 872C443D31 for ; Fri, 25 Mar 2005 04:23:31 +0000 (GMT) (envelope-from dhutch9999@yahoo.com) Received: (qmail 83276 invoked by uid 60001); 25 Mar 2005 04:23:31 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=SXZUusmhDPVd4NTFKgBaY604x98WAl1dSQ68uwrhjaGz1McslU4qB+m1Ej5RoBLpVnKoQqy74mzEPwVMEYIjqaY3d50WBxnr78vrHhF//fHDe3rNjmz+222VEQdnCf0l/+SjxDr/QobN6eoAaOncfU3wxx3zhu6xCYOgewSquQw= ; Message-ID: <20050325042331.83274.qmail@web20429.mail.yahoo.com> Received: from [65.35.48.3] by web20429.mail.yahoo.com via HTTP; Thu, 24 Mar 2005 20:23:31 PST Date: Thu, 24 Mar 2005 20:23:31 -0800 (PST) From: DH To: mohammad babaei In-Reply-To: 6667 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Content-Filtered-By: Mailman/MimeDel 2.1.1 cc: freebsd-fs@freebsd.org Subject: Re: fsck output X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Mar 2005 04:23:32 -0000 I know this may sound silly - have you run fsck on those fs while they were unmounted - running fsck on a mounted fs can produce "odd" results...... If not, unmount the fs and then fsck it. If that would cause problems with running applications reboot into single user mode and run fsck ( that would be the preferred scenario ). David Hutchens III Network Technician Doug White wrote: On Wed, 9 Mar 2005, mohammad babaei wrote: > Dear Sirs, > > we have a production server running FreeBSD 4.10 RELEASE , when we run > fsck on the machine it's out put is something like this: > ======== [...] > ** /dev/ad0s1e (NO WRITE) > ** Last Mounted on /tmp > ** Phase 1 - Check Blocks and Sizes > ** Phase 2 - Check Pathnames > ** Phase 3 - Check Connectivity > ** Phase 4 - Check Reference Counts > UNREF FILE I=102 OWNER=www MODE=100600 > SIZE=0 MTIME=Mar 8 22:46 2005 > RECONNECT? no > > > CLEAR? no [..] > ** /dev/ad0s1g (NO WRITE) > ** Last Mounted on /var > ** Phase 1 - Check Blocks and Sizes > ** Phase 2 - Check Pathnames > ** Phase 3 - Check Connectivity > ** Phase 4 - Check Reference Counts > UNREF FILE I=668162 OWNER=mysql MODE=100600 > SIZE=0 MTIME=Mar 8 22:41 2005 > CLEAR? no > > UNREF FILE I=668163 OWNER=mysql MODE=100600 > SIZE=0 MTIME=Mar 8 22:41 2005 > CLEAR? no > > UNREF FILE I=668171 OWNER=mysql MODE=100600 > SIZE=0 MTIME=Mar 8 22:41 2005 > CLEAR? no > > ** Phase 5 - Check Cyl groups > FREE BLK COUNT(S) WRONG IN SUPERBLK > SALVAGE? no > > SUMMARY INFORMATION BAD > SALVAGE? no > > BLK(S) MISSING IN BIT MAPS > SALVAGE? no > > ========== > [*] Is this a normal output for fsck or not? > [*] Will it cause hard disk failure in the future ? What options did you run fsck with? Under normal circumstances after a crash these types of messages are typical. -- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" --------------------------------- Do you Yahoo!? Yahoo! Small Business - Try our new resources site! From owner-freebsd-fs@FreeBSD.ORG Sat Mar 26 00:11:18 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 88EBD16A4CE for ; Sat, 26 Mar 2005 00:11:18 +0000 (GMT) Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by mx1.FreeBSD.org (Postfix) with SMTP id B6DFA43D1D for ; Sat, 26 Mar 2005 00:11:17 +0000 (GMT) (envelope-from dwmalone@maths.tcd.ie) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 26 Mar 2005 00:11:17 +0000 (GMT) To: freebsd-fs@freebsd.org Date: Sat, 26 Mar 2005 00:11:16 +0000 From: David Malone Message-ID: <200503260011.aa53448@salmon.maths.tcd.ie> Subject: UFS Subdirectory limit. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Mar 2005 00:11:18 -0000 There was a discussion on comp.unix.bsd.freebsd.misc about two weeks ago, where someone had an application that used about 150K subdirectories of a single directory. They wanted to move this application to FreeBSD, but discovered that UFS is limited to 32K subdirectories, because UFS's link count field is a signed 16 bit quantity. Rewriting the application wasn't an option for them. I had a look at how hard it would be to fix this. The obvious route of increasing the size of the link count field is trickly because it means changing the struct stat, which has a 16 bit link count field. This would imply ABI breakage, though it might be worth it. I had a think about other ways to fix this problem. One way around this limitation is to change the link count semantics so that ".." doesn't contribute to a directories link count (this may seem silly, but I've included a bit of a rational below). I've produced a patch at: http://www.maths.tcd.ie/~dwmalone/dircount_hack which adds options to newfs (-D) and tunefs (-d) that set a new flag in the filesystem making it use this new link counting scheme. (If you enable the flag with tunefs you should run fsck to recalculate the link count.) The patch also makes filesystems with this flag be of type "wfs", so that fts knows not to use the link-count-stat shortcut. I'd appreciate any feedback on this patch. Please don't use it on important filesystems, as it may chew your files! I've done some basic testing, including making a directory with 70K subdirectories, and it seems to work. David. --- Originally, I guess the link count was used to decide when you can free the blocks belonging to a file. As directories are basically implemented as files, they inherit link counts. However today, the real test for "can you deallocate a directory" is "is it empty". Thus the link count for directories isn't authoritative for deallocation, though it does provide a shortcut (nlink > 2 => not empty). The link count also provides some extra consistency that fsck can check. However, as far as I can tell, the link count doesn't provide any extra information that can actually be used to fix inconsistencies. The other place that directory link counts are used is as a short cut in userland when estimating the number of subdirectories that a directory has. This is used in the fts code to avoid stating things. Since this shortcut only works for ufs-like filesystems, we already have code for dealing with this. From owner-freebsd-fs@FreeBSD.ORG Sat Mar 26 03:10:21 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E628216A4CE for ; Sat, 26 Mar 2005 03:10:21 +0000 (GMT) Received: from VARK.MIT.EDU (VARK.MIT.EDU [18.95.3.179]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8F45C43D49 for ; Sat, 26 Mar 2005 03:10:21 +0000 (GMT) (envelope-from das@FreeBSD.ORG) Received: from VARK.MIT.EDU (localhost [127.0.0.1]) by VARK.MIT.EDU (8.13.3/8.13.1) with ESMTP id j2Q3AIVF041632; Fri, 25 Mar 2005 22:10:18 -0500 (EST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by VARK.MIT.EDU (8.13.3/8.13.1/Submit) id j2Q3AIrY041631; Fri, 25 Mar 2005 22:10:18 -0500 (EST) (envelope-from das@FreeBSD.ORG) Date: Fri, 25 Mar 2005 22:10:18 -0500 From: David Schultz To: David Malone Message-ID: <20050326031018.GB41481@VARK.MIT.EDU> Mail-Followup-To: David Malone , freebsd-fs@FreeBSD.ORG References: <200503260011.aa53448@salmon.maths.tcd.ie> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200503260011.aa53448@salmon.maths.tcd.ie> cc: freebsd-fs@FreeBSD.ORG Subject: Re: UFS Subdirectory limit. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Mar 2005 03:10:22 -0000 On Sat, Mar 26, 2005, David Malone wrote: > There was a discussion on comp.unix.bsd.freebsd.misc about two weeks > ago, where someone had an application that used about 150K > subdirectories of a single directory. They wanted to move this > application to FreeBSD, but discovered that UFS is limited to 32K > subdirectories, because UFS's link count field is a signed 16 bit > quantity. Rewriting the application wasn't an option for them. > > I had a look at how hard it would be to fix this. The obvious route > of increasing the size of the link count field is trickly because > it means changing the struct stat, which has a 16 bit link count > field. This would imply ABI breakage, though it might be worth it. Why not just... - make a new st_nlink field that's 32 bits and put it in the spare 32-bit field in struct stat - rename the old st_nlink to st_onlink and leave it at 16 bits - the kernel would fill in st_onlink with max(st_nlink,SHORT_MAX) From owner-freebsd-fs@FreeBSD.ORG Sat Mar 26 04:57:21 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C3A6B16A4CE; Sat, 26 Mar 2005 04:57:21 +0000 (GMT) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3A53543D55; Sat, 26 Mar 2005 04:57:21 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.254.11] (junior-wifi.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.13.1/8.13.1) with ESMTP id j2Q4tVvL083352; Fri, 25 Mar 2005 21:55:31 -0700 (MST) (envelope-from scottl@samsco.org) Message-ID: <4244EAFD.1030304@samsco.org> Date: Fri, 25 Mar 2005 21:54:21 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.5) Gecko/20050218 X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Schultz References: <200503260011.aa53448@salmon.maths.tcd.ie> <20050326031018.GB41481@VARK.MIT.EDU> In-Reply-To: <20050326031018.GB41481@VARK.MIT.EDU> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.8 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on pooker.samsco.org cc: David Malone cc: freebsd-fs@freebsd.org Subject: Re: UFS Subdirectory limit. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Mar 2005 04:57:21 -0000 David Schultz wrote: > On Sat, Mar 26, 2005, David Malone wrote: > >>There was a discussion on comp.unix.bsd.freebsd.misc about two weeks >>ago, where someone had an application that used about 150K >>subdirectories of a single directory. They wanted to move this >>application to FreeBSD, but discovered that UFS is limited to 32K >>subdirectories, because UFS's link count field is a signed 16 bit >>quantity. Rewriting the application wasn't an option for them. >> >>I had a look at how hard it would be to fix this. The obvious route >>of increasing the size of the link count field is trickly because >>it means changing the struct stat, which has a 16 bit link count >>field. This would imply ABI breakage, though it might be worth it. > > > Why not just... > > - make a new st_nlink field that's 32 bits and put it in the spare > 32-bit field in struct stat > > - rename the old st_nlink to st_onlink and leave it at 16 bits > > - the kernel would fill in st_onlink with max(st_nlink,SHORT_MAX) I thought that we already discussed this in the past year. There are significant compatibility concerns here. What happens if you use an old fsck binary on a new filesystem? Since you haven't changed the magic, it has no way of knowing that nlink needs to be handled differently. It would make it impossible to share a filesystem between different versions of FreeBSD, let alone any other BSD. It's hard to justify bumping the magic for just a change like this, since it would basically mean that you've created UFS3. Also, the more important concern is that large directories simply don't scale in UFS. Lookups are a linear operation, and while DIRHASH helps, it really doesn't scale well to 150k entries. I think the reason that there isn't more pressure to fix the nlink size is because most people realize that it just won't provide any real benefit. It would be much more worthwhile to introduce a UFS3 that uses a more efficient directory layout (B-tree?) to provide real value to increasing the nlink limitation. Scott From owner-freebsd-fs@FreeBSD.ORG Sat Mar 26 09:02:11 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6B50216A4CE; Sat, 26 Mar 2005 09:02:11 +0000 (GMT) Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by mx1.FreeBSD.org (Postfix) with SMTP id 63E2E43D1D; Sat, 26 Mar 2005 09:02:10 +0000 (GMT) (envelope-from dwmalone@maths.tcd.ie) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 26 Mar 2005 09:02:09 +0000 (GMT) To: David Schultz In-reply-to: Your message of "Fri, 25 Mar 2005 22:10:18 EST." <20050326031018.GB41481@VARK.MIT.EDU> Date: Sat, 26 Mar 2005 09:02:09 +0000 From: David Malone Message-ID: <200503260902.ab77979@salmon.maths.tcd.ie> cc: freebsd-fs@FreeBSD.ORG Subject: Re: UFS Subdirectory limit. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Mar 2005 09:02:11 -0000 > - make a new st_nlink field that's 32 bits and put it in the spare > 32-bit field in struct stat > - rename the old st_nlink to st_onlink and leave it at 16 bits > - the kernel would fill in st_onlink with max(st_nlink,SHORT_MAX) Hmmm - interesting - I hadn't realised there was spare space in struct stat. I guess we could get away with this and there's space in both ufs1 and ufs2 inodes. I think we'd need to redefinte nlink_t, which would need an ABI bump. One problem I can think of might be non-obvious failures of old programs on directories with lots of subdirectories. The hacky scheme ends up with a link count on 2 on all directories, which produces a reasonably obvious failure. David. From owner-freebsd-fs@FreeBSD.ORG Sat Mar 26 09:35:54 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id ABBCE16A4CE for ; Sat, 26 Mar 2005 09:35:54 +0000 (GMT) Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by mx1.FreeBSD.org (Postfix) with SMTP id C72D943D54 for ; Sat, 26 Mar 2005 09:35:53 +0000 (GMT) (envelope-from dwmalone@maths.tcd.ie) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 26 Mar 2005 09:35:52 +0000 (GMT) To: Scott Long In-reply-to: Your message of "Fri, 25 Mar 2005 21:54:21 MST." <4244EAFD.1030304@samsco.org> X-Request-Do: Date: Sat, 26 Mar 2005 09:35:52 +0000 From: David Malone Message-ID: <200503260935.aa92067@salmon.maths.tcd.ie> cc: freebsd-fs@freebsd.org Subject: Re: UFS Subdirectory limit. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Mar 2005 09:35:54 -0000 [I'm not sure if Scott's comments were in relation to David Schultz's suggestion or mine...] > I thought that we already discussed this in the past year. There are > significant compatibility concerns here. What happens if you use an > old fsck binary on a new filesystem? Since you haven't changed the > magic, it has no way of knowing that nlink needs to be handled > differently. In the scheme I suggested, a flag in the superblock is used to record when the new scheme is in use. I guess das's scheme would do something similar. In my case, when you run an old fsck, the link counts are just set to their traditional values by fsck (unless you have a directory with too many links, in which case fsck will do whatever it has always done in this situation). (You also don't need to turn such a feature on by default. Both schemes can also be read-only compatible with old systems too.) > Also, the more important > concern is that large directories simply don't scale in UFS. Lookups > are a linear operation, and while DIRHASH helps, it really doesn't scale > well to 150k entries. It seems to work passably well actually, not that I've benchmarked it carefully at this size. My junkmail maildir has 164953 entries at the moment, and is pretty much continiously appended to without creating any problems for the machine it lives on. Dirhash doesn't care if the entries are subdirectories or files. If the directory entries are largely static, the name cache should do all the work, and it is well capable of dealing with lots of files. We should definitely look at what sort of filesystem features we're likely to need in the future, but I just wanted to see if we can offer people a sloution that doesn't mean waiting for FreeBSD 6 or 7. (164955 files now ;-) David. From owner-freebsd-fs@FreeBSD.ORG Sat Mar 26 21:31:22 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D5E6B16A4CE for ; Sat, 26 Mar 2005 21:31:22 +0000 (GMT) Received: from VARK.MIT.EDU (VARK.MIT.EDU [18.95.3.179]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4931B43D39 for ; Sat, 26 Mar 2005 21:31:22 +0000 (GMT) (envelope-from das@FreeBSD.ORG) Received: from VARK.MIT.EDU (localhost [127.0.0.1]) by VARK.MIT.EDU (8.13.3/8.13.1) with ESMTP id j2QLVAbm033834; Sat, 26 Mar 2005 16:31:10 -0500 (EST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by VARK.MIT.EDU (8.13.3/8.13.1/Submit) id j2QLUmYK033827; Sat, 26 Mar 2005 16:30:48 -0500 (EST) (envelope-from das@FreeBSD.ORG) Date: Sat, 26 Mar 2005 16:30:48 -0500 From: David Schultz To: Scott Long Message-ID: <20050326213048.GA33703@VARK.MIT.EDU> Mail-Followup-To: Scott Long , David Malone , freebsd-fs@FreeBSD.ORG References: <200503260011.aa53448@salmon.maths.tcd.ie> <20050326031018.GB41481@VARK.MIT.EDU> <4244EAFD.1030304@samsco.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4244EAFD.1030304@samsco.org> cc: David Malone cc: freebsd-fs@FreeBSD.ORG Subject: Re: UFS Subdirectory limit. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Mar 2005 21:31:23 -0000 On Fri, Mar 25, 2005, Scott Long wrote: > David Schultz wrote: > >On Sat, Mar 26, 2005, David Malone wrote: > > > >>There was a discussion on comp.unix.bsd.freebsd.misc about two weeks > >>ago, where someone had an application that used about 150K > >>subdirectories of a single directory. They wanted to move this > >>application to FreeBSD, but discovered that UFS is limited to 32K > >>subdirectories, because UFS's link count field is a signed 16 bit > >>quantity. Rewriting the application wasn't an option for them. > >> > >>I had a look at how hard it would be to fix this. The obvious route > >>of increasing the size of the link count field is trickly because > >>it means changing the struct stat, which has a 16 bit link count > >>field. This would imply ABI breakage, though it might be worth it. > > > > > >Why not just... > > > >- make a new st_nlink field that's 32 bits and put it in the spare > > 32-bit field in struct stat > > > >- rename the old st_nlink to st_onlink and leave it at 16 bits > > > >- the kernel would fill in st_onlink with max(st_nlink,SHORT_MAX) > > I thought that we already discussed this in the past year. There are > significant compatibility concerns here. What happens if you use an > old fsck binary on a new filesystem? Since you haven't changed the > magic, it has no way of knowing that nlink needs to be handled > differently. It would make it impossible to share a filesystem between > different versions of FreeBSD, let alone any other BSD. First of all, I was only talking about how to avoid badly breaking the stat ABI, not about how to avoid breaking the on-disk FS format. However, I think a similar trick could be applied to the disk inode. There are 24 bytes of reserved space in the UFS2 inode that current versions of fsck ignore, and four of them could be used to store a larger nlink field. The old nlink field would still be kept up-to-date by newer kernels, which would provide reverse compatibility for older kernels and versions of fsck *provided* that no directories have more than 32767 files. Clearly there's a fundamental limitation that older software won't be able to properly handle large directories, but at least small directories in the new format would be backwards compatible. The only other problem that comes to mind is that older versions of fsck and older kernels could cause the two nlink fields to get out of date. However, for directories, new kernels should be able to figure out the correct nlink value from the directory contents when this happens, since hard links to directories are not allowed. For regular files, it should be safe to assume the larger nlink value is the correct one; this may leak storage, but a new version of fsck would be able to reclaim it. Furthermore, this benign inconsistency would only happen in bizarre situations, such as switching from a new kernel to an old kernel, adding or removing hard links using the older kernel, and then switching back to the new kernel. From owner-freebsd-fs@FreeBSD.ORG Sat Mar 26 22:08:12 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F35AC16A4CE; Sat, 26 Mar 2005 22:08:11 +0000 (GMT) Received: from critter.freebsd.dk (f170.freebsd.dk [212.242.86.170]) by mx1.FreeBSD.org (Postfix) with ESMTP id DC09D43D41; Sat, 26 Mar 2005 22:08:10 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.3/8.13.1) with ESMTP id j2QM86k9017694; Sat, 26 Mar 2005 23:08:06 +0100 (CET) (envelope-from phk@critter.freebsd.dk) To: David Schultz From: "Poul-Henning Kamp" In-Reply-To: Your message of "Sat, 26 Mar 2005 16:30:48 EST." <20050326213048.GA33703@VARK.MIT.EDU> Date: Sat, 26 Mar 2005 23:08:06 +0100 Message-ID: <17693.1111874886@critter.freebsd.dk> Sender: phk@critter.freebsd.dk cc: David Malone cc: freebsd-fs@freebsd.org Subject: Re: UFS Subdirectory limit. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Mar 2005 22:08:12 -0000 In message <20050326213048.GA33703@VARK.MIT.EDU>, David Schultz writes: >On Fri, Mar 25, 2005, Scott Long wrote: >> David Schultz wrote: >> >On Sat, Mar 26, 2005, David Malone wrote: >> > >> >>There was a discussion on comp.unix.bsd.freebsd.misc about two weeks >> >>ago, where someone had an application that used about 150K >> >>subdirectories of a single directory. They wanted to move this >> >>application to FreeBSD, but discovered that UFS is limited to 32K >> >>subdirectories, because UFS's link count field is a signed 16 bit >> >>quantity. Rewriting the application wasn't an option for them. Has anybody here wondered how much searching a 150K directory would suck performance wise ? I realize that with dir-hashing and vfs-cache it is not as bad as it used to be, but I still think it will be unpleasant performance wise. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.