From owner-freebsd-fs@FreeBSD.ORG Sun Nov 16 15:51:52 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A1B7A16A4CE for ; Sun, 16 Nov 2003 15:51:52 -0800 (PST) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4DD7543FD7 for ; Sun, 16 Nov 2003 15:51:51 -0800 (PST) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id hAGNpOeF086319; Sun, 16 Nov 2003 15:51:31 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200311162351.hAGNpOeF086319@gw.catspoiler.org> Date: Sun, 16 Nov 2003 15:51:24 -0800 (PST) From: Don Lewis To: kmarx@vicor.com MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: freebsd-fs@FreeBSD.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Nov 2003 23:51:52 -0000 On 31 Oct, To: kmarx@vicor.com wrote: > On 31 Oct, Ken Marx wrote: > You may get better results if you change the minbfree calculation from > (avgbfree - avgbfree/4) to just (avgbfree). > > I'm somewhat tempted to change the calculation to: > min(avgbfree, max(1, (avgbfree - avgbfree/4), (dirsize/fs->fs_bsize))) > where the last term works out to 4500 with your tunefs parameters. I tried a variation of this on my -CURRENT box and it benchmarked consistently worse. I've got a "spare' 10 GB partition which first copied my /usr/ports/packages to, and then filled by repeatedly tarring my /usr/ports tree over to it. The partition was 100% full, including the reserve space, after four iterations. With minbfree set to max((avgbfree - avgbfree/4), 1) here are two iterations (the fifth line of timing data is for the 'rm -rf' command): 1310.47 real 5.48 user 141.90 sys 1336.78 real 5.62 user 152.27 sys 1368.84 real 6.02 user 151.75 sys 1359.70 real 5.55 user 154.01 sys 423.44 real 2.25 user 107.26 sys 1300.56 real 5.65 user 148.82 sys 1372.20 real 5.79 user 152.25 sys 1359.01 real 6.03 user 152.63 sys 1380.90 real 5.31 user 153.71 sys 437.22 real 2.20 user 105.61 sys With minbfree set to max(min(max(avgbfree - avgbfree / 4, dirsize / fs->fs_bsize), avgbfree), 1) I get the following: 1314.61 real 5.66 user 175.43 sys 1350.40 real 6.12 user 179.15 sys 1386.86 real 6.32 user 179.12 sys 1418.60 real 5.74 user 181.64 sys 508.67 real 2.67 user 119.66 sys 1361.19 real 5.97 user 176.94 sys 1327.63 real 5.72 user 179.60 sys 1376.16 real 6.33 user 179.72 sys 1356.47 real 6.07 user 180.24 sys 462.67 real 2.30 user 119.18 sys I'm using the newfs defaults, but dirsize is recalculated as the filesystem fills if the appropriate value is larger than what is calculated from the parameters set by newfs. I suspect the problem is the large bimodal distribution in file size in my benchmark, with zillions of little files, but also a number of large package files and source distfiles. The large files muck up the dirsize calculation because they are actually distributed across multiple cylinder groups and only the first maxbpg blocks are allocated in the original cylinder group. This would be easy to account for in the avgfilesize * avgfpdir formula, but I don't know how to handle this in the curdirsize formula (other than the degenerate case where most files are larger than maxbpg). Since I probably won't have time to get anything different tested before the -CURRENT code freeze, do you have any objections if I just MFC the code that I previously committed to -CURRENT? It certainly seems to perform better than the original code which is still in 4-STABLE. From owner-freebsd-fs@FreeBSD.ORG Sun Nov 16 19:32:04 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9770116A4CE for ; Sun, 16 Nov 2003 19:32:04 -0800 (PST) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7358143FCB for ; Sun, 16 Nov 2003 19:32:03 -0800 (PST) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id hAH3VleF086693; Sun, 16 Nov 2003 19:31:51 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200311170331.hAH3VleF086693@gw.catspoiler.org> Date: Sun, 16 Nov 2003 19:31:47 -0800 (PST) From: Don Lewis To: kmarx@vicor.com In-Reply-To: <200311162351.hAGNpOeF086319@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: freebsd-fs@FreeBSD.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 03:32:04 -0000 On 16 Nov, Don Lewis wrote: >> I'm somewhat tempted to change the calculation to: >> min(avgbfree, max(1, (avgbfree - avgbfree/4), (dirsize/fs->fs_bsize))) >> where the last term works out to 4500 with your tunefs parameters. > > I tried a variation of this on my -CURRENT box and it benchmarked > consistently worse. I've got a "spare' 10 GB partition which first > copied my /usr/ports/packages to, and then filled by repeatedly tarring > my /usr/ports tree over to it. The partition was 100% full, including > the reserve space, after four iterations. I just looked again, and it is more than 100% full, but only slightly into the reserve space. > With minbfree set to max((avgbfree - avgbfree/4), 1) here are two > iterations (the fifth line of timing data is for the 'rm -rf' command): > > 1310.47 real 5.48 user 141.90 sys > 1336.78 real 5.62 user 152.27 sys > 1368.84 real 6.02 user 151.75 sys > 1359.70 real 5.55 user 154.01 sys > 423.44 real 2.25 user 107.26 sys > > 1300.56 real 5.65 user 148.82 sys > 1372.20 real 5.79 user 152.25 sys > 1359.01 real 6.03 user 152.63 sys > 1380.90 real 5.31 user 153.71 sys > 437.22 real 2.20 user 105.61 sys > > With minbfree set to > max(min(max(avgbfree - avgbfree / 4, dirsize / fs->fs_bsize), > avgbfree), 1) > I get the following: > > 1314.61 real 5.66 user 175.43 sys > 1350.40 real 6.12 user 179.15 sys > 1386.86 real 6.32 user 179.12 sys > 1418.60 real 5.74 user 181.64 sys > 508.67 real 2.67 user 119.66 sys > > 1361.19 real 5.97 user 176.94 sys > 1327.63 real 5.72 user 179.60 sys > 1376.16 real 6.33 user 179.72 sys > 1356.47 real 6.07 user 180.24 sys > 462.67 real 2.30 user 119.18 sys > > I'm using the newfs defaults, but dirsize is recalculated as the > filesystem fills if the appropriate value is larger than what is > calculated from the parameters set by newfs. I filled up the file system again with the minbree = max((avgbfree - avgbfree/4), 1) version of the code. Based on the output of df and dumpfs, I calculate: avgfilesize = 18K curdirsize = 83K avgbfree = 864 avgifree = 14631 What suprises me is the poor distribution of free space across the cylinder groups in the file system. I now suspect the culprit is minifree. The current code calculates minifree as 75% of avgifree, or about 10973. There are some cylinder groups that are less than half full (capacity is 11761 blocks/group) in this filesystem, but their free inode counts are near the 10K minifree limit. It looks like the free inode count should be de-emphasized if the filesystem will run out of blocks before it runs out of inodes, and vice-versa if inodes are likely to be exhausted first. I now suspect that the other version of the minbfree code was more likely to bail out because it could not find any cylinder groups that met both selection criteria and used the fallback code, which probably selected the cylinder groups that were already full but had a large number of free inodes. Something to ponder ... #df -k /mnt Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/da0s2a 10890186 10057300 -38328 100% /mnt #df -i /mnt Filesystem 1K-blocks Used Avail Capacity iused ifree %iused Mounted on /dev/da0s2a 10890186 10057300 -38328 100% 535236 877882 38% /mnt #/usr/obj/usr/src/sbin/dumpfs/dumpfs /dev/da0s2a | head -20 magic 19540119 (UFS2) time Sun Nov 16 18:34:10 2003 superblock location 65536 id [ 3ec6b0f1 ec1a8944 ] ncg 60 size 5622734 blocks 5445093 bsize 16384 shift 14 mask 0xffffc000 fsize 2048 shift 11 mask 0xfffff800 frag 8 shift 3 fsbtodb 2 minfree 8% optim time symlinklen 120 maxbsize 16384 maxbpg 2048 maxcontig 8 contigsumsize 8 nbfree 51887 ndir 120875 nifree 877882 nffree 1347 bpg 11761 fpg 94088 ipg 23552 nindir 2048 inopb 64 maxfilesize 140806241583103 sbsize 2048 cgsize 16384 csaddr 3000 cssize 2048 sblkno 40 cblkno 48 iblkno 56 dblkno 3000 cgrotor 32 fmod 0 ronly 0 clean 0 avgfilesize 16384 avgfpdir 64 flags soft-updates fsmnt /mnt volname swuid 0 #/usr/obj/usr/src/sbin/dumpfs/dumpfs /dev/da0s2a | grep bfree nbfree 51887 ndir 120875 nifree 877882 nffree 1347 cs[].cs_(nbfree,ndir,nifree,nffree): nbfree 3953 ndir 2859 nifree 10786 nffree 3 nbfree 0 ndir 2114 nifree 14323 nffree 12 nbfree 0 ndir 2833 nifree 10755 nffree 31 nbfree 0 ndir 2866 nifree 10984 nffree 19 nbfree 0 ndir 2810 nifree 10983 nffree 15 nbfree 0 ndir 2122 nifree 14324 nffree 62 nbfree 0 ndir 2853 nifree 10887 nffree 39 nbfree 2 ndir 2872 nifree 10990 nffree 62 nbfree 0 ndir 2851 nifree 10985 nffree 18 nbfree 0 ndir 2787 nifree 11019 nffree 17 nbfree 0 ndir 2898 nifree 10982 nffree 15 nbfree 4 ndir 2769 nifree 10982 nffree 6 nbfree 5 ndir 2889 nifree 10984 nffree 18 nbfree 44 ndir 8 nifree 23544 nffree 21 nbfree 0 ndir 2525 nifree 12424 nffree 22 nbfree 0 ndir 2416 nifree 12654 nffree 5 nbfree 3 ndir 2510 nifree 12652 nffree 16 nbfree 0 ndir 2486 nifree 12655 nffree 56 nbfree 2 ndir 2475 nifree 12654 nffree 1 nbfree 21 ndir 2694 nifree 11847 nffree 44 nbfree 14 ndir 2230 nifree 13649 nffree 83 nbfree 196 ndir 4 nifree 23432 nffree 57 nbfree 16 ndir 587 nifree 20844 nffree 9 nbfree 2 ndir 0 nifree 23552 nffree 3 nbfree 5 ndir 2351 nifree 13182 nffree 15 nbfree 0 ndir 2801 nifree 11026 nffree 8 nbfree 0 ndir 2808 nifree 10985 nffree 12 nbfree 0 ndir 1738 nifree 15765 nffree 37 nbfree 0 ndir 1687 nifree 15991 nffree 24 nbfree 0 ndir 1668 nifree 15973 nffree 17 nbfree 0 ndir 1755 nifree 15992 nffree 47 nbfree 597 ndir 2901 nifree 10985 nffree 7 nbfree 1660 ndir 2945 nifree 10963 nffree 2 nbfree 0 ndir 2798 nifree 10984 nffree 10 nbfree 116 ndir 1736 nifree 15993 nffree 12 nbfree 0 ndir 1755 nifree 15978 nffree 8 nbfree 8344 ndir 2860 nifree 10972 nffree 1 nbfree 780 ndir 2770 nifree 10953 nffree 28 nbfree 0 ndir 2 nifree 23321 nffree 51 nbfree 0 ndir 0 nifree 23552 nffree 18 nbfree 2821 ndir 2791 nifree 10970 nffree 0 nbfree 91 ndir 0 nifree 23552 nffree 46 nbfree 1838 ndir 999 nifree 19117 nffree 2 nbfree 6455 ndir 2807 nifree 11028 nffree 4 nbfree 6232 ndir 2871 nifree 11027 nffree 0 nbfree 5 ndir 0 nifree 23552 nffree 17 nbfree 4 ndir 982 nifree 19170 nffree 58 nbfree 40 ndir 5 nifree 23496 nffree 78 nbfree 3169 ndir 484 nifree 21386 nffree 0 nbfree 36 ndir 3 nifree 23509 nffree 47 nbfree 3136 ndir 2852 nifree 10976 nffree 6 nbfree 4860 ndir 2876 nifree 10981 nffree 2 nbfree 34 ndir 2082 nifree 14219 nffree 57 nbfree 0 ndir 2087 nifree 14362 nffree 14 nbfree 1953 ndir 2867 nifree 10982 nffree 11 nbfree 0 ndir 2115 nifree 14365 nffree 14 nbfree 223 ndir 1997 nifree 14292 nffree 17 nbfree 5226 ndir 2876 nifree 10938 nffree 7 nbfree 0 ndir 2100 nifree 14094 nffree 32 nbfree 0 ndir 2048 nifree 14360 nffree 4 From owner-freebsd-fs@FreeBSD.ORG Sun Nov 16 21:22:58 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E476716A4CE; Sun, 16 Nov 2003 21:22:58 -0800 (PST) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5C25C43FBF; Sun, 16 Nov 2003 21:22:58 -0800 (PST) (envelope-from bright@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1192) id 513262ED462; Sun, 16 Nov 2003 21:22:58 -0800 (PST) Date: Sun, 16 Nov 2003 21:22:58 -0800 From: Alfred Perlstein To: fs@freebsd.org Message-ID: <20031117052258.GB35957@elvis.mu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i cc: bp@freebsd.org cc: marcel@freebsd.org Subject: open cookies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 05:22:59 -0000 I'm starting to do the gruntwork of getting us per-open cookies for file operations. If someone can explain what needs to be done that would speed things up. :) If you're unclear as to what I'm talking about, what I mean is the "private_data" field in Linux's "struct file". Please keep me cc'd as this is the only list I'm currently subscribed to. My main question is... should the cookies be returned from VOP_CREATE, VOP_LOOKUP, VOP_MKNOD, etc.. (all the ones that have and OUT/INOUT of *vpp) or should we only care about VOP_OPEN? -- - Alfred Perlstein - Research Engineering Development Inc. - email: bright@mu.org cell: 408-480-4684 From owner-freebsd-fs@FreeBSD.ORG Mon Nov 17 08:05:56 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 404AD16A4CE; Mon, 17 Nov 2003 08:05:56 -0800 (PST) Received: from newman.gte.com (newman.gte.com [132.197.8.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id B28AD43F75; Mon, 17 Nov 2003 08:05:54 -0800 (PST) (envelope-from ak03@gte.com) Received: from h132-197-179-27.gte.com (kanpc.gte.com [132.197.179.27]) by newman.gte.com (8.9.1/8.9.1) with ESMTP id LAA29847; Mon, 17 Nov 2003 11:05:53 -0500 (EST) Received: from kanpc.gte.com (localhost [IPv6:::1])hAHG5pl0017656; Mon, 17 Nov 2003 11:05:53 -0500 (EST) (envelope-from ak03@gte.com) Date: Mon, 17 Nov 2003 11:05:50 -0500 From: Alexander Kabaev To: Alfred Perlstein Message-Id: <20031117110550.6eb58bf3.ak03@gte.com> In-Reply-To: <20031117052258.GB35957@elvis.mu.org> References: <20031117052258.GB35957@elvis.mu.org> Organization: Verizon Data Services X-Mailer: Sylpheed version 0.9.6claws71 (GTK+ 1.2.10; i386-portbld-freebsd5.1) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit cc: bp@freebsd.org cc: marcel@freebsd.org cc: fs@freebsd.org Subject: Re: open cookies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 16:05:56 -0000 On Sun, 16 Nov 2003 21:22:58 -0800 Alfred Perlstein wrote: > > If you're unclear as to what I'm talking about, what I mean is the > "private_data" field in Linux's "struct file". How do you plan to deal with stacked FSes? -- Alexander Kabaev From owner-freebsd-fs@FreeBSD.ORG Mon Nov 17 09:29:24 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E962516A4CF; Mon, 17 Nov 2003 09:29:24 -0800 (PST) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id 56ECD43FB1; Mon, 17 Nov 2003 09:29:24 -0800 (PST) (envelope-from bright@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1192) id 492122ED46C; Mon, 17 Nov 2003 09:29:24 -0800 (PST) Date: Mon, 17 Nov 2003 09:29:24 -0800 From: Alfred Perlstein To: Alexander Kabaev Message-ID: <20031117172924.GE35957@elvis.mu.org> References: <20031117052258.GB35957@elvis.mu.org> <20031117110550.6eb58bf3.ak03@gte.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031117110550.6eb58bf3.ak03@gte.com> User-Agent: Mutt/1.4.1i cc: bp@freebsd.org cc: marcel@freebsd.org cc: fs@freebsd.org Subject: Re: open cookies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 17:29:25 -0000 * Alexander Kabaev [031117 08:06] wrote: > On Sun, 16 Nov 2003 21:22:58 -0800 > Alfred Perlstein wrote: > > > > > If you're unclear as to what I'm talking about, what I mean is the > > "private_data" field in Linux's "struct file". > > How do you plan to deal with stacked FSes? Wouldn't the stacking layer be responsible for taking care of the lower layer's cookie? struct nullfscookie { void *lowercookie; }; -- - Alfred Perlstein - Research Engineering Development Inc. - email: bright@mu.org cell: 408-480-4684 From owner-freebsd-fs@FreeBSD.ORG Mon Nov 17 10:13:06 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A3AA416A4CE; Mon, 17 Nov 2003 10:13:06 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 59B2D43FDD; Mon, 17 Nov 2003 10:13:05 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9p2/8.12.9) with ESMTP id hAHIB1Mg067880; Mon, 17 Nov 2003 13:11:01 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)hAHIB1xK067877; Mon, 17 Nov 2003 13:11:01 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Mon, 17 Nov 2003 13:11:01 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Alfred Perlstein In-Reply-To: <20031117052258.GB35957@elvis.mu.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: bp@freebsd.org cc: marcel@freebsd.org cc: fs@freebsd.org Subject: Re: open cookies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 18:13:06 -0000 On Sun, 16 Nov 2003, Alfred Perlstein wrote: > I'm starting to do the gruntwork of getting us per-open cookies for file > operations. If someone can explain what needs to be done that would > speed things up. :) > > If you're unclear as to what I'm talking about, what I mean is the > "private_data" field in Linux's "struct file". > > Please keep me cc'd as this is the only list I'm currently subscribed > to. > > My main question is... should the cookies be returned from VOP_CREATE, > VOP_LOOKUP, VOP_MKNOD, etc.. (all the ones that have and OUT/INOUT of > *vpp) or should we only care about VOP_OPEN? I implemented about 90% of this previously and did not commit it. In general, the notion of "session" corresponds well to the notion of "file descriptor"; I found that this meant only VOPs that could be performed on a vnode pulled out of a file descriptor were relevant. When a VOP is dual-purpose: i.e., can be called using both "by name" and "with a session", or even just "without a session", I used NULL for the cookie argument to the VOP. Since we nominally support file system stacking, I found that, much as you concluded, we needed a cookie rather than passing struct file into each VOP, which works with the top layer but not for lower layers. As we stuff a lower layer vnode reference into the per-vnode state, we now have to stuff per-open state material into each layer's per-open state. My general conclusion was that this over-complicated our VFS substantially, and that the struct file state in Linux was generally used only for multiply instantiated devices. With devfs cloning, all the cases I was interested in (things like /dev vmware nodes) are addressed. Since none of our non-specfs nodes required any notion of state, I found I was touching a lot of code to minimal benefit. What's your motivation for adding this support, and can it be added in a way that doesn't introduce new arguments to most VOPs, and introduce a host of potential bugs? I don't doubt it can be done right, but it's a fairly complex solution that has to be motivated by complex requirements... Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories From owner-freebsd-fs@FreeBSD.ORG Mon Nov 17 10:28:12 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7CA0B16A4CE; Mon, 17 Nov 2003 10:28:12 -0800 (PST) Received: from sploot.vicor-nb.com (sploot.vicor-nb.com [208.206.78.81]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5BEAA43F75; Mon, 17 Nov 2003 10:28:11 -0800 (PST) (envelope-from kmarx@vicor.com) Received: from vicor.com (localhost [127.0.0.1]) by sploot.vicor-nb.com (8.12.8/8.12.8) with ESMTP id hAHIMG5i097350; Mon, 17 Nov 2003 10:22:16 -0800 (PST) (envelope-from kmarx@vicor.com) Message-ID: <3FB911D8.5080300@vicor.com> Date: Mon, 17 Nov 2003 10:22:16 -0800 From: Ken Marx User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.6a) Gecko/20031105 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Don Lewis References: <200311162351.hAGNpOeF086319@gw.catspoiler.org> In-Reply-To: <200311162351.hAGNpOeF086319@gw.catspoiler.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-fs@FreeBSD.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 18:28:12 -0000 Don Lewis wrote: > On 31 Oct, To: kmarx@vicor.com wrote: > >>On 31 Oct, Ken Marx wrote: > > >>You may get better results if you change the minbfree calculation from >>(avgbfree - avgbfree/4) to just (avgbfree). >> >>I'm somewhat tempted to change the calculation to: >> min(avgbfree, max(1, (avgbfree - avgbfree/4), (dirsize/fs->fs_bsize))) >>where the last term works out to 4500 with your tunefs parameters. > > > I tried a variation of this on my -CURRENT box and it benchmarked > consistently worse. I've got a "spare' 10 GB partition which first > copied my /usr/ports/packages to, and then filled by repeatedly tarring > my /usr/ports tree over to it. The partition was 100% full, including > the reserve space, after four iterations. > > With minbfree set to max((avgbfree - avgbfree/4), 1) here are two > iterations (the fifth line of timing data is for the 'rm -rf' command): > > 1310.47 real 5.48 user 141.90 sys > 1336.78 real 5.62 user 152.27 sys > 1368.84 real 6.02 user 151.75 sys > 1359.70 real 5.55 user 154.01 sys > 423.44 real 2.25 user 107.26 sys > > 1300.56 real 5.65 user 148.82 sys > 1372.20 real 5.79 user 152.25 sys > 1359.01 real 6.03 user 152.63 sys > 1380.90 real 5.31 user 153.71 sys > 437.22 real 2.20 user 105.61 sys > > With minbfree set to > max(min(max(avgbfree - avgbfree / 4, dirsize / fs->fs_bsize), > avgbfree), 1) > I get the following: > > 1314.61 real 5.66 user 175.43 sys > 1350.40 real 6.12 user 179.15 sys > 1386.86 real 6.32 user 179.12 sys > 1418.60 real 5.74 user 181.64 sys > 508.67 real 2.67 user 119.66 sys > > 1361.19 real 5.97 user 176.94 sys > 1327.63 real 5.72 user 179.60 sys > 1376.16 real 6.33 user 179.72 sys > 1356.47 real 6.07 user 180.24 sys > 462.67 real 2.30 user 119.18 sys > > I'm using the newfs defaults, but dirsize is recalculated as the > filesystem fills if the appropriate value is larger than what is > calculated from the parameters set by newfs. > > I suspect the problem is the large bimodal distribution in file size in > my benchmark, with zillions of little files, but also a number of large > package files and source distfiles. The large files muck up the dirsize > calculation because they are actually distributed across multiple > cylinder groups and only the first maxbpg blocks are allocated in the > original cylinder group. This would be easy to account for in the > avgfilesize * avgfpdir formula, but I don't know how to handle this in > the curdirsize formula (other than the degenerate case where most files > are larger than maxbpg). > > Since I probably won't have time to get anything different tested before > the -CURRENT code freeze, do you have any objections if I just MFC the > code that I previously committed to -CURRENT? It certainly seems to > perform better than the original code which is still in 4-STABLE. > > Don, any fine points you put on our corse level of testing here are fine with us. I belive any of the versions you suggest will keep us well out of the crippling behavior that originally brought this up. I was able to run a couple more tests here, and *belive* that the fix to the hash table in vfs_bio.c will provide some relief for cg block searches when things do fall into the linear search case. k -- Ken Marx, kmarx@vicor-nb.com The entire team is behind the concept that we establish strategic alliances and identify trends in the context of the team building. - http://www.bigshed.com/cgi-bin/speak.cgi From owner-freebsd-fs@FreeBSD.ORG Mon Nov 17 11:31:05 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 712D416A4CE; Mon, 17 Nov 2003 11:31:05 -0800 (PST) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9021F43FBD; Mon, 17 Nov 2003 11:31:04 -0800 (PST) (envelope-from bright@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1192) id 8771B2ED46D; Mon, 17 Nov 2003 11:31:04 -0800 (PST) Date: Mon, 17 Nov 2003 11:31:04 -0800 From: Alfred Perlstein To: Robert Watson Message-ID: <20031117193104.GH35957@elvis.mu.org> References: <20031117052258.GB35957@elvis.mu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i cc: bp@freebsd.org cc: marcel@freebsd.org cc: fs@freebsd.org Subject: Re: open cookies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 19:31:05 -0000 * Robert Watson [031117 10:13] wrote: > > On Sun, 16 Nov 2003, Alfred Perlstein wrote: > > > I'm starting to do the gruntwork of getting us per-open cookies for file > > operations. If someone can explain what needs to be done that would > > speed things up. :) > > I implemented about 90% of this previously and did not commit it. In > general, the notion of "session" corresponds well to the notion of "file > descriptor"; I found that this meant only VOPs that could be performed on > a vnode pulled out of a file descriptor were relevant. When a VOP is > dual-purpose: i.e., can be called using both "by name" and "with a > session", or even just "without a session", I used NULL for the cookie > argument to the VOP. Since we nominally support file system stacking, I > found that, much as you concluded, we needed a cookie rather than passing > struct file into each VOP, which works with the top layer but not for > lower layers. As we stuff a lower layer vnode reference into the > per-vnode state, we now have to stuff per-open state material into each > layer's per-open state. > > My general conclusion was that this over-complicated our VFS > substantially, and that the struct file state in Linux was generally used > only for multiply instantiated devices. With devfs cloning, all the cases > I was interested in (things like /dev vmware nodes) are addressed. Since > none of our non-specfs nodes required any notion of state, I found I was > touching a lot of code to minimal benefit. What's your motivation for > adding this support, and can it be added in a way that doesn't introduce > new arguments to most VOPs, and introduce a host of potential bugs? I > don't doubt it can be done right, but it's a fairly complex solution that > has to be motivated by complex requirements... I just wanted to support the way that Linux does stuff. Are you saying that it's taken care of? -- - Alfred Perlstein - Research Engineering Development Inc. - email: bright@mu.org cell: 408-480-4684 From owner-freebsd-fs@FreeBSD.ORG Mon Nov 17 11:40:55 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 36D8A16A4CE; Mon, 17 Nov 2003 11:40:55 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1F65843FAF; Mon, 17 Nov 2003 11:40:54 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9p2/8.12.9) with ESMTP id hAHJcoMg070940; Mon, 17 Nov 2003 14:38:50 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)hAHJcoIf070937; Mon, 17 Nov 2003 14:38:50 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Mon, 17 Nov 2003 14:38:50 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Alfred Perlstein In-Reply-To: <20031117193104.GH35957@elvis.mu.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: bp@freebsd.org cc: marcel@freebsd.org cc: fs@freebsd.org Subject: Re: open cookies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 19:40:55 -0000 On Mon, 17 Nov 2003, Alfred Perlstein wrote: > * Robert Watson [031117 10:13] wrote: > > My general conclusion was that this over-complicated our VFS > > substantially, and that the struct file state in Linux was generally used > > only for multiply instantiated devices. With devfs cloning, all the cases > > I was interested in (things like /dev vmware nodes) are addressed. Since > > none of our non-specfs nodes required any notion of state, I found I was > > touching a lot of code to minimal benefit. What's your motivation for > > adding this support, and can it be added in a way that doesn't introduce > > new arguments to most VOPs, and introduce a host of potential bugs? I > > don't doubt it can be done right, but it's a fairly complex solution that > > has to be motivated by complex requirements... > > I just wanted to support the way that Linux does stuff. Are you saying > that it's taken care of? I'm saying we can support most of the interesting things I know of that need this state already using devfs clone support. I'm wondering if you have in mind anything further that can't be accomplished with clone support. I.e., something that requires session state for something other than /dev entries? Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories From owner-freebsd-fs@FreeBSD.ORG Mon Nov 17 11:42:00 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0D7D316A4CE; Mon, 17 Nov 2003 11:42:00 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5DA8643FBF; Mon, 17 Nov 2003 11:41:58 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.10/8.12.10) with ESMTP id hAHJfqfO049749; Mon, 17 Nov 2003 20:41:56 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: Alfred Perlstein From: "Poul-Henning Kamp" In-Reply-To: Your message of "Mon, 17 Nov 2003 11:31:04 PST." <20031117193104.GH35957@elvis.mu.org> Date: Mon, 17 Nov 2003 20:41:52 +0100 Message-ID: <49748.1069098112@critter.freebsd.dk> cc: bp@freebsd.org cc: marcel@freebsd.org cc: Robert Watson cc: fs@freebsd.org Subject: Re: open cookies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 19:42:00 -0000 In message <20031117193104.GH35957@elvis.mu.org>, Alfred Perlstein writes: >I just wanted to support the way that Linux does stuff. Are you saying >that it's taken care of? All the cases I've heard off have been related to devices, and the "clone" stuff in DEVFS seems to handle that for people. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-fs@FreeBSD.ORG Mon Nov 17 11:47:40 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5EC5E16A4CF; Mon, 17 Nov 2003 11:47:40 -0800 (PST) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id E78AF43FDD; Mon, 17 Nov 2003 11:47:38 -0800 (PST) (envelope-from bright@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1192) id DAE672ED475; Mon, 17 Nov 2003 11:47:38 -0800 (PST) Date: Mon, 17 Nov 2003 11:47:38 -0800 From: Alfred Perlstein To: Robert Watson Message-ID: <20031117194738.GK35957@elvis.mu.org> References: <20031117193104.GH35957@elvis.mu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i cc: bp@freebsd.org cc: marcel@freebsd.org cc: fs@freebsd.org Subject: Re: open cookies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 19:47:40 -0000 * Robert Watson [031117 11:40] wrote: > > I'm saying we can support most of the interesting things I know of that > need this state already using devfs clone support. I'm wondering if you > have in mind anything further that can't be accomplished with clone > support. I.e., something that requires session state for something other > than /dev entries? Not really. I didn't realize that the problem had been taken care of. -- - Alfred Perlstein - Research Engineering Development Inc. - email: bright@mu.org cell: 408-480-4684 From owner-freebsd-fs@FreeBSD.ORG Mon Nov 17 12:56:13 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3290B16A4CE for ; Mon, 17 Nov 2003 12:56:13 -0800 (PST) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id D956A43FBD for ; Mon, 17 Nov 2003 12:56:11 -0800 (PST) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id hAHKtreF088763; Mon, 17 Nov 2003 12:55:57 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200311172055.hAHKtreF088763@gw.catspoiler.org> Date: Mon, 17 Nov 2003 12:55:53 -0800 (PST) From: Don Lewis To: kmarx@vicor.com In-Reply-To: <3FB911D8.5080300@vicor.com> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: freebsd-fs@FreeBSD.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 20:56:13 -0000 On 17 Nov, Ken Marx wrote: > > Don Lewis wrote: >> Since I probably won't have time to get anything different tested before >> the -CURRENT code freeze, do you have any objections if I just MFC the >> code that I previously committed to -CURRENT? It certainly seems to >> perform better than the original code which is still in 4-STABLE. > Don, any fine points you put on our corse level of testing here > are fine with us. I belive any of the versions you suggest will > keep us well out of the crippling behavior that originally brought > this up. Ok, I'll do the commit as soon as I can do some testing on my -STABLE box. > I was able to run a couple more tests here, and *belive* that the > fix to the hash table in vfs_bio.c will provide some relief > for cg block searches when things do fall into the linear search case. I'll see about cranking out patch to use a Fibonacci hash. It'll probably be a little while before I can find sufficient time, though. From owner-freebsd-fs@FreeBSD.ORG Mon Nov 17 13:03:14 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 82F8116A4D2; Mon, 17 Nov 2003 13:03:14 -0800 (PST) Received: from sploot.vicor-nb.com (sploot.vicor-nb.com [208.206.78.81]) by mx1.FreeBSD.org (Postfix) with ESMTP id E423D43F75; Mon, 17 Nov 2003 13:03:09 -0800 (PST) (envelope-from kmarx@vicor.com) Received: from vicor.com (localhost [127.0.0.1]) by sploot.vicor-nb.com (8.12.8/8.12.8) with ESMTP id hAHKvF5i099584; Mon, 17 Nov 2003 12:57:16 -0800 (PST) (envelope-from kmarx@vicor.com) Message-ID: <3FB9362B.4030601@vicor.com> Date: Mon, 17 Nov 2003 12:57:15 -0800 From: Ken Marx User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.6a) Gecko/20031105 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Don Lewis References: <200311172055.hAHKtreF088763@gw.catspoiler.org> In-Reply-To: <200311172055.hAHKtreF088763@gw.catspoiler.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-fs@FreeBSD.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 21:03:14 -0000 Don Lewis wrote: > On 17 Nov, Ken Marx wrote: > >>Don Lewis wrote: > > >>>Since I probably won't have time to get anything different tested before >>>the -CURRENT code freeze, do you have any objections if I just MFC the >>>code that I previously committed to -CURRENT? It certainly seems to >>>perform better than the original code which is still in 4-STABLE. > > >>Don, any fine points you put on our corse level of testing here >>are fine with us. I belive any of the versions you suggest will >>keep us well out of the crippling behavior that originally brought >>this up. > > > Ok, I'll do the commit as soon as I can do some testing on my -STABLE > box. > Great. Please let us know when this happens. In fact, I kind of got lost which you were planning to commit. Can you point me to it, and I'll do one last overnight run. Again many thanks for this. > >>I was able to run a couple more tests here, and *belive* that the >>fix to the hash table in vfs_bio.c will provide some relief >>for cg block searches when things do fall into the linear search case. > > > I'll see about cranking out patch to use a Fibonacci hash. It'll > probably be a little while before I can find sufficient time, though. > Ditto the above: thanks/keep us posted. Our clients are anxious to have a 'final' kernel to run with. I think we'll just give them what you commit, and sneak the hash fix in with the security patch or some such. So, no rush, but do let me know if you think it might happen sooner than, say, 2 weeks so I can try and get it all in one release to them. regards, k. -- Ken Marx, kmarx@vicor-nb.com It is self-evident that we must sharpen our pencils and revise the expectations surrounding the requirements. - http://www.bigshed.com/cgi-bin/speak.cgi From owner-freebsd-fs@FreeBSD.ORG Mon Nov 17 13:27:47 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E1B4E16A4F0 for ; Mon, 17 Nov 2003 13:27:46 -0800 (PST) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7692443F3F for ; Mon, 17 Nov 2003 13:27:45 -0800 (PST) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id hAHLRTeF088888; Mon, 17 Nov 2003 13:27:33 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200311172127.hAHLRTeF088888@gw.catspoiler.org> Date: Mon, 17 Nov 2003 13:27:29 -0800 (PST) From: Don Lewis To: kmarx@vicor.com In-Reply-To: <200311170331.hAH3VleF086693@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: freebsd-fs@FreeBSD.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 21:27:47 -0000 On 16 Nov, Don Lewis wrote: > On 16 Nov, Don Lewis wrote: > >>> I'm somewhat tempted to change the calculation to: >>> min(avgbfree, max(1, (avgbfree - avgbfree/4), (dirsize/fs->fs_bsize))) >>> where the last term works out to 4500 with your tunefs parameters. >> >> I tried a variation of this on my -CURRENT box and it benchmarked >> consistently worse. I've got a "spare' 10 GB partition which first >> copied my /usr/ports/packages to, and then filled by repeatedly tarring >> my /usr/ports tree over to it. The partition was 100% full, including >> the reserve space, after four iterations. > > I just looked again, and it is more than 100% full, but only slightly > into the reserve space. > >> With minbfree set to max((avgbfree - avgbfree/4), 1) here are two >> iterations (the fifth line of timing data is for the 'rm -rf' command): >> >> 1310.47 real 5.48 user 141.90 sys >> 1336.78 real 5.62 user 152.27 sys >> 1368.84 real 6.02 user 151.75 sys >> 1359.70 real 5.55 user 154.01 sys >> 423.44 real 2.25 user 107.26 sys >> >> 1300.56 real 5.65 user 148.82 sys >> 1372.20 real 5.79 user 152.25 sys >> 1359.01 real 6.03 user 152.63 sys >> 1380.90 real 5.31 user 153.71 sys >> 437.22 real 2.20 user 105.61 sys >> >> With minbfree set to >> max(min(max(avgbfree - avgbfree / 4, dirsize / fs->fs_bsize), >> avgbfree), 1) >> I get the following: >> >> 1314.61 real 5.66 user 175.43 sys >> 1350.40 real 6.12 user 179.15 sys >> 1386.86 real 6.32 user 179.12 sys >> 1418.60 real 5.74 user 181.64 sys >> 508.67 real 2.67 user 119.66 sys >> >> 1361.19 real 5.97 user 176.94 sys >> 1327.63 real 5.72 user 179.60 sys >> 1376.16 real 6.33 user 179.72 sys >> 1356.47 real 6.07 user 180.24 sys >> 462.67 real 2.30 user 119.18 sys >> >> I'm using the newfs defaults, but dirsize is recalculated as the >> filesystem fills if the appropriate value is larger than what is >> calculated from the parameters set by newfs. > > I filled up the file system again with the > minbree = max((avgbfree - avgbfree/4), 1) > version of the code. > > Based on the output of df and dumpfs, I calculate: > avgfilesize = 18K > curdirsize = 83K > avgbfree = 864 > avgifree = 14631 > > What suprises me is the poor distribution of free space across the > cylinder groups in the file system. I now suspect the culprit is > minifree. The current code calculates minifree as 75% of avgifree, or > about 10973. There are some cylinder groups that are less than half > full (capacity is 11761 blocks/group) in this filesystem, but their free > inode counts are near the 10K minifree limit. It looks like the free > inode count should be de-emphasized if the filesystem will run out of > blocks before it runs out of inodes, and vice-versa if inodes are likely > to be exhausted first. I now suspect that the other version of the > minbfree code was more likely to bail out because it could not find any > cylinder groups that met both selection criteria and used the fallback > code, which probably selected the cylinder groups that were already full > but had a large number of free inodes. Something to ponder ... I ran another test with minifree set to a small value, which effectively removed it from the cylinder group selection criteria. I used max(min(max(avgbfree - avgbfree / 4, dirsize / fs->fs_bsize), avgbfree), 1) for minbfree. The results were similar to the previous max((avgbfree - avgbfree/4), 1) tests. 1337.34 real 5.69 user 150.63 sys 1323.58 real 5.87 user 157.96 sys 1347.14 real 5.52 user 159.77 sys 1361.57 real 5.37 user 160.50 sys 419.49 real 2.52 user 114.75 sys 1344.53 real 5.47 user 157.03 sys 1326.97 real 4.77 user 151.57 sys 1322.67 real 4.69 user 153.00 sys 1367.49 real 5.91 user 160.45 sys 409.95 real 2.59 user 114.20 sys 1330.93 real 5.37 user 156.93 sys 1374.03 real 5.59 user 159.14 sys 1367.17 real 5.41 user 160.84 sys 1318.14 real 5.50 user 159.75 sys 411.94 real 2.22 user 114.86 sys I took a snapshot of the cylinder group state at about 75% full as well as at 100%. Even at 75%, there are a number of cylinder groups that are totally full. I think that one of the problems is that the dirpref allocator lingers too long on a given cylinder group. It should probably move to a new cylinder group before the old one is totally full, somewhere around the minfree reserve level. Also, as the file system fills and a large number of the cylinder groups are totally filled, the average free space per cylinder group will be quite small, so the dirpref code will consider cylinder groups with only a small amount of free space as candidates even though there may be other cylinder groups that are nearly empty that would be better choices. 75% dumpfs /dev/da0s2a | grep nbfree nbfree 191340 ndir 94629 nifree 994237 nffree 1232 cs[].cs_(nbfree,ndir,nifree,nffree): nbfree 7256 ndir 1976 nifree 14679 nffree 5 nbfree 7592 ndir 1976 nifree 14853 nffree 7 nbfree 35 ndir 663 nifree 20677 nffree 32 nbfree 5992 ndir 35 nifree 23096 nffree 3 nbfree 0 ndir 2965 nifree 10371 nffree 29 nbfree 0 ndir 2465 nifree 12592 nffree 83 nbfree 38 ndir 2463 nifree 12630 nffree 39 nbfree 115 ndir 2461 nifree 12736 nffree 44 nbfree 45 ndir 2462 nifree 12440 nffree 31 nbfree 16 ndir 2461 nifree 12778 nffree 36 nbfree 644 ndir 408 nifree 21729 nffree 56 nbfree 65 ndir 2966 nifree 10759 nffree 58 nbfree 2516 ndir 2462 nifree 12452 nffree 1 nbfree 2859 ndir 2964 nifree 10626 nffree 7 nbfree 723 ndir 2964 nifree 10517 nffree 18 nbfree 2678 ndir 2967 nifree 10184 nffree 24 nbfree 4279 ndir 2983 nifree 10730 nffree 0 nbfree 0 ndir 2982 nifree 10215 nffree 40 nbfree 0 ndir 549 nifree 20947 nffree 44 nbfree 0 ndir 0 nifree 23552 nffree 10 nbfree 0 ndir 724 nifree 20416 nffree 16 nbfree 38 ndir 0 nifree 23552 nffree 67 nbfree 0 ndir 1200 nifree 17872 nffree 12 nbfree 0 ndir 2963 nifree 10769 nffree 7 nbfree 0 ndir 2963 nifree 10506 nffree 17 nbfree 0 ndir 0 nifree 23552 nffree 17 nbfree 0 ndir 2963 nifree 10765 nffree 4 nbfree 2 ndir 2963 nifree 10240 nffree 18 nbfree 4266 ndir 2983 nifree 10137 nffree 1 nbfree 9442 ndir 2982 nifree 10321 nffree 0 nbfree 9415 ndir 2963 nifree 10476 nffree 4 nbfree 10594 ndir 1194 nifree 18382 nffree 4 nbfree 2 ndir 0 nifree 23552 nffree 39 nbfree 8212 ndir 3050 nifree 10268 nffree 1 nbfree 10508 ndir 1288 nifree 17943 nffree 6 nbfree 1 ndir 0 nifree 23552 nffree 4 nbfree 11381 ndir 0 nifree 23552 nffree 0 nbfree 11391 ndir 0 nifree 23552 nffree 0 nbfree 0 ndir 2 nifree 23321 nffree 51 nbfree 0 ndir 0 nifree 23552 nffree 18 nbfree 7902 ndir 40 nifree 22960 nffree 3 nbfree 91 ndir 0 nifree 23552 nffree 46 nbfree 7862 ndir 0 nifree 23552 nffree 0 nbfree 8433 ndir 0 nifree 23552 nffree 0 nbfree 9341 ndir 0 nifree 23552 nffree 0 nbfree 5 ndir 0 nifree 23552 nffree 17 nbfree 8880 ndir 0 nifree 23552 nffree 0 nbfree 11 ndir 1958 nifree 14708 nffree 58 nbfree 12 ndir 1962 nifree 15043 nffree 54 nbfree 2151 ndir 1957 nifree 14900 nffree 20 nbfree 40 ndir 1958 nifree 15136 nffree 29 nbfree 5764 ndir 1957 nifree 14470 nffree 31 nbfree 6517 ndir 1959 nifree 15192 nffree 1 nbfree 8163 ndir 1976 nifree 14941 nffree 6 nbfree 4107 ndir 1956 nifree 15229 nffree 8 nbfree 3 ndir 1975 nifree 14289 nffree 37 nbfree 0 ndir 1974 nifree 15026 nffree 18 nbfree 6475 ndir 1976 nifree 14747 nffree 7 nbfree 0 ndir 1974 nifree 14882 nffree 43 nbfree 5200 ndir 1975 nifree 14912 nffree 1 100% dumpfs /dev/da0s2a | grep nbfree nbfree 51875 ndir 120875 nifree 877882 nffree 1443 cs[].cs_(nbfree,ndir,nifree,nffree): nbfree 3167 ndir 2963 nifree 10330 nffree 6 nbfree 3583 ndir 2982 nifree 10562 nffree 4 nbfree 52 ndir 663 nifree 20677 nffree 39 nbfree 4265 ndir 2982 nifree 10131 nffree 0 nbfree 4185 ndir 2982 nifree 10340 nffree 7 nbfree 9 ndir 2465 nifree 12592 nffree 60 nbfree 2 ndir 2463 nifree 12630 nffree 34 nbfree 1642 ndir 2461 nifree 12736 nffree 19 nbfree 38 ndir 2462 nifree 12440 nffree 31 nbfree 3008 ndir 2461 nifree 12778 nffree 36 nbfree 0 ndir 633 nifree 20564 nffree 42 nbfree 0 ndir 2963 nifree 10778 nffree 22 nbfree 0 ndir 2460 nifree 12459 nffree 12 nbfree 0 ndir 2963 nifree 10667 nffree 7 nbfree 0 ndir 2963 nifree 10491 nffree 3 nbfree 51 ndir 2963 nifree 10626 nffree 35 nbfree 0 ndir 2963 nifree 10547 nffree 18 nbfree 2 ndir 2963 nifree 10673 nffree 38 nbfree 0 ndir 549 nifree 20947 nffree 40 nbfree 0 ndir 0 nifree 23552 nffree 11 nbfree 3 ndir 0 nifree 23552 nffree 0 nbfree 87 ndir 0 nifree 23552 nffree 51 nbfree 0 ndir 1319 nifree 17311 nffree 5 nbfree 30 ndir 2963 nifree 10498 nffree 17 nbfree 4586 ndir 2983 nifree 10062 nffree 2 nbfree 0 ndir 0 nifree 23552 nffree 19 nbfree 9401 ndir 388 nifree 21774 nffree 5 nbfree 2 ndir 3473 nifree 8167 nffree 113 nbfree 103 ndir 3470 nifree 8345 nffree 28 nbfree 395 ndir 3471 nifree 7913 nffree 64 nbfree 1 ndir 3467 nifree 8476 nffree 5 nbfree 1690 ndir 3486 nifree 8049 nffree 7 nbfree 5065 ndir 3486 nifree 8302 nffree 2 nbfree 5762 ndir 3485 nifree 8214 nffree 4 nbfree 5 ndir 3472 nifree 8363 nffree 9 nbfree 0 ndir 2356 nifree 13130 nffree 33 nbfree 0 ndir 0 nifree 23552 nffree 6 nbfree 0 ndir 0 nifree 23552 nffree 11 nbfree 0 ndir 2 nifree 23321 nffree 51 nbfree 0 ndir 0 nifree 23552 nffree 18 nbfree 0 ndir 40 nifree 22960 nffree 6 nbfree 6 ndir 0 nifree 23552 nffree 48 nbfree 0 ndir 0 nifree 23552 nffree 51 nbfree 506 ndir 0 nifree 23552 nffree 22 nbfree 0 ndir 2965 nifree 10371 nffree 52 nbfree 0 ndir 0 nifree 23552 nffree 17 nbfree 139 ndir 2969 nifree 10603 nffree 63 nbfree 0 ndir 1958 nifree 14708 nffree 43 nbfree 37 ndir 1962 nifree 15043 nffree 57 nbfree 237 ndir 1957 nifree 14900 nffree 17 nbfree 0 ndir 1958 nifree 15136 nffree 21 nbfree 0 ndir 2964 nifree 10118 nffree 12 nbfree 805 ndir 3005 nifree 10331 nffree 6 nbfree 561 ndir 2964 nifree 10525 nffree 10 nbfree 5 ndir 2199 nifree 14133 nffree 19 nbfree 0 ndir 1975 nifree 14289 nffree 25 nbfree 2 ndir 1974 nifree 15026 nffree 11 nbfree 2437 ndir 2923 nifree 10441 nffree 5 nbfree 4 ndir 1974 nifree 14882 nffree 36 nbfree 2 ndir 2963 nifree 10451 nffree 8 I think it would work better if dirpref were converted to a two pass algorithm. The first pass would only consider those cylinder groups that had more than minfree space. If this first pass failed, the second pass would look at all cylinder groups. Another change that I suspect would help is rather than comparing cylinder groups to minbfree and minifree, calculate how many directories containing avgfilesperdir files of size avgfilesize they could hold, and then calculate the average and minimum threshold values of that. It would be an interesting project to write a filesystem allocation simulator to test different allocation algorithms without having to bang on physical disks. From owner-freebsd-fs@FreeBSD.ORG Mon Nov 17 19:48:08 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BA6AC16A4CE for ; Mon, 17 Nov 2003 19:48:08 -0800 (PST) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1501A43FAF for ; Mon, 17 Nov 2003 19:48:06 -0800 (PST) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id hAI3lmeF089505; Mon, 17 Nov 2003 19:47:53 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200311180347.hAI3lmeF089505@gw.catspoiler.org> Date: Mon, 17 Nov 2003 19:47:48 -0800 (PST) From: Don Lewis To: kmarx@vicor.com In-Reply-To: <3FB9362B.4030601@vicor.com> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: freebsd-fs@FreeBSD.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Nov 2003 03:48:08 -0000 On 17 Nov, Ken Marx wrote: > > > Don Lewis wrote: >> Ok, I'll do the commit as soon as I can do some testing on my -STABLE >> box. >> > > Great. Please let us know when this happens. In fact, > I kind of got lost which you were planning to commit. > Can you point me to it, and I'll do one last overnight run. I just committed version which sets minbfree to: max(1, avgbfree - avgbfree / 4) You may want to continue to use the version that you are already running which sets minbfree to avgbfree. I'm not committing my more complex version because it benchmarked worse for me than the version I committed. I'm pretty sure that we can do better than this, but it will require a fair amount of tweaking and benchmarking, but for now this version should work a lot better than the previous version of the code. >> >>>I was able to run a couple more tests here, and *belive* that the >>>fix to the hash table in vfs_bio.c will provide some relief >>>for cg block searches when things do fall into the linear search case. >> >> >> I'll see about cranking out patch to use a Fibonacci hash. It'll >> probably be a little while before I can find sufficient time, though. >> > > Ditto the above: thanks/keep us posted. Our clients are > anxious to have a 'final' kernel to run with. I think we'll > just give them what you commit, and sneak the hash fix in with > the security patch or some such. So, no rush, but do let me > know if you think it might happen sooner than, say, 2 weeks > so I can try and get it all in one release to them. I had some time to crank out a patch. Give this a try and compare it to your hash patch. It hasn't blown up my system, but I don't have any benchmark data on it. You can just do the test where you fill the remaining space in the filesystem. You won't need to do a newfs and start from scratch. It would be great if you could compare the hash bucket sizes for the different versions of the hash. Index: sys/kern/vfs_bio.c =================================================================== RCS file: /home/ncvs/src/sys/kern/vfs_bio.c,v retrieving revision 1.242.2.21 diff -u -r1.242.2.21 vfs_bio.c --- sys/kern/vfs_bio.c 9 Aug 2003 16:21:19 -0000 1.242.2.21 +++ sys/kern/vfs_bio.c 18 Nov 2003 02:10:55 -0000 @@ -140,6 +140,7 @@ &bufreusecnt, 0, ""); static int bufhashmask; +static int bufhashshift; static LIST_HEAD(bufhashhdr, buf) *bufhashtbl, invalhash; struct bqueues bufqueues[BUFFER_QUEUES] = { { 0 } }; char *buf_wmesg = BUF_WMESG; @@ -160,7 +161,20 @@ struct bufhashhdr * bufhash(struct vnode *vnp, daddr_t bn) { - return(&bufhashtbl[(((uintptr_t)(vnp) >> 7) + (int)bn) & bufhashmask]); + u_int64_t hashkey64; + int hashkey; + + /* + * Fibonacci hash, see Knuth's + * _Art of Computer Programming, Volume 3 / Sorting and Searching_ + * + * We reduce the argument to 32 bits before doing the hash to + * avoid the need for a slow 64x64 multiply on 32 bit platforms. + */ + hashkey64 = (u_int64_t)(uintptr_t)vnp + (u_int64_t)bn; + hashkey = (((u_int32_t)(hashkey64 + (hashkey64 >> 32)) * 2654435769u) >> + bufhashshift) & bufhashmask; + return(&bufhashtbl[hashkey]); } /* @@ -319,8 +333,9 @@ bufhashinit(caddr_t vaddr) { /* first, make a null hash table */ + bufhashshift = 29; for (bufhashmask = 8; bufhashmask < nbuf / 4; bufhashmask <<= 1) - ; + bufhashshift--; bufhashtbl = (void *)vaddr; vaddr = vaddr + sizeof(*bufhashtbl) * bufhashmask; --bufhashmask; From owner-freebsd-fs@FreeBSD.ORG Tue Nov 18 08:26:05 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B34F816A4CE for ; Tue, 18 Nov 2003 08:26:05 -0800 (PST) Received: from citi.umich.edu (citi.umich.edu [141.211.133.111]) by mx1.FreeBSD.org (Postfix) with ESMTP id 199B543FBF for ; Tue, 18 Nov 2003 08:26:05 -0800 (PST) (envelope-from rees@citi.umich.edu) Received: from citi.umich.edu (dumaguete.citi.umich.edu [141.211.133.51]) by citi.umich.edu (Postfix) with ESMTP id A59842080C for ; Tue, 18 Nov 2003 11:26:03 -0500 (EST) To: freebsd-fs@freebsd.org From: Jim Rees Date: Tue, 18 Nov 2003 11:26:03 -0500 Sender: rees@citi.umich.edu Message-Id: <20031118162603.A59842080C@citi.umich.edu> Subject: NFSv4 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Nov 2003 16:26:05 -0000 You may have noticed that there is now an NFS v4 client in the kernel. It was written at the Center for Information Technology Integration at the University of Michigan by me and several others. I would welcome any comments, questions, or suggestions you might have. Unfortunately, servers are hard to come by. There is no FreeBSD server. There is a server in linux 2.6, and one in the Network Appliance filer. There is one available for OpenBSD 2.8. Hummingbird has one for Windows. Things I'll be working on in the next few days: - man pages for idmapd and mount_nfs4 - make it possible to build a v4 module separate from v3 Right now there is no security, but it will be coming soon. NFSv4 uses gss-rpc, with kerberos and lipkey (spkm) mechanisms. You can find more info here: http://www.citi.umich.edu/projects/nfsv4/ From owner-freebsd-fs@FreeBSD.ORG Tue Nov 18 10:19:33 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6886D16A4CF; Tue, 18 Nov 2003 10:19:33 -0800 (PST) Received: from sploot.vicor-nb.com (sploot.vicor-nb.com [208.206.78.81]) by mx1.FreeBSD.org (Postfix) with ESMTP id F0CE143FDD; Tue, 18 Nov 2003 10:19:31 -0800 (PST) (envelope-from kmarx@vicor.com) Received: from vicor.com (localhost [127.0.0.1]) by sploot.vicor-nb.com (8.12.8/8.12.8) with ESMTP id hAIIDC3g038538; Tue, 18 Nov 2003 10:13:13 -0800 (PST) (envelope-from kmarx@vicor.com) Message-ID: <3FBA6138.3000500@vicor.com> Date: Tue, 18 Nov 2003 10:13:12 -0800 From: Ken Marx User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.6a) Gecko/20031105 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Don Lewis References: <200311180347.hAI3lmeF089505@gw.catspoiler.org> In-Reply-To: <200311180347.hAI3lmeF089505@gw.catspoiler.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-fs@FreeBSD.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Nov 2003 18:19:33 -0000 Don Lewis wrote: > On 17 Nov, Ken Marx wrote: > >> >>Don Lewis wrote: > > >>>Ok, I'll do the commit as soon as I can do some testing on my -STABLE >>>box. >>> >> >>Great. Please let us know when this happens. In fact, >>I kind of got lost which you were planning to commit. >>Can you point me to it, and I'll do one last overnight run. > > > I just committed version which sets minbfree to: > max(1, avgbfree - avgbfree / 4) > > You may want to continue to use the version that you are already running > which sets minbfree to avgbfree. I'm not committing my more complex > version because it benchmarked worse for me than the version I > committed. > > I'm pretty sure that we can do better than this, but it will require a > fair amount of tweaking and benchmarking, but for now this version > should work a lot better than the previous version of the code. > > >>>>I was able to run a couple more tests here, and *belive* that the >>>>fix to the hash table in vfs_bio.c will provide some relief >>>>for cg block searches when things do fall into the linear search case. >>> >>> >>>I'll see about cranking out patch to use a Fibonacci hash. It'll >>>probably be a little while before I can find sufficient time, though. >>> >> >>Ditto the above: thanks/keep us posted. Our clients are >>anxious to have a 'final' kernel to run with. I think we'll >>just give them what you commit, and sneak the hash fix in with >>the security patch or some such. So, no rush, but do let me >>know if you think it might happen sooner than, say, 2 weeks >>so I can try and get it all in one release to them. > > > I had some time to crank out a patch. Give this a try and compare it to > your hash patch. It hasn't blown up my system, but I don't have any > benchmark data on it. You can just do the test where you fill the > remaining space in the filesystem. You won't need to do a newfs and > start from scratch. It would be great if you could compare the hash > bucket sizes for the different versions of the hash. > > > Index: sys/kern/vfs_bio.c > =================================================================== > RCS file: /home/ncvs/src/sys/kern/vfs_bio.c,v > retrieving revision 1.242.2.21 > diff -u -r1.242.2.21 vfs_bio.c > --- sys/kern/vfs_bio.c 9 Aug 2003 16:21:19 -0000 1.242.2.21 > +++ sys/kern/vfs_bio.c 18 Nov 2003 02:10:55 -0000 > @@ -140,6 +140,7 @@ > &bufreusecnt, 0, ""); > > static int bufhashmask; > +static int bufhashshift; > static LIST_HEAD(bufhashhdr, buf) *bufhashtbl, invalhash; > struct bqueues bufqueues[BUFFER_QUEUES] = { { 0 } }; > char *buf_wmesg = BUF_WMESG; > @@ -160,7 +161,20 @@ > struct bufhashhdr * > bufhash(struct vnode *vnp, daddr_t bn) > { > - return(&bufhashtbl[(((uintptr_t)(vnp) >> 7) + (int)bn) & bufhashmask]); > + u_int64_t hashkey64; > + int hashkey; > + > + /* > + * Fibonacci hash, see Knuth's > + * _Art of Computer Programming, Volume 3 / Sorting and Searching_ > + * > + * We reduce the argument to 32 bits before doing the hash to > + * avoid the need for a slow 64x64 multiply on 32 bit platforms. > + */ > + hashkey64 = (u_int64_t)(uintptr_t)vnp + (u_int64_t)bn; > + hashkey = (((u_int32_t)(hashkey64 + (hashkey64 >> 32)) * 2654435769u) >> > + bufhashshift) & bufhashmask; > + return(&bufhashtbl[hashkey]); > } > > /* > @@ -319,8 +333,9 @@ > bufhashinit(caddr_t vaddr) > { > /* first, make a null hash table */ > + bufhashshift = 29; > for (bufhashmask = 8; bufhashmask < nbuf / 4; bufhashmask <<= 1) > - ; > + bufhashshift--; > bufhashtbl = (void *)vaddr; > vaddr = vaddr + sizeof(*bufhashtbl) * bufhashmask; > --bufhashmask; > > Most excellent. I'll try and get you some info by end of today. Thanks! k -- Ken Marx, kmarx@vicor-nb.com Clearly we must down size and set up weekly meetings on the object, etc. - http://www.bigshed.com/cgi-bin/speak.cgi From owner-freebsd-fs@FreeBSD.ORG Tue Nov 18 13:33:02 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8659616A4CE for ; Tue, 18 Nov 2003 13:33:02 -0800 (PST) Received: from mail.allcaps.org (mail.allcaps.org [206.251.247.157]) by mx1.FreeBSD.org (Postfix) with ESMTP id C01CE43FB1 for ; Tue, 18 Nov 2003 13:33:01 -0800 (PST) (envelope-from bsder@allcaps.org) Received: from mail.allcaps.org (localhost [127.0.0.1]) by mail.allcaps.org (Postfix) with ESMTP id 0615FD844C; Tue, 18 Nov 2003 13:34:02 -0800 (PST) Received: from localhost (bsder@localhost)hAILY1gs034422; Tue, 18 Nov 2003 13:34:01 -0800 (PST) X-Authentication-Warning: mail.allcaps.org: bsder owned process doing -bs Date: Tue, 18 Nov 2003 13:34:01 -0800 (PST) From: "Andrew P. Lentvorski, Jr." To: Jim Rees In-Reply-To: <20031118162603.A59842080C@citi.umich.edu> Message-ID: <20031118133248.V34409@mail.allcaps.org> References: <20031118162603.A59842080C@citi.umich.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-fs@freebsd.org Subject: Re: NFSv4 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Nov 2003 21:33:02 -0000 On Tue, 18 Nov 2003, Jim Rees wrote: > Unfortunately, servers are hard to come by. There is no FreeBSD server. > There is a server in linux 2.6, and one in the Network Appliance filer. > There is one available for OpenBSD 2.8. Hummingbird has one for Windows. Is there one in Solaris? Or is it unsuitable for some reason? -a From owner-freebsd-fs@FreeBSD.ORG Wed Nov 19 02:53:22 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7809516A4CE for ; Wed, 19 Nov 2003 02:53:22 -0800 (PST) Received: from mail-out.ukr.net (mail-out.ukr.net [212.42.65.71]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4FCD743FE3 for ; Wed, 19 Nov 2003 02:53:19 -0800 (PST) (envelope-from technix@ukr.net) Received: from storage.ukr.net ([212.42.65.69]) by mail-out.ukr.net with esmtp ID 1AMPxQ-000Cxk-00; Wed, 19 Nov 2003 12:53:12 +0200 Received: from mail by storage.ukr.net with local ID 1AMPxQ-000EwV-00 for freebsd-fs@freebsd.org; Wed, 19 Nov 2003 12:53:12 +0200 Received: from [193.41.172.68] by www1.ukr.net with HTTP; Wed, 19 Nov 2003 10:53:12 +0000 (GMT) From: "Sergei Mozhaisky" To: freebsd-fs@freebsd.org Mime-Version: 1.0 X-Mailer: mPOP Web-Mail 2.19 X-Originating-IP: 10.1.1.1 via proxy [193.41.172.68] Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 8bit Message-Id: Date: Wed, 19 Nov 2003 12:53:12 +0200 X-Scanner: exiscan for exim4 (http://duncanthrax.net/exiscan/) *1AMPxQ-000EwV-00*ec9zhISPHMo* Subject: Compressed filesystem for FreeBSD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Sergei Mozhaisky List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 10:53:22 -0000 Hello everyone. I am searching for compressed filesystem for FreeBSD. Why do I need compressed filesystem? I am developer of FreeBSD-based LiveCD, and the main problem of liveCD is loading speed - it's too slow, comparing with Knoppix or other Linux LiveCD. Usage of compressed filesystem will improve software loading speed and allow to put more software to CD. I found info about project "FiST" in freebsd-fs archives, this is almost what I need: ftp://ftp.filesystems.org/pub/fist/fistgen-0.0.7.tar.gz But gzipfs module does not compile in FreeBSD (developers said that size-changing algoritm they used currently works only in Linux) So the question is simple: is there any implementations of compressed filesystems for FreeBSD (even as unofficial projects)? This will help a lot in such projects as FreeBSD-based LiveCD. -- Best regards, [ http://technix.melitopol.zp.ua/ ] Mozhaisky Sergei (techniX) [ http://frenzy.icc.melitopol.net/ ] From owner-freebsd-fs@FreeBSD.ORG Wed Nov 19 08:25:12 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1C18316A4CE for ; Wed, 19 Nov 2003 08:25:12 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 13C1143FA3 for ; Wed, 19 Nov 2003 08:25:11 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9p2/8.12.9) with ESMTP id hAJGN2Mg008620; Wed, 19 Nov 2003 11:23:02 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)hAJGMxRR008617; Wed, 19 Nov 2003 11:23:02 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Wed, 19 Nov 2003 11:22:59 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Sergei Mozhaisky In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-fs@freebsd.org Subject: Re: Compressed filesystem for FreeBSD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 16:25:12 -0000 On Wed, 19 Nov 2003, Sergei Mozhaisky wrote: > Hello everyone. > > I am searching for compressed filesystem for FreeBSD. > > Why do I need compressed filesystem? I am developer of FreeBSD-based > LiveCD, and the main problem of liveCD is loading speed - it's too slow, > comparing with Knoppix or other Linux LiveCD. Usage of compressed > filesystem will improve software loading speed and allow to put more > software to CD. > > I found info about project "FiST" in freebsd-fs archives, this is almost > what I need: > ftp://ftp.filesystems.org/pub/fist/fistgen-0.0.7.tar.gz But gzipfs > module does not compile in FreeBSD (developers said that size-changing > algoritm they used currently works only in Linux) So the question is > simple: is there any implementations of compressed filesystems for > FreeBSD (even as unofficial projects)? This will help a lot in such > projects as FreeBSD-based LiveCD. If your file system is read-only, I wonder if the easier path wouldn't be to implement a compression layer for GEOM. That would keep you out of the business of alternative for stacked file systems, which is a painful business to be in :-). You could use regular UFS or cd9660, and then compress the resulting image using a custom (or off-the-shelf) tool. The trick will be to have random access to the compressed data... Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories From owner-freebsd-fs@FreeBSD.ORG Wed Nov 19 08:58:49 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id ACCC516A4CE; Wed, 19 Nov 2003 08:58:49 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id D8ED443F93; Wed, 19 Nov 2003 08:58:47 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.10/8.12.10) with ESMTP id hAJGwiEs065367; Wed, 19 Nov 2003 17:58:45 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: Robert Watson From: "Poul-Henning Kamp" In-Reply-To: Your message of "Wed, 19 Nov 2003 11:22:59 EST." Date: Wed, 19 Nov 2003 17:58:44 +0100 Message-ID: <65366.1069261124@critter.freebsd.dk> cc: freebsd-fs@freebsd.org Subject: Re: Compressed filesystem for FreeBSD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 16:58:49 -0000 In message , Rober t Watson writes: >If your file system is read-only, I wonder if the easier path wouldn't be >to implement a compression layer for GEOM. Somebody else talked about this concept recently, but I forgot who and can't find the email right now... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-fs@FreeBSD.ORG Wed Nov 19 09:32:41 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0A8BB16A4CE for ; Wed, 19 Nov 2003 09:32:41 -0800 (PST) Received: from vsmtp1.tin.it (vsmtp1.tin.it [212.216.176.221]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1819B43F93 for ; Wed, 19 Nov 2003 09:32:40 -0800 (PST) (envelope-from flag@libero.it) Received: from willow.homeunix.org (80.183.95.114) by vsmtp1.tin.it (7.0.019) id 3FB901560015B851; Wed, 19 Nov 2003 18:32:38 +0100 Received: by willow.homeunix.org (Postfix, from userid 1001) id 7A65820B3; Wed, 19 Nov 2003 18:36:24 +0100 (CET) Date: Wed, 19 Nov 2003 18:36:24 +0100 From: Paolo Pisati To: Poul-Henning Kamp Message-ID: <20031119173624.GA4093@tin.it> References: <65366.1069261124@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <65366.1069261124@critter.freebsd.dk> User-Agent: Mutt/1.4.1i cc: freebsd-fs@freebsd.org Subject: Re: Compressed filesystem for FreeBSD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 17:32:41 -0000 On Wed, Nov 19, 2003 at 05:58:44PM +0100, Poul-Henning Kamp wrote: > In message , Rober > t Watson writes: > > >If your file system is read-only, I wonder if the easier path wouldn't be > >to implement a compression layer for GEOM. > > Somebody else talked about this concept recently, but I forgot who and > can't find the email right now... Present! =) We talked in pvt emails about this topic, and u told me it would be simpler to start hacking geom-gate and write a userland implementation but my 5 minutes trip in geom-gate land didn't help me much... =) let's see for the future... -- Paolo Italian FreeBSD User Group: http://www.gufi.org From owner-freebsd-fs@FreeBSD.ORG Wed Nov 19 11:27:50 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 49E0D16A4D0 for ; Wed, 19 Nov 2003 11:27:50 -0800 (PST) Received: from mxsf26.cluster1.charter.net (mxsf26.cluster1.charter.net [209.225.28.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6DDF143FBD for ; Wed, 19 Nov 2003 11:27:40 -0800 (PST) (envelope-from ups@stups.com) Received: from stups.com ([209.187.143.11])hAJJKMJa070428; Wed, 19 Nov 2003 14:20:25 -0500 (EST) (envelope-from ups@stups.com) Received: from tree.com (localhost [127.0.0.1]) by stups.com (8.9.3/8.9.3) with ESMTP id OAA14798; Wed, 19 Nov 2003 14:20:21 -0500 Message-Id: <200311191920.OAA14798@stups.com> X-Mailer: exmh version 2.0.2 To: Sergei Mozhaisky In-Reply-To: Message from "Sergei Mozhaisky" of "Wed, 19 Nov 2003 12:53:12 +0200." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 19 Nov 2003 14:20:21 -0500 From: Stephan Uphoff cc: freebsd-fs@freebsd.org Subject: Re: Compressed filesystem for FreeBSD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Nov 2003 19:27:50 -0000 Sergei Mozhaisky wrote: > Hello everyone. > > I am searching for compressed filesystem for FreeBSD. > > Why do I need compressed filesystem? I am developer of FreeBSD-based > LiveCD, and the main problem of liveCD is loading speed - it's too > slow, comparing with Knoppix or other Linux LiveCD. Currently the f_iosize of a CD filesystem is set using the f_bsize. (normally 2048) This forces vnode_pager_generic_getpages() to use vnode_pager_input_smlfs(). Replacing the line: sbp->f_iosize = sbp->f_bsize; /* XXX */ in cd9660_vfsops.c with: sbp->f_iosize = (sbp->f_bsize > PAGE_SIZE) ? sbp->f_bsize : PAGE_SIZE; should be possible (Warning not tested!) and should help you with the loading speed. Stephan From owner-freebsd-fs@FreeBSD.ORG Thu Nov 20 02:55:38 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2E02416A4CE; Thu, 20 Nov 2003 02:55:38 -0800 (PST) Received: from razorbill.mail.pas.earthlink.net (razorbill.mail.pas.earthlink.net [207.217.121.248]) by mx1.FreeBSD.org (Postfix) with ESMTP id 090DF43FE5; Thu, 20 Nov 2003 02:55:35 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from user-2ivfjcl.dialup.mindspring.com ([165.247.205.149] helo=mindspring.com) by razorbill.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 1AMmTC-0003pJ-00; Thu, 20 Nov 2003 02:55:31 -0800 Message-ID: <3FBC9D7A.5ECAE855@mindspring.com> Date: Thu, 20 Nov 2003 02:54:50 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Alfred Perlstein References: <20031117193104.GH35957@elvis.mu.org> <20031117194738.GK35957@elvis.mu.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a433e589e8d7b90b859b690266565dc6c6667c3043c0873f7e350badd9bab72f9c350badd9bab72f9c cc: bp@freebsd.org cc: marcel@freebsd.org cc: Robert Watson cc: fs@freebsd.org Subject: Re: open cookies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Nov 2003 10:55:38 -0000 Alfred Perlstein wrote: > * Robert Watson [031117 11:40] wrote: > > I'm saying we can support most of the interesting things I know of that > > need this state already using devfs clone support. I'm wondering if you > > have in mind anything further that can't be accomplished with clone > > support. I.e., something that requires session state for something other > > than /dev entries? > > Not really. I didn't realize that the problem had been taken care of. It hasn't been taken care of until it's possible to run multiple instances of VMWare on FreeBSD, which works just fin on Linux. -- Terry From owner-freebsd-fs@FreeBSD.ORG Thu Nov 20 03:41:46 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4560416A4CE for ; Thu, 20 Nov 2003 03:41:46 -0800 (PST) Received: from techno.sub.ru (webmail.sub.ru [213.247.139.22]) by mx1.FreeBSD.org (Postfix) with SMTP id 3384743FAF for ; Thu, 20 Nov 2003 03:41:44 -0800 (PST) (envelope-from tarkhil@over.ru) Received: (qmail 66057 invoked by uid 0); 20 Nov 2003 11:41:50 -0000 Received: from unknown (HELO tarkhil.over.ru) (213.148.23.65) by webmail.sub.ru with SMTP; 20 Nov 2003 11:41:50 -0000 Date: Thu, 20 Nov 2003 14:41:43 +0300 From: Alex Povolotsky To: fs@freebsd.org Message-Id: <20031120144143.6cf73e06.tarkhil@over.ru> X-Mailer: Sylpheed version 0.9.6claws (GTK+ 1.2.10; i386-portbld-freebsd4.6.2) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: "Reverse union" mount possible? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Nov 2003 11:41:46 -0000 Hello! stating man mount(8) union Causes the namespace at the mount point to appear as the union of the mounted file system root and the existing directory. Lookups will be done in the mounted file sys- tem first. If those operations fail due to a non-exis- tent file the underlying directory is then accessed. All creates are done in the mounted file system. Is it somehow possible, or how complex patch will require to get "reverse union" mount, with first lookup in underlying system, and file creation there as well? In case I'm trying to invent a square wheel, here is the problem: I need to create several jails with as many common files as possible, and with ability to update software in all jails at once as well as in specific jails. Right now, I'm readonly mount_null'ing /bin, /sbin, /usr/bin, /usr/sbin, /usr/lib, /usr/include, /usr/libexec, /usr/share. With a dozen jails, there are too many mounts to my liking, and about twice a week I experience panic. Probabily it's nullfs-related. I think that reverse-union mounting (over nfs, for stability, but that doesn't really matter) common tree can help me a lot. I.e. mount -o runion 127.0.0.1:/usr/jail/common /usr/jail/jail1 mount -o runion 127.0.0.1:/usr/jail/common /usr/jail/jail2 etc will reduce mounting and make possible to update software in common subtree for all jails as well as in each particular jail. Please point me to my mistakes ;-) -- Alex. From owner-freebsd-fs@FreeBSD.ORG Thu Nov 20 06:05:18 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CDB8F16A4CE; Thu, 20 Nov 2003 06:05:18 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id A56BE43FD7; Thu, 20 Nov 2003 06:05:17 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9p2/8.12.9) with ESMTP id hAKE36Mg019542; Thu, 20 Nov 2003 09:03:06 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)hAKE35E4019531; Thu, 20 Nov 2003 09:03:05 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Thu, 20 Nov 2003 09:03:05 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Terry Lambert In-Reply-To: <3FBC9D7A.5ECAE855@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: bp@freebsd.org cc: Alfred Perlstein cc: fs@freebsd.org cc: marcel@freebsd.org Subject: Re: open cookies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Nov 2003 14:05:18 -0000 On Thu, 20 Nov 2003, Terry Lambert wrote: > Alfred Perlstein wrote: > > * Robert Watson [031117 11:40] wrote: > > > I'm saying we can support most of the interesting things I know of that > > > need this state already using devfs clone support. I'm wondering if you > > > have in mind anything further that can't be accomplished with clone > > > support. I.e., something that requires session state for something other > > > than /dev entries? > > > > Not really. I didn't realize that the problem had been taken care of. > > It hasn't been taken care of until it's possible to run multiple > instances of VMWare on FreeBSD, which works just fin on Linux. Well, I don't know if the driver has been updated, but the infrastructure to let the driver do what it needs to do is present. If we still have a VMware2 port (what I have a license for), maybe I'll take a look at it this evening. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories From owner-freebsd-fs@FreeBSD.ORG Thu Nov 20 12:42:02 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5622416A4CE; Thu, 20 Nov 2003 12:42:02 -0800 (PST) Received: from mx.nsu.ru (mx.nsu.ru [212.192.164.5]) by mx1.FreeBSD.org (Postfix) with ESMTP id B0B1743FE3; Thu, 20 Nov 2003 12:42:00 -0800 (PST) (envelope-from fjoe@iclub.nsu.ru) Received: from mail by mx.nsu.ru with drweb-scanned (Exim 3.35 #1 (Debian)) id 1AMwew-0004yk-00; Fri, 21 Nov 2003 03:48:18 +0600 Received: from iclub.nsu.ru ([193.124.215.97] ident=root) by mx.nsu.ru with esmtp (Exim 3.35 #1 (Debian)) id 1AMweu-0004uA-00; Fri, 21 Nov 2003 03:48:16 +0600 Received: from iclub.nsu.ru (fjoe@localhost [127.0.0.1]) by iclub.nsu.ru (8.12.8p2/8.12.8) with ESMTP id hAKKfoJQ060296; Fri, 21 Nov 2003 02:41:50 +0600 (NS) (envelope-from fjoe@iclub.nsu.ru) Received: (from fjoe@localhost) by iclub.nsu.ru (8.12.8p2/8.12.8/Submit) id hAKKflHS060294; Fri, 21 Nov 2003 02:41:47 +0600 (NS) (envelope-from fjoe) Date: Fri, 21 Nov 2003 02:41:47 +0600 From: Max Khon To: Robert Watson Message-ID: <20031120204147.GB60068@iclub.nsu.ru> References: <3FBC9D7A.5ECAE855@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-Envelope-To: rwatson@freebsd.org, tlambert2@mindspring.com, bp@freebsd.org, bright@mu.org, fs@freebsd.org, marcel@freebsd.org cc: bp@freebsd.org cc: marcel@freebsd.org cc: Alfred Perlstein cc: fs@freebsd.org Subject: Re: open cookies X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Nov 2003 20:42:02 -0000 Hello! On Thu, Nov 20, 2003 at 09:03:05AM -0500, Robert Watson wrote: > > > > I'm saying we can support most of the interesting things I know of that > > > > need this state already using devfs clone support. I'm wondering if you > > > > have in mind anything further that can't be accomplished with clone > > > > support. I.e., something that requires session state for something other > > > > than /dev entries? > > > > > > Not really. I didn't realize that the problem had been taken care of. > > > > It hasn't been taken care of until it's possible to run multiple > > instances of VMWare on FreeBSD, which works just fin on Linux. > > Well, I don't know if the driver has been updated, but the infrastructure > to let the driver do what it needs to do is present. If we still have a > VMware2 port (what I have a license for), maybe I'll take a look at it > this evening. It is already possible to run multiple instances of VMWare 3 (which can be found in ports). /fjoe From owner-freebsd-fs@FreeBSD.ORG Thu Nov 20 16:53:25 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B48CC16A4CE for ; Thu, 20 Nov 2003 16:53:25 -0800 (PST) Received: from smtp01.syd.iprimus.net.au (smtp01.syd.iprimus.net.au [210.50.30.52]) by mx1.FreeBSD.org (Postfix) with ESMTP id C04F343FE1 for ; Thu, 20 Nov 2003 16:53:24 -0800 (PST) (envelope-from tim@robbins.dropbear.id.au) Received: from robbins.dropbear.id.au (210.50.217.136) by smtp01.syd.iprimus.net.au (7.0.020) id 3F8B009E00F3C372; Fri, 21 Nov 2003 11:53:23 +1100 Received: by robbins.dropbear.id.au (Postfix, from userid 1000) id 7E1B6611E; Fri, 21 Nov 2003 11:57:06 +1100 (EST) Date: Fri, 21 Nov 2003 11:57:06 +1100 From: Tim Robbins To: Alex Povolotsky Message-ID: <20031121005706.GA67377@wombat.robbins.dropbear.id.au> References: <20031120144143.6cf73e06.tarkhil@over.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031120144143.6cf73e06.tarkhil@over.ru> User-Agent: Mutt/1.4.1i cc: fs@freebsd.org Subject: Re: "Reverse union" mount possible? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2003 00:53:25 -0000 On Thu, Nov 20, 2003 at 02:41:43PM +0300, Alex Povolotsky wrote: > Is it somehow possible, or how complex patch will require to get "reverse union" > mount, with first lookup in underlying system, and file creation there as well? I believe unionfs can do this (mount_unionfs -b option). mount's "union" option only has a small subset of unionfs's features. > Right now, I'm readonly mount_null'ing /bin, /sbin, /usr/bin, /usr/sbin, > /usr/lib, /usr/include, /usr/libexec, /usr/share. With a dozen jails, there > are too many mounts to my liking, and about twice a week I experience panic. > Probabily it's nullfs-related. Nullfs is known to be buggy in -stable. In particular, it seems to deadlock under load / when vnodes start getting recycled. Tim From owner-freebsd-fs@FreeBSD.ORG Thu Nov 20 18:01:33 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 466D816A4CE; Thu, 20 Nov 2003 18:01:33 -0800 (PST) Received: from filer.fsl.cs.sunysb.edu (filer.fsl.cs.sunysb.edu [130.245.126.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 39BAB43FCB; Thu, 20 Nov 2003 18:01:32 -0800 (PST) (envelope-from ezk@fsl.cs.sunysb.edu) Received: from agora.fsl.cs.sunysb.edu (IDENT:GMArlqKIdnY2X11vJLvmobYr0TntQ3oP@agora.fsl.cs.sunysb.edu [130.245.126.12])hAL210A1022051; Thu, 20 Nov 2003 21:01:00 -0500 Received: from agora.fsl.cs.sunysb.edu (IDENT:nk2IqdIH1aXza6IBqlj0leFPcksrZfc4@localhost.localdomain [127.0.0.1]) hAL21Vg9015272; Thu, 20 Nov 2003 21:01:31 -0500 Received: (from ezk@localhost) by agora.fsl.cs.sunysb.edu (8.12.8/8.12.8/Submit) id hAL21VDZ015268; Thu, 20 Nov 2003 21:01:31 -0500 Date: Thu, 20 Nov 2003 21:01:31 -0500 Message-Id: <200311210201.hAL21VDZ015268@agora.fsl.cs.sunysb.edu> From: Erez Zadok To: Tim Robbins In-reply-to: Your message of "Fri, 21 Nov 2003 11:57:06 +1100." <20031121005706.GA67377@wombat.robbins.dropbear.id.au> X-MailKey: Erez_Zadok cc: fs@freebsd.org Subject: Re: "Reverse union" mount possible? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2003 02:01:33 -0000 In message <20031121005706.GA67377@wombat.robbins.dropbear.id.au>, Tim Robbins writes: > On Thu, Nov 20, 2003 at 02:41:43PM +0300, Alex Povolotsky wrote: [...] > > Right now, I'm readonly mount_null'ing /bin, /sbin, /usr/bin, /usr/sbin, > > /usr/lib, /usr/include, /usr/libexec, /usr/share. With a dozen jails, there > > are too many mounts to my liking, and about twice a week I experience panic. > > Probabily it's nullfs-related. > > Nullfs is known to be buggy in -stable. In particular, it seems to deadlock > under load / when vnodes start getting recycled. My fist stackable templates were ported to fbsd 4.x and 5.0 not too long ago. We ran extensive tests to ensure that the code is stable. While it's possible we missed stuff, it might help if someone checked what is different about my "wrapfs" vs. Nullfs. We may have fixed bugs in wrapfs not realizing that they originally came from the base Nullfs we started with. Cheers, Erez. From owner-freebsd-fs@FreeBSD.ORG Fri Nov 21 01:45:53 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2BFD716A4CE for ; Fri, 21 Nov 2003 01:45:53 -0800 (PST) Received: from techno.sub.ru (webmail.sub.ru [213.247.139.22]) by mx1.FreeBSD.org (Postfix) with SMTP id 3826F43FBD for ; Fri, 21 Nov 2003 01:45:51 -0800 (PST) (envelope-from tarkhil@over.ru) Received: (qmail 68224 invoked by uid 0); 21 Nov 2003 09:45:52 -0000 Received: from unknown (HELO tarkhil.over.ru) (213.148.23.65) by webmail.sub.ru with SMTP; 21 Nov 2003 09:45:52 -0000 Date: Fri, 21 Nov 2003 12:45:49 +0300 From: Alex Povolotsky To: Erez Zadok Message-Id: <20031121124549.44895c7c.tarkhil@over.ru> In-Reply-To: <200311210201.hAL21VDZ015268@agora.fsl.cs.sunysb.edu> References: <20031121005706.GA67377@wombat.robbins.dropbear.id.au> <200311210201.hAL21VDZ015268@agora.fsl.cs.sunysb.edu> X-Mailer: Sylpheed version 0.9.6claws (GTK+ 1.2.10; i386-portbld-freebsd4.6.2) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit cc: fs@freebsd.org Subject: Re: "Reverse union" mount possible? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2003 09:45:53 -0000 On Thu, 20 Nov 2003 21:01:31 -0500 Erez Zadok wrote: EZ> > > Right now, I'm readonly mount_null'ing /bin, /sbin, /usr/bin, EZ> > > /usr/sbin,/usr/lib, /usr/include, /usr/libexec, /usr/share. With EZ> > > a dozen jails, there are too many mounts to my liking, and about EZ> > > twice a week I experience panic. Probabily it's nullfs-related. EZ> > EZ> > Nullfs is known to be buggy in -stable. In particular, it seems to EZ> > deadlock under load / when vnodes start getting recycled. EZ> EZ> My fist stackable templates were ported to fbsd 4.x and 5.0 not too EZ> long ago. We ran extensive tests to ensure that the code is stable. EZ> While it's EZ> possible we missed stuff, it might help if someone checked what is EZ> different about my "wrapfs" vs. Nullfs. We may have fixed bugs in EZ> wrapfs not realizing that they originally came from the base Nullfs EZ> we started with. Sounds cool. Where can I get the source? BTW, I've heard that nullfs/unionfs doesn't allow code sharing. Does wrapfs do it? -- Alex. From owner-freebsd-fs@FreeBSD.ORG Fri Nov 21 07:59:33 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 345F016A4CE for ; Fri, 21 Nov 2003 07:59:33 -0800 (PST) Received: from filer.fsl.cs.sunysb.edu (filer.fsl.cs.sunysb.edu [130.245.126.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 009C943FBD for ; Fri, 21 Nov 2003 07:59:32 -0800 (PST) (envelope-from ezk@fsl.cs.sunysb.edu) Received: from agora.fsl.cs.sunysb.edu (IDENT:Aj3oZg3YmzlmPav8B6zExme5/eT1l9gJ@agora.fsl.cs.sunysb.edu [130.245.126.12])hALFwrWh012766; Fri, 21 Nov 2003 10:58:53 -0500 Received: from agora.fsl.cs.sunysb.edu (IDENT:DV5MFM96mFeBs8rMCj5r6IuXBP+S1ac3@localhost.localdomain [127.0.0.1]) hALFxPg9015236; Fri, 21 Nov 2003 10:59:25 -0500 Received: (from ezk@localhost) by agora.fsl.cs.sunysb.edu (8.12.8/8.12.8/Submit) id hALFxOLr015232; Fri, 21 Nov 2003 10:59:24 -0500 Date: Fri, 21 Nov 2003 10:59:24 -0500 Message-Id: <200311211559.hALFxOLr015232@agora.fsl.cs.sunysb.edu> From: Erez Zadok To: Alex Povolotsky In-reply-to: Your message of "Fri, 21 Nov 2003 12:45:49 +0300." <20031121124549.44895c7c.tarkhil@over.ru> X-MailKey: Erez_Zadok cc: Erez Zadok cc: fs@freebsd.org Subject: Re: "Reverse union" mount possible? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2003 15:59:33 -0000 In message <20031121124549.44895c7c.tarkhil@over.ru>, Alex Povolotsky writes: > On Thu, 20 Nov 2003 21:01:31 -0500 > Erez Zadok wrote: > > EZ> > > Right now, I'm readonly mount_null'ing /bin, /sbin, /usr/bin, > EZ> > > /usr/sbin,/usr/lib, /usr/include, /usr/libexec, /usr/share. With > EZ> > > a dozen jails, there are too many mounts to my liking, and about > EZ> > > twice a week I experience panic. Probabily it's nullfs-related. > EZ> > > EZ> > Nullfs is known to be buggy in -stable. In particular, it seems to > EZ> > deadlock under load / when vnodes start getting recycled. > EZ> > EZ> My fist stackable templates were ported to fbsd 4.x and 5.0 not too > EZ> long ago. We ran extensive tests to ensure that the code is stable. > EZ> While it's > EZ> possible we missed stuff, it might help if someone checked what is > EZ> different about my "wrapfs" vs. Nullfs. We may have fixed bugs in > EZ> wrapfs not realizing that they originally came from the base Nullfs > EZ> we started with. > > Sounds cool. Where can I get the source? FiST home page: http://www1.cs.columbia.edu/~ezk/research/fist/ > BTW, I've heard that nullfs/unionfs doesn't allow code sharing. Does wrapfs do it? What do you mean by "code sharing"? Licensing? All of the freebsd fist templates use the BSD license. > Alex. Erez. From owner-freebsd-fs@FreeBSD.ORG Fri Nov 21 10:44:09 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5A6EC16A4CF for ; Fri, 21 Nov 2003 10:44:09 -0800 (PST) Received: from comp.chem.msu.su (comp-ext.chem.msu.su [158.250.32.157]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8E27343FB1 for ; Fri, 21 Nov 2003 10:44:05 -0800 (PST) (envelope-from yar@comp.chem.msu.su) Received: from comp.chem.msu.su (localhost [127.0.0.1]) by comp.chem.msu.su (8.12.3p3/8.12.3) with ESMTP id hALIi1dK040855; Fri, 21 Nov 2003 21:44:01 +0300 (MSK) (envelope-from yar@comp.chem.msu.su) Received: (from yar@localhost) by comp.chem.msu.su (8.12.3p3/8.12.3/Submit) id hALIhsED040849; Fri, 21 Nov 2003 21:43:54 +0300 (MSK) (envelope-from yar) Date: Fri, 21 Nov 2003 21:43:53 +0300 From: Yar Tikhiy To: "Matthew N. Dodd" Message-ID: <20031121184353.GC37908@comp.chem.msu.su> References: <20030322151656.GA34184@comp.chem.msu.su> <20030322211504.X8716@sasami.jurai.net> <20030324155445.GA67925@comp.chem.msu.su> <20030324225320.B96310@iclub.nsu.ru> <20030324181446.GB74771@comp.chem.msu.su> <20030324131809.E39864@sasami.jurai.net> <20030325125620.GD27905@comp.chem.msu.su> <20030927232851.S35442@sasami.jurai.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030927232851.S35442@sasami.jurai.net> User-Agent: Mutt/1.5.3i cc: fs@FreeBSD.ORG Subject: Re: HFS/HFS Plus driver and tools for 5.x are available X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2003 18:44:09 -0000 On Sat, Sep 27, 2003 at 11:29:14PM -0400, Matthew N. Dodd wrote: > On Tue, 25 Mar 2003, Yar Tikhiy wrote: > > Since introducing a new file system won't disrupt any existing > > functionality, this code will be committed to -CURRENT as soon as we > > reach an agreement with Apple on the licensing issue. Although APSL > > allows us to use this code today, it would be better if Apple moved it > > to the BSD license; and there is hope it's possible. > > Any status update on this matter? > > Thanks. I apologize for my staying silent so long. Frankly, I had nothing to answer till today. Robert Watson, who undertook the mission of talking to Apple on the issue, has just got Apple's resolution. Apple is against relicensing HFS+ under a BSD license since HFS+ contains some important intellectual property. OTOH, Apple thinks that the move to APSL2 will help open source projects with the legal issues involved. Robert Watson also added that now core@ would investigate possible issues regarding APSL2 before reaching a final decision on how to treat code covered by that license within our source tree. Therefore the issue of integrating the HFS code into FreeBSD has been clarified at least partially: Alas, the HFS code will keep its status of contributed software. -- Yar