From owner-freebsd-fs@FreeBSD.ORG Tue Oct 28 18:38:03 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2BE1116A4CE for ; Tue, 28 Oct 2003 18:38:03 -0800 (PST) Received: from sploot.vicor-nb.com (sploot.vicor-nb.com [208.206.78.81]) by mx1.FreeBSD.org (Postfix) with ESMTP id 407D843FE0 for ; Tue, 28 Oct 2003 18:38:02 -0800 (PST) (envelope-from kmarx@vicor.com) Received: from vicor.com (localhost [127.0.0.1]) by sploot.vicor-nb.com (8.12.8/8.12.8) with ESMTP id h9T2WxT1073888; Tue, 28 Oct 2003 18:32:59 -0800 (PST) (envelope-from kmarx@vicor.com) Message-ID: <3F9F26DB.6050207@vicor.com> Date: Tue, 28 Oct 2003 18:32:59 -0800 From: Ken Marx User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030402 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Kirk McKusick References: <200310261749.h9QHnieN015824@beastie.mckusick.com> In-Reply-To: <200310261749.h9QHnieN015824@beastie.mckusick.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-fs@freebsd.org cc: Ken Marx cc: julian@elischer.org Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Oct 2003 02:38:03 -0000 Kirk McKusick wrote: >>Date: Thu, 23 Oct 2003 17:58:54 -0700 >>From: Ken Marx >>To: Kirk McKusick >>CC: Julian Elischer , cburrell@vicor.com, davep@vicor.com, >> Ken Marx , gluk@ptci.ru, jpl@vicor.com, jrh@vicor.com, >> julian@vicor-nb.com, VicPE@aol.com >>Subject: Re: 4.8 ffs_dirpref problem >>X-ASK-Info: Whitelist match >> >>Hi Kirk, >> >>I had a few minutes before heading out, so tried getting a list >>of block numbers in the bufferhash bucket that seemed to have >>lots of hits. The depth changes of course, but I caught it at >>one point at a depth of 600 or so: >> >>/kernel: dumpbh( 250 ) >>/kernel: bp[1]: b_vp=0xcfa3d480, b_lblkno=52561, b_flags=0x20100020 >>/kernel: bp[2]: b_vp=0xcf3c5d00, b_lblkno=345047104, b_flags=0x200000a0 >>... >> >>For no good reason, I sorted by block number and looked at differences >>between block number values. It varies a bit, but of 522 block numbers, >>494 of them have a difference of 65536. >> >>Er, some duplicates also show up, but the b_flags values differ. >> >>I'm not cc'ing fs@freebsd on this just in case it's being seen >>as getting out of control. Feel free to fold them back in. >> >>Thanks again, >>k. > > > I does look like the hash function is having some trouble. > It has been completely revamped in 5.0, but is still using > a "power-of-2" hashing scheme in 4.X. I highly recommend > trying a scheme with non-power-of-2 base. Perhaps something > as simple as changing the hashing to use modulo rather than > logical & (e.g., in bufhash change from & bufhashmask to > % bufhashmask). > > Kirk McKusick > > Hi, Hope this isn't seen as spamming the list, but this should be the last of it I hope. I'll summarize findings briefly. More details at: http://www.bigshed.com/kernel/raid_full_problem and/or you can find our patches for what we finally did at: http://www.bigshed.com/kernel/ffs_vfsbio.diff We did re-newfs our raid as Kirk suggested. Stupidly, our data file and some test results were lost in the process (doh!). So we had to use a slightly different datafile for re-testing. Still 1.5Gb of mixed files/dir sizes. Anyway, it would appear that the new fs settings (average file size=48k, average files per dir = 1500) help some, but performance still suffers as the disk fills. We have a sample 'fix' for the hashtable in vfs_bio.c that uses all the blkno bits. It's in the diff link above. Use as you see fit. However, it too doesn't really address our symptoms significantly. Darn. Bogging down to 1Mb/sec and > 90% system seen. The only thing that really addressed our problem was going back to the 4.4 dirpref logic. We added a sysctl OID to support this on a system-wide basis. That's also in the diff patch. It would be nice if we could do this on a per fs basis via fs.h's fs_flags or some such, but perhaps this is too messy for future support. We can live with system-wide 4.4 semantics if necessary, as Doug White mentioned. If any of this does get addressed in 4.8 code, please let us (er, julian@vicor.com) know so we can clean up our kernel tree. Of course, any comments, suggestions, flames totally welcome. Thanks again for everyone's patience and assistance. regards, k -- Ken Marx, kmarx@vicor-nb.com Ramp up the solution space!! - http://www.bigshed.com/cgi-bin/speak.cgi From owner-freebsd-fs@FreeBSD.ORG Wed Oct 29 01:01:06 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A1CB216A4D0 for ; Wed, 29 Oct 2003 01:01:06 -0800 (PST) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5024944084 for ; Wed, 29 Oct 2003 01:00:24 -0800 (PST) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id h9T8xWeF028514; Wed, 29 Oct 2003 00:59:40 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200310290859.h9T8xWeF028514@gw.catspoiler.org> Date: Wed, 29 Oct 2003 00:59:32 -0800 (PST) From: Don Lewis To: kmarx@vicor.com In-Reply-To: <3F9F26DB.6050207@vicor.com> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: freebsd-fs@FreeBSD.org cc: julian@elischer.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Oct 2003 09:01:06 -0000 On 28 Oct, Ken Marx wrote: > > > Kirk McKusick wrote: >> I does look like the hash function is having some trouble. >> It has been completely revamped in 5.0, but is still using >> a "power-of-2" hashing scheme in 4.X. I highly recommend >> trying a scheme with non-power-of-2 base. Perhaps something >> as simple as changing the hashing to use modulo rather than >> logical & (e.g., in bufhash change from & bufhashmask to >> % bufhashmask). >> >> Kirk McKusick >> >> > We have a sample 'fix' for the hashtable in vfs_bio.c > that uses all the blkno bits. It's in the diff link above. > Use as you see fit. However, it too doesn't really address > our symptoms significantly. Darn. > Bogging down to 1Mb/sec and > 90% system seen. A Fibonacci hash, like I implemented in the kern/kern_mtxpool.c 1.8, might be a good choice here, since it tends to distribute the keys fairly uniformly. I think this is a secondary issue, though. I think the real problem is the following code in ffs_dirpref(): avgifree = fs->fs_cstotal.cs_nifree / fs->fs_ncg; avgbfree = fs->fs_cstotal.cs_nbfree / fs->fs_ncg; avgndir = fs->fs_cstotal.cs_ndir / fs->fs_ncg; [snip] maxndir = min(avgndir + fs->fs_ipg / 16, fs->fs_ipg); minifree = avgifree - fs->fs_ipg / 4; if (minifree < 0) minifree = 0; minbfree = avgbfree - fs->fs_fpg / fs->fs_frag / 4; if (minbfree < 0) minbfree = 0; [snip] prefcg = ino_to_cg(fs, pip->i_number); for (cg = prefcg; cg < fs->fs_ncg; cg++) if (fs->fs_cs(fs, cg).cs_ndir < maxndir && fs->fs_cs(fs, cg).cs_nifree >= minifree && fs->fs_cs(fs, cg).cs_nbfree >= minbfree) { if (fs->fs_contigdirs[cg] < maxcontigdirs) return ((ino_t)(fs->fs_ipg * cg)); } for (cg = 0; cg < prefcg; cg++) if (fs->fs_cs(fs, cg).cs_ndir < maxndir && fs->fs_cs(fs, cg).cs_nifree >= minifree && fs->fs_cs(fs, cg).cs_nbfree >= minbfree) { if (fs->fs_contigdirs[cg] < maxcontigdirs) return ((ino_t)(fs->fs_ipg * cg)); } If the file system is more than 75% full, minbfree will be zero, which will allow new directories to be created in cylinder groups that have no free blocks for either the directory itself, or for any files created in that directory. If this happens, allocating the blocks for the directory and its files will require ffs_alloc() to do an expensive search across the cylinder groups for each block. It looks to me like minbfree needs to equal, or at least a lot closer to avgbfree. A similar situation exists with minifree. Please note that the fallback algorithm uses the condition: fs->fs_cs(fs, cg).cs_nifree >= avgifree From owner-freebsd-fs@FreeBSD.ORG Wed Oct 29 17:25:57 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D374516A4CE; Wed, 29 Oct 2003 17:25:57 -0800 (PST) Received: from sploot.vicor-nb.com (sploot.vicor-nb.com [208.206.78.81]) by mx1.FreeBSD.org (Postfix) with ESMTP id F105643F75; Wed, 29 Oct 2003 17:25:56 -0800 (PST) (envelope-from kmarx@vicor.com) Received: from vicor.com (localhost [127.0.0.1]) by sploot.vicor-nb.com (8.12.8/8.12.8) with ESMTP id h9U1KoT1093518; Wed, 29 Oct 2003 17:20:51 -0800 (PST) (envelope-from kmarx@vicor.com) Message-ID: <3FA06772.10409@vicor.com> Date: Wed, 29 Oct 2003 17:20:50 -0800 From: Ken Marx User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030402 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Don Lewis References: <200310290859.h9T8xWeF028514@gw.catspoiler.org> In-Reply-To: <200310290859.h9T8xWeF028514@gw.catspoiler.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-fs@FreeBSD.org cc: julian@elischer.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2003 01:25:58 -0000 Don Lewis wrote: > On 28 Oct, Ken Marx wrote: > >> >>Kirk McKusick wrote: > > >>>I does look like the hash function is having some trouble. >>>It has been completely revamped in 5.0, but is still using >>>a "power-of-2" hashing scheme in 4.X. I highly recommend >>>trying a scheme with non-power-of-2 base. Perhaps something >>>as simple as changing the hashing to use modulo rather than >>>logical & (e.g., in bufhash change from & bufhashmask to >>>% bufhashmask). >>> >>> Kirk McKusick >>> >>> > > >>We have a sample 'fix' for the hashtable in vfs_bio.c >>that uses all the blkno bits. It's in the diff link above. >>Use as you see fit. However, it too doesn't really address >>our symptoms significantly. Darn. >>Bogging down to 1Mb/sec and > 90% system seen. > > > A Fibonacci hash, like I implemented in the kern/kern_mtxpool.c 1.8, > might be a good choice here, since it tends to distribute the keys > fairly uniformly. I think this is a secondary issue, though. > > I think the real problem is the following code in ffs_dirpref(): > > avgifree = fs->fs_cstotal.cs_nifree / fs->fs_ncg; > avgbfree = fs->fs_cstotal.cs_nbfree / fs->fs_ncg; > avgndir = fs->fs_cstotal.cs_ndir / fs->fs_ncg; > [snip] > maxndir = min(avgndir + fs->fs_ipg / 16, fs->fs_ipg); > minifree = avgifree - fs->fs_ipg / 4; > if (minifree < 0) > minifree = 0; > minbfree = avgbfree - fs->fs_fpg / fs->fs_frag / 4; > if (minbfree < 0) > minbfree = 0; > [snip] > prefcg = ino_to_cg(fs, pip->i_number); > for (cg = prefcg; cg < fs->fs_ncg; cg++) > if (fs->fs_cs(fs, cg).cs_ndir < maxndir && > fs->fs_cs(fs, cg).cs_nifree >= minifree && > fs->fs_cs(fs, cg).cs_nbfree >= minbfree) { > if (fs->fs_contigdirs[cg] < maxcontigdirs) > return ((ino_t)(fs->fs_ipg * cg)); > } > for (cg = 0; cg < prefcg; cg++) > if (fs->fs_cs(fs, cg).cs_ndir < maxndir && > fs->fs_cs(fs, cg).cs_nifree >= minifree && > fs->fs_cs(fs, cg).cs_nbfree >= minbfree) { > if (fs->fs_contigdirs[cg] < maxcontigdirs) > return ((ino_t)(fs->fs_ipg * cg)); > } > > If the file system is more than 75% full, minbfree will be zero, which > will allow new directories to be created in cylinder groups that have no > free blocks for either the directory itself, or for any files created in > that directory. If this happens, allocating the blocks for the > directory and its files will require ffs_alloc() to do an expensive > search across the cylinder groups for each block. It looks to me like > minbfree needs to equal, or at least a lot closer to avgbfree. > > A similar situation exists with minifree. Please note that the fallback > algorithm uses the condition: > fs->fs_cs(fs, cg).cs_nifree >= avgifree > > > Interesting. We (Vicor) will defer to experts here, but are very willing to test anything you come up with. thanks, k -- Ken Marx, kmarx@vicor-nb.com I insist that we do the right thing and be accountable for the realistic goals. - http://www.bigshed.com/cgi-bin/speak.cgi From owner-freebsd-fs@FreeBSD.ORG Wed Oct 29 22:42:21 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8396516A4CE for ; Wed, 29 Oct 2003 22:42:21 -0800 (PST) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9159F43FA3 for ; Wed, 29 Oct 2003 22:42:20 -0800 (PST) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id h9U6fWeF031328; Wed, 29 Oct 2003 22:41:35 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200310300641.h9U6fWeF031328@gw.catspoiler.org> Date: Wed, 29 Oct 2003 22:41:32 -0800 (PST) From: Don Lewis To: kmarx@vicor.com In-Reply-To: <3FA06772.10409@vicor.com> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: freebsd-fs@FreeBSD.org cc: gluk@ptci.ru cc: julian@elischer.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2003 06:42:21 -0000 On 29 Oct, Ken Marx wrote: > Don Lewis wrote: >> I think the real problem is the following code in ffs_dirpref(): >> >> avgifree = fs->fs_cstotal.cs_nifree / fs->fs_ncg; >> avgbfree = fs->fs_cstotal.cs_nbfree / fs->fs_ncg; >> avgndir = fs->fs_cstotal.cs_ndir / fs->fs_ncg; >> [snip] >> maxndir = min(avgndir + fs->fs_ipg / 16, fs->fs_ipg); >> minifree = avgifree - fs->fs_ipg / 4; >> if (minifree < 0) >> minifree = 0; >> minbfree = avgbfree - fs->fs_fpg / fs->fs_frag / 4; >> if (minbfree < 0) >> minbfree = 0; >> [snip] >> prefcg = ino_to_cg(fs, pip->i_number); >> for (cg = prefcg; cg < fs->fs_ncg; cg++) >> if (fs->fs_cs(fs, cg).cs_ndir < maxndir && >> fs->fs_cs(fs, cg).cs_nifree >= minifree && >> fs->fs_cs(fs, cg).cs_nbfree >= minbfree) { >> if (fs->fs_contigdirs[cg] < maxcontigdirs) >> return ((ino_t)(fs->fs_ipg * cg)); >> } >> for (cg = 0; cg < prefcg; cg++) >> if (fs->fs_cs(fs, cg).cs_ndir < maxndir && >> fs->fs_cs(fs, cg).cs_nifree >= minifree && >> fs->fs_cs(fs, cg).cs_nbfree >= minbfree) { >> if (fs->fs_contigdirs[cg] < maxcontigdirs) >> return ((ino_t)(fs->fs_ipg * cg)); >> } >> >> If the file system is more than 75% full, minbfree will be zero, which >> will allow new directories to be created in cylinder groups that have no >> free blocks for either the directory itself, or for any files created in >> that directory. If this happens, allocating the blocks for the >> directory and its files will require ffs_alloc() to do an expensive >> search across the cylinder groups for each block. It looks to me like >> minbfree needs to equal, or at least a lot closer to avgbfree. Actually, I think the expensive search will only happen for the first block in each file (and the other blocks will be allocated in the same cylinder group), but if you are creating tons of files that are only one block long ... >> A similar situation exists with minifree. Please note that the fallback >> algorithm uses the condition: >> fs->fs_cs(fs, cg).cs_nifree >= avgifree >> >> >> > > Interesting. We (Vicor) will defer to experts here, but are very willing to > test anything you come up with. You might try the lightly tested patch below. It tweaks the dirpref algorithm so that cylinder groups with free space >= 75% of the average free space and free inodes >= 75% of the average number of free inodes are candidates for allocating the directory. It will not chose a cylinder group that does not have at least one free block and one free inode. It also decreases maxcontigdirs as the free space decreases so that a cluster of directories is less likely to cause the cylinder group to overflow. I think it would be better to tune maxcontigdirs individually for each cylinder group, based on the free space in that cylinder group, but that is more complex ... Index: sys/ufs/ffs/ffs_alloc.c =================================================================== RCS file: /home/ncvs/src/sys/ufs/ffs/ffs_alloc.c,v retrieving revision 1.64.2.2 diff -u -r1.64.2.2 ffs_alloc.c --- sys/ufs/ffs/ffs_alloc.c 21 Sep 2001 19:15:21 -0000 1.64.2.2 +++ sys/ufs/ffs/ffs_alloc.c 30 Oct 2003 06:01:38 -0000 @@ -696,18 +696,18 @@ * optimal allocation of a directory inode. */ maxndir = min(avgndir + fs->fs_ipg / 16, fs->fs_ipg); - minifree = avgifree - fs->fs_ipg / 4; - if (minifree < 0) - minifree = 0; - minbfree = avgbfree - fs->fs_fpg / fs->fs_frag / 4; - if (minbfree < 0) - minbfree = 0; + minifree = avgifree - avgifree / 4; + if (minifree < 1) + minifree = 1; + minbfree = avgbfree - avgbfree / 4; + if (minbfree < 1) + minbfree = 1; cgsize = fs->fs_fsize * fs->fs_fpg; dirsize = fs->fs_avgfilesize * fs->fs_avgfpdir; curdirsize = avgndir ? (cgsize - avgbfree * fs->fs_bsize) / avgndir : 0; if (dirsize < curdirsize) dirsize = curdirsize; - maxcontigdirs = min(cgsize / dirsize, 255); + maxcontigdirs = min((avgbfree * fs->fs_bsize) / dirsize, 255); if (fs->fs_avgfpdir > 0) maxcontigdirs = min(maxcontigdirs, fs->fs_ipg / fs->fs_avgfpdir); From owner-freebsd-fs@FreeBSD.ORG Thu Oct 30 00:49:13 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EED7A16A4CE for ; Thu, 30 Oct 2003 00:49:13 -0800 (PST) Received: from emerion.com (gate01.emerion.com [212.69.178.5]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6F19E43FE0 for ; Thu, 30 Oct 2003 00:49:12 -0800 (PST) (envelope-from markus@mk-meisinger.at) Received: from frosch (N897P000.adsl.highway.telekom.at [62.47.56.0]) by emerion.com (8.12.1/8.12.1) with SMTP id h9U8n8P9030266 for ; Thu, 30 Oct 2003 09:49:09 +0100 Message-ID: <002501c39ec3$014498f0$0200a8c0@MKMEISINGER.local> From: "Markus F. Meisinger" To: Date: Thu, 30 Oct 2003 09:51:21 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.3790.0 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0 Subject: mount_smbfs and MS Windows Domain Controllers X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2003 08:49:14 -0000 Dear list, if i try to mount a share on a Microsoft Windows 2003 Domain Controller i get "Permission denied", the password is correct,because if i use a wrong password i get "Authentication error" mounting shares from other Windows Workstations works fine. for example mount_smbfs -I 192.168.0.2 -W MK-MEISINGER //markus@frosch/library /mnt 192.168.0.2 ip of W2K3 domain controller MK-MEISINGER name of domain markus user of domain frosch W2K3 domain controller i think i have to reconfigure the domain controller, but how? Thanks for your help Markus From owner-freebsd-fs@FreeBSD.ORG Thu Oct 30 03:55:01 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1D5E616A4CF for ; Thu, 30 Oct 2003 03:55:01 -0800 (PST) Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6449043FD7 for ; Thu, 30 Oct 2003 03:55:00 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from user-2ivfim6.dialup.mindspring.com ([165.247.202.198] helo=mindspring.com) by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 1AFBOC-0007Xj-00; Thu, 30 Oct 2003 03:54:56 -0800 Message-ID: <3FA0FBD7.E19D1C7A@mindspring.com> Date: Thu, 30 Oct 2003 03:53:59 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "Markus F. Meisinger" References: <002501c39ec3$014498f0$0200a8c0@MKMEISINGER.local> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4d693b2ddcc2cca1b61c542d333ada713548b785378294e88350badd9bab72f9c350badd9bab72f9c cc: freebsd-fs@freebsd.org Subject: Re: mount_smbfs and MS Windows Domain Controllers X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2003 11:55:01 -0000 "Markus F. Meisinger" wrote: > if i try to mount a share on a Microsoft Windows 2003 Domain Controller > i get "Permission denied", > the password is correct,because if i use a wrong password i get > "Authentication error" > mounting shares from other Windows Workstations works fine. You need to add your machine to the domain. To do this, you will need the admin passwd for the domain controller. See the SAMBA FAQ. -- Terry From owner-freebsd-fs@FreeBSD.ORG Thu Oct 30 06:37:49 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D63DD16A4CF for ; Thu, 30 Oct 2003 06:37:49 -0800 (PST) Received: from NORMANDIE.AD.HartBrothers.Com (normandie.ad.hartbrothers.com [63.102.100.8]) by mx1.FreeBSD.org (Postfix) with ESMTP id 032CD43FBD for ; Thu, 30 Oct 2003 06:37:49 -0800 (PST) (envelope-from davehart@davehart.com) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-class: urn:content-classes:message X-MimeOLE: Produced By Microsoft Exchange V6.5.6940.0 Date: Thu, 30 Oct 2003 14:37:48 -0000 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: mount_smbfs and MS Windows Domain Controllers Thread-Index: AcOe3OJJoVviJypUTSuoE4hx1qOQ8QAFjFIg From: "Dave Hart" To: "Terry Lambert" , "Markus F. Meisinger" cc: freebsd-fs@freebsd.org Subject: RE: mount_smbfs and MS Windows Domain Controllers X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2003 14:37:49 -0000 I don't think domain membership would interfere with authentication in this case. While I didn't find any easy instructions on how to change the setting, Windows Server 2003 DCs require SMB signing by default. Probably the version of SAMBA involved cannot sign its SMB operations. Dave Hart From owner-freebsd-fs@FreeBSD.ORG Thu Oct 30 10:29:56 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4A28416A4CE for ; Thu, 30 Oct 2003 10:29:56 -0800 (PST) Received: from emerion.com (gate01.emerion.com [212.69.178.5]) by mx1.FreeBSD.org (Postfix) with ESMTP id BA73D43FCB for ; Thu, 30 Oct 2003 10:29:52 -0800 (PST) (envelope-from markus@mk-meisinger.at) Received: from frosch (N851P013.adsl.highway.telekom.at [62.47.50.77]) by emerion.com (8.12.1/8.12.1) with SMTP id h9UITlaQ014695 for ; Thu, 30 Oct 2003 19:29:48 +0100 Message-ID: <001801c39f14$1fbfed60$0200a8c0@MKMEISINGER.local> From: "Markus F. Meisinger" To: References: Date: Thu, 30 Oct 2003 19:32:01 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.3790.0 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0 Subject: Re: mount_smbfs and MS Windows Domain Controllers X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2003 18:29:56 -0000 Thank you for your answers, adding the FreeBSD machine to the domain was not necessary, i had to turn off digital signing of the smb packets on the domain controller and now it works. i use FreeBSD 5.0 standard installation, what version of mount_smbfs can sign its packets? Markus ----- Original Message ----- From: "Dave Hart" To: "Terry Lambert" ; "Markus F. Meisinger" Cc: Sent: Thursday, October 30, 2003 3:37 PM Subject: RE: mount_smbfs and MS Windows Domain Controllers > I don't think domain membership would interfere with authentication in > this case. While I didn't find any easy instructions on how to change > the setting, Windows Server 2003 DCs require SMB signing by default. > Probably the version of SAMBA involved cannot sign its SMB operations. > > Dave Hart > From owner-freebsd-fs@FreeBSD.ORG Thu Oct 30 11:12:39 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4288916A4CF; Thu, 30 Oct 2003 11:12:39 -0800 (PST) Received: from sploot.vicor-nb.com (sploot.vicor-nb.com [208.206.78.81]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3757843FAF; Thu, 30 Oct 2003 11:12:38 -0800 (PST) (envelope-from kmarx@vicor.com) Received: from vicor.com (localhost [127.0.0.1]) by sploot.vicor-nb.com (8.12.8/8.12.8) with ESMTP id h9UJ7KT1008892; Thu, 30 Oct 2003 11:07:20 -0800 (PST) (envelope-from kmarx@vicor.com) Message-ID: <3FA16168.2010209@vicor.com> Date: Thu, 30 Oct 2003 11:07:20 -0800 From: Ken Marx User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030402 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Don Lewis References: <200310300641.h9U6fWeF031328@gw.catspoiler.org> In-Reply-To: <200310300641.h9U6fWeF031328@gw.catspoiler.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-fs@FreeBSD.org cc: gluk@ptci.ru cc: julian@elischer.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2003 19:12:39 -0000 Don Lewis wrote: > On 29 Oct, Ken Marx wrote: > >>Don Lewis wrote: > > >>>I think the real problem is the following code in ffs_dirpref(): >>> >>> avgifree = fs->fs_cstotal.cs_nifree / fs->fs_ncg; >>> avgbfree = fs->fs_cstotal.cs_nbfree / fs->fs_ncg; >>> avgndir = fs->fs_cstotal.cs_ndir / fs->fs_ncg; >>>[snip] >>> maxndir = min(avgndir + fs->fs_ipg / 16, fs->fs_ipg); >>> minifree = avgifree - fs->fs_ipg / 4; >>> if (minifree < 0) >>> minifree = 0; >>> minbfree = avgbfree - fs->fs_fpg / fs->fs_frag / 4; >>> if (minbfree < 0) >>> minbfree = 0; >>>[snip] >>> prefcg = ino_to_cg(fs, pip->i_number); >>> for (cg = prefcg; cg < fs->fs_ncg; cg++) >>> if (fs->fs_cs(fs, cg).cs_ndir < maxndir && >>> fs->fs_cs(fs, cg).cs_nifree >= minifree && >>> fs->fs_cs(fs, cg).cs_nbfree >= minbfree) { >>> if (fs->fs_contigdirs[cg] < maxcontigdirs) >>> return ((ino_t)(fs->fs_ipg * cg)); >>> } >>> for (cg = 0; cg < prefcg; cg++) >>> if (fs->fs_cs(fs, cg).cs_ndir < maxndir && >>> fs->fs_cs(fs, cg).cs_nifree >= minifree && >>> fs->fs_cs(fs, cg).cs_nbfree >= minbfree) { >>> if (fs->fs_contigdirs[cg] < maxcontigdirs) >>> return ((ino_t)(fs->fs_ipg * cg)); >>> } >>> >>>If the file system is more than 75% full, minbfree will be zero, which >>>will allow new directories to be created in cylinder groups that have no >>>free blocks for either the directory itself, or for any files created in >>>that directory. If this happens, allocating the blocks for the >>>directory and its files will require ffs_alloc() to do an expensive >>>search across the cylinder groups for each block. It looks to me like >>>minbfree needs to equal, or at least a lot closer to avgbfree. > > > Actually, I think the expensive search will only happen for the first > block in each file (and the other blocks will be allocated in the same > cylinder group), but if you are creating tons of files that are only one > block long ... > > >>>A similar situation exists with minifree. Please note that the fallback >>>algorithm uses the condition: >>> fs->fs_cs(fs, cg).cs_nifree >= avgifree >>> >>> >>> >> >>Interesting. We (Vicor) will defer to experts here, but are very willing to >>test anything you come up with. > > > You might try the lightly tested patch below. It tweaks the dirpref > algorithm so that cylinder groups with free space >= 75% of the average > free space and free inodes >= 75% of the average number of free inodes > are candidates for allocating the directory. It will not chose a > cylinder group that does not have at least one free block and one free > inode. > > It also decreases maxcontigdirs as the free space decreases so that a > cluster of directories is less likely to cause the cylinder group to > overflow. I think it would be better to tune maxcontigdirs individually > for each cylinder group, based on the free space in that cylinder group, > but that is more complex ... > > Index: sys/ufs/ffs/ffs_alloc.c > =================================================================== > RCS file: /home/ncvs/src/sys/ufs/ffs/ffs_alloc.c,v > retrieving revision 1.64.2.2 > diff -u -r1.64.2.2 ffs_alloc.c > --- sys/ufs/ffs/ffs_alloc.c 21 Sep 2001 19:15:21 -0000 1.64.2.2 > +++ sys/ufs/ffs/ffs_alloc.c 30 Oct 2003 06:01:38 -0000 > @@ -696,18 +696,18 @@ > * optimal allocation of a directory inode. > */ > maxndir = min(avgndir + fs->fs_ipg / 16, fs->fs_ipg); > - minifree = avgifree - fs->fs_ipg / 4; > - if (minifree < 0) > - minifree = 0; > - minbfree = avgbfree - fs->fs_fpg / fs->fs_frag / 4; > - if (minbfree < 0) > - minbfree = 0; > + minifree = avgifree - avgifree / 4; > + if (minifree < 1) > + minifree = 1; > + minbfree = avgbfree - avgbfree / 4; > + if (minbfree < 1) > + minbfree = 1; > cgsize = fs->fs_fsize * fs->fs_fpg; > dirsize = fs->fs_avgfilesize * fs->fs_avgfpdir; > curdirsize = avgndir ? (cgsize - avgbfree * fs->fs_bsize) / avgndir : 0; > if (dirsize < curdirsize) > dirsize = curdirsize; > - maxcontigdirs = min(cgsize / dirsize, 255); > + maxcontigdirs = min((avgbfree * fs->fs_bsize) / dirsize, 255); > if (fs->fs_avgfpdir > 0) > maxcontigdirs = min(maxcontigdirs, > fs->fs_ipg / fs->fs_avgfpdir); > > Thanks Don, re: ... > cylinder group), but if you are creating tons of files that are only one > block long ... Not terribly scientific, but when our test bogs down, it's often in a directory with 6400 1-block files. So, your comment seems plausible. Anyway - I just tested your patch. Again, unloaded system, repeatedly untaring a 1.5gb file, starting at 97% capacity. and: tunefs: average file size: (-f) 49152 tunefs: average number of files in a directory: (-s) 1500 ... Takes about 74 system secs per 1.5gb untar: ------------------------------------------- /dev/da0s1e 558889580 497843972 16334442 97% 6858407 63316311 10% /raid 119.23 real 1.28 user 73.09 sys /dev/da0s1e 558889580 499371100 14807314 97% 6879445 63295273 10% /raid 111.69 real 1.32 user 73.65 sys /dev/da0s1e 558889580 500898228 13280186 97% 6900483 63274235 10% /raid 116.67 real 1.44 user 74.19 sys /dev/da0s1e 558889580 502425356 11753058 98% 6921521 63253197 10% /raid 114.73 real 1.25 user 75.01 sys /dev/da0s1e 558889580 503952484 10225930 98% 6942559 63232159 10% /raid 116.95 real 1.30 user 74.10 sys /dev/da0s1e 558889580 505479614 8698800 98% 6963597 63211121 10% /raid 115.29 real 1.39 user 74.25 sys /dev/da0s1e 558889580 507006742 7171672 99% 6984635 63190083 10% /raid 114.01 real 1.16 user 74.04 sys /dev/da0s1e 558889580 508533870 5644544 99% 7005673 63169045 10% /raid 119.95 real 1.32 user 75.05 sys /dev/da0s1e 558889580 510060998 4117416 99% 7026711 63148007 10% /raid 114.89 real 1.33 user 74.66 sys /dev/da0s1e 558889580 511588126 2590288 99% 7047749 63126969 10% /raid 114.91 real 1.58 user 74.64 sys /dev/da0s1e 558889580 513115254 1063160 100% 7068787 63105931 10% /raid tot: 1161.06 real 13.45 user 742.89 sys Compares pretty favorably to our naive, retro 4.4 dirpref hack that averages in the mid-high 60's: -------------------------------------------------------------------- /dev/da0s1e 558889580 497843952 16334462 97% 6858406 63316312 10% /raid 110.19 real 1.42 user 65.54 sys /dev/da0s1e 558889580 499371080 14807334 97% 6879444 63295274 10% /raid 105.47 real 1.47 user 65.09 sys /dev/da0s1e 558889580 500898208 13280206 97% 6900482 63274236 10% /raid 110.17 real 1.48 user 64.98 sys /dev/da0s1e 558889580 502425336 11753078 98% 6921520 63253198 10% /raid 131.88 real 1.49 user 71.20 sys /dev/da0s1e 558889580 503952464 10225950 98% 6942558 63232160 10% /raid 111.61 real 1.62 user 67.47 sys /dev/da0s1e 558889580 505479594 8698820 98% 6963596 63211122 10% /raid 131.36 real 1.67 user 90.79 sys /dev/da0s1e 558889580 507006722 7171692 99% 6984634 63190084 10% /raid 115.34 real 1.49 user 65.61 sys /dev/da0s1e 558889580 508533850 5644564 99% 7005672 63169046 10% /raid 110.26 real 1.39 user 65.26 sys /dev/da0s1e 558889580 510060978 4117436 99% 7026710 63148008 10% /raid 116.15 real 1.51 user 65.47 sys /dev/da0s1e 558889580 511588106 2590308 99% 7047748 63126970 10% /raid 112.74 real 1.37 user 65.01 sys /dev/da0s1e 558889580 513115234 1063180 100% 7068786 63105932 10% /raid 1158.36 real 15.01 user 686.57 sys Without either, we'd expect timings of 5-20 minutes when things are going poorly. Happy to test further if you have tweaks to your patch or things you'd like us to test in particular. E.g., load, newfs, etc. k. -- Ken Marx, kmarx@vicor-nb.com As a company we must not put the cart before the horse and set up weekly meetings on the solution space. - http://www.bigshed.com/cgi-bin/speak.cgi From owner-freebsd-fs@FreeBSD.ORG Thu Oct 30 11:30:00 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 97F7B16A4CE for ; Thu, 30 Oct 2003 11:30:00 -0800 (PST) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id AE7E643FDD for ; Thu, 30 Oct 2003 11:29:57 -0800 (PST) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id h9UJSreF032920; Thu, 30 Oct 2003 11:28:57 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200310301928.h9UJSreF032920@gw.catspoiler.org> Date: Thu, 30 Oct 2003 11:28:53 -0800 (PST) From: Don Lewis To: kmarx@vicor.com In-Reply-To: <3FA16168.2010209@vicor.com> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: freebsd-fs@FreeBSD.org cc: gluk@ptci.ru cc: julian@elischer.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2003 19:30:00 -0000 On 30 Oct, Ken Marx wrote: > > > Don Lewis wrote: [snip] >> You might try the lightly tested patch below. It tweaks the dirpref >> algorithm so that cylinder groups with free space >= 75% of the average >> free space and free inodes >= 75% of the average number of free inodes >> are candidates for allocating the directory. It will not chose a >> cylinder group that does not have at least one free block and one free >> inode. >> >> It also decreases maxcontigdirs as the free space decreases so that a >> cluster of directories is less likely to cause the cylinder group to >> overflow. I think it would be better to tune maxcontigdirs individually >> for each cylinder group, based on the free space in that cylinder group, >> but that is more complex ... [snip] > Anyway - I just tested your patch. Again, unloaded system, repeatedly > untaring a 1.5gb file, starting at 97% capacity. and: > > tunefs: average file size: (-f) 49152 > tunefs: average number of files in a directory: (-s) 1500 > ... > > Takes about 74 system secs per 1.5gb untar: > ------------------------------------------- > /dev/da0s1e 558889580 497843972 16334442 97% 6858407 63316311 10% /raid > 119.23 real 1.28 user 73.09 sys > /dev/da0s1e 558889580 499371100 14807314 97% 6879445 63295273 10% /raid > 111.69 real 1.32 user 73.65 sys > /dev/da0s1e 558889580 500898228 13280186 97% 6900483 63274235 10% /raid > 116.67 real 1.44 user 74.19 sys > /dev/da0s1e 558889580 502425356 11753058 98% 6921521 63253197 10% /raid > 114.73 real 1.25 user 75.01 sys > /dev/da0s1e 558889580 503952484 10225930 98% 6942559 63232159 10% /raid > 116.95 real 1.30 user 74.10 sys > /dev/da0s1e 558889580 505479614 8698800 98% 6963597 63211121 10% /raid > 115.29 real 1.39 user 74.25 sys > /dev/da0s1e 558889580 507006742 7171672 99% 6984635 63190083 10% /raid > 114.01 real 1.16 user 74.04 sys > /dev/da0s1e 558889580 508533870 5644544 99% 7005673 63169045 10% /raid > 119.95 real 1.32 user 75.05 sys > /dev/da0s1e 558889580 510060998 4117416 99% 7026711 63148007 10% /raid > 114.89 real 1.33 user 74.66 sys > /dev/da0s1e 558889580 511588126 2590288 99% 7047749 63126969 10% /raid > 114.91 real 1.58 user 74.64 sys > /dev/da0s1e 558889580 513115254 1063160 100% 7068787 63105931 10% /raid > tot: 1161.06 real 13.45 user 742.89 sys > > Compares pretty favorably to our naive, retro 4.4 dirpref hack > that averages in the mid-high 60's: > -------------------------------------------------------------------- > /dev/da0s1e 558889580 497843952 16334462 97% 6858406 63316312 10% /raid > 110.19 real 1.42 user 65.54 sys > /dev/da0s1e 558889580 499371080 14807334 97% 6879444 63295274 10% /raid > 105.47 real 1.47 user 65.09 sys > /dev/da0s1e 558889580 500898208 13280206 97% 6900482 63274236 10% /raid > 110.17 real 1.48 user 64.98 sys > /dev/da0s1e 558889580 502425336 11753078 98% 6921520 63253198 10% /raid > 131.88 real 1.49 user 71.20 sys > /dev/da0s1e 558889580 503952464 10225950 98% 6942558 63232160 10% /raid > 111.61 real 1.62 user 67.47 sys > /dev/da0s1e 558889580 505479594 8698820 98% 6963596 63211122 10% /raid > 131.36 real 1.67 user 90.79 sys > /dev/da0s1e 558889580 507006722 7171692 99% 6984634 63190084 10% /raid > 115.34 real 1.49 user 65.61 sys > /dev/da0s1e 558889580 508533850 5644564 99% 7005672 63169046 10% /raid > 110.26 real 1.39 user 65.26 sys > /dev/da0s1e 558889580 510060978 4117436 99% 7026710 63148008 10% /raid > 116.15 real 1.51 user 65.47 sys > /dev/da0s1e 558889580 511588106 2590308 99% 7047748 63126970 10% /raid > 112.74 real 1.37 user 65.01 sys > /dev/da0s1e 558889580 513115234 1063180 100% 7068786 63105932 10% /raid > 1158.36 real 15.01 user 686.57 sys > > Without either, we'd expect timings of 5-20 minutes when things are > going poorly. > > Happy to test further if you have tweaks to your patch or > things you'd like us to test in particular. E.g., load, > newfs, etc. You might want to try your hash patch along my patch to see if decreasing the maximum hash chain lengths makes a difference in system time. From owner-freebsd-fs@FreeBSD.ORG Thu Oct 30 12:33:19 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 39AC016A4CE; Thu, 30 Oct 2003 12:33:19 -0800 (PST) Received: from sploot.vicor-nb.com (sploot.vicor-nb.com [208.206.78.81]) by mx1.FreeBSD.org (Postfix) with ESMTP id 409CD43FDD; Thu, 30 Oct 2003 12:33:18 -0800 (PST) (envelope-from kmarx@vicor.com) Received: from vicor.com (localhost [127.0.0.1]) by sploot.vicor-nb.com (8.12.8/8.12.8) with ESMTP id h9UKSAT1010092; Thu, 30 Oct 2003 12:28:10 -0800 (PST) (envelope-from kmarx@vicor.com) Message-ID: <3FA1745A.2090205@vicor.com> Date: Thu, 30 Oct 2003 12:28:10 -0800 From: Ken Marx User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030402 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Don Lewis References: <200310301928.h9UJSreF032920@gw.catspoiler.org> In-Reply-To: <200310301928.h9UJSreF032920@gw.catspoiler.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-fs@FreeBSD.org cc: gluk@ptci.ru cc: julian@elischer.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2003 20:33:19 -0000 Don Lewis wrote: > On 30 Oct, Ken Marx wrote: > >> >>Don Lewis wrote: > > [snip] > >>>You might try the lightly tested patch below. It tweaks the dirpref >>>algorithm so that cylinder groups with free space >= 75% of the average >>>free space and free inodes >= 75% of the average number of free inodes >>>are candidates for allocating the directory. It will not chose a >>>cylinder group that does not have at least one free block and one free >>>inode. >>> >>>It also decreases maxcontigdirs as the free space decreases so that a >>>cluster of directories is less likely to cause the cylinder group to >>>overflow. I think it would be better to tune maxcontigdirs individually >>>for each cylinder group, based on the free space in that cylinder group, >>>but that is more complex ... > > [snip] > >>Anyway - I just tested your patch. Again, unloaded system, repeatedly >>untaring a 1.5gb file, starting at 97% capacity. and: >> >> tunefs: average file size: (-f) 49152 >> tunefs: average number of files in a directory: (-s) 1500 >> ... >> >>Takes about 74 system secs per 1.5gb untar: >>------------------------------------------- >>/dev/da0s1e 558889580 497843972 16334442 97% 6858407 63316311 10% /raid >> 119.23 real 1.28 user 73.09 sys >>/dev/da0s1e 558889580 499371100 14807314 97% 6879445 63295273 10% /raid >> 111.69 real 1.32 user 73.65 sys >>/dev/da0s1e 558889580 500898228 13280186 97% 6900483 63274235 10% /raid >> 116.67 real 1.44 user 74.19 sys >>/dev/da0s1e 558889580 502425356 11753058 98% 6921521 63253197 10% /raid >> 114.73 real 1.25 user 75.01 sys >>/dev/da0s1e 558889580 503952484 10225930 98% 6942559 63232159 10% /raid >> 116.95 real 1.30 user 74.10 sys >>/dev/da0s1e 558889580 505479614 8698800 98% 6963597 63211121 10% /raid >> 115.29 real 1.39 user 74.25 sys >>/dev/da0s1e 558889580 507006742 7171672 99% 6984635 63190083 10% /raid >> 114.01 real 1.16 user 74.04 sys >>/dev/da0s1e 558889580 508533870 5644544 99% 7005673 63169045 10% /raid >> 119.95 real 1.32 user 75.05 sys >>/dev/da0s1e 558889580 510060998 4117416 99% 7026711 63148007 10% /raid >> 114.89 real 1.33 user 74.66 sys >>/dev/da0s1e 558889580 511588126 2590288 99% 7047749 63126969 10% /raid >> 114.91 real 1.58 user 74.64 sys >>/dev/da0s1e 558889580 513115254 1063160 100% 7068787 63105931 10% /raid >>tot: 1161.06 real 13.45 user 742.89 sys >> >>Compares pretty favorably to our naive, retro 4.4 dirpref hack >>that averages in the mid-high 60's: >>-------------------------------------------------------------------- >>/dev/da0s1e 558889580 497843952 16334462 97% 6858406 63316312 10% /raid >> 110.19 real 1.42 user 65.54 sys >>/dev/da0s1e 558889580 499371080 14807334 97% 6879444 63295274 10% /raid >> 105.47 real 1.47 user 65.09 sys >>/dev/da0s1e 558889580 500898208 13280206 97% 6900482 63274236 10% /raid >> 110.17 real 1.48 user 64.98 sys >>/dev/da0s1e 558889580 502425336 11753078 98% 6921520 63253198 10% /raid >> 131.88 real 1.49 user 71.20 sys >>/dev/da0s1e 558889580 503952464 10225950 98% 6942558 63232160 10% /raid >> 111.61 real 1.62 user 67.47 sys >>/dev/da0s1e 558889580 505479594 8698820 98% 6963596 63211122 10% /raid >> 131.36 real 1.67 user 90.79 sys >>/dev/da0s1e 558889580 507006722 7171692 99% 6984634 63190084 10% /raid >> 115.34 real 1.49 user 65.61 sys >>/dev/da0s1e 558889580 508533850 5644564 99% 7005672 63169046 10% /raid >> 110.26 real 1.39 user 65.26 sys >>/dev/da0s1e 558889580 510060978 4117436 99% 7026710 63148008 10% /raid >> 116.15 real 1.51 user 65.47 sys >>/dev/da0s1e 558889580 511588106 2590308 99% 7047748 63126970 10% /raid >> 112.74 real 1.37 user 65.01 sys >>/dev/da0s1e 558889580 513115234 1063180 100% 7068786 63105932 10% /raid >> 1158.36 real 15.01 user 686.57 sys >> >>Without either, we'd expect timings of 5-20 minutes when things are >>going poorly. >> >>Happy to test further if you have tweaks to your patch or >>things you'd like us to test in particular. E.g., load, >>newfs, etc. > > > You might want to try your hash patch along my patch to see if > decreasing the maximum hash chain lengths makes a difference in system > time. > > Sorry - should hvae mentioned: Both tests included our hash patch. I just re-ran with the hash stuff back to original. Sorry to say there's no appreciable difference. Using your dirpref patch, still apprx 75 sys sec/1.5gb: tot: 1185.54 real 14.15 user 747.43 sys Prehaps the dirpref patch lowers the frequency of having to search so much, and hence exercises the hashtable less. Or I'm doing something lame. k -- Ken Marx, kmarx@vicor-nb.com We have to move. We must not stand pat and achieve closure on customer's customer. - http://www.bigshed.com/cgi-bin/speak.cgi From owner-freebsd-fs@FreeBSD.ORG Thu Oct 30 14:51:47 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 69CA316A4CE; Thu, 30 Oct 2003 14:51:47 -0800 (PST) Received: from beastie.mckusick.com (64-169-229-100.ded.pacbell.net [64.169.229.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3AF8F43F93; Thu, 30 Oct 2003 14:51:46 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Received: from beastie.mckusick.com (localhost [127.0.0.1]) by beastie.mckusick.com (8.12.8/8.12.3) with ESMTP id h9UMbGeN022403; Thu, 30 Oct 2003 14:37:16 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Message-Id: <200310302237.h9UMbGeN022403@beastie.mckusick.com> To: Don Lewis In-Reply-To: Your message of "Thu, 30 Oct 2003 12:28:10 PST." <3FA1745A.2090205@vicor.com> Date: Thu, 30 Oct 2003 14:37:16 -0800 From: Kirk McKusick cc: Ken Marx cc: freebsd-fs@FreeBSD.org cc: gluk@ptci.ru cc: julian@elischer.org Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2003 22:51:47 -0000 Don, your change appears to be quite helpful and is performance neutral on my benchmarks. So, I suggest that you check them into -current. They are also small enough that they would be a good candidate to MFC to -stable. Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Thu Oct 30 14:52:20 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 58C9916A4D1; Thu, 30 Oct 2003 14:52:20 -0800 (PST) Received: from beastie.mckusick.com (64-169-229-100.ded.pacbell.net [64.169.229.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id D31AD43FCB; Thu, 30 Oct 2003 14:52:15 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Received: from beastie.mckusick.com (localhost [127.0.0.1]) by beastie.mckusick.com (8.12.8/8.12.3) with ESMTP id h9UMZfeN022398; Thu, 30 Oct 2003 14:35:41 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Message-Id: <200310302235.h9UMZfeN022398@beastie.mckusick.com> To: Ken Marx In-Reply-To: Your message of "Thu, 30 Oct 2003 12:28:10 PST." <3FA1745A.2090205@vicor.com> Date: Thu, 30 Oct 2003 14:35:41 -0800 From: Kirk McKusick cc: freebsd-fs@FreeBSD.org cc: Don Lewis cc: gluk@ptci.ru cc: julian@elischer.org Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2003 22:52:20 -0000 I know it takes a lot of time, but I would like to hear of the results when you do the initial loading of the filesystem using Don's code as that may well effect the set of choices that it has. Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Thu Oct 30 19:00:47 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2B78A16A4CE for ; Thu, 30 Oct 2003 19:00:47 -0800 (PST) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id E526F43FA3 for ; Thu, 30 Oct 2003 19:00:45 -0800 (PST) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id h9V2xueF033759; Thu, 30 Oct 2003 19:00:00 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200310310300.h9V2xueF033759@gw.catspoiler.org> Date: Thu, 30 Oct 2003 18:59:56 -0800 (PST) From: Don Lewis To: kmarx@vicor.com In-Reply-To: <3FA1745A.2090205@vicor.com> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: freebsd-fs@FreeBSD.org cc: gluk@ptci.ru cc: julian@elischer.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Oct 2003 03:00:47 -0000 On 30 Oct, Ken Marx wrote: > > > Don Lewis wrote: >> On 30 Oct, Ken Marx wrote: >>>Happy to test further if you have tweaks to your patch or >>>things you'd like us to test in particular. E.g., load, >>>newfs, etc. >> >> >> You might want to try your hash patch along my patch to see if >> decreasing the maximum hash chain lengths makes a difference in system >> time. >> >> > > Sorry - should hvae mentioned: Both tests included our hash patch. > I just re-ran with the hash stuff back to original. Sorry to say > there's no appreciable difference. Using your dirpref patch, still > apprx 75 sys sec/1.5gb: > > tot: 1185.54 real 14.15 user 747.43 sys > > Prehaps the dirpref patch lowers the frequency of having to search so much, > and hence exercises the hashtable less. I'm pretty sure that is the situation, but I wasn't sure if there was still enough hash table usage to make balancing it worthwhile. It sounds like there isn't. It looks like there is still about a 10% increase in system time for my modifications to dirpref versus reverting to the old algorithm. It would be nice to know where the extra time was being spent, but that would require probably require some kernel profiling. It might also be interesting to play with the cylinder group selection criteria. Should minbfree be 75% of avgbfree, 100% of avgbfree, or 125% of avgbfree? Probably not anything over 100%, since the algorithm would fail if all the cylinder groups were evenly filled ... From owner-freebsd-fs@FreeBSD.ORG Fri Oct 31 01:00:53 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5F4A616A4CE for ; Fri, 31 Oct 2003 01:00:53 -0800 (PST) Received: from firecrest.mail.pas.earthlink.net (firecrest.mail.pas.earthlink.net [207.217.121.247]) by mx1.FreeBSD.org (Postfix) with ESMTP id BB75643FD7 for ; Fri, 31 Oct 2003 01:00:52 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from user-2ivfje9.dialup.mindspring.com ([165.247.205.201] helo=mindspring.com) by firecrest.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 1AFV97-0004hE-00; Fri, 31 Oct 2003 01:00:41 -0800 Message-ID: <3FA22474.BBCA3E2F@mindspring.com> Date: Fri, 31 Oct 2003 00:59:32 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Dave Hart References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4bed513c9a3b9e47c2a9920a1dd57f8c8a2d4e88014a4647c350badd9bab72f9c350badd9bab72f9c cc: freebsd-fs@freebsd.org Subject: Re: mount_smbfs and MS Windows Domain Controllers X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Oct 2003 09:00:53 -0000 Dave Hart wrote: > I don't think domain membership would interfere with authentication in > this case. While I didn't find any easy instructions on how to change > the setting, Windows Server 2003 DCs require SMB signing by default. > Probably the version of SAMBA involved cannot sign its SMB operations. I'm pretty sure the Darwin 7.0 SMBFS code can do this; it's derived from the FreeBSD code, available for download from the Apple site for back-porting the changes to FreeBSD, and I'm pretty sure the license was left exactly as it was when the code was imported from FreeBSD. -- Terry From owner-freebsd-fs@FreeBSD.ORG Fri Oct 31 08:31:57 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9AAAC16A4CE; Fri, 31 Oct 2003 08:31:57 -0800 (PST) Received: from sploot.vicor-nb.com (sploot.vicor-nb.com [208.206.78.81]) by mx1.FreeBSD.org (Postfix) with ESMTP id B655B43F85; Fri, 31 Oct 2003 08:31:56 -0800 (PST) (envelope-from kmarx@sploot.vicor-nb.com) Received: from sploot.vicor-nb.com (localhost [127.0.0.1]) by sploot.vicor-nb.com (8.12.8/8.12.8) with ESMTP id h9VGQoT1033257; Fri, 31 Oct 2003 08:26:50 -0800 (PST) (envelope-from kmarx@sploot.vicor-nb.com) Received: (from kmarx@localhost) by sploot.vicor-nb.com (8.12.8/8.12.8/Submit) id h9VGQmou033256; Fri, 31 Oct 2003 08:26:48 -0800 (PST) Date: Fri, 31 Oct 2003 08:26:47 -0800 From: Ken Marx To: Don Lewis Message-ID: <20031031162647.GB30803@sploot.vicor-nb.com> References: <3FA1745A.2090205@vicor.com> <200310310300.h9V2xueF033759@gw.catspoiler.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200310310300.h9V2xueF033759@gw.catspoiler.org> User-Agent: Mutt/1.4i cc: kmarx@vicor.com cc: freebsd-fs@FreeBSD.org cc: julian@elischer.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Oct 2003 16:31:57 -0000 Kirk McKusick wrote: > I know it takes a lot of time, but I would like to hear of the results > when you do the initial loading of the filesystem using Don's code as > that may well effect the set of choices that it has. Ok. I've done this using 48k avgfilesize, and 1500 filesperdir. I left our hashtable patch in. I can give details, but Don's code seems to average a healthy 64-5sec/1.5gb untar. I.e., basically equivalent to the 4.4 code. But after 90% or so starts consuming more time - up to 90-130 seconds (system time). This increase is not always monotonic. Timings in the 60sec range to occur. (I double-checked this, re-running starting at 97%.) The 4.4 dirpref code seems a bit better in this regime, staying mostly in the 60-70 sec range. I'm now starting from a newfs'd raid with the default settings, and running Don's patch (with hashtable patch). Should be done in 10hrs or so. This matters to us, because we'd like to avoid having to newfs all our production raids. On Thu, Oct 30, 2003 at 06:59:56PM -0800, Don Lewis wrote: > On 30 Oct, Ken Marx wrote: > > > > > > Don Lewis wrote: > >> On 30 Oct, Ken Marx wrote: > > > Prehaps the dirpref patch lowers the frequency of having to search so much, > > and hence exercises the hashtable less. > > I'm pretty sure that is the situation, but I wasn't sure if there was > still enough hash table usage to make balancing it worthwhile. It > sounds like there isn't. In this case at least. But is it still possible that some situation might arise in which the quadratic search fails enough to warrant having the more efficient hash for the multiple linear searches? I guess I'm asking if folks are inclined to go for some hashtable patch (in addition to Don's dirpref patch). I should probably compare hashtable vs. no-hashtable kernels, with NO dirpref patch. I can do (not until tonight) if this would help in deciding. Let me know. > It looks like there is still about a 10% increase in system time for my > modifications to dirpref versus reverting to the old algorithm. It > would be nice to know where the extra time was being spent, but that > would require probably require some kernel profiling. Right. In the high-capacity regime at least. We can profile kernels here, but I'm running low on time I can easily devote to this. Julian is due back in a few weeks though. Let me know though, and I'll do whatever I can. > It might also be interesting to play with the cylinder group selection > criteria. Should minbfree be 75% of avgbfree, 100% of avgbfree, or 125% > of avgbfree? Probably not anything over 100%, since the algorithm would > fail if all the cylinder groups were evenly filled ... > (I'm way out of my depth here...) Thanks again for the continued help with all this, k -- Ken Marx, kmarx@vicor-nb.com Analyze progress on the family jewels!! - http://www.bigshed.com/cgi-bin/speak.cgi From owner-freebsd-fs@FreeBSD.ORG Fri Oct 31 11:13:42 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 92AA716A4CE for ; Fri, 31 Oct 2003 11:13:42 -0800 (PST) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id D766943FB1 for ; Fri, 31 Oct 2003 11:13:41 -0800 (PST) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id h9VJCneF035786; Fri, 31 Oct 2003 11:12:54 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200310311912.h9VJCneF035786@gw.catspoiler.org> Date: Fri, 31 Oct 2003 11:12:49 -0800 (PST) From: Don Lewis To: kmarx@sploot.vicor-nb.com In-Reply-To: <20031031162647.GB30803@sploot.vicor-nb.com> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: kmarx@vicor.com cc: freebsd-fs@FreeBSD.org cc: julian@elischer.org cc: mckusick@beastie.mckusick.com Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Oct 2003 19:13:42 -0000 On 31 Oct, Ken Marx wrote: > Kirk McKusick wrote: >> I know it takes a lot of time, but I would like to hear of the results >> when you do the initial loading of the filesystem using Don's code as >> that may well effect the set of choices that it has. > > Ok. I've done this using 48k avgfilesize, and 1500 filesperdir. > I left our hashtable patch in. > > I can give details, but Don's code seems to average a healthy > 64-5sec/1.5gb untar. I.e., basically equivalent to the 4.4 code. But > after 90% or so starts consuming more time - up to 90-130 seconds > (system time). This increase is not always monotonic. Timings in > the 60sec range to occur. (I double-checked this, re-running > starting at 97%.) > > The 4.4 dirpref code seems a bit better in this regime, staying > mostly in the 60-70 sec range. I suspect that the problem is caused by setting {minbfree,minifree} to 75% of {avgbfree,avgifree}, which is still allowing cylinder groups to overflow even with my patch. I'm somewhat hesitant to go to 100%, since there may not be many cylinder groups with free blocks and inodes both above average. Can you send me the first part of the dumpfs output for this filesystem? From owner-freebsd-fs@FreeBSD.ORG Fri Oct 31 15:46:57 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3EFE016A4CE for ; Fri, 31 Oct 2003 15:46:57 -0800 (PST) Received: from mail.siscom.net (mail.siscom.net [209.251.2.99]) by mx1.FreeBSD.org (Postfix) with SMTP id 27BF243F75 for ; Fri, 31 Oct 2003 15:46:56 -0800 (PST) (envelope-from radams@siscom.net) Received: (qmail 28302 invoked by uid 1005); 31 Oct 2003 23:46:42 -0000 Received: from radams@siscom.net by mail.siscom.net by uid 0 with qmail-scanner-1.14 (f-prot: 3.12. Clear:. Processed in 0.069797 secs); 31 Oct 2003 23:46:42 -0000 X-Qmail-Scanner-Mail-From: radams@siscom.net via mail.siscom.net X-Qmail-Scanner: 1.14 (Clear:. Processed in 0.069797 secs) Received: from unknown (HELO siscom.net) (209.251.6.250) by mail.siscom.net with SMTP; 31 Oct 2003 23:46:41 -0000 Message-ID: <3FA2F4F2.1090408@siscom.net> Date: Fri, 31 Oct 2003 18:49:06 -0500 From: "Robert J. Adams (jason)" Organization: Newshosting.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.5b) Gecko/20030914 Thunderbird/0.3a X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <3F95B946.8010309@newshosting.com> <20031021233414.GJ99943@elvis.mu.org> In-Reply-To: <20031021233414.GJ99943@elvis.mu.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: >1 systems 1 FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Oct 2003 23:46:57 -0000 Hello, Looking at this from a different approach. Lets say the filesystem consists of only a few huge files (INN CNFS news spools for example) .. the number of or size of these files wouldn't change.. but another system would be updating the contents of the files.. would this work? Does the fs cache cache contents? BTW, thanks for all the insight on this thus far guys! -jason From owner-freebsd-fs@FreeBSD.ORG Fri Oct 31 18:57:11 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 53A1616A4CE; Fri, 31 Oct 2003 18:57:11 -0800 (PST) Received: from sploot.vicor-nb.com (sploot.vicor-nb.com [208.206.78.81]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3DC3543F3F; Fri, 31 Oct 2003 18:57:10 -0800 (PST) (envelope-from kmarx@sploot.vicor-nb.com) Received: from sploot.vicor-nb.com (localhost [127.0.0.1]) by sploot.vicor-nb.com (8.12.8/8.12.8) with ESMTP id hA12q4T1043735; Fri, 31 Oct 2003 18:52:04 -0800 (PST) (envelope-from kmarx@sploot.vicor-nb.com) Received: (from kmarx@localhost) by sploot.vicor-nb.com (8.12.8/8.12.8/Submit) id hA12q26l043734; Fri, 31 Oct 2003 18:52:02 -0800 (PST) Date: Fri, 31 Oct 2003 18:52:02 -0800 From: Ken Marx To: Kirk McKusick Message-ID: <20031101025202.GC30803@sploot.vicor-nb.com> References: <3FA1745A.2090205@vicor.com> <200310302235.h9UMZfeN022398@beastie.mckusick.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200310302235.h9UMZfeN022398@beastie.mckusick.com> User-Agent: Mutt/1.4i cc: Ken Marx cc: freebsd-fs@FreeBSD.org cc: Don Lewis cc: gluk@ptci.ru cc: julian@elischer.org Subject: Re: 4.8 ffs_dirpref problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Nov 2003 02:57:11 -0000 On Thu, Oct 30, 2003 at 02:35:41PM -0800, Kirk McKusick wrote: > I know it takes a lot of time, but I would like to hear of the results > when you do the initial loading of the filesystem using Don's code as > that may well effect the set of choices that it has. > > Kirk McKusick > Just followup: I emailed this morning on re-doing from scratch newfs with afore-mentined overrides for avg filesperdir and filesize. I've now run after newfs'ing with no overrides. This with Don's patch and our hashtable patch. The results are pretty similar to before. Things march along at about 64-5sec/1.5gb untar, until the low 90%'s. There it bogs down a bit. This time peak times are a bit higher, but still acceptable. That's nice for us, as we don't have to re-newfs our production disks to take advantage of Don's patch. Below are cut/paste from the end of both runs. # tunefs: average file size: (-f) 49152 # tunefs: average number of files in a directory: (-s) 1500 --------------------------------------------------------------------- /dev/da0s1e 558889580 441340370 72838044 86% 6080003 64094715 9% /raid 154.11 real 1.67 user 63.63 sys /dev/da0s1e 558889580 442867498 71310916 86% 6101041 64073677 9% /raid 181.47 real 1.55 user 62.14 sys /dev/da0s1e 558889580 444394628 69783786 86% 6122079 64052639 9% /raid 155.28 real 1.36 user 63.30 sys /dev/da0s1e 558889580 445921756 68256658 87% 6143117 64031601 9% /raid 170.80 real 1.65 user 63.68 sys /dev/da0s1e 558889580 447448884 66729530 87% 6164155 64010563 9% /raid 121.58 real 1.84 user 65.67 sys /dev/da0s1e 558889580 448976012 65202402 87% 6185193 63989525 9% /raid 124.90 real 1.53 user 66.31 sys /dev/da0s1e 558889580 450503140 63675274 88% 6206231 63968487 9% /raid 126.03 real 1.48 user 66.31 sys /dev/da0s1e 558889580 452030268 62148146 88% 6227269 63947449 9% /raid 129.77 real 1.52 user 66.29 sys /dev/da0s1e 558889580 453557398 60621016 88% 6248307 63926411 9% /raid 124.69 real 1.49 user 67.11 sys /dev/da0s1e 558889580 455084526 59093888 89% 6269345 63905373 9% /raid 137.29 real 1.46 user 68.44 sys /dev/da0s1e 558889580 456611654 57566760 89% 6290383 63884335 9% /raid 126.54 real 1.84 user 66.92 sys /dev/da0s1e 558889580 458138782 56039632 89% 6311421 63863297 9% /raid 140.16 real 1.49 user 69.78 sys /dev/da0s1e 558889580 459665910 54512504 89% 6332459 63842259 9% /raid 139.00 real 1.68 user 72.60 sys /dev/da0s1e 558889580 461193038 52985376 90% 6353497 63821221 9% /raid 143.07 real 1.76 user 78.86 sys /dev/da0s1e 558889580 462720168 51458246 90% 6374535 63800183 9% /raid 164.61 real 1.57 user 94.91 sys /dev/da0s1e 558889580 464247296 49931118 90% 6395573 63779145 9% /raid 162.26 real 1.77 user 107.84 sys /dev/da0s1e 558889580 465774424 48403990 91% 6416611 63758107 9% /raid 169.80 real 1.64 user 100.17 sys /dev/da0s1e 558889580 467301552 46876862 91% 6437649 63737069 9% /raid 170.83 real 1.70 user 80.28 sys /dev/da0s1e 558889580 468828680 45349734 91% 6458687 63716031 9% /raid 134.12 real 1.68 user 66.30 sys /dev/da0s1e 558889580 470355808 43822606 91% 6479725 63694993 9% /raid 134.38 real 1.92 user 65.51 sys /dev/da0s1e 558889580 471882938 42295476 92% 6500763 63673955 9% /raid 119.38 real 1.41 user 66.21 sys /dev/da0s1e 558889580 473410066 40768348 92% 6521801 63652917 9% /raid 123.21 real 1.75 user 66.18 sys /dev/da0s1e 558889580 474937194 39241220 92% 6542839 63631879 9% /raid 125.09 real 1.75 user 66.63 sys /dev/da0s1e 558889580 476464322 37714092 93% 6563877 63610841 9% /raid 129.32 real 1.65 user 67.14 sys /dev/da0s1e 558889580 477991452 36186962 93% 6584916 63589802 9% /raid 129.29 real 1.54 user 68.44 sys /dev/da0s1e 558889580 479518580 34659834 93% 6605954 63568764 9% /raid 147.42 real 1.50 user 81.64 sys /dev/da0s1e 558889580 481045710 33132704 94% 6626992 63547726 9% /raid 149.96 real 1.49 user 74.04 sys /dev/da0s1e 558889580 482572838 31605576 94% 6648030 63526688 9% /raid 175.23 real 1.97 user 101.63 sys /dev/da0s1e 558889580 484099966 30078448 94% 6669068 63505650 10% /raid 182.27 real 1.79 user 115.61 sys /dev/da0s1e 558889580 485627094 28551320 94% 6690106 63484612 10% /raid 134.85 real 1.44 user 77.27 sys /dev/da0s1e 558889580 487154222 27024192 95% 6711144 63463574 10% /raid 208.96 real 1.63 user 91.81 sys /dev/da0s1e 558889580 488681350 25497064 95% 6732182 63442536 10% /raid 148.43 real 1.74 user 111.99 sys /dev/da0s1e 558889580 490208480 23969934 95% 6753220 63421498 10% /raid 151.99 real 1.51 user 115.23 sys /dev/da0s1e 558889580 491735608 22442806 96% 6774258 63400460 10% /raid 146.03 real 1.76 user 109.72 sys /dev/da0s1e 558889580 493262736 20915678 96% 6795296 63379422 10% /raid 171.04 real 1.67 user 132.48 sys /dev/da0s1e 558889580 494789864 19388550 96% 6816334 63358384 10% /raid 144.08 real 1.61 user 107.65 sys /dev/da0s1e 558889580 496316992 17861422 97% 6837372 63337346 10% /raid 149.30 real 1.75 user 112.30 sys /dev/da0s1e 558889580 497844120 16334294 97% 6858410 63316308 10% /raid 147.50 real 1.71 user 114.39 sys /dev/da0s1e 558889580 499371248 14807166 97% 6879448 63295270 10% /raid 153.79 real 1.65 user 118.08 sys /dev/da0s1e 558889580 500898378 13280036 97% 6900486 63274232 10% /raid 144.62 real 1.71 user 109.23 sys /dev/da0s1e 558889580 502425506 11752908 98% 6921524 63253194 10% /raid 134.38 real 1.33 user 98.33 sys /dev/da0s1e 558889580 503952634 10225780 98% 6942562 63232156 10% /raid 106.89 real 1.44 user 71.75 sys /dev/da0s1e 558889580 505479762 8698652 98% 6963600 63211118 10% /raid 138.70 real 1.50 user 103.18 sys /dev/da0s1e 558889580 507006890 7171524 99% 6984638 63190080 10% /raid 106.71 real 1.31 user 67.53 sys /dev/da0s1e 558889580 508534020 5644394 99% 7005676 63169042 10% /raid 112.68 real 1.41 user 72.57 sys /dev/da0s1e 558889580 510061148 4117266 99% 7026714 63148004 10% /raid 146.53 real 1.41 user 101.87 sys /dev/da0s1e 558889580 511588276 2590138 99% 7047752 63126966 10% /raid 134.13 real 1.61 user 95.51 sys /dev/da0s1e 558889580 513115404 1063010 100% 7068790 63105928 10% /raid total: 38098.69 real 492.64 user 22852.88 sys # tunefs: average file size: (-f) 16384 # tunefs: average number of files in a directory: (-s) 64 --------------------------------------------------------------------- /dev/da0s1e 558889580 456611290 57567124 89% 6290373 63884345 9% /raid 110.60 real 1.39 user 64.17 sys /dev/da0s1e 558889580 458138418 56039996 89% 6311411 63863307 9% /raid 112.72 real 1.45 user 64.43 sys /dev/da0s1e 558889580 459665546 54512868 89% 6332449 63842269 9% /raid 111.78 real 1.29 user 65.39 sys /dev/da0s1e 558889580 461192674 52985740 90% 6353487 63821231 9% /raid 114.09 real 1.38 user 65.11 sys /dev/da0s1e 558889580 462719802 51458612 90% 6374525 63800193 9% /raid 112.38 real 1.39 user 65.89 sys /dev/da0s1e 558889580 464246930 49931484 90% 6395563 63779155 9% /raid 108.89 real 1.35 user 65.48 sys /dev/da0s1e 558889580 465774058 48404356 91% 6416601 63758117 9% /raid 107.14 real 1.31 user 66.02 sys /dev/da0s1e 558889580 467301186 46877228 91% 6437639 63737079 9% /raid 110.17 real 1.35 user 66.06 sys /dev/da0s1e 558889580 468828314 45350100 91% 6458677 63716041 9% /raid 103.04 real 1.35 user 66.41 sys /dev/da0s1e 558889580 470355442 43822972 91% 6479715 63695003 9% /raid 103.65 real 1.33 user 67.43 sys /dev/da0s1e 558889580 471882570 42295844 92% 6500753 63673965 9% /raid 111.54 real 1.39 user 67.63 sys /dev/da0s1e 558889580 473409698 40768716 92% 6521791 63652927 9% /raid 119.68 real 1.49 user 70.97 sys /dev/da0s1e 558889580 474936826 39241588 92% 6542829 63631889 9% /raid 117.99 real 1.43 user 72.54 sys /dev/da0s1e 558889580 476463954 37714460 93% 6563867 63610851 9% /raid 124.48 real 1.33 user 82.17 sys /dev/da0s1e 558889580 477991084 36187330 93% 6584906 63589812 9% /raid 128.24 real 1.53 user 79.34 sys /dev/da0s1e 558889580 479518212 34660202 93% 6605944 63568774 9% /raid 133.42 real 1.73 user 94.41 sys /dev/da0s1e 558889580 481045340 33133074 94% 6626982 63547736 9% /raid 145.93 real 1.65 user 101.37 sys /dev/da0s1e 558889580 482572468 31605946 94% 6648020 63526698 9% /raid 135.85 real 1.60 user 84.61 sys /dev/da0s1e 558889580 484099596 30078818 94% 6669058 63505660 10% /raid 115.91 real 1.58 user 75.65 sys /dev/da0s1e 558889580 485626724 28551690 94% 6690096 63484622 10% /raid 160.66 real 1.64 user 118.11 sys /dev/da0s1e 558889580 487153852 27024562 95% 6711134 63463584 10% /raid 157.99 real 1.55 user 117.62 sys /dev/da0s1e 558889580 488680980 25497434 95% 6732172 63442546 10% /raid 171.06 real 1.63 user 127.95 sys /dev/da0s1e 558889580 490208108 23970306 95% 6753210 63421508 10% /raid 178.18 real 1.80 user 131.84 sys /dev/da0s1e 558889580 491735236 22443178 96% 6774248 63400470 10% /raid 167.70 real 1.64 user 113.13 sys /dev/da0s1e 558889580 493262364 20916050 96% 6795286 63379432 10% /raid 214.04 real 1.70 user 154.12 sys /dev/da0s1e 558889580 494789492 19388922 96% 6816324 63358394 10% /raid 191.70 real 1.74 user 135.61 sys /dev/da0s1e 558889580 496316620 17861794 97% 6837362 63337356 10% /raid 199.92 real 1.90 user 157.57 sys /dev/da0s1e 558889580 497843748 16334666 97% 6858400 63316318 10% /raid 281.91 real 2.00 user 192.30 sys /dev/da0s1e 558889580 499370876 14807538 97% 6879438 63295280 10% /raid 198.64 real 1.57 user 144.68 sys /dev/da0s1e 558889580 500898004 13280410 97% 6900476 63274242 10% /raid 271.23 real 1.97 user 213.29 sys /dev/da0s1e 558889580 502425132 11753282 98% 6921514 63253204 10% /raid 224.62 real 1.75 user 160.66 sys /dev/da0s1e 558889580 503952260 10226154 98% 6942552 63232166 10% /raid 233.05 real 1.65 user 168.42 sys /dev/da0s1e 558889580 505479388 8699026 98% 6963590 63211128 10% /raid 217.44 real 2.09 user 160.40 sys /dev/da0s1e 558889580 507006516 7171898 99% 6984628 63190090 10% /raid 284.76 real 1.97 user 219.23 sys /dev/da0s1e 558889580 508533644 5644770 99% 7005666 63169052 10% /raid 234.51 real 1.67 user 170.36 sys /dev/da0s1e 558889580 510060772 4117642 99% 7026704 63148014 10% /raid 236.64 real 2.00 user 173.65 sys /dev/da0s1e 558889580 511587900 2590514 99% 7047742 63126976 10% /raid 518.78 real 1.86 user 411.95 sys /dev/da0s1e 558889580 513115028 1063386 100% 7068780 63105938 10% /raid total: 39100.60 real 483.88 user 23886.66 sys -- Ken Marx, kmarx@vicor-nb.com Agree. Agree. But still we have to right size and step upto the plate on the milestones. - http://www.bigshed.com/cgi-bin/speak.cgi