From owner-freebsd-stable@FreeBSD.ORG Thu Mar 18 12:19:48 2004 Return-Path: Delivered-To: freebsd-stable@www.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E6CA716A58B for ; Thu, 18 Mar 2004 12:19:47 -0800 (PST) Received: from mx2.freebsd.org (mx2.freebsd.org [216.136.204.119]) by mx1.FreeBSD.org (Postfix) with ESMTP id DD01743D1D for ; Thu, 18 Mar 2004 12:19:47 -0800 (PST) (envelope-from mandrews@bit0.com) Received: from hub.freebsd.org (hub.freebsd.org [216.136.204.18]) by mx2.freebsd.org (Postfix) with ESMTP id 7FEEC55443 for ; Thu, 18 Mar 2004 12:18:44 -0800 (PST) (envelope-from mandrews@bit0.com) Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B33D416A4CE for ; Thu, 18 Mar 2004 12:18:43 -0800 (PST) Received: from bit0.com (bit0.com [216.24.42.194]) by mx1.FreeBSD.org (Postfix) with SMTP id 0E4EE43D1F for ; Thu, 18 Mar 2004 12:18:43 -0800 (PST) (envelope-from mandrews@bit0.com) Received: from localhost (localhost.bit0.com [127.0.0.1]) by bit0.com (Postfix) with ESMTP id 8F5D434D33 for ; Thu, 18 Mar 2004 15:19:25 -0500 (EST) Date: Thu, 18 Mar 2004 15:19:25 -0500 (EST) From: Mike Andrews To: freebsd-stable@lists.freebsd.org Message-ID: <20040318144807.E95350@mindcrime.bit0.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Weird NFSvs rdirplus issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Mar 2004 20:19:48 -0000 I've been experimenting with readdirplus and running into two bizarre situations. Bizarre situation #1 is the easy one: when you try to put the acdirmin/max and acregmin/max options on an NFS filesystem in /etc/fstab, mount (actually mount_nfs) will dump core on 4.9-RELEASE-p4 but not on 5.2.1-RELEASE-p3: # grep acreg /etc/fstab server:/fs /mnt nfs,ro,-lis,acdirmin=0,acdirmax=1,acregmin=0,acregmax=1 0 0 # mount /mnt mount: server:/fs: Segmentation fault (the core file left behind is for mount_nfs, not mount, though) However running mount_nfs at the command line will work, even on 4.9: # mount_nfs -lis -o acdirmin=0,acdirmax=1,acregmin=0,acregmax=1 server:/fs /mnt Looks like some kind of parsing error that the fix hasn't been MFC'ed for? (I haven't been able to check 4.9-STABLE yet to see if the fix made it there.) Bizarre situation #2 is why I was messing with those options in the first place... With readdirplus enabled (i.e. 'rdirplus' or '-l' in the fstab mount options) files sometimes, but not always, disappear when they're written -- which is just a bit alarming. :) One way to reproduce this easily is to have /usr/ports be an NFS mount and try to build a port that does patches -- the .orig file will be created but the patched file will be gone, which causes the build to fail with "no such file or directory" errors. A specific port I've seen this on is the lang/ruby16 port: # mount_nfs -l server:/fs /usr/ports # cd /usr/ports/lang/ruby16 # make clean ===> Cleaning for ruby-1.6.8.2003.10.15_1 # make ===> Vulnerability check disabled ===> Extracting for ruby-1.6.8.2003.10.15_1 >> Checksum OK for ruby/ruby-1.6.8.tar.gz. >> Checksum OK for ruby/ruby-1.6.8-2003.04.19.diff.bz2. >> Checksum OK for ruby/ruby-1.6.8-2003.04.19-2003.10.15.diff.bz2. ===> Patching for ruby-1.6.8.2003.10.15_1 ===> Applying distribution patches for ruby-1.6.8.2003.10.15_1 /bin/rm -rf /admin3.usr/ports/lang/ruby16/work/ruby-1.6.8/ext/Win32API /bin/mv /admin3.usr/ports/lang/ruby16/work/ruby-1.6.8/ext/gdbm /admin3.usr/ports/lang/ruby16/work/ /bin/mv /admin3.usr/ports/lang/ruby16/work/ruby-1.6.8/ext/tcltklib /admin3.usr/ports/lang/ruby16/work/ /bin/mv /admin3.usr/ports/lang/ruby16/work/ruby-1.6.8/ext/tk /admin3.usr/ports/lang/ruby16/work/ ===> Configuring for ruby-1.6.8.2003.10.15_1 /usr/bin/touch /admin3.usr/ports/lang/ruby16/work/ruby-1.6.8/configure configure: WARNING: you should use --build, --host, --target grep: ./version.h: No such file or directory checking build system type... i386-portbld-freebsd4 [snip] ===> Building for ruby-1.6.8.2003.10.15_1 make: don't know how to make version.h. Stop *** Error code 2 version.h is indeed missing, but version.h.orig exists (provided you comment "${FIND} ${PATCH_WRKSRC} -name '*.orig' -delete" out of the ruby16 port's Makefile, that is). It's not limited to just this one port anyway. Removing the -l from the mount_nfs makes the problem go away. I tried making a simple shell script to create a file, and apply a patch to it, but I can't reproduce the problem that way. There must be something a bit more complicated going on... I've also had files saved in vim (but not nvi) vanish as soon as they're saved, but not consistently. Thinking it might have to do with attribute caching, I've tried experimenting with the attribute cache settings, and can make the problem go away (or at least this particular symptom of it go away) by setting acregmin=0 and acregmax=1. This is how I ran into the fstab parsing coredump above... Setting the sysctl vfs.nfs.access_cache_timeout to 0 doesn't help. On the NFS server end, I've tried FreeBSD 4.9-RELEASE-p4, and two Netapps (one Ontap 5.3.7, one Ontap 6.3.3), and on the client end, FreeBSD 4.9-RELEASE-p4 and 5.2.1-RELEASE-p3. Every combination does the same thing. The motivating factor in using readdirplus at all is that it drastically reduces CPU load, ethernet load, and NFS ops/sec on the Netapps, which are not exactly cheap to upgrade CPU in. When I turned readdirplus off to stop the file corruption, the Netapp's CPU load pegged at 100% around the clock. But interestingly, it doesn't seem to raise the actual disk spindle ops/sec; probably the extra stat() calls are being handled from its disk cache. Still, the overhead of quadruping the number of NFS calls is too much for it... So now I'm having to pick between high CPU load and randomly losing files in unpredictable cases. Obviously there's got to be some pattern to it, and it seems to do with attribute caching, I just haven't nailed it down 100%. If the fstab parsing issue could be fixed, I could just disable attribute caching entirely and leave readdirplus on, which in theory should give me reliability and still keep the load down... So if there are any NFS gurus that can make any sense out of this, or tell me what stupid thing I forgot to read up on, let me know :) In particular, how the access_cache_timeout sysctl interacts with the ac(dir|reg)(min|max) settings, why the sysctl is set to 2 in -stable and 60 in -current, and is there maybe something that should be invalidating the cache other than the timeout that isn't happening? Or is the timeout all we have because of the statelessness of NFS? Mike Andrews * mandrews@bit0.com * http://www.bit0.com "The truth is, you never find the truth." Carpe cavy! It's not news, it's Fark.com.