From owner-freebsd-stable@FreeBSD.ORG  Thu Mar 18 12:19:48 2004
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@www.freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E6CA716A58B
	for <freebsd-stable@www.freebsd.org>;
	Thu, 18 Mar 2004 12:19:47 -0800 (PST)
Received: from mx2.freebsd.org (mx2.freebsd.org [216.136.204.119])
	by mx1.FreeBSD.org (Postfix) with ESMTP id DD01743D1D
	for <freebsd-stable@www.freebsd.org>;
	Thu, 18 Mar 2004 12:19:47 -0800 (PST)
	(envelope-from mandrews@bit0.com)
Received: from hub.freebsd.org (hub.freebsd.org [216.136.204.18])
	by mx2.freebsd.org (Postfix) with ESMTP id 7FEEC55443
	for <freebsd-stable@lists.freebsd.org>;
	Thu, 18 Mar 2004 12:18:44 -0800 (PST)
	(envelope-from mandrews@bit0.com)
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id B33D416A4CE
	for <freebsd-stable@lists.freebsd.org>;
	Thu, 18 Mar 2004 12:18:43 -0800 (PST)
Received: from bit0.com (bit0.com [216.24.42.194])
	by mx1.FreeBSD.org (Postfix) with SMTP id 0E4EE43D1F
	for <freebsd-stable@lists.freebsd.org>;
	Thu, 18 Mar 2004 12:18:43 -0800 (PST)
	(envelope-from mandrews@bit0.com)
Received: from localhost (localhost.bit0.com [127.0.0.1])
	by bit0.com (Postfix) with ESMTP id 8F5D434D33
	for <freebsd-stable@lists.freebsd.org>;
	Thu, 18 Mar 2004 15:19:25 -0500 (EST)
Date: Thu, 18 Mar 2004 15:19:25 -0500 (EST)
From: Mike Andrews <mandrews@bit0.com>
To: freebsd-stable@lists.freebsd.org
Message-ID: <20040318144807.E95350@mindcrime.bit0.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: Weird NFSvs rdirplus issues
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Production branch of FreeBSD source code
	<freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Mar 2004 20:19:48 -0000

I've been experimenting with readdirplus and running into two bizarre situations.

Bizarre situation #1 is the easy one: when you try to put the acdirmin/max
and acregmin/max options on an NFS filesystem in /etc/fstab, mount
(actually mount_nfs) will dump core on 4.9-RELEASE-p4 but not on
5.2.1-RELEASE-p3:

# grep acreg /etc/fstab
server:/fs /mnt nfs,ro,-lis,acdirmin=0,acdirmax=1,acregmin=0,acregmax=1 0 0
# mount /mnt
mount: server:/fs: Segmentation fault

(the core file left behind is for mount_nfs, not mount, though)
However running mount_nfs at the command line will work, even on 4.9:

# mount_nfs -lis -o acdirmin=0,acdirmax=1,acregmin=0,acregmax=1 server:/fs /mnt

Looks like some kind of parsing error that the fix hasn't been MFC'ed for?
(I haven't been able to check 4.9-STABLE yet to see if the fix made it there.)


Bizarre situation #2 is why I was messing with those options in the first
place...


With readdirplus enabled (i.e. 'rdirplus' or '-l' in the fstab mount
options) files sometimes, but not always, disappear when they're written
-- which is just a bit alarming.  :)  One way to reproduce this easily is
to have /usr/ports be an NFS mount and try to build a port that does
patches -- the .orig file will be created but the patched file will be
gone, which causes the build to fail with "no such file or directory"
errors.  A specific port I've seen this on is the lang/ruby16 port:

# mount_nfs -l server:/fs /usr/ports
# cd /usr/ports/lang/ruby16
# make clean
===>  Cleaning for ruby-1.6.8.2003.10.15_1
# make
===>  Vulnerability check disabled
===>  Extracting for ruby-1.6.8.2003.10.15_1
>> Checksum OK for ruby/ruby-1.6.8.tar.gz.
>> Checksum OK for ruby/ruby-1.6.8-2003.04.19.diff.bz2.
>> Checksum OK for ruby/ruby-1.6.8-2003.04.19-2003.10.15.diff.bz2.
===>  Patching for ruby-1.6.8.2003.10.15_1
===>  Applying distribution patches for ruby-1.6.8.2003.10.15_1
/bin/rm -rf /admin3.usr/ports/lang/ruby16/work/ruby-1.6.8/ext/Win32API
/bin/mv /admin3.usr/ports/lang/ruby16/work/ruby-1.6.8/ext/gdbm /admin3.usr/ports/lang/ruby16/work/
/bin/mv /admin3.usr/ports/lang/ruby16/work/ruby-1.6.8/ext/tcltklib /admin3.usr/ports/lang/ruby16/work/
/bin/mv /admin3.usr/ports/lang/ruby16/work/ruby-1.6.8/ext/tk /admin3.usr/ports/lang/ruby16/work/
===>  Configuring for ruby-1.6.8.2003.10.15_1
/usr/bin/touch /admin3.usr/ports/lang/ruby16/work/ruby-1.6.8/configure
configure: WARNING: you should use --build, --host, --target
grep: ./version.h: No such file or directory
checking build system type... i386-portbld-freebsd4
[snip]
===>  Building for ruby-1.6.8.2003.10.15_1
make: don't know how to make version.h. Stop
*** Error code 2

version.h is indeed missing, but version.h.orig exists (provided you
comment "${FIND} ${PATCH_WRKSRC} -name '*.orig' -delete" out of the ruby16
port's Makefile, that is).  It's not limited to just this one port anyway.

Removing the -l from the mount_nfs makes the problem go away.

I tried making a simple shell script to create a file, and apply a patch
to it, but I can't reproduce the problem that way.  There must be
something a bit more complicated going on...

I've also had files saved in vim (but not nvi) vanish as soon as they're
saved, but not consistently.

Thinking it might have to do with attribute caching, I've tried
experimenting with the attribute cache settings, and can make the problem
go away (or at least this particular symptom of it go away) by setting
acregmin=0 and acregmax=1.  This is how I ran into the fstab parsing
coredump above...

Setting the sysctl vfs.nfs.access_cache_timeout to 0 doesn't help.

On the NFS server end, I've tried FreeBSD 4.9-RELEASE-p4, and two Netapps
(one Ontap 5.3.7, one Ontap 6.3.3), and on the client end, FreeBSD
4.9-RELEASE-p4 and 5.2.1-RELEASE-p3.  Every combination does the same
thing.

The motivating factor in using readdirplus at all is that it drastically
reduces CPU load, ethernet load, and NFS ops/sec on the Netapps, which are
not exactly cheap to upgrade CPU in.  When I turned readdirplus off to
stop the file corruption, the Netapp's CPU load pegged at 100% around the
clock. But interestingly, it doesn't seem to raise the actual disk spindle
ops/sec; probably the extra stat() calls are being handled from its disk
cache.  Still, the overhead of quadruping the number of NFS calls is too
much for it...

So now I'm having to pick between high CPU load and randomly losing files
in unpredictable cases.  Obviously there's got to be some pattern to it,
and it seems to do with attribute caching, I just haven't nailed it down
100%.  If the fstab parsing issue could be fixed, I could just disable
attribute caching entirely and leave readdirplus on, which in theory
should give me reliability and still keep the load down...

So if there are any NFS gurus that can make any sense out of this, or tell
me what stupid thing I forgot to read up on, let me know :)

In particular, how the access_cache_timeout sysctl interacts with the
ac(dir|reg)(min|max) settings, why the sysctl is set to 2 in -stable and
60 in -current, and is there maybe something that should be invalidating
the cache other than the timeout that isn't happening?  Or is the timeout
all we have because of the statelessness of NFS?


Mike Andrews  *  mandrews@bit0.com  *  http://www.bit0.com
"The truth is, you never find the truth."     Carpe cavy!
It's not news, it's Fark.com.