Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Jan 2001 01:56:12 -0800
From:      Guy Harris <gharris@flashcom.net>
To:        Neil Brown <neilb@cse.unsw.edu.au>
Cc:        Matthias Andree <matthias.andree@stud.uni-dortmund.de>, Linux NFS mailing list <nfs@lists.sourceforge.net>, reiserfs-list@namesys.com, FreeBSD Stable <freebsd-stable@freebsd.org>
Subject:   Re: [NFS] Incompatible: FreeBSD 4.2 client, Linux 2.2.18 nfsv3 server, ReiserFS
Message-ID:  <20010123015612.H345@quadrajet.flashcom.com>

next in thread | raw e-mail | index | archive | help
> Beware of error messages reported by tcpdump... they are misleading.
> The error code in the tcp packet is an "NFS error code" which should
> not be confused with an "Unix-errno" error code, though there are
> sometimes similarities.
> tcpdump seems to assume that nfs error codes *are* unix error codes.

"Fixed in 3.6"; tcpdump 3.6 doesn't assume NFS error codes are errno
values for the current host.

However, the "116" in the

	23:22:29.528764 emma1.nfs > freebsd.526363636: reply ok 116
	    access ERROR: Read-only file system (ttl 64, id 12889)

line isn't the error value, it's the length of the packet;
"nfsreply_print()' in tcpdump 3.5 does:

	if (!nflag)
		(void)printf("%s.nfs > %s.%u: reply %s %d",
			     ipaddr_string(&ip->ip_src),
			     ipaddr_string(&ip->ip_dst),
			     (u_int32_t)ntohl(rp->rm_xid),
			     ntohl(rp->rm_reply.rp_stat) == MSG_ACCEPTED?
				     "ok":"ERR",
			     length);
	else
		(void)printf("%s.%u > %s.%u: reply %s %d",
			     ipaddr_string(&ip->ip_src),
			     NFS_PORT,
			     ipaddr_string(&ip->ip_dst),
			     (u_int32_t)ntohl(rp->rm_xid),
			     ntohl(rp->rm_reply.rp_stat) == MSG_ACCEPTED?
			     	"ok":"ERR",
			     length);

EROFS is 30 on both Linux/x86 and FreeBSD (and 30 goes, I suspect, all
the way back to V6 UNIX or earlier, so I suspect just about any UNIX
with AT&T code in it, or even that once upon a time had AT&T code in it,
e.g. most of the commercial UNIXes, use 30 for EROFS).

So...

> So my guess is that the reiserfs code which tries to support nfsd (which
> may well involve some patches to knfsd) is having problems, and wants
> to return ESTALE, but returns it without converting to an NFS error
> code.

..."Read-only file system" probably means you got back EROFS from the
server, and the ReiserFS code is probably not to blame - the Linux nfsd
in 2.2.18 appears to assume that the underlying file system returns
Linux errno values (or the negative of same), and maps them to NFS error
values.

I note, however, that Solaris's NFS server, as I remember, will, if you
make a V3 ACCESS call, asking for ACCESS3_MODIFY and/or ACCESS3_EXTEND
(and/or, I suspect, ACCESS3_DELETE), will, if the file being asked about
is on a read-only file system, just return access permissions with those
bits turned off - RFC 1813 doesn't list NFS3ERR_ROFS as one of the error
returns permissible from ACCESS.

(I remember this because I saw a note on the "toasters" mailing list,
where somebody's Solaris client was trying to write to a file on a
NetApp file server when that file was in a snapshot, and those files
aren't writable; we returned EROFS to that attempt, and the Solaris
client apparently decided "well, maybe he hasn't remounted the file
system read/write yet", and just kept trying, to no avail.

I figured the underlying problem was that we weren't returning "no, you
can't write it" to an ACCESS call, and that, were we to do so, the
"open()" of the file would fail, and the Solaris box wouldn't try to
write to it in the first place; I sent mail back to the person who sent
the note, asking him to try it with a Solaris NFS server, and, sure
enough, the Solaris box did, as I remember, what I described above.

I filed a bug inside NetApp about this, and our server was fixed to do
what the Solaris server does.  As the original implementor of V3 on
NetApp boxes, I'll cheerfully take the 50 lashes with a wet noodle
called for here. :-))

Perhaps the Linux server should, in "nfsd_access()", treat "nfserr_rofs"
the same way it treats "nfserr_perm" and "nfserr_acces", and just say
the access type is denied but the access query succeeded, doing the same
thing that Solaris and a future release of the NetApp software will do.

(It looks as if the FreeBSD NFS server already does that - it treats all
errors from "nfsrv_access()" as meaning "access denied", not "access
call failed", so it treats EROFS in that fashion.  The same probably
applies to other BSDs.)

As for why the V2 client doesn't appear to have this problem, the V2
client doesn't make an ACCESS call, because NFS V2 doesn't have an
ACCESS call to make.

As for why it's a problem with ReiserFS but not ext2fs, if the ReiserFS
file system in question wasn't mounted read-only, perhaps somehow it's
bogusly reporting read-onlyness to the NFS server.  If it *was* mounted
read-only, was the ext2 file system also mounted read-only?  If not,
that might explain it.

As for why it's a problem with the FreeBSD client but not the Solaris
client, I'm not sure - a quick look at the 4.2 client code doesn't seem
to show any way in which the EROFS is "sticky" to the extent that it
affects all client accesses, as it doesn't cache the result of an ACCESS
call that failed.  It may just be that the Solaris client just ignores
NFS3ERR_ROFS from an ACCESS call and does an access check based on the
permission bits, rather than returning EROFS, whilst the FreeBSD client
returns EROFS; if ReiserFS is returning EROFS bogusly, that might cause
the symptoms in question.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010123015612.H345>