Date: Fri, 22 May 2020 19:43:02 +0100 From: "Norman Gray" <norman.gray@glasgow.ac.uk> To: Greg Veldman <freebsd@gregv.net> Cc: FreeBSD Questions <freebsd-questions@freebsd.org> Subject: Re: Documentation and debugging for NFSv4 Message-ID: <AD3C4097-8DEF-41E1-81FE-4C4CD9DC4FA0@glasgow.ac.uk> In-Reply-To: <20200522162937.GJ1068@aurora.gregv.net> References: <D3388CA5-84AA-48F4-8B47-8B94EFA4305A@glasgow.ac.uk> <20200522162937.GJ1068@aurora.gregv.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Greg, hello. Thanks for your questions. I didn't describe the problem in detail in the first message, to keep it shorter. On 22 May 2020, at 17:29, Greg Veldman wrote: First things first: > FYI NFS problems in general usually come down to either firewall > settings or MTU mismatches. If there are any firewalls between > the client or server you may wish to temporarily disable them to > test (or add an allow all rule). Also check that the MTU is > the same for the whole path. There are no firewalls in between, and indeed on one pair of machines they're on the same subnet. I'd be surprised it's an MTU problem (though I'm not ruling anything = out) because I can do some mounts, so I _think_ that rules that out? > What symptoms are you seeing? Does it mount at all? If not > does it give you an error message? Does it mount but is it > not usable? Does NFSv3 work or is any version broken? The following is a description of a sequence of things more-or-less working as expected, followed by the now-clearer puzzle below '----'. server:/etc/exports /tank/home -maproot=3Dnobody -network=3D172.16.45.0/24 /tank/home/astro -maproot=3Dnobody -network=3D172.16.45.0/24 /tank/home/astro/norman -maproot=3Dnobody = -network=3D172.16.45.0/24 V4: /tank/home -network=3D172.16.45.0/24 There are numerous other lines in the exports file, and each of the above lines is almost duplicated, to refer to the networks for the machines `client1`, `client2` and `client3` below. I don't _particularly_ want /tank/home and /tank/home/astro exported, but I learn in exports(5) that if I'm exporting ZFS filesystems (which these are) then intermediate FSs must be exported as well. The V4 line 'turns on' the NFSv4 server, and marks /tank/home as the 'root' for this purpose. server:/etc/rc.conf mountd_enable=3D"YES" nfs_server_enable=3D"YES" nfsv4_server_enable=3D"YES" rpcbind_enable=3D"YES" server# rpcinfo -s program version(s) netid(s) service = owner 100000 2,3,4 local,udp6,tcp6,udp,tcp rpcbind = superuser 100005 3,1 tcp,udp,tcp6,udp6 mountd = superuser 100003 3,2 tcp6,tcp,udp6,udp nfs = superuser Ie, no version 4 for the 'nfs' service. On the client (CentOS 7.8, Linux automount version 5.0.7-109.el7): client1# mount -tnfs server:/astro/norman /mnt client1# mount | grep norman server:/astro/norman on /mnt type nfs4 = (rw,relatime,vers=3D4.1,rsize=3D131072,wsize=3D131072,namlen=3D255,hard,p= roto=3Dtcp,timeo=3D600,retrans=3D2,sec=3Dsys,clientaddr=3D172.20.45.184,l= ocal_lock=3Dnone,addr=3D130.209.45.61) (note 'absolute' path relative to NFS root), and norman@client1$ flock /mnt/try.lock echo hello hello That is, this is mounted as NFSv4, and file-locking works. Good. The same is true with an Ubuntu12 machine ('client2') also running = automount v5.0.6 (yes, Ubuntu 12; this work is part of a sequence of steps to let that machine out of the dungeon it's currently in for its own good). client# mount -tnfs server:/astro/norman /mnt The same thing works with the mounted filesystem given as `server:astro/norman` (ie, without the leading slash). This was the result of a hint on some other forum post (and the exports(5) page doesn't make it clear which one is correct). This also seems to work in this particular case, but an anomaly is that the mount command shows this as server:/astro/norman as if it were an absolute path. If, in contrast, we do client1# mount -tnfs server:/tank/home/astro/norman /mnt client1# mount | grep norman server:/tank/home/astro/norman on /mnt type nfs = (rw,relatime,vers=3D3,rsize=3D131072,wsize=3D131072,namlen=3D255,hard,pro= to=3Dtcp,timeo=3D600,retrans=3D2,sec=3Dsys,mountaddr=3D130.209.45.61,moun= tvers=3D3,mountport=3D780,mountproto=3Dudp,local_lock=3Dnone,addr=3D130.2= 09.45.61) and norman@client1$ flock /mnt/try.lock echo hello =2E..hangs for a 4 minutes (what timeout is this?). Also, this appears to be a NFS3 mount. I do not think I would have been able to predict that would happen. nfsv4(4) says "setting [rootdir] to anything other than ``/'' will result in clients being required to use different mount paths for NFSv4 than for NFS Version 2 or 3.", but it doesn't say why, or what the different paths might be, which is frustrating. So what's happening here is something like the client, in the first case, trying a v4 mount of /astro/norman with success, and then, in the second case, trying a v4 mount of /tank/home/astro/norman, which fails, so the client falls back to a v3 mount of that path (which doesn't work for me, because of the different flock semantics in NFSv4). Again, I don't know if this is a sane heuristic, or if the manpages are telling me I should have predicted that. So far so good. Client1 and client2 are two different Linux distros, on two different subnets (albeit with almost the same automount version, slightly surprisingly), and both can mount the server filesystem. For what it's worth, the same thing works when using an automount map appropriately configured via LDAP. I haven't tried any FreeBSD clients here: I'm serving almost exclusively CentOS clients (plus occasional legacy rogues, of course). ---- I now go to another CentOS 7.8 machine client3# mount -tnfs server:/astro/norman /mnt client3# mount|grep /mnt server:/astro/norman on /mnt type nfs4 = (rw,relatime,vers=3D4.1,rsize=3D131072,wsize=3D131072,namlen=3D255,hard,p= roto=3Dtcp,timeo=3D600,retrans=3D2,sec=3Dsys,clientaddr=3D130.209.202.212= ,local_lock=3Dnone,addr=3D130.209.45.61) client3# ls /mnt ls: reading directory /mnt: Input/output error Nothing! The same thing -- an apparently successful mount, and then an I/O error -- happens when I let the automounter do the work. If I try client3# mount -tnfs server:/tank/home/astro/norman /mnt mount.nfs: Stale file handle that is, the server path that should end up with a v3 mount, I get a separate NFS problem. So: My real problem is that at this point I have run out of techniques to diagnose this. The server looks... OK (modulo the 100003 3/4 version puzzle I've mentioned above), and some mounts work (including on another CentOS machine at the same OS version, which I've not knowingly configured significantly differently). There's nothing relevant in the logs. I have a suspicion that if I simply rebooted everything, I'd clear out some caches and this might start working as I expect (as it happens, client3 is scheduled for a reboot tonight anyway, so we'll find out), but that's never been a fully satisfactory solution even if it worked. I've always felt uncomfortable with the (perceived) lack of NFS diagnostics for this sort of situation, but in a good number of years, this is the first time it's really bitten me. > NFSv4 introduces a couple of new things that can also be issues. > The first is the v4 tree root, and the second is the concept > of ID mapping to avoid the need to sync UID/GID numbers. If > you can make it work with v3 but not v4, I'd start looking there. > Specifically make sure you have a "V4" line in /etc/exports on > the FreeBSD server, and make sure nfsuserd is running and > configured with the same domain name as in /etc/idmapd.conf on > the Linux side (and make sure rpc.idmapd is running on Linux). I haven't done anything clever with user mapping, because I don't _think_ I need to (do I?). All of the machines in question refer to the same LDAP directory for account and mount-map information. Is this something I should care about nonetheless, whether or not it's related to this mount problem? Thanks for reading this far. Best wishes, Norman -- = Norman Gray : http://www.astro.gla.ac.uk/users/norman/it/ Research IT Coordinator : School of Physics and Astronomy // My current template week for IT tasks is: Monday, Tuesday, and Friday
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AD3C4097-8DEF-41E1-81FE-4C4CD9DC4FA0>