Date: Tue, 15 Oct 2013 10:29:22 -0500 From: Rick Romero <rick@havokmon.com> To: freebsd-fs@freebsd.org Subject: Re: NFS locks, rpcbind port = 0 failed? - try #2 Message-ID: <20131015102922.Horde.Wr6o_T_ZY1aX0Z7nkPNeAw2@beta.vfemail.net> In-Reply-To: <124950221.41251023.1381844070456.JavaMail.root@uoguelph.ca> References: <124950221.41251023.1381844070456.JavaMail.root@uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
Quoting Rick Macklem <rmacklem@uoguelph.ca>: > Rick Romero wrote: >> Quoting Rick Macklem <rmacklem@uoguelph.ca>: >> >> Rick Romero wrote: >> This is a continuation of "9.1 VM nfs3 & locks over VPN" from >> freebsd-questions - trying a >> different angle maybe it'll jostle someones memory. Don't mean to >> cross-post, but as I pay more attention to the lists I'm reading, >> this >> seems to be the better list for NFS issues. >> >> I have a FreeBSD 9.2 VM at an offsite hosting company. hostname >> nl101vpn >> OpenVPN is installed on it, routed not bridged mode. >> I have multiple OSs installed on local network. I'm already >> exportings NFS >> off 9.1 with working file locks. >> >> What I see - >> export nfsv3 or nfsv4 from nl101vpn, mount on local FreeBSD or >> Linux >> - >> locks do not work. >> export nfsv3 from any local system, mount on nl101vpn - locks >> work. >> export nfsv3 from locally installed VM, mount on any local host or >> nl101vpn >> - locks work. No OpenVPN installed on it though. This was to test >> if >> virtio drivers might be causing the problem. >> >> I even ran a tcpdump to see if something was getting lost - both >> sides >> match, nothing is getting dropped >> >> nl101vpn - /var/log/messages: >> Oct 14 12:21:01 nl101 kernel: NLM: failed to contact remote >> rpcbind, >> stat = >> 0, port = 0 (why port 0?) >> Oct 14 12:23:02 nl101 last message repeated 109 times >> Oct 14 12:25:48 nl101 last message repeated 177 times >> >> I tried binding rpcbind to the VPN interface, but that doesn't >> seem >> to >> work. tcpdump shows no packets trying to leave the 'Internet' >> interface. >> >> So I haven't exhausted every combination, or completely 100% >> replicated >> whats happening offsite, but it's getting pretty ridiculous now... >> I'm >> lost, and I need NFS locking to work. >> Help :) >> >> For rpcbind to work, IP broadcast needs to work between the hosts >> and I suspect that the VPN doesn't support that. >> >> Without rpcbind, I don't think you can get rpc.lockd/rpc.statd >> to work, but I am not sure. (There are command line options for >> these daemons that allow you to set specific port #s, but I don't >> think that will fix the problem, since they still need rpcbind to >> tell them the port# for the remote machines.) These protocols were >> designed in the 1980s for use on a LAN. >> >> Now, nfsv4 shouldn't care less about rpcbind, rpc.lockd. NFSv4 >> locking >> is handled as a part of the NFSv4 protocol and always uses port >> #2049. >> I'd suggest you try NFSv4 again and make sure it is using NFSv4 and >> the mount has not fallen back to NFSv3. (For FreeBSD, specify >> "nfsv4" >> as a mount option. For Linux, specify "vers=4" as a mount option.) >> You can check what the mount is actually using via "nfsstat -m". >> If you assumed the locking for NFSv4 wasn't working because of >> these >> messages, that isn't the case. If you are using NFSv4 for all >> mounts, >> you don't need to run rpc.lockd at all (at least for FreeBSD, I'm >> not sure what the daemons do w.r.t. Linux). >> >> Hi Rick, >> >> Yeah - I thought the VPN might pose a problem, but I can get locks >> from the >> VM side (nl101vpn) via NFS3 back to the main site. So it doesn't >> seem to >> be an issue with the VPN. After that I created a local VM to ensure >> it >> wasn't a virtio thing, and then upgraded the remote VM to 9.2 (to >> rule out >> any funky custom options the host may have thrown into their 9.1 >> installer). nada. >> >> So I'm re-trying with NFS4 - though my mount does show it was mounted >> nfs4 >> (from Linux) last time I tried: >> nl101vpn:/first on /mnt type nfs4 >> (rw,relatime,vers=4,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.16.1.92,minorversion=0,local_lock=none,addr=10.9.8.6) >> >> After trying again, still doesn't work. Though now I noticed the >> error is >> different. I have a little perl script that I test with, and the >> line is: >> flock(LOCKFILE, LOCK_SH) or die "Can't get shared lock on $lock_file: >> $!\n"; >> Before I would get 'pemission denied' - which (IIRC) would also >> happen when >> I forgot to run lockd or statd. >> Now with NFS4 it says, 'Bad file descriptor' > > Well NFSv4 supports POSIX byte range locking (the fcntl() stuff), > so if you can switch your testing/apps to that, you might find > it works. (I have no idea what a Linux nfs4 mount will do with a > flock() call.) > >> After much more testing I've gotten a single Linux VM (but no other >> VMs on >> the same host, or my other host, yet they're all from the same >> template) to >> get a lock (n NFSv3) >> Failed locks show in the logs: >> NLM: failed to contact remote rpcbind, stat = 0, port = 0 >> (FreeBSD) >> NLM: failed to contact remote rpcbind, stat = 7, port = 28416 >> (Linux) >> >> I have 2 FreeBSD boxes, 9.1 and 7.2. Both don't seem to be relaying >> their >> ports? >> The Linux ones that fail have a port, but apparently can't be >> contacted. Or >> is 'port' in the log not really referring to a port number? >> >> The OpenVPN 'server' is another Linux VM - but on a different host >> than the >> working Linux VM :P None of the Linux VMs on that host can get a >> lock. >> How's THAT for weird? :) >> >> I need some aspirin. > > Maybe someone else can suggest something that will help. If your > machines/VMs have multiple IP host addresses, the "-h" option can > be useful on all these daemons. > > I'll also mention that, although I don't know the NSM (rpc.statd) > well, my basic understanding of it is that it "pings" other hosts > to see if they are "up" (sometimes using IP broadcast, I think?). > If it doesn't get a response, it marks the host "down" and then > rpc.lockd doesn't use that host. > > I've never trusted rpc.lockd and friends, but others have found > them useful. I believe a reliable LAN/WAN with support for IP > broadcast and no port# blocking is needed for it to have any chance > of working reliably. Long story short - while my perl script doesn't properly test for locks (thanks for that, I wouldn't have moved on without it) - the applications I need to run seem to be working correctly over NFSv4. Thanks for your time! Rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20131015102922.Horde.Wr6o_T_ZY1aX0Z7nkPNeAw2>