Date: Mon, 14 Oct 2013 20:04:58 -0500 From: Rick Romero <rick@havokmon.com> To: freebsd-fs@freebsd.org Subject: Re: NFS locks, rpcbind port = 0 failed? - try #2 Message-ID: <20131014200458.Horde.QjqQXEfm9A5k9357kfSZGQ1@beta.vfemail.net> In-Reply-To: <485946006.40991171.1381788977647.JavaMail.root@uoguelph.ca> References: <485946006.40991171.1381788977647.JavaMail.root@uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
Quoting Rick Macklem <rmacklem@uoguelph.ca>: > Rick Romero wrote: >> This is a continuation of "9.1 VM nfs3 & locks over VPN" from >> freebsd-questions - trying a >> different angle maybe it'll jostle someones memory. Don't mean to >> cross-post, but as I pay more attention to the lists I'm reading, >> this >> seems to be the better list for NFS issues. >> >> I have a FreeBSD 9.2 VM at an offsite hosting company. hostname >> nl101vpn >> OpenVPN is installed on it, routed not bridged mode. >> I have multiple OSs installed on local network. I'm already >> exportings NFS >> off 9.1 with working file locks. >> >> What I see - >> export nfsv3 or nfsv4 from nl101vpn, mount on local FreeBSD or Linux >> - >> locks do not work. >> export nfsv3 from any local system, mount on nl101vpn - locks work. >> export nfsv3 from locally installed VM, mount on any local host or >> nl101vpn >> - locks work. No OpenVPN installed on it though. This was to test if >> virtio drivers might be causing the problem. >> >> I even ran a tcpdump to see if something was getting lost - both >> sides >> match, nothing is getting dropped >> >> nl101vpn - /var/log/messages: >> Oct 14 12:21:01 nl101 kernel: NLM: failed to contact remote rpcbind, >> stat = >> 0, port = 0 (why port 0?) >> Oct 14 12:23:02 nl101 last message repeated 109 times >> Oct 14 12:25:48 nl101 last message repeated 177 times >> >> I tried binding rpcbind to the VPN interface, but that doesn't seem >> to >> work. tcpdump shows no packets trying to leave the 'Internet' >> interface. >> >> So I haven't exhausted every combination, or completely 100% >> replicated >> whats happening offsite, but it's getting pretty ridiculous now... >> I'm >> lost, and I need NFS locking to work. >> Help :) > > For rpcbind to work, IP broadcast needs to work between the hosts > and I suspect that the VPN doesn't support that. > > Without rpcbind, I don't think you can get rpc.lockd/rpc.statd > to work, but I am not sure. (There are command line options for > these daemons that allow you to set specific port #s, but I don't > think that will fix the problem, since they still need rpcbind to > tell them the port# for the remote machines.) These protocols were > designed in the 1980s for use on a LAN. > > Now, nfsv4 shouldn't care less about rpcbind, rpc.lockd. NFSv4 locking > is handled as a part of the NFSv4 protocol and always uses port #2049. > I'd suggest you try NFSv4 again and make sure it is using NFSv4 and > the mount has not fallen back to NFSv3. (For FreeBSD, specify "nfsv4" > as a mount option. For Linux, specify "vers=4" as a mount option.) > You can check what the mount is actually using via "nfsstat -m". > If you assumed the locking for NFSv4 wasn't working because of these > messages, that isn't the case. If you are using NFSv4 for all mounts, > you don't need to run rpc.lockd at all (at least for FreeBSD, I'm > not sure what the daemons do w.r.t. Linux). Hi Rick, Yeah - I thought the VPN might pose a problem, but I can get locks from the VM side (nl101vpn) via NFS3 back to the main site. So it doesn't seem to be an issue with the VPN. After that I created a local VM to ensure it wasn't a virtio thing, and then upgraded the remote VM to 9.2 (to rule out any funky custom options the host may have thrown into their 9.1 installer). nada. So I'm re-trying with NFS4 - though my mount does show it was mounted nfs4 (from Linux) last time I tried: nl101vpn:/first on /mnt type nfs4 (rw,relatime,vers=4,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.16.1.92,minorversion=0,local_lock=none,addr=10.9.8.6) After trying again, still doesn't work. Though now I noticed the error is different. I have a little perl script that I test with, and the line is: flock(LOCKFILE, LOCK_SH) or die "Can't get shared lock on $lock_file: $!\n"; Before I would get 'pemission denied' - which (IIRC) would also happen when I forgot to run lockd or statd. Now with NFS4 it says, 'Bad file descriptor' After much more testing I've gotten a single Linux VM (but no other VMs on the same host, or my other host, yet they're all from the same template) to get a lock (n NFSv3) Failed locks show in the logs: NLM: failed to contact remote rpcbind, stat = 0, port = 0 (FreeBSD) NLM: failed to contact remote rpcbind, stat = 7, port = 28416 (Linux) I have 2 FreeBSD boxes, 9.1 and 7.2. Both don't seem to be relaying their ports? The Linux ones that fail have a port, but apparently can't be contacted. Or is 'port' in the log not really referring to a port number? The OpenVPN 'server' is another Linux VM - but on a different host than the working Linux VM :P None of the Linux VMs on that host can get a lock. How's THAT for weird? :) I need some aspirin. Rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20131014200458.Horde.QjqQXEfm9A5k9357kfSZGQ1>