Date: Tue, 15 Oct 2013 09:34:30 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: Rick Romero <rick@havokmon.com> Cc: freebsd-fs@freebsd.org Subject: Re: NFS locks, rpcbind port = 0 failed? - try #2 Message-ID: <124950221.41251023.1381844070456.JavaMail.root@uoguelph.ca> In-Reply-To: <20131014200458.Horde.QjqQXEfm9A5k9357kfSZGQ1@beta.vfemail.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Rick Romero wrote: > Quoting Rick Macklem <rmacklem@uoguelph.ca>: >=20 > > Rick Romero wrote: > >> This is a continuation of "9.1 VM nfs3 & locks over VPN" from > >> freebsd-questions - trying a > >> different angle maybe it'll jostle someones memory.=C2=A0 Don't mean t= o > >> cross-post, but as I pay more attention to the lists I'm reading, > >> this > >> seems to be the better list for NFS issues. > >> > >> I have a FreeBSD 9.2 VM at an offsite hosting company.=C2=A0 hostname > >> nl101vpn > >> OpenVPN is installed on it, routed not bridged mode. > >> I have multiple OSs installed on local network. I'm already > >> exportings NFS > >> off 9.1 with working file locks. > >> > >> What I see - > >> export nfsv3 or nfsv4 from nl101vpn, mount on local FreeBSD or > >> Linux > >> - > >> locks do not work. > >> export nfsv3 from any local system, mount on nl101vpn - locks > >> work. > >> export nfsv3 from locally installed VM, mount on any local host or > >> nl101vpn > >> - locks work.=C2=A0 No OpenVPN installed on it though. This was to tes= t > >> if > >> virtio drivers might be causing the problem. > >> > >> I even ran a tcpdump to see if something was getting lost - both > >> sides > >> match, nothing is getting dropped > >> > >> nl101vpn - /var/log/messages: > >> Oct 14 12:21:01 nl101 kernel: NLM: failed to contact remote > >> rpcbind, > >> stat =3D > >> 0, port =3D 0=C2=A0 (why port 0?) > >> Oct 14 12:23:02 nl101 last message repeated 109 times > >> Oct 14 12:25:48 nl101 last message repeated 177 times > >> > >> I tried binding rpcbind to the VPN interface, but that doesn't > >> seem > >> to > >> work.=C2=A0 tcpdump shows no packets trying to leave the 'Internet' > >> interface. > >> > >> So I haven't exhausted every combination, or completely 100% > >> replicated > >> whats happening offsite, but it's getting pretty ridiculous now... > >> I'm > >> lost, and I need NFS locking to work. > >> Help :) > > > > For rpcbind to work, IP broadcast needs to work between the hosts > > and I suspect that the VPN doesn't support that. > > > > Without rpcbind, I don't think you can get rpc.lockd/rpc.statd > > to work, but I am not sure. (There are command line options for > > these daemons that allow you to set specific port #s, but I don't > > think that will fix the problem, since they still need rpcbind to > > tell them the port# for the remote machines.) These protocols were > > designed in the 1980s for use on a LAN. > > > > Now, nfsv4 shouldn't care less about rpcbind, rpc.lockd. NFSv4 > > locking > > is handled as a part of the NFSv4 protocol and always uses port > > #2049. > > I'd suggest you try NFSv4 again and make sure it is using NFSv4 and > > the mount has not fallen back to NFSv3. (For FreeBSD, specify > > "nfsv4" > > as a mount option. For Linux, specify "vers=3D4" as a mount option.) > > You can check what the mount is actually using via "nfsstat -m". > > If you assumed the locking for NFSv4 wasn't working because of > > these > > messages, that isn't the case. If you are using NFSv4 for all > > mounts, > > you don't need to run rpc.lockd at all (at least for FreeBSD, I'm > > not sure what the daemons do w.r.t. Linux). >=20 > Hi Rick, >=20 > Yeah - I thought the VPN might pose a problem, but I can get locks > from the > VM side (nl101vpn) via NFS3 back to the main site.=C2=A0 So it doesn't > seem to > be an issue with the VPN. After that I created a local VM to ensure > it > wasn't a virtio thing, and then upgraded the remote VM to 9.2 (to > rule out > any funky custom options the host may have thrown into their 9.1 > installer).=C2=A0 nada. >=20 > So I'm re-trying with NFS4 - though my mount does show it was mounted > nfs4 > (from Linux) last time I tried: > nl101vpn:/first on /mnt type nfs4 > (rw,relatime,vers=3D4,rsize=3D65536,wsize=3D65536,namlen=3D255,hard,proto= =3Dtcp,timeo=3D600,retrans=3D2,sec=3Dsys,clientaddr=3D172.16.1.92,minorvers= ion=3D0,local_lock=3Dnone,addr=3D10.9.8.6) >=20 > After trying again, still doesn't work.=C2=A0 Though now I noticed the > error is > different.=C2=A0 I have a little perl script that I test with, and the > line is: > flock(LOCKFILE, LOCK_SH) or die "Can't get shared lock on $lock_file: > $!\n"; > Before I would get 'pemission denied' - which (IIRC) would also > happen when > I forgot to run lockd or statd. > Now with NFS4 it says, 'Bad file descriptor' >=20 Well NFSv4 supports POSIX byte range locking (the fcntl() stuff), so if you can switch your testing/apps to that, you might find it works. (I have no idea what a Linux nfs4 mount will do with a flock() call.) > After much more testing I've gotten a single Linux VM (but no other > VMs on > the same host, or my other host, yet they're all from the same > template) to > get a lock (n NFSv3) > Failed locks show in the logs: > NLM: failed to contact remote rpcbind, stat =3D 0, port =3D 0 > =C2=A0=C2=A0=C2=A0 (FreeBSD) > NLM: failed to contact remote rpcbind, stat =3D 7, port =3D 28416 > =C2=A0 (Linux) >=20 > I have 2 FreeBSD boxes, 9.1 and 7.2.=C2=A0 Both don't seem to be relaying > their > ports? > The Linux ones that fail have a port, but apparently can't be > contacted. Or > is 'port' in the log not really referring to a port number? >=20 > The OpenVPN 'server' is another Linux VM - but on a different host > than the > working Linux VM :P=C2=A0 None of the Linux VMs on that host can get a > lock. > How's THAT for weird? :) >=20 > I need some aspirin. >=20 Maybe someone else can suggest something that will help. If your machines/VMs have multiple IP host addresses, the "-h" option can be useful on all these daemons. I'll also mention that, although I don't know the NSM (rpc.statd) well, my basic understanding of it is that it "pings" other hosts to see if they are "up" (sometimes using IP broadcast, I think?). If it doesn't get a response, it marks the host "down" and then rpc.lockd doesn't use that host. I've never trusted rpc.lockd and friends, but others have found them useful. I believe a reliable LAN/WAN with support for IP broadcast and no port# blocking is needed for it to have any chance of working reliably. rick > Rick > =C2=A0 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?124950221.41251023.1381844070456.JavaMail.root>