Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Oct 2013 10:29:22 -0500
From:      Rick Romero <rick@havokmon.com>
To:        freebsd-fs@freebsd.org
Subject:   Re: NFS locks, rpcbind port = 0 failed? - try #2
Message-ID:  <20131015102922.Horde.Wr6o_T_ZY1aX0Z7nkPNeAw2@beta.vfemail.net>
In-Reply-To: <124950221.41251023.1381844070456.JavaMail.root@uoguelph.ca>
References:  <124950221.41251023.1381844070456.JavaMail.root@uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
  Quoting Rick Macklem <rmacklem@uoguelph.ca>:

> Rick Romero wrote:
>> Quoting Rick Macklem <rmacklem@uoguelph.ca>:
>>
>> Rick Romero wrote:
>> This is a continuation of "9.1 VM nfs3 & locks over VPN" from
>> freebsd-questions - trying a
>> different angle maybe it'll jostle someones memory.  Don't mean to
>> cross-post, but as I pay more attention to the lists I'm reading,
>> this
>> seems to be the better list for NFS issues.
>>
>> I have a FreeBSD 9.2 VM at an offsite hosting company.  hostname
>> nl101vpn
>> OpenVPN is installed on it, routed not bridged mode.
>> I have multiple OSs installed on local network. I'm already
>> exportings NFS
>> off 9.1 with working file locks.
>>
>> What I see -
>> export nfsv3 or nfsv4 from nl101vpn, mount on local FreeBSD or
>> Linux
>> -
>> locks do not work.
>> export nfsv3 from any local system, mount on nl101vpn - locks
>> work.
>> export nfsv3 from locally installed VM, mount on any local host or
>> nl101vpn
>> - locks work.  No OpenVPN installed on it though. This was to test
>> if
>> virtio drivers might be causing the problem.
>>
>> I even ran a tcpdump to see if something was getting lost - both
>> sides
>> match, nothing is getting dropped
>>
>> nl101vpn - /var/log/messages:
>> Oct 14 12:21:01 nl101 kernel: NLM: failed to contact remote
>> rpcbind,
>> stat =
>> 0, port = 0  (why port 0?)
>> Oct 14 12:23:02 nl101 last message repeated 109 times
>> Oct 14 12:25:48 nl101 last message repeated 177 times
>>
>> I tried binding rpcbind to the VPN interface, but that doesn't
>> seem
>> to
>> work.  tcpdump shows no packets trying to leave the 'Internet'
>> interface.
>>
>> So I haven't exhausted every combination, or completely 100%
>> replicated
>> whats happening offsite, but it's getting pretty ridiculous now...
>> I'm
>> lost, and I need NFS locking to work.
>> Help :)
>>
>> For rpcbind to work, IP broadcast needs to work between the hosts
>> and I suspect that the VPN doesn't support that.
>>
>> Without rpcbind, I don't think you can get rpc.lockd/rpc.statd
>> to work, but I am not sure. (There are command line options for
>> these daemons that allow you to set specific port #s, but I don't
>> think that will fix the problem, since they still need rpcbind to
>> tell them the port# for the remote machines.) These protocols were
>> designed in the 1980s for use on a LAN.
>>
>> Now, nfsv4 shouldn't care less about rpcbind, rpc.lockd. NFSv4
>> locking
>> is handled as a part of the NFSv4 protocol and always uses port
>> #2049.
>> I'd suggest you try NFSv4 again and make sure it is using NFSv4 and
>> the mount has not fallen back to NFSv3. (For FreeBSD, specify
>> "nfsv4"
>> as a mount option. For Linux, specify "vers=4" as a mount option.)
>> You can check what the mount is actually using via "nfsstat -m".
>> If you assumed the locking for NFSv4 wasn't working because of
>> these
>> messages, that isn't the case. If you are using NFSv4 for all
>> mounts,
>> you don't need to run rpc.lockd at all (at least for FreeBSD, I'm
>> not sure what the daemons do w.r.t. Linux).
>>
>>   Hi Rick,
>>
>> Yeah - I thought the VPN might pose a problem, but I can get locks
>> from the
>> VM side (nl101vpn) via NFS3 back to the main site.  So it doesn't
>> seem to
>> be an issue with the VPN. After that I created a local VM to ensure
>> it
>> wasn't a virtio thing, and then upgraded the remote VM to 9.2 (to
>> rule out
>> any funky custom options the host may have thrown into their 9.1
>> installer).  nada.
>>
>> So I'm re-trying with NFS4 - though my mount does show it was mounted
>> nfs4
>> (from Linux) last time I tried:
>> nl101vpn:/first on /mnt type nfs4
>>
(rw,relatime,vers=4,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.16.1.92,minorversion=0,local_lock=none,addr=10.9.8.6)
>>
>> After trying again, still doesn't work.  Though now I noticed the
>> error is
>> different.  I have a little perl script that I test with, and the
>> line is:
>> flock(LOCKFILE, LOCK_SH) or die "Can't get shared lock on $lock_file:
>> $!\n";
>> Before I would get 'pemission denied' - which (IIRC) would also
>> happen when
>> I forgot to run lockd or statd.
>> Now with NFS4 it says, 'Bad file descriptor'
>
> Well NFSv4 supports POSIX byte range locking (the fcntl() stuff),
> so if you can switch your testing/apps to that, you might find
> it works. (I have no idea what a Linux nfs4 mount will do with a
> flock() call.)
>
>> After much more testing I've gotten a single Linux VM (but no other
>> VMs on
>> the same host, or my other host, yet they're all from the same
>> template) to
>> get a lock (n NFSv3)
>> Failed locks show in the logs:
>> NLM: failed to contact remote rpcbind, stat = 0, port = 0
>>     (FreeBSD)
>> NLM: failed to contact remote rpcbind, stat = 7, port = 28416
>>   (Linux)
>>
>> I have 2 FreeBSD boxes, 9.1 and 7.2.  Both don't seem to be relaying
>> their
>> ports?
>> The Linux ones that fail have a port, but apparently can't be
>> contacted. Or
>> is 'port' in the log not really referring to a port number?
>>
>> The OpenVPN 'server' is another Linux VM - but on a different host
>> than the
>> working Linux VM :P  None of the Linux VMs on that host can get a
>> lock.
>> How's THAT for weird? :)
>>
>> I need some aspirin.
>
> Maybe someone else can suggest something that will help. If your
> machines/VMs have multiple IP host addresses, the "-h" option can
> be useful on all these daemons.
>
> I'll also mention that, although I don't know the NSM (rpc.statd)
> well, my basic understanding of it is that it "pings" other hosts
> to see if they are "up" (sometimes using IP broadcast, I think?).
> If it doesn't get a response, it marks the host "down" and then
> rpc.lockd doesn't use that host.
>
> I've never trusted rpc.lockd and friends, but others have found
> them useful. I believe a reliable LAN/WAN with support for IP
> broadcast and no port# blocking is needed for it to have any chance
> of working reliably.

Long story short - while my perl script doesn't properly test for locks
(thanks for that, I wouldn't have moved on without it) - the applications I
need to run seem to be working correctly over NFSv4.

Thanks for your time!

Rick



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20131015102922.Horde.Wr6o_T_ZY1aX0Z7nkPNeAw2>