Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Oct 2013 09:34:30 -0400 (EDT)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Rick Romero <rick@havokmon.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: NFS locks, rpcbind port = 0 failed? - try #2
Message-ID:  <124950221.41251023.1381844070456.JavaMail.root@uoguelph.ca>
In-Reply-To: <20131014200458.Horde.QjqQXEfm9A5k9357kfSZGQ1@beta.vfemail.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Rick Romero wrote:
> Quoting Rick Macklem <rmacklem@uoguelph.ca>:
>=20
> > Rick Romero wrote:
> >> This is a continuation of "9.1 VM nfs3 & locks over VPN" from
> >> freebsd-questions - trying a
> >> different angle maybe it'll jostle someones memory.=C2=A0 Don't mean t=
o
> >> cross-post, but as I pay more attention to the lists I'm reading,
> >> this
> >> seems to be the better list for NFS issues.
> >>
> >> I have a FreeBSD 9.2 VM at an offsite hosting company.=C2=A0 hostname
> >> nl101vpn
> >> OpenVPN is installed on it, routed not bridged mode.
> >> I have multiple OSs installed on local network. I'm already
> >> exportings NFS
> >> off 9.1 with working file locks.
> >>
> >> What I see -
> >> export nfsv3 or nfsv4 from nl101vpn, mount on local FreeBSD or
> >> Linux
> >> -
> >> locks do not work.
> >> export nfsv3 from any local system, mount on nl101vpn - locks
> >> work.
> >> export nfsv3 from locally installed VM, mount on any local host or
> >> nl101vpn
> >> - locks work.=C2=A0 No OpenVPN installed on it though. This was to tes=
t
> >> if
> >> virtio drivers might be causing the problem.
> >>
> >> I even ran a tcpdump to see if something was getting lost - both
> >> sides
> >> match, nothing is getting dropped
> >>
> >> nl101vpn - /var/log/messages:
> >> Oct 14 12:21:01 nl101 kernel: NLM: failed to contact remote
> >> rpcbind,
> >> stat =3D
> >> 0, port =3D 0=C2=A0 (why port 0?)
> >> Oct 14 12:23:02 nl101 last message repeated 109 times
> >> Oct 14 12:25:48 nl101 last message repeated 177 times
> >>
> >> I tried binding rpcbind to the VPN interface, but that doesn't
> >> seem
> >> to
> >> work.=C2=A0 tcpdump shows no packets trying to leave the 'Internet'
> >> interface.
> >>
> >> So I haven't exhausted every combination, or completely 100%
> >> replicated
> >> whats happening offsite, but it's getting pretty ridiculous now...
> >> I'm
> >> lost, and I need NFS locking to work.
> >> Help :)
> >
> > For rpcbind to work, IP broadcast needs to work between the hosts
> > and I suspect that the VPN doesn't support that.
> >
> > Without rpcbind, I don't think you can get rpc.lockd/rpc.statd
> > to work, but I am not sure. (There are command line options for
> > these daemons that allow you to set specific port #s, but I don't
> > think that will fix the problem, since they still need rpcbind to
> > tell them the port# for the remote machines.) These protocols were
> > designed in the 1980s for use on a LAN.
> >
> > Now, nfsv4 shouldn't care less about rpcbind, rpc.lockd. NFSv4
> > locking
> > is handled as a part of the NFSv4 protocol and always uses port
> > #2049.
> > I'd suggest you try NFSv4 again and make sure it is using NFSv4 and
> > the mount has not fallen back to NFSv3. (For FreeBSD, specify
> > "nfsv4"
> > as a mount option. For Linux, specify "vers=3D4" as a mount option.)
> > You can check what the mount is actually using via "nfsstat -m".
> > If you assumed the locking for NFSv4 wasn't working because of
> > these
> > messages, that isn't the case. If you are using NFSv4 for all
> > mounts,
> > you don't need to run rpc.lockd at all (at least for FreeBSD, I'm
> > not sure what the daemons do w.r.t. Linux).
>=20
>   Hi Rick,
>=20
> Yeah - I thought the VPN might pose a problem, but I can get locks
> from the
> VM side (nl101vpn) via NFS3 back to the main site.=C2=A0 So it doesn't
> seem to
> be an issue with the VPN. After that I created a local VM to ensure
> it
> wasn't a virtio thing, and then upgraded the remote VM to 9.2 (to
> rule out
> any funky custom options the host may have thrown into their 9.1
> installer).=C2=A0 nada.
>=20
> So I'm re-trying with NFS4 - though my mount does show it was mounted
> nfs4
> (from Linux) last time I tried:
> nl101vpn:/first on /mnt type nfs4
> (rw,relatime,vers=3D4,rsize=3D65536,wsize=3D65536,namlen=3D255,hard,proto=
=3Dtcp,timeo=3D600,retrans=3D2,sec=3Dsys,clientaddr=3D172.16.1.92,minorvers=
ion=3D0,local_lock=3Dnone,addr=3D10.9.8.6)
>=20
> After trying again, still doesn't work.=C2=A0 Though now I noticed the
> error is
> different.=C2=A0 I have a little perl script that I test with, and the
> line is:
> flock(LOCKFILE, LOCK_SH) or die "Can't get shared lock on $lock_file:
> $!\n";
> Before I would get 'pemission denied' - which (IIRC) would also
> happen when
> I forgot to run lockd or statd.
> Now with NFS4 it says, 'Bad file descriptor'
>=20
Well NFSv4 supports POSIX byte range locking (the fcntl() stuff),
so if you can switch your testing/apps to that, you might find
it works. (I have no idea what a Linux nfs4 mount will do with a
flock() call.)

> After much more testing I've gotten a single Linux VM (but no other
> VMs on
> the same host, or my other host, yet they're all from the same
> template) to
> get a lock (n NFSv3)
> Failed locks show in the logs:
> NLM: failed to contact remote rpcbind, stat =3D 0, port =3D 0
> =C2=A0=C2=A0=C2=A0 (FreeBSD)
> NLM: failed to contact remote rpcbind, stat =3D 7, port =3D 28416
> =C2=A0 (Linux)
>=20
> I have 2 FreeBSD boxes, 9.1 and 7.2.=C2=A0 Both don't seem to be relaying
> their
> ports?
> The Linux ones that fail have a port, but apparently can't be
> contacted. Or
> is 'port' in the log not really referring to a port number?
>=20
> The OpenVPN 'server' is another Linux VM - but on a different host
> than the
> working Linux VM :P=C2=A0 None of the Linux VMs on that host can get a
> lock.
> How's THAT for weird? :)
>=20
> I need some aspirin.
>=20
Maybe someone else can suggest something that will help. If your
machines/VMs have multiple IP host addresses, the "-h" option can
be useful on all these daemons.

I'll also mention that, although I don't know the NSM (rpc.statd)
well, my basic understanding of it is that it "pings" other hosts
to see if they are "up" (sometimes using IP broadcast, I think?).
If it doesn't get a response, it marks the host "down" and then
rpc.lockd doesn't use that host.

I've never trusted rpc.lockd and friends, but others have found
them useful. I believe a reliable LAN/WAN with support for IP
broadcast and no port# blocking is needed for it to have any chance
of working reliably.

rick

> Rick
> =C2=A0
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?124950221.41251023.1381844070456.JavaMail.root>