From owner-freebsd-fs@FreeBSD.ORG Tue Oct 15 01:05:06 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id B0C864D2 for ; Tue, 15 Oct 2013 01:05:06 +0000 (UTC) (envelope-from rick@havokmon.com) Received: from smtp101-5.vfemail.net (nine.vfemail.net [108.76.175.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 4C6D82493 for ; Tue, 15 Oct 2013 01:05:05 +0000 (UTC) Received: (qmail 1270 invoked by uid 89); 15 Oct 2013 01:04:58 -0000 Received: by simscan 1.4.0 ppid: 1257, pid: 1262, t: 0.0870s scanners:none Received: from unknown (HELO www110) (cmlja0BoYXZva21vbi5jb20=@172.16.100.92) by 172.16.100.61 with ESMTPA; 15 Oct 2013 01:04:58 -0000 Received: from fw.vfemail.net (fw.vfemail.net [108.76.175.13]) by beta.vfemail.net (Horde Framework) with HTTP; Mon, 14 Oct 2013 20:04:58 -0500 Date: Mon, 14 Oct 2013 20:04:58 -0500 Message-ID: <20131014200458.Horde.QjqQXEfm9A5k9357kfSZGQ1@beta.vfemail.net> From: Rick Romero To: freebsd-fs@freebsd.org Subject: Re: NFS locks, rpcbind port = 0 failed? - try #2 References: <485946006.40991171.1381788977647.JavaMail.root@uoguelph.ca> In-Reply-To: <485946006.40991171.1381788977647.JavaMail.root@uoguelph.ca> User-Agent: Internet Messaging Program (IMP) H5 (6.1.4) X-VFEmail-Originating-IP: MTA4Ljc2LjE3NS4xMw== X-VFEmail-Remote-Browser: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0 @ X-VFEmail-AntiSpam: Notify admin@vfemail.net of any spam, and include VFEmail headers X-Remote-Browser: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed; DelSp=Yes Content-Transfer-Encoding: 8bit Content-Disposition: inline Content-Description: Plaintext Message X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Oct 2013 01:05:06 -0000 Quoting Rick Macklem : > Rick Romero wrote: >> This is a continuation of "9.1 VM nfs3 & locks over VPN" from >> freebsd-questions - trying a >> different angle maybe it'll jostle someones memory.  Don't mean to >> cross-post, but as I pay more attention to the lists I'm reading, >> this >> seems to be the better list for NFS issues. >> >> I have a FreeBSD 9.2 VM at an offsite hosting company.  hostname >> nl101vpn >> OpenVPN is installed on it, routed not bridged mode. >> I have multiple OSs installed on local network. I'm already >> exportings NFS >> off 9.1 with working file locks. >> >> What I see - >> export nfsv3 or nfsv4 from nl101vpn, mount on local FreeBSD or Linux >> - >> locks do not work. >> export nfsv3 from any local system, mount on nl101vpn - locks work. >> export nfsv3 from locally installed VM, mount on any local host or >> nl101vpn >> - locks work.  No OpenVPN installed on it though. This was to test if >> virtio drivers might be causing the problem. >> >> I even ran a tcpdump to see if something was getting lost - both >> sides >> match, nothing is getting dropped >> >> nl101vpn - /var/log/messages: >> Oct 14 12:21:01 nl101 kernel: NLM: failed to contact remote rpcbind, >> stat = >> 0, port = 0  (why port 0?) >> Oct 14 12:23:02 nl101 last message repeated 109 times >> Oct 14 12:25:48 nl101 last message repeated 177 times >> >> I tried binding rpcbind to the VPN interface, but that doesn't seem >> to >> work.  tcpdump shows no packets trying to leave the 'Internet' >> interface. >> >> So I haven't exhausted every combination, or completely 100% >> replicated >> whats happening offsite, but it's getting pretty ridiculous now... >> I'm >> lost, and I need NFS locking to work. >> Help :) > > For rpcbind to work, IP broadcast needs to work between the hosts > and I suspect that the VPN doesn't support that. > > Without rpcbind, I don't think you can get rpc.lockd/rpc.statd > to work, but I am not sure. (There are command line options for > these daemons that allow you to set specific port #s, but I don't > think that will fix the problem, since they still need rpcbind to > tell them the port# for the remote machines.) These protocols were > designed in the 1980s for use on a LAN. > > Now, nfsv4 shouldn't care less about rpcbind, rpc.lockd. NFSv4 locking > is handled as a part of the NFSv4 protocol and always uses port #2049. > I'd suggest you try NFSv4 again and make sure it is using NFSv4 and > the mount has not fallen back to NFSv3. (For FreeBSD, specify "nfsv4" > as a mount option. For Linux, specify "vers=4" as a mount option.) > You can check what the mount is actually using via "nfsstat -m". > If you assumed the locking for NFSv4 wasn't working because of these > messages, that isn't the case. If you are using NFSv4 for all mounts, > you don't need to run rpc.lockd at all (at least for FreeBSD, I'm > not sure what the daemons do w.r.t. Linux). Hi Rick, Yeah - I thought the VPN might pose a problem, but I can get locks from the VM side (nl101vpn) via NFS3 back to the main site.  So it doesn't seem to be an issue with the VPN. After that I created a local VM to ensure it wasn't a virtio thing, and then upgraded the remote VM to 9.2 (to rule out any funky custom options the host may have thrown into their 9.1 installer).  nada. So I'm re-trying with NFS4 - though my mount does show it was mounted nfs4 (from Linux) last time I tried: nl101vpn:/first on /mnt type nfs4 (rw,relatime,vers=4,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.16.1.92,minorversion=0,local_lock=none,addr=10.9.8.6) After trying again, still doesn't work.  Though now I noticed the error is different.  I have a little perl script that I test with, and the line is: flock(LOCKFILE, LOCK_SH) or die "Can't get shared lock on $lock_file: $!\n"; Before I would get 'pemission denied' - which (IIRC) would also happen when I forgot to run lockd or statd. Now with NFS4 it says, 'Bad file descriptor' After much more testing I've gotten a single Linux VM (but no other VMs on the same host, or my other host, yet they're all from the same template) to get a lock (n NFSv3) Failed locks show in the logs: NLM: failed to contact remote rpcbind, stat = 0, port = 0    (FreeBSD) NLM: failed to contact remote rpcbind, stat = 7, port = 28416  (Linux) I have 2 FreeBSD boxes, 9.1 and 7.2.  Both don't seem to be relaying their ports?  The Linux ones that fail have a port, but apparently can't be contacted. Or is 'port' in the log not really referring to a port number? The OpenVPN 'server' is another Linux VM - but on a different host than the working Linux VM :P  None of the Linux VMs on that host can get a lock.  How's THAT for weird? :) I need some aspirin.  Rick