Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 06 Feb 2010 21:33:23 -0800
From:      Julian Elischer <julian@elischer.org>
To:        "M. Warner Losh" <imp@bsdimp.com>
Cc:        net@freebsd.org
Subject:   Re: How does rpc.lockd know where to send a request
Message-ID:  <4B6E50A3.6080804@elischer.org>
In-Reply-To: <20100206.212042.925196285631243946.imp@bsdimp.com>
References:  <20100206.191153.401093655925072575.imp@bsdimp.com>	<4B6E2B40.1070405@elischer.org> <20100206.212042.925196285631243946.imp@bsdimp.com>

next in thread | previous in thread | raw e-mail | index | archive | help
M. Warner Losh wrote:
> In message: <4B6E2B40.1070405@elischer.org>
>             Julian Elischer <julian@elischer.org> writes:
> : M. Warner Losh wrote:
> : > I have a problem.  All systems are running freebsd-current form
> : > sometime in the last month, although similar systems running
> : > 8.0-RELEASE exhibit exactly the same problem.  rpc.lockd on an NFS
> : > client is doing something that baffles my mind entirely, maybe you can
> : > help.  Please bear with me, this is a little complicated, but I wanted
> : > to include all the details.
> : > I have a host, let's call it dune.  dune is at 10.0.0.5.  dune is also
> : > the master for the carp interface 10.0.0.99.  It is running rpc.lockd
> : > and is an nfs server.  I've told nfs, rpcbind, lockd and statd to only
> : > listen on address 10.0.0.99.
> : > I have a second host.  maud-dib is 10.0.0.8.  I do "mount
> : > 10.0.0.99:/dune /dune" on maud-dib.  Wireshark shows all the traffic
> : > going to 10.0.0.99.  All is happy in the world.  When I start, there's
> : > no ARP entry for 10.0.0.5 on 10.0.0.8, nor is there after the mount.
> : > Until I do the following 'lockf /dune/imp/junk ls' (I have write perms
> : > to /dune/imp).  At this point, rpc.lockd hangs.  I get the message
> : > "10.0.0.99:/dune: lockd not responding" which seems odd.  lockd is
> : > really there.  However, wireshark shows the NLM traffic going to IP
> : > address 10.0.0.5.  maud-dib has no carp interfaces.
> : > That's odd.  So my question is 'how does lockd know where to go to
> : > talk the NLM protocol?'
> : > 
> : 
> : my recollection is that maud-dib will sent an initial packet to dune
> : and dune will respond but that the response may come from 10.0.0.5,
> : after which maud-dib will redirect all requests there, which will not
> : work because dune is not listenning there.
> : 
> : teh problem is that dune's daemon is setting a local address of
> : IPADDR_ANY (0.0.0.0) which tells the packets to use a from
> : address that is the address ofthe interface that they exit from.
> : 
> : Since 10.0.0.5 is the primary address on that interface, that gets
> : selected.
> : you may try some trickery where you add the .5 address AFTER the .99
> : address so that the .99 is the primary address.
> 
> Actually, it looks like this is getting returned, as a ASCII string
> '10.0.0.5' in frame 68 in response to the GETADDR call.  Since I've
> told it specifically '-h 10.0.0.99' I'd have thought it would respect
> that.  Since it is supposed to be bound to 10.0.0.99, I'd proffer the
> argument this is a bug in rpcbind's implementation of GETADDR.
> 
> I never would have thought it would have been returned as an ASCII
> string, but you live and learn, eh?
> 
> Now, on to fixing the bug.
> 
> Warner
> 
> P.S. http://people.freebsd.org/~imp/wireshark.dat has the trace I'm
> referring to (and I've posted it in another message on this thread).
> 
> : > I did a packet capture from before I did the mount on maud-dib.  I can
> : > see the NFS mount, the NFS traffic, all to 10.0.0.99.  I then see an
> : > ARP for 10.0.0.5, followed by the NLM request from 10.0.0.8 to
> : > 10.0.0.5.  This gets an ICMP port unreachable message, since I told
> : > nfs, et al, to bind only to 10.0.0.99.
> : > So, I thought, 'the answer is obvious, I'll just look for the packet
> : > that has the string 'dune' in it (which is the hostname of 10.0.0.5).
> : > No packets have that string in it, other than the mount packet which
> : > has /dune in it.  Nor is there any DNS activity doing a lookup.  Nor
> : > is there any static mapping in /etc/hosts on 10.0.0.8.
> : > Next thought: Oh, somebody like portmapper or the NFS protocol from
> : > 10.0.0.99 is telling 10.0.0.8's rpc.lockd (or something else) to do
> : > locking requests to 10.0.0.5.  That's trivial to find, I think to
> : > myself.  I'll look for the octets 0a 00 00 05 (hex).  The only
> : > instances of that are in the ARP packet, the NLM request and the ICMP
> : > unreachable packets.  No other packets includes these bytes.  Nor do
> : > any include the reverse.
> : > Right after the mount, there's nothing in the connection table that
> : > points to 10.0.0.5, only 10.0.0.99.
> : > So I'm having a serious WTF moment.  How the heck is this even
> : > possible.  Any ideas on where to look for where this gets set and/or
> : > communicated?
> : > thanks a bunch for any insight that you can give...
> : > Warner
> : > _______________________________________________
> : > freebsd-net@freebsd.org mailing list
> : > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> : > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
> : 
> : 


try swapping the addresses on the interface.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4B6E50A3.6080804>