Date: Sat, 06 Feb 2010 21:33:23 -0800 From: Julian Elischer <julian@elischer.org> To: "M. Warner Losh" <imp@bsdimp.com> Cc: net@freebsd.org Subject: Re: How does rpc.lockd know where to send a request Message-ID: <4B6E50A3.6080804@elischer.org> In-Reply-To: <20100206.212042.925196285631243946.imp@bsdimp.com> References: <20100206.191153.401093655925072575.imp@bsdimp.com> <4B6E2B40.1070405@elischer.org> <20100206.212042.925196285631243946.imp@bsdimp.com>
next in thread | previous in thread | raw e-mail | index | archive | help
M. Warner Losh wrote: > In message: <4B6E2B40.1070405@elischer.org> > Julian Elischer <julian@elischer.org> writes: > : M. Warner Losh wrote: > : > I have a problem. All systems are running freebsd-current form > : > sometime in the last month, although similar systems running > : > 8.0-RELEASE exhibit exactly the same problem. rpc.lockd on an NFS > : > client is doing something that baffles my mind entirely, maybe you can > : > help. Please bear with me, this is a little complicated, but I wanted > : > to include all the details. > : > I have a host, let's call it dune. dune is at 10.0.0.5. dune is also > : > the master for the carp interface 10.0.0.99. It is running rpc.lockd > : > and is an nfs server. I've told nfs, rpcbind, lockd and statd to only > : > listen on address 10.0.0.99. > : > I have a second host. maud-dib is 10.0.0.8. I do "mount > : > 10.0.0.99:/dune /dune" on maud-dib. Wireshark shows all the traffic > : > going to 10.0.0.99. All is happy in the world. When I start, there's > : > no ARP entry for 10.0.0.5 on 10.0.0.8, nor is there after the mount. > : > Until I do the following 'lockf /dune/imp/junk ls' (I have write perms > : > to /dune/imp). At this point, rpc.lockd hangs. I get the message > : > "10.0.0.99:/dune: lockd not responding" which seems odd. lockd is > : > really there. However, wireshark shows the NLM traffic going to IP > : > address 10.0.0.5. maud-dib has no carp interfaces. > : > That's odd. So my question is 'how does lockd know where to go to > : > talk the NLM protocol?' > : > > : > : my recollection is that maud-dib will sent an initial packet to dune > : and dune will respond but that the response may come from 10.0.0.5, > : after which maud-dib will redirect all requests there, which will not > : work because dune is not listenning there. > : > : teh problem is that dune's daemon is setting a local address of > : IPADDR_ANY (0.0.0.0) which tells the packets to use a from > : address that is the address ofthe interface that they exit from. > : > : Since 10.0.0.5 is the primary address on that interface, that gets > : selected. > : you may try some trickery where you add the .5 address AFTER the .99 > : address so that the .99 is the primary address. > > Actually, it looks like this is getting returned, as a ASCII string > '10.0.0.5' in frame 68 in response to the GETADDR call. Since I've > told it specifically '-h 10.0.0.99' I'd have thought it would respect > that. Since it is supposed to be bound to 10.0.0.99, I'd proffer the > argument this is a bug in rpcbind's implementation of GETADDR. > > I never would have thought it would have been returned as an ASCII > string, but you live and learn, eh? > > Now, on to fixing the bug. > > Warner > > P.S. http://people.freebsd.org/~imp/wireshark.dat has the trace I'm > referring to (and I've posted it in another message on this thread). > > : > I did a packet capture from before I did the mount on maud-dib. I can > : > see the NFS mount, the NFS traffic, all to 10.0.0.99. I then see an > : > ARP for 10.0.0.5, followed by the NLM request from 10.0.0.8 to > : > 10.0.0.5. This gets an ICMP port unreachable message, since I told > : > nfs, et al, to bind only to 10.0.0.99. > : > So, I thought, 'the answer is obvious, I'll just look for the packet > : > that has the string 'dune' in it (which is the hostname of 10.0.0.5). > : > No packets have that string in it, other than the mount packet which > : > has /dune in it. Nor is there any DNS activity doing a lookup. Nor > : > is there any static mapping in /etc/hosts on 10.0.0.8. > : > Next thought: Oh, somebody like portmapper or the NFS protocol from > : > 10.0.0.99 is telling 10.0.0.8's rpc.lockd (or something else) to do > : > locking requests to 10.0.0.5. That's trivial to find, I think to > : > myself. I'll look for the octets 0a 00 00 05 (hex). The only > : > instances of that are in the ARP packet, the NLM request and the ICMP > : > unreachable packets. No other packets includes these bytes. Nor do > : > any include the reverse. > : > Right after the mount, there's nothing in the connection table that > : > points to 10.0.0.5, only 10.0.0.99. > : > So I'm having a serious WTF moment. How the heck is this even > : > possible. Any ideas on where to look for where this gets set and/or > : > communicated? > : > thanks a bunch for any insight that you can give... > : > Warner > : > _______________________________________________ > : > freebsd-net@freebsd.org mailing list > : > http://lists.freebsd.org/mailman/listinfo/freebsd-net > : > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > : > : try swapping the addresses on the interface.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4B6E50A3.6080804>