Date: Wed, 20 Dec 1995 18:16:16 -0700 From: Nate Williams <nate@rocky.sri.MT.net> To: hackers@FreeBSD.org, isp@FreeBSD.org Subject: BSD networking code guru needed? Message-ID: <199512210116.SAA01444@rocky.sri.MT.net>
next in thread | raw e-mail | index | archive | help
I'm seeing a fairly significant bug in FreeBSD's networking code w/regards to routing and arp, and I'm looking for someone who can help me figure this out. I've asked Garrett to help, but he has been too busy so I'm now looking for someone else who is familiar with the networking code in FreeBSD. I can re-produce the problem at will and can tell whomever helps me how to re-create it. I'm suprised that none of the ISP's have seen this, but I suspect they aren't using proxy-arp, or aren't seeing folks re-connect from broken PPP connections as fast as we do. Basically, I'm using proxy-arp to setup routing from a couple 'portable' computers which can exist on either our local network, or at home. I'm using the same IP address for both locations, and I'm using proxy-arp to allow the machines to sit behind our PPP server (a FreeBSD box). When the line goes down and the remote machines are talking with a machine on our local network, the PPP process (correctly) removes the proxy-arp entry from the PPP server. However, packets are still being sent to that box from other machines on our network, which causes the server box to send out an arp request onto the ethernet and add an incomplete arp entry and route to the ethernet. This is acceptable *except* that the remote box re-connects to the server, which causes PPP to proxy-arp again for the remote box. What *should* happen is the incomplete arp entry and route should be removed from the tables and replaced with the now valid proxy-arp entry. What is happening is the proxy-arp entry is added to the table *after* the incomplete arp entry, so the server machine doesn't know to route traffic to the remote machine via the PPP link. PPP is doing the correct thing, and the remote machine is sending data to the server, but the server doesn't know the correct route to get back to it since it assumes the route is via the ethernet. This problem occurs with normal arp as well as proxy-arp, so you can have up to three arp entries for a single IP address in the arp table. Here's what happens on my server box right now. ws1.sri.MT.net (204.182.243.100) at (incomplete) ws1.sri.MT.net (204.182.243.100) at 0:80:48:e8:27:63 permanent published ws1.sri.MT.net (204.182.243.100) at 0:80:48:e8:27:63 permanent published (proxy only) Fun, huh? I've got kernel dumps where the bogosity is occuring, back-traces, and all sorts of programs to trigger the bug and more information than you'll ever want to describe the problem, but I'm beating my head against the wall trying to figure out the code flow, so I'm appealing the BSD gurus to help. This problem won't occur if the arp entries time-out on both the remote host and the server box. If that happens, then the proxy-arp entry which gets added by PPPD is the first in the arp table, and routing is correct until the line goes down. I've checked and neither SunOS 4.1 nor Solaris 2.4 have this bug, and I don't have root access on any other OS's to test this out. Please help! Nate
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199512210116.SAA01444>