Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Jun 2001 15:32:59 +0300
From:      Ruslan Ermilov <ru@FreeBSD.org>
To:        Garrett Wollman <wollman@FreeBSD.org>
Cc:        net@FreeBSD.org
Subject:   Cached socket routes problem
Message-ID:  <20010606153259.B15851@sunbay.com>

next in thread | raw e-mail | index | archive | help
----- Forwarded message from Ruslan Ermilov <ru@FreeBSD.org> -----

Date: Wed, 6 Jun 2001 15:32:05 +0300
From: Ruslan Ermilov <ru@FreeBSD.org>
To: Andre Albsmeier <andre.albsmeier@mchp.siemens.de>
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/27890: FreeBSD not always seems to take the best route
User-Agent: Mutt/1.2.5i
In-Reply-To: <20010606122904.A81971@curry.mchp.siemens.de>; from andre.albsmeier@mchp.siemens.de on Wed, Jun 06, 2001 at 12:29:04PM +0200

On Wed, Jun 06, 2001 at 12:29:04PM +0200, Andre Albsmeier wrote:
> Thanks for helping...
> 
> On Wed, 06-Jun-2001 at 11:24:19 +0300, Ruslan Ermilov wrote:
> >
> > ...
> > 
> > I can't reproduce this problem on my 4.3-STABLE box.
> > 
> > Yes, the UDP socket has the reference to the protocol-cloned
> > route to the destination host S through the router 1 initially,
> > and UDP packets go through that router.
> > 
> > In my tests, router 1 (192.168.1.1) was the host *not* configured
> > to act as the router, so all "foreign" packets sent to it got
> 
> OK, I have blocked packets coming from C on router 1. So
> I think I got the same config as you.
> 
> 
> > silently ignored.  I used the ports/net/netcat utility to connect
> > to the UDP `echo' port of the destination S (192.168.2.1):
> > 
> > Fig.1: Initial state, before UDP socket is open.
> > 
> > : # netstat -arn
> > : Destination        Gateway            Flags     Refs     Use     Netif Expire
> > : default            192.168.1.1        UGSc        0        2      rl0
> > : 127.0.0.1          127.0.0.1          UH          1        6      lo0
> > : 192.168.1          link#1             UC          3        0      rl0 =>
> > 
> > 
> > Fig.2: We connect(2) UDP socket to the "echo" port on S (192.168.2.1).
> > 
> > : # nc -u 192.168.2.1 echo
> > : ping1
> > : ping2
> > : ping3
> > [...]
> > 
> > As you can see, we receive no echos back.
> 
> OK, same here.
> 
> 
> > Fig.3: Routing table after UDP socket is open.
> > 
> > : # netstat -arn
> > : Destination        Gateway            Flags     Refs     Use     Netif Expire
> > : default            192.168.1.1        UGSc        1        2      rl0
> > : 127.0.0.1          127.0.0.1          UH          1        6      lo0
> > : 192.168.1          link#1             UC          4        0      rl0 =>
> > : 192.168.2.1        192.168.1.1        UGHW        1       14      rl0
> > 
> > The route to S (192.168.2.1) was cloned (W) from the `default' route.
> > refcnt=1 on the 192.168.2.1 route indicates that the UDP socket holds
> > a reference to this route.
> 
> Same here:
> 
> 192.168.2.1       192.168.1.1        UGHW        1      425     fxp0
> 
> 
> > Fig.4: I manually add the route to the 192.168.2 network.
> > 
> > : # route add -net 192.168.2   192.168.1.2 
> > : add net 192.168.2: gateway 192.168.1.2 
> 
> OK, I don;t add it manually but wait until routed messages from
> 192.168.1.2 brings it back.
> 
> 
> > 
> > Fig.5: Routing table after the route to the 192.168.2 network was added.
> > 
> > : # netstat -arn
> > : Destination        Gateway            Flags     Refs     Use     Netif Expire
> > : default            192.168.1.1        UGSc        1        2      rl0
> > : 127.0.0.1          127.0.0.1          UH          1        6      lo0
> > : 192.168.1          link#1             UC          4        0      rl0 =>
> > : 192.168.2          192.168.1.2        UGSc        0        0      rl0
> 
> Yup, same here
> 
> 
> > As you can see, the route to the 192.168.2.1 host is deleted from the routing
> > table.  It actually doesn't get freed completely, as it had non-zero reference
> > count (UDP socket still holds on it), but instead it gets marked as DOWN, and
> > will be freed and reallocated in ip_output() on the next use.
> > 
> > Fig.6: We continue to send UDP datagrams.
> > 
> > : # nc -u 192.168.2.1 echo (continued)
> > : ping4
> > : ping4
> > : ping5
> > : ping5
> > : ping6
> > : ping6
> > 
> > As you can see, this time we get the echos back.
> 
> Yes, same here :-(
> 
> 
> > Fig.7: Routing table after we sent more UDP datagrams.
> > 
> > : # netstat -arn -finet
> > : Destination        Gateway            Flags     Refs     Use     Netif Expire
> > : default            192.168.1.1        UGSc        0        2      rl0
> > : 127.0.0.1          127.0.0.1          UH          1        6      lo0
> > : 192.168.1          link#1             UC          4        0      rl0 =>
> > : 192.168.2          192.168.1.2        UGSc        1        3      rl0
> > 
> > The refcount on 192.168.2 route has grown to 1, indicating that the
> > UDP socket now holds on this route.  The `Use' count of 3 corresponds
> > to our three UDP datagrams (ping4, ping5, and ping6).
> > 
> > Could you please repeat these steps in your environment, and try to
> > detect where it behaved differently in your case.
> 
> It doesn't behave differently, that's interesting. May I ask you to
> try it using syslogd?
> 
> - Let host C log to host S (with the route installed).
> - Watch C's messages appear on S.
> - Delete C's route to S (via router 2)
> - Let host C log again (run tcpdump on router 1 to see the packets come in)
> - Install the route to S (via router 2) again on C
> - Log more stuff. If you don't see the packets go into router 1 anymore
>   I am really lost...
> 
Yes, I have reproduced the problem here.  My test misses one step.
OK, now about what happens here.

Initially, there is the route (cloned from the network route) to S
(192.168.2.1) through the router 2 (192.168.1.2).  UDP socket uses
this route initially.  When this (and the 192.168.2 network) routes
disappear, on the next write (!), ip_output() detects that the S
route is DOWN, and "allocates" (caches) another route, which happens
to be the "default" route pointing to router 1 (192.168.1.1).
Later, when the route to the 192.168.2 network gets installed again,
it's not taken into account, as the cached ("default") route is still
UP.

Unfortunately, there is no easy way to fix this.  Checking for
the best-match route on every write may be too time consuming.
As the workaround, you can delete and re-add your "default"
route.  This worked for me here.  `route delete default' will
delete the "default" route from the routing table, but because
it has a refcnt>0 will not delete it immediately, but will mark
it as DOWN.  ip_output() for this UDP socket's write will detect
that the cached route is DOWN, will free it, and allocate a new
route, which will be the route to the 192.168.2 network through
router 2 (192.168.1.2) this time.

The actual fix would be to notify protocol (from within the
routing code) whenever its routing table is modified.  This
notification could then be saved in a variable as timestamp,
and every PCB-cached route could have a similar timestamp as
well, indicating when this "caching" took place.  Having
that, ip_output() would "invalidate" cached route if it was
cached before the last routing table modification was done.

I could probably try to implement this, if no one else can
come up with a better idea.


Cheers,
-- 
Ruslan Ermilov		Oracle Developer/DBA,
ru@sunbay.com		Sunbay Software AG,
ru@FreeBSD.org		FreeBSD committer,
+380.652.512.251	Simferopol, Ukraine

http://www.FreeBSD.org	The Power To Serve
http://www.oracle.com	Enabling The Information Age

----- End forwarded message -----

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010606153259.B15851>