Date: Fri, 07 May 2004 23:12:53 +0200 From: Andre Oppermann <andre@freebsd.org> To: Mike Tancsa <mike@sentex.net> Cc: freebsd-current@freebsd.org Subject: Re: routing bug? Message-ID: <409BFBD5.5050101@freebsd.org> In-Reply-To: <6.0.3.0.0.20040507151017.08816e60@64.7.153.2> References: <6.0.3.0.0.20040507151017.08816e60@64.7.153.2>
next in thread | previous in thread | raw e-mail | index | archive | help
Mike Tancsa wrote: > A follow up to the post below on stable. We tried the same program on > OpenBSD, and it does the same thing as RELENG_4. However, HEAD as of > yesterday and 5.2.1 do not show the same behaviour as RELENG_4. Does > this ring a bell with anyone ? Ideally, we would like to see the same > behaviour on RELENG_4 as it causes major grief for us with raccoon when > dynamic interfaces come and go. In -current protocol cloning is gone and pointers to an rtentry are no longer stored in the inpcb. This causes a route lookup to be done for every packet that goes out. In RELENG_4 the same happens only if you use an unspecific UDP socket. It will do a route lookup every time a UDP packet is being sent to determine the source address and thus it will pick up any changes that happen in the routing table. However if you bind the UDP socket to a specific address the route pointer to the rtentry valid at the time of the bind is stored. This is only being revisited when the rtentry the pointer points to is deleted. Then the rtentry pointer is set to NULL and with the next packet sent it will be looked up freshly and stored again. The problem you have is that it will never switch back from an existing and working route back to a more specific one because it never tries to find one. The only fix with the code in RELENG_4 is to delete the default route, send a packet and reinsert the default route. Reexamining all inpcb rtentry pointers when a new route is installed is a very expensive operation due to many table walks through all inpcbs. In -current this rtentry caching has been removed due to the complexity it put into the network stack and to the fine-grained locking for efficient SMP. While it is certainly a little expense to do a route lookup for every packet sent, this is nothing extraordinary as it happens every time the machine forwards a packet as a router and it is rather efficient. This does not say it is perfect but sucks a lot less than before. Work is underway to allow for much better optimized pluggable routing tables in the network stack and especially for IPv4. I hope this explains the effect you are seeing. A backport or MFC of the protocol cloning removal is not possible and is a rather extensive change to the network stack which is outside of the RELENG_4 scope. There is one easy hack you can do for UDP sockets. That is you simply RTFREE() the rtentry pointed to by the inpcb early on in udp_output() just to let a new one be reaquired later on. If it is NULL the code will simply redo the routing table lookup and store a new pointer the rtentry. With this every time you send a UDP packet you'll get the corret route. A little bit more extensive hack would simply avoid storing the rtentry pointer in the inpcb at all. I once did this for an RELENG_4 machine without any ill effects (if done correctly). The same applies to OpenBSD and NetBSD. -- Andre > ---Mike > > At 09:59 AM 07/05/2004, Gabor wrote: > >> I am experiencing some weird routing phenomena. >> When I open a UDP socket and send datagrams to an address(172.30.1.1) >> and then remove that route(route delete 172.30.1.1) then my packets >> switch from going out the route specific interface(rl0) to going out >> the default interface(fxp0). This is as expected. Then I add back >> the route (route add 172.30.1.1 10.0.2.2) and the packets swing back >> to the route specific interface(rl0). However, if I bind my socket to >> a source address(172.16.24.33), when I remove the route and then add >> it back, the packets continue to go out the default interface(fxp0) >> instead of going out the route specific interface(rl0). >> >> This is on 4.9 STABLE. I was unable to reproduce this on 5.2.1. What >> changes have there been that haven't been MFC'ed? >> >> =0= udp-test # netstat -nr >> Routing tables >> >> Internet: >> Destination Gateway Flags Refs Use Netif >> Expire >> default 192.168.43.1 UGSc 2 2440 fxp0 >> 10.0.2/24 link#1 UC 2 0 rl0 >> 10.0.2.1 00:50:fc:32:52:a7 UHLW 0 2 lo0 >> 10.0.2.2 00:0e:0c:05:09:19 UHLW 1 3 >> rl0 69 >> 172.0.0.1 172.0.0.1 UH 0 0 lo0 >> 172.30.1.1 10.0.2.2 UGHS 0 0 rl0 >> 192.168.7 link#1 UC 0 0 rl0 >> 192.168.43 link#2 UC 4 0 fxp0 >> 192.168.43.1 00:50:bf:33:63:70 UHLW 4 56111 fxp0 >> 1079 >> 192.168.43.31 link#2 UHLW 1 249 fxp0 >> 192.168.43.157 00:a0:c9:4b:a5:f4 UHLW 6 2168880 fxp0 >> 771 >> 192.168.43.242 link#2 UHLW 1 718878 fxp0 >> >> =0= udp-test # ifconfig >> rl0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 >> inet 192.168.7.2 netmask 0xffffff00 broadcast 192.168.7.255 >> inet 10.0.2.1 netmask 0xffffff00 broadcast 10.0.2.255 >> ether 00:50:fc:32:52:a7 >> media: Ethernet 10baseT/UTP >> status: active >> fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 >> inet 192.168.43.26 netmask 0xffffff00 broadcast 192.168.43.255 >> ether 00:01:80:3d:b4:4f >> media: Ethernet autoselect (100baseTX <full-duplex>) >> status: active >> ppp0: flags=8010<POINTOPOINT,MULTICAST> mtu 1500 >> faith0: flags=8002<BROADCAST,MULTICAST> mtu 1500 >> ds0: flags=8008<LOOPBACK,MULTICAST> mtu 65532 >> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384 >> inet 172.16.24.33 netmask 0xffff0000 >> inet 172.0.0.1 netmask 0xffff0000 >> tun0: flags=8010<POINTOPOINT,MULTICAST> mtu 1500 >> >> =0= udp-test # cat udp-test.c >> #include <sys/types.h> >> #include <sys/time.h> >> #include <sys/socket.h> >> #include <netinet/in.h> >> #include <netdb.h> >> #include <unistd.h> >> #include <stdio.h> >> #include <stdlib.h> >> #include <string.h> >> #include <ctype.h> >> >> int send_pkt(unsigned char *src, unsigned char *dest, unsigned short >> port); >> >> int >> main(int argc, char **argv) >> { >> unsigned a, b, c, d, a2, b2, c2, d2; >> unsigned port; >> unsigned char src[4], dest[4]; >> >> >> if (argc != 3) { >> fprintf(stderr, >> "Usage: %s <source> <dest>:<port>\n", >> argv[0]); >> return 1; >> } >> >> if (sscanf(argv[1], "%u.%u.%u.%u", &a, &b, &c, &d) == 4 >> && a < 256 && b < 256 && c < 256 && d < 256 >> && sscanf(argv[2], "%u.%u.%u.%u:%u", &a2, &b2, &c2, &d2, >> &port) == 5 >> && a2 < 256 && b2 < 256 && c2 < 256 && d2 < 256 && port < >> 65536) { >> /* OK */ >> src[0] = a; >> src[1] = b; >> src[2] = c; >> src[3] = d; >> dest[0] = a2; >> dest[1] = b2; >> dest[2] = c2; >> dest[3] = d2; >> send_pkt(src, dest, port); >> } >> else { >> fprintf(stderr, >> "Usage: %s <source> <dest>:<port>\n", >> argv[0]); >> return 1; >> } >> >> return 0; >> } >> >> int >> send_pkt(unsigned char *src, unsigned char *dest, unsigned short port) >> { >> int s, len, cnt, rc, on; >> struct protoent *proto; >> struct sockaddr_in to, from; >> char data[1024]; >> >> if (!(proto = getprotobyname("udp"))) { >> perror("getprotobyname"); >> return -1; >> } >> >> if ((s = socket(PF_INET, SOCK_DGRAM, proto->p_proto)) < 0) { >> perror("socket"); >> return -1; >> } >> on = 1; >> >> memset(&from, 0, sizeof from); >> from.sin_family = AF_INET; >> from.sin_port = htons(0); >> memcpy(&from.sin_addr.s_addr, src, 4); >> fprintf(stderr, >> "bind:%d\n", >> bind(s, (struct sockaddr*)&from, sizeof from)); >> >> memset(&to, 0, sizeof to); >> to.sin_family = AF_INET; >> to.sin_port = htons(port); >> memcpy(&to.sin_addr.s_addr, dest, 4); >> len = 58; >> cnt = 0; >> while (1) { >> memset(data, cnt, len); >> rc = sendto(s, data, len, 0, (struct sockaddr*)&to, sizeof to); >> if (rc < 0) >> perror(""); >> fprintf(stderr, "%d %d\n", rc, cnt); >> sleep(5); >> ++cnt; >> } >> close(s); >> >> return 0; >> } >> >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?409BFBD5.5050101>