Date: Fri, 27 Jan 2006 14:34:54 -0500 From: Kurt Miller <lists@intricatesoftware.com> To: freebsd-hackers@freebsd.org Cc: Daniel Eischen <deischen@freebsd.org> Subject: Re: read hang on datagram socket Message-ID: <200601271434.54776.lists@intricatesoftware.com> In-Reply-To: <200601271042.04315.lists@intricatesoftware.com> References: <Pine.GSO.4.43.0601270909190.10667-100000@sea.ntplx.net> <200601271042.04315.lists@intricatesoftware.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Friday 27 January 2006 10:42 am, Kurt Miller wrote: > On Friday 27 January 2006 9:16 am, Daniel Eischen wrote: > > On Thu, 26 Jan 2006, Kurt Miller wrote: > > > > > On Thursday 26 January 2006 7:26 pm, Daniel Eischen wrote: > > > > > > > > The modified version does not hang on 5.2. Do you have multiple > > > > interfaces on your 5.4 box? > > > > > > No, the 5.4 box is virtually identical to the 6.0 box. I set them both > > > up at the same time from initial installs for the project. > > > > > > truk@freebsd5-4$ ifconfig > > > lnc0: flags=108843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 > > > inet6 fe80::250:56ff:fe40:451a%lnc0 prefixlen 64 scopeid 0x1 > > > inet 172.16.1.36 netmask 0xffffff00 broadcast 172.16.1.255 > > > ether 00:50:56:40:45:1a > > > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384 > > > inet 127.0.0.1 netmask 0xff000000 > > > inet6 ::1 prefixlen 128 > > > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 > > > > [ ... ] > > > > > > What happens when you try using non-zero IP addresses and ports? > > > > > > > > > > Setting the ports doesn't effect the problem, however setting the > > > addresses does. It really seems like binding to INADDR_ANY only binds > > > to loopback address 127.0.0.1 and not all the interfaces. > > > > > > If sock1 is bound to the hostAddress and sock2 connects to sock1 at > > > the hostAddress it works ok. If sock1 is bound to INADDR_ANY and sock2 > > > connects to sock1 using INADDR_ANY it works. but any mixture of of > > > using INADDR_ANY with the hostAddress fails. > > > > According to Steven's Network Programming, when binding to > > INADDR_ANY, the operating system doesn't assign an address > > until the first write. This is unlike the port, where using > > port 0, an ephemeral port is assigned right away. I don't > > have the book handy right now, so I forgot if the INADDR_ANY > > behavior is only when you have multiple interfaces or not. > > The book I'm using is not that clear about it (Advanced > Programming in the UNIX Environment). It does say that using > connect with a datagram socket will receive datagrams only > from the address specified, which seems related to the problem. > > > > Unfortunately, I don't have control over the addresses, the java > > > programs do. This particular jck test binds the first socket with > > > INADDR_ANY (InetAddress.getByName("0.0.0.0")) and connects the second > > > socket to the first using the hostAddress (InetAddress.getLocalHost()). > > > > You can try sending a byte before getting the address for the > > port and see if that works. Do you have anything weird, like > > not having a default route (gateway)? > > Yes, sending a byte before doing the connect(sock2, &sock1Addres > does work. However, calling connect/send/read after that fails too. > The problem appears to be related to sock1's selection of it source > address, or perhaps the connect call is ignoring the hostAddress and > using the loopback address. The netstat output leads me to believe > it is the latter. It is behaving like a mismatch between source > address of the message and the address enforced by the connect call. > > I've confirmed that the sock1Addr struct is filled in correctly with > the hostAddress and port of sock1 and sock2Addr is filled in correctly > with the hostAddress and port of sock2. > > I've got a pretty standard setup. All three machines are using DHCP > to get their addresses, default route and name servers. I've set > the dhcp server to give them the same IP addresses each time. Here's > the routing table for each: > > truk@freebsd6-0$ netstat -r -f inet > Routing tables > > Internet: > Destination Gateway Flags Refs Use Netif Expire > default 172.16.1.1 UGS 0 34 lnc0 > localhost localhost UH 0 0 lo0 > 172.16.1/24 link#1 UC 0 0 lnc0 > 172.16.1.1 00:00:24:c2:47:b5 UHLW 2 0 lnc0 671 > 172.16.1.10 00:13:46:c9:0a:5c UHLW 1 0 lnc0 1103 > 172.16.1.72 00:12:f0:b5:f4:6c UHLW 1 118 lnc0 961 > > truk@freebsd5-4$ netstat -r -f inet > Routing tables > > Internet: > Destination Gateway Flags Refs Use Netif Expire > default 172.16.1.1 UGS 0 112 lnc0 > localhost localhost UH 1 19 lo0 > 172.16.1/24 link#1 UC 0 0 lnc0 > 172.16.1.1 00:00:24:c2:47:b5 UHLW 1 0 lnc0 40 > 172.16.1.10 00:13:46:c9:0a:5c UHLW 0 0 lnc0 1106 > 172.16.1.36 localhost UGHS 0 7 lo0 > 172.16.1.72 00:12:f0:b5:f4:6c UHLW 0 3151 lnc0 749 > > $ netstat -r -f inet #freebsd4-11 > Routing tables > > Internet: > Destination Gateway Flags Refs Use Netif Expire > default 172.16.1.1 UGSc 1 0 lnc0 > localhost localhost UH 1 0 lo0 > 172.16.1/24 link#1 UC 2 0 lnc0 > 172.16.1.1 00:00:24:c2:47:b5 UHLW 2 0 lnc0 1200 > 172.16.1.30 localhost UGHS 0 0 lo0 > 172.16.1.72 00:12:f0:b5:f4:6c UHLW 1 112 lnc0 1195 > > Thanks for the ideas and suggestions. The problem turned out to be related to how dhcp sets up the routing table. Switching to a fixed address setup adjusted the routing table and now the the program works. Go figure. Here's the routing table when using a fixed address: Routing tables Internet: Destination Gateway Flags Refs Use Netif Expire default 172.16.1.1 UGS 0 48 lnc0 localhost localhost UH 0 0 lo0 172.16.1/24 link#1 UC 0 0 lnc0 172.16.1.1 00:00:24:c2:47:b5 UHLW 1 0 lnc0 1200 172.16.1.20 00:07:e9:47:0f:f9 UHLW 0 2 lnc0 429 172.16.1.21 00:40:96:39:b6:f9 UHLW 0 3 lnc0 1197 172.16.1.36 00:50:56:40:45:1a UHLW 0 1 lo0 172.16.1.72 00:12:f0:b5:f4:6c UHLW 0 637 lnc0 1187 Notice the difference in the gateway for 172.16.1.36. Thanks for all the help and suggestions. -Kurt
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200601271434.54776.lists>