Date: Thu, 08 May 2003 15:39:08 +0200 From: "Ian Freislich" <ianf@za.uu.net> To: Lars =?iso-8859-1?Q?K=F6ller?= <Lars.Koeller@Uni-Bielefeld.DE> Cc: freebsd-questions@freebsd.org Subject: Re: Please, Urgent: Need ideas/help to solve PR bin/51586 Message-ID: <43122.1052401148@wcom.com> In-Reply-To: Your message of "Thu, 08 May 2003 13:46:16 %2B0200." <200305081146.h48BkHP13996@rayadm.hrz.uni-bielefeld.de> References: <200305081146.h48BkHP13996@rayadm.hrz.uni-bielefeld.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Lars wrote: > > rresvport_af(3) returns this error because I suspect that it thinks > > this address is already in use, perhaps because the address/port > > pair is in TIME_WAIT, although I don't have time to test this > > suspicion and my network programming and protocol experience is not > > good enough to say this is the case outright without testing. > > NO,NO! Netstat says nothing about that. Even I tune msl time to go out = > of TIME_WAIT very fast (only intranet connection on same switch!). > The ethereal dump in the PR shown, that an initial communication takes = > place, but the final ACK to establish the connection fails! Interesting. I setup rshd and inet exactly like you did and ran your test script and it broke in almost exactly the same way it did for you: while true do /usr/bin/rsh brane -l ianf pwd; ret=$? if [ "$ret" != "0" ] then echo "Return Code: $ret" break fi done Loops several hundred times and the immediately prints: /usr/home/ianf /usr/home/ianf /usr/home/ianf select: protocol failure in circuit setup Return Code: 1 At this point on the server 'brane' I get the following in /var/log/messages: May 8 14:23:10 brane rshd[16886]: can't get stderr port: Can't assign requested address This message is logged by rshd when it is unable to open the connection for stderr back to the originating rsh client. Have you turned on net.inet.tcp.blackhole=2 which would result in ICMP port unreachable messages not being sent? What is the output of 'netstat -anf inet |grep -v TIME_WAIT' on machine2 after you get the timeout connecting to machine2? Is the tcp *.514 LISTEN line missing after you get the timeout. What do you get in your messages file on machine2 (the one running the rsh server)? I suspect that you're not getting ICMP port unreachable after inetd silently terminated the shell service because of rshd's exit code so your connection timed out. > > (/usr/src/libexec/rshd, apply this, make and make install the patched r= > shd) > > --- rshd.c.orig Thu May 8 12:55:46 2003 > > +++ rshd.c Thu May 8 12:43:31 2003 > > @@ -296,7 +296,7 @@ > > s =3D rresvport_af(&lport, af); > > if (s < 0) { > > syslog(LOG_ERR, "can't get stderr port: %m"); > > - exit(1); > > + exit(0); > > } > > if (port >=3D IPPORT_RESERVED || > > port < IPPORT_RESERVED/2) { > > = > > > I know this is a horrible solution and shouldn't be committed, but > > at least you have a work-around so you can get your virus scanner > > farm up in the mean time while someone fixes this propperly. > > This dosen't help, cause the port can be reserved by the rshd. The > problem is the establishing of the connection, so this is not the right > place in the source. Which port is reserved by rshd? An incoming connection is established on 514. rshd reads a port number off that connection and initiates a connection back to the originator on the specified port. Both these connections need to be established for the shell service to come up. I'm not sure that I trust the tcpdump in your PR becuase I tried to dump the entire run from the script on both my test servers and the two dumps didn't match and some sequences were out of order. Only when I dumped the packets to a file and used tcpdump to read the file did the dumps from each server match. Here's a good rsh session: 15:04:31.944902 196.7.162.26.1001 > 196.7.162.25.514: S 242763540:242763540(0) win 65535 <mss 1460,nop,wscale 1,nop,nop,timestamp 483614 0> (DF) 15:04:31.944965 196.7.162.25.514 > 196.7.162.26.1001: S 1769914383:1769914383(0) ack 242763541 win 57344 <mss 1460,nop,wscale 0,nop,nop,timestamp 14908587 483614> (DF) 15:04:31.945271 196.7.162.26.1001 > 196.7.162.25.514: . ack 1 win 33304 <nop,nop,timestamp 483614 14908587> (DF) 15:04:31.945572 196.7.162.26.1001 > 196.7.162.25.514: P 1:6(5) ack 1 win 33304 <nop,nop,timestamp 483614 14908587> (DF) 15:04:31.945600 196.7.162.25.514 > 196.7.162.26.1001: . ack 6 win 57915 <nop,nop,timestamp 14908587 483614> (DF) 15:04:31.952264 196.7.162.25.929 > 196.7.162.26.1000: S 206573132:206573132(0) win 57344 <mss 1460,nop,wscale 0,nop,nop,timestamp 14908588 0> (DF) 15:04:31.952525 196.7.162.26.1000 > 196.7.162.25.929: S 740063972:740063972(0) ack 206573133 win 65535 <mss 1460,nop,wscale 1,nop,nop,timestamp 483615 14908588> (DF) 15:04:31.952560 196.7.162.25.929 > 196.7.162.26.1000: . ack 1 win 57920 <nop,nop,timestamp 14908588 483615> (DF) 15:04:31.953030 196.7.162.26.1001 > 196.7.162.25.514: P 6:11(5) ack 1 win 33304 <nop,nop,timestamp 483615 14908587> (DF) 15:04:31.953064 196.7.162.25.514 > 196.7.162.26.1001: . ack 11 win 57915 <nop,nop,timestamp 14908588 483615> (DF) 15:04:31.953316 196.7.162.26.1001 > 196.7.162.25.514: P 11:20(9) ack 1 win 33304 <nop,nop,timestamp 483615 14908588> (DF) 15:04:31.953334 196.7.162.25.514 > 196.7.162.26.1001: . ack 20 win 57911 <nop,nop,timestamp 14908588 483615> (DF) 15:04:31.954560 196.7.162.25.514 > 196.7.162.26.1001: P 1:2(1) ack 20 win 57920 <nop,nop,timestamp 14908588 483615> (DF) 15:04:31.954787 196.7.162.26.1001 > 196.7.162.25.514: . ack 2 win 33303 <nop,nop,timestamp 483615 14908588> (DF) 15:04:31.958429 196.7.162.25.514 > 196.7.162.26.1001: P 2:17(15) ack 20 win 57920 <nop,nop,timestamp 14908588 483615> (DF) 15:04:31.958516 196.7.162.25.514 > 196.7.162.26.1001: F 17:17(0) ack 20 win 57920 <nop,nop,timestamp 14908588 483615> (DF) 15:04:31.958697 196.7.162.26.1001 > 196.7.162.25.514: . ack 17 win 33296 <nop,nop,timestamp 483615 14908588> (DF) 15:04:31.958795 196.7.162.26.1001 > 196.7.162.25.514: . ack 18 win 33296 <nop,nop,timestamp 483615 14908588> (DF) 15:04:31.959146 196.7.162.25.929 > 196.7.162.26.1000: F 1:1(0) ack 1 win 57920 <nop,nop,timestamp 14908588 483615> (DF) 15:04:31.959440 196.7.162.26.1000 > 196.7.162.25.929: . ack 2 win 33304 <nop,nop,timestamp 483616 14908588> (DF) 15:04:31.961198 196.7.162.26.1001 > 196.7.162.25.514: F 20:20(0) ack 18 win 33304 <nop,nop,timestamp 483616 14908588> (DF) 15:04:31.961239 196.7.162.25.514 > 196.7.162.26.1001: . ack 21 win 57920 <nop,nop,timestamp 14908589 483616> (DF) 15:04:31.961303 196.7.162.26.1000 > 196.7.162.25.929: F 1:1(0) ack 2 win 33304 <nop,nop,timestamp 483616 14908588> (DF) 15:04:31.961321 196.7.162.25.929 > 196.7.162.26.1000: . ack 2 win 57919 <nop,nop,timestamp 14908589 483616> (DF) And here's the last one that failed: 15:04:31.984458 196.7.162.26.999 > 196.7.162.25.514: S 3911362959:3911362959(0) win 65535 <mss 1460,nop,wscale 1,nop,nop,timestamp 483618 0> (DF) 15:04:31.984514 196.7.162.25.514 > 196.7.162.26.999: S 834974100:834974100(0) ack 3911362960 win 57344 <mss 1460,nop,wscale 0,nop,nop,timestamp 14908591 483618> (DF) 15:04:31.984863 196.7.162.26.999 > 196.7.162.25.514: . ack 1 win 33304 <nop,nop,timestamp 483618 14908591> (DF) 15:04:31.985141 196.7.162.26.999 > 196.7.162.25.514: P 1:5(4) ack 1 win 33304 <nop,nop,timestamp 483618 14908591> (DF) 15:04:31.985165 196.7.162.25.514 > 196.7.162.26.999: . ack 5 win 57916 <nop,nop,timestamp 14908591 483618> (DF) 15:04:31.992888 196.7.162.25.514 > 196.7.162.26.999: F 1:1(0) ack 5 win 57920 <nop,nop,timestamp 14908592 483618> (DF) 15:04:31.993164 196.7.162.26.999 > 196.7.162.25.514: . ack 2 win 33304 <nop,nop,timestamp 483619 14908592> (DF) 15:04:31.993698 196.7.162.26.999 > 196.7.162.25.514: F 5:5(0) ack 2 win 33304 <nop,nop,timestamp 483619 14908592> (DF) 15:04:31.993737 196.7.162.25.514 > 196.7.162.26.999: . ack 6 win 57920 <nop,nop,timestamp 14908592 483619> (DF) You'll notice the absence of the second SYN from 196.7.162.25 to 196.7.162.26 and instead 196.7.162.25 immediately sends a FIN. It was at this point that rshd couldn't get the second port and terminated the connection. > However, the mailserver, which calls the rsh client is a solaris > 8 machine :-( That's not a problem because I believe the problem to be in rshd and most likely in libc in rresvport_af(3). Ian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?43122.1052401148>