From owner-freebsd-net@FreeBSD.ORG Thu May 8 06:39:23 2003 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AC43E37B401; Thu, 8 May 2003 06:39:23 -0700 (PDT) Received: from mailspool.ops.uunet.co.za (mailspool.ops.uunet.co.za [196.7.0.140]) by mx1.FreeBSD.org (Postfix) with ESMTP id E8ECA43F75; Thu, 8 May 2003 06:39:18 -0700 (PDT) (envelope-from ianf@wcom.com) Received: from copernicus.so.cpt1.za.uu.net ([196.30.72.32]) by mailspool.ops.uunet.co.za with esmtp (Exim 3.36 #1) id 19Dlc7-000Peh-00; Thu, 08 May 2003 15:39:11 +0200 Received: from localhost ([127.0.0.1] helo=wcom.com) by copernicus.so.cpt1.za.uu.net with esmtp (Exim 3.36 #1) id 19Dlc4-000BDX-00; Thu, 08 May 2003 15:39:08 +0200 To: Lars =?iso-8859-1?Q?K=F6ller?= In-reply-to: Your message of "Thu, 08 May 2003 13:46:16 +0200." <200305081146.h48BkHP13996@rayadm.hrz.uni-bielefeld.de> References: <200305081146.h48BkHP13996@rayadm.hrz.uni-bielefeld.de> From: "Ian Freislich" X-image-url: http://www.digs.iafrica.com/gallery/ian-small.gif X-BOFH: true X-LART: Depleted uranium X-No-Junk-Mail: I do not want to get *any* junk mail. You have been deleted Date: Thu, 08 May 2003 15:39:08 +0200 Message-ID: <43122.1052401148@wcom.com> Sender: ianf@wcom.com cc: freebsd-net@freebsd.org cc: freebsd-questions@freebsd.org Subject: Re: Please, Urgent: Need ideas/help to solve PR bin/51586 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 May 2003 13:39:24 -0000 Lars wrote: > > rresvport_af(3) returns this error because I suspect that it thinks > > this address is already in use, perhaps because the address/port > > pair is in TIME_WAIT, although I don't have time to test this > > suspicion and my network programming and protocol experience is not > > good enough to say this is the case outright without testing. > > NO,NO! Netstat says nothing about that. Even I tune msl time to go out = > of TIME_WAIT very fast (only intranet connection on same switch!). > The ethereal dump in the PR shown, that an initial communication takes = > place, but the final ACK to establish the connection fails! Interesting. I setup rshd and inet exactly like you did and ran your test script and it broke in almost exactly the same way it did for you: while true do /usr/bin/rsh brane -l ianf pwd; ret=$? if [ "$ret" != "0" ] then echo "Return Code: $ret" break fi done Loops several hundred times and the immediately prints: /usr/home/ianf /usr/home/ianf /usr/home/ianf select: protocol failure in circuit setup Return Code: 1 At this point on the server 'brane' I get the following in /var/log/messages: May 8 14:23:10 brane rshd[16886]: can't get stderr port: Can't assign requested address This message is logged by rshd when it is unable to open the connection for stderr back to the originating rsh client. Have you turned on net.inet.tcp.blackhole=2 which would result in ICMP port unreachable messages not being sent? What is the output of 'netstat -anf inet |grep -v TIME_WAIT' on machine2 after you get the timeout connecting to machine2? Is the tcp *.514 LISTEN line missing after you get the timeout. What do you get in your messages file on machine2 (the one running the rsh server)? I suspect that you're not getting ICMP port unreachable after inetd silently terminated the shell service because of rshd's exit code so your connection timed out. > > (/usr/src/libexec/rshd, apply this, make and make install the patched r= > shd) > > --- rshd.c.orig Thu May 8 12:55:46 2003 > > +++ rshd.c Thu May 8 12:43:31 2003 > > @@ -296,7 +296,7 @@ > > s =3D rresvport_af(&lport, af); > > if (s < 0) { > > syslog(LOG_ERR, "can't get stderr port: %m"); > > - exit(1); > > + exit(0); > > } > > if (port >=3D IPPORT_RESERVED || > > port < IPPORT_RESERVED/2) { > > = > > > I know this is a horrible solution and shouldn't be committed, but > > at least you have a work-around so you can get your virus scanner > > farm up in the mean time while someone fixes this propperly. > > This dosen't help, cause the port can be reserved by the rshd. The > problem is the establishing of the connection, so this is not the right > place in the source. Which port is reserved by rshd? An incoming connection is established on 514. rshd reads a port number off that connection and initiates a connection back to the originator on the specified port. Both these connections need to be established for the shell service to come up. I'm not sure that I trust the tcpdump in your PR becuase I tried to dump the entire run from the script on both my test servers and the two dumps didn't match and some sequences were out of order. Only when I dumped the packets to a file and used tcpdump to read the file did the dumps from each server match. Here's a good rsh session: 15:04:31.944902 196.7.162.26.1001 > 196.7.162.25.514: S 242763540:242763540(0) win 65535 (DF) 15:04:31.944965 196.7.162.25.514 > 196.7.162.26.1001: S 1769914383:1769914383(0) ack 242763541 win 57344 (DF) 15:04:31.945271 196.7.162.26.1001 > 196.7.162.25.514: . ack 1 win 33304 (DF) 15:04:31.945572 196.7.162.26.1001 > 196.7.162.25.514: P 1:6(5) ack 1 win 33304 (DF) 15:04:31.945600 196.7.162.25.514 > 196.7.162.26.1001: . ack 6 win 57915 (DF) 15:04:31.952264 196.7.162.25.929 > 196.7.162.26.1000: S 206573132:206573132(0) win 57344 (DF) 15:04:31.952525 196.7.162.26.1000 > 196.7.162.25.929: S 740063972:740063972(0) ack 206573133 win 65535 (DF) 15:04:31.952560 196.7.162.25.929 > 196.7.162.26.1000: . ack 1 win 57920 (DF) 15:04:31.953030 196.7.162.26.1001 > 196.7.162.25.514: P 6:11(5) ack 1 win 33304 (DF) 15:04:31.953064 196.7.162.25.514 > 196.7.162.26.1001: . ack 11 win 57915 (DF) 15:04:31.953316 196.7.162.26.1001 > 196.7.162.25.514: P 11:20(9) ack 1 win 33304 (DF) 15:04:31.953334 196.7.162.25.514 > 196.7.162.26.1001: . ack 20 win 57911 (DF) 15:04:31.954560 196.7.162.25.514 > 196.7.162.26.1001: P 1:2(1) ack 20 win 57920 (DF) 15:04:31.954787 196.7.162.26.1001 > 196.7.162.25.514: . ack 2 win 33303 (DF) 15:04:31.958429 196.7.162.25.514 > 196.7.162.26.1001: P 2:17(15) ack 20 win 57920 (DF) 15:04:31.958516 196.7.162.25.514 > 196.7.162.26.1001: F 17:17(0) ack 20 win 57920 (DF) 15:04:31.958697 196.7.162.26.1001 > 196.7.162.25.514: . ack 17 win 33296 (DF) 15:04:31.958795 196.7.162.26.1001 > 196.7.162.25.514: . ack 18 win 33296 (DF) 15:04:31.959146 196.7.162.25.929 > 196.7.162.26.1000: F 1:1(0) ack 1 win 57920 (DF) 15:04:31.959440 196.7.162.26.1000 > 196.7.162.25.929: . ack 2 win 33304 (DF) 15:04:31.961198 196.7.162.26.1001 > 196.7.162.25.514: F 20:20(0) ack 18 win 33304 (DF) 15:04:31.961239 196.7.162.25.514 > 196.7.162.26.1001: . ack 21 win 57920 (DF) 15:04:31.961303 196.7.162.26.1000 > 196.7.162.25.929: F 1:1(0) ack 2 win 33304 (DF) 15:04:31.961321 196.7.162.25.929 > 196.7.162.26.1000: . ack 2 win 57919 (DF) And here's the last one that failed: 15:04:31.984458 196.7.162.26.999 > 196.7.162.25.514: S 3911362959:3911362959(0) win 65535 (DF) 15:04:31.984514 196.7.162.25.514 > 196.7.162.26.999: S 834974100:834974100(0) ack 3911362960 win 57344 (DF) 15:04:31.984863 196.7.162.26.999 > 196.7.162.25.514: . ack 1 win 33304 (DF) 15:04:31.985141 196.7.162.26.999 > 196.7.162.25.514: P 1:5(4) ack 1 win 33304 (DF) 15:04:31.985165 196.7.162.25.514 > 196.7.162.26.999: . ack 5 win 57916 (DF) 15:04:31.992888 196.7.162.25.514 > 196.7.162.26.999: F 1:1(0) ack 5 win 57920 (DF) 15:04:31.993164 196.7.162.26.999 > 196.7.162.25.514: . ack 2 win 33304 (DF) 15:04:31.993698 196.7.162.26.999 > 196.7.162.25.514: F 5:5(0) ack 2 win 33304 (DF) 15:04:31.993737 196.7.162.25.514 > 196.7.162.26.999: . ack 6 win 57920 (DF) You'll notice the absence of the second SYN from 196.7.162.25 to 196.7.162.26 and instead 196.7.162.25 immediately sends a FIN. It was at this point that rshd couldn't get the second port and terminated the connection. > However, the mailserver, which calls the rsh client is a solaris > 8 machine :-( That's not a problem because I believe the problem to be in rshd and most likely in libc in rresvport_af(3). Ian