From owner-freebsd-stable@FreeBSD.ORG Mon Nov 2 10:10:06 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5C97D1065676; Mon, 2 Nov 2009 10:10:06 +0000 (UTC) (envelope-from O.Seibert@cs.ru.nl) Received: from kookpunt.science.ru.nl (kookpunt.science.ru.nl [131.174.30.61]) by mx1.freebsd.org (Postfix) with ESMTP id 0A2878FC13; Mon, 2 Nov 2009 10:10:05 +0000 (UTC) Received: from twoquid.cs.ru.nl (twoquid.cs.ru.nl [131.174.142.38]) by kookpunt.science.ru.nl (8.13.7/5.30) with ESMTP id nA2A9wsW014772; Mon, 2 Nov 2009 11:09:58 +0100 (MET) Received: by twoquid.cs.ru.nl (Postfix, from userid 4100) id 6ADBE2E05F; Mon, 2 Nov 2009 11:09:58 +0100 (CET) Date: Mon, 2 Nov 2009 11:09:58 +0100 From: Olaf Seibert To: Rick Macklem Message-ID: <20091102100958.GY841@twoquid.cs.ru.nl> References: <20091027164159.GU841@twoquid.cs.ru.nl> <20091029135239.GX841@twoquid.cs.ru.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.19 (2009-01-05) X-Spam-Score: -1.799 () ALL_TRUSTED,BAYES_50 X-Scanned-By: MIMEDefang 2.63 on 131.174.30.61 Cc: danny@cs.huji.ca.il, freebsd-stable@freebsd.org, dfr@freebsd.org, Olaf Seibert Subject: Re: 8.0-RC1 NFS client timeout issue X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Nov 2009 10:10:06 -0000 On Sun 01 Nov 2009 at 17:17:15 -0500, Rick Macklem wrote: > On Thu, 29 Oct 2009, Olaf Seibert wrote: > > > > > Thanks, it looks like it should do the trick. I can't try it before > > monday, though. > > > Although I think the patch does avoid sending the request on the > partially closed connection, it doesn't fix the "real problem", > so I don't know if it is worth testing? Well, I tested it anyway, just in case. It seems to work fine for me, so far. I don't see your extra RSTs either. Maybe that is because in my case the client used a different port number for the new connection. (Usually, this is controlled by the TCP option SO_REUSEADDR from ). Here is a new packet trace. I had to cut out some packets since I forgot to kill some (failing) mount attempts of another directory on the same server. (sorry again for the long lines) No. Time Source Destination Protocol Info 486 60.438406 xxx.xxx.31.43 xxx.xxx.16.142 NFS V3 LOOKUP Call (Reply In 487), DH:0x61b8eb12/date 487 60.438629 xxx.xxx.16.142 xxx.xxx.31.43 NFS V3 LOOKUP Reply (Call In 486) Error:NFS3ERR_NOENT 488 60.538796 xxx.xxx.31.43 xxx.xxx.16.142 TCP hello-port > nfs [ACK] Seq=36477 Ack=44701 Win=8192 Len=0 TSV=228817 TSER=1575935 last real action on old connection (client port "hello-port") 537 420.437763 xxx.xxx.16.142 xxx.xxx.31.43 TCP nfs > hello-port [FIN, ACK] Seq=44701 Ack=36477 Win=49232 Len=0 TSV=1611935 TSER=228817 538 420.437805 xxx.xxx.31.43 xxx.xxx.16.142 TCP hello-port > nfs [ACK] Seq=36477 Ack=44702 Win=8192 Len=0 TSV=588734 TSER=1611935 server ends connection 563 605.334262 xxx.xxx.31.43 xxx.xxx.16.142 TCP hello-port > nfs [FIN, ACK] Seq=36477 Ack=44702 Win=8192 Len=0 TSV=773641 TSER=1611935 some time later, client now ends connection before sending its request on new connection (port 875) 564 605.334303 xxx.xxx.31.43 xxx.xxx.16.142 TCP 875 > nfs [SYN] Seq=0 Win=65535 Len=0 MSS=1460 WS=5 TSV=773641 TSER=0 565 605.334440 xxx.xxx.16.142 xxx.xxx.31.43 TCP nfs > hello-port [ACK] Seq=44702 Ack=36478 Win=49232 Len=0 TSV=1630424 TSER=773641 566 605.334564 xxx.xxx.16.142 xxx.xxx.31.43 TCP nfs > 875 [SYN, ACK] Seq=0 Ack=1 Win=49232 Len=0 TSV=1630424 TSER=773641 MSS=1460 WS=0 567 605.334588 xxx.xxx.31.43 xxx.xxx.16.142 TCP 875 > nfs [ACK] Seq=1 Ack=1 Win=66592 Len=0 TSV=773641 TSER=1630424 new connection set up 568 605.334605 xxx.xxx.31.43 xxx.xxx.16.142 NFS V3 ACCESS Call (Reply In 570), FH:0x008002a2 569 605.334828 xxx.xxx.16.142 xxx.xxx.31.43 TCP nfs > 875 [ACK] Seq=1 Ack=141 Win=49092 Len=0 TSV=1630424 TSER=773641 and in use > I'm hoping that the "Help TCP Wizards..." thread I just started > on freebsd-current comes up with something. > > At least I can reproduce the problem now. (For some reason, I have > to reboot the Solaris10 server before the problem appears for me. > I can't think why this matters, but that's networking for you:-) Maybe it depends on server load or something. This particular server is a central file server at a university, it may have some more pressure to terminate unused connections. > rick -Olaf. --