From owner-freebsd-fs@FreeBSD.ORG Sun Jan 10 21:26:17 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 61D02106566B for ; Sun, 10 Jan 2010 21:26:17 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 18F028FC18 for ; Sun, 10 Jan 2010 21:26:16 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAPzUSUuDaFvH/2dsb2JhbADReYIhgg4E X-IronPort-AV: E=Sophos;i="4.49,251,1262581200"; d="scan'208";a="60717293" Received: from danube.cs.uoguelph.ca ([131.104.91.199]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 10 Jan 2010 16:26:15 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id 4DD3A1084454; Sun, 10 Jan 2010 16:26:14 -0500 (EST) X-Virus-Scanned: amavisd-new at danube.cs.uoguelph.ca Received: from danube.cs.uoguelph.ca ([127.0.0.1]) by localhost (danube.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dalOjxewKhYe; Sun, 10 Jan 2010 16:26:13 -0500 (EST) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id 971CD108440B; Sun, 10 Jan 2010 16:26:12 -0500 (EST) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id o0ALaIg09132; Sun, 10 Jan 2010 16:36:18 -0500 (EST) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Sun, 10 Jan 2010 16:36:18 -0500 (EST) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: Mikolaj Golub In-Reply-To: <86ocl272mb.fsf@kopusha.onet> Message-ID: References: <86ocl272mb.fsf@kopusha.onet> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org Subject: Re: FreeBSD NFS client/Linux NFS server issue X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Jan 2010 21:26:17 -0000 On Sun, 10 Jan 2010, Mikolaj Golub wrote: > > For one of the incident we were tcpdumping "problem" NFS connection for about > 1 hour and during this hour an activity was observed only once: > > 08:20:38.281422 IP (tos 0x0, ttl 64, id 56110, offset 0, flags [DF], proto TCP (6), length 140) 172.30.10.27.344496259 > 172.30.10.121.2049: 88 access fh[1:9300:10df8001] 003f > 08:20:38.281554 IP (tos 0x0, ttl 64, id 26624, offset 0, flags [DF], proto TCP (6), length 52) 172.30.10.121.2049 > 172.30.10.27.971: ., cksum 0xca5e (correct), 89408667:89408667(0) ack 1517941890 win 46 > > The client sent rpc ACCESS request for root exported inode, received tcp ack > response (so tcp connection was ok) but did not receive any RPC reply from the > server. > > So it looks like the problem on NFS server side. But for me it looks a bit > strange that freebsd client is sending rpc packets so rarely. Shouldn't it > retransmit them more frequently? For another incident we monitored tcp > connection for 4 minutes and did not see any packets then. Unfortunately we > can't run tcpdumping long time as these are production servers and we need to > reboot hosts to restore normal operations. > For NFSv3 over TCP, there was no RFC specification, so client behaviour when the server failed to reply to an RPC was essentially undefined. (For NFSv4, a client isn't allowed to retry a non-NULL RPC on the same TCP connection and a server is expected to reply to all RPCs received on the connection or do a disconnect, but that's NFSv4 not NFSv3.) I think the new krpc in FreeBSD8 does to a slow timeout on RPCs over TCP for NFSv3 and eventually does a retry, but I didn't write the code, so I'm not absolutely sure. (I'll try and remember to take a look, or maybe dfr can comment?) However, this krpc code isn't used for FreeBSD7. Bottom line is I don't think the client does a retry until it sees the TCP connection break and if the server isn't replying to the RPC nor disconnecting the TCP connection, it'll be stuck as you describe. I think you have three choices: 1 - Fix the NFS server so that it does reply or disconnects, if that is possible. (I have no idea if the Linux NFS server can be reconfigured?) 2 - Switch to using UDP (which will retry RPCs when no reply is received). 3 - Try a FreeBSD8 system and see if it works ok, then upgrade if that's practical? rick ps: As an historical note, I think I implemented NFS over TCP before anyone else and assumed that a server would reply to all RPC requests, so retries at the RPC level wouldn't be necessary. Others, like Sun, implemented NFS over TCP with RPC timeout/retries and then slowly came over to my way of thinking, but it wasn't spelled out until NFSv4.