From owner-freebsd-fs@FreeBSD.ORG Mon Dec 19 00:59:15 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3EFDE106564A; Mon, 19 Dec 2011 00:59:15 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id D4D818FC14; Mon, 19 Dec 2011 00:59:14 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAICL7k6DaFvO/2dsb2JhbABDFoR2p1KCHARSNQINGQKIdKVMkHmBL4c7ggSBFgSINoxIkkw X-IronPort-AV: E=Sophos;i="4.71,373,1320642000"; d="scan'208";a="148998857" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 18 Dec 2011 19:59:13 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id CEAA2B3F77; Sun, 18 Dec 2011 19:59:13 -0500 (EST) Date: Sun, 18 Dec 2011 19:59:13 -0500 (EST) From: Rick Macklem To: freebsd-fs@freebsd.org Message-ID: <255844377.375232.1324256353832.JavaMail.root@erie.cs.uoguelph.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: John Subject: NFS client UDP retransmit timer busted for 8.n/9.n (patch) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Dec 2011 00:59:15 -0000 Thanks to recent detective work done by jwd@, a problem w.r.t. retransmit timeouts for UDP mounts (both old and new NFS clients) has been identified. The kernel rpc has two timeouts for UDP: 1 - a timeout that causes the RPC request to be retransmitted on the same socket, using the same xid. This one defaults to 3seconds and can be set via CLSET_RETRY_TIMEOUT. (This is always the default of 3seconds for FreeBSD currently.) 2 - a timeout that cause the socket to be destroyed and a fresh one created. The request is then sent on this new socket, with a different xid. The problem with #2 is that the retransmitted RPC request will miss a server's Duplicate Request Cache (DRC), because of the different xid. As such, #2 should be much larger than #1. However, #2 defaults to 1second (ie. smaller than #1->trouble!) One way to avoid this problem is to set #2 to a much larger value via the "timeout=" mount option. (Btw, the is in 1/10 seconds, so "timeout="600" sets it to 60sec.) I now have a patch that I believe deals with this correctly. It sets #1 to the "timeout=" (default 1second) and #2 to a much larger value. (#2 timeouts are what the kernel rpc counts as retries, so for "soft" mounts, I set #2 to "nm_retry * nm_timeout / 2" and "retries = 2", so that it fails after "nm_retry * nm_timeout", which I think is the correct semantics.) This patch is attached and is also available at: http://people.freebsd.org/~rmacklem/udp-timer.patch (jwd@, this patch is updated from what I emailed you, so you probably want it:-) In summary, if you are using NFS mounts over UDP on FreeBSD8 or 9 systems, you either want to use "timeout=600" or try the patch. You are pretty badly broken otherwise. Hopefully, this patch can make it into -current/head soon, rick ps: jhb@, could you maybe review this, thanks, rick.