From owner-freebsd-fs@FreeBSD.ORG Sat Mar 20 00:58:16 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E5E4D106564A; Sat, 20 Mar 2010 00:58:15 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 2B9AB8FC19; Sat, 20 Mar 2010 00:58:14 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvsEAEW6o0uDaFvH/2dsb2JhbACbPnO7O4R8BA X-IronPort-AV: E=Sophos;i="4.51,277,1267419600"; d="scan'208";a="69311447" Received: from danube.cs.uoguelph.ca ([131.104.91.199]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 19 Mar 2010 20:58:14 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id D69A410842BD; Fri, 19 Mar 2010 20:58:13 -0400 (EDT) X-Virus-Scanned: amavisd-new at danube.cs.uoguelph.ca Received: from danube.cs.uoguelph.ca ([127.0.0.1]) by localhost (danube.cs.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eQWt1OGBglIs; Fri, 19 Mar 2010 20:58:13 -0400 (EDT) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by danube.cs.uoguelph.ca (Postfix) with ESMTP id 635CB108402D; Fri, 19 Mar 2010 20:58:13 -0400 (EDT) Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id o2K1B3S21356; Fri, 19 Mar 2010 21:11:03 -0400 (EDT) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Fri, 19 Mar 2010 21:11:03 -0400 (EDT) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: John Baldwin In-Reply-To: <201003190831.00950.jhb@freebsd.org> Message-ID: References: <4BA3613F.4070606@comcast.net> <201003190831.00950.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org, User Questions , bseklecki@noc.cfi.pgh.pa.us Subject: Re: FreeBSD NFS client goes into infinite retry loop X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Mar 2010 00:58:16 -0000 On Fri, 19 Mar 2010, John Baldwin wrote: > On Friday 19 March 2010 7:34:23 am Steve Polyack wrote: >> Hi, we use a FreeBSD 8-STABLE (from shortly after release) system as an >> NFS server to provide user home directories which get mounted across a >> few machines (all 6.3-RELEASE). For the past few weeks we have been >> running into problems where one particular client will go into an >> infinite loop where it is repeatedly trying to write data which causes >> the NFS server to return "reply ok 40 write ERROR: Input/output error >> PRE: POST:". This retry loop can cause between 20mbps and 500mbps of I'm afraid I don't quite understand what you mean by "causes the NFS server to return "reply ok 40 write ERROR..."". Is this something logged by syslog (I can't find a printf like this in the kernel sources) or is this something that tcpdump is giving you or ??? Why I ask is that it seems to say that the server is returning EIO (or maybe 40 == EMSGSIZE). The server should return ESTALE (NFSERR_STALE) after a file has been deleted. If it is returning EIO, then that will cause the client to keep trying to write the dirty block to the server. (EIO is interpreted by the client as a "transient error".) [good stuff snipped] >> >> I have a feeling that using NFS in such a matter may simply be prone to >> such problems, but what confuses me is why the NFS client system is >> infinitely retrying the write operation and causing itself so much grief. > > Yes, your feeling is correct. This sort of race is inherent to NFS if you do > not use some sort of locking protocol to resolve the race. The infinite > retries sound like a client-side issue. Have you been able to try a newer OS > version on a client to see if it still causes the same behavior? > As John notes, having one client delete a file while another is trying to write it, is not a good thing. However, the server should return ESTALE after the file is deleted and that tells the client that the write can never succeed, so it marks the buffer cache block invalid and returns the error to the app. (The app. may not see it, if it doesn't check for error returns upon close as well as write, but that's another story...) If you could look at a packet trace via wireshark when the problem occurs, it would be nice to see what the server is returning. (If it isn't ESTALE and the file no longer exists on the server, then thats a server problem.) If it is returning ESTALE, then the client is busted. (At a glance, the client code looks like it would handle ESTALE as a fatal error for the buffer cache, but that doesn't mean it isn't broken, just that it doesn't appear wrong. Also, it looks like mmap'd writes won't recognize a fatal write error and will just keep trying to write the dirty page back to the server. Take this with a big grain of salt, since I just took a quick look at the sources. FreeBSD6->8 appear to be pretty much the same as far as this goes, in the client. Please let us know if you can see the server's error reply code. Good luck with it, rick ps: If the server isn't returning ESTALE, you could try switching to the experimental nfs server and see if it exhibits the same behaviour? ("-e" option on both mountd and nfsd, assuming the server is FreeBSD8.)