From owner-freebsd-questions@FreeBSD.ORG Wed Mar 11 08:11:46 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 134511065780 for ; Wed, 11 Mar 2009 08:11:46 +0000 (UTC) (envelope-from perryh@pluto.rain.com) Received: from agora.rdrop.com (agora.rdrop.com [199.26.172.34]) by mx1.freebsd.org (Postfix) with ESMTP id E3B0C8FC18 for ; Wed, 11 Mar 2009 08:11:45 +0000 (UTC) (envelope-from perryh@pluto.rain.com) Received: from agora.rdrop.com (66@localhost [127.0.0.1]) by agora.rdrop.com (8.13.1/8.12.7) with ESMTP id n2B8BiBP098873 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Wed, 11 Mar 2009 01:11:45 -0700 (PDT) (envelope-from perryh@pluto.rain.com) Received: (from uucp@localhost) by agora.rdrop.com (8.13.1/8.12.9/Submit) with UUCP id n2B8Bids098869; Wed, 11 Mar 2009 01:11:44 -0700 (PDT) Received: from fbsd61 by pluto.rain.com (4.1/SMI-4.1-pluto-M2060407) id AA26312; Wed, 11 Mar 09 00:10:26 PST Date: Wed, 11 Mar 2009 01:09:47 -0700 From: perryh@pluto.rain.com To: kheuer2@gwdg.de Message-Id: <49b771cb.daisuixe+bdr3tV8%perryh@pluto.rain.com> References: <20090310091318.W34669@gwdu60.gwdg.de> In-Reply-To: <20090310091318.W34669@gwdu60.gwdg.de> User-Agent: nail 11.25 7/29/05 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-questions@freebsd.org Subject: Re: Is NFS Locking Reliable? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Mar 2009 08:11:46 -0000 > Our NFS servers for user home directories are on FreeBSD (6.4), > MacOSX (10.5), Linux (still 2.4 kernel) and Tru64-UNIX boxes; NFS > clients are mostly Linux (2.6 kernel) and FreeBSD (6.4, 7.0, but > w/o kernel lockd) systems. I have seen problems with NFS locking even in completely homogeneous environments. With a mix like that, I would not trust it as far as I could throw a Cray :) > There are periods of several days without problems, but from time > to time, on one, two, or several (but not all) clients application > processes which use locking suddenly hang in kernel mode - namely > firefox, opera, pine. Lockups are probably the least of your concerns, at least where pine is involved. Dunno what sort of data firefox and opera are protecting from race conditions, but I suppose pine is being used for email. Cases will arise wherein mail mysteriously disappears, because the client and the delivery agent were both updating the inbox at the same time. Often there will be no noticeable symptoms, except for users wondering what happened to that important message they were supposed to have gotten (and which the MTA log shows was in fact delivered). Never export an inbox read/write if reliability of mail delivery is needed. Use IMAP instead. > It seems to be no specific operating system problem - all > combinations of clients and servers are involved. I suspect the reason NFS locking is so troublesome is that it presents problems which are fundamentally incomputable. Prior to restoration of communication, how can any automaton possibly distinguish between * a temporary loss of the communication link (but the peer is still running and the link will eventually be re-established), and * the peer has crashed, and will eventually reboot?