From owner-freebsd-net@FreeBSD.ORG Thu Jul 9 04:27:50 2009 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1578510656EB for ; Thu, 9 Jul 2009 04:27:50 +0000 (UTC) (envelope-from jhein@timing.com) Received: from Daffy.timing.com (mail.timing.com [206.168.13.218]) by mx1.freebsd.org (Postfix) with ESMTP id C3DDF8FC28 for ; Thu, 9 Jul 2009 04:27:49 +0000 (UTC) (envelope-from jhein@timing.com) Received: from gromit.timing.com (gromit.timing.com [206.168.13.209]) by Daffy.timing.com (8.13.1/8.13.1) with ESMTP id n68N1ZZM049585; Wed, 8 Jul 2009 17:01:35 -0600 (MDT) (envelope-from jhein@timing.com) Received: from gromit.timing.com (localhost [127.0.0.1]) by gromit.timing.com (8.14.3/8.14.3) with ESMTP id n68N1ZXj008411; Wed, 8 Jul 2009 17:01:35 -0600 (MDT) (envelope-from jhein@gromit.timing.com) Received: (from jhein@localhost) by gromit.timing.com (8.14.3/8.14.3/Submit) id n68N1ZBQ008408; Wed, 8 Jul 2009 17:01:35 -0600 (MDT) (envelope-from jhein) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <19029.9551.628427.146587@gromit.timing.com> Date: Wed, 8 Jul 2009 17:01:35 -0600 From: John Hein To: net@freebsd.org In-Reply-To: <19029.5367.534192.928426@gromit.timing.com> References: <19029.4145.296260.915327@gromit.timing.com> <19029.5367.534192.928426@gromit.timing.com> X-Mailer: VM 7.19 under Emacs 22.3.1 X-Virus-Scanned: ClamAV version 0.91.2, clamav-milter version 0.91.2 on Daffy.timing.com X-Virus-Status: Clean Cc: Subject: Re: network lock manager (lockd) deadlocked in 'rpcrecv' X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jul 2009 04:27:51 -0000 John Hein wrote at 15:51 -0600 on Jul 8, 2009: > John Hein wrote at 15:31 -0600 on Jul 8, 2009: > > I have a home directory on FreeBSD 7.2-stable (20090705), amd64. > > It is serving up the directory over nfs (v3, tcp), and now > > I'm seeing lots of 'lockd not responding' on Fedora 10 & 11 systems. . . > Also in dmesg: > > NLM: failed to contact remote rpcbind, stat = 5, port = 28416 . . Here's some good information. This seems to happen when there are 2 or more Fedora systems trying to access locks via lockd. Rebooting the Fedora box that has 'lockd not responding' frees up the deadlocked freebsd lockd. But _also_ disabling the firewall on the Fedora boxes helps, too. This doesn't necessarily completely implicate or exonerate lockd. But what should lockd do when the remote box asks for a lock, but doesn't complete the RPC dialog? Is there a way we can deal with this problem and not have lockd deadlock?