From owner-freebsd-current@FreeBSD.ORG Wed Oct 13 13:07:23 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D4154106566B for ; Wed, 13 Oct 2010 13:07:23 +0000 (UTC) (envelope-from ecrist@secure-computing.net) Received: from kenny.secure-computing.net (kenny.secure-computing.net [173.8.118.210]) by mx1.freebsd.org (Postfix) with ESMTP id 8BBEC8FC12 for ; Wed, 13 Oct 2010 13:07:23 +0000 (UTC) Received: from [192.168.1.8] (ms.choksondik.secure-computing.net [173.8.118.221]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: ecrist@secure-computing.net) by kenny.secure-computing.net (Postfix) with ESMTP id DC8032E0A6; Wed, 13 Oct 2010 07:42:19 -0500 (CDT) From: Eric Crist Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Date: Wed, 13 Oct 2010 07:42:19 -0500 Message-Id: <7B979EC3-3829-424A-9187-E1294371B192@secure-computing.net> To: freebsd-current@freebsd.org Mime-Version: 1.0 (Apple Message framework v1081) X-Mailer: Apple Mail (2.1081) Cc: Thomas Johnson Subject: NFSv3 + 8.1 + rpc.[lockd|statd] issues X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Oct 2010 13:07:24 -0000 Hey folks, We have a machine running FreeBSD 8.1-RELEASEp1 acting as an NFS server = hosting 3 ZFS file systems on an external enclosure. There are a bunch = of machines, ranging from 4.11, 7.1, and 8.x systems acting as NFS = clients to this server. Running dmesg on the NFS server shows no errors = at all, but the three different clients show differing errors. On the = panther example below, it was reported last night a 48MB file took about = 90 minutes to transfer. I'm working on upgrading the 7.1 system to 8.1 now, so I'm not quite as = concerned with that, but the rpcbind errors that show on both 7.1 and = 8.1 are causing core dumps on some of our applications. Any help is appreciated. =3D=3D=3D Data below =3D=3D=3D ecrist@jaguar-1:~-> date Wed Oct 13 07:33:46 CDT 2010 ecrist@jaguar-1:~-> dmesg ecrist@jaguar-1:~-> uname -a FreeBSD jaguar-1.claimlynx.com 8.1-RC2 FreeBSD 8.1-RC2 #1: Wed Jul 14 = 11:34:02 CDT 2010 = root@jaguar-1.claimlynx.com:/usr/obj/usr/src/sys/GENERIC-CARP amd64 ecrist@jaguar-1:~-> uptime 7:33AM up 83 days, 8:42, 2 users, load averages: 1.08, 1.25, 0.94 ecrist@jaguar-1:~-> On the clients, however, many of them are reporting assorted problems. = The 7.1 system reports the following: ecrist@panther:~-> date Wed Oct 13 07:34:43 CDT 2010 ecrist@panther:~-> dmesg ... Can't start NLM - unable to contact NSM NLM: failed to contact remote rpcbind, stat =3D 5, port =3D 28416 NLM: failed to contact remote rpcbind, stat =3D 5, port =3D 28416 NLM: failed to contact remote rpcbind, stat =3D 5, port =3D 28416 nfs server jaguar.stor:/array/production: not responding nfs server jaguar.stor:/array/production: is alive again nfs server jaguar.stor:/array/production: not responding ... ecrist@panther:~-> uname -a FreeBSD panther.claimlynx.com 7.1-RELEASE-p3 FreeBSD 7.1-RELEASE-p3 #2: = Sun Mar 22 08:21:50 CDT 2009 = root@cougar.claimlynx.com:/usr/obj/usr/src/sys/SMP-ASR i386 ecrist@panther:~-> uptime 7:34AM up 30 days, 16:13, 4 users, load averages: 0.97, 1.00, 0.91 ecrist@panther:~->=20 Our 4.11 system: ecrist@puma:~-> date Wed Oct 13 07:38:09 CDT 2010 ecrist@puma:~-> dmesg got bad cookie vp 0xe93fd240 bp 0xcfa2d2ec got bad cookie vp 0xe859e740 bp 0xcfa96644 ... nfs server jaguar.stor:/array/production: not responding nfs server jaguar.stor:/array/production: is alive again nfs server jaguar.stor:/array/archive: not responding nfs server jaguar.stor:/array/archive: is alive again nfs server jaguar.stor:/array/archive: not responding nfs server jaguar.stor:/array/archive: is alive again nfs server jaguar.stor:/array/archive: not responding nfs server jaguar.stor:/array/production: not responding nfs server jaguar.stor:/array/archive: is alive again nfs server jaguar.stor:/array/production: is alive again nfs server jaguar.stor:/array/archive: not responding nfs server jaguar.stor:/array/archive: is alive again nfs server jaguar.stor:/array/production: not responding nfs server jaguar.stor:/array/production: is alive again nfs server jaguar.stor:/array/production: not responding ... ecrist@puma:~-> uname -a FreeBSD puma.claimlynx.com 4.11-RELEASE-p2 FreeBSD 4.11-RELEASE-p2 #1: = Wed Apr 13 18:25:25 CDT 2005 = drue@puma.claimlynx.com:/usr/obj/usr/src/sys/PUMA i386 ecrist@puma:~-> uptime 7:38AM up 30 days, 15:27, 1 user, load averages: 0.02, 0.02, 0.00 ecrist@puma:~->=20 And, finally, an 8.1 system: ecrist@puma-2:~-> date Wed Oct 13 07:39:27 CDT 2010 ecrist@puma-2:~-> dmesg ... NLM: failed to contact remote rpcbind, stat =3D 5, port =3D 28416 NLM: failed to contact remote rpcbind, stat =3D 5, port =3D 28416 ecrist@puma-2:~-> uname -a FreeBSD puma-2.claimlynx.com 8.1-RELEASE FreeBSD 8.1-RELEASE #2: Mon Aug = 2 12:50:40 CDT 2010 = root@jaguar-1.claimlynx.com:/usr/obj/usr/src/sys/GENERIC-CARP amd64 ecrist@puma-2:~-> uptime 7:39AM up 70 days, 18:25, 3 users, load averages: 0.00, 0.00, 0.00 ecrist@puma-2:~->=