From owner-freebsd-current@FreeBSD.ORG Wed Oct 27 16:29:57 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 27E8816A4CE for ; Wed, 27 Oct 2004 16:29:57 +0000 (GMT) Received: from Daffy.timing.com (w.timing.com [206.168.13.218]) by mx1.FreeBSD.org (Postfix) with ESMTP id D1E8443D49 for ; Wed, 27 Oct 2004 16:29:56 +0000 (GMT) (envelope-from ben@timing.com) Received: from piglet.timing.com (oink@piglet.timing.com [206.168.13.178]) by Daffy.timing.com (8.12.8p2/8.12.8) with ESMTP id i9RGTuIm040630 for ; Wed, 27 Oct 2004 10:29:56 -0600 (MDT) (envelope-from ben@timing.com) Received: from piglet.timing.com (oink@localhost.timing.com [127.0.0.1]) by piglet.timing.com (8.12.6p3/8.12.6) with ESMTP id i9RGTuhC065641 for ; Wed, 27 Oct 2004 10:29:56 -0600 (MDT) (envelope-from ben@piglet.timing.com) Received: (from ben@localhost) by piglet.timing.com (8.12.6p3/8.12.6/Submit) id i9RGTuua065638; Wed, 27 Oct 2004 10:29:56 -0600 (MDT) From: Ben Mesander MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16767.52483.986921.670394@piglet.timing.com> Date: Wed, 27 Oct 2004 10:29:55 -0600 To: current@freebsd.org X-Mailer: VM 7.00 under Emacs 21.2.95.2 X-Virus-Scanned: clamd / ClamAV version 0.74, clamav-milter version 0.74a on Daffy.timing.com X-Virus-Status: Clean Subject: -current NFSv2 and NFSv3 issues X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Oct 2004 16:29:57 -0000 Hi all, We're seeing some data corruption and performance problems here with NFSv3/TCP on a machine running -current NFS client against a 4.8 NFS server. The problems can be reproduced by doing a 'make buildworld' over NFS. With NFSv2 & UDP we could buildworld with -j8 and the build would usually - but not always - complete successfully. The times when it did not appeared to possibly be a result of the mtime.tv_usec not being checked for files over NFS (ie, one build step creates a .depend file, and another step tries to use it before it "appears" over NFS). We decided to try NFSv3/TCP to see if we could get better performance. However with buildworld and -j8, we reliably see gcc or some other toolchain component coredump during the build. With -j1 things complete succesfully, but buildworld -j1 of -current takes 5 hours over dedicated 100baseT network, and the ethernet never gets even close to being saturated, so the underlying network transport doesn't seem to be the bottleneck. We appear to have sufficient nfsd's & nfsiod's in that they don't all seem to be incurring appreciable CPU time. Any clues as to the data corruption issue? Should we expect NFSv3 over TCP to outperform NFSv2 over UDP? Thanks, Ben