From owner-freebsd-bugs@FreeBSD.ORG Tue Mar 30 16:00:17 2010 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 154C8106566B for ; Tue, 30 Mar 2010 16:00:17 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id F240B8FC0A for ; Tue, 30 Mar 2010 16:00:16 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o2UG0GBo088384 for ; Tue, 30 Mar 2010 16:00:16 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o2UG0Gxq088374; Tue, 30 Mar 2010 16:00:16 GMT (envelope-from gnats) Date: Tue, 30 Mar 2010 16:00:16 GMT Message-Id: <201003301600.o2UG0Gxq088374@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Bruce Evans Cc: Subject: Re: misc/145189: nfsd performs abysmally under load X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Bruce Evans List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Mar 2010 16:00:17 -0000 The following reply was made to PR misc/145189; it has been noted by GNATS. From: Bruce Evans To: Rich Ercolani Cc: freebsd-gnats-submit@freebsd.org, freebsd-bugs@freebsd.org Subject: Re: misc/145189: nfsd performs abysmally under load Date: Wed, 31 Mar 2010 02:50:16 +1100 (EST) On Tue, 30 Mar 2010, Rich Ercolani wrote: >> Description: > nfsd performs abysmally on this machine under conditions in which Solaris's NFS implementation is reasonably fast, and while local IO to the same filesystems is still zippy. Please don't format lines for 200+ column terminals. Does it work better when limited to 1 thread (nfsd -n 1)? In at least some versions of it (or maybe in nfsiod), multiple threads fight each other under load. > For instance, copying a 4GB file over NFSv3 from a ZFS filesystem with the following flags [rw,nosuid,hard,intr,nofsc,tcp,vers=3,rsize=8192,wsize=8192,sloppy,addr=X.X.X.X](Linux client, the above is the server), I achieve 2 MB/s, fluctuating between 1 and 3. (pv reports 2.23 MB/s avg) > > Locally, on the server, I achieve 110-140 MB/s (at the end of pv, it reports 123 MB/s avg). > > I'd assume network latency, but nc with no flags other than port achieves 30-50 MB/s between server and client. > > Latency is also abysmal - ls on a randomly chosen homedir full of files, according to time, takes: > real 0m15.634s > user 0m0.012s > sys 0m0.097s > while on the local machine: > real 0m0.266s > user 0m0.007s > sys 0m0.000s It probably is latency. nfs is very latency-sensitive when there are lots of small files. Transfers of large files shouldn't be affected so much. > The server in question is a 3GHz Core 2 Duo, running FreeBSD RELENG_8. The kernel conf, DTRACE_POLL, is just the stock AMD64 kernel with all of the DTRACE-related options turned on, as well as the option to enable polling in the NIC drivers, since we were wondering if that would improve our performance. Enabling polling is a good way to destroy latency. A ping latency of more that about 50uS causes noticable loss of performance for nfs, but LAN latency is usually a few times higher than that, and polling without increasing the clock interrupt frequency to an excessively high value gives a latency of at least 20 times higher than that. Also, -current with debugging options is so bloated that even localhost has a ping latency of about 50uS on a Core2 (up from 2uS for FreeBSD-4 on an AthlonXP). Anyway try nfs on localhost to see if reducing the latency helps. > We tested this with a UFS directory as well, because we were curious if this was an NFS/ZFS interaction - we still got 1-2 MB/s read speed and horrible latency while achieving fast throughput and latency local to the server, so we're reasonably certain it's not "just" ZFS, if there is indeed any interaction there. After various tuning and bug fixing (now partly committed by others) I get improvements like the following on low-end systems with ffs (I don't use zfs): - very low end with 100Mbps ethernet: little change; bulk transfers always went at near wire speed (about 10 MB/S) - low end with 1Gbps/S: bulk transfers up from 20MB/S to 45MB/S (local ffs 50MB/S). buildworld over nfs of 5.2 world down from 1200 seconds to 800 seconds (this one is very latency-sensitive. Takes about 750 seconds on local ffs). > Read speed of a randomly generated 6500 MB file on UFS over NFSv3 with the same flags as above: 1-3 MB/s, averaging 2.11 MB/s > Read speed of the same file, local to the server: consistently between 40-60 MB/s, averaging 61.8 MB/s [it got faster over time - presumably UFS was aggressively caching the file, or something?] You should use a file size larger than the size of main memory to prevent caching, especially for reads. That is 1GB on my low-end systems. > Read speed of the same file over NFS again, after the local test: > Amusingly, worse (768 KB/s-2.2 MB/s, with random stalls - average reported 270 KB/s(!)). The random stalls are typical of the problem with the nfsd's getting in each other's way, and/or of related problems. The stalls that I saw were very easy to see in real time using "netstat -I 1" -- they happened every few seconds and lasted a second or 2. But they were never long enough to reduce the throughput by more than a factor of 3, so I always got over 19 MB/S. The throughput was reduced by approximately the ratio of stalled time to non-stalled time. >> How-To-Repeat: > 1) Mount multiple NFS filesystems from the server > 2) Watch as your operations latency and throughput rapidly sink to near-zero Multiple active nfs mounts are probably a different problem. You certainly need more than 1 nfsd and/or nfsiod to handle them, and the stalls might be a result of not having enough daemons. Bruce