From owner-freebsd-bugs@FreeBSD.ORG Tue Mar 30 16:56:14 2010 Return-Path: Delivered-To: freebsd-bugs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1233F106566C; Tue, 30 Mar 2010 16:56:14 +0000 (UTC) (envelope-from rincebrain@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id D13958FC18; Tue, 30 Mar 2010 16:56:13 +0000 (UTC) Received: by pwi9 with SMTP id 9so190641pwi.13 for ; Tue, 30 Mar 2010 09:56:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:received:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=vwTesQiwFMeiBToRpUBJvHyyRvM3GD+OVnyzQEmBf68=; b=wfqdt8V7JfqHW9Eq3EwQ45DFM3WUbfUao+o9/hI4qSHFNgd48QKC2jZOmxWFBpAN5n MyG1ShfvPTA85UMqif36LneTtMqOIxkDen7qsAHiLiYF/51zLzzhNClO9O6AHovX1frD DnAGQp5qZhjZSmzTzf5CHVv+FW3Eglwo9ppOI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=HkvksHl0BQ5NVpM71wqy+0F/5maqi1VktMKnz1Uw8TtxF6Jcn7r6EsF0L+sjB5vaaR hH6byxtPByoWV+t0C0LghGoc8UaLTxebRN77Pk6haU/4YmuCHfHLWeQ/eRMeEaz2IBgX 8fAxf23npAOnXGXi7e/WaBkjrlp3h3jP5gpPw= MIME-Version: 1.0 Sender: rincebrain@gmail.com Received: by 10.231.190.69 with HTTP; Tue, 30 Mar 2010 09:29:37 -0700 (PDT) In-Reply-To: <20100331021023.Y919@besplex.bde.org> References: <201003300501.o2U51afE033587@www.freebsd.org> <20100331021023.Y919@besplex.bde.org> Date: Tue, 30 Mar 2010 12:29:37 -0400 X-Google-Sender-Auth: 7665eeece593345f Received: by 10.141.15.17 with SMTP id s17mr1222904rvi.14.1269966577944; Tue, 30 Mar 2010 09:29:37 -0700 (PDT) Message-ID: <5da0588e1003300929k50be2963h64d520f9f942e3a2@mail.gmail.com> From: Rich To: Bruce Evans Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-bugs@freebsd.org, freebsd-gnats-submit@freebsd.org Subject: Re: misc/145189: nfsd performs abysmally under load X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Mar 2010 16:56:14 -0000 On Tue, Mar 30, 2010 at 11:50 AM, Bruce Evans wrote: > Does it work better when limited to 1 thread (nfsd -n 1)? =A0In at least > some versions of it (or maybe in nfsiod), multiple threads fight each oth= er > under load. It doesn't seem to - nfsd -n 1 still ranges between 1-3 MB/s for files > RAM on server or client (6 and 4 GB, respectively). >> For instance, copying a 4GB file over NFSv3 from a ZFS filesystem with t= he >> following flags >> [rw,nosuid,hard,intr,nofsc,tcp,vers=3D3,rsize=3D8192,wsize=3D8192,sloppy= ,addr=3DX.X.X.X](Linux >> client, the above is the server), I achieve 2 MB/s, fluctuating between = 1 >> and 3. (pv reports 2.23 MB/s avg) >> >> Locally, on the server, I achieve 110-140 MB/s (at the end of pv, it >> reports 123 MB/s avg). >> >> I'd assume network latency, but nc with no flags other than port achieve= s >> 30-50 MB/s between server and client. >> >> Latency is also abysmal - ls on a randomly chosen homedir full of files, >> according to time, takes: >> real =A0 =A00m15.634s >> user =A0 =A00m0.012s >> sys =A0 =A0 0m0.097s >> while on the local machine: >> real =A0 =A00m0.266s >> user =A0 =A00m0.007s >> sys =A0 =A0 0m0.000s > > It probably is latency. =A0nfs is very latency-sensitive when there are l= ots > of small files. =A0Transfers of large files shouldn't be affected so much= . Sure, and next on my TODO is to look into whether 9.0-CURRENT makes certain ZFS high-latency things perform better. >> The server in question is a 3GHz Core 2 Duo, running FreeBSD RELENG_8. T= he >> kernel conf, DTRACE_POLL, is just the stock AMD64 kernel with all of the >> DTRACE-related options turned on, as well as the option to enable pollin= g in >> the NIC drivers, since we were wondering if that would improve our >> performance. > > Enabling polling is a good way to destroy latency. =A0A ping latency of > more that about 50uS causes noticable loss of performance for nfs, but > LAN latency is usually a few times higher than that, and polling without > increasing the clock interrupt frequency to an excessively high value > gives a latency of at least 20 times higher than that. =A0Also, -current > with debugging options is so bloated that even localhost has a ping > latency of about 50uS on a Core2 (up from 2uS for FreeBSD-4 on an > AthlonXP). =A0Anyway try nfs on localhost to see if reducing the latency > helps. Actually, we noticed that throughput appeared to get marginally better whil= e causing occasional bursts of crushing latency, but yes, we have it on in th= e kernel without using it in any actual NICs at present. :) But yes, I'm getting 40-90+ MB/s, occasionally slowing to 20-30 MB/s, average after copying a 6.5 GB file of 52.7 MB/s, on localhost IPv4, with no additional mount flags. {r,w}size=3D8192 on localhost goes up to 80-100 MB/s, with occasional sinks to 60 (average after copying another, separate 6.5 GB file: 77.3 MB/s). Also: 64 bytes from 127.0.0.1: icmp_seq=3D0 ttl=3D64 time=3D0.015 ms 64 bytes from 127.0.0.1: icmp_seq=3D1 ttl=3D64 time=3D0.049 ms 64 bytes from 127.0.0.1: icmp_seq=3D2 ttl=3D64 time=3D0.012 ms 64 bytes from [actual IP]: icmp_seq=3D0 ttl=3D64 time=3D0.019 ms 64 bytes from [actual IP]: icmp_seq=3D1 ttl=3D64 time=3D0.015 ms >> We tested this with a UFS directory as well, because we were curious if >> this was an NFS/ZFS interaction - we still got 1-2 MB/s read speed and >> horrible latency while achieving fast throughput and latency local to th= e >> server, so we're reasonably certain it's not "just" ZFS, if there is ind= eed >> any interaction there. > > After various tuning and bug fixing (now partly committed by others) I ge= t > improvements like the following on low-end systems with ffs (I don't use > zfs): > - very low end with 100Mbps ethernet: little change; bulk transfers alway= s > =A0went at near wire speed (about 10 MB/S) > - low end with 1Gbps/S: bulk transfers up from 20MB/S to 45MB/S (local ff= s > =A050MB/S). =A0buildworld over nfs of 5.2 world down from 1200 seconds to= 800 > =A0seconds (this one is very latency-sensitive. =A0Takes about 750 second= s on > =A0local ffs). Is this on 9.0-CURRENT, or RELENG_8, or something else? >> Read speed of a randomly generated 6500 MB file on UFS over NFSv3 with t= he >> same flags as above: 1-3 MB/s, averaging 2.11 MB/s >> Read speed of the same file, local to the server: consistently between >> 40-60 MB/s, averaging 61.8 MB/s [it got faster over time - presumably UF= S >> was aggressively caching the file, or something?] > > You should use a file size larger than the size of main memory to prevent > caching, especially for reads. =A0That is 1GB on my low-end systems. I didn't mention the server's RAM, explicitly, but it has 6 GB of real RAM, and the files used were 6.5-7 GB each in that case (I did use a 4GB file earlier - I've avoided doing that again here). >> Read speed of the same file over NFS again, after the local test: >> Amusingly, worse (768 KB/s-2.2 MB/s, with random stalls - average report= ed >> 270 KB/s(!)). > > The random stalls are typical of the problem with the nfsd's getting > in each other's way, and/or of related problems. =A0The stalls that I > saw were very easy to see in real time using "netstat -I > 1" -- they happened every few seconds and lasted a second or 2. =A0But > they were never long enough to reduce the throughput by more than a > factor of 3, so I always got over 19 MB/S. =A0The throughput was reduced > by approximately the ratio of stalled time to non-stalled time. I believe it. I'm seeing at least partially similar behavior here, when I mention the performance drops where transfer briefly pauses and then picks up again in the localhost case, even with nfsd -n 1 and nfsiod -n 1. - Rich