Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Mar 2010 12:29:37 -0400
From:      Rich <rercola@acm.jhu.edu>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        freebsd-bugs@freebsd.org, freebsd-gnats-submit@freebsd.org
Subject:   Re: misc/145189: nfsd performs abysmally under load
Message-ID:  <5da0588e1003300929k50be2963h64d520f9f942e3a2@mail.gmail.com>
In-Reply-To: <20100331021023.Y919@besplex.bde.org>
References:  <201003300501.o2U51afE033587@www.freebsd.org> <20100331021023.Y919@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Mar 30, 2010 at 11:50 AM, Bruce Evans <brde@optusnet.com.au> wrote:
> Does it work better when limited to 1 thread (nfsd -n 1)? =A0In at least
> some versions of it (or maybe in nfsiod), multiple threads fight each oth=
er
> under load.

It doesn't seem to - nfsd -n 1 still ranges between 1-3 MB/s for files
> RAM on server or client (6 and 4 GB, respectively).

>> For instance, copying a 4GB file over NFSv3 from a ZFS filesystem with t=
he
>> following flags
>> [rw,nosuid,hard,intr,nofsc,tcp,vers=3D3,rsize=3D8192,wsize=3D8192,sloppy=
,addr=3DX.X.X.X](Linux
>> client, the above is the server), I achieve 2 MB/s, fluctuating between =
1
>> and 3. (pv reports 2.23 MB/s avg)
>>
>> Locally, on the server, I achieve 110-140 MB/s (at the end of pv, it
>> reports 123 MB/s avg).
>>
>> I'd assume network latency, but nc with no flags other than port achieve=
s
>> 30-50 MB/s between server and client.
>>
>> Latency is also abysmal - ls on a randomly chosen homedir full of files,
>> according to time, takes:
>> real =A0 =A00m15.634s
>> user =A0 =A00m0.012s
>> sys =A0 =A0 0m0.097s
>> while on the local machine:
>> real =A0 =A00m0.266s
>> user =A0 =A00m0.007s
>> sys =A0 =A0 0m0.000s
>
> It probably is latency. =A0nfs is very latency-sensitive when there are l=
ots
> of small files. =A0Transfers of large files shouldn't be affected so much=
.

Sure, and next on my TODO is to look into whether 9.0-CURRENT makes
certain ZFS high-latency things perform better.

>> The server in question is a 3GHz Core 2 Duo, running FreeBSD RELENG_8. T=
he
>> kernel conf, DTRACE_POLL, is just the stock AMD64 kernel with all of the
>> DTRACE-related options turned on, as well as the option to enable pollin=
g in
>> the NIC drivers, since we were wondering if that would improve our
>> performance.
>
> Enabling polling is a good way to destroy latency. =A0A ping latency of
> more that about 50uS causes noticable loss of performance for nfs, but
> LAN latency is usually a few times higher than that, and polling without
> increasing the clock interrupt frequency to an excessively high value
> gives a latency of at least 20 times higher than that. =A0Also, -current
> with debugging options is so bloated that even localhost has a ping
> latency of about 50uS on a Core2 (up from 2uS for FreeBSD-4 on an
> AthlonXP). =A0Anyway try nfs on localhost to see if reducing the latency
> helps.

Actually, we noticed that throughput appeared to get marginally better whil=
e
causing occasional bursts of crushing latency, but yes, we have it on in th=
e
kernel without using it in any actual NICs at present. :)

But yes, I'm getting 40-90+ MB/s, occasionally slowing to 20-30 MB/s,
average after copying a 6.5 GB file of 52.7 MB/s, on localhost IPv4,
with no additional mount flags. {r,w}size=3D8192 on localhost goes up to
80-100 MB/s, with occasional sinks to 60 (average after copying
another, separate 6.5 GB file: 77.3 MB/s).

Also:
64 bytes from 127.0.0.1: icmp_seq=3D0 ttl=3D64 time=3D0.015 ms
64 bytes from 127.0.0.1: icmp_seq=3D1 ttl=3D64 time=3D0.049 ms
64 bytes from 127.0.0.1: icmp_seq=3D2 ttl=3D64 time=3D0.012 ms
64 bytes from [actual IP]: icmp_seq=3D0 ttl=3D64 time=3D0.019 ms
64 bytes from [actual IP]: icmp_seq=3D1 ttl=3D64 time=3D0.015 ms

>> We tested this with a UFS directory as well, because we were curious if
>> this was an NFS/ZFS interaction - we still got 1-2 MB/s read speed and
>> horrible latency while achieving fast throughput and latency local to th=
e
>> server, so we're reasonably certain it's not "just" ZFS, if there is ind=
eed
>> any interaction there.
>
> After various tuning and bug fixing (now partly committed by others) I ge=
t
> improvements like the following on low-end systems with ffs (I don't use
> zfs):
> - very low end with 100Mbps ethernet: little change; bulk transfers alway=
s
> =A0went at near wire speed (about 10 MB/S)
> - low end with 1Gbps/S: bulk transfers up from 20MB/S to 45MB/S (local ff=
s
> =A050MB/S). =A0buildworld over nfs of 5.2 world down from 1200 seconds to=
 800
> =A0seconds (this one is very latency-sensitive. =A0Takes about 750 second=
s on
> =A0local ffs).

Is this on 9.0-CURRENT, or RELENG_8, or something else?

>> Read speed of a randomly generated 6500 MB file on UFS over NFSv3 with t=
he
>> same flags as above: 1-3 MB/s, averaging 2.11 MB/s
>> Read speed of the same file, local to the server: consistently between
>> 40-60 MB/s, averaging 61.8 MB/s [it got faster over time - presumably UF=
S
>> was aggressively caching the file, or something?]
>
> You should use a file size larger than the size of main memory to prevent
> caching, especially for reads. =A0That is 1GB on my low-end systems.

I didn't mention the server's RAM, explicitly, but it has 6 GB of real
RAM, and the files used were 6.5-7 GB each in that case (I did use a
4GB file earlier - I've avoided doing that again here).

>> Read speed of the same file over NFS again, after the local test:
>> Amusingly, worse (768 KB/s-2.2 MB/s, with random stalls - average report=
ed
>> 270 KB/s(!)).
>
> The random stalls are typical of the problem with the nfsd's getting
> in each other's way, and/or of related problems. =A0The stalls that I
> saw were very easy to see in real time using "netstat -I <interface>
> 1" -- they happened every few seconds and lasted a second or 2. =A0But
> they were never long enough to reduce the throughput by more than a
> factor of 3, so I always got over 19 MB/S. =A0The throughput was reduced
> by approximately the ratio of stalled time to non-stalled time.

I believe it. I'm seeing at least partially similar behavior here,
when I mention
the performance drops where transfer briefly pauses and then picks up again
in the localhost case, even with nfsd -n 1 and nfsiod -n 1.

- Rich



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5da0588e1003300929k50be2963h64d520f9f942e3a2>