From owner-freebsd-hackers@FreeBSD.ORG  Sat Oct 20 12:52:13 2012
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 831A6DB6;
 Sat, 20 Oct 2012 12:52:13 +0000 (UTC)
 (envelope-from ndenev@gmail.com)
Received: from mail-we0-f182.google.com (mail-we0-f182.google.com
 [74.125.82.182])
 by mx1.freebsd.org (Postfix) with ESMTP id D21198FC16;
 Sat, 20 Oct 2012 12:52:12 +0000 (UTC)
Received: by mail-we0-f182.google.com with SMTP id x43so869989wey.13
 for <multiple recipients>; Sat, 20 Oct 2012 05:52:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=subject:mime-version:content-type:from:in-reply-to:date:cc
 :content-transfer-encoding:message-id:references:to:x-mailer;
 bh=KX31H4vAEQx/m8YgRZMoSUYzLTD6QAl62NO0N3TWDjA=;
 b=YYVIrXg0YBzHT5gCRanTroqgspOLz/h+aXR+CUzwyRqVU4sa/gdUfLnVibOOCS+gLp
 Xy8SIdbiq10qfQgWvixOmsEBnnmaQlPFO7L6+/6AODlPFEk3MKXwTDqmPKztQlLe6TqM
 4bGgvspLBIoa81WRgewcqn8rcFkxQp/73X6EaEBjZb4brzXhMCik7nAnhiaZQH7sw44C
 kyuvXGatHqNeDW6pi2FO2+VMk5TXcoOzzDT/KL1t2Nj6jmV6hxBl0Ugdi6Q2nsrNSVCk
 +sFDPUAo93DFcWTbyZpmMwL+50Ksa8wgk7Op5lJn6nkZN+uUXFFuxupQPnRwnb8mmbBR
 Q5bg==
Received: by 10.180.87.74 with SMTP id v10mr9385146wiz.21.1350737526192;
 Sat, 20 Oct 2012 05:52:06 -0700 (PDT)
Received: from [10.0.0.86] ([93.152.184.10])
 by mx.google.com with ESMTPS id v3sm9416314wiy.5.2012.10.20.05.52.04
 (version=TLSv1/SSLv3 cipher=OTHER);
 Sat, 20 Oct 2012 05:52:05 -0700 (PDT)
Subject: Re: NFS server bottlenecks
Mime-Version: 1.0 (Mac OS X Mail 6.1 \(1498\))
Content-Type: text/plain; charset=windows-1252
From: Nikolay Denev <ndenev@gmail.com>
In-Reply-To: <CAF-QHFWY0drcrUpo7GGD1zQNSDWsEeB_LHAjEbUKrX2ovQHNxw@mail.gmail.com>
Date: Sat, 20 Oct 2012 15:52:03 +0300
Content-Transfer-Encoding: quoted-printable
Message-Id: <942B9B96-7F2B-4833-865F-33DDCCA3500A@gmail.com>
References: <937460294.2185822.1350093954059.JavaMail.root@erie.cs.uoguelph.ca>
 <302BF685-4B9D-49C8-8000-8D0F6540C8F7@gmail.com> <k5gtdh$nc0$1@ger.gmane.org>
 <0857D79A-6276-433F-9603-D52125CF190F@gmail.com>
 <CAF-QHFUU0hhtRNK1_p9zks2w+e22bfWOtv+XaqgFqTiURcJBbQ@mail.gmail.com>
 <6DAAB1E6-4AC7-4B08-8CAD-0D8584D039DE@gmail.com>
 <23D7CB3A-BD66-427E-A7F5-6C9D3890EE1B@gmail.com>
 <CAF-QHFWY0drcrUpo7GGD1zQNSDWsEeB_LHAjEbUKrX2ovQHNxw@mail.gmail.com>
To: Ivan Voras <ivoras@freebsd.org>
X-Mailer: Apple Mail (2.1498)
Cc: "freebsd-hackers@freebsd.org Hackers" <freebsd-hackers@freebsd.org>,
 Rick Macklem <rmacklem@uoguelph.ca>
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Oct 2012 12:52:13 -0000


On Oct 20, 2012, at 3:11 PM, Ivan Voras <ivoras@freebsd.org> wrote:

> On 20 October 2012 13:42, Nikolay Denev <ndenev@gmail.com> wrote:
>=20
>> Here are the results from testing both patches : =
http://home.totalterror.net/freebsd/nfstest/results.html
>> Both tests ran for about 14 hours ( a bit too much, but I wanted to =
compare different zfs recordsize settings ),
>> and were done first after a fresh reboot.
>> The only noticeable difference seems to be much more context switches =
with Ivan's patch.
>=20
> Thank you very much for your extensive testing!
>=20
> I don't know how to interpret the rise in context switches; as this is
> kernel code, I'd expect no context switches. I hope someone else can
> explain.
>=20
> But, you have also shown that my patch doesn't do any better than
> Rick's even on a fairly large configuration, so I don't think there's
> value in adding the extra complexity, and Rick knows NFS much better
> than I do.
>=20
> But there are a few things other than that I'm interested in: like why
> does your load average spike almost to 20-ties, and how come that with
> 24 drives in RAID-10 you only push through 600 MBit/s through the 10
> GBit/s Ethernet. Have you tested your drive setup locally (AESNI
> shouldn't be a bottleneck, you should be able to encrypt well into
> Gbyte/s range) and the network?
>=20
> If you have the time, could you repeat the tests but with a recent
> Samba server and a CIFS mount on the client side? This is probably not
> important, but I'm just curious of how would it perform on your
> machine.

I've now started this test locally.
But from previous different iozone runs, I remember locally the speed =
was much better,
but I will wait for this test to finish, as the comparison will be =
better.

But I think there is still something fishy=85 I have cases where I have =
reached 1000MB/s over NFS
(from network stats, not local machine stats), but sometimes it is very =
slow even for=20
file completely in ARC. Rick mentioned that this could be due to RPC =
overhead and network round trip time, but
earlier in this thread I've done a test only on the server by mounting =
the NFS exported ZFS dataset locally and did some tests with "dd":

> To take the network out of the equation I redid the test by mounting =
the same filesystem over NFS on the server:
>=20
> [18:23]root@goliath:~#  mount -t nfs -o =
rw,hard,intr,tcp,nfsv3,rsize=3D1048576,wsize=3D1048576 =
localhost:/tank/spa_db/undo /mnt
> [18:24]root@goliath:~# dd if=3D/mnt/data.dbf of=3D/dev/null bs=3D1M=20
> 30720+1 records in
> 30720+1 records out
> 32212262912 bytes transferred in 79.793343 secs (403696120 bytes/sec)
> [18:25]root@goliath:~# dd if=3D/mnt/data.dbf of=3D/dev/null bs=3D1M
> 30720+1 records in
> 30720+1 records out
> 32212262912 bytes transferred in 12.033420 secs (2676900110 bytes/sec)
>=20
> During the first run I saw several nfsd threads in top, along with dd =
and again zero disk I/O.
> There was increase in memory usage because of the double buffering =
ARC->buffercahe.
> The second run was with all of the nfsd threads totally idle, and read =
directly from the buffercache.