From owner-freebsd-fs@FreeBSD.ORG  Wed Nov 28 21:20:54 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 5857A66E
 for <freebsd-fs@freebsd.org>; Wed, 28 Nov 2012 21:20:54 +0000 (UTC)
 (envelope-from jas@cse.yorku.ca)
Received: from bronze.cs.yorku.ca (bronze.cs.yorku.ca [130.63.95.34])
 by mx1.freebsd.org (Postfix) with ESMTP id 113188FC14
 for <freebsd-fs@freebsd.org>; Wed, 28 Nov 2012 21:20:53 +0000 (UTC)
Received: from [130.63.97.125] (ident=jas)
 by bronze.cs.yorku.ca with esmtp (Exim 4.76)
 (envelope-from <jas@cse.yorku.ca>)
 id 1Tdp3m-0008K9-G4; Wed, 28 Nov 2012 16:20:46 -0500
Message-ID: <50B6802E.50009@cse.yorku.ca>
Date: Wed, 28 Nov 2012 16:20:46 -0500
From: Jason Keltz <jas@cse.yorku.ca>
User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64;
 rv:10.0.9) Gecko/20121011 Thunderbird/10.0.9
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Subject: ZFS/NFS performance file server
References: <50A130B7.4080604@cse.yorku.ca>
 <20121113043409.GA70601@neutralgood.org>
 <alpine.GSO.2.01.1211131132110.14586@freddy.simplesystems.org>
 <50A2B95D.4000400@cse.yorku.ca> <50A2F804.3010009@freebsd.org>
 <20121115001840.GA27399@FreeBSD.org> <20121115102704.6657ee52@suse3>
 <CACpH0Me_-MvqCbn5TmG892zUBMieOq8cDWPvxU5zLYjdPPKwXQ@mail.gmail.com>
 <20121116091747.2c1bfc55@suse3>
 <CACpH0Mc_OHhyReQmJVN2Sj3uy4swj8xQ=Y_0ntJhwFKeho-H_A@mail.gmail.com>
 <16B803FB-0964-4237-8F25-291470E7EFB5@gmail.com>
In-Reply-To: <16B803FB-0964-4237-8F25-291470E7EFB5@gmail.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -1.0
X-Spam-Level: -
X-Spam-Report: Content preview: I've been experimenting with filebench for ZFS
 performance
 testing on my new file server running 9.1RC3. Using filebench's "fileserver"
 workload, and testing a variety of different pool configurations, I am clearly
 able to see the differences in performance between the various ZFS layouts.
 I'm able to repeat the tests and get similar results each time. I really
 like this tool! I feel that the fileserver workload is probably more
 representative
 of my actual file server workload than say iozone/bonnie (in fact, probably
 heavier than my server workload). Presently, I'm just using the fileserver
 workload with default configuration. [...] 
 Content analysis details:   (-1.0 points, 5.0 required)
 pts rule name              description
 ---- ---------------------- --------------------------------------------------
 -0.0 SHORTCIRCUIT           Not all rules were run,
 due to a shortcircuited rule
 -1.0 ALL_TRUSTED            Passed through trusted hosts only via SMTP
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Nov 2012 21:20:54 -0000

I've been experimenting with filebench for ZFS performance testing on my 
new file server running 9.1RC3. Using filebench's "fileserver" workload, 
and testing a variety of different pool configurations, I am clearly 
able to see the differences in performance between the various ZFS 
layouts.  I'm able to repeat the tests and get similar results each 
time.  I really like this tool!  I feel that the fileserver workload is 
probably more representative of my actual file server workload than say 
iozone/bonnie (in fact, probably heavier than my server workload).   
Presently, I'm just using the fileserver workload with default 
configuration.

After configuring the zpool on my system (presently 11 mirrored vdevs, 
but soon to be triple mirror), filebench gave me these numbers:

statfile1            297131ops     4945ops/s   0.0mb/s      
1.1ms/op        0us/op-cpu [0ms - 254ms]
deletefile1          297141ops     4945ops/s   0.0mb/s      
2.6ms/op        0us/op-cpu [0ms - 269ms]
closefile3           297143ops     4945ops/s   0.0mb/s      
0.1ms/op        0us/op-cpu [0ms - 14ms]
readfile1            297144ops     4945ops/s 657.2mb/s      
0.2ms/op        0us/op-cpu [0ms - 29ms]
openfile2            297150ops     4946ops/s   0.0mb/s      
1.1ms/op        0us/op-cpu [0ms - 254ms]
closefile2           297150ops     4946ops/s   0.0mb/s      
0.0ms/op        0us/op-cpu [0ms - 13ms]
appendfilerand1      297150ops     4946ops/s  38.6mb/s      
0.7ms/op        0us/op-cpu [0ms - 247ms]
openfile1            297154ops     4946ops/s   0.0mb/s      
1.1ms/op        0us/op-cpu [0ms - 268ms]
closefile1           297155ops     4946ops/s   0.0mb/s      
0.0ms/op        0us/op-cpu [0ms - 13ms]
wrtfile1             297168ops     4946ops/s 621.1mb/s      
0.8ms/op        0us/op-cpu [0ms - 67ms]
createfile1          297172ops     4946ops/s   0.0mb/s      
2.2ms/op        0us/op-cpu [0ms - 55ms]
57845: 64.858: IO Summary: 3268658 ops, 54401.548 ops/s, (4945/9891 
r/w), 1316.9mb/s,      0us cpu/op,   3.3ms latency

Next, I wanted to try NFS testing from a Linux client.  All of my NFS 
clients are running RHEL63.  I believe that the filebench fileserver 
workload is intended to be run locally on the file server, but I don't 
see any reason why I cannot run it on the client as well.  Later, I 
would probably change the default 50 threads of activity to less (say, 
2), and run the tool on say 100 or more hosts simultaneously to get a 
better idea of whether all clients get the performance that I would 
expect without the server being too heavily loaded.  However, that's for 
another day!  From people who use filebench - does this approach make 
some sense?

I ran into some interesting results on my NFS testing, after which, I'm 
not quite sure I will use NFSv4 for this file server project, but I'm 
interested in feedback....

Note that the file server has a 1 gbit/s network connection, but most 
clients are on 100 mbit/s.

Here's filebench running from a 100mbps client connected to the new file 
server with NFSv4 with the above pool all to itself:

21512: 258.862: IO Summary: 38537 ops, 642.082 ops/s, (58/117 r/w),  
15.0mb/s,   3430us cpu/op, 280.9ms latency

With the client on a 100mbit/s link, maximum throughput I would expect 
would be 12.5 MB/s.  I'm not using compression at the moment (I know 
that this could inflate the number).  In addition, I exported the pool, 
and re-imported it before running the test, and unmounted and mounted it 
on the client.  Where does the extra 2.5mb/s come from?  I'm assuming it 
has to do with caching that occurs during the test.  I need to 
understand more clearly how filebench is generating the files, but I 
suspect it would be hard to get away from caching altogether.  I don't 
really expect to actually GET 100mbit/s out of a 100mbit/s link, so I'm 
curious how much of the 15mb/s is "real".

When I run the identical filebench test from the same 100mbps client but 
to my OLD file server (CentOS 4, NFS v3, running ext3 on an old 3ware 
card with a 6 disk RAID10 also gigabit), the numbers that I get are:

22991: 257.779: IO Summary: 46672 ops, 777.635 ops/s, (71/142 r/w),  
18.1mb/s,   1543us cpu/op, 227.9ms latency

NFSv4 to new server: 15 mb/s
NFSv3 to old server: 18 mb/s

Hmmm...

100 mbps client to the new file server with NFSv3 instead of NFSv4:

22460: 369.760: IO Summary: 64934 ops, 1081.895 ops/s, (98/197 r/w),  
25.4mb/s,   2203us cpu/op, 166.7ms latency

If I repeat that test, as long as I zfs export/import and unmount/mount 
the filesystem, that number is very consistent....

So...
  NFSv4: 15 MB/s and
  NFSv3: 25.4 MB/s

10 MB/s difference?  There has been some discussion on the list before 
about the difference in performance between NFSv3 and NFSv4.  I suspect 
this is just a sample of that...

(I don't think this is just a FreeBSD thing - I think it's an NFSv4 thing.)

So what happens if the client is on the same gigabit network as the new 
file server...

NFSv4:
27669: 180.309: IO Summary: 55844 ops, 930.619 ops/s, (84/170 r/w),  
21.6mb/s,    763us cpu/op, 193.3ms latency

NFSv3:
28891: 176.382: IO Summary: 72068 ops, 1201.054 ops/s, (109/219 r/w),  
28.1mb/s,    555us cpu/op, 150.0ms latency

Hey wait ... sure, they are "closer", but 28.1 mb/s?  Before, I was 
trying to understand how my performance number was OVER 100 mbps -- now 
it's significantly under gigabit???

This got to me to thinking ... hey, I don't have a separate ZIL .... so 
just for overall performance, I thought I'd disable the ZIL (zfs set 
sync=disabled) and see what happens to performance NFSv3 and NFSv4 on 
the same gigabit client test:

NFSv4:
30077: 128.518: IO Summary: 49346 ops, 822.368 ops/s, (75/150 r/w),  
19.2mb/s,    736us cpu/op, 218.9ms latency

NFSv3:

29484: 110.466: IO Summary: 293843 ops, 4897.058 ops/s, (445/891 r/w), 
115.7mb/s,    532us cpu/op,  36.4ms latency

The results were consistent.  I repeated them many times.   Rick -- any 
ideas on this one!?

So NFSv3 version gets closer to gigabit speed, but NFSv4 stays really 
slow - slower than having no ZIL!?
(Clearly, I must be doing something wrong.)

I'm not going to disable ZIL on a production system, but this testing 
would indicate to me that:

1) I'm probably better off sticking with NFSv3 for this server.  NFSv4 
was just "simpler" and I believe that the new ACLs could be helpful in 
my environment.
2) Without a fast ZIL, 1 100 mbps client didn't seem to have performance 
issues... but when I have 100+ clients all 100mbps talking to the 
server, I suspect a ZIL will be more important.  With just one server 
talking to the file server at gigabit speed, it looks like the ZIL is 
necessary....  Unfortunately, in this respect, I'm told that SSD ZILs 
under FreeBSD 9.X can be tricky business -- with no TRIM support, and no 
secure erase support (except in HEAD), when garbage collection happens 
after the SSD has been used for a long while, there may be performance 
degradation.... sigh.  Can you win!?

Feedback?

Jason.