Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Oct 2017 14:05:12 +0100
From:      Kate Dawson <k4t@3msg.es>
To:        freebsd-questions@freebsd.org
Subject:   FreeBSD ZFS file server with SSD HDD
Message-ID:  <20171011130512.GE24374@apple.rat.burntout.org>

next in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
Hi, 

Currently running a FreeBSD NFS server with a zpool comprising 

12 x 1TB hard disk drives are arranged as pairs of mirrors in a strip set ( RAID 10 )

An additional 2x 960GB SSD added. These two SSD are partitioned with a
small partition begin used for a ZIL log, and larger partion arranged for
L2ARC cache. 

Additionally the host has 64GB RAM and 16 CPU cores (AMD Opteron 2Ghz)

A dataset from the pool is exported via NFS to a number of Debian
Gnu/Linux hosts running a xen hypervisor. These run several disk image
based virtual machines

In general use, the FreeBSD NFS host sees very little read IO, which is to expected
as the RAM cache  and L2ARC are designed to minimise the amount of read load
on the disks.

However we're starting to see high load ( mostly IO WAIT ) on the Linux
virtualisation hosts, and virtual machines - with kernel timeouts
occurring resulting in crashes and instability.

I believe this may be due to the limited number of random write IOPS available
on the zpool NFS export. 

I can get sequential writes and reads to and from the NFS server at
speeds that approach the maximum the network provides ( currently 1Gb/s
+ Jumbo Frames, and I could increase this by bonding multiple interfaces together. )

However day to day usage does not show network utilisation anywhere near
this maximum.

If I look at the output of `zpool iostat -v tank 1 ` I see that every
five seconds or so, the numner of write operation go to > 2k

I think this shows that the I'm hitting the limit that the spinning disk
can provide in this workload.

As a cost effective way to improve this ( rather than replacing the
whole chassis ) I was considering replacing the 1TB HDD with 1TB SSD,
for the improved IOPS. 

I wonder if there were any opinions within the community here, on 

1. What metrics can I gather to confirm the disk write IO as bottleneck?

2. If the proposed solution will have the required effect?  That is an
decrease in the IOWAIT on the GNU/Linux virtualization hosts.

I hope this is a well formed question.

Regards, 

Kate Dawson
-- 
"The introduction of a coordinate system to geometry is an act of violence"

[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBCAAGBQJZ3hcCAAoJEAGiBQHoGku6jeIQAJpEA+BgNn+aRCFPGYbDGZqf
ptOk9NWFr1nIjZE3P9SdnOF19abfodGf+i28WAmSqhnHU/qx9lKHWW4y/Tu7/bIe
uXlBELEZ//tdlcgDYYsBuoaDDy8bj8XE71vPCcwky956h56D97gesZtf070TSrWu
AI0pCQdl9SGE98a4bbWUTkMEYVT3K014gTOosxuSwY4hbVmFFDjDQCy+yAURJDsa
QrDBjRDKUTcLvBnunpVvKNjFVbi8zHrEfSlMzxWcZVVS5IKFlb1ew2dvPhO3x6B4
ahkjwELze12fhFUwMgDUZ502h5UgVjOUe/AbkT+K2vwyGB791MoetwocBZ422rnL
ucfG5C3TH0WtfOgAC07/9m98CGy5ZU7O+OeG14CnbIcps0gSoI1ZMvF1/9itSOnn
RpO3Y/6DZksplHOmpm+YfjSVlcM0tS9yKCMhFGWT5uX6KfboDO5q2YVyw+Wuy31t
r1AerPGtPBmR44n1lofZaPeJB3j3P57tXzzkEadbYPrOQ1aSyQ78fnbp9+kGmWPq
MhNzZCGajixhilDSxz/cbH6G3+I3jM5/FLA4/u8EhHCxuiOyQju2dwo5m6IPFfof
iCjxWlUXezb6vHos95wutG06wIegGiXXIrR/nF6a1sik8OCkZxfWcJYLRDMItlh4
j8Un9Tsd8xJzQm2G2Qxg
=2kGf
-----END PGP SIGNATURE-----

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20171011130512.GE24374>