Date: Wed, 11 Oct 2017 14:56:30 +0000 (UTC) From: Paul Pathiakis <pathiaki2@yahoo.com> To: Adam Vande More <amvandemore@gmail.com>, Kate Dawson <k4t@3msg.es> Cc: FreeBSD Questions <freebsd-questions@freebsd.org> Subject: Re: FreeBSD ZFS file server with SSD HDD Message-ID: <447544877.450092.1507733790531@mail.yahoo.com> In-Reply-To: <CA%2BtpaK3Cga3SKmbKnRts_SSp=D4qk9p%2BaTzNBZeEqDuvQGVd9A@mail.gmail.com> References: <20171011130512.GE24374@apple.rat.burntout.org> <CA%2BtpaK3Cga3SKmbKnRts_SSp=D4qk9p%2BaTzNBZeEqDuvQGVd9A@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi,
This is sort of a complementary answer. Seems like a cool setup.
One thing as an SA is to do detailed operational awareness with history for things. (aka a solid SNMP monitoring tool with graphing so that you can see what is happening at exactly that time regarding network status, I/O status, CPU + resources as well as other detail. ) There's a good amount like Zabbix, Nagios, Munin, Ganglia, etc in ports/pkgs. - My favorite is OpenNMS which I was going to do the port for but ran out of time.
They support point-in-time graphing. Once you resolve this issue, get some tool up and running so you can 'see' status on everything at that time. (an you can give read-only access to management so they can ooo and ahhh at it.) It is very useful to see where the bottleneck is... if there is one... otherwise, tuning parameters for everything (ZFS, NFS, NIC, etc) will have to tried as you can't easily narrow down where the problem is.
Just trying to help in the long term.
P.
From: Adam Vande More <amvandemore@gmail.com>
To: Kate Dawson <k4t@3msg.es>
Cc: FreeBSD Questions <freebsd-questions@freebsd.org>
Sent: Wednesday, October 11, 2017 10:41 AM
Subject: Re: FreeBSD ZFS file server with SSD HDD
On Wed, Oct 11, 2017 at 8:05 AM, Kate Dawson <k4t@3msg.es> wrote:
> Hi,
>
> Currently running a FreeBSD NFS server with a zpool comprising
>
> 12 x 1TB hard disk drives are arranged as pairs of mirrors in a strip set
> ( RAID 10 )
>
> An additional 2x 960GB SSD added. These two SSD are partitioned with a
> small partition begin used for a ZIL log, and larger partion arranged for
> L2ARC cache.
>
> Additionally the host has 64GB RAM and 16 CPU cores (AMD Opteron 2Ghz)
>
> A dataset from the pool is exported via NFS to a number of Debian
> Gnu/Linux hosts running a xen hypervisor. These run several disk image
> based virtual machines
>
> In general use, the FreeBSD NFS host sees very little read IO, which is to
> expected
> as the RAM cache and L2ARC are designed to minimise the amount of read
> load
> on the disks.
>
> However we're starting to see high load ( mostly IO WAIT ) on the Linux
> virtualisation hosts, and virtual machines - with kernel timeouts
> occurring resulting in crashes and instability.
>
> I believe this may be due to the limited number of random write IOPS
> available
> on the zpool NFS export.
>
> I can get sequential writes and reads to and from the NFS server at
> speeds that approach the maximum the network provides ( currently 1Gb/s
> + Jumbo Frames, and I could increase this by bonding multiple interfaces
> together. )
>
> However day to day usage does not show network utilisation anywhere near
> this maximum.
>
> If I look at the output of `zpool iostat -v tank 1 ` I see that every
> five seconds or so, the numner of write operation go to > 2k
>
> I think this shows that the I'm hitting the limit that the spinning disk
> can provide in this workload.
>
I doubt that is the cause. It is more likely you have
vfs.zfs.txg.timeout
set to the default. Have you tried any other zfs or nfs tuning? If so,
please share those details.
Does gstat reveal anything useful?
--
Adam
_______________________________________________
freebsd-questions@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org"
From owner-freebsd-questions@freebsd.org Wed Oct 11 15:07:53 2017
Return-Path: <owner-freebsd-questions@freebsd.org>
Delivered-To: freebsd-questions@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
[IPv6:2001:1900:2254:206a::19:1])
by mailman.ysv.freebsd.org (Postfix) with ESMTP id 835C3E2EBCB
for <freebsd-questions@mailman.ysv.freebsd.org>;
Wed, 11 Oct 2017 15:07:53 +0000 (UTC)
(envelope-from tech-lists@zyxst.net)
Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com
[66.111.4.25])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client did not present a certificate)
by mx1.freebsd.org (Postfix) with ESMTPS id 525206B889
for <freebsd-questions@freebsd.org>; Wed, 11 Oct 2017 15:07:52 +0000 (UTC)
(envelope-from tech-lists@zyxst.net)
Received: from compute4.internal (compute4.nyi.internal [10.202.2.44])
by mailout.nyi.internal (Postfix) with ESMTP id 6D91A213D5
for <freebsd-questions@freebsd.org>; Wed, 11 Oct 2017 11:07:46 -0400 (EDT)
Received: from frontend2 ([10.202.2.161])
by compute4.internal (MEProxy); Wed, 11 Oct 2017 11:07:46 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zyxst.net; h content-type:date:from:in-reply-to:message-id:mime-version
:references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s fm1; bh=y9HO1wvtTfcKKuYmFC0rG0ab+YPr21ZTM6hVWpSOSgw=; b=YodnTMIx
qLwNf9DZlmWqLuLwzeSyOE24L2vWiDOl0NXiFc+jHIzIjyx6poYIgkqnpCM78/cQ
I9mph/Lm9fNCtTqKgssJhpx9uLNrgkQvOxT+V/b3YTnLxXdvQ7P/D3OlaXtCvGLp
nBeWu5l0L4FEHSWRnC+I0sP/I6TTtAniMwyL8/u4CWbmq/VfFlQg/iIEJspP6OlO
f/50mYnbjbyPZCTgxDEnMZxgHkfEuJLDLPF85XRBmrqzstewoLRXro1GOh/etPze
sdJWApxq+Vu2YsuGzzwLEAeTeicZo9zXMgK/2F4peg3nBMs24UB+4/HpE1x6E6gj
9WF/fUJMA1ikCg=DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d messagingengine.com; h=content-type:date:from:in-reply-to
:message-id:mime-version:references:subject:to:x-me-sender
:x-me-sender:x-sasl-enc; s=fm1; bh=y9HO1wvtTfcKKuYmFC0rG0ab+YPr2
1ZTM6hVWpSOSgw=; b=LO7xrg3627lAMzuZDoKiO4W7hsbnWao9Br1vGyCJr0RsW
KLZ0GB6SUcqQh8znM33XVMuilgWyHmwkoZG5RYTrgNePfXkvqtyPVb3m2T54of15
FHl+qX2TDrKZLT5d78Dj//1bM2w0Mb6fBYvDF4NkF2sAqTItOCZIRqDKUpt24TLA
3nx8DsKw+8GBwvhH8r7D1H6W+9B/mnKdTuht8Im6HM6o26nPaHE31/Dl5L2ea7GO
jr/mqUbxtaPD/KVZ5LMLq69ZMjQWunTJSKzrXZCMAO/xPwVD6NnXOo2IGZbliZ8j
UZ0pOZdtVMDZtsgKtt3Su7bA49Xv9LhUh14z9i6fw=X-ME-Sender: <xms:wjPeWRVcaiTANmXKHBPIBxq6uH4EgTYq4qxD2yOlfB0wonCFBIeCaQ>
Received: from acer.zyxst.net (parsley.growveg.org [82.70.91.97])
by mail.messagingengine.com (Postfix) with ESMTPA id DE4A1247F0
for <freebsd-questions@freebsd.org>; Wed, 11 Oct 2017 11:07:45 -0400 (EDT)
Date: Wed, 11 Oct 2017 16:07:44 +0100
From: tech-lists <tech-lists@zyxst.net>
To: freebsd-questions@freebsd.org
Subject: Re: problems with pkg: Operation timed out
Message-ID: <20171011150743.GA23107@acer.zyxst.net>
Mail-Followup-To: freebsd-questions@freebsd.org
References: <CAJ5UdcPVYnF3GN5EyHZSbLjUiPq4icujvFSJSp7hwH0gLAykxQ@mail.gmail.com>
<VI1PR02MB1200A355A0975AB589DD6784F6740@VI1PR02MB1200.eurprd02.prod.outlook.com>
<CAJ5UdcMkRv_cd7n-fo9DL-H+_MVuzkFUH4eub1g_2AxYdsKVPg@mail.gmail.com>
<CAJ5UdcP+CU0=CMe953Mf+cW-N6wh24uMiz6cpu-MRsXftd6VkQ@mail.gmail.com>
<CA+4G5KZ+e=FXRjRYrLb+T0cd4bGg4fHV3MCBTR+06uH6dR0RUA@mail.gmail.com>
<CAJ5UdcP-DAE01jwOZ_eVm=oHJLc=4u_Lzykj+nQn3gR4BxxSJg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Disposition: inline
In-Reply-To: <CAJ5UdcP-DAE01jwOZ_eVm=oHJLc=4u_Lzykj+nQn3gR4BxxSJg@mail.gmail.com>
User-Agent: Mutt/1.9.1 (2017-09-22)
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-questions>,
<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions/>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 11 Oct 2017 15:07:53 -0000
Hi,
On Wed, Oct 11, 2017 at 08:07:39AM -0500, Antonio Olivares wrote:
>root@aceraspire:~ # traceroute pkg.freebsd.org
>traceroute to pkgmir.geo.freebsd.org (96.47.72.71), 64 hops max, 40 byte packets
> 1 10.155.142.1 (10.155.142.1) 0.914 ms 9.633 ms 0.952 ms
> 2 172.17.0.1 (172.17.0.1) 4.262 ms 2.012 ms 1.911 ms
> 3 * * *
> 4 * * *
> 5 * * *
> 6 * *^C
That indicates a problem with your internet connection (or whatever is after
172.17.0.1) rather than pkg.freebsd.org
--
J.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?447544877.450092.1507733790531>
