Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 13 Jul 2018 12:10:50 -0700
From:      Jim Long <list@museum.rain.com>
To:        Mike Tancsa <mike@sentex.net>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: Disk/ZFS activity crash on 11.2-STABLE [SOLVED]
Message-ID:  <20180713191050.GA98371@g5.umpquanet.com>
In-Reply-To: <20180712214248.GA98578@g5.umpquanet.com>
References:  <20180711212959.GA81029@g5.umpquanet.com> <5ebd8573-1363-06c7-cbb2-8298b0894319@sentex.net> <20180712183512.GA75020@g5.umpquanet.com> <a069a076-df1c-80b2-1116-787e0a948ed9@sentex.net> <20180712214248.GA98578@g5.umpquanet.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jul 12, 2018 at 02:42:48PM -0700, Jim Long wrote:
> On Thu, Jul 12, 2018 at 02:49:53PM -0400, Mike Tancsa wrote:
> 
> --snip--
> 
> > I would try and set a ceiling. On
> > RELENG_11 you dont need to reboot
> > 
> > Try
> > sysctl -w vfs.zfs.arc_max=77946198016
> > 
> > which shaves off 20G from what ARC can gobble up. Not sure if thats your
> > issue, but it is an issue for some users.
> > 
> > If you are still hurting for caching, an SSD drive or NVME and make it a
> > caching device for your pool.
> > 
> > and what does
> > zpool status
> > show ?
> 
> I set the limit to the value you suggested, and the next test ran less
> than three minutes before the machine rebooted, with no crash dump produced.
> 
> I further reduced the limit to 50G and it's been running for about 50 minutes
> so far.  Fingers crossed.  I do have L2ARC I can add if need be.
> 
> I'll keep you posted on how this run goes.
> 
> Thank you,
> 
> Jim

It appears that limiting the ARC size did it.  The 'zfs send -R' was
able to complete with ARC limited to 50G, and a second run with a 60G
ARC limit also completed.

That is a very handy tunable to know about.  Being able to reduce cache
size on a running system when needed, to free up RAM, or whatever.  I
was curious to find the answer to your query about the average size of
files on the system, so I ran a 'zdb -b' on the pool.  That process
began to page out large amounts of RAM into swap, which was making the
system rather sluggish, especially once I decided to kill the zdb
process.  By dropping the ARC size limit, I was able to temporarily free
some RAM so that the process could succumb to the SIGKILL signal.

Thank you very much for your advice in guiding me to this resolution!

Jim




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180713191050.GA98371>