Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 Sep 2010 15:25:32 +0200
From:      Willem Jan Withagen <wjw@digiware.nl>
To:        Jeremy Chadwick <freebsd@jdc.parodius.com>
Cc:        stable@freebsd.org, "avg@icyb.net.ua >> Andriy Gapon" <avg@icyb.net.ua>, fs@freebsd.org
Subject:   Re: Still getting kmem exhausted panic
Message-ID:  <4CA1ECCC.4070801@digiware.nl>
In-Reply-To: <20100928115047.GA62142@icarus.home.lan>
References:  <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 28-9-2010 13:50, Jeremy Chadwick wrote:
> On Tue, Sep 28, 2010 at 01:24:28PM +0200, Willem Jan Withagen wrote:
>> This is with stable as of yesterday,but with an un-tunned ZFS box I
>> was still able to generate a kmem exhausted panic.
>> Hard panic, just 3 lines.
>>
>> The box contains 12Gb memory, runs on a 6 core (with HT) xeon.
>> 6* 2T WD black caviar in raidz2 with 2*512Mb mirrored log.
>>
>> The box died while rsyncing 5.8T from its partnering system.
>> (that was the only activity on the box)
> 
> It would help if you could provide output from the following commands
> (even after the box has rebooted):

It is currently in the proces of zfs receive of that same 5.8T.

> $ sysctl -a | egrep ^vm.kmem
> $ sysctl -a | egrep ^vfs.zfs.arc
> $ sysctl kstat.zfs.misc.arcstats

> sysctl -a | egrep ^vm.kmem
vm.kmem_size_scale: 3
vm.kmem_size_max: 329853485875
vm.kmem_size_min: 0
vm.kmem_size: 4156850176

> sysctl -a | egrep ^vfs.zfs.arc
vfs.zfs.arc_meta_limit: 770777088
vfs.zfs.arc_meta_used: 33449648
vfs.zfs.arc_min: 385388544
vfs.zfs.arc_max: 3083108352

>  sysctl kstat.zfs.misc.arcstats
kstat.zfs.misc.arcstats.hits: 3119873
kstat.zfs.misc.arcstats.misses: 98710
kstat.zfs.misc.arcstats.demand_data_hits: 3043947
kstat.zfs.misc.arcstats.demand_data_misses: 3699
kstat.zfs.misc.arcstats.demand_metadata_hits: 67981
kstat.zfs.misc.arcstats.demand_metadata_misses: 90005
kstat.zfs.misc.arcstats.prefetch_data_hits: 121
kstat.zfs.misc.arcstats.prefetch_data_misses: 48
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 7824
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 4958
kstat.zfs.misc.arcstats.mru_hits: 34828
kstat.zfs.misc.arcstats.mru_ghost_hits: 21736
kstat.zfs.misc.arcstats.mfu_hits: 3077133
kstat.zfs.misc.arcstats.mfu_ghost_hits: 47605
kstat.zfs.misc.arcstats.allocated: 5507025
kstat.zfs.misc.arcstats.deleted: 5349715
kstat.zfs.misc.arcstats.stolen: 4468221
kstat.zfs.misc.arcstats.recycle_miss: 83995
kstat.zfs.misc.arcstats.mutex_miss: 231
kstat.zfs.misc.arcstats.evict_skip: 130461
kstat.zfs.misc.arcstats.evict_l2_cached: 0
kstat.zfs.misc.arcstats.evict_l2_eligible: 592200836608
kstat.zfs.misc.arcstats.evict_l2_ineligible: 11000092160
kstat.zfs.misc.arcstats.hash_elements: 20585
kstat.zfs.misc.arcstats.hash_elements_max: 150543
kstat.zfs.misc.arcstats.hash_collisions: 761847
kstat.zfs.misc.arcstats.hash_chains: 780
kstat.zfs.misc.arcstats.hash_chain_max: 6
kstat.zfs.misc.arcstats.p: 2266075295
kstat.zfs.misc.arcstats.c: 2410082200
kstat.zfs.misc.arcstats.c_min: 385388544
kstat.zfs.misc.arcstats.c_max: 3083108352
kstat.zfs.misc.arcstats.size: 2410286720
kstat.zfs.misc.arcstats.hdr_size: 7565040
kstat.zfs.misc.arcstats.data_size: 2394099200
kstat.zfs.misc.arcstats.other_size: 8622480
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_read_bytes: 0
kstat.zfs.misc.arcstats.l2_write_bytes: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 0
kstat.zfs.misc.arcstats.l2_write_trylock_fail: 0
kstat.zfs.misc.arcstats.l2_write_passed_headroom: 0
kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0
kstat.zfs.misc.arcstats.l2_write_in_l2: 0
kstat.zfs.misc.arcstats.l2_write_io_in_progress: 0
kstat.zfs.misc.arcstats.l2_write_not_cacheable: 85908
kstat.zfs.misc.arcstats.l2_write_full: 0
kstat.zfs.misc.arcstats.l2_write_buffer_iter: 0
kstat.zfs.misc.arcstats.l2_write_pios: 0
kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 0


>> So the obvious would to conclude that auto-tuning voor ZFS on
>> 8.1-Stable is not yet quite there.
>>
>> So I guess that we still need tuning advice even for 8.1.
>> And thus prevent a hard panic.
> 
> Andriy Gapon provides this general recommendation:
> 
> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/059114.html
> 
> The advice I've given for RELENG_8 (as of the time of this writing),
> 8.1-STABLE, and 8.1-RELEASE, is that for amd64 you'll need to tune:

Well advises seem to vary, and the latest I understood was that
8.1-stable did not need any tuning. (The other system with a much older
kernel is tuned as to what most here are suggesting)
And I was shure led to believe that even since 8.0 panics were no longer
among us......
> 
> vm.kmem_size
> vfs.zfs.arc_max

real memory  = 12889096192 (12292 MB)
avail memory = 12408684544 (11833 MB)

So that prompts vm.kmem_size=18G.

Form the other post:
> As to arc_max/arc_min, set them based your needs according to general
> ZFS recommendations.

I'm seriously at a loss what general recommendations would be.

The other box has 8G
loader.conf:
vm.kmem_size="14G"      # 2* phys RAM size for ZFS perf.
vm.kmem_size_scale="1"
vfs.zfs.arc_min="1G"
vfs.zfs.arc_max="6G"

So I'd select something like 11G for arc_max on a box with 12G mem.

> I believe the trick -- Andriy, please correct me if I'm wrong -- is the
> tuning of vfs.zfs.arc_max, which is now a hard limit rather than a "high
> watermark".

> I can't provide tuning advice for i386.

This is amd64.

- --WjW
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)

iQEcBAEBAgAGBQJMoezMAAoJEP4k4K6R6rBhEScIAI/rZH5/VTmASMGyEYu4NZHU
SSFo3TOSOkYPEJicd8/NgM7w7D3xgMA0Xse0fu3tQOsjX940Z6fUKvnM7LCX2OJK
vvkW0LpGuKbv/9sFFvkklodjkArtRzzoptLtiCVsaYsoieRqnmYMpBxU9WFYCY2I
HoRx1nMbArg2HvKPzeZjf9knnQaU6YOR/PUiFBo6YuHkDJ40noqRElewbPEiOVZz
zqnUh90ZDFVdHMYNuZegOKtfSVCA1AifHR3e7+zn8jSco/+svESd7tBIxmHZWQ8u
BA1AKyYVTHs+wKsTw2J7u1v8yg74HxJNyVqwPRP048Z8onoPlGgtnFCTWbl2ICU=
=KiyH
-----END PGP SIGNATURE-----



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4CA1ECCC.4070801>