Date: Sun, 13 Sep 2009 23:21:38 +0200 (CEST) From: Peter Much <pmc@citylink.dinoex.sub.org> To: FreeBSD-gnats-submit@FreeBSD.org Subject: kern/138790: ZFS ceases caching when mem demand is high Message-ID: <200909132121.n8DLLcxT065515@disp.oper.dinoex.org> Resent-Message-ID: <200909132250.n8DMo5SO095532@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 138790 >Category: kern >Synopsis: ZFS ceases caching when mem demand is high >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: change-request >Submitter-Id: current-users >Arrival-Date: Sun Sep 13 22:50:05 UTC 2009 >Closed-Date: >Last-Modified: >Originator: Peter Much >Release: FreeBSD 7.2-STABLE i386 >Organization: n/a >Environment: System: FreeBSD disp.oper.dinoex.org 7.2-STABLE FreeBSD 7.2-STABLE #0: Mon Aug 10 20:38:48 CEST 2009 root@disp.oper.dinoex.org:/usr/src/sys/i386/compile/D1R72V1 i386 7.2-STABLE as of July, with ZFS version 13 >Description: The system was originally equipped with 256 MB memory, and runs a lot of (seldom used) processes, so there was about 350 MB paged out and some ongoing competition for memory - which was handled well by the VM-pager. I activated ZFS for a selected few filesystems where I need journalling/CoW. The intention was not to increase performance, but to increase crash-safety, for a database only. I noticed that the actual size of the ARCache did shrink down to ~1 MB and only metadata would be cached. This makes write-performance incredibly bad, because for every, say, 1kB write a 128kB read has to happen first. As a remedy I increased memory to 768 MB, but the only effect was the ARCache now shrinking to 2 MB instead of 1 MB. Reading the source showed me that ZFS will try to shrink the ARCache as soon as Free+Cache mem gets down to someway near the uppermost threshold (vm.v_free_target + vm.v_cache_min). So, some modification seems necessary here, as it appears inacceptable that one has to increase the installed memory by maybe 4 or 5 times only to enable ZFS on a machine that otherwise would function suitably. I do currently not know how the behaviour is on big machines with a couple GB ram - but I think there the free list will also run low if enough processes compete for memory. >How-To-Repeat: Start some processes that use up the available memory. Best choice might be ruby processes that do GarbageCollection and therefore will be considered active and not candidate for swapping (otherwise "sysctl vm.swap_idle_enabled=1" would be another solution). Then check the difference of kstat.zfs.misc.arcstats.size - vfs.zfs.arc_meta_used (that should be the amount of payload data currently being cached) going near zero. >Fix: Since ZFS write performance becomes horribly bad when there is no caching at all available, I suggest that a certain mimimum of caching should be preserved even if the free list is quite low. Therefore I changed the code in the following way (see attached patch), and the results are now ok for my purpose. Experiments showed that there is a certain risk that the machine may experience a freeze/lockdown when working on these parameters. A crashdump analysis then gave me the hint that this seems to happen when the amount of arc_anon buffers gets too high and requests for further buffers will be declined. The machine then seems to block all activity and steadily increase kstat.zfs.misc.arcstats.memory_throttle_count until watchdog-reboot. I have not yet experienced this effect with the now attached patch, but further evaluation seems necessary. *** sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c.orig Wed Aug 5 20:45:41 2009 --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Sat Sep 5 00:19:30 2009 *************** *** 1821,1831 **** #ifdef _KERNEL /* * If pages are needed or we're within 2048 pages * of needing to page need to reclaim */ ! if (vm_pages_needed || (vm_paging_target() > -2048)) return (1); if (needfree) --- 1821,1835 ---- #ifdef _KERNEL + if (vm_page_count_min()) + return (1); + /* * If pages are needed or we're within 2048 pages * of needing to page need to reclaim */ ! if ((vm_pages_needed || (vm_paging_target() > -2048)) && ! (arc_size > arc_c_min)) return (1); if (needfree) *************** *** 3338,3344 **** available_memory += MIN(evictable_memory, arc_size - arc_c_min); } ! if (inflight_data > available_memory / 4) { ARCSTAT_INCR(arcstat_memory_throttle_count, 1); return (ERESTART); } --- 3342,3348 ---- available_memory += MIN(evictable_memory, arc_size - arc_c_min); } ! if ((inflight_data > available_memory / 4) && (arc_size > arc_c_min)) { ARCSTAT_INCR(arcstat_memory_throttle_count, 1); return (ERESTART); } >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200909132121.n8DLLcxT065515>