From owner-freebsd-current@FreeBSD.ORG Sat Apr 18 07:48:31 2009 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 353CD106564A; Sat, 18 Apr 2009 07:48:31 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from redbull.bpaserver.net (redbullneu.bpaserver.net [213.198.78.217]) by mx1.freebsd.org (Postfix) with ESMTP id AC7278FC0C; Sat, 18 Apr 2009 07:48:30 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from outgoing.leidinger.net (pD9E2DC61.dip.t-dialin.net [217.226.220.97]) by redbull.bpaserver.net (Postfix) with ESMTP id A1B1D2E0AD; Sat, 18 Apr 2009 09:48:26 +0200 (CEST) Received: from unknown (IO.Leidinger.net [192.168.2.103]) by outgoing.leidinger.net (Postfix) with ESMTP id F2787C2E1F; Sat, 18 Apr 2009 09:48:22 +0200 (CEST) Date: Sat, 18 Apr 2009 09:48:21 +0200 From: Alexander Leidinger To: Ben Kelly Message-ID: <20090418094821.00002e67@unknown> In-Reply-To: References: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> X-Mailer: Claws Mail 3.7.1 (GTK+ 2.10.13; i586-pc-mingw32msvc) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BPAnet-MailScanner-Information: Please contact the ISP for more information X-MailScanner-ID: A1B1D2E0AD.767EF X-BPAnet-MailScanner: Found to be clean X-BPAnet-MailScanner-SpamCheck: not spam, ORDB-RBL, SpamAssassin (not cached, score=-14.823, required 6, BAYES_00 -15.00, RDNS_DYNAMIC 0.10, TW_ZF 0.08) X-BPAnet-MailScanner-From: alexander@leidinger.net X-Spam-Status: No Cc: current@freebsd.org, fs@freebsd.org Subject: Re: ZFS: unlimited arc cache growth? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Apr 2009 07:48:31 -0000 On Fri, 17 Apr 2009 10:04:15 -0400 Ben Kelly wrote: > On Apr 17, 2009, at 8:50 AM, Alexander Leidinger wrote: > > to fs@, please CC me, as I'm not subscribed. > > > > I monitored (by hand) a while the sysctls > > kstat.zfs.misc.arcstats.size and kstat.zfs.misc.arcstats.hdr_size. > > Both grow way higher (at some point I've seen more than 500M) than > > what I have configured in vfs.zfs.arc_max (40M). > > > > After a while FS operations (e.g. pkgdb -F with about 900 > > packages... my specific workload is the fixup of gnome packages > > after the removal of the obsolete libusb port) get very slow (in > > my specific example I let the pkgdb run several times over night > > and it still is not finished). > > > > The big problem with this is, that at some point in time the > > machine reboots (panic, page fault, page not present, during a > > fork1). I have the impression (beware, I have a watchdog > > configured, as I don't know if a triggered WD would cause the same > > panic, the following is just a guess) that I run out of memory of > > some kind (I have 1G RAM, i386, max kmem size 700M). I restarted > > pkgdb several times after a reboot, and it continues to process the > > libusb removal, but hey, this is anoying. > > > > Does someone see something similar to what I describe (mainly the > > growth of the arc cache way beyond what is configured)? Anyone > > with some ideas what to try? > > Can you provide the rest of the arcstats from sysctl? Also, does > your arc_reclaim_thread process get any cycles when this problem > occurs? What happens if you kill the pkgdb -F manually before it > completes? Does the arc cache size come back down or is it stuck at > the abnormally high level? I haven't tried killing pkgdb and looking at the stats, but on the idle machine (reboot after the panic and 5h of no use by me... the machine fetches my mails, has a webmail + mysql + imap interface and is a fileserver) the size is double of my max value. Again there's no real load at this time, just fetching my mails (most traffic from the FreeBSD lists) and a little bit of SpamAssassin filtering of them. When I logged in this morning the machine was rebooted about 5h ago by a panic and no FS traffic was going on (100% idle). Currently the arc_reclaim_thread has 0:12 of accumulated CPU time, the wcpu is at 0%, but it is in the running state. The machine is about 80% idle. Here are all zfs sysctls as of now (pkgdb started 5min ago): ---snip--- # sysctl -a | grep zfs vfs.zfs.arc_meta_limit: 10485760 vfs.zfs.arc_meta_used: 130211600 vfs.zfs.mdcomp_disable: 0 vfs.zfs.arc_min: 22937600 vfs.zfs.arc_max: 41943040 vfs.zfs.zfetch.array_rd_sz: 1048576 vfs.zfs.zfetch.block_cap: 256 vfs.zfs.zfetch.min_sec_reap: 2 vfs.zfs.zfetch.max_streams: 8 vfs.zfs.prefetch_disable: 1 vfs.zfs.recover: 0 vfs.zfs.txg.synctime: 5 vfs.zfs.txg.timeout: 30 vfs.zfs.scrub_limit: 10 vfs.zfs.vdev.cache.bshift: 13 vfs.zfs.vdev.cache.size: 5242880 vfs.zfs.vdev.cache.max: 16384 vfs.zfs.vdev.aggregation_limit: 131072 vfs.zfs.vdev.ramp_rate: 2 vfs.zfs.vdev.time_shift: 6 vfs.zfs.vdev.min_pending: 4 vfs.zfs.vdev.max_pending: 6 vfs.zfs.cache_flush_disable: 0 vfs.zfs.zil_disable: 0 vfs.zfs.version.zpl: 3 vfs.zfs.version.vdev_boot: 1 vfs.zfs.version.spa: 13 vfs.zfs.version.dmu_backup_stream: 1 vfs.zfs.version.dmu_backup_header: 2 vfs.zfs.version.acl: 1 vfs.zfs.debug: 0 vfs.zfs.super_owner: 0 kstat.zfs.misc.arcstats.hits: 2483157 kstat.zfs.misc.arcstats.misses: 604115 kstat.zfs.misc.arcstats.demand_data_hits: 187200 kstat.zfs.misc.arcstats.demand_data_misses: 78685 kstat.zfs.misc.arcstats.demand_metadata_hits: 2295957 kstat.zfs.misc.arcstats.demand_metadata_misses: 525430 kstat.zfs.misc.arcstats.prefetch_data_hits: 0 kstat.zfs.misc.arcstats.prefetch_data_misses: 0 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 0 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 0 kstat.zfs.misc.arcstats.mru_hits: 1621026 kstat.zfs.misc.arcstats.mru_ghost_hits: 32102 kstat.zfs.misc.arcstats.mfu_hits: 862131 kstat.zfs.misc.arcstats.mfu_ghost_hits: 18804 kstat.zfs.misc.arcstats.deleted: 550853 kstat.zfs.misc.arcstats.recycle_miss: 287993 kstat.zfs.misc.arcstats.mutex_miss: 2 kstat.zfs.misc.arcstats.evict_skip: 654418 kstat.zfs.misc.arcstats.hash_elements: 5363 kstat.zfs.misc.arcstats.hash_elements_max: 8569 kstat.zfs.misc.arcstats.hash_collisions: 133396 kstat.zfs.misc.arcstats.hash_chains: 739 kstat.zfs.misc.arcstats.hash_chain_max: 5 kstat.zfs.misc.arcstats.p: 41943040 kstat.zfs.misc.arcstats.c: 41943040 kstat.zfs.misc.arcstats.c_min: 22937600 kstat.zfs.misc.arcstats.c_max: 41943040 kstat.zfs.misc.arcstats.size: 130467088 kstat.zfs.misc.arcstats.hdr_size: 730456 kstat.zfs.misc.arcstats.l2_hits: 0 kstat.zfs.misc.arcstats.l2_misses: 0 kstat.zfs.misc.arcstats.l2_feeds: 0 kstat.zfs.misc.arcstats.l2_rw_clash: 0 kstat.zfs.misc.arcstats.l2_writes_sent: 0 kstat.zfs.misc.arcstats.l2_writes_done: 0 kstat.zfs.misc.arcstats.l2_writes_error: 0 kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0 kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0 kstat.zfs.misc.arcstats.l2_evict_reading: 0 kstat.zfs.misc.arcstats.l2_free_on_write: 0 kstat.zfs.misc.arcstats.l2_abort_lowmem: 0 kstat.zfs.misc.arcstats.l2_cksum_bad: 0 kstat.zfs.misc.arcstats.l2_io_error: 0 kstat.zfs.misc.arcstats.l2_size: 0 kstat.zfs.misc.arcstats.l2_hdr_size: 0 kstat.zfs.misc.arcstats.memory_throttle_count: 0 kstat.zfs.misc.vdev_cache_stats.delegations: 2728 kstat.zfs.misc.vdev_cache_stats.hits: 297326 kstat.zfs.misc.vdev_cache_stats.misses: 368918 ---snip--- Bye, Alexander.