From owner-freebsd-questions@FreeBSD.ORG Thu Oct 22 13:57:46 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 875E51065672 for ; Thu, 22 Oct 2009 13:57:46 +0000 (UTC) (envelope-from cjk32@cam.ac.uk) Received: from ppsw-7.csi.cam.ac.uk (ppsw-7.csi.cam.ac.uk [131.111.8.137]) by mx1.freebsd.org (Postfix) with ESMTP id F35588FC21 for ; Thu, 22 Oct 2009 13:57:45 +0000 (UTC) X-Cam-AntiVirus: no malware found X-Cam-SpamDetails: not scanned X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from nat1.cjkey.org.uk ([88.97.163.220]:46961 helo=[192.168.2.58]) by ppsw-7.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.157]:465) with esmtpsa (PLAIN:cjk32) (TLSv1:DHE-RSA-AES256-SHA:256) id 1N0yAi-0005f5-Od (Exim 4.70) for freebsd-questions@freebsd.org (return-path ); Thu, 22 Oct 2009 14:57:44 +0100 Message-ID: <4AE064D6.2030504@cam.ac.uk> Date: Thu, 22 Oct 2009 14:57:42 +0100 From: Christopher Key User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: freebsd-questions@freebsd.org References: <4ADC5318.2010706@cam.ac.uk> In-Reply-To: <4ADC5318.2010706@cam.ac.uk> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: ZFS: Strange performance issues X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Oct 2009 13:57:46 -0000 Christopher Key wrote: > I'm running FreeBSD 7.2 amd64 on a system with 2GB RAM. I've a zfs pool > using raidz1 over five 2Tb SATA drives connected via a port multiplier > and a RR2314 card. > > I can write to a filesystem on this pool at approx 20MB/s: > > # dd if=/dev/urandom of=$FS/testdump bs=1m count=1k > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 47.096440 secs (22798790 bytes/sec) > > and zpool iostat -v is consistent with this > > capacity operations bandwidth > pool used avail read write read write > -------------- ----- ----- ----- ----- ----- ----- > films 6.37T 2.69T 11 440 53.2K 23.0M > raidz1 6.37T 2.69T 11 440 53.2K 23.0M > da0 - - 1 214 88.2K 5.58M > da1 - - 1 209 88.6K 5.58M > da2 - - 1 211 76.0K 5.70M > da3 - - 1 213 88.6K 5.77M > da4 - - 1 213 88.6K 5.71M > -------------- ----- ----- ----- ----- ----- ----- > > > However, the read behaviour is strange: > > dd if=$FS/testdump of=/dev/null bs=1m count=1k > > 1024+0 records in > 1024+0 records out > 1073741824 bytes transferred in 40.392055 secs (26582996 bytes/sec) > > but here, zpool iostat -v is odd: > > capacity operations bandwidth > pool used avail read write read write > -------------- ----- ----- ----- ----- ----- ----- > films 6.37T 2.69T 1.52K 0 194M 0 > raidz1 6.37T 2.69T 1.52K 0 194M 0 > da0 - - 420 0 48.2M 0 > da1 - - 274 0 24.0M 0 > da2 - - 270 0 24.4M 0 > da3 - - 418 0 48.4M 0 > da4 - - 418 0 48.4M 0 > -------------- ----- ----- ----- ----- ----- ----- > > Notice that dd was reading at ~27MB/s, but zfs is reading from the vdev > at ~200MB/s. Also odd is that fact the reduced read rates for da1, da2. > > > I'm struggling to understand what's happening to the extra data being > read. The most likely scenario seems to be that ZFS is inflating its > read size, knowing that it won't delay the transfer significantly, and > hoping to pull in some useful data to the cache. However, it's either > failing the cache this data correctly, or the file is highly > non-contiguous, and the extra data read doesn't contain anythign useful > to out read. > > I'm also somewhat surpised by the poor performance of the pool. From > memory, when it was first configured (on identical hardware and > software), I could write at ~130MB/s and read at ~200MB/s. > > Once conclusion is that the pool is suffering from something akin to > fragmentation, perhaps with files always being allocated from very small > blacks. The vast majority of the data comprises large (~1Gb) files, > that are written to one 'import' pool, moved to the main pool, then > never modified. There are however a lot (~5000) of small (<1k) files > that get rewritten half hourly, and I'm wondering if that might be > causing problems, and confusing ZFS's block sizing algorithm. > > Can anyone shed any light on what might be going on, or how to further > diagnose this problem. Do any of my tentative conclusions make any sense? > > > I've restarted the system, and the problems have dissappeared. I can now write at ~50MB/s and read at ~200MB/s, including reading from files that were written before the reboot, when both reading and writing were slow. This seems to rule out any problem with the on disk layout of data. Also, the pool read rate now matches the filesystem read rate (as opposed to reading 200MB/s vs 27MB/s before the reboot). The only conclusion that I can come to now is that caching was simply non-existent before the reboot. Comparing the result of # sysctl -a | grep zfs before and after the reboot, see below, the significant differences seem to be the twentyfold increase in "kstat.zfs.misc.arcstats.p", "kstat.zfs.misc.arcstats.c" and "kstat.zfs.misc.arcstats.size". I'm not sure of the correct interpretation of these, but if they are really indicating a 20MB cache vs. a 400MB cache I could well believe that would explain the difference. I'm not sure what might have caused the reduction in the cache size, there aren't any significant memory pressures that I'm aware of, unless there was excessive content in /tmp Any thoughts? Kind regards, Christopher Key Before: vfs.zfs.arc_min: 21353344 vfs.zfs.arc_max: 512480256 vfs.zfs.mdcomp_disable: 0 vfs.zfs.prefetch_disable: 0 vfs.zfs.zio.taskq_threads: 0 vfs.zfs.recover: 0 vfs.zfs.vdev.cache.size: 10485760 vfs.zfs.vdev.cache.max: 16384 vfs.zfs.cache_flush_disable: 0 vfs.zfs.zil_disable: 0 vfs.zfs.debug: 0 kstat.zfs.misc.arcstats.hits: 14715277 kstat.zfs.misc.arcstats.misses: 1013356 kstat.zfs.misc.arcstats.demand_data_hits: 10180440 kstat.zfs.misc.arcstats.demand_data_misses: 119214 kstat.zfs.misc.arcstats.demand_metadata_hits: 4026012 kstat.zfs.misc.arcstats.demand_metadata_misses: 141654 kstat.zfs.misc.arcstats.prefetch_data_hits: 116044 kstat.zfs.misc.arcstats.prefetch_data_misses: 745305 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 392781 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 7183 kstat.zfs.misc.arcstats.mru_hits: 2456403 kstat.zfs.misc.arcstats.mru_ghost_hits: 12154 kstat.zfs.misc.arcstats.mfu_hits: 11832908 kstat.zfs.misc.arcstats.mfu_ghost_hits: 84764 kstat.zfs.misc.arcstats.deleted: 1853402 kstat.zfs.misc.arcstats.recycle_miss: 262586 kstat.zfs.misc.arcstats.mutex_miss: 1180 kstat.zfs.misc.arcstats.evict_skip: 20216195 kstat.zfs.misc.arcstats.hash_elements: 6645 kstat.zfs.misc.arcstats.hash_elements_max: 34534 kstat.zfs.misc.arcstats.hash_collisions: 588095 kstat.zfs.misc.arcstats.hash_chains: 715 kstat.zfs.misc.arcstats.hash_chain_max: 7 kstat.zfs.misc.arcstats.p: 412187380 kstat.zfs.misc.arcstats.c: 415807010 kstat.zfs.misc.arcstats.c_min: 21353344 kstat.zfs.misc.arcstats.c_max: 512480256 kstat.zfs.misc.arcstats.size: 416037376 After: vfs.zfs.arc_min: 21353344 vfs.zfs.arc_max: 512480256 vfs.zfs.mdcomp_disable: 0 vfs.zfs.prefetch_disable: 0 vfs.zfs.zio.taskq_threads: 0 vfs.zfs.recover: 0 vfs.zfs.vdev.cache.size: 10485760 vfs.zfs.vdev.cache.max: 16384 vfs.zfs.cache_flush_disable: 0 vfs.zfs.zil_disable: 0 vfs.zfs.debug: 0 kstat.zfs.misc.arcstats.hits: 14715277 kstat.zfs.misc.arcstats.misses: 1013356 kstat.zfs.misc.arcstats.demand_data_hits: 10180440 kstat.zfs.misc.arcstats.demand_data_misses: 119214 kstat.zfs.misc.arcstats.demand_metadata_hits: 4026012 kstat.zfs.misc.arcstats.demand_metadata_misses: 141654 kstat.zfs.misc.arcstats.prefetch_data_hits: 116044 kstat.zfs.misc.arcstats.prefetch_data_misses: 745305 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 392781 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 7183 kstat.zfs.misc.arcstats.mru_hits: 2456403 kstat.zfs.misc.arcstats.mru_ghost_hits: 12154 kstat.zfs.misc.arcstats.mfu_hits: 11832908 kstat.zfs.misc.arcstats.mfu_ghost_hits: 84764 kstat.zfs.misc.arcstats.deleted: 1853402 kstat.zfs.misc.arcstats.recycle_miss: 262586 kstat.zfs.misc.arcstats.mutex_miss: 1180 kstat.zfs.misc.arcstats.evict_skip: 20216195 kstat.zfs.misc.arcstats.hash_elements: 6645 kstat.zfs.misc.arcstats.hash_elements_max: 34534 kstat.zfs.misc.arcstats.hash_collisions: 588095 kstat.zfs.misc.arcstats.hash_chains: 715 kstat.zfs.misc.arcstats.hash_chain_max: 7 kstat.zfs.misc.arcstats.p: 412187380 kstat.zfs.misc.arcstats.c: 415807010 kstat.zfs.misc.arcstats.c_min: 21353344 kstat.zfs.misc.arcstats.c_max: 512480256 kstat.zfs.misc.arcstats.size: 416037376