From owner-freebsd-questions@FreeBSD.ORG Tue Dec 22 10:10:43 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C2663106566C for ; Tue, 22 Dec 2009 10:10:43 +0000 (UTC) (envelope-from cjk32@cam.ac.uk) Received: from ppsw-5.csi.cam.ac.uk (ppsw-5.csi.cam.ac.uk [131.111.8.135]) by mx1.freebsd.org (Postfix) with ESMTP id 651188FC1E for ; Tue, 22 Dec 2009 10:10:43 +0000 (UTC) X-Cam-AntiVirus: no malware found X-Cam-SpamDetails: not scanned X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from nat2.cjkey.org.uk ([88.97.163.221]:14879 helo=[192.168.2.58]) by ppsw-5.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.155]:465) with esmtpsa (PLAIN:cjk32) (TLSv1:DHE-RSA-AES256-SHA:256) id 1NN1hQ-0002pB-Hd (Exim 4.70) for freebsd-questions@freebsd.org (return-path ); Tue, 22 Dec 2009 10:10:40 +0000 Message-ID: <4B309B1F.5060800@cam.ac.uk> Date: Tue, 22 Dec 2009 10:10:39 +0000 From: Christopher Key User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: freebsd-questions@freebsd.org References: <4ADC5318.2010706@cam.ac.uk> <4AE064D6.2030504@cam.ac.uk> In-Reply-To: <4AE064D6.2030504@cam.ac.uk> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: ZFS: Strange performance issues X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Dec 2009 10:10:43 -0000 Hello, I'm seeing exactly the same issues on my 5 disk raidz pool again. In brief: I can only write to the filesystem at ~20MB/s, and read from it at ~25MB/s. Normally, I'd expect to be able to write at nearer 50-100 MB/s, and read at ~200MB/s. The values from "zpool iostat -v" are also strange. When writing to the pool, I get sensible values, whereby the pool write rate matches the rate reported by dd, and each of the five drives reports a quarter of this rate. When reading from the filesystem, the pool read rate is ~200MB/s, with each drive reading at ~40MB/s. The read rate that dd reports is still 25MB/s however. The last time that this happened, I concluded that ZFS had simply ceased to perform any caching, which would apparently explain the above observations. A reboot fixed this issue, and the assumption of caching problems seemed borne out by the change in kstat.zfs.misc.arcstats.p, .c and .size: Before reboot*: kstat.zfs.misc.arcstats.p: 21353344 kstat.zfs.misc.arcstats.c: 21353344 kstat.zfs.misc.arcstats.size: 29774848 After reboot: kstat.zfs.misc.arcstats.p: 412187380 kstat.zfs.misc.arcstats.c: 415807010 kstat.zfs.misc.arcstats.size: 416037376 I had guessed that this caching problem was due to some unexpected pressure on memory, but had rebooted the system before thinking to check! Seeing the same issues again, I am now in a position to investigate further. I see the same reduced values in kstat.zfs.misc.arcstats.*: kstat.zfs.misc.arcstats.p: 21353344 kstat.zfs.misc.arcstats.c: 21353344 kstat.zfs.misc.arcstats.size: 29774848 I'm not sure exactly how to interpret stats from top, but there doesn't appear to be anying amiss: Mem: 340M Active, 765M Inact, 686M Wired, 976K Cache, 211M Buf, 162M Free Swap: 8192M Total, 1152K Used, 8191M Free Before I restart this system, can anyone suggest anything to further diagnose whats going on here? What exactly do arcstats.(p|c) mean, and would increasing arcstats.c_min perhaps help? Kind Regards, Christopher Key * The stats before the reboot in my previous post were incorrect. I still have them archived, and include them here for future reference. vfs.zfs.arc_min: 21353344 vfs.zfs.arc_max: 512480256 vfs.zfs.mdcomp_disable: 0 vfs.zfs.prefetch_disable: 0 vfs.zfs.zio.taskq_threads: 0 vfs.zfs.recover: 0 vfs.zfs.vdev.cache.size: 10485760 vfs.zfs.vdev.cache.max: 16384 vfs.zfs.cache_flush_disable: 0 vfs.zfs.zil_disable: 0 vfs.zfs.debug: 0 kstat.zfs.misc.arcstats.hits: 512541171 kstat.zfs.misc.arcstats.misses: 39560635 kstat.zfs.misc.arcstats.demand_data_hits: 156634097 kstat.zfs.misc.arcstats.demand_data_misses: 5743443 kstat.zfs.misc.arcstats.demand_metadata_hits: 303622664 kstat.zfs.misc.arcstats.demand_metadata_misses: 5425923 kstat.zfs.misc.arcstats.prefetch_data_hits: 4293179 kstat.zfs.misc.arcstats.prefetch_data_misses: 28037490 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 47991231 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 353779 kstat.zfs.misc.arcstats.mru_hits: 43042878 kstat.zfs.misc.arcstats.mru_ghost_hits: 16196037 kstat.zfs.misc.arcstats.mfu_hits: 420079251 kstat.zfs.misc.arcstats.mfu_ghost_hits: 3400520 kstat.zfs.misc.arcstats.deleted: 29558675 kstat.zfs.misc.arcstats.recycle_miss: 14747429 kstat.zfs.misc.arcstats.mutex_miss: 12390 kstat.zfs.misc.arcstats.evict_skip: 330353811 kstat.zfs.misc.arcstats.hash_elements: 1410 kstat.zfs.misc.arcstats.hash_elements_max: 30816 kstat.zfs.misc.arcstats.hash_collisions: 10467654 kstat.zfs.misc.arcstats.hash_chains: 31 kstat.zfs.misc.arcstats.hash_chain_max: 8 kstat.zfs.misc.arcstats.p: 21353344 kstat.zfs.misc.arcstats.c: 21353344 kstat.zfs.misc.arcstats.c_min: 21353344 kstat.zfs.misc.arcstats.c_max: 512480256 kstat.zfs.misc.arcstats.size: 29774848