From owner-freebsd-fs@FreeBSD.ORG Tue Jun 8 15:20:07 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CF1BD1065674 for ; Tue, 8 Jun 2010 15:20:07 +0000 (UTC) (envelope-from brad@duttonbros.com) Received: from uno.mnl.com (uno.mnl.com [64.221.209.136]) by mx1.freebsd.org (Postfix) with ESMTP id A05CA8FC15 for ; Tue, 8 Jun 2010 15:20:07 +0000 (UTC) Received: from uno.mnl.com (localhost [127.0.0.1]) by uno.mnl.com (Postfix) with ESMTP id 1DD501F04; Tue, 8 Jun 2010 08:20:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=duttonbros.com; h= message-id:date:from:to:cc:subject:references:in-reply-to :mime-version:content-type:content-transfer-encoding; s=mail; bh=UK/HH0x79TxQQwm/y3vO9uNunDs=; b=YR1uMMY3wzuG9aItq9LtESzI7f5G aQJ8oENrDeY1flS9yb2t/XKVJEqanll4GGX3D/L2sLmHEZKdfxe1Rhfac1x12Eo/ P7NVimFDL/pf90i1e/MFaQP+lHdR6IDU7xBnr/xqlgYjHqsolqtBr1c57qkBkIPV NNU0VIPSJyLiM3I= DomainKey-Signature: a=rsa-sha1; c=nofws; d=duttonbros.com; h=message-id :date:from:to:cc:subject:references:in-reply-to:mime-version :content-type:content-transfer-encoding; q=dns; s=mail; b=V6d4Ac LlJtYifoPgABSStCMcijEGy40Pe5QJiol24gfej3IVSSUUH1otIsWjik520jTlcB sk1smdsYvBXcFqlwdpJvAwZjD4BPnjLIvd71S1zNCpoJvY0hBs5AKzOGDDvPkknd zUgtN1+pWvv/1dVtSxN3GVoZhtMW5ruff46Ro= Received: from localhost (localhost [127.0.0.1]) by uno.mnl.com (Postfix) with ESMTP id 0739E1F03; Tue, 8 Jun 2010 08:20:07 -0700 (PDT) Received: from c-98-210-178-102.hsd1.ca.comcast.net (c-98-210-178-102.hsd1.ca.comcast.net [98.210.178.102]) by duttonbros.com (Horde Framework) with HTTP; Tue, 08 Jun 2010 08:20:06 -0700 Message-ID: <20100608082006.5006764hokcpvzqe@duttonbros.com> Date: Tue, 08 Jun 2010 08:20:06 -0700 From: "Bradley W. Dutton" To: Jeremy Chadwick References: <20100607154256.941428ovaq2hha0g@duttonbros.com> <20100607173218.11716iopp083dbpu@duttonbros.com> <20100608044707.GA78147@icarus.home.lan> In-Reply-To: <20100608044707.GA78147@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Internet Messaging Program (IMP) H3 (4.3.7) / FreeBSD-8.1 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS performance of various vdevs (long post) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Jun 2010 15:20:07 -0000 Quoting Jeremy Chadwick : >> On Mon, Jun 07, 2010 at 05:32:18PM -0700, Bradley W. Dutton wrote: >> I know it's pretty simple but for checking throughput I thought it >> would be ok. I don't have compression on and based on the drive >> lights and gstat, the drives definitely aren't idle. > > Try disabling prefetch (you have it enabled) and try setting > vfs.zfs.txg.timeout="5". Some people have reported a "sweet spot" with > regards to the last parameter (needing to be adjusted if your disks are > extremely fast, etc.), as otherwise ZFS would be extremely "bursty" in > its I/O (stalling/deadlocking the system at set intervals). By > decreasing the value you essentially do disk writes more regularly (with > less data), and depending upon the load and controller, this may even > out performance. I tested some of these settings. With the timeout set to 5 not much changed write wise. (keep in mind these results are the Nvidia/WDRE2 combo): With txg=5 and prefetch disabled I saw read speeds go down considerably: # normal/jbod txg=5 no prefetch zpool create bench /dev/ad4 /dev/ad6 /dev/ad10 /dev/ad12 /dev/ad14 dd if=/bench/test.file of=/dev/null bs=1m 12582912000 bytes transferred in 59.330330 secs (212082286 bytes/sec) compared to 12582912000 bytes transferred in 34.668165 secs (362952928 bytes/sec) zpool create bench raidz /dev/ad4 /dev/ad6 /dev/ad10 /dev/ad12 /dev/ad14 dd if=/bench/test.file of=/dev/null bs=1m 12582912000 bytes transferred in 71.135696 secs (176886046 bytes/sec) compared to 12582912000 bytes transferred in 45.825533 secs (274582993 bytes/sec) Running the same tests on the raidz2 Supermicro/Hitachi setup didn't yield any difference in writes, the reads were slower: zpool create tank raidz2 /dev/da0 /dev/da1 /dev/da2 /dev/da3 /dev/da4 /dev/da5 /dev/da6 /dev/da7 dd if=/tank/test.file of=/dev/null bs=1m 12582912000 bytes transferred in 44.118409 secs (285207745 bytes/sec) compared to 12582912000 bytes transferred in 32.911291 secs (382328118 bytes/sec) I rebooted and reran these numbers just to make sure they were consistent. >> >The higher CPU usage might be due to the device driver or the >> >interface card being used. >> >> Definitely a plausible explanation. If this was the case would the 8 >> parallel dd processes exhibit the same behavior? or is the type of >> IO affecting how much CPU the driver is using? > > It would be the latter. > > Also, I believe this Supermicro controller has been discussed in the > past. I can't remember if people had outright failures/issues with it > or if people were complaining about sub-par performance. I could also > be remembering a different Supermicro controller. > > If I had to make a recommendation, it would be to reproduce the same > setup on a system using an Intel ICH9/ICH9R or ICH10/ICH10R controller > in AHCI mode (with ahci.ko loaded, not ataahci.ko) and see if things > improve. But start with the loader.conf tunables I mentioned above -- > segregate each test. > > I would also recommend you re-run your tests with a different blocksize > for dd. I don't know why people keep using 1m (Linux websites?). Test > the following increments: 4k, 8k, 16k, 32k, 64k, 128k, 256k. That's > about where you should stop. I tested with 8, 16, 32, 64, 128, 1m and the results all looked similar. As such I stuck with bs=1m because it's easier to change count. > Otherwise, consider installing ports/benchmarks/bonnie++ and try that. > That will also get you concurrent I/O tests, I believe. I may give this a shot but I'm most interested in less concurrency as I have larger files with only a couple of readers/writers. As Bob noted a bunch of mirrors in the pool would definitely be faster for concurrent IO. Thanks for the help, Brad