From owner-freebsd-current@FreeBSD.ORG Thu Dec 3 09:00:42 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 926E01065692 for ; Thu, 3 Dec 2009 09:00:42 +0000 (UTC) (envelope-from freebsd-current@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 1F2E18FC21 for ; Thu, 3 Dec 2009 09:00:41 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.50) id 1NG7YG-0000qO-9r for freebsd-current@freebsd.org; Thu, 03 Dec 2009 10:00:40 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 03 Dec 2009 10:00:40 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 03 Dec 2009 10:00:40 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-current@freebsd.org From: Ivan Voras Date: Thu, 03 Dec 2009 10:00:25 +0100 Lines: 75 Message-ID: References: <1259583785.00188655.1259572802@10.7.7.3> <1259659388.00189017.1259647802@10.7.7.3> <1259691809.00189274.1259681402@10.7.7.3> <1259695381.00189283.1259682004@10.7.7.3> <4B170FCB.3030102@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Thunderbird 2.0.0.23 (X11/20090928) In-Reply-To: <4B170FCB.3030102@FreeBSD.org> Sender: news Subject: Re: NCQ vs UFS/ZFS benchmark [Was: Re: FreeBSD 8.0 Performance (at Phoronix)] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Dec 2009 09:00:42 -0000 Alexander Motin wrote: > Ivan Voras wrote: >> If you have a drive to play with, could you also check UFS vs ZFS on >> both ATA & AHCI? To try and see if the IO scheduling of ZFS plays nicely. >> >> For benchmarks I suggest blogbench and bonnie++ (in ports) and if you >> want to bother, randomio, http://arctic.org/~dean/randomio . > gstat shown that most of time only one request at a time was running on > disk. Looks like read or read-modify-write operations (due to many short > writes in test pattern) are heavily serialized in UFS, even when several > processes working with the same file. It has almost eliminated effect of > NCQ in this test. > > Test 2: Same as before, but without O_DIRECT flag: > ata(4), 1 process, first tps: 78 > ata(4), 1 process, second tps: 469 > ata(4), 32 processes, first tps: 83 > ata(4), 32 processes, second tps: 475 > ahci(4), 1 process, first tps: 79 > ahci(4), 1 process, second tps: 476 > ahci(4), 32 processes, first tps: 93 > ahci(4), 32 processes, second tps: 488 Ok, so this is UFS, normal caching. > Data doesn't fit into cache. Multiple parallel requests give some effect > even with legacy driver, but with NCQ enabled it gives much more, almost > doubling performance! You've seen queueing in gstat for ZFS+NCQ? > Teste 4: Same as 3, but with kmem_size=1900M and arc_max=1700M. > ata(4), 1 process, first tps: 90 > ata(4), 1 process, second tps: ~160-300 > ata(4), 32 processes, first tps: 112 > ata(4), 32 processes, second tps: ~190-322 > ahci(4), 1 process, first tps: 90 > ahci(4), 1 process, second tps: ~140-300 > ahci(4), 32 processes, first tps: 180 > ahci(4), 32 processes, second tps: ~280-550 And this is ZFS with some tuning. I've also seen high deviation in performance on ZFS so it seems normal. > As conclusion: > - in this particular test ZFS scaled well with parallel requests, > effectively using multiple disks. NCQ shown great benefits. But i386 > constraints are significantly limited ZFS caching abilities. > - UFS behaves very poorly in this test. Even with parallel workload it > often serializes device accesses. May be results would be different if I wouldn't say UFS behaves poorly from your results. It looks like only the multiprocess case is bad on the UFS. For single-process access the difference in favour of ZFS is ~10 TPS on the first case and UFS is apparently much better in all cases but the last on the second try. This may be explained if you have a large variation between runs. Also, did you use the whole drive for the file system? In cases like this it would be interesting to create a special partition (in all cases, on all drives), covering only a small segment on the disk (thinking of the drive as a rotational media, made of cylinders). For example, a partition of size of 30 GB covering only the outer tracks. > there would be separate file for each process, or with some other > options, but I think pattern I have used is also possible in some > applications. Only benefit UFS shown here is more effective memory > management on i386, leading to higher cache effectiveness. > > It would be nice if somebody explained that UFS behavior. Possibly, read-only access to memory cache structures is protected by read-only locks, which are efficient, and ARC is more complicated than it's worth? But others should have better guesses :)