From owner-freebsd-current@FreeBSD.ORG Thu Dec 3 01:09:35 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6E7231065672; Thu, 3 Dec 2009 01:09:35 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-bw0-f213.google.com (mail-bw0-f213.google.com [209.85.218.213]) by mx1.freebsd.org (Postfix) with ESMTP id 8FF418FC0C; Thu, 3 Dec 2009 01:09:34 +0000 (UTC) Received: by bwz5 with SMTP id 5so717984bwz.3 for ; Wed, 02 Dec 2009 17:09:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=DOmG3RJI0iFv+fpOz6egjv6rt0Y+/y0Ph+J7acpZyEg=; b=aA73W9Yr3p00WiwcwdD37tsQfddfuaoSmckc5LSGJ6iLxIoYZgUEX+PDMtt5NgOYuB xQ7AbsueXyKHE/s4XB2vFnEpEsirvqba+F+zJK+bI6JuwP+Ww0xKxueuWVn60sL6BR5V f25y62kjJ9ND1tchp1gHbW9YIjB//afWnGNMM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=OGF0aA4iL/8iK8UA07WY9CQIpdEA9L/BQcdX+U4I1k3JwsOrJ5JhW1euCX0UF5Z13E /qborTBAE4toRjDT2he1bbzjxyZ/yC+sNhkH13iDcVkG+krOHMqg5nvWJhXutVp9HWYY EW7+JIIHV3XNRvL9PFNsP1Y0V2PmmsxXTeEbw= Received: by 10.204.10.2 with SMTP id n2mr906323bkn.91.1259802573438; Wed, 02 Dec 2009 17:09:33 -0800 (PST) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua [212.86.226.226]) by mx.google.com with ESMTPS id 16sm583460fxm.8.2009.12.02.17.09.32 (version=SSLv3 cipher=RC4-MD5); Wed, 02 Dec 2009 17:09:32 -0800 (PST) Sender: Alexander Motin Message-ID: <4B170FCB.3030102@FreeBSD.org> Date: Thu, 03 Dec 2009 03:09:31 +0200 From: Alexander Motin User-Agent: Thunderbird 2.0.0.23 (X11/20090901) MIME-Version: 1.0 To: Ivan Voras References: <1259583785.00188655.1259572802@10.7.7.3> <1259659388.00189017.1259647802@10.7.7.3> <1259691809.00189274.1259681402@10.7.7.3> <1259695381.00189283.1259682004@10.7.7.3> In-Reply-To: <1259695381.00189283.1259682004@10.7.7.3> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: FreeBSD-Current Subject: NCQ vs UFS/ZFS benchmark [Was: Re: FreeBSD 8.0 Performance (at Phoronix)] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Dec 2009 01:09:35 -0000 Ivan Voras wrote: > If you have a drive to play with, could you also check UFS vs ZFS on > both ATA & AHCI? To try and see if the IO scheduling of ZFS plays nicely. > > For benchmarks I suggest blogbench and bonnie++ (in ports) and if you > want to bother, randomio, http://arctic.org/~dean/randomio . I have looked on randomio and found that it is also tuned to test physical drive, and it does almost the same as raidtest. The main difference that raidtest uses pre-generated test patterns, so it's results are much more repeatable. What bonnie++ does is another question, I prefer trust results which I can explain. So I have spent several hours to quickly compare UFS and ZFS in several scenarios, using ata(4) and ahci(4) drivers. It is not a strict research, but I have checked every digit at least twice, some unexpected or deviating ones even more. I have pre-written 20GB file on empty file systems and used raidtest to generate random rix of 10000 read/write requests of random size (512B - 128KB) to those files. Every single run took about a minute, total transfer size per run was about 600MB. I have used the same request pattern in all tests. Test 1: raidtest with O_DIRECT flag (default) on UFS file system: ata(4), 1 process tps: 70 ata(4), 32 processes tps: 71 ahci(4), 1 process tps: 72 ahci(4), 32 processes tps: 81 gstat shown that most of time only one request at a time was running on disk. Looks like read or read-modify-write operations (due to many short writes in test pattern) are heavily serialized in UFS, even when several processes working with the same file. It has almost eliminated effect of NCQ in this test. Test 2: Same as before, but without O_DIRECT flag: ata(4), 1 process, first tps: 78 ata(4), 1 process, second tps: 469 ata(4), 32 processes, first tps: 83 ata(4), 32 processes, second tps: 475 ahci(4), 1 process, first tps: 79 ahci(4), 1 process, second tps: 476 ahci(4), 32 processes, first tps: 93 ahci(4), 32 processes, second tps: 488 Without O_DIRECT flag UFS was able to fit all accessed information into buffer cache on second run. Second run uses buffer cache for all reads, writes are not serialized, but NCQ effect is minimal in this situation. First run is still mostly serialized. Test 3: Same as 2, but with ZFS (i386 without tuning) ata(4), 1 process, first tps: 75 ata(4), 1 process, second tps: 73 ata(4), 32 processes, first tps: 98 ata(4), 32 processes, second tps: 97 ahci(4), 1 process, first tps: 77 ahci(4), 1 process, second tps: 80 ahci(4), 32 processes, first tps: 139 ahci(4), 32 processes, second tps: 142 Data doesn't fit into cache. Multiple parallel requests give some effect even with legacy driver, but with NCQ enabled it gives much more, almost doubling performance! Teste 4: Same as 3, but with kmem_size=1900M and arc_max=1700M. ata(4), 1 process, first tps: 90 ata(4), 1 process, second tps: ~160-300 ata(4), 32 processes, first tps: 112 ata(4), 32 processes, second tps: ~190-322 ahci(4), 1 process, first tps: 90 ahci(4), 1 process, second tps: ~140-300 ahci(4), 32 processes, first tps: 180 ahci(4), 32 processes, second tps: ~280-550 Data slightly cached on first run and heavily cached on second. But even such (maximum of I can dedicate on my i386) amount of memory it is not enough to cache all data. Second run gives different device access pattern each time and very random results. Test 5: Same as 3, but with 2 disks: ata(4), 1 process, first tps: 80 ata(4), 1 process, second tps: 79 ata(4), 32 processes, first tps: 186 ata(4), 32 processes, second tps: 181 ahci(4), 1 process, first tps: 79 ahci(4), 1 process, second tps: 110 ahci(4), 32 processes, first tps: 287 ahci(4), 32 processes, second tps: 290 Data doesn't fit into cache. Second disk gives almost no improvements for serialized requests. Multiple parallel requests double speed even with legacy driver, because of spreading requests between drives. Adding NCQ support significantly rises speed even more. As conclusion: - in this particular test ZFS scaled well with parallel requests, effectively using multiple disks. NCQ shown great benefits. But i386 constraints are significantly limited ZFS caching abilities. - UFS behaves very poorly in this test. Even with parallel workload it often serializes device accesses. May be results would be different if there would be separate file for each process, or with some other options, but I think pattern I have used is also possible in some applications. Only benefit UFS shown here is more effective memory management on i386, leading to higher cache effectiveness. It would be nice if somebody explained that UFS behavior. -- Alexander Motin