From owner-freebsd-hackers@FreeBSD.ORG Fri Jul 4 11:37:31 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CDABE2BE for ; Fri, 4 Jul 2014 11:37:31 +0000 (UTC) Received: from SMTP02.CITRIX.COM (smtp02.citrix.com [66.165.176.63]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "mail.citrix.com", Issuer "Cybertrust Public SureServer SV CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 27A0D2FCE for ; Fri, 4 Jul 2014 11:37:30 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.01,600,1400025600"; d="scan'208";a="149973517" Received: from accessns.citrite.net (HELO FTLPEX01CL02.citrite.net) ([10.9.154.239]) by FTLPIPO02.CITRIX.COM with ESMTP; 04 Jul 2014 11:37:22 +0000 Received: from [IPv6:::1] (10.80.16.47) by smtprelay.citrix.com (10.13.107.79) with Microsoft SMTP Server id 14.3.181.6; Fri, 4 Jul 2014 07:37:21 -0400 Message-ID: <53B691EA.3070108@citrix.com> Date: Fri, 4 Jul 2014 13:37:14 +0200 From: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: FreeBSD Hackers Subject: Strange IO performance with UFS X-Enigmail-Version: 1.6 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit X-DLP: MIA1 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Jul 2014 11:37:31 -0000 Hello, I'm doing some tests on IO performance using fio, and I've found something weird when using UFS and large files. I have the following very simple sequential fio workload: [global] rw=write size=10g bs=4k [job1] In this case the box has 6GB of RAM, and when running the following fio workload, I also run `iostat -xz -w 1` in parallel. The result of fio is pretty disappointing in terms of performance: bw=33309KB/s, iops=8327 The output of iostat is the following: device r/s w/s kr/s kw/s qlen svc_t %b ada0 266.1 299.0 34000.8 38243.4 1 92.4 100 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 236.7 235.7 30295.1 30168.9 30 61.0 100 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 301.8 224.7 38272.7 28674.4 80 49.3 95 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 185.1 274.8 23687.5 35168.7 15 92.4 105 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 258.4 238.1 33077.3 30475.7 36 57.1 100 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 200.3 213.4 25634.5 27319.4 8 72.7 100 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 243.3 233.7 31053.3 29919.1 31 57.4 100 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 243.5 228.5 31169.7 29244.1 49 73.2 99 So, we are performing almost the same amount of reads and writes to disk, even when the workload is just a sequential write, this doesn't seem right to me, I was expecting that the number of reads would be much lower. I've also added the following DTrace probe, in order to figure out where those reads are coming from, and the stack trace of all these read bios is always the same: kernel`g_io_request+0x384 kernel`g_part_start+0x2c3 kernel`g_io_request+0x384 kernel`g_part_start+0x2c3 kernel`g_io_request+0x384 kernel`ufs_strategy+0x8a kernel`VOP_STRATEGY_APV+0xf5 kernel`bufstrategy+0x46 kernel`cluster_read+0x5e6 kernel`ffs_balloc_ufs2+0x1be2 kernel`ffs_write+0x310 kernel`VOP_WRITE_APV+0x166 kernel`vn_write+0x2eb kernel`vn_io_fault_doio+0x22 kernel`vn_io_fault1+0x78 kernel`vn_io_fault+0x173 kernel`dofilewrite+0x85 kernel`kern_writev+0x65 kernel`sys_write+0x63 The probe used is the following: io:::start /args[0] && (args[0]->bio_cmd == BIO_READ)/ { @traces[stack()] = count(); } If I lower the file size of the fio workload to 4GB for example everything seems fine, and I see almost no reads in iostat: bw=84953KB/s, iops=21238 device r/s w/s kr/s kw/s qlen svc_t %b ada0 0.0 694.6 0.0 88912.2 82 111.4 100 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 4.0 559.4 159.3 71014.2 67 99.6 99 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 1.9 630.8 124.8 80617.0 63 90.6 100 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 0.0 673.3 0.0 86177.9 80 107.2 99 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 7.0 564.5 381.7 72260.6 4 94.1 101 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 2.9 641.8 92.2 82113.9 55 101.3 100 extended device statistics device r/s w/s kr/s kw/s qlen svc_t %b ada0 1.9 638.9 151.2 81773.4 54 90.4 100 Is this something expected/known? I'm I doing something wrong on the tests? Thanks, Roger.