Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 4 Jul 2014 13:37:14 +0200
From:      =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <roger.pau@citrix.com>
To:        FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Strange IO performance with UFS
Message-ID:  <53B691EA.3070108@citrix.com>

next in thread | raw e-mail | index | archive | help
Hello,

I'm doing some tests on IO performance using fio, and I've found
something weird when using UFS and large files. I have the following
very simple sequential fio workload:

[global]
rw=write
size=10g
bs=4k

[job1]

In this case the box has 6GB of RAM, and when running the following fio
workload, I also run `iostat -xz -w 1` in parallel. The result of fio is
pretty disappointing in terms of performance:

bw=33309KB/s, iops=8327

The output of iostat is the following:

device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0     266.1 299.0 34000.8 38243.4    1  92.4 100
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0     236.7 235.7 30295.1 30168.9   30  61.0 100
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0     301.8 224.7 38272.7 28674.4   80  49.3  95
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0     185.1 274.8 23687.5 35168.7   15  92.4 105
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0     258.4 238.1 33077.3 30475.7   36  57.1 100
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0     200.3 213.4 25634.5 27319.4    8  72.7 100
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0     243.3 233.7 31053.3 29919.1   31  57.4 100
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0     243.5 228.5 31169.7 29244.1   49  73.2  99

So, we are performing almost the same amount of reads and writes to
disk, even when the workload is just a sequential write, this doesn't
seem right to me, I was expecting that the number of reads would be much
lower.

I've also added the following DTrace probe, in order to figure out where
those reads are coming from, and the stack trace of all these read bios
is always the same:

              kernel`g_io_request+0x384
              kernel`g_part_start+0x2c3
              kernel`g_io_request+0x384
              kernel`g_part_start+0x2c3
              kernel`g_io_request+0x384
              kernel`ufs_strategy+0x8a
              kernel`VOP_STRATEGY_APV+0xf5
              kernel`bufstrategy+0x46
              kernel`cluster_read+0x5e6
              kernel`ffs_balloc_ufs2+0x1be2
              kernel`ffs_write+0x310
              kernel`VOP_WRITE_APV+0x166
              kernel`vn_write+0x2eb
              kernel`vn_io_fault_doio+0x22
              kernel`vn_io_fault1+0x78
              kernel`vn_io_fault+0x173
              kernel`dofilewrite+0x85
              kernel`kern_writev+0x65
              kernel`sys_write+0x63

The probe used is the following:

io:::start
/args[0] && (args[0]->bio_cmd == BIO_READ)/
{
        @traces[stack()] = count();
}

If I lower the file size of the fio workload to 4GB for example
everything seems fine, and I see almost no reads in iostat:

bw=84953KB/s, iops=21238

device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0       0.0 694.6     0.0 88912.2   82 111.4 100
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0       4.0 559.4   159.3 71014.2   67  99.6  99
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0       1.9 630.8   124.8 80617.0   63  90.6 100
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0       0.0 673.3     0.0 86177.9   80 107.2  99
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0       7.0 564.5   381.7 72260.6    4  94.1 101
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0       2.9 641.8    92.2 82113.9   55 101.3 100
                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
ada0       1.9 638.9   151.2 81773.4   54  90.4 100

Is this something expected/known? I'm I doing something wrong on the tests?

Thanks, Roger.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53B691EA.3070108>