Date: Tue, 07 Jul 2009 16:26:29 +0300 From: Alexander Motin <mav@FreeBSD.org> To: Matthew Dillon <dillon@apollo.backplane.com> Cc: freebsd-arch@freebsd.org Subject: Re: DFLTPHYS vs MAXPHYS Message-ID: <4A534D05.1040709@FreeBSD.org> In-Reply-To: <1246915383.00136290.1246904409@10.7.7.3> References: <1246746182.00135530.1246735202@10.7.7.3> <1246792983.00135712.1246781401@10.7.7.3> <1246796580.00135722.1246783203@10.7.7.3> <1246814582.00135806.1246803602@10.7.7.3> <1246818181.00135809.1246804804@10.7.7.3> <1246825383.00135846.1246812602@10.7.7.3> <1246825385.00135854.1246814404@10.7.7.3> <1246830930.00135868.1246819202@10.7.7.3> <1246830933.00135875.1246820402@10.7.7.3> <1246908182.00136258.1246896003@10.7.7.3> <1246911786.00136277.1246900203@10.7.7.3> <1246915383.00136290.1246904409@10.7.7.3>
next in thread | previous in thread | raw e-mail | index | archive | help
Matthew Dillon wrote: > tty da0 cpu > tin tout KB/t tps MB/s us ni sy in id > 0 11 0.50 17511 8.55 0 0 15 0 85 bs=512 > 0 11 1.00 16108 15.73 0 0 12 0 87 bs=1024 > 0 11 2.00 14758 28.82 0 0 11 0 89 bs=2048 > 0 11 4.00 12195 47.64 0 0 7 0 93 bs=4096 > 0 11 8.00 8026 62.70 0 0 5 0 95 bs=8192 << MB/s breakpt > 0 11 16.00 4018 62.78 0 0 4 0 96 bs=16384 > 0 11 32.00 2025 63.28 0 0 2 0 98 bs=32768 << id breakpt > 0 11 64.00 1004 62.75 0 0 1 0 99 bs=65536 > 0 11 128.00 506 63.25 0 0 1 0 99 bs=131072 As I have written before, my SSD continues to improve speed up to 512KB transaction size, and may be farther, I haven't tested > Random seek/read > > tty da0 cpu > tin tout KB/t tps MB/s us ni sy in id > 0 11 0.50 189 0.09 0 0 0 0 100 bs=512 > 0 11 1.00 184 0.18 0 0 0 0 100 bs=1024 > 0 11 2.00 177 0.35 0 0 0 0 100 bs=2048 > 0 11 4.00 175 0.68 0 0 0 0 100 bs=4096 > 0 11 8.00 172 1.34 0 0 0 0 100 bs=8192 > 0 11 16.00 166 2.59 0 0 0 0 100 bs=16384 > 0 11 32.00 159 4.97 0 0 1 0 99 bs=32768 > 0 11 64.00 142 8.87 0 0 0 0 100 bs=65536 > 0 11 128.00 117 14.62 0 0 0 0 100 bs=131072 > ^^^ ^^^ > note TPS rate and MB/s > > Which is the more important tuning variable? Efficiency of linear > reads or saving re-seeks by buffering more data? If you didn't choose > saving re-seeks you lose. > > To go from 16K to 32K requires saving 5% of future re-seeks to break-even. > To go from 32K to 64K requires saving 11% of future re-seeks. > To go from 64K to 128K requires saving 18% of future re-seeks. > (at least with this particular disk) > > At the point where the block size exceeds 32768 if you aren't saving > re-seeks with locality of reference from the additional cached data, > you lose. If you are saving reseeks you win. cpu caches do not enter > into the equation at all. > > For most filesystems the re-seeks being saved depend on the access > pattern. For example, if you are doing a ls -lR or a find the re-seek > pattern will be related to inode and directory lookups. The number of > inodes which fit in a cluster_read(), assuming reasonable locality of > reference, will wind up determining the performance. > > However, as the buffer size grows the total number of bytes you are > able to cache becomes the dominant factor in calculating the re-seek > efficiency. I don't have a graph for that but, ultimately, it means > that reading very large blocks (i.e. 1MB) with a non-linear access > pattern is bad because most of the additional data cached will never > be used before the memory winds up being re-used to cache some other > cluster. You are mixing completely different things. I was never talking about file system block size. I am not trying to argue that 16/32K file system block size may be quite effective in most of cases. I was speaking about maximum _disk_transaction_ size. It is not the same. When file system needs small amount of data, or there is just small file, there is definitely no need to read/write more then one small FS block. But instead, when file system prognoses effective large read-ahead or it have a lot of write-back data, there is no reason to not transfer more contiguous blocks with one big disk transaction. Splitting it will just increase command overhead at all layers and make possible drive to be interrupted between that operations to do some very long seek. -- Alexander Motin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A534D05.1040709>