Date: Fri, 8 Jan 2010 02:15:10 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Alexander Motin <mav@freebsd.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Ivan Voras <ivoras@freebsd.org> Subject: Re: svn commit: r201658 - head/sbin/geom/class/stripe Message-ID: <20100108013737.S56162@delplex.bde.org> In-Reply-To: <4B450F30.20705@FreeBSD.org> References: <201001061712.o06HCICF087127@svn.freebsd.org> <9bbcef731001060938k2b0014a2m15eef911b9922b2c@mail.gmail.com> <4B44D8FA.2000608@FreeBSD.org> <9bbcef731001061103u33fd289q727179454b21ce18@mail.gmail.com> <4B450F30.20705@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 7 Jan 2010, Alexander Motin wrote: > Ivan Voras wrote: >> Yes, my experience which lead to the post was mostly on UFS which, >> while AFAIK it does read-ahead, it still does it serially (I think >> this is implied by your experiments with NCQ and ZFS vs UFS) - so in >> any case only 2 drives are hit with 64k stripe size at any moment in >> time. > > I do not think it is true. On system with default MAXPHYS I've made > gstripe with 64K block of 4 equal drives with 108MB/s of maximal read > speed. Reads with dd from large pre-written file on UFS shown: > > vfs.read_max=8 (default) - 235090074 bytes/sec > vfs.read_max=16 - 378385148 bytes/sec > vfs.read_max=32 - 386620109 bytes/sec Maybe I'm wrong about it being limited by MAXPHYS. 'racluster' is limited by MAXPHYS, but 'maxra' (vfs.read_max) is not, and these interact confusingly. BTW, vfs.read_max has bogus units -- fs blocks (bsize not fsize for ffs IIRC). The default of 8 works very badly when the fs block size is small (512 say). In my version, the units are DEV_BSIZE blocks and the default is the default MAXPHYS/DEV_BSIZE (should be MAXPHYS/DEV_BSIZE). > I've put some printfs into the clustering read code and found enough > read-ahead there. So it works. > > One thing IMHO would be nice to see there is the alignment of the > read-ahead requests to the array stripe size/offset. Dirty hack I've > tried there, reduced number of requests to the array components by 30%. ffs thinks that bsize alignment is adequate. It doesn't try to align files any more than that. Then for sequential reads from the beginning of the file, vfs read clustering tries to read MAXPHYS bytes at a time, so it perfectly preserves any initial misalignment. I'm not sure what happens for large random reads. Does seeking ouside of the read-ahead reset the alignment to the seek point? It shouldn't, if alignment done by the file system is to work right. However, vfs should re-align if the file system or user i/o doesn't, so that all of its reads of mnt_iosize_max bytes start on an alignment boundary. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100108013737.S56162>