Date: Wed, 8 May 1996 14:39:40 +1000 From: Bruce Evans <bde@zeta.org.au> To: koshy@india.hp.com, stesin@elvisti.kiev.ua Cc: current@freebsd.org, hackers@freebsd.org Subject: Re: lmbench IDE anomaly Message-ID: <199605080439.OAA23473@godzilla.zeta.org.au>
next in thread | raw e-mail | index | archive | help
>i.e# ./lmdd if=/dev/DEVICE bs=BLOCKSIZE count=16MEG/BLOCKSIZE of=internal >Throughput for one lmdd reader process and two simultaneous lmdd readers are >given below). > Per process Per process >Device blocksize KB/s blocksize KB/s >~~~~~~ ~~~~~~~~~ ~~~~ ~~~~~~~~~ ~~~~ > -- SCSI DISK -- > --single reader-- >rsd0a bs=1024 653.74 bs=8192 1312.31 > 682.66 1268.36 > 677.52 1361.95 > --two readers-- >rsd0a bs=1024 424.27 bs=8192 805.69 > 424.24 807.64 > -- 812.82 >Looks like changing the block size for the read can double throughput. This is normal for small block sizes on SCSI disks. On my P133 ncr'810 system with a slow Toshiba MK537FB drive (which BTW still breaks everything unless SCSI_NCR_DFLT_TAGS is defined as 0 using option FAILSAFE or directly (it breaks things slightly more in -current than in 2.1R if this option isn't used)), the speeds for a single process are: blocksize KB/s --------- ---- 512 180 1024 334 2048 678 4096 1139 8192 2034 16384 2480 32768 2544 65536 2528 I.e., for block sizes smaller than 8K, the speed is approximately proportional to the block size (because SCSI command overhead doesn't depend much on the block size and is very large). >Also, two readers yield better thoughput than a single reader process. Perhaps this is because there is some overlap for the command overheads. The command for process 2 will usually arrive while the i/o for process 1 is in progress, so the drive may be able to process most of it before it can be executed. > -- IDE DISK -- > --single reader-- >rwd0a bs=1024 839.05 bs=8192 2392.08 (!!) > 841.53 2402.42 (!!) > 841.85 2402.45 (!!) I'm surprised that the larger block size is so much faster. The command overhead is much lower for IDE. > --two readers-- >rwd0a bs=1024 199.38 bs=8192 251.83 > 218.38 237.95 > 220.68 238.50 >The read rates for the single reader case are fantastic, however >disaster seems to strike when two reader access the same device There is no possibility for overlapping of command overheads because commands are serialized in the driver. I think the slowdown is to be expected. The large command overhead for slow drives like your SCSI drive and my Toshiba probably results in each process taking turns reading the same block out of the drive's cache. OTOH, for faster drives like my Quantum XPG and any IDE drive, one of the processes apparently gets far enough ahead of the other to defeat the drive's caching. This causes a 26x per-process slowdown for the XPG. >So I looked at the block device. > --single reader-- >wd0a bs=1024 199.80 bs=8192 796.07 > 200.04 795.06 > --two readers-- >wd0a bs=1024 200.04 bs=8192 795.60 > 200.33 795.20 >Hmm, block size makes a huge difference still. Is this to >be expected? Also the two reader case and the single reader case It's a bit unexpected. A (too-small) block size of 2048 is always used for physical reads for the block device. Thus the speed is limited to that of the raw device with a block size of 2048. The (lack of) speed of my Toshiba /dev/sd0 is almost independent of the block size. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199605080439.OAA23473>