From owner-freebsd-hackers Tue May 7 21:43:01 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id VAA22147 for hackers-outgoing; Tue, 7 May 1996 21:43:01 -0700 (PDT) Received: from godzilla.zeta.org.au (godzilla.zeta.org.au [203.2.228.19]) by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id VAA22138 Tue, 7 May 1996 21:42:54 -0700 (PDT) Received: (from bde@localhost) by godzilla.zeta.org.au (8.6.12/8.6.9) id OAA23473; Wed, 8 May 1996 14:39:40 +1000 Date: Wed, 8 May 1996 14:39:40 +1000 From: Bruce Evans Message-Id: <199605080439.OAA23473@godzilla.zeta.org.au> To: koshy@india.hp.com, stesin@elvisti.kiev.ua Subject: Re: lmbench IDE anomaly Cc: current@freebsd.org, hackers@freebsd.org Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >i.e# ./lmdd if=/dev/DEVICE bs=BLOCKSIZE count=16MEG/BLOCKSIZE of=internal >Throughput for one lmdd reader process and two simultaneous lmdd readers are >given below). > Per process Per process >Device blocksize KB/s blocksize KB/s >~~~~~~ ~~~~~~~~~ ~~~~ ~~~~~~~~~ ~~~~ > -- SCSI DISK -- > --single reader-- >rsd0a bs=1024 653.74 bs=8192 1312.31 > 682.66 1268.36 > 677.52 1361.95 > --two readers-- >rsd0a bs=1024 424.27 bs=8192 805.69 > 424.24 807.64 > -- 812.82 >Looks like changing the block size for the read can double throughput. This is normal for small block sizes on SCSI disks. On my P133 ncr'810 system with a slow Toshiba MK537FB drive (which BTW still breaks everything unless SCSI_NCR_DFLT_TAGS is defined as 0 using option FAILSAFE or directly (it breaks things slightly more in -current than in 2.1R if this option isn't used)), the speeds for a single process are: blocksize KB/s --------- ---- 512 180 1024 334 2048 678 4096 1139 8192 2034 16384 2480 32768 2544 65536 2528 I.e., for block sizes smaller than 8K, the speed is approximately proportional to the block size (because SCSI command overhead doesn't depend much on the block size and is very large). >Also, two readers yield better thoughput than a single reader process. Perhaps this is because there is some overlap for the command overheads. The command for process 2 will usually arrive while the i/o for process 1 is in progress, so the drive may be able to process most of it before it can be executed. > -- IDE DISK -- > --single reader-- >rwd0a bs=1024 839.05 bs=8192 2392.08 (!!) > 841.53 2402.42 (!!) > 841.85 2402.45 (!!) I'm surprised that the larger block size is so much faster. The command overhead is much lower for IDE. > --two readers-- >rwd0a bs=1024 199.38 bs=8192 251.83 > 218.38 237.95 > 220.68 238.50 >The read rates for the single reader case are fantastic, however >disaster seems to strike when two reader access the same device There is no possibility for overlapping of command overheads because commands are serialized in the driver. I think the slowdown is to be expected. The large command overhead for slow drives like your SCSI drive and my Toshiba probably results in each process taking turns reading the same block out of the drive's cache. OTOH, for faster drives like my Quantum XPG and any IDE drive, one of the processes apparently gets far enough ahead of the other to defeat the drive's caching. This causes a 26x per-process slowdown for the XPG. >So I looked at the block device. > --single reader-- >wd0a bs=1024 199.80 bs=8192 796.07 > 200.04 795.06 > --two readers-- >wd0a bs=1024 200.04 bs=8192 795.60 > 200.33 795.20 >Hmm, block size makes a huge difference still. Is this to >be expected? Also the two reader case and the single reader case It's a bit unexpected. A (too-small) block size of 2048 is always used for physical reads for the block device. Thus the speed is limited to that of the raw device with a block size of 2048. The (lack of) speed of my Toshiba /dev/sd0 is almost independent of the block size. Bruce