From owner-cvs-all Mon Sep 20 9:34:23 1999 Delivered-To: cvs-all@freebsd.org Received: from proxy2.ba.best.com (proxy2.ba.best.com [206.184.139.14]) by hub.freebsd.org (Postfix) with ESMTP id 9C84814C07; Mon, 20 Sep 1999 09:34:18 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com ([209.157.86.2]) by proxy2.ba.best.com (8.9.3/8.9.2/best.out) with ESMTP id JAA03934; Mon, 20 Sep 1999 09:26:16 -0700 (PDT) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id JAA82119; Mon, 20 Sep 1999 09:26:15 -0700 (PDT) (envelope-from dillon) Date: Mon, 20 Sep 1999 09:26:15 -0700 (PDT) From: Matthew Dillon Message-Id: <199909201626.JAA82119@apollo.backplane.com> To: Peter Wemm Cc: Jesper Skriver , Poul-Henning Kamp , cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: User block device access (was: cvs commit: src/sys/miscfs/specfs spec_vnops.c src/sys/sys vnode.h src/sys/kern vfs_subr.c) References: <19990920093358.75A841CC5@overcee.netplex.com.au> Sender: owner-cvs-all@FreeBSD.ORG Precedence: bulk : raw rawcpu block blkcpu file filecpu :bsize MB/s sec MB/s sec MB/s sec :---------------------------------------------------- :1024k 21.4 0.16 12.5 9.09 13.7 3.11 :512k 21.3 0.15 12.2 9.23 13.6 3.12 :256k 21.6 0.17 12.4 9.08 13.6 2.89 :128k 21.6 0.18 12.4 8.80 13.6 2.54 :64k 21.4 0.22 12.3 8.82 13.6 2.58 :32k 21.3 0.36 12.4 8.82 15.5 2.62 :16k 21.6 0.68 12.4 8.99 15.5 2.65 :8k 20.0 1.28 12.2 9.18 16.8 2.69 :4k 19.1 2.52 12.4 9.60 17.2 2.79 :2k 12.6 4.89 12.3 10.56 17.2 3.37 :1k 7.4 9.96 12.3 11.87 17.2 4.82 :0.5k 4.2 19.67 11.9 14.58 17.3 7.41 : :Notice three things- :- raw (character) IO nosedives in throughput with smaller block sizes and it's : cpu cost goes through the roof :- block IO throughput remains fairly constant with block size but cpu usage : is fairly high (but is cheaper than raw IO at small bsize with dd). :- file IO (a fs with the same size file in the partition) is significantly : faster than block IO and *increases* throughput as the block size goes down. : file IO is never slower than block IO to the same disk zone and is cheaper : in cpu cost. : :The obvious question is.. why isn't block IO implemented the same way as This is very odd. Well, so many people are getting these numbers that there must be something going on. I've already committed everything in my tree except for the vnode->v_lastr changes, perhaps those are responsible for my better numbers. Peter, what is the cpu utilization during your tests? Is the cpu being maxed out by block I/O or is there still idle time? Also, what is the low level transfer size (T/S from iostat) during the block I/O op? Is it 2K? If it's 2K then that's probably the problem. Try running the test on a real partition such as da0a that has a normal filesystem blocksize programmed into it (usually 8K) and see if that narrows the numbers. It may make sense to increase the default block size from 2K to 8K for performance considerations. :read() ends up going to the device. ie: zap the caching aspects of bio and :make specfs use the same access methods for "block" devices as read() uses :to get to the devices in a filesystem. : :Wouldn't this achieve the goals of all parties? bio would be dramatically :simplified as it has no caching or coherency issues to deal with, and there :would still be mmap/unaligned read/buffering/coherency/etc provided by the :VM system, and it would make bdevs faster in the process. If I understand correctly, the answer is no. One of the big points of the block device is to cache the data. :confusion that arises ("I fsck rdaXXX but mount daXXX right?"), I think it :would be better to rename slightly. ie: rdaXXX becomes daXXX (char devices :are already mountable if I recall correctly), and the old daXXX devices :become bdaXXX or something else. Then you end up with all user exposure to :the raw devices for everything from fsck, mount, etc, gives us a chance to :renumber so bmaj == cmaj, and still allows block access "out of the way" :for things like mmaping an INN cyclic news spool and still get the required :caching. I think the simple solution here is to make the utility programs that access these devices complain when you give them the wrong type. fsck already does this. :-Peter :-- :Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe cvs-all" in the body of the message