From owner-cvs-all Mon Sep 20 2:34: 9 1999 Delivered-To: cvs-all@freebsd.org Received: from overcee.netplex.com.au (overcee.netplex.com.au [202.12.86.7]) by hub.freebsd.org (Postfix) with ESMTP id 4EB9914DBF; Mon, 20 Sep 1999 02:33:59 -0700 (PDT) (envelope-from peter@netplex.com.au) Received: from netplex.com.au (localhost [127.0.0.1]) by overcee.netplex.com.au (Postfix) with ESMTP id 75A841CC5; Mon, 20 Sep 1999 17:33:58 +0800 (WST) (envelope-from peter@netplex.com.au) X-Mailer: exmh version 2.0.2 2/24/98 To: Jesper Skriver Cc: Poul-Henning Kamp , Matthew Dillon , cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: User block device access (was: cvs commit: src/sys/miscfs/specfs spec_vnops.c src/sys/sys vnode.h src/sys/kern vfs_subr.c) In-reply-to: Your message of "Mon, 20 Sep 1999 10:46:01 +0200." <19990920104601.B75298@skriver.dk> Date: Mon, 20 Sep 1999 17:33:58 +0800 From: Peter Wemm Message-Id: <19990920093358.75A841CC5@overcee.netplex.com.au> Sender: owner-cvs-all@FreeBSD.ORG Precedence: bulk Jesper Skriver wrote: > On Sun, Sep 19, 1999 at 10:05:04PM +0200, Poul-Henning Kamp wrote: > > In message , Mat thew > > Jacob writes: > > > > > > > >Okay, then. Really, seriously, though- if we're all stuck arguing a > > >major issue from different viewpoints for lack of < 1K$ equipment, this is > > >an easy problem to solve from the K$ point of view (hadn't thought about > > >customs- I guess I just can't express mail these puppies, can I? :-)) > > > > You know, only Matt Dillon thought this was a hardware issue, I don't > > think it is. > > I see the same a Poul-Henning > > # time dd if=/dev/rda0 of=/dev/null bs=8k count=10000 > 10000+0 records in > 10000+0 records out > 81920000 bytes transferred in 6.293370 secs (13016873 bytes/sec) > 6.30s real 0.04s user 0.46s system > # time dd if=/dev/da0 of=/dev/null bs=8k count=10000 > 10000+0 records in > 10000+0 records out > 81920000 bytes transferred in 12.496958 secs (6555195 bytes/sec) > 12.51s real 0.02s user 3.26s system raw rawcpu block blkcpu file filecpu bsize MB/s sec MB/s sec MB/s sec ---------------------------------------------------- 1024k 21.4 0.16 12.5 9.09 13.7 3.11 512k 21.3 0.15 12.2 9.23 13.6 3.12 256k 21.6 0.17 12.4 9.08 13.6 2.89 128k 21.6 0.18 12.4 8.80 13.6 2.54 64k 21.4 0.22 12.3 8.82 13.6 2.58 32k 21.3 0.36 12.4 8.82 15.5 2.62 16k 21.6 0.68 12.4 8.99 15.5 2.65 8k 20.0 1.28 12.2 9.18 16.8 2.69 4k 19.1 2.52 12.4 9.60 17.2 2.79 2k 12.6 4.89 12.3 10.56 17.2 3.37 1k 7.4 9.96 12.3 11.87 17.2 4.82 0.5k 4.2 19.67 11.9 14.58 17.3 7.41 Notice three things- - raw (character) IO nosedives in throughput with smaller block sizes and it's cpu cost goes through the roof - block IO throughput remains fairly constant with block size but cpu usage is fairly high (but is cheaper than raw IO at small bsize with dd). - file IO (a fs with the same size file in the partition) is significantly faster than block IO and *increases* throughput as the block size goes down. file IO is never slower than block IO to the same disk zone and is cheaper in cpu cost. The obvious question is.. why isn't block IO implemented the same way as read() ends up going to the device. ie: zap the caching aspects of bio and make specfs use the same access methods for "block" devices as read() uses to get to the devices in a filesystem. Wouldn't this achieve the goals of all parties? bio would be dramatically simplified as it has no caching or coherency issues to deal with, and there would still be mmap/unaligned read/buffering/coherency/etc provided by the VM system, and it would make bdevs faster in the process. (This was tested without Matt's patches to use vmio) Regarding the different names of devices (rda0s1e vs da0s1e) and the confusion that arises ("I fsck rdaXXX but mount daXXX right?"), I think it would be better to rename slightly. ie: rdaXXX becomes daXXX (char devices are already mountable if I recall correctly), and the old daXXX devices become bdaXXX or something else. Then you end up with all user exposure to the raw devices for everything from fsck, mount, etc, gives us a chance to renumber so bmaj == cmaj, and still allows block access "out of the way" for things like mmaping an INN cyclic news spool and still get the required caching. FWIW:Pentium III (450-MHz), mem = 256M. System in use running X and not getting any advantage from in-core caching: 107M Active, 72M Inact, 57M Wired, 11M Cache, 17M Buf, 1292K Free ahc0: irq 15 at device 17.0 on pci0 ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs da0: Fixed Direct Access SCSI-3 device da0: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled da0: 8761MB (17942584 512 byte sectors: 255H 63S/T 1116C) Cheers, -Peter -- Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe cvs-all" in the body of the message