Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Jun 1998 08:44:37 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        hackers@FreeBSD.ORG
Subject:   Re: Heads up:  block devices to disappear!
Message-ID:  <199806240844.BAA22345@usr08.primenet.com>
In-Reply-To: <199806240432.VAA23173@kithrup.com> from "Sean Eric Fagan" at Jun 23, 98 09:32:48 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> Actually, I should have followed up myself - a big question is:  will all of
> the benefits of block devices still be present?  If so, fine.
> 
> If not, this should be more carefully thought out.  I admit that the benefits
> are fairly small -- but there are some advantages of block devices over
> character devices.

I agree.

Like Sean, I am unconvinced of the utility of removing the block devices,
even though I am aware of the code simplifications that would result.


I would be *much* more prone to saying "remove the *character* devices",
since the character devices uncached behaviour can be achieved with two
compares and a bypass in kernel space ("is the read/write block aligned,
is the size to be read/written a multiple of the block size").


Specifically, here is Julian's whiteboard drawing:

     b              c
     |              |
     v              |
------------  ---   |
block device  VN    |
------------  ---   |
-----------------   |
buffer cache        |
-----------------   v
-----------------------
character device
-----------------------
-----------------------
Device driver
-----------------------
-----------------------
Hardware
-----------------------


My main objection is non-seekable devices.  Specifically, there is
a reliance on a linear buffer cache for things like DAT and floppy tape
drives.  Well, there is, unless someone has ripped it out of the driver.

The nned for the user space kludge in the floppy tape case come from the
fact that there is not a two buffer block buffer so that the tape drive
could be kept actively re-writing a block until it was fed more data.


I also think there are issues to be considered for a vnconfig'ed device
that is supposedly emulating an atomic block size larger than the physical
block size (the vn device doesn't do this correctly right now for CDROM
images, which are my particular worry).

There are potentially some historical tape compatability issues;
specifically, it is wrong to assume that some tapes, QIC-11 and QIC-24
tapes written on SCO and similar machines using Archive/Computone
based controller drivers, and a number of NCR tape controller drivers
acting directly, will actually write out a full block at the end of
the tape.  There is a need to be able to read partial blocks, and, in
some cases, to write partial blocks for nominally blocked media.

I had no end of trouble, a while back, attempting to read tapes written
on a Sun4 machine on an NCR machine (ie: this is not a historical strawman).
The NCR machine was incapable of dealing with partial blocks.  I would hate
to see FreeBSD gain the same problem: not all tapes can be rewritten using
"dd conv=osync" to for full blocks.  If the tapes already exist, you could
be screwed.

There are a number of performance issues with the MSDOSFS, especially in
the block boundry spanning case:

                 ---------
                    1K
                 ---------
---------------------  ---------------------
         4K                     4K
---------------------  ---------------------

That are dealt with in the MACK MSDOSFS implementation by use of the
buffer cache and partially populated pages, with knowledge of the
underlying physical block size.  This impies the use of cached data
for speed in the read-before-write case (see the CMU MSDOSFS paper
for details, or merely benchmark FreeBSD on MSDOSFS vs Linux on
MSDOSFS; be sure to adjust the sync/async paramters accordingly, so
you can not claim an unfair test.


I also think that requring devices to be written in block size increments
by user space programs is a mistake -- specifically, database programs
that tend to take over raw partitions may depend on non-block aligned I/O
and fragment gathering... *particulary* for the sections of the disk
being utilized for transaction logging.


If the features of the block devices are not lost, *specifically*:

o	the ability for it to automatically gather short writes
o	the ability to utilize the buffer cache to reduce the size
	of read-before-write read increments
o	the ability to double-buffer non-seekable devices, transparently,
	in the kernel, without the use of a user space "helper" program
	like "ft"
o	any other feature of block devices that anyone else can think
	up that I couldn't

then I would not object; but again, like Sean, I don't feel that this
has been demonstrated.

Simpler is not necessarily better, unless you want to be a teaching tool
and potentially little else.  Soft Updates is an example of a situation
where additional complexity provides an undeniable "win" that could be
had in no other fashion.  Let's meake sure that the block device interface
is not in the same category before summarily executing it.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199806240844.BAA22345>