From owner-freebsd-scsi  Thu May  1 19:31:25 1997
Return-Path: <owner-freebsd-scsi>
Received: (from root@localhost)
          by hub.freebsd.org (8.8.5/8.8.5) id TAA05771
          for freebsd-scsi-outgoing; Thu, 1 May 1997 19:31:25 -0700 (PDT)
Received: from math.berkeley.edu (math.Berkeley.EDU [128.32.183.94])
          by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id TAA05764;
          Thu, 1 May 1997 19:31:23 -0700 (PDT)
Received: by math.berkeley.edu (8.7.5/1.33(math)Ow.3)
	id TAA24685; Thu, 1 May 1997 19:31:20 -0700 (PDT)
Date: Thu, 1 May 1997 19:31:20 -0700 (PDT)
From: dan@math.berkeley.edu (Dan Strick)
Message-Id: <199705020231.TAA24685@math.berkeley.edu>
To: freeBSD-arch@FreeBSD.ORG, freeBSD-scsi@FreeBSD.ORG
Subject: Re: cvs commit: src/sys/scsi sd.c
Cc: dan@math.berkeley.edu
Sender: owner-freebsd-scsi@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

> 1) What should sd and other devices drivers "see" by the time they receive
> the I/O?  Should b_blkno always be in terms of the logical block size of
> the device or should the device driver be responsible for scaling?
>	...

There are really two different problems here.  One is how we communicate
the device address to the device driver and the other is what we do if
the requested I/O operation does not begin and end at block boundaries.

The main limitations of the current convention, passing a bock number
in b_blkno, is that it forces applications using the device to know
what the block size is and it does not allow specification of an arbitrary
byte address.  The most straight forward fix is to change the block
number to a byte number, but this would force significant change in
other parts of the OS and there may be some value in having the OS
know something about device block sizes.

Therefore I recommend adding a b_bytoff field to the buffer structure
and a device block size field to some appropriate per device structure.
The block size would default to 512 bytes so that existing code
wouldn't need to be changed.  Byte addressed devices could use a
block size of 1 and ignore the byte offset.  Since the OS would know
the block size, it could be extended to provide more general data
blocking services (e.g. a generalized physio routine that knows how
to read or modify just a part of a physical device block).  Programs
could exchange block size information with each other and the kernel
in a general way and the device address could be made to mean something
for tapes operating in a variable record size mode.

In practice, nearly all disk applications will like block sizes
that are a power of two, so special code might be written to
optimize for that specific situation but the general case should be
implemented anyway.  It only takes a little extra code and that
code won't be used unless it is needed.  (Hint: a buffer size is a
power of two iff (size-1)^size == 0.)

Several years ago I did something like this for a SunOS SCSI disk
driver and was quite pleased with the results.  The OS would seldom
do I/O not aligned on a file system block boundary (8kb) and
consequently there was essentially no performance problem
(and CR-ROMS with a 2K block size worked!).

Dan Strick
dan@math.berkeley.edu