From owner-freebsd-hackers  Sat Dec 13 03:06:22 1997
Return-Path: <owner-freebsd-hackers>
Received: (from root@localhost)
          by hub.freebsd.org (8.8.7/8.8.7) id DAA13978
          for hackers-outgoing; Sat, 13 Dec 1997 03:06:22 -0800 (PST)
          (envelope-from owner-freebsd-hackers)
Received: from home.gtcs.com (home.gtcs.com [206.54.69.238])
          by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id DAA13964
          for <hackers@FreeBSD.ORG>; Sat, 13 Dec 1997 03:06:11 -0800 (PST)
          (envelope-from bruce@gtcs.com)
Received: from gtcs.com (localhost.gtcs.com [127.0.0.1])
	by home.gtcs.com (8.8.5/8.8.5) with ESMTP id EAA20340;
	Sat, 13 Dec 1997 04:02:11 -0700 (MST)
Disposition-Notification-To: bgingery@gtcs.com
X-Comment1: in most cases both the Return-Receipt and Delivery-Notification
X-Comment2: requests are part of an ongoing poll to determine what clients
X-Comment3: and MTAs respond to the headers.
Message-Id: <199712131102.EAA20340@home.gtcs.com>
Date: Sat, 13 Dec 1997 04:02:08 -0700 (MST)
From: bgingery@gtcs.com
Reply-To: bgingery@gtcs.com
Subject: Re: blocksize on devfs entries (and related) 
To: mike@smith.net.au
cc: hackers@FreeBSD.ORG
In-Reply-To: <199712130848.TAA01888@word.smith.net.au>
MIME-Version: 1.0
Content-Type: TEXT/plain; CHARSET=US-ASCII
Sender: owner-freebsd-hackers@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

On 13 Dec, Mike Smith wrote:
[munch]
-} >      2.  physical layout (sect/track, tracks/cyl) also needs to
-} >         be stored for any DASD.  Also any OTHER known info which
-} >         may be used to optimize the filesystem building process for
-} >         the device, such as rotational speed, seek timing ..  If
-} >         this is not stored with driver info in the devfs, then
-} >         some pointer or common reference point should be made to
-} >         the "file entry" that contains the info.
-} 
-} Physical layout is a joke, and has been for many years.  This 
-} suggestion costs you a lot of credibility.

    Physical layout *is* a joke for devices that do their own controller
   mapping.  Yet I have yet to see anything cast in silicon that never
   needs any rework nor possibility of innovation.  Whenever we throw
   away information, it's gone - PERIOD, unless it's stored *somewhere*.
   Now, the physical layout is (at least theoretically) pollable for
   some devices.  What I'm saying is don't throw away the POSSIBILITY
   of storing and using this information in an orderly way.

    Call it "cost of credibility" if you will, and point to me as an
   anachronism if you wish to flame.  The "Maintenance Free" automotive
   battery and the "never-LLF IDE drive" still have a lot in common -
   and both could have been produced by a certain company I'd choose
   not to mention by name, as it's marketing hype consistently does NOT
   match reality.

    I've already suffered some drivers that PRESUME the rotational speed
   and then optimize a filesystem for that supposed rotational vs bus
   speed, with a resulting significant decrease in performance from a
   bad interleave.  I don't have any problem whatsoever in extrapolating
   that to a loss of traditional disktab ns,nt,nc,sc,su, sk, cs, hs,
   and il values for even POTENTIAL use.  To me it's no wiser than
   ignoring (or eliminating) the rm value.

NO DISKTAB INTEGRATION WITH DEVFS?

   Perhaps they don't belong in devfs stored values.  Yet these values
   are DEVICE dependent, not filesystem dependent, hence I raised the
   question.  When re-inventing device handling, it's wise, perhaps,
   to not ignore portions of it, even if the answer is "leave that part
   alone".
   
[munch]
-} >      3.  If at the controller level it is possible to concatinate
-} >         or RAID join devices, that information needs to be stored
-} >         for the device.  If this is intrinsic to the device driver
-} >         or the physical device - no matter.
[followed up with]
-} An upper layer may well want to take advantage of, or precautions in
-} light of, the construction of the extent(s) with which it is presented.
   

-} >      6.  When a device is opened ro, if the underlying hardware has
-} >         ANY indication that it's a ro open, then if it is later upgraded
-} >         there should at least be a hook for it to be notified that it
-} >         has been upgraded.  Current state (ro/rw) should be avaialable
-} >         to user processes without "testing it by opening a write file"
-} >         to a filesystem (or even raw device). 
-} 
-} The RO->RW upgrade notification is a contentious issue, but one that 
-} definitely needs thinking through.  How would you suggest it be 
-} handled?  Should the standard be to reopen the device, or pass a 
-} special ioctl, or add a new device entrypoint?

   To me, an IOCTL and flag would be tighter than different entry
   points for ro vs rw.  Not necessarily tighter code, but tighter
   management.

[munch poorly stated description and accurate answer Re: vnode fs]

-} >   Yet, why deny these the optimization information which will allow
-} >   them to map (within the constraints of their architecture) a new
-} >   filesystem for best throughput, if it's actually available.
-} 
-} Because any "optimisation information" that you could pass them would 
-} be wrong.  Any optimisation attempting to operate based on incorrect 
-} parameters can only be a pessimisation. 

    Why must it be wrong?


-} >   Now let me raise some additional questions --
-} > 
-} > 
-} >        Should a DASD be mappable ONLY with horizontal slices?
-} >    With what we're all doing today, it seems that taking a certain
-} >    number of cylinders for slices is best - but other access methods
-} >    may find an underlying physical structure more convenient if
-} >    a slice specifies a range of heads and cylinders that do NOT
-} >    presume that all heads/cylinders from starting to ending according
-} >    to physical layout are part of the same slice.  It may be quite
-} >    convenient to have a cluster of heads across physical devices
-} >    forming a logical device or slice, without fully dedicating those
-} >    physical devices to that use.
-} 
-} This is a nonsense question in the context of ZBR and "logical extent" 
-} devices (eg. SCSI, ATAPI, most ATA devices).

      Okay, but not those which expose that info and allow direct
      control.  Across the Free *n?x lines we're seeing more and more
      antique resurrections - even if that's the ONLY place this could
      be used accurately.

-} 
-} >        And, I'll mention again, DISK formats are not the only
-} >    random-access mass-storage formats on the horizon!  I'm guessing
-} >    that for speed of inclusion into product lines, all will emulate
-} >    a disk drive - but that may not be the most efficient way of using
-} >    them (in fact, probably not).  They also can be expected to have
-} >    "direct access" methods according to their physical architecture,
-} >    with some form of tree-access the MOST efficient!
-} 
-} In most cases, the internal architecture of the device will be 
-} optimised for two basic operations; retrieval of large contiguous 
-} extents, and read/write of small randomly scattered regions.
-} 
-} Data access patterns are unlikely to change radically, particularly 
-} given the momentum that modern systems have.  I'll let you work out 
-} what the two above are, and why they are so common.  But trust me, they 
-} are.

     Oh, you needn't point out the obvious.  I'm not talking about that.
     
     I'm talking about handling more and more-varied informational
     content and relationships, AND the potential of non-magnetic,
     optical DASD devices, whether they're raw stabilized gel or 
     impregnated porous glass - both of which are on the horizon; the
     former in the US, and (I understand, partially by reading between
     the lines), the latter in Israel.  Raw storage capacity with
     approximately chip density (speaking of 3-d space taken) is too
     big a leap to stay out of the reach of "Free *n?x using users".

     Except for "delay lines", silicon-based domain-migration technology
     hasn't proven very useful, yet it set some thinking going which
     SHOULD prove useful in optical solid state memories. Just because
     you see "Data access patterns .. unlikely to change radically"
     doesn't mean that we won't see it happening.  Of course we're 
     going to keep those two traditional data transfer modes.  I'm
     pointing out that just those TWO such modes aren't the all-to-end-
     all, especially when we start eliminating rotational and seek
     delays from devices used for the same purpose.

	Bruce Gingery	<bgingery@gtcs.com>