From owner-freebsd-hackers Sat Dec 13 03:06:22 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id DAA13978 for hackers-outgoing; Sat, 13 Dec 1997 03:06:22 -0800 (PST) (envelope-from owner-freebsd-hackers) Received: from home.gtcs.com (home.gtcs.com [206.54.69.238]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id DAA13964 for ; Sat, 13 Dec 1997 03:06:11 -0800 (PST) (envelope-from bruce@gtcs.com) Received: from gtcs.com (localhost.gtcs.com [127.0.0.1]) by home.gtcs.com (8.8.5/8.8.5) with ESMTP id EAA20340; Sat, 13 Dec 1997 04:02:11 -0700 (MST) Disposition-Notification-To: bgingery@gtcs.com X-Comment1: in most cases both the Return-Receipt and Delivery-Notification X-Comment2: requests are part of an ongoing poll to determine what clients X-Comment3: and MTAs respond to the headers. Message-Id: <199712131102.EAA20340@home.gtcs.com> Date: Sat, 13 Dec 1997 04:02:08 -0700 (MST) From: bgingery@gtcs.com Reply-To: bgingery@gtcs.com Subject: Re: blocksize on devfs entries (and related) To: mike@smith.net.au cc: hackers@FreeBSD.ORG In-Reply-To: <199712130848.TAA01888@word.smith.net.au> MIME-Version: 1.0 Content-Type: TEXT/plain; CHARSET=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On 13 Dec, Mike Smith wrote: [munch] -} > 2. physical layout (sect/track, tracks/cyl) also needs to -} > be stored for any DASD. Also any OTHER known info which -} > may be used to optimize the filesystem building process for -} > the device, such as rotational speed, seek timing .. If -} > this is not stored with driver info in the devfs, then -} > some pointer or common reference point should be made to -} > the "file entry" that contains the info. -} -} Physical layout is a joke, and has been for many years. This -} suggestion costs you a lot of credibility. Physical layout *is* a joke for devices that do their own controller mapping. Yet I have yet to see anything cast in silicon that never needs any rework nor possibility of innovation. Whenever we throw away information, it's gone - PERIOD, unless it's stored *somewhere*. Now, the physical layout is (at least theoretically) pollable for some devices. What I'm saying is don't throw away the POSSIBILITY of storing and using this information in an orderly way. Call it "cost of credibility" if you will, and point to me as an anachronism if you wish to flame. The "Maintenance Free" automotive battery and the "never-LLF IDE drive" still have a lot in common - and both could have been produced by a certain company I'd choose not to mention by name, as it's marketing hype consistently does NOT match reality. I've already suffered some drivers that PRESUME the rotational speed and then optimize a filesystem for that supposed rotational vs bus speed, with a resulting significant decrease in performance from a bad interleave. I don't have any problem whatsoever in extrapolating that to a loss of traditional disktab ns,nt,nc,sc,su, sk, cs, hs, and il values for even POTENTIAL use. To me it's no wiser than ignoring (or eliminating) the rm value. NO DISKTAB INTEGRATION WITH DEVFS? Perhaps they don't belong in devfs stored values. Yet these values are DEVICE dependent, not filesystem dependent, hence I raised the question. When re-inventing device handling, it's wise, perhaps, to not ignore portions of it, even if the answer is "leave that part alone". [munch] -} > 3. If at the controller level it is possible to concatinate -} > or RAID join devices, that information needs to be stored -} > for the device. If this is intrinsic to the device driver -} > or the physical device - no matter. [followed up with] -} An upper layer may well want to take advantage of, or precautions in -} light of, the construction of the extent(s) with which it is presented. -} > 6. When a device is opened ro, if the underlying hardware has -} > ANY indication that it's a ro open, then if it is later upgraded -} > there should at least be a hook for it to be notified that it -} > has been upgraded. Current state (ro/rw) should be avaialable -} > to user processes without "testing it by opening a write file" -} > to a filesystem (or even raw device). -} -} The RO->RW upgrade notification is a contentious issue, but one that -} definitely needs thinking through. How would you suggest it be -} handled? Should the standard be to reopen the device, or pass a -} special ioctl, or add a new device entrypoint? To me, an IOCTL and flag would be tighter than different entry points for ro vs rw. Not necessarily tighter code, but tighter management. [munch poorly stated description and accurate answer Re: vnode fs] -} > Yet, why deny these the optimization information which will allow -} > them to map (within the constraints of their architecture) a new -} > filesystem for best throughput, if it's actually available. -} -} Because any "optimisation information" that you could pass them would -} be wrong. Any optimisation attempting to operate based on incorrect -} parameters can only be a pessimisation. Why must it be wrong? -} > Now let me raise some additional questions -- -} > -} > -} > Should a DASD be mappable ONLY with horizontal slices? -} > With what we're all doing today, it seems that taking a certain -} > number of cylinders for slices is best - but other access methods -} > may find an underlying physical structure more convenient if -} > a slice specifies a range of heads and cylinders that do NOT -} > presume that all heads/cylinders from starting to ending according -} > to physical layout are part of the same slice. It may be quite -} > convenient to have a cluster of heads across physical devices -} > forming a logical device or slice, without fully dedicating those -} > physical devices to that use. -} -} This is a nonsense question in the context of ZBR and "logical extent" -} devices (eg. SCSI, ATAPI, most ATA devices). Okay, but not those which expose that info and allow direct control. Across the Free *n?x lines we're seeing more and more antique resurrections - even if that's the ONLY place this could be used accurately. -} -} > And, I'll mention again, DISK formats are not the only -} > random-access mass-storage formats on the horizon! I'm guessing -} > that for speed of inclusion into product lines, all will emulate -} > a disk drive - but that may not be the most efficient way of using -} > them (in fact, probably not). They also can be expected to have -} > "direct access" methods according to their physical architecture, -} > with some form of tree-access the MOST efficient! -} -} In most cases, the internal architecture of the device will be -} optimised for two basic operations; retrieval of large contiguous -} extents, and read/write of small randomly scattered regions. -} -} Data access patterns are unlikely to change radically, particularly -} given the momentum that modern systems have. I'll let you work out -} what the two above are, and why they are so common. But trust me, they -} are. Oh, you needn't point out the obvious. I'm not talking about that. I'm talking about handling more and more-varied informational content and relationships, AND the potential of non-magnetic, optical DASD devices, whether they're raw stabilized gel or impregnated porous glass - both of which are on the horizon; the former in the US, and (I understand, partially by reading between the lines), the latter in Israel. Raw storage capacity with approximately chip density (speaking of 3-d space taken) is too big a leap to stay out of the reach of "Free *n?x using users". Except for "delay lines", silicon-based domain-migration technology hasn't proven very useful, yet it set some thinking going which SHOULD prove useful in optical solid state memories. Just because you see "Data access patterns .. unlikely to change radically" doesn't mean that we won't see it happening. Of course we're going to keep those two traditional data transfer modes. I'm pointing out that just those TWO such modes aren't the all-to-end- all, especially when we start eliminating rotational and seek delays from devices used for the same purpose. Bruce Gingery