From owner-freebsd-hackers  Fri Mar 29 09:01:52 1996
Return-Path: owner-hackers
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.3/8.7.3) id JAA01912
          for hackers-outgoing; Fri, 29 Mar 1996 09:01:52 -0800 (PST)
Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211])
          by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id JAA01905
          for <freebsd-hackers@FreeBSD.ORG>; Fri, 29 Mar 1996 09:01:47 -0800 (PST)
Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id JAA05680; Fri, 29 Mar 1996 09:57:47 -0700
From: Terry Lambert <terry@lambert.org>
Message-Id: <199603291657.JAA05680@phaeton.artisoft.com>
Subject: Re: fdisk and partition info
To: julian@ref.tfs.com (JULIAN Elischer)
Date: Fri, 29 Mar 1996 09:57:46 -0700 (MST)
Cc: terry@lambert.org, bde@zeta.org.au, freebsd-hackers@FreeBSD.ORG
In-Reply-To: <199603290645.WAA25956@ref.tfs.com> from "JULIAN Elischer" at Mar 28, 96 10:45:04 pm
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hackers@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

> I have a designe and (some) code for exactly this, except that it's the
> other way around.. the disk handling methods call devfs, and not
> the other way around..
>
> > CONTROLLER 0 probes for devices
> >   DEVICE 1 found
> >   *REGISTER DEVICE 1 WITH DEVFS, NAME "/dev/dsk/c0d0
> >     -> DEVFS ASKS "BSD slice driver" DO YOU WANT THIS DEVICE?
> >        "BSD slice driver" TRYS TO RECOGNIZE DEVICE
> >        "BSD slice driver" FAILS TO RECOGNIZE DEVICE
> >     <- "BSD slice driver" SAYS NO
> >     -> DEVFS ASKS "DOS partition driver" DO YOU WANT THIS DEVICE?
> >        "DOS partition driver" TRYS TO RECOGNIZE DEVICE
> >        "DOS partition driver" RECOGNIZES DEVICE

???

I thought that was what I was describing...

controller -> devfs [ -> logical driver -> devfs ... ] -> mount

Or are you saying that the devfs doesn't process the registration
events locally?  If so, what causes a disk handling method to act
on a device of a format it recognizes?

> > > >1)	The hierarchy could get large fast.  For instance, a device
> > > 
> > > No kidding.  There are already about 512+64 possible devices for the
> > > slice and partition layers.
> > 
> > Right... that's why you would use directories for population.
> 
> usig my scheme you anly have devices for disk partitions that exist.
> that is less devices that exist at the  moment..

Same here.  The difference is that there would be explicit nodes
for manipulating partitioning, with some common interface, like:

#define	DIOCISDEV	_IO('D', 1)			/* use node as device*/
#define	DIOCGETAP	_IOR('D', 4, struct diskapart)	/* get allowable*/
#define	DIOCSETAP	_IOR('D', 5, struct diskapart)	/* set type*/
#define	DIOCREADP	_IOR('D', 2, struct diskparti)	/* get partition data*/
#define	DIOCWRITEP	_IOW('D', 3, struct diskparti)	/* set partition data*/

For telling devfs that the directory node is going to be used as a
device (replace dir vnode with dev vnode in fd), for determining the
current (and allowable) logical devices at the given node, for setting
or deleting the active logical device, for reading the partition
data, and for setting the partition data.

For instance, here's a pseudo-code parititioning program that will
work with all divisions of devices into logical extents, assuming
logical devices nodes are exported in some hierarchical fashion, and
that you can "pick" devices to operate on via hierarchy traversal
of /dev/dsk:

	struct allow_list {
		struct diskapart	item;
		struct allow_list	*next;
	};

	struct part_list {
		struct diskparti	item;
		struct part_list	*next;
	};

	struct allow_list	*atop = NULL;
	struct part_list	*ptop = NULL;

	{
		struct allow_list	**alpp = &atop;
		struct allow_list	*alp;
		struct part_list	**plpp = &ptop;
		struct part_list	*plp;
		struct diskapart	dai;	/* diskapart item*/
		struct diskparti	dpi	/* diskparti item*/
		int			i;
		int			choice;

		/* open physical device*/
		fd = open( "/dev/dsk/c0d0", O_RDWR, 0);

		/* use as device instead of looking up inferior nodes*/
		ioctl( fd, DIOCISDEV, 0);

		/*
		 * get allowable logical drivers (partitioning types);
		 * put them on a linked list.
		 */
		dai.index = 0;
		while( ioctl( fd, DIOCGETAP, &dai) == 0) {
			*alpp = malloc( sizeof(struct allow_list));
			memcpy( (*alpp)->item, &dai, sizeof(struct allow_list));
			(*alpp)->next = NULL;
			alpp = &((*alpp)->next);
		}

		... disallow if item->active is true ...

		/* display choices to user... option base 1*/
		for( i = 1, alp = atop; alp != NULL; i++, alp = alp->next) {
			printf( "%d: %s\n", i, alp->item->desc);
		}

		/*
		 * get user choice of partitioning to apply
		 * (into "choice"); as in display above, option base
		 * is 1.
		 */
		...


		/* locate structure for choice...*/
		for( i = choice, alp = atop; i > 1; i--, alp = alp->next)
			continue;

		/* set as active (will write empty structures to disk)*/
		alp->item->active = 1;
		ioctl( fd, DIOCSETAP, &alp->item);

		/*
		 * get potential partition information... there will
		 * be one per item allowed by the logical device;
		 * put them on a linked list.  Active will be non-zero
		 * for those which exist.
		 */
		dpi.index = 0;
		while( ioctl( fd, DIOCREADP, &dpi) == 0) {
			*alpp = malloc( sizeof(struct part_list));
			memcpy( (*alpp)->item, &dpi, sizeof(struct part_list));
			(*alpp)->next = NULL;
			alpp = &((*alpp)->next);
		}

	keep_partitioning:
		/*
		 * Use information from allowable ranges and user input
		 * to pick active partitions; because allowable space
		 * may be reduced, we will need to refresh the list
		 * via ioctl after a change operation... ie: for an
		 * empty DOS partition entry, bias = 0, length = 2G
		 * (or limit of available space, whichever is smaller).
		 * The flags field of "diskapart" structure indicates
		 * allowable spanning strategies for a given region
		 * (ie: if partitions can start on cyliner 45 if there
		 * isn't one from 1-44 already, etc.).
		 */

		/* get menu item*/
		if( ... done)
			goto done;

		/* handle partition creation/deletion for selected*/
		...


		/* write partition entry*/
		ioctl( fd, DIOCWRITEP, &plp->item);

		/* refresh potentials list*/
		...

		goto keep_partitioning;

		/* NOTREACHED*/

	done:		/* got here on "exit" from poartitioning menu...
		...

	}


> minor numbers are assigned in increasing non-recuring sequence.
> first is 0, 2nd is 1 etc.
> if you repartition a drive  and the last device was 6 but youdelete device 3
> when you re-make it you get 7..
> ther eis a little hash-table to associate minoe numbers to 'disk_section'
> structures.
> each disk_section has the usual start, length stuff and in addition a pointer
> to it's 'parent' device, and a method that is called if non-0
> the method() can dynamically re-calculate a request.

Hmmmm... where do you store this monatonically increasing number on disk?

> a single disk_section can have multiple parents
> which would be the case to simulate the CCD driver.
> 
> geeze I gotta finish this code.. I feel so guilty..

8-).

>  as far as naming hierarchy.. I was going to keep tacking things on the end
>  sd0/whole
>  sd0/part1
>  sd0/part2/whole
>  sd0/part2/BSDa
>  sd0/part2/BSDb
>  etc.
>  but really that's just a decision taken at the time of the devfs call
>  and purely a string..
>  it'd be a simple edit to make it:
>  sd0
>  sd0s1
>  sd0s2
>  sd0s2a
>  sd0s2b

Agreed.   Names are only a convenience for humans, who actually
should not have to think of devices at this level anyway, only of
cutting the devices up and the usable pie pieces that result...
the pseudo-code above does not deal with setting up spanning
sets or anything, but as long as you stole the first sector/
cylinder/whatever as a tag (and put order information there as
well), they could self-assemble as a result of callbacks, and the
code would not be that difficult.  You would need to have a
logical driver class associated with the allowable logical
drivers, and DIOCREADP and DIOCWRITEP would be applicable only
in case of LD_CLASS_PART, not LD_CLASS_SPAN (which would require
its own control ioctls).

> > You could do the same thing with a flat namespace, if you were willing
> > to parse the devices into semantic units and build them up one
> > character concatenation at a time.
> 
>  which I would do I think

I was thinking that human usability wasn't a factor -- the human
usability is a matter for the interface program, so making things
unambiguous for the interface program seemed to be a higher
priority.  Obviously, as long as it's parsable from a directory
listing into a hierarchy internally, it's irrelevant, except as
regards the amount of work required to implement a usable interface.

> > I think that for a minor change in the kernel (30 or less lines of
> > additional code), you could get what you wanted from a hierarchy
> > much easier.  Note that cloning devices (I'll assume we will go to
> > cloning for pty's eventually...)
> 
> well possibly.. don't know how yet..
> I'd also like to impl;iment an "active directory"
> where a 'lookup' is fielded by the driver..

Hmmm... as in "master" and "slave" directories for pty allocation?

The reason I phrased things the way I did on the cloning (using the
directory as the control device to get an assignment) is that once
you get an assign, you could reserve it until EITHER the control
device was closed OR the allocated device master was opened.  This
eliminates the potential race on allocation for iterate/open on an
"active directory", and saves you from badly behaved programs at
the same time (ie: crash in the middle of the open procedure, etc.).

I think the "active directory" idea could be useful elsewhere (ie: for
something like /dev/fd/* iteration), so don't throw it out.  8-).


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.