From owner-freebsd-current Wed Jan 22 12: 0:17 2003 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9079A37B401 for ; Wed, 22 Jan 2003 12:00:08 -0800 (PST) Received: from sax.sax.de (sax.sax.de [193.175.26.33]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1798043EB2 for ; Wed, 22 Jan 2003 12:00:07 -0800 (PST) (envelope-from j@uriah.heep.sax.de) Received: (from uucp@localhost) by sax.sax.de (8.9.3/8.9.3) with UUCP id VAA11512 for freebsd-current@freebsd.org; Wed, 22 Jan 2003 21:00:05 +0100 (CET) Received: from uriah.heep.sax.de (localhost.heep.sax.de [127.0.0.1]) by uriah.heep.sax.de (8.12.6/8.12.6) with ESMTP id h0MJxIRL088158 for ; Wed, 22 Jan 2003 20:59:18 +0100 (MET) (envelope-from j@uriah.heep.sax.de) Received: (from j@localhost) by uriah.heep.sax.de (8.12.6/8.12.6/Submit) id h0MJxIiW088157; Wed, 22 Jan 2003 20:59:18 +0100 (MET) Date: Wed, 22 Jan 2003 20:59:18 +0100 (MET) Message-Id: <200301221959.h0MJxIiW088157@uriah.heep.sax.de> Mime-Version: 1.0 X-Newsreader: knews 1.0b.1 Reply-To: joerg_wunsch@uriah.heep.sax.de (Joerg Wunsch) Organization: Private BSD site, Dresden X-Phone: +49-351-2012 669 X-PGP-Fingerprint: DC 47 E6 E4 FF A6 E9 8F 93 21 E0 7D F9 12 D6 4E References: <20030122121739.GA758@cicely8.cicely.de> <25074.1043238359@critter.freebsd.dk> <20030122124403.GB758@cicely8.cicely.de> <20030122135404.B70341@uriah.heep.sax.de> <20030122130644.GA52953@tara.freenix.org> From: j@uriah.heep.sax.de (Joerg Wunsch) Subject: Re: vinum root [Was: I want a sysctl kern.allow_shooting_into_my_foot!] X-Original-Newsgroups: local.freebsd.current To: freebsd-current@freebsd.org Content-Transfer-Encoding: 8bit Content-Type: multipart/mixed; boundary="=-=-=__Z6c+YyEsi+3q+W1ZhZc2+8GQT__=-=-=" Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --=-=-=__Z6c+YyEsi+3q+W1ZhZc2+8GQT__=-=-= Ollivier Robert wrote: >> Oh, i should add: in my case, it's loaded before mounting the >> root (root is on vinum). > And how did you achieved this ? I thought vinum isn't able to do > that... Well, the patch for -current is currently only sitting here on my machine(s). Greg wanted to review it before i commit it, but i'll append it in case someone else wants to have a look at it. Under -current, the generic part of the kernel (namely mountroot() and its related functions) is already clean enough so no changes are needed there, only vinum needs a patch. Under -stable, the patch is already in the tree. Since mountroot() employs a very simple scheme to derive the dev_t of the root device from the given device name there, a patch to it was required, and all this is not very clean, but it works (see below). The basic concept is that you need to have the loader load the vinum module for you, and vinum needs to be told to configure itself early. Under -current (with the patch), put the following into /boot/loader.conf: vinum_load="YES" vinum.autostart="YES" Your /etc/fstab needs to have /dev/vinum/root (or whatever you name it) for the / filesystem; the loader will read this file, and pass the device name as the default root device to mountroot(). mountroot() then asks the drivers (by the trick of calling the undocumented event handler for disk_clone) to get a dev_t for the given name. Alternatively, any name entered after boot -a will be resolved the same way. So what the patch does is: . implement the logic to start vinum early . implement an event handler for dev_clone so vinum will get asked, too Under 4.x, put the following into loader.conf: vinum_load="YES" vinum.drives="/dev/da0 /dev/da1 /dev/da2" vinum.root="root" The logic to have vinum auto-scan the available disks is not yet there, so you explicitly need to name the devices to scan. (This part of the patch has been implemented first here under -current, but can perhaps be MFC'ed, too. The vinum.drives approach will also still work with the -current patch but is less convenient.) Also, since mountroot() has no way to ask the drivers to translate the given name into the corresponding dev_t, the trick with vinum.root is used; if this environment variable is set, vinum will pre-allocate the variable rootdev with the appropriate dev_t if the volume named by this has been found. The generic code will trust this value if the major # of the driver as figured out from the root device name matches the major # of the pre-allocated rootdev. I. e., it still gets /dev/vinum/root from the loader, strips the /dev/ (not needed at all inside the kernel), then scans until the first digit or slash, yielding "vinum" in that case. Now, if the major number of the preset rootdev matches the major # of vinum, the value will be taken. If rootdev has not been set, the traditional approach by deriving the minor # from the unit #, slice #, and partition name will be taken (using a hardcoded algorithm for this which is independent of the actual driver). All this gives the illusion that mountroot() would know how to handle /dev/vinum/root. ;-) Of course, what you cannot do is to boot -a, then enter an invalid name (so you'll get asked again), and then enter ufs:/dev/vinum/root: the previous invalid name has destroyed the preset rootdev value. At that point, you either need to abort or to enter a valid slice/partition. The biggest problem of all this is, of course, the bootstrapping step. The bootstrap still needs an `a' partition in order to read at least /boot/loader etc. from. The solution is to produce a faked overlay `a' partition that sits at exactly the point where the corresponding vinum subdisk of the root device is located. Another solution would be to setup a mini-root that only contains a boot/ directory in it, and use that one for partition `a', but that'll only cause other trouble (like "make install" not doing the right thing). While /boot/loader could perhaps be taught how to read something from a vinum volume, there's always the problem how to get at /boot/loader itself, so i have no other idea for this. A script could be provided that creates the faked `a' entry. -- cheers, J"org .-.-. --... ...-- -.. . DL8DTL http://www.sax.de/~joerg/ NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-) --=-=-=__Z6c+YyEsi+3q+W1ZhZc2+8GQT__=-=-= Content-Type: text/plain Content-Disposition: attachment; filename="vinumrootpatch" Index: vinum.c =================================================================== RCS file: /home/ncvs/src/sys/dev/vinum/vinum.c,v retrieving revision 1.49 diff -u -u -r1.49 vinum.c --- vinum.c 1 Apr 2002 21:30:36 -0000 1.49 +++ vinum.c 13 Jan 2003 21:13:33 -0000 @@ -66,6 +66,8 @@ STATIC int vinum_modevent(module_t mod, modeventtype_t type, void *unused); +STATIC void vinum_clone(void *arg, char *name, int namelen, dev_t *dev); + struct _vinum_conf vinum_conf; /* configuration information */ dev_t vinum_daemon_dev; @@ -79,6 +81,10 @@ void vinumattach(void *dummy) { + int i, rv; + char *cp, *cp1, *cp2, **drives, *drivep; + size_t alloclen; + /* modload should prevent multiple loads, so this is worth a panic */ if ((vinum_conf.flags & VF_LOADED) != 0) panic("vinum: already loaded"); @@ -135,6 +141,63 @@ bzero(SD, sizeof(struct sd) * INITIAL_SUBDISKS); vinum_conf.subdisks_allocated = INITIAL_SUBDISKS; /* number of sd slots allocated */ vinum_conf.subdisks_used = 0; /* and number in use */ + + EVENTHANDLER_REGISTER(dev_clone, vinum_clone, 0, 1000); + + /* + * See if the loader has passed us any of the + * autostart options. + */ + cp = drivep = NULL; +#ifndef VINUM_AUTOSTART + if ((cp = getenv("vinum.autostart")) != NULL) { + freeenv(cp); + cp = NULL; +#endif + rv = kernel_sysctlbyname(&thread0, "kern.disks", + NULL, NULL, + NULL, 0, + &alloclen); + if (rv) + log(LOG_NOTICE, + "sysctlbyname(\"kern.disks\") failed, rv = %d\n", + rv); + else { + drivep = malloc(alloclen, M_TEMP, M_WAITOK); + (void)kernel_sysctlbyname(&thread0, "kern.disks", + drivep, &alloclen, + NULL, 0, + NULL); + goto start; + } +#ifndef VINUM_AUTOSTART + } else +#endif + if ((cp = getenv("vinum.drives")) != NULL) { + start: + for (cp1 = cp? cp: drivep, i = 0, drives = 0; + *cp1 != '\0'; + i++) { + cp2 = cp1; + while (*cp1 != '\0' && *cp1 != ',' && *cp1 != ' ') + cp1++; + if (*cp1 != '\0') + *cp1++ = '\0'; + drives = realloc(drives, + (unsigned long)((i + 1) * sizeof(char *)), + M_TEMP, M_WAITOK); + drives[i] = cp2; + } + if (i == 0) + goto bailout; + rv = vinum_scandisk(drives, i); + if (rv) + log(LOG_NOTICE, "vinum_scandisk() returned %d\n", rv); + bailout: + freeenv(cp); + free(drives, M_TEMP); + free(drivep, M_TEMP); + } } /* @@ -490,6 +553,25 @@ return 0; /* err on the size of conservatism */ return size; +} + +void +vinum_clone(void *arg, char *name, int namelen, dev_t *dev) +{ + struct volume *vol; + int i; + + if (*dev != NODEV) + return; + if (strncmp(name, "vinum/", sizeof("vinum/") - 1) != 0) + return; + + name += sizeof("vinum/") - 1; + if ((i = find_volume(name, 0)) == -1) + return; + + vol = &VOL[i]; + *dev = vol->dev; } /* Local Variables: */ Index: vinumhdr.h =================================================================== RCS file: /home/ncvs/src/sys/dev/vinum/vinumhdr.h,v retrieving revision 1.27 diff -u -u -r1.27 vinumhdr.h --- vinumhdr.h 12 May 2002 20:49:41 -0000 1.27 +++ vinumhdr.h 13 Jan 2003 20:20:32 -0000 @@ -49,6 +49,7 @@ #include #include #include +#include #endif #include #include Index: vinumio.c =================================================================== RCS file: /home/ncvs/src/sys/dev/vinum/vinumio.c,v retrieving revision 1.78 diff -u -u -r1.78 vinumio.c --- vinumio.c 22 Jan 2003 14:06:46 -0000 1.78 +++ vinumio.c 22 Jan 2003 19:38:55 -0000 @@ -50,32 +50,21 @@ int open_drive(struct drive *drive, struct thread *td, int verbose) { - struct nameidata nd; struct cdevsw *dsw; /* pointer to cdevsw entry */ - int error; if (drive->flags & VF_OPEN) /* open already, */ return EBUSY; /* don't do it again */ - NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, drive->devicename, - curthread); - error = namei(&nd); - if (error) - return (error); - if (!vn_isdisk(nd.ni_vp, &error)) { - NDFREE(&nd, 0); - return (error); - } - drive->dev = udev2dev(nd.ni_vp->v_rdev->si_udev, 0); - NDFREE(&nd, 0); - - if (drive->dev == NULL) /* didn't find anything */ - return ENODEV; + drive->dev = getdiskbyname(drive->devicename); + if (drive->dev == NODEV) /* didn't find anything */ + return ENOENT; drive->dev->si_iosize_max = DFLTPHYS; dsw = devsw(drive->dev); - if (dsw == NULL) + if (dsw == NULL) /* sanity, should not happen */ drive->lasterror = ENOENT; + else if ((dsw->d_flags & D_DISK) == 0) + drive->lasterror = ENOTBLK; else drive->lasterror = (dsw->d_open) (drive->dev, FWRITE | FREAD, 0, NULL); @@ -145,11 +134,7 @@ int init_drive(struct drive *drive, int verbose) { - if (drive->devicename[0] != '/') { - drive->lasterror = EINVAL; - log(LOG_ERR, "vinum: Can't open drive without drive name\n"); - return EINVAL; - } + drive->lasterror = open_drive(drive, curthread, verbose); /* open the drive */ if (drive->lasterror) return drive->lasterror; --=-=-=__Z6c+YyEsi+3q+W1ZhZc2+8GQT__=-=-=-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message