From owner-freebsd-arch@FreeBSD.ORG Mon Aug 23 23:13:16 2010 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0A2CE1065695 for ; Mon, 23 Aug 2010 23:13:16 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from harmony.bsdimp.com (bsdimp.com [199.45.160.85]) by mx1.freebsd.org (Postfix) with ESMTP id AA9BF8FC0C for ; Mon, 23 Aug 2010 23:13:15 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by harmony.bsdimp.com (8.14.3/8.14.1) with ESMTP id o7NNBpNW057077; Mon, 23 Aug 2010 17:11:51 -0600 (MDT) (envelope-from imp@bsdimp.com) Date: Mon, 23 Aug 2010 17:12:01 -0600 (MDT) Message-Id: <20100823.171201.107001114053031707.imp@bsdimp.com> To: marcelm@juniper.net From: "M. Warner Losh" In-Reply-To: References: X-Mailer: Mew version 6.3 on Emacs 22.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-arch@FreeBSD.org Subject: Re: RFC: enhancing the root mount logic X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Aug 2010 23:13:16 -0000 In message: Marcel Moolenaar writes: : All, : : In embedded products, software is possibly installed as an image onto : an actual storage device. This means that mounting the storage device : as root is not enough to have a usable root file system. The rough : draft below is an idea to enhance the root mount from having ad-hoc : quirks to a well-defined and recursive mechanism to allow a wide- : range of use cases. : : The root mount logic is recursive as follows: : 1. The kernel mounts devfs as root (is it is now). : 2. The kernel will re-mount root by virtue of reading a file, called : /.mount.conf, in the current root file system and following the : directives is it. devfs synthesizes the contents of this file. : : At each iteration, the kernel will: : 1. move the devfs mount from /dev in the old file system to /dev in : the new file system. : 2. As per the directives or unconditionally, the kernel will re-mount : the old root file system under /.mount (or some other name) within : the new file system. : : devfs will synthesize the contents of /.mount.conf as per the kernel : configuration and tunables. The administrator (or install process) : will create and populate /.mount.conf for all other cases. : : Directives in /.mount.conf are envisioned to be something like: : : {FS}:{MOUNTPOINT} e.g. ufs:/dev/da0 : a root mount alternative. The order of the alternatives in : the file determines the priority. : : .ask : a root mount alternative that asks the operator to specify : what the root mount should be. : : .wait N .e.g. .wait 5 : wait at most N seconds for a root mount alternative to : succeed. If an alternative does not succeed within that : time, move on to the next alternative. : : .onfail {panic|reboot|retry|continue} : Tells the kernel what to do in case it can't successfully : complete the root mount as directed to. : : The .wait directive works better (probably) if we have events that : signify the arrival of a file system or device special file, so that : we can wait for at most N seconds after the last event. This also : allows us to wait for a separate interval between events. : : As an example, consider: : : [devfs] /.mount.conf: : ufs:/dev/da0 : .ask : .wait 5 : .onfail panic : : [ufs:/dev/da0] /.mount.conf : md0:/images/OS-image-1.0.iso : unionfs:/jail/freebsd-8-stable : .wait 0 : .onfail continue : : In the example, the kernel will mount devfs, read /.mount.conf and : wait at most 5 seconds to mount the UFS on /dev/da0. If that fails, : the kernel will ask (once) and panic in case of failure. : : If the UFS root mount succeeded, the kernel will re-mount devfs : underneath /dev. Since this is the first non-devfs root file system, : the kernel will not re-mount the old root under /.mount. : : Since there's a /.mount.conf on the UFS, the kernel will read it : and repeat the process. First it'll try and mount the OS image : in /images/OS-image-1.0.iso and if it's not present will try to : mount some -stable 8 chroot using unionfs (not necessarily a : real-world example here :-) If either fails, the kernel will : continue booting using the current root file system. Assuming that : the image is present, the kernel will re-mount root, move devfs : underneath /dev in the MD root and remount ufs:/dev/da0 under : /.mount in the MD root. This gives the following picture: : : / md0:[ufs:/dev/da0]/images/OS-image-1.0.iso : /.mount ufs:/dev/da0 : /dev devfs : : : Things to not explicitly touched upon: : o root mount options : o directives to instruct the kernel what to run as the initial : process to eliminate the rather ad-hoc hardcoding. E.g: : .init /sbin/init : .init /sbin/init.old : : Is this something that people feel is worth fleshing out and : prototyping? This sounds very interesting. If kept simple, I could see how this would make my life a lot easier. However, all this scripting sounds a bit like a very simple shell in the kernel. What advantages are there to this approach vs having the ability to run a simple shell script or executable and "pivot" the root to a new location? And how do you emulate the mount_foo programs for foo filesystems? Some of them do weird things that might not translate well into the kernel... As you can see, I'm torn about how I feel about the idea. For simple cases, I think it is great, but as complexity builds, I become less sure. What if that iso image was compressed? What if I had a software RAID of disks or flash devices? What about crypto? I know I can handle those cases in /bin/sh, but will each new one require more code in the kernel? What would df and/or mount tell you about the now-hidden file systems? Warner