From owner-freebsd-hackers@FreeBSD.ORG Mon Jul 8 22:44:09 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1BA02DBB; Mon, 8 Jul 2013 22:44:09 +0000 (UTC) (envelope-from Devin.Teske@fisglobal.com) Received: from mx1.fisglobal.com (mx1.fisglobal.com [199.200.24.190]) by mx1.freebsd.org (Postfix) with ESMTP id D545F1D5E; Mon, 8 Jul 2013 22:44:08 +0000 (UTC) Received: from smtp.fisglobal.com ([10.132.206.17]) by ltcfislmsgpa05.fnfis.com (8.14.5/8.14.5) with ESMTP id r68MCe4f032179 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Mon, 8 Jul 2013 17:12:40 -0500 Received: from LTCFISWMSGMB21.FNFIS.com ([10.132.99.23]) by LTCFISWMSGHT06.FNFIS.com ([10.132.206.17]) with mapi id 14.02.0309.002; Mon, 8 Jul 2013 17:12:40 -0500 From: "Teske, Devin" To: "Chad J. Milios" Subject: Re: Announcing: nuOS 0.0.9.1b1 - a whole NEW FreeBSD distro, NOT a fork Thread-Topic: Announcing: nuOS 0.0.9.1b1 - a whole NEW FreeBSD distro, NOT a fork Thread-Index: AQHOe10+p/Zf2qC+20mYyH00Dv0xCplaPOCAgAE2VQCAADpnAA== Date: Mon, 8 Jul 2013 22:12:39 +0000 Message-ID: <13CA24D6AB415D428143D44749F57D7201FB7302@ltcfiswmsgmb21> References: <51D9E499.103@nuos.org> <13CA24D6AB415D428143D44749F57D7201FB6177@ltcfiswmsgmb21> <51DB085A.9040701@nuos.org> In-Reply-To: <51DB085A.9040701@nuos.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.132.253.126] Content-Type: text/plain; charset="iso-8859-1" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.10.8794, 1.0.431, 0.0.0000 definitions=2013-07-08_03:2013-07-08,2013-07-08,1970-01-01 signatures=0 Cc: FreeBSD Hackers , Devin Teske X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Devin Teske List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 22:44:09 -0000 On Jul 8, 2013, at 11:43 AM, Chad J. Milios wrote: > On 07/08/13 00:12, Teske, Devin wrote: >> On Jul 7, 2013, at 2:58 PM, Chad J. Milios wrote: >> [snip] >>=20 >>> /etc is now a ZFS dataset of its own >>> How did we do it? >>> Decades of conventional wisdom says /etc must be on = /. >>> Check it out, discuss the whys and the trade-offs. >> Well, I see in nu_install on GitHub how you're doing it: >>=20 >> You added: >>=20 >> init_script=3D"/boot/init.sh" >>=20 >> to /boot/loader.conf, wich -- among other things -- does these two inter= esting things (variable names changed to make things more clear): >>=20 >> zfs rollback -r $zfs/swap/host@blank >> NOTE: $zfs is equal to $( /bin/kenv vfs.root.mountfrom ) minus the leadi= ng "zfs:" >>=20 >> and >>=20 >> zfs mount $zpool/etc >> NOTE: $zpool is equal to $zfs from above, leading up to (but not includi= ng) the first slash (/). >>=20 >> Cute. Have to say I wasn't aware of the init_script feature of loader.co= nf(5). Not bad. >=20 > We also had to put one file into the etc directory on the / "beneath" the= /etc mount so that /sbin/init can read it before /etc is mounted. There we= re two or three ways we could do that and each has a tradeoff. >=20 I've been bitten by that. Getting access to that file that's "beneath" once you've booted the system = can be ... less than easy. I'm interested in your cost/benefit points of having /etc a separate filesy= stem. On the face of it, I want to say that "/etc" is (or at least contains) the = "core identity" of the machine (and to a lesser extent -- because this is B= SD after-all -- /usr/local/etc). In my mind, /etc and /usr/local/etc *are* = the machine (metaphorically speaking), so the merits of having it as a sepa= rate filesystem are weighed against your desired topology. If you want to bunch of machines to look and/or act differently, then a sha= red /etc is precisely what you want. However, without allowing minor change= s (ala ZFS clone/snapshot or by way of UnionFS), you'll quickly find that t= he only way to cope is with role-based scripting in /etc/rc.conf (it is aft= er-all a shell script) or complicated abstraction layers (for example, usin= g netgraph eiface devices with the jail-name inside them so that rc.conf ha= ve have jail-specific ifconfig_* lines). But I digress. I think the better solution to your loading of files "beneath" the eventual= /etc filesystem is to throw away the ZFS snapshot/clone method and instead= move to a UnionFS approach for /etc. If you use UnionFS for your /etc, then what you do is for each of the machi= nes that you want *that* /etc to appear, you do something like: (as root) mount_unionfs -o below /etc /other/etc Now /other/etc (assuming it was empty before) looks exactly like /etc. Pros: With "rm -f ; rm -W " (in /other/etc) you can reclaim a f= ile from the underlying /etc. ZFS does not allow you to revert a single fil= e (you can revert the entire volume or filesystem, but not a single file). Cons: The advantage of having /etc as a ZFS filesystem is probably going to= be the compressratio. Using something like lzjb compression on your /etc d= irectory is beneficial (not as beneficial has say /var/log, but by means of= having mostly text files, /etc should compress nicely). But... if you *rea= lly* need to compress your /etc (that is to say, you're hard-up enough for = space that you need the little-savings that you'll gain from compressing /e= tc), then you're also hard-up enough that you should just set compression o= n the entire filesystem (nullifying your need to make /etc a separate files= ystem -- /etc would get the compression feature from the underlying root fi= lesystem; whatever that is -- zfs filesystem, zpool, zvol, etc.). So again,= UnionFS looks like a win unless you *really* want to set separate filesyst= em features for /etc that you don't set elsewhere. Were you perhaps after a zfs-/etc for some other reason? because there are = other reasons that I'm not getting into. For example, using sysutils/zxfer = to make backups of the /etc directory of an entire cloud of machines to a s= ingle host. If you don't have /etc as a separate filesystem (and all you wa= nt is /etc) then a ZFS stream is of course out of the question and you'll h= ave to resort to rsync. I personally think zxfer is more efficient than rsy= nc but I haven't done the calculations yet to prove it (but it feels like i= t -- incremental snapshot transfers are pretty darned quick). > What we did (mv /etc/login.conf.db /boot/etc; ln -s ../boot/etc/login.con= f.db /etc/login.conf.db) has the undesirable effect that one must remember = to (or be reminded/automated) run cap_mkdb anytime /etc is rolled to a diff= erent snapshot or a backup is restored to it (if that changes login.conf). >=20 *nods* > With our customers at ccsys.com we have a proprietary management thing in= userland (and you could lose out on that important event hook if you used = anything other than our interface for zfs rollbacks and restoring backups, = which we forbid). Interesting. > Since our goals at nuos.org are different, i'd like to implement that tri= gger somewhere better, ideally as-needed and immediate as possible. >=20 > if anyone with more intimate knowledge of how and exactly when login.conf= .db gets accessed has any thoughts... It could be a disaster for an admin t= o think their /etc is in a certain state and have that one file be out of s= ync. If better minds could chip in, I'm wondering if we're better off editi= ng /sbin/init to run init_script _before_ loading the daemon class from log= in.conf.db (or explain why thats a bad idea) or if i should just add some s= ort of hook to run cap_mkdb right when needed using a DTrace script or audi= td? >=20 That's an interesting aspect of the boot process I hadn't noticed before (h= aving not used init_script before). I would think that this should be filed= as a PR. Seems to me that the init_script should fire first -- but (and th= is is a guess) it may need to bootstrap the user that the init_script runs = as (presumably needing to load the daemon class for said user). While there= may be good reason, it certainly violates a principle (that one might be a= stonished to learn that init_script is not run in a fashion that only the d= ependencies thereof are required). > Does anyone think this issue is moot? (Can't we just document this partic= ular specific "gotcha" instance? I don't think so, I abhor any "gotcha" tha= t deviates from behavior people expect from "upstream" fbsd.) Does anyone a= gree it's important we come as close to perfect a solution as we can? Thanks for bringing up the issue with init_script. We should look to fix it= to make its use capable of handling the use-case you identified (using it = to bootstrap a separate /etc). > Is a separate /etc even worth it to people? Depends. Everybody? certainly not. Some? Sure. See above example-cases. > Should i scrap that feature because of this issue? It sounds like you contorted yourself working around a deficiency in it (a = POLA violation in that it has unforeseen dependencies). At the very least, = I would think that init could have a fall-back if the file can't be loaded. Are you putting anything beside the default daemon-class definition in your= login.conf "beneath" your true /etc? > I think we can tighten this up so theres no twisted ankles and no one fal= ling in this rare case but certainly potential manhole. (the manhole i'm ta= lking about is login.conf and login.conf.db being out of sync because the l= ater is a symlink to /boot/etc and someone might rollback to a more restric= tive login.conf and think they're covered without running cap_mkdb again bu= t their login.conf.db is actually out of sync and less restrictive in a way= that burns them) >=20 Sorry you had to work around that -- you should have filed a PR. > Devin, thank you IMMENSELY for bsdinstall and especially bsdconfig. I use= them both at work and they make life so much better. And thank you for the= simplification using kenv. I was unaware of it On a side-note, I didn't write bsdinstall -- I'm going to maintain it, but = I wrote bsdconfig ^_^ (smiles) Thank you very much for your appreciation. Certainly a labor of love and I'= m happy that others have kicked the wheels at least. --=20 Devin _____________ The information contained in this message is proprietary and/or confidentia= l. If you are not the intended recipient, please: (i) delete the message an= d all copies; (ii) do not disclose, distribute or use the message in any ma= nner; and (iii) notify the sender immediately. In addition, please be aware= that any message addressed to our domain is subject to archiving and revie= w by persons other than the intended recipient. Thank you.