Date: Sat, 9 Feb 2019 20:20:28 -0800 (PST) From: "Rodney W. Grimes" <freebsd-rwg@pdx.rh.CN85.dnsmgr.net> To: Cy Schubert <Cy.Schubert@cschubert.com> Cc: cem@freebsd.org, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: Re: nosh init system Message-ID: <201902100420.x1A4KSxA064573@pdx.rh.CN85.dnsmgr.net> In-Reply-To: <201902100136.x1A1aXXv039736@slippy.cwsent.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> In message <CAG6CVpWXOA6r_aJcefxQBu2QZxprf1ZpDoTb4eb2JSwWsE2m+g@mail.gma > il.com> > , Conrad Meyer writes: > > Hi Cy, > > > > On Sat, Feb 9, 2019 at 3:35 PM Cy Schubert <Cy.Schubert@cschubert.com> wrote: > > > I don't see what's so "incredibly fragile" about rc(8). That's not to > > > say there aren't better solutions, like SMF. > > > > Maybe "incredibly" as a choice of adjective is inappropriate. I think > > we (you, me, and ngie@) can all agree it is somewhat fragile, and > > there are things SMF/systemd/nosh get right that rc(8) does not > > (today). Anyway, your next paragraph goes on to be a good start at > > describing some of rc's fragility. :-) > > > > > Where rc(8) falls down is any port or a customer's (user of FreeBSD) rc > > > script could fail hosing the boot or worse hosing the system*. Where a > > > solution like SMF solves the problem is that should a service which > > > other services depend on fail, only that branch of the startup tree > > > would fail. > > > > Right; that's a great example. > > > > > In that scenario, if a service fails but sshd start, a > > > sysadmin would still be able to login remotely to resolve the problem. > > > So in this regard rc(8) is at a disadvantage. > > > > > > We could address the above paragraph by starting sshd earlier during > > > boot thereby allowing the opportunity to fix remotely. > > > > I don't think that is really sufficient without substantially > > modifying init+rc to be closer to something like systemd or SMF, > > anyway. And then we'd rather just have something like SMF :-). > > I'd rather see SMF but a number felt a CDDL licensed init was > unacceptable -- except for the fact that SMF doesn't replace init. > > > > > As soon as *any* rc service fails to start (signal, non-zero exit, > > stop_boot), rc(8) exits non-zero, causing init(8) to go to single > > user. All service state is thrown away with rc(8) exit, but any rc.d > > "services" that managed to start before boot failed are not > > terminated. Even if an admin manages to log in and fix the > > configuration, re-starting rc(8) restarts the runcom process from > > scratch, as if nothing had already been done, without first stopping > > anything that was already running. The only safe, reproducible way to > > re-start rc(8) is to fully reboot the system. It -should- be safe to restart rc, as rc scripts should check to see if the item they are being requested to start is already running, rc scripts that fail to have this check are defective and should be fixed. You should be able to invate /etc/rc.d/foo start as many times as you want in a row and only get 1 instance of foo, with the other starts returning "foo already running" Same with stop. > > It wasn't that way 10-15 years ago. It's evolved to become this. Even > if we stay with rc(8), quickly cobbling together a patch isn't going to > fix it long term. Whether we use another init, an add-on like SMF, or > make rc(8) more robust, it will not be fixed by a simple tweak here or > there. Much gets broken in the name of new features sadly. > > > > > E.g., the major pain point we run into repeatedly with restarted boot > > is that cleanvar / cleartmp run again. This breaks ld-elf.so.hints > > cache (anything linking /usr/local libraries ??? hope your admin is > > running base sshd and not openssh-portable!) as well as wiping out > > /var/run pid files (breaking "already running?" rc pid checks). As a > > result, services get double-started. > > > > Cleanvar could maybe be improved to avoid this problem ??? e.g., we > > could coordinate with the kernel to set a per-boot, persisted flag > > that cleanvar has completed, even if rc(8) exits ??? but the broad class > > of problems would remain (rc.d autostart is stateful, but any partial > > failure destroys all state). > > This needs more than improving cleanvar or some other script. It's like > whack-a-mole. (The rest of this not specifically talking to you > Conrad.) This is why every one to two months this topic comes up again > and again and again. It's a pain point. (And also the shiny new object > syndrome.) Various people suggest their favourite init(8) replacement > and the bikeshed starts up again. Shiny new things also come with shiny new problems, I would actually often rather repair a broken old something than get a new shiny something as I know the defects of the raty old something. > To avoid bikeshedding this to death again, we enumerated two issues so > far. Let's continue to list issues. I also think that this should be a > BSDCan devsummit whiteboard topic where we list issues in one column > and next to it we list possible solutions, after listing all the issues > first. And finally if this is too large for one person to work on, > assign the various issues to willing developers. We do not need to wait for BSDCan, there are more of us here on this list than at any dev summit. > > One final thought. init(8) and rc(8) requirements for desktop/laptop, > server, embedded, and mobile are probably different enough that their > requirements may compete with each other. Some embedded applications > may desire a simple rc(8) whereas server or desktop a heavier weight > solution. It is rather simple to just drop the whole rc.d and rewrite /etc/rc for the embeded situtaion, going back to the 4.3 era. Though we might want to go over to the rc mailling list? > Cy Schubert <Cy.Schubert@cschubert.com> -- Rod Grimes rgrimes@freebsd.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201902100420.x1A4KSxA064573>