Date: Sun, 10 Feb 2019 15:50:50 -0500 From: Warner Losh <imp@bsdimp.com> To: Cy Schubert <Cy.Schubert@cschubert.com> Cc: Garrett Cooper <yaneurabeya@gmail.com>, "Conrad E. Meyer" <cem@freebsd.org>, "Rodney W. Grimes" <freebsd-rwg@pdx.rh.cn85.dnsmgr.net>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: Re: nosh init system Message-ID: <CANCZdfor-Xai_HgBSLv%2B0-ZToz18xYOz%2BV_Dy_A27EerRpU_xQ@mail.gmail.com> In-Reply-To: <201902101631.x1AGV1sa026790@slippy.cwsent.com> References: <yaneurabeya@gmail.com> <43C091FC-18ED-49DF-A488-784DC2329D52@gmail.com> <201902101631.x1AGV1sa026790@slippy.cwsent.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Feb 10, 2019, 11:34 AM Cy Schubert <Cy.Schubert@cschubert.com wrote= : > In message <43C091FC-18ED-49DF-A488-784DC2329D52@gmail.com>, Enji > Cooper writes > : > > On Feb 9, 2019, at 20:20, Rodney W. Grimes < > freebsd-rwg@pdx.rh.cn85.dnsmgr.ne > > t> wrote: > > > > >> In message > <CAG6CVpWXOA6r_aJcefxQBu2QZxprf1ZpDoTb4eb2JSwWsE2m+g@mail.gma > > >> il.com> > > >> , Conrad Meyer writes: > > >>> Hi Cy, > > >>> > > >>>> On Sat, Feb 9, 2019 at 3:35 PM Cy Schubert < > Cy.Schubert@cschubert.com> w > > rote: > > >>>> I don't see what's so "incredibly fragile" about rc(8). That's not > to > > >>>> say there aren't better solutions, like SMF. > > >>> > > >>> Maybe "incredibly" as a choice of adjective is inappropriate. I > think > > >>> we (you, me, and ngie@) can all agree it is somewhat fragile, and > > >>> there are things SMF/systemd/nosh get right that rc(8) does not > > >>> (today). Anyway, your next paragraph goes on to be a good start at > > >>> describing some of rc's fragility. :-) > > >>> > > >>>> Where rc(8) falls down is any port or a customer's (user of > FreeBSD) rc > > >>>> script could fail hosing the boot or worse hosing the system*. > Where a > > >>>> solution like SMF solves the problem is that should a service whic= h > > >>>> other services depend on fail, only that branch of the startup tre= e > > >>>> would fail. > > >>> > > >>> Right; that's a great example. > > >>> > > >>>> In that scenario, if a service fails but sshd start, a > > >>>> sysadmin would still be able to login remotely to resolve the > problem. > > >>>> So in this regard rc(8) is at a disadvantage. > > >>>> > > >>>> We could address the above paragraph by starting sshd earlier duri= ng > > >>>> boot thereby allowing the opportunity to fix remotely. > > >>> > > >>> I don't think that is really sufficient without substantially > > >>> modifying init+rc to be closer to something like systemd or SMF, > > >>> anyway. And then we'd rather just have something like SMF :-). > > >> > > >> I'd rather see SMF but a number felt a CDDL licensed init was > > >> unacceptable -- except for the fact that SMF doesn't replace init. > > >> > > >>> > > >>> As soon as *any* rc service fails to start (signal, non-zero exit, > > >>> stop_boot), rc(8) exits non-zero, causing init(8) to go to single > > >>> user. All service state is thrown away with rc(8) exit, but any rc= .d > > >>> "services" that managed to start before boot failed are not > > >>> terminated. Even if an admin manages to log in and fix the > > >>> configuration, re-starting rc(8) restarts the runcom process from > > >>> scratch, as if nothing had already been done, without first stoppin= g > > >>> anything that was already running. The only safe, reproducible way > to > > >>> re-start rc(8) is to fully reboot the system. > > > > > > It -should- be safe to restart rc, as rc scripts should check to > > > see if the item they are being requested to start is already running, > > > rc scripts that fail to have this check are defective and should be > > > fixed. You should be able to invate /etc/rc.d/foo start as many > > > times as you want in a row and only get 1 instance of foo, with the > > > other starts returning "foo already running" Same with stop. > > > > I=C3=A2=E2=82=AC=E2=84=A2m not sure if Conrad is referring to the isilo= n way of restarting > service > > s. If so, the isilon parallel start process would effectively wipe the > slate > > clean and restart everything if interrupted, which (because of the > nature of > > cleanvar, etc), would wipe out any and all pidfiles, resulting in in > weird se > > t of services which fail to start on next run through. > > > > In short, I think the fact that isilon didn=C3=A2=E2=82=AC=E2=84=A2t mo= unt tmpfs to /var/run > was b > > egging for pain, as it=C3=A2=E2=82=AC=E2=84=A2s a directory one should = only setup once at > boot. > > Regardless of whether they use tmpfs or not, services should be > constructed in a manner such that it should still work if the customer > chooses not to use tmpfs. > Correct. If we require this. That's a bug. This also goes for those who mount /usr separately like I do (which has > saved my bacon as recently as a couple of weeks ago). A change made to > one of the RC scripts assumed /usr was on rootfs. (When I raised the > issue the reply was "you should /usr on / anyway.") My point is that we > assume our way of setting up a server is the only way and we bulldoze. > In reality FreeBSD and prior to that commercial UNIX were set up > variously. It's only since Linux became so popular that it has been > assumed that one size fits all. > > These are two examples of why this approach doesn't work. POLA is > painful. > This would also be a bug. I'd just fix the bug. I know people don't want to think of these things, but we still support separate filesystems. Saying not to run that way is lame and unhelpful. > > > That being said, there are other pseudo services that aren=C3=A2=E2=82= =AC=E2=84=A2t > necessarily id > > empotent. If they run twice, the second run could result in breakage to > other > > dependent services run after them. > > Cleanvar being the focus of much of our discussion should be able to > determine it has run before. > > I'm purposely not discussing implementation details. > Yea. That's also a sloppy bug. In this case, there is no concept of restarting... we want to run it only once... maybe that is the real bug here: we don't adequately have a way to Express that notion. Of course the bigger issue is that this is the sort of thing you want to be 100% sure is done before anything that depends on it runs. When you have a complicated topology like our start graph, that makes doing stuff in parallel hard. Warner --=20 > Cheers, > Cy Schubert <Cy.Schubert@cschubert.com> > FreeBSD UNIX: <cy@FreeBSD.org> Web: http://www.FreeBSD.org > > The need of the many outweighs the greed of the few. > > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org= " >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfor-Xai_HgBSLv%2B0-ZToz18xYOz%2BV_Dy_A27EerRpU_xQ>