FreeBSD Mail Archives

Date:      Sat, 9 Oct 2010 19:30:44 -0700
From:      Garrett Cooper <gcooper@FreeBSD.org>
To:        Devin Teske <dteske@vicor.com>
Cc:        Brandon Gooch <jamesbrandongooch@gmail.com>, freebsd-hackers@freebsd.org
Subject:   Re: sysrc -- a sysctl(8)-like utility for managing /etc/rc.conf et. al.
Message-ID:  <AANLkTinCxd9aDbr7GifEdeOpdcW4d%2B4ZcQWF0TK_mC=8@mail.gmail.com>
In-Reply-To: <238E0B24-AA12-4684-9651-84DA665BE893@vicor.com>
References:  <1286397912.27308.40.camel@localhost.localdomain> <AANLkTikoohMo5ng-RM3tctTH__P6cqhQpm=FPhSE9mMg@mail.gmail.com> <51B4504F-5AA4-47C5-BF23-FA51DE5BC8C8@vicor.com> <AANLkTim=BLkd229vdEst8U0ugpq3UsHPxjZZp2qaJxH-@mail.gmail.com> <238E0B24-AA12-4684-9651-84DA665BE893@vicor.com>

Trimming out some context...

On Sat, Oct 9, 2010 at 3:39 PM, Devin Teske <dteske@vicor.com> wrote:
>

...

> Should this really be set to something other than 0 or 1 by the
> end-user's environment? This would simplify a lot of return/exit
> calls...
>
> A scenario that I envision that almost never arises, but...
> Say someone wanted to call my script but wanted to mask it to always retu=
rn
> with success (why? I dunno... it's conceivable though).
> Example:=A0(this should be considered ugly -- because it is)
> FAILURE=3D0 &&=A0sysrc foo && reboot

But then someone could do sysrc foo || : && reboot, or more simply
sysrc foo; reboot

Perhaps you meant env FAILURE=3D0 sysrc foo && reboot ?

$ cat failure.sh
#!/bin/sh
echo "FAILURE: $FAILURE"
$ FAILURE=3D0 && sh failure.sh
FAILURE:
$ env FAILURE=3D0 sh failure.sh
FAILURE: 0

> Efficacy:
> The `reboot' rvalue of '&&' will always execute because FAILURE.
> I don't really know why I got into the practice of writing scripts this
> way... most likely a foregone conclusion that seemed like a good idea at =
one
> time but never really amounted to anything substantive (in fact, it shoul=
d
> perhaps be considered heinous).
> I agree... a productionized version in the base distribution should lack
> such oddities. The script should do:
> SUCCESS=3D0
> FAILURE=3D1
> and be done with it.
> Though, I've been sometimes known to follow the habits of C-programming a=
nd
> instead do:
> EXIT_SUCCESS=3D0
> EXIT_FAILURE=3D1
> (real macros defined by system includes; though in C-land they aren't 0/1
> but rather 0/-1 IIRC)
> I just found it redundant to say:
> exit $EXIT_SUCCESS
> and shorter/more-succinct to say:
> exit $SUCCESS

Understood :). I try to avoid sysexits just because bde@ wasn't too
happy in a review that I posted some C code in a review.

...

> I borrow my argument-documentation style from 15+ years of perl programmi=
ng.
> I think it's all just personal preference. Personally, I like to jam it a=
ll
> one line specifically so that I can do a quick mark, then "?function.*nam=
e"
> to jump up to the definition-line, "yy" (for yank-yank; copies current li=
ne
> into buffer), then jump back to my mark, "p" for paste, then replace the
> variables with what I intend to pass in for the particular call.
> Using vi for years teaches interesting styles -- packing a list of keywor=
ds
> onto a single line to grab/paste elsewhere are just one of those little
> things you learn.

Understood. There really isn't any degree of shell style in FreeBSD,
but it would be nice if there was..

...

> The first ": dependency checks ..." is just a note to myself. I used ":"
> syntax to make it stand-out differently than the "#" syntax. Not to menti=
on
> that when I go through my scripts (well, the ones that are intended for
> functioning within an embedded environment at least) I expect to see a ca=
ll
> to "depend()" before a) each/every function and b) each/every large
> contiguous block of code (or at least the blocks that look like they are
> good candidates for re-use in other scripts).
> The second usage (": function") aids in finding the function declaration
> among the usages. See, in Perl, I can simply search for "sub" preceding t=
he
> function name. In C, I tend to split the return type from the function na=
me
> and ensure that the function name always starts in column-1 so I can sear=
ch
> for "^funcname" to go to the declaration opposed to the usages/references=
.
> In BASH, `function' is a valid keyword and you can say "function funcname=
 (
> ) BLOCK" but unfortunately in good ol' bourne shell, "function" is not an
> understood keyword, ... but really liking this keyword, I decided to make
> use of it in bourne shell by-way-of simply making it a
> non-executed-expression (preceded it with ":" and terminated it with ";")=
.

Yeah, that's one of the nicer readability points that would be helpful
in POSIX. Unfortunately none of the other shell code in FreeBSD [that
I've seen] is written that way, so it would look kind of out of place.
But I understand your reasoning...

>
> {
>
> =A0 =A0 =A0 =A0local fd=3D$1
>
> =A0 =A0 =A0 =A0[ $# -gt 1 ] || return ${FAILURE-1}
>
> While working at IronPort, Doug (my tech lead) has convinced me that
> constructs like:
>
> if [ $# -le 1 ]
> then
> =A0=A0=A0return ${FAILURE-1}
> fi
>
> Never did understand why folks insisted on splitting the if/then syntax (=
or
> while/do or for/do etc.) into multiple lines. I've always found that putt=
ing
> the semi-colon in there made it easier to read.

Well, I [personally] prefer the semi-colon, but I can see merits with
the other format because spacing between the expressions, the
semi-colon, etc is variable, so code gets inconsistent over time
(looking back I've noticed that even my code has become inconsistent
in that manner). Either way can easily be searched and extracted via
sed or awk.

> Are a little more consistent and easier to follow than:
>
> [ $# -gt 1 ] || return ${FAILURE-1}
>
> Because some folks have a tendency to chain shell expressions, i.e.
>
> I agree with you that any-more than one is excessive.
> I've often tried to emulate the C-expression "bool ? if-true : else" usin=
g:
> ( bool && if-true ) || else
> but it's just not clean-looking.
> I still like the simple-elegance of "expr || if-false" and "expr && if-tr=
ue"
> ... but-again, only perhaps since my first-love is Perl (of which I've
> programmed 15+ years), and statements like that are rampant in Perl perha=
ps
> because the ol' Perl cookbooks of historical right advocate their usage i=
n
> such a manner.

I know. Perl was my first language after C, but that was only 4 years
ago, and I only used it extensively for a year or so.

> Ah, coolness. command(1) is new to me just now ^_^

Yeah.. I was looking for something 100% portable after I ran into
issues with writing scripts for Solaris :).

...

> I originally had been programming in tests for '!' and 'in', but in POSIX
> bourne-shell, they aren't defined (though understood) in the keyword tabl=
e
> (so type(1) balks in bourne-shell while csh and bash do respond to '!' an=
d
> 'in' queries).
> Since you've pointed out command(1)... I now have a way of checking '!'.
> Though unfortunately, "command -v", like type(1), also does not like "in"
> (in bourne-shell at least).

Hmmm... interesting.

> I never understood why people don't trust the tools they are using...
> `[' is very very similar (if not identical) to test(1)

$ md5 /bin/\[ /bin/test
MD5 (/bin/[) =3D b4199bea7980ecac7af225af14ae555f
MD5 (/bin/test) =3D b4199bea7980ecac7af225af14ae555f

Looks the same to me :). On FreeBSD and Linux (and I'm sure other
OSes), if done properly test(1) and [(1) should be hardlinks to the
same file.

> [ "..." ] is the same thing as [ -n "..." ] or test -n "..."
> [ ! "..." ] is the same things as [ -z "..." ] or test -z "..."
> I'll never understand why people have to throw an extra letter in there a=
nd
> then compare it to that letter.

I ran into issues using ! on Solaris ksh recently (not using test),
and I agree that your example below is more straightforward and
readable than the other examples I've dealt with in the past.

> If the variable expands to nothing, go ahead and let it. I've traced ever=
y
> possible expansion of variables when used in the following manner:
> [ "$VAR" ] ...
> and it never fails. If $VAR is anything but null, the entire expression w=
ill
> evaluate to true.
> Again... coming from 15+ years of perl has made my eyes read the followin=
g
> block of code:
> if [ "$the_network_is_enabled" ]; then
> aloud in my head as "if the network is enabled, then ..." (not too far of=
 a
> stretch)... which has a sort of quintessential humanized logic to it, don=
't
> you think?
> Now, contrast that with this block:
> if [ "x$the_network_is_enabled" =3D x ]; then
> (one might verbalize that in their head as "if x plus `the network is
> enabled' is equal to x, then" ... which is more clear?)

Yet, it's more complicated than that. I use the x because some
versions are test(1) are more braindead than others and interpret the
string as an option, not as an argument. I suppose the other way to
ameliorate that though is to swap the static string and the value
which needs to be expanded. But that's also counterintuitive if you
read it out loud, and that's also against the recommendation of my
college professor (when dealing with assignment and tests... but
that's more of an artifact of beginning C than anything else).

> Yet, if I don't leave out the implied "-n" or "-z", is it more acceptable=
?
> For instance...
> if [ -n "$the_network_is_enabled" ]; then
> But that would require the reader (performing intonation in their heads a=
s
> they read the code) to innately _know_ that "-n" is "this is non-null"
> (where "this" is the rvalue to the keyword).

I wouldn't sweat it so much though. I just tested out the string with
dashes and it passed all of the cases mentioned above (our version of
test seems a bit less error prone than some of the others I've run
across).

...

> Wouldn't it be better to declare this outside of the loop (I'm not
> sure how optimal it is to place it inside the loop)?
>
> I'm assuming you mean the "local d" statement. There's no restriction tha=
t
> says you have to put your variable declarations at the beginning of a blo=
ck
> (like in C -- even if only within a superficial block { in the middle of
> nowhere } ... like that).

Correct. My issue was just how a shell interpreter would act on the
local declaration. I need to do more digging in that area to determine
how our's works vs bash vs whatever.

> Meanwhile, in Perl, it's quite a difference to scope it to the loop rathe=
r
> than the block. So, it all depends on whichever _looks_ nicer to you ^_^

Sure, and perl has the my keyword too :).

> =3D(
> I made the switch to using [ "..." ] (implied "-n") and [ ! "..." ] (impl=
ied
> "-z") long ago because they intonate in my head so-darned well ("!" becom=
ing
> "NOT" of course).

No worries. We established above that this isn't an issue.

> Ah, another oddity of my programming style.
> I often experienced people ripping whole blocks or whole functions out of=
 my
> scripts and re-using them in their own scripts...
> So I adopted this coding practice where... whenever I anticipated people
> doing this (usually I only anticipate people ripping whole functions), I
> wanted the blocks of code to still be semi-functional.
> So what you're seeing is that everytime I rely on the global "progname"
> within a re-usable code construct (a function for example), I would use
> special parameter-expansion syntaxes that allow a fall-back default value
> that was sensible ($0 in this case).
> So outside of functions within the script, you'll see:
> $progname
> -- the global is used explicitly without fallback (because people ripping
> out a block in the main source should be smart enough to know to check th=
e
> globals section at the top)
> meanwhile, in a function:
> ${progname:-$0}
> So that if they ripped said-function into their own code and neglected to
> define progname, the fallback default would be $0 which is expanded by th=
e
> shell always to be the first word (words being separated by any character=
 of
> $IFS) of the invocation line.

Well, right... but if someone's taking the value out of context and
you acted on the value in a different way, then really shouldn't be
copy-pasting your code without understanding your intent :).

> Too true...
> I was being ULTRA pedantic in my embedded-environment testing. ^_^
> Taking measures to test with different shells even... sh, bash, csh, pdks=
h,
> zsh, etc. etc. etc. (glad to report that the script is ultra portable)

Fair enough :).

...

> I would probably just point someone to a shell manual, as available
> options and behavior may change, and behavior shouldn't (but
> potentially could) vary between versions of FreeBSD.
>
> I just checked "man 1 sh" on FreeBSD-8.1, and it did have copious
> documentation on special expansion syntaxes. (beautiful!)... so you're
> right, we could just point them at a sh(1) man-page.
> I somehow had it ingrained in my mind that the sh(1) man-page was lacking
> while the bash(1) info-tex pages were the only places to find documentati=
on
> on the special expansion syntaxes. I'm glad to see they are fully documen=
ted
> in FreeBSD these days (even back to 4.11 which I checked just now).

Yeah. GNU likes infopages, but even those sometimes lack critical data
(and that's one of the positive points for using FreeBSD).

...

> IIRC I've run into issues doing something similar to this in the past,
> so I broke up the local declarations on 2+ lines.
>
> I find that the issue is only when you do something funky where you need =
to
> know the return status after the assignment. `local' will always return w=
ith
> success, so if you need to test the error status after an assignment with
> local, you'll never get it. In those cases, it's best to use local just t=
o
> define the variable and then assign in another step to which you can get =
the
> return error status of the command executed within.
> For example:
> local foo=3D"$( some command )"
> if [ $? -ne 0 ]; then
> ...
> will never fire because local always returns true.
> Meanwhile,...
> local foo
> foo=3D"$( some command )"
> if [ $? -ne 0 ]; then
> ...
> will work as expected (if "some command" returns error status, then the
> if-block will fire).

I understand, along with this case:

$ cat test_scoping
#!/bin/sh

foo() {
    for i in a b c d; do
        echo $i
    done
}

i=3D2
foo
echo $i
[gcooper@bayonetta
/scratch/ltp/testcases/open_posix_testsuite/conformance/interfaces/aio_retu=
rn]$
sh test_scoping
a
b
c
d
d

If someone didn't understand scoping in Bourne shell they would think
that i is local to foo.

My consideration was more over:

local i=3D
local j=3D

$ sh test_local.sh

$ cat test_local.sh
#!/bin/sh
foo() {
	local i=3Da # <- here
	local j=3Db # <- and there
}
foo
echo $i $j

But if it works with all cases you have tested, then by all means please us=
e it.

...

> I think you'll find (quite pleasantly) that if you intonate the lines...
> "rc_conf_files [is non-null] OR return failure"
> "varname [is non-null] OR return failure"
> Sounds a lot better/cleaner than the intonation of the suggested
> replacement:
> "if x plus rc_conf_files expands to something that is not equal to x OR x
> plus the expansion of varname is not x then return failure"
> Not to mention that if the checking of additional arguments is required, =
a
> single new line of similar appearance is added... whereas if you wanted t=
o
> expand the suggested replacement to handle another argument, you'd have t=
o
> add another "-o" case to the "[ ... ]" block which causes the line to be
> pushed further to the right, requiring something like one of the two
> following solutions:
> if [ "x$rc_conf_files" =3D x -o "x$varname" =3D x -o "x$third" =3D x ]
> then
> ...
> or (slightly better)
> if [ "x$rc_conf_files" =3D x -o \
> =A0=A0 =A0 "x$varname" =3D x -o \
> =A0=A0 =A0 "x$third" =3D x ]
> then
> ...
> But then again... you're lacking something very importantant in both of
> those that you don't get with the original syntax ([ "$blah" ] || return
> ...)... clean diff outputs! and clean CVS differentials... and clean RCS.=
..
> Let's say that the sanity checks need to be expanded to test yet-another
> variable. In the original syntax, the diff would be one line:
> + [ "$third" ] || return ${FAILURE-1}
> Otherwise, the diff is uglier (in my humble opinion):
> - if [ "x$rc_conf_files" =3D x -o "x$varname" =3D x ]
> + if [ "x$rc_conf_files" =3D x -o "x$varname" =3D x -o "x$third" =3D x ]
> Make sense?
> I think looking at CVS diffs where only a single line is added to check a
> new variable is much cleaner than a code-block which must be erased and
> rewritten everytime the test is expanded.

Yeah... perforce does a worse job in this department when it comes to
merges and deletions :/. Got what you mean...

> $ . /etc/defaults/rc.conf
> $ echo $rc_conf_files
> /etc/rc.conf /etc/rc.conf.local
> $ grep -q foo /etc/rc.local
> grep: /etc/rc.local: No such file or directory
>
> Good catch! I missed that ^_^

Np :).

> Being pedantic, I would capitalize the P in permission to match
> EACCES's output string.
>
> But, I actually copied the error verbatim from what the shell produces if
> you actually try the command.
> So... if you remove the check (if [ ! -w $file ] ... ... ...) and try the
> script as non-root, you'll get exactly that error message (with lower-cas=
e
> 'p' on 'permission denied').
> It wouldn't make sense for my script to use upper-case 'P' unless the
> bourne-shell is patched to do the same.
> I'm simply fundamentally producing the same error message as the shell sa=
fe
> for one difference... I try to detect the error before running into it
> simply so I can throw a spurious newline before the error... causing the
> output to more accurately mimick what sysctl(8) produces in the same exac=
t
> case (the case where a non-root user with insufficient privileges tries t=
o
> modify an MIB). Give it a shot...
> $ sysctl security.jail.set_hostname_allowed=3D1
> security.jail.set_hostname_allowed: 1
> sysctl: security.jail.set_hostname_allowed: Operation not permitted
> If I don't test for lack of write permissions first, and throw the error =
out
> with a preceding new-line, the result would be:
> $ sysrc foo=3Dbar
> foo: barsysrc: cannot create /etc/rc.conf: permission denied
> Rather than:
> $sysrc foo=3Dbar
> foo: bar
> sysrc: cannot create /etc/rc.conf: permission denied

I'm not sure which version you're using, but it looks like mine uses
strerror(3):

$ touch /etc/rc.conf
touch: /etc/rc.conf: Permission denied
$ > /etc/rc.conf
cannot create /etc/rc.conf: Permission denied
$ echo $SHELL
/bin/sh
$ uname -a
FreeBSD bayonetta.local 9.0-CURRENT FreeBSD 9.0-CURRENT #9 r211309M:
Thu Aug 19 22:50:36 PDT 2010
root@bayonetta.local:/usr/obj/usr/src/sys/BAYONETTA  amd64

*shrugs*

...

> I'll investigate lockf, however I think it's one of those things that you
> just live with (for example... what happens if two people issue a sysctl(=
8)
> call at the exact same time ... whoever gets there last sets the effectiv=
e
> value).

There's a difference though. Most of sysctl(9) is locked with mutexes
of various flavors; this method however is lock-free.

> You'll notice that I do all my work in memory...
> If the buffer is empty, I don't write out the buffer.
> Much in the way that if an in-line sed (with -i for example) will also ch=
eck
> the memory contents before writing out the changes.
> Since error-checking is performed, there's no difference between doing th=
is
> on a temporary file (essentially the memory buffer is the temporary file =
--
> safe for wierd scenarios where memory fails you -- but then you have bigg=
er
> problems than possibly wiping out your rc.conf file -- like perhaps
> scribbling on the disk in new and wonderful ways during memory corruption=
).
> Also, since the calculations are done in memory and the read-in is decide=
dly
> different than the write-out (read: not performed as a single command), i=
f
> two scripts operated simultaneously, here's what would happen:
> script A reads rc.conf(5)
> script B does the same
> script A operates on in-memory buffer
> script B does the same
> script A writes out new rc.conf from modified memory buffer
> script B does the same
> whomever does the last write will have their contents preserved. The unlu=
cky
> first-writer will have his contents overwritten.
> I do not believe the kernel will allow the two writes to intertwine even =
if
> firing at the exact same precise moment. I do believe that one will block
> until the other finishes (we could verify this by looking at perhaps the
> bourne-shell's '>' redirect operator to see if it flock's the file during
> the redirect, which it may, or perhaps such things are at lower levels).

Even then, my concern was more about the atomicity of the operation
than anything else. If person A modifies the file, then person B
modifies it simultaneously, and for whatever reason person B finishes
before person A, and person A's changes are written out to disk,
there's not much that can be done (otherwise we'd need a database, but
then that's smelling a lot like Windows registries, and those are a
bi^%& to recover, if at all possible).

I care more about the corruption case because that's a problem if the
contents written out to disk get partially written (script killed,
process interrupted, out of disk space, etc), or worse, the results
get interleaved from process A and process B :/.

There are some tricks that can be employed with test(1) (-nt, -ot),
but it's probably just easier to use lockf when writing out the file
because you're in a critical section of the script.

...

> ^_^
> Well, I see getopt is an external dependency (bad) while getopts appears =
to
> be a builtin.

The only plus-side to getopt is that it allows for double-dashed
arguments from what I've read (at least that was the Linux version),
but I avoid it because its implementation varies.

> I'll have a looksie and see, but I find the case statement to be very
> readable as it is.

But getopts does the shifting and junk for you and that's why I
suggested it *shrug*... just like getopt vs optparse in python, but
that's a different ball of wax.

> No more confusing than sysctl(8) which does the same thing as I did (I wa=
s
> in-fact mimicking sysctl(8) in this behavior).

...

well, the output is different depending on the context; example:

$ sysctl dev.uhid.0.%parent=3Dblah
sysctl: oid 'dev.uhid.0.%parent' is read only
$ sysctl debug.minidump=3D0
debug.minidump: 1
sysctl: debug.minidump: Operation not permitted
$ sudo sysctl debug.minidump=3D0
debug.minidump: 1 -> 0

So the messages vary, but it looks like I missed the newline with the
eprintf call you made above in sysrc_set in my first pass, so I
wouldn't worry about this comment.

> Not a screw-up....
> Since what appears between $( ... ) (back-ticks too `...`) is read using
> readline(3), any leading whitespace is ignored.
> I'm using this technique to split the line because it was too long to be
> accommodated-fully within an 80-character wide terminal window with
> tab-width set to 8 (what nearly everybody defaults to these days).

Ok, sounds good -- just a bit harder to scan with the eye initially.

> =A0=A0=A0And now some more important questions:
>
> =A0=A0=A01. What if I do: sysrc PS1 :) (hint: variables inherited from th=
e
> shell really shouldn't end up in the output / be queried)?
>
> Great question... hadn't thought of that.
> I could perhaps use a set(1) flag to clear the environment variables prio=
r
> to calling source_rc_confs. That seems to be a prudent thing to do (or if
> not via set(1) built-in, via preening the list of current variables and
> using unset built-in to kill them off in a for/in/do loop).

Ok -- sounds good!

...

> The `-n' is already covered (see usage).
> I do agree `-a' is both warranted and highly useful (provides system
> administrator a snapshot of what /etc/rc sees at boot after performing a
> source_rc_confs -- great for either trouble-shooting boot problems or
> taint-checking everything before a reboot).

Oooh -- cool (I'll have to look closer next time for `-n' :)..)!

-a is helpful, but could become a bit tricky, esp. when some rc.d
scripts live in /usr/local/etc/rc.d (can they live elsewhere? I don't
remember OTOH..) and don't necessarily have the same constraints as
rc.conf does... maybe some markup would need to be added to the
scripts or external metadata, to deal with configuration information.

One thing that would be nice is mapping variables to humanized
descriptions for the less understood values, but at that point it
might be wise to point someone to a manpage for the service they're
tweaking.

> Well now....
> If you really want to support ALL those possibilities... I _did_ have a m=
ore
> complex routine which caught them all (each and every one), but it wasn't
> quite as clean ^_^
> If you really want me to break out the nuclear reactor, I'll work it back=
 in
> from one of the predecessors of this script which was 1,000+ lines of cod=
e.
> However, I found that the need to catch such esoteric conditions was
> far-out-weighed by the need to simplify the script and make a cleaner
> approach.
> Yes, the rc.conf(5) scripts (whether we're talking about /etc/rc.conf,
> /etc/rc.conf.local, or ones that are appended by the end-user) can be qui=
te
> complex beasts...
> And we could see things like this...
> foo=3Dbar; bar=3Dbaz; baz=3D123
> And the script would not be able to find the correct instance that needs =
to
> be replaced to get "bar" to be some new value.
> My nuclear-physics-type script could handle those instances (using sed to
> reach into the line and replace only the baz portion and retain the exist=
ing
> foo and baz declarations.
> What would you prefer though? Something that is cleaner, more readable,
> easier to digest, more efficient, and has fewer dependencies, or one that=
 is
> more robust but may require a degree to digest?

    Fair enough :P; I would clearly advertise the limitations of the
tool with so it doesn't turn into a kitchen sink utility like
pkg_install and sysinstall have become :/.. otherwise people love to
add features into pieces of code that shouldn't really have those
features.
Thanks!
-Garrett

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTinCxd9aDbr7GifEdeOpdcW4d%2B4ZcQWF0TK_mC=8>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation