Date: Thu, 24 Feb 2005 12:33:51 +0200 From: Maxim Sobolev <sobomax@portaone.com> To: Garance A Drosihn <drosih@rpi.edu> Cc: freebsd-arch@FreeBSD.ORG Subject: Re: Bug in #! processing - One More Time Message-ID: <421DAD8F.6000704@portaone.com> In-Reply-To: <p06210225be4307a39100@[128.113.24.47]> References: <200410020349.i923nG8v021675@northstar.hetzel.org> <20041002052856.GE17792@nexus.dglawrence.com> <p0611041fbd848f6aa55d@[128.113.24.47]> <20041002233542.GL714@nexus.dglawrence.com> <p0620076ebe2490ccdc00@[128.113.24.47]> <p06210225be4307a39100@[128.113.24.47]>
next in thread | previous in thread | raw e-mail | index | archive | help
Garance A Drosihn wrote: > Sometimes it's the simplest little changes which can suck the > life out of you... I am aware that this is a trivial issue, > but now that I've figured out what is really going on, I am > not sure what the "best" fix would be. > > To recap some history: > > a) In Jan 2000, someone sent in a PR that perl documentation > (including the famous "Camel" book from O'Reilly) claims > that users can start a script with the line: > > #!/bin/sh -- # -*- perl -*- -p > > to avoid a variety of issues when writing cross-platform > scripts. Ignore the question of "but why?" for the moment, > it *is* documented by perl (and in books on some other > scripting languages). He proposed a fix, and that was > committed to src/sys/kern/imgact_shell.c as revision 1.21 > back in Feb 15 2000 (predating 4.0-release). It was MFC'ed > into release 3.5 on March 20, 2000. > > The PR is: > http://www.FreeBSD.org/cgi/query-pr.cgi?pr=16393 > > NOTE: People *do* use this "feature". > Counter: This feature doesn't actually work on recent > releases of Redhat Linux. I don't know about > other linuxes. > > b) In 2002, some other user updated that PR saying that the > new behavior wasn't quite right either. I assume nothing > much was done at the time, but he spent time to collect > a lot of details (which will be given below). > > c) In 2004, after 5.3-release, the issue came up again. I assume > that is in another PR, but I haven't checked. In any case, > kern/imgact_shell.c was changed to remove that special > processing for '#, after discussion in -current. The change > was committed to HEAD (6.x) on October 31st as revision 1.27. > It was MFC'ed to 5.3-stable on November 8th. > > This broke scripts which depended on the special-handling of > '#', but the conclusion in -current was that /bin/sh should > handle such processing (if it wanted to), and not execve(). > > d) In January I was finally bitten by this running 6.x-current, > and a friend of mine happened to get hit by it at the same > time running 5.3-stable. So I wrote up a quick fix and did > some minimal testing. I posted that to -current on Jan 31st, > but I didn't want to commit it until I did more testing, > which I wanted to do *after* I brought my systems up-to-date. > > e) On January 29th, sobomax committed an "unrelated" fix to > kern/imgact_shell.c, except that it just happened to bring back > the special '#' processing which had been removed in October... > > f) I update my systems, do extensive testing of my patch, and I > committed it once I was confident it worked in all situations. > However, I didn't notice that the shell was no longer even > *seeing* the parameters after '#' (I had tested that part > back in #d), so it turns out the key loop I that had added > was never actually getting triggered. > > I committed it to 6.x-current last week. > > g) On Monday I get ready to MFC the change to 5.3 (ahead of the > rush to beat the code-freeze!). But... the damn thing does > NOT work right in some common situations!! WTF?!? > > So, I figure out all the above history, and I locally modify > kern/imgact_shell.c to again remove the special '#'-processing. > I go to fix my patch to /bin/sh, and I realize... > > There is no simple, "make everyone happy" fix for it. Sigh. > > The problem is in the way the execve() system call passes all > arguments to the shell. Given a shell named /tmp/list_args.pl, > which starts out as: > #!/bin/sh -x -- # -*- perl -*- -p > > and is executed via: > /tmp/list_args.pl aaa bbb > > What /bin/sh sees for arguments are: > arg[0] == '-x' > arg[1] == '--' > arg[2] == '#' > arg[3] == '-*-' > arg[4] == 'perl' > arg[5] == '-*-' > arg[6] == '-p' > arg[7] == '/tmp/list_args.pl' > arg[8] == 'aaa' > arg[9] == 'bbb' > > The problem is that /bin/sh has no way of knowing where the > "shebang-line options" end, and the "command-line options" start. > (or does it? I couldn't think of any reliable way, given that > the '#' could be followed by any totally arbitrary strings). > > Going back to the follow-up to PR 16393, part of the challenge > with fixing this is that many other OS's do *not* break up the > options on the shebang line the way FreeBSD does. > From the PR: > > Given a file called '/tmp/x2' with shebang line: > #!/tmp/interp -a -b -c #dee eee > > If /tmp/x2 is exec'd, the operating system runs /tmp/interp > with the following arguments: > > Solaris 8: > args: "/tmp/interp" "-a" "/tmp/x2" > > Tru64 4.0: > args: "interp" "-a -b -c #dee eee" "/tmp/x2" > > FreeBSD 2.2.7: > args: "/tmp/interp" "-a" "-b" "-c" "#dee" "eee" "/tmp/x2" > > FreeBSD 4.0: > args: "/tmp/interp" "-a" "-b" "-c" "/tmp/x2" > > Linux 2.4.12: > args: "/tmp/interp" "-a -b -c #dee eee" "/tmp/x2" > > Linux 2.2.19: > args: "interp" "-a -b -c #dee eee" "/tmp/x2" > > Irix 6.5: > args: "/tmp/interp" "-a -b -c #dee eee" "/tmp/x2" > > HPUX 11.00: > args: "/tmp/x2" "-a -b -c #dee eee" "/tmp/x2" > > AIX 4.3: > args: "interp" "-a -b -c #dee eee" "/tmp/x2" > > Mac OX X: > args: "interp" "-a -b -c #dee eee" "/tmp/x2" > > The most common behavior is: > argv[0]: full path of interpreter > argv[1]: all remaining args, coalesced into one string > argv[2]: The file file exec'd. > > The change committed back in 2000 made the comment: "This complies > to POSIX 1003.2, in that Posix says the implementation is free to > choose whatever it likes.". I actually like the idea that FreeBSD > splits up the arguments from the shebang-line, but that leaves us > with the problem of figuring out shebang-options from user-specified > options given on the command-line. > > As I see it, we have the following choices to fix this: > > 1) MFC the January 31st change to kern/imgact_shell.c to 5.3-stable, > as it is. This means we haven't fixed the problem that people > complained about in 2002 and again in 2004. And I still think > it is "not appropriate" for the execve() system to be deciding > what '#' means on that line. The biggest advantage is that this > means 5.4-release will behave exactly the same as 3.5 through > 5.3-release have behaved. > > 2) Remove '#'-processing from kern/imgact_shell.c, and remove my > change to bin/sh/options.c (which doesn't work right once we > do that). This breaks shell-scripts which use the feature as > documented by perl (and other scripting languages), and fixes > the problem people complained about in 2002/2004. > > 3) Change kern/imgact_shell.c to process shebang options the same > way other (non-BSD?) operating systems do. By that I mean: > send the entire string as arg[1], and let the scripting > language sort it out. This is an incompatible change from > FreeBSD 5.3 to 5.4, but would put make us "more consistent" > with other operating systems. > > 4) Provide some way for /bin/sh to find out where the shebang > options end, and the user-specified options begin. This could > make everyone happy, but it's more work and right now (this > close to 5.4-release) that wouldn't make me particularly happy... > > Or we could do #1 for now, and plan to do #4 after 5.4-release. > Or do #1 now in 5.3, and go with some incompatible change (#2 > or #3) only in 6.x-current. > > What do people think? I know this is a mind-numbingly trivial > issue to care about, but I figured that if I just went ahead > with any particular solution, someone would be irritated with me > and assume I must not have understood "the issues". They will > then commit yet *another* change which undoes whatever I did, > while they fix something they feel that I broke. > > And if nothing else, this is proof that one can't just blindly > MFC some change, no matter now trivial it seems. I would vote for making #3 and respective /bin/sh changes and MFCing them into 5.4. We don't have that many shell scripts that rely on the previus functionality - ones that in the base system (if any) can be easily fixed, while ones in /usr/ports can be conditionalized on OSVERSION. Removing yet another superfluous difference between FreeBSD and other systems out there is good thing especially considering that BSD-way creates serious problems that can't be resolved without changing semantics anyway. -Maxim
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?421DAD8F.6000704>