From owner-freebsd-bugs Thu May 14 11:03:55 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id LAA13931 for freebsd-bugs-outgoing; Thu, 14 May 1998 11:03:55 -0700 (PDT) (envelope-from owner-freebsd-bugs@FreeBSD.ORG) Received: from freefall.freebsd.org (freefall.FreeBSD.ORG [204.216.27.21]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id LAA13920 for ; Thu, 14 May 1998 11:03:49 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.8.8/8.8.5) id LAA08266; Thu, 14 May 1998 11:00:01 -0700 (PDT) Date: Thu, 14 May 1998 11:00:01 -0700 (PDT) Message-Id: <199805141800.LAA08266@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.ORG From: woods@zeus.leitch.com (Greg A. Woods) Subject: Re: bin/6557: /bin/sh && IFS Reply-To: woods@zeus.leitch.com (Greg A. Woods) Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org The following reply was made to PR bin/6557; it has been noted by GNATS. From: woods@zeus.leitch.com (Greg A. Woods) To: FreeBSD-gnats-submit@freebsd.org Cc: Subject: Re: bin/6557: /bin/sh && IFS Date: Thu, 14 May 1998 14:00:38 -0400 (EDT) [ On Wed, May 13, 1998 at 02:00:02 (-0700), Martin Cracauer wrote: ] > Subject: Re: bin/6557: /bin/sh && IFS > > Hm, Solaris' ksh and sh don't agree completely (Solaris 2.6/SPARC): Actually with your example the original Bourne Shell is the odd man out. Ksh-88i, Ksh93, ash (both NetBSD & FreeBSD), and pdksh-5.2.13 all behave similarly with your example (which is I think the one that gets right down to the meat of the problem). I've finally found the rationale in POSIX 1003.2 Draft 11.2 that talks about this, and it does seem to make a certain amount of sense, though it introduces strange magic that can lead to very unexpected results: Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. BEGIN_RATIONALE 3.6.5.1 Field Splitting Rationale. (This subclause is not a part of P1003.2) The operation of field splitting using IFS as described in earlier drafts was based on the way the KornShell splits words, but is incompatible with other common versions of the shell. However, each has merit, and so a decision was made to allow both. If the IFS variable is unset, or is , the operation is equivalent to the way the System V shell splits words. Using characters outside the set yields the KornShell behavior, where each of the non- characters is significant. This behavior, which affords the most flexibility, was taken from the way the original awk handled field splitting. The (3) rule can be summarized as a pseudo ERE: 1 (s*ns*|s+) 1 where s is an IFS white-space character and n is a character in the IFS 1 that is not white space. Any string matching that ERE delimits a field, 1 except that the s+ form does not delimit fields at the beginning or the 1 end of a line. For example, if IFS is , the string 1 red,whiteblue 1 yields the three colors as the delimited fields. 1 END_RATIONALE 1 > Hm, so what are the arguments to `for` (or to any command)? > > As far as I can tell, they are > - not parameter expansion > - not command substitution > - not arithmetic expansion > > The paragraph above says that only results of these expansions and > substitutions are subject to field splitting. What kind of > substitution or expandsion are command arguments a result of? Command arguments are not a valid concept here at all. A deep and dark alley full of many horrors awaits anyone trying to think of things in those terms. Another section from P1003.2D11.2 may clear the fog (and also gives concrete reasons for siding with Korn on these mechanisms): Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. BEGIN_RATIONALE 3.6.0.1 Word Expansions Rationale. (This subclause is not a part of P1003.2) IFS is used for performing field splitting on the results of parameter and command substitution; it is not used for splitting all fields. Previous versions of the shell used it for splitting all fields during field splitting, but this has severe problems because the shell can no longer parse its own script. There are also important security implications caused by this behavior. All useful applications of IFS use it for parsing input of the read utility and for splitting the results of parameter and command substitution. New versions of the shell have fixed this bug, and POSIX.2 requires the corrected behavior. The rule concerning expansion to a single field requires that if foo=abc and bar=def, that "$foo""$bar" expands to the single field abcdef The rule concerning empty fields can be illustrated by: $ unset foo $ set $foo bar '' xyz "$foo" abc $ for i > do > echo "-$i-" > done -bar- -- -xyz- -- -abc- Step (1) indicates that Tilde Expansion, Parameter Expansion, Command 1 Substitution, and Arithmetic Expansion are all processed simultaneously as they are scanned. For example, the following is valid arithmetic: x=1 echo $(( $(echo 3)+$x )) An earlier draft stated that Tilde Expansion preceded the other steps, 1 but this is not the case in known historical implementations; if it were, 1 and a referenced home directory contained a $ character, expansions would 1 result within the directory name. 1 END_RATIONALE 1 If that didn't quite do it, then perhaps this will (the actual rules that appear before the above quoted rationale). This next section also answers your last question about the empty field (i.e. pdksh is wrong): Copyright c 1991 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 3.6 Word Expansions This clause describes the various expansions that are performed on words. Not all expansions are performed on every word, as explained in the following subclauses. Tilde expansions, parameter expansions, command substitutions, arithmetic expansions, and quote removals that occur within a single word expand to a single field. It is only field splitting or pathname expansion that can create multiple fields from a single word. The single exception to this rule is the expansion of the special parameter @ within double- quotes, as is described in 3.5.2. The order of word expansion shall be as follows: (1) Tilde Expansion (see 3.6.1), Parameter Expansion (see 3.6.2), 1 Command Substitution (see 3.6.3), and Arithmetic Expansion (see 3.6.4) shall be performed, beginning to end. [See item (5) in 3.3.] (2) Field Splitting (see 3.6.5) shall be performed on fields generated by step (1) unless IFS is null. [[NOTE: there's a minor inconsistency in the above vs. the rationale quoted first in this message, specifically the earlier rationale stated "If the IFS variable is unset, or is , the operation is equivalent to the way the System V shell splits words." which would imply more magic happens than the above actual rule allows. Hopefully nobody's implmented the extra magic given in the rationale.]] (3) Pathname Expansion (see 3.6.6) shall be performed, unless set -f is in effect. (4) Quote Removal (see 3.6.7) shall always be performed last. The expansions described in this clause shall occur in the same shell environment as that in which the command is executed. If the complete expansion appropriate for a word results in an empty field, that empty field shall be deleted from the list of fields that form the completely expanded command, unless the original word contained 1 single-quote or double-quote characters. 1 The $ character is used to introduce parameter expansion, command substitution, or arithmetic evaluation. If an unquoted $ is followed by a character that is either not numeric, the name of one of the special parameters (see 3.5.2), a valid first character of a variable name, a left curly brace ({), or a left parenthesis, the result is unspecified. -- Greg A. Woods +1 416 443-1734 VE3TCP Planix, Inc. ; Secrets of the Weird To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message