From owner-freebsd-current@FreeBSD.ORG Thu Oct 30 03:36:56 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E5A3E16A4CE for ; Thu, 30 Oct 2003 03:36:56 -0800 (PST) Received: from mailhub.fokus.fraunhofer.de (mailhub.fokus.fraunhofer.de [193.174.154.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7A9D943FAF for ; Thu, 30 Oct 2003 03:36:55 -0800 (PST) (envelope-from brandt@fokus.fraunhofer.de) Received: from beagle (beagle [193.175.132.100])h9UBWkP04683; Thu, 30 Oct 2003 12:32:46 +0100 (MET) Date: Thu, 30 Oct 2003 12:32:46 +0100 (CET) From: Harti Brandt To: Terry Lambert In-Reply-To: <3FA0EEFD.431DD759@mindspring.com> Message-ID: <20031030120925.K80335@beagle.fokus.fraunhofer.de> References: <3F9F4FE6.29C4E178@mindspring.com><3FA0EEFD.431DD759@mindspring.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: Jordan K Hubbard cc: current@FreeBSD.org Subject: Re: Anyone object to the following change in libc? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Oct 2003 11:36:57 -0000 On Thu, 30 Oct 2003, Terry Lambert wrote: TL>Harti Brandt wrote: TL>> TL>Paragraph 6 of: TL>> TL> TL>> TL> http://www.opengroup.org/onlinepubs/007904975/functions/sscanf.html TL>> TL> TL>> TL>Implies that the lack of characters in the string following the TL>> TL>conversion, due to failure in assignment, should result in an TL>> TL>"Input failure". Note also that stdio.h defines EOF as -1. TL>> TL>> I fail to locate this paragraph. This interpretation would also imply TL>> that scanf() always needs to return -1 whenever it cannot match a format TL>> specifier. TL> TL> The fscanf() functions shall execute each directive of the TL> format in turn. If a directive fails, as detailed below, the TL> function shall return. Failures are described as input TL> failures (due to the unavailability of input bytes) or TL> matching failures (due to inappropriate input). TL> TL>It comes down to how you interpret the NUL byte at the end of the TL>sscanf() input string. Is it an EOF? Or is it an unavailability of TL>input bytes? The answer to the question picks which return value TL>is correct. Section 7.19.6.7 of N843 states: "Reaching the end of the string is equivalent to encountering end-of-file for the fscanf function." Unfortunately this is missing in POSIX, but obviously implied by their reference to ISO. The next paragraph states: "The sscanf function returns the value of the macro EOF if an input failure occurs before any conversion." Again: do we have a conversion? We have! Should we return EOF? No. TL> TL> TL>> TL>I think it can be interpreted either way, still. TL>> TL>> You miss the section about RETURN VALUE: EOF is return on a read error. TL>> This is not an input error. TL> TL>How do I distinguish a "return value is -1 as an error result" from TL>"return value is -1 as an EOF result"? Well, I suppose that's the intention of having scanf() setting errno when it returns -1 in POSIX. Unfortunately POSIX fails to describe the error codes. This is possibly fodder for the aardvark. TL> TL> TL>> You should also read the very 1st paragraph. This clearly states, that TL>> ISO is the primary source of information and the ISO text is a lot TL>> cleaner. TL> TL>No, that's not what it actually states; here's the paragraph: TL> TL> The functionality described on this reference page is TL> aligned with the ISO C standard. Any conflict between TL> the requirements described here and the ISO C standard TL> is unintentional. This volume of IEEE Std 1003.1-2001 TL> defers to the ISO C standard. TL> TL>It says that any conflicts are unintentional, and their intent was TL>to use different language for no good reason, rather than just TL>copying it verbatim and removing any doubt. It does *NOT* say TL>that no conflicts exist. Yes. But I take the last sentence to mean that ISO-C takes over in the case a conflict exists. TL> TL>Also: In this context, which is IEEE 1003.1-2001, Issue 6, "the TL>ISO C standard" refers to "c89", which is the version of the C TL>standard that was in effect at the time that SVID IV was defined. Line 107 of Austin TC-1: "The c89 utility (which specified a compiler for the C Language specified by the 108 ISO/IEC 9899: 1990 standard) has been replaced by a c99 utility (which specifies a compiler for 109 the C Language specified by the ISO/IEC 9899: 1999 standard)." TL>If you need clarification on this issue, you should download the TL>currently available version of the NIST/PCTS, which specifically TL>requires you to compile with a c89 compiler, not one more recent. TL>The same is true of The Open Group test suites which are available TL>on the Internet. TL> TL>The version of the ISO C standard you are quoting from is *NOT* TL>the c89 version. Our sscanf() claims conformance to C99. So if we change the behaviour we have to remove this claim. TL>This makes interpretation ambiguous, since the test you are TL>specifically referencing to get the 0 result is text that was TL>added to the next version of the standard to clarify it. TL> TL> TL>> I think it makes no sense to classify TL>> TL>> sscanf("123", "%*d%d", ... TL>> TL>> as an error, but TL>> TL>> sscanf("123", "%d%d", ... TL>> TL>> not, does it? Also at least Solaris 9 return -1 but fails to set TL>> errno. Which is simply a bug. TL> TL>It makes no sense to do conversions without assignment in the TL>first place (IMO). [... Stuff about sense removed (I was talking about what return code makes sense, not whether calling sscanf makes sense) ...] TL>In any case, we are practically guaranteed that returning -1, as TL>all other UNIX-like OS's currently do, would result in less source TL>code breaking. No coder in his right mind should have written code that depends on this behaviour given the moot formulations in the classical books, man pages and pre-C99 standards. Also note, that the reason for this change request was that configuration scripts break, not applications. If applications break they should be fixed. TL>In other words, conformance level has historically been dictated TL>by what code is not broken, not what is technically permitted by TL>the standards, if you language-lawyer them to death. TL> TL>To put it in IETF terms: "Be conservative in what you generate, TL>and generous in what you accept". This does not apply here because you cannot return -1 and 0 at the same time. Adhering to a cleanly written standard and breaking a handful of badly written autoconf scripts is clearly better than adhering to undocumented historical behaviour. What will we do if Solaris 10 returns 0 in the above case? Change our code back? harti -- harti brandt, http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private brandt@fokus.fraunhofer.de, harti@freebsd.org