From owner-freebsd-current@FreeBSD.ORG Thu Jan 20 22:30:08 2005 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 58FA716A4CF; Thu, 20 Jan 2005 22:30:08 +0000 (GMT) Received: from sax.sax.de (sax.sax.de [193.175.26.33]) by mx1.FreeBSD.org (Postfix) with ESMTP id A3BBC43D39; Thu, 20 Jan 2005 22:30:07 +0000 (GMT) (envelope-from j@uriah.heep.sax.de) Received: from sax.sax.de (localhost [127.0.0.1]) by sax.sax.de (8.12.10/8.12.10) with ESMTP id j0KMU6Yu057169; Thu, 20 Jan 2005 23:30:06 +0100 (CET) (envelope-from j@uriah.heep.sax.de) Received: (from uucp@localhost) by sax.sax.de (8.12.10/8.12.10/Submit) with UUCP id j0KMU6Pg057168; Thu, 20 Jan 2005 23:30:06 +0100 (CET) (envelope-from j@uriah.heep.sax.de) Received: from uriah.heep.sax.de (localhost [127.0.0.1]) by uriah.heep.sax.de (8.13.1/8.13.1) with ESMTP id j0KMLbjD035872; Thu, 20 Jan 2005 23:21:37 +0100 (MET) (envelope-from j@uriah.heep.sax.de) Received: (from j@localhost) by uriah.heep.sax.de (8.13.1/8.13.1/Submit) id j0KMLbxk035871; Thu, 20 Jan 2005 23:21:37 +0100 (MET) (envelope-from j) Date: Thu, 20 Jan 2005 23:21:37 +0100 From: Joerg Wunsch To: current@FreeBSD.ORG Message-ID: <20050120222137.GE30862@uriah.heep.sax.de> References: <20050120192324.GA30862@uriah.heep.sax.de> <20050120205501.GA69123@nagual.pp.ru> <20050120211449.GC30862@uriah.heep.sax.de> <20050120214406.GA70088@nagual.pp.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050120214406.GA70088@nagual.pp.ru> User-Agent: Mutt/1.4.2.1i X-Phone: +49-351-2012 669 X-PGP-Fingerprint: DC 47 E6 E4 FF A6 E9 8F 93 21 E0 7D F9 12 D6 4E X-GPG-Fingerprint: 5E84 F980 C3CA FD4B B584 1070 F48C A81B 69A8 5873 X-Spam-Status: No, score=-2.6 required=7.5 tests=BAYES_00 autolearn=ham version=3.0.1 X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on uriah.heep.sax.de cc: Andrey Chernov cc: bde@FreeBSD.ORG Subject: Re: Implementation errors in strtol() X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Joerg Wunsch List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Jan 2005 22:30:08 -0000 As Andrey Chernov wrote: > Errno may be set in case of error with not documented errno. Thats > how I read it, but I may miss something. I read that a bit differently. > > Still, my major point was that "0x" sequences are falsely rejected as > It clearly should be rejected with EINVAL in case base == 16, > because 0 alone is not valid HEX sequence Nope. "0" alone is a completely valid hexadecimal number, representing the value 0. Conversion has to start at the 0 (as it is not invalid), and to stop at the x. The string "0x" simply means there is *no* optional 0x prefix, but just a number 0 without a prefix (followed by a letter that cannot be converted, so it has to be passed as final string). > > conversion errors, and that strings consisting solely of a plus or > > minus sign should not throw an error either, as I read the C standard. > +- may produce EINVAL, as POSIX says. Where? Again, I'd value the C standard higher than Posix. C says they form a valid subject sequence. Thus, no error may be flagged. I don't have Posix at hand, but SUSPv2 completely follows the C standard, with the only addition that EINVAL might be flagged in the case of a conversion error. As a subject sequence consisting of a sign only (+ or -) does not constitute an empty subject sequence, thus no conversion error is permissible. This implies EINVAL must not be set. > In general please don't forget that strtol(), atol() etc. supposed > to parse user input and _detect_ syntax errors, it is their > purpose. If they not do it or do it in half, each program forced to > use its own parser instead. This is no excuse for violating standards. If a user feels that a single sign must not be interpreted as a valid 0, they indeed have to apply their own checks. It would be no use to them if FreeBSD's strtoul (erroneously) flagged it as an error, while any standard-compliant implementation would not fulfill their expectations anyway. As a demonstration, consider this test program: #include #include #include const char *s[] = { "", "+", "-", "0x", "-0x" }; int main(void) { size_t i; char *p; long l; for (i = 0; i < sizeof s / sizeof s[0]; i++) { errno = 0; l = strtol(s[i], &p, 16); printf("\"%s\" -> %ld, len %u, errno %d\n", s[i], l, p - s[i], errno); } return 0; } Below are the results for Solaris 8, FreeBSD 5, Linux 2.x, and HP-UX 10.20. helios% ./foo "" -> 0, len 0, errno 0 "+" -> 0, len 0, errno 0 "-" -> 0, len 0, errno 0 "0x" -> 0, len 1, errno 0 "-0x" -> 0, len 2, errno 0 j@uriah 1259% ./foo "" -> 0, len 0, errno 22 "+" -> 0, len 0, errno 22 "-" -> 0, len 0, errno 22 "0x" -> 0, len 0, errno 22 "-0x" -> 0, len 0, errno 22 j@lux 344% ./foo "" -> 0, len 0, errno 0 "+" -> 0, len 0, errno 0 "-" -> 0, len 0, errno 0 "0x" -> 0, len 1, errno 0 "-0x" -> 0, len 2, errno 0 j@king 105% ./foo "" -> 0, len 0, errno 0 "+" -> 0, len 0, errno 0 "-" -> 0, len 0, errno 0 "0x" -> 0, len 0, errno 0 "-0x" -> 0, len 0, errno 0 It's quite obvious that any other system differs from FreeBSD here. (OK, HP-UX doesn't throw EINVAL at all, even for clearly inconvertible strings. But then, it's a pretty old system, more than ten years.) As the Posix/SUSPv2 standard say ``may set to EINVAL'', I'd even go with the majority and not set EINVAL for an empty string even though it technically constitutes a conversion error according to the C standard. It's quite pointless to handle a single plus or minus sign differently than an empty string, and as both the C and Posix/SUSP standards mandate that +/- must not cause a conversion error, just don't flag the error for the empty string either. -- cheers, J"org .-.-. --... ...-- -.. . DL8DTL http://www.sax.de/~joerg/ NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-)