Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 20 Jan 2005 23:21:37 +0100
From:      Joerg Wunsch <freebsd-current@uriah.heep.sax.de>
To:        current@FreeBSD.ORG
Cc:        bde@FreeBSD.ORG
Subject:   Re: Implementation errors in strtol()
Message-ID:  <20050120222137.GE30862@uriah.heep.sax.de>
In-Reply-To: <20050120214406.GA70088@nagual.pp.ru>
References:  <20050120192324.GA30862@uriah.heep.sax.de> <20050120205501.GA69123@nagual.pp.ru> <20050120211449.GC30862@uriah.heep.sax.de> <20050120214406.GA70088@nagual.pp.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
As Andrey Chernov wrote:

> Errno may be set in case of error with not documented errno. Thats
> how I read it, but I may miss something.

I read that a bit differently.

> > Still, my major point was that "0x" sequences are falsely rejected as

> It clearly should be rejected with EINVAL in case base == 16,
> because 0 alone is not valid HEX sequence

Nope.  "0" alone is a completely valid hexadecimal number,
representing the value 0.  Conversion has to start at the 0 (as it is
not invalid), and to stop at the x.  The string "0x" simply means
there is *no* optional 0x prefix, but just a number 0 without a prefix
(followed by a letter that cannot be converted, so it has to be passed
as final string).

> > conversion errors, and that strings consisting solely of a plus or
> > minus sign should not throw an error either, as I read the C standard.

> +- may produce EINVAL, as POSIX says.

Where?

Again, I'd value the C standard higher than Posix.  C says they form a
valid subject sequence.  Thus, no error may be flagged.

I don't have Posix at hand, but SUSPv2 completely follows the C
standard, with the only addition that EINVAL might be flagged in the
case of a conversion error.  As a subject sequence consisting of a
sign only (+ or -) does not constitute an empty subject sequence, thus
no conversion error is permissible.  This implies EINVAL must not be
set.

> In general please don't forget that strtol(), atol() etc. supposed
> to parse user input and _detect_ syntax errors, it is their
> purpose. If they not do it or do it in half, each program forced to
> use its own parser instead.

This is no excuse for violating standards.  If a user feels that a
single sign must not be interpreted as a valid 0, they indeed have to
apply their own checks.  It would be no use to them if FreeBSD's
strtoul (erroneously) flagged it as an error, while any
standard-compliant implementation would not fulfill their expectations
anyway.

As a demonstration, consider this test program:

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>

const char *s[] = {
	"", "+", "-", "0x", "-0x"
};

int
main(void)
{
	size_t i;
	char *p;
	long l;

	for (i = 0; i < sizeof s / sizeof s[0]; i++) {
		errno = 0;
		l = strtol(s[i], &p, 16);
		printf("\"%s\" -> %ld, len %u, errno %d\n",
		    s[i], l, p - s[i], errno);
	}
	return 0;
}

Below are the results for Solaris 8, FreeBSD 5, Linux 2.x, and HP-UX
10.20.

helios% ./foo
"" -> 0, len 0, errno 0
"+" -> 0, len 0, errno 0
"-" -> 0, len 0, errno 0
"0x" -> 0, len 1, errno 0
"-0x" -> 0, len 2, errno 0
j@uriah 1259% ./foo
"" -> 0, len 0, errno 22
"+" -> 0, len 0, errno 22
"-" -> 0, len 0, errno 22
"0x" -> 0, len 0, errno 22
"-0x" -> 0, len 0, errno 22
j@lux 344% ./foo
"" -> 0, len 0, errno 0
"+" -> 0, len 0, errno 0
"-" -> 0, len 0, errno 0
"0x" -> 0, len 1, errno 0
"-0x" -> 0, len 2, errno 0
j@king 105% ./foo
"" -> 0, len 0, errno 0
"+" -> 0, len 0, errno 0
"-" -> 0, len 0, errno 0
"0x" -> 0, len 0, errno 0
"-0x" -> 0, len 0, errno 0

It's quite obvious that any other system differs from FreeBSD here.
(OK, HP-UX doesn't throw EINVAL at all, even for clearly inconvertible
strings.  But then, it's a pretty old system, more than ten years.)

As the Posix/SUSPv2 standard say ``may set to EINVAL'', I'd even go
with the majority and not set EINVAL for an empty string even though
it technically constitutes a conversion error according to the C
standard.  It's quite pointless to handle a single plus or minus sign
differently than an empty string, and as both the C and Posix/SUSP
standards mandate that +/- must not cause a conversion error, just
don't flag the error for the empty string either.

-- 
cheers, J"org               .-.-.   --... ...--   -.. .  DL8DTL

http://www.sax.de/~joerg/                        NIC: JW11-RIPE
Never trust an operating system you don't have sources for. ;-)



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050120222137.GE30862>