Date: Thu, 3 Jun 2010 15:10:42 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Garrett Cooper <yanefbsd@gmail.com> Cc: freebsd-bugs@freebsd.org Subject: Re: kern/147226: read(fd, buffer, len) returns -1 immediately, if len >=2147483648 Message-ID: <20100603140402.J27549@delplex.bde.org> In-Reply-To: <201005312310.o4VNA3ss070152@freefall.freebsd.org> References: <201005312310.o4VNA3ss070152@freefall.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 31 May 2010, Garrett Cooper wrote: > The following reply was made to PR kern/147226; it has been noted by GNATS. The following reply is only to to the addresses in the header mangled by GNATS, so it might be lost by GNATS as usual: > > From: Bruce Cran <bruce@cran.org.uk> > > To: bug-followup@FreeBSD.org, eugene.kharitonov@gmail.com > > Cc: > > Subject: Re: kern/147226: read(fd, buffer, len) returns -1 immediately, = > =A0if > > =A0len >=3D2147483648 > > Date: Mon, 31 May 2010 16:21:05 +0100 > > > > =A0This actually looks like a 64-bit bug. > > =A0http://opengroup.org/onlinepubs/007908775/xsh/read.html says that up t= > o > > =A0SSIZE_MAX bytes must be accepted, whereas FreeBSD only accepts up to > > =A0INT_MAX bytes. > > The point being that SSIZE_MAX is INT_MAX on 32-bit archs and LONG_MAX > on 64-bit archs. Yes, the point is that SSIZE_MAX is only broken on 64-bit arches. It is supposed to give the limit for read() and write() (but not for much else (1)), but the limit is actually INT_MAX, which differs from SSIZE_MAX on broken arches. The POSIX rationale makes it clear that SSIZE_MAX gives the actual limit and that the actual limit may be significantly less than the maximum of the type used to pass the value (ssize_t), but the POSIX spec conflicts with this, at least in the old 2001 draft7: % Spec: % (1) % 9110 {SSIZE_MAX} % 9111 Maximum value of an object of type ssize_t. % 9112 Minimum Acceptable Value: {_POSIX_SSIZE_MAX} % (2) % 13001 XSI The type ssize_t shall be capable of storing values at least in the range [-1, {SSIZE_MAX}]. The % 13002 type useconds_t shall be an unsigned integer type capable of storing values at least in the range Here (2) is correct but redundant since (1) requires more, but (1) is incorrect since it requires SSIZE_MAX to be the maximum of the range while it is SSIZE_MAX that is the maximum and there may be no type whose maximum (as a raw object (signed integer) type that has that maximum). On the broken arches, it happens that such a type exists, but it is not used due to ABI considerations. The rationale explicitly allows making SSIZE_MAX smaller so as to give the actual maximum without requiring mangling of the ABI to limit it to the actual maximum or mangling of the actual maximum to make it match the ABI. % Rationale: 8548 ssize_t This is intended to be a signed analog of size_t. The wording is such that an 8549 implementation may either choose to use a longer type or simply to use the signed 8550 version of the type that underlies size_t. All functions that return ssize_t (read( ) 8551 and write( )) describe as ``implementation-defined'' the result of an input exceeding 8552 {SSIZE_MAX}. It is recognized that some implementations might have ints that 8553 are smaller than size_t. A conforming application would be constrained not to 8554 perform I/O in pieces larger than {SSIZE_MAX}, but a conforming application 8555 using extensions would be able to use the full range if the implementation 8556 provided an extended range, while still having a single type-compatible interface. There is no corresponding rationale for SSIZE_MAX. (1) Here is a complete list of APIs documented by the old draft as being affected by the SSIZE_MAX limit. Note that it is much smaller than the list of APIs that use ssize_t. mq_receive(), msgrcv() read(), pread() readlink() write(), pwrite() strfmon() (a bogus (2) limit on the `size_t maxsize' arg). (2) This limit is intended to limit the buffer size to a value that can be returned by strfmon(). This is possible since strfmon() returns ssize_t, but bogus since the same problem affects interfaces like snprintf() to a much larger extent, and there is no problem and thus should be no error unless a too-large value actually needs to be returned, but strfmon() can never usefully want to return a too-large value, unlike snprintf() which only almost never wants to return one -- suppose someone has somehow obtained an enormous buffer (one of size > SSIZE_MAX) and passes its size to strfmon() -- then who is strfmon() to reject this buffer just because the format _might_ be even more preposterous so as to generate a result larger than SSIZE_MAX? (The behaviour is undefined if the passed size differs from the actual size, even if bytes beyond the end of the buffer would not be accessed by a naive implementation for this call, but strfmon() cannot easily detect this error.) For snprintf(), it is useful to be able to return results much larger than the 20 characters or so needed printing the maximum useful monetary value (~= the global GDP), but snprintf()'s API is not pointlessly typedefed and in any case limits on the buffer size have no effect on the returned size -- the returned size can easily want to exceed (no-signed) SIZE_MAX even if the buffer size is 0, by using enough %.*'s in the format to reach SIZE_MAX. Later versions of POSIX: in the latest public version (2004) found by google: - no change in the spec for {SSIZE_MAX}. Since the rationale cannot change, the spec is still broken. - no change in the spec for snprintf()'s return value. It still specifies the impossible, by requiring snprintf() to return the number of bytes that would be written in all cases, but this is impossible if the number would exceed INT_MAX. This bug is inherited from C99. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100603140402.J27549>