From owner-svn-src-all@FreeBSD.ORG Thu Aug 22 10:51:31 2013 Return-Path: Delivered-To: svn-src-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 2FF6F9C8; Thu, 22 Aug 2013 10:51:31 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id A58DB21C9; Thu, 22 Aug 2013 10:51:30 +0000 (UTC) Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 0D4F44241AC; Thu, 22 Aug 2013 20:51:17 +1000 (EST) Date: Thu, 22 Aug 2013 20:51:15 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Jilles Tjoelker Subject: Re: svn commit: r254600 - head/lib/libutil In-Reply-To: <20130821213755.GA8052@stack.nl> Message-ID: <20130822185756.Y1297@besplex.bde.org> References: <201308211646.r7LGk6eV051215@svn.freebsd.org> <5214F72B.7070006@freebsd.org> <20130821190309.GB52908@omg> <20130821202725.GA4991@stack.nl> <20130821212413.GC52908@omg> <20130821213755.GA8052@stack.nl> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=bpB1Wiqi c=1 sm=1 tr=0 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 a=PO7r1zJSAAAA:8 a=aw4-S10CjX0A:10 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=H7VN2zDOXXgA:10 a=10tJtSuNjYdlZKnmVusA:9 a=nWxIxC1GMPgnsk2h:21 a=zFEI9HLo3Qmo5DcZ:21 a=CjuIK1q_8ugA:10 Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, Sergey Kandaurov , Andrey Chernov , src-committers@FreeBSD.org X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Aug 2013 10:51:31 -0000 On Wed, 21 Aug 2013, Jilles Tjoelker wrote: > On Thu, Aug 22, 2013 at 01:24:13AM +0400, Sergey Kandaurov wrote: >> On Wed, Aug 21, 2013 at 10:27:25PM +0200, Jilles Tjoelker wrote: >>> On Wed, Aug 21, 2013 at 11:03:10PM +0400, Sergey Kandaurov wrote: >>>> On Wed, Aug 21, 2013 at 09:21:47PM +0400, Andrey Chernov wrote: >>>>> On 21.08.2013 20:46, Sergey Kandaurov wrote: >>>>>> number = strtoumax(buf, &endptr, 0); >>>>>> >>>>>> + if (number == UINTMAX_MAX && errno == ERANGE) { >>>>>> + return (-1); >>>>>> + } > >>>>> You need to reset errno before strtoumax() call (errno = 0), because any >>>>> of previous functions may left it as ERANGE. This also has a bogus test for UINTMAX_MAX wheich prevents input of the number UINTMAX_MAX on systems where the limit for the function (UINT64_T_MAX) is the same as the limit for strtoumax(). This test is a wrong way of doing the errno thing. This also has a high density of style bugs: - excessive braces - extra blank line expand_number() remains a very badly designed and implemented function. Its design errors start with its name. It doesn't expand numbers. It converts strings to numbers. The strtoumax() family is missing this bug and others. Its design errors continue with it taking a uint64_t arg and not a uint64_t arg. Its implementation errors begin with broken range checking. The above is an incomplete fix for this. Unless uintmax_t is the same as uint64_t, there are still the following bugs: - the return value of strotumax() is assigned to a uint64_t. This may overflow. After overflow, the bogus (number == UINTMAX_MAX) check has no effect, since when overflow is possible 'number' cannot equal UINTMAX_MAX. - values that exceed the maximum supported (UINT64_MAX) are not detected. They are silently truncated to uint64_t. >>>> Thanks for pointing out. >>>> Does the patch look good? No. It adds many more style bugs... >>>> Index: expand_number.c >>>> =================================================================== >>>> --- expand_number.c (revision 254600) >>>> +++ expand_number.c (working copy) >>>> @@ -53,6 +53,8 @@ >>>> unsigned shift; >>>> char *endptr; >>>> >>>> + errno = 0; >>>> + Another extra blank line. It is worse than most, since it separates the initialization for calling stroumax() from the call. >>>> number = strtoumax(buf, &endptr, 0); >>>> This extra blank line separates the call from the error checking for the function. It was added in the commit that started this thread. The previous version had a bogus extra blank line to separate the result of the error checking from the call. >>>> if (number == UINTMAX_MAX && errno == ERANGE) { > >>> This may cause the function to set errno=0 if it is successful, which is >>> not allowed for standard library functions from C and POSIX. There may >>> be a problem not only if expand_number() is standardized but also if it >>> is used in the implementation of a standard library function. The best >>> solution is to save and restore errno around this (if [ERANGE] is >>> detected, that is a valid errno value to keep). This is painful, but seems to be necessary. >>> In an application it is acceptable to set errno=0 without further ado. > >> What about this change? >> It changes errno only if it was modified from zero in strtoumax(). >> Unconditionally restoring errno after strtoumax() unless ERANGE is >> probably not good as strtoumax() might set different errno (e.g. EINVAL) >> and we might want to keep it as well. Please correct me, if I'm wrong. > >> Index: lib/libutil/expand_number.c >> =================================================================== >> --- lib/libutil/expand_number.c (revision 254600) >> +++ lib/libutil/expand_number.c (working copy) >> @@ -50,15 +50,22 @@ >> expand_number(const char *buf, uint64_t *num) >> { >> uint64_t number; >> + int saved_errno; Style bugs: - declarations more unsorted than before. Previously, only endptr was unsorted. - the variable for this is normally names serrno. libutil mostly follows this convention -- it has several serrno's and no saved_errno's, but it does have some error's (error = errno; ...), mainly in pidfile.c. Only login_cap.c is notkkk careful about clobbering errno to 0. >> unsigned shift; >> char *endptr; >> >> + saved_errno = errno; >> + errno = 0; >> + >> number = strtoumax(buf, &endptr, 0); >> >> if (number == UINTMAX_MAX && errno == ERANGE) { >> return (-1); >> } Style bugs, as above. >> >> + if (errno == 0) >> + errno = saved_errno; >> + Yet another style bug. >> if (endptr == buf) { >> /* No valid digits. */ >> errno = EINVAL; >> Another style bug is the placement of the error checking. Normally the endptr check is done before the range check. > > This looks good to me. This looks not so good to me. Some of the other bugs in the old version: % /* % * Convert an expression of the following forms to a uint64_t. % * 1) A positive decimal number. % * 2) A positive decimal number followed by a 'b' or 'B' (mult by 1). % * 3) A positive decimal number followed by a 'k' or 'K' (mult by 1 << 10). % * 4) A positive decimal number followed by a 'm' or 'M' (mult by 1 << 20). % * 5) A positive decimal number followed by a 'g' or 'G' (mult by 1 << 30). % * 6) A positive decimal number followed by a 't' or 'T' (mult by 1 << 40). % * 7) A positive decimal number followed by a 'p' or 'P' (mult by 1 << 50). % * 8) A positive decimal number followed by a 'e' or 'E' (mult by 1 << 60). This garbage was copied from a similar comment in dd/args.c. The number is actually neither positive nor decimal, especially in dd where there is a variant of this function that returns negative numbers. Another design error in this function is that it can't return negative numbers. % */ % This bogus blank line detaches the comment from the code that it describes, resulting in the comment describing null code. % int % expand_number(const char *buf, uint64_t *num) % { % uint64_t number; % unsigned shift; % char *endptr; % % number = strtoumax(buf, &endptr, 0); This supports non-positive non-decimal numbers. The support for non-decimal numbers is more intentional (it is given by the base 0 in the call). Negative numbers are only supported by strtoumax() allowing a minus sign and returning a nonnegative number equal to 1 larger than UINTMAX_MAX less the number without the minus sign. % % if (endptr == buf) { % /* No valid digits. */ % errno = EINVAL; % return (-1); % } Most of the bugs except ones in the comment are missing in dd/args.c: % static uintmax_t % get_num(const char *val) % { % uintmax_t num, mult, prevnum; By using variables of the same type that strtoumax() returns, it avoids overflow bugs in the overflow checking... % char *expr; % % errno = 0; This was fixed 14 years ago in dd. Perhaps earlier in my version. For variations on the bugs, see the 4.4Lite version. It was missing this errno setting. Since this is an application, it can clobber errno like this. % num = strtouq(val, &expr, 0); This is still broken. It unnecessarily assumes that uquad_t = uintmax_t. 4.4BSD-Lite uses strtoul() here, and using strtouq() is a wrong fix. % if (errno != 0) /* Overflow or underflow. */ % err(1, "%s", oper); The error handling is simpler since this is not a library function. The comment about underflow is wrong. Underflow can't happen. What the comment means by underflow is "overflow, with a negative sign". 4.4BSD-Lite was missing this bug. But 4.4BSD-Lite had a wrong range check (num == ULONG_MAX, essentially the same as expand_number() now). % 4.4BSD-Lite was missing this bug (extra blank line). % if (expr == val) /* No valid digits. */ % errx(1, "%s: illegal numeric value", oper); - this code, with its more specific error message, was broken by a POSIX bug. It is now unreachable (EINVAL is returned if no number is found, and for some other cases like an invalid base which can't happen for the above call). It is not bad to break it becauses its message is of low quality in a different way than the strerror(errno) message from err(). There are no lawyers here. strerror(EINVAL) is just too generic. % [...] dd's function is very careful to avoid overflow in shifts and multiplications. expand_number() just allows things to overflow. Since the variables are unsigned, overflow just causes garbage results. Not good enough for a library function. dd's function also supports multipliers. The overflow avoidance is more complication for multiplication than for shifting. Multipliers would be useful in a library function too. expand_number() doesn't support them. Another of its design errors is that it does less than dd's functions, so it cannot be used to replace them. % } % % /* % * Convert an expression of the following forms to an off_t. This is the % * same as get_num(), but it uses signed numbers. % * % * The major problem here is that an off_t may not necessarily be a intmax_t. % */ % static off_t % get_off_t(const char *val) % { % intmax_t num, mult, prevnum; % char *expr; % % errno = 0; % num = strtoq(val, &expr, 0); % if (errno != 0) /* Overflow or underflow. */ % err(1, "%s", oper); % % if (expr == val) /* No valid digits. */ % errx(1, "%s: illegal numeric value", oper); % ... % } dd also has this misimplemented function. It is supposed to fix the problem that get_num() only returns unsigned values. But it is misimplemented by making a copy of get_num() and changing some unsigned types to signed types. The result is mostly worse than casting the result of get_num(). The careful overflow checking no longer works, since it is actually not so careful and does things like checking for overflow after overflow has already occurred. Overflow already occurring gives underfined behaviour with signed types. So the duplicated function differs from just casting the result of get_num() as follows: - it gives undefined behaviour on overflow - strtoq() clamps to QUAD_MAX_/MIN where strtouq() clamps to UQUAD_MAX/MIN(). This mainly amplifies the bugs seeking to large offsets. The kernel supports 64-bit unsigned offsets for some defice files and these are needed for /dev/kmem. But QUAD_MAX is smaller than the necessary offsets, so oseek and iseek in bytes are broken. You have to trick dd into working by using non-byte offsets. Some tricks (bogus casts/conversions in various places) are needed anyway since signed 64-bit off_t can't hold the necessary offsets. The function is still missing checks that quad_t is not too large for offsets. My fix for get_off_t() is to just make it (off_t)get_num() with a check that the result is actually representable as an off_t. This only depends on 2's complement magic. To start fixing expand_number(): /* Fix comment here. */ int expand_number(const char *buf, uint64_t *num) { char *endptr; uintmax_t num; uint64_t number; unsigned shift; int serrno; serrno = errno; errno = 0; num = strtoumax(buf, &endptr, 0); if (num > UINT64_MAX) errno = ERANGE; if (errno != 0) return (-1); errno = serrno; number = num; ... } This depends on the POSIX bug for detecting the no-digits case. Fortunately, the early error handling of this function is simple and compatible with that of strtoumax(), so we don't need to translate strtoumax()'s errors except for adjusting its range error. Similarly in dd: /* Fix comment here. */ static uintmax_t get_num(const char *val) { char *expr; uintmax_t num, mult, prevnum; errno = 0; num = strtoumax(val, &expr, 0); if (errno != 0) err(1, "%s", oper); ... } /* * Convert an expression of the above form to an off_t. This version does * not handle negative numbers perfectly. It assumes 2's complement and * maybe nonnegative multipliers. I hope perfect handling is not necessary * for dd. */ static off_t get_off_t(const char *val) { uintmax_t num; num = get_num(val); if (num != (u_quad_t)(off_t)num) errx(1, "%s: offset too large", oper); return ((off_t)num); } Bruce