From owner-svn-src-all@FreeBSD.ORG  Sun Jul 17 23:09:09 2011
Return-Path: <owner-svn-src-all@FreeBSD.ORG>
Delivered-To: svn-src-all@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4F33A1065670;
	Sun, 17 Jul 2011 23:09:09 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au
	[211.29.132.187])
	by mx1.freebsd.org (Postfix) with ESMTP id C142B8FC1A;
	Sun, 17 Jul 2011 23:09:08 +0000 (UTC)
Received: from c122-106-165-191.carlnfd1.nsw.optusnet.com.au
	(c122-106-165-191.carlnfd1.nsw.optusnet.com.au [122.106.165.191])
	by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	p6HN94QO030466
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Mon, 18 Jul 2011 09:09:05 +1000
Date: Mon, 18 Jul 2011 09:09:04 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Ryan Stone <rstone@freebsd.org>
In-Reply-To: <201107172108.p6HL8Gkd090278@svn.freebsd.org>
Message-ID: <20110718080428.N3939@besplex.bde.org>
References: <201107172108.p6HL8Gkd090278@svn.freebsd.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org,
	src-committers@freebsd.org
Subject: Re: svn commit: r224150 - head/sbin/fdisk
X-BeenThere: svn-src-all@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "SVN commit messages for the entire src tree \(except for &quot;
	user&quot; and &quot; projects&quot; \)" <svn-src-all.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-all>,
	<mailto:svn-src-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-all>
List-Post: <mailto:svn-src-all@freebsd.org>
List-Help: <mailto:svn-src-all-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-all>,
	<mailto:svn-src-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 17 Jul 2011 23:09:09 -0000

On Sun, 17 Jul 2011, Ryan Stone wrote:

> Log:
>  The MBR uses a 32-bit unsigned integer to store the size of a slice, but
>  fdisk(1) internally uses a signed int.  Should a user attempt to specify
>  a slice containing more than 2^31 - 1 sectors, an error will be reported
>  on systems with sizeof(long) == 4 and the slice size will be silently
>  truncated on systems with sizeof(long) > 4.
>
>  Instead use an unsigned long to store the slice size in fdisk(1).  This
>  allows the user to specify a slice size up to the maximum permitted by
>  the MBR on-disk format and does not have any problems with silent
>  truncation should the use specify an slice size larger than 2^32 on systems
>  with sizeof(long) > 4.
>
>  Submitted by:	Mark Johnston (markjdb AT gmail DOT com)
>  MFC after:	2 weeks
>
> Modified:
>  head/sbin/fdisk/fdisk.c
>
> Modified: head/sbin/fdisk/fdisk.c
> ==============================================================================
> --- head/sbin/fdisk/fdisk.c	Sun Jul 17 20:49:38 2011	(r224149)
> +++ head/sbin/fdisk/fdisk.c	Sun Jul 17 21:08:16 2011	(r224150)
> @@ -108,9 +108,9 @@ typedef struct cmd {
>     char		cmd;
>     int			n_args;
>     struct arg {
> -	char	argtype;
> -	int	arg_val;
> -	char	*arg_str;
> +	char		argtype;
> +	unsigned long	arg_val;
> +	char *		arg_str;
>     }			args[MAX_ARGS];
> } CMD;

There was no need to further break the style.

"unsigned long" is spelled u_long in KNF, partly to avoid excessive
indentation like the above, and this spelling was used in some but not
all parts of this file.  But no one knows what the correct indentation
for the above is -- it shouldn't be just 1 or more tabs, since the
file is highly non-KNF, starting with it using a 4-space primary
indentation.

The "*" before arg_str in the above used to be correctly placed.

>
> @@ -990,7 +990,7 @@ parse_config_line(char *line, CMD *comma
> 	    if (isalpha(*cp))
> 		command->args[command->n_args].argtype = *cp++;
> 	    end = NULL;
> -	    command->args[command->n_args].arg_val = strtol(cp, &end, 0);
> +	    command->args[command->n_args].arg_val = strtoul(cp, &end, 0);
>  	    if (cp == end || (!isspace(*end) && *end != '\0')) {
>  		char ch;
>  		end = cp;
>

Also, arg_val is never used as a u_long.  We risk truncation when it is
blindly assigned to variables whose type is never u_long.  It is mostly
assigned to variables of type uint ("uint" is an archaic SysV spelling
of the BSD u_int which is a spelling of unsigned int.  This style bug
was imported from Mach and never fixed).  Since uint is unsigned and ints
have 32 bits, things now work up to UINT32_MAX but no further.  arg_val
is also blindly assigned to a variable of type int, the partition number,
and to a variable of type u_char (the partition type), but since these
values nonsense unless they are small, there is no problem with overflow
near INT32_MAX or UINT32_MAX.

The above still has the usual amount of error checking for the range after
strto*() returns: none.  We don't really care since values larger than
UINT32_MAX don't work for other reasons.  On 32-bit systems, a value of
UINT32_MAX will be returned in overflow cases.  Then we use that value
when we should abort.  On 64-bit systems, values larger than UINT32_MAX
may be returned.  Then assignment to arg_val now doesn't overflow, but
assignment to the uints overflows, giving fail-unsafe behaviour where
i386 has relatively fail-safe behaviour.

Elsewhere, fdisk uses a horrible mixture of int, uint, u_int, u_int32_t,
uint32_t, unsigned long, u_long, long long and u_int64_t.  Sometimes
the type is imposed by a system data structure, but mostly these choices
of types are nonsense or anachronisms.  They might as well almost all
be u_int, since only 32-bit partition offsets and sizes work.  More
careful old code uses u_long, since that was the largest possible and
ints might be 16 bits.  Newer code should us uint32_t.

u_int64_t is used just once, to avoid overflow in conversion from
sectors to MB.  This use is reasonable, but then to print the value
it must be cast to uintmax_t.  It would be better to use uintmax_t for
the value throughout.

The long long abomination is only just once, as an accumulator in a
home made version of strtoul() named decimal().  decimal() actually
has some range checking, but is otherwise much worse than strtoul().
It allows values up to UINT32_MAX, but returns values in an int *.
Callers often want the value in a uint or a u_char.  If they passed a
pointer to their variable, then the type error would be detected, but
such errors are "fixed" by subverting decimal() using the Decimal()
macro (Decimal() calls decimal() with a pointer to a temporary variable
with the matching wrong type 'int', and then blindly assigns the
temporary variable to the actual variable).  The subversion was in
rev.1.1 in 1993, so it apparently came from Mach.

Anyway, decimal() works up to UINT32_MAX although it gives undefined
behaviour above INT_MAX, since the overflow on assignment to the
temporary variable combined with the implementation-defined conversion
of the temporary variable to uint happens to be benign on all supported
arches (it gives back the original uint32_t value).  The type of the
accumulator has to be larger than the type of the maximum value supported
so that the range checking works after the limit of the range has been
exceeded.  Otherwise, the type of the accumulator only needs to be
uint32_t.  strtoul() uses a similar accumulator of type u_long and
makes this work by checking the range before a modified limit is
exceeded.

decimal() is used for most interactive input, including for partition
offsets and sizes.  The bug fixed in this commit must have been rarely
noticed since it only affects command-line input.  The corresponding
bugs for interactive input are merely all the benign overflows above
INT_MAX and the horrible choices of types which mark it hard to see
that all the overflows are benign.

See the r217714 commit thread for more details about bogusness in
decimal().

Bruce