Date: Wed, 7 Jun 2006 10:41:40 +1000 (EST) From: Bruce Evans <bde@zeta.org.au> To: Xin LI <delphij@delphij.net> Cc: freebsd-arch@freebsd.org Subject: Re: getbsize(3): Convert blocksizep to be unsigned long? Message-ID: <20060607090850.U4310@delplex.bde.org> In-Reply-To: <1149557729.6740.3.camel@spirit> References: <1149557729.6740.3.camel@spirit>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 6 Jun 2006, Xin LI wrote: > When I was twiddling df(1)'s code I found that getbsize(3) accepts > blocksizep as long *. Because that the manual page says: > > "The memory referenced by blocksizep is filled in with block size, in > bytes." > > I think it makes no sense for the number to be negative. Is it > reasonable to apply the attached patch? No. This mistake has been committed and backed out once too often already (in 2002). The previous interface chance was from "long *" to "size_t *". This was a larger change since it also increased the size of the pointed-to object on 64-bit machines, thus changing the ABI very unnecessarily. Using unsigned values gives sign extension bugs and breaks overflow checks that can be done in principle for sign values but not unsigned ones. df and statfs() are good examples of the sign extension bugs. statfs() used to supply correctly signed types (i.e., only signed ones), and naive users of statfs() like df still depend on this, but statfs() now supplies a mixture of signed and unsigned types, giving sign extension bugs or requiring lots of casts to avoid them when the types are mixed or unsigned values are subtracted. getbsize() still supports a correctly signed type. I use the following fix for the type error in df.c: %%% Index: df.c =================================================================== RCS file: /home/ncvs/src/bin/df/df.c,v retrieving revision 1.62 diff -u -2 -r1.62 df.c --- df.c 4 Jun 2004 09:30:51 -0000 1.62 +++ df.c 4 Jun 2004 10:15:16 -0000 @@ -363,13 +364,14 @@ prtstat(struct statfs *sfsp, struct maxwidths *mwp) { - static u_long blocksize; - static int headerlen, timesthrough = 0; - static const char *header; - int64_t used, availblks, inodes; + static long blocksize; + static int timesthrough; + const char *header; + int64_t availblks, inodes, used; + int headerlen; if (++timesthrough == 1) { mwp->mntfrom = imax(mwp->mntfrom, (int)strlen("Filesystem")); if (hflag) { - header = " Size"; + header = " Size"; mwp->total = mwp->used = mwp->avail = (int)strlen(header); @@ -440,5 +442,5 @@ update_maxwidths(struct maxwidths *mwp, const struct statfs *sfsp) { - static u_long blocksize = 0; + static long blocksize; int dummy; %%% The patch also fixes some style bugs and has part of a fix for a bug in the output formatting. "long blocksize" is now passed to fsbtoblk() which expects "u_long bs". The interface to fsbtoblk() should be changed too, but there is no actual problem due to the block size easily fitting in a signed long and fsbtoblk() defending against sign botches internally: % static intmax_t % fsbtoblk(int64_t num, uint64_t fsbs, u_long bs) % { % % if (fsbs != 0 && fsbs < bs) % return (num / (intmax_t)(bs / fsbs)); % else % return (num * (intmax_t)(fsbs / bs)); % } Note that `num' needs to be signed here since negative block counts can happen for f_bavail, and it is actually signed. It also needs to be divided by a signed value (with a positive sign) so that it doesn't get promoted to a huge unsigned value when numerator is a small signed value and the type of the denominator is unsigned and longer than the type of the numerator. This was broken in rev.1.52 when the type of fsbs (f_bsize in struct statfs) was broken from long to uint64_t. Rev.1.52 also broken type type for bs (blocksize) from long to u_long, but since u_long is smaller than uint64_t on all supported arches, this made no additional difference to the type pf (bs / fsbs) or (fsbsd / bss) -- that type is always uint64_t. Rev.1.56 works around these sign extension bugs by casting the denominator to int64_t. Rev.1.62 cleans up or down the interface of fsbtoblk() a little. Before rev.1.52, was a macro that operated only on signed types and just worked. Macros can work on "any" types provided that they don't mix unsigned types with signed ones and/or don't overflow. Overflow is avoided by not doing the multiplication (num * fsbs) first. Even this overflow avoidance is now bogus, since off_t has the same size as the block counts (all are 64 bits) so overflow can only occur if the args are garbage, so fsbtoblk() could be the even simpler macro: #define fsbtoblk(num, fsbs, bs) ((num) * (fsbs) / (bs)) provided there are no sign extension botches (the block sizes must be signed if num is signed). Casting things to signed types to avoid the sign extension bugs is worse than using signed types throughout, since it is uglier and strictly needs overflow checks for every conversion (cases where the extra bit provided by unsigned types is actually needed might cause overflow), and would require adding casts to interfaces that could be type-generic modulo signedness, and would generate larger code (see the commit log to rev.1.62). OTOH, casting everything to floating point would work well -- everything would become signed, integral values up to about 2^53 would be represented exactly, and larger values would never cause overflow and would be represented exactly enough. Befor
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060607090850.U4310>