Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 23 May 2012 18:57:25 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        gnn@freebsd.org
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, "David E. O'Brien" <obrien@freebsd.org>, Bruce Evans <brde@optusnet.com.au>
Subject:   Re: svn commit: r235797 - head/contrib/gcc
Message-ID:  <20120523173439.I902@besplex.bde.org>
In-Reply-To: <861umcx7oe.wl%gnn@neville-neil.com>
References:  <201205221818.q4MII7lk019626@svn.freebsd.org> <20120523050739.H3621@besplex.bde.org> <861umcx7oe.wl%gnn@neville-neil.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 22 May 2012 gnn@freebsd.org wrote:

> At Wed, 23 May 2012 06:05:06 +1000 (EST),
> Bruce Evans wrote:
>> [... excessive quoting deleted]
>
>> Now I prefer the old way, after fixing the bugs found by switching.
>> It finds more bugs under FreeBSD, and is bug for bug compatible with
>> distribution gcc and probably with other system's headers under
>> !FreeBSD.  Except under FreeBSD, I prefer %q to be an error.  The above
>> shows it only being used 12 times in /sys, with most uses of it being
>> bugs.  Fixing these bugs would leave about 1 correct use of it -- for
>> printing the quad_t in unreachable code in msdosfs.  This use would be
>> easy to avoid too (just cast to uintmax_t).  Uses in userland are
>> hopefully east to fix and avoid too.  In an old userland, there were
>> only about 50, with most in netstat/inet6.c, tcopy.c, ftpd, fsirand,
>> and contrib'ed code.  Of course you can't make it an error for the
>> contrib'ed code.
>
> Instead of replying to the original commit I'll just add to the chain.

Please don't quote the whole thing, especially when not replying to any
of it.

> Note that this change broke buildworld:
>
> src/usr.sbin/ppp/throughput.c: In function 'throughput_disp':
> src/usr.sbin/ppp/throughput.c:119: warning: format '%6qu' expects type 'long unsigned int', but argu
>
> And several more.

Of course ppp has several style bugs and type mismatches, else its brokenness
wouldn't have been exposed:
- the variables have the bogus type "unsigned long long"
- they are printed with the mismatched format "%6qu", although there is a
   perfectly bogus format "%6llu" that matches them exactly.

Another point is that libc printf doesn't really support %q.  It just
maps %q to %ll and accesses the arg using va_arg(ap, long long).  Thus
it is technically correct in userland for the format checker to also
map %q to %ll and disallow printing int64_t with each in the LP64 case.
This is also bug for bug compatible with the gcc distribution and
perhaps with other OS's.  However the kernel printf is more careful.
It doesn't map %q to %ll, and it accesses %q args using va_arg(ap, quad_t).
Thus it is technically correct in the kernel for the format checker to also
_not_ map %q to %ll and _allow_ printing int64_t with %q.

However, %q should never be used.  The history of the kernel printf
shows that I objected to adding %q support to it in 1999, but not
strongly enough to keep it out.  I should have objected more to adding
support for %ll.  This was slightly before C99 came out.  %j support
was needed before then but was not added until 2002.  I had prepared
for killing long long and quad_t in the kernel by removing many uses
of them.  When the %q mistake was committed, there were just 90
instances of 'long long' in /sys.  61 of these were in the i386
FPU emulator.  20 were in other code that I didn't care about and/or
was soon to go away:
     alpha: 6 (half wrong and/or easy to avoid)
     coda: 3 (all wrong and/or easy to avoid)
     svr4: 4 (all wrong and/or easy to avoid)
     vinum: 4 (all easy to avoid?)
     powerpc: 3 (cloned from wrong ones for alpha)

The remaining 9 were:
     3 that I was responsible for in i386 clock code and a copy for pc98
       (to avoid overflow in multiplication)
     2 that I was responsible for in i386 basic typedefs (only used for non-gcc)
     1 that I was responsible for in sys/param.h
       (to avoid overflow in shift)
     3 in ntp code (all wrong and easy to avoid.  The code wanted precisely
       64 bits and assumed that long long gave that.  The type that gives
       64 bits, int64_t, has been standard in FreeBSD since FreeBSD-2.  It
       came from 4.4BSD-Lite1.

While counting the above, I noticed that in the middle of 1999 just
before the %q changes, there were about 700 references to int64_t (now
counting ones in comments).  Just 1 year earlier, there were only about
300.  The count of long longs didn't change much in that time.  For
quad_t, the raw counts were 419 in the middle of 1998 and 441 in the
middle of 1999.  About half were in nfs (183) and libkern (64).  Now
/sys has and oldnfs has 58.  For long long, the raw count is 1577.
So, 13 years after C99 stdint.h made long long unnecessary and after
mostly avoiding long long when it was only a gcc misfeature, the use
of long long has exploded by a factor of about 1577/91 = 17 :-(.
The use of int64_t has also exploded: its raw count is 66730 (!);
it has exploded by a factor of about 66730/700 = 95.  int32_t is only
used 46715 times.  I fear than many of the uses of fixed width types
are wrong too.  The fixed widths become part of both APIs and ABIs
so they will be hard to change to support 128-bit things.  off_t is
only used 2223 times.  It has the opposite problem that its API is
perfect and supports it having any width, but you can't actually change
its width easily since bugware will have assumptions that it is precisely
64 bits, just like old bugware assumed that it and long are precisely 32
bits, but multiplied by a bloat factor of 95 or so.  intmax_t is only used
1090 times.  It has the same expansion problems for ABIs that long used to
have.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120523173439.I902>