Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Jun 2016 23:15:39 +0200
From:      Dimitry Andric <dim@FreeBSD.org>
To:        Gerald Pfeifer <gerald@pfeifer.com>
Cc:        Andreas Tobler <andreast@FreeBSD.org>, freebsd-toolchain@freebsd.org
Subject:   Re: Duplicate OPT_ entries in gcc/options.h
Message-ID:  <75411813-0C9B-4CEF-BEE4-8B26DD8346F7@FreeBSD.org>
In-Reply-To: <alpine.LSU.2.20.1606082038000.2798@anthias.pfeifer.com>
References:  <alpine.LSU.2.20.1606082038000.2798@anthias.pfeifer.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_EE128934-3111-4BA1-80A9-472C0E2D9E01
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

On 08 Jun 2016, at 21:11, Gerald Pfeifer <gerald@pfeifer.com> wrote:
>=20
> I got a user report, and could reproduce this, that building
> GCC (lang/gcc, but also current HEAD, so probably pretty much
> any version) with FreeBSD 11 and LANG =3D en_US.UTF-8 we get
> conflicting entires in $BUILDDIR/gcc/options.h such as
>=20
>  OPT_d =3D 135,                               /* -d */
>  OPT_D =3D 136,                               /* -D */
>  OPT_d =3D 137,                               /* -d */
>  OPT_D =3D 138,                               /* -D */
>  OPT_d =3D 141,                               /* -d */
>  OPT_D =3D 142,                               /* -D */
>  OPT_d =3D 143,                               /* -d */
>=20
> Using LANG =3D en_US (without UTF-8), everything works fine.
>=20
> Any ideas what might be going on here?  (This is done via
> AWK scripts from what I can tell, does this trigger any
> ideas?)

It is definitely something caused by our awk in base, in any case.
First opt-gather.awk is run to generate a flat list of all options:

  /usr/bin/awk -f /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/opt-gather.awk =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/ada/gcc-interface/lang.opt =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/fortran/lang.opt =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/go/lang.opt =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/java/lang.opt =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/lto/lang.opt =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/c-family/c.opt =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/common.opt =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/config/fused-madd.opt =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/config/i386/i386.opt =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/config/rpath.opt =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/config/freebsd.opt > =
tmp-optionlist

Then opt-functions.awk is run to process optionlist into options.h:

  /usr/bin/awk -f =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/opt-functions.awk -f =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/opt-read.awk -f =
/usr/ports/lang/gcc/work/gcc-4.8.5/gcc/opth-gen.awk < optionlist > =
options.h

If I run the first step using LANG=3DC, or without any LANG setting, =
both
optionlist and options.h are as expected.  If I run the first step using
LANG=3Den_US.UTF-8, the optionlist is sorted differently, for example =
the
"good" optionlist has the uppercase d options first, and much later the
lowercase d options:

  D^\C ObjC C++ ObjC++ Joined Separate MissingArgError(macro name =
missing after %qs)^\-D<macro>[=3D<val>]   Define a <macro> with <val> as =
its value.  If just <macro> is given, <val> is taken to be 1
  D^\Driver Joined Separate
  D^\Fortran Joined Separate
  ... much later in the file, after all options starting with an =
uppercase letter ...
  d^\C ObjC C++ ObjC++ Joined
  d^\Common Joined^\-d<letters>   Enable dumps from specific passes of =
the compiler
  d^\Fortran Joined
  d^\Java Separate SeparateAlias Alias(foutput-class-dir=3D)

The "bad" optionlist has the upper and lower case d options sorted
together:

  d^\C ObjC C++ ObjC++ Joined
  D^\C ObjC C++ ObjC++ Joined Separate MissingArgError(macro name =
missing after %qs)^\-D<macro>[=3D<val>]   Define a <macro> with <val> as =
its value.  If just <macro> is given, <val> is taken to be 1
  d^\Common Joined^\-d<letters>   Enable dumps from specific passes of =
the compiler
  D^\Driver Joined Separate
  defsym=3D^\Driver JoinedOrMissing
  defsym^\Driver Separate
  d^\Fortran Joined
  D^\Fortran Joined Separate
  d^\Java Separate SeparateAlias Alias(foutput-class-dir=3D)

Note that GNU awk does *not* produce a different optionlist file when
used with either LANG=3DC or LANG=3Den_US.UTF-8.

opt-gather.awk's sorting function looks like this:

  function sort(ARRAY, ELEMENTS)
  {
          for (i =3D 2; i <=3D ELEMENTS; ++i) {
                  for (j =3D i; ARRAY[j-1] > ARRAY[j]; --j) {
                          temp =3D ARRAY[j]
                          ARRAY[j] =3D ARRAY[j-1]
                          ARRAY[j-1] =3D temp
                  }
          }
          return
  }

So I am assuming that the ARRAY[j-1] > ARRAY[j] comparison works
differently in our awk, depending on the LANG settings.  No idea when
that changed, though, if it changed at all...

-Dimitry


--Apple-Mail=_EE128934-3111-4BA1-80A9-472C0E2D9E01
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.30

iEYEARECAAYFAldYiwAACgkQsF6jCi4glqNEfwCgjyaa7pD1dwOBJSpksK0JlnHN
NGEAnj0pycWr+f2DQBHEa3X5Ro5letdy
=GDFJ
-----END PGP SIGNATURE-----

--Apple-Mail=_EE128934-3111-4BA1-80A9-472C0E2D9E01--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?75411813-0C9B-4CEF-BEE4-8B26DD8346F7>