Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 9 Aug 2017 12:22:20 -0700
From:      Bryan Drewery <bdrewery@FreeBSD.org>
To:        David Wolfskill <david@catwhisker.org>, "current@freebsd.org" <current@freebsd.org>
Cc:        sjg@freebsd.org
Subject:   Re: Apparent race in buildworld (head/amd64, r322214 -> r322304)
Message-ID:  <2d6ecd49-2bcc-0c24-8854-63079c0eef6b@FreeBSD.org>
In-Reply-To: <20170809175724.GC1244@albert.catwhisker.org>
References:  <20170809120436.GY1244@albert.catwhisker.org> <689a8aa1-c8a3-a8e3-bc01-8bec5c212b41@FreeBSD.org> <20170809175724.GC1244@albert.catwhisker.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--gF9NCUB3eWvlgtS0IdGU6iakXiKObT7Mi
Content-Type: multipart/mixed; boundary="I7EMDTc44XW9Tu3NFekFJvpKQIPHmcS9R";
 protected-headers="v1"
From: Bryan Drewery <bdrewery@FreeBSD.org>
To: David Wolfskill <david@catwhisker.org>,
 "current@freebsd.org" <current@freebsd.org>
Cc: sjg@freebsd.org
Message-ID: <2d6ecd49-2bcc-0c24-8854-63079c0eef6b@FreeBSD.org>
Subject: Re: Apparent race in buildworld (head/amd64, r322214 -> r322304)
References: <20170809120436.GY1244@albert.catwhisker.org>
 <689a8aa1-c8a3-a8e3-bc01-8bec5c212b41@FreeBSD.org>
 <20170809175724.GC1244@albert.catwhisker.org>
In-Reply-To: <20170809175724.GC1244@albert.catwhisker.org>

--I7EMDTc44XW9Tu3NFekFJvpKQIPHmcS9R
Content-Type: text/plain; charset=windows-1252
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable

> 	/usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s


On 8/9/2017 10:57 AM, David Wolfskill wrote:
> On Wed, Aug 09, 2017 at 10:49:04AM -0700, Bryan Drewery wrote:
>> ...
>>> on one machine, but the other never had an issue.  On the "failing" o=
ne,
>>> a re-start of the buildworld completed (apparently) successfully.
>>
>> Yeah, I've gotten reports of this one for years.  I fixed a few proble=
ms
>> with it in the past but something else must have creeped in.
>=20
> Or I just got "lucky." :-)
>=20
>> I don't believe it is related to META_MODE though.
>=20
> Fair enough; I pointed it out just in case it might be relevant.  (I tr=
y
> to avoid hiding possibly-relevant information when I'm trying to work
> with someone to solve a problem.  I know that's weird, but... :-} )
>=20
>> The last time I fixed this (AFAIK) it was related to an early error
>> being ignored.  I'll review your log to see if I can find anything lik=
e
>> that.
>=20
> Cool.  FWIW, the scheduler will see 8 cores on each machine, so the
> "make buildworld" will have been "make -j16 buildworld" (on each).
>=20
>> ....
>=20

This should fix it:
https://people.freebsd.org/~bdrewery/patches/gcc_s-install-race.diff

The problem has consistently been, from your reports, that gcc_s is
being installed to WORLDTMP *while* something is trying to link to it.

> --- gnu/lib/libgcc__L ---
> Building /common/S4/obj/usr/src/world32/usr/src/gnu/lib/libgcc/_libinst=
all
> --- kerberos5/lib/libhx509__L ---
> Building /common/S4/obj/usr/src/world32/usr/src/kerberos5/lib/libhx509/=
keyset.So
> --- secure/lib/libssl__L ---
> /usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s
>=20
>=20
> Building /common/S3/obj/usr/src/world32/usr/src/gnu/lib/libgcc/_libinst=
all
> --- lib/ncurses/ncursesw__L ---
> Building /common/S3/obj/usr/src/world32/usr/src/lib/ncurses/ncursesw/nc=
_panel.po
> --- lib/ncurses/ncurses__L ---
> Building /common/S3/obj/usr/src/world32/usr/src/lib/ncurses/ncurses/com=
p_parse.po
> --- lib/ncurses/ncursesw__L ---
> Building /common/S3/obj/usr/src/world32/usr/src/lib/ncurses/ncursesw/re=
sizeterm.po
> --- lib/libc++__L ---
> /usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s
>=20
> --- lib/libgcc_s__L ---^M                                              =
      =20
> Building /common/S4/obj/usr/src/world32/usr/src/lib/libgcc_s/_libinstal=
l^M   =20
> --- kerberos5/lib/libwind__L ---^M                                     =
      =20
> --- obj ---^M                                                          =
      =20
> --- secure/lib/libcrypto__L ---^M                                      =
      =20
> --- all_subdir_secure/lib/libcrypto/engines/libatalla ---^M            =
      =20
> /usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s^M                 =
      =20
> cc: error: linker command failed with exit code 1 (use -v to see invoca=
tion)^M
> --- all_subdir_secure/lib/libcrypto/engines/libsureware ---^M          =
      =20
> /usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s^M                 =
      =20



By default 'install' unlinks the file and then copies over the new file.
 Using PRECIOUSLIB we get the -S flag to install which is atomic in its
installation.

Note the patch is not what I will commit. At Isilon we changed our
install to always use -S for library installation, but not to force schg
on.  I am considering making that change the default, to use -S for all
libraries.


--=20
Regards,
Bryan Drewery


--I7EMDTc44XW9Tu3NFekFJvpKQIPHmcS9R--

--gF9NCUB3eWvlgtS0IdGU6iakXiKObT7Mi
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBAgAGBQJZi2DtAAoJEDXXcbtuRpfPM9AH/iChj5dVPn6Dc5ya1Oxu3/2v
HooVuSxvAnb7yeYTAxaLLrjCjuu1R8AW7mOJdmLVbA0VekWLiHKQi3Uqo8a0bsxS
zMq/DRoomUFvmwO1RyoQboTaJbDaYsSCdzvj5A4RFD2DCSmkFTyJWpx5mwCT1QPx
VFdeU0/OGgqZ/22ySPQTD7vfPNInEA6yArRbAPifwi0htV77qd3V2OD1jDw/uoss
a8fxtAj0fZNcuf8pEjUCAXoigVI7C7o9lDGuElKUtIjxiyHLiS9yXxYF4gSHJQjs
G1pQShEMAqFGZec7pXnlMjj0I4ludrN6iRXhZp9oRsIQ7mOYekJ/+6smrdIIjRM=
=cDaP
-----END PGP SIGNATURE-----

--gF9NCUB3eWvlgtS0IdGU6iakXiKObT7Mi--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2d6ecd49-2bcc-0c24-8854-63079c0eef6b>