Date: Wed, 9 Aug 2017 12:22:20 -0700 From: Bryan Drewery <bdrewery@FreeBSD.org> To: David Wolfskill <david@catwhisker.org>, "current@freebsd.org" <current@freebsd.org> Cc: sjg@freebsd.org Subject: Re: Apparent race in buildworld (head/amd64, r322214 -> r322304) Message-ID: <2d6ecd49-2bcc-0c24-8854-63079c0eef6b@FreeBSD.org> In-Reply-To: <20170809175724.GC1244@albert.catwhisker.org> References: <20170809120436.GY1244@albert.catwhisker.org> <689a8aa1-c8a3-a8e3-bc01-8bec5c212b41@FreeBSD.org> <20170809175724.GC1244@albert.catwhisker.org>
next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --gF9NCUB3eWvlgtS0IdGU6iakXiKObT7Mi Content-Type: multipart/mixed; boundary="I7EMDTc44XW9Tu3NFekFJvpKQIPHmcS9R"; protected-headers="v1" From: Bryan Drewery <bdrewery@FreeBSD.org> To: David Wolfskill <david@catwhisker.org>, "current@freebsd.org" <current@freebsd.org> Cc: sjg@freebsd.org Message-ID: <2d6ecd49-2bcc-0c24-8854-63079c0eef6b@FreeBSD.org> Subject: Re: Apparent race in buildworld (head/amd64, r322214 -> r322304) References: <20170809120436.GY1244@albert.catwhisker.org> <689a8aa1-c8a3-a8e3-bc01-8bec5c212b41@FreeBSD.org> <20170809175724.GC1244@albert.catwhisker.org> In-Reply-To: <20170809175724.GC1244@albert.catwhisker.org> --I7EMDTc44XW9Tu3NFekFJvpKQIPHmcS9R Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: quoted-printable > /usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s On 8/9/2017 10:57 AM, David Wolfskill wrote: > On Wed, Aug 09, 2017 at 10:49:04AM -0700, Bryan Drewery wrote: >> ... >>> on one machine, but the other never had an issue. On the "failing" o= ne, >>> a re-start of the buildworld completed (apparently) successfully. >> >> Yeah, I've gotten reports of this one for years. I fixed a few proble= ms >> with it in the past but something else must have creeped in. >=20 > Or I just got "lucky." :-) >=20 >> I don't believe it is related to META_MODE though. >=20 > Fair enough; I pointed it out just in case it might be relevant. (I tr= y > to avoid hiding possibly-relevant information when I'm trying to work > with someone to solve a problem. I know that's weird, but... :-} ) >=20 >> The last time I fixed this (AFAIK) it was related to an early error >> being ignored. I'll review your log to see if I can find anything lik= e >> that. >=20 > Cool. FWIW, the scheduler will see 8 cores on each machine, so the > "make buildworld" will have been "make -j16 buildworld" (on each). >=20 >> .... >=20 This should fix it: https://people.freebsd.org/~bdrewery/patches/gcc_s-install-race.diff The problem has consistently been, from your reports, that gcc_s is being installed to WORLDTMP *while* something is trying to link to it. > --- gnu/lib/libgcc__L --- > Building /common/S4/obj/usr/src/world32/usr/src/gnu/lib/libgcc/_libinst= all > --- kerberos5/lib/libhx509__L --- > Building /common/S4/obj/usr/src/world32/usr/src/kerberos5/lib/libhx509/= keyset.So > --- secure/lib/libssl__L --- > /usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s >=20 >=20 > Building /common/S3/obj/usr/src/world32/usr/src/gnu/lib/libgcc/_libinst= all > --- lib/ncurses/ncursesw__L --- > Building /common/S3/obj/usr/src/world32/usr/src/lib/ncurses/ncursesw/nc= _panel.po > --- lib/ncurses/ncurses__L --- > Building /common/S3/obj/usr/src/world32/usr/src/lib/ncurses/ncurses/com= p_parse.po > --- lib/ncurses/ncursesw__L --- > Building /common/S3/obj/usr/src/world32/usr/src/lib/ncurses/ncursesw/re= sizeterm.po > --- lib/libc++__L --- > /usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s >=20 > --- lib/libgcc_s__L ---^M = =20 > Building /common/S4/obj/usr/src/world32/usr/src/lib/libgcc_s/_libinstal= l^M =20 > --- kerberos5/lib/libwind__L ---^M = =20 > --- obj ---^M = =20 > --- secure/lib/libcrypto__L ---^M = =20 > --- all_subdir_secure/lib/libcrypto/engines/libatalla ---^M = =20 > /usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s^M = =20 > cc: error: linker command failed with exit code 1 (use -v to see invoca= tion)^M > --- all_subdir_secure/lib/libcrypto/engines/libsureware ---^M = =20 > /usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s^M = =20 By default 'install' unlinks the file and then copies over the new file. Using PRECIOUSLIB we get the -S flag to install which is atomic in its installation. Note the patch is not what I will commit. At Isilon we changed our install to always use -S for library installation, but not to force schg on. I am considering making that change the default, to use -S for all libraries. --=20 Regards, Bryan Drewery --I7EMDTc44XW9Tu3NFekFJvpKQIPHmcS9R-- --gF9NCUB3eWvlgtS0IdGU6iakXiKObT7Mi Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAEBAgAGBQJZi2DtAAoJEDXXcbtuRpfPM9AH/iChj5dVPn6Dc5ya1Oxu3/2v HooVuSxvAnb7yeYTAxaLLrjCjuu1R8AW7mOJdmLVbA0VekWLiHKQi3Uqo8a0bsxS zMq/DRoomUFvmwO1RyoQboTaJbDaYsSCdzvj5A4RFD2DCSmkFTyJWpx5mwCT1QPx VFdeU0/OGgqZ/22ySPQTD7vfPNInEA6yArRbAPifwi0htV77qd3V2OD1jDw/uoss a8fxtAj0fZNcuf8pEjUCAXoigVI7C7o9lDGuElKUtIjxiyHLiS9yXxYF4gSHJQjs G1pQShEMAqFGZec7pXnlMjj0I4ludrN6iRXhZp9oRsIQ7mOYekJ/+6smrdIIjRM= =cDaP -----END PGP SIGNATURE----- --gF9NCUB3eWvlgtS0IdGU6iakXiKObT7Mi--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2d6ecd49-2bcc-0c24-8854-63079c0eef6b>