Date: Mon, 18 Jun 2018 18:27:27 -0400 From: Li-Wen Hsu <lwhsu@freebsd.org> To: Mark Millard <marklmi@yahoo.com> Cc: Bryan Drewery <bdrewery@freebsd.org>, FreeBSD Current <freebsd-current@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org> Subject: Re: A head buildworld race visible in the ci.freebsd.org build history Message-ID: <CAKBkRUxAfXi81yw93ejcJVpXQ0JetaACFtuS8tFprQvMeWx75A@mail.gmail.com> In-Reply-To: <BCD47660-EE57-490C-90E8-83FC3B720B09@yahoo.com> References: <74EAD684-0E0B-453A-B746-156777CF604A@yahoo.com> <1884103f-d1fb-aca6-2edd-062e11d05617@FreeBSD.org> <BCD47660-EE57-490C-90E8-83FC3B720B09@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jun 18, 2018 at 5:04 PM Mark Millard via freebsd-toolchain <freebsd-toolchain@freebsd.org> wrote: > > On 2018-Jun-18, at 12:42 PM, Bryan Drewery <bdrewery at FreeBSD.org> wrote: > > > On 6/15/2018 10:55 PM, Mark Millard wrote: > >> In watching ci.freebsd.org builds I've seen a notable > >> number of one time failures, such as (example from > >> powerpc64): > >> > >> --- all_subdir_lib/libufs --- > >> ranlib -D libufs.a > >> ranlib: fatal: Failed to open 'libufs.a' > >> *** [libufs.a] Error code 70 > >> > >> where the next build works despite the change being > >> irrelevant to whatever ranlib complained about. > >> > >> Other builds failed similarly: > >> > >> --- all_subdir_lib/libbsm --- > >> ranlib -D libbsm_p.a > >> ranlib: fatal: Failed to open 'libbsm_p.a' > >> *** [libbsm_p.a] Error code 70 > >> > >> and: > >> > >> --- kerberos5/lib__L --- > >> ranlib -D libgssapi_spnego_p.a > >> --- libgssapi_spnego.a --- > >> ranlib -D libgssapi_spnego.a > >> --- libgssapi_spnego_p.a --- > >> ranlib: fatal: Failed to open 'libgssapi_spnego_p.a' > >> *** [libgssapi_spnego_p.a] Error code 70 > >> > >> and so on. > >> > >> > >> It is not limited to powerpc64. For example, for aarch64 > >> there are: > >> > >> --- libpam_exec.a --- > >> building static pam_exec library > >> ar -crD libpam_exec.a `NM='nm' NMFLAGS='' lorder pam_exec.o | tsort -q` > >> ranlib -D libpam_exec.a > >> ranlib: fatal: Failed to open 'libpam_exec.a' > >> *** [libpam_exec.a] Error code 70 > >> > >> and: > >> > >> --- all_subdir_lib/libusb --- > >> ranlib -D libusb.a > >> ranlib: fatal: Failed to open 'libusb.a' > >> *** [libusb.a] Error code 70 > >> > >> and: > >> > >> --- all_subdir_lib/libbsnmp --- > >> ranlib: fatal: Failed to open 'libbsnmp.a' > >> --- all_subdir_lib/ncurses --- > >> --- all_subdir_lib/ncurses/panelw --- > >> --- panel.pico --- > >> --- all_subdir_lib/libbsnmp --- > >> *** [libbsnmp.a] Error code 70 > >> > >> > >> Even amd64 gets such: > >> > >> --- libpcap.a --- > >> ranlib -D libpcap.a > >> ranlib: fatal: Failed to open 'libpcap.a' > >> *** [libpcap.a] Error code 70 > >> > >> and: > >> > >> > >> --- libkafs5.a --- > >> ranlib: fatal: Failed to open 'libkafs5.a' > >> --- libkafs5_p.a --- > >> ranlib: fatal: Failed to open 'libkafs5_p.a' > >> --- cddl/lib__L --- > >> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lbaselib.c:60:26: note: include the header <ctype.h> or explicitly provide a declaration for 'toupper' > >> --- kerberos5/lib__L --- > >> *** [libkafs5_p.a] Error code 70 > >> > >> make[5]: stopped in /usr/src/kerberos5/lib/libkafs5 > >> --- libkafs5.a --- > >> *** [libkafs5.a] Error code 70 > >> > >> and: > >> > >> > >> --- lib__L --- > >> ranlib -D libclang_rt.asan_cxx-i386.a > >> ranlib: fatal: Failed to open 'libclang_rt.asan_cxx-i386.a' > >> *** [libclang_rt.asan_cxx-i386.a] Error code 70 > >> > >> > >> (Notice the variability in what .a the ranlib's fail for.) > >> > >> > >> > >> > >> > > > > > > I looked at this a few days ago and don't believe it's actually a build > > race. I think there is something wrong with the ar/ranlib on that system > > or something else. I've found no evidence of concurrent building of the > > .a files in question. > > > Looking at a bunch of the failures, spanning multiple > FreeBSD-head-*-build types of builds, I see only: > > NODE_LABELS bhyve_host butler1.nyi.freebsd.org jailer jailer_fast > NODE_NAME butler1.nyi.freebsd.org > > for the failures that I looked at. > > So your "on that system" might well be correct. Thanks for the insight, the build is done in a 11.1-R jail on a -CURRENT host. butler1.nyi is running r333388 (as a canary) while other builders are mostly running r328278. I upgraded few others and it seems can reproduce the issue, and now I downgraded all the build slaves to r328278 before we find the root cause. Li-Wen -- Li-Wen Hsu <lwhsu@FreeBSD.org> https://lwhsu.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAKBkRUxAfXi81yw93ejcJVpXQ0JetaACFtuS8tFprQvMeWx75A>