Date: Mon, 15 Nov 2021 15:43:49 -0800 From: Mark Millard via freebsd-current <freebsd-current@freebsd.org> To: freebsd-current <freebsd-current@freebsd.org>, "freebsd-arm@freebsd.org" <arm@freebsd.org> Subject: Re: aarch64(?) poudiere-devel based builds seem to get fairly-rare corrupted files after recent system update(s?) Message-ID: <9BF4F65B-6437-4D88-AF34-9BCFBF90D6F3@yahoo.com> In-Reply-To: <E7C678B0-B0E1-4802-9362-9C2C92558202@yahoo.com> References: <2CA61249-321C-45AA-9755-597146AB8E9F@yahoo.com> <65AA4BCD-EC4B-4A19-B750-C7FC6E5ADDF5@yahoo.com> <E7C678B0-B0E1-4802-9362-9C2C92558202@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2021-Nov-15, at 13:13, Mark Millard <marklmi@yahoo.com> wrote: > On 2021-Nov-15, at 12:51, Mark Millard <marklmi@yahoo.com> wrote: >=20 >> On 2021-Nov-15, at 11:31, Mark Millard <marklmi@yahoo.com> wrote: >>=20 >>> I updated from (shown a system that I've not updated yet): >>>=20 >>> # uname -apKU >>> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 = main-n250455-890cae197737-dirty: Thu Nov 4 13:43:17 PDT 2021 = root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm6= 4.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64=20 >>> 1400040 1400040 >>>=20 >>> to: >>>=20 >>> # uname -apKU >>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 = main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 = root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm6= 4.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400042 1400042 >>>=20 >>> and then updated /usr/ports/ and started poudriere-devel based = builds of >>> the ports I's set up to use. However my last round of port builds = from >>> a general update of /usr/ports/ were on 2021-10-23 before either of = the >>> above. >>>=20 >>> I've had at least two files that seem to be corrupted, where a later = part >>> of the build hits problematical file(s) from earlier build activity. = For >>> example: >>>=20 >>> /usr/local/include/X11/extensions/XvMC.h:1:1: warning: null = character ignored [-Wnull-character] >>> <U+0000>=20 >>> ^ >>> /usr/local/include/X11/extensions/XvMC.h:1:2: warning: null = character ignored [-Wnull-character] >>> <U+0000><U+0000> >>> ^ >>> /usr/local/include/X11/extensions/XvMC.h:1:3: warning: null = character ignored [-Wnull-character] >>> <U+0000><U+0000><U+0000>=20 >>> ^ =20 >>> /usr/local/include/X11/extensions/XvMC.h:1:4: warning: null = character ignored [-Wnull-character] >>> <U+0000><U+0000><U+0000><U+0000> >>> ^ >>> . . . >>>=20 >>> Removing the xorgproto-2021.4 package and rebuilding via >>> poudiere-devel did not get a failure of any ports dependent >>> on it. >>>=20 >>> This was from a use of: >>>=20 >>> # poudriere jail -j13_0R-CA7 -i >>> Jail name: 13_0R-CA7 >>> Jail version: 13.0-RELEASE-p5 >>> Jail arch: arm.armv7 >>> Jail method: null >>> Jail mount: /usr/obj/DESTDIRs/13_0R-CA7-poud >>> Jail fs: =20 >>> Jail updated: 2021-11-04 01:48:49 >>> Jail pkgbase: disabled >>>=20 >>> but another not-investigated example was from: >>>=20 >>> # poudriere jail -j13_0R-CA72 -i >>> Jail name: 13_0R-CA72 >>> Jail version: 13.0-RELEASE-p5 >>> Jail arch: arm64.aarch64 >>> Jail method: null >>> Jail mount: /usr/obj/DESTDIRs/13_0R-CA72-poud >>> Jail fs: =20 >>> Jail updated: 2021-11-04 01:48:01 >>> Jail pkgbase: disabled >>>=20 >>> (so no 32-bit COMPAT involved). The apparent corruption >>> was in a different port (autoconfig, noticed by the >>> build of automake failing via config reporting >>> /usr/local/share/autoconf-2.69/autoconf/autoconf.m4f >>> being rejected). >>>=20 >>> /usr/obj/DESTDIRs/13_0R-CA7-poud/ and >>> /usr/obj/DESTDIRs/13_0R-CA72-poud/ and the like track the >>> system versions. >>>=20 >>> The media is an Optane 960 in the PCIe slot of a HoneyComb >>> (16 Cortex-A72's). The context is a root on ZFS one, ZFS >>> used in order to have bectl, not redundancy. >>>=20 >>> The ThreadRipper 1950X (so amd64) port builds did not give >>> evidence of such problems based on the updated system. (Also >>> Optane media in a PCIe slot, also root on ZFS.) But the >>> errors seem rare enough to not be able to conclude much. >>=20 >> For aarch64 targeting aarch64 there was also this >> explicit corruption notice during the poudriere(-devel) >> bulk build: >>=20 >> . . . >> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3: ......... >> pkg-static: Fail to extract = /usr/local/libexec/gcc/arm-none-eabi/8.4.0/lto1 from package: Lzma = library error: Corrupted input data >> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3... done >>=20 >> Failed to install the following 1 package(s): = /packages/All/arm-none-eabi-gcc-8.4.0_3.pkg >> *** Error code 1 >> Stop. >> make: stopped in /usr/ports/sysutils/u-boot-orangepi-plus-2e >>=20 >> I'm not yet to the point of retrying after removing >> arm-none-eabi-gcc-8.4.0_3 : other things are being built. >=20 >=20 > Another context with my prior general update of /usr/ports/ > and the matching port builds: Back then I used USE_TMPFS=3Dall > but the failure is based on USE_TMPFS-"data" instead. So: > lots more I/O. >=20 None of the 3 corruptions repeated during bulk builds that retried the builds that generated the files. All of the ports that failed by hitting the corruptions in what they depended on, built fine in teh retries. For reference: I'll note that, back when I was using USE_TMPFS=3Dall , I also did some separate bulk -a test runs, both aarch64 (Cortex-A72) native and Cortext-A72 targeting Cortex-A7 (armv7). None of those showed evidence of file corruptions. In general I've not had previous file corruptions with this system. (There was a little more than 245 GiBytes swap, which covered the tmpfs needs when they were large.) =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9BF4F65B-6437-4D88-AF34-9BCFBF90D6F3>