From nobody Sat Nov 20 06:20:40 2021 X-Original-To: arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 285C1189D385 for ; Sat, 20 Nov 2021 06:20:57 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic317-22.consmr.mail.gq1.yahoo.com (sonic317-22.consmr.mail.gq1.yahoo.com [98.137.66.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Hx3Lv6gjYz4mZr for ; Sat, 20 Nov 2021 06:20:55 +0000 (UTC) (envelope-from marklmi@yahoo.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1637389247; bh=BiOpSI/RYBIeQyNVjZkH281xPU5T8DTLYytOiqeQhXE=; h=From:Subject:Date:References:To:In-Reply-To:From:Subject:Reply-To; b=I3LOupmo6Djz/1FyDarOuDFXKJJI4M5dhpjaGoIUbW+Vq67iYw5G6Nnmy0mRSUD5RehSh3fT6EusFcLvC8BTWyLQivHMgH7tvsq1tj5QMljBhp/gwkAe8qb1JHnaqbE44jHiYgMM91zxovXEhVdGmJmoHi4ir1PT76H5v9wrwximO8yNE/7x1yobYp33jpDliYgl1rbk20zr4t/dzfPCoCfIgyWLwnAhrDJDEoHGMqkqZXvrc8xqhLcBtfwcUcxGJROtFUOCSP1+iu8oUAoFb5ICpz6k7lgnz93y9+pIsWgtCCj1Sr+1zkoWqBSXh7fn8V3k5Q+9CBEORjrg2szhTQ== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1637389247; bh=mKULppDzdKQMNbpIqplmxLlYFcFgUhqnqZNTijRw71G=; h=X-Sonic-MF:From:Subject:Date:To:From:Subject; b=SfhmoHVIdiXtrPVj1sX+u9HfM/n1U/PFfVjbjt06ZpebGTOOb0UZV95uZyOqdfQC29iKBWrZ3+Tx3xFiLizII+hucY/aJGX8m5tCIN63Yjp3XJ1WOgeFx847Kfc0FfIunnqgBQc4IK+geXAF2KNI5WXHAWbBVBE48hN0E24skSUw+q4vLFohvj9/EsnZLcBw3MxoDlTl9d5mGGJBjmxST2UGopqkzvaM9FAUyuopLIJEH44agkBfG7tyt/+IyAHQ6BXLSZX4uKexo7UZ3AviVCdvJGZokBXFr2RVbD5Jw6cpTIGd/p71GHelTvSn9y56ffL67WNiVhNQKt4NeonqtA== X-YMail-OSG: 1ZcKlJEVM1mbIp3JGxAsImzdTS4TCdmv2jlx2dHShQ5z0FHmyM36k9Cz7Yp2yTq P0C8G_izhGn6YY1m7eMhP6Qi2mc8HI10IYUz5iHiTsz1_m7x3Doe.C32Z56tNvKPUIbg9u5JmrpW zV9il9Gz2mbx9IKrPbdfbjCrlmqQcjDjzWVs0SpXVYd5B8L3eT5ui53_nwgAQr7w4jM923Qz0YcR zHOlS0DG7BLNQdXbzHOuOQuYoTErg5W9SeWoQ8_k7EG89HsRAqYfKu4Lqi6FEcCGISU19HGi4Uhe OmfcCX_oix1tD7i0DggFV6Qj4ZBiSSrenkh4Hjp.3dJ6hDoKT_ncCL3xjBqDxjWol0mcMNJ_JIDJ bvW2dcv2244kopmTJg615xXoggHBweqJO_9eKlkdZWDe8JfZNa4kI1oY6Wtufel7YYIyg1C4qDgb 6N9r4b7dqeJQiVjzY15ev2_LFnXPSGRH9TNSvvLVzHb3F1Gaer7Cj8OMFvj.8xMprRWDsU7l5Ai0 EoTRGyoRJwdkMbSbf1W8FaFMOy5H.faz.MzyCj1kYqyfOsSidVwbBwYphHToQfUocoO5H2z.iAWF FJHAAtcJzHt0t1Qm7OGhfNQ1_llUIl8CRm5hdAbLPeeNZTe18WrbyKh7HSioIxB_Oxwx3A0PaT6m uKRAczgaEq4uIDtASEUoB7uyEI5w7aNDztdDs_ZUBT5wWLG5lXZjLsqm59csqGLboDzwZdNM7gsy Aebt08FeyBUMbbopb58li.3eV4h8sAGMesPfot2Xe6ideCzK.xN4E2YmA0nul1bku9tyYV1lP8h_ HeCfcUm_JW6GQXk5Hkmb4vCvmdm1C63pnSBgkSZgICM9DCChPzhbhFjvnICnYnfuKhQzGIVnPofC y6ZEo9Ayo5zEznRPobc2noRwBk5TGy0H1ZqskR3uKw8VPmMA0y.eQ6fO8tvWpdP6O82FVCJBOhYt ROwskg5wBeCOW_GO37xWlsBkADpBkrsAtwIVYbbkLqpsXtWl8c17ihgD1tTP8AyX8bUjq1U7.uiO uEYWLj5IFhz1zrYzKtmAEiExvGh4lzDPlo6EBF0al9bEGJ2PNlP3UtAucmbJE8Cf.0BGYop6kP8n t5TA7HLt95kPqqtLSKpEfAWkty6ycqsW.FLW6n2J3pi.Vrq.RrqwJcNVru9Py.xpkDNDcg1bwKf8 UcCZuc79nyrL_UJSiR5NKmz6oRr2z1bMaGIwHb7L3PqVPq8SI7lqJ7EXOGCpS3yqjrROQT8aYiv_ ybeMn7HATAYnA22HfItb9iFS4I94LYe8EgqJzB3CYtPWUaLpYa6ES17kHbOdEHAg0dAXeMV_OITn Qspw.5kWB2lt1kOGEw84S.3O72KVmq_Xd2bVnJbhKBCFtnX3vyCQJni8UdRNZ6fmXXm6fSUwNLLf F4gLl0WVcqZWamDtHBF0EFKZZSFgdJ8Wztgc_tOG6Sw1i1oo6nUxZRmC5G4oCJSaUJTjiAjZ9JEh Elr12a7aQk3bFq8RchKPPyYSnpMiGV2dIFR44RTg8bPJ7vRda7jUbgt4zWTC.gRUP1d_g81V5k4D Norl8QrEH8pB69dDo9BdYv6hVPT8Y4AezgBPQOdA7FrlkZm231P6v6ZqUvNXi5npRcIzAEpI8ZiY atglNxo5h3HXctT75EVzLWEKOnqb8irFFM5rDQugP7yC6F9d3Hkz1km58oed0uyGnfu1kVFIBri8 oCjS35PfrQOvHAf_bEmbg9wt60xYDTwsk7oGy_VgQSYI1XS0h2h8Zf_Hk9emXe1OrsjG7A45kCRl USQ72z_fmGBo__nzOKfZUZL2vzJBlfTHd3_FUjaKUMc.lhd9n5wLc.7oMG0JpUwxUgdySsWLX2h0 UtQ49DNmj_wmKTP2Lj8OfXWJGfamIPq43VeqlX_dshOarQe6v35_aD85QfTnyb9IJDnCOu0o0keZ utCJivRX6SNTKZIsglZYZnzudKqmlgOCHAk_VEk3VxDnPuTH6AwaTydFlLdmQXN6hMtpZly2mNMD bydbh5msDxQNmpg00MP8hc3qqSEuz1jG_6xsUMNPBtZZXo7LoD.ZeuY9mDOjgOC0EFyLLntwStz1 eU_j8cXgzTbHnuMeo.8szF64UG8hq0jvrL5JwBEzUmeZELKBYhSHSUt8gRVg_UMc7M3fS3rc_iYB .7u8c5zYe.diBToRPSwWwgOdUTEDrOSSuKYq4U9ZX4I6Sp2.wO9bnlNTM_2VLVkRQRzUs2Wt5zWE rZ9Dgq9r0Ds6YVoFt5dCs9OJMLBq2B0ruTTPjm4U4Da77Vbw- X-Sonic-MF: Received: from sonic.gate.mail.ne1.yahoo.com by sonic317.consmr.mail.gq1.yahoo.com with HTTP; Sat, 20 Nov 2021 06:20:47 +0000 Received: by kubenode528.mail-prod1.omega.ne1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID feee28c9cabcb0fca281a6ed60859c67; Sat, 20 Nov 2021 06:20:42 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\)) Subject: Re: aarch64(?) poudiere-devel based builds seem to get fairly-rare corrupted files after recent system update(s?) Date: Fri, 19 Nov 2021 22:20:40 -0800 References: <2CA61249-321C-45AA-9755-597146AB8E9F@yahoo.com> <65AA4BCD-EC4B-4A19-B750-C7FC6E5ADDF5@yahoo.com> <9BF4F65B-6437-4D88-AF34-9BCFBF90D6F3@yahoo.com> <4B591638-4693-4403-8549-88D7A1D9D669@yahoo.com> To: freebsd-current , "freebsd-arm@freebsd.org" In-Reply-To: <4B591638-4693-4403-8549-88D7A1D9D669@yahoo.com> Message-Id: <0006EB30-B9F9-465A-8B9A-A0C03899CEFC@yahoo.com> X-Mailer: Apple Mail (2.3654.120.0.1.13) X-Rspamd-Queue-Id: 4Hx3Lv6gjYz4mZr X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=yahoo.com header.s=s2048 header.b=I3LOupmo; dmarc=pass (policy=reject) header.from=yahoo.com; spf=pass (mx1.freebsd.org: domain of marklmi@yahoo.com designates 98.137.66.148 as permitted sender) smtp.mailfrom=marklmi@yahoo.com X-Spamd-Result: default: False [-1.50 / 15.00]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; MID_RHS_MATCH_FROM(0.00)[]; SUBJECT_HAS_QUESTION(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; MIME_GOOD(-0.10)[text/plain]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_IN_DNSWL_NONE(0.00)[98.137.66.148:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.66.148:from]; RCVD_COUNT_TWO(0.00)[2] Reply-To: marklmi@yahoo.com From: Mark Millard via freebsd-current X-Original-From: Mark Millard X-ThisMailContainsUnwantedMimeParts: N On 2021-Nov-18, at 12:15, Mark Millard wrote: > On 2021-Nov-17, at 11:17, Mark Millard wrote: >=20 >> On 2021-Nov-15, at 15:43, Mark Millard wrote: >>=20 >>> On 2021-Nov-15, at 13:13, Mark Millard wrote: >>>=20 >>>> On 2021-Nov-15, at 12:51, Mark Millard wrote: >>>>=20 >>>>> On 2021-Nov-15, at 11:31, Mark Millard wrote: >>>>>=20 >>>>>> I updated from (shown a system that I've not updated yet): >>>>>>=20 >>>>>> # uname -apKU >>>>>> FreeBSD CA72_4c8G_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #18 = main-n250455-890cae197737-dirty: Thu Nov 4 13:43:17 PDT 2021 = root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm6= 4.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64=20 >>>>>> 1400040 1400040 >>>>>>=20 >>>>>> to: >>>>>>=20 >>>>>> # uname -apKU >>>>>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #19 = main-n250667-20aa359773be-dirty: Sun Nov 14 02:57:32 PST 2021 = root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm6= 4.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400042 1400042 >>>>>>=20 >>>>>> and then updated /usr/ports/ and started poudriere-devel based = builds of >>>>>> the ports I's set up to use. However my last round of port builds = from >>>>>> a general update of /usr/ports/ were on 2021-10-23 before either = of the >>>>>> above. >>>>>>=20 >>>>>> I've had at least two files that seem to be corrupted, where a = later part >>>>>> of the build hits problematical file(s) from earlier build = activity. For >>>>>> example: >>>>>>=20 >>>>>> /usr/local/include/X11/extensions/XvMC.h:1:1: warning: null = character ignored [-Wnull-character] >>>>>> =20 >>>>>> ^ >>>>>> /usr/local/include/X11/extensions/XvMC.h:1:2: warning: null = character ignored [-Wnull-character] >>>>>> >>>>>> ^ >>>>>> /usr/local/include/X11/extensions/XvMC.h:1:3: warning: null = character ignored [-Wnull-character] >>>>>> =20 >>>>>> ^ =20 >>>>>> /usr/local/include/X11/extensions/XvMC.h:1:4: warning: null = character ignored [-Wnull-character] >>>>>> >>>>>> ^ >>>>>> . . . >>>>>>=20 >>>>>> Removing the xorgproto-2021.4 package and rebuilding via >>>>>> poudiere-devel did not get a failure of any ports dependent >>>>>> on it. >>>>>>=20 >>>>>> This was from a use of: >>>>>>=20 >>>>>> # poudriere jail -j13_0R-CA7 -i >>>>>> Jail name: 13_0R-CA7 >>>>>> Jail version: 13.0-RELEASE-p5 >>>>>> Jail arch: arm.armv7 >>>>>> Jail method: null >>>>>> Jail mount: /usr/obj/DESTDIRs/13_0R-CA7-poud >>>>>> Jail fs: =20 >>>>>> Jail updated: 2021-11-04 01:48:49 >>>>>> Jail pkgbase: disabled >>>>>>=20 >>>>>> but another not-investigated example was from: >>>>>>=20 >>>>>> # poudriere jail -j13_0R-CA72 -i >>>>>> Jail name: 13_0R-CA72 >>>>>> Jail version: 13.0-RELEASE-p5 >>>>>> Jail arch: arm64.aarch64 >>>>>> Jail method: null >>>>>> Jail mount: /usr/obj/DESTDIRs/13_0R-CA72-poud >>>>>> Jail fs: =20 >>>>>> Jail updated: 2021-11-04 01:48:01 >>>>>> Jail pkgbase: disabled >>>>>>=20 >>>>>> (so no 32-bit COMPAT involved). The apparent corruption >>>>>> was in a different port (autoconfig, noticed by the >>>>>> build of automake failing via config reporting >>>>>> /usr/local/share/autoconf-2.69/autoconf/autoconf.m4f >>>>>> being rejected). >>>>>>=20 >>>>>> /usr/obj/DESTDIRs/13_0R-CA7-poud/ and >>>>>> /usr/obj/DESTDIRs/13_0R-CA72-poud/ and the like track the >>>>>> system versions. >>>>>>=20 >>>>>> The media is an Optane 960 in the PCIe slot of a HoneyComb >>>>>> (16 Cortex-A72's). The context is a root on ZFS one, ZFS >>>>>> used in order to have bectl, not redundancy. >>>>>>=20 >>>>>> The ThreadRipper 1950X (so amd64) port builds did not give >>>>>> evidence of such problems based on the updated system. (Also >>>>>> Optane media in a PCIe slot, also root on ZFS.) But the >>>>>> errors seem rare enough to not be able to conclude much. >>>>>=20 >>>>> For aarch64 targeting aarch64 there was also this >>>>> explicit corruption notice during the poudriere(-devel) >>>>> bulk build: >>>>>=20 >>>>> . . . >>>>> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3: ......... >>>>> pkg-static: Fail to extract = /usr/local/libexec/gcc/arm-none-eabi/8.4.0/lto1 from package: Lzma = library error: Corrupted input data >>>>> [CA72_ZFS] Extracting arm-none-eabi-gcc-8.4.0_3... done >>>>>=20 >>>>> Failed to install the following 1 package(s): = /packages/All/arm-none-eabi-gcc-8.4.0_3.pkg >>>>> *** Error code 1 >>>>> Stop. >>>>> make: stopped in /usr/ports/sysutils/u-boot-orangepi-plus-2e >>>>>=20 >>>>> I'm not yet to the point of retrying after removing >>>>> arm-none-eabi-gcc-8.4.0_3 : other things are being built. >>>>=20 >>>>=20 >>>> Another context with my prior general update of /usr/ports/ >>>> and the matching port builds: Back then I used USE_TMPFS=3Dall >>>> but the failure is based on USE_TMPFS-"data" instead. So: >>>> lots more I/O. >>>>=20 >>>=20 >>> None of the 3 corruptions repeated during bulk builds that >>> retried the builds that generated the files. All of the >>> ports that failed by hitting the corruptions in what they >>> depended on, built fine in teh retries. >>>=20 >>> For reference: >>>=20 >>> I'll note that, back when I was using USE_TMPFS=3Dall , I also >>> did some separate bulk -a test runs, both aarch64 (Cortex-A72) >>> native and Cortext-A72 targeting Cortex-A7 (armv7). None of >>> those showed evidence of file corruptions. In general I've >>> not had previous file corruptions with this system. (There >>> was a little more than 245 GiBytes swap, which covered the >>> tmpfs needs when they were large.) >>=20 >>=20 >> I set up a contrasting test context and got no evidence of >> corruptions in that context. (Note: the 3 bulk builds >> total to around 24 hrs of activity for the 3 examples >> of 460+ ports building.) So, for the Cortex-A72 system, >=20 > I set up a UFS on Optane (U.2 via M.2 adapter) context and > also got no evidence of corruptions in that context (same > hardware and a copy of the USB3 SSD based system). The > sequence of 3 bulks took somewhat over 18 hrs using the > Optane. >=20 >> root on UFS on portable USB3 SSD: no evidence of corruptions > Also: > root on UFS on Optane U.2 via M.2: no evidence of corruptions >> vs.: >> root on ZFS on optane in PCIe slot: solid evidence of 3 known = corruptions >>=20 >> Both had USE_TMPFS=3D"data" in use. The same system build >> had been installed and booted for both tests. >>=20 >> The evidence of corruptions is rare enough for this not to >> be determinative, but it is suggestive. >>=20 >> Unfortunately, ZFS vs. UFS and Optane-in-PCIe vs. USB3 are >> not differentiated by this test result. >>=20 >> There is also the result that I've not seen evidence of >> corruptions on the ThreadRipper 1950 X (amd64) system. >> Again, not determinative, but suggestive, given how rare >> the corruptions seem to be. >=20 > So far the only things unique to the observed corruptions are: >=20 > root on ZFS context (vs. root on UFS) > and: > Optane in a PCIe slot (but no contrasting ZFS case tested) >=20 > The PCIe slot does not seem to me to be likely to be contributing. > So this seem to be suggestive of a ZFS problem. >=20 > A contributing point might be that the main [so: 14] system was > built via -mcpu=3Dcortex-a72 for execution on a Cortext-A72 system. >=20 > [I previously ran into a USB subsystem mishandling of keeping > things coherent for the week memory ordering in this sort of > context. That issue was fixed. But back then I was lucky enough > to be able to demonstrate fails vs. works by adding an > appropriate instruction to FreeBSD in a few specific places > (more than necessary as it turned out). Someone else determined > where the actual mishandling was that covered all required > places. My generating that much information in this context > seems unlikely.] I started a retry of root-on-ZFS with the Optane-in-PCIe-slot media and it got its first corruption (in a different place, 2nd bulk build this time). The use of the corrupted file reports: configure:13269: cc -o conftest -Wall -Wextra -fsigned-char = -Wdeclaration-after-statement -O2 -pipe -mcpu=3Dcortex-a53 -g = -fstack-protector-strong -fno-strict-aliasing -DUSE_MEMORY_H = -I/usr/local/incl ude -mcpu=3Dcortex-a53 -fstack-protector-strong conftest.c = -L/usr/local/lib -logg >&5 In file included from conftest.c:27: In file included from /usr/local/include/ogg/ogg.h:24: In file included from /usr/local/include/ogg/os_types.h:154: /usr/local/include/ogg/config_types.h:1:1: warning: null character = ignored [-Wnull-character] ^ /usr/local/include/ogg/config_types.h:1:2: warning: null character = ignored [-Wnull-character] ^ /usr/local/include/ogg/config_types.h:1:3: warning: null character = ignored [-Wnull-character] ^ . . . /usr/local/include/ogg/config_types.h:1:538: warning: null character = ignored [-Wnull-character] . . . (nulls) . . . So: 538 such null bytes. Thus, another example of something like a page of nulls being written out when ZFS is in use. audio/gstreamer1-plugins-ogg also failed via referencing the file during its build. (The bulk run is still going and there is one more bulk run to go.) =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)