Date: Sat, 28 Oct 2023 18:25:18 -0700 From: Mark Millard <marklmi@yahoo.com> To: Glen Barber <gjb@FreeBSD.org> Cc: Colin Percival <cperciva@tarsnap.com>, freebsd-arm <freebsd-arm@freebsd.org> Subject: Re: 15-aarch64-RPI-snap Message-ID: <6CF6E677-CF8F-4DE9-9781-754003FCE0B6@yahoo.com> In-Reply-To: <183A9CD0-42DB-4A0C-982D-FC6D3980163A@yahoo.com> References: <0100018b6a9d257c-b35e4157-ba97-4aa7-988c-aba797c6d2ca-000000@email.amazonses.com> <ACBCBC83-DD61-4E0A-89DC-9DDD1B71B8DE@freebsd.org> <13B64416-4334-4070-8588-71F7D938350B@yahoo.com> <3B40F89C-7E5E-427F-A7A1-2D37CCC06A6F@yahoo.com> <183A9CD0-42DB-4A0C-982D-FC6D3980163A@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Oct 28, 2023, at 09:40, Mark Millard <marklmi@yahoo.com> wrote: > On Oct 27, 2023, at 23:00, Mark Millard <marklmi@yahoo.com> wrote: >=20 >> On Oct 27, 2023, at 22:24, Mark Millard <marklmi@yahoo.com> wrote: >>=20 >>> On Oct 27, 2023, at 21:34, Glen Barber <gjb@FreeBSD.org> wrote: >>>=20 >>>>>> . . . >>>>>> = ^ >>>>>> ./offset.inc:16:19: error: null character ignored = [-Werror,-Wnull-character] >>>>>> = <U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U= +0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0= 000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+000= 0><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000>= <U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U= +0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0= 000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+000= 0><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000>= <U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U= +0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0= 000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+000= 0><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000>= <U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U+0000><U= +0000><U+0000><U+0000><U+00 >>>>>> 00><U+0000>#undef _SA >>>>>> = = ^ >>>=20 >>> Are the above from a ZFS file system? UFS? Something else? >>>=20 >>> Back in 2021-Nov (15..21) I had problems where ZFS was leading >>> to blocks of such on aarch64, not specifically RPi*'s, various >>> files but not the same ones from test to test. When I updated >>> past some zfs updates on the 23rd the problem stopped. >>>=20 >>> I also have notes from 2022-Mar (19..22) about replicating >>> another example problem someone was having with files ending >>> up with such blocks of bytes but the testing was on the >>> ThreadRipper 1950X. (The replication showed that ccache did >>> not need to be involved since I've never used it.) Again >>> ZFS was part of the environment that got the replication. >>> Mark Johnson fixed sys/contrib/openzfs/module/zfs/dnode.c >>> during this and my ability to replicate the issue then >>> stopped when I tested the patch. >>>=20 >>> Which ever file system it is that holds the bad bytes, some >>> attempted testing for repeatability of the problem could >>> be of interest, some of that being on aarch64 but not on >>> RPi*'s, some of it not on aarch64 at all. But it might take >>> information about the context to know better what/how to >>> test. That could include information about both the host and >>> the jail OS versions if such is involved. >>=20 >> Those last notes are likely too generic, in that normally >> official buildworld buildkernel activity is done on amd64 >> for all target platforms (last I knew). (Not that running >> such builds on other platforms would be a bad problem-scope >> isolation test.) >>=20 >> Any notes that help delimit what sort of test context >> would be a reasonable partial replication of the original >> context could prove useful. >>=20 >>> . . . >=20 > If the file system is ZFS, I'll note that main [so: 15] already has > a zpool feature that is not part of openzfs-2.2 and so not part of > releng/14.0 or stable/14 . So what zpool features are enabled could > be relevant to problems that only happen in main and might need to > be involved in efforts to replicate the problem. >=20 > But I've not evaluated if redaction_list_spill would be likely to > possibly be involved for the specific type of file corruptions. I'll note that the upstream openzfs master commit for the data corruption issue: "Zpool can start allocating from metaslab before TRIMs have completed" was on 2023-Oct-12, so not long ago. If the official builds use ZFS and TRIM but are based on a system version that predates FreeBSD picking up that commit, then there is a known data zfs data corruption issue present in the official build environment. Since port->package builds are based on a HOST/JAIL such as: Host OSVERSION: 1500000 Jail OSVERSION: 1500002 or: Host OSVERSION: 1500000 Jail OSVERSION: 1400097 but the Host kernel is the one in use (with the Host kernel commit not identified), it could have such an issue. (Because of such issues, I wish that Host OSVERSION related commit identification was also reported for the package builds. Presuming ZFS use, I also wish that the zpool features enabled were reported for similar reasons.) =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6CF6E677-CF8F-4DE9-9781-754003FCE0B6>