Date: Thu, 28 Nov 2024 08:31:40 -0800 From: Mark Millard <marklmi@yahoo.com> To: Andriy Gapon <avg@FreeBSD.org> Cc: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= <des@FreeBSD.org>, Konstantin Belousov <kib@freebsd.org>, Dimitry Andric <dim@freebsd.org>, "jah@freebsd.org" <jah@freebsd.org>, dougm@freebsd.org, Alan Somers <asomers@freebsd.org>, Mark Johnston <markj@freebsd.org>, FreeBSD Current <freebsd-current@freebsd.org>, Guido Falsi <mad@madpilot.net>, Yasuhiro Kimura <yasu@freebsd.org>, ports@freebsd.org Subject: Re: port binary dumping core on recent head in poudriere [tmpfs corruptions involving blocks of zeros that should not be all zeros] Message-ID: <0654A56E-08C7-42CC-A6D8-63C85120C1D8@yahoo.com> In-Reply-To: <5e37b8a5-2bd2-49b5-9746-674bd26ad770@FreeBSD.org> References: <38658C0D-CA33-4010-BBE1-E68D253A3DF7@FreeBSD.org> <1004a753-9a3c-4aa2-bfa8-4a0c471fe3ea@madpilot.net> <D14FF56C-506F-4168-91BC-1F10937B943F@yahoo.com> <E77AF0C3-5210-41C7-B8B8-02A8E22DB23D@yahoo.com> <A2820AEA-AB92-425F-AE91-2AF9629B3020@yahoo.com> <0690CFB1-6A6D-4B63-916C-BAB7F6256000@yahoo.com> <3660625A-0EE8-40DA-A248-EC18C734718C@yahoo.com> <865xoa2t6f.fsf@ltc.des.dev> <69A2E921-F5E3-40D2-977D-0964EE27349A@FreeBSD.org> <4AE5B316-D7EB-4290-8D52-7FBF244EA7A4@FreeBSD.org> <Z0XPPKtlLTMYeJS-@kib.kiev.ua> <33D56E3E-6476-48E8-B115-B906629B8AF5@yahoo.com> <65d47ca6-b0b9-4c03-9e36-d0f2cf6b4937@FreeBSD.org> <86zflj1t6b.fsf@ltc.des.dev> <5e37b8a5-2bd2-49b5-9746-674bd26ad770@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Nov 28, 2024, at 04:19, Andriy Gapon <avg@FreeBSD.org> wrote: > On 28/11/2024 13:42, Dag-Erling Sm=C3=B8rgrav wrote: >> Andriy Gapon <avg@FreeBSD.org> writes: >>> FWIW, I am not sure if it's relevant but I am seeing a similar = pattern >>> of corruption on tmpfs although in a different context, on FreeBSD >>> 13.3. >> Not relevant at all. In this case the file is not actually corrupted >> but `install(1)` skips over some of it when copying because = `SEEK_DATA` >> is implemented incorrectly. >=20 > Still could be relevant... > I don't know the "true state" of my corrupted files, I only observe = the consequences. And the files get some post-processing, then they are = uploaded and originals are removed. So, the problem could be not during = the write phase, but during the read phase of post-processing. First an FYI for why I started with 2bed60 instead of a page boundry: 2bed60 was the start of .got.plt, which is what was involved in the program crash. In every case, it seems likely that the whole page containing that start was zero, no matter if it should have been at the page start or not. The page start is just not what I was focused on for reporting. So I expect a "tail of page is all zero but should not be, start of page was a normal not-all-zero" problem would be a distinct problem. Or are you always seeing the problem as a full page of zeros instead of just the tail of that page (that should not be all zero)? In Dag-Erling's wording, "this case" refers to the context I was gathering investigative data for, not your context, as I understand it. [I've referenced: https://lists.freebsd.org/archives/freebsd-fs/2024-November/003855.html ] As for: "The writes are done by appending variable sized records to a file. There are no seeks or overwrites.": Am I to interpret that as: ) New file with just sequential writes that are variable sized? vs. ) Appending to a pre-existing file? (That would involve seeking and typically merging new data with old data from the original last-page-with-data and writing that update back out.) =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0654A56E-08C7-42CC-A6D8-63C85120C1D8>