Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Nov 2024 08:31:40 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= <des@FreeBSD.org>, Konstantin Belousov <kib@freebsd.org>, Dimitry Andric <dim@freebsd.org>, "jah@freebsd.org" <jah@freebsd.org>, dougm@freebsd.org, Alan Somers <asomers@freebsd.org>, Mark Johnston <markj@freebsd.org>, FreeBSD Current <freebsd-current@freebsd.org>, Guido Falsi <mad@madpilot.net>, Yasuhiro Kimura <yasu@freebsd.org>, ports@freebsd.org
Subject:   Re: port binary dumping core on recent head in poudriere [tmpfs corruptions involving blocks of zeros that should not be all zeros]
Message-ID:  <0654A56E-08C7-42CC-A6D8-63C85120C1D8@yahoo.com>
In-Reply-To: <5e37b8a5-2bd2-49b5-9746-674bd26ad770@FreeBSD.org>
References:  <38658C0D-CA33-4010-BBE1-E68D253A3DF7@FreeBSD.org> <1004a753-9a3c-4aa2-bfa8-4a0c471fe3ea@madpilot.net> <D14FF56C-506F-4168-91BC-1F10937B943F@yahoo.com> <E77AF0C3-5210-41C7-B8B8-02A8E22DB23D@yahoo.com> <A2820AEA-AB92-425F-AE91-2AF9629B3020@yahoo.com> <0690CFB1-6A6D-4B63-916C-BAB7F6256000@yahoo.com> <3660625A-0EE8-40DA-A248-EC18C734718C@yahoo.com> <865xoa2t6f.fsf@ltc.des.dev> <69A2E921-F5E3-40D2-977D-0964EE27349A@FreeBSD.org> <4AE5B316-D7EB-4290-8D52-7FBF244EA7A4@FreeBSD.org> <Z0XPPKtlLTMYeJS-@kib.kiev.ua> <33D56E3E-6476-48E8-B115-B906629B8AF5@yahoo.com> <65d47ca6-b0b9-4c03-9e36-d0f2cf6b4937@FreeBSD.org> <86zflj1t6b.fsf@ltc.des.dev> <5e37b8a5-2bd2-49b5-9746-674bd26ad770@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Nov 28, 2024, at 04:19, Andriy Gapon <avg@FreeBSD.org> wrote:

> On 28/11/2024 13:42, Dag-Erling Sm=C3=B8rgrav wrote:
>> Andriy Gapon <avg@FreeBSD.org> writes:
>>> FWIW, I am not sure if it's relevant but I am seeing a similar =
pattern
>>> of corruption on tmpfs although in a different context, on FreeBSD
>>> 13.3.
>> Not relevant at all.  In this case the file is not actually corrupted
>> but `install(1)` skips over some of it when copying because =
`SEEK_DATA`
>> is implemented incorrectly.
>=20
> Still could be relevant...
> I don't know the "true state" of my corrupted files, I only observe =
the consequences.  And the files get some post-processing, then they are =
uploaded and originals are removed.  So, the problem could be not during =
the write phase, but during the read phase of post-processing.


First an FYI for why I started with 2bed60 instead of a page
boundry:

2bed60 was the start of .got.plt, which is what was involved
in the program crash. In every case, it seems likely that the
whole page containing that start was zero, no matter if it
should have been at the page start or not. The page start is
just not what I was focused on for reporting.

So I expect a "tail of page is all zero but should not be,
start of page was a normal not-all-zero" problem would  be a
distinct problem.

Or are you always seeing the problem as a full page of
zeros instead of just the tail of that page (that should not
be all zero)?

In Dag-Erling's wording, "this case" refers to the context I
was gathering investigative data for, not your context, as
I understand it.

[I've referenced:
https://lists.freebsd.org/archives/freebsd-fs/2024-November/003855.html
]

As for: "The writes are done by appending variable sized
records to a file. There are no seeks or overwrites.": Am
I to interpret that as:

) New file with just sequential writes that are variable
   sized?

vs.

) Appending to a pre-existing file? (That would involve
   seeking and typically merging new data with old data
   from the original last-page-with-data and writing that
   update back out.)



=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0654A56E-08C7-42CC-A6D8-63C85120C1D8>