Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Nov 2024 13:16:16 -0500 (EST)
From:      "Sean C. Farley" <scf@FreeBSD.org>
To:        freebsd-ports@FreeBSD.org, freebsd-current@FreeBSD.org
Subject:   Re: port binary dumping core on recent head in poudriere [tmpfs corruptions involving blocks of zeros that should not be all zeros]
Message-ID:  <813bef1e-0189-27b2-9ee1-8ebb57a82296@FreeBSD.org>
In-Reply-To: <0690CFB1-6A6D-4B63-916C-BAB7F6256000@yahoo.com>
References:  <aa597431-54a8-4cde-8d4f-b75040b59bae@madpilot.net> <E4616829-D2DE-4EAF-B971-1EDA8B447F13@FreeBSD.org> <7c9c3cf5-bbd1-4642-8d04-33aa07a4db02@madpilot.net> <9df256a8-c6ed-46d9-b955-fc2657c12d36@madpilot.net> <5c502054-7353-4a1e-8350-c403482e9c0d@madpilot.net> <a203a89f-2eb7-4220-8dfb-648cd46fc6bb@madpilot.net> <3127C3BA-FC93-4636-ADDB-89518DE9C60D@FreeBSD.org> <86ed2zsp6l.fsf@ltc.des.dev> <5f24a570-26e0-4c0a-817f-591a234fd07b@madpilot.net> <5918C6A1-8FDB-40CA-8C86-EB7B7BE75A2E@yahoo.com> <86ed2zc8r5.fsf@ltc.des.dev> <45098ccf-4dc6-426c-849a-c923805d6723@madpilot.net> <F64DB4E9-A210-4E1F-B333-C597F3DBED54@yahoo.com> <38658C0D-CA33-4010-BBE1-E68D253A3DF7@FreeBSD.org> <1004a753-9a3c-4aa2-bfa8-4a0c471fe3ea@madpilot.net> <D14FF56C-506F-4168-91BC-1F10937B943F@yahoo.com> <E77AF0C3-5210-41C7-B8B8-02A8E22DB23D@yahoo.com> <A2820AEA-AB92-425F-AE91-2AF9629B3020@yahoo.com> <0690CFB1-6A6D-4B63-916C-BAB7F6256000@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--3279119474-871193679-1732816971=:34935
Content-Type: text/plain; CHARSET=UTF-8; format=flowed
Content-Transfer-Encoding: 8BIT
Content-ID: <a1c80e49-eaa4-74da-4bc3-0d1f443eb10e@farley.org>

On Mon, 25 Nov 2024, Mark Millard wrote:

> On Nov 25, 2024, at 18:05, Mark Millard <marklmi@yahoo.com> wrote:
>
>> Top posting going in a different direction that
>> established a way to control the behavior in my
>> context . . .
>
> For folks new to the discoveries: the context here
> is poudriere bulk builds, for USE_TMPFS=all vs.
> USE_TMPFS=no . My test context is amd64 on a
> 7950X3D system with 192 GiBytes of RAM. Others have
> other contexts, including an Intel system.

I have been seeing some odd behavior from Firefox as well as with 
poudriere builds on my system.  Both of which are touching a tmpfs 
system as I have setup /tmp as tmpfs, which Firefox uses, and 
USE_TMPFS=all.

The system has been an experiment, for me, with undervolting.  I have 
been attributing any flakiness to the undervolting, but I have reduced 
that a lot while the instability has been consistent as in it has stayed 
rare.  I cannot tell how many times I have run memtest86 on this system.

System setup:
- FreeBSD 14.2-STABLE
- i7-14700K (latest BIOS which *should* fix Intel power-related bugs)
- 128 GiB RAM
- ZFS (mirrored drives)
- 2 encrypted swap partitions (64 GiB each, lightly used)
- Lightly undervolted (-0.06 offset to Global Core SVID Voltage)
- /tmp is tmpfs
- ${HOME}/.cache is tmpfs
- Poudriere:
   - USE_TMPFS=all
   - ccache
   - jail version in sync with host
   - /usr/ports is mounted with nullfs

I have wondered if it was swap-related, but recently I noticed a build 
failure with games/veloren-weekly where swap was available but zero 
bytes were used.  The system was under little load at the time so less 
chance of undervolting being an issue.

Build failure:
-----------------------------

portpicker = { path = '/wrkdirs/usr/ports/games/veloren-weekly/work/portpicker-rs-df6b37872f3586ac3b21d08b56c8ec7cd92fb172' }
===>   Updating Cargo.lock
error: checksum for `windows_x86_64_msvc v0.42.2` changed between lock files

this could be indicative of a few possible errors:

     * the lock file is corrupt
     * a replacement source in use (e.g., a mirror) returned a different checksum
     * the source itself may be corrupt in one way or another

unable to verify that `windows_x86_64_msvc v0.42.2` is the same as when the lockfile was generated

*** Error code 101

-----------------------------

Restarting the build finished successfully.

>> I changed USE_TMPFS=all to USE_TMPFS=no :
>>
>> USE_TMPFS=all gets the failure

*snip*

>> vs.
>> USE_TMPFS=no works just fine
>>
>> So it is a FreeBSD system error associated with
>> use of tmpfs .
>
> Recent work on tmpfs includes:
>
> Mon, 09 Sep 2024
> • git: 8fa5e0f21fd1 - main - tmpfs: Account for whiteouts during rename/rmdir Jason A. Harmening
> Fri, 04 Oct 2024
> • git: 75734c4360fc - main - tmpfs: check residence in data_locked Doug Moore
> Sun, 13 Oct 2024
> • git: ec22e705c266 - main - tmpfs: remove duplicate flags check in tmpfs_rmdir Alan Somers
> Thu, 24 Oct 2024
> • git: db08b0b04dec - main - tmpfs_vnops: move swap work to swap_pager Doug Moore
>
> swap_pager (given the reference to it above):
>
> Tue, 08 Oct 2024
>    • git: d0b225d16418 - main - swap_pager: use iterators in swp_pager_meta_build Doug Moore
> Fri, 11 Oct 2024
>    • git: 1107834090be - main - swap_pager: swapoff detecting object death Doug Moore
> Thu, 24 Oct 2024
>    • git: 34951b0b9e78 - main - swap_pager: move scan_all_shadowed, use iterators Doug Moore
>    • git: 02e85d1c8a41 - main - swap_pager: fix assert in seek_data Doug Moore
>    • git: faa9356f97d2 - main - swap_pager: fix seek_hole assert Doug Moore
> Sat, 26 Oct 2024
>    • git: 39f6d1e7f835 - main - swap_pager: iter in haspage, lookup, getpages Doug Moore
> Wed, 13 Nov 2024
>    • git: d11d407aee48 - main - swap_pager: Ensure that swapoff puts swapped-in pages in page queues Mark Johnston
>
> I do not know at this time when the corruptions started. The
> above is only suggestive.

Thank you for listing those.

I need to find some time to look over those changes although I am no 
kernel guru by a long shot.  However, I see now that it looks like much 
more knowledgeable people are already looking on the current mailing 
list at the issue.

Sean
-- 
scf@FreeBSD.org
--3279119474-871193679-1732816971=:34935--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?813bef1e-0189-27b2-9ee1-8ebb57a82296>