From nobody Tue Nov 26 13:38:04 2024 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4XyNv21vMWz5fTpr; Tue, 26 Nov 2024 13:38:22 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4XyNv13hslz4PZ2; Tue, 26 Nov 2024 13:38:21 +0000 (UTC) (envelope-from kib@freebsd.org) Authentication-Results: mx1.freebsd.org; none Received: from tom.home (kib@localhost [127.0.0.1] (may be forged)) by kib.kiev.ua (8.18.1/8.18.1) with ESMTP id 4AQDc4Ii097783; Tue, 26 Nov 2024 15:38:07 +0200 (EET) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 4AQDc4Ii097783 Received: (from kostik@localhost) by tom.home (8.18.1/8.18.1/Submit) id 4AQDc4cf097782; Tue, 26 Nov 2024 15:38:04 +0200 (EET) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Tue, 26 Nov 2024 15:38:04 +0200 From: Konstantin Belousov To: Dimitry Andric Cc: Dag-Erling =?utf-8?B?U23DuHJncmF2?= , Mark Millard , "jah@freebsd.org" , dougm@freebsd.org, Alan Somers , Mark Johnston , FreeBSD Current , Guido Falsi , Yasuhiro Kimura , ports@freebsd.org Subject: Re: port binary dumping core on recent head in poudriere [tmpfs corruptions involving blocks of zeros that should not be all zeros] Message-ID: References: <38658C0D-CA33-4010-BBE1-E68D253A3DF7@FreeBSD.org> <1004a753-9a3c-4aa2-bfa8-4a0c471fe3ea@madpilot.net> <0690CFB1-6A6D-4B63-916C-BAB7F6256000@yahoo.com> <3660625A-0EE8-40DA-A248-EC18C734718C@yahoo.com> <865xoa2t6f.fsf@ltc.des.dev> <69A2E921-F5E3-40D2-977D-0964EE27349A@FreeBSD.org> <4AE5B316-D7EB-4290-8D52-7FBF244EA7A4@FreeBSD.org> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4AE5B316-D7EB-4290-8D52-7FBF244EA7A4@FreeBSD.org> X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=4.0.1 X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-26) on tom.home X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US] X-Rspamd-Queue-Id: 4XyNv13hslz4PZ2 X-Spamd-Bar: ---- On Tue, Nov 26, 2024 at 01:58:19PM +0100, Dimitry Andric wrote: > On 26 Nov 2024, at 13:32, Dimitry Andric wrote: > > > > On 26 Nov 2024, at 11:19, Dag-Erling Smørgrav wrote: > >> > >> Mark Millard writes: > >>> From inside a bulk -i where I did a manual make command > >>> after it built and installed libsass.so.1.0.0 . The > >>> manual make produced a /wrkdirs/ : > >>> [...] > >>> So the original creation looks okay. But . . . > >>> [...] > >>> So: The later, staged copy is a bad copy. Both are in the > >>> tmpfs. So copying to the staging area makes a corrupted > >>> copy inside the same tmpfs. After that, further copies of > >>> staging's bad copy can be expected to be messed up. > >> > >> This and the fact that it happens on 14 and 15 but not on 13 strongly > >> suggests an issue wth `copy_file_range(2)`, since `install(1)` in 14 and > >> 15 (but not in 13) now uses `copy_file_range(2)` if at all possible. > >> > >> My educated guess is that hole detection doesn't work reliably for files > >> that have had holes filled while memory-mapped, so `copy_file_range(2)` > >> thinks there is a hole where there isn't one and skips some of the data > >> when `install(1)` uses it to copy the library from `${WRKSRC}` to > >> `${STAGEDIR}`. This may or may not be specific to tmpfs. > >> > >> You may want to try applying the attached patch to your FreeBSD 14 and > >> 15 jails. It prevents `cp(1)` and `install(1)` from trying to use > >> `copy_file_range(2)`. > > > > Yes, tmpfs is indeed the culprit (or at least involved). I have had USE_TMPFS=localbase in my poudriere.conf for a long time, since otherwise my build machine would run out of memory very quickly, so I didn't encounter any issues. > > > > Now I changed it to USE_TMPFS=yes, rebuilt only textproc/libsass and textproc/sassc, and then after reinstalling those packages: > > > > $ /usr/local/bin/sassc > > Segmentation fault > > And after applying Dag-Erling's patch to disable copy_file_range for cp and install, it works correctly again. So indeed there might be an issue in tmpfs seeking for data. Could you try the following? commit f4b848946a131dab260b44eab2cfabceb82bee0c Author: Konstantin Belousov Date: Tue Nov 26 15:34:56 2024 +0200 tmpfs: do not skip pages searching for data If the iterator finds invalid page at the requested pindex in swap_pager_seek_data(), the current code only looks at the swap blocks to search for data. This is not correct, valid pages may appear at the higher indexes still. diff --git a/sys/vm/swap_pager.c b/sys/vm/swap_pager.c index db925f4ae7f6..390b2c10d680 100644 --- a/sys/vm/swap_pager.c +++ b/sys/vm/swap_pager.c @@ -2503,12 +2503,9 @@ swap_pager_seek_data(vm_object_t object, vm_pindex_t pindex) VM_OBJECT_ASSERT_RLOCKED(object); vm_page_iter_init(&pages, object); m = vm_page_iter_lookup_ge(&pages, pindex); - if (m != NULL) { - if (!vm_page_any_valid(m)) - m = NULL; - else if (pages.index == pindex) - return (pages.index); - } + if (m != NULL && pages.index == pindex) + return (pages.index); + swblk_iter_init_only(&blks, object); swap_index = swap_pager_iter_find_least(&blks, pindex); if (swap_index == pindex)