From nobody Sat Apr 15 20:30:13 2023 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4PzQ144xkFz451J2; Sat, 15 Apr 2023 20:30:16 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-oa1-x2e.google.com (mail-oa1-x2e.google.com [IPv6:2001:4860:4864:20::2e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4PzQ140t2Xz44jc; Sat, 15 Apr 2023 20:30:16 +0000 (UTC) (envelope-from mjguzik@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-oa1-x2e.google.com with SMTP id 586e51a60fabf-1879edfeff5so8940048fac.4; Sat, 15 Apr 2023 13:30:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681590615; x=1684182615; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=XS26ZCFyDqYHCZnay/VuIVQSOWVO9Vx56anBOmfV3Zw=; b=BVjYLBoh5FrR4HYd9jgz/LjZAi7o4MhGeyjqiHPylrUmZRVX5XPnC55MvmasK7uAYQ ufgnldYL3RYwAnQzb0mR5dn9XaCX+Ga6a1paW1QGOwSGyegd9tww7lJhvLS4+p2IKvef JvfFcZ0gFvCVkweEAXu8n02eQa81+nJzHkHjDJnAX0gOOEcAwW98Vn0Z9t5YfRGLlO5R gZv4mqDh6hW/Tjn1S87hGbytv8lB3ZwDpUrNHg7PchJabCZ2mnjFXZH1RChfSGNw0D3B VPoLbVoMDcEzAGUcbJS+QmGt1ZWlYegmxwX2HjKLNNMK3yFbzyyXcdoEGN7Fm/es4JIb xRPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681590615; x=1684182615; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=XS26ZCFyDqYHCZnay/VuIVQSOWVO9Vx56anBOmfV3Zw=; b=I4ROf5/y4OjmSlDkMIluy1mYbEqlQyFZfuae88plZ7yrbdegCh25bUJTqznZaPJDDo 374CHN8cXGllN/nQug/m4JRIjf1ylEiJje6/556NS6HzKmzpNfVNhbm3/9WEmopXFqRh L0Ceq2hlKUOLZe788m1JJojViPPHkI1cZV6GHuSBfSIjYBix9w1RD+W/czHFevipnupd WDAzv63pX3uI+ER6nEtLc4uIsAlzT3tD4Iki7hfM9UiF6U/alfC6M49sKfUJ9ijbS7OC QMf3799+2ETkoV0cTn4cmWOuo4QqU9n+AS2m9ytjESIz8O0+oCK0cG2/YbrWYY4Q7tNQ adfw== X-Gm-Message-State: AAQBX9dLgW+Gxch7N5tkdQTiufTh8qx92Xlsiz+yjIRgSc0vvry/HhAC htAQI9nQSGi+bNXDD7SThRWGU/HoPVlK/hDGWi0= X-Google-Smtp-Source: AKy350Zfd8XaGMQzhrASWvIcu8ZWBisEHr+qKOnRZlwkPZ80jf06/q4tYfdYxibFFBphsWAftliSVUEBXEZrhcv2Z94= X-Received: by 2002:a05:6870:46ab:b0:184:1a2c:83df with SMTP id a43-20020a05687046ab00b001841a2c83dfmr4702330oap.4.1681590614721; Sat, 15 Apr 2023 13:30:14 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Received: by 2002:a8a:46:0:b0:49c:b071:b1e3 with HTTP; Sat, 15 Apr 2023 13:30:13 -0700 (PDT) In-Reply-To: <20230415175218.777d0a97@thor.intern.walstatt.dynvpn.de> References: <20230413071032.18BFF31F@slippy.cwsent.com> <20230413063321.60344b1f@cschubert.com> <20230413135635.6B62F354@slippy.cwsent.com> <319a267e-3f76-3647-954a-02178c260cea@dawidek.net> <441db213-2abb-b37e-e5b3-481ed3e00f96@dawidek.net> <5ce72375-90db-6d30-9f3b-a741c320b1bf@freebsd.org> <99382FF7-765C-455F-A082-C47DB4D5E2C1@yahoo.com> <32cad878-726c-4562-0971-20d5049c28ad@freebsd.org> <20230415115452.08911bb7@thor.intern.walstatt.dynvpn.de> <20230415143625.99388387@slippy.cwsent.com> <20230415175218.777d0a97@thor.intern.walstatt.dynvpn.de> From: Mateusz Guzik Date: Sat, 15 Apr 2023 22:30:13 +0200 Message-ID: Subject: Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75 To: FreeBSD User Cc: Cy Schubert , Mark Millard , Charlie Li , Pawel Jakub Dawidek , dev-commits-src-main@freebsd.org, Current FreeBSD Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4PzQ140t2Xz44jc X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2001:4860:4864::/48, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N On 4/15/23, FreeBSD User wrote: > Am Sat, 15 Apr 2023 07:36:25 -0700 > Cy Schubert schrieb: > >> In message <20230415115452.08911bb7@thor.intern.walstatt.dynvpn.de>, >> FreeBSD Us >> er writes: >> > Am Thu, 13 Apr 2023 22:18:04 -0700 >> > Mark Millard schrieb: >> > >> > > On Apr 13, 2023, at 21:44, Charlie Li wrote: >> > > >> > > > Mark Millard wrote: >> > > >> FYI: in my original report for a context that has never had >> > > >> block_cloning enabled, I reported BOTH missing files and >> > > >> file content corruption in the poudriere-devel bulk build >> > > >> testing. This predates: >> > > >> https://people.freebsd.org/~pjd/patches/brt_revert.patch >> > > >> but had the changes from: >> > > >> https://github.com/openzfs/zfs/pull/14739/files >> > > >> The files were missing from packages installed to be used >> > > >> during a port's build. No other types of examples of missing >> > > >> files happened. (But only 11 ports failed.) >> > > > I also don't have block_cloning enabled. "Missing files" prior to >> > > > brt_rev >> > ert may actually >> > > > be present, but as the corruption also messes with the file(1) >> > > > signature, >> > some tools like >> > > > ldconfig report them as missing. >> > > >> > > For reference, the specific messages that were not explicit >> > > null-byte complaints were (some shown with a little context): >> > > >> > > >> > > ===> py39-lxml-4.9.2 depends on shared library: libxml2.so - not >> > > found >> > > ===> Installing existing package /packages/All/libxml2-2.10.3_1.pkg >> > > >> > > [CA72_ZFS] Installing libxml2-2.10.3_1... >> > > [CA72_ZFS] Extracting libxml2-2.10.3_1: .......... done >> > > ===> py39-lxml-4.9.2 depends on shared library: libxml2.so - found >> > > >> > > (/usr/local/lib/libxml2.so) . . . >> > > [CA72_ZFS] Extracting libxslt-1.1.37: .......... done >> > > ===> py39-lxml-4.9.2 depends on shared library: libxslt.so - found >> > > >> > > (/usr/local/lib/libxslt.so) ===> Returning to build of >> > > py39-lxml-4.9.2 >> > > . . . >> > > ===> Configuring for py39-lxml-4.9.2 >> > > Building lxml version 4.9.2. >> > > Building with Cython 0.29.33. >> > > Error: Please make sure the libxml2 and libxslt development packages >> > > are in >> > stalled. >> > > >> > > >> > > [CA72_ZFS] Extracting libunistring-1.1: .......... done >> > > ===> libidn2-2.3.4 depends on shared library: libunistring.so - not >> > > found >> > >> > > >> > > >> > > [CA72_ZFS] Extracting gmp-6.2.1: .......... done >> > > ===> mpfr-4.2.0,1 depends on shared library: libgmp.so - not found >> > > >> > > >> > > >> > > ===> nettle-3.8.1 depends on shared library: libgmp.so - not found >> > > ===> Installing existing package /packages/All/gmp-6.2.1.pkg >> > > [CA72_ZFS] Installing gmp-6.2.1... >> > > the most recent version of gmp-6.2.1 is already installed >> > > ===> nettle-3.8.1 depends on shared library: libgmp.so - not found >> > > >> > > *** Error code 1 >> > > >> > > >> > > autom4te: error: need GNU m4 1.4 or later: /usr/local/bin/gm4 >> > > >> > > >> > > checking for GNU >> > > M4 that supports accurate traces... configure: error: no acceptable m4 >> > > coul >> > d be found in >> > > $PATH. GNU M4 1.4.6 or later is required; 1.4.16 or newer is >> > > recommended. >> > > GNU M4 1.4.15 uses a buggy replacement strstr on some systems. >> > > Glibc 2.9 - 2.12 and GNU M4 1.4.11 - 1.4.15 have another strstr bug. >> > > >> > > >> > > ld: error: /usr/local/lib/libblkid.a: unknown file type >> > > >> > > >> > > === >> > > Mark Millard >> > > marklmi at yahoo.com >> > > >> > > >> > >> > Hello >> > >> > whar is the recent status of fixing/mitigate this desatrous bug? >> > Especially f >> > or those with the >> > new option enabled on ZFS pools. Any advice? >> > >> > In an act of precausion (or call it panic) I shutdown several servers to >> > prev >> > ent irreversible >> > damages to databases and data storages. We face on one host with >> > /usr/ports r >> > esiding on ZFS >> > always errors on the same files created while staging (using portmaster, >> > leav >> > es the system >> > with noninstalled software, i.e. www/apache24 in our case). Deleting the >> > work >> > folder doesn't >> > seem to change anything, even when starting a scrubbing of the entire >> > pool (R >> > AIDZ1 pool) - >> > cause unknown, why it affects always the same files to be corrupted. >> > Same wit >> > h deve/ruby-gems. >> > >> > Poudriere has been shutdown for the time being to avoid further issues. >> > >> > >> > Are there any advies to proceed apart from conserving the boxes via >> > shutdown? >> > >> > Thank you ;-) >> > oh >> > >> > >> > >> > -- >> > O. Hartmann >> >> With an up-to-date tree + pjd@'s "Fix data corruption when cloning >> embedded >> blocks. #14739" patch I didn't have any issues, except for email messages >> >> with corruption in my sent directory, nowhere else. I'm still >> investigating >> the email messages issue. IMO one is generally safe to run poudriere on >> the >> latest ZFS with the additional patch. >> >> My tests of the additional patch concluded that it resolved my last >> problems, except for the sent email problem I'm still investigating. I'm >> sure there's a simple explanation for it, i.e. the email thread was >> corrupted by the EXDEV regression which cannot be fixed by anything, even >> >> reverting to the previous ZFS -- the data in those files will remain >> damaged regardless. >> >> I cannot speak to the others who have had poudriere and other issues. I >> never had any problems with poudriere on top of the new ZFS. >> >> WRT reverting block_cloning pools to without, your only option is to >> backup >> your pool and recreate it without block_cloning. Then restore your data. >> >> > > All right, I interpret the answer that way, that I need a most recent source > tree (and > accordingly built and installed OS) AND a patch that isn't officially > commited? > > On a box I'm with: > > FreeBSD 14.0-CURRENT #8 main-n262175-5ee1c90e50ce: Sat Apr 15 07:57:16 CEST > 2023 amd64 > > The box is crashing while trying to update ports with the well known issue: > > Panic String: VERIFY(!zil_replaying(zilog, tx)) failed > > At the moment all alarm bells are ringing and I lost track about what has > been patched and > already commited and what is still as patch available but in the phase of > evaluation or > inofficially emmited here. > > According to the EXDEV issue: in cases of poudriere or ports trees on ZFS, > what do I have to > do to ensure that those datasets are clean? The OS should detect file > corruption but in my > case the box is crashing :-( > > I did several times scrubbing, but this seems to be the action of a helpless > and desperate man > ... ;-/ > > Greetings > Using block cloning is still not safe, but somewhere in this thread pjd had a patch to keep it operatinal for already cloned files without adding new ones. Anyhow, as was indicated by vishwin@ there was data corruption *unrelated* to block cloning which also came with the import, I narrowed it down: https://github.com/openzfs/zfs/issues/14753 That said now I'm testing a kernel which does not do block cloning and does not have the other problematic commit, we will see if things work. -- Mateusz Guzik