Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 12 Apr 2023 10:37:59 -0700
From:      Cy Schubert <Cy.Schubert@cschubert.com>
To:        Charlie Li <vishwin@freebsd.org>
Cc:        Rick Macklem <rick.macklem@gmail.com>, Martin Matuska <mm@freebsd.org>, src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org
Subject:   Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
Message-ID:  <00780E30-8E72-4746-B651-8A9A048C9EE4@cschubert.com>
In-Reply-To: <64e4af2a-5273-6219-c146-f867160f09ac@freebsd.org>
References:  <202304031513.333FD6qw014903@gitrepo.freebsd.org> <CAM5tNy45XwDNGK27i_Z_96H-sLDXXHuaZbSQ=E7507eCiCvgJw@mail.gmail.com> <20230403235851.84C0467@slippy.cwsent.com> <CAM5tNy6TMoXAKyfWq_psEjK0zy9j%2B=7yzp1vRirAfTdXBxabSQ@mail.gmail.com> <CAM5tNy64HTeC8%2BOT_SHg1osnKKAH3_qQJkyWFuOy-LDAFVzu%2BA@mail.gmail.com> <20230404052811.DA2172C1@slippy.cwsent.com> <7c75b934-cb0a-b32e-bc19-b1e15e8cf3aa@freebsd.org> <20230409154042.0685a273@cschubert.com> <ba938b23-a6d0-f673-ffc8-b3d9d59e53a4@freebsd.org> <E3DD3607-887C-48C4-9031-5204DD84E6A5@cschubert.com> <a99a20b9-c348-89f6-db37-604f72002da4@freebsd.org> <707e4671-d746-aa23-e340-6eb8f50f78c6@freebsd.org> <20230409205826.7802259d@cschubert.com> <4e85eb84-f0cc-2f8c-d3d9-1e016ede042a@freebsd.org> <20230410165406.51bcd958@cschubert.com> <70739834-4eea-db30-63be-556bcfd881a1@freebsd.org> <D62F34CB-69D0-46FE-89C9-9BD2536DBFC5@cschubert.com> <464cc8cd-2bf6-b7e5-3823-89227d842458@freebsd.org> <64e4af2a-5273-6219-c146-f867160f09ac@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On April 12, 2023 10:22:25 AM PDT, Charlie Li <vishwin@freebsd=2Eorg> wrote=
:
>Charlie Li wrote:
>> Cy Schubert wrote:
>>> On April 12, 2023 8:51:09 AM PDT, Charlie Li <vishwin@freebsd=2Eorg> w=
rote:
>>>> Cy Schubert wrote:
>>>>> I have a "sandhbox" pool, called t, used for /usr/obj and ports wrkd=
irs, and other writes I can easily recreate on my laptop=2E Here are the re=
sults of my tests=2E
>>>>>=20
>>>>> Method:
>>>>>=20
>>>>> Initially I copied my /usr/obj from my two build machines (one amd64=
=2Eamd64 and an i386=2Ei386) to my "sandbox" zpool=2E
>>>>>=20
>>>>> Next, with block_cloning disabled I did cp -R of the /usr/obj test f=
iles=2E Then a diff -qr=2E They source and target directories were the same=
=2E
>>>>>=20
>>>>> Next, I cleaned up (rm -rf) the target directory to prepare for the
>>>>> block_clone enabled test=2E
>>>>>=20
>>>>> Next, I did zpool checkpoint t=2E After this, zpool upgrade t=2E Poo=
l t now has block_cloning enabled=2E
>>>>>=20
>>>>> I repeated the cp -R test from above followed by a diff -qr=2E Almos=
t
>>>>> every file was different=2E The pool was corrupted=2E
>>>>>=20
>>>>> I restored the pool by the following removing the corruption:
>>>>>=20
>>>>>=20
>>>>> slippy# zpool export t
>>>>> slippy# zpool import --rewind-to-checkpoint t
>>>>> slippy#
>>>>>=20
>>>>> It is recommended that people avoid upgrading their zpools until the
>>>>> problem is fixed=2E
>>>>>=20
>>>> As of af7624ed3145, I just did this with an md(4)-backed test pool, t=
hough with the second `cp -R` landing in a separate dataset, created and de=
stroyed for each test=2E No corruption either way=2E However, my poudriere =
builds still output/package corrupted files (particularly those with null c=
haracters), probably after install(1) invocations (not cp(1))=2E
>>>>=20
>>>=20
>>> You need to copy from/to the same dataset to reproduce the problem=2E =
Copying from a source dataset to a different dataset will avoid block_cloni=
ng=2E
>>>=20
>> Got the corruption now=2E
>>=20
>Clarify: no corruption without block_cloning, corruption with=2E
>
>What is still a mystery to me is how corruption happens even without bloc=
k_cloning in the poudriere scenario=2E cp(1)/install(1) always happen withi=
n the same dataset, as this test=2E
>

This is because your pool has previously corrupted blocks=2E Even when you=
 backed up the old pool, created a new pool without block_cloning and resto=
red your data, because the backup contained corrupted blocks from your old =
pool, they were restored as is=2E ZFS can only fix corruption if the checks=
um says it's corrupt=2E As far as ZFS was concerned at the time those block=
s were not corrupted=2E You will need to delete the files with corruption a=
nd recreate them=2E

Even after this regression is fixed and people build/install kernel, whate=
ver was corrupted will remain until corrupted files are either removed and =
recreated or fixed manually=2E

This regression will have long lasting effects=2E

Like Kirk McKusick has reiterated many times, back in the old days people =
didn't trust EXT*FS because of the data corruption experienced=2E Sadly ZFS=
 will need to earn people's trust back again=2E This is unfortunate=2E


--=20
Cheers,
Cy Schubert <Cy=2ESchubert@cschubert=2Ecom>
FreeBSD UNIX:  <cy@FreeBSD=2Eorg>  Web:  https://FreeBSD=2Eorg
NTP:                     <cy@nwtime=2Eorg>    Web:  https://nwtime=2Eorg
                                                    e^(i*pi)+1=3D0

Pardon the typos=2E Small keyboard in use=2E



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?00780E30-8E72-4746-B651-8A9A048C9EE4>