Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 12 Apr 2023 18:46:50 +0200
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        FreeBSD User <freebsd@walstatt-de.de>
Cc:        Charlie Li <vishwin@freebsd.org>, Cy Schubert <Cy.Schubert@cschubert.com>,  Rick Macklem <rick.macklem@gmail.com>, Martin Matuska <mm@freebsd.org>, src-committers@freebsd.org,  dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org
Subject:   Re: git: 2a58b312b62f - main - zfs: merge openzfs/zfs@431083f75
Message-ID:  <CAGudoHGapcbDeB_NU26-MXkDk6APL84Tf_crxHVneNbhhze0fw@mail.gmail.com>
In-Reply-To: <20230412182813.63180c6a@thor.intern.walstatt.dynvpn.de>
References:  <202304031513.333FD6qw014903@gitrepo.freebsd.org> <20230403231444.CF48911F@slippy.cwsent.com> <20230403232549.73E331A2@slippy.cwsent.com> <CAM5tNy45XwDNGK27i_Z_96H-sLDXXHuaZbSQ=E7507eCiCvgJw@mail.gmail.com> <20230403235851.84C0467@slippy.cwsent.com> <CAM5tNy6TMoXAKyfWq_psEjK0zy9j%2B=7yzp1vRirAfTdXBxabSQ@mail.gmail.com> <CAM5tNy64HTeC8%2BOT_SHg1osnKKAH3_qQJkyWFuOy-LDAFVzu%2BA@mail.gmail.com> <20230404052811.DA2172C1@slippy.cwsent.com> <7c75b934-cb0a-b32e-bc19-b1e15e8cf3aa@freebsd.org> <20230409154042.0685a273@cschubert.com> <ba938b23-a6d0-f673-ffc8-b3d9d59e53a4@freebsd.org> <E3DD3607-887C-48C4-9031-5204DD84E6A5@cschubert.com> <a99a20b9-c348-89f6-db37-604f72002da4@freebsd.org> <707e4671-d746-aa23-e340-6eb8f50f78c6@freebsd.org> <20230409205826.7802259d@cschubert.com> <4e85eb84-f0cc-2f8c-d3d9-1e016ede042a@freebsd.org> <20230410165406.51bcd958@cschubert.com> <70739834-4eea-db30-63be-556bcfd881a1@freebsd.org> <20230412182813.63180c6a@thor.intern.walstatt.dynvpn.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On 4/12/23, FreeBSD User <freebsd@walstatt-de.de> wrote:
> Am Wed, 12 Apr 2023 11:51:09 -0400
> Charlie Li <vishwin@freebsd.org> schrieb:
>
>> Cy Schubert wrote:
>> > I have a "sandhbox" pool, called t, used for /usr/obj and ports wrkdirs,
>> > and other writes
>> > I can easily recreate on my laptop. Here are the results of my tests.
>> >
>> > Method:
>> >
>> > Initially I copied my /usr/obj from my two build machines (one
>> > amd64.amd64 and an
>> > i386.i386) to my "sandbox" zpool.
>> >
>> > Next, with block_cloning disabled I did cp -R of the /usr/obj test
>> > files. Then a diff -qr.
>> > They source and target directories were the same.
>> >
>> > Next, I cleaned up (rm -rf) the target directory to prepare for the
>> > block_clone enabled test.
>> >
>> > Next, I did zpool checkpoint t. After this, zpool upgrade t. Pool t now
>> > has block_cloning
>> > enabled.
>> >
>> > I repeated the cp -R test from above followed by a diff -qr. Almost
>> > every file was different. The pool was corrupted.
>> >
>> > I restored the pool by the following removing the corruption:
>> >
>> >
>> > slippy# zpool export t
>> > slippy# zpool import --rewind-to-checkpoint t
>> > slippy#
>> >
>> > It is recommended that people avoid upgrading their zpools until the
>> > problem is fixed.
>> >
>> As of af7624ed3145, I just did this with an md(4)-backed test pool,
>> though with the second `cp -R` landing in a separate dataset, created
>> and destroyed for each test. No corruption either way. However, my
>> poudriere builds still output/package corrupted files (particularly
>> those with null characters), probably after install(1) invocations (not
>> cp(1)).
>>
>
> I still have corrupt files on the /usr/ports tree (located on ZFS, with
> feature@block_cloning  active):
>
> [...]
> Installing man pages and online manual
> mkdir /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24
> cd /usr/ports/www/apache24/work/httpd-2.4.57/docs/manual && cp -rp *
> /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24 install  -m
> 0644
> /usr/ports/www/apache24/files/no-accf.conf
> /usr/ports/www/apache24/work/stage/usr/local/etc/apache24/Includes/ install
> -m 0644
> /usr/ports/www/apache24/files/README_modules.d
> /usr/ports/www/apache24/work/stage/usr/local/etc/apache24/modules.d/
> /usr/bin/strip
> /usr/ports/www/apache24/work/stage/usr/local/libexec/apache24/mod_*.so
> /bin/rm -f
> /usr/ports/www/apache24/work/stage/usr/local/share/apache24/build/ecp.????????
> 2>/dev/null
> install  -m 555
> /usr/ports/www/apache24/work/httpd-2.4.57/support/check_forensic
> /usr/ports/www/apache24/work/stage/usr/local/sbin ====> Compressing man
> pages (compress-man)
> ===> Staging rc.d startup script(s) ===>  Installing for apache24-2.4.57
> ===>   Registering
> installation for apache24-2.4.57 pkg-static:
> pkg_checksum_hash_sha256_file(read failed):
> Input/output error pkg-static: pkg_checksum_hash_sha256_file(read failed):
> Input/output error
> pkg-static: pkg_checksum_hash_sha256_file(read failed): Input/output error
> pkg-static:
> pkg_checksum_hash_sha256_file(read failed): Input/output error pkg-static:
> pkg_checksum_hash_sha256_file(read failed): Input/output error pkg-static:
> pkg_checksum_hash_sha256_file(read failed): Input/output error
>
> www/apache24 is now ALWAYS droping this corruption, even after scrubbing the
> pool.
>
> This one is the same in my case:
>
> [...]
>
> cd /usr/ports/devel/ruby-gems/work/stage/usr/local/ && /usr/bin/find -ds
> lib/ruby/gems/3.1/doc/ ! -type d >>
> /usr/ports/devel/ruby-gems/work/.PLIST.mktmp ====>
> Compressing man pages (compress-man) ===>>> Starting check for runtime
> dependencies
> ===>>> Gathering dependency list for devel/ruby-gems from ports
> ===>>> Dependency check complete for devel/ruby-gems
>
> ===>>> All >> rubygem-addressable-2.8.1 >> devel/ruby-gems (3/27)
>
> ===>  Installing for ruby31-gems-3.4.10
> ===>   Registering installation for ruby31-gems-3.4.10 as automatic
> pkg-static: pkg_checksum_hash_sha256_file(read failed): Input/output error
> pkg-static: pkg_checksum_hash_sha256_file(read failed): Input/output error
> pkg-static: pkg_checksum_hash_sha256_file(read failed): Input/output error
> pkg-static: pkg_checksum_hash_sha256_file(read failed): Input/output error
> *** Error code 1
>
> Stop.
> make[1]: stopped in /usr/ports/devel/ruby-gems
>
>
> Pool is then marked corrupt (was scrubbed after the last corruption):
>
> [...]
>
>   pool: POOL00
>  state: ONLINE
> status: One or more devices has experienced an error resulting in data
>         corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>         entire pool from backup.
>    see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
>   scan: scrub in progress since Wed Apr 12 18:07:02 2023
>         1.45T scanned at 2.01G/s, 139G issued at 193M/s, 13.2T total
>         0B repaired, 1.02% done, 19:49:53 to go
> config:
>
>         NAME              STATE     READ WRITE CKSUM
>         POOL00          ONLINE       0     0     0
>           raidz1-0        ONLINE       0     0     0
>             gpt/pool00  ONLINE       0     0     0
>             gpt/pool01  ONLINE       0     0     0
>             gpt/pool02  ONLINE       0     0     0
>             gpt/pool03  ONLINE       0     0     0
>
> errors: 22 data errors, use '-v' for a list
>
> [...]
>
> errors: Permanent errors have been detected in the following files:
>
>
> /usr/ports/devel/ruby-gems/work/stage/usr/local/lib/ruby/site_ruby/3.1/rubygems/optparse/lib/optionparser.rb
>
> /usr/ports/devel/ruby-gems/work/stage/usr/local/lib/ruby/site_ruby/3.1/rubygems/optparse.rb
>
> /usr/ports/www/apache24/work/stage/usr/local/www/apache24/icons/small/blank.gif
>
> /usr/ports/devel/ruby-gems/work/stage/usr/local/lib/ruby/site_ruby/3.1/rubygems/resolver/molinillo.rb
>
> /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24/images/left.gif
>
> /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24/images/right.gif
>
> /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24/images/down.gif
>
> /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24/images/pixel.gif
>
> /usr/ports/devel/ruby-gems/work/stage/usr/local/lib/ruby/site_ruby/3.1/rubygems/tsort.rb
>
> /usr/ports/www/apache24/work/stage/usr/local/share/doc/apache24/images/up.gif
>
>
> --
> O. Hartmann
>

https://github.com/openzfs/zfs/pull/14739/files

here is a fix you can apply on top of sys/contrib/openzfs, i have no
idea how much it can do though

-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHGapcbDeB_NU26-MXkDk6APL84Tf_crxHVneNbhhze0fw>