Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Apr 2023 09:46:41 +0200
From:      Martin Matuska <mm@FreeBSD.org>
To:        Mateusz Guzik <mjguzik@gmail.com>, Pawel Jakub Dawidek <pjd@freebsd.org>
Cc:        freebsd-current@freebsd.org, Glen Barber <gjb@freebsd.org>
Subject:   Re: another crash and going forward with zfs
Message-ID:  <8ea96ab9-89c5-d05a-95d1-ef71663f28bb@FreeBSD.org>
In-Reply-To: <CAGudoHEWFNcdrFcK30wLSN8%2B56%2BK4CfqwUDsvb1%2BZwS1Gt4NXg@mail.gmail.com>
References:  <CAGudoHH8vurcn4ydavi-xkGHYA6DVfOQF1mEEXkwPvGUTjKZNA@mail.gmail.com> <48e02888-c49f-ab2b-fc2d-ad6db6f0e10b@dawidek.net> <CAGudoHEWFNcdrFcK30wLSN8%2B56%2BK4CfqwUDsvb1%2BZwS1Gt4NXg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Btw. I am open for setting up a pre-merge stress testing

I will check out if I can use the hourly-billed amd64 and arm64 cloud 
boxes at Hetzner with FreeBSD.
Otherwise there are monthly-billed as well.

Cheers,
mm

On 17. 4. 2023 22:14, Mateusz Guzik wrote:
> On 4/17/23, Pawel Jakub Dawidek <pjd@freebsd.org> wrote:
>> On 4/18/23 03:51, Mateusz Guzik wrote:
>>> After bugfixes got committed I decided to zpool upgrade and sysctl
>>> vfs.zfs.bclone_enabled=1 vs poudriere for testing purposes. I very
>>> quickly got a new crash:
>>>
>>> panic: VERIFY(arc_released(db->db_buf)) failed
>>>
>>> cpuid = 9
>>> time = 1681755046
>>> KDB: stack backtrace:
>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>>> 0xfffffe0a90b8e5f0
>>> vpanic() at vpanic+0x152/frame 0xfffffe0a90b8e640
>>> spl_panic() at spl_panic+0x3a/frame 0xfffffe0a90b8e6a0
>>> dbuf_redirty() at dbuf_redirty+0xbd/frame 0xfffffe0a90b8e6c0
>>> dmu_buf_will_dirty_impl() at dmu_buf_will_dirty_impl+0xa2/frame
>>> 0xfffffe0a90b8e700
>>> dmu_write_uio_dnode() at dmu_write_uio_dnode+0xe9/frame
>>> 0xfffffe0a90b8e780
>>> dmu_write_uio_dbuf() at dmu_write_uio_dbuf+0x42/frame 0xfffffe0a90b8e7b0
>>> zfs_write() at zfs_write+0x672/frame 0xfffffe0a90b8e960
>>> zfs_freebsd_write() at zfs_freebsd_write+0x39/frame 0xfffffe0a90b8e980
>>> VOP_WRITE_APV() at VOP_WRITE_APV+0xdb/frame 0xfffffe0a90b8ea90
>>> vn_write() at vn_write+0x325/frame 0xfffffe0a90b8eb20
>>> vn_io_fault_doio() at vn_io_fault_doio+0x43/frame 0xfffffe0a90b8eb80
>>> vn_io_fault1() at vn_io_fault1+0x161/frame 0xfffffe0a90b8ecc0
>>> vn_io_fault() at vn_io_fault+0x1b5/frame 0xfffffe0a90b8ed40
>>> dofilewrite() at dofilewrite+0x81/frame 0xfffffe0a90b8ed90
>>> sys_write() at sys_write+0xc0/frame 0xfffffe0a90b8ee00
>>> amd64_syscall() at amd64_syscall+0x157/frame 0xfffffe0a90b8ef30
>>> fast_syscall_common() at fast_syscall_common+0xf8/frame
>>> 0xfffffe0a90b8ef30
>>> --- syscall (4, FreeBSD ELF64, write), rip = 0x103cddf7949a, rsp =
>>> 0x103cdc85dd48, rbp = 0x103cdc85dd80 ---
>>> KDB: enter: panic
>>> [ thread pid 95000 tid 135035 ]
>>> Stopped at      kdb_enter+0x32: movq    $0,0x9e4153(%rip)
>>>
>>> The posted 14.0 schedule which plans to branch stable/14 on May 12 and
>>> one cannot bet on the feature getting beaten up into production shape
>>> by that time. Given whatever non-block_clonning and not even zfs bugs
>>> which are likely to come out I think this makes the feature a
>>> non-starter for said release.
>>>
>>> I note:
>>> 1. the current problems did not make it into stable branches.
>>> 2. there was block_cloning-related data corruption (fixed) and there may
>>> be more
>>> 3. there was unrelated data corruption (see
>>> https://github.com/openzfs/zfs/issues/14753), sorted out by reverting
>>> the problematic commit in FreeBSD, not yet sorted out upstream
>>>
>>> As such people's data may be partially hosed as is.
>>>
>>> Consequently the proposed plan is as follows:
>>> 1. whack the block cloning feature for the time being, but make sure
>>> pools which upgraded to it can be mounted read-only
>>> 2. run ztest and whatever other stress testing on FreeBSD, along with
>>> restoring openzfs CI -- I can do the first part, I'm sure pho will not
>>> mind to run some tests of his own
>>> 3. recommend people create new pools and restore data from backup. if
>>> restoring from backup is not an option, tar or cp (not zfs send) from
>>> the read-only mount
>>>
>>> block cloning beaten into shape would use block_cloning_v2 or whatever
>>> else, key point that the current feature name would be considered
>>> bogus (not blocking RO import though) to prevent RW usage of the
>>> current pools with it enabled.
>>>
>>> Comments?
>> Correct me if I'm wrong, but from my understanding there were zero
>> problems with block cloning when it wasn't in use or now disabled.
>>
>> The reason I've introduced vfs.zfs.bclone_enabled sysctl, was to exactly
>> avoid mess like this and give us more time to sort all the problems out
>> while making it easy for people to try it.
>>
>> If there is no plan to revert the whole import, I don't see what value
>> removing just block cloning will bring if it is now disabled by default
>> and didn't cause any problems when disabled.
>>
> The feature definitely was not properly stress tested and what not and
> trying to do it keeps running into panics. Given the complexity of the
> feature I would expect there are many bug lurking, some of which
> possibly related to the on disk format. Not having to deal with any of
> this is can be arranged as described above and is imo the most
> sensible route given the timeline for 14.0
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8ea96ab9-89c5-d05a-95d1-ef71663f28bb>