From nobody Tue Apr 18 07:46:41 2023 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Q0wwg4w4nz45w3d for ; Tue, 18 Apr 2023 07:46:43 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from www541.your-server.de (www541.your-server.de [213.133.107.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Q0wwg49ZYz4JdR; Tue, 18 Apr 2023 07:46:43 +0000 (UTC) (envelope-from mm@FreeBSD.org) Authentication-Results: mx1.freebsd.org; none Received: from sslproxy04.your-server.de ([78.46.152.42]) by www541.your-server.de with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pog30-000C06-DE; Tue, 18 Apr 2023 09:46:42 +0200 Received: from [188.167.171.2] (helo=[10.0.9.128]) by sslproxy04.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pog30-000Eki-5j; Tue, 18 Apr 2023 09:46:42 +0200 Message-ID: <8ea96ab9-89c5-d05a-95d1-ef71663f28bb@FreeBSD.org> Date: Tue, 18 Apr 2023 09:46:41 +0200 List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: another crash and going forward with zfs Content-Language: en-US To: Mateusz Guzik , Pawel Jakub Dawidek Cc: freebsd-current@freebsd.org, Glen Barber References: <48e02888-c49f-ab2b-fc2d-ad6db6f0e10b@dawidek.net> From: Martin Matuska In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Authenticated-Sender: martin@matuska.de X-Virus-Scanned: Clear (ClamAV 0.103.8/26878/Mon Apr 17 09:23:32 2023) X-Rspamd-Queue-Id: 4Q0wwg49ZYz4JdR X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:24940, ipnet:213.133.96.0/19, country:DE] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N Btw. I am open for setting up a pre-merge stress testing I will check out if I can use the hourly-billed amd64 and arm64 cloud boxes at Hetzner with FreeBSD. Otherwise there are monthly-billed as well. Cheers, mm On 17. 4. 2023 22:14, Mateusz Guzik wrote: > On 4/17/23, Pawel Jakub Dawidek wrote: >> On 4/18/23 03:51, Mateusz Guzik wrote: >>> After bugfixes got committed I decided to zpool upgrade and sysctl >>> vfs.zfs.bclone_enabled=1 vs poudriere for testing purposes. I very >>> quickly got a new crash: >>> >>> panic: VERIFY(arc_released(db->db_buf)) failed >>> >>> cpuid = 9 >>> time = 1681755046 >>> KDB: stack backtrace: >>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame >>> 0xfffffe0a90b8e5f0 >>> vpanic() at vpanic+0x152/frame 0xfffffe0a90b8e640 >>> spl_panic() at spl_panic+0x3a/frame 0xfffffe0a90b8e6a0 >>> dbuf_redirty() at dbuf_redirty+0xbd/frame 0xfffffe0a90b8e6c0 >>> dmu_buf_will_dirty_impl() at dmu_buf_will_dirty_impl+0xa2/frame >>> 0xfffffe0a90b8e700 >>> dmu_write_uio_dnode() at dmu_write_uio_dnode+0xe9/frame >>> 0xfffffe0a90b8e780 >>> dmu_write_uio_dbuf() at dmu_write_uio_dbuf+0x42/frame 0xfffffe0a90b8e7b0 >>> zfs_write() at zfs_write+0x672/frame 0xfffffe0a90b8e960 >>> zfs_freebsd_write() at zfs_freebsd_write+0x39/frame 0xfffffe0a90b8e980 >>> VOP_WRITE_APV() at VOP_WRITE_APV+0xdb/frame 0xfffffe0a90b8ea90 >>> vn_write() at vn_write+0x325/frame 0xfffffe0a90b8eb20 >>> vn_io_fault_doio() at vn_io_fault_doio+0x43/frame 0xfffffe0a90b8eb80 >>> vn_io_fault1() at vn_io_fault1+0x161/frame 0xfffffe0a90b8ecc0 >>> vn_io_fault() at vn_io_fault+0x1b5/frame 0xfffffe0a90b8ed40 >>> dofilewrite() at dofilewrite+0x81/frame 0xfffffe0a90b8ed90 >>> sys_write() at sys_write+0xc0/frame 0xfffffe0a90b8ee00 >>> amd64_syscall() at amd64_syscall+0x157/frame 0xfffffe0a90b8ef30 >>> fast_syscall_common() at fast_syscall_common+0xf8/frame >>> 0xfffffe0a90b8ef30 >>> --- syscall (4, FreeBSD ELF64, write), rip = 0x103cddf7949a, rsp = >>> 0x103cdc85dd48, rbp = 0x103cdc85dd80 --- >>> KDB: enter: panic >>> [ thread pid 95000 tid 135035 ] >>> Stopped at kdb_enter+0x32: movq $0,0x9e4153(%rip) >>> >>> The posted 14.0 schedule which plans to branch stable/14 on May 12 and >>> one cannot bet on the feature getting beaten up into production shape >>> by that time. Given whatever non-block_clonning and not even zfs bugs >>> which are likely to come out I think this makes the feature a >>> non-starter for said release. >>> >>> I note: >>> 1. the current problems did not make it into stable branches. >>> 2. there was block_cloning-related data corruption (fixed) and there may >>> be more >>> 3. there was unrelated data corruption (see >>> https://github.com/openzfs/zfs/issues/14753), sorted out by reverting >>> the problematic commit in FreeBSD, not yet sorted out upstream >>> >>> As such people's data may be partially hosed as is. >>> >>> Consequently the proposed plan is as follows: >>> 1. whack the block cloning feature for the time being, but make sure >>> pools which upgraded to it can be mounted read-only >>> 2. run ztest and whatever other stress testing on FreeBSD, along with >>> restoring openzfs CI -- I can do the first part, I'm sure pho will not >>> mind to run some tests of his own >>> 3. recommend people create new pools and restore data from backup. if >>> restoring from backup is not an option, tar or cp (not zfs send) from >>> the read-only mount >>> >>> block cloning beaten into shape would use block_cloning_v2 or whatever >>> else, key point that the current feature name would be considered >>> bogus (not blocking RO import though) to prevent RW usage of the >>> current pools with it enabled. >>> >>> Comments? >> Correct me if I'm wrong, but from my understanding there were zero >> problems with block cloning when it wasn't in use or now disabled. >> >> The reason I've introduced vfs.zfs.bclone_enabled sysctl, was to exactly >> avoid mess like this and give us more time to sort all the problems out >> while making it easy for people to try it. >> >> If there is no plan to revert the whole import, I don't see what value >> removing just block cloning will bring if it is now disabled by default >> and didn't cause any problems when disabled. >> > The feature definitely was not properly stress tested and what not and > trying to do it keeps running into panics. Given the complexity of the > feature I would expect there are many bug lurking, some of which > possibly related to the on disk format. Not having to deal with any of > this is can be arranged as described above and is imo the most > sensible route given the timeline for 14.0 >