From nobody Fri May 17 19:52:56 2024 X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4VgyLT6g8zz5K04M for ; Fri, 17 May 2024 19:53:05 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1.sentex.ca [IPv6:2607:f3e0:0:1::12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smarthost1.sentex.ca", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4VgyLT0JH7z46YL for ; Fri, 17 May 2024 19:53:05 +0000 (UTC) (envelope-from mike@sentex.net) Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of mike@sentex.net designates 2607:f3e0:0:1::12 as permitted sender) smtp.mailfrom=mike@sentex.net Received: from pyroxene2a.sentex.ca (pyroxene19.sentex.ca [199.212.134.19]) by smarthost1.sentex.ca (8.17.1/8.16.1) with ESMTPS id 44HJqv34077528 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=FAIL); Fri, 17 May 2024 15:52:57 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [IPV6:2607:f3e0:0:4:c0dd:c7b1:ae3e:baa1] ([IPv6:2607:f3e0:0:4:c0dd:c7b1:ae3e:baa1]) by pyroxene2a.sentex.ca (8.18.1/8.15.2) with ESMTPS id 44HJqtO2047043 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NO); Fri, 17 May 2024 15:52:55 -0400 (EDT) (envelope-from mike@sentex.net) Content-Type: multipart/alternative; boundary="------------C7Xe0TBA9gZyQsY8DA0B9eaz" Message-ID: Date: Fri, 17 May 2024 15:52:56 -0400 List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: freebsd-stable@freebsd.org Sender: owner-freebsd-stable@FreeBSD.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Open ZFS vs FreeBSD ZFS boot issues (resolved sort of) To: Warner Losh Cc: FreeBSD-STABLE Mailing List References: <4c331e4f-75c0-4124-bb11-84568e91ca61@sentex.net> Content-Language: en-US From: mike tancsa Autocrypt: addr=mike@sentex.net; keydata= xsBNBFywzOMBCACoNFpwi5MeyEREiCeHtbm6pZJI/HnO+wXdCAWtZkS49weOoVyUj5BEXRZP xflV2ib2hflX4nXqhenaNiia4iaZ9ft3I1ebd7GEbGnsWCvAnob5MvDZyStDAuRxPJK1ya/s +6rOvr+eQiXYNVvfBhrCfrtR/esSkitBGxhUkBjOti8QwzD71JVF5YaOjBAs7jZUKyLGj0kW yDg4jUndudWU7G2yc9GwpHJ9aRSUN8e/mWdIogK0v+QBHfv/dsI6zVB7YuxCC9Fx8WPwfhDH VZC4kdYCQWKXrm7yb4TiVdBh5kgvlO9q3js1yYdfR1x8mjK2bH2RSv4bV3zkNmsDCIxjABEB AAHNHW1pa2UgdGFuY3NhIDxtaWtlQHNlbnRleC5uZXQ+wsCOBBMBCAA4FiEEmuvCXT0aY6hs 4SbWeVOEFl5WrMgFAl+pQfkCGwMFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AACgkQeVOEFl5W rMiN6ggAk3H5vk8QnbvGbb4sinxZt/wDetgk0AOR9NRmtTnPaW+sIJEfGBOz47Xih+f7uWJS j+uvc9Ewn2Z7n8z3ZHJlLAByLVLtcNXGoRIGJ27tevfOaNqgJHBPbFOcXCBBFTx4MYMM4iAZ cDT5vsBTSaM36JZFtHZBKkuFEItbA/N8ZQSHKdTYMIA7A3OCLGbJBqloQ8SlW4MkTzKX4u7R yefAYQ0h20x9IqC5Ju8IsYRFacVZconT16KS81IBceO42vXTN0VexbVF2rZIx3v/NT75r6Vw 0FlXVB1lXOHKydRA2NeleS4NEG2vWqy/9Boj0itMfNDlOhkrA/0DcCurMpnpbM7ATQRcsMzk AQgA1Dpo/xWS66MaOJLwA28sKNMwkEk1Yjs+okOXDOu1F+0qvgE8sVmrOOPvvWr4axtKRSG1 t2QUiZ/ZkW/x/+t0nrM39EANV1VncuQZ1ceIiwTJFqGZQ8kb0+BNkwuNVFHRgXm1qzAJweEt RdsCMohB+H7BL5LGCVG5JaU0lqFU9pFP40HxEbyzxjsZgSE8LwkI6wcu0BLv6K6cLm0EiHPO l5G8kgRi38PS7/6s3R8QDsEtbGsYy6O82k3zSLIjuDBwA9GRaeigGppTxzAHVjf5o9KKu4O7 gC2KKVHPegbXS+GK7DU0fjzX57H5bZ6komE5eY4p3oWT/CwVPSGfPs8jOwARAQABwsB2BBgB CAAgFiEEmuvCXT0aY6hs4SbWeVOEFl5WrMgFAl+pQfkCGwwACgkQeVOEFl5WrMiVqwf9GwU8 c6cylknZX8QwlsVudTC8xr/L17JA84wf03k3d4wxP7bqy5AYy7jboZMbgWXngAE/HPQU95NM aukysSnknzoIpC96XZJ0okLBXVS6Y0ylZQ+HrbIhMpuQPoDweoF5F9wKrsHRoDaUK1VR706X rwm4HUzh7Jk+auuMYfuCh0FVlFBEuiJWMLhg/5WCmcRfiuB6F59ZcUQrwLEZeNhF2XJV4KwB Tlg7HCWO/sy1foE5noaMyACjAtAQE9p5kGYaj+DuRhPdWUTsHNuqrhikzIZd2rrcMid+ktb0 NvtvswzMO059z1YGMtGSqQ4srCArju+XHIdTFdiIYbd7+jeehg== In-Reply-To: X-Scanned-By: MIMEDefang 2.86 X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.29 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-0.996]; NEURAL_HAM_SHORT(-0.91)[-0.906]; R_SPF_ALLOW(-0.20)[+ip6:2607:f3e0::/32]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; RCVD_IN_DNSWL_LOW(-0.10)[199.212.134.19:received]; XM_UA_NO_VERSION(0.01)[]; ASN(0.00)[asn:11647, ipnet:2607:f3e0::/32, country:CA]; FREEFALL_USER(0.00)[mike]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ARC_NA(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; R_DKIM_NA(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; DMARC_NA(0.00)[sentex.net]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; MLMMJ_DEST(0.00)[freebsd-stable@freebsd.org]; RCVD_COUNT_TWO(0.00)[2]; TO_DN_ALL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_TLS_ALL(0.00)[] X-Rspamd-Queue-Id: 4VgyLT0JH7z46YL This is a multi-part message in MIME format. --------------C7Xe0TBA9gZyQsY8DA0B9eaz Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 5/16/2024 10:38 AM, Warner Losh wrote: > > > On Thu, May 16, 2024 at 8:14 AM mike tancsa wrote: > > I have a strange edge case I am trying to work around.  I have a > customer's legacy VM which is RELENG_11 on ZFS.  There is some > corruption that wont clear on a bunch of directories, so I want to > re-create it from backups. I have done this many times in the past > but > this one is giving me grief. Normally I do something like this on my > backup server (RELENG_13) > > truncate -s 100G file.raw > mdconfig -f file.raw > gpart create -s gpt md0 > gpart add -t freebsd-boot -s 512k md0 > gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 md0 > gpart add -t freebsd-swap -s 2G md0 > gpart add -t freebsd-zfs md0 > zpool create -d -f -o altroot=/mnt2 -o > feature@lz4_compress=enabled -o > cachefile=/var/tmp/zpool.cache myZFSPool /dev/md0p3 > > > I'm surprised you don't specifically create compatibility with some older > standard and then maybe add compression. But I'd start there: create > one that doesn't use lz4_compress (it's not read-only compatible, > meaning the old boot loader has to 100% implement it faithfully). Hi Warner,     I though -d would make the LCD. But looking at the updated man pages for zpool create, I didnt realize there are these handy-dandy files with all the supported features! Trying with     zpool create -o compatibility=/usr/share/zfs/compatibility.d/freebsd-11.2 -o altroot=/mnt2  -o cachefile=/var/tmp/zpool.cache myZFSPool /dev/md0p3 and the pmbr and gptzfsboot from RELENG_12 still gives the same error However, if I copy over from RELENG_12 /boot/loader and /boot/zfsloader and /boot/lua I am able to boot.   No idea why that is the case, but.... I think this is "solved enough" for me and hopefully if someone else finds themselves in this strange edge case, this is enough for the LLM to scrape and give a solution :) Thanks for the hints Warner.  Not sure why it didnt "just work" but it works with this added step.     ---Mike > FreeBSD/x86 ZFS enabled bootstrap loader, Revision 1.1 > > (Tues Oct 10:24:17 EDT 2018 user@hostname) > panic: free: guard2 fail @ 0xbf153040 + 2061 from unknown:0 > --> Press a key on the console to reboot <-- > > > This is a memory corruption bug. You'll need to find what's corrupting > memory and make it stop. > > I imagine this might be a small incompatibility with OpenZFS or just > a bug in what openZFS is generating on the releng13 server. > > What version is the boot loader? There's been like 6 years of fixes and > churn since the date above? Maybe the latest on RELENG_11 for it > if you are still running 11.2-stable. > > Any chance you can use the stable/13 or stable/14 loaders? 11 is really > not supported anymore and hasn't been for quite some time. I have no > time for it beyond this quick peek. > --------------C7Xe0TBA9gZyQsY8DA0B9eaz Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit
On 5/16/2024 10:38 AM, Warner Losh wrote:


On Thu, May 16, 2024 at 8:14 AM mike tancsa <mike@sentex.net> wrote:
I have a strange edge case I am trying to work around.  I have a
customer's legacy VM which is RELENG_11 on ZFS.  There is some
corruption that wont clear on a bunch of directories, so I want to
re-create it from backups. I have done this many times in the past but
this one is giving me grief. Normally I do something like this on my
backup server (RELENG_13)

truncate -s 100G file.raw
mdconfig -f file.raw
gpart create -s gpt md0
gpart add -t freebsd-boot -s 512k md0
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 md0
gpart add -t freebsd-swap -s 2G md0
gpart add -t freebsd-zfs md0
zpool create -d -f -o altroot=/mnt2 -o feature@lz4_compress=enabled -o
cachefile=/var/tmp/zpool.cache myZFSPool /dev/md0p3

I'm surprised you don't specifically create compatibility with some older
standard and then maybe add compression. But I'd start there: create
one that doesn't use lz4_compress (it's not read-only compatible,
meaning the old boot loader has to 100% implement it faithfully).

Hi Warner,

    I though -d would make the LCD. But looking at the updated man pages for zpool create, I didnt realize there are these handy-dandy files with all the supported features!

Trying with 

    zpool create -o compatibility=/usr/share/zfs/compatibility.d/freebsd-11.2 -o altroot=/mnt2  -o cachefile=/var/tmp/zpool.cache myZFSPool /dev/md0p3

and the pmbr and gptzfsboot from RELENG_12 still gives the same error

However, if I copy over from RELENG_12 /boot/loader and /boot/zfsloader and /boot/lua I am able to boot.   No idea why that is the case, but.... I think this is "solved enough" for me and hopefully if someone else finds themselves in this strange edge case, this is enough for the LLM to scrape and give a solution :) 

Thanks for the hints Warner.  Not sure why it didnt "just work" but it works with this added step.

    ---Mike


FreeBSD/x86 ZFS enabled bootstrap loader, Revision 1.1

(Tues Oct 10:24:17 EDT 2018 user@hostname)
panic: free: guard2 fail @ 0xbf153040 + 2061 from unknown:0
--> Press a key on the console to reboot <--

This is a memory corruption bug. You'll need to find what's corrupting
memory and make it stop.

I imagine this might be a small incompatibility with OpenZFS or just
a bug in what openZFS is generating on the releng13 server.

What version is the boot loader? There's been like 6 years of fixes and
churn since the date above? Maybe the latest on RELENG_11 for it
if you are still running 11.2-stable.

Any chance you can use the stable/13 or stable/14 loaders? 11 is really
not supported anymore and hasn't been for quite some time. I have no
time for it beyond this quick peek.


--------------C7Xe0TBA9gZyQsY8DA0B9eaz--