From nobody Wed May 22 20:51:26 2024 X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Vl3Pn4Jq9z5KyTp for ; Wed, 22 May 2024 20:51:41 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-lf1-x12c.google.com (mail-lf1-x12c.google.com [IPv6:2a00:1450:4864:20::12c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Vl3Pm6CpMz4DTB for ; Wed, 22 May 2024 20:51:40 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=bsdimp-com.20230601.gappssmtp.com header.s=20230601 header.b="udWEpg/5"; dmarc=none; spf=none (mx1.freebsd.org: domain of wlosh@bsdimp.com has no SPF policy when checking 2a00:1450:4864:20::12c) smtp.mailfrom=wlosh@bsdimp.com Received: by mail-lf1-x12c.google.com with SMTP id 2adb3069b0e04-51ffff16400so11270374e87.2 for ; Wed, 22 May 2024 13:51:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20230601.gappssmtp.com; s=20230601; t=1716411099; x=1717015899; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=6hegpK8/3zAnT+G51qu58wA/DrgyZPk4N1CT5Vm/HkQ=; b=udWEpg/5lpEdMJsAq4am2aiPbLJR9ycxVA5XqmQkwREfxCBqSBTdiSxhiUDjBBgmbY O6l+pf13kF4AgrkkZhYi/P9yY3SSYg0gVH8BLHMooBj8m2Kl+h3Z4Mlq4tvFbjpUV1hR vPVoO3wSwZM4/3SsOAYk8H3PiNlj8Zk7c+Nok6rc+IzBba/mJJBlBwmB1Wjquzzsj/FB JHzXs71Efn8rfs2bXyZSreSJOblvUSbmjVdcBfI6jKEMc6HoeMqkByrMucdOo5zxMvMY IIjFPgoQzlzWLmfJ1TnAjjLb4ReOxumIRFFHgn66UYbf7+ZfUDuJvGkybRrFndpj24EA 8P6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716411099; x=1717015899; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=6hegpK8/3zAnT+G51qu58wA/DrgyZPk4N1CT5Vm/HkQ=; b=OIshf+BVByuePfXVIaL5tZwS4l1CJoWZnCUWZmOzGX4TDtoCITnh4oMkgtTz7sfWCA C84BF7UpEGVtxYOIRtDSG/b37eFdV4Ejg3IxheQxswECcbg2emA+YS+VuHX8w9OBJvgx MwP8D/ew5gHRH2rXVHGOQzErnjZYKWnvVGgNTh2BQDKT1+iitaNTHxtAugxZILZfyupx zNoamiyQV8SU5ra3s0zawoNYHwWqpXAXHWiz5NtDvlQn3f5jWP+gH+QcYGJqPPgEnPaj BMjhtef14GwprL13lqOBc+nfyMnL77p094Km63o4x6jlO0VpWf/7DKnlXsCB11Ceb/An pqDQ== X-Gm-Message-State: AOJu0YwrLyxk30F3MC+mRAiq/TcZLyNpPFY+rgHcxJ2YtW2YI+8yPjW+ WcZ77BOzU/t1unDtPoC8c8oYwnkwL0dhGAXuGroLe2eytfDJRY7js4ZuP2l/CScVjTWN52VzlMD KbbDg1p+cmoH+trv6OyfI2xFJOyxoOHmJcwZmNw== X-Google-Smtp-Source: AGHT+IFvXPupSIwE7r2S9wgQCdSODJvi+Q/W8MgihTVG71lg/eEhIQ6eWf1tUmeUlKLubkRL84nJH+EIRrtrru5v9+U= X-Received: by 2002:ac2:5e7a:0:b0:522:32d1:d0e with SMTP id 2adb3069b0e04-526c1215392mr2074476e87.67.1716411098829; Wed, 22 May 2024 13:51:38 -0700 (PDT) List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: freebsd-stable@freebsd.org Sender: owner-freebsd-stable@FreeBSD.org MIME-Version: 1.0 References: <4c331e4f-75c0-4124-bb11-84568e91ca61@sentex.net> In-Reply-To: From: Warner Losh Date: Wed, 22 May 2024 14:51:26 -0600 Message-ID: Subject: Re: Open ZFS vs FreeBSD ZFS boot issues (resolved sort of) To: mike tancsa Cc: FreeBSD-STABLE Mailing List Content-Type: multipart/alternative; boundary="000000000000d5b74706191119a5" X-Spamd-Bar: -- X-Spamd-Result: default: False [-3.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.999]; FORGED_SENDER(0.30)[imp@bsdimp.com,wlosh@bsdimp.com]; R_DKIM_ALLOW(-0.20)[bsdimp-com.20230601.gappssmtp.com:s=20230601]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; RCVD_COUNT_ONE(0.00)[1]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RCVD_TLS_LAST(0.00)[]; ARC_NA(0.00)[]; MISSING_XM_UA(0.00)[]; R_SPF_NA(0.00)[no SPF record]; DMARC_NA(0.00)[bsdimp.com]; RCPT_COUNT_TWO(0.00)[2]; FROM_NEQ_ENVFROM(0.00)[imp@bsdimp.com,wlosh@bsdimp.com]; FROM_HAS_DN(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::12c:from]; TO_MATCH_ENVRCPT_SOME(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-stable@freebsd.org]; TO_DN_ALL(0.00)[]; MLMMJ_DEST(0.00)[freebsd-stable@freebsd.org]; DKIM_TRACE(0.00)[bsdimp-com.20230601.gappssmtp.com:+] X-Rspamd-Queue-Id: 4Vl3Pm6CpMz4DTB --000000000000d5b74706191119a5 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, May 17, 2024 at 1:52=E2=80=AFPM mike tancsa wrote= : > On 5/16/2024 10:38 AM, Warner Losh wrote: > > > > On Thu, May 16, 2024 at 8:14=E2=80=AFAM mike tancsa wro= te: > >> I have a strange edge case I am trying to work around. I have a >> customer's legacy VM which is RELENG_11 on ZFS. There is some >> corruption that wont clear on a bunch of directories, so I want to >> re-create it from backups. I have done this many times in the past but >> this one is giving me grief. Normally I do something like this on my >> backup server (RELENG_13) >> >> truncate -s 100G file.raw >> mdconfig -f file.raw >> gpart create -s gpt md0 >> gpart add -t freebsd-boot -s 512k md0 >> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 md0 >> gpart add -t freebsd-swap -s 2G md0 >> gpart add -t freebsd-zfs md0 >> zpool create -d -f -o altroot=3D/mnt2 -o feature@lz4_compress=3Denabled = -o >> cachefile=3D/var/tmp/zpool.cache myZFSPool /dev/md0p3 >> > > I'm surprised you don't specifically create compatibility with some older > standard and then maybe add compression. But I'd start there: create > one that doesn't use lz4_compress (it's not read-only compatible, > meaning the old boot loader has to 100% implement it faithfully). > > Hi Warner, > > I though -d would make the LCD. But looking at the updated man pages > for zpool create, I didnt realize there are these handy-dandy files with > all the supported features! > > Trying with > > zpool create -o > compatibility=3D/usr/share/zfs/compatibility.d/freebsd-11.2 -o altroot=3D= /mnt2 > -o cachefile=3D/var/tmp/zpool.cache myZFSPool /dev/md0p3 > > and the pmbr and gptzfsboot from RELENG_12 still gives the same error > > However, if I copy over from RELENG_12 /boot/loader and /boot/zfsloader > and /boot/lua I am able to boot. No idea why that is the case, but.... = I > think this is "solved enough" for me and hopefully if someone else finds > themselves in this strange edge case, this is enough for the LLM to scrap= e > and give a solution :) > > Thanks for the hints Warner. Not sure why it didnt "just work" but it > works with this added step. > OK. That tells me that the new code that supposedly creates compatible old-ZFS images doesn't in some minor, usually trivial way, but that in this case causes the old boot loader to not be able to load it. Copying the 12.x bootloader should just work with 11.x kernels. The handoff protocols are the same (ish, there's one difference with UEFI that would bite you there, but this is BIOS). Warner > > ---Mike > > FreeBSD/x86 ZFS enabled bootstrap loader, Revision 1.1 >> >> (Tues Oct 10:24:17 EDT 2018 user@hostname) >> panic: free: guard2 fail @ 0xbf153040 + 2061 from unknown:0 >> --> Press a key on the console to reboot <-- >> > > This is a memory corruption bug. You'll need to find what's corrupting > memory and make it stop. > > I imagine this might be a small incompatibility with OpenZFS or just > a bug in what openZFS is generating on the releng13 server. > > What version is the boot loader? There's been like 6 years of fixes and > churn since the date above? Maybe the latest on RELENG_11 for it > if you are still running 11.2-stable. > > Any chance you can use the stable/13 or stable/14 loaders? 11 is really > not supported anymore and hasn't been for quite some time. I have no > time for it beyond this quick peek. > > > --000000000000d5b74706191119a5 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Fri, May 17, 2024 at 1:52=E2=80=AF= PM mike tancsa <mike@sentex.net&g= t; wrote:
=20 =20 =20
On 5/16/2024 10:38 AM, Warner Losh wrote:
=20


On Thu, May 16, 2024 at 8:14=E2=80=AFAM mike tancsa <mike@sentex.net> wrote:
I have a strange edge case I am trying to work around.=C2=A0 I ha= ve a
customer's legacy VM which is RELENG_11 on ZFS.=C2=A0 There= is some
corruption that wont clear on a bunch of directories, so I want to
re-create it from backups. I have done this many times in the past but
this one is giving me grief. Normally I do something like this on my
backup server (RELENG_13)

truncate -s 100G file.raw
mdconfig -f file.raw
gpart create -s gpt md0
gpart add -t freebsd-boot -s 512k md0
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 md0
gpart add -t freebsd-swap -s 2G md0
gpart add -t freebsd-zfs md0
zpool create -d -f -o altroot=3D/mnt2 -o feature@lz4_compress=3Denabled -o
cachefile=3D/var/tmp/zpool.cache myZFSPool /dev/md0p3

I'm surprised you don't specifically create compatib= ility with some older
standard and then maybe add compression. But I'd start there: create
one that doesn't use lz4_compress (it's not read-onl= y compatible,
meaning the old boot loader has to 100% implement it faithfully).

Hi Warner,

=C2=A0=C2=A0=C2=A0 I though -d would make the LCD. But looking at th= e updated man pages for zpool create, I didnt realize there are these handy-dandy files with all the supported features!

Trying with=C2=A0

=C2=A0=C2=A0=C2=A0 zpool create -o compatibility=3D/usr/share/zfs/compatibility.d/freebsd-11.2 -o altroot=3D/mnt2=C2=A0 -o cachefile=3D/var/tmp/zpool.cache myZFSPool /dev/md0p3

and the pmbr and gptzfsboot from RELENG_12 still gives the same error

However, if I copy over from RELENG_12 /boot/loader and /boot/zfsloader and /boot/lua I am able to boot.=C2=A0=C2=A0 No idea = why that is the case, but.... I think this is "solved enough" f= or me and hopefully if someone else finds themselves in this strange edge case, this is enough for the LLM to scrape and give a solution :)=C2=A0

Thanks for the hints Warner.=C2=A0 Not sure why it didnt "just = work" but it works with this added step.


OK. That tells me that the new code that supposedly creates compati= ble old-ZFS images doesn't in some minor, usually trivial way, but that= in this case causes the old boot loader to not be able to load it.

Copying the 12.x bootloader should just work with 11.x ke= rnels. The handoff protocols are the same (ish, there's one difference = with UEFI that would bite you there, but this is BIOS).

Warner
=C2=A0


=C2=A0=C2=A0=C2=A0 ---Mike


FreeBSD/x86 ZFS enabled bootstrap loader, Revision 1.1

(Tues Oct 10:24:17 EDT 2018 user@hostname)
panic: free: guard2 fail @ 0xbf153040 + 2061 from unknown:0
--> Press a key on the console to reboot <--

This is a memory corruption bug. You'll need to find what's corrupting
memory and make it stop.

I imagine this might be a small incompatibility with OpenZFS or just
a bug in what openZFS is generating on the releng13 server.

What version is the boot loader? There's been like 6 years of fixes and
churn since the date above? Maybe the latest on RELENG_11 for it
if you are still running 11.2-stable.

Any chance you can use the stable/13 or stable/14 loaders? 11 is really
not supported anymore and hasn't been for quite some time. I have no
time for it beyond this quick peek.


--000000000000d5b74706191119a5--