Date: Thu, 16 May 2024 08:38:08 -0600 From: Warner Losh <imp@bsdimp.com> To: mike tancsa <mike@sentex.net> Cc: FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org> Subject: Re: Open ZFS vs FreeBSD ZFS boot issues Message-ID: <CANCZdfpAXg_farsT3iypx8NGhOcuOWFUZnwbYG8sYAZoEzSmAw@mail.gmail.com> In-Reply-To: <4c331e4f-75c0-4124-bb11-84568e91ca61@sentex.net> References: <4c331e4f-75c0-4124-bb11-84568e91ca61@sentex.net>
next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000ba385d0618932fa9 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, May 16, 2024 at 8:14=E2=80=AFAM mike tancsa <mike@sentex.net> wrote= : > I have a strange edge case I am trying to work around. I have a > customer's legacy VM which is RELENG_11 on ZFS. There is some > corruption that wont clear on a bunch of directories, so I want to > re-create it from backups. I have done this many times in the past but > this one is giving me grief. Normally I do something like this on my > backup server (RELENG_13) > > truncate -s 100G file.raw > mdconfig -f file.raw > gpart create -s gpt md0 > gpart add -t freebsd-boot -s 512k md0 > gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 md0 > gpart add -t freebsd-swap -s 2G md0 > gpart add -t freebsd-zfs md0 > zpool create -d -f -o altroot=3D/mnt2 -o feature@lz4_compress=3Denabled -= o > cachefile=3D/var/tmp/zpool.cache myZFSPool /dev/md0p3 > I'm surprised you don't specifically create compatibility with some older standard and then maybe add compression. But I'd start there: create one that doesn't use lz4_compress (it's not read-only compatible, meaning the old boot loader has to 100% implement it faithfully). You might also look at disabling one or more of hold_birth and embedded_dat= a as well. Those aren't 'read-only' compatible. But all of that is kinda speculative. I'm not aware of any bugs in this area that were fixed, and all these options are in the features_for_read list that is in the boot loader. > Then zfs send -r backuppool | zfs recv myZFSPool > > I can then export / import the myZFSPool without issue. I can even > import and examine myZFSPool on the original RELENG_11 VM that is > currently running. A checksum of all the files under /boot are > identical. But every time I try to boot it (KVM), it panics early > > FreeBSD/x86 ZFS enabled bootstrap loader, Revision 1.1 > > (Tues Oct 10:24:17 EDT 2018 user@hostname) > panic: free: guard2 fail @ 0xbf153040 + 2061 from unknown:0 > --> Press a key on the console to reboot <-- > This is a memory corruption bug. You'll need to find what's corrupting memory and make it stop. I imagine this might be a small incompatibility with OpenZFS or just a bug in what openZFS is generating on the releng13 server. > Through a bunch of pf rdrs and nfs mounts, I was able to do the same > above steps on the live RELENG_11 image and do the zfs send/recv and the > image boots up no problem. Any ideas on how to work around this or > what the problem might be I am running into ? The issue seems to be that > I do the zfs recv on a RELENG_13 box. If I do the zfs recv on RELENG_11 > (which takes a LOT longer) it takes forever. zdb differences [1] below. > > The kernel is r339251 11.2-STABLE. I know this is a crazy old issue, > but hoping to at least learn something about ZFS as a result of going > down this rabbit hole. I will I think just do the send|recv via a > RELENG_11 just to get them up and running. They dont have the $ to get > me to upgrade it all for them and this is partly a favor to them to help > them limp along a bit more... > What version is the boot loader? There's been like 6 years of fixes and churn since the date above? Maybe the latest on RELENG_11 for it if you are still running 11.2-stable. Any chance you can use the stable/13 or stable/14 loaders? 11 is really not supported anymore and hasn't been for quite some time. I have no time for it beyond this quick peek. Warner > ---Mike > > > 1 zdb live pool > > ns9zroot: > version: 5000 > name: 'livezroot' > state: 0 > txg: 26872926 > pool_guid: 15183996218106005646 > hostid: 2054190969 > hostname: 'customer-hostname' > com.delphix:has_per_vdev_zaps > vdev_children: 1 > vdev_tree: > type: 'root' > id: 0 > guid: 15183996218106005646 > create_txg: 4 > children[0]: > type: 'disk' > id: 0 > guid: 15258031439924457243 > path: '/dev/vtbd0p3' > whole_disk: 1 > metaslab_array: 256 > metaslab_shift: 32 > ashift: 12 > asize: 580889083904 > is_log: 0 > DTL: 865260 > create_txg: 4 > com.delphix:vdev_zap_leaf: 129 > com.delphix:vdev_zap_top: 130 > features_for_read: > com.delphix:hole_birth > com.delphix:embedded_data > > > MOS Configuration: > version: 5000 > name: 'fromBackupPool' > state: 0 > txg: 2838 > pool_guid: 1150606583960632990 > hostid: 2054190969 > hostname: 'customer-hostname' > com.delphix:has_per_vdev_zaps > vdev_children: 1 > vdev_tree: > type: 'root' > id: 0 > guid: 1150606583960632990 > create_txg: 4 > children[0]: > type: 'disk' > id: 0 > guid: 4164348845485675975 > path: '/dev/md0p3' > whole_disk: 1 > metaslab_array: 256 > metaslab_shift: 29 > ashift: 12 > asize: 105221193728 > is_log: 0 > create_txg: 4 > com.delphix:vdev_zap_leaf: 129 > com.delphix:vdev_zap_top: 130 > features_for_read: > com.delphix:hole_birth > com.delphix:embedded_data > Neither of these --000000000000ba385d0618932fa9 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote">= <div dir=3D"ltr" class=3D"gmail_attr">On Thu, May 16, 2024 at 8:14=E2=80=AF= AM mike tancsa <<a href=3D"mailto:mike@sentex.net">mike@sentex.net</a>&g= t; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0p= x 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I have= a strange edge case I am trying to work around.=C2=A0 I have a <br> customer's legacy VM which is RELENG_11 on ZFS.=C2=A0 There is some <br= > corruption that wont clear on a bunch of directories, so I want to <br> re-create it from backups. I have done this many times in the past but <br> this one is giving me grief. Normally I do something like this on my <br> backup server (RELENG_13)<br> <br> truncate -s 100G file.raw<br> mdconfig -f file.raw<br> gpart create -s gpt md0<br> gpart add -t freebsd-boot -s 512k md0<br> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 md0<br> gpart add -t freebsd-swap -s 2G md0<br> gpart add -t freebsd-zfs md0<br> zpool create -d -f -o altroot=3D/mnt2 -o feature@lz4_compress=3Denabled -o = <br> cachefile=3D/var/tmp/zpool.cache myZFSPool /dev/md0p3<br></blockquote><div>= <br></div><div>I'm surprised you don't specifically create compatib= ility with some older</div><div>standard and then maybe add compression. Bu= t I'd start there: create</div><div>one that doesn't use lz4_compre= ss (it's not read-only compatible,</div><div>meaning the old boot loade= r has to 100% implement it faithfully).</div><div><br></div><div>You might = also look at disabling one or more of hold_birth and embedded_data</div><di= v>as well. Those aren't 'read-only' compatible.</div><div><br><= /div><div>But all of that is kinda speculative. I'm not aware of any bu= gs in this area that</div><div>were fixed, and all these options are in the= features_for_read list that is in the boot</div><div>loader.<br></div><div= >=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px = 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> Then zfs send -r backuppool | zfs recv myZFSPool<br> <br> I can then export / import the myZFSPool without issue. I can even <br> import and examine myZFSPool on the original RELENG_11 VM that is <br> currently running. A checksum of all the files under /boot are <br> identical.=C2=A0 But every time I try to boot it (KVM), it panics early<br> <br> FreeBSD/x86 ZFS enabled bootstrap loader, Revision 1.1<br> <br> (Tues Oct 10:24:17 EDT 2018 user@hostname)<br> panic: free: guard2 fail @ 0xbf153040 + 2061 from unknown:0<br> --> Press a key on the console to reboot <--<br></blockquote><div><br= ></div><div>This is a memory corruption bug. You'll need to find what&#= 39;s corrupting</div><div>memory and make it stop.</div><div><br></div><div= >I imagine this might be a small incompatibility with OpenZFS or just</div>= <div>a bug in what openZFS is generating on the releng13 server.<br></div><= div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0= px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> Through a bunch of pf rdrs and nfs mounts, I was able to do the same <br> above steps on the live RELENG_11 image and do the zfs send/recv and the <b= r> image boots up no problem.=C2=A0=C2=A0 Any ideas on how to work around this= or <br> what the problem might be I am running into ? The issue seems to be that <b= r> I do the zfs recv on a RELENG_13 box. If I do the zfs recv on RELENG_11 <br= > (which takes a LOT longer) it takes forever. zdb differences [1] below.<br> <br> The kernel is r339251 11.2-STABLE.=C2=A0 I know this is a crazy old issue, = <br> but hoping to at least learn something about ZFS as a result of going <br> down this rabbit hole.=C2=A0 I will I think just do the send|recv via a <br= > RELENG_11 just to get them up and running.=C2=A0 They dont have the $ to ge= t <br> me to upgrade it all for them and this is partly a favor to them to help <b= r> them limp along a bit more...<br></blockquote><div><br></div><div>What vers= ion is the boot loader? There's been like 6 years of fixes and</div><di= v>churn since the date above? Maybe the latest on RELENG_11 for it</div><di= v>if you are still running 11.2-stable.</div><div><br></div><div>Any chance= you can use the stable/13 or stable/14 loaders? 11 is really</div><div>not= supported anymore and hasn't been for quite some time. I have no</div>= <div>time for it beyond this quick peek.<br></div><div><br></div><div>Warne= r<br></div><div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"marg= in:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1e= x"> =C2=A0=C2=A0=C2=A0=C2=A0 ---Mike<br> <br> <br> 1 zdb live pool<br> <br> ns9zroot:<br> =C2=A0=C2=A0=C2=A0=C2=A0 version: 5000<br> =C2=A0=C2=A0=C2=A0=C2=A0 name: 'livezroot'<br> =C2=A0=C2=A0=C2=A0=C2=A0 state: 0<br> =C2=A0=C2=A0=C2=A0=C2=A0 txg: 26872926<br> =C2=A0=C2=A0=C2=A0=C2=A0 pool_guid: 15183996218106005646<br> =C2=A0=C2=A0=C2=A0=C2=A0 hostid: 2054190969<br> =C2=A0=C2=A0=C2=A0=C2=A0 hostname: 'customer-hostname'<br> =C2=A0=C2=A0=C2=A0=C2=A0 com.delphix:has_per_vdev_zaps<br> =C2=A0=C2=A0=C2=A0=C2=A0 vdev_children: 1<br> =C2=A0=C2=A0=C2=A0=C2=A0 vdev_tree:<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 type: 'root'<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 id: 0<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 guid: 15183996218106005646= <br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 create_txg: 4<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 children[0]:<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ty= pe: 'disk'<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 id= : 0<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 gu= id: 15258031439924457243<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pa= th: '/dev/vtbd0p3'<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 wh= ole_disk: 1<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 me= taslab_array: 256<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 me= taslab_shift: 32<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 as= hift: 12<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 as= ize: 580889083904<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 is= _log: 0<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 DT= L: 865260<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cr= eate_txg: 4<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 co= m.delphix:vdev_zap_leaf: 129<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 co= m.delphix:vdev_zap_top: 130<br> =C2=A0=C2=A0=C2=A0=C2=A0 features_for_read:<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 com.delphix:hole_birth<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 com.delphix:embedded_data<= br> <br> <br> MOS Configuration:<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 version: 5000<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 name: 'fromBackupPool&= #39;<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 state: 0<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 txg: 2838<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pool_guid: 115060658396063= 2990<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 hostid: 2054190969<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 hostname: 'customer-ho= stname'<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 com.delphix:has_per_vdev_z= aps<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 vdev_children: 1<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 vdev_tree:<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ty= pe: 'root'<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 id= : 0<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 gu= id: 1150606583960632990<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cr= eate_txg: 4<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ch= ildren[0]:<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 type: 'disk'<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 id: 0<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 guid: 4164348845485675975<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 path: '/dev/md0p3'<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 whole_disk: 1<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 metaslab_array: 256<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 metaslab_shift: 29<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 ashift: 12<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 asize: 105221193728<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 is_log: 0<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 create_txg: 4<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 com.delphix:vdev_zap_leaf: 129<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 com.delphix:vdev_zap_top: 130<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 features_for_read:<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 co= m.delphix:hole_birth<br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 co= m.delphix:embedded_data<br></blockquote><div><br></div><div>Neither of thes= e=C2=A0 <br></div></div></div> --000000000000ba385d0618932fa9--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfpAXg_farsT3iypx8NGhOcuOWFUZnwbYG8sYAZoEzSmAw>