Date: Thu, 28 Nov 2024 07:52:35 -0600 From: Alan Somers <asomers@freebsd.org> To: Dennis Clarke <dclarke@blastwave.org> Cc: Current FreeBSD <freebsd-current@freebsd.org> Subject: Re: zpools no longer exist after boot Message-ID: <CAOtMX2hKCYrx92SBLQOtekKiBWMgBy_n93ZGQ_NVLq=6puRhOg@mail.gmail.com> In-Reply-To: <5798b0db-bc73-476a-908a-dd1f071bfe43@blastwave.org>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --] On Thu, Nov 28, 2024, 7:06 AM Dennis Clarke <dclarke@blastwave.org> wrote: > > This is a baffling problem wherein two zpools no longer exist after > boot. This is : > > titan# uname -apKU > FreeBSD titan 15.0-CURRENT FreeBSD 15.0-CURRENT #1 > main-n273749-4b65481ac68a-dirty: Wed Nov 20 15:08:52 GMT 2024 > root@titan:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 amd64 > 1500027 1500027 > titan# > > titan# zpool list > NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP > HEALTH ALTROOT > t0 444G 91.2G 353G - - 27% 20% 1.00x > ONLINE - > titan# > > The *only* zpool that seems to exist in any reliable way is the little > NVME based unit for booting. The other two zpools vanished and yet the > devices exist just fine : > > titan# > titan# camcontrol devlist > <ST20000NM007D-3DJ103 SN03> at scbus0 target 0 lun 0 (pass0,ada0) > <ST20000NM007D-3DJ103 SN03> at scbus1 target 0 lun 0 (pass1,ada1) > <AHCI SGPIO Enclosure 2.00 0001> at scbus2 target 0 lun 0 (ses0,pass2) > <AHCI SGPIO Enclosure 2.00 0001> at scbus6 target 0 lun 0 (ses1,pass3) > <SAMSUNG MZVKW512HMJP-000L7 6L6QCXA7> at scbus7 target 0 lun 1 > (pass4,nda0) > <FREEBSD CTLDISK 0001> at scbus8 target 0 lun 0 (da0,pass5) > titan# > titan# nvmecontrol devlist > nvme0: SAMSUNG MZVKW512HMJP-000L7 > nvme0ns1 (488386MB) > titan# > titan# zpool status t0 > pool: t0 > state: ONLINE > status: Some supported and requested features are not enabled on the pool. > The pool can still be used, but some features are unavailable. > action: Enable all features using 'zpool upgrade'. Once this is done, > the pool may no longer be accessible by software that does not > support > the features. See zpool-features(7) for details. > scan: scrub repaired 0B in 00:00:44 with 0 errors on Wed Feb 7 > 09:56:40 2024 > config: > > NAME STATE READ WRITE CKSUM > t0 ONLINE 0 0 0 > nda0p3 ONLINE 0 0 0 > > errors: No known data errors > titan# > > > Initially I thought the problem was related to cachefile being empty for > these zpools. However if I set the cachefile to something reasonable > then the cachefile property vanishes at a reboot. The file, of course, > exists just fine : > > titan# zpool get cachefile proteus > NAME PROPERTY VALUE SOURCE > proteus cachefile - default > titan# > titan# zpool set cachefile="/var/log/zpool_cache" proteus > titan# zpool get cachefile proteus > NAME PROPERTY VALUE SOURCE > proteus cachefile /var/log/zpool_cache local > titan# ls -ladb /var/log/zpool_cache > -rw-r--r-- 1 root wheel 1440 Nov 28 11:45 /var/log/zpool_cache > titan# > > So there we have 1440 bytes of data in that file. > > titan# zpool set cachefile="/var/log/zpool_cache" t0 > titan# zpool get cachefile t0 > NAME PROPERTY VALUE SOURCE > t0 cachefile /var/log/zpool_cache local > titan# > titan# ls -ladb /var/log/zpool_cache > -rw-r--r-- 1 root wheel 2880 Nov 28 11:46 /var/log/zpool_cache > titan# > > Now we have 2 * 1440 bytes = 2880 bytes of some zpool cache data. > > titan# zpool set cachefile="/var/log/zpool_cache" leaf > titan# zpool get cachefile leaf > NAME PROPERTY VALUE SOURCE > leaf cachefile /var/log/zpool_cache local > titan# > titan# zpool get cachefile t0 > NAME PROPERTY VALUE SOURCE > t0 cachefile /var/log/zpool_cache local > titan# > titan# zpool get cachefile proteus > NAME PROPERTY VALUE SOURCE > proteus cachefile /var/log/zpool_cache local > titan# > titan# reboot > > From here on ... the only zpool that exists after boot is the local > little NVME samsung unit. > > So here I can import those pools and then see that the cachefile > property has been wiped out : > > titan# > titan# zpool import proteus > titan# zpool import leaf > titan# > titan# zpool list > NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP > HEALTH ALTROOT > leaf 18.2T 984K 18.2T - - 0% 0% 1.00x > ONLINE - > proteus 1.98T 361G 1.63T - - 1% 17% 1.00x > ONLINE - > t0 444G 91.2G 353G - - 27% 20% 1.00x > ONLINE - > titan# > titan# zpool get cachefile leaf > NAME PROPERTY VALUE SOURCE > leaf cachefile - default > titan# > titan# zpool get cachefile proteus > NAME PROPERTY VALUE SOURCE > proteus cachefile - default > titan# > titan# zpool get cachefile t0 > NAME PROPERTY VALUE SOURCE > t0 cachefile - default > titan# > titan# ls -l /var/log/zpool_cache > -rw-r--r-- 1 root wheel 4960 Nov 28 11:52 /var/log/zpool_cache > titan# > > The cachefile exists and seems to have grown in size. > > However a reboot will once again provide nothing but the t0 pool. > > Baffled. > > Any thoughts would be welcome. > > -- > -- > Dennis Clarke > RISC-V/SPARC/PPC/ARM/CISC > UNIX and Linux spoken > Do you have zfs_enable="YES" set in /etc/rc.conf? If not then nothing will get imported. Regarding the cachefile property, it's expected that "zpool import" will change it, unless you do "zpool import -O cachefile=whatever". > [-- Attachment #2 --] <div dir="auto"><div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Nov 28, 2024, 7:06 AM Dennis Clarke <<a href="mailto:dclarke@blastwave.org">dclarke@blastwave.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br> This is a baffling problem wherein two zpools no longer exist after<br> boot. This is :<br> <br> titan# uname -apKU<br> FreeBSD titan 15.0-CURRENT FreeBSD 15.0-CURRENT #1 <br> main-n273749-4b65481ac68a-dirty: Wed Nov 20 15:08:52 GMT 2024 <br> root@titan:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 amd64 <br> 1500027 1500027<br> titan#<br> <br> titan# zpool list<br> NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP <br> HEALTH ALTROOT<br> t0 444G 91.2G 353G - - 27% 20% 1.00x <br> ONLINE -<br> titan#<br> <br> The *only* zpool that seems to exist in any reliable way is the little<br> NVME based unit for booting. The other two zpools vanished and yet the<br> devices exist just fine :<br> <br> titan#<br> titan# camcontrol devlist<br> <ST20000NM007D-3DJ103 SN03> at scbus0 target 0 lun 0 (pass0,ada0)<br> <ST20000NM007D-3DJ103 SN03> at scbus1 target 0 lun 0 (pass1,ada1)<br> <AHCI SGPIO Enclosure 2.00 0001> at scbus2 target 0 lun 0 (ses0,pass2)<br> <AHCI SGPIO Enclosure 2.00 0001> at scbus6 target 0 lun 0 (ses1,pass3)<br> <SAMSUNG MZVKW512HMJP-000L7 6L6QCXA7> at scbus7 target 0 lun 1 (pass4,nda0)<br> <FREEBSD CTLDISK 0001> at scbus8 target 0 lun 0 (da0,pass5)<br> titan#<br> titan# nvmecontrol devlist<br> nvme0: SAMSUNG MZVKW512HMJP-000L7<br> nvme0ns1 (488386MB)<br> titan#<br> titan# zpool status t0<br> pool: t0<br> state: ONLINE<br> status: Some supported and requested features are not enabled on the pool.<br> The pool can still be used, but some features are unavailable.<br> action: Enable all features using 'zpool upgrade'. Once this is done,<br> the pool may no longer be accessible by software that does not <br> support<br> the features. See zpool-features(7) for details.<br> scan: scrub repaired 0B in 00:00:44 with 0 errors on Wed Feb 7 <br> 09:56:40 2024<br> config:<br> <br> NAME STATE READ WRITE CKSUM<br> t0 ONLINE 0 0 0<br> nda0p3 ONLINE 0 0 0<br> <br> errors: No known data errors<br> titan#<br> <br> <br> Initially I thought the problem was related to cachefile being empty for<br> these zpools. However if I set the cachefile to something reasonable<br> then the cachefile property vanishes at a reboot. The file, of course, <br> exists just fine :<br> <br> titan# zpool get cachefile proteus<br> NAME PROPERTY VALUE SOURCE<br> proteus cachefile - default<br> titan#<br> titan# zpool set cachefile="/var/log/zpool_cache" proteus<br> titan# zpool get cachefile proteus<br> NAME PROPERTY VALUE SOURCE<br> proteus cachefile /var/log/zpool_cache local<br> titan# ls -ladb /var/log/zpool_cache<br> -rw-r--r-- 1 root wheel 1440 Nov 28 11:45 /var/log/zpool_cache<br> titan#<br> <br> So there we have 1440 bytes of data in that file.<br> <br> titan# zpool set cachefile="/var/log/zpool_cache" t0<br> titan# zpool get cachefile t0<br> NAME PROPERTY VALUE SOURCE<br> t0 cachefile /var/log/zpool_cache local<br> titan#<br> titan# ls -ladb /var/log/zpool_cache<br> -rw-r--r-- 1 root wheel 2880 Nov 28 11:46 /var/log/zpool_cache<br> titan#<br> <br> Now we have 2 * 1440 bytes = 2880 bytes of some zpool cache data.<br> <br> titan# zpool set cachefile="/var/log/zpool_cache" leaf<br> titan# zpool get cachefile leaf<br> NAME PROPERTY VALUE SOURCE<br> leaf cachefile /var/log/zpool_cache local<br> titan#<br> titan# zpool get cachefile t0<br> NAME PROPERTY VALUE SOURCE<br> t0 cachefile /var/log/zpool_cache local<br> titan#<br> titan# zpool get cachefile proteus<br> NAME PROPERTY VALUE SOURCE<br> proteus cachefile /var/log/zpool_cache local<br> titan#<br> titan# reboot<br> <br> From here on ... the only zpool that exists after boot is the local<br> little NVME samsung unit.<br> <br> So here I can import those pools and then see that the cachefile <br> property has been wiped out :<br> <br> titan#<br> titan# zpool import proteus<br> titan# zpool import leaf<br> titan#<br> titan# zpool list<br> NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP <br> HEALTH ALTROOT<br> leaf 18.2T 984K 18.2T - - 0% 0% 1.00x <br> ONLINE -<br> proteus 1.98T 361G 1.63T - - 1% 17% 1.00x <br> ONLINE -<br> t0 444G 91.2G 353G - - 27% 20% 1.00x <br> ONLINE -<br> titan#<br> titan# zpool get cachefile leaf<br> NAME PROPERTY VALUE SOURCE<br> leaf cachefile - default<br> titan#<br> titan# zpool get cachefile proteus<br> NAME PROPERTY VALUE SOURCE<br> proteus cachefile - default<br> titan#<br> titan# zpool get cachefile t0<br> NAME PROPERTY VALUE SOURCE<br> t0 cachefile - default<br> titan#<br> titan# ls -l /var/log/zpool_cache<br> -rw-r--r-- 1 root wheel 4960 Nov 28 11:52 /var/log/zpool_cache<br> titan#<br> <br> The cachefile exists and seems to have grown in size.<br> <br> However a reboot will once again provide nothing but the t0 pool.<br> <br> Baffled.<br> <br> Any thoughts would be welcome.<br> <br> -- <br> --<br> Dennis Clarke<br> RISC-V/SPARC/PPC/ARM/CISC<br> UNIX and Linux spoken<br></blockquote></div></div><div dir="auto"><br></div><div dir="auto">Do you have zfs_enable="YES" set in /etc/rc.conf? If not then nothing will get imported. </div><div dir="auto"><br></div><div dir="auto">Regarding the cachefile property, it's expected that "zpool import" will change it, unless you do "zpool import -O cachefile=whatever".</div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> </blockquote></div></div></div>home | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2hKCYrx92SBLQOtekKiBWMgBy_n93ZGQ_NVLq=6puRhOg>
