Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Nov 2024 08:05:39 -0500
From:      Dennis Clarke <dclarke@blastwave.org>
To:        Current FreeBSD <freebsd-current@freebsd.org>
Subject:   zpools no longer exist after boot
Message-ID:  <5798b0db-bc73-476a-908a-dd1f071bfe43@blastwave.org>

next in thread | raw e-mail | index | archive | help

This is a baffling problem wherein two zpools no longer exist after
boot. This is :

titan# uname -apKU
FreeBSD titan 15.0-CURRENT FreeBSD 15.0-CURRENT #1 
main-n273749-4b65481ac68a-dirty: Wed Nov 20 15:08:52 GMT 2024 
root@titan:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG amd64 amd64 
1500027 1500027
titan#

titan# zpool list
NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP 
HEALTH  ALTROOT
t0     444G  91.2G   353G        -         -    27%    20%  1.00x 
ONLINE  -
titan#

The *only* zpool that seems to exist in any reliable way is the little
NVME based unit for booting. The other two zpools vanished and yet the
devices exist just fine :

titan#
titan# camcontrol devlist
<ST20000NM007D-3DJ103 SN03>        at scbus0 target 0 lun 0 (pass0,ada0)
<ST20000NM007D-3DJ103 SN03>        at scbus1 target 0 lun 0 (pass1,ada1)
<AHCI SGPIO Enclosure 2.00 0001>   at scbus2 target 0 lun 0 (ses0,pass2)
<AHCI SGPIO Enclosure 2.00 0001>   at scbus6 target 0 lun 0 (ses1,pass3)
<SAMSUNG MZVKW512HMJP-000L7 6L6QCXA7>  at scbus7 target 0 lun 1 (pass4,nda0)
<FREEBSD CTLDISK 0001>             at scbus8 target 0 lun 0 (da0,pass5)
titan#
titan# nvmecontrol devlist
  nvme0: SAMSUNG MZVKW512HMJP-000L7
     nvme0ns1 (488386MB)
titan#
titan# zpool status t0
   pool: t0
  state: ONLINE
status: Some supported and requested features are not enabled on the pool.
         The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
         the pool may no longer be accessible by software that does not 
support
         the features. See zpool-features(7) for details.
   scan: scrub repaired 0B in 00:00:44 with 0 errors on Wed Feb  7 
09:56:40 2024
config:

         NAME        STATE     READ WRITE CKSUM
         t0          ONLINE       0     0     0
           nda0p3    ONLINE       0     0     0

errors: No known data errors
titan#


Initially I thought the problem was related to cachefile being empty for
these zpools. However if I set the cachefile to something reasonable
then the cachefile property vanishes at a reboot.  The file, of course, 
exists just fine :

titan# zpool get cachefile proteus
NAME     PROPERTY   VALUE      SOURCE
proteus  cachefile  -          default
titan#
titan# zpool set cachefile="/var/log/zpool_cache" proteus
titan# zpool get cachefile proteus
NAME     PROPERTY   VALUE                 SOURCE
proteus  cachefile  /var/log/zpool_cache  local
titan# ls -ladb /var/log/zpool_cache
-rw-r--r--  1 root wheel 1440 Nov 28 11:45 /var/log/zpool_cache
titan#

So there we have 1440 bytes of data in that file.

titan# zpool set cachefile="/var/log/zpool_cache" t0
titan# zpool get cachefile t0
NAME  PROPERTY   VALUE                 SOURCE
t0    cachefile  /var/log/zpool_cache  local
titan#
titan# ls -ladb /var/log/zpool_cache
-rw-r--r--  1 root wheel 2880 Nov 28 11:46 /var/log/zpool_cache
titan#

Now we have 2 * 1440 bytes = 2880 bytes of some zpool cache data.

titan# zpool set cachefile="/var/log/zpool_cache" leaf
titan# zpool get cachefile leaf
NAME  PROPERTY   VALUE                 SOURCE
leaf  cachefile  /var/log/zpool_cache  local
titan#
titan# zpool get cachefile t0
NAME  PROPERTY   VALUE                 SOURCE
t0    cachefile  /var/log/zpool_cache  local
titan#
titan# zpool get cachefile proteus
NAME     PROPERTY   VALUE                 SOURCE
proteus  cachefile  /var/log/zpool_cache  local
titan#
titan# reboot

 From here on ... the only zpool that exists after boot is the local
little NVME samsung unit.

So here I can import those pools and then see that the cachefile 
property has been wiped out :

titan#
titan# zpool import proteus
titan# zpool import leaf
titan#
titan# zpool list
NAME      SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP 
HEALTH  ALTROOT
leaf     18.2T   984K  18.2T        -         -     0%     0%  1.00x 
ONLINE  -
proteus  1.98T   361G  1.63T        -         -     1%    17%  1.00x 
ONLINE  -
t0        444G  91.2G   353G        -         -    27%    20%  1.00x 
ONLINE  -
titan#
titan# zpool get cachefile leaf
NAME  PROPERTY   VALUE      SOURCE
leaf  cachefile  -          default
titan#
titan# zpool get cachefile proteus
NAME     PROPERTY   VALUE      SOURCE
proteus  cachefile  -          default
titan#
titan# zpool get cachefile t0
NAME  PROPERTY   VALUE      SOURCE
t0    cachefile  -          default
titan#
titan# ls -l /var/log/zpool_cache
-rw-r--r--  1 root wheel 4960 Nov 28 11:52 /var/log/zpool_cache
titan#

The cachefile exists and seems to have grown in size.

However a reboot will once again provide nothing but the t0 pool.

Baffled.

Any thoughts would be welcome.

-- 
--
Dennis Clarke
RISC-V/SPARC/PPC/ARM/CISC
UNIX and Linux spoken



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5798b0db-bc73-476a-908a-dd1f071bfe43>