Date: Thu, 27 Jan 2022 14:57:03 +0100 From: Ulrich =?utf-8?B?U3DDtnJsZWlu?= <uqs@freebsd.org> To: freebsd-stable@freebsd.org Subject: gptzfsboot can't boot from 4TB SSD Message-ID: <YfKkr%2BmC6rMrRlF9@acme.spoerlein.net>
next in thread | raw e-mail | index | archive | help
Hey folks, I'm stumped on what I assume is a BIOS bug... Upgraded a system from 2013 and a 60GB SSD with UFS (plus GELI and ZFS) with a shiny new 4TB Samsung SSD, on ZFS and with some parts of the pool encrypted. This was fine for half a year or so, but I noticed a stream of `zio_read error 5` from boot0 (I think?) sometimes during boot on the serial console, but it came up fine anyway. Did another installworld/installkernel dance yesterday and the system no longer boots. First it was showing streams of that zio_read error, then failed to load /tank/ROOT/default:/boot/kernel/kernel. Using the '?' command I could see / just fine, but I could get no combination to work to read inside dirs, the manpage makes me think I should try /boot?, but maybe I should've added a space?. Typed in /boot/kernel.old/kernel (!) and it started the spinner, but died shortly afterwards. Put the SSD into a different system (from ca. 2014) and it boots up just fine. Put the old 60GB SSD back in the old system, and it also boots just fine. Ok, on the newer system, I re-wrote the bootcode with `gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0` but that seems to have made things even worse on the old system. Now the tank pool isn't even found anymore (due to the protective MBR maybe?), all I get is some zio_read errors and (from memory, sorry): ZFS: i/o error - all block copies unavailable ZFS: can't read MOS of pool tank Issuing '?' finds nothing anymore. My hypothesis is that this worked initially, as the loader was under some specific LBA threshold, but with more data on the disk, every update moved it back further and this triggers a BIOS bug. There was even a BIOS update from 2018 that I flashed, but it didn't fix any of this, only some Intel ME bugs. Sigh. This is what it looks like: % gpart show => 40 7814037088 ada0 GPT (3.6T) 40 1024 1 freebsd-boot (512K) 1064 984 - free - (492K) 2048 33554432 2 freebsd-swap (16G) 33556480 7780478976 3 freebsd-zfs (3.6T) 7814035456 1672 - free - (836K) So can this be a shortcoming in the BIOS with large drives? I had thought that only applies to boot0, not the loader itself. I thought I can maybe boot from an USB stick and have it find the root of the pool, but the CMOS battery is dead, so I can't switch the boot drive unattended. I can't even turn on UEFI boot as, due to the battery, it won't stick. And this being an "industrial" PC, the CMOS battery is actually rechargeable but soldered onto the board, so I would have to get that fixed as well. Sigh. Should I try to switch to UEFI? Would I have to move all 4T around and re-partition, or could I steal 256M from the swap partition? Should I try with, gasp, GRUB2? I'm kinda stuck on GPT and BIOS here for a while, I think. Thanks for reading all of that, Uli
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YfKkr%2BmC6rMrRlF9>