Date: Tue, 1 May 2018 23:25:41 +0200 From: Willem Jan Withagen <wjw@digiware.nl> To: Warner Losh <imp@bsdimp.com>, Jan Knepper <jan@digitaldaemon.com> Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>, Richard Yao <ryao@gentoo.org>, Alan Somers <asomers@freebsd.org> Subject: Re: Getting ZFS pools back. Message-ID: <d68339d6-1abf-94ff-1763-51ac2ceb7637@digiware.nl> In-Reply-To: <b24f256e-35f9-e557-de5d-719f0fe98f1b@digiware.nl> References: <5f836c79-b379-f066-689b-1645e393c5e9@digiware.nl> <E3B39DFA-269A-4041-922E-38F0CF35CB9A@gentoo.org> <a7fb7ffc-fa5f-4031-c78a-20e7ba618566@digiware.nl> <CAOtMX2gpuc0ntoxqKJv3iw3x_Dcq99zpcmqE8g%2B2QiDtYPHmZQ@mail.gmail.com> <1645b168-4133-693c-2dd3-8e0606abb9c3@digiware.nl> <07576f68-f67e-3a22-7a50-ff261c9b3fff@digitaldaemon.com> <CANCZdfonKRcFKiV%2BCmCvAQ3O5h%2BuNBcWDW7oyxOhWMdmpDHEcw@mail.gmail.com> <7588abf8-16e4-8820-a0e5-e019a02a7bd6@digiware.nl> <b24f256e-35f9-e557-de5d-719f0fe98f1b@digiware.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
On 30/04/2018 12:37, Willem Jan Withagen wrote: > On 29-4-2018 23:20, Willem Jan Withagen wrote: >> On 29/04/2018 20:21, Warner Losh wrote: >>> >>> >>> On Sun, Apr 29, 2018 at 11:57 AM, Jan Knepper <jan@digitaldaemon.com >>> <mailto:jan@digitaldaemon.com>> wrote: >>> >>> On 04/29/2018 13:27, Willem Jan Withagen wrote: >>> >>> Trouble started when I installed (freebsd-update) 11.1 over a >>> running 10.4. Which is sort of scarry? >>> >>> This does sounds 'scary' as I am planning to do this in the (near) >>> future... >>> >>> Has anyone else experienced issues like this? >>> >>> Generally I do build the new system software on a running system, >>> but then go to single user mode to perform the actual install. >>> >>> I have done many upgrades like that over 18 or so years and never >>> seen or heard of an issue alike this. >>> >>> >>> 11.x binaries aren't guaranteed to work with a 10.x kernel. So that's >>> a bit of a problem. freebsd-update shouldn't have let you do that >>> either. >>> >>> However, most 11.x binaries work well enough to at least bootstrap / >>> fix problems if booted on a 10.x kernel due to targeted forward >>> compatibility. You shouldn't count on it for long, but it generally >>> won't totally brick your box. In the past, and I believe this is >>> still true, they work well enough to compile and install a new kernel >>> after pulling sources. The 10.x -> 11.x syscall changes are such that >>> you should be fine. At least if you are on UFS. >> >> I have been doing those kind of this for years and years. Even >> upgrading over NFS and stuff. Sometimes it is a bit too close to the >> sun and things burn. But never crash this bad. >> >>> However, the ZFS ioctls and such are in the bag of 'don't >>> specifically guarantee and also they change a lot' so that may be why >>> you can't mount ZFS by UUID. I've not checked to see if there's >>> specifically an issue here or not. The ZFS ABI is somewhat more >>> fragile than other parts of the system, so you may have issues here. >>> >>> If all else fails, you may be able to PXE boot an 11 kernel, or boot >>> off a USB memstick image to install a kernel. >> >> Tried just about replace everything in both the boot-partition (First >> growing it to take > 64K gptzfsboot) and in /boot from the memstick. >> But the error never went away. >> >> Never had ZFS die on me this bad, that I could not get it back. >> >>> Generally, while we don't guarantee forward compatibility (running >>> newer binaries on older kernels), we've generally built enough >>> forward compat so that things work well enough to complete the >>> upgrade. That's why you haven't hit an issue in 18 years of >>> upgrading. However, the velocity of syscall additions has increased, >>> and we've gone from fairly stable (stale?) ABIs for UFS to a more >>> dynamic one for ZFS where backwards compat is a bit of a crap shoot >>> and forward compat isn't really there at all. That's likely why >>> you've hit a speed bump here. >> >> Come to think of it, I did not do this step with freebsd-update, since >> I was not at an official release yet. I was going to 11.1-RELEASE, to >> be able to start using freebsd-update. >> >> So I don't think I did just do that.... But I tried so much yesterday. >> Normally I would installkernel, reboot, installworld, mergemaster, >> reboot for systems that are not up for freebsd-update. > > Right, > > The story gets even sadder ..... > Took the "spare" disk home, and just connected it to an older SuperMicro > server I had lying about for Ceph tests. And lo and behold, it just boots. > > So that system got upgraded from: 10.2 -> 10.4 -> 11.1 > No complaints about anything. > > So now I'm inclined to point at older hardware with an old bios, which > confused ZFS, or probably more precisely gptzfsboot. > > From dmidecode: > System Information > Manufacturer: Supermicro > Product Name: H8SGL > Version: 1234567890 > BIOS Information > Vendor: American Megatrends Inc. > Version: 3.5 > Release Date: 11/25/2013 > Address: 0xF0000 > > We only have 1 of those, so further investigation, and or tinkering, in > combo with the hardware will be impossible. Today i found the messages below in my daily report of the server: +NMI ISA 3c, EISA ff +NMI ISA 3c, EISA ff +NMI ISA 3c, EISA ff +NMI ... going to debugger +NMI ... going to debugger +NMI ISA 3c, EISA ff +NMI ISA 2c, EISA ff +NMI ... going to debugger +NMI ... going to debugger +NMI ISA 2c, EISA ff +NMI ISA 3c, EISA ff +NMI ... going to debugger +NMI ... going to debugger +NMI ... going to debugger +NMI ISA 3c, EISA ff +NMI ... going to debugger Could these things have anything to do with the problem I had with trying to find the pools. --WjW
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d68339d6-1abf-94ff-1763-51ac2ceb7637>