Date: Tue, 8 Dec 2020 19:18:54 +0100 From: Michael Gmelin <freebsd@grem.de> To: freebsd-current@freebsd.org Subject: Re: KLD zfs.ko: depends on kernel - not available or version mismatch Message-ID: <20201208191854.33fbb929@bsd64.grem.de> In-Reply-To: <2B044A92-500F-4121-85DB-D486865C75B5@gmail.com> References: <42AC7323-5AD6-401D-9A7D-F1D962EE5717@gmail.com> <X8%2BebiETBdNpbRXt@phouka1.phouka.net> <2B044A92-500F-4121-85DB-D486865C75B5@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 8 Dec 2020 19:10:26 +0100 Alban Hertroys <haramrae@gmail.com> wrote: > > On 8 Dec 2020, at 16:40, John Kennedy <warlock@phouka.net> wrote: > >=20 > > On Tue, Dec 08, 2020 at 08:56:25AM +0100, Alban Hertroys wrote: =20 > >> This seems to have gotten lost in the moderate queue, but after a > >> week I am no closer to a solution, so here???s a resend: > >>=20 > >> I???ve been trying to get a fresh world running (for the eventual > >> purpose of running amdgpu against my recent graphics adapter), but > >> I run into trouble with core loadable kernel modules, such as > >> zfs.ko from the subject. It also happens with other modules that I > >> tried randomly, for example, geom_mirror.ko. > >>=20 > >> I updated to the latest current using svn up in /usr/src, then: > >> make clean > >> make buildworld kernel -j12 > >> shutdown -r now > >>=20 > >> boot to single user mode > >>=20 > >> kldload zfs =20 > >=20 > > I'm not sure you've provided enough information for a one-shot > > armchair diagnosis, but some things seem factually wrong. For > > example, my normal rebuild procedure is: > >=20 > > cd /usr/src && make buildworld && make buildkernel > > make installkernel > > shutdown -r now > >=20 > > cd /usr/src && mergemaster -pi > > make installworld > > mergemaster -Fi > > make -DBATCH_DELETE_OLD_FILES delete-old =20 >=20 > Aha! So that=E2=80=99s how to prevent having to press =E2=80=98y=E2=80=99= for every > deprecated file! >=20 > > shutdown -r now > >=20 > > cd /usr/src && make -DBATCH_DELETE_OLD_FILES delete-old-libs > >=20 > > (I'm on a desktop system here. You haven't described your setup.) =20 >=20 > This is also a desktop system. >=20 > > You didn't say that you've installed the new kernel, which at least > > starts you down the road towards a driver/kernel mismatch. You > > presumably have a non-ZFS boot+root. =20 >=20 > I=E2=80=99m fairly sure I did, actually. >=20 > Last time I checked, "make buildworld buildkernel" was equivalent to > "make buildworld && make buildkernel", while "make kernel=E2=80=9D is a > shorthand for =E2=80=9Cmake buildkernel && make installkernel=E2=80=9D >=20 > So, unless I=E2=80=99m mistaken, =E2=80=9Cmake buildworld kernel=E2=80=9D= should be > equivalent to your first two lines. >=20 > Nevertheless, I retried without these assumptions, the result was the > same. I forgot to =E2=80=9Cmake delete-old=E2=80=9D though, I rarely reme= mber to do > that=E2=80=A6 >=20 > > Did you mess around with the ZFS from ports (ZoL -> ZoF) > > at some point so you're not using the kernel's ZFS drivers? What > > ZFS entries do you have in /etc/loader.conf, /etc/rc.conf, and some > > of the varients that may get dragged in? (see rc.conf(5) for > > possibilities) =20 >=20 > Nope, stock modules here. >=20 > > At the bottom of your email, you say / is UFS and /usr is ZFS, but > > I guess we have the extra fun of wondering what is under /usr on > > your /? If you have a pre-ZFS /usr that is populated by something > > now presumably very old (because all the new, current stuff went > > onto ZFS /usr, now unavailable). =20 >=20 > There is no populated directory /usr on the UFS file-system. This > install was created on a fresh NVME disk based on an existing install > on a spinning platter. The install happened with /usr mounted at the > ZFS file-system. >=20 > I had to copy over several files from /etc and /usr/local/etc and > re-installed the most important packages. This was admittedly a bit > messy, it is possible that I forgot to copy something over. > (Originally my intention was to dd the contents of the spinning disk > over, but apparently that disk has a few wonky sectors, dd failed > after a few device timeouts) >=20 >=20 > I did sort-of manage to fix things, but recent kernels keep causing > the same issue: >=20 > I noticed that uname -a said I was at revision 366335, while I had > the source tree up-to-date. For a test, I reverted back to that > revision and went through: make buildworld make buildkernel >=20 > Which broke on /usr/local/sys/drm-current-kmod, which I turned out to > have installed through pkg. There have been changes to the linux_kpi > shortly after above revision - probably what broke compatibility > between HEAD and r366335. >=20 > After removing that pkg, the kernel built and installed, world > installed fine too and I have a working system again, with kernel and > world in sync. >=20 > So I tried again to move to HEAD: >=20 > cd /usr/src > svn up > make buildworld -j12 > make buildkernel -j12 > make installkernel > shutdown -r now > <single user mode> > mount -u / > zpool import -Nf system (my /usr FS) >=20 > KLD zfs.ko: depends on kernel - not available or version mismatch > linker_load_file: /boot/kernel/zfs.ko - unsupported file type >=20 >=20 > >> Which results in dmesg messages: > >>=20 > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type =20 > >=20 > > Be sure to check out /var/log/messages for extra issues. For > > example, with the bug I mentioned below, I couldn't load my nvidia > > driver and that manifested as one driver having issues because it > > depended on another, which had the root of the problem. =20 >=20 > I forgot to look there. If I find anything suspicious there, I=E2=80=99ll= let > you know. That system doesn=E2=80=99t have a convenient mail client yet, = so > for now its copying output to files and scp-ing that to the Mac. >=20 > >> I can load the zfs kernel module from kernel.old just fine: > >>=20 > >> ZFS filesystem version: 5 > >> ZFS storage pool version: features support (5000) =20 > >=20 > > I kicked my more bleeding-edge system over from 12.2-rel (r366954) > > up into 13.0-current (r367044, 1300123) on 2020/10/26. OpenZFS > > kicked in 2020/8/24? I think the CFT was ~2018/8/21, not sure when > > we had the OpenZFS ports. Current bumps the ABI version pretty > > frequently so I'd think you'd have tripped across versioning issues > > a long time ago if you had some drivers not being rebuilt. =20 >=20 > Having a conflict between kernel and world was what I was expecting > too, but I can=E2=80=99t figure out what got me into that situation. For = all > I know, they should be in sync now, especially after I reverted the > tree back to rev 366335 and making world again (acc. to above method). >=20 > > =20 > >> This happens with any kernel module I???ve tried, such as > >> geom_mirror and amdgpu (from ports/graphics/drm-current-kmod - the > >> latter causes a kernel panic with kernel.old BTW). > >>=20 > >> I???ve gone back as far as Oct 7 (before changes to > >> kern/elf_load_obj.c off the top of my head), looked at mailing > >> list archives and forums etc, all to no avail. > >>=20 > >> I have / on UFS+J and /usr on ZFS and nothing in /etc/src.conf. I > >> had /etc/malloc.conf with the recommended symlink from UPDATING, > >> but the same happens with that moved out of the way. Nothing seems > >> to help. > >>=20 > >> Do I need to go back further to get into a usable state or is > >> there something else I should be doing? =20 > >=20 > > With very few exceptions (bug 250897, 2020/11/6), I've found > > 13-current bootable since 10/26 (up through my current system, 13.0 > > r368388 (2020/12/6). You obviously need to make sure that an extra > > drivers you add in are compiled against the kernel, but ZFS is > > typically one of those. =20 >=20 > I think we covered that. >=20 > Thanks for the help and the pointers, but unfortunately the mystery > remains. >=20 Do you have anything in /boot/modules? (wild shot) -m --=20 Michael Gmelin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20201208191854.33fbb929>