Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 8 Dec 2020 19:10:26 +0100
From:      Alban Hertroys <haramrae@gmail.com>
To:        John Kennedy <warlock@phouka.net>
Cc:        freebsd-current@freebsd.org
Subject:   Re: KLD zfs.ko: depends on kernel - not available or version mismatch
Message-ID:  <2B044A92-500F-4121-85DB-D486865C75B5@gmail.com>
In-Reply-To: <X8%2BebiETBdNpbRXt@phouka1.phouka.net>
References:  <42AC7323-5AD6-401D-9A7D-F1D962EE5717@gmail.com> <X8%2BebiETBdNpbRXt@phouka1.phouka.net>

next in thread | previous in thread | raw e-mail | index | archive | help

> On 8 Dec 2020, at 16:40, John Kennedy <warlock@phouka.net> wrote:
>=20
> On Tue, Dec 08, 2020 at 08:56:25AM +0100, Alban Hertroys wrote:
>> This seems to have gotten lost in the moderate queue, but after a =
week I am no closer to a solution, so here???s a resend:
>>=20
>> I???ve been trying to get a fresh world running (for the eventual =
purpose of running amdgpu against my recent graphics adapter), but I run =
into trouble with core loadable kernel modules, such as zfs.ko from the =
subject. It also happens with other modules that I tried randomly, for =
example, geom_mirror.ko.
>>=20
>> I updated to the latest current using svn up in /usr/src, then:
>> 	make clean
>> 	make buildworld kernel -j12
>> 	shutdown -r now
>>=20
>> boot to single user mode
>>=20
>> 	kldload zfs
>=20
> I'm not sure you've provided enough information for a one-shot =
armchair
> diagnosis, but some things seem factually wrong.  For example, my =
normal
> rebuild procedure is:
>=20
> 	cd /usr/src && make buildworld && make buildkernel
> 	make installkernel
> 	shutdown -r now
>=20
> 	cd /usr/src && mergemaster -pi
> 	make installworld
> 	mergemaster -Fi
> 	make -DBATCH_DELETE_OLD_FILES delete-old

Aha! So that=E2=80=99s how to prevent having to press =E2=80=98y=E2=80=99 =
for every deprecated file!

> 	shutdown -r now
>=20
> 	cd /usr/src && make -DBATCH_DELETE_OLD_FILES delete-old-libs
>=20
> (I'm on a desktop system here.  You haven't described your setup.)

This is also a desktop system.

> You didn't say that you've installed the new kernel, which at least =
starts
> you down the road towards a driver/kernel mismatch.  You presumably =
have a
> non-ZFS boot+root.

I=E2=80=99m fairly sure I did, actually.

Last time I checked, "make buildworld buildkernel" was equivalent to =
"make buildworld && make buildkernel", while "make kernel=E2=80=9D is a =
shorthand for =E2=80=9Cmake buildkernel && make installkernel=E2=80=9D

So, unless I=E2=80=99m mistaken, =E2=80=9Cmake buildworld kernel=E2=80=9D =
should be equivalent to your first two lines.

Nevertheless, I retried without these assumptions, the result was the =
same. I forgot to =E2=80=9Cmake delete-old=E2=80=9D though, I rarely =
remember to do that=E2=80=A6

>  Did you mess around with the ZFS from ports (ZoL -> ZoF)
> at some point so you're not using the kernel's ZFS drivers?  What ZFS
> entries do you have in /etc/loader.conf, /etc/rc.conf, and some of the
> varients that may get dragged in?  (see rc.conf(5) for possibilities)

Nope, stock modules here.

> At the bottom of your email, you say / is UFS and /usr is ZFS, but I =
guess we
> have the extra fun of wondering what is under /usr on your /?  If you =
have a
> pre-ZFS /usr that is populated by something now presumably very old =
(because
> all the new, current stuff went onto ZFS /usr, now unavailable).

There is no populated directory /usr on the UFS file-system. This =
install was created on a fresh NVME disk based on an existing install on =
a spinning platter. The install happened with /usr mounted at the ZFS =
file-system.

I had to copy over several files from /etc and /usr/local/etc and =
re-installed the most important packages. This was admittedly a bit =
messy, it is possible that I forgot to copy something over.
(Originally my intention was to dd the contents of the spinning disk =
over, but apparently that disk has a few wonky sectors, dd failed after =
a few device timeouts)


I did sort-of manage to fix things, but recent kernels keep causing the =
same issue:

I noticed that uname -a said I was at revision 366335, while I had the =
source tree up-to-date. For a test, I reverted back to that revision and =
went through:
	make buildworld
	make buildkernel

Which broke on /usr/local/sys/drm-current-kmod, which I turned out to =
have installed through pkg. There have been changes to the linux_kpi =
shortly after above revision - probably what broke compatibility between =
HEAD and r366335.

After removing that pkg, the kernel built and installed, world installed =
fine too and I have a working system again, with kernel and world in =
sync.

So I tried again to move to HEAD:

	cd /usr/src
	svn up
	make buildworld -j12
	make buildkernel -j12
	make installkernel
	shutdown -r now
	<single user mode>
	mount -u /
	zpool import -Nf system (my /usr FS)

KLD zfs.ko: depends on kernel - not available or version mismatch
linker_load_file: /boot/kernel/zfs.ko - unsupported file type


>> Which results in dmesg messages:
>>=20
>> KLD zfs.ko: depends on kernel - not available or version mismatch
>> linker_load_file: /boot/kernel/zfs.ko - unsupported file type
>> KLD zfs.ko: depends on kernel - not available or version mismatch
>> linker_load_file: /boot/kernel/zfs.ko - unsupported file type
>> KLD zfs.ko: depends on kernel - not available or version mismatch
>> linker_load_file: /boot/kernel/zfs.ko - unsupported file type
>> KLD zfs.ko: depends on kernel - not available or version mismatch
>> linker_load_file: /boot/kernel/zfs.ko - unsupported file type
>=20
> Be sure to check out /var/log/messages for extra issues.  For example, =
with
> the bug I mentioned below, I couldn't load my nvidia driver and that =
manifested
> as one driver having issues because it depended on another, which had =
the root
> of the problem.

I forgot to look there. If I find anything suspicious there, I=E2=80=99ll =
let you know. That system doesn=E2=80=99t have a convenient mail client =
yet, so for now its copying output to files and scp-ing that to the Mac.

>> I can load the zfs kernel module from kernel.old just fine:
>>=20
>> ZFS filesystem version: 5
>> ZFS storage pool version: features support (5000)
>=20
> I kicked my more bleeding-edge system over from 12.2-rel (r366954) up =
into
> 13.0-current (r367044, 1300123) on 2020/10/26.  OpenZFS kicked in =
2020/8/24?
> I think the CFT was ~2018/8/21, not sure when we had the OpenZFS =
ports.
> Current bumps the ABI version pretty frequently so I'd think you'd =
have
> tripped across versioning issues a long time ago if you had some =
drivers not
> being rebuilt.

Having a conflict between kernel and world was what I was expecting too, =
but I can=E2=80=99t figure out what got me into that situation. For all =
I know, they should be in sync now, especially after I reverted the tree =
back to rev 366335 and making world again (acc. to above method).

>=20
>> This happens with any kernel module I???ve tried, such as geom_mirror =
and amdgpu (from ports/graphics/drm-current-kmod - the latter causes a =
kernel panic with kernel.old BTW).
>>=20
>> I???ve gone back as far as Oct 7 (before changes to =
kern/elf_load_obj.c off the top of my head), looked at mailing list =
archives and forums etc, all to no avail.
>>=20
>> I have / on UFS+J and /usr on ZFS and nothing in /etc/src.conf. I had =
/etc/malloc.conf with the recommended symlink from UPDATING, but the =
same happens with that moved out of the way. Nothing seems to help.
>>=20
>> Do I need to go back further to get into a usable state or is there =
something else I should be doing?
>=20
> With very few exceptions (bug 250897, 2020/11/6), I've found =
13-current
> bootable since 10/26 (up through my current system, 13.0 r368388 =
(2020/12/6).
> You obviously need to make sure that an extra drivers you add in are =
compiled
> against the kernel, but ZFS is typically one of those.

I think we covered that.

Thanks for the help and the pointers, but unfortunately the mystery =
remains.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2B044A92-500F-4121-85DB-D486865C75B5>