From owner-freebsd-current@freebsd.org Tue Dec 8 18:19:35 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7437E4A8708 for ; Tue, 8 Dec 2020 18:19:35 +0000 (UTC) (envelope-from freebsd@grem.de) Received: from mail.evolve.de (mail.evolve.de [213.239.217.29]) (using TLSv1.3 with cipher TLS_CHACHA20_POLY1305_SHA256 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA512 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mail.evolve.de", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Cr7kG0mC6z4kSR for ; Tue, 8 Dec 2020 18:19:33 +0000 (UTC) (envelope-from freebsd@grem.de) Received: by mail.evolve.de (OpenSMTPD) with ESMTP id fdc79b1a for ; Tue, 8 Dec 2020 18:19:30 +0000 (UTC) Received: by mail.evolve.de (OpenSMTPD) with ESMTPSA id 53c2595f (TLSv1.3:AEAD-CHACHA20-POLY1305-SHA256:256:NO) for ; Tue, 8 Dec 2020 18:19:29 +0000 (UTC) Date: Tue, 8 Dec 2020 19:18:54 +0100 From: Michael Gmelin To: freebsd-current@freebsd.org Subject: Re: KLD zfs.ko: depends on kernel - not available or version mismatch Message-ID: <20201208191854.33fbb929@bsd64.grem.de> In-Reply-To: <2B044A92-500F-4121-85DB-D486865C75B5@gmail.com> References: <42AC7323-5AD6-401D-9A7D-F1D962EE5717@gmail.com> <2B044A92-500F-4121-85DB-D486865C75B5@gmail.com> X-Face: $wrgCtfdVw_H9WAY?S&9+/F"!41z'L$uo*WzT8miX?kZ~W~Lr5W7v?j0Sde\mwB&/ypo^}> +a'4xMc^^KroE~+v^&^#[B">soBo1y6(TW6#UZiC]o>C6`ej+i Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAJFBMVEWJBwe5BQDl LASZU0/LTEWEfHbyj0Txi32+sKrp1Mv944X8/fm1rS+cAAAACXBIWXMAAAsTAAAL EwEAmpwYAAAAB3RJTUUH3wESCxwC7OBhbgAAACFpVFh0Q29tbWVudAAAAAAAQ3Jl YXRlZCB3aXRoIFRoZSBHSU1QbbCXAAAAAghJREFUOMu11DFvEzEUAGCfEhBVFzuq AKkLd0O6VrIQsLXVSZXoWE5N1K3DobBBA9fQpRWc8OkWouaIjedWKiyREOKs+3PY fvalCNjgLVHeF7/3bMtBzV8C/VsQ8tecEgCcDgrzjekwKZ7TwsJZd/ywEKwwP+ZM 8P3drTsAwWn2mpWuDDuYiK1bFs6De0KUUFw0tWxm+D4AIhuuvZqtyWYeO7jQ4Aea 7jUqI+ixhQoHex4WshEvSXdood7stlv4oSuFOC4tqGcr0NjEqXgV4mMJO38nld4+ xKNxRDon7khyKVqY7YR4d+Cg0OMrkWXZOM7YDkEfKiilCn1qYv4mighZiynuHHOA Wq9QJq+BIES7lMFUtcikMnkDGHUoncA+uHgrP0ctIEqfwLHzeSo+eUA66AqzwN6n 2ZHJhw6Qh/PoyC/QENyEyC/AyNjq74Bs+3UH0xYwzDUC4B97HgLocg1QLYgDDO1v f3UX9Y307Ew4AHh67YAFFsxEpkXwpXY3eIgMhAAE3R19L919nNnuD2wlPcDE3UeT L2ytEICQib9BXgS2fU8PrD82ToYO1OEmMSnYTjSqSv9wdC0tPYC+rQRQD9ESnldF CyqfmiYW+tlALt8gH2xrMdC/youbjzPXEun+/ReXsMCDyve3dZc09fn2Oas8oXGc Jj6/fOeK5UmSMPmf/jL+GD8BEj0k/Fn6IO4AAAAASUVORK5CYII= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4Cr7kG0mC6z4kSR X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.50 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[grem.de:s=20180501]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:213.239.217.29/32]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; SPAMHAUS_ZRD(0.00)[213.239.217.29:from:127.0.2.255]; RCVD_COUNT_THREE(0.00)[3]; DMARC_NA(0.00)[grem.de]; DKIM_TRACE(0.00)[grem.de:+]; NEURAL_HAM_SHORT(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[213.239.217.29:from]; ASN(0.00)[asn:24940, ipnet:213.239.192.0/18, country:DE]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-current] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Dec 2020 18:19:35 -0000 On Tue, 8 Dec 2020 19:10:26 +0100 Alban Hertroys wrote: > > On 8 Dec 2020, at 16:40, John Kennedy wrote: > >=20 > > On Tue, Dec 08, 2020 at 08:56:25AM +0100, Alban Hertroys wrote: =20 > >> This seems to have gotten lost in the moderate queue, but after a > >> week I am no closer to a solution, so here???s a resend: > >>=20 > >> I???ve been trying to get a fresh world running (for the eventual > >> purpose of running amdgpu against my recent graphics adapter), but > >> I run into trouble with core loadable kernel modules, such as > >> zfs.ko from the subject. It also happens with other modules that I > >> tried randomly, for example, geom_mirror.ko. > >>=20 > >> I updated to the latest current using svn up in /usr/src, then: > >> make clean > >> make buildworld kernel -j12 > >> shutdown -r now > >>=20 > >> boot to single user mode > >>=20 > >> kldload zfs =20 > >=20 > > I'm not sure you've provided enough information for a one-shot > > armchair diagnosis, but some things seem factually wrong. For > > example, my normal rebuild procedure is: > >=20 > > cd /usr/src && make buildworld && make buildkernel > > make installkernel > > shutdown -r now > >=20 > > cd /usr/src && mergemaster -pi > > make installworld > > mergemaster -Fi > > make -DBATCH_DELETE_OLD_FILES delete-old =20 >=20 > Aha! So that=E2=80=99s how to prevent having to press =E2=80=98y=E2=80=99= for every > deprecated file! >=20 > > shutdown -r now > >=20 > > cd /usr/src && make -DBATCH_DELETE_OLD_FILES delete-old-libs > >=20 > > (I'm on a desktop system here. You haven't described your setup.) =20 >=20 > This is also a desktop system. >=20 > > You didn't say that you've installed the new kernel, which at least > > starts you down the road towards a driver/kernel mismatch. You > > presumably have a non-ZFS boot+root. =20 >=20 > I=E2=80=99m fairly sure I did, actually. >=20 > Last time I checked, "make buildworld buildkernel" was equivalent to > "make buildworld && make buildkernel", while "make kernel=E2=80=9D is a > shorthand for =E2=80=9Cmake buildkernel && make installkernel=E2=80=9D >=20 > So, unless I=E2=80=99m mistaken, =E2=80=9Cmake buildworld kernel=E2=80=9D= should be > equivalent to your first two lines. >=20 > Nevertheless, I retried without these assumptions, the result was the > same. I forgot to =E2=80=9Cmake delete-old=E2=80=9D though, I rarely reme= mber to do > that=E2=80=A6 >=20 > > Did you mess around with the ZFS from ports (ZoL -> ZoF) > > at some point so you're not using the kernel's ZFS drivers? What > > ZFS entries do you have in /etc/loader.conf, /etc/rc.conf, and some > > of the varients that may get dragged in? (see rc.conf(5) for > > possibilities) =20 >=20 > Nope, stock modules here. >=20 > > At the bottom of your email, you say / is UFS and /usr is ZFS, but > > I guess we have the extra fun of wondering what is under /usr on > > your /? If you have a pre-ZFS /usr that is populated by something > > now presumably very old (because all the new, current stuff went > > onto ZFS /usr, now unavailable). =20 >=20 > There is no populated directory /usr on the UFS file-system. This > install was created on a fresh NVME disk based on an existing install > on a spinning platter. The install happened with /usr mounted at the > ZFS file-system. >=20 > I had to copy over several files from /etc and /usr/local/etc and > re-installed the most important packages. This was admittedly a bit > messy, it is possible that I forgot to copy something over. > (Originally my intention was to dd the contents of the spinning disk > over, but apparently that disk has a few wonky sectors, dd failed > after a few device timeouts) >=20 >=20 > I did sort-of manage to fix things, but recent kernels keep causing > the same issue: >=20 > I noticed that uname -a said I was at revision 366335, while I had > the source tree up-to-date. For a test, I reverted back to that > revision and went through: make buildworld make buildkernel >=20 > Which broke on /usr/local/sys/drm-current-kmod, which I turned out to > have installed through pkg. There have been changes to the linux_kpi > shortly after above revision - probably what broke compatibility > between HEAD and r366335. >=20 > After removing that pkg, the kernel built and installed, world > installed fine too and I have a working system again, with kernel and > world in sync. >=20 > So I tried again to move to HEAD: >=20 > cd /usr/src > svn up > make buildworld -j12 > make buildkernel -j12 > make installkernel > shutdown -r now > > mount -u / > zpool import -Nf system (my /usr FS) >=20 > KLD zfs.ko: depends on kernel - not available or version mismatch > linker_load_file: /boot/kernel/zfs.ko - unsupported file type >=20 >=20 > >> Which results in dmesg messages: > >>=20 > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type > >> KLD zfs.ko: depends on kernel - not available or version mismatch > >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type =20 > >=20 > > Be sure to check out /var/log/messages for extra issues. For > > example, with the bug I mentioned below, I couldn't load my nvidia > > driver and that manifested as one driver having issues because it > > depended on another, which had the root of the problem. =20 >=20 > I forgot to look there. If I find anything suspicious there, I=E2=80=99ll= let > you know. That system doesn=E2=80=99t have a convenient mail client yet, = so > for now its copying output to files and scp-ing that to the Mac. >=20 > >> I can load the zfs kernel module from kernel.old just fine: > >>=20 > >> ZFS filesystem version: 5 > >> ZFS storage pool version: features support (5000) =20 > >=20 > > I kicked my more bleeding-edge system over from 12.2-rel (r366954) > > up into 13.0-current (r367044, 1300123) on 2020/10/26. OpenZFS > > kicked in 2020/8/24? I think the CFT was ~2018/8/21, not sure when > > we had the OpenZFS ports. Current bumps the ABI version pretty > > frequently so I'd think you'd have tripped across versioning issues > > a long time ago if you had some drivers not being rebuilt. =20 >=20 > Having a conflict between kernel and world was what I was expecting > too, but I can=E2=80=99t figure out what got me into that situation. For = all > I know, they should be in sync now, especially after I reverted the > tree back to rev 366335 and making world again (acc. to above method). >=20 > > =20 > >> This happens with any kernel module I???ve tried, such as > >> geom_mirror and amdgpu (from ports/graphics/drm-current-kmod - the > >> latter causes a kernel panic with kernel.old BTW). > >>=20 > >> I???ve gone back as far as Oct 7 (before changes to > >> kern/elf_load_obj.c off the top of my head), looked at mailing > >> list archives and forums etc, all to no avail. > >>=20 > >> I have / on UFS+J and /usr on ZFS and nothing in /etc/src.conf. I > >> had /etc/malloc.conf with the recommended symlink from UPDATING, > >> but the same happens with that moved out of the way. Nothing seems > >> to help. > >>=20 > >> Do I need to go back further to get into a usable state or is > >> there something else I should be doing? =20 > >=20 > > With very few exceptions (bug 250897, 2020/11/6), I've found > > 13-current bootable since 10/26 (up through my current system, 13.0 > > r368388 (2020/12/6). You obviously need to make sure that an extra > > drivers you add in are compiled against the kernel, but ZFS is > > typically one of those. =20 >=20 > I think we covered that. >=20 > Thanks for the help and the pointers, but unfortunately the mystery > remains. >=20 Do you have anything in /boot/modules? (wild shot) -m --=20 Michael Gmelin