From owner-freebsd-current@freebsd.org Tue Dec 8 18:10:32 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id BC2B84A7EE8 for ; Tue, 8 Dec 2020 18:10:32 +0000 (UTC) (envelope-from haramrae@gmail.com) Received: from mail-ed1-x532.google.com (mail-ed1-x532.google.com [IPv6:2a00:1450:4864:20::532]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Cr7Wp6Qy7z4jZR for ; Tue, 8 Dec 2020 18:10:30 +0000 (UTC) (envelope-from haramrae@gmail.com) Received: by mail-ed1-x532.google.com with SMTP id v22so18523795edt.9 for ; Tue, 08 Dec 2020 10:10:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=EEhswhMho+snLH32g7APwIwi0yAPg3k+8ol/35qStnE=; b=Q9nrFtmYiIogeB5nsGqmlZ/kP1G52EqnB9jJnN3lAUcL1y7fdAFpZwmKpkKW1scjiH /BM5YHugzn4zHweHnc2xpv0ya1OhqkyWYllBMW9CMsm/cBgpHH+MkSJDMD+0OsMUd03c uxsl766VHZ7gQbMsH1l1Dd4eWICDT5D4jbIUKMhW95J7p9nEiQY7URoHJytGyM0asb7n y94MxxYWb6BTwl05EAKI+M0pxhIQE4o+xrBnBfNYyhxwatySskh5OLLi5TVoHnxklGtM iKgZl8aR3H+lIwFuoGFr4GG5Rv28rDG/ZOzb8f5clZdzt3wuGs2LyZS0EM+Nt8kbqlhe boBQ== X-Gm-Message-State: AOAM532lCdTBmQxVTIwSxGt7nP1q7XfDEqergdvDMJ9kdeDKkkcuNqqt m0TCPwYkUUAqfafTf8g+Izs= X-Google-Smtp-Source: ABdhPJzUFPZN2y5IoSmTT0c/QHkBAqjvJwjzAFW093bzwYmrWl+VBU2m5SXej+hrPTMhCK84IUIoIw== X-Received: by 2002:a50:d1d3:: with SMTP id i19mr25035300edg.297.1607451029025; Tue, 08 Dec 2020 10:10:29 -0800 (PST) Received: from hollewijn.internal (217-19-30-105.dsl.cambrium.nl. [217.19.30.105]) by smtp.gmail.com with ESMTPSA id d4sm18210364edq.36.2020.12.08.10.10.27 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 08 Dec 2020 10:10:28 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: KLD zfs.ko: depends on kernel - not available or version mismatch From: Alban Hertroys In-Reply-To: Date: Tue, 8 Dec 2020 19:10:26 +0100 Cc: freebsd-current@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <2B044A92-500F-4121-85DB-D486865C75B5@gmail.com> References: <42AC7323-5AD6-401D-9A7D-F1D962EE5717@gmail.com> To: John Kennedy X-Mailer: Apple Mail (2.3445.104.17) X-Rspamd-Queue-Id: 4Cr7Wp6Qy7z4jZR X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.50 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a00:1450:4864:20::532:from]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; SPAMHAUS_ZRD(0.00)[2a00:1450:4864:20::532:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::532:from]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-current] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Dec 2020 18:10:32 -0000 > On 8 Dec 2020, at 16:40, John Kennedy wrote: >=20 > On Tue, Dec 08, 2020 at 08:56:25AM +0100, Alban Hertroys wrote: >> This seems to have gotten lost in the moderate queue, but after a = week I am no closer to a solution, so here???s a resend: >>=20 >> I???ve been trying to get a fresh world running (for the eventual = purpose of running amdgpu against my recent graphics adapter), but I run = into trouble with core loadable kernel modules, such as zfs.ko from the = subject. It also happens with other modules that I tried randomly, for = example, geom_mirror.ko. >>=20 >> I updated to the latest current using svn up in /usr/src, then: >> make clean >> make buildworld kernel -j12 >> shutdown -r now >>=20 >> boot to single user mode >>=20 >> kldload zfs >=20 > I'm not sure you've provided enough information for a one-shot = armchair > diagnosis, but some things seem factually wrong. For example, my = normal > rebuild procedure is: >=20 > cd /usr/src && make buildworld && make buildkernel > make installkernel > shutdown -r now >=20 > cd /usr/src && mergemaster -pi > make installworld > mergemaster -Fi > make -DBATCH_DELETE_OLD_FILES delete-old Aha! So that=E2=80=99s how to prevent having to press =E2=80=98y=E2=80=99 = for every deprecated file! > shutdown -r now >=20 > cd /usr/src && make -DBATCH_DELETE_OLD_FILES delete-old-libs >=20 > (I'm on a desktop system here. You haven't described your setup.) This is also a desktop system. > You didn't say that you've installed the new kernel, which at least = starts > you down the road towards a driver/kernel mismatch. You presumably = have a > non-ZFS boot+root. I=E2=80=99m fairly sure I did, actually. Last time I checked, "make buildworld buildkernel" was equivalent to = "make buildworld && make buildkernel", while "make kernel=E2=80=9D is a = shorthand for =E2=80=9Cmake buildkernel && make installkernel=E2=80=9D So, unless I=E2=80=99m mistaken, =E2=80=9Cmake buildworld kernel=E2=80=9D = should be equivalent to your first two lines. Nevertheless, I retried without these assumptions, the result was the = same. I forgot to =E2=80=9Cmake delete-old=E2=80=9D though, I rarely = remember to do that=E2=80=A6 > Did you mess around with the ZFS from ports (ZoL -> ZoF) > at some point so you're not using the kernel's ZFS drivers? What ZFS > entries do you have in /etc/loader.conf, /etc/rc.conf, and some of the > varients that may get dragged in? (see rc.conf(5) for possibilities) Nope, stock modules here. > At the bottom of your email, you say / is UFS and /usr is ZFS, but I = guess we > have the extra fun of wondering what is under /usr on your /? If you = have a > pre-ZFS /usr that is populated by something now presumably very old = (because > all the new, current stuff went onto ZFS /usr, now unavailable). There is no populated directory /usr on the UFS file-system. This = install was created on a fresh NVME disk based on an existing install on = a spinning platter. The install happened with /usr mounted at the ZFS = file-system. I had to copy over several files from /etc and /usr/local/etc and = re-installed the most important packages. This was admittedly a bit = messy, it is possible that I forgot to copy something over. (Originally my intention was to dd the contents of the spinning disk = over, but apparently that disk has a few wonky sectors, dd failed after = a few device timeouts) I did sort-of manage to fix things, but recent kernels keep causing the = same issue: I noticed that uname -a said I was at revision 366335, while I had the = source tree up-to-date. For a test, I reverted back to that revision and = went through: make buildworld make buildkernel Which broke on /usr/local/sys/drm-current-kmod, which I turned out to = have installed through pkg. There have been changes to the linux_kpi = shortly after above revision - probably what broke compatibility between = HEAD and r366335. After removing that pkg, the kernel built and installed, world installed = fine too and I have a working system again, with kernel and world in = sync. So I tried again to move to HEAD: cd /usr/src svn up make buildworld -j12 make buildkernel -j12 make installkernel shutdown -r now mount -u / zpool import -Nf system (my /usr FS) KLD zfs.ko: depends on kernel - not available or version mismatch linker_load_file: /boot/kernel/zfs.ko - unsupported file type >> Which results in dmesg messages: >>=20 >> KLD zfs.ko: depends on kernel - not available or version mismatch >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type >> KLD zfs.ko: depends on kernel - not available or version mismatch >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type >> KLD zfs.ko: depends on kernel - not available or version mismatch >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type >> KLD zfs.ko: depends on kernel - not available or version mismatch >> linker_load_file: /boot/kernel/zfs.ko - unsupported file type >=20 > Be sure to check out /var/log/messages for extra issues. For example, = with > the bug I mentioned below, I couldn't load my nvidia driver and that = manifested > as one driver having issues because it depended on another, which had = the root > of the problem. I forgot to look there. If I find anything suspicious there, I=E2=80=99ll = let you know. That system doesn=E2=80=99t have a convenient mail client = yet, so for now its copying output to files and scp-ing that to the Mac. >> I can load the zfs kernel module from kernel.old just fine: >>=20 >> ZFS filesystem version: 5 >> ZFS storage pool version: features support (5000) >=20 > I kicked my more bleeding-edge system over from 12.2-rel (r366954) up = into > 13.0-current (r367044, 1300123) on 2020/10/26. OpenZFS kicked in = 2020/8/24? > I think the CFT was ~2018/8/21, not sure when we had the OpenZFS = ports. > Current bumps the ABI version pretty frequently so I'd think you'd = have > tripped across versioning issues a long time ago if you had some = drivers not > being rebuilt. Having a conflict between kernel and world was what I was expecting too, = but I can=E2=80=99t figure out what got me into that situation. For all = I know, they should be in sync now, especially after I reverted the tree = back to rev 366335 and making world again (acc. to above method). >=20 >> This happens with any kernel module I???ve tried, such as geom_mirror = and amdgpu (from ports/graphics/drm-current-kmod - the latter causes a = kernel panic with kernel.old BTW). >>=20 >> I???ve gone back as far as Oct 7 (before changes to = kern/elf_load_obj.c off the top of my head), looked at mailing list = archives and forums etc, all to no avail. >>=20 >> I have / on UFS+J and /usr on ZFS and nothing in /etc/src.conf. I had = /etc/malloc.conf with the recommended symlink from UPDATING, but the = same happens with that moved out of the way. Nothing seems to help. >>=20 >> Do I need to go back further to get into a usable state or is there = something else I should be doing? >=20 > With very few exceptions (bug 250897, 2020/11/6), I've found = 13-current > bootable since 10/26 (up through my current system, 13.0 r368388 = (2020/12/6). > You obviously need to make sure that an extra drivers you add in are = compiled > against the kernel, but ZFS is typically one of those. I think we covered that. Thanks for the help and the pointers, but unfortunately the mystery = remains. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest.