Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 31 Mar 2024 16:31:43 +0200
From:      Alexander Leidinger <Alexander@Leidinger.net>
To:        Alexander Leidinger <Alexander@leidinger.net>
Cc:        Mark Johnston <markj@freebsd.org>, Current <current@freebsd.org>, bnovkov@freebsd.org
Subject:   Re: Multiple issues with current (kldload failures, missing CTF stuff, pty issues, ...)
Message-ID:  <888637ab03455a459342ba611c09b627@Leidinger.net>
In-Reply-To: <b3f14ca5c6a771b6664639c5d10f14dd@Leidinger.net>
References:  <09ef22679b76cb2dbeace8e78bf9f80e@Leidinger.net> <Zgb2w-1W7oRzsFEX@nuc> <b3f14ca5c6a771b6664639c5d10f14dd@Leidinger.net>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)

--=_c5498fd2cfe61cab96a94b8b1455a9d5
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII;
 format=flowed

Am 2024-03-29 18:21, schrieb Alexander Leidinger:
> Am 2024-03-29 18:13, schrieb Mark Johnston:
>> On Fri, Mar 29, 2024 at 04:52:55PM +0100, Alexander Leidinger wrote:
>>> Hi,
>>> 
>>> sources from 2024-03-11 work. Sources from 2024-03-25 and today don't 
>>> work
>>> (see below for the issue). As the monthly stabilisation pass didn't 
>>> find
>>> obvious issues, it is something related to my setup:
>>>  - not a generic kernel
>>>  - very modular kernel (as much as possible as a module)
>>>  - bind_now (a build without fails too, tested with clean /usr/obj)
>>>  - ccache (a build without fails too, tested with clean /usr/obj)
>>>  - kernel retpoline (build without in progress)
>>>  - userland retpoline (build without in progress)
>>>  - kernel build with WITH_CTF / DDB_CTF (next one to test if it isn't
>>> retpoline)
>>>  - -fno-builtin
>>>  - CPUFLAGS=native (except for stuff in /usr/src/sys/boot)
>>>  - malloc production
>>>  - COPTFLAGS= -O2 -pipe
>>> 
>>> The issue is, that kernel modules load OK from loader, but once it 
>>> starts
>>> init any module fails to load (e.g. via autodetection of hardware or 
>>> rc.conf
>>> kld_list) with the message that the kernel and module versions are 
>>> out of
>>> sync and the module refuses to load.
>> 
>> What is the exact revision you're running?  There were some unrelated
>> changes to the kernel linker around the same time.
> 
> The working src is from 2024-03-11-094351 (GMT+0100).
> The failing src was fetched after Glebs stabilization week message (and 
> todays src before the sound stuff still fails).
> 
> Retpoline wasn't the cause, next test is the CTF stuff in the kernel...

A rather obscure problem was causing this. The "last" BE had canmount 
set to "on" instead of "noauto". No idea how this happened, but this 
resulted in the "last" BE to be mounted on "zfs mount -a" on top of the 
current BE. This means that all modules loaded after the zfs rc script 
has run was loading old kernel modules and the error message of kernel 
version mismatch was correct. I fiund the issue while bisecting the tree 
and suddenly the error message went away but the new issue of missing 
dev entries popped up (/dev was mounted correctly on the booting 
dataset, but the last BE was mounted on top of it and /dev went 
empty...).

It looks to me like bectl was doing this (from "zpool history")...
2024-03-11.14:16:31 zpool set bootfs=rpool/ROOT/2024-03-11-094351 rpool
2024-03-11.14:16:31 zfs set canmount=noauto rpool/ROOT/2024-01-18-092730
2024-03-11.14:16:31 zfs set canmount=noauto rpool/ROOT/2024-02-10-144617
2024-03-11.14:16:32 zfs set canmount=noauto rpool/ROOT/2024-02-11-212006
2024-03-11.14:16:32 zfs set canmount=noauto rpool/ROOT/2024-02-16-082836
2024-03-11.14:16:32 zfs set canmount=noauto rpool/ROOT/2024-02-24-140211
2024-03-11.14:16:32 zfs set canmount=noauto 
rpool/ROOT/2024-02-24-140211_ok
2024-03-11.14:16:33 zfs set canmount=on rpool/ROOT/2024-03-11-094351
2024-03-11.14:16:33 zfs promote rpool/ROOT/2024-03-11-094351
2024-03-11.14:17:03 zfs destroy -r rpool/ROOT/2024-02-24-140211_ok

I surely didn't do the "zfs set canmount=..." for those by hand.

Bye,
Alexander.

-- 
http://www.Leidinger.net Alexander@Leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.org    netchild@FreeBSD.org  : PGP 0x8F31830F9F2772BF

--=_c5498fd2cfe61cab96a94b8b1455a9d5
Content-Type: application/pgp-signature;
 name=signature.asc
Content-Disposition: attachment;
 filename=signature.asc;
 size=833
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEER9UlYXp1PSd08nWXEg2wmwP42IYFAmYJc9wACgkQEg2wmwP4
2IZdmw/8CWcO//WcKt5rf77I6y5H8Rv7XA42z0cBott1qx2YoPRHTlhox7iIeCPb
YWkz81b7asLceCjh5T2GWwDmIVObyiiXsdfr2YHD+kvDxUzSRRGSk+37phcEU9Zs
8o3tt3KYscG8lY6fyn/icMmkTgc+jqYWvhMxKVf3JlyDgrUkER+z4AcOER7hFBOg
DIKkOuzaFzgEuscIqVNgUtVdkRDcKixSGOd3XFO+mhrZ8hjb4O5PsMpJlqVXXu9J
f54IG5PwBW1qx5NcLwAYWKw08Y40Mo3CDJHJu/LXFbOu5wXY3S03+sRyq3h5a1d2
HvW7iV+HN9uTdAjZvEYchAvT4t5Yug46Cz1+BGcHm3tRKSSMZKnsfzIAN8GtWuLd
kBcBnPAQKeJGzAMd/kwWZ47gQgIc53nQeSuvgwYaMSMm6nwRZEjA7b67XLwzle2J
uYOl7aKzBzFZp+5hozBvsu2XGhuUIIEMp2zBGqgUZKStL3SM2cd9FENLI7Lsd77G
P/Iq/f0WToYADSJtyNhmbul5VVZYWxGzOC5C3Q5S7rCsh2Zzky6avXX0ZUOQ2pkC
K8ZVijeNJVX+kPsQTE6VdkTzsN/n6sIkVNw44JMfP3QqSSGvkxYLVRojnfd6OgPw
AJ7HDzRIWJqFNRqQouCpaTqhzaL1bGwfm6gaCMU1IiEEEMaPQ10=
=O/PC
-----END PGP SIGNATURE-----

--=_c5498fd2cfe61cab96a94b8b1455a9d5--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?888637ab03455a459342ba611c09b627>