From nobody Mon Dec 22 19:23:13 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4dZp373hwCz6LRrM for ; Mon, 22 Dec 2025 19:23:47 +0000 (UTC) (envelope-from freebsd@walstatt-de.de) Received: from smtp052.goneo.de (smtp052.goneo.de [85.220.129.60]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4dZp362N0dz3rdV for ; Mon, 22 Dec 2025 19:23:46 +0000 (UTC) (envelope-from freebsd@walstatt-de.de) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=walstatt-de.de header.s=DKIM001 header.b="E6W9/Hb/"; dmarc=none; spf=pass (mx1.freebsd.org: domain of freebsd@walstatt-de.de designates 85.220.129.60 as permitted sender) smtp.mailfrom=freebsd@walstatt-de.de Received: from hub2.goneo.de (hub2.goneo.de [85.220.129.53]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by smtp5.goneo.de (Postfix) with ESMTPS id 5871F240632; Mon, 22 Dec 2025 20:23:43 +0100 (CET) Received: from hub2.goneo.de (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by hub2.goneo.de (Postfix) with ESMTPS id A9CE0240370; Mon, 22 Dec 2025 20:23:41 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=walstatt-de.de; s=DKIM001; t=1766431421; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=MvIvD41pbxw+BwF8aTwhVN5FjMbebqIOKHZ6gsCqnCo=; b=E6W9/Hb/whgdabS6X4NAev0vsXTcrwNjR06iyPCGUQmrhW+gdMgDw3IEQGI+TqweyZ5Ha5 WP27Hd6dJ86ULxlsGbheoWU5M2jXUaW6jzrXGpH4SIHB2OqP27rVvT0hS5JoX98+AX/lr6 /F+vt6hkwp3LcfrTkYgZk78hV74x1x9wFywSDGa5Rqdq7lGGjFhUYSZ6C3q8KPu1AvEPwm GEPkq7A5Njw2Lk6tC0GR707hTfpcUDA4Tw5LBCoPHQMUO2K0gnS0YK8ioFkRTe8IwsvdZ/ FBkXleCMBtF5x/GQxVlIu205R0sWwN+GmvqiRdPlgbUzXhxJyaODMVmUH2udmA== Received: from thor.sb211.local (dynamic-2a02-3100-2405-3c02-021b-21ff-fe4e-8f4d.310.pool.telefonica.de [IPv6:2a02:3100:2405:3c02:21b:21ff:fe4e:8f4d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (prime256v1) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by hub2.goneo.de (Postfix) with ESMTPSA id 675C4240350; Mon, 22 Dec 2025 20:23:41 +0100 (CET) Date: Mon, 22 Dec 2025 20:23:13 +0100 From: A FreeBSD User To: Warner Losh Cc: FreeBSD CURRENT Subject: Re: CURRENT: havock: elf_load_section: truncated ELF file Message-ID: <20251222201622.0993320f@thor.sb211.local> In-Reply-To: References: <20251220141124.1606aa7c@thor.sb211.local> <20251220233127.2ad04793@thor.sb211.local> X-Mailer: Claws Mail 3.21.0 (GTK+ 2.24.33; amd64-portbld-freebsd16.0) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 Content-Type: multipart/signed; boundary="Sig_/eGRkODnR8Vk/K6ZF0S1.Edh"; protocol="application/pgp-signature"; micalg=pgp-sha512 X-Rspamd-UID: 131793 X-Rspamd-UID: fdc69d X-Spamd-Bar: ----- X-Spamd-Result: default: False [-5.68 / 15.00]; SIGNED_PGP(-2.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.98)[-0.978]; R_DKIM_ALLOW(-0.20)[walstatt-de.de:s=DKIM001]; MIME_GOOD(-0.20)[multipart/signed,text/plain]; R_SPF_ALLOW(-0.20)[+ip4:85.220.129.0/25]; RCVD_IN_DNSWL_LOW(-0.10)[85.220.129.60:from]; DMARC_NA(0.00)[walstatt-de.de]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_COUNT_THREE(0.00)[3]; FROM_HAS_DN(0.00)[]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; FROM_EQ_ENVFROM(0.00)[]; TO_DN_ALL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[walstatt-de.de:+] X-Rspamd-Queue-Id: 4dZp362N0dz3rdV --Sig_/eGRkODnR8Vk/K6ZF0S1.Edh Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Am Tage des Herren Sat, 20 Dec 2025 23:11:37 -0700 Warner Losh schrieb: > On Sat, Dec 20, 2025 at 3:31=E2=80=AFPM A FreeBSD User > wrote: >=20 > > Am Tage des Herren Sat, 20 Dec 2025 08:10:59 -0700 > > Warner Losh schrieb: > > =20 > > > On Sat, Dec 20, 2025 at 6:12=E2=80=AFAM A FreeBSD User > > > wrote: > > > =20 > > > > Hello, > > > > > > > > recently a small server running recent CURRENT with a UFS basesd sy= stem > > > > SSD (NVMe) and a data > > > > graveyard based on RAID level 5 with ZFS (attached to a Fujitsu HBA > > > > controler) gets corrupted > > > > because of "loosing" a driver - this time the system reported TWO = =20 > > drives a =20 > > > > removed froma RAID > > > > level 5 - which is like a death sentence. > > > > > > > > I guess this is a fallout of the recently changed timie parameters = to =20 > > the =20 > > > > CAM infrastructure > > > > (I can't find any notes on this in man cam, so I feel lost). > > > > =20 > > > > > > Unlikely, but you can set this in the boot loader: > > > kern.cam.tur_timeout=3D60 > > > kern.cam.inquiry_timeout=3D60 > > > kern.cam.modesense_timeout=3D60 =20 > > > > I'll check, thanks. Are these OIDs documented somewhere to be at hand j= ust > > in case? I searched > > the recent cam manpage ... > > =20 >=20 > scsi.4: > SYSCTL VARIABLES > The following variables are available as both sysctl(8) variables and > loader(8) tunables: >=20 > kern.cam.cam_srch_hi > Search above LUN 7 for SCSI3 and greater devices. >=20 > kern.cam.tur_timeout > Timeout, in ms, for the initial TESTUNITREADY command we send to > the > devices during their initial probing. Defaults to 1s. FreeBSD = 15 > and earlier set this to 60s. >=20 > kern.cam.inquiry_timeout > Timeout, in ms, for the initial INQUIRY command we send to the > devices during their initial probing. Defaults to 1s. FreeBSD = 15 > and earlier set this to 60s. >=20 > kern.cam.reportluns_timeout > Timeout, in ms, for the initial REPORTLUNS command we send to the > devices during their initial probing. Defaults to 50s. >=20 > kern.cam.modesense_timeout > Timeout, in ms, for the initial MODESENSE command we send to the > devices during their initial probing. Defaults to 1s. FreeBSD = 15 > and earlier set this to 60s. Oh, I see, thank you for the hint. oh >=20 >=20 > > > > > > and see if that works. You should see new errors on boot if his is t= he > > > issue. Can you share a dmesg? > > > > > > I kinda doubt they'd cause the issues that you've had. If disks are g= one, > > > then there'd be different errors to what you are seeing, I'd think. > > > > > > To recover, your best bet is to use a USB stick from one of the relea= se =20 > > or =20 > > > snapshots. =20 > > > > In earlier times, when "make installkernel and/or make installworld > > crashed midair, some > > binaries in the installed tree were corrupted and since I run CURRENT > > which has a tough pace > > at the moment, the USB image booting should be close to the CURRENT made > > via "make world" ... > > I assume. I did so and had some problems with the new pkg concept ... > > (working offline, is a > > problem with the install-blob.txz ...) > > =20 >=20 > Yuck. Sorry that was a source of trouble for you. >=20 >=20 > > > > > > Warner > > > > > > =20 > > > > A very desastrous side effect of this crash was the inability to re= boot > > > > the box (CURRENT pre- > > > > 16.0-CURRENT #11 master-n282659-7f39d05b67ae: Sat Dec 20 09:35:32 C= ET > > > > 2025amd64, the runtime > > > > system was from 16th or 17th of December). > > > > After several tenth of minutes I had to hadr reboot the box - with = =20 > > obvious =20 > > > > data loss on the > > > > system SSD. And here my problems start to turn into a mess. > > > > > > > > After the first initial reboot I performed a fsck -fy, rebootet and > > > > whitnessed that > > > > jails didn't come up anymore and SSHD didn't work. So I installed = =20 > > prior to =20 > > > > the crash already > > > > compiled CURRENT from /usr/src which is "master-n282659-7f39d05b67a= e" =20 > > (as =20 > > > > the sibling box which > > > > is runnig great by the way, but different CPU and smaller RAID, but= =20 > > also =20 > > > > system SSD based on > > > > UFS filesystem, same HBA. So CURRENT seem to operate in general on = =20 > > similar =20 > > > > hardware. > > > > > > > > After the second reboot with the old kernel the box in question wen= t =20 > > into =20 > > > > debugger, rebooting > > > > in single user mode and performing fsck -fy revealed a lot of repai= rs =20 > > on =20 > > > > the first partitions, > > > > /, /var, /usr. After a reboot I realized that most services now are= =20 > > broken =20 > > > > - jails do not > > > > start, sshd doesn't start and the whole system is going into multiu= ser, > > > > but seems to have > > > > serious problems. > > > > > > > > uname -a remains empty > > > > cd /usr/src; make buildworld returns immediately empty, no further = =20 > > action =20 > > > > service ldconfig start also returns complete empty on console > > > > > > > > Several onboard/base tools simply return nothing. > > > > > > > > trying "/resucue/sh" (install date indicates 20th of December, so i= t is > > > > the latest ) seems to > > > > give me the first indication of something has terribly gone wrong o= r =20 > > even =20 > > > > /rescue/vi (to edit > > > > loader to change to boot.old): > > > > > > > > elf_load_section: truncated ELF file > > > > Abort trap > > > > > > > > Checking /boot/kernel, /lib, /usr/lib, /bin or /sbin seems to be in= takt > > > > (as far as I can > > > > check, all timestamps are 20th Dec 2025, 9:48 UTC). > > > > > > > > Well, since this is not the first time I ran into some problems usi= ng > > > > CURRENT, the outage due > > > > to two lost ZFS drives after the recent chenges seems worthy to mak= e =20 > > some =20 > > > > note here. > > > > =20 > > > > > > Can you provide error messages at boot for this? You talk about fsck = and > > > about ZFS, so I'm a little confused as to your setup. =20 > > > > No need to be confused: the CURRENT crashed/froze after two of five HDD > > were reported as > > "removed" from a RAIDZ pool. The box hung forever. > > > > The OS resides on a SSD with UFS. After > 30 min I had to switch off/on > > the box physically. > > So the UFS filesystem had a bump (journalling didn't fix it). ZFS "heal= ed" > > after reboot and > > checking the HDD. UFS SSD didn't ... > > > > > > I spent a while now to bring back everything. Boot device is now ZFS, t= oo. > > And, therefore, > > obvious slower but somehow save. > > > > The only issue I have now is a crash after a reboot. While rebooting and > > killing jails, the > > box drops into kernel debugger ... > > > > Somehow I need to copy the picture I made from the box, since the machi= ne > > isn't connected to > > the net at the moment ... > > =20 >=20 >=20 >=20 >=20 > > > > > > Warner > > > > > > =20 > > > > The other question would be how to fix: one strategy would be to bo= ot =20 > > from =20 > > > > an official image > > > > from flash drive and try to perform a "make installkernel =20 > > installworld". =20 > > > > Maybe there is > > > > another way idicativ to that what I described above ... > > > > =20 > > > > > > > > > > > > =20 > > > > Thanks in advance, > > > > > > > > oh > > > > > > > > > > > > -- > > > > > > > > A FreeBSD user > > > > =20 > > > > > > > > -- > > > > A FreeBSD user > > =20 --=20 A FreeBSD user --Sig_/eGRkODnR8Vk/K6ZF0S1.Edh Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iHUEARYKAB0WIQRQheDybVktG5eW/1Kxzvs8OqokrwUCaUmavAAKCRCxzvs8Oqok r/lsAQDzrlvtIMdoJ7whnswXENwgnDiZLk4tPpelzuUQJFsoYQEA4fvnbZjnJQsN m5oU4L0VKVYpc4xrrP0uDU2tSS6BRgE= =Z+rr -----END PGP SIGNATURE----- --Sig_/eGRkODnR8Vk/K6ZF0S1.Edh--