Date: Tue, 13 May 2025 13:26:22 -0700 From: Pete Wright <pete@nomadlogic.org> To: Colin Percival <cperciva@tarsnap.com>, freebsd-cloud@FreeBSD.org Subject: Re: ena(4) tx timeout messages in dmesg Message-ID: <8bc8c246-52bc-4fad-81d3-54f777893754@nomadlogic.org> In-Reply-To: <01000196cb23eca0-d4b771c7-f4a9-4406-bc20-4f8b7dff09d3-000000@email.amazonses.com> References: <fec4cb4f-2a36-4a3d-bf02-539fd1a1273c@nomadlogic.org> <2fe4e22b-acde-4a43-9359-bd6a4e028a37@nomadlogic.org> <01000196cb23eca0-d4b771c7-f4a9-4406-bc20-4f8b7dff09d3-000000@email.amazonses.com>
index | next in thread | previous in thread | raw e-mail
On 5/13/25 12:34, Colin Percival wrote: > On 5/13/25 10:22, Pete Wright wrote: >> So I've found an interesting pattern, the above messages get printed >> to /var/ log/messages and the dmesg buffer when i "su" to root >> apparently: >> >> May 9 19:19:23 airflow-nfs su[66523]: ec2-user to root on /dev/pts/3 >> May 9 19:19:23 airflow-nfs kernel: Found a Tx that wasn't completed >> on time, qid 2, index 593. 10 msecs have passed since last cleanup. >> Missing Tx timeout value 5000 msecs. >> May 9 19:19:23 airflow-nfs kernel: Found a Tx that wasn't completed >> on time, qid 2, index 220. 1 msecs have passed since last cleanup. >> Missing Tx timeout value 5000 msecs. >> [...] >> >> I have no idea what that means, but certainly feels like an >> interesting data- point. i'm ssh'ing as the ec2-user, then "su -" to >> become root and as you can see from the timestamps something triggers >> those log events. i'm not seeing any other occurances of these log >> messages outside of su'ing too. this is a very vanilla system, not >> krb auth or other network interactions should happen when i become root. > > Ooh, very interesting, and points to something I had wondered about > earlier. > There should be a line 'hw.broken_txfifo="1"' in /boot/loader.conf; can you > try removing that and see if the problem goes away? (In fact, it's a > sysctl > so you can flip it on and off without taking the system down.) > > If the system reproducibly prints that warning with broken_txfifo=1 and > does > not print the warning with broken_txfifo=0, we have the culprit. And I can > just remove that from EC2 images; it's a workaround for an old emulation > bug > which *should* be long since fixed in all EC2 instance types. > oh interesting! cool i've toggled that sysctl knob: # sysctl hw.broken_txfifo=0 hw.broken_txfifo: 1 -> 0 # i did an initial test and it looks good so far, i'll let it soak for the rest of the day today and check-in tomorrow. thanks Colin! -pete -- Pete Wright pete@nomadlogic.orghome | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8bc8c246-52bc-4fad-81d3-54f777893754>
