Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 13 May 2025 13:26:22 -0700
From:      Pete Wright <pete@nomadlogic.org>
To:        Colin Percival <cperciva@tarsnap.com>, freebsd-cloud@FreeBSD.org
Subject:   Re: ena(4) tx timeout messages in dmesg
Message-ID:  <8bc8c246-52bc-4fad-81d3-54f777893754@nomadlogic.org>
In-Reply-To: <01000196cb23eca0-d4b771c7-f4a9-4406-bc20-4f8b7dff09d3-000000@email.amazonses.com>
References:  <fec4cb4f-2a36-4a3d-bf02-539fd1a1273c@nomadlogic.org> <2fe4e22b-acde-4a43-9359-bd6a4e028a37@nomadlogic.org> <01000196cb23eca0-d4b771c7-f4a9-4406-bc20-4f8b7dff09d3-000000@email.amazonses.com>

index | next in thread | previous in thread | raw e-mail



On 5/13/25 12:34, Colin Percival wrote:
> On 5/13/25 10:22, Pete Wright wrote:
>> So I've found an interesting pattern, the above messages get printed 
>> to /var/ log/messages and the dmesg buffer when i "su" to root 
>> apparently:
>>
>> May  9 19:19:23 airflow-nfs su[66523]: ec2-user to root on /dev/pts/3
>> May  9 19:19:23 airflow-nfs kernel: Found a Tx that wasn't completed 
>> on time, qid 2, index 593. 10 msecs have passed since last cleanup. 
>> Missing Tx timeout value 5000 msecs.
>> May  9 19:19:23 airflow-nfs kernel: Found a Tx that wasn't completed 
>> on time, qid 2, index 220. 1 msecs have passed since last cleanup. 
>> Missing Tx timeout value 5000 msecs.
>> [...]
>>
>> I have no idea what that means, but certainly feels like an 
>> interesting data- point.  i'm ssh'ing as the ec2-user, then "su -" to 
>> become root and as you can see from the timestamps something triggers 
>> those log events. i'm not seeing any other occurances of these log 
>> messages outside of su'ing too.  this is a very vanilla system, not 
>> krb auth or other network interactions should happen when i become root.
> 
> Ooh, very interesting, and points to something I had wondered about 
> earlier.
> There should be a line 'hw.broken_txfifo="1"' in /boot/loader.conf; can you
> try removing that and see if the problem goes away?  (In fact, it's a 
> sysctl
> so you can flip it on and off without taking the system down.)
> 
> If the system reproducibly prints that warning with broken_txfifo=1 and 
> does
> not print the warning with broken_txfifo=0, we have the culprit.  And I can
> just remove that from EC2 images; it's a workaround for an old emulation 
> bug
> which *should* be long since fixed in all EC2 instance types.
> 

oh interesting!  cool i've toggled that sysctl knob:
# sysctl hw.broken_txfifo=0
hw.broken_txfifo: 1 -> 0
#

i did an initial test and it looks good so far, i'll let it soak for the 
rest of the day today and check-in tomorrow.  thanks Colin!
-pete


-- 
Pete Wright
pete@nomadlogic.org



home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8bc8c246-52bc-4fad-81d3-54f777893754>