Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 Nov 2020 15:20:15 +0100
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        David Wolfskill <david@catwhisker.org>, Jonathan Looney <jonlooney@gmail.com>, current@freebsd.org
Subject:   Re: Laptop exhibits erratic responsiveness
Message-ID:  <CAGudoHHevDxuvwLO2QMrMNfvG0vWSU=v9gHLbH0eMFDt8p0FWg@mail.gmail.com>
In-Reply-To: <X8OsLaVToS1V1zoX@albert.catwhisker.org>
References:  <X8JVn/PIPuLzsWuQ@albert.catwhisker.org> <CADrOrms8fmf9asttPw%2B4B%2BKL7A4svAKx0dDHSuDnEVvxGWiX0Q@mail.gmail.com> <X8OsLaVToS1V1zoX@albert.catwhisker.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11/29/20, David Wolfskill <david@catwhisker.org> wrote:
> On Sat, Nov 28, 2020 at 10:47:57AM -0500, Jonathan Looney wrote:
>> FWIW, I would try running lockstat on the box. (My supposition is that
>> the
>> delay is due to a lock. That could be incorrect.  Lockstat may provide
>> some
>> clue as to whether this is a line of inquiry worth pursuing.)
>> ....
>
> Thanks (again), Jonathan.
>
> So... I did that (during this morning's daily upgrade cycle); the
> results may be "of interest" to some.
>
> I have placed copies of the typescripts in:
>
> http://www.catwhisker.org/~david/FreeBSD/head/lockstat/
>
> I also scribbled a "README" in that same directory (though it doesn't
> seem to show up in the listing); it may be accessed via
>
> http://www.catwhisker.org/~david/FreeBSD/head/lockstat/README
>
> My prior message in this thread showed what I saw during a "ping albert"
> from the laptop while it was running head -- most RTTs were around 0.600
> ms, but some were notably longer, with at least one that was over 68
> seconds.
>
> So I did a "lockstat ping -c 64 albert" while the laptop was running
> stable/12@r368123 (as a reference point); it is probably boring. :-}
>
> Then (this morning), I tried a simple "lockstat sleep 600" on the laptop
> while it was running head@r368119 (and building head@r368143); we see
> the "lockstat" output in the "lockstat_head" file.
>
> It then occurred to me that trying a "lockstat ping albert" might be
> useful, so I fired up "lockstat ping -c 600 albert" -- which started up
> OK, and demonstrated some long RTTs about every 11 packets or so, but we
> see thing come to a screeching halt with:
>
> ...
> 64 bytes from 172.16.8.13: icmp_seq=534 ttl=63 time=0.664 ms
> lockstat: dtrace_status(): Abort due to systemic unresponsiveness
> 64 bytes from 172.16.8.13: icmp_seq=535 ttl=63 time=9404.383 ms
>
> and we get no lockstat output. :-/
>
>
> Finally, as another "control," I ran similar commands from freebeast,
> while it was running head@r368119 (and building head@r368143).  Those
> results are in the "lockstat_freebeast" file.
>

According to the data you got the entire kernel "freezes" every 11-12
seconds. So something way off is going on there.

Given that the bug seems to be reproducible I think it would be best
if you just bisected to the offending commit.

-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHHevDxuvwLO2QMrMNfvG0vWSU=v9gHLbH0eMFDt8p0FWg>