Date: Sun, 29 Nov 2020 15:20:15 +0100 From: Mateusz Guzik <mjguzik@gmail.com> To: David Wolfskill <david@catwhisker.org>, Jonathan Looney <jonlooney@gmail.com>, current@freebsd.org Subject: Re: Laptop exhibits erratic responsiveness Message-ID: <CAGudoHHevDxuvwLO2QMrMNfvG0vWSU=v9gHLbH0eMFDt8p0FWg@mail.gmail.com> In-Reply-To: <X8OsLaVToS1V1zoX@albert.catwhisker.org> References: <X8JVn/PIPuLzsWuQ@albert.catwhisker.org> <CADrOrms8fmf9asttPw%2B4B%2BKL7A4svAKx0dDHSuDnEVvxGWiX0Q@mail.gmail.com> <X8OsLaVToS1V1zoX@albert.catwhisker.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11/29/20, David Wolfskill <david@catwhisker.org> wrote: > On Sat, Nov 28, 2020 at 10:47:57AM -0500, Jonathan Looney wrote: >> FWIW, I would try running lockstat on the box. (My supposition is that >> the >> delay is due to a lock. That could be incorrect. Lockstat may provide >> some >> clue as to whether this is a line of inquiry worth pursuing.) >> .... > > Thanks (again), Jonathan. > > So... I did that (during this morning's daily upgrade cycle); the > results may be "of interest" to some. > > I have placed copies of the typescripts in: > > http://www.catwhisker.org/~david/FreeBSD/head/lockstat/ > > I also scribbled a "README" in that same directory (though it doesn't > seem to show up in the listing); it may be accessed via > > http://www.catwhisker.org/~david/FreeBSD/head/lockstat/README > > My prior message in this thread showed what I saw during a "ping albert" > from the laptop while it was running head -- most RTTs were around 0.600 > ms, but some were notably longer, with at least one that was over 68 > seconds. > > So I did a "lockstat ping -c 64 albert" while the laptop was running > stable/12@r368123 (as a reference point); it is probably boring. :-} > > Then (this morning), I tried a simple "lockstat sleep 600" on the laptop > while it was running head@r368119 (and building head@r368143); we see > the "lockstat" output in the "lockstat_head" file. > > It then occurred to me that trying a "lockstat ping albert" might be > useful, so I fired up "lockstat ping -c 600 albert" -- which started up > OK, and demonstrated some long RTTs about every 11 packets or so, but we > see thing come to a screeching halt with: > > ... > 64 bytes from 172.16.8.13: icmp_seq=534 ttl=63 time=0.664 ms > lockstat: dtrace_status(): Abort due to systemic unresponsiveness > 64 bytes from 172.16.8.13: icmp_seq=535 ttl=63 time=9404.383 ms > > and we get no lockstat output. :-/ > > > Finally, as another "control," I ran similar commands from freebeast, > while it was running head@r368119 (and building head@r368143). Those > results are in the "lockstat_freebeast" file. > According to the data you got the entire kernel "freezes" every 11-12 seconds. So something way off is going on there. Given that the bug seems to be reproducible I think it would be best if you just bisected to the offending commit. -- Mateusz Guzik <mjguzik gmail.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHHevDxuvwLO2QMrMNfvG0vWSU=v9gHLbH0eMFDt8p0FWg>