Date: Sun, 21 Apr 2019 12:05:12 -0600 From: Ian Lepore <ian@freebsd.org> To: Warner Losh <imp@bsdimp.com> Cc: Karl Denninger <karl@denninger.net>, Andrew Gierth <andrew@tao11.riddles.org.uk>, "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>, ticso@cicely.de Subject: Re: insanely-high interrupt rates -- PARTIAL resolution (Pi2) Message-ID: <26962ee10cf8c61416dde40d5e8c0c24400316f9.camel@freebsd.org> In-Reply-To: <CANCZdfrVXMqpvsWqig0a21HQ7PyK-Y4Md7t9E-3YPE3d9E4e6w@mail.gmail.com> References: <004ddba628b94b80845d8e509ddcb648d21fd6c9.camel@freebsd.org> <C68D7E6E-03C1-448F-8638-8BD1717DBF44@jeditekunum.com> <ac7d434f16f3a89f5ef247678d6becdbeded5c3f.camel@freebsd.org> <CE40E2B5-2244-4EF9-B67F-34A54D71E2E8@jeditekunum.com> <f60ea6d2-b696-d896-7bcb-ac628f41f7b8@denninger.net> <20190319161423.GH57400@cicely7.cicely.de> <52df098fdc0caf5de1879c93239534fffbd49b56.camel@freebsd.org> <40f57de2-2b25-3981-a416-b9958cc97636@denninger.net> <669892ac3fc37b0843a156c0ab102316829103fd.camel@freebsd.org> <663f2566-b035-7011-70eb-4163b41e6e55@denninger.net> <20190325164827.GL57400@cicely7.cicely.de> <3db9cf8a-68ee-e339-67bf-760ee51464fd@denninger.net> <fc17ac0f77832e840b9fffa9b1074561f1e766d8.camel@freebsd.org> <d96c7f42-f01b-8990-a558-ee92d631b51d@denninger.net> <dc56a8964cae942354cbe2b5b0620f2eebb569bb.camel@freebsd.org> <874l7fyrpr.fsf@news-spur.riddles.org.uk> <701e011f-3088-8ed4-4fbb-6fa93ac698f5@denninger.net> <aefa1d778e7684f71ffed49ce32ee80e2273d033.camel@freebsd.org> <67133e19-2be5-ccd1-2ded-008b36a866ec@denninger.net> <dd411c0bba7a78c35f1016ef2efa93f50b2ba68a.camel@freebsd.org> <CANCZdfrVXMqpvsWqig0a21HQ7PyK-Y4Md7t9E-3YPE3d9E4e6w@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 2019-04-21 at 12:00 -0600, Warner Losh wrote: > On Sun, Apr 21, 2019 at 11:58 AM Ian Lepore <ian@freebsd.org> wrote: > > > On Wed, 2019-04-17 at 14:56 -0500, Karl Denninger wrote: > > > On 4/9/2019 19:25, Ian Lepore wrote: > > > > On Tue, 2019-04-09 at 09:55 -0500, Karl Denninger wrote: > > > > > On 4/3/2019 11:48, Andrew Gierth wrote: > > > > > > [...] > > > > > > > > I've just posted https://reviews.freebsd.org/D19871 for this. > > > > Hopefully I'll get it committed in a day or so and merged to > > > > 12- > > > > stable > > > > a few days after that. > > > > > > > > -- Ian > > > > > > I am running that now on a Pi2 and so far the load problem is > > > gone > > > but > > > the spurious interrupt warnings are not.... > > > > > > > > > > [...] > > > > > > On my bench without the I2c inputs connected (which do analog > > > reads) I > > > do NOT get the spurious interrupt prints. With it connected I > > > do. The > > > process that reads them is code that is running in both cases, > > > but if it > > > cannot find the I2c devices it logs the error but continues, so > > > all it > > > gets to is trying to open the unit, doesn't see it when probed, > > > and > > > gives up. > > > > > > It appears that I2c is an inherent part of the spurious interrupt > > > thing > > > still and while the timer issue appears to be fixed that doesn't > > > resolve > > > the other problem. > > > > > > Any ideas on how to track down exactly what is generating those > > > warnings? > > > > > > > > > > After spending the whole day yesterday trying all the usual driver > > techniques for eliminating spurious interrupts, I was unable to > > make > > them go away completely, but I also convinced myself they're > > harmless. > > > > I was a little surprised that the "read after write" technique > > didn't > > work. That is, after writing to the i2c control register to clear > > all > > the interrupt-enable bits, read back that register. In theory, at > > least on normal arm chips, that ensures that the prior write has > > reached the hardware before the read can procede, so it's a way to > > guarantee that the write has taken effect and the interrupt can no > > longer be asserted, before returning from the interrupt > > handler. But, > > on the rpi chips even that doesn't work... you can read back the > > register and verify the interrupt-enable bits are cleared, and > > still > > after returning from the handler, it re-interrupts immediately. > > > > If you stick in a nice long DELAY() after clearing the control > > register, the spurious interrupts go away, but that's a horrible > > fix. > > It would be especially horrible for i2c devices that do a lot of > > transfers, you'd end up with the delay time overwhelming the time > > to do > > the actual transfers themselves. > > > > So, in r346489, I moved the reporting of the spurious transfers > > under > > the bootverbose flag, so that normally you just won't see them > > anymore, > > but we can still enable the reporting if we suspect some device > > driver > > is behaving badly. I'll mfc that change to 12-stable after a few > > days. > > > > vmstat -i will also show if you're system has an unusually high interrupt > rate in general as well, and is preferable to spamming the console with > printfs :) > > Warner vmstat doesn't report spurious interrupts in any way, though. I considered making it do so as one of the possible fixes here, but it turns out to be complicated... we need to do a bit of reworking of the INTRNG code as it related to interrupt counts. For example, on x86 you get this from vmstat -i: cpu0:timer 42006521 80 cpu1:timer 32510560 62 But on arm, all timer interrupts are counted as belonging to the generic_timer0 device. When I tried to figure out how to split that into per-cpu reporting like x86 does, I discovered what a mess the intrstats stuff in INTRNG is right now. So, a project for another day, I guess. -- Ian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?26962ee10cf8c61416dde40d5e8c0c24400316f9.camel>