Date: Fri, 12 Aug 2016 14:43:40 -0700 From: David Wolfskill <david@catwhisker.org> To: John Baldwin <jhb@freebsd.org> Cc: hackers@freebsd.org Subject: Re: "ipmi0: KCS..." whines Message-ID: <20160812214340.GZ1112@albert.catwhisker.org> In-Reply-To: <2855524.PakqtZoDR6@ralph.baldwin.cx> References: <20160811175409.GW1112@albert.catwhisker.org> <2855524.PakqtZoDR6@ralph.baldwin.cx>
next in thread | previous in thread | raw e-mail | index | archive | help
--sBy3cog7RUybpTge Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Aug 12, 2016 at 11:54:38AM -0700, John Baldwin wrote: > ... > So the issue is probably that the BMC controller on your box is sometimes > slow in responding. The completion code is the third byte of the reply > we wait to read after sending a request to the BMC via KCS. However, the > first two bytes just echo back the request ID and command we asked for, so > it may be that the BMC echoes those back right away without waiting for > whatever work it needs to do to handle the request to complete, but doesn= 't > send the completion code (the status of the request) until the request is > fully processed. >=20 > The driver is complaining that the BMC didn't respond with the completion > code before it's timeout expired. The default timeout is MAX_TIMEOUT in > sys/dev/ipmi/ipmivars.h which corresponds to 6 seconds. It may be that > occasionally some "background" task runs in the BMC OS that delays respon= ses > to handling commands. It could also be that whatever work the BMC has to= do > to read this specific value is actually timing out or having issues in the > hardware, etc. I could easily modify the stress-test loop to run "date" after each "ipmitool" invocation. (Pity we don't seem to have a sub-second format in strftime().) So... I tried the above (interspersing "date" commands while running "ipmitool dcmi power reading" in a loop within script(1)). I did not get a whine at 32 repetitions; I got one at 64. The total elapsed time was no more than 3 seconds (last timestamp - first timestamp difference was 2 seconds). > You could try increasing the timeout in MAX_TIMEOUT (just increase '6' to > however many seconds you want to tolerate), but keep in mind that the CPU > sits and spins polling for a reply (though the cure may be worse than the > disease). You might also try polling this sensor less often. That's one of the "odd things" -- based on the change that was committed (locally) I would expect that we issue the "ipmitool dcmi power reading" command (along with a handful of others) once every 59 seconds. The complete list of such commands (fed to ipmitool via stdin) is: dcmi power reading sensor raw 0x06 0x52 0x07 0x5b 0x01 0x92 raw 0x30 0x70 0x4b 0x00 0x03 exit > We could maybe use ppsratecheck() to rate limit the errors, but that's > sort of papering over the problem that the BMC is timing out the request. Well, in fairness, that's probably doing a slightly less brute force bit of "papering over" than the patch I had provided. :-} > A larger option is to modify the IPMI driver to support interrupt-driven > operation (and not just polled) in which case a longer timeout might not > hurt so much (you at least wouldn't be spinning on the CPU for N seconds). > .... =20 I wouldn't mind testing that, but I don't think I'm up to writing it. Thanks! Peace, david --=20 David H. Wolfskill david@catwhisker.org Those who would murder in the name of God or prophet are blasphemous coward= s. See http://www.catwhisker.org/~david/publickey.gpg for my public key. --sBy3cog7RUybpTge Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAEBCgBmBQJXrkMMXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRDQ0I3Q0VGOTE3QTgwMUY0MzA2NEQ3N0Ix NTM5Q0M0MEEwNDlFRTE3AAoJEBU5zECgSe4XfvYH/jYhYC8o/NPzFrkTkHAZ2W1E kqrJitnkPUSnqU5zuSW/usKCYvrWh9YGBeBTv1TsGzzYoAsCi8kRUqMAF/oJiFRd vC1CBAvnUVqXkHvX1Nes8THnML0HtMW6VAiyx8to+oFshs2VKXJqI1iq5geFH8el QaqIBuvBd0zu6DGCszmQxMq0VT3ls3qhgmUN/x1asBZ44X60h+n71taiEjvFzzRf BqZPminCQcmPZx9CdNxIOu/jx+8r1W5hBAuc80r2DSkUS4VPBNQPRa4fm7KWvWoj 1VaSrryGgOq/Bb/fYKNWrbh7FBylhNIoD6J1yAaGQ32fuek6clYyayjD0Qqqywo= =EpUB -----END PGP SIGNATURE----- --sBy3cog7RUybpTge--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160812214340.GZ1112>