Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 12 Aug 2016 14:43:40 -0700
From:      David Wolfskill <david@catwhisker.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        hackers@freebsd.org
Subject:   Re: "ipmi0: KCS..." whines
Message-ID:  <20160812214340.GZ1112@albert.catwhisker.org>
In-Reply-To: <2855524.PakqtZoDR6@ralph.baldwin.cx>
References:  <20160811175409.GW1112@albert.catwhisker.org> <2855524.PakqtZoDR6@ralph.baldwin.cx>

next in thread | previous in thread | raw e-mail | index | archive | help

--sBy3cog7RUybpTge
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Aug 12, 2016 at 11:54:38AM -0700, John Baldwin wrote:
> ...
> So the issue is probably that the BMC controller on your box is sometimes
> slow in responding.  The completion code is the third byte of the reply
> we wait to read after sending a request to the BMC via KCS.  However, the
> first two bytes just echo back the request ID and command we asked for, so
> it may be that the BMC echoes those back right away without waiting for
> whatever work it needs to do to handle the request to complete, but doesn=
't
> send the completion code (the status of the request) until the request is
> fully processed.
>=20
> The driver is complaining that the BMC didn't respond with the completion
> code before it's timeout expired.  The default timeout is MAX_TIMEOUT in
> sys/dev/ipmi/ipmivars.h which corresponds to 6 seconds.  It may be that
> occasionally some "background" task runs in the BMC OS that delays respon=
ses
> to handling commands.  It could also be that whatever work the BMC has to=
 do
> to read this specific value is actually timing out or having issues in the
> hardware, etc.

I could easily modify the stress-test loop to run "date" after each
"ipmitool" invocation.  (Pity we don't seem to have a sub-second format
in strftime().)

So... I tried the above (interspersing "date" commands while running
"ipmitool dcmi power reading" in a loop within script(1)).  I did not
get a whine at 32 repetitions; I got one at 64.

The total elapsed time was no more than 3 seconds (last timestamp -
first timestamp difference was 2 seconds).

> You could try increasing the timeout in MAX_TIMEOUT (just increase '6' to
> however many seconds you want to tolerate), but keep in mind that the CPU
> sits and spins polling for a reply (though the cure may be worse than the
> disease).  You might also try polling this sensor less often.

That's one of the "odd things" -- based on the change that was committed
(locally) I would expect that we issue the "ipmitool dcmi power reading"
command (along with a handful of others) once every 59 seconds.

The complete list of such commands (fed to ipmitool via stdin) is:

dcmi power reading
sensor
raw 0x06 0x52 0x07 0x5b 0x01 0x92
raw 0x30 0x70 0x4b 0x00 0x03
exit

> We could maybe use ppsratecheck() to rate limit the errors, but that's
> sort of papering over the problem that the BMC is timing out the request.

Well, in fairness, that's probably doing a slightly less brute force bit
of "papering over" than the patch I had provided. :-}

> A larger option is to modify the IPMI driver to support interrupt-driven
> operation (and not just polled) in which case a longer timeout might not
> hurt so much (you at least wouldn't be spinning on the CPU for N seconds).
> ....
=20
I wouldn't mind testing that, but I don't think I'm up to writing it.

Thanks!

Peace,
david
--=20
David H. Wolfskill				david@catwhisker.org
Those who would murder in the name of God or prophet are blasphemous coward=
s.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

--sBy3cog7RUybpTge
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQF8BAEBCgBmBQJXrkMMXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRDQ0I3Q0VGOTE3QTgwMUY0MzA2NEQ3N0Ix
NTM5Q0M0MEEwNDlFRTE3AAoJEBU5zECgSe4XfvYH/jYhYC8o/NPzFrkTkHAZ2W1E
kqrJitnkPUSnqU5zuSW/usKCYvrWh9YGBeBTv1TsGzzYoAsCi8kRUqMAF/oJiFRd
vC1CBAvnUVqXkHvX1Nes8THnML0HtMW6VAiyx8to+oFshs2VKXJqI1iq5geFH8el
QaqIBuvBd0zu6DGCszmQxMq0VT3ls3qhgmUN/x1asBZ44X60h+n71taiEjvFzzRf
BqZPminCQcmPZx9CdNxIOu/jx+8r1W5hBAuc80r2DSkUS4VPBNQPRa4fm7KWvWoj
1VaSrryGgOq/Bb/fYKNWrbh7FBylhNIoD6J1yAaGQ32fuek6clYyayjD0Qqqywo=
=EpUB
-----END PGP SIGNATURE-----

--sBy3cog7RUybpTge--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160812214340.GZ1112>