From owner-freebsd-hackers@freebsd.org Fri Aug 12 21:43:42 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 770C5BB805E for ; Fri, 12 Aug 2016 21:43:42 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 6258F13BA for ; Fri, 12 Aug 2016 21:43:42 +0000 (UTC) (envelope-from david@catwhisker.org) Received: by mailman.ysv.freebsd.org (Postfix) id 5E313BB805D; Fri, 12 Aug 2016 21:43:42 +0000 (UTC) Delivered-To: hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5DCB2BB805C for ; Fri, 12 Aug 2016 21:43:42 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from albert.catwhisker.org (mx.catwhisker.org [198.144.209.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2FF0E13B9; Fri, 12 Aug 2016 21:43:41 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from albert.catwhisker.org (localhost [127.0.0.1]) by albert.catwhisker.org (8.15.2/8.15.2) with ESMTP id u7CLheuK048316; Fri, 12 Aug 2016 21:43:40 GMT (envelope-from david@albert.catwhisker.org) Received: (from david@localhost) by albert.catwhisker.org (8.15.2/8.15.2/Submit) id u7CLhecv048315; Fri, 12 Aug 2016 14:43:40 -0700 (PDT) (envelope-from david) Date: Fri, 12 Aug 2016 14:43:40 -0700 From: David Wolfskill To: John Baldwin Cc: hackers@freebsd.org Subject: Re: "ipmi0: KCS..." whines Message-ID: <20160812214340.GZ1112@albert.catwhisker.org> References: <20160811175409.GW1112@albert.catwhisker.org> <2855524.PakqtZoDR6@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="sBy3cog7RUybpTge" Content-Disposition: inline In-Reply-To: <2855524.PakqtZoDR6@ralph.baldwin.cx> User-Agent: Mutt/1.6.1 (2016-04-27) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2016 21:43:42 -0000 --sBy3cog7RUybpTge Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Aug 12, 2016 at 11:54:38AM -0700, John Baldwin wrote: > ... > So the issue is probably that the BMC controller on your box is sometimes > slow in responding. The completion code is the third byte of the reply > we wait to read after sending a request to the BMC via KCS. However, the > first two bytes just echo back the request ID and command we asked for, so > it may be that the BMC echoes those back right away without waiting for > whatever work it needs to do to handle the request to complete, but doesn= 't > send the completion code (the status of the request) until the request is > fully processed. >=20 > The driver is complaining that the BMC didn't respond with the completion > code before it's timeout expired. The default timeout is MAX_TIMEOUT in > sys/dev/ipmi/ipmivars.h which corresponds to 6 seconds. It may be that > occasionally some "background" task runs in the BMC OS that delays respon= ses > to handling commands. It could also be that whatever work the BMC has to= do > to read this specific value is actually timing out or having issues in the > hardware, etc. I could easily modify the stress-test loop to run "date" after each "ipmitool" invocation. (Pity we don't seem to have a sub-second format in strftime().) So... I tried the above (interspersing "date" commands while running "ipmitool dcmi power reading" in a loop within script(1)). I did not get a whine at 32 repetitions; I got one at 64. The total elapsed time was no more than 3 seconds (last timestamp - first timestamp difference was 2 seconds). > You could try increasing the timeout in MAX_TIMEOUT (just increase '6' to > however many seconds you want to tolerate), but keep in mind that the CPU > sits and spins polling for a reply (though the cure may be worse than the > disease). You might also try polling this sensor less often. That's one of the "odd things" -- based on the change that was committed (locally) I would expect that we issue the "ipmitool dcmi power reading" command (along with a handful of others) once every 59 seconds. The complete list of such commands (fed to ipmitool via stdin) is: dcmi power reading sensor raw 0x06 0x52 0x07 0x5b 0x01 0x92 raw 0x30 0x70 0x4b 0x00 0x03 exit > We could maybe use ppsratecheck() to rate limit the errors, but that's > sort of papering over the problem that the BMC is timing out the request. Well, in fairness, that's probably doing a slightly less brute force bit of "papering over" than the patch I had provided. :-} > A larger option is to modify the IPMI driver to support interrupt-driven > operation (and not just polled) in which case a longer timeout might not > hurt so much (you at least wouldn't be spinning on the CPU for N seconds). > .... =20 I wouldn't mind testing that, but I don't think I'm up to writing it. Thanks! Peace, david --=20 David H. Wolfskill david@catwhisker.org Those who would murder in the name of God or prophet are blasphemous coward= s. See http://www.catwhisker.org/~david/publickey.gpg for my public key. --sBy3cog7RUybpTge Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAEBCgBmBQJXrkMMXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRDQ0I3Q0VGOTE3QTgwMUY0MzA2NEQ3N0Ix NTM5Q0M0MEEwNDlFRTE3AAoJEBU5zECgSe4XfvYH/jYhYC8o/NPzFrkTkHAZ2W1E kqrJitnkPUSnqU5zuSW/usKCYvrWh9YGBeBTv1TsGzzYoAsCi8kRUqMAF/oJiFRd vC1CBAvnUVqXkHvX1Nes8THnML0HtMW6VAiyx8to+oFshs2VKXJqI1iq5geFH8el QaqIBuvBd0zu6DGCszmQxMq0VT3ls3qhgmUN/x1asBZ44X60h+n71taiEjvFzzRf BqZPminCQcmPZx9CdNxIOu/jx+8r1W5hBAuc80r2DSkUS4VPBNQPRa4fm7KWvWoj 1VaSrryGgOq/Bb/fYKNWrbh7FBylhNIoD6J1yAaGQ32fuek6clYyayjD0Qqqywo= =EpUB -----END PGP SIGNATURE----- --sBy3cog7RUybpTge--