From owner-freebsd-hackers@freebsd.org Mon Aug 15 17:19:01 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 54371BBA34B for ; Mon, 15 Aug 2016 17:19:01 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 3751919C0 for ; Mon, 15 Aug 2016 17:19:01 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 31890BBA348; Mon, 15 Aug 2016 17:19:01 +0000 (UTC) Delivered-To: hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 311E4BBA347 for ; Mon, 15 Aug 2016 17:19:01 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 08D4019B0 for ; Mon, 15 Aug 2016 17:19:01 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id E9912B915; Mon, 15 Aug 2016 13:18:59 -0400 (EDT) From: John Baldwin To: David Wolfskill Cc: hackers@freebsd.org Subject: Re: "ipmi0: KCS..." whines Date: Fri, 12 Aug 2016 17:29:58 -0700 Message-ID: <6661021.NZidrlQVOE@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.3-STABLE; KDE/4.14.3; amd64; ; ) In-Reply-To: <20160812214340.GZ1112@albert.catwhisker.org> References: <20160811175409.GW1112@albert.catwhisker.org> <2855524.PakqtZoDR6@ralph.baldwin.cx> <20160812214340.GZ1112@albert.catwhisker.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 15 Aug 2016 13:19:00 -0400 (EDT) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Aug 2016 17:19:01 -0000 On Friday, August 12, 2016 02:43:40 PM David Wolfskill wrote: > On Fri, Aug 12, 2016 at 11:54:38AM -0700, John Baldwin wrote: > > ... > > So the issue is probably that the BMC controller on your box is sometimes > > slow in responding. The completion code is the third byte of the reply > > we wait to read after sending a request to the BMC via KCS. However, the > > first two bytes just echo back the request ID and command we asked for, so > > it may be that the BMC echoes those back right away without waiting for > > whatever work it needs to do to handle the request to complete, but doesn't > > send the completion code (the status of the request) until the request is > > fully processed. > > > > The driver is complaining that the BMC didn't respond with the completion > > code before it's timeout expired. The default timeout is MAX_TIMEOUT in > > sys/dev/ipmi/ipmivars.h which corresponds to 6 seconds. It may be that > > occasionally some "background" task runs in the BMC OS that delays responses > > to handling commands. It could also be that whatever work the BMC has to do > > to read this specific value is actually timing out or having issues in the > > hardware, etc. > > I could easily modify the stress-test loop to run "date" after each > "ipmitool" invocation. (Pity we don't seem to have a sub-second format > in strftime().) > > So... I tried the above (interspersing "date" commands while running > "ipmitool dcmi power reading" in a loop within script(1)). I did not > get a whine at 32 repetitions; I got one at 64. > > The total elapsed time was no more than 3 seconds (last timestamp - > first timestamp difference was 2 seconds). Hmm, you might see what 'MAX_TIMEOUT' is in sys/dev/ipmi/ipmivars.h in your tree. It might also be worthwhile wrapping it in ()'s as in HEAD it is just a bare '6 * hz'. The code to wait for IBF doesn't look like it would break without the ()'s though. It was bumped from 3 seconds to 6 seconds back in 10-current in r253812, but perhaps your box has 3 seconds instead of 6? -- John Baldwin