From owner-freebsd-stable@FreeBSD.ORG Mon Sep 26 14:01:00 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B3D521065672 for ; Mon, 26 Sep 2011 14:01:00 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 8C0DD8FC16 for ; Mon, 26 Sep 2011 14:01:00 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 3C87D46B49; Mon, 26 Sep 2011 10:01:00 -0400 (EDT) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id D55418A040; Mon, 26 Sep 2011 10:00:59 -0400 (EDT) From: John Baldwin To: freebsd-stable@freebsd.org Date: Mon, 26 Sep 2011 10:00:25 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110617; KDE/4.5.5; amd64; ; ) References: <4E801658.9030306@rdtc.ru> In-Reply-To: <4E801658.9030306@rdtc.ru> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201109261000.25789.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Mon, 26 Sep 2011 10:00:59 -0400 (EDT) Cc: ambrisko@ironport.com, Eugene Grosbein Subject: Re: FreeBSD IPMI driver problem X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Sep 2011 14:01:00 -0000 On Monday, September 26, 2011 2:06:16 am Eugene Grosbein wrote: > Hi! > > I use several SuperMicro boxes with intergrated IPMI card. > http://www.supermicro.com/products/system/1U/5016/SYS-5016T-MTF.cfm > > FreeBSD 8.2 sometimes hang in the past after panics so I use IPMI's watchdog > and generally it works nice with 5 minute timeout. The card is detected as following: > > ipmi0: on isa0 > ipmi0: KCS mode found at io 0xca2 alignment 0x1 on isa > ipmi0: IPMI device rev. 1, firmware rev. 1.07, version 2.0 > ipmi0: Number of channels 2 > ipmi0: Attached watchdog > > Sometimes ipmi driver issues "KCS errors" to system logs that I ignore > as they seem harmless. However, one of my boxes suddenly rebooted with watchdog > after following errors written to console: > > ipmi0: KCS: Reply address mismatch > ipmi0: KCS error: 01 > ipmi0: KCS: Reply address mismatch > ipmi0: KCS error: 01 > ipmi0: KCS: Command mismatch > ipmi0: KCS error: 01 > ipmi0: KCS: Reply address mismatch > ipmi0: KCS error: 01 > ipmi0: KCS: Reply address mismatch > ipmi0: KCS error: 01 > ipmi0: KCS: Command mismatch > ipmi0: KCS error: 01 > ipmi0: KCS: Reply address mismatch > ipmi0: KCS error: 01 > ipmi0: KCS: Reply address mismatch > ipmi0: KCS error: 01 > ipmi0: KCS: Reply address mismatch > ipmi0: KCS error: 01 > ipmi0: Failed to reset watchdog > ipmi0: KCS: Command mismatch > ipmi0: KCS error: 01 > > It seems, the driver lost ability to contact IPMI watchdog timer and that was the reason of reboot. > > What can be done to avoid such resets in the future? Hmm, it looks like the IPMI BMC wedged in some fashion. The driver tries to reset the KCS interface when it encounters an error and from your log it didn't unwedge even after several resets. In that case there isn't a lot we can do since we can't talk to the watchdog to turn it off. -- John Baldwin