From owner-freebsd-stable@FreeBSD.ORG Fri Jul 27 19:54:07 2012 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9C3D1106564A; Fri, 27 Jul 2012 19:54:07 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 855488FC12; Fri, 27 Jul 2012 19:54:06 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id WAA18350; Fri, 27 Jul 2012 22:51:45 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1SuqZd-000HCc-2r; Fri, 27 Jul 2012 22:51:45 +0300 Message-ID: <5012F14F.7070204@FreeBSD.org> Date: Fri, 27 Jul 2012 22:51:43 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120620 Thunderbird/13.0.1 MIME-Version: 1.0 To: Andrew Boyer References: <1343350238.12294.10.camel@powernoodle.corp.yahoo.com> <23294764-F30B-4732-8C41-3F0ECA5F273C@averesystems.com> In-Reply-To: <23294764-F30B-4732-8C41-3F0ECA5F273C@averesystems.com> X-Enigmail-Version: 1.4.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: sbruno@FreeBSD.org, FreeBSD Stable Mailing List , John Baldwin Subject: Re: IPMI hardware watchdogs Re: dell r420/r320 stable/9 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jul 2012 19:54:07 -0000 on 27/07/2012 17:33 Andrew Boyer said the following: > > On Jul 26, 2012, at 8:50 PM, Sean Bruno wrote: > >> For the time being I had to revert the following from my stable/9 tree. >> Otherwise I would get a kernel panic on shutdown from ipmi(4). >> >> http://svnweb.freebsd.org/base?view=revision&revision=237839 >> http://svnweb.freebsd.org/base?view=revision&revision=221121 >> > > > On a somewhat related note: We noticed recently that you can't pet or disable > the IPMI hardware watchdog once SCHEDULER_STOPPED() is true. This means it > can fire unexpectedly while you're dumping core or rebooting, depending on > how long the timeout was on the pet before the panic. The ipmi driver will > need to process the command differently if the scheduler is stopped. I > haven't had time to look at a fix yet. Yeah, I noticed that unlike most (all?) other watchdog drivers where watchdog re-arming is a very basic operation like doing one I/O the IPMI watchdog does some more complex stuff which involves waiting on another thread. I think that this may be a little bit too much for a reliable watchdog driver. At least, as you note, this definitely won't work for the panic case where only one thread is left running. I guess that the driver should check for that case and do a direct operation instead of enqueueing a request and waiting for another thread to execute it. -- Andriy Gapon