From owner-freebsd-arch@FreeBSD.ORG Tue Feb 19 23:14:52 2013 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id AC0A333E; Tue, 19 Feb 2013 23:14:52 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 81BF6FF4; Tue, 19 Feb 2013 23:14:52 +0000 (UTC) Received: from Alfreds-MacBook-Pro-9.local (c-67-180-208-218.hsd1.ca.comcast.net [67.180.208.218]) by elvis.mu.org (Postfix) with ESMTPSA id 80D021A3C6A; Tue, 19 Feb 2013 15:14:33 -0800 (PST) Message-ID: <51240759.9030900@mu.org> Date: Tue, 19 Feb 2013 15:14:33 -0800 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Andriy Gapon Subject: Re: request for preliminary review, enhanced watchdog. References: <511AE9C4.4030301@mu.org> <5123FC0B.3020503@FreeBSD.org> In-Reply-To: <5123FC0B.3020503@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "arch@freebsd.org" , Poul-Henning Kamp X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Feb 2013 23:14:52 -0000 On 2/19/13 2:26 PM, Andriy Gapon wrote: > on 13/02/2013 03:17 Alfred Perlstein said the following: >> At work we've had some issues with superfluous watchdog timeouts firing. >> >> Since we use an ipmi/external watchdog the system is completely reset and we are >> unable to gather metrics. >> >> I investigated the issue and then compared to what is offered by Linux and >> decided to crib from their API such that we can benefit from an enhanced watchdog. >> >> I have a WIP at this time in a branch that I would hope people could weigh in on >> and review as well as make technical suggestions. > Alfred, > > I think that this is very useful work. > Some comments below. > >> The branch is located here: >> svn+ssh://svn.freebsd.org/base/user/alfred/ewatchdog >> >> The easy way to get changes: >> svn log --stop-on-copy svn+ssh://svn.freebsd.org/base/user/alfred/ewatchdog >> >> 1) Support for pre-watchdog timeout. This means that so long as the kernel is >> somewhat functional (callouts are working) we can trigger a configurable action >> (panic,ddb,log) if the watchdog program is otherwise hung. > I see where this can be useful. > The unfortunate drawback which you mentioned is that the solution is > "semi-reliable" - it won't help much if a hang is such that the callouts no > longer fire. > But it could be still desirable to obtain something for postmortem analysis even > in that condition. Yes. Exactly why I have done it so. > >> 2) Support for built-in software watchdog that has the same options >> (panic,ddb,log) if the watchdog times out. This is useful for prototyping and >> was done instead of using the SW_WATCHDOG in kern_clock.c because of the ease of >> working the code into watchdog.c versus communication via the EVENTHANDLER api. > I see why you chose (or had to choose) this option, but this is kind of > unfortunate - more below. Agreed. > >> 3) Support for Linux-like API. (WDIOC_GETTIMELEFT, >> WDIOC_SETTIMEOUT,WDIOC_GETTIMEOUT, etc) > I haven't looked at the complete Linux API, but from you quote above - what are > the Linux and potential FreeBSD use-cases for the ioctls like GETTIMELEFT and > GETTIMEOUT? This would be for reporting purposes. > >> 4) Modifications to watchdogd(8): >> - Warn if the watchdog program takes too long. >> - Disable activation of the system watchdog so that one can test the >> watchdogd script >> without potentially rebooting the system. >> - Ability to log to syslog when scripts begin to timeout. >> - When told to measure time, do not unconditionally nap for 'sleep' seconds, >> instead adjust >> the naptime by the elapsed time so as not to trigger the watchdog. > I don't have anything to say about the userland part. In general these new > things sound useful. > >> I've not yet hooked in the optional pre-timeout code into watchdogd(8) but plan >> on doing so later in the week. >> >> It would be really helpful if we could decide on a way of selecting which >> watchdogs to arm/fire and how to query them. I may adopt the Linux API unless >> someone has alternative suggestions that make a strong enough case to forge our >> own API. > Again, I haven't examined Linux API, so I can't say much about it. > The following is how I imagine our watchdog infrastructure. > > I think that we should have some quality and feature flags associated with > various watchdog drivers (somewhat similarly to e.g. eventtimers), which would > describe things like: > - I am implemented in software or hardware > - I am able to generate system reset > - I am able to generate a "hard" debug event (NMI) > - (for software wd) I work via NMIs or regular interrupts > > Then ,I think that watchdogd should support at least two timeouts: for debug > watchdog and reset watchdog. The ioctl interface should of course support > setting timeouts per watchdog type. > This way a user should be able to specify a timeout (e.g. 10 seconds) for a > debug watchdog with an intent of dumping a core (or other debugging action) and > a different timeout (say 60 seconds) for a reset watchdog, which should make > sure in a fail-safe manner that a system doesn't get stuck in the debug/dump/etc > code. > > Then, the kernel should auto-select the best watchdog driver for each of the > watchdog classes. But sysctl interface should allow a user to override the > selection in case that there are multiple drivers with sufficient capabilities. > > Also, and only partially related to your WIP, I think that it is long overdue > that we got a software watchdog driven by (periodic) NMIs as opposed to > SW_WATCHDOG (or your "callout" "watchdog" [in quotes only because it is not > implemented as a real watchdog(9) driver, but is blended into the > infrastructure]) that is driven by regular timer interrupts. > > My opinion is that such infrastructure could be more powerful and flexible (and > reliable) than what you currently have in the branch. We could let a multitude > of watchdog drivers co-exist and "cooperate" by ensuring that each of them does > its special part of the overall job. Of course, it requires more work too. > As far as this API, this is close enough to the Linux API that it makes sense for us to do this. Do you think that the current work (plus some documentation that is due) is good enough for a step forward and inclusion in the system? -Alfred