From owner-freebsd-arch@freebsd.org Wed Dec 27 13:46:23 2017 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 205C2E8E2F1 for ; Wed, 27 Dec 2017 13:46:23 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lf0-f41.google.com (mail-lf0-f41.google.com [209.85.215.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A61F6661E4; Wed, 27 Dec 2017 13:46:22 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lf0-f41.google.com with SMTP id u84so22429756lff.7; Wed, 27 Dec 2017 05:46:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=3KMG0MRLXsbSmMObb/QjMMJ/KbseMjzSx5Z+SJkvJqg=; b=jeQB0srJCr9jEwM+O3SwkUfP6hW7I7aniiiWssL6Vjoc/151RSKqs3rsEHBHLC+wfG bbbXV0XmWuM9JjiJcTkdEJ3x/Skc32hjgEfzom5CC5aM2CnLFiPylexLVzDaQaEwd2N9 PyMHbkf1I4mfFO/h8OH7A3FuqUp3mj5JzsLXfT33yaHpszXeYBVUBFPfHddxjVn2VwaD B8ARQJL0Xncpj3fPwyEOPh3TjXr3eYuQU+tc8XGF+UkL8W37j+y7w9UwIldnUOFXLuek G3d27rARXmxKtoEeqr0P5k/1dpvfTBDaqlK6Pv41ZAN7d4bR4CxJTzMuOUmNAgevstjS 0kpQ== X-Gm-Message-State: AKGB3mI7vyI/6KD8pMlouLdVJxlrdvLL0v5LNPhvAFn7KpnxykBm0BWA dCsNxjxkP2CYUraD1KEtJ+y8gMvb X-Google-Smtp-Source: ACJfBouwfZka7e0KpqvwXvOOKNDUHuK9I5NuH7L6bZeQPs4c5mdtc9BlQj8l3N74Ll63ewa0dpY9TA== X-Received: by 10.46.89.129 with SMTP id g1mr17662792ljf.12.1514380735848; Wed, 27 Dec 2017 05:18:55 -0800 (PST) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id b77sm3879050lfh.67.2017.12.27.05.18.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 27 Dec 2017 05:18:54 -0800 (PST) Subject: Re: making SW_WATCHDOG dynamic To: karels@FreeBSD.org, freebsd-arch@freebsd.org References: <201712261425.vBQEPMmQ007578@mail.karels.net> From: Andriy Gapon Message-ID: Date: Wed, 27 Dec 2017 15:18:53 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <201712261425.vBQEPMmQ007578@mail.karels.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Dec 2017 13:46:23 -0000 On 26/12/2017 16:25, Mike Karels wrote: > There is a kernel option, SW_WATCHDOG, which adds a low-level software > watchdog in hardclock. By default, the kernel and watchdogd support > only hardware-based watchdogs. There is also a callout-based software > watchdog that can be enabled by watchdogd with an ioctl if --softwatchdog > is specified, but watchdogd doesn't switch on its own. The SW_WATCHDOG > option adds a lower-level software watchdog to the hardware-based mechanism, > but it adds it unconditionally. I propose to include the SW_WATCHDOG > facility by default, but enable it only if there is no hardware watchdog. I think that this is a good idea. Although, I would not necessarily tie the software watchdog to not having any hardware watchdog. This is probably a good default policy, but I would allow to enable / disable the software watchdog explicitly (e.g. via a sysctl). I also think that we should support enabling several watchdog timers with different timeouts. Each of them can serve a different purpose. E.g., a software or hardware NMI-sending watchdog can be used to get diagnostic data out of a hung system while a resetting watchdog can be used to ensure fail-safe operation. > I'm interested in any comments, suggestions, or background; feel free to > mail me off the list. If there are multiple people interested, I'll > forward messages to that group. > > I want to make the change because I have found SW_WATCHDOG quite useful > at $JOB, and it's annoying to have to build a custom kernel just for this > (not just once, but every time there is a kernel patch). Makes sense. > Also, I'm curious why we have two software watchdog facilities. The > --softwatchdog facility has various options on expiration, such as > printf/log/panic; I don't know why anything other than panic/reboot > would be desirable, though. I already contacted some of the people who > have left fingerprints on watchdog. Also, if anyone wants to review > the code, let me know. I guess that the second software watchdog was added to achieve what I suggested above. Of course, it would have been nicer to re-use SW_WATCHDOG for that purpose and to add a more generic support for configuring multiple watchdog timers with different timeouts. But I guess that adding a new single-purpose software watchdog was much easier to do. P.S. And maybe just using the second software watchdog would be good enough for what you are doing? -- Andriy Gapon