From owner-svn-src-head@freebsd.org Sun Nov 5 05:35:48 2017 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E5C84E62B6E for ; Sun, 5 Nov 2017 05:35:48 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-it0-x22d.google.com (mail-it0-x22d.google.com [IPv6:2607:f8b0:4001:c0b::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A874B6E8E7 for ; Sun, 5 Nov 2017 05:35:48 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-it0-x22d.google.com with SMTP id l196so1738983itl.4 for ; Sat, 04 Nov 2017 22:35:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=U22cVcdAREMTydsxZNCB0svM5cjPuW8GM44B3K4XS/k=; b=rZo4AI7/DpVKBrATtT6soDDu/yKa30E1MxsL/3mXKPMtJgCtQFBX3JBCF7soEtAQxz gvAR/eKq0j8anS/J0l0YND9SpugffHegghVW0dzRJsTEW8ggx6CGWLwKKdT5FHW/hBxw Nu7E0j4Fi6LA5aMtLEz1b4KQPjwx7SOlRwDg69PU1N+mSyH0ArV9Hd+mdX2vx1qjr4/9 Vqbi8VgUvvDJ2D8ps/jwRtdJ6Ss4VHUnJzR+ZM/+cW7Q972f6wSPFqz3vnJ0O+b98Xjj 3nClcGLZhPh12wREALXIwGJOrULFZEXLkd1uRzZ/OxNvyQNwk3vzjQtaL4lruPQXcGiM eXXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=U22cVcdAREMTydsxZNCB0svM5cjPuW8GM44B3K4XS/k=; b=dMmu7xrPUCcOsBnyEqmf+tfQdsGe0Qa67/o0rn0ErE6TUbP2XUnJcRLRRhb6xRvMkn 6u2/xIscugM+LzaNnFF8A34Pk3G7aiiR4MueLvb6ieBdEoOlHxdW79Tpt+zpFrMEEUwt P/sl5s47jUWVM5N+yoSlaCn9ts+XYiCHNzdPqdpsvfbnEHK37YD3sQeBwbjPFpWKqXky SzftuJBKFd7co+IYn9JcGWssorlHRsQwk/O/aXgz9x5hBR9uC0dFNGR32hD5+dIRU5Tg 31cl+4uimuRShudS0/5kuHVs1ci+zHJQ7RdsIq0+iSjq9TJb//32kanT7ea0lYqW48Yf aL8Q== X-Gm-Message-State: AJaThX7BFdQQF1PH+WusdNowmftt7qsXkm/y6XGWVGwHOrTM2r8jX/IJ 9J08Icty8EkGpr1Q7l71YhbHsElEelC4k5BwODND/A== X-Google-Smtp-Source: ABhQp+SpCcD5iFb981JF+Q2xOY3wUw0UJjIte7dDpO5LhC6Ja0qiLs/Sfamy8nsto9xxv+hV5usvsIe4ajCj1sYT5mQ= X-Received: by 10.36.184.5 with SMTP id m5mr5145834ite.69.1509860147969; Sat, 04 Nov 2017 22:35:47 -0700 (PDT) MIME-Version: 1.0 Sender: wlosh@bsdimp.com Received: by 10.79.57.22 with HTTP; Sat, 4 Nov 2017 22:35:47 -0700 (PDT) X-Originating-IP: [2603:300b:6:5100:a004:68c9:b567:b3a8] In-Reply-To: <1595776.mmy5sTxHyV@overcee.wemm.org> References: <201711040301.vA431wdY002757@repo.freebsd.org> <2932858.xKWtPkGhRe@overcee.wemm.org> <1595776.mmy5sTxHyV@overcee.wemm.org> From: Warner Losh Date: Sat, 4 Nov 2017 23:35:47 -0600 X-Google-Sender-Auth: x2drmQJLEZ1niZEObvN9eHsO_7g Message-ID: Subject: Re: svn commit: r325378 - head/sys/dev/ipmi To: Peter Wemm Cc: "svn-src-all@freebsd.org" , Warner Losh , src-committers , "svn-src-head@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Nov 2017 05:35:49 -0000 On Sat, Nov 4, 2017 at 11:19 PM, Peter Wemm wrote: > On Saturday, November 04, 2017 11:03:55 PM Warner Losh wrote: > > On Sat, Nov 4, 2017 at 10:50 PM, Peter Wemm wrote: > > > On Saturday, November 04, 2017 03:01:58 AM Warner Losh wrote: > > > > Author: imp > > > > Date: Sat Nov 4 03:01:58 2017 > > > > New Revision: 325378 > > > > URL: https://svnweb.freebsd.org/changeset/base/325378 > > > > > > > > Log: > > > > Make the startup timeout 0 seconds by default rathern than 420s. > This > > > > makes the default fail safe when watchdogd is disabled (which is > also > > > > the default). > > > > > > We're still getting unanticipated reboots. > > > > > > I think what is happening is: > > > 1) orderly reboot initiated. > > > 2) By default, the watchdog code sets a 420 second timer, even with no > > > watchdogd. > > > 3) reboot complets, system comes up. > > > 4) A few minutes later, the pre-reboot 420 second timer expires and > > > *another* > > > reboot happens. > > > > > > Setting hw.ipmi.on="0" in loader.conf stops this... > > > > > > eg: reboot at 4:41:47.. system comes back up, and later: > > > ... > > > Uptime: 322 Sun Nov 5 04:48:45 UTC 2017 > > > Uptime: 323 Sun Nov 5 04:48:46 UTC 2017 > > > Uptime: 324 Sun Nov 5 04:48:47 UTC 2017 > > > Stopping cron. > > > Waiting for PIDS: 1004. > > > Stopping sshd. > > > Waiting for PIDS: 994. > > > Stopping nginx. > > > ... > > > That's exactly 420 seconds after the original reboot which matches the > > > wd_shutdown_countdown timer that is still enabled.] > > > > Good detective work.I suspect this will need to be opt-in as well... > Though > > the other option is to disable the watchdog on attach if we're not > enabling > > the early watchdog which would give us a watchdog when we hang on > > shutdown... I need to think this through.... Fix it early with less > > protection by setting this to 0, or fix it later with more protection, > but > > perhaps odd behavior for some edge cases like downgrade. > > > > In the mean time hw.ipmi.wd_shutdown_countdown=0 should also fix it. Can > > you confirm that? > > > > Warner > > We have a number of obnoxious machines that take 5+ minutes in POST. The 7 > minute timer is cutting it awfully close. > > However, what I'm more worried about: what if you're going to boot > something > other than FreeBSD? Or going into the BIOS to tweak something? If I > break > into the loader to pause booting, it'll just silently reboot out from > under me > a few minutes later. I don't see how this can be anything but opt-in by > default. As it's a timer initiated by an orderly shutdown/reboot there > should > be plenty of time for an approprate value to be safely set. > > Yes, setting the sysctl after boot did prevent the spurious reboot after > the > next boot-up. OK. Given the edge cases aren't so edgy as I was originally thinking, I'm inclined to agree here: both features have to be opt-in. Attempts at being clever only work in a monoculture of FreeBSD where one is always moving forward in versions and never back. There's problems with both of these assumptions... Sorry for what sounds like a lot of hassle to diagnose this. Warner