From owner-svn-src-head@freebsd.org Sun Nov 5 05:03:57 2017 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0752EE621E6 for ; Sun, 5 Nov 2017 05:03:57 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-io0-x234.google.com (mail-io0-x234.google.com [IPv6:2607:f8b0:4001:c06::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B1E586D87B for ; Sun, 5 Nov 2017 05:03:56 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-io0-x234.google.com with SMTP id e89so12474149ioi.11 for ; Sat, 04 Nov 2017 22:03:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=KWQEa8bxI52Medqlr1nC7ODFZoPgWBxP/TMIYpQrk24=; b=ki23K/G1HxL+IdzVFlXQcA93BmXyqpxcxNLT/Ju/FdHomVuxij1ZA5Tbwxy4uYWPI1 XZIZuAjo/uuozrDIx/Nnp8aHpQiq6HwDBajz+Z9LyvFE18f9yut6KI7QMkZYIsqsaAjF zfwSgy/lLZnM9hNM64BeiKjFNsjS5P1v5HcfKLYWclw260ffh8FBiR96Z0VgFka12wEc b2VEJBjVLQMaMyud0hjVdrYvCm9kSA1q68SSXARZxltSCzpmk5rnRiIQGnBUkcKU5Fs9 j9BDNAEsow+RhXXURHEZJs9BhMgHPP2KElj3OIf/q8ZDJmoww8N2HwKhCJZ0bdmBfcJK OFFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=KWQEa8bxI52Medqlr1nC7ODFZoPgWBxP/TMIYpQrk24=; b=qO2gys8NV8+rq1BlVTdTJKEGGZHrevHJkVrlnIs9xg6JI0lndywPcwXyq3TRTmgpsY /Bu31WpFoJEHsnV3yRcCXHBhClu9w9mWkxaqrOQ86qHqmpbKdazW/DRqiPEmAbiQWyzR f1JL3CR+kOI915btbTBSfEOQE9uwHOvixBj1FCCanIVgRehIYrXGNC1urkbW+XKx0Dzj ECOE+UbnCedowMVr7hL/2BG1H4oXNYwRYK/kWTlWPD3js9JX8cK7nAOMtqwPdcdR6GaI OIKg1l2/z6qjykndrW/yYqUsyiNr4zd+f0BCwiqgnyndQCtfH2B8tKrIH29C5acypEq3 FucQ== X-Gm-Message-State: AMCzsaVzHFUnpUhMeJobFS8M8NZdYeu6LdbYrKQfzMrubo24mxlUsgNQ old1lgZftv3MEa4FgdaLeQ23qc+Jf5y9Ei1eC19V7Q== X-Google-Smtp-Source: ABhQp+Rqx6bbDU/QqRZDutnqmlYwYJRXXTqfCQIhXaTbTuCmJP7MO/hdyFK/oTFH8KCVRCbN32anWd6u5t2FTGGdUv4= X-Received: by 10.107.46.216 with SMTP id u85mr13999284iou.136.1509858236053; Sat, 04 Nov 2017 22:03:56 -0700 (PDT) MIME-Version: 1.0 Sender: wlosh@bsdimp.com Received: by 10.79.57.22 with HTTP; Sat, 4 Nov 2017 22:03:55 -0700 (PDT) X-Originating-IP: [2603:300b:6:5100:a004:68c9:b567:b3a8] In-Reply-To: <2932858.xKWtPkGhRe@overcee.wemm.org> References: <201711040301.vA431wdY002757@repo.freebsd.org> <2932858.xKWtPkGhRe@overcee.wemm.org> From: Warner Losh Date: Sat, 4 Nov 2017 23:03:55 -0600 X-Google-Sender-Auth: cAOeRqM2MiMagG4NQpXZNKgb258 Message-ID: Subject: Re: svn commit: r325378 - head/sys/dev/ipmi To: Peter Wemm Cc: "svn-src-all@freebsd.org" , Warner Losh , src-committers , "svn-src-head@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Nov 2017 05:03:57 -0000 On Sat, Nov 4, 2017 at 10:50 PM, Peter Wemm wrote: > On Saturday, November 04, 2017 03:01:58 AM Warner Losh wrote: > > Author: imp > > Date: Sat Nov 4 03:01:58 2017 > > New Revision: 325378 > > URL: https://svnweb.freebsd.org/changeset/base/325378 > > > > Log: > > Make the startup timeout 0 seconds by default rathern than 420s. This > > makes the default fail safe when watchdogd is disabled (which is also > > the default). > > We're still getting unanticipated reboots. > > I think what is happening is: > 1) orderly reboot initiated. > 2) By default, the watchdog code sets a 420 second timer, even with no > watchdogd. > 3) reboot complets, system comes up. > 4) A few minutes later, the pre-reboot 420 second timer expires and > *another* > reboot happens. > > Setting hw.ipmi.on="0" in loader.conf stops this... > > eg: reboot at 4:41:47.. system comes back up, and later: > ... > Uptime: 322 Sun Nov 5 04:48:45 UTC 2017 > Uptime: 323 Sun Nov 5 04:48:46 UTC 2017 > Uptime: 324 Sun Nov 5 04:48:47 UTC 2017 > Stopping cron. > Waiting for PIDS: 1004. > Stopping sshd. > Waiting for PIDS: 994. > Stopping nginx. > ... > That's exactly 420 seconds after the original reboot which matches the > wd_shutdown_countdown timer that is still enabled.] > Good detective work.I suspect this will need to be opt-in as well... Though the other option is to disable the watchdog on attach if we're not enabling the early watchdog which would give us a watchdog when we hang on shutdown... I need to think this through.... Fix it early with less protection by setting this to 0, or fix it later with more protection, but perhaps odd behavior for some edge cases like downgrade. In the mean time hw.ipmi.wd_shutdown_countdown=0 should also fix it. Can you confirm that? Warner