From owner-svn-src-head@freebsd.org Sun Nov 5 05:19:50 2017 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2D519E62507; Sun, 5 Nov 2017 05:19:50 +0000 (UTC) (envelope-from peter@wemm.org) Received: from smtp2.wemm.org (smtp2.wemm.org [192.203.228.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "smtp2.wemm.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0DA726E007; Sun, 5 Nov 2017 05:19:49 +0000 (UTC) (envelope-from peter@wemm.org) Received: from overcee.wemm.org (canning.wemm.org [192.203.228.65]) by smtp2.wemm.org (Postfix) with ESMTP id 320375B9; Sat, 4 Nov 2017 22:19:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wemm.org; s=m20140428; t=1509859189; bh=NXQbXuMbNQCeUzP1hZxkLZbMJjdCG1Rlg8oMV2p9Ymw=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=GLAZuRigrK82w3rG/tShhkf4iXdMHSNXjxRinOeqPK58qelHGgeUwsmi8Ls4oY7uy kxFlpfXa08GPQFNq3AQsjAL6NZGyV83INGKJy4CSBnC6gPSY9IJGjRkewNpFL8vHmZ ApHoGwfWrLzFMJ1/w88viRz7tqgc0rg69ZUH7TkE= From: Peter Wemm To: Warner Losh Cc: "svn-src-all@freebsd.org" , Warner Losh , src-committers , "svn-src-head@freebsd.org" Subject: Re: svn commit: r325378 - head/sys/dev/ipmi Date: Sat, 04 Nov 2017 22:19:45 -0700 Message-ID: <1595776.mmy5sTxHyV@overcee.wemm.org> User-Agent: KMail/4.14.10 (FreeBSD/12.0-CURRENT; KDE/4.14.30; amd64; ; ) In-Reply-To: References: <201711040301.vA431wdY002757@repo.freebsd.org> <2932858.xKWtPkGhRe@overcee.wemm.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart7964703.98dSmvIvTU"; micalg="pgp-sha256"; protocol="application/pgp-signature" X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Nov 2017 05:19:50 -0000 --nextPart7964703.98dSmvIvTU Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="us-ascii" On Saturday, November 04, 2017 11:03:55 PM Warner Losh wrote: > On Sat, Nov 4, 2017 at 10:50 PM, Peter Wemm wrote: > > On Saturday, November 04, 2017 03:01:58 AM Warner Losh wrote: > > > Author: imp > > > Date: Sat Nov 4 03:01:58 2017 > > > New Revision: 325378 > > > URL: https://svnweb.freebsd.org/changeset/base/325378 > > >=20 > > > Log: > > > Make the startup timeout 0 seconds by default rathern than 420s= . This > > > makes the default fail safe when watchdogd is disabled (which i= s also > > > the default). > >=20 > > We're still getting unanticipated reboots. > >=20 > > I think what is happening is: > > 1) orderly reboot initiated. > > 2) By default, the watchdog code sets a 420 second timer, even with= no > > watchdogd. > > 3) reboot complets, system comes up. > > 4) A few minutes later, the pre-reboot 420 second timer expires and= > > *another* > > reboot happens. > >=20 > > Setting hw.ipmi.on=3D"0" in loader.conf stops this... > >=20 > > eg: reboot at 4:41:47.. system comes back up, and later: > > ... > > Uptime: 322 Sun Nov 5 04:48:45 UTC 2017 > > Uptime: 323 Sun Nov 5 04:48:46 UTC 2017 > > Uptime: 324 Sun Nov 5 04:48:47 UTC 2017 > > Stopping cron. > > Waiting for PIDS: 1004. > > Stopping sshd. > > Waiting for PIDS: 994. > > Stopping nginx. > > ... > > That's exactly 420 seconds after the original reboot which matches = the > > wd_shutdown_countdown timer that is still enabled.] >=20 > Good detective work.I suspect this will need to be opt-in as well... = Though > the other option is to disable the watchdog on attach if we're not en= abling > the early watchdog which would give us a watchdog when we hang on > shutdown... I need to think this through.... Fix it early with less > protection by setting this to 0, or fix it later with more protection= , but > perhaps odd behavior for some edge cases like downgrade. >=20 > In the mean time hw.ipmi.wd_shutdown_countdown=3D0 should also fix it= . Can > you confirm that? >=20 > Warner We have a number of obnoxious machines that take 5+ minutes in POST. T= he 7=20 minute timer is cutting it awfully close. However, what I'm more worried about: what if you're going to boot some= thing=20 other than FreeBSD? Or going into the BIOS to tweak something? If I = break=20 into the loader to pause booting, it'll just silently reboot out from u= nder me=20 a few minutes later. I don't see how this can be anything but opt-in = by=20 default. As it's a timer initiated by an orderly shutdown/reboot there= should=20 be plenty of time for an approprate value to be safely set. Yes, setting the sysctl after boot did prevent the spurious reboot afte= r the=20 next boot-up. =2D-=20 Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI= 6FJV UTF-8: for when a ' or ... just won\342\200\231t do\342\200\246 --nextPart7964703.98dSmvIvTU Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part. Content-Transfer-Encoding: 7Bit -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEBgrA0Vr/vfNVuPoUNdaXCeyAngQFAln+n3EACgkQNdaXCeyA ngTSrgf+LQTK8ZlkoaM8e9thKzvDnGTaC2yASnunCiVYu67gojRYoU5aUALXjR5o B4mohlD2BA+5cZWOdjfa7gq1PZ6zhZnQ/Zs9UfZ2qiDV4arhPj9XXO1Mj2zU8mZu wq4VMC1RRDRXqtw+vJVc0WtpRE7JdUqaXm33kQxFoKMuDW3ITN4A1jCam4Lkca/D HqS25pC/s9TFjwhYAi6n354zkw92Q3dEZWv0eYbnWYyTn2/V3Vw/kNSxEWgyeq8L Q7IAwB140ZuofW8Cu9clJXDY4boxtHDkfDjYVsRCnBfSvyZ7rElgy8a/o611LCxk PmRAkoK026ohVxxHHR2E5DnjAFR/WA== =d1zx -----END PGP SIGNATURE----- --nextPart7964703.98dSmvIvTU--