From owner-svn-src-all@freebsd.org Sun Apr 22 17:11:10 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7577CFAF386; Sun, 22 Apr 2018 17:11:10 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-io0-x233.google.com (mail-io0-x233.google.com [IPv6:2607:f8b0:4001:c06::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0A70186153; Sun, 22 Apr 2018 17:11:10 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-io0-x233.google.com with SMTP id f3-v6so15776456iob.13; Sun, 22 Apr 2018 10:11:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=1onh7fzpfGTYUS+kC1S8F4gQyK75PV3ThEue0BT8LDE=; b=eWOFQ3SzzVC7JRpbJQfWsbm2r/CO6kcDfVUZhq7qmJtVCS2PzZJaDZ+PXFc9ndEFTr QdkgVeV/LM4DRKi+kO4FwUSLDgbnaKCPuwfBqiKtQKAG/HMNr4rZjCvlzxdrVfG0AOMX P6fPL5jj/LPXwmHD64g2gHFLEM/j2889fHbViaCA7gKdhQiqJEXM/d/AbY+oiVLvRGnl 9hhwhZdm/921ioYBIF/d+xAto/9Pt7YihbBU24mWkkIKa9R1XUr1+wfXqvTKBdeR6K27 tsmiVYzeis81YRTXRa5aP8tjMS1rYqfnfdVLy9xy6z7GC+/87JYBvXKyschCgg2lUpFg ABww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=1onh7fzpfGTYUS+kC1S8F4gQyK75PV3ThEue0BT8LDE=; b=PZPSLWhrpkFVHKlZmz39sCUaswc1ivEgdgECeX2WHn2Rw94rb/uDe9WUDWsrpPBR3b rrAt/lr6O1brwALYVsDHXUDINquTffIZyKjm1MZYkTKMeYdEhrs/hzpoqeHmYedOPR0K ODhsfjtb4Z1OjNasYs4K/4grEK9rFugG0EnDZ3ET+j4Zcq96KMn4i75BgVOqBw9nbNSz ZL2kkN2LTR4M7nnZpTKgK6xwJR+iuuHogIU44UF4oUlrzYaTWS0SwQrOVdJf4ZgESY97 rink1yATRMooTPMPkExPZG5GZAB1wzs/USFgA2hl+Buf23CHOuwx1Cra8MitxXREbw/V E84g== X-Gm-Message-State: ALQs6tAS2PXYs2Ue8/GvSGEQ6GO2YTPLJNbjQOEVGD6TGpBt83A2m+oQ /vQcAb3Vc/un1vton//nMoY= X-Google-Smtp-Source: AIpwx4/FtUkZYLbRQ/hTFr8lZJc2uAW8NhjUelIbYEyu2z+SbJAoPhxQVhq77TAWaA/HMArewKe7fg== X-Received: by 2002:a6b:3985:: with SMTP id g127-v6mr17409061ioa.52.1524417069353; Sun, 22 Apr 2018 10:11:09 -0700 (PDT) Received: from raichu (toroon0560w-lp130-04-184-145-252-74.dsl.bell.ca. [184.145.252.74]) by smtp.gmail.com with ESMTPSA id t126-v6sm5014033iod.38.2018.04.22.10.11.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 22 Apr 2018 10:11:08 -0700 (PDT) Sender: Mark Johnston Date: Sun, 22 Apr 2018 13:11:06 -0400 From: Mark Johnston To: Jonathan Looney Cc: cem@freebsd.org, src-committers , svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r332860 - head/sys/kern Message-ID: <20180422171106.GB84833@raichu> References: <201804211705.w3LH50Dk056339@repo.freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Apr 2018 17:11:10 -0000 On Sat, Apr 21, 2018 at 04:33:33PM -0400, Jonathan Looney wrote: > On Sat, Apr 21, 2018 at 1:53 PM, Conrad Meyer wrote: > > > > I don't think this should be enabled by default. Can we leave it > > disabled by default and let consumers opt-in? > > I'm willing to change the default if there's a good reason or consensus for > that. However, it is not obvious to me that the default is actually wrong. > > I think its important that we remember where we are when we hit this code. > > By the time we hit this code, we've already panic'd (whether due to a > "true" panic or a violated assertion). At this point, the system is already > in an unknown/unexpected state and we want to preserve our ability to > troubleshoot and/or recover. (And, since we're running a system with > INVARIANTS, presumably the ability to troubleshoot is important.) > > We may well violate a second assertion simply because the system is in an > unknown/unexpected state. Once we do that, the question is whether we > should try to preserve the ability to troubleshoot, or should give up and > reset the system. > > All too often, my ability to debug assertion violations is hindered because > the system trips over yet another assertion while dumping the core. If we > skip the assertion, nothing bad happens. (The post-panic debugging code > already needs to deal with systems that are inconsistent, and it does a > pretty good job at it.) I think we make a decent effort to fix such problems as they arise, but you are declaring defeat on behalf of everyone. Did you make some effort to fix or report these issues before resorting to the more drastic measure taken here? > On the other hand, I really am not sure what you are worried might happen > if we skip checking assertions after we've already panic'd. As far as I can > tell, the likely worst case is that we hit a true panic of some kind. In > that case, we're no worse off than before. > > I think the one obvious exception is when we're purposely trying to > validate the post-panic debugging code. In that case, you can change the > sysctl/tunable to enable troubleshooting. What about a user whose test system panics and fails to dump? With assertions enabled, a developer has a better chance of spotting the problem. Now we need at least one extra round trip to the user to diagnose the problem, which may not be readily reproducible in the first place. > I would honestly appreciate someone explaining the dangers in disabling a > response to assertion violations after we've already panic'd and are simply > trying to troubleshoot, because they are not obvious to me. But, I could > simply be missing them. The assertions help identify code that is being executed during a dump when it shouldn't be. In general we rely on users to opt in to running INVARIANTS kernels because developers don't catch every single bug. With this change it's harder to be confident in the kernel dump code. (Or in any post-panic debugging code for that matter.) I dislike the change and would prefer the default to be inverted. At the very least I think we should print the assertion message rather than returning silently from kassert_panic().