From owner-svn-src-head@freebsd.org  Tue Apr 24 17:40:07 2018
Return-Path: <owner-svn-src-head@freebsd.org>
Delivered-To: svn-src-head@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 519CAFAD0BF;
 Tue, 24 Apr 2018 17:40:07 +0000 (UTC)
 (envelope-from markjdb@gmail.com)
Received: from mail-io0-x231.google.com (mail-io0-x231.google.com
 [IPv6:2607:f8b0:4001:c06::231])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id D41137F4AB;
 Tue, 24 Apr 2018 17:40:06 +0000 (UTC)
 (envelope-from markjdb@gmail.com)
Received: by mail-io0-x231.google.com with SMTP id t123-v6so23688540iof.7;
 Tue, 24 Apr 2018 10:40:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=sender:date:from:to:cc:subject:message-id:references:mime-version
 :content-disposition:in-reply-to:user-agent;
 bh=kz0qiVVI/RL4n1eO1doYqsiwE5Y4ja9IzI9Xdp4/Pks=;
 b=ZqvOwhjenhMYSlkFWw+vFxwzwZUX6TL9jQEDcTkOdPXYJlIaeSIw8Km5PmFwKySDJG
 m6jrc+u7kukCCV9hOmVXSMm92KJNE5EjsdhDnmqzu8y2DmMLHpA7VWb2UMF5ToKddLjm
 ToX0qYswh7jCB9zoUG9+mZJk6tzQKbtFBt3u64ohvNBxGqnLNgRYJ+8/Bj8WFn/i0jA0
 xkQ/XuzBfaBhn5DrnK/Ik4mdG80GAF3LsEzYqdaCF6opEPkJb1plPzCq1AxK/EH4eBba
 z+BrMiU1KzY1q8JQfsdSMUJaLT40y3NDmdBJ56lwDUrKxK+h8vNvXvdzDXV7aVQIv1Cr
 xkQg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:sender:date:from:to:cc:subject:message-id
 :references:mime-version:content-disposition:in-reply-to:user-agent;
 bh=kz0qiVVI/RL4n1eO1doYqsiwE5Y4ja9IzI9Xdp4/Pks=;
 b=mGZdjLuAzDR1TK/IGgz0TAo9UDdMZXQJIiufAgE9dORRACdcxbtYLx9ySW0I8rWWm2
 VAopVg3XqPFYh29No84C4ses4ocbR0+L7n7En8yfXrk+h0ceLCRIU7eXODjzA8Cl+hDG
 ffwI919vxNz4344e6pePyZYjGHS5TON3AsKiRVIXt0vSEdRceG2mlYHuuwcAoShqwSCN
 LfWHfeSX92n8WeL+FP30gF+/5gXD88Yb3oBeuH5HXlHChJK6YJdDwsFNO2n7c4L2hbva
 m/o8bEzkYmg5BnUFqbAkwMe7X42mzX02tRQfE+0RN181V2MOoxdUtl2Q3LaRkDask4r1
 e4Cg==
X-Gm-Message-State: ALQs6tCq6+Xa78m5F3wnZkOg0xARfPC/EKTba1nYNaJa5zFBS3/0WuNG
 gUJJ4Dgxm/b+quBvkSWGe4ZvKw==
X-Google-Smtp-Source: AB8JxZoN8gcUVtRxTMSYN9TzzUX3Nai+F6BSzA75CC+MliKYHdbNTbc5yoODfzpG2wq9IItCs11dTw==
X-Received: by 2002:a6b:ae49:: with SMTP id
 x70-v6mr16179920ioe.148.1524591605839; 
 Tue, 24 Apr 2018 10:40:05 -0700 (PDT)
Received: from raichu (toroon0560w-lp130-04-184-145-252-74.dsl.bell.ca.
 [184.145.252.74])
 by smtp.gmail.com with ESMTPSA id z88-v6sm3844811ioi.25.2018.04.24.10.40.04
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 24 Apr 2018 10:40:05 -0700 (PDT)
Sender: Mark Johnston <markjdb@gmail.com>
Date: Tue, 24 Apr 2018 13:40:02 -0400
From: Mark Johnston <markj@freebsd.org>
To: "Jonathan T. Looney" <jtl@freebsd.org>
Cc: John Baldwin <jhb@freebsd.org>, cem@freebsd.org,
 src-committers <src-committers@freebsd.org>,
 svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject: Re: svn commit: r332860 - head/sys/kern
Message-ID: <20180424174002.GB27358@raichu>
References: <201804211705.w3LH50Dk056339@repo.freebsd.org>
 <CADrOrmvAxuoadBM==1EEbJc4PAPwtd-vPE4Tg-pM86CvwQnnwA@mail.gmail.com>
 <20180423180024.GC84833@raichu>
 <1739228.8pyHcvzasL@ralph.baldwin.cx>
 <CADrOrmunCxzSpBZe35XwX7ZSFyPuSEpvEraKmQP2MdSP8ZMEGw@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CADrOrmunCxzSpBZe35XwX7ZSFyPuSEpvEraKmQP2MdSP8ZMEGw@mail.gmail.com>
User-Agent: Mutt/1.9.4 (2018-02-28)
X-BeenThere: svn-src-head@freebsd.org
X-Mailman-Version: 2.1.25
Precedence: list
List-Id: SVN commit messages for the src tree for head/-current
 <svn-src-head.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-head>,
 <mailto:svn-src-head-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-head/>
List-Post: <mailto:svn-src-head@freebsd.org>
List-Help: <mailto:svn-src-head-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-head>,
 <mailto:svn-src-head-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Apr 2018 17:40:07 -0000

On Tue, Apr 24, 2018 at 01:24:30PM -0400, Jonathan T. Looney wrote:
> On Mon, Apr 23, 2018 at 6:04 PM, John Baldwin <jhb@freebsd.org> wrote:
> >
> > I think this is actually a key question.  In my experience to date I have
> not
> > encountered a large number of post-panic assertion failures.  Given that
> > we already break all locks and disable assertions for locks I'd be curious
> > which assertions are actually failing.  My inclination given my
> experiences
> > to date would be to explicitly ignore those as we do for locking if it is
> > constrained set rather than blacklisting all of them.  However, I would be
> > most interested in seeing some examples of assertions that are failing.
> 
> The latest example (the one that prompted me to finally commit this) is in
> lockmgr_sunlock_try(): 'panic: Assertion (*xp & ~LK_EXCLUSIVE_SPINNERS) ==
> LK_SHARERS_LOCK(1) failed at /usr/src/sys/kern/kern_lock.c:541'
> 
> I don't see any obvious recent changes that would have caused this, so this
> is probably a case where a change to another file suddenly made us trip
> over this assert.
> 
> And, that really illustrates my overall point:

Mine too. :)

Why is anything trying to acquire a lockmgr lock after a panic? What is
the stack? I suspect that CAM is completing non-dump CCBs after a panic,
which can cause deadlocks if the completion handler needs to perform a
TLB shootdown after destroying a mapping, for example. In fact, I had
forgotten that Isilon has some CAM patches which attempt to address this
because of the problems that such deadlocks had caused. I will work on
getting these reviewed and upstreamed.

> most assertions in
> general-use code have limited value after a panic.
>
> We expect developers to write high-quality assertions so we can catch bugs.
> This requires that they understand how their code will be used. However,
> once we've panic'd, many assumptions behind code change and the assertions
> are no longer valid. (And, sometimes, it is difficult for a developer to
> predict how these things will change in a panic situation.) We can either
> play whack-a-mole to modify assertions as we trip over them in our
> post-panic work, or we can switch to an opt-in model where we only check
> assertions which the developer actually intends to run post-panic.
> 
> Playing whack-a-mole seems like a Sisyphean task which will burn out
> developers and/or frustrate people who run INVARIANTS kernels. Switching to
> an opt-in model seems like the better long-term strategy.
> 
> Having said all of that, I am cognizant of at least two things:
> 1) Mark Johnston has done a lot of work in coredumps and thinks there are
> post-panic assertions that have value.
> 2) Until we have both agreement to switch our post-panic assertion paradigm
> and also infrastructure to allow developers to opt in, it probably is not
> wise to disable all assertions by default.
> 
> So, I will follow Mark's suggestions: I will change the default. I will
> also change the code so we print a limited number of failed assertions.

Thanks.

> However, I think that changing the post-panic assertion paradigm is an
> important conversation to have. We want people to run our INVARIANTS
> kernels. And, we want to get high-quality reports from those. I think we
> could better serve those goals by changing the post-panic assertion
> paradigm.
> 
> Jonathan