Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Dec 2012 14:02:53 -0800
From:      Adrian Chadd <adrian@freebsd.org>
To:        Andriy Gapon <avg@freebsd.org>
Cc:        Alfred Perlstein <alfred@freebsd.org>, src-committers@freebsd.org, John Baldwin <jhb@freebsd.org>, svn-src-all@freebsd.org, Alfred Perlstein <bright@mu.org>, Gleb Smirnoff <glebius@freebsd.org>, Robert Watson <rwatson@freebsd.org>, Navdeep Parhar <np@freebsd.org>, Bruce Evans <brde@optusnet.com.au>, svn-src-head@freebsd.org
Subject:   Re: svn commit: r244112 - head/sys/kern
Message-ID:  <CAJ-VmonBP5chxPROAQya8ckmLzJrMRK%2B2qD_SKpROJO=T71_Kw@mail.gmail.com>
In-Reply-To: <50CF92F0.5020904@FreeBSD.org>
References:  <201212110708.qBB78EWx025288@svn.freebsd.org> <50CBC285.7060307@mu.org> <20121215161414.V1029@besplex.bde.org> <201212171439.27297.jhb@freebsd.org> <50CF8CE7.4020906@mu.org> <50CF92F0.5020904@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 17 December 2012 13:47, Andriy Gapon <avg@freebsd.org> wrote:

> But you see, the following is still illogical _to me_.

And this is the core of the problem.

A lot of developers are interpreting the KASSERT() conditions as an
invariant condition that, if in any way enabled, should be completely
trusted, believed and cause an immediate panic().

However, we leave them out on shipping, production kernels. Why's
that? Because the contract here is "the code should never hit these
situations, so we don't bother checking."

It's totally understandable at this point why there's such a huge
amount of confusion here. On one hand we have a construct that allows
the developer to enforce correct behaviour and panic early if it gets
caught out; on the other hand we totally ignore all of that in
shipping, production kernels.

Why are they there, if we just ship production releases with
INVARIANTS disabled?

> If one believes that all KASSERTs are bogus then one should compile them out and
> just forget about them.

There's a lot of perfectly good reasons for leaving KASSERT()s there
in the source.

The main reason: because they enforce what the correct behaviour SHOULD be.

Even if the code doesn't fail the invariant check, it may do so in the future.

Why?

* because we're not perfect, not at all. The existing code may be subtly buggy.
* it protects those bits of code from changes in other areas of code
that can cause flow-on effects that trigger these kinds of invariant
checks.

> If one believes that at least some KASSERTs are such that the execution should
> not continue after them, then one should enable all of them and panic on all of
> them (better safe than sorry).

Yes, but we really should turn as many of those panic conditions into
"handle the invariant failure gracefully", since it may be
recoverable.

Again, right now GENERIC wouldn't panic if those conditions are met,
as right now we don't compile them in.


> If one is not satisfied with the above choices, then one should meticulously go
> over all (or at least some) of KASSERTs and make a decision which KASSERTs are
> bogus, which are true and which should be turned into something else (error
> checking, warnings, etc).

Again, look above. Those assertions mean more than "make sure the code
works right at the present moment."

However, I'm in the field of "defensive programming" - a lot of those
invariants likely could be dealt with inside the function itself.

Eg, something like this:

KASSERT(n >= 0 || n < max, ("n out of bounds"));
if (n < 0 || n >= max)
    return -EINVAL;

.. that kind of thing should be the norm.

In any case, _that_ is a separate discussion to all of this.

> The proposed solution simultaneously enables all KASSERTs, as if all of them
> were true, and doesn't panic on any KASSERT, as if neither of them is fatal
> (i.e. they all are bogus to one degree and another).

.. GENERIC doesn't panic on the KASSERTs, because they're not even
compiled in. Thus, the kernel doesn't panic (much) more or less than
before. See below.

> P.S.  I am not talking about the status quo because the status quo would be
> maintained even without the changes in question.

The status quo in _behaviour_ is:

* shipping kernels don't panic on an invariant check, because we don't
even compile them in;
* doing the invariant checks shouldn't affect future execution paths
(except for whatever changes occur because we're printing them out);
* if it crashed in GENERIC because it wasn't caught by a (not compiled
in) invariant check, it will very likely still crash with alfreds work
- we'll just get told about it beforehand.
* if it wouldn't crash in GENERIC after it wasn't caught by a (not
compiled in) invariant check, we'll still get a printout of the
assertion failure, then execution will continue.

The _behaviour_ doesn't change, outside of some extra console IO and
some potential for race conditions to occur "differently."



Adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmonBP5chxPROAQya8ckmLzJrMRK%2B2qD_SKpROJO=T71_Kw>