Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 16 Dec 2012 15:32:13 -0800
From:      Navdeep Parhar <nparhar@gmail.com>
To:        Adrian Chadd <adrian@freebsd.org>
Cc:        Ian Lepore <freebsd@damnhippie.dyndns.org>, src-committers@freebsd.org, Peter Wemm <peter@wemm.org>, svn-src-all@freebsd.org, Andriy Gapon <avg@freebsd.org>, svn-src-head@freebsd.org, Peter Jeremy <peter@rulingia.com>
Subject:   Re: svn commit: r244112 - head/sys/kern
Message-ID:  <20121216233213.GA1451@itx>
In-Reply-To: <CAJ-Vmo=4HNhWYSgGBn%2BTae%2B6UO9dqnim_hvcabsCy8Nq-9=bOA@mail.gmail.com>
References:  <201212121658.49048.jhb@freebsd.org> <50C90567.8080406@FreeBSD.org> <50C909BD.9090709@mu.org> <50C91B32.4080904@FreeBSD.org> <20121215205202.GF1411@garage.freebsd.pl> <20121216040717.GG35245@server.rulingia.com> <CAGE5yCofnCKfJ8kMKrV8fmckPt_WOXc9PGnh3zuVUSGO-%2BrCRQ@mail.gmail.com> <1355634037.1198.115.camel@revolution.hippie.lan> <50CD7C1D.3020108@FreeBSD.org> <CAJ-Vmo=4HNhWYSgGBn%2BTae%2B6UO9dqnim_hvcabsCy8Nq-9=bOA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Dec 16, 2012 at 09:23:13AM -0800, Adrian Chadd wrote:
> On 15 December 2012 23:45, Andriy Gapon <avg@freebsd.org> wrote:
> > on 16/12/2012 07:00 Ian Lepore said the following:
> >> The question here isn't whether aborting or continuing beyond that point
> >> is a good idea.  Some developer already made that choice by coding a
> >> KASSERT() instead of a panic().  The developer decided that a production
> >> machine should try to keep running at that point.
> >
> > Please don't perpetuate this argument.  The point of KASSERT is not that the
> > developer intended that the system should try to keep running in production.
> > The point is that (1) the KASSERT should not be hit in production as was
> > established in testing *and* (2) having all KASSERTs enabled in production is
> > too expensive.  That's all.
> 
> You can't possibly believe that once the kernel is in production,
> "testing" stops.
> 
> That's why Alfred and I want to mak KASSERT() optionally just print
> that it happened and maybe add some further information, then
> continue.
> 
> It doesn't change the status quo with the default, GENERIC
> "production" kernel. It still crashes where it would normally crash
> (timing bugs otherwise.) It still won't crash where it wouldn't
> trigger a kassert. A shipping, production kernel doesn't have KASSERT
> enabled.
> 
> You may assert "assertions are supposed to crash", yet we ship with
> assertions disabled. Please, tell the software engineers here what you
> think that implies about what we think about those assertions. Let me
> give you a hint - if you ship with them disabled, they don't get run.
> So obviously we don't think there's a big enough problem to cause any
> real issues. Now, this may not be the case at all - in which case,
> those shouldn't be disabled in production kernels, for all the reasons
> everyone above has said. Yet, they're disabled.

It is correct (and standard practice) to ship with assertions disabled.
This is recognition of the fact that assertions and run time error
checks are two different beasts.  Anything expressed as a KASSERT really
should be ironclad by the time the code is deemed ready for production,
and time spent verifying it isn't worth it.

Creating a new class of checks for recoverable errors would be welcome.
Just report the unexpected state and move on.  A subset of these - the
lightweight ones - could be enabled in production too, and that would be
quite welcome too.  Why not introduce new macros?  Why change the
meaning of a KASSERT?

> 
> The status quo _does not change_ by default.
> 

So now we have a knob that could be used to change the behaviour of all
the KASSERTs in the system; one that hints that it may be possible to
continue even if an assertion in the FreeBSD kernel doesn't hold good
(this is the part that bothers me).  I know all the KASSERTs I've looked
at or written are genuine assertions -- the code simply wouldn't be able
to cope if they were violated.  You'd get NULL dereferences, or worse,
access protected structures without corresponding locks held, etc.

Regards,
Navdeep



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121216233213.GA1451>