Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 15 Dec 2012 21:52:03 +0100
From:      Pawel Jakub Dawidek <pjd@FreeBSD.org>
To:        Navdeep Parhar <np@FreeBSD.org>
Cc:        Adrian Chadd <adrian@FreeBSD.org>, Alfred Perlstein <alfred@FreeBSD.org>, John Baldwin <jhb@FreeBSD.org>, svn-src-all@FreeBSD.org, Alfred Perlstein <bright@mu.org>, Andriy Gapon <avg@FreeBSD.org>, src-committers@FreeBSD.org, svn-src-head@FreeBSD.org
Subject:   Re: svn commit: r244112 - head/sys/kern
Message-ID:  <20121215205202.GF1411@garage.freebsd.pl>
In-Reply-To: <50C91B32.4080904@FreeBSD.org>
References:  <201212110708.qBB78EWx025288@svn.freebsd.org> <201212121046.43706.jhb@freebsd.org> <CAJ-Vmo=U04GX%2BZyKuzXLwV%2BPpzU6_dm5BCmL=DWfsmhTVAR%2BsA@mail.gmail.com> <201212121658.49048.jhb@freebsd.org> <50C90567.8080406@FreeBSD.org> <50C909BD.9090709@mu.org> <50C91B32.4080904@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--jkO+KyKz7TfD21mV
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Dec 12, 2012 at 04:02:58PM -0800, Navdeep Parhar wrote:
> On 12/12/12 14:48, Alfred Perlstein wrote:
> > On 12/12/12 2:29 PM, Andriy Gapon wrote:
> >> Now we get a new middle-ground: get both worse performance (because
> >> KASSERTs are compiled in) and a risk of harming your data (because
> >> KASSERTs no longer panic). The upside: there is no panic! There's just
> >> a log message (or etc). and chance to get more log messages because
> >> the insanity propagates. And a chance to lose your data (your
> >> customer's) - but I've already mentioned this. I am not sure that I
> >> like this kind of middle-ground.=20
> > I have a number of points here:
> >=20
> > The most important one being:
> > 1) without kassert you would still have the bug, just that it would be
> > unreported.
> >   The upside: there is no panic! There's **NO** log message (or etc).
> > and chance to get more log messages because the insanity propagates.
> >=20
> > Terrible!
> >=20
> > Let me explain that again:
> > If you don't compile in KASSERT, then it's not like the condition is
> > never going to happen.  Instead it will just be unreported.
>=20
> A KASSERT() really is for a condition that should never happen.  It is
> primarily useful during development and testing (and when the code is
> reworked or redesigned).  I agree with Andriy here -- a non-fatal assert
> shouldn't really exist.

I have sort of mixed feelings about this change, but in reality we have
three cases:

1. Fatal conditions that shouldn't happen, but may happen for some
   reason and we definiately want to stop running (corrupted file system
   metadata that can mess up the file system more badly). For those we have
   direct panic(9) calls.

2. Fatal conditions that cannot happen and for those we have KASSERT(9).

3. Non-fatal conditions that cannot happen, which we have no way to
   report more gracefully and we do it through KASSERT(9).

It is annoying to run INVARIANTS kernel and trigger 3. I had this
problem few times, for example in TCP/IP stack. It turned out to be
non-fatal and KASSERT(9) was there to understand the code better.
I'd much prefer to see it logged than to panic my box. Of course it is
also sometimes hard to judge if it is fatal or not.

I use plenty of assertions in my code to catch bugs early, but
assertions like any other part of the code might have bugs and during
rapid development they help a lot when the code around is changing a lot
and some earlier assumptions are no longer valid. Many of those
assertions are non-fatal.

For me this problem is pretty complex, because:

1. It would be nice to have a macro to test non-fatal conditions, but it
   is hard to tell sometimes if it is fatal or not. This macro could be
   also used when code is hard to understand and it needs some special
   workload to really figure out if something is possible or not.

2. Buggy assertions can be used for DoS attack. Lack of assertions can be
   used for much worse problems.

3. Logging non-fatal "assertions" might make them go unnoticed, but we
   currently don't enable kernel dumps by default, so when panic occurs
   user has no idea what has happend, especially if he is in X. Logging
   would give better chance to actually notice the problem currently.

4. We develop using HEAD, which can have bugs, also buggy assertions and
   it is annoying to trigger assertions not in your code when you try to
   work on something. But I'm talking here about test boxes, on my
   laptop I'd probably prefer panic too often than too rarely and
   corrupt my system.

In my opinion we should do the following:

1. Leave the option to make KASSERTs non-fatal, but log big fat warning
   that this is development feature and should not be used in production.

2. Introduce handy macro that would log the problem, but won't panic the
   box for non-fatal conditions, just like we do with LORs.

3. Enable kernel dumps by default. The main obstacle to do that was lack
   of a way to limit number of kernel dumps, which could lead to filling
   /var/ after hitting some panic/reboot cycle.

My contribution to solve this is implementation of 3:

	http://people.freebsd.org/~pjd/patches/savecore.patch

The first two should be easy to do.

--=20
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://tupytaj.pl

--jkO+KyKz7TfD21mV
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iEYEARECAAYFAlDM4vIACgkQForvXbEpPzQXLgCeLch1lhyOcSZdeleToPhoRe2/
588AoOCn0RN1zNo2+8p242SCNIO7z2pG
=oVGw
-----END PGP SIGNATURE-----

--jkO+KyKz7TfD21mV--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121215205202.GF1411>