Date: Sat, 15 Dec 2012 21:52:03 +0100 From: Pawel Jakub Dawidek <pjd@FreeBSD.org> To: Navdeep Parhar <np@FreeBSD.org> Cc: Adrian Chadd <adrian@FreeBSD.org>, Alfred Perlstein <alfred@FreeBSD.org>, John Baldwin <jhb@FreeBSD.org>, svn-src-all@FreeBSD.org, Alfred Perlstein <bright@mu.org>, Andriy Gapon <avg@FreeBSD.org>, src-committers@FreeBSD.org, svn-src-head@FreeBSD.org Subject: Re: svn commit: r244112 - head/sys/kern Message-ID: <20121215205202.GF1411@garage.freebsd.pl> In-Reply-To: <50C91B32.4080904@FreeBSD.org> References: <201212110708.qBB78EWx025288@svn.freebsd.org> <201212121046.43706.jhb@freebsd.org> <CAJ-Vmo=U04GX%2BZyKuzXLwV%2BPpzU6_dm5BCmL=DWfsmhTVAR%2BsA@mail.gmail.com> <201212121658.49048.jhb@freebsd.org> <50C90567.8080406@FreeBSD.org> <50C909BD.9090709@mu.org> <50C91B32.4080904@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--jkO+KyKz7TfD21mV Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Dec 12, 2012 at 04:02:58PM -0800, Navdeep Parhar wrote: > On 12/12/12 14:48, Alfred Perlstein wrote: > > On 12/12/12 2:29 PM, Andriy Gapon wrote: > >> Now we get a new middle-ground: get both worse performance (because > >> KASSERTs are compiled in) and a risk of harming your data (because > >> KASSERTs no longer panic). The upside: there is no panic! There's just > >> a log message (or etc). and chance to get more log messages because > >> the insanity propagates. And a chance to lose your data (your > >> customer's) - but I've already mentioned this. I am not sure that I > >> like this kind of middle-ground.=20 > > I have a number of points here: > >=20 > > The most important one being: > > 1) without kassert you would still have the bug, just that it would be > > unreported. > > The upside: there is no panic! There's **NO** log message (or etc). > > and chance to get more log messages because the insanity propagates. > >=20 > > Terrible! > >=20 > > Let me explain that again: > > If you don't compile in KASSERT, then it's not like the condition is > > never going to happen. Instead it will just be unreported. >=20 > A KASSERT() really is for a condition that should never happen. It is > primarily useful during development and testing (and when the code is > reworked or redesigned). I agree with Andriy here -- a non-fatal assert > shouldn't really exist. I have sort of mixed feelings about this change, but in reality we have three cases: 1. Fatal conditions that shouldn't happen, but may happen for some reason and we definiately want to stop running (corrupted file system metadata that can mess up the file system more badly). For those we have direct panic(9) calls. 2. Fatal conditions that cannot happen and for those we have KASSERT(9). 3. Non-fatal conditions that cannot happen, which we have no way to report more gracefully and we do it through KASSERT(9). It is annoying to run INVARIANTS kernel and trigger 3. I had this problem few times, for example in TCP/IP stack. It turned out to be non-fatal and KASSERT(9) was there to understand the code better. I'd much prefer to see it logged than to panic my box. Of course it is also sometimes hard to judge if it is fatal or not. I use plenty of assertions in my code to catch bugs early, but assertions like any other part of the code might have bugs and during rapid development they help a lot when the code around is changing a lot and some earlier assumptions are no longer valid. Many of those assertions are non-fatal. For me this problem is pretty complex, because: 1. It would be nice to have a macro to test non-fatal conditions, but it is hard to tell sometimes if it is fatal or not. This macro could be also used when code is hard to understand and it needs some special workload to really figure out if something is possible or not. 2. Buggy assertions can be used for DoS attack. Lack of assertions can be used for much worse problems. 3. Logging non-fatal "assertions" might make them go unnoticed, but we currently don't enable kernel dumps by default, so when panic occurs user has no idea what has happend, especially if he is in X. Logging would give better chance to actually notice the problem currently. 4. We develop using HEAD, which can have bugs, also buggy assertions and it is annoying to trigger assertions not in your code when you try to work on something. But I'm talking here about test boxes, on my laptop I'd probably prefer panic too often than too rarely and corrupt my system. In my opinion we should do the following: 1. Leave the option to make KASSERTs non-fatal, but log big fat warning that this is development feature and should not be used in production. 2. Introduce handy macro that would log the problem, but won't panic the box for non-fatal conditions, just like we do with LORs. 3. Enable kernel dumps by default. The main obstacle to do that was lack of a way to limit number of kernel dumps, which could lead to filling /var/ after hitting some panic/reboot cycle. My contribution to solve this is implementation of 3: http://people.freebsd.org/~pjd/patches/savecore.patch The first two should be easy to do. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://tupytaj.pl --jkO+KyKz7TfD21mV Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlDM4vIACgkQForvXbEpPzQXLgCeLch1lhyOcSZdeleToPhoRe2/ 588AoOCn0RN1zNo2+8p242SCNIO7z2pG =oVGw -----END PGP SIGNATURE----- --jkO+KyKz7TfD21mV--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121215205202.GF1411>