From owner-svn-src-all@FreeBSD.ORG Sat Dec 15 06:04:31 2012 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 561AC25A; Sat, 15 Dec 2012 06:04:31 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail13.syd.optusnet.com.au (mail13.syd.optusnet.com.au [211.29.132.194]) by mx1.freebsd.org (Postfix) with ESMTP id C3A708FC17; Sat, 15 Dec 2012 06:04:29 +0000 (UTC) Received: from c122-106-175-26.carlnfd1.nsw.optusnet.com.au (c122-106-175-26.carlnfd1.nsw.optusnet.com.au [122.106.175.26]) by mail13.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id qBF64HxN017455 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 15 Dec 2012 17:04:19 +1100 Date: Sat, 15 Dec 2012 17:04:17 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Alfred Perlstein Subject: Re: svn commit: r244112 - head/sys/kern In-Reply-To: <50CBC285.7060307@mu.org> Message-ID: <20121215161414.V1029@besplex.bde.org> References: <201212110708.qBB78EWx025288@svn.freebsd.org> <50C9271C.70803@mu.org> <20121213090215.GP97487@FreeBSD.org> <201212141149.07671.jhb@freebsd.org> <50CBC285.7060307@mu.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=BrrFWvr5 c=1 sm=1 a=9gktFwppygsA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=M4roAWbnUW4A:10 a=mT7c94_YR1GvEF4OrkYA:9 a=CjuIK1q_8ugA:10 a=ggKEkdPFzXdPy3Qo:21 a=9UpuXnAP4qImuvuz:21 a=bxQHXO5Py4tHmhUgaywp5w==:117 X-Mailman-Approved-At: Sat, 15 Dec 2012 12:32:05 +0000 Cc: Adrian Chadd , src-committers@freebsd.org, John Baldwin , svn-src-all@freebsd.org, Alfred Perlstein , Andriy Gapon , Gleb Smirnoff , Robert Watson , Navdeep Parhar , svn-src-head@freebsd.org X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Dec 2012 06:04:31 -0000 On Fri, 14 Dec 2012, Alfred Perlstein wrote: > On 12/14/12 4:12 PM, Robert Watson wrote: >> On Fri, 14 Dec 2012, John Baldwin wrote: >> >>> On Thursday, December 13, 2012 4:02:15 am Gleb Smirnoff wrote: >>>> On Wed, Dec 12, 2012 at 04:53:48PM -0800, Alfred Perlstein wrote: A> The >>>> problem again is that not all the KASSERTS are inviolable, if you A> want >>>> to do a project to split them, then please do, it would really be A> >>>> helpful, as for now, they are a mis-mash of death/warnings and there are >>>> A> at least three vendors who approve of this as well as 3 long term A> >>>> committers that approved my change (not including Adrian). >>>> >>>> Can you show examples of not inviolable KASSERTs? >>> >>> There are none. They are all assertions for a reason. However, in my Not even one whose existence is a bug? :-) >>> experience at several large consumers of FreeBSD, no one wants to run with >>> INVARIANTS in production. Not because we don't want panics (believe me, >>> Yahoo! gets plenty of panics even with INVARIANTS disabled), but because >>> the performance overhead, and redefining INVARIANTS into printf doesn't >>> address that at all. I mostly don't use INVARIANTS, but once used Debugger() a lot. Now I only want to debug my own code and sometimes turn on INVARIANTS to try to do this, but it usually exposes more bugs not in my own code so I turn it off again :-). >> In the past, FYI, the two major INVARIANTS hits were un-inlining of >> locking, and UMA using additional global locking to perform heavier-weight >> sanity checking. For me, at least, the former wasn't such a problem, but >> the latter was extremely measurable. I have occasionally wondered if we >> should have another option name for heavier-duty sanity checking, as is the >> case with WITNESS, SOCKBUF_DEBUG, etc, so that INVARIANTS overhead can be >> made tolerable for more real-world installations. As you observe, our >> invariants tests are generally things where you catch critical problems in >> much more debuggable states, rather than sources of additional fragility, >> and it would be great if we could leave more in so that bug reports were >> able to contain more information. I think that KASSERT() is used too much: - the heavyweight uses that you mention - many lightweight unconditional panics in old code were turned into conditional panics using KASSERT(). So production code that turns off INVARIANTS no longer gets the benefit of the sanity checking for these. - newer code tends to use KASSERT() even for things that could reasonably be lightweight unconditional panics. So production code that turns off INVARIANTS never got the benefit of the sanity checking for these. - not many unconditional panics (with non-lightweight checking for them) can't be left in production code, so there were a limited number of them, with a few pushed under DIAGNOSTIC. In particular, ones for debugging didn't grow unboundedly. Now, KASSERT()s grow unboundedly and the cost of enabling INVARIANTS grows unboundedly. - the source code becomes unreadable, with half of it tending to consist ot misformatted KASSERT()s SHOUTING IN UPPER CASE. The latter bug is missing in assert(3). assert(3) has a better designed API in other respects too. For example it supplies the bloat for __func__, __FILE__ and __LINE__ whether you want this or not. This bloat was intentionally left out for KASSERT(). But many uses of KASSERT() supply this bloat in the caller, where it also bloat the source code. > The KTR system offers an interesting reference for a model that allows us to > make a compile time decision about which areas to trace. > > There used to be a DIAGNOSTIC option for the more heavy checks, this is > either removed or just not used these days. Again, using a mask like KTR > might be a win if we bring back the equivalent of heavy weight DIAGNOSTIC > option. DIGANOSTIC still exists. It is rarely used, and tends to be only used for heavyweight checks that are too buggy or too heavyweight or just too old to put in KASSERT(). > Often I've been guilty of putting KASSERT(ptr != NULL) checks into the code > too. Those are really just to make it less painful when hitting that bug, You mean to make it _more_ painful when hitting that bug. > so > instead of having to do a lot of homework to figure out the fault address and > line number, I can just get a pretty printed message under INVARIANTS. Maybe > those "pretty null checks" need to go under another option, perhaps > DIAGNOSTIC, or another .. ??PRETTY_PANIC. Null pointer panics give restartable page faults, with the fault address printed by trap_fatal() and the page fault often easy to restart using a debugger. trap_fatal() calls the debugger, so the page fault isn't completely transparent. This corresponds to gdb in userland stopping on the signal for the page fault. You might have to use "handle SIGFOO Stop/Print/Pass" to get back to the faulting instruction. In the kernel, continuing from trap_fatal() gives this behaviour. The page fault will repeat endlessly unless you fix it up. You can try various fixes until you find one that doesn't fault, or let it fault a few times to look at the context in different ways. If panic() is used for null pointers, then you get an unrestartable panic(), with the relevant context several frames back where it is hard to see and much more difficult to restart from (practically impossible now that panic() stops CPUs, etc., though it used to be possible to unwind the frame in some cases). The null pointer check doesn't even detect the common case where the pointer is &p->bar where p is null. Now the bad pointer is a small offset from a null pointer (normally at a small address). It would take an earlier check of p to detect this. > Anyhow, I've always been a huge fan of FreeBSD due the additional debugging > checks it offered. I was the one that suggested splassert() back in the day > when we would continually find issues with spl calls. Additionally I found SPLASSERT() was almost never used. The only uses that I can find of it are all in netipsec. > doing filesystem work a ton easier with DEBUG_VFS_LOCKS under FreeBSD than > under Darwin which at the time did not have such a feature. This was more useful I think :-). The main reason that SPLASSERT() was useless was because spl was a stable API that went away at about the same time that SPLASSERT() was added (SPLASSERT() was implemented about 6 months before SMPng killed spl). Bruce