From owner-freebsd-hackers Fri Jan 15 21:12:37 1999 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id VAA03969 for freebsd-hackers-outgoing; Fri, 15 Jan 1999 21:12:37 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from whistle.com (s205m131.whistle.com [207.76.205.131]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id VAA03963; Fri, 15 Jan 1999 21:12:36 -0800 (PST) (envelope-from archie@whistle.com) Received: (from smap@localhost) by whistle.com (8.7.5/8.6.12) id VAA08645; Fri, 15 Jan 1999 21:12:35 -0800 (PST) Received: from bubba.whistle.com( 207.76.205.7) by whistle.com via smap (V2.0) id xma008643; Fri, 15 Jan 99 21:12:08 -0800 Received: (from archie@localhost) by bubba.whistle.com (8.8.7/8.6.12) id VAA07999; Fri, 15 Jan 1999 21:12:08 -0800 (PST) From: Archie Cobbs Message-Id: <199901160512.VAA07999@bubba.whistle.com> Subject: Automated debug sanity checkers To: freebsd-current@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Date: Fri, 15 Jan 1999 21:12:07 -0800 (PST) X-Mailer: ELM [version 2.4ME+ PL38 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG I was thinking about the DIAGNOSTICS replacement macros and had a random thought... Suppose you're sitting in front of a ddb (or better yet gdb) prompt because your kernel has just crashed due to who knows what reason. What do you do to debug this? You start looking at variables, memory, etc for anything funny going on. For example, several times we've spent hours going through a crash dump to find, for example, that a process was on two queues, or some mbuf was mangled, etc. The thought is that it would be really easy to help automate this process, by doing the following: 1. Define a new kernel option INCLUDE_SANITY_CHECKS (or whatever) 2. When this is defined, all the various FreeBSD kernel submodules (VM, networking, device drivers, etc) would include a function that exhaustively runs sanity checks -- ie, validations that all the assumptions in the code are true -- for that particular submodule. This means checking all queues, flags, whatever. 3. The function is required to only READ memory, not modify it. It can report any inconsistencies, though, obviously. 4. The function is linked into a linker set SANITY_SET(...) or whatever Then by simply calling this function from the debugger you can much more quickly narrow down on the problem (and hopefully fix it before you get tired and go to sleep :-) Moreover, since the function is running post-mortem, it can do very detailed checks that would otherwise take way too long. E.g., check every mbuf, every queue entry, check the filesystem, etc. Basically a "fsck" for the kernel memory. Is this something that people would be motivated enough to make as "official" FreeBSD kernel good housekeeping policy? -Archie ___________________________________________________________________________ Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message