Date: Tue, 1 Apr 2008 12:57:06 +0100 (BST) From: Robert Watson <rwatson@FreeBSD.org> To: stable@FreeBSD.org Subject: Q&A on textdumps (fwd) Message-ID: <20080401125534.D94491@fledge.watson.org>
next in thread | raw e-mail | index | archive | help
Dear all, I've now completed the MFC of basic textdump support to 7.0. Once I've had a chance to ping Brooks about it, either he or I will MFC support for ddb.conf, which allows configuring textdump and debugging scripts automatically at boot. I've attached a Q&A post I made to current@ after committing textdump support to HEAD, and you can also consult textdump(4) and ddb(4) for more information. Thanks, Robert N M Watson Computer Laboratory University of Cambridge ---------- Forwarded message ---------- Date: Sun, 30 Dec 2007 13:11:29 +0000 (GMT) From: Robert Watson <rwatson@FreeBSD.org> To: current@freeBSD.org Subject: Q&A on textdumps Dear all, I've received a few textdump-related questions that I thought I'd share my answers to. (1) What information is in a textdump? The textdump is stored as a tarfile with several subfiles in it: config.txt - Kernel configuration, if compiled into kernel ddb.txt - Captured DDB output, if present msgbuf.txt - Kernel message buffer panic.txt - Kernel panic message, if there was a panic version.txt - Kernel version string It is easy to add new files to textdumps, so if there's some easily extractable kernel state that you feel should go in there, drop me an e-mail and/or send a patch. (2) Is there any information in a textdump that can't be acquired using kgdb and other available dump analysis tools? In principle no, as normal dumps include all kernel memory, and textdumps operate by inspecting kernel memory using DDB, capturing only small but presumably relevant parts. However, there are some important differences in approach that mean that textdumps can be used in ways that regular dumps can't easily be: - DDB textdumps are very small. Including a full debugging session, kernel message buffer, and kernel configuration, my textdumps are frequently around 100k uncompressed. This makes it possible to use them on very small machines, store them for an extended period, e-mail them around, etc, in a way that you can't currently do with kernel memory dumps. This improved usability will (hopefully) improve our bug and crash management. - DDB is a specialized debugging tool with intimate knowledge of the kernel, and there are types of data trivially extracted with DDB that are awkward or quite difficult to extract using kgdb or other currently available dump analysis tools. Locking, waiting, and process information are examples of where automatic extraction is currently only possible with DDB, and one of the reasons many developers prefer to begin any diagnosis with an interactive DDB session. - DDB textdumps can be used without the exact source tree, kernel configuration, built kernel, and debug symbols, as they interpret rather than save the pages of memory. They're even an architecture-independent file format so you don't need a cross-debugger. Having that additional context is useful (ability to map symbol+offset to line of code), but you can actually go a remarkable way without it, especially looking at the results in a PR potentially years later. (3) What do I lose by using textdumps? To be clear, there are also some important things that textdumps can't do -- principally, a textdump doesn't contain all kernel memory, so your textdump output is all you have. If you need to extract detailed structure information for something DDB doesn't understand, or that you don't think of in advance or during a DDB session, then there's nothing to fall back on except configuring a textdump or regular dump and waiting for the panic to happen again. (4) When should I use textdumps? Minidumps remain the default in 7.x and 8.x, and full dumps remain the default in 6.x and earlier. Textdumps must be specifically enabled by the administrator to be used. DDB is an excellent live debugging tool whose use has been limited to situations where there is an accessible video console, or more ideally serial or firewire console to a second box, and generally requiring an experienced developer to be available to drive debugging. There are many problems that can be pretty much instantly understood with a couple of DDB commands, so these limitations impacted debugging effectiveness. The goal of adding DDB capture output, scripting, and textdumps was to broaden the range of situations in which DDB could be used: now it is usable more easily for post-mortem analysis, no console or second machine is required, and a developer can install, or even e-mail, a script of DDB commands to run automatically. Developers can simply define a few scripts to handle various DDB cases, such as panic, and get a nice debugging bundle to look at later. When I'm debugging network stack problems, I typically want a fairly small set of DDB commands to be run by the user, and the output sent back, and now it will go from "Read the chapter on kernel debugging, set up a serial console, run the following commands, copy and paste from your serial console -- oh, you don't have a serial console, perhaps hand-copy these fields or use a digital camera" to "run the following ddb(8) command and when the box reboots, send me the tarball in /var/crash". I anticipate that textdumps will see use when developers are exchanging e-mail with users reporting problems and trying to gather concise summaries of information about a crash with minimum downtime and maximum portability, in embedded environments where dumping kernel memory to flash is tricky, or in order to save a transcript of an interactive DDB session when testing new features locally. Another interesting advantage of textdumps is that it's easy to inspect them for confidential/identifying information and mask or purge it. When someone sends out a kernel memory dump, it potentially contains a lot of sensitive information, and most people (including me) would have difficulty making sure all sensitive information was purged safely. (5) I want to collect DDB output, but still need memory dumps -- can I do both? Yes and no. Yes, you can use the DDB output capture buffer and scripting without using a textdump, as the capture buffer is stored in kernel memory. You can print it using kgdb, and we should probably add that capability to ddb(8) also. End your script with "call doadump; reset" but don't "textdump set". For example: ddb script kdb.enter.panic="capture on;show pcpu;trace;ps;show locks;alltrace;show alllocks;show lockedvnods;call doadump;reset" No, because you must pick one of the three dump layouts (dump, minidump, textdump) to write to the swap partition -- you can't write out all three and then decide which to extract later. In principle this could be changed so that we actually write out a textdump section and a full/minidump, but that's not implemented. (6) I have a serial console so don't need textudmps, can I still use DDB scripting to manage a crash? Yes. You can set up scripts in exactly the same way as with textdumps, only omit the textdump bits and end with a "reset" to reboot the system when done. That way you can extract the results from the serial console log. I.e., ddb script kdb.enter.panic="show pcpu;trace;show locks;ps;alltrace;show alllocks;show lockedvnods;reset" (7) I'm in DDB and I suddenly realize I want to save the output, and I haven't configured textdumps. What do I do? As with normal dumps, you must previously have configured support for a dump partition. These days, that is done automatically whenever you have swap configured on the box, so unless you're in single-user mode or don't have swap configured, you should be able to do the following: Schedule a textdump using the "textdump set" command. Turn on DDB output capture using "capture on", run your commands of interest, and turn it off using "capture off". Type "call doadump" to dump memory, and "reset" to reboot. (8) The buffer is small, can I pick and choose what DDB output is captured? The capture buffer does have a size limit, so you might find you want to explore interactively at first to figure out what information to save. Then you can turn it on and off around output to capture with "capture on" and "capture off". Each time you turn capture back on, new output is appended after any existing output. If you decide you want to clear the buffer, you can use "capture reset" to do that, and you can check the status of the buffer using "capture status". You can also increase the buffer size by setting the debug.ddb.capture.bufsize sysctl to a larger size. The sysctl will automatically round up to the next textdump blocksize. (9) Can I continue the kernel after doing a textdump? No. As with kernel memory dumps, textdumps invoke the storage controller dumper routine, which may hose up state in the device driver preventing its use after the dump is generated. However, if you do plan to continue from DDB, just use DDB output capture without a textdump. You can then extract the contents of the DDB buffer using the debug.ddb.capture.data sysctl. _______________________________________________ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080401125534.D94491>