Date: Thu, 3 Apr 2008 16:38:03 -0500 From: Brooks Davis <brooks@freebsd.org> To: Robert Watson <rwatson@freebsd.org> Cc: stable@freebsd.org Subject: Re: Q&A on textdumps (fwd) Message-ID: <20080403213803.GA39213@lor.one-eyed-alien.net> In-Reply-To: <20080401125534.D94491@fledge.watson.org> References: <20080401125534.D94491@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Q68bSM7Ycu6FN28Q Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Apr 01, 2008 at 12:57:06PM +0100, Robert Watson wrote: >=20 > Dear all, >=20 > I've now completed the MFC of basic textdump support to 7.0. Once I've h= ad=20 > a chance to ping Brooks about it, either he or I will MFC support for=20 > ddb.conf, which allows configuring textdump and debugging scripts=20 > automatically at boot. I've attached a Q&A post I made to current@ after= =20 > committing textdump support to HEAD, and you can also consult textdump(4)= =20 > and ddb(4) for more information. I've MFC'd support for reading ddb.conf at boot. -- Brooks > Thanks, >=20 > Robert N M Watson > Computer Laboratory > University of Cambridge >=20 > ---------- Forwarded message ---------- > Date: Sun, 30 Dec 2007 13:11:29 +0000 (GMT) > From: Robert Watson <rwatson@FreeBSD.org> > To: current@freeBSD.org > Subject: Q&A on textdumps >=20 >=20 > Dear all, >=20 > I've received a few textdump-related questions that I thought I'd share m= y=20 > answers to. >=20 > (1) What information is in a textdump? >=20 > The textdump is stored as a tarfile with several subfiles in it: >=20 > config.txt - Kernel configuration, if compiled into kernel > ddb.txt - Captured DDB output, if present > msgbuf.txt - Kernel message buffer > panic.txt - Kernel panic message, if there was a panic > version.txt - Kernel version string >=20 > It is easy to add new files to textdumps, so if there's some easily=20 > extractable kernel state that you feel should go in there, drop me an=20 > e-mail and/or send a patch. >=20 > (2) Is there any information in a textdump that can't be acquired using= =20 > kgdb and other available dump analysis tools? >=20 > In principle no, as normal dumps include all kernel memory, and textdumps= =20 > operate by inspecting kernel memory using DDB, capturing only small but= =20 > presumably relevant parts. However, there are some important differences= =20 > in approach that mean that textdumps can be used in ways that regular dum= ps=20 > can't easily be: >=20 > - DDB textdumps are very small. Including a full debugging session, kerne= l=20 > message buffer, and kernel configuration, my textdumps are frequently=20 > around 100k uncompressed. This makes it possible to use them on very smal= l=20 > machines, store them for an extended period, e-mail them around, etc, in = a=20 > way that you can't currently do with kernel memory dumps. This improved= =20 > usability will (hopefully) improve our bug and crash management. >=20 > - DDB is a specialized debugging tool with intimate knowledge of the=20 > kernel, and there are types of data trivially extracted with DDB that are= =20 > awkward or quite difficult to extract using kgdb or other currently=20 > available dump analysis tools. Locking, waiting, and process information= =20 > are examples of where automatic extraction is currently only possible wit= h=20 > DDB, and one of the reasons many developers prefer to begin any diagnosis= =20 > with an interactive DDB session. >=20 > - DDB textdumps can be used without the exact source tree, kernel=20 > configuration, built kernel, and debug symbols, as they interpret rather= =20 > than save the pages of memory. They're even an architecture-independent= =20 > file format so you don't need a cross-debugger. Having that additional=20 > context is useful (ability to map symbol+offset to line of code), but you= =20 > can actually go a remarkable way without it, especially looking at the=20 > results in a PR potentially years later. >=20 > (3) What do I lose by using textdumps? >=20 > To be clear, there are also some important things that textdumps can't do= =20 > -- principally, a textdump doesn't contain all kernel memory, so your=20 > textdump output is all you have. If you need to extract detailed structur= e=20 > information for something DDB doesn't understand, or that you don't think= =20 > of in advance or during a DDB session, then there's nothing to fall back = on=20 > except configuring a textdump or regular dump and waiting for the panic t= o=20 > happen again. >=20 > (4) When should I use textdumps? >=20 > Minidumps remain the default in 7.x and 8.x, and full dumps remain the=20 > default in 6.x and earlier. Textdumps must be specifically enabled by the= =20 > administrator to be used. >=20 > DDB is an excellent live debugging tool whose use has been limited to=20 > situations where there is an accessible video console, or more ideally=20 > serial or firewire console to a second box, and generally requiring an=20 > experienced developer to be available to drive debugging. There are many= =20 > problems that can be pretty much instantly understood with a couple of DD= B=20 > commands, so these limitations impacted debugging effectiveness. >=20 > The goal of adding DDB capture output, scripting, and textdumps was to=20 > broaden the range of situations in which DDB could be used: now it is=20 > usable more easily for post-mortem analysis, no console or second machine= =20 > is required, and a developer can install, or even e-mail, a script of DDB= =20 > commands to run automatically. Developers can simply define a few scripts= =20 > to handle various DDB cases, such as panic, and get a nice debugging bund= le=20 > to look at later. >=20 > When I'm debugging network stack problems, I typically want a fairly smal= l=20 > set of DDB commands to be run by the user, and the output sent back, and= =20 > now it will go from "Read the chapter on kernel debugging, set up a seria= l=20 > console, run the following commands, copy and paste from your serial=20 > console -- oh, you don't have a serial console, perhaps hand-copy these= =20 > fields or use a digital camera" to "run the following ddb(8) command and= =20 > when the box reboots, send me the tarball in /var/crash". >=20 > I anticipate that textdumps will see use when developers are exchanging= =20 > e-mail with users reporting problems and trying to gather concise summari= es=20 > of information about a crash with minimum downtime and maximum portabilit= y,=20 > in embedded environments where dumping kernel memory to flash is tricky, = or=20 > in order to save a transcript of an interactive DDB session when testing= =20 > new features locally. >=20 > Another interesting advantage of textdumps is that it's easy to inspect= =20 > them for confidential/identifying information and mask or purge it. When= =20 > someone sends out a kernel memory dump, it potentially contains a lot of= =20 > sensitive information, and most people (including me) would have difficul= ty=20 > making sure all sensitive information was purged safely. >=20 > (5) I want to collect DDB output, but still need memory dumps -- can I do= both? >=20 > Yes and no. >=20 > Yes, you can use the DDB output capture buffer and scripting without usin= g=20 > a textdump, as the capture buffer is stored in kernel memory. You can pri= nt=20 > it using kgdb, and we should probably add that capability to ddb(8) also.= =20 > End your script with "call doadump; reset" but don't "textdump set". For= =20 > example: >=20 > ddb script kdb.enter.panic=3D"capture on;show pcpu;trace;ps;show=20 > locks;alltrace;show alllocks;show lockedvnods;call doadump;reset" >=20 > No, because you must pick one of the three dump layouts (dump, minidump,= =20 > textdump) to write to the swap partition -- you can't write out all three= =20 > and then decide which to extract later. In principle this could be change= d=20 > so that we actually write out a textdump section and a full/minidump, but= =20 > that's not implemented. >=20 > (6) I have a serial console so don't need textudmps, can I still use DDB= =20 > scripting to manage a crash? >=20 > Yes. You can set up scripts in exactly the same way as with textdumps, on= ly=20 > omit the textdump bits and end with a "reset" to reboot the system when= =20 > done. That way you can extract the results from the serial console log.= =20 > I.e., >=20 > ddb script kdb.enter.panic=3D"show pcpu;trace;show locks;ps;alltrace;sh= ow=20 > alllocks;show lockedvnods;reset" >=20 > (7) I'm in DDB and I suddenly realize I want to save the output, and I=20 > haven't configured textdumps. What do I do? >=20 > As with normal dumps, you must previously have configured support for a= =20 > dump partition. These days, that is done automatically whenever you have= =20 > swap configured on the box, so unless you're in single-user mode or don't= =20 > have swap configured, you should be able to do the following: >=20 > Schedule a textdump using the "textdump set" command. >=20 > Turn on DDB output capture using "capture on", run your commands of=20 > interest, and turn it off using "capture off". >=20 > Type "call doadump" to dump memory, and "reset" to reboot. >=20 > (8) The buffer is small, can I pick and choose what DDB output is capture= d? >=20 > The capture buffer does have a size limit, so you might find you want to= =20 > explore interactively at first to figure out what information to save. Th= en=20 > you can turn it on and off around output to capture with "capture on" and= =20 > "capture off". Each time you turn capture back on, new output is appended= =20 > after any existing output. >=20 > If you decide you want to clear the buffer, you can use "capture reset" t= o=20 > do that, and you can check the status of the buffer using "capture status= ". >=20 > You can also increase the buffer size by setting the=20 > debug.ddb.capture.bufsize sysctl to a larger size. The sysctl will=20 > automatically round up to the next textdump blocksize. >=20 > (9) Can I continue the kernel after doing a textdump? >=20 > No. As with kernel memory dumps, textdumps invoke the storage controller= =20 > dumper routine, which may hose up state in the device driver preventing i= ts=20 > use after the dump is generated. >=20 > However, if you do plan to continue from DDB, just use DDB output capture= =20 > without a textdump. You can then extract the contents of the DDB buffer= =20 > using the debug.ddb.capture.data sysctl. > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >=20 --Q68bSM7Ycu6FN28Q Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (FreeBSD) iD8DBQFH9U46XY6L6fI4GtQRAhmDAKDh9U1A11DJX7QIx6jNKL0OcDF7uwCg1RVJ iTA6hTh9p3SbbmDXKCFh32k= =HGwy -----END PGP SIGNATURE----- --Q68bSM7Ycu6FN28Q--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080403213803.GA39213>