Date: Fri, 23 Jun 2006 16:25:58 +0300 From: Giorgos Keramidas <keramida@freebsd.org> To: Kostik Belousov <kostikbel@gmail.com> Cc: freebsd-doc@freebsd.org Subject: Re: [patch] deadlock debugging Message-ID: <20060623132558.GD7062@gothmog.pc> In-Reply-To: <20060607084346.GA21391@deviant.kiev.zoral.com.ua> References: <20060607084346.GA21391@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2006-06-07 11:43, Kostik Belousov <kostikbel@gmail.com> wrote: > Reports of the deadlocks are reccurrent topic on the current- > and stable- lists. Many of us have to repeat the instructions > on how to provide the useful initial bug report from them. > > Please, comment proposed addition to the kernel debugging > chapter of the developer handbook. Hi Kostik, > Obviously, I am not an english native speaker. Your corrections > for both factual material and grammar/style are very much > welcome ! > > P.S. I'm not on the list, do not remove CC: to me on replying. Ok :) This seems like a useful addition to the developer's handbook, but I have some minor comments. See inline text below: > Index: en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml > =================================================================== > RCS file: /usr/local/arch/ncvs/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml,v > retrieving revision 1.64 > diff -u -r1.64 chapter.sgml > --- en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml 5 Jan 2006 20:03:34 -0000 1.64 > +++ en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml 7 Jun 2006 08:39:20 -0000 > @@ -821,6 +821,41 @@ > on any configured console driver, including a serial > console.</para> > </sect1> > + > + <sect1 id="kerneldebug-deadlocks"> > + <title>Debugging the Deadlocks</title> `Debugging Kernel Deadlocks' is probably a better title here, since deadlocks can only occur in the kernel and `the Deadlocks' doesn't really make this as obvious as I'd probably want it to be. > + <para>You may experience so called deadlocks, the situation where > + system stops doing useful work. To provide the useful bug report > + in this situation, you shall use ddb as described above. Please, > + include the output of <command>ps</command> and > + <command>trace</command> for suspected processes in the > + report.</para> This paragraph has a few minor syntax buglets. English is not my native language, but I would probably rewrite this as: | <para>Modern &os; releases have been extended with support for | Symmetric Multiprocessing (SMP). To support highly parallel | processing, the &os; kernel uses a lot of internal locking and | synchronization primitives, to allow multiple kernel threads | to run concurrently on systems that can support such a mode of | operation. Bugs in the use of these internal locking | mechanisms can lead to a situation where one or more kernel | threads block compete for the same resources and block | indefinitely waiting for each other. When this happens, the | system may become unstable, leading either to a crash or | appear to <quote>hang</quote>. This hang is called a | <quote>deadlock</quote>.</para> | | <para>Debugging a deadlock may be a tricky and difficult thing, | but &os; provides some tools that may assist you in tracking | down the problem or collect information about the deadlock | when it occurs.</para> | | <para>One of these tools is the kernel debugger, | <application>DDB</application>, which you can use as described | in the previous sections to collect useful information for | such a bug. <application>DDB</application> commands that are | very useful and may provide information that helps debugging a | deadlock are:</para> | | <itemizedlist> | <listitem><para><command>ps</command></para></listitem> | <listitem><para><command>trace</command></para></listitem> | </itemizedlist> | | <para>Use the <command>ps</command> command to list all the | processes and then use <command>trace</command> on processes | that are suspects for having caused the deadlock.</para> | | <para>Other commands that can provide useful information for | tracking down the cause of a deadlock are:</para> | | <itemizedlist> | <listitem><para><command>show allcpu</command></para></listitem> | <listitem><para><command>show alllocks</command></para></listitem> | <listitem><para><command>show lockedvnods</command></para></listitem> | </itemizedlist> | | <para>Useful information about what each process was doing, at | the time the deadlock occured, can be listed with:</para> | | <itemizedlist> | <listitem><para><command>where <replaceable>PID</replaceable></command></para></listitem> | </itemizedlist> | | <para>The output of the <command>where</command> command tends | to be very useful for the processes listed in the output of | the <command>show</command> commands.</para> | | <para>To obtain meaningful backtraces for threaded processes, | use <command>thread thread-id</command> first, to switch to | the correct thread, and then get a backtrace | with <command>where</command>.</para> Does this version look ok to you? I can handle the merging of this change with your initial diff/patch > + <para>If possible, consider doing further investigation. Receipt > + below is especially usefull if you suspect deadlock occurs in the > + VFS layer. Add the options > + <programlisting>makeoptions DEBUG=-g > + options INVARIANTS > + options INVARIANT_SUPPORT > + options WITNESS > + options DEBUG_LOCKS > + options DEBUG_VFS_LOCKS > + options DIAGNOSTIC</programlisting> > + > + to the kernel config. When deadlock occurs, in addition to the > + output of the <command>ps</command> command, provide information > + from the <command>show allpcpu</command>, <command>show > + alllocks</command> and <command>show > + lockedvnods</command>. More, please provide output of the > + <command>where pid</command> for each process id mentioned in > + the output of the <command>show</command> commands. > + </para> > + > + <para>For threaded processes, to obtain meaningful backtraces, use > + <command>thread thread-id</command> to switch to the thread > + stack, and do backtrace with <command>where</command>.</para> > + </sect1> > </chapter> This part is also nice, but IMHO it would be even nicer if we could expand it a bit more. How about something like this? | <!-- On reproducing a deadlock and `doing further investigation' --> | | <para>Deadlocks are pretty nasty bugs, since they are not very | easy to reproduce. Their occurence depends on specific | timing, synchronization, system load and many more factors. | This makes it hard to reliably reproduce a deadlock bug. | Since reproducing a bug is some times a crucial part of | gathering all the necessary information, you may have to spend | some time investigating the deadlock. Naturally, this is not | always possible for production systems, but if you can | reproduce the deadlock on a test system which can afford | staying off-line for extended periods of time, then consider | staying inside <application>DDB</application> while you are | investigating the deadlock further.</para> | | <para>A serial console can be extremely helpful in collecting | <application>DDB</application> output.</para> | | <para>If it's impossible to set up a serial console | (i.e. because you cannot find or afford a second system to | configure as a testbed), emulators like | <filename role="port">emulators/qemu</filename>, | <filename role="port">emulators/vmware2</filename> or | <filename role="port">emulators/bochs</filename> may prove a | very efficient way of debugging kernel issues, like a | deadlock.</para> Part #2 ... | <!-- On kernel options that are useful for debugging locking problems. --> | | <para>Apart from the usual kernel options that are useful for | debugging kernel problems, there are some options that are | prticularly useful and targetted at debugging locking | problems. These options are:</para> | | <programlisting> options INVARIANTS | options INVARIANT_SUPPORT | options WITNESS | options DEBUG_LOCKS | options DEBUG_VFS_LOCKS | options DIAGNOSTIC</programlisting> Any help in expanding these parts (especially the second one) is more than welcome :-)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060623132558.GD7062>