Date: Thu, 30 Aug 2012 11:12:20 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Matt Burke <mattblists@icritical.com> Cc: freebsd-stable@freebsd.org Subject: Re: Killing processes from DDB Message-ID: <20120830081220.GJ33100@deviant.kiev.zoral.com.ua> In-Reply-To: <503F0BA2.6060107@icritical.com> References: <503F0BA2.6060107@icritical.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--m35yhtyYP/lVCYkX Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Aug 30, 2012 at 07:43:46AM +0100, Matt Burke wrote: > Is it possible to forcibly kill process from DDB which are unkillable from > userland? My understanding is the 'kill' command is effectively the same = as > the userland version, so perhaps a process could be terminated by invoking > an OOM handler or something? Processes can only be terminated at the safe points, where kernel code explicitely checks for termination conditions and which are known to not hold kernel resources. Yes, kill command from ddb just kills the process, i.e. it sends a signal to it, handling of which is subject of the normal signal delivering. >=20 >=20 > I just had a VirtualBox instance crash and hog 100% CPU on my desktop: >=20 > mattb 36939 100.0 13.6 2577328 2276108 ?? I 6:13AM 2:28.44 > /usr/local/lib/virtualbox/VirtualBox >=20 > I kill -9 it >=20 > mattb 36939 100.0 13.6 2577328 2275804 ?? T 6:13AM 3:10.89 > /usr/local/lib/virtualbox/VirtualBox >=20 > Note it's moved to 'stop' state for some reason, yet is still eating 100% > cpu time >=20 > # procstat -k 36939 > PID TID COMM TDNAME KSTACK > 36939 227509 VirtualBox - <running> > 36939 227836 VirtualBox - mi_switch > thread_suspend_switch thread_single exit1 sigexit postsig ast doreti_ast Stop state indicates that the process is stopped or being stopped. The later is your case. The process has one thread executing exit1() kernel function, which terminates the process. In the course of work, the function notifies all other threads of the exiting process that they shall terminate ASAP at the next safe point. According to the procstat output, there is other thread in the process which seems to execute in kernel. My guess is that it loops somewhere, not reachi= ng any check-points for termination. >=20 >=20 > Could this be the trigger - 9.0 binary (from pkgng) against 9.1? >=20 > $ procstat -b 1 36939 > PID COMM OSREL PATH > 1 init 901000 /sbin/init > 36939 VirtualBox 900044 /usr/local/lib/virtualbox/VirtualBox >=20 >=20 > I couldn't even kill it with "dtrace -n 'pid$target:::' -p 36939 -l" - > which so far has proven reliable in killing anything: >=20 > # dtrace -n 'pid$target:::' -p 2021 -l <--- unimportant proc > Bus error: 10 (core dumped) > # dtrace -n 'pid$target:::' -p 2044 -l <--- unimportant proc > Bus error: 10 (core dumped) > # dtrace -n 'pid$target:::' -p 36939 -l <--- virtualbox hangs dtrace > ^C >=20 > I couldn't truss the process or use gcore to get a dump, so my only option > was a reboot. Does anyone have any suggestions on a course of action in > case this happens again? I can't get a kernel dump since the machine > doesn't have enough swap (small SSDs) The way to debug the issue is to break into ddb on console and get a backtrace for the spinning thread, then continue, then break again and get another backtrace. Do it several times, to see where the code spins. It is impossible to even start guessing what is wrong, without seeing the backtrace. Still, recompiling VB could be good idea, since VB kernel module uses non-stable KPI and KBI, thus what you see might be just build issue. --m35yhtyYP/lVCYkX Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlA/IGQACgkQC3+MBN1Mb4it1gCcC9NjEamsIfxTmgPw/FEpe6nO uukAoOGdVt+anyTUE3CSeyQ+bMszPgXE =AnUq -----END PGP SIGNATURE----- --m35yhtyYP/lVCYkX--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120830081220.GJ33100>