Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Feb 2010 12:04:35 +0000 (UTC)
From:      "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>
To:        Mikolaj Golub <to.my.trociny@gmail.com>
Cc:        freebsd-net@freebsd.org
Subject:   Re: mpd has hung
Message-ID:  <20100220115850.T27327@maildrop.int.zabbadoz.net>
In-Reply-To: <20100220112639.L27327@maildrop.int.zabbadoz.net>
References:  <20100217132632.GA756@crete.org.ua> <4B7D5D95.20007@gmx.com> <86bpflqr5b.fsf@zhuzha.ua1> <20100220112639.L27327@maildrop.int.zabbadoz.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 20 Feb 2010, Bjoern A. Zeeb wrote:

> On Fri, 19 Feb 2010, Mikolaj Golub wrote:
>
>> On Thu, 18 Feb 2010 17:32:37 +0200 Nikos Vassiliadis wrote:
>> 
>>> On 2/17/2010 3:26 PM, Alexander Shikoff wrote:
>>>> Hello All,
>>>> 
>>>> I have mpd 5.3 running on 8.0-RC1 as PPPoE server (now only 5 clients).
>>>> Today mpd process hung and I cannot kill it with -9 signal, and I cannot
>>>> access it's console via telnet.
>>>> 
>>>> State of process in `top` output is STOP:
>>>> 73551 root          2  44    0 29588K  5692K STOP    6   0:32  0.00% mpd5
>>>> 
>>>> # procstat -kk 73551
>>>>    PID    TID COMM             TDNAME           KSTACK
>>>> 73551 100233 mpd5             -                mi_switch+0x16f 
>>>> sleepq_wait+0x42 _cv_wait+0x111 flowtable_flush+0x51 if_detach+0x2f2 
>>>> ng_iface_shutdown+0x1e ng_rmnode+0x167 ng_apply_item+0xef7 
>>>> ng_snd_item+0x2ce ngc_send+0x1d2 sosend_generic+0x3f6 kern_sendit+0x13d 
>>>> sendit+0xdc sendto+0x4d syscall+0x1da Xfast_syscall+0xe1
>>>> 73551 100502 mpd5             -                mi_switch+0x16f 
>>>> thread_suspend_switch+0xc6 thread_single+0x1b6 exit1+0x72 sigexit+0x7c 
>>>> postsig+0x306 ast+0x279 doreti_ast+0x1f
>>>> 
>>>> Is there a way to stop a process without rebooting a whole system?
>>>> Thanks in advance!
>>>> 
>>>> P.S. I'm ready for experiments with it before tonight, but I cannot
>>>> force system to crash in order to get crash dump right now.
>>>> 
>>> 
>>> It's probably too late now, but are you sure that nobody pressed
>>> CTLR-Z while in the mpd console???
>>> 
>>> CTLR-Z will send SIGSTOP to the process and the process will
>>> stop. While stopped, all processing stops(including receiving
>>> SIGKILL, you cannot kill it, and the signals are queued). You
>>> have to send SIGCONT for the process to continue.
>> 
>> We were discussing this problem with Alexander in another 
>> (Russian/Ukrainian
>> speaking) maillist. And it looks like the problem is the following.
>> 
>> mpd5 thread was detaching ng interface and when doing flowtable_flush() it
>> slept in cv_wait waiting for flowclean_cycles variable to be updated. It
>> should have been awaken by flowcleaner thread but this thread got stuck in
>> endless loop, supposedly in flowtable_clean_vnet()/flowtable_free_stale(), 
>> I
>> think because of inconsistent state of some lists (iface?) due to if_detach
>> being in progress.
>
> I have patches that are out for review.

I am not sure if they apply cleanly as they are broken out of the tail
side of a larger patchset.

If you are not using VIMAGEs you could ignore the ones I marked with (*).

http://people.freebsd.org/~bz/20100216-10-ft-cv.diff
http://people.freebsd.org/~bz/20100216-11-ft-debugging.diff
http://people.freebsd.org/~bz/20100216-12-ft-cleanup.diff	(*)
http://people.freebsd.org/~bz/20100216-13-ft-ll-cleanup.diff
http://people.freebsd.org/~bz/20100216-18-ft-free.diff		(*)

If you are still seeing the hang and have DDB support in your kernel,
then break into the debugger and save the complete output of
 	ddb> ps
before rebooting.

Regards,
Bjoern

-- 
Bjoern A. Zeeb         It will not break if you know what you are doing.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100220115850.T27327>