Date: Thu, 20 Mar 2014 20:46:52 -0300 From: Marcelo Gondim <gondim@bsdinfo.com.br> To: FreeBSD Stable Mailing List <freebsd-stable@freebsd.org> Subject: Re: sshd with zombie process on FreeBSD 10.0-STABLE - workaround Message-ID: <532B7DEC.7010809@bsdinfo.com.br> In-Reply-To: <201403201058.38555.jhb@freebsd.org> References: <53016D97.5030909@bsdinfo.com.br> <CAN6yY1uucfkdXxkCF30w1Q9vffRpDLxM90Sz1XVbdn5W69vQMg@mail.gmail.com> <5329D81E.7040709@bsdinfo.com.br> <201403201058.38555.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Em 20/03/14 11:58, John Baldwin escreveu:
> On Wednesday, March 19, 2014 1:47:10 pm Marcelo Gondim wrote:
>> Em 19/03/14 13:01, Kevin Oberman escreveu:
>>> On Wed, Mar 19, 2014 at 6:00 AM, Marcelo Gondim
> <gondim@bsdinfo.com.br>wrote:
>>>> Hi all,
>>>>
>>>> While the solution does not appear, did the script below and put it in
>>>> crontab to automatically delete zombie sshd processes.
>>>>
>>>> the_walking_dead.sh:
>>>>
>>>> #!/bin/sh
>>>> kill -9 `ps afx|grep sshd|grep unknown|awk '{print $1}'`
>>>>
>>>>
>>>> Put this in /etc/crontab:
>>>>
>>>> 00 1 * * * root the_walking_dead.sh
>>>>
>>>>
>>> If 'kill -9' works, the process is not really a zombie. It simply still
> has
>>> a socket open and is waiting for it to be closed before exiting.
>>>
>>> You might takes a look at network sockets with sockstat(1) and see if you
>>> can get any indication of why these sockets are not being closed. It may
> be
>>> that the issue is not sshd but some other issue in the OS leaving sockets
>>> open.
>>>
>> Hi Kevin,
>>
>> My ps -afx below:
>>
>> [...]
>> 42139 - Is 0:00.01 sshd: unknown [priv] (sshd)
>> 42140 - Z 0:00.01 <defunct>
>> 42141 - IW 0:00.00 sshd: unknown [pam] (sshd)
>> 58445 - Is 0:00.01 sshd: unknown [priv] (sshd)
>> 58446 - Z 0:00.02 <defunct>
>> 58447 - IW 0:00.00 sshd: unknown [pam] (sshd)
>> 65635 - Is 0:00.01 sshd: vinicius [priv] (sshd)
>> 65636 - Z 0:00.01 <defunct>
>> [...]
>>
>> # sockstat | grep 42140
>> #
>>
>> # sockstat | grep 58446
>> #
>>
>> # sockstat | grep 65636
>> #
>>
>> No associated socket with zombie process.
> Do a pstree. I bet the zombies are children of the other processes that
> are stuck on a socket as Kevin described.
>
# ps afx|grep sshd |grep unk
10948 - Is 0:00.02 sshd: unknown [priv] (sshd)
10955 - IW 0:00.00 sshd: unknown [pam] (sshd) <====
11701 - Is 0:00.02 sshd: unknown [priv] (sshd)
11704 - IW 0:00.00 sshd: unknown [pam] (sshd)
25450 - Is 0:00.01 sshd: unknown [priv] (sshd)
25452 - IW 0:00.00 sshd: unknown [pam] (sshd)
41193 - Is 0:00.02 sshd: unknown [priv] (sshd)
41196 - IW 0:00.00 sshd: unknown [pam] (sshd)
42193 - Is 0:00.02 sshd: unknown [priv] (sshd)
42195 - IW 0:00.00 sshd: unknown [pam] (sshd)
80638 - Is 0:00.02 sshd: unknown [priv] (sshd)
80640 - IW 0:00.00 sshd: unknown [pam] (sshd)
81484 - Is 0:00.02 sshd: unknown [priv] (sshd)
81486 - IW 0:00.00 sshd: unknown [pam] (sshd)
With proctstat I could see the socket as follows:
# procstat -f 10955
PID COMM FD T V FLAGS REF OFFSET PRO NAME
10955 sshd text v r r------- - - - /usr/sbin/sshd
10955 sshd cwd v d r------- - - - /
10955 sshd root v d r------- - - - /
10955 sshd 0 v c rw------ 6 0 - /dev/null
10955 sshd 1 v c rw------ 6 0 - /dev/null
10955 sshd 2 v c rw------ 6 0 - /dev/null
10955 sshd 3 s - rw---n-- 2 0 TCP 186.xxx.xx.2:22
186.xxx.xx.8:57035
10955 sshd 5 p - rw------ 2 0 - -
10955 sshd 6 s - rw------ 2 0 UDS -
10955 sshd 7 p - rw------ 1 0 - -
10955 sshd 8 s - rw------ 2 0 UDS -
I do not understand why these connections are remaining locked in
FreeBSD 10.0
I'll try this sysctl: net.inet.tcp.delayed_ack=0
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?532B7DEC.7010809>
