Date: Thu, 20 Mar 2014 20:46:52 -0300 From: Marcelo Gondim <gondim@bsdinfo.com.br> To: FreeBSD Stable Mailing List <freebsd-stable@freebsd.org> Subject: Re: sshd with zombie process on FreeBSD 10.0-STABLE - workaround Message-ID: <532B7DEC.7010809@bsdinfo.com.br> In-Reply-To: <201403201058.38555.jhb@freebsd.org> References: <53016D97.5030909@bsdinfo.com.br> <CAN6yY1uucfkdXxkCF30w1Q9vffRpDLxM90Sz1XVbdn5W69vQMg@mail.gmail.com> <5329D81E.7040709@bsdinfo.com.br> <201403201058.38555.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Em 20/03/14 11:58, John Baldwin escreveu: > On Wednesday, March 19, 2014 1:47:10 pm Marcelo Gondim wrote: >> Em 19/03/14 13:01, Kevin Oberman escreveu: >>> On Wed, Mar 19, 2014 at 6:00 AM, Marcelo Gondim > <gondim@bsdinfo.com.br>wrote: >>>> Hi all, >>>> >>>> While the solution does not appear, did the script below and put it in >>>> crontab to automatically delete zombie sshd processes. >>>> >>>> the_walking_dead.sh: >>>> >>>> #!/bin/sh >>>> kill -9 `ps afx|grep sshd|grep unknown|awk '{print $1}'` >>>> >>>> >>>> Put this in /etc/crontab: >>>> >>>> 00 1 * * * root the_walking_dead.sh >>>> >>>> >>> If 'kill -9' works, the process is not really a zombie. It simply still > has >>> a socket open and is waiting for it to be closed before exiting. >>> >>> You might takes a look at network sockets with sockstat(1) and see if you >>> can get any indication of why these sockets are not being closed. It may > be >>> that the issue is not sshd but some other issue in the OS leaving sockets >>> open. >>> >> Hi Kevin, >> >> My ps -afx below: >> >> [...] >> 42139 - Is 0:00.01 sshd: unknown [priv] (sshd) >> 42140 - Z 0:00.01 <defunct> >> 42141 - IW 0:00.00 sshd: unknown [pam] (sshd) >> 58445 - Is 0:00.01 sshd: unknown [priv] (sshd) >> 58446 - Z 0:00.02 <defunct> >> 58447 - IW 0:00.00 sshd: unknown [pam] (sshd) >> 65635 - Is 0:00.01 sshd: vinicius [priv] (sshd) >> 65636 - Z 0:00.01 <defunct> >> [...] >> >> # sockstat | grep 42140 >> # >> >> # sockstat | grep 58446 >> # >> >> # sockstat | grep 65636 >> # >> >> No associated socket with zombie process. > Do a pstree. I bet the zombies are children of the other processes that > are stuck on a socket as Kevin described. > # ps afx|grep sshd |grep unk 10948 - Is 0:00.02 sshd: unknown [priv] (sshd) 10955 - IW 0:00.00 sshd: unknown [pam] (sshd) <==== 11701 - Is 0:00.02 sshd: unknown [priv] (sshd) 11704 - IW 0:00.00 sshd: unknown [pam] (sshd) 25450 - Is 0:00.01 sshd: unknown [priv] (sshd) 25452 - IW 0:00.00 sshd: unknown [pam] (sshd) 41193 - Is 0:00.02 sshd: unknown [priv] (sshd) 41196 - IW 0:00.00 sshd: unknown [pam] (sshd) 42193 - Is 0:00.02 sshd: unknown [priv] (sshd) 42195 - IW 0:00.00 sshd: unknown [pam] (sshd) 80638 - Is 0:00.02 sshd: unknown [priv] (sshd) 80640 - IW 0:00.00 sshd: unknown [pam] (sshd) 81484 - Is 0:00.02 sshd: unknown [priv] (sshd) 81486 - IW 0:00.00 sshd: unknown [pam] (sshd) With proctstat I could see the socket as follows: # procstat -f 10955 PID COMM FD T V FLAGS REF OFFSET PRO NAME 10955 sshd text v r r------- - - - /usr/sbin/sshd 10955 sshd cwd v d r------- - - - / 10955 sshd root v d r------- - - - / 10955 sshd 0 v c rw------ 6 0 - /dev/null 10955 sshd 1 v c rw------ 6 0 - /dev/null 10955 sshd 2 v c rw------ 6 0 - /dev/null 10955 sshd 3 s - rw---n-- 2 0 TCP 186.xxx.xx.2:22 186.xxx.xx.8:57035 10955 sshd 5 p - rw------ 2 0 - - 10955 sshd 6 s - rw------ 2 0 UDS - 10955 sshd 7 p - rw------ 1 0 - - 10955 sshd 8 s - rw------ 2 0 UDS - I do not understand why these connections are remaining locked in FreeBSD 10.0 I'll try this sysctl: net.inet.tcp.delayed_ack=0
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?532B7DEC.7010809>