From owner-freebsd-hackers@FreeBSD.ORG Wed Apr 2 18:12:15 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 39DE9295 for ; Wed, 2 Apr 2014 18:12:15 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 12DD43E4 for ; Wed, 2 Apr 2014 18:12:15 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id A3005B9A6; Wed, 2 Apr 2014 14:12:13 -0400 (EDT) From: John Baldwin To: Karl Pielorz Subject: Re: Stuck CLOSED sockets / sshd / zombies... Date: Wed, 2 Apr 2014 14:05:56 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20130906; KDE/4.5.5; amd64; ; ) References: <3FE645E9723756F22EF901AE@Mail-PC.tdx.co.uk> <201404021130.39478.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201404021405.56878.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 02 Apr 2014 14:12:13 -0400 (EDT) Cc: freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Apr 2014 18:12:15 -0000 On Wednesday, April 02, 2014 12:55:43 pm Karl Pielorz wrote: > > --On 2 April 2014 11:30:39 -0400 John Baldwin wrote: > > >> # ps ax | grep 4344 > >> ps axl | grep 4344 > >> 0 4344 895 0 20 0 84868 6944 urdlck Is - 0:00.01 sshd: > >> unknown [priv] (sshd) > > > > Can you get 'procstat -k 4344' to see where this process is stuck? > > Sure, > > " > # procstat -k 4344 > PID TID COMM TDNAME KSTACK > 4344 100068 sshd - mi_switch > sleepq_catch_signals sleepq_wait_sig _sleep umtxq_sleep do_rw_rdlock > __umtx_op_rw_rdlock amd64_syscall Xfast_syscall > " Yes, that is waiting on a pthread read lock as the Xen guys noted. > >> 22 4345 4344 0 20 0 0 0 - Z - 0:00.00 > >> 0 4346 4344 0 21 0 84868 6952 sbwait I - 0:00.00 sshd: > >> unknown [pam] (sshd) > > > > 'procstat -f' and 'procstat -k' for this process might also be useful. > > Ok, think you mean PID 4346? > > " > # procstat -f 4346 > PID COMM FD T V FLAGS REF OFFSET PRO NAME > 4346 sshd text v r r------- - - - /usr/sbin/sshd > 4346 sshd cwd v d r------- - - - / > 4346 sshd root v d r------- - - - / > 4346 sshd 0 v c rw------ 6 0 - /dev/null > 4346 sshd 1 v c rw------ 6 0 - /dev/null > 4346 sshd 2 v c rw------ 6 0 - /dev/null > 4346 sshd 3 s - rw---n-- 2 0 TCP 192.168.0.138:22 > 192.168.0.45:54588 > 4346 sshd 5 p - rw------ 2 0 - - > 4346 sshd 6 s - rw------ 2 0 UDS - > 4346 sshd 7 p - rw------ 1 0 - - > 4346 sshd 8 s - rw------ 2 0 UDS - > " > > " > # procstat -k 4346 > PID TID COMM TDNAME KSTACK > 4346 100100 sshd - mi_switch > sleepq_catch_signals sleepq_wait_sig _sleep sbwait soreceive_generic > dofileread kern_readv sys_read amd64_syscall Xfast_syscall > " Grr, I guess that's what I should have expected. Was sort of hoping to be able to see which socket it was blocked on. Can you run 'kgdb' as root (no args), then do 'proc 4346' and 'bt'? If you are familiar with gdb, walk up to the frame that in sys_read and do 'p *uap' so we can see which fd is being read. -- John Baldwin