From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 11 12:39:56 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F1C414E8; Fri, 11 Apr 2014 12:39:56 +0000 (UTC) Received: from mail.tdx.com (mail.tdx.com [62.13.128.18]) by mx1.freebsd.org (Postfix) with ESMTP id A4D1512EB; Fri, 11 Apr 2014 12:39:56 +0000 (UTC) Received: from Mail-PC.tdx.co.uk (storm.tdx.co.uk [62.13.130.251]) (authenticated bits=0) by mail.tdx.com (8.14.3/8.14.3/) with ESMTP id s3BCdsbq027498 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 11 Apr 2014 13:39:55 +0100 (BST) Date: Fri, 11 Apr 2014 13:39:54 +0100 From: Karl Pielorz To: Konstantin Belousov Subject: Re: Stuck CLOSED sockets / sshd / zombies... Message-ID: <211BD03C086DDB1A07FDF036@Mail-PC.tdx.co.uk> In-Reply-To: <20140410184855.GP21331@kib.kiev.ua> References: <20140408164353.GB21331@kib.kiev.ua> <277FA3F7B4E7A98921F4D631@study64.tdx.co.uk> <201404081533.53990.jhb@freebsd.org> <92366925229B4C5B21B04D81@study64.tdx.co.uk> <20140408212319.GC21331@kib.kiev.ua> <20140409084951.GE21331@kib.kiev.ua> <2A722BB3B12E0D80CA9FF075@Mail-PC.tdx.co.uk> <20140409111917.GH21331@kib.kiev.ua> <851413886E3982D2CCFEA9D9@Mail-PC.tdx.co.uk> <20140410184855.GP21331@kib.kiev.ua> X-Mailer: Mulberry/4.0.8 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Cc: freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Apr 2014 12:39:57 -0000 Ok, rebuilt a debug world (with your rtld-elf patch), installed it - reproduced the issue, and ran up gdb on a 'urdlck' stuck sshd, and got the trace below. Fingers crossed, -Karl ps. When running up gdb I get a number of these errors (having checked, I've always got these - I just didn't notice before as they scroll past right at the top of the output from gdb starting up): " Attaching to program: /usr/sbin/sshd, process 2220 warning: current_sos: Can't read pathname for load map: Bad address warning: current_sos: Can't read pathname for load map: Bad address " I'm presuming they can be ignored? --- " [Switching to Thread 804006400 (LWP 100083/sshd)] _umtx_op_err () at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37 37 RSYSCALL_ERR(_umtx_op) (gdb) bt #0 _umtx_op_err () at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37 #1 0x00000008038e304f in __thr_rwlock_rdlock (rwlock=0x803afb480, flags=, tsp=) at /usr/src/lib/libthr/thread/thr_umtx.c:277 #2 0x00000008038ea22c in _thr_rtld_rlock_acquire (lock=0x803afb480) at thr_umtx.h:196 #3 0x000000080064f9a2 in rlock_acquire (lock=0x80085fe00, lockstate=0x7fffffffc058) at /usr/src/libexec/rtld-elf/rtld_lock.c:197 #4 0x00000008006498c9 in _rtld_bind (obj=0x800662000, reloff=13008) at /usr/src/libexec/rtld-elf/rtld.c:675 #5 0x00000008006470cd in _rtld_bind_start () at /usr/src/libexec/rtld-elf/amd64/rtld_start.S:121 #6 0x000000000041072c in grace_alarm_handler (sig=0) at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/sshd.c:378 #7 #8 _umtx_op_err () at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37 #9 0x00000008038e304f in __thr_rwlock_rdlock (rwlock=0x803afb480, flags=, tsp=) at /usr/src/lib/libthr/thread/thr_umtx.c:277 #10 0x00000008038ea22c in _thr_rtld_rlock_acquire (lock=0x803afb480) at thr_umtx.h:196 #11 0x000000080064f9a2 in rlock_acquire (lock=0x80085fe00, lockstate=0x7fffffffc668) at /usr/src/libexec/rtld-elf/rtld_lock.c:197 #12 0x00000008006498c9 in _rtld_bind (obj=0x800662000, reloff=9240) at /usr/src/libexec/rtld-elf/rtld.c:675 #13 0x00000008006470cd in _rtld_bind_start () at /usr/src/libexec/rtld-elf/amd64/rtld_start.S:121 #14 0x000000000042f9dd in sshpam_sigchld_handler (sig=) at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/auth-pam.c:152 #15 #16 0x000000080064a1c4 in dlclose (handle=0x800696c00) at /usr/src/libexec/rtld-elf/rtld.c:4136 #17 0x0000000800ede121 in openpam_destroy_chain (chain=0x8040634e0) at /usr/src/lib/libpam/libpam/../../../contrib/openpam/lib/libpam/openpam_load.c:92 #18 0x0000000800ede0bc in openpam_destroy_chain (chain=0x804063460) at /usr/src/lib/libpam/libpam/../../../contrib/openpam/lib/libpam/openpam_load.c:109 #19 0x0000000800ede0bc in openpam_destroy_chain (chain=0x8040633e0) at /usr/src/lib/libpam/libpam/../../../contrib/openpam/lib/libpam/openpam_load.c:109 #20 0x0000000800ede051 in openpam_clear_chains (policy=0x80401a6c8) at /usr/src/lib/libpam/libpam/../../../contrib/openpam/lib/libpam/openpam_load.c:128 #21 0x0000000800eda9e7 in pam_end (pamh=0x80401a6c0, status=) at /usr/src/lib/libpam/libpam/../../../contrib/openpam/lib/libpam/pam_end.c:83 #22 0x000000000042e15d in sshpam_cleanup () at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/auth-pam.c:614 #23 0x000000000041d58f in do_cleanup (authctxt=0x80401a600) at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/session.c:2732 #24 0x000000000041064f in ssh_cleanup_exit (i=255) at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/sshd.c:2545 #25 0x0000000000428f83 in mm_request_receive (sock=, m=) at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/monitor_wrap.c:153 #26 0x0000000000427e26 in monitor_read (pmonitor=0x804022220, ent=0x6465a0, pent=0x7fffffffd240) at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/monitor.c:593 #27 0x0000000000427b49 in monitor_child_preauth (_authctxt=, pmonitor=0x804022220) at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/monitor.c:387 #28 0x000000000040fd15 in main (ac=, av=) at /usr/src/secure/usr.sbin/sshd/../../../crypto/openssh/sshd.c:679 "