From owner-freebsd-hackers@FreeBSD.ORG Mon Jun 15 21:53:10 2009 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A7A1E106566C for ; Mon, 15 Jun 2009 21:53:10 +0000 (UTC) (envelope-from mel.flynn+fbsd.hackers@mailing.thruhere.net) Received: from mailhub.rachie.is-a-geek.net (rachie.is-a-geek.net [66.230.99.27]) by mx1.freebsd.org (Postfix) with ESMTP id 710718FC24 for ; Mon, 15 Jun 2009 21:53:09 +0000 (UTC) (envelope-from mel.flynn+fbsd.hackers@mailing.thruhere.net) Received: from smoochies.rachie.is-a-geek.net (mailhub.rachie.is-a-geek.net [192.168.2.11]) by mailhub.rachie.is-a-geek.net (Postfix) with ESMTP id 83D807E837 for ; Mon, 15 Jun 2009 13:53:08 -0800 (AKDT) From: Mel Flynn To: freebsd-hackers@freebsd.org Date: Mon, 15 Jun 2009 13:53:05 -0800 User-Agent: KMail/1.11.4 (FreeBSD/8.0-CURRENT; KDE/4.2.4; i386; ; ) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200906151353.06630.mel.flynn+fbsd.hackers@mailing.thruhere.net> Subject: How best to debug locking/scheduler problems X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Jun 2009 21:53:11 -0000 Hi, I'm trying to get to the bottom of a bug with getpeername() and certain kde= 4=20 applications which is probably as low-level as the libthr and the scheduler. =46rom browsing various related files in sys/kern it seems KTR is a good be= t to=20 get the information needed, yet it isn't really well supported in userland.= =20 =46or one, I've got no clue other then logging console output(?) how to ret= rieve=20 the lock info or filter it in userland from reading ktr(9) and alq(9). Gdb = is=20 useless as the process doesn't give the information gdb wants and gdb just= =20 hangs in wait. ktrace also does not provide anything as there are no more=20 syscalls being made, so I'll have to get to the bottom of this by tracing a= nd=20 filtering. Short description of the problem: a process never gets out of mi_switch and remains locked even init tries to= =20 shut it down. % procstat -t 4283 PID TID COMM TDNAME CPU PRI STATE WCHAN =20 4283 100215 kdeinit4 - 0 128 lock *unp_mtx =20 % procstat -k 4283 PID TID COMM TDNAME KSTACK = =20 4283 100215 kdeinit4 - mi_switch turnstile_wait=20 _mtx_lock_sleep uipc_peeraddr kern_getpeername getpeername syscall=20 Xint0x80_syscall=20 % ps -ww 4283 PID TT STAT TIME COMMAND 4283 ?? T 0:00.38 kdeinit4: kdeinit4: kio_http http=20 local:/tmp/ksocket-mel/klauncherxJ1635.slave-socket local:/tmp/ksocket- mel/plasmayC1653.slave-socket (kdeinit4) %=08ls -l /tmp/ksocket-mel/ total 2 =2Drw-rw-r-- 1 mel wheel 62 Jun 14 22:55 KSMserver__0 srw------- 1 mel wheel 0 Jun 14 22:55 kdeinit4__0 srwxrwxr-x 1 mel wheel 0 Jun 14 22:55 klauncherxJ1635.slave-socket =2D-=20 Mel