From owner-freebsd-hackers@FreeBSD.ORG Wed Feb 24 22:18:44 2010 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 307DE1065672 for ; Wed, 24 Feb 2010 22:18:44 +0000 (UTC) (envelope-from nate@thatsmathematics.com) Received: from euclid.ucsd.edu (euclid.ucsd.edu [132.239.145.52]) by mx1.freebsd.org (Postfix) with ESMTP id 108058FC08 for ; Wed, 24 Feb 2010 22:18:43 +0000 (UTC) Received: from zeno.ucsd.edu (zeno.ucsd.edu [132.239.145.22]) by euclid.ucsd.edu (8.11.7p3+Sun/8.11.7) with ESMTP id o1OMIhY29893; Wed, 24 Feb 2010 14:18:43 -0800 (PST) Received: from localhost (neldredg@localhost) by zeno.ucsd.edu (8.11.7p3+Sun/8.11.7) with ESMTP id o1OMIh023946; Wed, 24 Feb 2010 14:18:43 -0800 (PST) X-Authentication-Warning: zeno.ucsd.edu: neldredg owned process doing -bs Date: Wed, 24 Feb 2010 14:18:42 -0800 (PST) From: Nate Eldredge X-X-Sender: neldredg@zeno.ucsd.edu To: Peter Steele In-Reply-To: <7B9397B189EB6E46A5EE7B4C8A4BB7CB385D60B7@MBX03.exg5.exghost.com> Message-ID: References: <7B9397B189EB6E46A5EE7B4C8A4BB7CB385D5C73@MBX03.exg5.exghost.com> <20100220113349.GA22800@kiwi.sharlinx.com> <7B9397B189EB6E46A5EE7B4C8A4BB7CB385D60B7@MBX03.exg5.exghost.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: "freebsd-hackers@freebsd.org" Subject: RE: ntpd hangs under FBSD 8 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Feb 2010 22:18:44 -0000 On Mon, 22 Feb 2010, Peter Steele wrote: >> Just out of curiosity, can you attach to the process via gdb and get a >> backtrace? This smells like a locked pthread_join I hit in my own code >> a few weeks ago > > I'm not using the debug version of ntpd so the backtrace isn't too > useful, but here's what I get: > > (gdb) bt > #0 0x0000000800d52bfc in select () from /lib/libc.so.7 > #1 0x0000000000425273 in ?? () > #2 0x000000000040540e in ?? () > #3 0x0000000800580000 in ?? () > #4 0x0000000000000000 in ?? () I bet ntpd doesn't call select() in all that many places. Instead of going to all this trouble to build a debugging libc, you could just grep for select() and place breakpoints on all occurrences. (It might also be obvious from looking at them which one is the offender.) Also, since a system call is causing the trouble, you might learn something from truss or ktrace. -- Nate Eldredge nate@thatsmathematics.com