From owner-freebsd-hackers@FreeBSD.ORG Mon Feb 22 15:44:03 2010 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 926871065692 for ; Mon, 22 Feb 2010 15:44:03 +0000 (UTC) (envelope-from psteele@maxiscale.com) Received: from server505.appriver.com (server505e.appriver.com [98.129.35.9]) by mx1.freebsd.org (Postfix) with ESMTP id 5B3A18FC19 for ; Mon, 22 Feb 2010 15:44:02 +0000 (UTC) X-Policy: GLOBAL - maxiscale.com X-Primary: psteele@maxiscale.com X-Note: This Email was scanned by AppRiver SecureTide X-ALLOW: psteele@maxiscale.com ALLOWED X-Virus-Scan: V- X-Note: Spam Tests Failed: X-Country-Path: UNITED STATES->UNITED STATES->UNITED STATES X-Note-Sending-IP: 98.129.23.14 X-Note-Reverse-DNS: ht01.exg5.exghost.com X-Note-WHTLIST: psteele@maxiscale.com X-Note: User Rule Hits: X-Note: Global Rule Hits: G173 G174 G175 G176 G180 G181 G192 G279 X-Note: Encrypt Rule Hits: X-Note: Mail Class: ALLOWEDSENDER X-Note: Headers Injected Received: from [98.129.23.14] (HELO ht01.exg5.exghost.com) by server505.appriver.com (CommuniGate Pro SMTP 5.3.2) with ESMTPS id 25052201 for freebsd-hackers@freebsd.org; Mon, 22 Feb 2010 09:44:00 -0600 Received: from mbx03.exg5.exghost.com ([169.254.1.200]) by ht01.exg5.exghost.com ([98.129.23.14]) with mapi; Mon, 22 Feb 2010 09:44:02 -0600 From: Peter Steele To: "freebsd-hackers@freebsd.org" Date: Mon, 22 Feb 2010 09:44:01 -0600 Thread-Topic: ntpd hangs under FBSD 8 Thread-Index: AcqyIJlpWWkdyL4NQ1GPkZbhrycV7wBsHOSQ Message-ID: <7B9397B189EB6E46A5EE7B4C8A4BB7CB385D60B7@MBX03.exg5.exghost.com> References: <7B9397B189EB6E46A5EE7B4C8A4BB7CB385D5C73@MBX03.exg5.exghost.com> <20100220113349.GA22800@kiwi.sharlinx.com> In-Reply-To: <20100220113349.GA22800@kiwi.sharlinx.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: RE: ntpd hangs under FBSD 8 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Feb 2010 15:44:03 -0000 >Just out of curiosity, can you attach to the process via gdb and get a bac= ktrace? This smells like a locked pthread_join I hit in my own code a few w= eeks ago I'm not using the debug version of ntpd so the backtrace isn't too useful, = but here's what I get: (gdb) bt #0 0x0000000800d52bfc in select () from /lib/libc.so.7 #1 0x0000000000425273 in ?? () #2 0x000000000040540e in ?? () #3 0x0000000800580000 in ?? () #4 0x0000000000000000 in ?? () The trace continues for 700+ entries. The first entry is useful enough thou= gh. One of the parameters to select() is a timeout parameter. Every time I = do the backtrace it's stuck on this select call so it seems they have an in= finite timeout set. One of these was running all weekend in fact and it's s= till stuck. Curiously, this problem only happens when we make the call from= code via a system() call. If I run the same command interactively, it neve= r hangs: # /usr/sbin/ntpd -g -q ntpd: time set +28845.997063s The same code that runs this command does not hang when we run it on a BSD = 7 box.=20 I think I'm going to have to build the debug version of ntpd and try to deb= ug it. Definitely something weird going on.