From owner-freebsd-hackers@FreeBSD.ORG Tue Jul 12 19:33:25 2005 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0D67216A41C for ; Tue, 12 Jul 2005 19:33:25 +0000 (GMT) (envelope-from lists@nbux.com) Received: from smtp7.wanadoo.fr (smtp7.wanadoo.fr [193.252.22.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5D48F43D48 for ; Tue, 12 Jul 2005 19:33:23 +0000 (GMT) (envelope-from lists@nbux.com) Received: from me-wanadoo.net (localhost [127.0.0.1]) by mwinf0707.wanadoo.fr (SMTP Server) with ESMTP id 7BA491800090 for ; Tue, 12 Jul 2005 21:33:22 +0200 (CEST) Received: from daneel.nbux.com (LNeuilly-152-22-15-131.w82-127.abo.wanadoo.fr [82.127.94.131]) by mwinf0707.wanadoo.fr (SMTP Server) with ESMTP id 5C2AC1800088 for ; Tue, 12 Jul 2005 21:33:22 +0200 (CEST) X-ME-UUID: 20050712193322377.5C2AC1800088@mwinf0707.wanadoo.fr Received: from webmail.nbux.com (daneel.nbux.com [192.168.42.2]) by daneel.nbux.com (Postfix) with ESMTP id 94A42164C1E for ; Tue, 12 Jul 2005 21:33:08 +0200 (CEST) Received: from 192.168.42.2 (SquirrelMail authenticated user lists) by webmail.nbux.com with HTTP; Tue, 12 Jul 2005 21:33:08 +0200 (CEST) Message-ID: <61087.192.168.42.2.1121196788.squirrel@webmail.nbux.com> Date: Tue, 12 Jul 2005 21:33:08 +0200 (CEST) From: "Christophe Yayon" To: freebsd-hackers@freebsd.org User-Agent: SquirrelMail/1.4.4 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Virus-Scanned: amavisd-new at nbux.com Subject: nagios and pthreads X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Jul 2005 19:33:25 -0000 Hi all, i know that we add already discuss about this problem, but is there any solution for this problem ? --- What's section on nagios website "FreeBSD and threads. On FreeBSD there's a native user-level implementation of threads called 'pthread' and there's also an optional ports collection 'linuxthreads' that uses kernel hooks. Some folks from Yahoo! have reported that using the pthread library causes Nagios to pause under heavy I/O load, causing some service check results to be lost. Switching to linuxthreads seems to help this problem, but not fix it. The lock happens in liblthread's __pthread_acquire() - it can't ever acquire the spinlock. It happens when the main thread forks to execute an active check. On the second fork to create the grandchild, the grandchild is created by fork, but never returns from liblthread's fork wrapper, because it's stuck in __pthread_acquire(). Maybe some FreeBSD users can help out with this problem." --- I have just upgraded to 5.4-STABLE but i encountered again the problem. Sometimes, there is a nagios forked child process which consume 100% of CPU. i have heard that there was perhaps a problem with libc_r reported by Luigi Rizzo on this list 06/22/2005, but no news since this date... My workaround is to have a cron job which run every hour and check if there is a bad nagios process and kill it... i know it's very ugly... Do you any solution or what could i do to get more trace when it happen ? sorry, but i am not familiar with ktrace like tools... If someone could help me to help nagios community on freebsd ;-) ? Thanks in advance.