From owner-freebsd-hackers@FreeBSD.ORG Fri Jan 28 16:04:34 2011 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C605C1065672; Fri, 28 Jan 2011 16:04:34 +0000 (UTC) (envelope-from dan@dan.emsphone.com) Received: from email2.allantgroup.com (email2.emsphone.com [199.67.51.116]) by mx1.freebsd.org (Postfix) with ESMTP id 6D0C38FC14; Fri, 28 Jan 2011 16:04:34 +0000 (UTC) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by email2.allantgroup.com (8.14.4/8.14.4) with ESMTP id p0SFP6wv094447 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 28 Jan 2011 09:25:07 -0600 (CST) (envelope-from dan@dan.emsphone.com) Received: from dan.emsphone.com (smmsp@localhost [127.0.0.1]) by dan.emsphone.com (8.14.4/8.14.4) with ESMTP id p0SFP6Di093128 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 28 Jan 2011 09:25:06 -0600 (CST) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.14.4/8.14.4/Submit) id p0SFP6R9093127; Fri, 28 Jan 2011 09:25:06 -0600 (CST) (envelope-from dan) Date: Fri, 28 Jan 2011 09:25:06 -0600 From: Dan Nelson To: Ivan Voras Message-ID: <20110128152505.GP75125@dan.emsphone.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-OS: FreeBSD 8.2-PRERELEASE User-Agent: Mutt/1.5.21 (2010-09-15) X-Virus-Scanned: clamav-milter 0.96.4 at email2.allantgroup.com X-Virus-Status: Clean X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.6 (email2.allantgroup.com [199.67.51.78]); Fri, 28 Jan 2011 09:25:07 -0600 (CST) X-Scanned-By: MIMEDefang 2.68 on 199.67.51.78 Cc: freebsd-hackers@freebsd.org Subject: Re: Namecache lock contention? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jan 2011 16:04:34 -0000 In the last episode (Jan 28), Ivan Voras said: > I have this situation on a PHP server: > > 36623 www 1 76 0 237M 30600K *Name 6 0:14 47.27% php-cgi > 36638 www 1 76 0 237M 30600K *Name 3 0:14 46.97% php-cgi > 36628 www 1 105 0 237M 30600K *Name 2 0:14 46.88% php-cgi > 36627 www 1 105 0 237M 30600K *Name 0 0:14 46.78% php-cgi > 36639 www 1 105 0 237M 30600K *Name 5 0:14 46.58% php-cgi > 36643 www 1 105 0 237M 30600K *Name 7 0:14 46.39% php-cgi > 36629 www 1 76 0 237M 30600K *Name 1 0:14 46.39% php-cgi > 36642 www 1 105 0 237M 30600K *Name 2 0:14 46.39% php-cgi > 36626 www 1 105 0 237M 30600K *Name 5 0:14 46.19% php-cgi > 36654 www 1 105 0 237M 30600K *Name 7 0:13 46.19% php-cgi > 36645 www 1 105 0 237M 30600K *Name 1 0:14 45.75% php-cgi > 36625 www 1 105 0 237M 30600K *Name 0 0:14 45.56% php-cgi > 36624 www 1 105 0 237M 30600K *Name 6 0:14 45.56% php-cgi > 36630 www 1 76 0 237M 30600K *Name 7 0:14 45.17% php-cgi > 36631 www 1 105 0 237M 30600K RUN 4 0:14 45.17% php-cgi > 36636 www 1 105 0 237M 30600K *Name 3 0:14 44.87% php-cgi > > It looks like periodically most or all of the php-cgi processes are > blocked in "*Name" for long enough that "top" notices, then continue, > probably in a "thundering herd" way. From grepping inside /sys the most > likely suspect seems to be something in the namecache, but I can't find > exactly a symbol named "Name" or string beginning with "Name" that would > be connected to a lock. My guess would be: kern/vfs_cache.c:151 static struct rwlock cache_lock; kern/vfs_cache.c:152 RW_SYSINIT(vfscache, &cache_lock, "Name Cache"); The CACHE_*LOCK() macros.c in vfs_cache use cache_lock, so you've got lots of possible contention points. procstat -ka and/or dtrace might help you determine exactly where. -- Dan Nelson dnelson@allantgroup.com