From owner-freebsd-current@FreeBSD.ORG Sun Jan 2 01:00:03 2011 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6F122106564A; Sun, 2 Jan 2011 01:00:03 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from out-0.mx.aerioconnect.net (out-0-26.mx.aerioconnect.net [216.240.47.86]) by mx1.freebsd.org (Postfix) with ESMTP id 3F4508FC13; Sun, 2 Jan 2011 01:00:03 +0000 (UTC) Received: from idiom.com (postfix@mx0.idiom.com [216.240.32.160]) by out-0.mx.aerioconnect.net (8.13.8/8.13.8) with ESMTP id p020a2md009337; Sat, 1 Jan 2011 16:36:03 -0800 X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e Received: from julian-mac.elischer.org (h-67-100-89-137.snfccasy.static.covad.net [67.100.89.137]) by idiom.com (Postfix) with ESMTP id C298A2D6013; Sat, 1 Jan 2011 16:36:00 -0800 (PST) Message-ID: <4D1FC888.8060906@freebsd.org> Date: Sat, 01 Jan 2011 16:36:24 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: Kostik Belousov References: <4D1F1AE8.5040704@chruetertee.ch> <20110101151008.GA7762@freebsd.org> <4D1F4A48.6080604@chruetertee.ch> <20110101154537.GW90883@deviant.kiev.zoral.com.ua> <4D1F4FB8.3030303@chruetertee.ch> <20110101161254.GX90883@deviant.kiev.zoral.com.ua> <4D1F5992.8030309@chruetertee.ch> <20110101164649.GY90883@deviant.kiev.zoral.com.ua> <4D1F5D5E.8030602@chruetertee.ch> <20110101172635.GZ90883@deviant.kiev.zoral.com.ua> In-Reply-To: <20110101172635.GZ90883@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 216.240.47.51 Cc: Alexander Best , Beat G?tzi , current@freebsd.org Subject: Re: Suddenly slow lstat syscalls on CURRENT from Juli X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Jan 2011 01:00:03 -0000 On 1/1/11 9:26 AM, Kostik Belousov wrote: > On Sat, Jan 01, 2011 at 05:59:10PM +0100, Beat G?tzi wrote: >> On 01.01.2011 17:46, Kostik Belousov wrote: >>> On Sat, Jan 01, 2011 at 05:42:58PM +0100, Beat G?tzi wrote: >>>> On 01.01.2011 17:12, Kostik Belousov wrote: >>>>> On Sat, Jan 01, 2011 at 05:00:56PM +0100, Beat G?tzi wrote: >>>>>> On 01.01.2011 16:45, Kostik Belousov wrote: >>>>>>> Check the output of sysctl kern.maxvnodes and vfs.numvnodes. I suspect >>>>>>> they are quite close or equial. If yes, consider increasing maxvnodes. >>>>>>> Another workaround, if you have huge nested directories hierarhy, is >>>>>>> to set vfs.vlru_allow_cache_src to 1. >>>>>> Thanks for the hint. kern.maxvnodes and vfs.numvnodes were equal: >>>>>> # sysctl kern.maxvnodes vfs.numvnodes >>>>>> kern.maxvnodes: 100000 >>>>>> vfs.numvnodes: 100765 >>>>>> >>>>>> I've increased kern.maxvnodes and the problem was gone until >>>>>> vfs.numvnodes reached the value of kern.maxvnodes again: >>>>>> # sysctl kern.maxvnodes vfs.numvnodes >>>>>> kern.maxvnodes: 150000 >>>>>> vfs.numvnodes: 150109 >>>>> The processes should be stuck in "vlruwk" state, that can be >>>>> checked with ps or '^T' on the terminal. >>>> Yes, there are various processes in "vlruwk" state, >>>> >>>>>> As the directory structure is quite huge on this server I've set >>>>>> vfs.vlru_allow_cache_src to one now. >>>>> Did it helped ? >>>> No, it doesn't looks like setting vfs.vlru_allow_cache_src helped. The >>>> problem was gone when I increased kern.maxvnodes until vfs.numvnodes >>>> reached that level. I've stopped all running deamons but numvnodes >>>> doesn't decrease. >>> Stopping the daemons would not decrease the count of cached vnodes. >>> What you can do is to call unmount on the filesystems. Supposedly, the >>> filesystems are busy and unmount shall fail, but it will force freed >>> the vnodes that are unused by any process. >> That freed around 1500 vnodes. At the moment the vfs.numvnodes doesn't >> increase rapidly and the server is usable. I will keep an eye it to see >> if I run into the same problem again. > This is too small amount of vnodes to be freed for the typical system, > and it feels like a real vnode leak. It would be helpful if you tried > to identify the load that causes the situation to occur. > > You are on the UFS, right ? try running sockstat to a file and looking to see what is open.. it could just be a normal leak.