From owner-freebsd-fs@FreeBSD.ORG Wed Aug 15 03:24:03 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 50B1516A420 for ; Wed, 15 Aug 2007 03:24:03 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com [72.36.161.186]) by mx1.freebsd.org (Postfix) with ESMTP id 26BFD13C480 for ; Wed, 15 Aug 2007 03:24:03 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from neutrino.vnode.org (r74-193-81-203.pfvlcmta01.grtntx.tl.dh.suddenlink.net [74.193.81.203]) (authenticated bits=0) by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id l7F3O13C043380 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Tue, 14 Aug 2007 22:24:02 -0500 (CDT) (envelope-from anderson@freebsd.org) Message-ID: <46C271CC.8080008@freebsd.org> Date: Tue, 14 Aug 2007 22:23:56 -0500 From: Eric Anderson User-Agent: Thunderbird 2.0.0.6 (X11/20070812) MIME-Version: 1.0 To: Gore Jarold References: <924334.71021.qm@web63006.mail.re1.yahoo.com> In-Reply-To: <924334.71021.qm@web63006.mail.re1.yahoo.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com Cc: freebsd-fs@freebsd.org Subject: Re: help needed - tuning a filesystem for rm and cp ? (more details) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2007 03:24:03 -0000 On 08/14/07 16:47, Gore Jarold wrote: >>> Here is a question for any and all out there > reading >>> ... what would you expect would happen to a system >>> that was constantly maxing out this value, > sometimes >>> on a sustained basis, while the activity that > caused >>> it went on uninterrupted ? >>> >>> I am seeing the system halt ... is it reasonable > to >>> think that maxing that value out on a regular, >>> sustained basis would cause a system to halt ? >>> >>> (6.2-release running on a 4 GB memory p4 xeon ... > does >>> nothing but fileserver duties) >> >> If you have a lot of meta-data IO (which you seem to > have), then it's >> possible that the system is incredibly busy doing > disk accesses, and >> waiting on IO from storage. When you say 'halt' > does that mean you >> can't log in to it, and eventually it comes back > alive, or does that >> mean it is locked up in a way that never recovers? > > > I never wait around to find out it it comes back ... > > It no longer responds to pings, so I assume it is > actually crashed. > > Also, it should be noted that this system (and its > large, >4TB filesystem) has quotas enabled. > > So perhaps a better question would be: > > Any comments on frequent, sustained maxing out of > dirhash on a quota'd filesystem ? If you can, it would be best if you can break into the debugger and get a core. I can't think of how dirhash would effect quotas really, but I suppose it might be possible. You might try bumping the dirhash setting way up, maybe to 20MB or so. I think it might be worthwhile also to run a script (cron maybe) every minute that logs the current use of dirhash, and maybe a ps -auxwl, with a date stamp so you can when your peaks occur, what was running, etc, just before a crash. Might also look into bsdsar - it might help with some of this too. Eric