From owner-freebsd-fs@FreeBSD.ORG  Tue Apr 11 15:12:36 2006
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@freebsd.org
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E16C416A400
	for <freebsd-fs@freebsd.org>; Tue, 11 Apr 2006 15:12:35 +0000 (UTC)
	(envelope-from anderson@centtech.com)
Received: from mh2.centtech.com (moat3.centtech.com [207.200.51.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 7E69C43D45
	for <freebsd-fs@freebsd.org>; Tue, 11 Apr 2006 15:12:35 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from [10.177.171.220] (neutrino.centtech.com [10.177.171.220])
	by mh2.centtech.com (8.13.1/8.13.1) with ESMTP id k3BFCYMU084847;
	Tue, 11 Apr 2006 10:12:34 -0500 (CDT)
	(envelope-from anderson@centtech.com)
Message-ID: <443BC755.1080905@centtech.com>
Date: Tue, 11 Apr 2006 10:12:21 -0500
From: Eric Anderson <anderson@centtech.com>
User-Agent: Thunderbird 1.5 (X11/20060402)
MIME-Version: 1.0
To: Nicolas KOWALSKI <Nicolas.Kowalski@imag.fr>
References: <20060329152608.GB1375@deviant.kiev.zoral.com.ua>	<vqoy7ydv7lw.fsf@corbeau.imag.fr>	<20060410144904.GC1408@deviant.kiev.zoral.com.ua>	<vqou091v3vt.fsf@corbeau.imag.fr>
	<443A7C8E.4020203@centtech.com>	<vqopsjpv2ci.fsf@corbeau.imag.fr>
	<443A8842.6060802@centtech.com>	<vqolkudv09k.fsf@corbeau.imag.fr>
	<443A97F9.8090601@centtech.com>	<vqoek04vbap.fsf@corbeau.imag.fr>
	<443B8EC1.8080004@centtech.com> <vqoacasut46.fsf@corbeau.imag.fr>
In-Reply-To: <vqoacasut46.fsf@corbeau.imag.fr>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Virus-Scanned: ClamAV 0.87.1/1391/Tue Apr 11 04:53:41 2006 on
	mh2.centtech.com
X-Virus-Status: Clean
Cc: freebsd-fs@freebsd.org
Subject: Re: [patch] giant-less quotas for UFS
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Apr 2006 15:12:36 -0000

Nicolas KOWALSKI wrote:
> Eric Anderson <anderson@centtech.com> writes:
> 
>> Nicolas KOWALSKI wrote:
>>> Eric Anderson <anderson@centtech.com> writes:
>>>
>>>> Nicolas KOWALSKI wrote:
>>>>> Eric Anderson <anderson@centtech.com> writes:
>>>>>
>>>>>> Nicolas KOWALSKI wrote:
>>>>>>> Yes, this is exactly what is happening. To add some precision, some
>>>>>>> students here use calculation applications
>>>>>>> that allocate a lot of disk space, ususally more than their allowed
>>>>>>> home quotas; when by error they launch these apps in their home
>>>>>>> directories, instead of their workstation dedicated space, it makes
>>>>>>> the server go to its knees on the NFS client side.
>>>>>> When you say 'to it's knees' - what do you mean exactly?  How many
>>>>>> clients do you have, how much memory is on the server, and how many
>>>>>> nfsd threads are you using?  What kind of load average do you see
>>>>>> during this (on the server)?
>>>>> Sorry for the imprecision.
>>>>> The server is a Dual-Xeon 2.8Ghz, 2GB of RAM, using SCSI3 Ultra320
>>>>> 76GB disks and controller. It is accessed by NFS from ~100 Unix
>>>>> (Linux, Solaris) clients, and by Samba from ~15 Windows XP. The
>>>>> network connection is GB ethernet.
>>>>> During slowdowns, it's only from a NFS client view that the server
>>>>> does not respond. For example, a simple 'ls' in my home directory is
>>>>> almost immediate, but when it slows down, it can take up to 2 minutes.
>>>>> On the server, the load average goes to 0.5, compared to a default
>>>>> maximum of 0.15-0.20. The nfsd processus shows them in the state
>>>>> "biowr" in top, but nothing is really written, because the quotas
>>>>> system block any further writes to the user exceeding her/his quotas.
>>>>>
>>>> In this case (which is what I suspected), try bumping up your nfsd
>>>> threads to 128.  I set mine very high (I have around 1000 clients),
>>>> and I can say there aren't really ill-effects besides a bit of memory
>>>> usage (which you have plenty of).  I suspect increasing the threads
>>>> will neutralize this problem for you.
>>> Using 128 nfsd threads, I stressed the server, by running on a NFS
>>> client a small C program, writting continuously in a file, so that the
>>> user "biguser" (account stored on /export/home2) exceeds his quota.
>>> It half-works: during the test, users working on another disk
>>> (/export/home) did not see any difference, but users working on the
>>> same disk that "biguser" (/export/home2) where almost halted.
>>> So, this is better, because before everybody was halted, but there is
>>> still a problem.
>>> Any other tips ?
>> Watch gstat during the testing, and see if the disk that holds the
>> full partition is really busy.  I'm betting it's thrashing the disk
>> continually checking for free space.  I don't think there's any way to
>> avoid that.
> 
> Mh, I did not find this "gstat" tool on the system or in the ports;
> perhaps is it in >= 5.x ? (the server is running 4.11-p15).
> 
> It is sad I can not do anything about it: such a server pulled down by
> a single NFS-client. :-(


*sigh* - I should really pay more attention to the beginning of the 
thread.  I thought you were on 5.x, so my mistake.  You'll need to use 
iostat to see what's busy with your disks.

I strongly recommend going to 6.x if you can..

Eric


-- 
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
Anything that works is better than anything that doesn't.
------------------------------------------------------------------------