From owner-freebsd-fs@FreeBSD.ORG  Tue Apr 11 17:43:17 2006
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
X-Original-To: freebsd-fs@FreeBSD.org
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 8BCC416A404
	for <freebsd-fs@FreeBSD.org>; Tue, 11 Apr 2006 17:43:17 +0000 (UTC)
	(envelope-from Nicolas.Kowalski@imag.fr)
Received: from imag.imag.fr (imag.imag.fr [129.88.30.1])
	by mx1.FreeBSD.org (Postfix) with ESMTP id D26DC43D5F
	for <freebsd-fs@FreeBSD.org>; Tue, 11 Apr 2006 17:43:15 +0000 (GMT)
	(envelope-from Nicolas.Kowalski@imag.fr)
Received: from mail-veri.imag.fr (mail-veri.imag.fr [129.88.43.52])
	by imag.imag.fr (8.13.6/8.13.6) with ESMTP id k3BHhBwu011122
	(version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO)
	for <freebsd-fs@FreeBSD.org>; Tue, 11 Apr 2006 19:43:11 +0200 (CEST)
Received: from corbeau.imag.fr ([129.88.43.162])
	by mail-veri.imag.fr with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA:32)
	(Exim 4.50) id 1FTMtT-0002Ry-5t
	for freebsd-fs@FreeBSD.org; Tue, 11 Apr 2006 19:43:11 +0200
Received: from kowalski by corbeau.imag.fr with local (Exim 4.50)
	id 1FTMtT-0003Ak-08
	for freebsd-fs@FreeBSD.org; Tue, 11 Apr 2006 19:43:11 +0200
To: freebsd-fs@FreeBSD.org
References: <20060329152608.GB1375@deviant.kiev.zoral.com.ua>
	<vqoy7ydv7lw.fsf@corbeau.imag.fr>
	<20060410144904.GC1408@deviant.kiev.zoral.com.ua>
	<vqou091v3vt.fsf@corbeau.imag.fr> <443A7C8E.4020203@centtech.com>
	<vqopsjpv2ci.fsf@corbeau.imag.fr> <443A8842.6060802@centtech.com>
	<vqolkudv09k.fsf@corbeau.imag.fr> <443A97F9.8090601@centtech.com>
	<vqoek04vbap.fsf@corbeau.imag.fr> <443B8EC1.8080004@centtech.com>
	<vqoacasut46.fsf@corbeau.imag.fr> <443BC755.1080905@centtech.com>
From: Nicolas KOWALSKI <Nicolas.Kowalski@imag.fr>
Date: Tue, 11 Apr 2006 19:43:10 +0200
In-Reply-To: <443BC755.1080905@centtech.com>
Message-ID: <vqo64lguhxt.fsf@corbeau.imag.fr>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: Nicolas Kowalski <Nicolas.Kowalski@imag.fr>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6
	(imag.imag.fr [129.88.30.1]);
	Tue, 11 Apr 2006 19:43:11 +0200 (CEST)
X-IMAG-MailScanner-Information: Please contact IMAG DMI for more information
X-IMAG-MailScanner: Found to be clean
X-MailScanner-From: kowalski@imag.fr
Cc: 
Subject: Re: [patch] giant-less quotas for UFS
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Apr 2006 17:43:17 -0000

Eric Anderson <anderson@centtech.com> writes:

> Nicolas KOWALSKI wrote:
>> Eric Anderson <anderson@centtech.com> writes:
>>
>>> Nicolas KOWALSKI wrote:
>>>> Eric Anderson <anderson@centtech.com> writes:
>>>>
>>>>> Nicolas KOWALSKI wrote:
>>>>>> Eric Anderson <anderson@centtech.com> writes:
>>>>>>
>>>>>>> Nicolas KOWALSKI wrote:
>>>>>>>> Yes, this is exactly what is happening. To add some precision, some
>>>>>>>> students here use calculation applications
>>>>>>>> that allocate a lot of disk space, ususally more than their allowed
>>>>>>>> home quotas; when by error they launch these apps in their home
>>>>>>>> directories, instead of their workstation dedicated space, it makes
>>>>>>>> the server go to its knees on the NFS client side.
>>>>>>> When you say 'to it's knees' - what do you mean exactly?  How many
>>>>>>> clients do you have, how much memory is on the server, and how many
>>>>>>> nfsd threads are you using?  What kind of load average do you see
>>>>>>> during this (on the server)?
>>>>>> Sorry for the imprecision.
>>>>>> The server is a Dual-Xeon 2.8Ghz, 2GB of RAM, using SCSI3 Ultra320
>>>>>> 76GB disks and controller. It is accessed by NFS from ~100 Unix
>>>>>> (Linux, Solaris) clients, and by Samba from ~15 Windows XP. The
>>>>>> network connection is GB ethernet.
>>>>>> During slowdowns, it's only from a NFS client view that the server
>>>>>> does not respond. For example, a simple 'ls' in my home directory is
>>>>>> almost immediate, but when it slows down, it can take up to 2 minutes.
>>>>>> On the server, the load average goes to 0.5, compared to a default
>>>>>> maximum of 0.15-0.20. The nfsd processus shows them in the state
>>>>>> "biowr" in top, but nothing is really written, because the quotas
>>>>>> system block any further writes to the user exceeding her/his quotas.
>>>>>>
>>>>> In this case (which is what I suspected), try bumping up your nfsd
>>>>> threads to 128.  I set mine very high (I have around 1000 clients),
>>>>> and I can say there aren't really ill-effects besides a bit of memory
>>>>> usage (which you have plenty of).  I suspect increasing the threads
>>>>> will neutralize this problem for you.
>>>> Using 128 nfsd threads, I stressed the server, by running on a NFS
>>>> client a small C program, writting continuously in a file, so that the
>>>> user "biguser" (account stored on /export/home2) exceeds his quota.
>>>> It half-works: during the test, users working on another disk
>>>> (/export/home) did not see any difference, but users working on the
>>>> same disk that "biguser" (/export/home2) where almost halted.
>>>> So, this is better, because before everybody was halted, but there is
>>>> still a problem.
>>>> Any other tips ?
>>> Watch gstat during the testing, and see if the disk that holds the
>>> full partition is really busy.  I'm betting it's thrashing the disk
>>> continually checking for free space.  I don't think there's any way to
>>> avoid that.
>> Mh, I did not find this "gstat" tool on the system or in the ports;
>> perhaps is it in >= 5.x ? (the server is running 4.11-p15).
>> It is sad I can not do anything about it: such a server pulled down
>> by
>> a single NFS-client. :-(
>
>
> *sigh* - I should really pay more attention to the beginning of the
> thread.  I thought you were on 5.x, so my mistake.  You'll need to use
> iostat to see what's busy with your disks.

Ok, I'll check with that. Thanks.

> I strongly recommend going to 6.x if you can..

Yep, it's planned, but only in a few months, when all
students/teachers will go to the beach. ;-)

Just for my knowledge, what will improve the situation ? Better
Locking ? If I read the start of the thread, it looks like the actual
quotas system is still under the giant lock, so...

-- 
Nicolas