From owner-freebsd-fs@FreeBSD.ORG Tue Apr 11 17:43:17 2006 Return-Path: X-Original-To: freebsd-fs@FreeBSD.org Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8BCC416A404 for ; Tue, 11 Apr 2006 17:43:17 +0000 (UTC) (envelope-from Nicolas.Kowalski@imag.fr) Received: from imag.imag.fr (imag.imag.fr [129.88.30.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id D26DC43D5F for ; Tue, 11 Apr 2006 17:43:15 +0000 (GMT) (envelope-from Nicolas.Kowalski@imag.fr) Received: from mail-veri.imag.fr (mail-veri.imag.fr [129.88.43.52]) by imag.imag.fr (8.13.6/8.13.6) with ESMTP id k3BHhBwu011122 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Tue, 11 Apr 2006 19:43:11 +0200 (CEST) Received: from corbeau.imag.fr ([129.88.43.162]) by mail-veri.imag.fr with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA:32) (Exim 4.50) id 1FTMtT-0002Ry-5t for freebsd-fs@FreeBSD.org; Tue, 11 Apr 2006 19:43:11 +0200 Received: from kowalski by corbeau.imag.fr with local (Exim 4.50) id 1FTMtT-0003Ak-08 for freebsd-fs@FreeBSD.org; Tue, 11 Apr 2006 19:43:11 +0200 To: freebsd-fs@FreeBSD.org References: <20060329152608.GB1375@deviant.kiev.zoral.com.ua> <20060410144904.GC1408@deviant.kiev.zoral.com.ua> <443A7C8E.4020203@centtech.com> <443A8842.6060802@centtech.com> <443A97F9.8090601@centtech.com> <443B8EC1.8080004@centtech.com> <443BC755.1080905@centtech.com> From: Nicolas KOWALSKI Date: Tue, 11 Apr 2006 19:43:10 +0200 In-Reply-To: <443BC755.1080905@centtech.com> Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: Nicolas Kowalski X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (imag.imag.fr [129.88.30.1]); Tue, 11 Apr 2006 19:43:11 +0200 (CEST) X-IMAG-MailScanner-Information: Please contact IMAG DMI for more information X-IMAG-MailScanner: Found to be clean X-MailScanner-From: kowalski@imag.fr Cc: Subject: Re: [patch] giant-less quotas for UFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Apr 2006 17:43:17 -0000 Eric Anderson writes: > Nicolas KOWALSKI wrote: >> Eric Anderson writes: >> >>> Nicolas KOWALSKI wrote: >>>> Eric Anderson writes: >>>> >>>>> Nicolas KOWALSKI wrote: >>>>>> Eric Anderson writes: >>>>>> >>>>>>> Nicolas KOWALSKI wrote: >>>>>>>> Yes, this is exactly what is happening. To add some precision, some >>>>>>>> students here use calculation applications >>>>>>>> that allocate a lot of disk space, ususally more than their allowed >>>>>>>> home quotas; when by error they launch these apps in their home >>>>>>>> directories, instead of their workstation dedicated space, it makes >>>>>>>> the server go to its knees on the NFS client side. >>>>>>> When you say 'to it's knees' - what do you mean exactly? How many >>>>>>> clients do you have, how much memory is on the server, and how many >>>>>>> nfsd threads are you using? What kind of load average do you see >>>>>>> during this (on the server)? >>>>>> Sorry for the imprecision. >>>>>> The server is a Dual-Xeon 2.8Ghz, 2GB of RAM, using SCSI3 Ultra320 >>>>>> 76GB disks and controller. It is accessed by NFS from ~100 Unix >>>>>> (Linux, Solaris) clients, and by Samba from ~15 Windows XP. The >>>>>> network connection is GB ethernet. >>>>>> During slowdowns, it's only from a NFS client view that the server >>>>>> does not respond. For example, a simple 'ls' in my home directory is >>>>>> almost immediate, but when it slows down, it can take up to 2 minutes. >>>>>> On the server, the load average goes to 0.5, compared to a default >>>>>> maximum of 0.15-0.20. The nfsd processus shows them in the state >>>>>> "biowr" in top, but nothing is really written, because the quotas >>>>>> system block any further writes to the user exceeding her/his quotas. >>>>>> >>>>> In this case (which is what I suspected), try bumping up your nfsd >>>>> threads to 128. I set mine very high (I have around 1000 clients), >>>>> and I can say there aren't really ill-effects besides a bit of memory >>>>> usage (which you have plenty of). I suspect increasing the threads >>>>> will neutralize this problem for you. >>>> Using 128 nfsd threads, I stressed the server, by running on a NFS >>>> client a small C program, writting continuously in a file, so that the >>>> user "biguser" (account stored on /export/home2) exceeds his quota. >>>> It half-works: during the test, users working on another disk >>>> (/export/home) did not see any difference, but users working on the >>>> same disk that "biguser" (/export/home2) where almost halted. >>>> So, this is better, because before everybody was halted, but there is >>>> still a problem. >>>> Any other tips ? >>> Watch gstat during the testing, and see if the disk that holds the >>> full partition is really busy. I'm betting it's thrashing the disk >>> continually checking for free space. I don't think there's any way to >>> avoid that. >> Mh, I did not find this "gstat" tool on the system or in the ports; >> perhaps is it in >= 5.x ? (the server is running 4.11-p15). >> It is sad I can not do anything about it: such a server pulled down >> by >> a single NFS-client. :-( > > > *sigh* - I should really pay more attention to the beginning of the > thread. I thought you were on 5.x, so my mistake. You'll need to use > iostat to see what's busy with your disks. Ok, I'll check with that. Thanks. > I strongly recommend going to 6.x if you can.. Yep, it's planned, but only in a few months, when all students/teachers will go to the beach. ;-) Just for my knowledge, what will improve the situation ? Better Locking ? If I read the start of the thread, it looks like the actual quotas system is still under the giant lock, so... -- Nicolas