From owner-freebsd-current@FreeBSD.ORG Fri Jul 16 17:57:45 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 05E6916A4CE for ; Fri, 16 Jul 2004 17:57:45 +0000 (GMT) Received: from mail.cems.umn.edu (tyr.cems.umn.edu [134.84.164.251]) by mx1.FreeBSD.org (Postfix) with ESMTP id DA5B443D1D for ; Fri, 16 Jul 2004 17:57:44 +0000 (GMT) (envelope-from mwt@cems.umn.edu) Received: from localhost (localhost.cems.umn.edu [127.0.0.1]) by mail.cems.umn.edu (Postfix) with ESMTP id 60BE414D927 for ; Fri, 16 Jul 2004 12:58:01 -0500 (CDT) Received: from mail.cems.umn.edu ([127.0.0.1]) by localhost (tyr.cems.umn.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 56246-01 for ; Fri, 16 Jul 2004 12:54:04 -0500 (CDT) Received: from [134.84.164.244] (calamity.cems.umn.edu [134.84.164.244]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.cems.umn.edu (Postfix) with ESMTP id 1AC7314D8DD for ; Fri, 16 Jul 2004 12:51:26 -0500 (CDT) Message-ID: <40F8157D.5040104@cems.umn.edu> Date: Fri, 16 Jul 2004 12:50:53 -0500 From: Mike Thomas User-Agent: Mozilla Thunderbird 0.7 (Macintosh/20040616) X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-current@freebsd.org Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new at cems.umn.edu Subject: nfsd problems with FreeBSD 5.2.1 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Jul 2004 17:57:45 -0000 Hello, Alright folks, I'm in some serious need for help/advice. I'm running FreeBSD 5.2.1 (-current) with a kernel/buildworld ran yesterday (7/16/2004) on a Dual Xeon 3.06ghz with hyperthreading enabled. The machine also has 2gb of ram and a scsi raided array with an intel storage raid array controller. (iir0) The machine functions as a nis client for accounts with home directories nfs mounted from a Solaris 9 machine. It's primary function is as a mail server, and what it is nfs sharing out is the spool folder. (/var/mail, in this case). I know all about the dangers of sharing out a mail spool, I don't need, or want, a lecture about proper operating procedures in this case. It's for legacy purposes and will be going away in due time. Anyway, its with this mount that I am experiencing these nfs problems. Now, to the nitty gritty. I am seeing periodic spikes from one of the nfsd children from about 10% of the cpu (via top) to 100% of the cpu. During times of this spike, even if the spike only reaches 40-50% of the cpu, the machine becomes dibilitatingly slow and stops responding to all other commands. Even issuing an 'ls' is difficult, let alone doing anything productive. While using top, the nfsd state will alternate between biowr, biord, *Giant (yeah, it even is requesting Giant locks). I have recompiled every single ounce of software that operates on /var/mail to only use fcntl locking (procmail/postfix/uw-imap (there's a patch by redhat to do that)) so that it is nfs friendly. Here's what I've tried to do to see if it made any difference. First, all mounts of /var/mail from other servers were using UDP, they have all been switched to tcp with a rsize and wsize of 1024. I've tried 4096, and 8192, both which make no difference. All clients are specifically forced to use NFSv3. I have also tried varying between a soft and hard mount, also, with no difference in these spikes. I also tried switching back to the 4BSD scheduler, to see if that might have beeen the issue, but it would appear that didn't make any difference as well, though the max load average I was seeing stayed a bit lower with ULE as upposed to the 4BSD scheduler. So, I'm really at the end of my rope right now, I have no idea what to do or what could be causing this. Any advice would be great, thanks. --Mike