From owner-freebsd-questions@FreeBSD.ORG Wed Jul 28 07:18:44 2010 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8B9ED106566B for ; Wed, 28 Jul 2010 07:18:44 +0000 (UTC) (envelope-from asimex@gmx.net) Received: from mail.gmx.net (mailout-de.gmx.net [213.165.64.22]) by mx1.freebsd.org (Postfix) with SMTP id EB2CF8FC0A for ; Wed, 28 Jul 2010 07:18:43 +0000 (UTC) Received: (qmail 2007 invoked by uid 0); 28 Jul 2010 06:52:02 -0000 Received: from 212.118.142.74 by www022.gmx.net with HTTP; Wed, 28 Jul 2010 08:52:02 +0200 (CEST) Content-Type: text/plain; charset="utf-8" Date: Wed, 28 Jul 2010 08:52:01 +0200 From: "Andreas Feid" In-Reply-To: Message-ID: <20100728065201.234030@gmx.net> MIME-Version: 1.0 References: To: krad , freebsd-questions@freebsd.org, freebsd-hackers@freebsd.org X-Authenticated: #138425 X-Flags: 0001 X-Mailer: WWW-Mail 6100 (Global Message Exchange) X-Priority: 3 X-Provags-ID: V01U2FsdGVkX18dDfFVGfYdia+o8sWzdtp/V94DshEznloD4fKIZ3 fUPI8aJEZZlEwjwBUvkc9dkDi35aL7ebKYmg== Content-Transfer-Encoding: 8bit X-GMX-UID: put6eCARRkkNbs1mcWRqSIdudWkvKNM6 X-FuHaFi: 0.51000000000000001 Cc: Subject: Re: possible NFS lockups X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Jul 2010 07:18:44 -0000 I have a few remarks and questions; what happens when the system is in this state? Your access to the mount fails but is restored after a while, or do you need to remount, under normal conditions the access should be restored automaticlly. The error message per se is indicating a busy server and should clear up after a while, as you have seen. How frequent do you see the error, once per hour, day? If you say filer, I assume you are talking about a Netapp filer, it might be worth taking a perfstat when the error happens, and when the condition exists. I think dtrace will not really help since this seems a server issue to me. As the filer is used to store mails, I assume we are talking about qmail or similiar environment with a huge number of small files, I would like to know how the directory structure looks on the filer. If possible get a perfstat and provide the directory structure offline to me and I will have a look. -Andreas -------- Original-Nachricht -------- > Datum: Tue, 27 Jul 2010 20:55:42 +0100 > Von: krad > An: freebsd-hackers@freebsd.org, FreeBSD Questions > Betreff: Re: possible NFS lockups > On 27 July 2010 16:29, krad wrote: > > > I have a production mail system with an nfs backend. Every now and again > we > > see the nfs die on a particular head end. However it doesn't die across > all > > the nodes. This suggests to me there isnt an issue with the filer itself > and > > the stats from the filer concur with that. > > > > The symptoms are lines like this appearing in dmesg > > > > nfs server 10.44.17.138:/vol/vol1/mail: not responding > > nfs server 10.44.17.138:/vol/vol1/mail: is alive again > > > > trussing df it seems to hang on getfsstat, this is presumably when it > tries > > the nfs mounts > > > > eg > > > > __sysctl(0xbfbfe224,0x2,0xbfbfe22c,0xbfbfe230,0x0,0x0) = 0 (0x0) > > mmap(0x0,1048576,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = > > 1746583552 (0x681ac000) > > mmap(0x682ac000,344064,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) > = > > 1747632128 (0x682ac000) > > munmap(0x681ac000,344064) = 0 (0x0) > > getfsstat(0x68201000,0x1270,0x2,0xbfbfe960,0xbfbfe95c,0x1) = 9 (0x9) > > > > > > I have played with mount options a fair bit but they dont make much > > difference. This is what they are set to at present > > > > 10.44.17.138:/vol/vol1/mail /mail/0 nfs > > rw,noatime,tcp,acdirmax=320,acdirmin=180,acregmax=320,acregmin=180 0 > 0 > > > > When this locking is occuring I find that if I do a show mount or mount > > 10.44.17.138:/vol/vol1/mail again under another mount point I can access > > it fine. > > > > One thing I have just noticed is that lockd and statd always seem to > have > > died when this happens. Restarting does not help > > > > > > I find all this a bit perplexing. Can anyone offer any help into why > this > > might be happening. I have dtrace compliled into the kernel if that > could > > help with debugging > > > > sorry i missed a bit of critical info > > # uname -a > FreeBSD X 8.1-STABLE FreeBSD 8.1-STABLE #2: Mon Jul 26 16:10:19 BST 2010 > root@mk-pimap-7.b2b.uk.tiscali.com:/usr/obj/usr/src/sys/DTRACE i > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01