From owner-freebsd-hackers@FreeBSD.ORG Tue Jul 27 19:55:44 2010 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1EFEC1065679; Tue, 27 Jul 2010 19:55:44 +0000 (UTC) (envelope-from kraduk@googlemail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 7B6CC8FC0C; Tue, 27 Jul 2010 19:55:43 +0000 (UTC) Received: by fxm13 with SMTP id 13so902905fxm.13 for ; Tue, 27 Jul 2010 12:55:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=rWOz8nNTQrOgYW0+uBnP1vVr1kQ+FGW9soiIlD9YH1Y=; b=jWpt2IsMr8ji1teqnvJMuAVOqaBF7/d+bJD7f2HzYLpofzScxxytBTAvRG2JZSzLrZ jUZk4im/Y8knMz0DQ4l4rQYVMHgw+p4fSf2etlXhF0y5Xii8g3cupBcOUz5SIk5xfvAq D7yph8fjDKhHXPHfJ5oC426HAks6OtqcS7Qlw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=o2GC10U3LVUpaZERgwNIdmsH+Af35E+SBF3pqnLamLfTK0Qm/mJAiZbG+PCzsu9YKX XPGs+S3i2us5fG3GIGrm5KzaSBHrjCM+TTptYU63MpRLdY8DyDxV1FerGAs8J04pbF8m pVhQB3TjaBhzapkqgWA5QvULQWHgPKwaDI91I= MIME-Version: 1.0 Received: by 10.239.154.204 with SMTP id f12mr585988hbc.143.1280260542150; Tue, 27 Jul 2010 12:55:42 -0700 (PDT) Received: by 10.239.160.201 with HTTP; Tue, 27 Jul 2010 12:55:42 -0700 (PDT) In-Reply-To: References: Date: Tue, 27 Jul 2010 20:55:42 +0100 Message-ID: From: krad To: freebsd-hackers@freebsd.org, FreeBSD Questions X-Mailman-Approved-At: Tue, 27 Jul 2010 21:01:04 +0000 Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Re: possible NFS lockups X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Jul 2010 19:55:44 -0000 On 27 July 2010 16:29, krad wrote: > I have a production mail system with an nfs backend. Every now and again we > see the nfs die on a particular head end. However it doesn't die across all > the nodes. This suggests to me there isnt an issue with the filer itself and > the stats from the filer concur with that. > > The symptoms are lines like this appearing in dmesg > > nfs server 10.44.17.138:/vol/vol1/mail: not responding > nfs server 10.44.17.138:/vol/vol1/mail: is alive again > > trussing df it seems to hang on getfsstat, this is presumably when it tries > the nfs mounts > > eg > > __sysctl(0xbfbfe224,0x2,0xbfbfe22c,0xbfbfe230,0x0,0x0) = 0 (0x0) > mmap(0x0,1048576,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = > 1746583552 (0x681ac000) > mmap(0x682ac000,344064,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = > 1747632128 (0x682ac000) > munmap(0x681ac000,344064) = 0 (0x0) > getfsstat(0x68201000,0x1270,0x2,0xbfbfe960,0xbfbfe95c,0x1) = 9 (0x9) > > > I have played with mount options a fair bit but they dont make much > difference. This is what they are set to at present > > 10.44.17.138:/vol/vol1/mail /mail/0 nfs > rw,noatime,tcp,acdirmax=320,acdirmin=180,acregmax=320,acregmin=180 0 0 > > When this locking is occuring I find that if I do a show mount or mount > 10.44.17.138:/vol/vol1/mail again under another mount point I can access > it fine. > > One thing I have just noticed is that lockd and statd always seem to have > died when this happens. Restarting does not help > > > I find all this a bit perplexing. Can anyone offer any help into why this > might be happening. I have dtrace compliled into the kernel if that could > help with debugging > sorry i missed a bit of critical info # uname -a FreeBSD X 8.1-STABLE FreeBSD 8.1-STABLE #2: Mon Jul 26 16:10:19 BST 2010 root@mk-pimap-7.b2b.uk.tiscali.com:/usr/obj/usr/src/sys/DTRACE i