Date: Tue, 27 Jul 2010 20:55:42 +0100 From: krad <kraduk@googlemail.com> To: freebsd-hackers@freebsd.org, FreeBSD Questions <freebsd-questions@freebsd.org> Subject: Re: possible NFS lockups Message-ID: <AANLkTi=3LKv4DkaX_yHo5WfXK33YGYSAaOvqh5mjSVTV@mail.gmail.com> In-Reply-To: <AANLkTinUVKByfTX%2Bf9DOQ97jh43VPVSug_=BDpJ9PB0z@mail.gmail.com> References: <AANLkTinUVKByfTX%2Bf9DOQ97jh43VPVSug_=BDpJ9PB0z@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 27 July 2010 16:29, krad <kraduk@googlemail.com> wrote: > I have a production mail system with an nfs backend. Every now and again we > see the nfs die on a particular head end. However it doesn't die across all > the nodes. This suggests to me there isnt an issue with the filer itself and > the stats from the filer concur with that. > > The symptoms are lines like this appearing in dmesg > > nfs server 10.44.17.138:/vol/vol1/mail: not responding > nfs server 10.44.17.138:/vol/vol1/mail: is alive again > > trussing df it seems to hang on getfsstat, this is presumably when it tries > the nfs mounts > > eg > > __sysctl(0xbfbfe224,0x2,0xbfbfe22c,0xbfbfe230,0x0,0x0) = 0 (0x0) > mmap(0x0,1048576,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = > 1746583552 (0x681ac000) > mmap(0x682ac000,344064,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = > 1747632128 (0x682ac000) > munmap(0x681ac000,344064) = 0 (0x0) > getfsstat(0x68201000,0x1270,0x2,0xbfbfe960,0xbfbfe95c,0x1) = 9 (0x9) > > > I have played with mount options a fair bit but they dont make much > difference. This is what they are set to at present > > 10.44.17.138:/vol/vol1/mail /mail/0 nfs > rw,noatime,tcp,acdirmax=320,acdirmin=180,acregmax=320,acregmin=180 0 0 > > When this locking is occuring I find that if I do a show mount or mount > 10.44.17.138:/vol/vol1/mail again under another mount point I can access > it fine. > > One thing I have just noticed is that lockd and statd always seem to have > died when this happens. Restarting does not help > > > I find all this a bit perplexing. Can anyone offer any help into why this > might be happening. I have dtrace compliled into the kernel if that could > help with debugging > sorry i missed a bit of critical info # uname -a FreeBSD X 8.1-STABLE FreeBSD 8.1-STABLE #2: Mon Jul 26 16:10:19 BST 2010 root@mk-pimap-7.b2b.uk.tiscali.com:/usr/obj/usr/src/sys/DTRACE i
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTi=3LKv4DkaX_yHo5WfXK33YGYSAaOvqh5mjSVTV>