Date: Wed, 10 Mar 2010 00:19:30 +0100 From: Adrenalin <adrenalinup@gmail.com> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: Doug Rabson <dfr@freebsd.org>, current@freebsd.org, freebsd-current@freebsd.org, pav@freebsd.org, Kip Macy <kmacy@freebsd.org> Subject: Re: hang in rpccon from interrupting NFS operations (Re: pointyhat panic) Message-ID: <f027bef41003091519h4e2b0827i28b95cc55701076c@mail.gmail.com> In-Reply-To: <Pine.GSO.4.63.0906212119310.2430@muncher.cs.uoguelph.ca> References: <1242075474.72992.118.camel@hood.oook.cz> <3c1674c90906151408n6febec56m140b089b694f6e13@mail.gmail.com> <20090616073353.GZ33280@droso.net> <200906160812.04284.jhb@freebsd.org> <4A3E234F.6050403@FreeBSD.org> <Pine.GSO.4.63.0906212119310.2430@muncher.cs.uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, I would like to know if this bug has been fixed in the FreeBSD 8 Release since I got it 3 times already on a busy box that use heavily NFS (with lots of files). Unfortunately my processes are not compiled with debug symbols(so I cannot get an backtrace), but I've got all the php-cgi stuck in the "rpccon" state just like described here, I cannot kill them and I cannot cleanly reboot, manual restart is required. FreeBSD g4.torrentsmd.com 8.0-RELEASE FreeBSD 8.0-RELEASE #0: Sat Nov 21 15:02:08 UTC 2009 root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 4063 www 1 52 0 82576K 26320K rpccon 3 1:40 0.00% php-cgi 4078 www 1 48 0 83600K 26768K rpccon 1 1:37 0.00% php-cgi 4129 www 1 52 0 83600K 26740K rpccon 1 1:31 0.00% php-cgi 4159 www 1 55 0 82832K 26216K rpccon 0 1:24 0.00% php-cgi 4184 www 1 54 0 90768K 34104K rpccon 0 1:16 0.00% php-cgi 4174 www 1 50 0 82832K 23396K rpccon 0 1:15 0.00% php-cgi 4258 www 1 55 0 82064K 24224K rpccon 1 1:06 0.00% php-cgi I belive the error was triggered when Mar 9 20:00:31 sv kernel: nfs server s:/path/pah/paf: not responding Mar 9 20:00:36 sv last message repeated 23 times My fstab look like this, I use the -b flag sv:/path/pah/paf /path/fap/hap/afh nfs rw,-b 0 0 Since it's a remote box and I'm afraid to screw up the kernel recompilation of the "Stable", and I'm not even sure it will help, do you have any suggestions ? Thank you. Nicu. On Mon, Jun 22, 2009 at 2:21 AM, Rick Macklem <rmacklem@uoguelph.ca> wrote: > > > On Sun, 21 Jun 2009, Kris Kennaway wrote: > > >> Got another deadlock after upgrading. Again, busy NFS volume, and ^C'ing >> a recursive find hung in rpccon state: >> >> db> bt 89596 >> Tracing pid 89596 tid 102493 td 0xffffff0089260000 >> sched_switch() at sched_switch+0x17c >> mi_switch() at mi_switch+0x21d >> sleepq_switch() at sleepq_switch+0x123 >> sleepq_timedwait() at sleepq_timedwait+0x4d >> _sleep() at _sleep+0x301 >> clnt_reconnect_call() at clnt_reconnect_call+0x5d3 >> nfs_request() at nfs_request+0x225 >> nfs_statfs() at nfs_statfs+0x197 >> __vfs_statfs() at __vfs_statfs+0x28 >> kern_fstatfs() at kern_fstatfs+0x286 >> fstatfs() at fstatfs+0x34 >> syscall() at syscall+0x1af >> Xfast_syscall() at Xfast_syscall+0xd0 >> --- syscall (397, FreeBSD ELF64, fstatfs), rip = 0x800726dcc, rsp = >> 0x7fffffffe1a8, rbp = 0x1000 --- >> >> These are mounted with intr, I'll try disabling that next. >> >> There are two sleeps in clnt_rc.c. One of them optionally does a PCATCH > and returns when interrupted via ^C, but the other one (which it is > sleeping on above), doesn't. I've emailed Kris a small patch that > changes that for him to test. > > If anyone else wants to test the patch, just email me for a copy, rick > > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?f027bef41003091519h4e2b0827i28b95cc55701076c>