Date: Thu, 6 Oct 2005 01:38:53 -0400 From: Kris Kennaway <kris@obsecurity.org> To: Don Lewis <truckman@FreeBSD.org> Cc: mi+mx@aldan.algebra.com, re@FreeBSD.org, current@FreeBSD.org, openoffice@FreeBSD.org Subject: Re: 6.0 hangs (while building OOo) Message-ID: <20051006053853.GA58630@xor.obsecurity.org> In-Reply-To: <200510050220.j952KD25025940@gw.catspoiler.org> References: <200510041328.05637.mi%2Bmx@aldan.algebra.com> <200510050220.j952KD25025940@gw.catspoiler.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--9amGYk9869ThD9tj Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Oct 04, 2005 at 07:20:13PM -0700, Don Lewis wrote: > On 4 Oct, Mikhail Teterin wrote: > > ???????? 04 ??????? 2005 13:08, Don Lewis ?? ????????: > >> Hung trying to lock a vnode ... > >> > >> What other processes are in the D state, and what is their wchan info? > >=20 > > mi@roo:~ (301) ps -lax | awk 'match($10, "D")' > > 0 2 0 0 -8 0 0 8 - DL ?? 0:06,50 [g_= event] > > 0 3 0 0 -8 0 0 8 - DL ?? 0:39,71 [g_= up] > > 0 4 0 0 -8 0 0 8 - DL ?? 0:31,21 [g_= down] > > 0 5 0 0 8 0 0 8 - DL ?? 0:00,00 [th= read taskq] > > 0 6 0 0 8 0 0 8 - DL ?? 0:00,00 [kq= ueue taskq] > > 0 7 0 0 96 0 0 8 idle DL ?? 0:00,00 [ai= c_recovery0] > > 0 8 0 0 96 0 0 8 idle DL ?? 0:00,00 [ai= c_recovery0] > > 0 9 0 0 96 0 0 8 idle DL ?? 0:00,00 [ai= c_recovery1] > > 0 10 0 0 -16 0 0 8 ktrace DL ?? 0:00,00 [kt= race] > > 0 39 0 0 -16 0 0 8 - DL ?? 0:09,21 [ya= rrow] > > 0 44 0 0 8 0 0 8 usbevt DL ?? 0:00,01 [us= b0] > > 0 45 0 0 8 0 0 8 usbtsk DL ?? 0:00,00 [us= btask] > > 0 46 0 0 96 0 0 8 idle DL ?? 0:00,00 [ai= c_recovery1] > > 0 47 0 0 -8 0 0 8 - DL ?? 0:00,91 [fd= c0] > > 0 49 0 0 -16 0 0 8 psleep DL ?? 0:03,51 [pa= gedaemon] > > 0 50 0 0 20 0 0 8 psleep DL ?? 0:00,00 [vm= daemon] > > 0 51 0 0 171 0 0 8 pgzero DL ?? 12:19,32 [pa= gezero] > > 0 52 0 0 -16 0 0 8 psleep DL ?? 0:06,55 [bu= fdaemon] > > 0 53 0 0 20 0 0 8 syncer DL ?? 1:00,40 [sy= ncer] > > 0 54 0 0 -4 0 0 8 vlruwt DL ?? 0:03,16 [vn= lru] > > 0 55 0 0 -64 0 0 8 - DL ?? 0:11,48 [sc= hedcpu] > > 0 115 0 0 -8 0 0 8 mdwait DL ?? 0:05,75 [md= 7] > > 0 45773 45771 0 -4 0 1740 1208 ufs D p1 0:00,32 dma= ke > > 0 45806 45788 350 -4 0 1548 632 ufs D p1 0:00,00 /bi= n/tcsh -fc zipdep.pl -u -j ../../../ > > 0 65072 64985 271 -4 0 1248 480 ufs D p1 0:00,00 /bi= n/tcsh -fc if ( -e ../../../unxfbsd.p > > 0 65327 8694 0 -4 0 1432 908 ufs D+ p2 0:02,05 fin= d work/ -name provider.o >=20 > Mikhail and I have been looking at this offline and have discovered the > following: > The wedged processes are waiting for vnode locks in the file > name lookup path for the access() and lstat syscalls(). >=20 > There are two locked directories that are wedging these > processes. >=20 > We don't know what threads are holding the locks on these > directories, but we do know that is is none of the threads > associated with these processes, so it is not a classic deadlock > problem. 'show lockedvnods' doesn't help? There is code in -current that saves stack traces when lockmgr locks are acquired, when DEBUG_LOCKS is enabled - except it sometimes panics while trying to save the trace because of a code bug. I remind jeffr about this on a more-or-less daily basis, but he hasn't had time to commit the fix he has yet. It still may be useful if this is easily reproducible. > This problem appears to be some sort of vnode lock leak. leaked lockmgr locks usually panic when the thread exits. Kris --9amGYk9869ThD9tj Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (FreeBSD) iD8DBQFDRLhtWry0BWjoQKURAqbBAKDDYHRox1Y3jeJDYh+vI/po8nMInACfXsXw tqAxMscrtIZGb5inaiXgMfQ= =WUq1 -----END PGP SIGNATURE----- --9amGYk9869ThD9tj--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051006053853.GA58630>