Date: Sat, 30 Nov 2002 19:56:42 +0100 (CET) From: Michal Mertl <mime@traveller.cz> To: current@freebsd.org Subject: Re: system locks with vnode backed md(4) Message-ID: <Pine.BSF.4.41.0211301949200.88595-200000@prg.traveller.cz>
index | next in thread | raw e-mail
[-- Attachment #1 --]
Ok, I got another one. DDB output attached. I did all kinds of operations
to trigger it - had 3 md mounted from the same dir, in 2 of them doing my
ports.tgz torture test and in root file system I had 'find . -inum
1231231' running. One find finished succesfully but then it finally
locked-up.
> Yeah, vnode locks and other lockmgr locks don't show up in 'show locks',
> since only SMPng locking primitives are tracked by WITNESS.
I see.
> There are a fair number of vnode locking deadlock scenarios that are
> unavoidable where we rely on grabbing vnode locks out of the directory
> structure lock order. This occurs for vnode-backed md devices, quotas,
> and UFS1 extended attributes, and probably some other situations. I
> suspect that Terry is correct that operations on the vnode backing file
> storage directory are triggering the problem, since that increases the
> chances that a vnode lock "race to root" will occur from both the file
> system backed into the md device, and for the md backing vnodes during
> blocking I/O. If you can avoid directory operations on the md backing
> directory, that would probably be one way to avoid triggering the bug.
I'm afraid this doesn't sound too good for me.
> Seeing it reproduced would probably confirm that this is the case. On the
> other hand, there may be other deadlocks in the vnode/ufs/md code that can
> be more easily corrected than this general VFS problem, so details there
> would be very useful.
May be the attached one will allow someone to track something down.
PS: Sorry if you have problems with attachment, I myself find them
difficult to read (I'm receivind digest of this list - isn't there a
possibility to have it in mime form (like Buqtraq and others)?
--
Michal Mertl
mime@traveller.cz
[-- Attachment #2 --]
db> show lockedvnods
Locked vnodes
0xc3ef8b90: tag ufs, type VDIR, usecount 1, writecount 0, refcount 1, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by pid 1993
ino 6593, on dev ad0s1a (4, 4)
0xc3efea68: tag ufs, type VREG, usecount 2, writecount 0, refcount 0, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by pid 599
ino 6596, on dev ad0s1a (4, 4)
0xc4ac7818: tag devfs, type VCHR, usecount 1297, writecount 0, refcount 30, flags (VV_OBJBUF), lock type devfs: EXCL (count 1) by pid 8
0xc574f250: tag ufs, type VREG, usecount 2, writecount 1, refcount 8, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by pid 1919
ino 7, on dev ad0s2e (4, 10)
0xc3f3c940: tag ufs, type VREG, usecount 2, writecount 1, refcount 225, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by pid 439
ino 3, on dev ad0s2e (4, 10)
0xc559d940: tag ufs, type VREG, usecount 0, writecount 0, refcount 0, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by pid 1965
ino 4, on dev ad0s2e (4, 10)
0xc5c84cb8: tag ufs, type VREG, usecount 2, writecount 1, refcount 426, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by pid 810
ino 5, on dev ad0s2e (4, 10)
0xc4a5a000: tag ufs, type VDIR, usecount 0, writecount 0, refcount 1, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by pid 1990
ino 96988, on dev md0 (4, 12)
0xc444f128: tag ufs, type VDIR, usecount 1, writecount 0, refcount 2, lock type ufs: EXCL (count 1) by pid 1964
ino 14330, on dev md0 (4, 12)
0xc4454818: tag ufs, type VNON, usecount 1, writecount 0, refcount 0, lock type ufs: EXCL (count 1) by pid 1964
ino 14336, on dev md0 (4, 12)
0xc5183de0: tag ufs, type VDIR, usecount 0, writecount 0, refcount 2, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by pid 1992
ino 5090, on dev md1 (4, 14)
db> show locks
exclusive sleep mutex Giant r = 0 (0xc0399660) locked @ ../../../kern/kern_intr.c:534
db> ps
pid proc addr uid ppid pgrp flag stat wmesg wchan cmd
1993 c4819938 e1797000 0 1737 1993 0004002 norm[SLPQ newbuf c03ede10][SLP] find
1992 c481b3b0 e179c000 0 547 1992 0004002 norm[SLPQ getblk ce3eabb8][SLP] rm
1990 c4819b10 e1798000 0 1962 1990 0004002 norm[SLPQ getblk ce2d1ba4][SLP] cp
1965 c4819000 e174e000 0 1964 1964 0006002 norm[SLPQ newbuf c03ede10][SLP] gzip
1964 c481b938 e179f000 0 436 1964 0004002 norm[SLPQ biord ce2f743c][SLP] tar
1962 c4b57000 e17c3000 0 1958 1962 2004002 norm[SLPQ pause e17c3000][SLP] csh
1958 c4b571d8 e17c4000 1001 1779 1958 0004102 norm[SLPQ wait c4b571d8][SLP] su
1919 c48193b0 e175e000 0 0 0 0000204 norm[SLPQ newbuf c03ede10][SLP] md2
1779 c4819588 e175f000 1001 1778 1779 2004002 norm[SLPQ pause e175f000][SLP] tcsh
1778 c4b59000 e17cb000 1001 1775 360 0000100 norm[CVQ select c039cbe4][SLP] sshd
1775 c4b59588 e1804000 0 360 360 0000100 norm[SLPQ sbwait c3f00e64][SLP] sshd
1737 c4b591d8 e17cc000 0 1736 1737 2004002 norm[SLPQ pause e17cc000][SLP] csh
1736 c48191d8 e174f000 1001 458 1736 0004102 norm[SLPQ wait c48191d8][SLP] su
810 c4b57ce8 e17ca000 0 0 0 0000204 norm[SLPQ newbuf c03ede10][SLP] md1
599 c4b57b10 e17c9000 0 597 599 0004002 norm[SLPQ newbuf c03ede10][SLP] systat
597 c4b57760 e17c7000 0 415 597 2004002 norm[SLPQ pause e17c7000][SLP] csh
547 c4819760 e1760000 0 545 547 2004002 norm[SLPQ pause e1760000][SLP] csh
545 c403ace8 e1711000 1001 512 545 0004102 norm[SLPQ wait c403ace8][SLP] su
512 c481b760 e179e000 1001 511 512 2004002 norm[SLPQ pause e179e000][SLP] tcsh
511 c3e4a1d8 e04dd000 1001 508 360 0000100 norm[CVQ select c039cbe4][SLP] sshd
508 c481b000 e179a000 0 360 360 0000100 norm[SLPQ sbwait c3eff764][SLP] sshd
458 c3e4d760 e051e000 1001 457 458 2004002 norm[SLPQ pause e051e000][SLP] tcsh
457 c3e4a3b0 e04de000 1001 454 360 0000100 norm[CVQ select c039cbe4][SLP] sshd
454 c3d27b10 e04d2000 0 360 360 0000100 norm[SLPQ sbwait c3eff964][SLP] sshd
439 c3e4d3b0 e051c000 0 0 0 0000204 norm[SLPQ newbuf c03ede10][SLP] md0
436 c403a000 e170a000 0 435 436 2004002 norm[SLPQ pause e170a000][SLP] csh
435 c3e4dce8 e0521000 1001 427 435 0004102 norm[SLPQ wait c3e4dce8][SLP] su
427 c3e4d938 e051f000 1001 426 427 2004002 norm[SLPQ pause e051f000][SLP] tcsh
426 c3e4db10 e0520000 1001 423 360 0000100 norm[CVQ select c039cbe4][SLP] sshd
423 c403a1d8 e170b000 0 360 360 0000100 norm[SLPQ sbwait c3eff264][SLP] sshd
422 c403a3b0 e170c000 0 1 422 0004002 norm[SLPQ ttyin c3fd8610][SLP] getty
421 c403a588 e170d000 0 1 421 0004002 norm[SLPQ ttyin c3fd8a10][SLP] getty
420 c403a760 e170e000 0 1 420 0004002 norm[SLPQ ttyin c3fd8e10][SLP] getty
419 c403a938 e170f000 0 1 419 0004002 norm[SLPQ ttyin c3fd9210][SLP] getty
418 c403ab10 e1710000 0 1 418 0004002 norm[SLPQ ttyin c3efa210][SLP] getty
417 c3e4a588 e04df000 0 1 417 0004002 norm[SLPQ ttyin c3e47c10][SLP] getty
416 c3e4ab10 e04e2000 0 1 416 0004002 norm[SLPQ ttyin c3e47810][SLP] getty
415 c3e4a760 e04e0000 0 1 415 0004102 norm[SLPQ wait c3e4a760][SLP] login
405 c3e4a000 e04dc000 0 1 405 0000000 norm[SLPQ nanslp c03c8e34][SLP] cron
360 c3e4a938 e04e1000 0 1 360 0000100 norm[CVQ select c039cbe4][SLP] sshd
210 c3d27ce8 e04d3000 0 1 210 0000000 norm[CVQ select c039cbe4][SLP] syslogd
33 c3e4ace8 e04e3000 0 0 0 0000204 norm[SLPQ vlruwt c3e4ace8][SLP] vnlru
9 c3e4d000 e051a000 0 0 0 0000204 norm[SLPQ newbuf c03ede10][SLP] syncer
8 c3e4d1d8 e051b000 0 0 0 0000204 norm[SLPQ wdrain c03ede0c][SLP] bufdaemon
7 c3cc0588 d781b000 0 0 0 000020c norm[SLPQ pgzero c03ef2a4][SLP] pagezero
6 c3cc0760 d7852000 0 0 0 0000204 norm[SLPQ psleep c03ef2bc][SLP] vmdaemon
5 c3cc0938 d7853000 0 0 0 0000204 norm[SLPQ psleep c03a9718][SLP] pagedaemon
32 c3cc0b10 d7854000 0 0 0 0000204 new [IWAIT] irq8: rtc
31 c3cc0ce8 d7855000 0 0 0 0000204 new [IWAIT] irq0: clk
30 c3d27000 e04cc000 0 0 0 0000204 new [IWAIT] irq4: sio0
29 c3d271d8 e04cd000 0 0 0 0000204 norm[IWAIT] swi0: tty:sio
28 c3d273b0 e04ce000 0 0 0 0000204 norm[CPU 0] irq1: atkbd0
27 c3d27588 e04cf000 0 0 0 0000204 new [IWAIT] irq15: ata1
26 c3d27760 e04d0000 0 0 0 0000204 norm[IWAIT] irq14: ata0
25 c3d27938 e04d1000 0 0 0 0000204 new [IWAIT] irq5: pcm0
24 c120e1d8 d6640000 0 0 0 0000204 norm[IWAIT] irq10: rl0 bktr3+++
23 c120e3b0 d6641000 0 0 0 0000204 new [IWAIT] irq6: bktr2 bktr5++
22 c120e588 d6642000 0 0 0 0000204 new [IWAIT] irq7: bktr1 bktr4++
21 c120e760 d6643000 0 0 0 0000204 new [IWAIT] irq11: bktr0 bktr7*
20 c120e938 d6644000 0 0 0 0000204 new [IWAIT] irq13:
19 c120eb10 d6645000 0 0 0 0000204 new [IWAIT] swi5: acpitaskq
18 c120ece8 d6646000 0 0 0 0000204 new [IWAIT] swi3: cambio
17 c3cc0000 d7818000 0 0 0 0000204 new [IWAIT] swi2: camnet
16 c3cc01d8 d7819000 0 0 0 0000204 new [IWAIT] swi5: task queue
15 c3cc03b0 d781a000 0 0 0 0000204 norm[SLPQ sleep c03b5040][SLP] random
4 c1207000 d65cb000 0 0 0 0000204 norm[SLPQ g_down c03934b0][SLP] g_down
3 c12071d8 d6638000 0 0 0 0000204 norm[SLPQ g_up c03934ac][SLP] g_up
2 c12073b0 d6639000 0 0 0 0000204 norm[SLPQ g_events c03934a4][SLP] g_event
14 c1207588 d663a000 0 0 0 0000204 new [IWAIT] swi4: vm
13 c1207760 d663b000 0 0 0 000020c norm[RUNQ] swi6: tty:sio clock
12 c1207938 d663c000 0 0 0 0000204 norm[IWAIT] swi1: net
11 c1207b10 d663d000 0 0 0 000020c norm[Can run] idle
1 c1207ce8 d663e000 0 0 1 0004200 norm[SLPQ wait c1207ce8][SLP] init
10 c120e000 d663f000 0 0 0 0000204 norm[CVQ ktrace c03c50c4][SLP] ktrace
0 c0394720 c04ff000 0 0 0 0000200 norm[SLPQ sched c0394720][SLP] swapper
help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.41.0211301949200.88595-200000>
