Date: Sat, 30 Nov 2002 11:15:32 -0800 From: Kirk McKusick <mckusick@beastie.mckusick.com> To: Sean Kelly <smkelly@zombie.org> Cc: current@FreeBSD.ORG Subject: Re: UFS Snapshot deadlock Message-ID: <200211301915.gAUJFW59081565@beastie.mckusick.com> In-Reply-To: Your message of "Wed, 30 Oct 2002 03:57:52 CST." <20021030095752.GA1868@edgemaster.zombie.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Your deadlock should now be fixed.
Kirk McKusick
=-=-=-=-=
From: Kirk McKusick <mckusick@FreeBSD.org>
Date: Fri, 29 Nov 2002 23:27:12 -0800 (PST)
To: cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject: cvs commit: src/sys/ufs/ffs ffs_snapshot.c
X-FreeBSD-CVS-Branch: HEAD
mckusick 2002/11/29 23:27:12 PST
Modified files:
sys/ufs/ffs ffs_snapshot.c
Log:
Fix two deadlocks in snapshots:
1) Release the snapshot file lock while suspending the system. Otherwise
a process trying to read the lock may block on its containing directory
preventing the suspension from completing. Thanks to Sean Kelly
<smkelly@zombie.org> for finding this deadlock.
2) Replace some bdwrite's with bawrite's so as not to fill all the
buffers with dirty data. The buffers could not be cleaned as the
snapshot vnode was locked hence the system could deadlock when
making snapshots of really massive filesystems. Thanks to
Hidetoshi Shimokawa <simokawa@sat.t.u-tokyo.ac.jp> for figuring
this out.
Sponsored by: DARPA & NAI Labs.
Revision Changes Path
1.51 +7 -2 src/sys/ufs/ffs/ffs_snapshot.c
=-=-=-=-=-=
Date: Wed, 30 Oct 2002 03:57:52 -0600
From: Sean Kelly <smkelly@zombie.org>
To: current@FreeBSD.ORG
Subject: UFS Snapshot deadlock
While playing with UFS snapshots on a UFS2 filesystem I mounted
specifically for this purpose, I encountered a little problem. It seems I
have processes deadlocked on each other.
Steps to repeat:
/# mount /dev/ad2a /mnt ; cd /mnt
/dev/ad2a on /mnt (ufs, local, soft-updates, multilabel) # UFS2
/mnt# cd /mnt; mount -u -o snapshot /mnt/snapshot /mnt
*switch vtys*
/# cd /mnt; ls -l
*ls deadlocks*
*I get bored and ^C the mount on the other vty about 30 minutes later*
/mnt# ls
*this ls deadlocks too*
For the record, /mnt was a new filesystem. It had *nothing* in it. No
directories or anything.
So now, I've got these:
UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
0 1133 669 0 -4 0 692 548 ufs D+ v1 0:00.00 ls
1001 939 856 0 -4 0 696 560 ufs D+ v2 0:00.00 ls -l
0 937 1 0 -4 0 560 336 ufs D v1 0:00.65 mount -u -o snapshot /mnt/snapshot /mnt
Now for some numbers.
db> trace 937
mi_switch(c71aab60,50,c03375c6,c7,c03ad2f8) at mi_switch+0x158
msleep(c75098dc,c03a9358,50,c034f732,0) at msleep+0x3b4
acquire(c75098dc,1000040,600,e6,3a9) at acquire+0xa7
lockmgr(c75098dc,1010002,c7509818,c71aab60,e5b076a8) at lockmgr+0x2f7
vop_stdlock(e5b076c4,e5b076e0,c021e306,e5b076c4,0) at vop_stdlock+0x2c
ufs_vnoperate(e5b076c4,0,c033dd28,e5b076e0,c01ba4a5) at ufs_vnoperate+0x18
vn_lock(c7509818,10002,c71aab60,815,c7509818) at vn_lock+0xd6
vget(c7509818,2,c71aab60,470,0) at vget+0xd6
ffs_sync(c74c5400,1,c726a780,c71aab60,c74f1000) at ffs_sync+0x126
vfs_write_suspend(c74c5400,c74ffcb8,d351f08c,1,c2c06e80) at vfs_write_suspend+0x70
ffs_snapshot(c74c5400,bfbffd1d,70,c033990d,252) at ffs_snapshot+0xa48
ffs_mount(c74c5400,c745ce80,bfbff000,e5b07bf0,c71aab60) at ffs_mount+0x548
vfs_mount(c71aab60,c6d2b780,c745ce80,1010000,bfbff000) at vfs_mount+0x85e
mount(c71aab60,e5b07d14,c03590ba,409,4) at mount+0xb8
syscall(2f,2f,2f,bfbfeffc,bfbff9f4) at syscall+0x22e
Xint0x80_syscall() at Xint0x80_syscall+0x1d
db> trace 939
mi_switch(c74260d0,50,c03375c6,c7,1cc) at mi_switch+0x158
msleep(c74ffd7c,c03a9688,50,c034f732,0) at msleep+0x3b4
acquire(c74ffd7c,1000040,600,e6,3ab) at acquire+0xa7
lockmgr(c74ffd7c,1010002,c74ffcb8,c74260d0,e5bfd83c) at lockmgr+0x2f7
vop_stdlock(e5bfd858,e5bfd874,c021e306,e5bfd858,246) at vop_stdlock+0x2c
ufs_vnoperate(e5bfd858,246,0,c74f1000,0) at ufs_vnoperate+0x18
vn_lock(c74ffcb8,10002,c74260d0,7f,3) at vn_lock+0xd6
vget(c74ffcb8,10002,c74260d0,7f,c74260d0) at vget+0xd6
ufs_ihashget(c74cce00,3,2,e5bfd98c,e5bfd8f0) at ufs_ihashget+0xd2
ffs_vget(c74c5400,3,2,e5bfd98c,e5bfd994) at ffs_vget+0x44
ufs_lookup(e5bfdac0,e5bfdafc,c0207a24,e5bfdac0,e5bfdc3c) at ufs_lookup+0xdae
ufs_vnoperate(e5bfdac0,e5bfdc3c,e5bfdc50,3ab,c74260d0) at ufs_vnoperate+0x18
vfs_cache_lookup(e5bfdb70,e5bfdb9c,c020bd39,e5bfdb70,c7509818) at vfs_cache_lookup+0x2e4
ufs_vnoperate(e5bfdb70,c7509818,e5bfdc50,e5bfdb5c,c74260d0) at ufs_vnoperate+0x18
lookup(e5bfdc28,0,c033d6ad,a4,c74260d0) at lookup+0x309
namei(e5bfdc28,c03ade38,c03ade10,c03b42a0,0) at namei+0x1e0
lstat(c74260d0,e5bfdd14,c03590ba,409,2) at lstat+0x52
syscall(2f,2f,2f,80d3200,80d1040) at syscall+0x22e
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (190, FreeBSD ELF32, lstat), eip = 0x805838b, esp = 0xbfbff3dc, ebp = 0xbfbff468 ---
db> trace 1133
mi_switch(c6d31680,50,c03375c6,c7,2) at mi_switch+0x158
msleep(c75098dc,c03a9358,50,c034f732,0) at msleep+0x3b4
acquire(c75098dc,1000040,600,e6,46d) at acquire+0xa7
lockmgr(c75098dc,1030002,c7509818,c6d31680,e3887ad0) at lockmgr+0x2f7
vop_stdlock(e3887aec,e3887b08,c021e306,e3887aec,0) at vop_stdlock+0x2c
ufs_vnoperate(e3887aec,0,c033e1ac,360,c01e3af0) at ufs_vnoperate+0x18
vn_lock(c7509818,20002,c6d31680,e3887b5c,c6d31680) at vn_lock+0xd6
lookup(e3887c28,0,c033d6ad,a4,c6d31680) at lookup+0x8e
namei(e3887c28,c03ade38,c03ade10,c03b42a0,0) at namei+0x1e0
stat(c6d31680,e3887d14,c03590ba,409,2) at stat+0x52
syscall(2f,2f,2f,80d3080,80d1000) at syscall+0x22e
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (188, FreeBSD ELF32, stat), eip = 0x80583b3, esp = 0xbfbff4dc, ebp = 0xbfbff568 ---
db> x/x 0xc74ffd7c, 20
0xc74ffd7c: c03a9688 1200440 0 1
0xc74ffd8c: 500001 c034f732 6 3a9
0xc74ffd9c: c74ffd7c c6be9500 c74c5400 0
0xc74ffdac: 0 c74ffdac 968 c74ffcb8
0xc74ffdbc: 0 0 1 0
0xc74ffdcc: 0 0 0 0
0xc74ffddc: ffffffff c0370e80 c033dd98 c033dd98
0xc74ffdec: 30000 c74cb734 c7508010 c03acfd8
db> x/x 0xc75098dc, 10
0xc75098dc: c03a9358 1200440 0 3
0xc75098ec: 500001 c034f732 6 3ab
0xc75098fc: c75098dc c6be9500 c74c5400 0
0xc750990c: 0 c750990c 93c c7509818
(gdb) list *( ufs_lookup+0xdae)
0xc02bd86e is in ufs_lookup (/usr/src/sys/ufs/ufs/ufs_lookup.c:602).
597 } else if (dp->i_number == dp->i_ino) {
598 VREF(vdp); /* we want ourself, ie "." */
599 *vpp = vdp;
600 } else {
601 error = VFS_VGET(pdp->v_mount, dp->i_ino, LK_EXCLUSIVE, &tdp);
602 if (error)
603 return (error);
604 if (!lockparent || !(flags & ISLASTCN)) {
605 VOP_UNLOCK(pdp, 0, td);
606 cnp->cn_flags |= PDIRUNLOCK;
--
Sean Kelly | PGP KeyID: 77042C7B
smkelly@zombie.org | http://www.zombie.org
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200211301915.gAUJFW59081565>
