Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 3 Feb 2004 16:56:23 +0100 (CET)
From:      Lukas Ertl <le@FreeBSD.org>
To:        freebsd-geom@FreeBSD.org
Subject:   vinum and GEOM deadlock situation
Message-ID:  <20040203164816.X616@korben.in.tern>

next in thread | raw e-mail | index | archive | help
Hi,

I'm running into a deadlock situation with the following scenario:

Have a vinum RAID5 with several disks mounted, pull out one of the disks,
shortly thereafter all I/O hangs.

I managed to identify the deadlock, but couldn't come up with a fix yet.

Let's see.  Here's the backtrace of the vinum process:

(kgdb) defproc 512
  512 c685da50 e3eac000    0     1   512  000200  1  vinum        g_waitfor_event c6852200
 frame 0 at 0xe3e865a4: ebp e3e865ec, eip 0xc04e20ba <mi_switch+550>:   add    $0x4,%esp
 frame 1 at 0xe3e865ec: ebp e3e86610, eip 0xc04e19ad <msleep+933>:      add    $0x4,%esp
 frame 2 at 0xe3e86610: ebp e3e86638, eip 0xc04b0873 <g_waitfor_event+123>:add    $0x14,%esp
 frame 3 at 0xe3e86638: ebp e3e86660, eip 0xc04af9c7 <disk_destroy+55>: movl   $0x0,(%esi)
 frame 4 at 0xe3e86660: ebp e3e86674, eip 0xc043e971 <dacleanup+89>:    push   $0xc0678500
 frame 5 at 0xe3e86674: ebp e3e868e8, eip 0xc042ffa2 <camperiphfree+66>:add    $0x4,%esp
 frame 6 at 0xe3e868e8: ebp e3e868f4, eip 0xc042fd07 <cam_periph_release+31>:add    $0x4,%esp
 frame 7 at 0xe3e868f4: ebp e3e86918, eip 0xc043e531 <daclose+325>:     mov    $0x0,%edx
 frame 8 at 0xe3e86918: ebp e3e86934, eip 0xc04af344 <g_disk_access+360>:mov    %eax,0xfffffff0(%ebp)
 frame 9 at 0xe3e86934: ebp e3e86964, eip 0xc04b3256 <g_access_rel+466>:mov    %eax,0xffffffe8(%ebp)
 frame 10 at 0xe3e86964: ebp e3e869a0, eip 0xc04b171d <g_slice_access+349>:lea    0xfffffff4(%ebp),%esp
 frame 11 at 0xe3e869a0: ebp e3e869d0, eip 0xc04b3256 <g_access_rel+466>:mov    %eax,0xffffffe8(%ebp)
 frame 12 at 0xe3e869d0: ebp e3e86a0c, eip 0xc04b171d <g_slice_access+349>:lea    0xfffffff4(%ebp),%esp
 frame 13 at 0xe3e86a0c: ebp e3e86a3c, eip 0xc04b3256 <g_access_rel+466>:mov    %eax,0xffffffe8(%ebp)
 frame 14 at 0xe3e86a3c: ebp e3e86a70, eip 0xc04aecc4 <g_dev_close+172>:mov    %eax,%edi
 frame 15 at 0xe3e86a70: ebp e3e86a94, eip 0xc6780dfb <close_locked_drive+123>:mov    %eax,%edi
 frame 16 at 0xe3e86a94: ebp e3e86aa4, eip 0xc6780d62 <close_drive+38>: add    $0x4,%esp
 frame 17 at 0xe3e86aa4: ebp e3e86ac8, eip 0xc6781798 <daemon_save_config+324>:add    $0x8,%esp
 frame 18 at 0xe3e86ac8: ebp e3e86ad8, eip 0xc677f7e2 <vinum_daemon+478>:jmp    0xc677f89f <vinum_daemon+667>
 frame 19 at 0xe3e86ad8: ebp e3e86ae0, eip 0xc677f9c4 <vinum_finddaemon+68>:mov    $0x0,%edx
 frame 20 at 0xe3e86ae0: ebp e3e86af8, eip 0xc67828bd <vinum_super_ioctl+1505>: jmp    0xc67828c9 <vinum_super_ioctl+1517>
 frame 21 at 0xe3e86af8: ebp e3e86b44, eip 0xc67820fe <vinumioctl+58>:  jmp    0xc67822d4 <vinumioctl+528>
 frame 22 at 0xe3e86b44: ebp e3e86b70, eip 0xc04ad2ea <spec_ioctl+242>: mov    %eax,%esi
 frame 23 at 0xe3e86b70: ebp e3e86b7c, eip 0xc04acbef <spec_vnoperate+19>:leave
 frame 24 at 0xe3e86b7c: ebp e3e86c34, eip 0xc052f20f <vn_ioctl+383>:   add    $0x4,%esp
 frame 25 at 0xe3e86c34: ebp e3e86cec, eip 0xc04fc6e8 <ioctl+892>:      add    $0x14,%esp
 frame 26 at 0xe3e86cec: ebp e3e86d40, eip 0xc060e297 <syscall+535>:    mov    %eax,%ebx

As you can see, it finally hangs in g_waitfor_event()+123:

328             do
329                     tsleep(ep, PRIBIO, "g_waitfor_event", hz);
330             while (!(ep->flag & EV_DONE));

So, what is the g_event thread doing:

(kgdb) defproc 2
    2 c685da50 e1a5e000    0     0     0  000204  1  g_event      GEOM topology c069dc58
 frame 0 at 0xe1a38c50: ebp e1a38c98, eip 0xc04e20ba <mi_switch+550>:   add    $0x4,%esp
 frame 1 at 0xe1a38c98: ebp e1a38cb0, eip 0xc04bfba9 <cv_wait+429>:     movl   $0xe4,(%esp,1)
 frame 2 at 0xe1a38cb0: ebp e1a38cc8, eip 0xc04e11ec <_sx_xlock+100>:   decl   0x48(%ebx)
 frame 3 at 0xe1a38cc8: ebp e1a38cfc, eip 0xc04b0352 <one_event+66>:    add    $0x28,%esp
 frame 4 at 0xe1a38cfc: ebp e1a38d04, eip 0xc04b0549 <g_run_events+9>:  test   %eax,%eax
 frame 5 at 0xe1a38d04: ebp e1a38d1c, eip 0xc04b12c9 <g_event_procbody+61>:mov    0xc06ab164,%esi
 frame 6 at 0xe1a38d1c: ebp e1a38d34, eip 0xc04cb5f0 <fork_exit+156>:   push   $0x325

It hangs at one_event()+66:

170             g_topology_lock();

OK, and here's the problem: the topology lock was grabbed in
g_dev_close(), which you can see in the backtrace of the vinum process.

Any ideas?

regards,
le

-- 
Lukas Ertl                           http://mailbox.univie.ac.at/~le/
le@FreeBSD.org                       http://people.freebsd.org/~le/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040203164816.X616>