From owner-freebsd-stable@FreeBSD.ORG  Thu Dec 27 17:47:08 2012
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id A8B0411B
 for <freebsd-stable@freebsd.org>; Thu, 27 Dec 2012 17:47:08 +0000 (UTC)
 (envelope-from longwitz@incore.de)
Received: from dss.incore.de (dss.incore.de [195.145.1.138])
 by mx1.freebsd.org (Postfix) with ESMTP id 4ADC88FC08
 for <freebsd-stable@freebsd.org>; Thu, 27 Dec 2012 17:47:08 +0000 (UTC)
Received: from inetmail.dmz (inetmail.dmz [10.3.0.3])
 by dss.incore.de (Postfix) with ESMTP id 5ED075C346;
 Thu, 27 Dec 2012 18:47:07 +0100 (CET)
X-Virus-Scanned: amavisd-new at incore.de
Received: from dss.incore.de ([10.3.0.3])
 by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024)
 with LMTP id 3UNQndxsVKYv; Thu, 27 Dec 2012 18:47:05 +0100 (CET)
Received: from mail.incore (fwintern.dmz [10.0.0.253])
 by dss.incore.de (Postfix) with ESMTP id ABBDF5C342;
 Thu, 27 Dec 2012 18:47:05 +0100 (CET)
Received: from bsdlo.incore (bsdlo.incore [192.168.0.84])
 by mail.incore (Postfix) with ESMTP id A057B50851;
 Thu, 27 Dec 2012 18:47:05 +0100 (CET)
Message-ID: <50DC8999.8000708@incore.de>
Date: Thu, 27 Dec 2012 18:47:05 +0100
From: Andreas Longwitz <longwitz@incore.de>
User-Agent: Thunderbird 2.0.0.19 (X11/20090113)
MIME-Version: 1.0
To: Konstantin Belousov <kostikbel@gmail.com>
Subject: Re: FS hang with suspfs when creating snapshot on a UFS + GJOURNAL
 setup
References: <50DC30F6.1050904@incore.de> <20121227133355.GI82219@kib.kiev.ua>
In-Reply-To: <20121227133355.GI82219@kib.kiev.ua>
Content-Type: text/plain; charset=ISO-8859-15
Content-Transfer-Encoding: 7bit
Cc: freebsd-stable@freebsd.org
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Dec 2012 17:47:08 -0000

Konstantin Belousov wrote:
> On Thu, Dec 27, 2012 at 12:28:54PM +0100, Andreas Longwitz wrote:
>> On a FreeBSD 8-Stable machine with UFS + GJOURNAL (no SU) I observed the
>> same behaviour as described for UFS+SU+J in
>>   lists.freebsd.org/pipermail/freebsd-current/2012-January/030937.html.
>>
>> The snapshot was initiated by amanda with
>>   dump 3ubLshf 64 1048576 0 - /dev/mirror/gm0p10.journal (pid 50347)
>> and the process creating the snapshot is
>>   /sbin/mksnap_ffs /home /home/.snap/dump_snapshot (pid 50350).
>>
>> The process 50350 hangs and also all following processes that tried to
>> access the home partition, some of them work, but don't come to an end,
>> e.g. a shell after (Ctrl T):
>>   load: 0.61  cmd: sh 52670 [suspfs] 43.46r 0.00u 0.00s 0% 2272k.
>>
>> All write I/O's to all gjournaled partitions are stopped. Under normal
>> circumstances the snapshot is taken in five seconds, so I have definitiv
>> not the problem I have described in
>>   lists.freebsd.org/pipermail/freebsd-geom/2012-May/005246.html.
>>
>> .....
>>
>> It seems there is a deadlock on the suspfs lock, but I could not figure
>> out who holds this lock.
>> Any hints how to get better diagnostic information for next time the
>> error occurs are welcome.
> 
> The
> http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html
> provides the instructions.
> 
> The suspfs is owned by the snapshot creator. The question is, where is it
> blocked.

Thanks for answer.

In the meantime I can reproduce the problem and got some more
information. It looks that there is a deadlock between the two processes
with pid 18 (g_journal switcher) and pid 7126 (/sbin/mksnap_ffs /home
/home/.snap/my_snapshot):

0    18     0   0  45  0     0    16 snaplk DL    ??    4:40,34
         [g_journal switcher]
0  7126  1933   0  50  0  5836  1176 suspfs D     ??    0:00,44
         /sbin/mksnap_ffs /home /home/.snap/my_snapshot

procstat -t 18 -->
  PID    TID COMM               TDNAME           CPU  PRI STATE   WCHAN
   18 100076 g_journal switcher g_journal switch   0  129 sleep   snaplk
procstat  -t 7126 -->
  PID    TID COMM               TDNAME           CPU  PRI STATE   WCHAN
 7126 100157 mksnap_ffs       -                    1  134 sleep   suspfs
procstat -kk 18 -->
  PID    TID COMM             TDNAME           KSTACK
   18 100076 g_journal switcher g_journal switch mi_switch+0x186
      sleepq_wait+0x42 __lockmgr_args+0x49b ffs_copyonwrite
      +0x19a ffs_geom_strategy+0x1b5 bufwrite+0xe9 ffs_sbupdate+0x12a
      g_journal_ufs_clean+0x3e g_journal_switcher+0xe5e fork
      _exit+0x11f fork_trampoline+0xe
procstat -kk 7126 -->
  PID    TID COMM             TDNAME           KSTACK
 7126 100157 mksnap_ffs       -                mi_switch+0x186
      sleepq_wait+0x42 _sleep+0x373 vn_start_write+0xdf ffs_s
      napshot+0xe2b ffs_mount+0x65a vfs_donmount+0xdc5 nmount+0x63
      amd64_syscall+0x1f4 Xfast_syscall+0xfc

>From DDB:
db> show lockedvnods
Locked vnodes

0xffffff012281a938: tag ufs, type VREG
    usecount 1, writecount 0, refcount 3339 mountedhere 0
    flags (VV_SYSTEM)
    lock type snaplk: EXCL by thread 0xffffff000807a470 (pid 7126)
 with exclusive waiters pending
        ino 23552, on dev mirror/gm0p10.journal

db> show lockedbufs
buf at 0xffffff81efa73320
b_flags = 0xa0000020<remfree,vmio,cache>
b_error = 0, b_bufsize = 8192, b_bcount = 8192, b_resid = 0
b_bufobj = (0xffffff00080564c8), b_data = 0xffffff81f659e000, b_blkno =
12000, b_dep = 0
b_npages = 2, pages(OBJ, IDX, PA): (0xffffff000805c000, 0x5dc,
0x4a78000),(0xffffff000805c000, 0x5dd, 0x4a79000)
lock type bufwait: EXCL by thread 0xffffff0002bd5000 (pid 18)

buf at 0xffffff81f0485070
b_flags = 0x80000020<remfree,cache>
b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
b_bufobj = (0xffffff012281aa50), b_data = 0xffffff8207896000, b_blkno =
462144, b_dep = 0
b_npages = 4, pages(OBJ, IDX, PA): (0, 0xffffff8207896, 0x118bd9000),(0,
0xffffff8207897, 0x118bda000),(0, 0xffffff820
7898, 0x118bdb000),(0, 0xffffff8207899, 0x118bdc000)
lock type bufwait: EXCL by thread 0xffffff000807a470 (pid 7126)

buf at 0xffffff81f0df9f98
b_flags = 0xa0000020<remfree,vmio,cache>
b_error = 0, b_bufsize = 2048, b_bcount = 2048, b_resid = 0
b_bufobj = (0xffffff00080564c8), b_data = 0xffffff8217ad2000, b_blkno =
128, b_dep = 0
b_npages = 1, pages(OBJ, IDX, PA): (0xffffff000805c000, 0x10, 0x4a7a000)
lock type bufwait: EXCL by thread 0xffffff0002bd5000 (pid 18)

db> alltrace (pid 18 and 7126)

Tracing command g_journal switcher pid 18 tid 100076 td 0xffffff0002bd5000
sched_switch() at sched_switch+0xde
mi_switch() at mi_switch+0x186
sleepq_wait() at sleepq_wait+0x42
__lockmgr_args() at __lockmgr_args+0x49b
ffs_copyonwrite() at ffs_copyonwrite+0x19a
ffs_geom_strategy() at ffs_geom_strategy+0x1b5
bufwrite() at bufwrite+0xe9
ffs_sbupdate() at ffs_sbupdate+0x12a
g_journal_ufs_clean() at g_journal_ufs_clean+0x3e
g_journal_switcher() at g_journal_switcher+0xe5e
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff8242ca8cf0, rbp = 0 ---

Tracing command mksnap_ffs pid 7126 tid 100157 td 0xffffff000807a470
sched_switch() at sched_switch+0xde
mi_switch() at mi_switch+0x186
sleepq_wait() at sleepq_wait+0x42
_sleep() at _sleep+0x373
vn_start_write() at vn_start_write+0xdf
ffs_snapshot() at ffs_snapshot+0xe2b
ffs_mount() at ffs_mount+0x65a
vfs_donmount() at vfs_donmount+0xdc5
nmount() at nmount+0x63
amd64_syscall() at amd64_syscall+0x1f4
Xfast_syscall() at Xfast_syscall+0xfc
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x18069e35c, rsp =
0x7fffffffe358, rbp = 0x7fffffffedc7 ---

There are a lot of other - but later started than pid 7126 - processes
sitting on the suspfs lock, most of them hang on a close with a stack
like this:

Tracing command cat pid 17726 tid 100387 td 0xffffff012e4bd470
sched_switch() at sched_switch+0xde
mi_switch() at mi_switch+0x186
sleepq_wait() at sleepq_wait+0x42
_sleep() at _sleep+0x373
vn_start_write() at vn_start_write+0xdf
vn_close() at vn_close+0x5b
vn_closefile() at vn_closefile+0x5a
_fdrop() at _fdrop+0x23
closef() at closef+0x45
kern_close() at kern_close+0x163
amd64_syscall() at amd64_syscall+0x1f4
Xfast_syscall() at Xfast_syscall+0xfc
--- syscall (6, FreeBSD ELF64, close), rip = 0x18073f07c, rsp =
0x7fffffffeb68, rbp = 0 ---


If more information is necessary to catch this problem, I can try to
reproduce the problem with a suitable debug kernel.

-- 
Andreas Longwitz