Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 9 May 2012 16:20:11 GMT
From:      Andreas Longwitz <longwitz@incore.de>
To:        freebsd-geom@FreeBSD.org
Subject:   Re: kern/164252: [geom] gjournal overflow
Message-ID:  <201205091620.q49GKBqW036236@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/164252; it has been noted by GNATS.

From: Andreas Longwitz <longwitz@incore.de>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: kern/164252: [geom] gjournal overflow
Date: Wed, 09 May 2012 18:12:51 +0200

 The panic "gjournal overflow" is caused by design problems of snapshot
 and/or gjournal on big disks (1 TB or more). Each journaled partition is
 served by a kernel thread g_journal and there is one kernel thread
 g_journal switcher responsible for all journaled partitions. Aside from
 mount/umount the g_journal switcher and snapshot for ffs are the only
 kernel threads using vfs_write_suspend() holding the lock "suspwt".
 After starting a snapshot on a big disk with
        mksnap_ffs /backup/.snap/snapshot
 the lock "suspwt" will be catched and hold for a long time. Some seconds
 later the g_journal switcher tries to switch the journal of the backup
 partition and blocks on the "suspwt" lock. Therefore he can not handle
 the other journaled partition anymore, he must wait until the snapshot
 releases the "suspwt" lock.
 
 On my test server (FreeBSD 8.3) /backup is mounted on /dev/mirror/gm2p2
 (1,8 TB) and kern.geom.journal.debug=1 gives
 
 May  9 09:59:47 : Data has been copied.
 May  9 09:59:47 : Msync time of /backup: 0.031111s
 May  9 09:59:47 : Sync time of /backup: 0.086015s
 May  9 09:59:47 : Cache flush time: 0.000705s
 May  9 09:59:47 : BIO_FLUSH time of mirror/gm2p2: 0.015049s
 May  9 10:04:12 : Journal mirror/gm2p2 71% full, forcing journal switch.
 May  9 10:17:48 : Suspend time of /backup: 1080.955120s
 May  9 10:17:48 : Starting copy of journal.
 May  9 10:17:48 : Cache flush time: 0.013182s
 May  9 10:17:48 : Cache flush time: 0.027241s
 May  9 10:17:48 : Switch time of mirror/gm2p2: 0.206213s
 May  9 10:17:48 : Entire switch time: 1081.589788s
 
 The critical "Suspend time" was 18 minutes with no I/O on any other
 partitions. The same test with some I/O's on any other partition
 triggers the panic immediately, because my journal is not big enough to
 hold the I/O's of 18 minutes.
 The same problem occurs on removing the snapshot, the g_journal switcher
 waits for 10 minutes on the "ufs" lock.
 The same test on a 1 TB disk drops the "Suspend time" to 190 seconds for
 the snapshot and 12 seconds for the remove.
 
 My conclusion is, that snapshot (used for dump -L) on a journaled
 partition is not safe, when the "Suspend time" for the biggest journaled
 partition is more than about 20 seconds.
 
 -- 
 Dr. Andreas Longwitz
 
 Data Service GmbH
 Beethovenstr. 2A
 23617 Stockelsdorf
 Amtsgericht Lübeck, HRB 318 BS
 Geschäftsführer: Wilfried Paepcke, Dr. Andreas Longwitz, Josef Flatau
 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201205091620.q49GKBqW036236>