Date: Wed, 9 May 2012 16:20:11 GMT From: Andreas Longwitz <longwitz@incore.de> To: freebsd-geom@FreeBSD.org Subject: Re: kern/164252: [geom] gjournal overflow Message-ID: <201205091620.q49GKBqW036236@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/164252; it has been noted by GNATS. From: Andreas Longwitz <longwitz@incore.de> To: bug-followup@freebsd.org Cc: Subject: Re: kern/164252: [geom] gjournal overflow Date: Wed, 09 May 2012 18:12:51 +0200 The panic "gjournal overflow" is caused by design problems of snapshot and/or gjournal on big disks (1 TB or more). Each journaled partition is served by a kernel thread g_journal and there is one kernel thread g_journal switcher responsible for all journaled partitions. Aside from mount/umount the g_journal switcher and snapshot for ffs are the only kernel threads using vfs_write_suspend() holding the lock "suspwt". After starting a snapshot on a big disk with mksnap_ffs /backup/.snap/snapshot the lock "suspwt" will be catched and hold for a long time. Some seconds later the g_journal switcher tries to switch the journal of the backup partition and blocks on the "suspwt" lock. Therefore he can not handle the other journaled partition anymore, he must wait until the snapshot releases the "suspwt" lock. On my test server (FreeBSD 8.3) /backup is mounted on /dev/mirror/gm2p2 (1,8 TB) and kern.geom.journal.debug=1 gives May 9 09:59:47 : Data has been copied. May 9 09:59:47 : Msync time of /backup: 0.031111s May 9 09:59:47 : Sync time of /backup: 0.086015s May 9 09:59:47 : Cache flush time: 0.000705s May 9 09:59:47 : BIO_FLUSH time of mirror/gm2p2: 0.015049s May 9 10:04:12 : Journal mirror/gm2p2 71% full, forcing journal switch. May 9 10:17:48 : Suspend time of /backup: 1080.955120s May 9 10:17:48 : Starting copy of journal. May 9 10:17:48 : Cache flush time: 0.013182s May 9 10:17:48 : Cache flush time: 0.027241s May 9 10:17:48 : Switch time of mirror/gm2p2: 0.206213s May 9 10:17:48 : Entire switch time: 1081.589788s The critical "Suspend time" was 18 minutes with no I/O on any other partitions. The same test with some I/O's on any other partition triggers the panic immediately, because my journal is not big enough to hold the I/O's of 18 minutes. The same problem occurs on removing the snapshot, the g_journal switcher waits for 10 minutes on the "ufs" lock. The same test on a 1 TB disk drops the "Suspend time" to 190 seconds for the snapshot and 12 seconds for the remove. My conclusion is, that snapshot (used for dump -L) on a journaled partition is not safe, when the "Suspend time" for the biggest journaled partition is more than about 20 seconds. -- Dr. Andreas Longwitz Data Service GmbH Beethovenstr. 2A 23617 Stockelsdorf Amtsgericht Lübeck, HRB 318 BS Geschäftsführer: Wilfried Paepcke, Dr. Andreas Longwitz, Josef Flatau
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201205091620.q49GKBqW036236>