Date: Thu, 25 Jun 2009 18:05:40 -0000 (UTC) From: "Peter Wood" <peter@alastria.net> To: stable@freebsd.org Subject: ZFS Assertion Fault with FreeBSD 7.2 Message-ID: <751c8204f14ff89116fba32be000eae5.squirrel@webmail.alastria.net>
next in thread | raw e-mail | index | archive | help
Good Evening, This is a heads up really, I think I've got this sorted. I'm writing this as my system backs up data to another array in case it all explodes. This afternoon I was performing some MPEG4 encoding with ffmpeg source file and destination file where both located on the same ZFS partition. Part way through the ffmpeg encode the process went to the "zfs:lo" state and hung, all processes that attempted to browse to the partition "data/domains" hung immediately. I attempted to reboot the machine in order to restore normality however the system stuck half way through shutting down. In the end a hard power off was issued to shut the machine down. Upon reboot during the ZFS rc.d init script I saw the following: panic: solaris assert: 0 == dmu_bonus_hold(zfsvfs->z_os, *oid, NULL, &dbp), file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c, line: 472 Apologies for single character errors that's typed from an image. Through diagnosis I was able to determine the error was being caused by a mirror zpool called "store". I booted into single user mode, /etc/rc.d/hostname and /etc/rc.d/hostid. Looking at the ZFS rc.d file I was able to "zfs volinit" with no issues, the panic was reproducable on "zfs mount -a". I then began to load each mount point one by one until I found the one causing the issue. This is "store/sara/unix/Maildir", it is a compressed volume, otherwise nothing custom. Following my ancient ufs logic I attempted to mount it read only, this worked and spat out the following kernel warning: Solaris: WARNING: ZFS replay transaction error 30, dataset store/sara/unix/Maildir, seq 0x77001, txtype 5 To aid diagnosis and because I'd damaged rc environment while debugging, I rebooted, single user-ed, and mounted the whole of store as read only. However this time the warning did not show. I am currently in the process of copying the entirety of "store" to "data", I was planning to attempt remounting the entire volumes mount points read/write once the backup is done. Is there anything else that I should be doing to a) attempt to ensure my data structures are now okay and b) help find the problem. I understand a) will probably prevent b), but the data is too important to risk, sorry. A scrub of the volume came to mind as a double check. Any thoughts are greatly appreciated, apologies if this email comes out badly, my email is on this server, so I'm webmailing and scrounging through mqueue's on the upstream. Peter. -- Peter Wood :: peter@alastria.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?751c8204f14ff89116fba32be000eae5.squirrel>