Date: Tue, 10 Apr 2012 14:34:00 +0300 From: Andriy Gapon <avg@FreeBSD.org> To: Rumen Telbizov <telbizov@gmail.com> Cc: freebsd-stable@FreeBSD.org Subject: Re: ZFS: can't read MOS Message-ID: <4F841AA8.3030602@FreeBSD.org> In-Reply-To: <CAENR%2B_X6gb5TB01i3FTfq_zD=RyFUGfLAWwA56SNm6Gqf_49iw@mail.gmail.com> References: <CAENR%2B_X6gb5TB01i3FTfq_zD=RyFUGfLAWwA56SNm6Gqf_49iw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
on 09/04/2012 21:50 Rumen Telbizov said the following:
> Hello everyone,
>
> I have a ZFS FreeBSD 8.2-STABLE (Aug 30, 2011) that I am having issues with
> and might use some help.
>
> In a nutshell, this machine has been running fine for about a year and a
> half but after a recent power
> outage (complete colo blackout) I can't boot of the ZFS pool any more.
> Here's the error I get (attached screenshot as well):
>
> ZFS: i/o error - all block copies unavailable
> ZFS: can't read MOS
> ZFS: unexpected object set type 0
> ZFS: unexpected object set type 0
>
> FreeBSD/x86 boot
> Default: zroot:/boot/kernel/kernel
> boot: ZFS: unexpected object set type 0
>
> I've been searching the net high and low for an actual solution but all the
> threads end up nowhere.
> I hope I can get some clue here. Thanks in advance.
Not sure if the following could be of any help to you but
${SRC}/tools/tools/zfsboottest utility can help diagnosing and debugging such
issues from userland (without requiring a reboot).
See also a small nitpick below.
> Here's the relevant hardware configuration of this box (serves as a backup
> box).
>
> - SuperMicro 4U + another 4U totalling 48 x 2TB disks
> - Hardware raid LSI 9261-8i holding both shelves giving 1 mfid0 device
> to the OS
> - Hardware raid 60 -- 6 x 8 raid6 groups
> - ZFS with gptzfsboot installed on the "single" mfid0 device. Partition
> table is:
>
> [root@mfsbsd /zroot/etc]# gpart show -l
> => 34 140554616765 mfid0 GPT (65T)
> 34 128 1 (null) (64k)
> 162 33554432 2 swap (16G)
> 33554594 140521062205 3 zroot (65T)
>
>
>
> - boot device is: vfs.root.mountfrom="zfs:zroot" (as per loader.conf)
> - zpool status is:
>
> [root@mfsbsd /zroot/etc]# zpool status
> pool: zroot
> state: ONLINE
> scan: scrub canceled on Mon Apr 9 09:48:14 2012
> config:
>
> NAME STATE READ WRITE CKSUM
> zroot ONLINE 0 0 0
> mfid0p3 ONLINE 0 0 0
>
> errors: No known data errors
>
>
>
> - zpool get all:
>
> [root@mfsbsd /zroot/etc]# zpool get all zroot
> NAME PROPERTY VALUE SOURCE
> zroot size 65T -
> zroot capacity 36% -
> zroot altroot - default
> zroot health ONLINE -
> zroot guid 3339338746696340707 default
> zroot version 28 default
> *zroot bootfs zroot local*
> zroot delegation on default
> zroot autoreplace off default
> zroot cachefile - default
> zroot failmode wait default
> zroot listsnapshots on local
> zroot autoexpand off default
> zroot dedupditto 0 default
> zroot dedupratio 1.00x -
> zroot free 41.2T -
> zroot allocated 23.8T -
> zroot readonly off -
>
>
> Here's what happened chronologically:
>
> - Savvis Toronto blacked out completely for 31 minutes
> - After power was restored this machine came up with the above error
> - I managed to PXE boot into mfsbsd successfully and managed to import
> the pool and access actual data/snapshots - no problem
> - Shortly after another reboot the hardware raid controller complained
> that it has lost
> it's configuration and now sees only half of the disks as foreign good
> and the
> rest as foreign bad. BIOS didn't see any boot device.
> - Spent some time on the phone with LSI and managed to restore the
> hardware RAID
> by basically removing any and all configuration, making disks
> unconfigured good
> and recreating the array in exactly the same way as I created it in the
> beginning BUT
> with the important exception that I did NOT initialize the array.
> - After this I was back to square one where I could see all the data
> without any loss
> (via mfsbsd) but cannot boot of the volume any more.
> - First thing I tried was to restore the boot loader without any luck:
> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 mfid0p1
> - Then out of desperation, took zfsboot, zfsloader, gptzfsboot from
> 9.0-RELEASE and replaced them in /boot,
> reinitialized again - no luck
> - Currently running zdb -ccv zroot to check for any corruptions - I am
> afraid this will take forever since I have *23.8T* used space. No errors
> yet
> - One thing I did notice is that zdb zroot returned the metaslab
> information line by line very slowly (10-15 seconds a line). I don't know
> if it's related.
> - Another thing I tried (saw that in a thread) without any difference
> whatsoever was:
>
> # cd src/sys/boot/i386/zfsboot
> # make clean; make cleandir
> # make obj ; make depend ; make
> # cd i386/loader
You probably wanted to do this in i386/zfsloader
> # make install
> # cd /usr/src/sys/boot/i386/zfsboot
> # make install
> # sysctl kern.geom.debugflags=16
> # dd if=/boot/zfsboot of=/dev/da0 count=1
> # dd if=/boot/zfsboot of=/dev/da0 skip=1 seek=1024
> # reboot
>
>
> At this point I am contemplating how to evacuate all the data from there or
> better yet put some USB flash to boot from.
> I could provide further details/execute commands if needed. Any help would
> be appreciated.
>
--
Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F841AA8.3030602>
