From owner-freebsd-stable@FreeBSD.ORG Tue Apr 10 11:34:04 2012 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B0E0A106566B for ; Tue, 10 Apr 2012 11:34:04 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id F283D8FC0A for ; Tue, 10 Apr 2012 11:34:03 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA10668; Tue, 10 Apr 2012 14:34:00 +0300 (EEST) (envelope-from avg@FreeBSD.org) Message-ID: <4F841AA8.3030602@FreeBSD.org> Date: Tue, 10 Apr 2012 14:34:00 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.3) Gecko/20120314 Thunderbird/10.0.3 MIME-Version: 1.0 To: Rumen Telbizov References: In-Reply-To: X-Enigmail-Version: 1.4 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-stable@FreeBSD.org Subject: Re: ZFS: can't read MOS X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Apr 2012 11:34:04 -0000 on 09/04/2012 21:50 Rumen Telbizov said the following: > Hello everyone, > > I have a ZFS FreeBSD 8.2-STABLE (Aug 30, 2011) that I am having issues with > and might use some help. > > In a nutshell, this machine has been running fine for about a year and a > half but after a recent power > outage (complete colo blackout) I can't boot of the ZFS pool any more. > Here's the error I get (attached screenshot as well): > > ZFS: i/o error - all block copies unavailable > ZFS: can't read MOS > ZFS: unexpected object set type 0 > ZFS: unexpected object set type 0 > > FreeBSD/x86 boot > Default: zroot:/boot/kernel/kernel > boot: ZFS: unexpected object set type 0 > > I've been searching the net high and low for an actual solution but all the > threads end up nowhere. > I hope I can get some clue here. Thanks in advance. Not sure if the following could be of any help to you but ${SRC}/tools/tools/zfsboottest utility can help diagnosing and debugging such issues from userland (without requiring a reboot). See also a small nitpick below. > Here's the relevant hardware configuration of this box (serves as a backup > box). > > - SuperMicro 4U + another 4U totalling 48 x 2TB disks > - Hardware raid LSI 9261-8i holding both shelves giving 1 mfid0 device > to the OS > - Hardware raid 60 -- 6 x 8 raid6 groups > - ZFS with gptzfsboot installed on the "single" mfid0 device. Partition > table is: > > [root@mfsbsd /zroot/etc]# gpart show -l > => 34 140554616765 mfid0 GPT (65T) > 34 128 1 (null) (64k) > 162 33554432 2 swap (16G) > 33554594 140521062205 3 zroot (65T) > > > > - boot device is: vfs.root.mountfrom="zfs:zroot" (as per loader.conf) > - zpool status is: > > [root@mfsbsd /zroot/etc]# zpool status > pool: zroot > state: ONLINE > scan: scrub canceled on Mon Apr 9 09:48:14 2012 > config: > > NAME STATE READ WRITE CKSUM > zroot ONLINE 0 0 0 > mfid0p3 ONLINE 0 0 0 > > errors: No known data errors > > > > - zpool get all: > > [root@mfsbsd /zroot/etc]# zpool get all zroot > NAME PROPERTY VALUE SOURCE > zroot size 65T - > zroot capacity 36% - > zroot altroot - default > zroot health ONLINE - > zroot guid 3339338746696340707 default > zroot version 28 default > *zroot bootfs zroot local* > zroot delegation on default > zroot autoreplace off default > zroot cachefile - default > zroot failmode wait default > zroot listsnapshots on local > zroot autoexpand off default > zroot dedupditto 0 default > zroot dedupratio 1.00x - > zroot free 41.2T - > zroot allocated 23.8T - > zroot readonly off - > > > Here's what happened chronologically: > > - Savvis Toronto blacked out completely for 31 minutes > - After power was restored this machine came up with the above error > - I managed to PXE boot into mfsbsd successfully and managed to import > the pool and access actual data/snapshots - no problem > - Shortly after another reboot the hardware raid controller complained > that it has lost > it's configuration and now sees only half of the disks as foreign good > and the > rest as foreign bad. BIOS didn't see any boot device. > - Spent some time on the phone with LSI and managed to restore the > hardware RAID > by basically removing any and all configuration, making disks > unconfigured good > and recreating the array in exactly the same way as I created it in the > beginning BUT > with the important exception that I did NOT initialize the array. > - After this I was back to square one where I could see all the data > without any loss > (via mfsbsd) but cannot boot of the volume any more. > - First thing I tried was to restore the boot loader without any luck: > gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 mfid0p1 > - Then out of desperation, took zfsboot, zfsloader, gptzfsboot from > 9.0-RELEASE and replaced them in /boot, > reinitialized again - no luck > - Currently running zdb -ccv zroot to check for any corruptions - I am > afraid this will take forever since I have *23.8T* used space. No errors > yet > - One thing I did notice is that zdb zroot returned the metaslab > information line by line very slowly (10-15 seconds a line). I don't know > if it's related. > - Another thing I tried (saw that in a thread) without any difference > whatsoever was: > > # cd src/sys/boot/i386/zfsboot > # make clean; make cleandir > # make obj ; make depend ; make > # cd i386/loader You probably wanted to do this in i386/zfsloader > # make install > # cd /usr/src/sys/boot/i386/zfsboot > # make install > # sysctl kern.geom.debugflags=16 > # dd if=/boot/zfsboot of=/dev/da0 count=1 > # dd if=/boot/zfsboot of=/dev/da0 skip=1 seek=1024 > # reboot > > > At this point I am contemplating how to evacuate all the data from there or > better yet put some USB flash to boot from. > I could provide further details/execute commands if needed. Any help would > be appreciated. > -- Andriy Gapon