Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 29 Feb 2020 23:16:30 -0800
From:      David Christensen <dpchrist@holgerdanske.com>
To:        freebsd-questions@freebsd.org
Subject:   Re: ZFS i/o error on boot unable to start system
Message-ID:  <a2c79a9a-7cc5-4131-7b34-92a4c22955ae@holgerdanske.com>
In-Reply-To: <fcc9f93f-3680-d000-840c-a9be86a53ceb@holgerdanske.com>
References:  <eb8f8f32fcf5559774daf3a772a1ad2e.squirrel@webmail.harte-lyne.ca> <fcc9f93f-3680-d000-840c-a9be86a53ceb@holgerdanske.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-02-28 11:04, David Christensen wrote:
> On 2020-02-28 05:51, James B. Byrne via freebsd-questions wrote:
>> I have reported this on the forums as well.
>>
>> FreeBSD-12.1p2
>> raidz2 on 4x8TB HDD (reds)
>> root on zfs
>>
>> We did a hot restart of this host this morning and received the 
>> following on
>> the console:
>>
>> ZFS: i/o error - all block copies unavailable
>> ZFS: failed to read pool zroot directory object
>> qptzfsboot: failed to mount default pool zroot
>>
>> FreeBSD/x86 boot
>> ZFS: i/o error - all block copies unavailable
>> ZFS: can't fild dataset 0
>> Default: zroot/<0x0>
>> boot:
>>
>> What has happened?  How do I get this system back up and online?
>>
>> My first thought is that in modifying rc.conf to change some ip4 address
>> assignments that I may have done something else inadvertently which 
>> has caused
>> this.  I cannot think of any other changes made since the system was last
>> restarted a noon yesterday.
>>
>> &#8203;This is an urgent matter.  Any help is gratefully welcomed.
> 
> So, you have a desktop computer with four Western Digital Red 8 TB SATA 
> hard disk drives.  You installed FreeBSD-12.1-RELEASE-amd64 and ended up 
> with one ZFS RAIDZ2 pool with everything in it -- boot, root, usr, var, 
> tmp, home, whatever.  You have since upgraded to 12.1-p2.  Yesterday, 
> you edited /etc/rc.conf and now the system will not boot.
> 
> 
> The most likely explanation is that you broke rc.conf.
> 
> 
> One possible solution would be too boot a rescue shell or live system, 
> import the RAIDZ2 pool, and fix rc.conf.  Be sure to export the pool 
> each time you are done editing and before attempting to boot it.
> 
> 
> Let us know how it works out.

I put my operating systems on a single device -- typically a 2.5" SATA 
SSD, but sometimes a USB 3.0 flash drive (which tends to work the best 
in USB 2.0 ports).  I use BIOS boot, MBR partitioning, ZFS boot, GELI 
random key swap, and GELI passphrase ZFS root.  My bulk data is on 3.5" 
SATA HDD's, each with GPT, one large partition, and GELI, fed into a ZFS 
pool; one data pool per computer.  All of my tower and rack computers 
have 2.5" racks for the system drive.  One has racks for the 3.5" data 
drives (the others are internal).  The goal is to be able to mix and 
match computers, system drives, and data pools as required.


I migrated two 2.5" SSD system disks and two sets of pool drives to 
different computers this evening.


Initially, both computers failed to boot with the OP's message "ZFS: i/o 
error - all block copies unavailable".


Swapping the drives allowed one computer to boot.


The other produced a white screen of death at some late point in the 
FreeBSD boot loader (third stage?).  Unracking the data drives and 
resetting the CMOS settings to defaults allowed it to boot.  (My guess 
is that the white screen of death was caused by incorrect CMOS Setup 
video settings.)


I then shutdown, racked the data drives, and rebooted.  The device node 
names had changed, the system drive was no longer ada0 (it was ada2), 
and root GELI was broken.  The solution was to reverse the order of the 
SATA port connections, unrack the data drives, boot, shutdown, rack the 
data drives, and boot again.  (My guess is that /boot/zfs/zpool.cache is 
involved?)


David



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a2c79a9a-7cc5-4131-7b34-92a4c22955ae>