Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 6 Nov 2015 09:00:04 -0700
From:      Alan Somers <asomers@freebsd.org>
To:        "Eugene M. Zheganin" <emz@norma.perm.ru>
Cc:        FreeBSD <freebsd-stable@freebsd.org>
Subject:   Re: unable to boot a healthy zfs pool: all block copies unavailable
Message-ID:  <CAOtMX2gB_-pygSRGtaHK%2BtEHHJsAxSx4uce4Di5uAwaPbwH8KQ@mail.gmail.com>
In-Reply-To: <563C406F.3090003@norma.perm.ru>
References:  <563BAE37.2090205@norma.perm.ru> <563BD121.4020404@FreeBSD.org> <563C406F.3090003@norma.perm.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Nov 5, 2015 at 10:53 PM, Eugene M. Zheganin <emz@norma.perm.ru> wrote:
> Hi.
>
> On 06.11.2015 02:58, Andriy Gapon wrote:
>>
>> It could be that your BIOS is not able to read past 1TB (512 * INT_MAX). That
>> seems to be a rather common problem for consumer motherboards.
>> Here is an example of how it looked for me:
>> https://people.freebsd.org/~avg/IMAG1099.jpg
>> Fortunately, it wasn't a root pool that got the error.
> Mine looks way different: yours shows the pool info, mine shows 'BTX
> halted' message: http://zhegan.in/files/cannot-read-MOS.jpg . I'm
> running the latest BIOS for this motherboard (Gigabyte Z77P-D3, updated
> yesterday, stilll it's only 2012h year). If it's still the BIOS-related
> bug, what wokraround can I use - reslice the disk and create the root
> pool inside first Tb, right ?
>
> Thanks.
> Eugene.

I notice that my 10.2-RELEASE VM prints the same message about "all
block copies unavailable" and then continues to boot just fine.  So I
wonder if that part is just red herring.  There is another possibility
here: I have seen a bug where ZFS attempts to open the root pool's
vdevs by path (eg ada0p3) but can't find them because disks have been
replaced and no longer have their old devnames.  So vdev_geom searches
through the list of geom providers looking for any provider with the
correct ZFS GUID.  Normally it would find the right devname (eg
ada1p3).  But sometimes, because the disks are partitioned, it will
find the wrong partition first (eg ada1).  Since ZFS has labels at
both the beginning and the end of each vdev, vdev_geom will see the
label at the end of ada1 (really, it's the label at the end of ada1p3,
but it shares the same LBA that a label at the end of ada1 would) and
think that it opened ada1 successfully. vdev_geom_open will then
return, and at some later date another part of ZFS will fail to read
the MOS, and your boot will fail.

If this is the case, then there are three possible solutions:
1) Fix vdev_geom.  I'm currently testing a patch to do just that.
2) With power off, shuffle disks around until the boot disks have the
same devnames that they had the last time you successfully booted.  If
this is a SATA-only computer, swapping cables between different mobo
ports should be enough.
3) Boot from your USB stick and carefully (oh so carefully!) erase the
ZFS labels at the end of the boot disk.  Don't touch the labels at the
beginning.  If your boot pool is mirrored, it should be sufficient to
erase the labels on one disk only.

-Alan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2gB_-pygSRGtaHK%2BtEHHJsAxSx4uce4Di5uAwaPbwH8KQ>