Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Dec 2016 00:22:25 +0100
From:      Stefan Bethke <stb@lassitu.de>
To:        freebsd-hackers@freebsd.org
Cc:        Andriy Gapon <avg@freebsd.org>, John Baldwin <jhb@FreeBSD.org>
Subject:   Re: on BIOS problems with disks larger than 2 TB
Message-ID:  <3BF31AE6-BA3D-498A-9203-500C75F957C5@lassitu.de>
In-Reply-To: <490347865.SvN7iQoFWI@ralph.baldwin.cx>
References:  <6cec427b-4df1-50f0-3014-a96e5f8210f5@FreeBSD.org> <490347865.SvN7iQoFWI@ralph.baldwin.cx>

next in thread | previous in thread | raw e-mail | index | archive | help

> Am 12.08.2016 um 21:18 schrieb John Baldwin <jhb@FreeBSD.org>:
>=20
> On Tuesday, August 02, 2016 04:35:23 PM Andriy Gapon wrote:
>>=20
>> There are some BIOSes out there that do not properly support disks
>> larger than 2TB and cause boot problems if there is any data required
>> for boot at offsets larger than 2 TB (TiB, rather).
>>=20
>> The most typical victim is the ZFS boot if a boot pool includes disk
>> areas beyond 2TB, because a kernel, or zfsloader or any configuration
>> files required by the loader may end up in those "inaccessible" =
areas.
>>=20
>> It's obvious why 2TiB is a magic value here:
>> 2^32 * 512 =3D 2^41 =3D 2 * 2^40
>> So the problem seems to happen when an LBA is treated as a 32-bit
>> integer (unsigned).
>>=20
>> I happen to own one of affected systems and I have done some more
>> investigation.  As far as I can see, the only actual problem in my =
case
>> is that a disk size in 512b sectors is reported modulo 2^32 by INT =
13h
>> AH=3D48h.  If I "fix up" the parameter, then everything else (i.e. =
actual
>> data reads) seems to work just fine after that.
>>=20
>> I suspect that a large subclass of other problematic systems may have
>> exactly the same problem.
>>=20
>> Does anyone have an idea about how we could auto-detect and and
>> auto-correct that problem?
>> Would that be worth the trouble at all?  Given the gradual =
de-orbiting
>> of BIOS systems.
>=20
> Hmm, I'm not sure how easy it is to handle this case (i.e. how do you =
know
> if an LBA beyond the size is really legit due to truncation vs coming =
from
> corrupted metadata).  Related is that tsoome's bcache stuff wants to =
know
> where the end of the disk is (to avoid reading off the end), so just
> ignoring the size is not easy.

Having just been bitten by this, an early indication that the BIOS is =
deficient would be most welcome.

I have two systems (Asus P6-P8H61E) which BIOS seems to be limited to 2 =
TB.  For about two years, everything seemed to be fine, until the latest =
make world, when the new loader, kernel, and modules suddenly ended up =
too far back on the disk:
All buffers synced.
Uptime: 32d4h27m58s
re0: link state changed to DOWN
re0: link st/boot/config: -DhS115200

ZFS: i/o error - all block copies unavailable
Invalid format

FreeBSD/x86 boot
Default: tank/be/default:/boot/kernel/kernel
boot:=20
ZFS: i/o error - all block copies unavailable
Invalid format

Of course, the systems are remote and I can=E2=80=99t access them =
physically easily.

Luckily, I did manage to loader the old loader and kernel, and could =
bring the system up again, but I will need to try to update the BIOS on =
the machine, or even create a root ZFS pool that is far enough forward =
on the main disk.

If the BIOS limitation cannot be worked around, gptboot/gptzfsboot =
should at least try and read (for example) the backup GPT.  This way, =
they could emit a warning that parts of the disk are not accessible =
through the BIOS, and that future boots might suddenly stop working.  If =
I had known that the BIOS had this problem when I was setting up these =
systems, I could have easily created a root pool and a separate data =
pool, instead of just a root pool.


Stefan

--=20
Stefan Bethke <stb@lassitu.de>   Fon +49 151 14070811







Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3BF31AE6-BA3D-498A-9203-500C75F957C5>