Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 07 Mar 2017 09:04:29 +0200
From:      Toomas Soome <tsoome@me.com>
To:        Lawrence Stewart <lstewart@freebsd.org>
Cc:        Andriy Gapon <avg@freebsd.org>, freebsd-fs@freebsd.org, Toomas Soome <tsoome@freebsd.org>
Subject:   Re: svn commit: r308089 - in head
Message-ID:  <CCB18F77-A9C3-4D22-82A3-9DD84DF783F9@me.com>
In-Reply-To: <c4cc03d0-d26e-f7c0-8399-d65f2aa0c5ef@freebsd.org>
References:  <201610291409.u9TE9WXJ020650@repo.freebsd.org> <c4cc03d0-d26e-f7c0-8399-d65f2aa0c5ef@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

> On 7. m=C3=A4rts 2017, at 7:25, Lawrence Stewart =
<lstewart@freebsd.org> wrote:
>=20
> Hi Andriy,
>=20
> On 30/10/2016 01:09, Andriy Gapon wrote:
>> Author: avg
>> Date: Sat Oct 29 14:09:32 2016
>> New Revision: 308089
>> URL: https://svnweb.freebsd.org/changeset/base/308089
>>=20
>> Log:
>>  zfsbootcfg: a simple tool to set next boot (one time) options for =
zfsboot
>>=20
>>  (gpt)zfsboot will read one-time boot directives from a special ZFS =
pool
>>  area.  The area was previously described as "Boot Block Header", but
>>  currently it is know as Pad2, marked as reserved and is zeroed out =
on
>>  pool creation.  The new code interprets data in this area, if any, =
using
>>  the same format as boot.config.  The area is immediately wiped out.
>>  Failure to parse the directives results in a reboot right after the
>>  cleanup.  Otherwise the boot sequence proceeds as usual.
>>=20
>>  zfsbootcfg writes zfsboot arguments specified on its command line to =
the
>>  Pad2 area of a disk identified by vfs.zfs.boot.primary_pool and
>>  vfs.zfs.boot.primary_vdev kenv variables that are set by loader =
during
>>  boot.  Please see the manual page for more.
>>=20
>>  Thanks to all who reviewed, contributed and made suggestions!  There =
are
>>  many potential improvements to the feature, please see the review =
for
>>  details.
>>=20
>>  Reviewed by:	wblock (docs)
>>  Discussed with:	jhb, tsoome
>>  MFC after:	3 weeks
>>  Relnotes:	yes
>>  Differential Revision: https://reviews.freebsd.org/D7612
>>=20
>> Added:
>>  head/sbin/zfsbootcfg/
>>  head/sbin/zfsbootcfg/Makefile   (contents, props changed)
>>  head/sbin/zfsbootcfg/zfsbootcfg.8   (contents, props changed)
>>  head/sbin/zfsbootcfg/zfsbootcfg.c   (contents, props changed)
>> Modified:
>>  head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h
>>  head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c
>>  head/sbin/Makefile
>>  head/sys/boot/i386/common/drv.c
>>  head/sys/boot/i386/common/drv.h
>>  head/sys/boot/i386/gptzfsboot/Makefile
>>  head/sys/boot/i386/zfsboot/Makefile
>>  head/sys/boot/i386/zfsboot/zfsboot.c
>>  head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev.h
>>  head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c
>>  head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
>>  head/sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h
>>=20
> [snip]
>> @@ -634,7 +712,39 @@ main(void)
>>     primary_spa =3D spa;
>>     primary_vdev =3D spa_get_primary_vdev(spa);
>>=20
>> -    if (zfs_spa_init(spa) !=3D 0 || zfs_mount(spa, 0, &zfsmount) !=3D =
0) {
>> +    nextboot =3D 0;
>> +    rc  =3D vdev_read_pad2(primary_vdev, cmd, sizeof(cmd));
>> +    if (vdev_clear_pad2(primary_vdev))
>> +	printf("failed to clear pad2 area of primary vdev\n");
>> +    if (rc =3D=3D 0) {
>> +	if (*cmd) {
>> +	    /*
>> +	     * We could find an old-style ZFS Boot Block header here.
>> +	     * Simply ignore it.
>> +	     */
>> +	    if (*(uint64_t *)cmd !=3D 0x2f5b007b10c) {
>> +		/*
>> +		 * Note that parse() is destructive to cmd[] and we also =
want
>> +		 * to honor RBX_QUIET option that could be present in =
cmd[].
>> +		 */
>> +		nextboot =3D 1;
>> +		memcpy(cmddup, cmd, sizeof(cmd));
>> +		if (parse()) {
>> +		    printf("failed to parse pad2 area of primary =
vdev\n");
>> +		    reboot();
>> +		}
>> +		if (!OPT_CHECK(RBX_QUIET))
>> +		    printf("zfs nextboot: %s\n", cmddup);
>> +	    }
>> +	    /* Do not process this command twice */
>> +	    *cmd =3D 0;
>> +	}
>> +    } else
>> +	printf("failed to read pad2 area of primary vdev\n");
>> +
>=20
> I've just taken Allan Jude's & co-conspirators' work for a spin that
> allows gptzfsboot to boot from a geli + ZFS partition. Everything is
> working amazingly well, but I see the above "failed to read pad2 area =
of
> primary vdev" message on every boot.
>=20
> It doesn't appear to cause any problems per se and the system
> boots/works fine. I assume that message is printed to signal an
> unexpected situation though, so figured I'd get in touch to get your
> thoughts.
>=20
>=20
>=20
> I installed the KVM-based virtual machine system manually from the =
live
> shell of:
>=20
> FreeBSD-12.0-CURRENT-amd64-20170301-r314495-disc1.iso
>=20
>=20
>=20
> The partitioning is very simple:
>=20
> gpart create -s gpt /dev/vtbd0
> gpart add -t freebsd-boot -a 8 -b 40 -s 512k vtbd0
> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 vtbd0
> gpart add -t freebsd-zfs -b 2088 vtbd0
>=20
>  root# gpart show
>  =3D>      40  83886000  vtbd0  GPT  (40G)
> 	  40      1024      1  freebsd-boot  (512K)
> 	1064      1024         - free -  (512K)
> 	2088  83883952      2  freebsd-zfs  (40G)
>=20
>=20
>=20
> geli was inited/attached to vtbd0p2 and the zpool was created with =
command:
>=20
> zpool create -o altroot=3D/tmp/zroot -o cachefile=3D/tmp/zpool.cache =
-O
> checksum=3Dskein -O compression=3Dlz4 <pool> vtbd0p2.eli
>=20
> i.e. the entire pool including bootfs is using skein for checksumming
> and lz4 for compression.
>=20
>=20
>=20
> I hit another boot bug using skein previously which Toomas (CCed) =
fixed,
> and am wondering if this issue might also be related to the skein
> implementation.
>=20
> I haven't tested if the zfsbootcfg functionality works for fear that =
the
> printf is indicating a low level problem with the zpool. I can test
> potentially destructive things and break the pool though if that would
> be helpful.
>=20
> Any thoughts?
>=20
> Cheers,
> Lawrence


The problem with having pool on geli encrypted partition is that all the =
reads done on such partition, gave to go through geli aware read() =
function, and the same is true for writes (which is important for =
nextboot feature). So what it means for gptzfsboot/zfsboot is that we =
would need to have the disk reads/writes go through the geli aware =
functions and we can not issue =E2=80=9Cpure=E2=80=9D disk io directly.

rgds,
toomas




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CCB18F77-A9C3-4D22-82A3-9DD84DF783F9>