Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 7 Mar 2017 23:33:52 +1100
From:      Lawrence Stewart <lstewart@freebsd.org>
To:        Toomas Soome <tsoome@me.com>
Cc:        Andriy Gapon <avg@freebsd.org>, freebsd-fs@freebsd.org, Toomas Soome <tsoome@freebsd.org>, allanjude@freebsd.org
Subject:   Failed to read pad2 area of primary vdev [was: Re: svn commit: r308089 - in head]
Message-ID:  <5497b4a0-836a-668c-f18c-f2adb5b93f7a@freebsd.org>
In-Reply-To: <814E1C65-23E3-42A1-8093-8008DF188506@me.com>
References:  <201610291409.u9TE9WXJ020650@repo.freebsd.org> <c4cc03d0-d26e-f7c0-8399-d65f2aa0c5ef@freebsd.org> <CCB18F77-A9C3-4D22-82A3-9DD84DF783F9@me.com> <9f0b2f93-04b8-b90b-3cb5-13b8539b9171@freebsd.org> <814E1C65-23E3-42A1-8093-8008DF188506@me.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 07/03/2017 19:48, Toomas Soome wrote:
> 
>> On 7. märts 2017, at 10:25, Lawrence Stewart <lstewart@freebsd.org
>> <mailto:lstewart@freebsd.org>> wrote:
>>
>> On 07/03/2017 18:04, Toomas Soome wrote:
>>>
>>>> On 7. märts 2017, at 7:25, Lawrence Stewart <lstewart@freebsd.org
>>>> <mailto:lstewart@freebsd.org>> wrote:
>>>>
>>>> Hi Andriy,
>>>>
>>>> On 30/10/2016 01:09, Andriy Gapon wrote:
>>>>> Author: avg
>>>>> Date: Sat Oct 29 14:09:32 2016
>>>>> New Revision: 308089
>>>>> URL: https://svnweb.freebsd.org/changeset/base/308089
>>>>>
>>>>> Log:
>>>>> zfsbootcfg: a simple tool to set next boot (one time) options for
>>>>> zfsboot
>>>>>
>>>>> (gpt)zfsboot will read one-time boot directives from a special ZFS pool
>>>>> area.  The area was previously described as "Boot Block Header", but
>>>>> currently it is know as Pad2, marked as reserved and is zeroed out on
>>>>> pool creation.  The new code interprets data in this area, if any,
>>>>> using
>>>>> the same format as boot.config.  The area is immediately wiped out.
>>>>> Failure to parse the directives results in a reboot right after the
>>>>> cleanup.  Otherwise the boot sequence proceeds as usual.
>>>>>
>>>>> zfsbootcfg writes zfsboot arguments specified on its command line
>>>>> to the
>>>>> Pad2 area of a disk identified by vfs.zfs.boot.primary_pool and
>>>>> vfs.zfs.boot.primary_vdev kenv variables that are set by loader during
>>>>> boot.  Please see the manual page for more.
>>>>>
>>>>> Thanks to all who reviewed, contributed and made suggestions!
>>>>>  There are
>>>>> many potential improvements to the feature, please see the review for
>>>>> details.
>>>>>
>>>>> Reviewed by:wblock (docs)
>>>>> Discussed with:jhb, tsoome
>>>>> MFC after:3 weeks
>>>>> Relnotes:yes
>>>>> Differential Revision: https://reviews.freebsd.org/D7612
>>>>>
>>>>> Added:
>>>>> head/sbin/zfsbootcfg/
>>>>> head/sbin/zfsbootcfg/Makefile   (contents, props changed)
>>>>> head/sbin/zfsbootcfg/zfsbootcfg.8   (contents, props changed)
>>>>> head/sbin/zfsbootcfg/zfsbootcfg.c   (contents, props changed)
>>>>> Modified:
>>>>> head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h
>>>>> head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c
>>>>> head/sbin/Makefile
>>>>> head/sys/boot/i386/common/drv.c
>>>>> head/sys/boot/i386/common/drv.h
>>>>> head/sys/boot/i386/gptzfsboot/Makefile
>>>>> head/sys/boot/i386/zfsboot/Makefile
>>>>> head/sys/boot/i386/zfsboot/zfsboot.c
>>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev.h
>>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c
>>>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c
>>>>> head/sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h
>>>>>
>>>> [snip]
>>>>> @@ -634,7 +712,39 @@ main(void)
>>>>>    primary_spa = spa;
>>>>>    primary_vdev = spa_get_primary_vdev(spa);
>>>>>
>>>>> -    if (zfs_spa_init(spa) != 0 || zfs_mount(spa, 0, &zfsmount) != 0) {
>>>>> +    nextboot = 0;
>>>>> +    rc  = vdev_read_pad2(primary_vdev, cmd, sizeof(cmd));
>>>>> +    if (vdev_clear_pad2(primary_vdev))
>>>>> +printf("failed to clear pad2 area of primary vdev\n");
>>>>> +    if (rc == 0) {
>>>>> +if (*cmd) {
>>>>> +    /*
>>>>> +     * We could find an old-style ZFS Boot Block header here.
>>>>> +     * Simply ignore it.
>>>>> +     */
>>>>> +    if (*(uint64_t *)cmd != 0x2f5b007b10c) {
>>>>> +/*
>>>>> + * Note that parse() is destructive to cmd[] and we also want
>>>>> + * to honor RBX_QUIET option that could be present in cmd[].
>>>>> + */
>>>>> +nextboot = 1;
>>>>> +memcpy(cmddup, cmd, sizeof(cmd));
>>>>> +if (parse()) {
>>>>> +    printf("failed to parse pad2 area of primary vdev\n");
>>>>> +    reboot();
>>>>> +}
>>>>> +if (!OPT_CHECK(RBX_QUIET))
>>>>> +    printf("zfs nextboot: %s\n", cmddup);
>>>>> +    }
>>>>> +    /* Do not process this command twice */
>>>>> +    *cmd = 0;
>>>>> +}
>>>>> +    } else
>>>>> +printf("failed to read pad2 area of primary vdev\n");
>>>>> +
>>>>
>>>> I've just taken Allan Jude's & co-conspirators' work for a spin that
>>>> allows gptzfsboot to boot from a geli + ZFS partition. Everything is
>>>> working amazingly well, but I see the above "failed to read pad2 area of
>>>> primary vdev" message on every boot.
>>>>
>>>> It doesn't appear to cause any problems per se and the system
>>>> boots/works fine. I assume that message is printed to signal an
>>>> unexpected situation though, so figured I'd get in touch to get your
>>>> thoughts.
>>>>
>>>>
>>>>
>>>> I installed the KVM-based virtual machine system manually from the live
>>>> shell of:
>>>>
>>>> FreeBSD-12.0-CURRENT-amd64-20170301-r314495-disc1.iso
>>>>
>>>>
>>>>
>>>> The partitioning is very simple:
>>>>
>>>> gpart create -s gpt /dev/vtbd0
>>>> gpart add -t freebsd-boot -a 8 -b 40 -s 512k vtbd0
>>>> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 vtbd0
>>>> gpart add -t freebsd-zfs -b 2088 vtbd0
>>>>
>>>> root# gpart show
>>>> =>      40  83886000  vtbd0  GPT  (40G)
>>>>   40      1024      1  freebsd-boot  (512K)
>>>> 1064      1024         - free -  (512K)
>>>> 2088  83883952      2  freebsd-zfs  (40G)
>>>>
>>>>
>>>>
>>>> geli was inited/attached to vtbd0p2 and the zpool was created with
>>>> command:
>>>>
>>>> zpool create -o altroot=/tmp/zroot -o cachefile=/tmp/zpool.cache -O
>>>> checksum=skein -O compression=lz4 <pool> vtbd0p2.eli
>>>>
>>>> i.e. the entire pool including bootfs is using skein for checksumming
>>>> and lz4 for compression.
>>>>
>>>>
>>>>
>>>> I hit another boot bug using skein previously which Toomas (CCed) fixed,
>>>> and am wondering if this issue might also be related to the skein
>>>> implementation.
>>>>
>>>> I haven't tested if the zfsbootcfg functionality works for fear that the
>>>> printf is indicating a low level problem with the zpool. I can test
>>>> potentially destructive things and break the pool though if that would
>>>> be helpful.
>>>>
>>>> Any thoughts?
>>>>
>>>> Cheers,
>>>> Lawrence
>>>
>>>
>>> The problem with having pool on geli encrypted partition is that all
>>> the reads done on such partition, gave to go through geli aware
>>> read() function, and the same is true for writes (which is important
>>> for nextboot feature). So what it means for gptzfsboot/zfsboot is
>>> that we would need to have the disk reads/writes go through the geli
>>> aware functions and we can not issue “pure” disk io directly.
>>
>> [+Allan]
>>
>> Presumably that functionality exists given that the geli support Allan
>> added to gptzfsboot is able to read loader and loader is able to read
>> everything in /boot from the geli-encrypted ZFS pool?
> 
> 
> The problem is deeper, the idea behind the nextboot is that it is
> attempting to provide recovery from failed boot, so if you set nextboot
> dataset, attempt to boot from it, you need to do 2 things: 1. detect the
> nextboot config, so you would actually be able to use it, and 2, you
> want to reset it as early as possible, because later you may not have a
> chance.
> 
> So it means the gptzfsboot has to read out the config to know where from
> it has to load the zfsloader, and gptzfsboot has to reset the config, so
> that if anything will go wrong, on next boot the fallback or “normal”
> boot will be done. Which means that either gptzfsboot has to know how to
> deal with geli in context of handling nextboot, or with geli, you just
> can not use nextboot config.
> 
> The similar issue is with using boot block area in zfs pool label - to
> be able to store and use gptzfsboot in pool label boot area, the boot1
> either has to know how to read the geli, or geli must be able not to
> encrypt the bootblock area, or we just can not use that area [with
> geli]. All in all, it is another example of the chicken and the egg issue:)

Thanks to both you and Andriy for the detailed explanation.

To clarify the current state of affairs as I understand them:

- zfsbootcfg will set parameters correctly in the pool's correct Pad2
location when the system is booted and running i.e. zfsbootcfg is safe
to use in this scenario and won't scribble in any places it shouldn't

- The support in gptzfsboot for these zfsbootcfg Pad2 parameters does
not know to decrypt first, so reads "garbage" and harmlessly gives up
i.e. any parameters which have been set by zfsbootcfg are completely
ignored.

If I've got that right, then I guess the printf is harmless and is safe
to ignore, and I just shouldn't expect zfsbootcfg to work as advertised
until such time as someone figures out if/how to add the necessary
support to gptzfsboot?

Cheers,
Lawrence



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5497b4a0-836a-668c-f18c-f2adb5b93f7a>