Date: Tue, 07 Mar 2017 10:48:16 +0200 From: Toomas Soome <tsoome@me.com> To: Lawrence Stewart <lstewart@freebsd.org> Cc: Andriy Gapon <avg@freebsd.org>, freebsd-fs@freebsd.org, Toomas Soome <tsoome@freebsd.org>, allanjude@freebsd.org Subject: Re: svn commit: r308089 - in head Message-ID: <814E1C65-23E3-42A1-8093-8008DF188506@me.com> In-Reply-To: <9f0b2f93-04b8-b90b-3cb5-13b8539b9171@freebsd.org> References: <201610291409.u9TE9WXJ020650@repo.freebsd.org> <c4cc03d0-d26e-f7c0-8399-d65f2aa0c5ef@freebsd.org> <CCB18F77-A9C3-4D22-82A3-9DD84DF783F9@me.com> <9f0b2f93-04b8-b90b-3cb5-13b8539b9171@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 7. m=C3=A4rts 2017, at 10:25, Lawrence Stewart = <lstewart@freebsd.org> wrote: >=20 > On 07/03/2017 18:04, Toomas Soome wrote: >>=20 >>> On 7. m=C3=A4rts 2017, at 7:25, Lawrence Stewart = <lstewart@freebsd.org> wrote: >>>=20 >>> Hi Andriy, >>>=20 >>> On 30/10/2016 01:09, Andriy Gapon wrote: >>>> Author: avg >>>> Date: Sat Oct 29 14:09:32 2016 >>>> New Revision: 308089 >>>> URL: https://svnweb.freebsd.org/changeset/base/308089 >>>>=20 >>>> Log: >>>> zfsbootcfg: a simple tool to set next boot (one time) options for = zfsboot >>>>=20 >>>> (gpt)zfsboot will read one-time boot directives from a special ZFS = pool >>>> area. The area was previously described as "Boot Block Header", = but >>>> currently it is know as Pad2, marked as reserved and is zeroed out = on >>>> pool creation. The new code interprets data in this area, if any, = using >>>> the same format as boot.config. The area is immediately wiped out. >>>> Failure to parse the directives results in a reboot right after the >>>> cleanup. Otherwise the boot sequence proceeds as usual. >>>>=20 >>>> zfsbootcfg writes zfsboot arguments specified on its command line = to the >>>> Pad2 area of a disk identified by vfs.zfs.boot.primary_pool and >>>> vfs.zfs.boot.primary_vdev kenv variables that are set by loader = during >>>> boot. Please see the manual page for more. >>>>=20 >>>> Thanks to all who reviewed, contributed and made suggestions! = There are >>>> many potential improvements to the feature, please see the review = for >>>> details. >>>>=20 >>>> Reviewed by: wblock (docs) >>>> Discussed with: jhb, tsoome >>>> MFC after: 3 weeks >>>> Relnotes: yes >>>> Differential Revision: https://reviews.freebsd.org/D7612 >>>>=20 >>>> Added: >>>> head/sbin/zfsbootcfg/ >>>> head/sbin/zfsbootcfg/Makefile (contents, props changed) >>>> head/sbin/zfsbootcfg/zfsbootcfg.8 (contents, props changed) >>>> head/sbin/zfsbootcfg/zfsbootcfg.c (contents, props changed) >>>> Modified: >>>> head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h >>>> head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c >>>> head/sbin/Makefile >>>> head/sys/boot/i386/common/drv.c >>>> head/sys/boot/i386/common/drv.h >>>> head/sys/boot/i386/gptzfsboot/Makefile >>>> head/sys/boot/i386/zfsboot/Makefile >>>> head/sys/boot/i386/zfsboot/zfsboot.c >>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev.h >>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c >>>> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c >>>> head/sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h >>>>=20 >>> [snip] >>>> @@ -634,7 +712,39 @@ main(void) >>>> primary_spa =3D spa; >>>> primary_vdev =3D spa_get_primary_vdev(spa); >>>>=20 >>>> - if (zfs_spa_init(spa) !=3D 0 || zfs_mount(spa, 0, &zfsmount) = !=3D 0) { >>>> + nextboot =3D 0; >>>> + rc =3D vdev_read_pad2(primary_vdev, cmd, sizeof(cmd)); >>>> + if (vdev_clear_pad2(primary_vdev)) >>>> + printf("failed to clear pad2 area of primary vdev\n"); >>>> + if (rc =3D=3D 0) { >>>> + if (*cmd) { >>>> + /* >>>> + * We could find an old-style ZFS Boot Block header here. >>>> + * Simply ignore it. >>>> + */ >>>> + if (*(uint64_t *)cmd !=3D 0x2f5b007b10c) { >>>> + /* >>>> + * Note that parse() is destructive to cmd[] and we also = want >>>> + * to honor RBX_QUIET option that could be present in = cmd[]. >>>> + */ >>>> + nextboot =3D 1; >>>> + memcpy(cmddup, cmd, sizeof(cmd)); >>>> + if (parse()) { >>>> + printf("failed to parse pad2 area of primary = vdev\n"); >>>> + reboot(); >>>> + } >>>> + if (!OPT_CHECK(RBX_QUIET)) >>>> + printf("zfs nextboot: %s\n", cmddup); >>>> + } >>>> + /* Do not process this command twice */ >>>> + *cmd =3D 0; >>>> + } >>>> + } else >>>> + printf("failed to read pad2 area of primary vdev\n"); >>>> + >>>=20 >>> I've just taken Allan Jude's & co-conspirators' work for a spin that >>> allows gptzfsboot to boot from a geli + ZFS partition. Everything is >>> working amazingly well, but I see the above "failed to read pad2 = area of >>> primary vdev" message on every boot. >>>=20 >>> It doesn't appear to cause any problems per se and the system >>> boots/works fine. I assume that message is printed to signal an >>> unexpected situation though, so figured I'd get in touch to get your >>> thoughts. >>>=20 >>>=20 >>>=20 >>> I installed the KVM-based virtual machine system manually from the = live >>> shell of: >>>=20 >>> FreeBSD-12.0-CURRENT-amd64-20170301-r314495-disc1.iso >>>=20 >>>=20 >>>=20 >>> The partitioning is very simple: >>>=20 >>> gpart create -s gpt /dev/vtbd0 >>> gpart add -t freebsd-boot -a 8 -b 40 -s 512k vtbd0 >>> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 vtbd0 >>> gpart add -t freebsd-zfs -b 2088 vtbd0 >>>=20 >>> root# gpart show >>> =3D> 40 83886000 vtbd0 GPT (40G) >>> 40 1024 1 freebsd-boot (512K) >>> 1064 1024 - free - (512K) >>> 2088 83883952 2 freebsd-zfs (40G) >>>=20 >>>=20 >>>=20 >>> geli was inited/attached to vtbd0p2 and the zpool was created with = command: >>>=20 >>> zpool create -o altroot=3D/tmp/zroot -o cachefile=3D/tmp/zpool.cache = -O >>> checksum=3Dskein -O compression=3Dlz4 <pool> vtbd0p2.eli >>>=20 >>> i.e. the entire pool including bootfs is using skein for = checksumming >>> and lz4 for compression. >>>=20 >>>=20 >>>=20 >>> I hit another boot bug using skein previously which Toomas (CCed) = fixed, >>> and am wondering if this issue might also be related to the skein >>> implementation. >>>=20 >>> I haven't tested if the zfsbootcfg functionality works for fear that = the >>> printf is indicating a low level problem with the zpool. I can test >>> potentially destructive things and break the pool though if that = would >>> be helpful. >>>=20 >>> Any thoughts? >>>=20 >>> Cheers, >>> Lawrence >>=20 >>=20 >> The problem with having pool on geli encrypted partition is that all = the reads done on such partition, gave to go through geli aware read() = function, and the same is true for writes (which is important for = nextboot feature). So what it means for gptzfsboot/zfsboot is that we = would need to have the disk reads/writes go through the geli aware = functions and we can not issue =E2=80=9Cpure=E2=80=9D disk io directly. >=20 > [+Allan] >=20 > Presumably that functionality exists given that the geli support Allan > added to gptzfsboot is able to read loader and loader is able to read > everything in /boot from the geli-encrypted ZFS pool? The problem is deeper, the idea behind the nextboot is that it is = attempting to provide recovery from failed boot, so if you set nextboot = dataset, attempt to boot from it, you need to do 2 things: 1. detect the = nextboot config, so you would actually be able to use it, and 2, you = want to reset it as early as possible, because later you may not have a = chance. So it means the gptzfsboot has to read out the config to know where from = it has to load the zfsloader, and gptzfsboot has to reset the config, so = that if anything will go wrong, on next boot the fallback or = =E2=80=9Cnormal=E2=80=9D boot will be done. Which means that either = gptzfsboot has to know how to deal with geli in context of handling = nextboot, or with geli, you just can not use nextboot config. The similar issue is with using boot block area in zfs pool label - to = be able to store and use gptzfsboot in pool label boot area, the boot1 = either has to know how to read the geli, or geli must be able not to = encrypt the bootblock area, or we just can not use that area [with = geli]. All in all, it is another example of the chicken and the egg = issue:) rgds, toomas=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?814E1C65-23E3-42A1-8093-8008DF188506>