Date: Tue, 07 Mar 2017 09:04:29 +0200 From: Toomas Soome <tsoome@me.com> To: Lawrence Stewart <lstewart@freebsd.org> Cc: Andriy Gapon <avg@freebsd.org>, freebsd-fs@freebsd.org, Toomas Soome <tsoome@freebsd.org> Subject: Re: svn commit: r308089 - in head Message-ID: <CCB18F77-A9C3-4D22-82A3-9DD84DF783F9@me.com> In-Reply-To: <c4cc03d0-d26e-f7c0-8399-d65f2aa0c5ef@freebsd.org> References: <201610291409.u9TE9WXJ020650@repo.freebsd.org> <c4cc03d0-d26e-f7c0-8399-d65f2aa0c5ef@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 7. m=C3=A4rts 2017, at 7:25, Lawrence Stewart = <lstewart@freebsd.org> wrote: >=20 > Hi Andriy, >=20 > On 30/10/2016 01:09, Andriy Gapon wrote: >> Author: avg >> Date: Sat Oct 29 14:09:32 2016 >> New Revision: 308089 >> URL: https://svnweb.freebsd.org/changeset/base/308089 >>=20 >> Log: >> zfsbootcfg: a simple tool to set next boot (one time) options for = zfsboot >>=20 >> (gpt)zfsboot will read one-time boot directives from a special ZFS = pool >> area. The area was previously described as "Boot Block Header", but >> currently it is know as Pad2, marked as reserved and is zeroed out = on >> pool creation. The new code interprets data in this area, if any, = using >> the same format as boot.config. The area is immediately wiped out. >> Failure to parse the directives results in a reboot right after the >> cleanup. Otherwise the boot sequence proceeds as usual. >>=20 >> zfsbootcfg writes zfsboot arguments specified on its command line to = the >> Pad2 area of a disk identified by vfs.zfs.boot.primary_pool and >> vfs.zfs.boot.primary_vdev kenv variables that are set by loader = during >> boot. Please see the manual page for more. >>=20 >> Thanks to all who reviewed, contributed and made suggestions! There = are >> many potential improvements to the feature, please see the review = for >> details. >>=20 >> Reviewed by: wblock (docs) >> Discussed with: jhb, tsoome >> MFC after: 3 weeks >> Relnotes: yes >> Differential Revision: https://reviews.freebsd.org/D7612 >>=20 >> Added: >> head/sbin/zfsbootcfg/ >> head/sbin/zfsbootcfg/Makefile (contents, props changed) >> head/sbin/zfsbootcfg/zfsbootcfg.8 (contents, props changed) >> head/sbin/zfsbootcfg/zfsbootcfg.c (contents, props changed) >> Modified: >> head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h >> head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c >> head/sbin/Makefile >> head/sys/boot/i386/common/drv.c >> head/sys/boot/i386/common/drv.h >> head/sys/boot/i386/gptzfsboot/Makefile >> head/sys/boot/i386/zfsboot/Makefile >> head/sys/boot/i386/zfsboot/zfsboot.c >> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev.h >> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c >> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c >> head/sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h >>=20 > [snip] >> @@ -634,7 +712,39 @@ main(void) >> primary_spa =3D spa; >> primary_vdev =3D spa_get_primary_vdev(spa); >>=20 >> - if (zfs_spa_init(spa) !=3D 0 || zfs_mount(spa, 0, &zfsmount) !=3D = 0) { >> + nextboot =3D 0; >> + rc =3D vdev_read_pad2(primary_vdev, cmd, sizeof(cmd)); >> + if (vdev_clear_pad2(primary_vdev)) >> + printf("failed to clear pad2 area of primary vdev\n"); >> + if (rc =3D=3D 0) { >> + if (*cmd) { >> + /* >> + * We could find an old-style ZFS Boot Block header here. >> + * Simply ignore it. >> + */ >> + if (*(uint64_t *)cmd !=3D 0x2f5b007b10c) { >> + /* >> + * Note that parse() is destructive to cmd[] and we also = want >> + * to honor RBX_QUIET option that could be present in = cmd[]. >> + */ >> + nextboot =3D 1; >> + memcpy(cmddup, cmd, sizeof(cmd)); >> + if (parse()) { >> + printf("failed to parse pad2 area of primary = vdev\n"); >> + reboot(); >> + } >> + if (!OPT_CHECK(RBX_QUIET)) >> + printf("zfs nextboot: %s\n", cmddup); >> + } >> + /* Do not process this command twice */ >> + *cmd =3D 0; >> + } >> + } else >> + printf("failed to read pad2 area of primary vdev\n"); >> + >=20 > I've just taken Allan Jude's & co-conspirators' work for a spin that > allows gptzfsboot to boot from a geli + ZFS partition. Everything is > working amazingly well, but I see the above "failed to read pad2 area = of > primary vdev" message on every boot. >=20 > It doesn't appear to cause any problems per se and the system > boots/works fine. I assume that message is printed to signal an > unexpected situation though, so figured I'd get in touch to get your > thoughts. >=20 >=20 >=20 > I installed the KVM-based virtual machine system manually from the = live > shell of: >=20 > FreeBSD-12.0-CURRENT-amd64-20170301-r314495-disc1.iso >=20 >=20 >=20 > The partitioning is very simple: >=20 > gpart create -s gpt /dev/vtbd0 > gpart add -t freebsd-boot -a 8 -b 40 -s 512k vtbd0 > gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 vtbd0 > gpart add -t freebsd-zfs -b 2088 vtbd0 >=20 > root# gpart show > =3D> 40 83886000 vtbd0 GPT (40G) > 40 1024 1 freebsd-boot (512K) > 1064 1024 - free - (512K) > 2088 83883952 2 freebsd-zfs (40G) >=20 >=20 >=20 > geli was inited/attached to vtbd0p2 and the zpool was created with = command: >=20 > zpool create -o altroot=3D/tmp/zroot -o cachefile=3D/tmp/zpool.cache = -O > checksum=3Dskein -O compression=3Dlz4 <pool> vtbd0p2.eli >=20 > i.e. the entire pool including bootfs is using skein for checksumming > and lz4 for compression. >=20 >=20 >=20 > I hit another boot bug using skein previously which Toomas (CCed) = fixed, > and am wondering if this issue might also be related to the skein > implementation. >=20 > I haven't tested if the zfsbootcfg functionality works for fear that = the > printf is indicating a low level problem with the zpool. I can test > potentially destructive things and break the pool though if that would > be helpful. >=20 > Any thoughts? >=20 > Cheers, > Lawrence The problem with having pool on geli encrypted partition is that all the = reads done on such partition, gave to go through geli aware read() = function, and the same is true for writes (which is important for = nextboot feature). So what it means for gptzfsboot/zfsboot is that we = would need to have the disk reads/writes go through the geli aware = functions and we can not issue =E2=80=9Cpure=E2=80=9D disk io directly. rgds, toomas
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CCB18F77-A9C3-4D22-82A3-9DD84DF783F9>