From owner-freebsd-fs@freebsd.org Tue Mar 7 08:04:41 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B0355D01591 for ; Tue, 7 Mar 2017 08:04:41 +0000 (UTC) (envelope-from tsoome@me.com) Received: from st13p35im-asmtp002.me.com (st13p35im-asmtp002.me.com [17.164.199.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7E4A61187; Tue, 7 Mar 2017 08:04:41 +0000 (UTC) (envelope-from tsoome@me.com) Received: from process-dkim-sign-daemon.st13p35im-asmtp002.me.com by st13p35im-asmtp002.me.com (Oracle Communications Messaging Server 7.0.5.38.0 64bit (built Feb 26 2016)) id <0OMF00A00NN15X00@st13p35im-asmtp002.me.com>; Tue, 07 Mar 2017 07:04:33 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=me.com; s=4d515a; t=1488870273; bh=UWl4GDb6XiYLEWSgjeGhRFsUnb6MDrMU7R1ZdKLrwxY=; h=Content-type:MIME-version:Subject:From:Date:Message-id:To; b=AokUI6kY63JcUOTC4ZgraUy1xdampHbGR1RigHan4/30tOP+mqq9xQqFD/p6AHQCy tj6PgPpTOaeNDH8hH3IinBkD/qJQWWDYSJny4gijCscNsK99dGUnF1Ye2loT64HOaP uFgFkPD0YNq7kxPs2nyjN/c9kRrq9ox6bzQaaoJB2UeL19JuZ003Gmf7lYmXJNSlzC LSUyBo+Up3XGDkiUr6ZbpIce2JuvkH8ei+T1vLfIKAtEHnkpXK7XALjVEn+CbwYW+o xZAr1LSfMvuYL6qOZOg+SufIZje1/jCkALX92Acco87khyUQb2tSn+F1SLkYAcq5E6 0jq/VKoHsgTXg== Received: from icloud.com ([127.0.0.1]) by st13p35im-asmtp002.me.com (Oracle Communications Messaging Server 7.0.5.38.0 64bit (built Feb 26 2016)) with ESMTPSA id <0OMF006FKNNI6300@st13p35im-asmtp002.me.com>; Tue, 07 Mar 2017 07:04:33 +0000 (GMT) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-03-07_05:,, signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 clxscore=1034 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1701120000 definitions=main-1703070061 Content-type: text/plain; charset=utf-8 MIME-version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Subject: Re: svn commit: r308089 - in head From: Toomas Soome In-reply-to: Date: Tue, 07 Mar 2017 09:04:29 +0200 Cc: Andriy Gapon , freebsd-fs@freebsd.org, Toomas Soome Content-transfer-encoding: quoted-printable Message-id: References: <201610291409.u9TE9WXJ020650@repo.freebsd.org> To: Lawrence Stewart X-Mailer: Apple Mail (2.3259) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Mar 2017 08:04:41 -0000 > On 7. m=C3=A4rts 2017, at 7:25, Lawrence Stewart = wrote: >=20 > Hi Andriy, >=20 > On 30/10/2016 01:09, Andriy Gapon wrote: >> Author: avg >> Date: Sat Oct 29 14:09:32 2016 >> New Revision: 308089 >> URL: https://svnweb.freebsd.org/changeset/base/308089 >>=20 >> Log: >> zfsbootcfg: a simple tool to set next boot (one time) options for = zfsboot >>=20 >> (gpt)zfsboot will read one-time boot directives from a special ZFS = pool >> area. The area was previously described as "Boot Block Header", but >> currently it is know as Pad2, marked as reserved and is zeroed out = on >> pool creation. The new code interprets data in this area, if any, = using >> the same format as boot.config. The area is immediately wiped out. >> Failure to parse the directives results in a reboot right after the >> cleanup. Otherwise the boot sequence proceeds as usual. >>=20 >> zfsbootcfg writes zfsboot arguments specified on its command line to = the >> Pad2 area of a disk identified by vfs.zfs.boot.primary_pool and >> vfs.zfs.boot.primary_vdev kenv variables that are set by loader = during >> boot. Please see the manual page for more. >>=20 >> Thanks to all who reviewed, contributed and made suggestions! There = are >> many potential improvements to the feature, please see the review = for >> details. >>=20 >> Reviewed by: wblock (docs) >> Discussed with: jhb, tsoome >> MFC after: 3 weeks >> Relnotes: yes >> Differential Revision: https://reviews.freebsd.org/D7612 >>=20 >> Added: >> head/sbin/zfsbootcfg/ >> head/sbin/zfsbootcfg/Makefile (contents, props changed) >> head/sbin/zfsbootcfg/zfsbootcfg.8 (contents, props changed) >> head/sbin/zfsbootcfg/zfsbootcfg.c (contents, props changed) >> Modified: >> head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs.h >> head/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_pool.c >> head/sbin/Makefile >> head/sys/boot/i386/common/drv.c >> head/sys/boot/i386/common/drv.h >> head/sys/boot/i386/gptzfsboot/Makefile >> head/sys/boot/i386/zfsboot/Makefile >> head/sys/boot/i386/zfsboot/zfsboot.c >> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev.h >> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c >> head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c >> head/sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h >>=20 > [snip] >> @@ -634,7 +712,39 @@ main(void) >> primary_spa =3D spa; >> primary_vdev =3D spa_get_primary_vdev(spa); >>=20 >> - if (zfs_spa_init(spa) !=3D 0 || zfs_mount(spa, 0, &zfsmount) !=3D = 0) { >> + nextboot =3D 0; >> + rc =3D vdev_read_pad2(primary_vdev, cmd, sizeof(cmd)); >> + if (vdev_clear_pad2(primary_vdev)) >> + printf("failed to clear pad2 area of primary vdev\n"); >> + if (rc =3D=3D 0) { >> + if (*cmd) { >> + /* >> + * We could find an old-style ZFS Boot Block header here. >> + * Simply ignore it. >> + */ >> + if (*(uint64_t *)cmd !=3D 0x2f5b007b10c) { >> + /* >> + * Note that parse() is destructive to cmd[] and we also = want >> + * to honor RBX_QUIET option that could be present in = cmd[]. >> + */ >> + nextboot =3D 1; >> + memcpy(cmddup, cmd, sizeof(cmd)); >> + if (parse()) { >> + printf("failed to parse pad2 area of primary = vdev\n"); >> + reboot(); >> + } >> + if (!OPT_CHECK(RBX_QUIET)) >> + printf("zfs nextboot: %s\n", cmddup); >> + } >> + /* Do not process this command twice */ >> + *cmd =3D 0; >> + } >> + } else >> + printf("failed to read pad2 area of primary vdev\n"); >> + >=20 > I've just taken Allan Jude's & co-conspirators' work for a spin that > allows gptzfsboot to boot from a geli + ZFS partition. Everything is > working amazingly well, but I see the above "failed to read pad2 area = of > primary vdev" message on every boot. >=20 > It doesn't appear to cause any problems per se and the system > boots/works fine. I assume that message is printed to signal an > unexpected situation though, so figured I'd get in touch to get your > thoughts. >=20 >=20 >=20 > I installed the KVM-based virtual machine system manually from the = live > shell of: >=20 > FreeBSD-12.0-CURRENT-amd64-20170301-r314495-disc1.iso >=20 >=20 >=20 > The partitioning is very simple: >=20 > gpart create -s gpt /dev/vtbd0 > gpart add -t freebsd-boot -a 8 -b 40 -s 512k vtbd0 > gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 vtbd0 > gpart add -t freebsd-zfs -b 2088 vtbd0 >=20 > root# gpart show > =3D> 40 83886000 vtbd0 GPT (40G) > 40 1024 1 freebsd-boot (512K) > 1064 1024 - free - (512K) > 2088 83883952 2 freebsd-zfs (40G) >=20 >=20 >=20 > geli was inited/attached to vtbd0p2 and the zpool was created with = command: >=20 > zpool create -o altroot=3D/tmp/zroot -o cachefile=3D/tmp/zpool.cache = -O > checksum=3Dskein -O compression=3Dlz4 vtbd0p2.eli >=20 > i.e. the entire pool including bootfs is using skein for checksumming > and lz4 for compression. >=20 >=20 >=20 > I hit another boot bug using skein previously which Toomas (CCed) = fixed, > and am wondering if this issue might also be related to the skein > implementation. >=20 > I haven't tested if the zfsbootcfg functionality works for fear that = the > printf is indicating a low level problem with the zpool. I can test > potentially destructive things and break the pool though if that would > be helpful. >=20 > Any thoughts? >=20 > Cheers, > Lawrence The problem with having pool on geli encrypted partition is that all the = reads done on such partition, gave to go through geli aware read() = function, and the same is true for writes (which is important for = nextboot feature). So what it means for gptzfsboot/zfsboot is that we = would need to have the disk reads/writes go through the geli aware = functions and we can not issue =E2=80=9Cpure=E2=80=9D disk io directly. rgds, toomas