Date: Sat, 6 Jun 2020 08:52:01 -0700 From: Conrad Meyer <cem@freebsd.org> To: Xin LI <d@delphij.net> Cc: freebsd-current <freebsd-current@freebsd.org> Subject: Re: HEADSUP: GEOM label may be broken [Was Re: svn commit: r361838 - in head/sys/geom: . label] Message-ID: <CAG6CVpWreHKkL=X5h1DVX8oNiP=SNEneYizLU9NrrpMvMoFLMA@mail.gmail.com> In-Reply-To: <d630a3fd-32a2-dba5-757c-c7445d407855@delphij.net> References: <202006051612.055GCL11009491@repo.freebsd.org> <d630a3fd-32a2-dba5-757c-c7445d407855@delphij.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Xin Li, Thank you for the report and diagnosis. Sorry for the breakage. I have reverted the change and hope to address the issues you have identified. There seem to be two issues identified: 1. When ZFS attaches to a partition of a disk, the /dev/diskid label disappears. 2. geli(8) attach of symlink labels doesn't work (1) is interesting, as it does not seem to happen to UFS or swap consumers: # mount /dev/gpt/rootfs on / (ufs, local, noatime, soft-updates) # swapctl -l Device: 1024-blocks Used: /dev/vtbd0p2 1048576 0 # ls -l /dev/diskid/DISK-BHYVE-2613-8EFD-BAF4 /dev/gpt/{rootfs,swapfs} lrwxr-xr-x 1 root wheel 8 May 22 20:18 /dev/diskid/DISK-BHYVE-2613-8EFD-BAF4 -> ../vtbd0 lrwxr-xr-x 1 root wheel 10 May 22 20:18 /dev/gpt/rootfs -> ../vtbd0p3 lrwxr-xr-x 1 root wheel 10 May 22 20:18 /dev/gpt/swapfs -> ../vtbd0p2 I wonder what is different in the ZFS case here. (2) # ls -l /dev/gpt/testingeli lrwxr-xr-x 1 root wheel 8 Jun 6 08:33 /dev/gpt/testingeli -> ../md0s1 # geli init /dev/gpt/testingeli Enter new passphrase: Reenter new passphrase: ... # geli attach /dev/gpt/testingeli Enter passphrase: geli: Provider gpt/testingeli is invalid. geli: There was an error with at least one provider. So at least the latter problem is straightforward to resolve. I have a patch I'm testing locally now, and will upload to phabricator shortly. Best regards, Conrad On Sat, Jun 6, 2020 at 1:17 AM Xin Li <delphij@delphij.net> wrote: > > I just spent quite some time to revive my laptop. TL;DR: if you are > using /dev/diskid or /dev/gptid labels and GELI, please wait until > things settled. > > === > > On 6/5/20 9:12 AM, Conrad Meyer wrote: > > Author: cem > > Date: Fri Jun 5 16:12:21 2020 > > New Revision: 361838 > > URL: https://svnweb.freebsd.org/changeset/base/361838 > > > > Log: > > geom_label: Use provider aliasing to alias upstream geoms > > > > For synthetic aliases (just pseudonyms inferred from metadata like GPT or > > UFS labels, GPT UUIDs, etc), use the GEOM provider aliasing system to create > > a symlink to the real device instead of creating an independent device. > > This makes it more clear which labels and devices correspond, and we can > > safely have multiple labels to a single device accessed at once. > > > > The confusingly named geom_label on-disk construct continues to behave > > identically to how it did before. > > > > This requires teaching GEOM's provider aliasing about the possibility > > that aliases might be added later in time, and GEOM's devfs interaction > > layer not to worry about existing aliases during retaste. > > > > Discussed with: imp > > Relnotes: sure, if we don't end up reverting it > > This would break (and the effect can be quite persistent and hard to > repair, see explanation below) existing configuration as some GEOM > classes are not converted to support the new way of device > representation, so I'd like to request this change be reverted for now > until these are fixed. > > Consider the following configuration, where one have a hard drive > partitioned with GPT, like: > > => 40 1953525088 ada1 GPT (932G) > 40 262144 1 efi (128M) > 262184 8388608 2 freebsd-zfs (4.0G) > 8650792 67108864 3 freebsd-swap (32G) > 75759656 1877765472 4 freebsd-zfs (895G) > > Now, the first ZFS pool is created as root pool. ZFS gets an exclusive > hold of 'ada1p2' despite the pool is carefully created to use > /dev/diskid/<DISKSERIAL>p2 instead of ada1p2. > > ZFS writes the new device path to vdev label. For ZFS, this doesn't > matter much as it always checks the label. > > However, this will prevent GEOM from properly creating > /dev/diskid/<DISKSERIAL>. > > In order to prevent accidentally writing data to wrong disk, for "raw" > disk partitions it's usually a good idea to reference them with labels, > either by /dev/diskid or /dev/gptid. With /dev/ada1p2 exclusively > accessed by ZFS, the /dev/diskid/<DISKSERIAL> representing the disk is > now gone. > > And to make the situation even worse, simply changing the partition > reference to the corresponding /dev/gptid/<uuid> doesn't really work > either, when one is encrypting partitions individually. In the example > above, adap4 is an encrypted partition and with the alias change, one > can no longer "geli attach" them via /dev/gptid/<uuid>. They can still > be attached by the "canonical" path (/dev/ada1p4), but for the swap > partition that would completely defeat the purpose of using label (to > prevent accidentally writing to the wrong disk). > > For my case, reverting to an older kernel is not sufficient to fix the > configuration, because ZFS is recording the "canonical" device path > (/dev/ada1p2) and /dev/diskid label for this disk disappeared somewhat > permanently (I would have to find a USB drive somewhere to fix the root > pool to use the "right" device path). > > > Differential Revision: https://reviews.freebsd.org/D24968 > [...] > > Modified: head/sys/geom/label/g_label.c > > ============================================================================== > > --- head/sys/geom/label/g_label.c Fri Jun 5 16:05:09 2020 (r361837) > > +++ head/sys/geom/label/g_label.c Fri Jun 5 16:12:21 2020 (r361838) > > @@ -344,18 +345,16 @@ g_label_taste(struct g_class *mp, struct g_provider *p > > { > > struct g_label_metadata md; > > struct g_consumer *cp; > > + struct g_class *clsp; > > struct g_geom *gp; > > int i; > > + bool changed; > > > > g_trace(G_T_TOPOLOGY, "%s(%s, %s)", __func__, mp->name, pp->name); > > g_topology_assert(); > > > > G_LABEL_DEBUG(2, "Tasting %s.", pp->name); > > > > - /* Skip providers that are already open for writing. */ > > - if (pp->acw > 0) > > - return (NULL); > > - > > if (strcmp(pp->geom->class->name, mp->name) == 0) > > return (NULL); > > > > @@ -391,9 +390,16 @@ g_label_taste(struct g_class *mp, struct g_provider *p > > if (md.md_provsize != pp->mediasize) > > break; > > > > + /* Skip providers that are already open for writing. */ > > + if (pp->acw > 0) { > > + g_access(cp, -1, 0, 0); > > + goto end; > > + } > > + > > (Is this still necessary when the eventual provider would be the real one?) > > I think symlink aliasing is a the right direction but we need to be > really careful to not break existing and legitimate usage. Since this > also breaks GELI when using with labels, I'd like to request that this > change be backed out for now until the consumers are correctly fixed. > > Cheers, >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAG6CVpWreHKkL=X5h1DVX8oNiP=SNEneYizLU9NrrpMvMoFLMA>