Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Dec 2015 10:39:12 -0800
From:      John Baldwin <jhb@freebsd.org>
To:        Ian Lepore <ian@freebsd.org>
Cc:        Alexey Dokuchaev <danfe@freebsd.org>, "Andrey V. Elsukov" <ae@freebsd.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r292058 - head/sbin/geom/class/part
Message-ID:  <1982430.YSjTPepPxF@ralph.baldwin.cx>
In-Reply-To: <1449940829.1358.154.camel@freebsd.org>
References:  <201512101037.tBAAbDMq065138@repo.freebsd.org> <20151212121209.GA60800@FreeBSD.org> <1449940829.1358.154.camel@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday, December 12, 2015 10:20:29 AM Ian Lepore wrote:
> On Sat, 2015-12-12 at 12:12 +0000, Alexey Dokuchaev wrote:
> > On Thu, Dec 10, 2015 at 08:42:01PM +0300, Andrey V. Elsukov wrote:
> > > On 10.12.15 20:05, Ian Lepore wrote:
> > > > On Thu, 2015-12-10 at 10:37 +0000, Andrey V. Elsukov wrote:
> > > > > Author: ae
> > > > > Date: Thu Dec 10 10:37:12 2015
> > > > > New Revision: 292058
> > > > > URL: https://svnweb.freebsd.org/changeset/base/292058
> > > > > 
> > > > > Log:
> > > > >   Remove a note about damaged PMBR. Now GPT will be detected
> > > > > automatically
> > > > >   with such corruption.
> > > > >   
> > > > >   MFC after:	1 month
> > > > 
> > > > Will all of these changes add up to it being impossible to make a
> > > > device NOT be recognized as gpt once it has had gpt on it?
> > > > It's typical to dd some zeroes to the start of a volume to clean
> > > > out
> > > > old info, but if geom is going to aggressively ressurect a
> > > > purposely
> > > > -nuked GPT based on the backup info (which is hard to find and dd
> > > > over
> > > > by hand) this is going to add up to a lot of frustration for
> > > > those of
> > > > us who have to frequently work with regenerating sdcard and CF
> > > > images.
> > 
> > +1, I'm also used to "dd'ing zeros" trick.
> >  
> > > If you want to make device to not be recognized as GPT, you should
> > > use
> > > 'gpart destroy -F <device>' this will destroy first two sectors
> > > where
> > > PMBR and primary GPT header are located, also it will destroy the
> > > last
> > > sector with backup GPT.
> > 
> > While this' technically more accurate, "geom destroy" never worked
> > for me
> > without googling or reading the manpage because of missing -F switch,
> > I
> > think.  Filling first few sectors with zeros worked for so many years
> > and
> > people got used to that.  Would it perhaps make sense to add debug
> > message
> > when backup GPT is being used?
> > 
> 
> I spent much of the last week fighting with "geom destroy" and trying
> to prevent the ressurection of old geoms during the creation of new
> ones.  It's a Big Mess and it doesn't really work well at all.  I came
> to the conclusion that it's not geom destroy that needs a force flag so
> much as geom create, where it would mean "it is okay to replace any
> existing geom with the new one."
> 
> For example if you have a freebsd slice da0s1 that contains freebsd
> partitions within it, and you geom destroy -F da0 then that slice and
> the partitions within it disappear.  Then as soon as you create a new
> geom for the device and add a slice that happens to fall on the same
> sector that da0s1 used to live at, suddenly that whole geom tree comes
> back to life and your next command to create a new geom da0s1 fails
> because it already exists.
> 
> With a task like formatting and populating an sd card with a script,
> you have to deal with whatever data is on the sd card when it's first
> inserted.  You don't know where existing geoms might be, and it's quite
> burdensome to write script code to figure that out and do a recursive
> destroy -F.  (Actually, a "geom destroy -R -F" to recursively clean
> everything would be quite nice.)
> 
> I eventually worked around the problem by using the no-commit feature
> to do all the work in the sort of virtual space that creates, then
> commit everything after the whole volume is laid out.  That process is
> another Big Mess, because it turns out you have to commit each geom
> individually.
> 
> Just committing the top level doesn't recurse and that creates insanity
> because now you've got uncommitted nested geoms that are somehow locked
> such that anything you try to do results in permission errors with no
> clue why root doesn't have permission to do commands that usually work
> fine.  (I literally had to add printfs to sys/geom code and reboot with
> that kernel to figure out what was wrong.)
> 
> Maybe it shouldn't be possible to commit an outer geom if it contains
> uncommitted nested geoms.  Or maybe commit should have a -R flag to
> recurse automatically as well (but that would have to be implemented on
> the kernel side, unless there's some way to query from userland about
> whether a geom is committed or not).

A big +1 to this.  At ${JOB--} we had a wonky script that tried gpart
destroy -F but then still had to dd zero's at both the start and end of
the disk to cope with various edge cases.  All we really wanted was a
'gpart create -f' that would work reliably.

The reason this commit triggered this message is paranoia that the more
aggressive recovery here would now resurrect a GPT "mid-flight" when
partitioning a disk causing the process to fail.  Part of the problem is
that partioning a disk is not atomic.  You almost want a way to detach
all consumers from a provider and then disable tasting (except for adding
explicitly created objects) until you are done creating the new layout.

Something like:

   gpart start_relayout foo  # implies destroy -F
   gpart create -s bar foo
   gpart add blah blah
   ...
   gpart end_relayout foo

(Yes, those command names are ugly, but you hopefully get the idea.)

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1982430.YSjTPepPxF>