From owner-svn-src-all@freebsd.org Mon Dec 14 18:40:28 2015 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 500DBA43AEA; Mon, 14 Dec 2015 18:40:28 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0F4781B16; Mon, 14 Dec 2015 18:40:28 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id E2D99B93E; Mon, 14 Dec 2015 13:40:26 -0500 (EST) From: John Baldwin To: Ian Lepore Cc: Alexey Dokuchaev , "Andrey V. Elsukov" , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r292058 - head/sbin/geom/class/part Date: Mon, 14 Dec 2015 10:39:12 -0800 Message-ID: <1982430.YSjTPepPxF@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.2-STABLE; KDE/4.14.3; amd64; ; ) In-Reply-To: <1449940829.1358.154.camel@freebsd.org> References: <201512101037.tBAAbDMq065138@repo.freebsd.org> <20151212121209.GA60800@FreeBSD.org> <1449940829.1358.154.camel@freebsd.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 14 Dec 2015 13:40:27 -0500 (EST) X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2015 18:40:28 -0000 On Saturday, December 12, 2015 10:20:29 AM Ian Lepore wrote: > On Sat, 2015-12-12 at 12:12 +0000, Alexey Dokuchaev wrote: > > On Thu, Dec 10, 2015 at 08:42:01PM +0300, Andrey V. Elsukov wrote: > > > On 10.12.15 20:05, Ian Lepore wrote: > > > > On Thu, 2015-12-10 at 10:37 +0000, Andrey V. Elsukov wrote: > > > > > Author: ae > > > > > Date: Thu Dec 10 10:37:12 2015 > > > > > New Revision: 292058 > > > > > URL: https://svnweb.freebsd.org/changeset/base/292058 > > > > > > > > > > Log: > > > > > Remove a note about damaged PMBR. Now GPT will be detected > > > > > automatically > > > > > with such corruption. > > > > > > > > > > MFC after: 1 month > > > > > > > > Will all of these changes add up to it being impossible to make a > > > > device NOT be recognized as gpt once it has had gpt on it? > > > > It's typical to dd some zeroes to the start of a volume to clean > > > > out > > > > old info, but if geom is going to aggressively ressurect a > > > > purposely > > > > -nuked GPT based on the backup info (which is hard to find and dd > > > > over > > > > by hand) this is going to add up to a lot of frustration for > > > > those of > > > > us who have to frequently work with regenerating sdcard and CF > > > > images. > > > > +1, I'm also used to "dd'ing zeros" trick. > > > > > If you want to make device to not be recognized as GPT, you should > > > use > > > 'gpart destroy -F ' this will destroy first two sectors > > > where > > > PMBR and primary GPT header are located, also it will destroy the > > > last > > > sector with backup GPT. > > > > While this' technically more accurate, "geom destroy" never worked > > for me > > without googling or reading the manpage because of missing -F switch, > > I > > think. Filling first few sectors with zeros worked for so many years > > and > > people got used to that. Would it perhaps make sense to add debug > > message > > when backup GPT is being used? > > > > I spent much of the last week fighting with "geom destroy" and trying > to prevent the ressurection of old geoms during the creation of new > ones. It's a Big Mess and it doesn't really work well at all. I came > to the conclusion that it's not geom destroy that needs a force flag so > much as geom create, where it would mean "it is okay to replace any > existing geom with the new one." > > For example if you have a freebsd slice da0s1 that contains freebsd > partitions within it, and you geom destroy -F da0 then that slice and > the partitions within it disappear. Then as soon as you create a new > geom for the device and add a slice that happens to fall on the same > sector that da0s1 used to live at, suddenly that whole geom tree comes > back to life and your next command to create a new geom da0s1 fails > because it already exists. > > With a task like formatting and populating an sd card with a script, > you have to deal with whatever data is on the sd card when it's first > inserted. You don't know where existing geoms might be, and it's quite > burdensome to write script code to figure that out and do a recursive > destroy -F. (Actually, a "geom destroy -R -F" to recursively clean > everything would be quite nice.) > > I eventually worked around the problem by using the no-commit feature > to do all the work in the sort of virtual space that creates, then > commit everything after the whole volume is laid out. That process is > another Big Mess, because it turns out you have to commit each geom > individually. > > Just committing the top level doesn't recurse and that creates insanity > because now you've got uncommitted nested geoms that are somehow locked > such that anything you try to do results in permission errors with no > clue why root doesn't have permission to do commands that usually work > fine. (I literally had to add printfs to sys/geom code and reboot with > that kernel to figure out what was wrong.) > > Maybe it shouldn't be possible to commit an outer geom if it contains > uncommitted nested geoms. Or maybe commit should have a -R flag to > recurse automatically as well (but that would have to be implemented on > the kernel side, unless there's some way to query from userland about > whether a geom is committed or not). A big +1 to this. At ${JOB--} we had a wonky script that tried gpart destroy -F but then still had to dd zero's at both the start and end of the disk to cope with various edge cases. All we really wanted was a 'gpart create -f' that would work reliably. The reason this commit triggered this message is paranoia that the more aggressive recovery here would now resurrect a GPT "mid-flight" when partitioning a disk causing the process to fail. Part of the problem is that partioning a disk is not atomic. You almost want a way to detach all consumers from a provider and then disable tasting (except for adding explicitly created objects) until you are done creating the new layout. Something like: gpart start_relayout foo # implies destroy -F gpart create -s bar foo gpart add blah blah ... gpart end_relayout foo (Yes, those command names are ugly, but you hopefully get the idea.) -- John Baldwin