From owner-freebsd-arm@freebsd.org  Fri Jul 15 13:36:34 2016
Return-Path: <owner-freebsd-arm@freebsd.org>
Delivered-To: freebsd-arm@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 400D5B98F17
 for <freebsd-arm@mailman.ysv.freebsd.org>;
 Fri, 15 Jul 2016 13:36:34 +0000 (UTC)
 (envelope-from paul@gromit.dlib.vt.edu)
Received: from gromit.dlib.vt.edu (gromit.dlib.vt.edu [128.173.126.120])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "gromit.dlib.vt.edu",
 Issuer "Chumby Certificate Authority" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 1646A1310
 for <freebsd-arm@freebsd.org>; Fri, 15 Jul 2016 13:36:33 +0000 (UTC)
 (envelope-from paul@gromit.dlib.vt.edu)
Received: from mather.chumby.lan (c-71-63-91-41.hsd1.va.comcast.net
 [71.63.91.41])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by gromit.dlib.vt.edu (Postfix) with ESMTPSA id 71E1F85F;
 Fri, 15 Jul 2016 09:36:27 -0400 (EDT)
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
Subject: Re: Bizarre clone attempt failures on Raspberry Pi2...
From: Paul Mather <paul@gromit.dlib.vt.edu>
In-Reply-To: <bec46aff-a4d5-9c4d-49d0-78534b13f719@denninger.net>
Date: Fri, 15 Jul 2016 09:36:26 -0400
Cc: freebsd-arm@freebsd.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <E01579F5-9562-4E51-9CFB-EA510460A4C8@gromit.dlib.vt.edu>
References: <548783e1-9047-68f7-5f50-449db684d602@denninger.net>
 <d2eb4035-e494-1a7b-98e5-2aa87efe0763@denninger.net>
 <EDE65B12-4961-4CEF-8AE9-BFDA4FD508A5@gromit.dlib.vt.edu>
 <5475ea53-ae22-2634-6f2a-5737d1b0e308@denninger.net>
 <398ae56c-8893-f188-c210-cf7f19ccf433@denninger.net>
 <1468518953.72182.219.camel@freebsd.org>
 <7a91fc79-1c85-fac8-aa3f-db90592f3f44@denninger.net>
 <bec46aff-a4d5-9c4d-49d0-78534b13f719@denninger.net>
To: Karl Denninger <karl@denninger.net>
X-Mailer: Apple Mail (2.3124)
X-BeenThere: freebsd-arm@freebsd.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Porting FreeBSD to ARM processors." <freebsd-arm.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-arm>,
 <mailto:freebsd-arm-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arm/>
List-Post: <mailto:freebsd-arm@freebsd.org>
List-Help: <mailto:freebsd-arm-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-arm>,
 <mailto:freebsd-arm-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 15 Jul 2016 13:36:34 -0000

On Jul 14, 2016, at 11:36 PM, Karl Denninger <karl@denninger.net> wrote:

> On 7/14/2016 13:27, Karl Denninger wrote:
>> On 7/14/2016 12:55, Ian Lepore wrote:
>> No there wasn't.  It was a blank (brand new) card the first time =
around;
>> it had a MSDOS filesystem on it (as do all new cards) but *no*
>> BSD-specific geom anything on it.
>>> To reliably create a new layout regardless of what may be present
>>> already on the media, you have two choices:
>>>=20
>>> 1 - dd zeroes to the entire device
>>> 2 - use the "no commit" feature of gpart
>> Actually in the case at hand #1 isn't impractical since I really only
>> care about the first 100MB or so being zeroed.  The reason is that my
>> boot block (the MSDOS fs) is ~50Mb and the label is obviously next, =
so
>> if we zero the first 100MB we're fine.
>>=20
>> And in fact that does work.
>>> When you pass no '-f <flags>' to a gpart command, it automatically =
adds
>>> the "-f C" (commit) flag behind your back.  There is no "don't =
commit"
>>> flag, so (this is surrealistically crazy...) what you're supposed to =
do
>>> is pass an invalid flag, which it won't complain about, in order to
>>> prevent it from automatically adding that 'C' flag you didn't even
>>> realize existed.  (This is where *I* curse whoever coded this mess.)
>>>=20
>>> When you don't commit, the changes take place in a sort of 'virtual
>>> workspace' and nothing on the physical disk changes until you do a
>>> "gpart commit" (or "gpart undo" to discard the changes).  Making all
>>> this much-less-cool that it's sounding right now, there is no =
automatic
>>> recursion for commit and undo... if you create a bunch of nested =
stuff
>>> (a slice, a geom within that slice, parititions within that geom), =
then
>>> you have to commit all the pending new geoms *in reverse order of =
how
>>> they were created*.
>>>=20
>>> So, using da0 (since it's shorter to type), the sequence goes like:
>>>=20
>>> gpart destroy -f x -F da0
>>> gpart create -f x  -s MBR da0
>>> gpart add -f x     -t \!12 -s 64M -a 4M da0
>>> gpart add -f x     -t freebsd -a 4M da0
>>> gpart destroy -f x -F da0s2
>>> gpart create -f x  -s BSD da0s2
>>> gpart add -f x     -t freebsd-ufs da0s2
>>> gpart commit da0s2
>>> gpart commit da0
>>> newfs_msdos /dev/da0s1
>>> newfs -U /dev/da0s2a
>>>=20
>>> And that reliably creates a fresh rpi-style layout regardless of =
what
>>> was on the media before you started.
>> Ok, I will try this, BUT I suspect it's still screwed (blind) because
>> when I zeroed the front of the disk I got a "correct" partition =
layout
>> but after populating it what I get still hangs after it mounts root =
in
>> the same place.  The way to prevent the alignment issue from coming =
up
>> is to specify a "-b" switch on the "add", giving you a block offset.=20=

>> "-b 64" is sufficient; now if the system tries to "taste" da0s2 it =
will
>> fail (as it does for the card that is running) but "tasting" da0s2a
>> succeeds.
>>> Now, to address the question of the filesystem existing at da0s2 =
versus
>>> da0s2a, the difference is alignment.  Making things even more
>>> confusing, alignment (if you don't specify it) sometimes changes =
based
>>> on the type and brand of usb sdcard reader you're using and the fake
>>> geometry values it reports to the system.  (A USB reader almost =
always
>>> reports different fake geometry than a native sd slot would on a
>>> machine with non-USB based sd support.)
>> Yes, I understand that; if the alignment matches thus the "a" =
partition
>> starts at offset zero then you can actually reference that (although
>> length might be wrong) with the base device.  After all, what it =
really
>> does is look at the blocks to see if the magic number is good, and if =
so
>> it tries to read and process it.
>>=20
>> But this doesn't explain why, after getting a layout that's correct =
(by
>> writing zeros to the front of the card first, so anything that =
"might"
>> be there isn't) and copying all the file structure over (which =
facially
>> not only appears to be correct but the loader finds and loads the
>> kernel, AND the root filesystem mounts!) the system hangs, apparently
>> just before init gets started.
>>=20
>> If init can't be found you should get a complaint (been there, done
>> that) on the console but there is no complaint of any sort.
>>=20
>> I've gotten through the bad structure issue on the SD card, and am =
now
>> left with "why does it hang on boot -- with no error or other =
indication
>> of what the problem is" after the kernel loads *and* the root =
filesystem
>> mounts?
>>=20
> Found it.
>=20
> Apparently the current code *requires* the label be set on the msdos
> partition.  If it's not then not only does it not mount (which =
shouldn't
> matter post-boot as the loader is supposed to pass the dtb file, it is
> specified in the config file without any sort of path prefix, and thus
> once the kernel has loaded it should not matter if the dos partition =
if
> actually mounted or not) *but* the boot process hangs without any
> indication of why!
>=20
> So, you must do newfs_msdos -L MSDOSBOOT -F 16 {device}
>=20
> If the "-L" is missing you're hosed; the system facially appears to be
> just fine but while the loader comes up and so does the kernel, it =
hangs
> without ever proceeding -- and without any sort of error message
> indicating that it is unable to mount something it needs.


You have to do that because the device entry in the stock /etc/fstab is =
/dev/msdosfs/MSDOSBOOT.  The /dev/msdosfs part indicates it's using =
ms-dos labels.  In other words, this is just the same sort of failure =
you were getting when you weren't labelling the UFS partition as =
"rootfs".  Labelling the file system properly "fixes" the issue, as you =
would expect.

It's a misnomer to say the code "requires" labels.  It's just that's the =
way the distribution images are currently set up.  I have an older Pi =
that predates the current distribution images that just uses =
/dev/mmcsd0... device names in /etc/fstab.  Both approaches work fine.  =
You just need to make sure the devices you specify in /etc/fstab will =
actually exist when it comes time to mount the corresponding file =
system.

If you stop using labels in your /etc/fstab then you won't have problems =
when those labels are missing.  If the labels are missing, the =
/dev/{msdosfs,ufs} devices will not be present and the system will drop =
to single-user mode because none-late, non-noauto file systems can't be =
accessed via their device nodes when attempting to mount them.  When =
that happens and you don't have a serial console enabled then you have =
problems remediating the situation.

If a file system is not needed to mount as part of booting (as you =
suggest for /boot/msdos) then you should probably flag it with the =
"noauto" option in /etc/fstab or remove it from /etc/fstab entirely.

I think the problem you were having is not copying all the required =
attributes of the file systems in question when cloning your SD cards, =
given your /etc/fstab setup.  It sounds like you've fixed that, now.

Cheers,

Paul.

>=20
> I can clone cards now.
>=20
> --=20
> Karl Denninger
> karl@denninger.net <mailto:karl@denninger.net>
> /The Market Ticker/
> /[S/MIME encrypted email preferred]/