Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 15 Jul 2016 12:33:14 -0400
From:      Paul Mather <paul@gromit.dlib.vt.edu>
To:        Ian Lepore <ian@freebsd.org>
Cc:        Karl Denninger <karl@denninger.net>, freebsd-arm <freebsd-arm@freebsd.org>
Subject:   Re: Bizarre clone attempt failures on Raspberry Pi2...
Message-ID:  <D9A9D485-4564-4F58-8B6D-564D4C86229E@gromit.dlib.vt.edu>
In-Reply-To: <1468597885.72182.286.camel@freebsd.org>
References:  <548783e1-9047-68f7-5f50-449db684d602@denninger.net> <d2eb4035-e494-1a7b-98e5-2aa87efe0763@denninger.net> <EDE65B12-4961-4CEF-8AE9-BFDA4FD508A5@gromit.dlib.vt.edu> <5475ea53-ae22-2634-6f2a-5737d1b0e308@denninger.net> <398ae56c-8893-f188-c210-cf7f19ccf433@denninger.net> <1468518953.72182.219.camel@freebsd.org> <7a91fc79-1c85-fac8-aa3f-db90592f3f44@denninger.net> <bec46aff-a4d5-9c4d-49d0-78534b13f719@denninger.net> <E01579F5-9562-4E51-9CFB-EA510460A4C8@gromit.dlib.vt.edu> <60b6e156-981e-9fbd-b68c-0daae1961286@denninger.net> <04391154-A38E-46CD-B570-B2BECFD19022@gromit.dlib.vt.edu> <d1aba096-e645-04df-dfda-5a9284250960@denninger.net> <1468597885.72182.286.camel@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Jul 15, 2016, at 11:51 AM, Ian Lepore <ian@freebsd.org> wrote:

> On Fri, 2016-07-15 at 09:44 -0500, Karl Denninger wrote:
>> On 7/15/2016 09:22, Paul Mather wrote:
>>=20
>>> On Jul 15, 2016, at 9:44 AM, Karl Denninger <karl@denninger.net>
>>> wrote:
>>>=20
>>>> On 7/15/2016 08:36, Paul Mather wrote:
>>>>> On Jul 14, 2016, at 11:36 PM, Karl Denninger <
>>>>> karl@denninger.net> wrote:
>>>>>=20
>>>>>> Found it.
>>>>>>=20
>>>>>> Apparently the current code *requires* the label be set on
>>>>>> the msdos
>>>>>> partition.  If it's not then not only does it not mount
>>>>>> (which shouldn't
>>>>>> matter post-boot as the loader is supposed to pass the dtb
>>>>>> file, it is
>>>>>> specified in the config file without any sort of path prefix,
>>>>>> and thus
>>>>>> once the kernel has loaded it should not matter if the dos
>>>>>> partition if
>>>>>> actually mounted or not) *but* the boot process hangs without
>>>>>> any
>>>>>> indication of why!
>>>>>>=20
>>>>>> So, you must do newfs_msdos -L MSDOSBOOT -F 16 {device}
>>>>>>=20
>>>>>> If the "-L" is missing you're hosed; the system facially
>>>>>> appears to be
>>>>>> just fine but while the loader comes up and so does the
>>>>>> kernel, it hangs
>>>>>> without ever proceeding -- and without any sort of error
>>>>>> message
>>>>>> indicating that it is unable to mount something it needs.
>>>>> You have to do that because the device entry in the stock
>>>>> /etc/fstab is /dev/msdosfs/MSDOSBOOT.  The /dev/msdosfs part
>>>>> indicates it's using ms-dos labels.  In other words, this is
>>>>> just the same sort of failure you were getting when you weren't
>>>>> labelling the UFS partition as "rootfs".  Labelling the file
>>>>> system properly "fixes" the issue, as you would expect.
>>>>>=20
>>>>> It's a misnomer to say the code "requires" labels.  It's just
>>>>> that's the way the distribution images are currently set up.  I
>>>>> have an older Pi that predates the current distribution images
>>>>> that just uses /dev/mmcsd0... device names in /etc/fstab.  Both
>>>>> approaches work fine.  You just need to make sure the devices
>>>>> you specify in /etc/fstab will actually exist when it comes
>>>>> time to mount the corresponding file system.
>>>> Except that if the root filesystem doesn't mount you get an
>>>> error, and
>>>> thus you can figure out what's going on.  What excuse is there
>>>> for not
>>>> printing an error message if a mount fails, and if something in
>>>> /etc/fstab fails to mount what's with hanging the machine?  I've
>>>> had
>>>> disks be unavailable before on Intel architecture machines (it
>>>> happens
>>>> when disks fail) and the result is an error on the failure to
>>>> mount but,
>>>> unless it's the root volume, the system still comes up.
>>>=20
>>> Are you sure you don't get an error?  When I forgot to label rootfs
>>> recently when I cloned an SD card I got an error displayed on the
>>> serial console.  I didn't get an error on the HDMI screen console.
>> You get an error if rootfs is not labelled on the HDMI screen (as
>> root
>> fails to mount.) There is *no* error on an HDMI screen if the msdosfs
>> is
>> not labeled.
>>> As I've mentioned before directly, FreeBSD/arm acts like
>>> console=3D"comconsole,vidconsole" is in effect.  This means that
>>> during /etc/rc boot processing, you'll only get output on
>>> comconsole (except for kernel messages, which seem to go to both).=20=

>>> That's been my experience in FreeBSD in general.
>>>=20
>>> I dimly recall folks on here saying U-Boot doesn't currently
>>> enable/support USB keyboards, so there's not really much you can do
>>> to fix it interactively if you fail to boot the OS and hence enable
>>> USB keyboard support via FreeBSD.  That's not a problem if you use
>>> a serial console, which is supported by U-Boot.
>> Well, that's not true if the kernel is loaded.  Once the kernel loads
>> a
>> usb keyboard works.
>>>=20
>>> I'm not sure comparisons with Intel architecture machines is
>>> entirely appropriate as they use a different boot
>>> environment/mechanism.  Still, I stand by the fact that I've always
>>> got an error message on the serial console when disks on my
>>> FreeBSD/arm system have failed to mount at boot.  (It used to
>>> happen regularly with an external USB drive I had that took a long
>>> time to probe, and I ended up having to put a kern.cam.boot_delay
>>> in /boot/loader.conf to avoid the system dropping into single-user
>>> mode when doing a reboot.)
>>>=20
>>>=20
>>>>> If you stop using labels in your /etc/fstab then you won't have
>>>>> problems when those labels are missing.  If the labels are
>>>>> missing, the /dev/{msdosfs,ufs} devices will not be present and
>>>>> the system will drop to single-user mode because none-late, non
>>>>> -noauto file systems can't be accessed via their device nodes
>>>>> when attempting to mount them.  When that happens and you don't
>>>>> have a serial console enabled then you have problems
>>>>> remediating the situation.
>>>>>=20
>>>>> If a file system is not needed to mount as part of booting (as
>>>>> you suggest for /boot/msdos) then you should probably flag it
>>>>> with the "noauto" option in /etc/fstab or remove it from
>>>>> /etc/fstab entirely.
>>>>>=20
>>>>> I think the problem you were having is not copying all the
>>>>> required attributes of the file systems in question when
>>>>> cloning your SD cards, given your /etc/fstab setup.  It sounds
>>>>> like you've fixed that, now.
>>>> Again, if it dropped to single user mode *and said it was doing
>>>> so* or
>>>> if there was an error message on the console when the filesystem
>>>> failed
>>>> to mount I would have found this in a reasonable period of time.=20
>>>> It
>>>> wasn't that rough to do so with the ufs label once I knew the
>>>> filesystem
>>>> was failing to mount, which was discernible from the console
>>>> output.
>>>>=20
>>>> Not printing an error when things error out is rude at best, and
>>>> when
>>>> that error is going to prevent the system from coming up this
>>>> darn well
>>>> ought to show up where one with a monitor plugged in can see it,
>>>> eh?
>>>>=20
>>>> There was literally no indication at all as to what was going on
>>>> and
>>>> since gpart does not show filesystem labels for *either* BSD
>>>> labeled
>>>> slices OR msdos figuring out what was different between the two
>>>> proved
>>>> to be a bit troublesome.  IMHO at least the failure to display an
>>>> error
>>>> message in this circumstance ought to be corrected.
>>>=20
>>> See above re: serial console vs. video console.
>>>=20
>>> As for the labels, these are file system labels and not partition
>>> labels.  The big clue is in the device name in /etc/fstab.  (The "
>>> -l" option to "gpart show" will only show labels "[f]or
>>> partitioning schemes that support partition labels".  That's
>>> reasonable, IMHO, as partitions are not the same as file systems
>>> and gpart is concerned with partitions.)  In my experience,
>>> complaints about not being able to access /dev/ufs/something means
>>> you forgot to label a UFS file system as "something" when you made
>>> it. :-)
>>>=20
>>> Cheers,
>>>=20
>>> Paul.
>>=20
>> Understood, but the issue here is that there's no indication without
>> a
>> serial console that you have anything wrong -- the system appears to
>> have simply hung.
>>=20
>> The quick fix is to put "failok" (or noauto) in the default
>> /etc/fstab
>> entry for the dos filesystem, since it is not necessary for that
>> filesystem to be mounted at all on a running machine.  If there is a
>> policy reason to leave it accessible (and there's a fairly-clean
>> argument that there is) then "failok" might be preferable to
>> "noauto",
>> but either way forcing a filesystem that is not necessary to be
>> accessible or the system fails to come up and does not give any
>> indication of same on what many users will have accessible to them is
>> facially wrong.
>>=20
>> These devices are thought of as "appliances" by many and as such the
>> model of USB keyboard + HDMI (e.g. TV or monitor) is entirely
>> reasonable, and IMHO FreeBSD ought to, when possible, make that a
>> viable
>> option.  It both is and can be provided the kernel loads, but the
>> defaults in pre-built configurations right now preclude that.
>>=20
>=20
> I'm having a hard time understanding how a problem report got =
generated
> about all this, or how any of it is anything other than "Karl
> misconfigured his system."
>=20
> The downloadable system images work correctly.  You made a local =
change
> (formatted new media) and depending on how you want to look at it,
> either you didn't format correctly or you didn't make your config =
files
> match the way you formatted, and that made your system stop working.=20=

> It doesn't mean there is anything wrong about the way the downloadable
> images are generated.
>=20
> Changing fstab in the distributed images so that a failure to mount a
> filesystem becomes a non-error seems like a bad idea to me.  The only
> way that problem happens with a downloaded image is if the image =
wasn't
> burned successfully, and that doesn't seem like something that needs =
to
> just get papered over just because in your use-case you don't really
> need the filesystem that failed to mount.
>=20
> A PR about the fact that it hung without visibly reporting an error =
may
> be appropriate.  A PR that says we should just paper over the error
> because you don't care about it doesn't seem appropriate.


Maybe it should be filed as a "feature request" rather than a "bug."  =
Does Bugzilla support the distinction?

I agree with Ian that this is not a bug in the sense that anyone =
installing from the distributed images will never trigger it on their =
install media.

It is reasonable to file a feature request to omit /boot/msdos as a =
mandatory mount.  I think when I first was using FreeBSD/arm on my =
Raspberry Pi it wasn't mounted, but then that predates the current =
distribution images.  Now it is.  I can see arguments either way and the =
current setting makes sense to me.  I think Warner and Ian hit the nail =
on the head that the real issue is the lack of output on the video =
console during /etc/rc processing.

Incidentally, does setting console=3D"vidconsole" in /boot/loader.conf =
fix the problem of a lack of /etc/rc messages for those who are using an =
HDMI monitor as their primary/only console?  If so, there may also be a =
case for making that the default if the assumption is that a minority of =
people will be using a serial console.  (Not a fair assumption right =
now, IMHO, but perhaps a fair one going forward as FreeBSD/arm becomes =
Tier 1.)

Cheers,

Paul.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D9A9D485-4564-4F58-8B6D-564D4C86229E>