Date: Fri, 15 Jul 2016 12:33:14 -0400 From: Paul Mather <paul@gromit.dlib.vt.edu> To: Ian Lepore <ian@freebsd.org> Cc: Karl Denninger <karl@denninger.net>, freebsd-arm <freebsd-arm@freebsd.org> Subject: Re: Bizarre clone attempt failures on Raspberry Pi2... Message-ID: <D9A9D485-4564-4F58-8B6D-564D4C86229E@gromit.dlib.vt.edu> In-Reply-To: <1468597885.72182.286.camel@freebsd.org> References: <548783e1-9047-68f7-5f50-449db684d602@denninger.net> <d2eb4035-e494-1a7b-98e5-2aa87efe0763@denninger.net> <EDE65B12-4961-4CEF-8AE9-BFDA4FD508A5@gromit.dlib.vt.edu> <5475ea53-ae22-2634-6f2a-5737d1b0e308@denninger.net> <398ae56c-8893-f188-c210-cf7f19ccf433@denninger.net> <1468518953.72182.219.camel@freebsd.org> <7a91fc79-1c85-fac8-aa3f-db90592f3f44@denninger.net> <bec46aff-a4d5-9c4d-49d0-78534b13f719@denninger.net> <E01579F5-9562-4E51-9CFB-EA510460A4C8@gromit.dlib.vt.edu> <60b6e156-981e-9fbd-b68c-0daae1961286@denninger.net> <04391154-A38E-46CD-B570-B2BECFD19022@gromit.dlib.vt.edu> <d1aba096-e645-04df-dfda-5a9284250960@denninger.net> <1468597885.72182.286.camel@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Jul 15, 2016, at 11:51 AM, Ian Lepore <ian@freebsd.org> wrote: > On Fri, 2016-07-15 at 09:44 -0500, Karl Denninger wrote: >> On 7/15/2016 09:22, Paul Mather wrote: >>=20 >>> On Jul 15, 2016, at 9:44 AM, Karl Denninger <karl@denninger.net> >>> wrote: >>>=20 >>>> On 7/15/2016 08:36, Paul Mather wrote: >>>>> On Jul 14, 2016, at 11:36 PM, Karl Denninger < >>>>> karl@denninger.net> wrote: >>>>>=20 >>>>>> Found it. >>>>>>=20 >>>>>> Apparently the current code *requires* the label be set on >>>>>> the msdos >>>>>> partition. If it's not then not only does it not mount >>>>>> (which shouldn't >>>>>> matter post-boot as the loader is supposed to pass the dtb >>>>>> file, it is >>>>>> specified in the config file without any sort of path prefix, >>>>>> and thus >>>>>> once the kernel has loaded it should not matter if the dos >>>>>> partition if >>>>>> actually mounted or not) *but* the boot process hangs without >>>>>> any >>>>>> indication of why! >>>>>>=20 >>>>>> So, you must do newfs_msdos -L MSDOSBOOT -F 16 {device} >>>>>>=20 >>>>>> If the "-L" is missing you're hosed; the system facially >>>>>> appears to be >>>>>> just fine but while the loader comes up and so does the >>>>>> kernel, it hangs >>>>>> without ever proceeding -- and without any sort of error >>>>>> message >>>>>> indicating that it is unable to mount something it needs. >>>>> You have to do that because the device entry in the stock >>>>> /etc/fstab is /dev/msdosfs/MSDOSBOOT. The /dev/msdosfs part >>>>> indicates it's using ms-dos labels. In other words, this is >>>>> just the same sort of failure you were getting when you weren't >>>>> labelling the UFS partition as "rootfs". Labelling the file >>>>> system properly "fixes" the issue, as you would expect. >>>>>=20 >>>>> It's a misnomer to say the code "requires" labels. It's just >>>>> that's the way the distribution images are currently set up. I >>>>> have an older Pi that predates the current distribution images >>>>> that just uses /dev/mmcsd0... device names in /etc/fstab. Both >>>>> approaches work fine. You just need to make sure the devices >>>>> you specify in /etc/fstab will actually exist when it comes >>>>> time to mount the corresponding file system. >>>> Except that if the root filesystem doesn't mount you get an >>>> error, and >>>> thus you can figure out what's going on. What excuse is there >>>> for not >>>> printing an error message if a mount fails, and if something in >>>> /etc/fstab fails to mount what's with hanging the machine? I've >>>> had >>>> disks be unavailable before on Intel architecture machines (it >>>> happens >>>> when disks fail) and the result is an error on the failure to >>>> mount but, >>>> unless it's the root volume, the system still comes up. >>>=20 >>> Are you sure you don't get an error? When I forgot to label rootfs >>> recently when I cloned an SD card I got an error displayed on the >>> serial console. I didn't get an error on the HDMI screen console. >> You get an error if rootfs is not labelled on the HDMI screen (as >> root >> fails to mount.) There is *no* error on an HDMI screen if the msdosfs >> is >> not labeled. >>> As I've mentioned before directly, FreeBSD/arm acts like >>> console=3D"comconsole,vidconsole" is in effect. This means that >>> during /etc/rc boot processing, you'll only get output on >>> comconsole (except for kernel messages, which seem to go to both).=20= >>> That's been my experience in FreeBSD in general. >>>=20 >>> I dimly recall folks on here saying U-Boot doesn't currently >>> enable/support USB keyboards, so there's not really much you can do >>> to fix it interactively if you fail to boot the OS and hence enable >>> USB keyboard support via FreeBSD. That's not a problem if you use >>> a serial console, which is supported by U-Boot. >> Well, that's not true if the kernel is loaded. Once the kernel loads >> a >> usb keyboard works. >>>=20 >>> I'm not sure comparisons with Intel architecture machines is >>> entirely appropriate as they use a different boot >>> environment/mechanism. Still, I stand by the fact that I've always >>> got an error message on the serial console when disks on my >>> FreeBSD/arm system have failed to mount at boot. (It used to >>> happen regularly with an external USB drive I had that took a long >>> time to probe, and I ended up having to put a kern.cam.boot_delay >>> in /boot/loader.conf to avoid the system dropping into single-user >>> mode when doing a reboot.) >>>=20 >>>=20 >>>>> If you stop using labels in your /etc/fstab then you won't have >>>>> problems when those labels are missing. If the labels are >>>>> missing, the /dev/{msdosfs,ufs} devices will not be present and >>>>> the system will drop to single-user mode because none-late, non >>>>> -noauto file systems can't be accessed via their device nodes >>>>> when attempting to mount them. When that happens and you don't >>>>> have a serial console enabled then you have problems >>>>> remediating the situation. >>>>>=20 >>>>> If a file system is not needed to mount as part of booting (as >>>>> you suggest for /boot/msdos) then you should probably flag it >>>>> with the "noauto" option in /etc/fstab or remove it from >>>>> /etc/fstab entirely. >>>>>=20 >>>>> I think the problem you were having is not copying all the >>>>> required attributes of the file systems in question when >>>>> cloning your SD cards, given your /etc/fstab setup. It sounds >>>>> like you've fixed that, now. >>>> Again, if it dropped to single user mode *and said it was doing >>>> so* or >>>> if there was an error message on the console when the filesystem >>>> failed >>>> to mount I would have found this in a reasonable period of time.=20 >>>> It >>>> wasn't that rough to do so with the ufs label once I knew the >>>> filesystem >>>> was failing to mount, which was discernible from the console >>>> output. >>>>=20 >>>> Not printing an error when things error out is rude at best, and >>>> when >>>> that error is going to prevent the system from coming up this >>>> darn well >>>> ought to show up where one with a monitor plugged in can see it, >>>> eh? >>>>=20 >>>> There was literally no indication at all as to what was going on >>>> and >>>> since gpart does not show filesystem labels for *either* BSD >>>> labeled >>>> slices OR msdos figuring out what was different between the two >>>> proved >>>> to be a bit troublesome. IMHO at least the failure to display an >>>> error >>>> message in this circumstance ought to be corrected. >>>=20 >>> See above re: serial console vs. video console. >>>=20 >>> As for the labels, these are file system labels and not partition >>> labels. The big clue is in the device name in /etc/fstab. (The " >>> -l" option to "gpart show" will only show labels "[f]or >>> partitioning schemes that support partition labels". That's >>> reasonable, IMHO, as partitions are not the same as file systems >>> and gpart is concerned with partitions.) In my experience, >>> complaints about not being able to access /dev/ufs/something means >>> you forgot to label a UFS file system as "something" when you made >>> it. :-) >>>=20 >>> Cheers, >>>=20 >>> Paul. >>=20 >> Understood, but the issue here is that there's no indication without >> a >> serial console that you have anything wrong -- the system appears to >> have simply hung. >>=20 >> The quick fix is to put "failok" (or noauto) in the default >> /etc/fstab >> entry for the dos filesystem, since it is not necessary for that >> filesystem to be mounted at all on a running machine. If there is a >> policy reason to leave it accessible (and there's a fairly-clean >> argument that there is) then "failok" might be preferable to >> "noauto", >> but either way forcing a filesystem that is not necessary to be >> accessible or the system fails to come up and does not give any >> indication of same on what many users will have accessible to them is >> facially wrong. >>=20 >> These devices are thought of as "appliances" by many and as such the >> model of USB keyboard + HDMI (e.g. TV or monitor) is entirely >> reasonable, and IMHO FreeBSD ought to, when possible, make that a >> viable >> option. It both is and can be provided the kernel loads, but the >> defaults in pre-built configurations right now preclude that. >>=20 >=20 > I'm having a hard time understanding how a problem report got = generated > about all this, or how any of it is anything other than "Karl > misconfigured his system." >=20 > The downloadable system images work correctly. You made a local = change > (formatted new media) and depending on how you want to look at it, > either you didn't format correctly or you didn't make your config = files > match the way you formatted, and that made your system stop working.=20= > It doesn't mean there is anything wrong about the way the downloadable > images are generated. >=20 > Changing fstab in the distributed images so that a failure to mount a > filesystem becomes a non-error seems like a bad idea to me. The only > way that problem happens with a downloaded image is if the image = wasn't > burned successfully, and that doesn't seem like something that needs = to > just get papered over just because in your use-case you don't really > need the filesystem that failed to mount. >=20 > A PR about the fact that it hung without visibly reporting an error = may > be appropriate. A PR that says we should just paper over the error > because you don't care about it doesn't seem appropriate. Maybe it should be filed as a "feature request" rather than a "bug." = Does Bugzilla support the distinction? I agree with Ian that this is not a bug in the sense that anyone = installing from the distributed images will never trigger it on their = install media. It is reasonable to file a feature request to omit /boot/msdos as a = mandatory mount. I think when I first was using FreeBSD/arm on my = Raspberry Pi it wasn't mounted, but then that predates the current = distribution images. Now it is. I can see arguments either way and the = current setting makes sense to me. I think Warner and Ian hit the nail = on the head that the real issue is the lack of output on the video = console during /etc/rc processing. Incidentally, does setting console=3D"vidconsole" in /boot/loader.conf = fix the problem of a lack of /etc/rc messages for those who are using an = HDMI monitor as their primary/only console? If so, there may also be a = case for making that the default if the assumption is that a minority of = people will be using a serial console. (Not a fair assumption right = now, IMHO, but perhaps a fair one going forward as FreeBSD/arm becomes = Tier 1.) Cheers, Paul.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D9A9D485-4564-4F58-8B6D-564D4C86229E>