From owner-freebsd-arm@freebsd.org Fri Jul 15 13:36:34 2016 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 400D5B98F17 for ; Fri, 15 Jul 2016 13:36:34 +0000 (UTC) (envelope-from paul@gromit.dlib.vt.edu) Received: from gromit.dlib.vt.edu (gromit.dlib.vt.edu [128.173.126.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gromit.dlib.vt.edu", Issuer "Chumby Certificate Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 1646A1310 for ; Fri, 15 Jul 2016 13:36:33 +0000 (UTC) (envelope-from paul@gromit.dlib.vt.edu) Received: from mather.chumby.lan (c-71-63-91-41.hsd1.va.comcast.net [71.63.91.41]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by gromit.dlib.vt.edu (Postfix) with ESMTPSA id 71E1F85F; Fri, 15 Jul 2016 09:36:27 -0400 (EDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: Bizarre clone attempt failures on Raspberry Pi2... From: Paul Mather In-Reply-To: Date: Fri, 15 Jul 2016 09:36:26 -0400 Cc: freebsd-arm@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <548783e1-9047-68f7-5f50-449db684d602@denninger.net> <5475ea53-ae22-2634-6f2a-5737d1b0e308@denninger.net> <398ae56c-8893-f188-c210-cf7f19ccf433@denninger.net> <1468518953.72182.219.camel@freebsd.org> <7a91fc79-1c85-fac8-aa3f-db90592f3f44@denninger.net> To: Karl Denninger X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jul 2016 13:36:34 -0000 On Jul 14, 2016, at 11:36 PM, Karl Denninger wrote: > On 7/14/2016 13:27, Karl Denninger wrote: >> On 7/14/2016 12:55, Ian Lepore wrote: >> No there wasn't. It was a blank (brand new) card the first time = around; >> it had a MSDOS filesystem on it (as do all new cards) but *no* >> BSD-specific geom anything on it. >>> To reliably create a new layout regardless of what may be present >>> already on the media, you have two choices: >>>=20 >>> 1 - dd zeroes to the entire device >>> 2 - use the "no commit" feature of gpart >> Actually in the case at hand #1 isn't impractical since I really only >> care about the first 100MB or so being zeroed. The reason is that my >> boot block (the MSDOS fs) is ~50Mb and the label is obviously next, = so >> if we zero the first 100MB we're fine. >>=20 >> And in fact that does work. >>> When you pass no '-f ' to a gpart command, it automatically = adds >>> the "-f C" (commit) flag behind your back. There is no "don't = commit" >>> flag, so (this is surrealistically crazy...) what you're supposed to = do >>> is pass an invalid flag, which it won't complain about, in order to >>> prevent it from automatically adding that 'C' flag you didn't even >>> realize existed. (This is where *I* curse whoever coded this mess.) >>>=20 >>> When you don't commit, the changes take place in a sort of 'virtual >>> workspace' and nothing on the physical disk changes until you do a >>> "gpart commit" (or "gpart undo" to discard the changes). Making all >>> this much-less-cool that it's sounding right now, there is no = automatic >>> recursion for commit and undo... if you create a bunch of nested = stuff >>> (a slice, a geom within that slice, parititions within that geom), = then >>> you have to commit all the pending new geoms *in reverse order of = how >>> they were created*. >>>=20 >>> So, using da0 (since it's shorter to type), the sequence goes like: >>>=20 >>> gpart destroy -f x -F da0 >>> gpart create -f x -s MBR da0 >>> gpart add -f x -t \!12 -s 64M -a 4M da0 >>> gpart add -f x -t freebsd -a 4M da0 >>> gpart destroy -f x -F da0s2 >>> gpart create -f x -s BSD da0s2 >>> gpart add -f x -t freebsd-ufs da0s2 >>> gpart commit da0s2 >>> gpart commit da0 >>> newfs_msdos /dev/da0s1 >>> newfs -U /dev/da0s2a >>>=20 >>> And that reliably creates a fresh rpi-style layout regardless of = what >>> was on the media before you started. >> Ok, I will try this, BUT I suspect it's still screwed (blind) because >> when I zeroed the front of the disk I got a "correct" partition = layout >> but after populating it what I get still hangs after it mounts root = in >> the same place. The way to prevent the alignment issue from coming = up >> is to specify a "-b" switch on the "add", giving you a block offset.=20= >> "-b 64" is sufficient; now if the system tries to "taste" da0s2 it = will >> fail (as it does for the card that is running) but "tasting" da0s2a >> succeeds. >>> Now, to address the question of the filesystem existing at da0s2 = versus >>> da0s2a, the difference is alignment. Making things even more >>> confusing, alignment (if you don't specify it) sometimes changes = based >>> on the type and brand of usb sdcard reader you're using and the fake >>> geometry values it reports to the system. (A USB reader almost = always >>> reports different fake geometry than a native sd slot would on a >>> machine with non-USB based sd support.) >> Yes, I understand that; if the alignment matches thus the "a" = partition >> starts at offset zero then you can actually reference that (although >> length might be wrong) with the base device. After all, what it = really >> does is look at the blocks to see if the magic number is good, and if = so >> it tries to read and process it. >>=20 >> But this doesn't explain why, after getting a layout that's correct = (by >> writing zeros to the front of the card first, so anything that = "might" >> be there isn't) and copying all the file structure over (which = facially >> not only appears to be correct but the loader finds and loads the >> kernel, AND the root filesystem mounts!) the system hangs, apparently >> just before init gets started. >>=20 >> If init can't be found you should get a complaint (been there, done >> that) on the console but there is no complaint of any sort. >>=20 >> I've gotten through the bad structure issue on the SD card, and am = now >> left with "why does it hang on boot -- with no error or other = indication >> of what the problem is" after the kernel loads *and* the root = filesystem >> mounts? >>=20 > Found it. >=20 > Apparently the current code *requires* the label be set on the msdos > partition. If it's not then not only does it not mount (which = shouldn't > matter post-boot as the loader is supposed to pass the dtb file, it is > specified in the config file without any sort of path prefix, and thus > once the kernel has loaded it should not matter if the dos partition = if > actually mounted or not) *but* the boot process hangs without any > indication of why! >=20 > So, you must do newfs_msdos -L MSDOSBOOT -F 16 {device} >=20 > If the "-L" is missing you're hosed; the system facially appears to be > just fine but while the loader comes up and so does the kernel, it = hangs > without ever proceeding -- and without any sort of error message > indicating that it is unable to mount something it needs. You have to do that because the device entry in the stock /etc/fstab is = /dev/msdosfs/MSDOSBOOT. The /dev/msdosfs part indicates it's using = ms-dos labels. In other words, this is just the same sort of failure = you were getting when you weren't labelling the UFS partition as = "rootfs". Labelling the file system properly "fixes" the issue, as you = would expect. It's a misnomer to say the code "requires" labels. It's just that's the = way the distribution images are currently set up. I have an older Pi = that predates the current distribution images that just uses = /dev/mmcsd0... device names in /etc/fstab. Both approaches work fine. = You just need to make sure the devices you specify in /etc/fstab will = actually exist when it comes time to mount the corresponding file = system. If you stop using labels in your /etc/fstab then you won't have problems = when those labels are missing. If the labels are missing, the = /dev/{msdosfs,ufs} devices will not be present and the system will drop = to single-user mode because none-late, non-noauto file systems can't be = accessed via their device nodes when attempting to mount them. When = that happens and you don't have a serial console enabled then you have = problems remediating the situation. If a file system is not needed to mount as part of booting (as you = suggest for /boot/msdos) then you should probably flag it with the = "noauto" option in /etc/fstab or remove it from /etc/fstab entirely. I think the problem you were having is not copying all the required = attributes of the file systems in question when cloning your SD cards, = given your /etc/fstab setup. It sounds like you've fixed that, now. Cheers, Paul. >=20 > I can clone cards now. >=20 > --=20 > Karl Denninger > karl@denninger.net > /The Market Ticker/ > /[S/MIME encrypted email preferred]/