Date: Thu, 19 Mar 2020 01:20:02 -0700 From: Mark Millard <marklmi@yahoo.com> To: Toomas Soome <tsoome@me.com>, freebsd-arm <freebsd-arm@freebsd.org>, FreeBSD Current <freebsd-current@freebsd.org> Subject: Re: head -r538966 on OrangePi+ 2ed: boot loader crashes when USB drive is present at power-on/boot: its a misaligned access by code from -r354746 Message-ID: <30A666F0-8773-41B7-AD37-7E319AA510CA@yahoo.com> In-Reply-To: <863312E1-4216-49BA-A623-CAC85F123655@yahoo.com> References: <863312E1-4216-49BA-A623-CAC85F123655@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[I built with -DDEBUG -DDISK_DEBUG -DPART_DEBUG and show with some extra output as well. It shows that a misaligned access causes the crash. The access in question is from head -r354746 code additions.] On 2020-Mar-18, at 13:36, Mark Millard <marklmi at yahoo.com> wrote: > Without a USB drive present at power-on or > boot, the OPi+2e boots fine. (The USB drives > involved have a partition holding a ufs file > system and a partition holding a swap/aging > space.) >=20 > In all cases below, /boot/ is from the microsd > card. But the intended configuration is for > vfs.root.mountfrom to be used to direct the > stages after kernel-booting to the USB drive. >=20 > The output sequence related to the crash when > the USB drive is present looks like: >=20 > QUOTE > . . . > END QUOTE I'm replacing the original quote with better information. First I quote the definition of dos_partition for reference: struct dos_partition { unsigned char dp_flag; /* bootstrap flags */ unsigned char dp_shd; /* starting head */ unsigned char dp_ssect; /* starting sector */ unsigned char dp_scyl; /* starting cylinder */ unsigned char dp_typ; /* partition type */ unsigned char dp_ehd; /* end head */ unsigned char dp_esect; /* end sector */ unsigned char dp_ecyl; /* end cylinder */ uint32_t dp_start; /* absolute starting sector = number */ uint32_t dp_size; /* partition size in sectors */ }; Note that access to dp_start or dp_size requires address%4=3D=3D0 alignment but the other fields do not. This is important for &dp[i].dp_start reported in the below (and then dp[i].dp_start is accessed in the code). I also had it report dp, which shows that dp has dp%4!=3D0 as well. Here is with -DDEBUG and -DDISK_DEBUG -DPART_DEBUG in place for the code (with some extra debug output added): FreeBSD/armv7 U-Boot loader, Revision 1.3 signature: version =3D 1 checksum =3D 0x98de198b sc entry =3D 0xbdf8cb01 addresses info: _etext (sdata) =3D 0x4204fdb4 _edata =3D 0x4205ebb8 __sbss_start =3D 0x4205ec28 __sbss_end =3D 0x4205ec28 __sbss_start =3D 0x4205ec28 _end =3D 0x42062aa0 syscall entry =3D 0xbdf8cb01 DRAM: 2048MB Number of U-Boot devices: 2 U-Boot env: loaderdev not set, will probe all devices. stor_init(): storage devices found: 2 Found U-Boot device: disk Probing all <unknown> devices... Checking unit=3D0 slice=3D<auto> partition=3D<auto>...disk_open: = disk0s0: unit 0, slice 0, partition -2 =3D> 0x420631c0 stor_readdev(): reading blk=3D0 size=3D1 @ 0x420633c0 stor_readdev(): reading blk=3D64 size=3D1 @ 0x420636c0 stor_readdev(): reading blk=3D1 size=3D1 @ 0x420636c0 ptable_open: BEFORE NDOSPART loop #1 ptable_open: dp=3D0x4206357e ptable_open: IN NDOSPART loop #1 after 1st if ptable_open: IN NDOSPART loop #1: &dp[i].dp_start=3D0x42063586 data abort pc : [<42009350>] lr : [<42009348>] reloc pc : [<ce07d350>] lr : [<ce07d348>] sp : b9f649e8 ip : 4205e200 fp : b9f64a18 r10: 42063640 r9 : b9f64a50 r8 : 420633c0 r7 : 420636c0 r6 : 4200818c r5 : 4205cd95 r4 : 42063586 r3 : bdf8cdf7 r2 : 0000000a r1 : 01c28000 r0 : 000000ee Flags: nZCv IRQs off FIQs off Mode SVC_32 Code: e08f0000 eb0114af e5d801c2 e35000ee (05940000)=20 Resetting CPU ... resetting ... My extra messages are from: # svnlite diff /usr/src/stand/common/part.c Index: /usr/src/stand/common/part.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/stand/common/part.c (revision 358966) +++ /usr/src/stand/common/part.c (working copy) @@ -715,18 +715,26 @@ * start sector 1. After DOSPTYP_PMBR, there may be other = partitions. * UEFI compliant PMBR has no other partitions. */ + DPRINTF("BEFORE NDOSPART loop #1"); + DPRINTF("dp=3D%p", (void*)dp); for (i =3D 0; i < NDOSPART; i++) { if (dp[i].dp_flag !=3D 0 && dp[i].dp_flag !=3D 0x80) { DPRINTF("invalid partition flag %x", = dp[i].dp_flag); goto out; } + DPRINTF("IN NDOSPART loop #1 after 1st if"); + DPRINTF("IN NDOSPART loop #1: &dp[i].dp_start=3D%p", = (void*)&dp[i].dp_start); #ifdef LOADER_GPT_SUPPORT if (dp[i].dp_typ =3D=3D DOSPTYP_PMBR && dp[i].dp_start = =3D=3D 1) { + DPRINTF("BEFORE table->type assignment for PMBR = detected"); + DPRINTF("table=3D%p for PMBR detected", = (void*)table); table->type =3D PTABLE_GPT; DPRINTF("PMBR detected"); } + DPRINTF("IN NDOSPART loop #1 after 2nd if"); #endif } + DPRINTF("AFTER NDOSPART loop #1"); #ifdef LOADER_GPT_SUPPORT if (table->type =3D=3D PTABLE_GPT) { table =3D ptable_gptread(table, dev, dread); Note that "IN NDOSPART loop #1 after 2nd if" is never reported. Nor is "BEFORE table->type assignment for PMBR detected". The condition in the 2nd if never completes its evaluation because of the misaligned access involved in dp[i].dp_start being evaluated. > Stopping it earlier and using: >=20 > setenv loaderdev mmc 0 > boot >=20 > avoids the problem because it avoids "probing" the > USB drive at the stage indicated above. Yep: The above loop is then not evaluated. > But the > boot then actually uses: >=20 > vfs.root.mountfrom=3D"ufs:/dev/gpt/BPIM3root" >=20 > in the /boot/loader.conf on the microsd card. > ufs:/dev/gpt/BPIM3root is a reference to the ufs > partition on the USB media. (With the mountfrom > commented out the microsd card is bootable by > itself.) So later the USB drive is put to use > successfully when the initial probing is avoided. >=20 > Looking in the /usr/src/stand/uboot/common/main.c > code shows: >=20 > static int > probe_disks(int devidx, int load_type, int load_unit, int load_slice, > int load_partition) > { > . . . > if (load_unit =3D=3D -1) { > printf(" Probing all %s devices...\n", = device_typename(load_type)); > /* Try each disk of given type in succession until one = works. */ > for (unit =3D 0; unit < UB_MAX_DEV; unit++) { > currdev.dd.d_unit =3D = uboot_diskgetunit(load_type, unit); > if (currdev.dd.d_unit =3D=3D -1) > break; > print_disk_probe_info(); > open_result =3D devsw[devidx]->dv_open(&f, = &currdev); > if (open_result =3D=3D 0) { > printf(" good.\n"); > return (0); > } > printf("\n"); > } > return (-1); > } > . . . > } >=20 > So it appears that the crash is during the code involved > for the line: >=20 > open_result =3D devsw[devidx]->dv_open(&f, &currdev); >=20 > Note that the boot attempt reported "Probing all <unknown> > devices..." so device_typename(load_type) came up with > "<unknown>". (I've no clue if that is significant to the > issue or not.) Looks like the below paragraph was junk. Sorry. > It appeared that /usr/src/stand/usb/storage/umass_loader.c > and its umass_disk_open and umass_disk_open_sub might > be involved and then code from the likes of: > /usr/src/sys/dev/usb/ --such as from usb_msctest.c for > usb_msc_read_capacity. (I stopped looking around > there: well outside areas I know how to interpret.) > For reference, I used the OPi+2e u-boot material from > my poudriere-devel based port builds: >=20 > # ls -ldT /usr/local/share/u-boot/u-boot-orangepi-plus-2e/* > -rw-r--r-- 1 root wheel 503 Oct 26 19:12:16 2019 = /usr/local/share/u-boot/u-boot-orangepi-plus-2e/README > -rw-r--r-- 1 root wheel 199 Oct 26 19:12:16 2019 = /usr/local/share/u-boot/u-boot-orangepi-plus-2e/boot.scr > -rw-r--r-- 1 root wheel 66 Oct 26 19:12:16 2019 = /usr/local/share/u-boot/u-boot-orangepi-plus-2e/metadata > -rw-r--r-- 1 root wheel 471250 Oct 26 19:12:16 2019 = /usr/local/share/u-boot/u-boot-orangepi-plus-2e/u-boot-sunxi-with-spl.bin >=20 >=20 >=20 > (The USB drives are USB SSDs.) =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?30A666F0-8773-41B7-AD37-7E319AA510CA>