Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 11 Jan 2021 22:42:11 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        Gordon Bergling <gbe@freebsd.org>
Cc:        freebsd-arm@freebsd.org
Subject:   Re: PR 252541: Early kernel panic on RPi4B (Too many early devmatch mappings)
Message-ID:  <7655A4A0-B74E-41B5-8E93-8F39CD462A81@yahoo.com>
In-Reply-To: <784263FD-D17C-4CA5-991E-FE93E3E584F3@yahoo.com>
References:  <X/y5YbRUMOyn4Hwl@lion.0xfce3.net> <7C6DC946-B7B6-42C8-A8B9-0471ED7B77AA@yahoo.com> <F0031010-EBB0-4DDE-B9D1-20A0F161E4EA@yahoo.com> <784263FD-D17C-4CA5-991E-FE93E3E584F3@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2021-Jan-11, at 18:10, Mark Millard <marklmi at yahoo.com> wrote:

> On 2021-Jan-11, at 16:19, Mark Millard <marklmi at yahoo.com> wrote:
>=20
>>=20
>>=20
>> On 2021-Jan-11, at 14:23, Mark Millard <marklmi at yahoo.com> wrote:
>>=20
>>=20
>>> On 2021-Jan-11, at 12:47, Gordon Bergling <gbe at freebsd.org> =
wrote:
>>>=20
>>>> Hi,
>>>>=20
>>>> I am currently investigating PR 252541 (Too many early devmatch =
mappings) [1].
>>>>=20
>>>> The kernel panic happends on the RPi4B. Has anyone successfully =
booted a rivision
>>>> on the RPi4B after:
>>>>=20
>>>> =
--------------------------------------------------------------------------=
-------
>>>> commit e83fdf8bb391579fa422d34663cd8c1f82a00dc0
>>>> Author:     Chuck Tuffli <chuck@FreeBSD.org>
>>>> AuthorDate: 2021-01-08 22:36:37 +0000
>>>> Commit:     Chuck Tuffli <chuck@FreeBSD.org>
>>>> CommitDate: 2021-01-08 22:41:45 +0000
>>>>=20
>>>> fix big-endian platforms after 6733401935f8
>>>>=20
>>>> The NVMe byte-swap routines for big-endian platforms used memcpy() =
to
>>>> move the unaligned 64-bit value into a temp register to byte swap =
it.
>>>> Instead of introducing a dependency, manually byte-swap the values =
in
>>>> place.
>>>> =
--------------------------------------------------------------------------=
-------
>>>>=20
>>>> --Gordon
>>>>=20
>>>> [1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D252541
>>>=20
>>> I do my own builds but I'm running based on 19cca0b9613d
>>> with CommitDate 2021-01-09 16:21:33 -0800 :
>>>=20
>>> # ~/fbsd-based-on-what-freebsd-main.sh mm-src
>>> 19cca0b9613d7c3058e41baf0204245119732235
>>> CommitDate: 2021-01-09 16:21:33 -0800
>>> 5d333ee67ac3 19cca0b9613d (HEAD -> mm-src) mm-src snapshot for mm's =
patched build in git context.
>>> FreeBSD RPi4B 13.0-CURRENT FreeBSD 13.0-CURRENT =
mm-src-c255807-g5d333ee67ac3 GENERIC-NODBG  arm64 aarch64 1300134 =
1300134
>>>=20
>>> In other words, the history spanned for e83fdf8bb391 and
>>> after is:
>>>=20
>>> * 	aio: fix the tests when ZFS is not available	Alan Somers	=
46 hours	2	-0/+5
>>> * 	linuxkpi: Fix the "error: unknown type name 'u32'" compilation =
issue when	Neel Chauhan	47 hours	1	-0/+1
>>> * 	netmap: vtnet: stop krings during interface reset	Vincenzo =
Maffione	48 hours	1	-1/+7
>>> * 	netmap: refactor netmap_reset	Vincenzo Maffione	2 days	=
1	-45/+20
>>> * 	netmap: iflib: fix asserts in netmap_fl_refill()	Vincenzo =
Maffione	2 days	1	-1/+2
>>> * 	netmap: iflib: stop krings during interface reset	Vincenzo =
Maffione	2 days	2	-1/+10
>>> * 	fileargs: add tests	Mariusz Zaborski	2 days	3	=
-0/+625
>>> * 	tcp: don't use KTLS socket option on listening sockets	Michael =
Tuexen	4 days	1	-0/+10
>>> * 	arm: revert MAXDSIZ change from 202aea9c82ea	Kyle Evans	=
2 days	1	-1/+1
>>> * 	kevent(2): Bugfix for wrong EVFILT_TIMER timeouts	Jan =
Kokem=C3=BCller	2 days	1	-1/+1
>>> * 	ldd: renumber executable type constants	Ed Maste	2 days	=
1	-2/+2
>>> * 	diff: honour flags with -q	Ed Maste	2 days	2	=
-1/+13
>>> * 	sysctl: improve debug.kdb.panic_str description	Warner Losh	=
2 days	1	-1/+1
>>> * 	last(1): Add EXAMPLES section	Fernando Apestegu=C3=ADa	=
2 days	1	-4/+22
>>> * 	man(1): Bump .Dd	Fernando Apestegu=C3=ADa	2 days	=
1	-1/+1
>>> * 	man(1): Add EXAMPLES section	Fernando Apestegu=C3=ADa	=
2 days	1	-0/+35
>>> * 	mvneta: Acquire the softc lock before clearing the MIB	Mark =
Johnston	2 days	1	-0/+2
>>> * 	Add fib lookup testing module.	Alexander V. Chernikov	2 days	=
2	-0/+548
>>> * 	Bring DPDK route lookups to FreeBSD.	Alexander V. Chernikov	=
2 days	17	-0/+6030
>>> * 	Fix LINT kernel build after =
01f2e864f79584c0cd250a8e7cfb501a9985768a.	Hans Petter Selasky	=
3 days	1	-1/+4
>>> * 	certctl: factor out certname resolution	Kyle Evans	3 days	=
1	-2/+17
>>> * 	certctl: replace hardcoded uses of /usr/local	Kyle Evans	=
3 days	1	-2/+3
>>> * 	fix big-endian platforms after 6733401935f8	Chuck Tuffli	=
3 days	1	-5/+9
>>>=20
>>> The RPi4B is a 8 GiByte one, booted directly from a USB3 SSD,
>>> no microsd card involved. I can boot either u-boot style or
>>> UEFI/ACPI style from the same media, just switching config.txt
>>> content.
>>>=20
>>> I do not have MMCCAM or the like:
>>>=20
>>> # more /usr/fbsd/mm-src/sys/arm64/conf/GENERIC-NODBG=20
>>> #
>>> # GENERIC -- Custom configuration for the arm64/aarch64
>>> #
>>>=20
>>> include "GENERIC"
>>>=20
>>> ident   GENERIC-NODBG
>>>=20
>>> makeoptions     DEBUG=3D-g                # Build kernel with gdb(1) =
debug symbols
>>>=20
>>> options         ALT_BREAK_TO_DEBUGGER
>>>=20
>>> options         KDB                     # Enable kernel debugger =
support
>>>=20
>>> # For minimum debugger support (stable branch) use:
>>> #options        KDB_TRACE               # Print a stack trace for a =
panic
>>> options         DDB                     # Enable the kernel debugger
>>>=20
>>> # Extra stuff:
>>> #options        VERBOSE_SYSINIT=3D0       # Enable verbose sysinit =
messages
>>> #options        BOOTVERBOSE=3D1
>>> #options        BOOTHOWTO=3DRB_VERBOSE
>>> #options        KTR
>>> #options        KTR_MASK=3DKTR_TRAP
>>> ##options       KTR_CPUMASK=3D0xF
>>> #options        KTR_VERBOSE
>>>=20
>>> # Disable any extra checking for. . .
>>> nooptions       DEADLKRES               # Enable the deadlock =
resolver
>>> nooptions       INVARIANTS              # Enable calls of extra =
sanity checking
>>> nooptions       INVARIANT_SUPPORT       # Extra sanity checks of =
internal structures, required by INVARIANTS
>>> nooptions       WITNESS                 # Enable checks to detect =
deadlocks and cycles
>>> nooptions       WITNESS_SKIPSPIN        # Don't run witness on =
spinlocks for speed
>>> nooptions       DIAGNOSTIC
>>> nooptions       MALLOC_DEBUG_MAXZONES   # Separate malloc(9) zones
>>> nooptions       BUF_TRACKING
>>> nooptions       FULL_BUF_TRACKING
>>>=20
>>=20
>> Looks like the message is from a KASSERT that does nothing
>> unless INVARIANTS/INVARIANT_SUPPORT is enabled.
>>=20
>> Unfortunately, arftifacts.ci.freebsd.org has not started
>> getting git-based main builds yet. Normally I'd support
>> an official debug kernel from there and see if I could
>> repeat the problem.
>>=20
>> So I've made my own debug kernel build for things as they
>> are in my context and it reproduced the problem:
>>=20
>> Hit [Enter] to boot immediately, or any other key for command prompt.
>> Booting [/boot/kernel/kernel]...              =20
>> Using DTB provided by EFI at 0x7ef0000.
>> EFI framebuffer information:
>> addr, size     0x3e2fe000, 0x7e9000
>> dimensions     1920 x 1080
>> stride         1920
>> masks          0x00ff0000, 0x0000ff00, 0x000000ff, 0xff000000
>> ---<<BOOT>>---
>> panic: Too many early devmap mappings
>> cpuid =3D 0
>> time =3D 1
>> KDB: stack backtrace:
>> (null)() at 0xffff000000116980
>>        pc =3D 0xffff000000772af4  lr =3D 0xffff000000116980
>>        sp =3D 0xffff0000011f1320  fp =3D 0xffff0000011f1520
>>=20
>> (null)() at 0xffff000000464710
>>        pc =3D 0xffff000000116980  lr =3D 0xffff000000464710
>>        sp =3D 0xffff0000011f1530  fp =3D 0xffff0000011f1590
>>=20
>> (null)() at 0xffff0000004644b4
>>        pc =3D 0xffff000000464710  lr =3D 0xffff0000004644b4
>>        sp =3D 0xffff0000011f15a0  fp =3D 0xffff0000011f1650
>>=20
>> (null)() at 0xffff0000007e9838
>>        pc =3D 0xffff0000004644b4  lr =3D 0xffff0000007e9838
>>        sp =3D 0xffff0000011f1660  fp =3D 0xffff0000011f1660
>>=20
>> (null)() at 0xffff00000076f744
>>        pc =3D 0xffff0000007e9838  lr =3D 0xffff00000076f744
>>        sp =3D 0xffff0000011f1670  fp =3D 0xffff0000011f1690
>>=20
>> (null)() at 0xffff000000782904
>>        pc =3D 0xffff00000076f744  lr =3D 0xffff000000782904
>>        sp =3D 0xffff0000011f16a0  fp =3D 0xffff0000011f16c0
>>=20
>> (null)() at 0xffff0000002896b0
>>        pc =3D 0xffff000000782904  lr =3D 0xffff0000002896b0
>>        sp =3D 0xffff0000011f16d0  fp =3D 0xffff0000011f1790
>>=20
>> (null)() at 0xffff0000007d9bb0
>>        pc =3D 0xffff0000002896b0  lr =3D 0xffff0000007d9bb0
>>        sp =3D 0xffff0000011f17a0  fp =3D 0xffff0000011f1820
>>=20
>> (null)() at 0xffff00000028b814
>>        pc =3D 0xffff0000007d9bb0  lr =3D 0xffff00000028b814
>>        sp =3D 0xffff0000011f1830  fp =3D 0xffff0000011f1840
>>=20
>> (null)() at 0xffff00000039e448
>>        pc =3D 0xffff00000028b814  lr =3D 0xffff00000039e448
>>        sp =3D 0xffff0000011f1850  fp =3D 0xffff0000011f1870
>>=20
>> (null)() at 0xffff0000004af2ac
>>        pc =3D 0xffff00000039e448  lr =3D 0xffff0000004af2ac
>>        sp =3D 0xffff0000011f1880  fp =3D 0xffff0000011f18b0
>>=20
>> (null)() at 0xffff00000077ef90
>>        pc =3D 0xffff0000004af2ac  lr =3D 0xffff00000077ef90
>>        sp =3D 0xffff0000011f18c0  fp =3D 0xffff0000011f1a00
>>=20
>> (null)() at 0xffff00000000089c
>>        pc =3D 0xffff00000077ef90  lr =3D 0xffff00000000089c
>>        sp =3D 0xffff0000011f1a10  fp =3D 0x0000000000000000
>>=20
>> KDB: enter: panic
>> [ thread pid 0 tid 0 ]
>> Stopped at      0xffff0000004aeeb4
>> db> dump
>> Cannot dump: no dump device specified.
>> db>=20
>=20
>=20
> I stuck in some printf's showing figures in hexadecimal:
>=20
> . . .
> ---<<BOOT>>---
> pmap_mapdev early_boot: akva_devmap_vaddr: ffff007fff816000 size: 1000
> pmap_mapdev early_boot: va: ffff007fff815000 VM_MAX_KERNEL_ADDRESS: =
ffff008000000000 L2_SIZE: 200000
> panic: Too many early devmap mappings
> cpuid =3D 0
> . . .
>=20
> For reference:
>=20
> #if defined(__aarch64__) || defined(__riscv)
>        if (early_boot) {
> printf("pmap_mapdev early_boot: akva_devmap_vaddr: %jx size: %jx\n",
> (uintmax_t) akva_devmap_vaddr, (uintmax_t) size);
>                akva_devmap_vaddr =3D trunc_page(akva_devmap_vaddr - =
size);
>                va =3D akva_devmap_vaddr;
> printf("pmap_mapdev early_boot: va: %jx VM_MAX_KERNEL_ADDRESS: %jx =
L2_SIZE: %jx\n",
> (uintmax_t) va, (uintmax_t) VM_MAX_KERNEL_ADDRESS, (uintmax_t) =
L2_SIZE);
>                KASSERT(va >=3D VM_MAX_KERNEL_ADDRESS - L2_SIZE,
>                    ("Too many early devmap mappings"));
>        } else
> #endif
>=20
> So (hexadecimal):
>=20
> VM_MAX_KERNEL_ADDRESS - L2_SIZE =3D=3D ffff007fffe00000
>=20
> and so va < VM_MAX_KERNEL_ADDRESS - L2_SIZE:
>=20
> ffff007fff815000 < ffff007fffe00000
>=20
> by:
>=20
> ffff007fffe00000-ffff007fff815000 =3D=3D 5eb000
>=20
> I've not done anything to track down a relationship to
> e83fdf8bb391 .

The bisect point appears to make no sense, in that the
change was:

diff --git a/sys/dev/nvme/nvme.h b/sys/dev/nvme/nvme.h
index 67d02ba73fd8..b28a8d4348db 100644
--- a/sys/dev/nvme/nvme.h
+++ b/sys/dev/nvme/nvme.h
@@ -2042,16 +2042,20 @@ static inline void
 nvme_device_self_test_swapbytes(struct nvme_device_self_test_page *s =
__unused)
 {
 #if _BYTE_ORDER !=3D _LITTLE_ENDIAN
-	uint64_t failing_lba;
-	uint32_t r;
+	uint8_t *tmp;
+	uint32_t r, i;
+	uint8_t b;
=20
 	for (r =3D 0; r < 20; r++) {
 		s->result[r].poh =3D le64toh(s->result[r].poh);
 		s->result[r].nsid =3D le32toh(s->result[r].nsid);
 		/* Unaligned 64-bit loads fail on some architectures */
-		memcpy(&failing_lba, s->result[r].failing_lba, =
sizeof(failing_lba));
-		failing_lba =3D le64toh(failing_lba);
-		memcpy(s->result[r].failing_lba, &failing_lba, =
sizeof(failing_lba));
+		tmp =3D s->result[r].failing_lba;
+		for (i =3D 0; i < 4; i++) {
+			b =3D tmp[i];
+			tmp[i] =3D tmp[7-i];
+			tmp[7-i] =3D b;
+		}
 	}
 #endif
 }


This seems to only matter for the nvme device handling and
it appears to be a no-op for little endian contexts, both
before and after the change. So far as I know nvme is not
involved in the failing context and the code will be
compiled for little endian for the failing context.

The commits just-before and just-after also seem unlikely
candidates.

One of the reasons I prefer to test with artifact.ci
debug kernels is that it avoids things like my use of
-mcpu=3Dcortex-a72 that my normal procedures are set up
for. Also, my context was a amd64->aarch64 cross-build
instead of being an aarch64 native build.

For reference, my aarch64 debug kernel config file,
cortex-A72 src.conf like file, make.conf like file, and
the script that runs the cortex-A72 debug kernel build
look like:

# more sys/arm64/conf/GENERIC-DBG
#
# GENERIC -- Custom configuration for the arm64/aarch64
#

include "GENERIC"

ident   GENERIC-DBG

makeoptions     DEBUG=3D-g                # Build kernel with gdb(1) =
debug symbols

options         ALT_BREAK_TO_DEBUGGER

options         KDB                     # Enable kernel debugger support

# For minimum debugger support (stable branch) use:
options         KDB_TRACE               # Print a stack trace for a =
panic
options         DDB                     # Enable the kernel debugger

# Extra stuff:
#options        VERBOSE_SYSINIT=3D0       # Enable verbose sysinit =
messages
#options        BOOTVERBOSE=3D1
#options        BOOTHOWTO=3DRB_VERBOSE
#options        KTR
#options        KTR_MASK=3DKTR_TRAP|KTR_PROC
##options       KTR_CPUMASK=3D0xF
#options        KTR_VERBOSE

# Enable any extra checking for. . .
options         DEADLKRES               # Enable the deadlock resolver
options         INVARIANTS              # Enable calls of extra sanity =
checking
options         INVARIANT_SUPPORT       # Extra sanity checks of =
internal structures, required by INVARIANTS
options         WITNESS                 # Enable checks to detect =
deadlocks and cycles
options         WITNESS_SKIPSPIN        # Don't run witness on spinlocks =
for speed
options         DIAGNOSTIC
options         MALLOC_DEBUG_MAXZONES=3D8 # Separate malloc(9) zones
options         BUF_TRACKING
options         FULL_BUF_TRACKING


# more ~/src.configs/src.conf.cortexA72dbg-clang-bootstrap.amd64-host=20
TO_TYPE=3Daarch64
TOOLS_TO_TYPE=3D${TO_TYPE}
#
KERNCONF=3DGENERIC-DBG
TARGET=3Darm64
.if ${.MAKE.LEVEL} =3D=3D 0
TARGET_ARCH=3D${TO_TYPE}
.export TARGET_ARCH
.endif
#
#WITH_CROSS_COMPILER=3D
WITH_SYSTEM_COMPILER=3D
WITH_SYSTEM_LINKER=3D
#
WITH_LIBCPLUSPLUS=3D
#WITH_LLD_BOOTSTRAP=3D
WITHOUT_BINUTILS_BOOTSTRAP=3D
WITH_ELFTOOLCHAIN_BOOTSTRAP=3D
#Disables avoiding bootstrap: WITHOUT_LLVM_TARGET_ALL=3D
WITH_LLVM_TARGET_AARCH64=3D
WITH_LLVM_TARGET_ARM=3D
WITHOUT_LLVM_TARGET_MIPS=3D
WITHOUT_LLVM_TARGET_POWERPC=3D
WITHOUT_LLVM_TARGET_RISCV=3D
WITHOUT_LLVM_TARGET_X86=3D
#WITH_CLANG_BOOTSTRAP=3D
WITH_CLANG=3D
WITH_CLANG_IS_CC=3D
WITH_CLANG_FULL=3D
WITH_CLANG_EXTRAS=3D
WITH_LLD=3D
WITH_LLD_IS_LD=3D
WITHOUT_BINUTILS=3D
WITH_LLDB=3D
#
WITH_BOOT=3D
WITHOUT_LIB32=3D
#
#
WITHOUT_WERROR=3D
#WERROR=3D
#MALLOC_PRODUCTION=3D
WITHOUT_MALLOC_PRODUCTION=3D
WITH_ASSERT_DEBUG=3D
WITH_LLVM_ASSERTIONS=3D
#
# Avoid stripping but do not control host -g status as well:
DEBUG_FLAGS+=3D
#
WITH_REPRODUCIBLE_BUILD=3D
WITH_DEBUG_FILES=3D
#
XCFLAGS+=3D -mcpu=3Dcortex-a72
XCXXFLAGS+=3D -mcpu=3Dcortex-a72
# There is no XCPPFLAGS but XCPP gets XCFLAGS content.
ACFLAGS.arm64cpuid.S+=3D  -mcpu=3Dcortex-a72+crypto
ACFLAGS.aesv8-armx.S+=3D  -mcpu=3Dcortex-a72+crypto
ACFLAGS.ghashv8-armx.S+=3D        -mcpu=3Dcortex-a72+crypto

The ~/src.conf/make.conf is just comments.

# more =
~/sys_build_scripts.amd64-host/make-cortexA72-debug-clang-bootstrap-amd64-=
host.sh=20
kldload -n filemon && \
script =
~/sys_typescripts/typescript_make_cortexA72_debug_clang_bootstrap-amd64-ho=
st-$(date +%Y-%m-%d:%H:%M:%S) \
env __MAKE_CONF=3D"/root/src.configs/make.conf" SRCCONF=3D"/dev/null" =
SRC_ENV_CONF=3D"/root/src.configs/src.conf.cortexA72dbg-clang-bootstrap.am=
d64-host" \
WITH_META_MODE=3Dyes \
MAKEOBJDIRPREFIX=3D"/usr/obj/cortexA72dbg_clang/arm64.aarch64" \
make $*


=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7655A4A0-B74E-41B5-8E93-8F39CD462A81>