Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 22 Feb 2024 18:10:56 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 277211] panic: Unhandled external data abort - handle_el1h_sync - --- exception, esr 0x96000410 - wait_fw_init - mlx5_load_one
Message-ID:  <bug-277211-227-U4WQbMzm8T@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-277211-227@https.bugs.freebsd.org/bugzilla/>
References:  <bug-277211-227@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D277211

--- Comment #5 from John Baldwin <jhb@FreeBSD.org> ---
Ah, looks like the dmesg from Dave does actually include this patch as it h=
as
this line of output:

mlx5_core0: translate 0x14082000000 -> 0x24082000000

That looks correct, but unfortunately, we only display the ranges in
bootverbose for FDT, not ACPI.  The patch below fixes the pcib driver to al=
ways
log the ranges which would be useful to confirm the window:

diff --git a/sys/dev/pci/pci_host_generic.c b/sys/dev/pci/pci_host_generic.c
index 386b8411d29a..46b84ff3004b 100644
--- a/sys/dev/pci/pci_host_generic.c
+++ b/sys/dev/pci/pci_host_generic.c
@@ -83,6 +83,7 @@ pci_host_generic_core_attach(device_t dev)
        uint64_t phys_base;
        uint64_t pci_base;
        uint64_t size;
+       const char *range_descr;
        char buf[64];
        int domain, error;
        int flags, rid, tuple, type;
@@ -179,6 +180,7 @@ pci_host_generic_core_attach(device_t dev)
                switch (FLAG_TYPE(sc->ranges[tuple].flags)) {
                case FLAG_TYPE_PMEM:
                        sc->has_pmem =3D true;
+                       range_descr =3D "prefetch";
                        flags =3D RF_PREFETCHABLE;
                        type =3D SYS_RES_MEMORY;
                        error =3D rman_manage_region(&sc->pmem_rman,
@@ -186,12 +188,14 @@ pci_host_generic_core_attach(device_t dev)
                        break;
                case FLAG_TYPE_MEM:
                        flags =3D 0;
+                       range_descr =3D "memory";
                        type =3D SYS_RES_MEMORY;
                        error =3D rman_manage_region(&sc->mem_rman,
                           pci_base, pci_base + size - 1);
                        break;
                case FLAG_TYPE_IO:
                        flags =3D 0;
+                       range_descr =3D "I/O port";
                        type =3D SYS_RES_IOPORT;
                        error =3D rman_manage_region(&sc->io_rman,
                           pci_base, pci_base + size - 1);
@@ -219,6 +223,10 @@ pci_host_generic_core_attach(device_t dev)
                        error =3D ENXIO;
                        goto err_rman_manage;
                }
+               if (bootverbose)
+                       device_printf(dev,
+                           "PCI addr: 0x%jx, CPU addr: 0x%jx, Size: 0x%jx,
Type: %s\n",
+                           pci_base, phys_base, size, range_type);
        }

        return (0);

That said, it seems like the translation is correct given the prefetch wind=
ow
used for the pcib1 bridge between pcib0 and the mlx5 device:

pcib1: <PCI-PCI bridge> at device 0.0 on pci0
pcib1:   domain            0
pcib1:   secondary bus     1
pcib1:   subordinate bus   1
pcib1:   memory decode     0x30000000-0x301fffff
pcib1:   prefetched decode 0x14080000000-0x14083ffffff

And this allocation of mlx5's BAR:

        map[10]: type Prefetchable Memory, range 64, base 0x14082000000, si=
ze
25, enabled
pcib1: allocated prefetch range (0x14082000000-0x14083ffffff) for rid 10 of
pci0:1:0:0


It is odd for a register bar to be in a prefetch BAR.  It might be good to =
see
a verbose dmesg from before to see how the bridge and and mlx5 BAR were
configured before.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-277211-227-U4WQbMzm8T>