Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 12 Jan 2024 13:53:15 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        Doug Rabson <dfr@rabson.org>
Cc:        Current FreeBSD <freebsd-current@freebsd.org>, FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>
Subject:   Re: 15 & 14: ram_attach vs. its using regions_to_avail vs. "bus_alloc_resource" can lead to: panic("ram_attach: resource %d failed to attach", rid)
Message-ID:  <D5D3DDFC-8183-4F0B-98BB-FEB3E8392B09@yahoo.com>
In-Reply-To: <CACA0VUjwCcXTq5m=S8_Mj6pTpiyZ7=v7um=NxsR3Uj1c-_vuYQ@mail.gmail.com>
References:  <3CB904C2-D983-4EF7-84D3-6BDED0700B08.ref@yahoo.com> <3CB904C2-D983-4EF7-84D3-6BDED0700B08@yahoo.com> <CACA0VUjwCcXTq5m=S8_Mj6pTpiyZ7=v7um=NxsR3Uj1c-_vuYQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Jan 12, 2024, at 09:57, Doug Rabson <dfr@rabson.org> wrote:

> On Sat, 30 Sept 2023 at 08:47, Mark Millard <marklmi@yahoo.com> wrote:
> ram_attach is based on regions_to_avail but that is a problem for
> its later bus_alloc_resource use --and that can lead to:
>=20
> panic("ram_attach: resource %d failed to attach", rid);
>=20
> Unfortunately, the known example is use of EDK2 on RPi4B
> class systems, not what is considered the supported way.
> The panic happens for main [so: 15] and will happen once
> the cortex-a72 handling in 14.0-* is in a build fixed by:
>=20
>     =E2=80=A2 git: 906bcc44641d - releng/14.0 - arm64: Fix errata =
workarounds that depend on smccc Andrew Turner
>=20
> The lack of the fix leads to an earlier panic as stands.
>=20
>=20
> sys/kern/subr_physmem.c 's regions_to_avail is based on ignoring
> phys_avail and using only hwregions and exregions. In other words,
> in part:
>=20
>  * Initially dump_avail and phys_avail are identical.  Boot time =
memory
>  * allocations remove extents from phys_avail that may still be =
included
>  * in dumps.
>=20
> This means that early, dedicated memory allocations are treated
> as available for general use by regions_to_avail . The distinction
> is visible in the  boot -v output in that:
>=20
> real memory  =3D 3138154496 (2992 MB)
> Physical memory chunk(s):
> 0x00000000200000 - 0x0000002b7fffff, 727711744 bytes (177664 pages)
> 0x0000002ce3a000 - 0x0000003385ffff, 111304704 bytes (27174 pages)
> 0x000000338c0000 - 0x000000338c6fff, 28672 bytes (7 pages)
> 0x00000033a30000 - 0x00000036efffff, 55377920 bytes (13520 pages)
> 0x000000372e0000 - 0x0000003b2fffff, 67239936 bytes (16416 pages)
> 0x00000040000000 - 0x000000bb3dcfff, 2067648512 bytes (504797 pages)
> avail memory =3D 3027378176 (2887 MB)
>=20
> does not list the wider:
>=20
> 0x00000040000000 - 0x000000bfffffff
>=20
> because of phys_avail . But the earlier dump based on hwregions and
> exregions shows:
>=20
> Physical memory chunk(s):
>   0x001d0000 - 0x001effff,     0 MB (     32 pages)
>   0x00200000 - 0x338c6fff,   822 MB ( 210631 pages)
>   0x33920000 - 0x3b2fffff,   121 MB (  31200 pages)
>   0x40000000 - 0xbfffffff,  2048 MB ( 524288 pages)
> Excluded memory regions:
>   0x001d0000 - 0x001effff,     0 MB (     32 pages) NoAlloc=20
>   0x2b800000 - 0x2ce39fff,    22 MB (   5690 pages) NoAlloc=20
>   0x33860000 - 0x338bffff,     0 MB (     96 pages) NoAlloc=20
>   0x33920000 - 0x33a2ffff,     1 MB (    272 pages) NoAlloc=20
>   0x36f00000 - 0x372dffff,     3 MB (    992 pages) NoAlloc=20
>=20
> which indicates:
>=20
>   0x40000000 - 0xbfffffff
>=20
> is available as far as it is concerned.
>=20
> (Note some code works/displays in terms of: 0x40000000 - 0xc0000000
> instead.)
>=20
> For aarch64 , sys/arm64/arm64/nexus.c has a nexus_alloc_resource
> that is used as bus_alloc_resource . It ends up rejecting the
> RPi4B boot via using the result of the call in ram_attach:
>=20
>                 if (bus_alloc_resource(dev, SYS_RES_MEMORY, &rid, =
start, end,
>                     end - start, 0) =3D=3D NULL)
>                         panic("ram_attach: resource %d failed to =
attach", rid);
>=20
> as shown by the just-prior start/end pair sequence messages:
>=20
> ram0: reserving memory region:   200000-2b800000
> ram0: reserving memory region:   2ce3a000-33860000
> ram0: reserving memory region:   338c0000-338c7000
> ram0: reserving memory region:   33a30000-36f00000
> ram0: reserving memory region:   372e0000-3b300000
> ram0: reserving memory region:   40000000-c0000000
> panic: ram_attach: resource 5 failed to attach
>=20
> I do not see anything about this that looks inherently RPi*
> specific for possibly ending up with an analogous panic. So
> I expect the example is sufficient context to identify a
> problem is present, despite EDK2 use not being normal for
> RPi4B's and the like as far as FreeBSD is concerned.
>=20
> I'm not quite clear why phys_avail changes

Do not be confused by common labeling to distinct
data: Note the "phys_avail" vs. "hwregions" despite
the label "Physical memory chunk(s):" :

static void
cpu_startup(void *dummy)
{
        vm_paddr_t size;
        int i;
           printf("real memory  =3D %ju (%ju MB)\n", =
ptoa((uintmax_t)realmem),
            ptoa((uintmax_t)realmem) / 1024 / 1024);

        if (bootverbose) {
                printf("Physical memory chunk(s):\n");
                for (i =3D 0; phys_avail[i + 1] !=3D 0; i +=3D 2) {
                        size =3D phys_avail[i + 1] - phys_avail[i];
                        printf("%#016jx - %#016jx, %ju bytes (%ju =
pages)\n",
                            (uintmax_t)phys_avail[i],
                            (uintmax_t)phys_avail[i + 1] - 1,
                            (uintmax_t)size, (uintmax_t)size / =
PAGE_SIZE);
                }
        }
. . .

vs.

physmem_dump_tables(int (*prfunc)(const char *, ...) __printflike(1, 2))
{
        size_t i;
        int flags;
        uintmax_t addr, size;
        const unsigned int mbyte =3D 1024 * 1024;

        prfunc("Physical memory chunk(s):\n");
        for (i =3D 0; i < hwcnt; ++i) {
                addr =3D hwregions[i].addr;
                size =3D hwregions[i].size;
                prfunc("  0x%08jx - 0x%08jx, %5ju MB (%7ju pages)\n", =
addr,
                    addr + size - 1, size / mbyte, size / PAGE_SIZE);
        }
        prfunc("Excluded memory regions:\n");
        for (i =3D 0; i < excnt; ++i) {
                addr  =3D exregions[i].addr;
                size  =3D exregions[i].size;
                flags =3D exregions[i].flags;
                prfunc("  0x%08jx - 0x%08jx, %5ju MB (%7ju pages) %s =
%s\n",
                    addr, addr + size - 1, size / mbyte, size / =
PAGE_SIZE,
                    (flags & EXFLAG_NOALLOC) ? "NoAlloc" : "",
                    (flags & EXFLAG_NODUMP)  ? "NoDump" : "");
        }
. . .

In other words, phys_avail does not change: It is just that phys_avail
and hwregions are for different purposes and can get distinct values but
ultimately both are involved overall and a net-result has to be
generated from them.

> and why that is triggered by the 906bcc44641d commit. I'm wondering if =
it makes sense to arrange for ram_attach to happen before acpi, e.g. =
using BUS_PASS_ORDER_FIRST?

=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D5D3DDFC-8183-4F0B-98BB-FEB3E8392B09>