Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 8 May 2020 18:53:24 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        FreeBSD Current <freebsd-current@FreeBSD.org>
Cc:        Konstantin Belousov <kib@freebsd.org>
Subject:   CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ?
Message-ID:  <0d7db402-621e-cc6b-2918-2078f63e2a9b@FreeBSD.org>

next in thread | raw e-mail | index | archive | help

I have a reproducible panic with a custom kernel without option NUMA while using
amdgpu driver from linuxkpi-based drm:

panic: address 41ec00000 beyond the last segment

I did some quick debugging and the panic happens when Xorg server tries to
access a frame buffer (or something like that).  There is a page fault that gets
satisfied by ttm with a fictitious page.

The stack trace is:
#11 0xffffffff808031a3 in panic (fmt=0xffffffff8119a998 <cnputs_mtx>
"5\003ʀ\377\377\377\377") at /usr/devel/git/motil/sys/kern/kern_shutdown.c:839
#12 0xffffffff80bbc552 in pmap_enter (pmap=<optimized out>, va=34504441856,
m=<optimized out>, prot=<optimized out>, flags=<optimized out>, psind=<optimized
out>) at /usr/devel/git/motil/sys/amd64/amd64/pmap.c:6035
#13 0xffffffff80b288be in vm_fault_populate (fs=<optimized out>) at
/usr/devel/git/motil/sys/vm/vm_fault.c:519
#14 vm_fault_allocate (fs=<optimized out>) at
/usr/devel/git/motil/sys/vm/vm_fault.c:1032
#15 vm_fault (map=<optimized out>, vaddr=<optimized out>, fault_type=<optimized
out>, fault_flags=<optimized out>, m_hold=<optimized out>) at
/usr/devel/git/motil/sys/vm/vm_fault.c:1342
#16 0xffffffff80b26e7e in vm_fault_trap (map=0xfffffe0017cd39e8,
vaddr=<optimized out>, fault_type=<optimized out>, fault_flags=0,
signo=0xfffffe00a810dbc4, ucode=0xfffffe00a810dbc0) at
/usr/devel/git/motil/sys/vm/vm_fault.c:589
#17 0xffffffff80bcf89c in trap_pfault (frame=0xfffffe00a810dc00,
usermode=<optimized out>, signo=<optimized out>, ucode=0xffffffff80853250
<putchar>) at /usr/devel/git/motil/sys/amd64/amd64/trap.c:821
#18 0xffffffff80bceeec in trap (frame=0xfffffe00a810dc00) at
/usr/devel/git/motil/sys/amd64/amd64/trap.c:34


The line number in pmap_enter() is incorrect, I guess because of optimizations.
The assert seems to be reached via pmap_enter -> CHANGE_PV_LIST_LOCK_TO_PHYS ->
PHYS_TO_PV_LIST_LOCK -> pa_index().

The panic in correct in that the page is fictitious and its physical address is
beyond the end of real physical memory.
It seems that NUMA PHYS_TO_PV_LIST_LOCK() is aware of such pages, but !NUMA one
is not.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0d7db402-621e-cc6b-2918-2078f63e2a9b>