Date: Sun, 25 Aug 2019 17:30:34 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Rebecca Cran <rebecca@bsdio.com>, markj@freebsd.org Cc: FreeBSD Current <freebsd-current@freebsd.org> Subject: Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX) Message-ID: <20190825143034.GO71821@kib.kiev.ua> In-Reply-To: <9e94aea8-7d63-0f9e-2f1e-c1492e9dc455@bsdio.com> References: <6e5687b2-ab3f-a570-37ab-72c8a9776167@bsdio.com> <20190824203305.GF71821@kib.kiev.ua> <d7200dbc-62b3-fd86-ca61-32d559987338@bsdio.com> <20190824230801.GK71821@kib.kiev.ua> <f15ba651-28ef-d9db-3646-ab8cb49b3d18@bsdio.com> <20190825062407.GL71821@kib.kiev.ua> <9e94aea8-7d63-0f9e-2f1e-c1492e9dc455@bsdio.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Aug 25, 2019 at 07:17:20AM -0600, Rebecca Cran wrote: > On 2019-08-25 00:24, Konstantin Belousov wrote: > > What are the panic messages ? > > Fatal trap 18: integer divide fault while in kernel mode > > instruction pointer = 0x20:0xffffffff80f1027c > > stack pointer = 0x28:0xffffffff845809f0 > > frame pointer = 0x28:0xffffffff84580a00 > > code segment = base 0x0, limit 0xffffff, type 0x1b > > = DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags = resume, IOPL = 0 > > current process = 0 () > > trap number = 18 > > panic: integer divide fault > > cpuid = 0 > > time = 1 > > > > What is the source line ? > > (gdb) info line *0xffffffff80f1027c > Line 102 of "/usr/src/sys/vm/vm_domainset.c" starts at address > 0xffffffff80f10267 <vm_domainset_iter_first+151> > and ends at 0xffffffff80f1027f <vm_domainset_iter_first+175>. There was one more source line I asked about. So what happens, IMO, is that for memory-less domains ds_cnt is zero because ds_mask is zero, which causes the exception on divide. You can try the following combined patch, but I really dislike the fact that I cannot safely use DOMAINSET_FIXED (if my diagnosis is correct). I would prefer for kmem_malloc_domainset(DOMAINSET_FIXED(unpopulated domain)) to fail with NULL result, and then I would manually fall-back to DOMAINSET_PREF(). OTOH, I think the chunk for mp_realloc_cpu() is the final fix. diff --git a/sys/amd64/amd64/mp_machdep.c b/sys/amd64/amd64/mp_machdep.c index b38c688f8b4..2c3dc8744f6 100644 --- a/sys/amd64/amd64/mp_machdep.c +++ b/sys/amd64/amd64/mp_machdep.c @@ -402,6 +402,8 @@ mp_realloc_pcpu(int cpuid, int domain) return; m = vm_page_alloc_domain(NULL, 0, domain, VM_ALLOC_NORMAL | VM_ALLOC_NOOBJ); + if (m == NULL) + return; na = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)); pagecopy((void *)oa, (void *)na); pmap_qenter((vm_offset_t)&__pcpu[cpuid], &m, 1); @@ -481,10 +483,10 @@ native_start_all_aps(void) M_ZERO); mce_stack = (char *)kmem_malloc(PAGE_SIZE, M_WAITOK | M_ZERO); nmi_stack = (char *)kmem_malloc_domainset( - DOMAINSET_FIXED(domain), PAGE_SIZE, M_WAITOK | M_ZERO); + DOMAINSET_PREF(domain), PAGE_SIZE, M_WAITOK | M_ZERO); dbg_stack = (char *)kmem_malloc_domainset( - DOMAINSET_FIXED(domain), PAGE_SIZE, M_WAITOK | M_ZERO); - dpcpu = (void *)kmem_malloc_domainset(DOMAINSET_FIXED(domain), + DOMAINSET_PREF(domain), PAGE_SIZE, M_WAITOK | M_ZERO); + dpcpu = (void *)kmem_malloc_domainset(DOMAINSET_PREF(domain), DPCPU_SIZE, M_WAITOK | M_ZERO); bootSTK = (char *)bootstacks[cpu] +
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190825143034.GO71821>