From owner-freebsd-current@freebsd.org Sun Aug 25 14:30:48 2019 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E69A0E0F00 for ; Sun, 25 Aug 2019 14:30:48 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 46Gcxg2cFbz41Nq; Sun, 25 Aug 2019 14:30:46 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id x7PEUY4b043697 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sun, 25 Aug 2019 17:30:37 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua x7PEUY4b043697 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id x7PEUYE3043688; Sun, 25 Aug 2019 17:30:34 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 25 Aug 2019 17:30:34 +0300 From: Konstantin Belousov To: Rebecca Cran , markj@freebsd.org Cc: FreeBSD Current Subject: Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX) Message-ID: <20190825143034.GO71821@kib.kiev.ua> References: <6e5687b2-ab3f-a570-37ab-72c8a9776167@bsdio.com> <20190824203305.GF71821@kib.kiev.ua> <20190824230801.GK71821@kib.kiev.ua> <20190825062407.GL71821@kib.kiev.ua> <9e94aea8-7d63-0f9e-2f1e-c1492e9dc455@bsdio.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <9e94aea8-7d63-0f9e-2f1e-c1492e9dc455@bsdio.com> User-Agent: Mutt/1.12.1 (2019-06-15) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on tom.home X-Rspamd-Queue-Id: 46Gcxg2cFbz41Nq X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (mx1.freebsd.org: 2001:470:d5e7:1::1 is neither permitted nor denied by domain of kostikbel@gmail.com) smtp.mailfrom=kostikbel@gmail.com X-Spamd-Result: default: False [-2.96 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; FREEMAIL_FROM(0.00)[gmail.com]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; HAS_XAW(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all]; IP_SCORE_FREEMAIL(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_HAM_SHORT(-0.96)[-0.956,0]; IP_SCORE(0.00)[ip: (-2.51), ipnet: 2001:470::/32(-4.43), asn: 6939(-3.06), country: US(-0.05)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; RCVD_COUNT_TWO(0.00)[2]; FREEMAIL_ENVFROM(0.00)[gmail.com]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : No valid SPF, No valid DKIM,none] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2019 14:30:49 -0000 On Sun, Aug 25, 2019 at 07:17:20AM -0600, Rebecca Cran wrote: > On 2019-08-25 00:24, Konstantin Belousov wrote: > > What are the panic messages ? > > Fatal trap 18: integer divide fault while in kernel mode > > instruction pointer = 0x20:0xffffffff80f1027c > > stack pointer = 0x28:0xffffffff845809f0 > > frame pointer = 0x28:0xffffffff84580a00 > > code segment = base 0x0, limit 0xffffff, type 0x1b > >     = DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags = resume, IOPL = 0 > > current process = 0 () > > trap number = 18 > > panic: integer divide fault > > cpuid = 0 > > time = 1 > > > > What is the source line ? > > (gdb) info line *0xffffffff80f1027c > Line 102 of "/usr/src/sys/vm/vm_domainset.c" starts at address > 0xffffffff80f10267 >    and ends at 0xffffffff80f1027f . There was one more source line I asked about. So what happens, IMO, is that for memory-less domains ds_cnt is zero because ds_mask is zero, which causes the exception on divide. You can try the following combined patch, but I really dislike the fact that I cannot safely use DOMAINSET_FIXED (if my diagnosis is correct). I would prefer for kmem_malloc_domainset(DOMAINSET_FIXED(unpopulated domain)) to fail with NULL result, and then I would manually fall-back to DOMAINSET_PREF(). OTOH, I think the chunk for mp_realloc_cpu() is the final fix. diff --git a/sys/amd64/amd64/mp_machdep.c b/sys/amd64/amd64/mp_machdep.c index b38c688f8b4..2c3dc8744f6 100644 --- a/sys/amd64/amd64/mp_machdep.c +++ b/sys/amd64/amd64/mp_machdep.c @@ -402,6 +402,8 @@ mp_realloc_pcpu(int cpuid, int domain) return; m = vm_page_alloc_domain(NULL, 0, domain, VM_ALLOC_NORMAL | VM_ALLOC_NOOBJ); + if (m == NULL) + return; na = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)); pagecopy((void *)oa, (void *)na); pmap_qenter((vm_offset_t)&__pcpu[cpuid], &m, 1); @@ -481,10 +483,10 @@ native_start_all_aps(void) M_ZERO); mce_stack = (char *)kmem_malloc(PAGE_SIZE, M_WAITOK | M_ZERO); nmi_stack = (char *)kmem_malloc_domainset( - DOMAINSET_FIXED(domain), PAGE_SIZE, M_WAITOK | M_ZERO); + DOMAINSET_PREF(domain), PAGE_SIZE, M_WAITOK | M_ZERO); dbg_stack = (char *)kmem_malloc_domainset( - DOMAINSET_FIXED(domain), PAGE_SIZE, M_WAITOK | M_ZERO); - dpcpu = (void *)kmem_malloc_domainset(DOMAINSET_FIXED(domain), + DOMAINSET_PREF(domain), PAGE_SIZE, M_WAITOK | M_ZERO); + dpcpu = (void *)kmem_malloc_domainset(DOMAINSET_PREF(domain), DPCPU_SIZE, M_WAITOK | M_ZERO); bootSTK = (char *)bootstacks[cpu] +