From owner-freebsd-current@FreeBSD.ORG Sun May 17 11:10:05 2015 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 58FB3B8D for ; Sun, 17 May 2015 11:10:05 +0000 (UTC) Received: from mail-wi0-f178.google.com (mail-wi0-f178.google.com [209.85.212.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BD4211247 for ; Sun, 17 May 2015 11:10:04 +0000 (UTC) Received: by wizk4 with SMTP id k4so45078989wiz.1 for ; Sun, 17 May 2015 04:09:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type; bh=pUMoglpMV1pXvulKyeQKfUlryEyE/m4JRK42uuefE5s=; b=fyeKbHEOulrkLDCPBddtQQpspvpcUF2gM1Qxs4Q0QDfcIKTIezYVajbwPwDPSEvcBI diVFbfWhG6rUNlFYiMt/EGqo55Kywks3uGp9JBdzN+jftU2cX3Xqb6woMDmBdsCeTjAj vq5VCge1XI+pVhV7CfTu6PcrveGSXmUFk5n9cp/t11jWtt16BoOubuO60gGFBRXHMLb5 5txgJMv+bSXibXc+kuffbMGiF7oey3v5xcdJhPI0FI964iql1y7XEh75NhFA9wYdsQ6f /q+N27ByiGoHRXcebzDqoZ6PD9RC0q6k7RQXBTawXdIuNVUWcZvdTCicksnKzKnsI6/j B4oQ== X-Gm-Message-State: ALoCoQl5KUhf11lgmuy+OFCzqYJ27i6ykWfFqEdS3n79bj2goP08nabyZJaEtG63CpR76bZp1q8r X-Received: by 10.180.187.232 with SMTP id fv8mr13153044wic.28.1431860996812; Sun, 17 May 2015 04:09:56 -0700 (PDT) Received: from [192.168.1.117] ([192.166.203.79]) by mx.google.com with ESMTPSA id q10sm474881wjo.38.2015.05.17.04.09.55 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 17 May 2015 04:09:56 -0700 (PDT) Message-ID: <5558770A.70903@semihalf.com> Date: Sun, 17 May 2015 13:10:02 +0200 From: =?UTF-8?B?TWljaGHFgiBTdGFuZWs=?= User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Stanislav Sedov CC: freebsd-current@freebsd.org, freebsd-arm@freebsd.org Subject: Re: UMA initialization failure with 48 core ARM64 References: <2A6C7643-0C10-4451-B547-9D50EA6809B8@freebsd.org> In-Reply-To: <2A6C7643-0C10-4451-B547-9D50EA6809B8@freebsd.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 May 2015 11:10:05 -0000 On 2015-05-16 00:42, Stanislav Sedov wrote: > >> On May 15, 2015, at 11:30 AM, Michał Stanek wrote: >> >> Hi, >> >> I am experiencing an early failure of UMA on an ARM64 platform with 48 >> cores enabled. I get a kernel panic during initialization of VM. Here is >> the boot log (lines with 'MST:' are my own debug printfs). >> >> Copyright (c) 1992-2015 The FreeBSD Project. >> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 >> The Regents of the University of California. All rights reserved. >> FreeBSD is a registered trademark of The FreeBSD Foundation. >> FreeBSD 11.0-CURRENT #333 52fd91e(smp_48)-dirty: Fri May 15 18:26:56 CEST >> 2015 >> mst@arm64-prime:/usr/home/mst/freebsd_v8/obj_kernel/arm64.aarch64/usr/home/mst/freebsd_v8/kernel/sys/THUNDER-88XX >> arm64 >> FreeBSD clang version 3.6.0 (tags/RELEASE_360/final 230434) 20150225 >> MST: in vm_mem_init() >> MST: in vmem_init() with param *vm == kernel_arena >> MST: in vmem_xalloc() with param *vm == kernel_arena >> MST: in vmem_xalloc() with param *vm == kmem_arena >> panic: mtx_lock() of spin mutex (null) @ >> /usr/home/mst/freebsd_v8/kernel/sys/kern/subr_vmem.c:1165 >> cpuid = 0 >> KDB: enter: panic >> [ thread pid 0 tid 0 ] >> Stopped at 0xffffff80001f4f80: >> >> The kernel boots fine when MAXCPU is set to 30 or lower, but the error >> above always appears when it is set to a higher value. >> >> The panic is triggered by a KASSERT in __mtx_lock_flags() which is called >> with the macro VMEM_LOCK(vm) in vmem_xalloc(). This is line 1143 in >> subr_vmem.c (log shows different line number due to added printfs). >> It looks like the lock belongs to 'kmem_arena' which is uninitialized at >> this point (kmeminit() has not been called yet). >> >> While debugging, I tried modifying VM code as a quick workaround. I >> replaced the number of cores to 1 wherever mp_ncpus, mp_maxid or MAXCPU >> (and others) are read. This, I believe, limits UMA per-cpu caches to just >> one, while the rest of the OS (scheduler, etc) sees all 48 cores. >> In addition, I changed UMA_BOOT_PAGES in sys/vm/uma_int.h to 512 (default >> was 64). >> With these tweaks, I got a successful (but not really stable) boot with 48 >> cores. Of course these are dirty hacks and a proper solution is needed. >> >> I am a bit surprised that the kernel fails with MAXCPU==48 as the amd64 >> arch has this value set to '256' and I have read posts that other platforms >> with even more cores have worked fine. Perhaps I need to tweak some other >> VM parameters, apart from UMA_BOOT_PAGES (AKA vm.boot_pages), but I am not >> sure how. >> >> I included a full stacktrace and a more verbose log (with UMA_DEBUG macros >> enabled) in the attachment. There is also a diff of the hacks I used while >> debugging. >> >> > > Hi, Michal! > > It looks like the log attachment didn’t make it though the mailing list. > Can you please resend it again? > > The panic suggests that a mutex was left uninitialized... > > -- > ST4096-RIPE > > > Yes you're right, kmem_arena's mutex is used before it is initialized. I do not know why increasing MAXCPU causes such behavior. Here is the stacktrace at the point of the panic: db_stack_trace db_command db_command_loop db_trap kdb_trap handle_el1h_sync vpanic kassert_panic __mtx_lock_flags vmem_xalloc vmem_bt_alloc keg_alloc_slab keg_fetch_slab zone_fetch_slab zone_import zone_alloc_item bt_fill vmem_xalloc vmem_alloc kmem_init_zero_region vm_mem_init mi_startup virtdone Diff of the hacks in UMA: diff --git a/sys/kern/kern_malloc.c b/sys/kern/kern_malloc.c index aef1e4e..be225fb 100644 --- a/sys/kern/kern_malloc.c +++ b/sys/kern/kern_malloc.c @@ -874,7 +874,7 @@ malloc_uninit(void *data) * Look for memory leaks. */ temp_allocs = temp_bytes = 0; - for (i = 0; i < MAXCPU; i++) { + for (i = 0; i < 1; i++) { mtsp = &mtip->mti_stats[i]; temp_allocs += mtsp->mts_numallocs; temp_allocs -= mtsp->mts_numfrees; diff --git a/sys/kern/subr_vmem.c b/sys/kern/subr_vmem.c index 80940be..89d62ed 100644 --- a/sys/kern/subr_vmem.c +++ b/sys/kern/subr_vmem.c @@ -665,7 +665,8 @@ vmem_startup(void) * CPUs to attempt to allocate new tags concurrently to limit * false restarts in UMA. */ - uma_zone_reserve(vmem_bt_zone, BT_MAXALLOC * (mp_ncpus + 1) / 2); + //mst look here + uma_zone_reserve(vmem_bt_zone, BT_MAXALLOC * (1 + 1) / 2); uma_zone_set_allocf(vmem_bt_zone, vmem_bt_alloc); #endif } diff --git a/sys/vm/uma_core.c b/sys/vm/uma_core.c index b96c421..6382437 100644 --- a/sys/vm/uma_core.c +++ b/sys/vm/uma_core.c @@ -98,6 +98,14 @@ __FBSDID("$FreeBSD$"); #include #endif +//mst: override some defines +#undef curcpu +#define curcpu 0 +#undef CPU_FOREACH +#define CPU_FOREACH(i) \ + for ((i) = 0; (i) <= 0; (i)++) \ + if (!CPU_ABSENT((i))) + /* * This is the zone and keg from which all zones are spawned. The idea is that * even the zone & keg heads are allocated from the allocator, so we use the @@ -1228,6 +1236,7 @@ keg_small_init(uma_keg_t keg) if (keg->uk_flags & UMA_ZONE_PCPU) { u_int ncpus = mp_ncpus ? mp_ncpus : MAXCPU; + ncpus = 1; keg->uk_slabsize = sizeof(struct pcpu); keg->uk_ppera = howmany(ncpus * sizeof(struct pcpu), @@ -1822,7 +1831,7 @@ uma_startup(void *bootmem, int boot_pages) #endif args.name = "UMA Zones"; args.size = sizeof(struct uma_zone) + - (sizeof(struct uma_cache) * (mp_maxid + 1)); + (sizeof(struct uma_cache) * (0 + 1)); args.ctor = zone_ctor; args.dtor = zone_dtor; args.uminit = zero_init; @@ -3301,7 +3310,7 @@ uma_zero_item(void *item, uma_zone_t zone) { if (zone->uz_flags & UMA_ZONE_PCPU) { - for (int i = 0; i < mp_ncpus; i++) + for (int i = 0; i < 1; i++) bzero(zpcpu_get_cpu(item, i), zone->uz_size); } else bzero(item, zone->uz_size); @@ -3465,7 +3474,7 @@ sysctl_vm_zone_stats(SYSCTL_HANDLER_ARGS) */ bzero(&ush, sizeof(ush)); ush.ush_version = UMA_STREAM_VERSION; - ush.ush_maxcpus = (mp_maxid + 1); + ush.ush_maxcpus = (0 + 1); ush.ush_count = count; (void)sbuf_bcat(&sbuf, &ush, sizeof(ush)); @@ -3509,7 +3518,7 @@ sysctl_vm_zone_stats(SYSCTL_HANDLER_ARGS) * accept the possible race associated with bucket * exchange during monitoring. */ - for (i = 0; i < (mp_maxid + 1); i++) { + for (i = 0; i < (0 + 1); i++) { bzero(&ups, sizeof(ups)); if (kz->uk_flags & UMA_ZFLAG_INTERNAL) goto skip; diff --git a/sys/vm/uma_int.h b/sys/vm/uma_int.h index 11ab24f..b5b5a05 100644 --- a/sys/vm/uma_int.h +++ b/sys/vm/uma_int.h @@ -107,7 +107,7 @@ #define UMA_SLAB_MASK (PAGE_SIZE - 1) /* Mask to get back to the page */ #define UMA_SLAB_SHIFT PAGE_SHIFT /* Number of bits PAGE_MASK */ -#define UMA_BOOT_PAGES 64 /* Pages allocated for startup */ +#define UMA_BOOT_PAGES 512 /* Pages allocated for startup */ /* Max waste percentage before going to off page slab management */ #define UMA_MAX_WASTE 10 And lastly, the more verbose log: Copyright (c) 1992-2015 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 11.0-CURRENT #336 52fd91e(smp_48)-dirty: Fri May 15 18:57:05 CEST 2015 mst@arm64-prime:/usr/home/mst/freebsd_v8/obj_kernel/arm64.aarch64/usr/home/mst/freebsd_v8/kernel/sys/THUNDER-88XX arm64 FreeBSD clang version 3.6.0 (tags/RELEASE_360/final 230434) 20150225 MST: in vm_mem_init() Creating uma keg headers zone and keg. UMA: UMA Kegs(0xffffff8000d1b140) size 256(256) flags 0x20000000 ipers 15 ppera 1 out 0 free 0 Filling boot free list. Creating uma zone headers zone and keg. INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) alloc_slab: Allocating a new slab for UMA Kegs UMA: UMA Zones(0xffffff8000d1b000) size 1856(1856) flags 0x20000000 ipers 2 ppera 1 out 0 free 0 Creating slab and hash zones. INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) alloc_slab: Allocating a new slab for UMA Zones INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: UMA Slabs(0xffffffc0789fe000) size 112(112) flags 0x20000000 ipers 35 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: UMA RCntSlabs(0xffffffc0789fe740) size 120(120) flags 0x20000000 ipers 33 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) alloc_slab: Allocating a new slab for UMA Zones INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: UMA Hash(0xffffffc0789fd000) size 256(256) flags 0x20000000 ipers 15 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: 4 Bucket(0xffffffc0789fd740) size 32(32) flags 0x10000040 ipers 124 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) alloc_slab: Allocating a new slab for UMA Zones INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: 6 Bucket(0xffffffc0789fc000) size 48(48) flags 0x10000040 ipers 83 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: 8 Bucket(0xffffffc0789fc740) size 64(64) flags 0x10000040 ipers 62 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) alloc_slab: Allocating a new slab for UMA Zones INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: 12 Bucket(0xffffffc0789fb000) size 96(96) flags 0x10000040 ipers 41 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: 16 Bucket(0xffffffc0789fb740) size 128(128) flags 0x10000040 ipers 31 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) alloc_slab: Allocating a new slab for UMA Zones INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: 32 Bucket(0xffffffc0789fa000) size 256(256) flags 0x10000040 ipers 15 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: 64 Bucket(0xffffffc0789fa740) size 512(512) flags 0x10000040 ipers 7 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) alloc_slab: Allocating a new slab for UMA Zones INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA decided we need offpage slab headers for keg: 128 Bucket, calculated wastedspace = 912, maximum wasted space allowed = 409, calculated ipers = 4, new wasted space = 0 INTERNAL: Allocating one item from UMA Hash(0xffffffc0789fd000) alloc_slab: Allocating a new slab for UMA Hash UMA: 128 Bucket(0xffffffc0789f9000) size 1024(1024) flags 0x10000148 ipers 4 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA decided we need offpage slab headers for keg: 256 Bucket, calculated wastedspace = 1936, maximum wasted space allowed = 409, calculated ipers = 2, new wasted space = 0 INTERNAL: Allocating one item from UMA Hash(0xffffffc0789fd000) UMA: 256 Bucket(0xffffffc0789f9740) size 2048(2048) flags 0x10000148 ipers 2 ppera 1 out 0 free 0 UMA startup complete. INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) alloc_slab: Allocating a new slab for UMA Zones INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: vmem btag(0xffffffc0789f7000) size 56(56) flags 0x80000080 ipers 71 ppera 1 out 0 free 0 alloc_slab: Allocating a new slab for vmem btag INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: VM OBJECT(0xffffffc0789f7740) size 256(256) flags 0x20 ipers 15 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) alloc_slab: Allocating a new slab for UMA Zones INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) alloc_slab: Allocating a new slab for UMA Kegs UMA: RADIX NODE(0xffffffc0789f5000) size 144(144) flags 0x80000080 ipers 27 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: MAP(0xffffffc0789f5740) size 240(240) flags 0x20 ipers 16 ppera 1 out 0 free 0 alloc_slab: Allocating a new slab for MAP INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) alloc_slab: Allocating a new slab for UMA Zones INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: KMAP ENTRY(0xffffffc0789f2000) size 128(128) flags 0x800000c0 ipers 31 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: MAP ENTRY(0xffffffc0789f2740) size 128(128) flags 0 ipers 31 ppera 1 out 0 free 0 INTERNAL: Allocating one item from UMA Zones(0xffffff8000d1b000) alloc_slab: Allocating a new slab for UMA Zones INTERNAL: Allocating one item from UMA Kegs(0xffffff8000d1b140) UMA: VMSPACE(0xffffffc0789f1000) size 384(384) flags 0x20 ipers 10 ppera 1 out 0 free 0 Allocating one item from MAP(0xffffffc0789f5740) INTERNAL: Allocating one item from MAP(0xffffffc0789f5740) Allocating one item from KMAP ENTRY(0xffffffc0789f2000) INTERNAL: Allocating one item from KMAP ENTRY(0xffffffc0789f2000) alloc_slab: Allocating a new slab for KMAP ENTRY MST: in vmem_init() with param *vm == kernel_arena MST: in vmem_xalloc() with param *vm == kernel_arena Allocating one item from vmem btag(0xffffffc0789f7000) INTERNAL: Allocating one item from vmem btag(0xffffffc0789f7000) Allocating one item from vmem btag(0xffffffc0789f7000) INTERNAL: Allocating one item from vmem btag(0xffffffc0789f7000) alloc_slab: Allocating a new slab for vmem btag MST: in vmem_xalloc() with param *vm == kmem_arena panic: mtx_lock() of spin mutex (null) @ /usr/home/mst/freebsd_v8/kernel/sys/kern/subr_vmem.c:1165 cpuid = 0 KDB: enter: panic [ thread pid 0 tid 0 ] Stopped at 0xffffff80001f4f80: db> Best regards, Michal Stanek