From owner-freebsd-current@FreeBSD.ORG Mon Oct 18 14:18:05 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 011FE106566C; Mon, 18 Oct 2010 14:18:05 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 908F98FC0A; Mon, 18 Oct 2010 14:18:03 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA15898; Mon, 18 Oct 2010 17:18:02 +0300 (EEST) (envelope-from avg@freebsd.org) Message-ID: <4CBC5719.1020807@freebsd.org> Date: Mon, 18 Oct 2010 17:18:01 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Giovanni Trematerra References: <4C9B9B9C.6000807@freebsd.org> <4CBBEBDF.3060905@freebsd.org> In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: alc@freebsd.org, freebsd-current@freebsd.org Subject: Re: panic in uma_startup for many-core amd64 system X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Oct 2010 14:18:05 -0000 on 18/10/2010 16:40 Giovanni Trematerra said the following: > On Mon, Oct 18, 2010 at 8:40 AM, Andriy Gapon wrote: >> on 23/09/2010 21:25 Andriy Gapon said the following: >>> >>> Jeff, >>> >>> just for the kicks I tried to emulate a machine with 64 logical CPUs using >>> qemu-devel port: >>> qemu-system-x86_64 -smp sockets=4,cores=8,threads=2 ... >>> >>> It seems that FreeBSD agreed to recognize only first 32 CPUs, but it paniced anyway. >>> >>> Here's a backtrace: >>> #34 0xffffffff804fe7f5 in zone_alloc_item (zone=0xffffffff80be1554, >>> udata=0xffffffff80be1550, flags=1924) at /usr/src/sys/vm/uma_core.c:2506 >>> #35 0xffffffff804ff35d in hash_alloc (hash=0xffffff001ffdb030) at >>> /usr/src/sys/vm/uma_core.c:483 >>> #36 0xffffffff804ff642 in keg_ctor (mem=Variable "mem" is not available. >>> ) at /usr/src/sys/vm/uma_core.c:1396 >>> #37 0xffffffff804fe91b in zone_alloc_item (zone=0xffffffff80a1f300, >>> udata=0xffffffff80be1b60, flags=2) at /usr/src/sys/vm/uma_core.c:2544 >>> #38 0xffffffff804ff92e in zone_ctor (mem=Variable "mem" is not available. >>> ) at /usr/src/sys/vm/uma_core.c:1832 >>> #39 0xffffffff804ffca4 in uma_startup (bootmem=0xffffff001ffac000, boot_pages=48) >>> at /usr/src/sys/vm/uma_core.c:1741 >>> #40 0xffffffff80514822 in vm_page_startup (vaddr=18446744071576817664) at >>> /usr/src/sys/vm/vm_page.c:360 >>> #41 0xffffffff805060c5 in vm_mem_init (dummy=Variable "dummy" is not available. >>> ) at /usr/src/sys/vm/vm_init.c:118 >>> #42 0xffffffff803258b9 in mi_startup () at /usr/src/sys/kern/init_main.c:253 >>> #43 0xffffffff8017177c in btext () at /usr/src/sys/amd64/amd64/locore.S:81 >>> [[[ >>> Note: >>> 1. Frame numbers are high because the backtrace is obtained via gdb remotely >>> connected to qemu and also there is bunch of extra frames from DDB, etc. >>> 2. Line numbers in uma_core. won't match those in FreeBSD tree, because I've doing >>> some unrelated hacking in the file. >>> ]]] >>> >>> The problem seems to be with creation of "UMA Zones" zone and keg. >>> Because of the large number of processors, size argument in the following snippet >>> is set to a value of 4480: >>> >>> args.name = "UMA Zones"; >>> args.size = sizeof(struct uma_zone) + >>> (sizeof(struct uma_cache) * (mp_maxid + 1)); >>> >>> Because of this, keg_ctor() calls keg_large_init(): >>> >>> else if ((keg->uk_size+UMA_FRITM_SZ) > >>> (UMA_SLAB_SIZE - sizeof(struct uma_slab))) >>> keg_large_init(keg); >>> else >>> keg_small_init(keg); >>> >>> keg_large_init sets UMA_ZONE_OFFPAGE and UMA_ZONE_HASH flags for this keg. >>> This leads to hash_alloc() being invoked from keg_ctor(): >>> >>> if (keg->uk_flags & UMA_ZONE_HASH) >>> hash_alloc(&keg->uk_hash); >>> >>> But the problem is that "UMA Hash" zone is not created yet and thus the call leads >>> to the panic. "UMA Hash" zone is the last of system zones created. >>> >>> Not sure what the proper fix here could/should be. >>> Would it work to simply not set UMA_ZONE_HASH flag when UMA_ZFLAG_INTERNAL is set? >>> >>> >>> And some final calculations. >>> On the test system sizeof(struct uma_cache) is 128 bytes and (mp_maxid + 1) is 32, >>> so it's already UMA_SLAB_SIZE = PAGE_SIZE = 4096. >>> >> >> Here is a simple solution that seems to work: >> http://people.freebsd.org/~avg/uma-many-cpus.diff >> Not sure if it's the best we can do. >> > > I don't know if it makes sense I only want to raise a flag. > Is it safe to call kmem_malloc() before bucket_init() during > uma_startup() to reserve room for CPU caches? Hmm, not sure what exactly you mean. > Reading the top uma_int.h comment, it seems that the best way to > handle this issue > would be to implement and allow for dynamic slab sizes. Again, not sure if I follow you, I don't see relation between per-cpu caches and dynamic slab size. > I'm also afraid that memory footprint will be larger than now. Of course, but only by sizeof(pointer) per zone. -- Andriy Gapon