From owner-freebsd-current@FreeBSD.ORG Fri May 15 18:30:44 2015 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 735616AF for ; Fri, 15 May 2015 18:30:44 +0000 (UTC) Received: from mail-la0-f51.google.com (mail-la0-f51.google.com [209.85.215.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 013DF111E for ; Fri, 15 May 2015 18:30:43 +0000 (UTC) Received: by lagr1 with SMTP id r1so51477723lag.0 for ; Fri, 15 May 2015 11:30:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type; bh=8gnCshJ79yqyj66V57nzpHg17W9kD0Q7xBoLqQnouzE=; b=aFCms8I1oXVUNHHZ5gHE2c+0kGHCTk3e4lfradlHgi/YqWeqBuivrA0tj4zusfNBA+ oriFN+tw3TQyUcSX0ALVWi+MnYS1FRFismo07H+YIf8sLwEp87/SWA2p3Q5AZuAFvsc0 6shgEAjgDIm29Yfoh62qzz/jfRjyO2owHVrMOzrdwnW0b+M/PaFss8Yg+mmXvB/llVOK GQnOMWwzBe6ykwsl9m1W+ZMchr+fYZw7kBoIFLg2DnCDomogrBLAsvcyfCnbRpnK+f5c xk0C+aAIJgsXnA2Da1NlGyCju4oRarwGNXmYMpwNEIOMDVvsU0Aude96wSO7R5qR7Vut 0lnw== X-Gm-Message-State: ALoCoQkWf+wW9F0vQsFwCOkpmQ6sf8hkvAqUGToQFXWksvjEiBAd+Fipb7mlek6OYUa223Js6X/P MIME-Version: 1.0 X-Received: by 10.152.203.233 with SMTP id kt9mr8138922lac.21.1431714635683; Fri, 15 May 2015 11:30:35 -0700 (PDT) Received: by 10.25.201.8 with HTTP; Fri, 15 May 2015 11:30:35 -0700 (PDT) Date: Fri, 15 May 2015 20:30:35 +0200 Message-ID: Subject: UMA initialization failure with 48 core ARM64 From: =?UTF-8?Q?Micha=C5=82_Stanek?= To: freebsd-current@freebsd.org, freebsd-arm@freebsd.org Content-Type: multipart/mixed; boundary=001a11345984485e700516230aeb X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 May 2015 18:30:44 -0000 --001a11345984485e700516230aeb Content-Type: text/plain; charset=UTF-8 Hi, I am experiencing an early failure of UMA on an ARM64 platform with 48 cores enabled. I get a kernel panic during initialization of VM. Here is the boot log (lines with 'MST:' are my own debug printfs). Copyright (c) 1992-2015 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 11.0-CURRENT #333 52fd91e(smp_48)-dirty: Fri May 15 18:26:56 CEST 2015 mst@arm64-prime:/usr/home/mst/freebsd_v8/obj_kernel/arm64.aarch64/usr/home/mst/freebsd_v8/kernel/sys/THUNDER-88XX arm64 FreeBSD clang version 3.6.0 (tags/RELEASE_360/final 230434) 20150225 MST: in vm_mem_init() MST: in vmem_init() with param *vm == kernel_arena MST: in vmem_xalloc() with param *vm == kernel_arena MST: in vmem_xalloc() with param *vm == kmem_arena panic: mtx_lock() of spin mutex (null) @ /usr/home/mst/freebsd_v8/kernel/sys/kern/subr_vmem.c:1165 cpuid = 0 KDB: enter: panic [ thread pid 0 tid 0 ] Stopped at 0xffffff80001f4f80: The kernel boots fine when MAXCPU is set to 30 or lower, but the error above always appears when it is set to a higher value. The panic is triggered by a KASSERT in __mtx_lock_flags() which is called with the macro VMEM_LOCK(vm) in vmem_xalloc(). This is line 1143 in subr_vmem.c (log shows different line number due to added printfs). It looks like the lock belongs to 'kmem_arena' which is uninitialized at this point (kmeminit() has not been called yet). While debugging, I tried modifying VM code as a quick workaround. I replaced the number of cores to 1 wherever mp_ncpus, mp_maxid or MAXCPU (and others) are read. This, I believe, limits UMA per-cpu caches to just one, while the rest of the OS (scheduler, etc) sees all 48 cores. In addition, I changed UMA_BOOT_PAGES in sys/vm/uma_int.h to 512 (default was 64). With these tweaks, I got a successful (but not really stable) boot with 48 cores. Of course these are dirty hacks and a proper solution is needed. I am a bit surprised that the kernel fails with MAXCPU==48 as the amd64 arch has this value set to '256' and I have read posts that other platforms with even more cores have worked fine. Perhaps I need to tweak some other VM parameters, apart from UMA_BOOT_PAGES (AKA vm.boot_pages), but I am not sure how. I included a full stacktrace and a more verbose log (with UMA_DEBUG macros enabled) in the attachment. There is also a diff of the hacks I used while debugging. Best regards, Michal Stanek --001a11345984485e700516230aeb Content-Type: text/plain; charset=US-ASCII; name="smp_uma_WA.diff" Content-Disposition: attachment; filename="smp_uma_WA.diff" Content-Transfer-Encoding: base64 X-Attachment-Id: f_i9pxcjzk2 ZGlmZiAtLWdpdCBhL3N5cy9rZXJuL2tlcm5fbWFsbG9jLmMgYi9zeXMva2Vybi9rZXJuX21hbGxv Yy5jCmluZGV4IGFlZjFlNGUuLmJlMjI1ZmIgMTAwNjQ0Ci0tLSBhL3N5cy9rZXJuL2tlcm5fbWFs bG9jLmMKKysrIGIvc3lzL2tlcm4va2Vybl9tYWxsb2MuYwpAQCAtODc0LDcgKzg3NCw3IEBAIG1h bGxvY191bmluaXQodm9pZCAqZGF0YSkKIAkgKiBMb29rIGZvciBtZW1vcnkgbGVha3MuCiAJICov CiAJdGVtcF9hbGxvY3MgPSB0ZW1wX2J5dGVzID0gMDsKLQlmb3IgKGkgPSAwOyBpIDwgTUFYQ1BV OyBpKyspIHsKKwlmb3IgKGkgPSAwOyBpIDwgMTsgaSsrKSB7CiAJCW10c3AgPSAmbXRpcC0+bXRp X3N0YXRzW2ldOwogCQl0ZW1wX2FsbG9jcyArPSBtdHNwLT5tdHNfbnVtYWxsb2NzOwogCQl0ZW1w X2FsbG9jcyAtPSBtdHNwLT5tdHNfbnVtZnJlZXM7CmRpZmYgLS1naXQgYS9zeXMva2Vybi9zdWJy X3ZtZW0uYyBiL3N5cy9rZXJuL3N1YnJfdm1lbS5jCmluZGV4IDgwOTQwYmUuLjg5ZDYyZWQgMTAw NjQ0Ci0tLSBhL3N5cy9rZXJuL3N1YnJfdm1lbS5jCisrKyBiL3N5cy9rZXJuL3N1YnJfdm1lbS5j CkBAIC02NjUsNyArNjY1LDggQEAgdm1lbV9zdGFydHVwKHZvaWQpCiAJICogQ1BVcyB0byBhdHRl bXB0IHRvIGFsbG9jYXRlIG5ldyB0YWdzIGNvbmN1cnJlbnRseSB0byBsaW1pdAogCSAqIGZhbHNl IHJlc3RhcnRzIGluIFVNQS4KIAkgKi8KLQl1bWFfem9uZV9yZXNlcnZlKHZtZW1fYnRfem9uZSwg QlRfTUFYQUxMT0MgKiAobXBfbmNwdXMgKyAxKSAvIDIpOworCS8vbXN0IGxvb2sgaGVyZQorCXVt YV96b25lX3Jlc2VydmUodm1lbV9idF96b25lLCBCVF9NQVhBTExPQyAqICgxICsgMSkgLyAyKTsK IAl1bWFfem9uZV9zZXRfYWxsb2NmKHZtZW1fYnRfem9uZSwgdm1lbV9idF9hbGxvYyk7CiAjZW5k aWYKIH0KZGlmZiAtLWdpdCBhL3N5cy92bS91bWFfY29yZS5jIGIvc3lzL3ZtL3VtYV9jb3JlLmMK aW5kZXggYjk2YzQyMS4uNjM4MjQzNyAxMDA2NDQKLS0tIGEvc3lzL3ZtL3VtYV9jb3JlLmMKKysr IGIvc3lzL3ZtL3VtYV9jb3JlLmMKQEAgLTk4LDYgKzk4LDE0IEBAIF9fRkJTRElEKCIkRnJlZUJT RCQiKTsKICNpbmNsdWRlIDx2bS9tZW1ndWFyZC5oPgogI2VuZGlmCiAKKy8vbXN0OiBvdmVycmlk ZSBzb21lIGRlZmluZXMKKyN1bmRlZiBjdXJjcHUKKyNkZWZpbmUJY3VyY3B1CTAKKyN1bmRlZglD UFVfRk9SRUFDSAorI2RlZmluZQlDUFVfRk9SRUFDSChpKQkJCQkJCQlcCisJZm9yICgoaSkgPSAw OyAoaSkgPD0gMDsgKGkpKyspCQkJCVwKKwkJaWYgKCFDUFVfQUJTRU5UKChpKSkpCisKIC8qCiAg KiBUaGlzIGlzIHRoZSB6b25lIGFuZCBrZWcgZnJvbSB3aGljaCBhbGwgem9uZXMgYXJlIHNwYXdu ZWQuICBUaGUgaWRlYSBpcyB0aGF0CiAgKiBldmVuIHRoZSB6b25lICYga2VnIGhlYWRzIGFyZSBh bGxvY2F0ZWQgZnJvbSB0aGUgYWxsb2NhdG9yLCBzbyB3ZSB1c2UgdGhlCkBAIC0xMjI4LDYgKzEy MzYsNyBAQCBrZWdfc21hbGxfaW5pdCh1bWFfa2VnX3Qga2VnKQogCiAJaWYgKGtlZy0+dWtfZmxh Z3MgJiBVTUFfWk9ORV9QQ1BVKSB7CiAJCXVfaW50IG5jcHVzID0gbXBfbmNwdXMgPyBtcF9uY3B1 cyA6IE1BWENQVTsKKwkJbmNwdXMgPSAxOwogCiAJCWtlZy0+dWtfc2xhYnNpemUgPSBzaXplb2Yo c3RydWN0IHBjcHUpOwogCQlrZWctPnVrX3BwZXJhID0gaG93bWFueShuY3B1cyAqIHNpemVvZihz dHJ1Y3QgcGNwdSksCkBAIC0xODIyLDcgKzE4MzEsNyBAQCB1bWFfc3RhcnR1cCh2b2lkICpib290 bWVtLCBpbnQgYm9vdF9wYWdlcykKICNlbmRpZgogCWFyZ3MubmFtZSA9ICJVTUEgWm9uZXMiOwog CWFyZ3Muc2l6ZSA9IHNpemVvZihzdHJ1Y3QgdW1hX3pvbmUpICsKLQkgICAgKHNpemVvZihzdHJ1 Y3QgdW1hX2NhY2hlKSAqIChtcF9tYXhpZCArIDEpKTsKKwkgICAgKHNpemVvZihzdHJ1Y3QgdW1h X2NhY2hlKSAqICgwICsgMSkpOwogCWFyZ3MuY3RvciA9IHpvbmVfY3RvcjsKIAlhcmdzLmR0b3Ig PSB6b25lX2R0b3I7CiAJYXJncy51bWluaXQgPSB6ZXJvX2luaXQ7CkBAIC0zMzAxLDcgKzMzMTAs NyBAQCB1bWFfemVyb19pdGVtKHZvaWQgKml0ZW0sIHVtYV96b25lX3Qgem9uZSkKIHsKIAogCWlm ICh6b25lLT51el9mbGFncyAmIFVNQV9aT05FX1BDUFUpIHsKLQkJZm9yIChpbnQgaSA9IDA7IGkg PCBtcF9uY3B1czsgaSsrKQorCQlmb3IgKGludCBpID0gMDsgaSA8IDE7IGkrKykKIAkJCWJ6ZXJv KHpwY3B1X2dldF9jcHUoaXRlbSwgaSksIHpvbmUtPnV6X3NpemUpOwogCX0gZWxzZQogCQliemVy byhpdGVtLCB6b25lLT51el9zaXplKTsKQEAgLTM0NjUsNyArMzQ3NCw3IEBAIHN5c2N0bF92bV96 b25lX3N0YXRzKFNZU0NUTF9IQU5ETEVSX0FSR1MpCiAJICovCiAJYnplcm8oJnVzaCwgc2l6ZW9m KHVzaCkpOwogCXVzaC51c2hfdmVyc2lvbiA9IFVNQV9TVFJFQU1fVkVSU0lPTjsKLQl1c2gudXNo X21heGNwdXMgPSAobXBfbWF4aWQgKyAxKTsKKwl1c2gudXNoX21heGNwdXMgPSAoMCArIDEpOwog CXVzaC51c2hfY291bnQgPSBjb3VudDsKIAkodm9pZClzYnVmX2JjYXQoJnNidWYsICZ1c2gsIHNp emVvZih1c2gpKTsKIApAQCAtMzUwOSw3ICszNTE4LDcgQEAgc3lzY3RsX3ZtX3pvbmVfc3RhdHMo U1lTQ1RMX0hBTkRMRVJfQVJHUykKIAkJCSAqIGFjY2VwdCB0aGUgcG9zc2libGUgcmFjZSBhc3Nv Y2lhdGVkIHdpdGggYnVja2V0CiAJCQkgKiBleGNoYW5nZSBkdXJpbmcgbW9uaXRvcmluZy4KIAkJ CSAqLwotCQkJZm9yIChpID0gMDsgaSA8IChtcF9tYXhpZCArIDEpOyBpKyspIHsKKwkJCWZvciAo aSA9IDA7IGkgPCAoMCArIDEpOyBpKyspIHsKIAkJCQliemVybygmdXBzLCBzaXplb2YodXBzKSk7 CiAJCQkJaWYgKGt6LT51a19mbGFncyAmIFVNQV9aRkxBR19JTlRFUk5BTCkKIAkJCQkJZ290byBz a2lwOwpkaWZmIC0tZ2l0IGEvc3lzL3ZtL3VtYV9pbnQuaCBiL3N5cy92bS91bWFfaW50LmgKaW5k ZXggMTFhYjI0Zi4uYjViNWEwNSAxMDA2NDQKLS0tIGEvc3lzL3ZtL3VtYV9pbnQuaAorKysgYi9z eXMvdm0vdW1hX2ludC5oCkBAIC0xMDcsNyArMTA3LDcgQEAKICNkZWZpbmUgVU1BX1NMQUJfTUFT SwkoUEFHRV9TSVpFIC0gMSkJLyogTWFzayB0byBnZXQgYmFjayB0byB0aGUgcGFnZSAqLwogI2Rl ZmluZSBVTUFfU0xBQl9TSElGVAlQQUdFX1NISUZUCS8qIE51bWJlciBvZiBiaXRzIFBBR0VfTUFT SyAqLwogCi0jZGVmaW5lIFVNQV9CT09UX1BBR0VTCQk2NAkvKiBQYWdlcyBhbGxvY2F0ZWQgZm9y IHN0YXJ0dXAgKi8KKyNkZWZpbmUgVU1BX0JPT1RfUEFHRVMJCTUxMgkvKiBQYWdlcyBhbGxvY2F0 ZWQgZm9yIHN0YXJ0dXAgKi8KIAogLyogTWF4IHdhc3RlIHBlcmNlbnRhZ2UgYmVmb3JlIGdvaW5n IHRvIG9mZiBwYWdlIHNsYWIgbWFuYWdlbWVudCAqLwogI2RlZmluZSBVTUFfTUFYX1dBU1RFCTEw Cg== --001a11345984485e700516230aeb--