Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 15 May 2015 20:30:35 +0200
From:      =?UTF-8?Q?Micha=C5=82_Stanek?= <mst@semihalf.com>
To:        freebsd-current@freebsd.org, freebsd-arm@freebsd.org
Subject:   UMA initialization failure with 48 core ARM64
Message-ID:  <CAMiGqYiOwN8onw1%2BreP9y9fiQ5pvpaLwpFmqJoX5HTBLQtCfRQ@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
--001a11345984485e700516230aeb
Content-Type: text/plain; charset=UTF-8

Hi,

I am experiencing an early failure of UMA on an ARM64 platform with 48
cores enabled. I get a kernel panic during initialization of VM. Here is
the boot log (lines with 'MST:' are my own debug printfs).

Copyright (c) 1992-2015 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
    The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.0-CURRENT #333 52fd91e(smp_48)-dirty: Fri May 15 18:26:56 CEST
2015
    mst@arm64-prime:/usr/home/mst/freebsd_v8/obj_kernel/arm64.aarch64/usr/home/mst/freebsd_v8/kernel/sys/THUNDER-88XX
arm64
FreeBSD clang version 3.6.0 (tags/RELEASE_360/final 230434) 20150225
MST: in vm_mem_init()
MST: in vmem_init() with param *vm == kernel_arena
MST: in vmem_xalloc() with param *vm == kernel_arena
MST: in vmem_xalloc() with param *vm == kmem_arena
panic: mtx_lock() of spin mutex (null) @
/usr/home/mst/freebsd_v8/kernel/sys/kern/subr_vmem.c:1165
cpuid = 0
KDB: enter: panic
[ thread pid 0 tid 0 ]
Stopped at      0xffffff80001f4f80:

The kernel boots fine when MAXCPU is set to 30 or lower, but the error
above always appears when it is set to a higher value.

The panic is triggered by a KASSERT in __mtx_lock_flags() which is called
with the macro VMEM_LOCK(vm) in vmem_xalloc(). This is line 1143 in
subr_vmem.c (log shows different line number due to added printfs).
It looks like the lock belongs to 'kmem_arena' which is uninitialized at
this point (kmeminit() has not been called yet).

While debugging, I tried modifying VM code as a quick workaround. I
replaced the number of cores to 1 wherever mp_ncpus, mp_maxid or MAXCPU
(and others) are read. This, I believe, limits UMA per-cpu caches to just
one, while the rest of the OS (scheduler, etc) sees all 48 cores.
In addition, I changed UMA_BOOT_PAGES in sys/vm/uma_int.h to 512 (default
was 64).
With these tweaks, I got a successful (but not really stable) boot with 48
cores. Of course these are dirty hacks and a proper solution is needed.

I am a bit surprised that the kernel fails with MAXCPU==48 as the amd64
arch has this value set to '256' and I have read posts that other platforms
with even more cores have worked fine. Perhaps I need to tweak some other
VM parameters, apart from UMA_BOOT_PAGES (AKA vm.boot_pages), but I am not
sure how.

I included a full stacktrace and a more verbose log (with UMA_DEBUG macros
enabled) in the attachment. There is also a diff of the hacks I used while
debugging.

Best regards,
Michal Stanek

--001a11345984485e700516230aeb
Content-Type: text/plain; charset=US-ASCII; name="smp_uma_WA.diff"
Content-Disposition: attachment; filename="smp_uma_WA.diff"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_i9pxcjzk2

ZGlmZiAtLWdpdCBhL3N5cy9rZXJuL2tlcm5fbWFsbG9jLmMgYi9zeXMva2Vybi9rZXJuX21hbGxv
Yy5jCmluZGV4IGFlZjFlNGUuLmJlMjI1ZmIgMTAwNjQ0Ci0tLSBhL3N5cy9rZXJuL2tlcm5fbWFs
bG9jLmMKKysrIGIvc3lzL2tlcm4va2Vybl9tYWxsb2MuYwpAQCAtODc0LDcgKzg3NCw3IEBAIG1h
bGxvY191bmluaXQodm9pZCAqZGF0YSkKIAkgKiBMb29rIGZvciBtZW1vcnkgbGVha3MuCiAJICov
CiAJdGVtcF9hbGxvY3MgPSB0ZW1wX2J5dGVzID0gMDsKLQlmb3IgKGkgPSAwOyBpIDwgTUFYQ1BV
OyBpKyspIHsKKwlmb3IgKGkgPSAwOyBpIDwgMTsgaSsrKSB7CiAJCW10c3AgPSAmbXRpcC0+bXRp
X3N0YXRzW2ldOwogCQl0ZW1wX2FsbG9jcyArPSBtdHNwLT5tdHNfbnVtYWxsb2NzOwogCQl0ZW1w
X2FsbG9jcyAtPSBtdHNwLT5tdHNfbnVtZnJlZXM7CmRpZmYgLS1naXQgYS9zeXMva2Vybi9zdWJy
X3ZtZW0uYyBiL3N5cy9rZXJuL3N1YnJfdm1lbS5jCmluZGV4IDgwOTQwYmUuLjg5ZDYyZWQgMTAw
NjQ0Ci0tLSBhL3N5cy9rZXJuL3N1YnJfdm1lbS5jCisrKyBiL3N5cy9rZXJuL3N1YnJfdm1lbS5j
CkBAIC02NjUsNyArNjY1LDggQEAgdm1lbV9zdGFydHVwKHZvaWQpCiAJICogQ1BVcyB0byBhdHRl
bXB0IHRvIGFsbG9jYXRlIG5ldyB0YWdzIGNvbmN1cnJlbnRseSB0byBsaW1pdAogCSAqIGZhbHNl
IHJlc3RhcnRzIGluIFVNQS4KIAkgKi8KLQl1bWFfem9uZV9yZXNlcnZlKHZtZW1fYnRfem9uZSwg
QlRfTUFYQUxMT0MgKiAobXBfbmNwdXMgKyAxKSAvIDIpOworCS8vbXN0IGxvb2sgaGVyZQorCXVt
YV96b25lX3Jlc2VydmUodm1lbV9idF96b25lLCBCVF9NQVhBTExPQyAqICgxICsgMSkgLyAyKTsK
IAl1bWFfem9uZV9zZXRfYWxsb2NmKHZtZW1fYnRfem9uZSwgdm1lbV9idF9hbGxvYyk7CiAjZW5k
aWYKIH0KZGlmZiAtLWdpdCBhL3N5cy92bS91bWFfY29yZS5jIGIvc3lzL3ZtL3VtYV9jb3JlLmMK
aW5kZXggYjk2YzQyMS4uNjM4MjQzNyAxMDA2NDQKLS0tIGEvc3lzL3ZtL3VtYV9jb3JlLmMKKysr
IGIvc3lzL3ZtL3VtYV9jb3JlLmMKQEAgLTk4LDYgKzk4LDE0IEBAIF9fRkJTRElEKCIkRnJlZUJT
RCQiKTsKICNpbmNsdWRlIDx2bS9tZW1ndWFyZC5oPgogI2VuZGlmCiAKKy8vbXN0OiBvdmVycmlk
ZSBzb21lIGRlZmluZXMKKyN1bmRlZiBjdXJjcHUKKyNkZWZpbmUJY3VyY3B1CTAKKyN1bmRlZglD
UFVfRk9SRUFDSAorI2RlZmluZQlDUFVfRk9SRUFDSChpKQkJCQkJCQlcCisJZm9yICgoaSkgPSAw
OyAoaSkgPD0gMDsgKGkpKyspCQkJCVwKKwkJaWYgKCFDUFVfQUJTRU5UKChpKSkpCisKIC8qCiAg
KiBUaGlzIGlzIHRoZSB6b25lIGFuZCBrZWcgZnJvbSB3aGljaCBhbGwgem9uZXMgYXJlIHNwYXdu
ZWQuICBUaGUgaWRlYSBpcyB0aGF0CiAgKiBldmVuIHRoZSB6b25lICYga2VnIGhlYWRzIGFyZSBh
bGxvY2F0ZWQgZnJvbSB0aGUgYWxsb2NhdG9yLCBzbyB3ZSB1c2UgdGhlCkBAIC0xMjI4LDYgKzEy
MzYsNyBAQCBrZWdfc21hbGxfaW5pdCh1bWFfa2VnX3Qga2VnKQogCiAJaWYgKGtlZy0+dWtfZmxh
Z3MgJiBVTUFfWk9ORV9QQ1BVKSB7CiAJCXVfaW50IG5jcHVzID0gbXBfbmNwdXMgPyBtcF9uY3B1
cyA6IE1BWENQVTsKKwkJbmNwdXMgPSAxOwogCiAJCWtlZy0+dWtfc2xhYnNpemUgPSBzaXplb2Yo
c3RydWN0IHBjcHUpOwogCQlrZWctPnVrX3BwZXJhID0gaG93bWFueShuY3B1cyAqIHNpemVvZihz
dHJ1Y3QgcGNwdSksCkBAIC0xODIyLDcgKzE4MzEsNyBAQCB1bWFfc3RhcnR1cCh2b2lkICpib290
bWVtLCBpbnQgYm9vdF9wYWdlcykKICNlbmRpZgogCWFyZ3MubmFtZSA9ICJVTUEgWm9uZXMiOwog
CWFyZ3Muc2l6ZSA9IHNpemVvZihzdHJ1Y3QgdW1hX3pvbmUpICsKLQkgICAgKHNpemVvZihzdHJ1
Y3QgdW1hX2NhY2hlKSAqIChtcF9tYXhpZCArIDEpKTsKKwkgICAgKHNpemVvZihzdHJ1Y3QgdW1h
X2NhY2hlKSAqICgwICsgMSkpOwogCWFyZ3MuY3RvciA9IHpvbmVfY3RvcjsKIAlhcmdzLmR0b3Ig
PSB6b25lX2R0b3I7CiAJYXJncy51bWluaXQgPSB6ZXJvX2luaXQ7CkBAIC0zMzAxLDcgKzMzMTAs
NyBAQCB1bWFfemVyb19pdGVtKHZvaWQgKml0ZW0sIHVtYV96b25lX3Qgem9uZSkKIHsKIAogCWlm
ICh6b25lLT51el9mbGFncyAmIFVNQV9aT05FX1BDUFUpIHsKLQkJZm9yIChpbnQgaSA9IDA7IGkg
PCBtcF9uY3B1czsgaSsrKQorCQlmb3IgKGludCBpID0gMDsgaSA8IDE7IGkrKykKIAkJCWJ6ZXJv
KHpwY3B1X2dldF9jcHUoaXRlbSwgaSksIHpvbmUtPnV6X3NpemUpOwogCX0gZWxzZQogCQliemVy
byhpdGVtLCB6b25lLT51el9zaXplKTsKQEAgLTM0NjUsNyArMzQ3NCw3IEBAIHN5c2N0bF92bV96
b25lX3N0YXRzKFNZU0NUTF9IQU5ETEVSX0FSR1MpCiAJICovCiAJYnplcm8oJnVzaCwgc2l6ZW9m
KHVzaCkpOwogCXVzaC51c2hfdmVyc2lvbiA9IFVNQV9TVFJFQU1fVkVSU0lPTjsKLQl1c2gudXNo
X21heGNwdXMgPSAobXBfbWF4aWQgKyAxKTsKKwl1c2gudXNoX21heGNwdXMgPSAoMCArIDEpOwog
CXVzaC51c2hfY291bnQgPSBjb3VudDsKIAkodm9pZClzYnVmX2JjYXQoJnNidWYsICZ1c2gsIHNp
emVvZih1c2gpKTsKIApAQCAtMzUwOSw3ICszNTE4LDcgQEAgc3lzY3RsX3ZtX3pvbmVfc3RhdHMo
U1lTQ1RMX0hBTkRMRVJfQVJHUykKIAkJCSAqIGFjY2VwdCB0aGUgcG9zc2libGUgcmFjZSBhc3Nv
Y2lhdGVkIHdpdGggYnVja2V0CiAJCQkgKiBleGNoYW5nZSBkdXJpbmcgbW9uaXRvcmluZy4KIAkJ
CSAqLwotCQkJZm9yIChpID0gMDsgaSA8IChtcF9tYXhpZCArIDEpOyBpKyspIHsKKwkJCWZvciAo
aSA9IDA7IGkgPCAoMCArIDEpOyBpKyspIHsKIAkJCQliemVybygmdXBzLCBzaXplb2YodXBzKSk7
CiAJCQkJaWYgKGt6LT51a19mbGFncyAmIFVNQV9aRkxBR19JTlRFUk5BTCkKIAkJCQkJZ290byBz
a2lwOwpkaWZmIC0tZ2l0IGEvc3lzL3ZtL3VtYV9pbnQuaCBiL3N5cy92bS91bWFfaW50LmgKaW5k
ZXggMTFhYjI0Zi4uYjViNWEwNSAxMDA2NDQKLS0tIGEvc3lzL3ZtL3VtYV9pbnQuaAorKysgYi9z
eXMvdm0vdW1hX2ludC5oCkBAIC0xMDcsNyArMTA3LDcgQEAKICNkZWZpbmUgVU1BX1NMQUJfTUFT
SwkoUEFHRV9TSVpFIC0gMSkJLyogTWFzayB0byBnZXQgYmFjayB0byB0aGUgcGFnZSAqLwogI2Rl
ZmluZSBVTUFfU0xBQl9TSElGVAlQQUdFX1NISUZUCS8qIE51bWJlciBvZiBiaXRzIFBBR0VfTUFT
SyAqLwogCi0jZGVmaW5lIFVNQV9CT09UX1BBR0VTCQk2NAkvKiBQYWdlcyBhbGxvY2F0ZWQgZm9y
IHN0YXJ0dXAgKi8KKyNkZWZpbmUgVU1BX0JPT1RfUEFHRVMJCTUxMgkvKiBQYWdlcyBhbGxvY2F0
ZWQgZm9yIHN0YXJ0dXAgKi8KIAogLyogTWF4IHdhc3RlIHBlcmNlbnRhZ2UgYmVmb3JlIGdvaW5n
IHRvIG9mZiBwYWdlIHNsYWIgbWFuYWdlbWVudCAqLwogI2RlZmluZSBVTUFfTUFYX1dBU1RFCTEw
Cg==
--001a11345984485e700516230aeb--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAMiGqYiOwN8onw1%2BreP9y9fiQ5pvpaLwpFmqJoX5HTBLQtCfRQ>