From owner-freebsd-bugs@freebsd.org Fri Jul 28 11:19:56 2017 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 303EDDC3CEB for ; Fri, 28 Jul 2017 11:19:56 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1E0C18183A for ; Fri, 28 Jul 2017 11:19:56 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v6SBJtX1029800 for ; Fri, 28 Jul 2017 11:19:55 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 219399] System panics after several hours of 14-threads-compilation orgies using poudriere on AMD Ryzen... Date: Fri, 28 Jul 2017 11:19:55 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-STABLE X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: nbe@renzel.net X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2017 11:19:56 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D219399 --- Comment #182 from Nils Beyer --- In order to track these compilation errors, I did what AMD support requeste= d: cleared CMOS by removing all cables and the battery and set VCORE staticial= ly to 1.36250V Then I started a new, fresh poudriere run. And guess what, after 1733 built ports (1 failed - "ghc"), my system panice= d: ---------------------------------------------------------------------------= --- root@asbach:/var/crash/#kgdb -c vmcore.0 /usr/lib/debug/boot/kernel/kernel.debug=20 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain condition= s. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: spin lock 0xffffffff81dc8b50 (smp rendezvous) held by 0xfffff801325ea560 (t= id 102081) too long timeout stopping cpus panic: spin lock held too long cpuid =3D 6 KDB: stack backtrace: #0 0xffffffff80aada97 at kdb_backtrace+0x67 #1 0xffffffff80a6bb76 at vpanic+0x186 #2 0xffffffff80a6b9e3 at panic+0x43 #3 0xffffffff80a4cf71 at _mtx_lock_spin_cookie+0x311 #4 0xffffffff81042dc1 at smp_targeted_tlb_shootdown+0x101 #5 0xffffffff81042cac at smp_masked_invltlb+0x4c #6 0xffffffff80eced91 at pmap_invalidate_all+0x211 #7 0xffffffff80ed936a at pmap_advise+0x49a #8 0xffffffff80d60c26 at vm_map_madvise+0x2c6 #9 0xffffffff80d6534e at sys_madvise+0x7e #10 0xffffffff80ee0394 at amd64_syscall+0x6c4 #11 0xffffffff80ec392b at Xfast_syscall+0xfb Uptime: 4h4m31s Dumping 5426 out of 32665 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..= 91% Reading symbols from /usr/lib/debug/boot/kernel/zfs.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/zfs.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/opensolaris.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/opensolaris.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/linprocfs.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/linprocfs.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/linux_common.ko.debug...don= e. Loaded symbols for /usr/lib/debug/boot/kernel/linux_common.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/tmpfs.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/tmpfs.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/vmm.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/vmm.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/ums.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/ums.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/pflog.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/pflog.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/pf.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/pf.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/linux.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/linux.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/linux64.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/linux64.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/nullfs.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/nullfs.ko.debug Reading symbols from /usr/lib/debug/boot/kernel/fdescfs.ko.debug...done. Loaded symbols for /usr/lib/debug/boot/kernel/fdescfs.ko.debug #0 doadump (textdump=3D) at pcpu.h:222 222 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump (textdump=3D) at pcpu.h:222 #1 0xffffffff80a6b6f1 in kern_reboot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:366 #2 0xffffffff80a6bbb0 in vpanic (fmt=3D, ap=3D) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0xffffffff80a6b9e3 in panic (fmt=3D) at /usr/src/sys/kern/kern_shutdown.c:690 #4 0xffffffff80a4cf71 in _mtx_lock_spin_cookie (c=3D, v=3D, tid=3D18446735289348100096, opts=3D,=20 file=3D, line=3D) at /usr/src/sys/kern/kern_mutex.c:672 #5 0xffffffff81042dc1 in smp_targeted_tlb_shootdown (mask=3D{__bits =3D 0xfffffe085f03b780}, vector=3D244, pmap=3D, addr1=3D, addr2=3D0) at /usr/src/sys/x86/x86/mp_x86.c:1470 #6 0xffffffff81042cac in smp_masked_invltlb (mask=3D{__bits =3D 0xfffffe085f03b7b0}, pmap=3D) at /usr/src/sys/x86/x86/mp_x86.c:1504 #7 0xffffffff80eced91 in pmap_invalidate_all (pmap=3D0xfffff8017f9ff138) at /usr/src/sys/amd64/amd64/pmap.c:1662 #8 0xffffffff80ed936a in pmap_advise (pmap=3D, sva=3D35436597248, eva=3D35436597248, advice=3D5) at /usr/src/sys/amd64/amd64/pmap.c:6189 #9 0xffffffff80d60c26 in vm_map_madvise (map=3D, start=3D35436552192, end=3D35436597248, behav=3D) at /usr/src/sys/vm/vm_map.c:2291 #10 0xffffffff80d6534e in sys_madvise (td=3D, uap=3D) at /usr/src/sys/vm/vm_mmap.c:705 #11 0xffffffff80ee0394 in amd64_syscall (td=3D0xfffff802bb419000, traced=3D= 0) at subr_syscall.c:135 #12 0xffffffff80ec392b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396 #13 0x00000008020502fa in ?? () Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal ---------------------------------------------------------------------------= --- I raised the voltage by 0.05V to 1.41250V as suggested by AMD tech support.= And will try another fresh poudriere run now. At least, that panic is something new - is that something caused by flawky = CPU or a software bug? --=20 You are receiving this mail because: You are the assignee for the bug.=