From owner-freebsd-bugs@freebsd.org Mon Jul 24 08:10:57 2017 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E1D27C08FD8 for ; Mon, 24 Jul 2017 08:10:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CF96384EAF for ; Mon, 24 Jul 2017 08:10:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v6O8AvBD059249 for ; Mon, 24 Jul 2017 08:10:57 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 219399] System panics after several hours of 14-threads-compilation orgies using poudriere on AMD Ryzen... Date: Mon, 24 Jul 2017 08:10:58 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-STABLE X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: kib@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Jul 2017 08:10:58 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D219399 --- Comment #92 from Konstantin Belousov --- (In reply to Don Lewis from comment #90) Yes, the coredumping message is because the object backing the shared page entry is only initialized with single page, so attempt to read from the sec= ond page cannot be satisfied without the backing physical memory. >From what I see in the amd support forums/reddit threads, the issue is not diagnosed yet and AMD is silent about it. Most strange thing I found was a claim that sometimes CPU executes instructions from %rip+0x40 byte instead = of %rip. That would explain Dillon' fix but probably have no effect on FreeBSD trampoline layout, unless some more weirdness is in place. If the problem indeed hardware (I hope so) and AMD will be able to identify= and fix it, I very much dislike the global change to the AMD64 native VA layout= .=20 My concerns are due to USRSTACK value leaking to tools and becoming part of= the ABI. For instance, I added kern.proc..sigtramp for the debuggers and unwinders like libunwind to avoid using pre-defined value for the trampoline base to detect signal frames, but some tools are not converted, and old binaries cannot be fixed. Similar concern for old libc' setproctitle(3). = Etc. I suggest trying a different approach for implementing your workaround: if matching CPU is detected, decrement sv_usrstack and sv_shared_page_base by PAGE_SIZE. I expect that the image activator is parametrized by struct sysentvec enough to make this work; if not, I will fix it. For Linux 64 bit emul, similar adjustment for the Linux ABI sysentvec should be done at modu= le init. It is shame that AMD is silent and does not provide Erratas/Notifications of problems for their flagship CPUs. --=20 You are receiving this mail because: You are the assignee for the bug.=