From owner-freebsd-bugs@freebsd.org Tue Jul 25 13:30:59 2017 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C9F29C7C77D for ; Tue, 25 Jul 2017 13:30:59 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9CBDE77660 for ; Tue, 25 Jul 2017 13:30:59 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v6PDUwwm064540 for ; Tue, 25 Jul 2017 13:30:59 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 219399] System panics after several hours of 14-threads-compilation orgies using poudriere on AMD Ryzen... Date: Tue, 25 Jul 2017 13:30:58 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-STABLE X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: nbe@renzel.net X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jul 2017 13:30:59 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D219399 --- Comment #121 from Nils Beyer --- (In reply to Don Lewis from comment #119) > This gives mfence() some memory loads to wait for, which allows the data = to be migrated from the core A cache. With this change, I no longer get an= y segfaults. confirmed - with that change, I haven't gotten any segfaults in 500 passes. Though, there is a discrepancy in how many passes each core has absolved: --------------------------------------------------------------------------- [...] 412: Tue Jul 25 15:19:00 CEST 2017: OK 405: Tue Jul 25 15:19:01 CEST 2017: OK 402: Tue Jul 25 15:19:01 CEST 2017: OK 420: Tue Jul 25 15:19:01 CEST 2017: OK 410: Tue Jul 25 15:19:01 CEST 2017: OK 406: Tue Jul 25 15:19:01 CEST 2017: OK 410: Tue Jul 25 15:19:01 CEST 2017: OK 414: Tue Jul 25 15:19:01 CEST 2017: OK 410: Tue Jul 25 15:19:01 CEST 2017: OK 409: Tue Jul 25 15:19:02 CEST 2017: OK 413: Tue Jul 25 15:19:02 CEST 2017: OK 423: Tue Jul 25 15:19:02 CEST 2017: OK 397: Tue Jul 25 15:19:02 CEST 2017: OK 411: Tue Jul 25 15:19:02 CEST 2017: OK 401: Tue Jul 25 15:19:02 CEST 2017: OK 421: Tue Jul 25 15:19:02 CEST 2017: OK 438: Tue Jul 25 15:19:02 CEST 2017: OK 427: Tue Jul 25 15:19:02 CEST 2017: OK 406: Tue Jul 25 15:19:02 CEST 2017: OK --------------------------------------------------------------------------- In my eyes, each core is performing the same workload and should therefore = be at the same pass number. Maybe I'm completely wrong. But isn't that somethi= ng you've observed, too, is it? > Ryzen bug? Just more aggressive prefetching? I don't know ... It's a rather difficult question: if CPU A executes something without segfaults; and CPU B throws segfaults using the same executable, does that automatically mean that CPU B is doing it all wrongly? Or does it rather me= an CPU B is not 100% compatible to CPU A and therefore needs an appropiate executable? I ask because I wonder if that's something that should be told to AMD tech support - particularly because I have an open ticket there... --=20 You are receiving this mail because: You are the assignee for the bug.=