From owner-freebsd-current@freebsd.org Wed Jun 13 20:41:06 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 50A551014114 for ; Wed, 13 Jun 2018 20:41:06 +0000 (UTC) (envelope-from mike@sentex.net) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id D7EF17A817 for ; Wed, 13 Jun 2018 20:41:05 +0000 (UTC) (envelope-from mike@sentex.net) Received: by mailman.ysv.freebsd.org (Postfix) id 90F1D1014106; Wed, 13 Jun 2018 20:41:05 +0000 (UTC) Delivered-To: current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6BFDD1014102; Wed, 13 Jun 2018 20:41:05 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [IPv6:2607:f3e0:80:80::2]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 0FCDF7A80F; Wed, 13 Jun 2018 20:41:04 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (lava.sentex.ca [IPv6:2607:f3e0:0:5::11]) by smarthost2.sentex.ca (8.15.2/8.15.2) with ESMTPS id w5DKf3vr075683 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 13 Jun 2018 16:41:04 -0400 (EDT) (envelope-from mike@sentex.net) Received: from [192.168.43.26] (saphire3.sentex.net [192.168.43.26]) by lava.sentex.ca (8.15.2/8.15.2) with ESMTP id w5DKf15J022704; Wed, 13 Jun 2018 16:41:02 -0400 (EDT) (envelope-from mike@sentex.net) Subject: Re: Ryzen public erratas To: Konstantin Belousov , current@freebsd.org, amd64@freebsd.org References: <20180613103535.GP2493@kib.kiev.ua> From: Mike Tancsa Openpgp: preference=signencrypt Autocrypt: addr=mike@sentex.net; prefer-encrypt=mutual; keydata= xsBNBEzcA24BCACpwI/iqOrs0GfQSfhA1v6Z8AcXVeGsRyKEKUpxoOYxXWc2z3vndbYlIP6E YJeifzKhS/9E+VjhhICaepLHfw865TDTUPr5D0Ed+edSsKjlnDtb6hfNJC00P7eoiuvi85TW F/gAxRY269A5d856bYrzLbkWp2lKUR3Bg6NnORtflGzx9ZWAltZbjYjjRqegPv0EQNYcHqWo eRpXilEo1ahT6nmOU8V7yEvT2j4wlLcQ6qg7w+N/vcBvyd/weiwHU+vTQ9mT61x5/wUrQhdw 2gJHeQXeDGMJV49RT2EEz+QVxaf477eyWsdQzPVjAKRMT3BVdK8WvpYAEfBAbXmkboOxABEB AAHNHG1pa2UgdGFuY3NhIDxtaWtlQHNlbnRleC5jYT7CwHgEEwECACIFAkzcA24CGwMGCwkI BwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJEJXHwM2kc8rX+sMH/2V6pTBKsQ5mpWWLgs6wVP2k BC+6r/YKNXv9Rw/PrC6+9hTbgA+sSjJ+8gxsCbJsOQXZrxF0x3l9oYdYfuKcwdwXFX1/FS8p HfBeDkmlH+dI709xT9wgrR4dS5aMmKp0scPrXPIAKiYVOHjOlNItcLYTEEWEFBepheEVsgmk GrNbcrHwOx/u4igUQ8vcpyXPyUki+BsftPw8ZQvBU887igh0OxaCR8AurJppQ5UQd63r81cX E1ZjoFoWCaGK/SjPb/OhpYpu5swoZIhOxQbn7OtakYPsDd5t2A5KhvjI8BMTnd5Go+2xsCmr jlIEq8Bi29gCcfQUvNiClevi13ifmnnOwE0ETNwDbgEIALWGNJHRAhpd0A4vtd3G0oRqMBcM FGThQr3qORmEBTPPEomTdBaHcn+Xl+3YUvTBD/67/mutWBwgp2R5gQOSqcM7axvgMSHbKqBL 9sd1LsLw0UT2O5AYxv3EwzhG84pwRg3XcUqvWA4lA8tIj/1q4Jzi5qOkg1zxq4W9qr9oiYK5 bBR638JUvr3eHMaz/Nz+sDVFgwHmXZj3M6aE5Ce9reCGbvrae7H5D5PPvtT3r22X8SqfVAiO TFKedCf/6jbSOedPN931FJQYopj9P6b3m0nI3ZiCDVSqeyOAIBLzm+RBUIU3brzoxDhYR8pz CJc2sK8l6YjqivPakrD86bFDff8AEQEAAcLAXwQYAQIACQUCTNwDbgIbDAAKCRCVx8DNpHPK 1+iQB/99aqNtez9ZTBWELj269La8ntuRx6gCpzfPXfn6SDIfTItDxTh1hrdRVP5QNGGF5wus N4EMwXouskva1hbFX3Pv72csYSxxEJXjW16oV8WK4KjKXoskLg2RyRP4uXqL7Mp2ezNtVY5F 9nu3fj4ydpHCSaqKy5xd70A8D50PfZsFgkrsa5gdQhPiGGEdxhq/XSeAAnZ4uVLJKarH+mj5 MEhgZPEBWkGrbDZpezl9qbFcUem/uT9x8FYT/JIztMVh9qDcdP5tzANW5J7nvgXjska+VFGY ryZK4SPDczh74mn6GI/+RBi7OUzXXPgpPBrhS5FByjwCqjjsSpTjTds+NGIY Organization: Sentex Communications Message-ID: <2838dcda-f117-6732-bf12-70618a81a1d7@sentex.net> Date: Wed, 13 Jun 2018 16:41:02 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180613103535.GP2493@kib.kiev.ua> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Jun 2018 20:41:06 -0000 On 6/13/2018 6:35 AM, Konstantin Belousov wrote: > Today I noted that AMD published the public errata document for Ryzens, > https://developer.amd.com/wp-content/resources/55449_1.12.pdf > > Some of the issues listed there looks quite relevant to the potential > hangs that some people still experience with the machines. I wrote > a script which should apply the recommended workarounds to the erratas > that I find interesting. > > To run it, kldload cpuctl, then apply the latest firmware update to your > CPU, then run the following shell script. Comments indicate the errata > number for the workarounds. Hi, tl;dr: The Microcode changes seem to fix a hard lockup I was able to reliable reproduce back in Feb. The BIOS on my AMD is pretty up to date. I think it has the same microcode as whats in the ports. x86info -a shows root@ryzenbsd11:/home/mdtancsa # x86info -a | grep -i microc Microcode patch level: 0x8001137 root@ryzenbsd11:/home/mdtancsa # after running the microcode update and root@ryzenbsd11:/home/mdtancsa # /usr/local/etc/rc.d/microcode_update onestart Updating CPU Microcode... Done. root@ryzenbsd11:/home/mdtancsa # x86info -a | grep -i microc Microcode patch level: 0x8001137 root@ryzenbsd11:/home/mdtancsa # However, the dmesg after the microcode update adds this line AMD Extended Feature Extensions ID EBX=0x1007 CPU: AMD Ryzen 5 1600X Six-Core Processor (3593.36-MHz K8-class CPU) Origin="AuthenticAMD" Id=0x800f11 Family=0x17 Model=0x1 Stepping=1 Features=0x178bfbff Features2=0x7ed8320b AMD Features=0x2e500800 AMD Features2=0x35c233ff Structured Extended Features=0x209c01a9 XSAVE Features=0xf SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768 TSC: P-state invariant, performance statistics I ran the script root@ryzenbsd11:/home/mdtancsa # cat fix.sh #!/bin/sh # Enable workarounds for erratas listed in # https://developer.amd.com/wp-content/resources/55449_1.12.pdf # 1057, 1109 sysctl machdep.idle_mwait=0 sysctl machdep.idle=hlt for x in /dev/cpuctl*; do # 1021 cpucontrol -m '0xc0011029|=0x2000' $x # 1033 cpucontrol -m '0xc0011020|=0x10' $x # 1049 cpucontrol -m '0xc0011028|=0x10' $x # 1095 cpucontrol -m '0xc0011020|=0x200000000000000' $x echo $x done root@ryzenbsd11:/home/mdtancsa # sh ./fix.sh machdep.idle_mwait: 1 -> 0 machdep.idle: acpi -> hlt /dev/cpuctl0 /dev/cpuctl1 /dev/cpuctl10 /dev/cpuctl11 /dev/cpuctl2 /dev/cpuctl3 /dev/cpuctl4 /dev/cpuctl5 /dev/cpuctl6 /dev/cpuctl7 /dev/cpuctl8 /dev/cpuctl9 root@ryzenbsd11:/home/mdtancsa # Using a FreeBSD stable from back in Feb, I was able to crash Ryzen and Epyc based systems (https://lists.freebsd.org/pipermail/freebsd-stable/2018-February/088439.html) by generating a lot of traffic between the hypervisor and guests. The same tests on an intel based box ran just fine. e.g. start 3 guests in bhyve (amd64) and run combos of iperf3 between them. It would not take too long, but the box would hard lock-- i.e. blank screen, no crash dump etc. With the latest micro code update, I have been running the same sort of tests and so far so good. I will let them run overnight to see if things are now stable on STABLE. ---Mike > > Please report the results. If the script helps, I will code the kernel > change to apply the workarounds. > > #!/bin/sh > > # Enable workarounds for erratas listed in > # https://developer.amd.com/wp-content/resources/55449_1.12.pdf > > # 1057, 1109 > sysctl machdep.idle_mwait=0 > sysctl machdep.idle=hlt > > for x in /dev/cpuctl*; do > # 1021 > cpucontrol -m '0xc0011029|=0x2000' $x > # 1033 > cpucontrol -m '0xc0011020|=0x10' $x > # 1049 > cpucontrol -m '0xc0011028|=0x10' $x > # 1095 > cpucontrol -m '0xc0011020|=0x200000000000000' $x > done > > _______________________________________________ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > > -- ------------------- Mike Tancsa, tel +1 519 651 3400 x203 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada