From owner-freebsd-stable@freebsd.org Wed Jan 17 23:10:45 2018 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 98A37EBE50F for ; Wed, 17 Jan 2018 23:10:45 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from mx2.catspoiler.org (mx2.catspoiler.org [IPv6:2607:f740:16::d18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "amnesiac", Issuer "amnesiac" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 7F0093FEA for ; Wed, 17 Jan 2018 23:10:45 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org ([76.212.85.177]) by mx2.catspoiler.org (8.15.2/8.15.2) with ESMTPS id w0HNB5DM074060 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 17 Jan 2018 23:11:07 GMT (envelope-from truckman@FreeBSD.org) Received: from mousie.catspoiler.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTPS id w0HNAZJY027875 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 17 Jan 2018 15:10:36 -0800 (PST) (envelope-from truckman@FreeBSD.org) Date: Wed, 17 Jan 2018 15:10:30 -0800 (PST) From: Don Lewis Subject: Re: Ryzen issues on FreeBSD ? To: Mike Tancsa cc: Pete French , freebsd-stable@freebsd.org In-Reply-To: <795dbb79-3c18-d967-98b9-5d09a740dbfe@sentex.net> Message-ID: References: <8e842dec-ade7-37d1-6bd8-856ea1a827ca@sentex.net> <3b625072-dfb3-6b4f-494d-7fe1b2fa554c@ingresso.co.uk> <2c6ce4dd-f43c-7c40-abc2-732d6f8996ec@sentex.net> <795dbb79-3c18-d967-98b9-5d09a740dbfe@sentex.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; CHARSET=us-ascii Content-Disposition: INLINE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jan 2018 23:10:45 -0000 On 17 Jan, Mike Tancsa wrote: > On 1/17/2018 3:39 PM, Don Lewis wrote: >> On 17 Jan, Mike Tancsa wrote: >>> On 1/17/2018 8:43 AM, Pete French wrote: >>>> >>>> Are you running the latest STABLE ? There were some patches for Ryzen >>>> which went in I belive, and might affect te stability. Specificly the >>>> chnages to stop it locking up when executing code in the top page ? >>> >>> Hi, >>> I was testing with RELENG_11 as of 2 days ago. The fix seems to be there >>> >>> # sysctl -A hw.lower_amd64_sharedpage >>> hw.lower_amd64_sharedpage: 1 >>> >>> Would love to find a class of motherboard that pushes its "You dont need >>> to dork around with any BIOS settings. It just works. Oh, and we have a >>> hardware watchdog too".... ipmi would be stellar. >> >> The shared page change fixed the random lockup and silent reboot problem >> for me. I've got a 1700X eight core CPU and a Gigabyte X370 Gaming 5. I >> did have to RMA my CPU (it was an early one) because it had the problem >> with random segfaults that seemed to be triggered by process migration >> between CPU cores. I still haven't switched over to using it for >> package builds because I see more random fallout than on my older >> package builder. I'm not blaming the hardware for that at this point >> because I see a lot of the same issues on my older machine, but less >> frequently. >> >> One thing to watch (though it should be less critical with a six core >> CPU) is VRM cooling. I removed the stupid plastic shroud over the VRM >> sink on my motherboard so that it gets some more airflow. > > Thanks! I will confirm the cooling. I tried just now looking at the CPU > FAN control in the BIOS and up'd it to "turbo" from the default. Does > amdtmp.ko work with your chipset ? Nothing on mine unfortunately, so I > cant tell from the OS if its running hot. > > Is there a way to see if your CPU is old and has that bug ? I havent > seen any segfaults on the few dozen buildworlds I have done. So far its > always been a total lockup and not crash with RELENG11. > > x86info v1.31pre > Found 12 identical CPUs > Extended Family: 8 Extended Model: 0 Family: 15 Model: 1 Stepping: 1 > CPU Model (x86info's best guess): AMD Zen Series Processor (ZP-B1) > Processor name string (BIOS programmed): AMD Ryzen 5 1600 Six-Core > Processor My original CPU had a date code of 1708SUT (8th week of 2017 I think), and the replacement has a date code of 1733SUS. There's a humungous discussion thread here where date codes are discussed. As I recall, the first replacement parts shipped had dates codes somewhere in the mid 20's, but I think AMD was still hand screening parts at that point. My replacement came in a sealed box, so it wasn't hand screened and AMD probably was able to screen for this problem in their production test.