From owner-freebsd-hackers@freebsd.org Mon Aug 27 21:30:27 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6CC8E1096A58 for ; Mon, 27 Aug 2018 21:30:27 +0000 (UTC) (envelope-from sblachmann@gmail.com) Received: from mail-oi0-x244.google.com (mail-oi0-x244.google.com [IPv6:2607:f8b0:4003:c06::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EEE5683FF2 for ; Mon, 27 Aug 2018 21:30:26 +0000 (UTC) (envelope-from sblachmann@gmail.com) Received: by mail-oi0-x244.google.com with SMTP id 8-v6so884273oip.0 for ; Mon, 27 Aug 2018 14:30:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=j1Y2vvZIQ6NxQc4Dvvs4m9AeyNSgfRIdEvUACdkWYoM=; b=bPeQhNa1QNLSD7ZHQzFSzzEFB7pz4BYGf4VTngAs4yOdcvo+3AgDzV8zMWooRK8EGd 2Nn4CJqTUE+YEYWHqyQ21CDJRTuhYAdU79+128TagBv0SqbqQElWu3rIZNmdGnEIveVD IxJcRDKdhpLWfpPJFzHbLGALiDMonpzJoKGu3/SEvYEb7bEHWJxVQH+T+zxLJYCl3JFv 5wPy33und38/0gKTFLSTyANc7+G0UmQMEWZ05/sGnI1UJwAZytp1vyDxHY+LieRyvKvI WeV1F864bVKCwLQR1vUY2u8o08yYX6qU8jURV6smXD3u0fX5vIbbvp+7ZB+SK7BXRGK+ 1qnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=j1Y2vvZIQ6NxQc4Dvvs4m9AeyNSgfRIdEvUACdkWYoM=; b=V+BeYzjYzBoPOf8RPuDJwrqTTqgRFpIMhCEUqBJsKZjGZrKtdwyVntkjQQT1fomvkr 3kOK2vAamOCRnPYdQEdm+uO+XxN0VAYQKJX+qodszznExL1489QwIN2DwouWeQpxGOOf M/JbkobX5RKQa0lqw3SyNOTRXp5B6jSPuzcz3XPhB5qjsIG0rkFUB3imSRUkfCLHaWmO XFb6ZS1K2lTm2DElU5cyH5pq2trR/F8qeIVhGtdEu/DtZfdgkIJF+7ZXvpc5TWSbO0tk cruyVjKxlYAH0u9ytkzJi+UqVespXqjMn4ohDA4jM6Y63/l2TSpOYO+3U/qVO7XU+1Xw sWdQ== X-Gm-Message-State: APzg51BzjLSGkGXtzmAhLwcMcsHtgy1UPa2VjzI87Jk2BQt83z5WSS6s 6e488IkYXTAIAZa7p04aoi7wp8VOfCGGpbqdwIbhCg== X-Google-Smtp-Source: ANB0VdboyG+8ym5VRnSGG3HsYG+CMghrj4GQ1+fFJEinMA1KFSltSn7GWKvy6IiRki2uYmElHMoQxZXyPasyP7nfAFE= X-Received: by 2002:aca:b904:: with SMTP id j4-v6mr379639oif.89.1535405426195; Mon, 27 Aug 2018 14:30:26 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:ac9:2b9c:0:0:0:0:0 with HTTP; Mon, 27 Aug 2018 14:30:25 -0700 (PDT) In-Reply-To: References: <32e008cf-93d3-944d-9b11-e56f1bb425ef@wyatt672earp.force9.co.uk> From: Stefan Blachmann Date: Mon, 27 Aug 2018 23:30:25 +0200 Message-ID: Subject: Re: Ryzen Build Problem To: Meowthink Cc: Mitchell , freebsd-hackers@freebsd.org Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Aug 2018 21:30:27 -0000 It is remarkable that AMD's list only contains "brands" like Crucial and the like, but not a single first-party-manufacturer. Why? Because the first-party-manufacturers do not sell bad memories, for simple reputation reasons. The question where all these masses of B-grade selection chips remain, which the memory manufacturers reject for use under their own brand, is an old taboo in the industry. My personal impression is that these are dumped via these third party memory module manufacturers. The typical gamer/overclocker customer unaware of this will readily explain away problems on her non-ECC systems equipped with memory chips rejected by the original manufacturer as "the usual Windows crashes". The consumers will even happily take the fancy "coolers" on the modules as "sign of quality and worthiness", whose actual function is to hide the crap inside. Thus my personal advice: Do not use memory modules from third-party-manufacturers. The time and data you lose does not justify the savings when buying stuff from B-grade-stuff remarketers. Only buy first-party-memory modules, i.e. Samsung, Hynix, Micron etc. (If you really insist on using third-party-modules, take Kingston, who have a comparatively small history of using unreliable chips compared to other "brands".) On 8/27/18, Meowthink wrote: > Hi frank, > > On 8/27/18, Mitchell wrote: >> >> Hi Meowthink: >> >> I'm planning a Home Build, and I came across an issue which might apply >> to your design. >> >> Some AMD CPUs are designed for Over-Clocking automatically. But when I >> investigated Memory Compatibility I saw that some Memory wasn't. > > Many Intel CPUs are turbo boost enabled, also. I think It's safe to > trust these designs. They'll communicate to memory at a steady clock > rate, which will provide by SPD chips on DIMMs. > > Ryzens are known to have compatible issues with memories. An easier > way is to choose a module which is in the qualified list, "QVL". > >> >> The "AMD Ryzen 5 2400G" looks like it can Over-Clock itself when it >> feels safe to do so. >> >> But the "Crucial 16GB DDR4-2400 EUDIMM CL17" seems to be classified as >> Server Memory, which could mean it's designed for a single speed. I >> couldn't find more details about Crucial Memory Over-Clocking. >> >> The Crucial Web Pages do feature a Help Facility which might enable you >> to check further if you input all your system details. >> > > That's a mistake months ago. What I'd care about is ECC. > I knew Ryzens (1x00) are ECC enabled. Then I was mistaken checking out > mobo's specification as Asrock didn't mention Raven Ridges (2x00G) at > that time. I thought my build with 2400G will got ECC, but sadly not. > Now Asrock say these on their website: > > - AMD Ryzen series CPUs (Raven Ridge) support DDR4 > 3200+(OC)/2933(OC)/2667/2400/2133 non-ECC, un-buffered memory* > > *For Ryzen Series CPUs (Raven Ridge), ECC is only supported with PRO CPUs. > > In the end I got my system run, but without ECC enabled. > >> I'm no expert here. This will be my first Home Build attempt and I >> haven't even started yet. You probably need a 2nd and 3rd opinion on >> this topic. I'm just hoping my contribution will prompt further comments >> from FreeBSD people with more know-how than I've got. >> >> Yours truly: Frank Mitchell >> > > You are welcome. > > Cheers, > meowthink > >> On 27/08/18 09:13, Phil Norman wrote: >>> Hi. >>> >>> I have a similar setup: Ryzen 3 and Fatal1ty X370 mini-ITX. I had some >>> trouble with instability, although my problems weren't panics, but >>> rather >>> two issues. One was random lockups (with no evidence left in logs), but >>> I >>> *think* this was down to an inadequately cooled graphics card. >>> >>> The other problem I had was with USB. I got quite a spam of log messages >>> about the USB reinitialisation. However, eventually I figured out that >>> the >>> problem didn't occur if I booted the system from a completely >>> powered-down >>> state. That is, use the physical switch on the PSU to cut power >>> entirely, >>> re-enable, then boot from that state. Since then I've had 67 days of >>> uninterrupted uptime, with no USB issues at all. >>> >>> It sounds like your problem is different, but trying a boot-from-cold >>> might >>> be worthwhile, just in case ASRock have a consistent problem in this >>> regard. >>> >>> Cheers, >>> Phil >>> >>> On 26 August 2018 at 13:20, Meowthink wrote: >>> >>>> Hello all, >>>> >>>> Recently I tried to build up a Ryzen system and run FreeBSD on it. >>>> CPU: AMD Ryzen 5 2400G with Radeon Vega Graphics (0x810f10) >>>> Mobo: Asrock Fatal1ty AB350 Gaming-ITX/ac ( with up-to-date BIOS with >>>> PinnaclePI-AM4_1.0.0.4, microcode 0x810100b ) >>>> Mem: 2x Crucial 16GB DDR4-2400 EUDIMM CL17 ( ECC Unregistered but ECC >>>> actually won't work :( ) >>>> >>>> But the system is unstable - it can't last few days even is nearly >>>> idle. System panics even at midnight. It almost panic while or after I >>>> built something large. Surprisly I didn't encourage a user program >>>> fault, bad binaries built etc., panics only. >>>> >>>> Then I tried lots of BIOS settings e.g. SMT, C6 idle current, >>>> underclock RAM, but none seems effect. >>>> It could pass memtest86 V7.5 without error, or various benchmarks >>>> under Windows. thus I think the problem is not in the hardware but >>>> software. >>>> >>>> In the mean time, I realized that the rate of irqs from xhci0 are too >>>> high - it's about 1998/s. I found [1] and tried to MFC r331665. It >>>> didn't fix the problem though, but disabling that bluetooth module >>>> stops the irq storm, after all. >>>> >>>> Then the system lasts much longer before panic. It eventually can >>>> compile ports tree, build the world, scrub the zpool, all done without >>>> annoying reboots. >>>> Then I assume this is [2] related? So I also tried cpuctl, bounding >>>> all processes to 2-7. >>>> But the problem is still there, only the chance become very low. It >>>> still panics occasionally, idling a week or stressing few hours - >>>> Stress seems to rise the chance of panic, but differently by types. >>>> Things like llvm will always build, but gcc will cause a panic per few >>>> passes. >>>> >>>> The system was 11.2 but then moved on to stable/11 (r337906 >>>> currently). I've got last 10 coredumps saved but my kernel isn't >>>> compile as debug. So I'll put some backtrace from core.txt.? in the >>>> end. >>>> >>>> Indeed I want to eliminate this problem. Could someone guide me how to >>>> figure out the problem? What should I try next? >>>> >>>> Best regards, >>>> Meowthink >>>> >> _______________________________________________ >> freebsd-hackers@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers >> To unsubscribe, send any mail to >> "freebsd-hackers-unsubscribe@freebsd.org" >> > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" >