Date: Tue, 28 Aug 2018 23:47:20 +0800 From: Meowthink <meowthink@gmail.com> To: "karu.pruun" <karu.pruun@gmail.com> Cc: freebsd-hackers@freebsd.org, freebsd-stable@freebsd.org Subject: Re: Help diagnose my Ryzen build problem (in progress) Message-ID: <CABnABoamgeDUMBXvGwHzgjKrQvHSXC8o3wVRhtu5hFsiLV%2BEaw@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hi Peeter, On 8/28/18, karu.pruun <karu.pruun@gmail.com> wrote: > On Mon, Aug 27, 2018 at 6:07 PM Meowthink <meowthink@gmail.com> wrote: > >> >> Unfortunately, that's for Ryzens family 17h model 00h-0fh, whereas my >> >> Ryzen 5 2400G's model is 11h. >> >> >> >> On the microcode. It shall be updated through UEFI/BIOS updates. I >> >> think mine is now PinnaclePI-AM4_1.0.0.4 with microcode patchlevel >> >> 0x810100b. >> >> >> >> Seems like ... the only thing I can do is sit down and wait? >> > >> > The revision >> > >> > https://svnweb.freebsd.org/base/head/sys/x86/x86/cpu_machdep.c?r1=336763&r2=336762&pathrev=336763 >> > >> > works around the mwait issue, i.e. it sets >> > >> > sysctl machdep.idle_mwait=0 >> > sysctl machdep.idle=hlt >> > >> >> I think that shall not apply to 2400G, which is model 11h not 1h. >> Here're what I have now: >> >> machdep.idle: acpi >> machdep.idle_available: spin, mwait, hlt, acpi >> machdep.idle_apl31: 0 >> machdep.idle_mwait: 1 >> >> > Now it may or may not relate to your problem, but it appears that >> > Ryzen 2400G also has another issue with HLT, see the DragonFly bug >> > report >> > >> > https://bugs.dragonflybsd.org/issues/3131 >> > >> >> Thanks a lot for that info. >> It's much easier to prove your problem, since it's reproducible. But >> mine was so random to catch... >> Anyway, it seems like the IRET issue [1] is still not fixed? I'm >> highly doubt that my issue is this related because my system became >> significantly more stable since I stop that irq storm from bluetooth >> module - Though it still panics occasionally. >> So could anybody tell, what's the difference between FreeBSD >> workaround [2] and the DragonflyBSD one? >> >> > which AMD is aware of and is possibly working on, but it may not have >> > appeared in the errata yet. The bug report says that until this is >> > fixed, the workaround is to also disable HLT in cpu_idle. I am not >> > sure what is the correct value for the sysctl on FreeBSD, perhaps >> > >> > sysctl machdep.idle=0 >> > >> > or some other value? >> >> In the meantime, I have this microcode >> >> # cpucontrol -m 0x8b /dev/cpuctl0 >> MSR 0x8b: 0x00000000 0x0810100b >> >> Hence I should use mwait? >> Still don't know what should I set. Any idea? > > > If I was you, I'd play around with the sysctls mentioned above and see > if it helps. Start with disabling both mwait and hlt, perhaps > > machdep.idle=spin > machdep.idle_mwait=0 > > (assuming that 'spin' means hlt will not used) and then if that does > not lead to a panic, try enabling mwait. I can't test 2400G since I > don't have it any more. I booted FreeBSD a couple of times but did not > run it over long periods of time. It works! After hours and hours of different stressing. I got 8 copies of gcc built without any problem. But it costs lots of power and the fan will become very annoying. As so, I don't think I'll test long term stability with this state. machdep.idle: acpi -> spin - will add ~5W, maybe some deeper C states disabled? machdep.idle_mwait: 1 -> 0 - will add another ~50W, CPUs are working insomniac. I tried to set machdep.idle_mwait to 1, or machdep.idle to mwait. Both failed with panics when I start building gcc pass by pass. I'm pretty sure mwait will cause problem, as once I experienced a panic immediately after I issued the sysctl command (the 2nd dump info followed) So my next step will be hlt. Still need some time, though. > > Cheers > > Peeter > > -- > Cheers, meowthink ------------------------------------------------------------------------ machdep.idle=mwait panic: ffs_syncvnode: syncing truncated data. cpuid = 7 KDB: stack backtrace: #0 0xffffffff80b414b7 at kdb_backtrace+0x67 #1 0xffffffff80afa9e7 at vpanic+0x177 #2 0xffffffff80afa863 at panic+0x43 #3 0xffffffff80dcddc4 at ffs_syncvnode+0x5a4 #4 0xffffffff80dcc915 at ffs_fsync+0x25 #5 0xffffffff810ffcb2 at VOP_FSYNC_APV+0x82 #6 0xffffffff80bc3a62 at sched_sync+0x412 #7 0xffffffff80abd813 at fork_exit+0x83 #8 0xffffffff80f5cc7e at fork_trampoline+0xe ------------------------------------------------------------------------ machdep.idle_mwait=1 Fatal trap 9: general protection fault while in kernel mode cpuid = 7; apic id = 07 instruction pointer = 0x20:0xffffffff80e094fe stack pointer = 0x0:0xfffffe081e5df9e0 frame pointer = 0x0:0xfffffe081e5dfa50 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 17 (dom0) trap number = 9 panic: general protection fault cpuid = 7 KDB: stack backtrace: #0 0xffffffff80b414b7 at kdb_backtrace+0x67 #1 0xffffffff80afa9e7 at vpanic+0x177 #2 0xffffffff80afa863 at panic+0x43 #3 0xffffffff80f7c14f at trap_fatal+0x35f #4 0xffffffff80f7b70e at trap+0x5e #5 0xffffffff80f5bccc at calltrap+0x8 #6 0xffffffff80e07a17 at vm_pageout+0x87 #7 0xffffffff80abd813 at fork_exit+0x83 #8 0xffffffff80f5cc7e at fork_trampoline+0xe
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABnABoamgeDUMBXvGwHzgjKrQvHSXC8o3wVRhtu5hFsiLV%2BEaw>