Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 27 Aug 2018 14:56:45 +0200
From:      Johannes Lundberg <johalun0@gmail.com>
To:        Meowthink <meowthink@gmail.com>
Cc:        Freebsd hackers list <freebsd-hackers@freebsd.org>, freebsd-stable@freebsd.org
Subject:   Re: Help diagnose my Ryzen build problem
Message-ID:  <CAECmPwvmQ1a36=5V0PtKcrVgosAyRaKmsRq2CMMwTwfqsLYHuA@mail.gmail.com>
In-Reply-To: <CABnABoZA4DUOFfr7JdbbBAWxak3=ge6zX0HXtu1RffQH7tSb2Q@mail.gmail.com>
References:  <CABnABoZA4DUOFfr7JdbbBAWxak3=ge6zX0HXtu1RffQH7tSb2Q@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Aug 26, 2018 at 1:27 PM Meowthink <meowthink@gmail.com> wrote:

> Hello all,
>
> Recently I tried to build up a Ryzen system and run FreeBSD on it.
> CPU:  AMD Ryzen 5 2400G with Radeon Vega Graphics (0x810f10)
> Mobo: Asrock Fatal1ty AB350 Gaming-ITX/ac ( with up-to-date BIOS with
> PinnaclePI-AM4_1.0.0.4, microcode 0x810100b )
> Mem:  2x Crucial 16GB DDR4-2400 EUDIMM CL17 ( ECC Unregistered but ECC
> actually won't work :( )
>
> But the system is unstable - it can't last few days even is nearly
> idle. System panics even at midnight. It almost panic while or after I
> built something large. Surprisly I didn't encourage a user program
> fault, bad binaries built etc., panics only.
>
> Then I tried lots of BIOS settings e.g. SMT, C6 idle current,
> underclock RAM, but none seems effect.
> It could pass memtest86 V7.5 without error, or various benchmarks
> under Windows. thus I think the problem is not in the hardware but
> software.
>
> In the mean time, I realized that the rate of irqs from xhci0 are too
> high - it's about 1998/s. I found [1] and tried to MFC r331665. It
> didn't fix the problem though, but disabling that bluetooth module
> stops the irq storm, after all.
>
> Then the system lasts much longer before panic. It eventually can
> compile ports tree, build the world, scrub the zpool, all done without
> annoying reboots.
> Then I assume this is [2] related? So I also tried cpuctl, bounding
> all processes to 2-7.
> But the problem is still there, only the chance become very low. It
> still panics occasionally, idling a week or stressing few hours -
> Stress seems to rise the chance of panic, but differently by types.
> Things like llvm will always build, but gcc will cause a panic per few
> passes.
>
> The system was 11.2 but then moved on to stable/11 (r337906
> currently). I've got last 10 coredumps saved but my kernel isn't
> compile as debug. So I'll put some backtrace from core.txt.? in the
> end.
>
> Indeed I want to eliminate this problem. Could someone guide me how to
> figure out the problem? What should I try next?
>
> Best regards,
> Meowthink
>

Hi

I have a similar setup and also experience random hard resets without a
kernel dump. FreeBSD usually can run days, Windows 10 (fresh install I
think) doesn=E2=80=99t last more than a few minutes before BSOD.

Windows 10 came with the thing when I bought it. I can't run windows update
or anything since it hangs so soon.
The box came with an earlier mobo+cpu that I replaced with a ASRock+Ryzen.
The earlier AMD Kaveri was stable in both Windows (I think) and FreeBSD.

On occasion I get kernel panic directly in early boot. See attached image.

I am running cpu microcode update in rc.conf.



>
> [1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D224886
> [2] https://reviews.freebsd.org/D11780
>
> Backtraces newer - older:
> ------------------------------------------------------------------------
> Panic while compiling gcc:
>
> #0  doadump (textdump=3D<value optimized out>) at pcpu.h:230
> 230     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) #0  doadump (textdump=3D<value optimized out>) at pcpu.h:230
> #1  0xffffffff80afa5fb in kern_reboot (howto=3D260)
>     at /usr/src/sys/kern/kern_shutdown.c:383
> #2  0xffffffff80afaa21 in vpanic (fmt=3D<value optimized out>,
>     ap=3D<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
> #3  0xffffffff80afa863 in panic (fmt=3D<value optimized out>)
>     at /usr/src/sys/kern/kern_shutdown.c:707
> #4  0xffffffff80f7c14f in trap_fatal (frame=3D0xfffffe081e962790,
>     eva=3D18446735309538549504) at /usr/src/sys/amd64/amd64/trap.c:877
> #5  0xffffffff80f7c1a9 in trap_pfault (frame=3D0xfffffe081e962790,
> usermode=3D0)
>     at pcpu.h:230
> #6  0xffffffff80f7b984 in trap (frame=3D0xfffffe081e962790)
>     at /usr/src/sys/amd64/amd64/trap.c:415
> #7  0xffffffff80f5bccc in calltrap ()
>     at /usr/src/sys/amd64/amd64/exception.S:231
> #8  0xffffffff822950a8 in arc_change_state ()
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1800
> #9  0xffffffff8229328b in arc_access () at time.h:145
> #10 0xffffffff82296232 in arc_write_done (zio=3D0xfffff8065f886410)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:6169
> #11 0xffffffff82334cbe in zio_done (zio=3D<value optimized out>)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:4032
> #12 0xffffffff8233070c in zio_execute (zio=3D0xfffff8065f886410)
>     at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1768
> #13 0xffffffff80b52cc4 in taskqueue_run_locked (queue=3D0xfffff8000d9e6e0=
0)
>     at /usr/src/sys/kern/subr_taskqueue.c:463
> #14 0xffffffff80b53e28 in taskqueue_thread_loop (arg=3D<value optimized o=
ut>)
>     at /usr/src/sys/kern/subr_taskqueue.c:755
> #15 0xffffffff80abd813 in fork_exit (
>     callout=3D0xffffffff80b53d90 <taskqueue_thread_loop>,
>     arg=3D0xfffff8000d967030, frame=3D0xfffffe081e962ac0)
>     at /usr/src/sys/kern/kern_fork.c:1072
> #16 0xffffffff80f5cc7e in fork_trampoline ()
>     at /usr/src/sys/amd64/amd64/exception.S:972
> #17 0x0000000000000000 in ?? ()
> Current language:  auto; currently minimal
> (kgdb)
>
> ------------------------------------------------------------------------
> backtrace panic when shuting down:
>
> #0  doadump (textdump=3D<value optimized out>) at pcpu.h:230
> 230     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) #0  doadump (textdump=3D<value optimized out>) at pcpu.h:230
> #1  0xffffffff80afa5fb in kern_reboot (howto=3D260)
>     at /usr/src/sys/kern/kern_shutdown.c:383
> #2  0xffffffff80afaa21 in vpanic (fmt=3D<value optimized out>,
>     ap=3D<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
> #3  0xffffffff80afa863 in panic (fmt=3D<value optimized out>)
>     at /usr/src/sys/kern/kern_shutdown.c:707
> #4  0xffffffff80f7c14f in trap_fatal (frame=3D0xfffffe081ed30700, eva=3D0=
)
>     at /usr/src/sys/amd64/amd64/trap.c:877
> #5  0xffffffff80f7c1a9 in trap_pfault (frame=3D0xfffffe081ed30700,
> usermode=3D0)
>     at pcpu.h:230
> #6  0xffffffff80f7b984 in trap (frame=3D0xfffffe081ed30700)
>     at /usr/src/sys/amd64/amd64/trap.c:415
> #7  0xffffffff80f5bccc in calltrap ()
>     at /usr/src/sys/amd64/amd64/exception.S:231
> #8  0xffffffff80dfe4ad in vm_object_terminate (object=3D0xfffff805bf66d5a=
0)
>     at /usr/src/sys/vm/vm_object.c:768
> #9  0xffffffff80dfd0f8 in vm_object_deallocate (object=3D0x0)
>     at /usr/src/sys/vm/vm_object.c:677
> #10 0xffffffff80df3189 in _vm_map_unlock (map=3D<value optimized out>,
>     file=3D<value optimized out>, line=3D<value optimized out>)
>     at /usr/src/sys/vm/vm_map.c:2939
> #11 0xffffffff80df7be2 in vm_map_remove (map=3D0xfffff80018673000,
> start=3D4096,
>     end=3D140737488351232) at /usr/src/sys/vm/vm_map.c:3137
> #12 0xffffffff80df2e49 in vmspace_exit (td=3D0xfffff80039ec0620)
>     at /usr/src/sys/vm/vm_map.c:337
> #13 0xffffffff80ab72b9 in exit1 (td=3D0xfffff80039ec0620,
>     rval=3D<value optimized out>, signo=3D<value optimized out>)
>     at /usr/src/sys/kern/kern_exit.c:401
> #14 0xffffffff80ab6ced in sys_sys_exit (td=3D<value optimized out>,
>     uap=3D<value optimized out>) at /usr/src/sys/kern/kern_exit.c:180
> #15 0xffffffff80f7d1d8 in amd64_syscall (td=3D0xfffff80039ec0620, traced=
=3D0)
>     at subr_syscall.c:132
> #16 0xffffffff80f5c5ad in fast_syscall_common ()
>     at /usr/src/sys/amd64/amd64/exception.S:494
> #17 0x00000008028d034a in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> Current language:  auto; currently minimal
> (kgdb)
>
> ------------------------------------------------------------------------
> Panic while only running my single thread python script
>
> #0  doadump (textdump=3D<value optimized out>) at pcpu.h:230
> 230     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) #0  doadump (textdump=3D<value optimized out>) at pcpu.h:230
> #1  0xffffffff80afa5fb in kern_reboot (howto=3D260)
>     at /usr/src/sys/kern/kern_shutdown.c:383
> #2  0xffffffff80afaa21 in vpanic (fmt=3D<value optimized out>,
>     ap=3D<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
> #3  0xffffffff80afa863 in panic (fmt=3D<value optimized out>)
>     at /usr/src/sys/kern/kern_shutdown.c:707
> #4  0xffffffff80f7c14f in trap_fatal (frame=3D0xfffffe081f635ec0, eva=3D9=
52)
>     at /usr/src/sys/amd64/amd64/trap.c:877
> #5  0xffffffff80f7c1a9 in trap_pfault (frame=3D0xfffffe081f635ec0,
> usermode=3D0)
>     at pcpu.h:230
> #6  0xffffffff80f7b984 in trap (frame=3D0xfffffe081f635ec0)
>     at /usr/src/sys/amd64/amd64/trap.c:415
> #7  0xffffffff80f5bccc in calltrap ()
>     at /usr/src/sys/amd64/amd64/exception.S:231
> #8  0xffffffff80af57ad in __rw_wlock_hard (c=3D0xfffff80016f8f798,
>     v=3D<value optimized out>) at /usr/src/sys/kern/kern_rwlock.c:977
> #9  0xffffffff80bbca92 in bufobj_invalbuf (bo=3D<value optimized out>,
> flags=3D1,
>     slpflag=3D1017770744, slptimeo=3D<value optimized out>)
>     at /usr/src/sys/kern/vfs_subr.c:1609
> #10 0xffffffff80bbf8be in vgonel (vp=3D0xfffff8053ca9f1d8)
>     at /usr/src/sys/kern/vfs_subr.c:1655
> #11 0xffffffff80bbbcc4 in vnlru_free_locked (count=3D1, mnt_op=3D0x0)
>     at /usr/src/sys/kern/vfs_subr.c:1227
> #12 0xffffffff80bbbe14 in getnewvnode_reserve (count=3D1)
>     at /usr/src/sys/kern/vfs_subr.c:1287
> #13 0xffffffff82327fb4 in zfs_zget (zfsvfs=3D0xfffff80076574000,
> obj_num=3D34941,
>     zpp=3D0xfffffe081f6362a8)
>     at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1122
> #14 0xffffffff823421ad in zfs_dirent_lookup (dzp=3D0xfffff804ff94e420,
>     name=3D0xfffffe081f6363e0 "filename.ext", zpp=3D0xfffffe081f6362a8, f=
lag=3D2)
>     at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:187
> #15 0xffffffff82342267 in zfs_dirlook (dzp=3D0xfffff804ff94e420,
>     name=3D<value optimized out>, zpp=3D0xfffffe081f636360)
>     at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:238
> #16 0xffffffff8235a4ef in zfs_lookup ()
>     at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1658
> #17 0xffffffff8235ac1e in zfs_freebsd_lookup (ap=3D0xfffffe081f636548)
>     at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4956
> #18 0xffffffff810fe89c in VOP_CACHEDLOOKUP_APV (vop=3D<value optimized ou=
t>,
>     a=3D0xfffffe081f636548) at vnode_if.c:195
> #19 0xffffffff80ba8d56 in vfs_cache_lookup (ap=3D<value optimized out>)
>     at vnode_if.h:80
> #20 0xffffffff810fe77c in VOP_LOOKUP_APV (vop=3D<value optimized out>,
>     a=3D0xfffffe081f636610) at vnode_if.c:127
> #21 0xffffffff80bb2761 in lookup (ndp=3D0xfffffe081f636748) at vnode_if.h=
:54
> #22 0xffffffff80bb1c29 in namei (ndp=3D0xfffffe081f636748)
>     at /usr/src/sys/kern/vfs_lookup.c:448
> #23 0xffffffff80bc8238 in kern_statat (td=3D0xfffff8013ba1b620,
>     flag=3D<value optimized out>, fd=3D-100,
>     path=3D0x80332c910 <Address 0x80332c910 out of bounds>,
>     pathseg=3DUIO_USERSPACE, sbp=3D0xfffffe081f636900, hook=3D0)
>     at /usr/src/sys/kern/vfs_syscalls.c:2023
> #24 0xffffffff80bc817d in sys_stat (td=3D<value optimized out>,
>     uap=3D0xfffff8013ba1bb58) at /usr/src/sys/kern/vfs_syscalls.c:1978
> #25 0xffffffff80f7d1d8 in amd64_syscall (td=3D0xfffff8013ba1b620, traced=
=3D0)
>     at subr_syscall.c:132
> #26 0xffffffff80f5c5ad in fast_syscall_common ()
>     at /usr/src/sys/amd64/amd64/exception.S:494
> #27 0x0000000801a5b9ca in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> Current language:  auto; currently minimal
> (kgdb)
>
> ------------------------------------------------------------------------
> Panic while using mplayer
>
> #0  doadump (textdump=3D<value optimized out>) at pcpu.h:230
> 230     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) #0  doadump (textdump=3D<value optimized out>) at pcpu.h:230
> #1  0xffffffff80af91cb in kern_reboot (howto=3D260)
>     at /usr/src/sys/kern/kern_shutdown.c:383
> #2  0xffffffff80af95f1 in vpanic (fmt=3D<value optimized out>,
>     ap=3D<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
> #3  0xffffffff80af9433 in panic (fmt=3D<value optimized out>)
>     at /usr/src/sys/kern/kern_shutdown.c:707
> #4  0xffffffff80f7a13f in trap_fatal (frame=3D0xfffffe081f70d380, eva=3D0=
)
>     at /usr/src/sys/amd64/amd64/trap.c:877
> #5  0xffffffff80f7a199 in trap_pfault (frame=3D0xfffffe081f70d380,
> usermode=3D0)
>     at pcpu.h:230
> #6  0xffffffff80f79974 in trap (frame=3D0xfffffe081f70d380)
>     at /usr/src/sys/amd64/amd64/trap.c:415
> #7  0xffffffff80f5a00c in calltrap ()
>     at /usr/src/sys/amd64/amd64/exception.S:231
> #8  0xffffffff8088c030 in hdac_stream_start (dev=3D<value optimized out>,
>     child=3D<value optimized out>, dir=3D0, stream=3D1, buf=3D1889533952,
> blksz=3D2048,
>     blkcnt=3D2) at /usr/src/sys/dev/sound/pci/hda/hdac.c:1927
> #9  0xffffffff8088437d in hdaa_channel_start (ch=3D<value optimized out>)
>     at hdac_if.h:84
> #10 0xffffffff80887e0d in hdaa_channel_trigger (obj=3D<value optimized ou=
t>,
>     data=3D0xfffff8007102c480, go=3D1)
>     at /usr/src/sys/dev/sound/pci/hda/hdaa.c:2161
> #11 0xffffffff80893b8e in chn_trigger (c=3D0xfffff80071058400, go=3D1)
>     at channel_if.h:131
> #12 0xffffffff8089751b in chn_notify (c=3D0xfffff80071058400,
>     flags=3D<value optimized out>) at
> /usr/src/sys/dev/sound/pcm/channel.c:2281
> #13 0xffffffff808b697f in vchan_trigger (obj=3D<value optimized out>,
>     data=3D<value optimized out>, go=3D1)
>     at /usr/src/sys/dev/sound/pcm/vchan.c:171
> #14 0xffffffff80893b8e in chn_trigger (c=3D0xfffff80071057c00, go=3D1)
>     at channel_if.h:131
> #15 0xffffffff8089de10 in dsp_ioctl (i_dev=3D<value optimized out>,
>     cmd=3D<value optimized out>, arg=3D0xfffffe081f70d8d0 "\003",
>     mode=3D<value optimized out>, td=3D<value optimized out>)
>     at /usr/src/sys/dev/sound/pcm/dsp.c:1733
> #16 0xffffffff809c5b38 in devfs_ioctl_f (fp=3D0xfffff802c5563c80,
>     com=3D2147766288, data=3D0xfffffe081f70d8d0, cred=3D0xfffff8004c48250=
0,
>     td=3D0xfffff802f24ac000) at /usr/src/sys/fs/devfs/devfs_vnops.c:791
> #17 0xffffffff80b5c00d in kern_ioctl (td=3D0xfffff802f24ac000, fd=3D51,
>     com=3D2147766288, data=3D<value optimized out>) at file.h:323
> #18 0xffffffff80b5bd2c in sys_ioctl (td=3D0xfffff802f24ac000,
>     uap=3D0xfffff802f24ac538) at /usr/src/sys/kern/sys_generic.c:745
> #19 0xffffffff80f7b1c8 in amd64_syscall (td=3D0xfffff802f24ac000, traced=
=3D0)
>     at subr_syscall.c:132
> #20 0xffffffff80f5a8ed in fast_syscall_common ()
>     at /usr/src/sys/amd64/amd64/exception.S:494
> #21 0x0000000801fb94aa in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> Current language:  auto; currently minimal
> (kgdb)
>
> ------------------------------------------------------------------------
> Panic while ilde, seems like cronjobs triggered ZFS ARC cleanup.
>
> #0  doadump (textdump=3D<value optimized out>) at pcpu.h:230
> 230     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) #0  doadump (textdump=3D<value optimized out>) at pcpu.h:230
> #1  0xffffffff80af95fb in kern_reboot (howto=3D260)
>     at /usr/src/sys/kern/kern_shutdown.c:383
> #2  0xffffffff80af9a21 in vpanic (fmt=3D<value optimized out>,
>     ap=3D<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
> #3  0xffffffff80af9863 in panic (fmt=3D<value optimized out>)
>     at /usr/src/sys/kern/kern_shutdown.c:707
> #4  0xffffffff80f7b13f in trap_fatal (frame=3D0xfffffe081ee186e0,
> eva=3D201697507)
>     at /usr/src/sys/amd64/amd64/trap.c:877
> #5  0xffffffff80f7b199 in trap_pfault (frame=3D0xfffffe081ee186e0,
> usermode=3D0)
>     at pcpu.h:230
> #6  0xffffffff80f7a974 in trap (frame=3D0xfffffe081ee186e0)
>     at /usr/src/sys/amd64/amd64/trap.c:415
> #7  0xffffffff80f5a5bc in calltrap ()
>     at /usr/src/sys/amd64/amd64/exception.S:231
> #8  0xffffffff80ad596e in free (addr=3D0xfffff802472af200,
>     mtp=3D0xffffffff825bfc00) at /usr/src/sys/kern/kern_malloc.c:583
> #9  0xffffffff8232a667 in zfs_inactive (vp=3D<value optimized out>,
>     cr=3D<value optimized out>, ct=3D<value optimized out>)
>     at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4333
> #10 0xffffffff82332a1d in zfs_freebsd_inactive (ap=3D<value optimized out=
>)
>     at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:5364
> #11 0xffffffff810ff6b2 in VOP_INACTIVE_APV (vop=3D<value optimized out>,
>     a=3D0xfffffe081ee18858) at vnode_if.c:1955
> #12 0xffffffff80bbd7bc in vinactive (vp=3D0xfffff803ae8b3760,
>     td=3D0xfffff803ae23b620) at vnode_if.h:807
> #13 0xffffffff80bbdcc7 in vputx (vp=3D0xfffff803ae8b3760, func=3D1)
>     at /usr/src/sys/kern/vfs_subr.c:2688
> #14 0xffffffff80bc5180 in sys_fchdir (td=3D0xfffff803ae23b620,
>     uap=3D<value optimized out>) at /usr/src/sys/kern/vfs_syscalls.c:724
> #15 0xffffffff80f7c1c8 in amd64_syscall (td=3D0xfffff803ae23b620, traced=
=3D0)
>     at subr_syscall.c:132
> #16 0xffffffff80f5ae9d in fast_syscall_common ()
>     at /usr/src/sys/amd64/amd64/exception.S:494
> #17 0x00000008008a99aa in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> Current language:  auto; currently minimal
> (kgdb)
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org=
"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAECmPwvmQ1a36=5V0PtKcrVgosAyRaKmsRq2CMMwTwfqsLYHuA>