Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 26 Aug 2018 19:20:00 +0800
From:      Meowthink <meowthink@gmail.com>
To:        freebsd-hackers@freebsd.org, freebsd-stable@freebsd.org
Subject:   Help diagnose my Ryzen build problem
Message-ID:  <CABnABoZA4DUOFfr7JdbbBAWxak3=ge6zX0HXtu1RffQH7tSb2Q@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hello all,

Recently I tried to build up a Ryzen system and run FreeBSD on it.
CPU:  AMD Ryzen 5 2400G with Radeon Vega Graphics (0x810f10)
Mobo: Asrock Fatal1ty AB350 Gaming-ITX/ac ( with up-to-date BIOS with
PinnaclePI-AM4_1.0.0.4, microcode 0x810100b )
Mem:  2x Crucial 16GB DDR4-2400 EUDIMM CL17 ( ECC Unregistered but ECC
actually won't work :( )

But the system is unstable - it can't last few days even is nearly
idle. System panics even at midnight. It almost panic while or after I
built something large. Surprisly I didn't encourage a user program
fault, bad binaries built etc., panics only.

Then I tried lots of BIOS settings e.g. SMT, C6 idle current,
underclock RAM, but none seems effect.
It could pass memtest86 V7.5 without error, or various benchmarks
under Windows. thus I think the problem is not in the hardware but
software.

In the mean time, I realized that the rate of irqs from xhci0 are too
high - it's about 1998/s. I found [1] and tried to MFC r331665. It
didn't fix the problem though, but disabling that bluetooth module
stops the irq storm, after all.

Then the system lasts much longer before panic. It eventually can
compile ports tree, build the world, scrub the zpool, all done without
annoying reboots.
Then I assume this is [2] related? So I also tried cpuctl, bounding
all processes to 2-7.
But the problem is still there, only the chance become very low. It
still panics occasionally, idling a week or stressing few hours -
Stress seems to rise the chance of panic, but differently by types.
Things like llvm will always build, but gcc will cause a panic per few
passes.

The system was 11.2 but then moved on to stable/11 (r337906
currently). I've got last 10 coredumps saved but my kernel isn't
compile as debug. So I'll put some backtrace from core.txt.? in the
end.

Indeed I want to eliminate this problem. Could someone guide me how to
figure out the problem? What should I try next?

Best regards,
Meowthink

[1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224886
[2] https://reviews.freebsd.org/D11780

Backtraces newer - older:
------------------------------------------------------------------------
Panic while compiling gcc:

#0  doadump (textdump=<value optimized out>) at pcpu.h:230
230     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:230
#1  0xffffffff80afa5fb in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:383
#2  0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
    ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3  0xffffffff80afa863 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:707
#4  0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081e962790,
    eva=18446735309538549504) at /usr/src/sys/amd64/amd64/trap.c:877
#5  0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081e962790, usermode=0)
    at pcpu.h:230
#6  0xffffffff80f7b984 in trap (frame=0xfffffe081e962790)
    at /usr/src/sys/amd64/amd64/trap.c:415
#7  0xffffffff80f5bccc in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:231
#8  0xffffffff822950a8 in arc_change_state ()
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1800
#9  0xffffffff8229328b in arc_access () at time.h:145
#10 0xffffffff82296232 in arc_write_done (zio=0xfffff8065f886410)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:6169
#11 0xffffffff82334cbe in zio_done (zio=<value optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:4032
#12 0xffffffff8233070c in zio_execute (zio=0xfffff8065f886410)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1768
#13 0xffffffff80b52cc4 in taskqueue_run_locked (queue=0xfffff8000d9e6e00)
    at /usr/src/sys/kern/subr_taskqueue.c:463
#14 0xffffffff80b53e28 in taskqueue_thread_loop (arg=<value optimized out>)
    at /usr/src/sys/kern/subr_taskqueue.c:755
#15 0xffffffff80abd813 in fork_exit (
    callout=0xffffffff80b53d90 <taskqueue_thread_loop>,
    arg=0xfffff8000d967030, frame=0xfffffe081e962ac0)
    at /usr/src/sys/kern/kern_fork.c:1072
#16 0xffffffff80f5cc7e in fork_trampoline ()
    at /usr/src/sys/amd64/amd64/exception.S:972
#17 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal
(kgdb)

------------------------------------------------------------------------
backtrace panic when shuting down:

#0  doadump (textdump=<value optimized out>) at pcpu.h:230
230     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:230
#1  0xffffffff80afa5fb in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:383
#2  0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
    ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3  0xffffffff80afa863 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:707
#4  0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081ed30700, eva=0)
    at /usr/src/sys/amd64/amd64/trap.c:877
#5  0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081ed30700, usermode=0)
    at pcpu.h:230
#6  0xffffffff80f7b984 in trap (frame=0xfffffe081ed30700)
    at /usr/src/sys/amd64/amd64/trap.c:415
#7  0xffffffff80f5bccc in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:231
#8  0xffffffff80dfe4ad in vm_object_terminate (object=0xfffff805bf66d5a0)
    at /usr/src/sys/vm/vm_object.c:768
#9  0xffffffff80dfd0f8 in vm_object_deallocate (object=0x0)
    at /usr/src/sys/vm/vm_object.c:677
#10 0xffffffff80df3189 in _vm_map_unlock (map=<value optimized out>,
    file=<value optimized out>, line=<value optimized out>)
    at /usr/src/sys/vm/vm_map.c:2939
#11 0xffffffff80df7be2 in vm_map_remove (map=0xfffff80018673000, start=4096,
    end=140737488351232) at /usr/src/sys/vm/vm_map.c:3137
#12 0xffffffff80df2e49 in vmspace_exit (td=0xfffff80039ec0620)
    at /usr/src/sys/vm/vm_map.c:337
#13 0xffffffff80ab72b9 in exit1 (td=0xfffff80039ec0620,
    rval=<value optimized out>, signo=<value optimized out>)
    at /usr/src/sys/kern/kern_exit.c:401
#14 0xffffffff80ab6ced in sys_sys_exit (td=<value optimized out>,
    uap=<value optimized out>) at /usr/src/sys/kern/kern_exit.c:180
#15 0xffffffff80f7d1d8 in amd64_syscall (td=0xfffff80039ec0620, traced=0)
    at subr_syscall.c:132
#16 0xffffffff80f5c5ad in fast_syscall_common ()
    at /usr/src/sys/amd64/amd64/exception.S:494
#17 0x00000008028d034a in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal
(kgdb)

------------------------------------------------------------------------
Panic while only running my single thread python script

#0  doadump (textdump=<value optimized out>) at pcpu.h:230
230     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:230
#1  0xffffffff80afa5fb in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:383
#2  0xffffffff80afaa21 in vpanic (fmt=<value optimized out>,
    ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3  0xffffffff80afa863 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:707
#4  0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081f635ec0, eva=952)
    at /usr/src/sys/amd64/amd64/trap.c:877
#5  0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081f635ec0, usermode=0)
    at pcpu.h:230
#6  0xffffffff80f7b984 in trap (frame=0xfffffe081f635ec0)
    at /usr/src/sys/amd64/amd64/trap.c:415
#7  0xffffffff80f5bccc in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:231
#8  0xffffffff80af57ad in __rw_wlock_hard (c=0xfffff80016f8f798,
    v=<value optimized out>) at /usr/src/sys/kern/kern_rwlock.c:977
#9  0xffffffff80bbca92 in bufobj_invalbuf (bo=<value optimized out>, flags=1,
    slpflag=1017770744, slptimeo=<value optimized out>)
    at /usr/src/sys/kern/vfs_subr.c:1609
#10 0xffffffff80bbf8be in vgonel (vp=0xfffff8053ca9f1d8)
    at /usr/src/sys/kern/vfs_subr.c:1655
#11 0xffffffff80bbbcc4 in vnlru_free_locked (count=1, mnt_op=0x0)
    at /usr/src/sys/kern/vfs_subr.c:1227
#12 0xffffffff80bbbe14 in getnewvnode_reserve (count=1)
    at /usr/src/sys/kern/vfs_subr.c:1287
#13 0xffffffff82327fb4 in zfs_zget (zfsvfs=0xfffff80076574000, obj_num=34941,
    zpp=0xfffffe081f6362a8)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1122
#14 0xffffffff823421ad in zfs_dirent_lookup (dzp=0xfffff804ff94e420,
    name=0xfffffe081f6363e0 "filename.ext", zpp=0xfffffe081f6362a8, flag=2)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:187
#15 0xffffffff82342267 in zfs_dirlook (dzp=0xfffff804ff94e420,
    name=<value optimized out>, zpp=0xfffffe081f636360)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:238
#16 0xffffffff8235a4ef in zfs_lookup ()
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1658
#17 0xffffffff8235ac1e in zfs_freebsd_lookup (ap=0xfffffe081f636548)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4956
#18 0xffffffff810fe89c in VOP_CACHEDLOOKUP_APV (vop=<value optimized out>,
    a=0xfffffe081f636548) at vnode_if.c:195
#19 0xffffffff80ba8d56 in vfs_cache_lookup (ap=<value optimized out>)
    at vnode_if.h:80
#20 0xffffffff810fe77c in VOP_LOOKUP_APV (vop=<value optimized out>,
    a=0xfffffe081f636610) at vnode_if.c:127
#21 0xffffffff80bb2761 in lookup (ndp=0xfffffe081f636748) at vnode_if.h:54
#22 0xffffffff80bb1c29 in namei (ndp=0xfffffe081f636748)
    at /usr/src/sys/kern/vfs_lookup.c:448
#23 0xffffffff80bc8238 in kern_statat (td=0xfffff8013ba1b620,
    flag=<value optimized out>, fd=-100,
    path=0x80332c910 <Address 0x80332c910 out of bounds>,
    pathseg=UIO_USERSPACE, sbp=0xfffffe081f636900, hook=0)
    at /usr/src/sys/kern/vfs_syscalls.c:2023
#24 0xffffffff80bc817d in sys_stat (td=<value optimized out>,
    uap=0xfffff8013ba1bb58) at /usr/src/sys/kern/vfs_syscalls.c:1978
#25 0xffffffff80f7d1d8 in amd64_syscall (td=0xfffff8013ba1b620, traced=0)
    at subr_syscall.c:132
#26 0xffffffff80f5c5ad in fast_syscall_common ()
    at /usr/src/sys/amd64/amd64/exception.S:494
#27 0x0000000801a5b9ca in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal
(kgdb)

------------------------------------------------------------------------
Panic while using mplayer

#0  doadump (textdump=<value optimized out>) at pcpu.h:230
230     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:230
#1  0xffffffff80af91cb in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:383
#2  0xffffffff80af95f1 in vpanic (fmt=<value optimized out>,
    ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3  0xffffffff80af9433 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:707
#4  0xffffffff80f7a13f in trap_fatal (frame=0xfffffe081f70d380, eva=0)
    at /usr/src/sys/amd64/amd64/trap.c:877
#5  0xffffffff80f7a199 in trap_pfault (frame=0xfffffe081f70d380, usermode=0)
    at pcpu.h:230
#6  0xffffffff80f79974 in trap (frame=0xfffffe081f70d380)
    at /usr/src/sys/amd64/amd64/trap.c:415
#7  0xffffffff80f5a00c in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:231
#8  0xffffffff8088c030 in hdac_stream_start (dev=<value optimized out>,
    child=<value optimized out>, dir=0, stream=1, buf=1889533952, blksz=2048,
    blkcnt=2) at /usr/src/sys/dev/sound/pci/hda/hdac.c:1927
#9  0xffffffff8088437d in hdaa_channel_start (ch=<value optimized out>)
    at hdac_if.h:84
#10 0xffffffff80887e0d in hdaa_channel_trigger (obj=<value optimized out>,
    data=0xfffff8007102c480, go=1)
    at /usr/src/sys/dev/sound/pci/hda/hdaa.c:2161
#11 0xffffffff80893b8e in chn_trigger (c=0xfffff80071058400, go=1)
    at channel_if.h:131
#12 0xffffffff8089751b in chn_notify (c=0xfffff80071058400,
    flags=<value optimized out>) at /usr/src/sys/dev/sound/pcm/channel.c:2281
#13 0xffffffff808b697f in vchan_trigger (obj=<value optimized out>,
    data=<value optimized out>, go=1)
    at /usr/src/sys/dev/sound/pcm/vchan.c:171
#14 0xffffffff80893b8e in chn_trigger (c=0xfffff80071057c00, go=1)
    at channel_if.h:131
#15 0xffffffff8089de10 in dsp_ioctl (i_dev=<value optimized out>,
    cmd=<value optimized out>, arg=0xfffffe081f70d8d0 "\003",
    mode=<value optimized out>, td=<value optimized out>)
    at /usr/src/sys/dev/sound/pcm/dsp.c:1733
#16 0xffffffff809c5b38 in devfs_ioctl_f (fp=0xfffff802c5563c80,
    com=2147766288, data=0xfffffe081f70d8d0, cred=0xfffff8004c482500,
    td=0xfffff802f24ac000) at /usr/src/sys/fs/devfs/devfs_vnops.c:791
#17 0xffffffff80b5c00d in kern_ioctl (td=0xfffff802f24ac000, fd=51,
    com=2147766288, data=<value optimized out>) at file.h:323
#18 0xffffffff80b5bd2c in sys_ioctl (td=0xfffff802f24ac000,
    uap=0xfffff802f24ac538) at /usr/src/sys/kern/sys_generic.c:745
#19 0xffffffff80f7b1c8 in amd64_syscall (td=0xfffff802f24ac000, traced=0)
    at subr_syscall.c:132
#20 0xffffffff80f5a8ed in fast_syscall_common ()
    at /usr/src/sys/amd64/amd64/exception.S:494
#21 0x0000000801fb94aa in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal
(kgdb)

------------------------------------------------------------------------
Panic while ilde, seems like cronjobs triggered ZFS ARC cleanup.

#0  doadump (textdump=<value optimized out>) at pcpu.h:230
230     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) #0  doadump (textdump=<value optimized out>) at pcpu.h:230
#1  0xffffffff80af95fb in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:383
#2  0xffffffff80af9a21 in vpanic (fmt=<value optimized out>,
    ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:776
#3  0xffffffff80af9863 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:707
#4  0xffffffff80f7b13f in trap_fatal (frame=0xfffffe081ee186e0, eva=201697507)
    at /usr/src/sys/amd64/amd64/trap.c:877
#5  0xffffffff80f7b199 in trap_pfault (frame=0xfffffe081ee186e0, usermode=0)
    at pcpu.h:230
#6  0xffffffff80f7a974 in trap (frame=0xfffffe081ee186e0)
    at /usr/src/sys/amd64/amd64/trap.c:415
#7  0xffffffff80f5a5bc in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:231
#8  0xffffffff80ad596e in free (addr=0xfffff802472af200,
    mtp=0xffffffff825bfc00) at /usr/src/sys/kern/kern_malloc.c:583
#9  0xffffffff8232a667 in zfs_inactive (vp=<value optimized out>,
    cr=<value optimized out>, ct=<value optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4333
#10 0xffffffff82332a1d in zfs_freebsd_inactive (ap=<value optimized out>)
    at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:5364
#11 0xffffffff810ff6b2 in VOP_INACTIVE_APV (vop=<value optimized out>,
    a=0xfffffe081ee18858) at vnode_if.c:1955
#12 0xffffffff80bbd7bc in vinactive (vp=0xfffff803ae8b3760,
    td=0xfffff803ae23b620) at vnode_if.h:807
#13 0xffffffff80bbdcc7 in vputx (vp=0xfffff803ae8b3760, func=1)
    at /usr/src/sys/kern/vfs_subr.c:2688
#14 0xffffffff80bc5180 in sys_fchdir (td=0xfffff803ae23b620,
    uap=<value optimized out>) at /usr/src/sys/kern/vfs_syscalls.c:724
#15 0xffffffff80f7c1c8 in amd64_syscall (td=0xfffff803ae23b620, traced=0)
    at subr_syscall.c:132
#16 0xffffffff80f5ae9d in fast_syscall_common ()
    at /usr/src/sys/amd64/amd64/exception.S:494
#17 0x00000008008a99aa in ?? ()
Previous frame inner to this frame (corrupt stack?)
Current language:  auto; currently minimal
(kgdb)



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABnABoZA4DUOFfr7JdbbBAWxak3=ge6zX0HXtu1RffQH7tSb2Q>