From owner-freebsd-stable@freebsd.org Sun Aug 26 11:20:15 2018 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2D2F610723AB; Sun, 26 Aug 2018 11:20:15 +0000 (UTC) (envelope-from meowthink@gmail.com) Received: from mail-oi0-x243.google.com (mail-oi0-x243.google.com [IPv6:2607:f8b0:4003:c06::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 92ABC7BF63; Sun, 26 Aug 2018 11:20:14 +0000 (UTC) (envelope-from meowthink@gmail.com) Received: by mail-oi0-x243.google.com with SMTP id x197-v6so4926177oix.5; Sun, 26 Aug 2018 04:20:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=+6+plC/3fU8Kp1Osd/rb3wCKOICD0IZbAmKqk657Uh0=; b=QciaCnpdonODb0IF57MM49R/oUxCbg+IIqVfGeqKLBgx9GIjvhLwu/VgBupXgEk8wt sP2pZhX5SsEZL3PZdZnxkQtnJNDog8gB/gYOOMGKdn9G+v/O5N8Kvh8gDnFsKqve4lnM ErsFlftL3nSnluLhhHFUjxmVvaWyd4Xu/NDsqLuhpLzkt7scNPBenL9ot9A0t/GVXf9T 5ekHq2lQLuf0hy6yUmq+V5TQ1Cp/qBhi9c0XAnnPFOYnI2JTWNrhJJgG4CL7mhKRDdIQ uxssiLbhBdPPoNWq1hNayFPhk9HrEhL8YPi0AqTZIeW6GtqA8Tx9Z/N3UQDLe5ADhljb 3/2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=+6+plC/3fU8Kp1Osd/rb3wCKOICD0IZbAmKqk657Uh0=; b=BmPpVBCZVsn84jjK1F+l76/kDVYM8sAo7OsiN6Dt1RLAxCb8lXZvGsLJR9TGghWAt9 xbyjp44WjBWZGNkzJaiSim8KRv/oqMYOteo5RFXS9oKIY6DL6Hf94F+b3ZSOp7IMYsdC ogll12IjgQX/rtqSwex5jQ3ZcnSKioa0PLgtDTpDWOvgDvpOiAYSf0gjxcLI+6AzBY7s obnelzhGV/4TIM7g6XOMQcCKafUJCelf2r4m+tJuURRIZHCLc1oPFtxikb92qceEEBmh 5MLCM/BfPUIxOb88MAG0HUa7X2YulnSajIXvwMlXZ+ABijjkApzujUvslZSxMS7dLjMM ibTQ== X-Gm-Message-State: APzg51A6gHNRCYayegUO7322LTRW3cVNgceai5OK4Hi4eixhrC16e/34 AFStSkhram+K6zOKSuqZngC83b7K8gCUcJ4XFrqaMzp1 X-Google-Smtp-Source: ANB0VdbkRqzDGobf2fdslj2j+uGsiDmzbekXSadn68lHOadtPPsJWT92MdUAcIingPKeq7mlezRl9swIAIYaSXbYmlI= X-Received: by 2002:aca:4802:: with SMTP id v2-v6mr8802835oia.259.1535282413258; Sun, 26 Aug 2018 04:20:13 -0700 (PDT) MIME-Version: 1.0 From: Meowthink Date: Sun, 26 Aug 2018 19:20:00 +0800 Message-ID: Subject: Help diagnose my Ryzen build problem To: freebsd-hackers@freebsd.org, freebsd-stable@freebsd.org Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Aug 2018 11:20:15 -0000 Hello all, Recently I tried to build up a Ryzen system and run FreeBSD on it. CPU: AMD Ryzen 5 2400G with Radeon Vega Graphics (0x810f10) Mobo: Asrock Fatal1ty AB350 Gaming-ITX/ac ( with up-to-date BIOS with PinnaclePI-AM4_1.0.0.4, microcode 0x810100b ) Mem: 2x Crucial 16GB DDR4-2400 EUDIMM CL17 ( ECC Unregistered but ECC actually won't work :( ) But the system is unstable - it can't last few days even is nearly idle. System panics even at midnight. It almost panic while or after I built something large. Surprisly I didn't encourage a user program fault, bad binaries built etc., panics only. Then I tried lots of BIOS settings e.g. SMT, C6 idle current, underclock RAM, but none seems effect. It could pass memtest86 V7.5 without error, or various benchmarks under Windows. thus I think the problem is not in the hardware but software. In the mean time, I realized that the rate of irqs from xhci0 are too high - it's about 1998/s. I found [1] and tried to MFC r331665. It didn't fix the problem though, but disabling that bluetooth module stops the irq storm, after all. Then the system lasts much longer before panic. It eventually can compile ports tree, build the world, scrub the zpool, all done without annoying reboots. Then I assume this is [2] related? So I also tried cpuctl, bounding all processes to 2-7. But the problem is still there, only the chance become very low. It still panics occasionally, idling a week or stressing few hours - Stress seems to rise the chance of panic, but differently by types. Things like llvm will always build, but gcc will cause a panic per few passes. The system was 11.2 but then moved on to stable/11 (r337906 currently). I've got last 10 coredumps saved but my kernel isn't compile as debug. So I'll put some backtrace from core.txt.? in the end. Indeed I want to eliminate this problem. Could someone guide me how to figure out the problem? What should I try next? Best regards, Meowthink [1] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224886 [2] https://reviews.freebsd.org/D11780 Backtraces newer - older: ------------------------------------------------------------------------ Panic while compiling gcc: #0 doadump (textdump=) at pcpu.h:230 230 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=) at pcpu.h:230 #1 0xffffffff80afa5fb in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:383 #2 0xffffffff80afaa21 in vpanic (fmt=, ap=) at /usr/src/sys/kern/kern_shutdown.c:776 #3 0xffffffff80afa863 in panic (fmt=) at /usr/src/sys/kern/kern_shutdown.c:707 #4 0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081e962790, eva=18446735309538549504) at /usr/src/sys/amd64/amd64/trap.c:877 #5 0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081e962790, usermode=0) at pcpu.h:230 #6 0xffffffff80f7b984 in trap (frame=0xfffffe081e962790) at /usr/src/sys/amd64/amd64/trap.c:415 #7 0xffffffff80f5bccc in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #8 0xffffffff822950a8 in arc_change_state () at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1800 #9 0xffffffff8229328b in arc_access () at time.h:145 #10 0xffffffff82296232 in arc_write_done (zio=0xfffff8065f886410) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:6169 #11 0xffffffff82334cbe in zio_done (zio=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:4032 #12 0xffffffff8233070c in zio_execute (zio=0xfffff8065f886410) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1768 #13 0xffffffff80b52cc4 in taskqueue_run_locked (queue=0xfffff8000d9e6e00) at /usr/src/sys/kern/subr_taskqueue.c:463 #14 0xffffffff80b53e28 in taskqueue_thread_loop (arg=) at /usr/src/sys/kern/subr_taskqueue.c:755 #15 0xffffffff80abd813 in fork_exit ( callout=0xffffffff80b53d90 , arg=0xfffff8000d967030, frame=0xfffffe081e962ac0) at /usr/src/sys/kern/kern_fork.c:1072 #16 0xffffffff80f5cc7e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:972 #17 0x0000000000000000 in ?? () Current language: auto; currently minimal (kgdb) ------------------------------------------------------------------------ backtrace panic when shuting down: #0 doadump (textdump=) at pcpu.h:230 230 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=) at pcpu.h:230 #1 0xffffffff80afa5fb in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:383 #2 0xffffffff80afaa21 in vpanic (fmt=, ap=) at /usr/src/sys/kern/kern_shutdown.c:776 #3 0xffffffff80afa863 in panic (fmt=) at /usr/src/sys/kern/kern_shutdown.c:707 #4 0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081ed30700, eva=0) at /usr/src/sys/amd64/amd64/trap.c:877 #5 0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081ed30700, usermode=0) at pcpu.h:230 #6 0xffffffff80f7b984 in trap (frame=0xfffffe081ed30700) at /usr/src/sys/amd64/amd64/trap.c:415 #7 0xffffffff80f5bccc in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #8 0xffffffff80dfe4ad in vm_object_terminate (object=0xfffff805bf66d5a0) at /usr/src/sys/vm/vm_object.c:768 #9 0xffffffff80dfd0f8 in vm_object_deallocate (object=0x0) at /usr/src/sys/vm/vm_object.c:677 #10 0xffffffff80df3189 in _vm_map_unlock (map=, file=, line=) at /usr/src/sys/vm/vm_map.c:2939 #11 0xffffffff80df7be2 in vm_map_remove (map=0xfffff80018673000, start=4096, end=140737488351232) at /usr/src/sys/vm/vm_map.c:3137 #12 0xffffffff80df2e49 in vmspace_exit (td=0xfffff80039ec0620) at /usr/src/sys/vm/vm_map.c:337 #13 0xffffffff80ab72b9 in exit1 (td=0xfffff80039ec0620, rval=, signo=) at /usr/src/sys/kern/kern_exit.c:401 #14 0xffffffff80ab6ced in sys_sys_exit (td=, uap=) at /usr/src/sys/kern/kern_exit.c:180 #15 0xffffffff80f7d1d8 in amd64_syscall (td=0xfffff80039ec0620, traced=0) at subr_syscall.c:132 #16 0xffffffff80f5c5ad in fast_syscall_common () at /usr/src/sys/amd64/amd64/exception.S:494 #17 0x00000008028d034a in ?? () Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal (kgdb) ------------------------------------------------------------------------ Panic while only running my single thread python script #0 doadump (textdump=) at pcpu.h:230 230 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=) at pcpu.h:230 #1 0xffffffff80afa5fb in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:383 #2 0xffffffff80afaa21 in vpanic (fmt=, ap=) at /usr/src/sys/kern/kern_shutdown.c:776 #3 0xffffffff80afa863 in panic (fmt=) at /usr/src/sys/kern/kern_shutdown.c:707 #4 0xffffffff80f7c14f in trap_fatal (frame=0xfffffe081f635ec0, eva=952) at /usr/src/sys/amd64/amd64/trap.c:877 #5 0xffffffff80f7c1a9 in trap_pfault (frame=0xfffffe081f635ec0, usermode=0) at pcpu.h:230 #6 0xffffffff80f7b984 in trap (frame=0xfffffe081f635ec0) at /usr/src/sys/amd64/amd64/trap.c:415 #7 0xffffffff80f5bccc in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #8 0xffffffff80af57ad in __rw_wlock_hard (c=0xfffff80016f8f798, v=) at /usr/src/sys/kern/kern_rwlock.c:977 #9 0xffffffff80bbca92 in bufobj_invalbuf (bo=, flags=1, slpflag=1017770744, slptimeo=) at /usr/src/sys/kern/vfs_subr.c:1609 #10 0xffffffff80bbf8be in vgonel (vp=0xfffff8053ca9f1d8) at /usr/src/sys/kern/vfs_subr.c:1655 #11 0xffffffff80bbbcc4 in vnlru_free_locked (count=1, mnt_op=0x0) at /usr/src/sys/kern/vfs_subr.c:1227 #12 0xffffffff80bbbe14 in getnewvnode_reserve (count=1) at /usr/src/sys/kern/vfs_subr.c:1287 #13 0xffffffff82327fb4 in zfs_zget (zfsvfs=0xfffff80076574000, obj_num=34941, zpp=0xfffffe081f6362a8) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1122 #14 0xffffffff823421ad in zfs_dirent_lookup (dzp=0xfffff804ff94e420, name=0xfffffe081f6363e0 "filename.ext", zpp=0xfffffe081f6362a8, flag=2) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:187 #15 0xffffffff82342267 in zfs_dirlook (dzp=0xfffff804ff94e420, name=, zpp=0xfffffe081f636360) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:238 #16 0xffffffff8235a4ef in zfs_lookup () at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1658 #17 0xffffffff8235ac1e in zfs_freebsd_lookup (ap=0xfffffe081f636548) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4956 #18 0xffffffff810fe89c in VOP_CACHEDLOOKUP_APV (vop=, a=0xfffffe081f636548) at vnode_if.c:195 #19 0xffffffff80ba8d56 in vfs_cache_lookup (ap=) at vnode_if.h:80 #20 0xffffffff810fe77c in VOP_LOOKUP_APV (vop=, a=0xfffffe081f636610) at vnode_if.c:127 #21 0xffffffff80bb2761 in lookup (ndp=0xfffffe081f636748) at vnode_if.h:54 #22 0xffffffff80bb1c29 in namei (ndp=0xfffffe081f636748) at /usr/src/sys/kern/vfs_lookup.c:448 #23 0xffffffff80bc8238 in kern_statat (td=0xfffff8013ba1b620, flag=, fd=-100, path=0x80332c910
, pathseg=UIO_USERSPACE, sbp=0xfffffe081f636900, hook=0) at /usr/src/sys/kern/vfs_syscalls.c:2023 #24 0xffffffff80bc817d in sys_stat (td=, uap=0xfffff8013ba1bb58) at /usr/src/sys/kern/vfs_syscalls.c:1978 #25 0xffffffff80f7d1d8 in amd64_syscall (td=0xfffff8013ba1b620, traced=0) at subr_syscall.c:132 #26 0xffffffff80f5c5ad in fast_syscall_common () at /usr/src/sys/amd64/amd64/exception.S:494 #27 0x0000000801a5b9ca in ?? () Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal (kgdb) ------------------------------------------------------------------------ Panic while using mplayer #0 doadump (textdump=) at pcpu.h:230 230 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=) at pcpu.h:230 #1 0xffffffff80af91cb in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:383 #2 0xffffffff80af95f1 in vpanic (fmt=, ap=) at /usr/src/sys/kern/kern_shutdown.c:776 #3 0xffffffff80af9433 in panic (fmt=) at /usr/src/sys/kern/kern_shutdown.c:707 #4 0xffffffff80f7a13f in trap_fatal (frame=0xfffffe081f70d380, eva=0) at /usr/src/sys/amd64/amd64/trap.c:877 #5 0xffffffff80f7a199 in trap_pfault (frame=0xfffffe081f70d380, usermode=0) at pcpu.h:230 #6 0xffffffff80f79974 in trap (frame=0xfffffe081f70d380) at /usr/src/sys/amd64/amd64/trap.c:415 #7 0xffffffff80f5a00c in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #8 0xffffffff8088c030 in hdac_stream_start (dev=, child=, dir=0, stream=1, buf=1889533952, blksz=2048, blkcnt=2) at /usr/src/sys/dev/sound/pci/hda/hdac.c:1927 #9 0xffffffff8088437d in hdaa_channel_start (ch=) at hdac_if.h:84 #10 0xffffffff80887e0d in hdaa_channel_trigger (obj=, data=0xfffff8007102c480, go=1) at /usr/src/sys/dev/sound/pci/hda/hdaa.c:2161 #11 0xffffffff80893b8e in chn_trigger (c=0xfffff80071058400, go=1) at channel_if.h:131 #12 0xffffffff8089751b in chn_notify (c=0xfffff80071058400, flags=) at /usr/src/sys/dev/sound/pcm/channel.c:2281 #13 0xffffffff808b697f in vchan_trigger (obj=, data=, go=1) at /usr/src/sys/dev/sound/pcm/vchan.c:171 #14 0xffffffff80893b8e in chn_trigger (c=0xfffff80071057c00, go=1) at channel_if.h:131 #15 0xffffffff8089de10 in dsp_ioctl (i_dev=, cmd=, arg=0xfffffe081f70d8d0 "\003", mode=, td=) at /usr/src/sys/dev/sound/pcm/dsp.c:1733 #16 0xffffffff809c5b38 in devfs_ioctl_f (fp=0xfffff802c5563c80, com=2147766288, data=0xfffffe081f70d8d0, cred=0xfffff8004c482500, td=0xfffff802f24ac000) at /usr/src/sys/fs/devfs/devfs_vnops.c:791 #17 0xffffffff80b5c00d in kern_ioctl (td=0xfffff802f24ac000, fd=51, com=2147766288, data=) at file.h:323 #18 0xffffffff80b5bd2c in sys_ioctl (td=0xfffff802f24ac000, uap=0xfffff802f24ac538) at /usr/src/sys/kern/sys_generic.c:745 #19 0xffffffff80f7b1c8 in amd64_syscall (td=0xfffff802f24ac000, traced=0) at subr_syscall.c:132 #20 0xffffffff80f5a8ed in fast_syscall_common () at /usr/src/sys/amd64/amd64/exception.S:494 #21 0x0000000801fb94aa in ?? () Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal (kgdb) ------------------------------------------------------------------------ Panic while ilde, seems like cronjobs triggered ZFS ARC cleanup. #0 doadump (textdump=) at pcpu.h:230 230 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=) at pcpu.h:230 #1 0xffffffff80af95fb in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:383 #2 0xffffffff80af9a21 in vpanic (fmt=, ap=) at /usr/src/sys/kern/kern_shutdown.c:776 #3 0xffffffff80af9863 in panic (fmt=) at /usr/src/sys/kern/kern_shutdown.c:707 #4 0xffffffff80f7b13f in trap_fatal (frame=0xfffffe081ee186e0, eva=201697507) at /usr/src/sys/amd64/amd64/trap.c:877 #5 0xffffffff80f7b199 in trap_pfault (frame=0xfffffe081ee186e0, usermode=0) at pcpu.h:230 #6 0xffffffff80f7a974 in trap (frame=0xfffffe081ee186e0) at /usr/src/sys/amd64/amd64/trap.c:415 #7 0xffffffff80f5a5bc in calltrap () at /usr/src/sys/amd64/amd64/exception.S:231 #8 0xffffffff80ad596e in free (addr=0xfffff802472af200, mtp=0xffffffff825bfc00) at /usr/src/sys/kern/kern_malloc.c:583 #9 0xffffffff8232a667 in zfs_inactive (vp=, cr=, ct=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4333 #10 0xffffffff82332a1d in zfs_freebsd_inactive (ap=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:5364 #11 0xffffffff810ff6b2 in VOP_INACTIVE_APV (vop=, a=0xfffffe081ee18858) at vnode_if.c:1955 #12 0xffffffff80bbd7bc in vinactive (vp=0xfffff803ae8b3760, td=0xfffff803ae23b620) at vnode_if.h:807 #13 0xffffffff80bbdcc7 in vputx (vp=0xfffff803ae8b3760, func=1) at /usr/src/sys/kern/vfs_subr.c:2688 #14 0xffffffff80bc5180 in sys_fchdir (td=0xfffff803ae23b620, uap=) at /usr/src/sys/kern/vfs_syscalls.c:724 #15 0xffffffff80f7c1c8 in amd64_syscall (td=0xfffff803ae23b620, traced=0) at subr_syscall.c:132 #16 0xffffffff80f5ae9d in fast_syscall_common () at /usr/src/sys/amd64/amd64/exception.S:494 #17 0x00000008008a99aa in ?? () Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal (kgdb)