From owner-freebsd-fs@freebsd.org Wed Mar 31 08:12:59 2021 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id BB6B45BF6CD for ; Wed, 31 Mar 2021 08:12:59 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4F9JwC4lmXz4kn5 for ; Wed, 31 Mar 2021 08:12:59 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id A32BE5BF7BF; Wed, 31 Mar 2021 08:12:59 +0000 (UTC) Delivered-To: fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id A2F305BF7BE for ; Wed, 31 Mar 2021 08:12:59 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4F9JwC48zTz4lF5 for ; Wed, 31 Mar 2021 08:12:59 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 81F8D160E1 for ; Wed, 31 Mar 2021 08:12:59 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 12V8CxvT092471 for ; Wed, 31 Mar 2021 08:12:59 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 12V8CxIo092470 for fs@FreeBSD.org; Wed, 31 Mar 2021 08:12:59 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 224292] processes are hanging in state ufs / possible deadlock in file system Date: Wed, 31 Mar 2021 08:12:59 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: sigsys@gmail.com X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Mar 2021 08:12:59 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D224292 --- Comment #18 from sigsys@gmail.com --- (In reply to Konstantin Belousov from comment #17) This sure seems to have helped. I was about to report that the problem is = most likely gone since it hadn't happened in a while (despite running kyua in a = loop for hours) after getting that patch series. But then it happened again with chrome this time and I got a dump. Dunno if running "sync" would have unwedged the whole thing since I made it panic instead. There were two threads from two processes looping and doing crazy I/O: a ch= rome process and a zsh process. zsh thread backtrace: #0 sched_switch (td=3Dtd@entry=3D0xfffffe00aa35ce00, flags=3D, flags@entry=3D260) at /usr/src/sys/kern/sched_ule.c:2147 #1 0xffffffff80c1f4c9 in mi_switch (flags=3Dflags@entry=3D260) at /usr/src/sys/kern/kern_synch.c:542 #2 0xffffffff80c6f929 in sleepq_switch (wchan=3Dwchan@entry=3D0xfffffe0009= 7da0a8, pri=3D92, pri@entry=3D0) at /usr/src/sys/kern/subr_sleepqueue.c:608 #3 0xffffffff80c6f7fe in sleepq_wait (wchan=3D, pri=3D) at /usr/src/sys/kern/subr_sleepqueue.c:659 #4 0xffffffff80c1e9e6 in _sleep (ident=3Dident@entry=3D0xfffffe00097da0a8, lock=3D, lock@entry=3D0xfffffe000863b0c0, priority=3Dpriority@entry=3D92, wmesg=3D, sbt=3Dsbt@entry=3D= 0, pr=3Dpr@entry=3D0, flags=3D256) at /usr/src/sys/kern/kern_synch.c:221 #5 0xffffffff80cd5214 in bwait (bp=3D0xfffffe00097da0a8, pri=3D92 '\\', wchan=3D) at /usr/src/sys/kern/vfs_bio.c:5020 #6 bufwait (bp=3Dbp@entry=3D0xfffffe00097da0a8) at /usr/src/sys/kern/vfs_bio.c:4433 #7 0xffffffff80cd285a in bufwrite (bp=3D0xfffffe00097da0a8, bp@entry=3D) at /usr/src/sys/kern/vfs_bio.c:2= 305 #8 0xffffffff80f01789 in bwrite (bp=3D) at /usr/src/sys/sys/buf.h:430 #9 ffs_update (vp=3Dvp@entry=3D0xfffff80004c61380, waitfor=3Dwaitfor@entry= =3D1) at /usr/src/sys/ufs/ffs/ffs_inode.c:204 #10 0xffffffff80f2f98a in ffs_syncvnode (vp=3Dvp@entry=3D0xfffff80004c61380, waitfor=3D, waitfor@entry=3D1, flags=3D, flag= s@entry=3D0) at /usr/src/sys/ufs/ffs/ffs_vnops.c:447 #11 0xffffffff80f0f91d in softdep_prelink (dvp=3Ddvp@entry=3D0xfffff80004c6= 1380, vp=3Dvp@entry=3D0x0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:3417 #12 0xffffffff80f3fee3 in ufs_makeinode (mode=3D33188, dvp=3D0xfffff80004c6= 1380, vpp=3D0xfffffe00aae0a9d8, cnp=3D, callfunc=3D) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2741 #13 0xffffffff80f3bfa4 in ufs_create (ap=3D0xfffffe00aae0a8a8) at /usr/src/sys/ufs/ufs/ufs_vnops.c:213 #14 0xffffffff8118a31d in VOP_CREATE_APV (vop=3D0xffffffff81b63158 , a=3Da@entry=3D0xfffffe00aae0a8a8) at vnode_if.c:244 #15 0xffffffff80d15233 in VOP_CREATE (dvp=3D, vpp=3D0xfffffe00aae0a9d8, cnp=3D0xfffffe00aae0aa00, vap=3D0xfffffe00aae0a7f= 0) at ./vnode_if.h:133 #16 vn_open_cred (ndp=3Dndp@entry=3D0xfffffe00aae0a968, flagp=3Dflagp@entry=3D0xfffffe00aae0aa94, cmode=3Dcmode@entry=3D420, vn_open_flags=3D, vn_open_flags@entry=3D0, cred=3D0xfffff800= 48d42e00, fp=3D0xfffff8010aeabc30) at /usr/src/sys/kern/vfs_vnops.c:285 #17 0xffffffff80d14f6d in vn_open (ndp=3D, ndp@entry=3D0xfffffe00aae0a968, flagp=3D, flagp@entry=3D0xfffffe00aae0aa94, cmode=3D, cmode@entry=3D420, fp=3D) at /usr/src/sys/kern/vfs_vnops.c:202 #18 0xffffffff80d08999 in kern_openat (td=3D0xfffffe00aa35ce00, fd=3D-100, path=3D0x8002fd420 , pathseg=3DUIO_USERSPACE, flags=3D34306, mode=3D) at /usr/src/sys/kern/vfs_syscalls.c:1142 #19 0xffffffff810c5803 in syscallenter (td=3D) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:205 #20 amd64_syscall (td=3D0xfffffe00aa35ce00, traced=3D0) at /usr/src/sys/amd64/amd64/trap.c:1156 #21 #22 0x00000008004f223a in ?? () chrome thread backtrace: #0 cpustop_handler () at /usr/src/sys/x86/x86/mp_x86.c:1475 #1 0xffffffff8108afe9 in ipi_nmi_handler () at /usr/src/sys/x86/x86/mp_x86.c:1432 #2 0xffffffff810c4256 in trap (frame=3D0xfffffe0009848f30) at /usr/src/sys/amd64/amd64/trap.c:201 #3 #4 vtpci_legacy_notify_vq (dev=3D, queue=3D0, offset=3D16) = at /usr/src/sys/dev/virtio/pci/virtio_pci_legacy.c:485 #5 0xffffffff80a45417 in VIRTIO_BUS_NOTIFY_VQ (dev=3D0xfffff8000362fb00, queue=3D0, offset=3D16) at ./virtio_bus_if.h:144 #6 vq_ring_notify_host (vq=3D0xfffffe0063e27000) at /usr/src/sys/dev/virtio/virtqueue.c:834 #7 virtqueue_notify (vq=3D0xfffffe0063e27000, vq@entry=3D0xfffff8004de6f60= 0) at /usr/src/sys/dev/virtio/virtqueue.c:439 #8 0xffffffff80a538c0 in vtblk_startio (sc=3Dsc@entry=3D0xfffff8000362f100= ) at /usr/src/sys/dev/virtio/block/virtio_blk.c:1123 #9 0xffffffff80a53bed in vtblk_strategy (bp=3D0xfffff8004de6f600) at /usr/src/sys/dev/virtio/block/virtio_blk.c:571 #10 0xffffffff80b4bcfc in g_disk_start (bp=3D) at /usr/src/sys/geom/geom_disk.c:473 #11 0xffffffff80b4f147 in g_io_request (bp=3D0xfffff80021d33c00, cp=3D, cp@entry=3D0xfffff8000398ce80) at /usr/src/sys/geom/geom_io.c:589 #12 0xffffffff80b5b1a9 in g_part_start (bp=3D0xfffff8004e974900) at /usr/src/sys/geom/part/g_part.c:2332 #13 0xffffffff80b4f147 in g_io_request (bp=3D0xfffff8004e974900, cp=3D) at /usr/src/sys/geom/geom_io.c:589 #14 0xffffffff80cd284c in bstrategy (bp=3D0xfffffe0008ac5388) at /usr/src/sys/sys/buf.h:442 #15 bufwrite (bp=3D0xfffffe0008ac5388) at /usr/src/sys/kern/vfs_bio.c:2302 #16 0xffffffff80f01789 in bwrite (bp=3D0x0) at /usr/src/sys/sys/buf.h:430 #17 ffs_update (vp=3Dvp@entry=3D0xfffff80139495000, waitfor=3Dwaitfor@entry= =3D1) at /usr/src/sys/ufs/ffs/ffs_inode.c:204 #18 0xffffffff80f2f98a in ffs_syncvnode (vp=3Dvp@entry=3D0xfffff80139495000, waitfor=3D, waitfor@entry=3D1, flags=3D, flag= s@entry=3D0) at /usr/src/sys/ufs/ffs/ffs_vnops.c:447 #19 0xffffffff80f0f86f in softdep_prelink (dvp=3Ddvp@entry=3D0xfffff8013949= 5000, vp=3Dvp@entry=3D0xfffff8013c8328c0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:3= 417 #20 0xffffffff80f3d797 in ufs_remove (ap=3D0xfffffe00aabdfa20) at /usr/src/sys/ufs/ufs/ufs_vnops.c:1011 #21 0xffffffff8118bf90 in VOP_REMOVE_APV (vop=3D0xffffffff81b63158 , a=3Da@entry=3D0xfffffe00aabdfa20) at vnode_if.c:1540 #22 0xffffffff80d0a468 in VOP_REMOVE (dvp=3D0x0, vp=3D0xfffff8013c8328c0, cnp=3D) at ./vnode_if.h:802 #23 kern_funlinkat (td=3D0xfffffe00aa6e3100, dfd=3Ddfd@entry=3D-100, path= =3D0x8288d40e0 , fd=3D, fd@entry=3D-200, pathseg=3Dpathseg@entry=3DUIO_USERSPACE, flag=3D, flag@entry=3D0, oldinum=3D0) at /usr/src/sys/kern/vfs_syscalls.c:1927 #24 0xffffffff80d0a138 in sys_unlink (td=3D0xfffff8000362fb00, uap=3D) at /usr/src/sys/kern/vfs_syscalls.c:1808 #25 0xffffffff810c5803 in syscallenter (td=3D) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:205 #26 amd64_syscall (td=3D0xfffffe00aa6e3100, traced=3D0) at /usr/src/sys/amd64/amd64/trap.c:1156 #27 #28 0x000000080e40d17a in ?? () syncer backtrace: #0 sched_switch (td=3Dtd@entry=3D0xfffffe00a5e29100, flags=3D, flags@entry=3D260) at /usr/src/sys/kern/sched_ule.c:2147 #1 0xffffffff80c1f4c9 in mi_switch (flags=3Dflags@entry=3D260) at /usr/src/sys/kern/kern_synch.c:542 #2 0xffffffff80c6f929 in sleepq_switch (wchan=3Dwchan@entry=3D0xffffffff81= fa9550 , pri=3Dpri@entry=3D0) at /usr/src/sys/kern/subr_sleepqueue.c:= 608 #3 0xffffffff80c6fe3b in sleepq_timedwait (wchan=3Dwchan@entry=3D0xffffffff81fa9550 , pri=3Dpri@entry=3D= 0) at /usr/src/sys/kern/subr_sleepqueue.c:690 #4 0xffffffff80ba34b0 in _cv_timedwait_sbt (cvp=3D0xffffffff81fa9550 , lock=3D0xffffffff81fa9520 , sbt=3D, pr=3D, pr@entry=3D0, flags=3D0, flags@entry=3D256) at /usr/src/sys/kern/kern_condvar.c:312 #5 0xffffffff80d036dc in sched_sync () at /usr/src/sys/kern/vfs_subr.c:2739 #6 0xffffffff80bcb9a0 in fork_exit (callout=3D0xffffffff80d03090 , arg=3D0x0, frame=3D0xfffffe006a491c00) at /usr/src/sys/kern/kern_fork.c:1077 #7 It seems like some kind of livelock involving ERELOOKUP loops. I can only g= uess though, softupdates' is way too complicated for me. That's with cb0dd7e122b8936ad61a141e65ef8ef874bfebe5 merged. This kernel h= as some local changes and I'm a little bit worried that this might be the prob= lem but I think it's unlikely. The problem happens pretty rarely and that's the only -CURRENT install on UFS that I'm working with so that's the best that = I've got. That's with a virtio disk backed by a ZFS volume on bhyve BTW. --=20 You are receiving this mail because: You are the assignee for the bug.=