Date: Sun, 21 Feb 2021 18:51:38 +0000 From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 244048] mksnap_ffs hangs machine (12.1 regression over 11.3) Message-ID: <bug-244048-3630-RlsBk1Xrgy@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-244048-3630@https.bugs.freebsd.org/bugzilla/> References: <bug-244048-3630@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=244048 --- Comment #6 from ml@netfence.it --- After some investigation this is what I found. (Notice I'm no kernel expert, so I just hope I'm not saying stupid things). The thread originating from mksnap_ffs is stuck in softdep_check_suspend, sleeping on mp->mnt_secondary_writes ("secwr" for userland utilities). Full backtrace: #0 sched_switch (td=0xfffff8000237e760, newtd=0xfffff8000212c760, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2143 #1 0xffffffff805a5294 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:452 #2 0xffffffff805f272b in sleepq_switch (wchan=0xfffff80004120a00, pri=119) at /usr/src/sys/kern/subr_sleepqueue.c:626 #3 0xffffffff805f25c3 in sleepq_wait (wchan=0xfffff80004120a00, pri=119) at /usr/src/sys/kern/subr_sleepqueue.c:705 #4 0xffffffff805a4a6b in _sleep (ident=0xfffff80004120a00, lock=<optimized out>, priority=631, wmesg=0xffffffff80921d1b "secwr", sbt=0, pr=0, flags=256) at /usr/src/sys/kern/kern_synch.c:217 #5 0xffffffff807b6324 in softdep_check_suspend (mp=0xfffff80004120000, devvp=0xfffff800041ff780, softdep_depcnt=15, softdep_accdepcnt=50188, secondary_writes=1, secondary_accwrites=106) at /usr/src/sys/ufs/ffs/ffs_softdep.c:14299 #6 0xffffffff807c19a6 in ffs_sync (mp=0xfffff80004120000, waitfor=4) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:1620 #7 0xffffffff8067c8bf in vfs_write_suspend (mp=0xfffff80004120000, flags=0) at /usr/src/sys/kern/vfs_vnops.c:1864 #8 0xffffffff8079c0b9 in ffs_snapshot (mp=0xfffff80004120000, snapfile=0xfffff80004214780 "A\003\215\200\377\377\377\377\070\232\254\200\377\377\377\377 \b\361\035") at /usr/src/sys/ufs/ffs/ffs_snapshot.c:430 #9 0xffffffff807bfe5a in ffs_mount (mp=<unavailable>) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:479 #10 0xffffffff80661a54 in vfs_domount_update (td=0xfffff80080ac9401, vp=<optimized out>, fsflags=<optimized out>, optlist=<optimized out>) at /usr/src/sys/kern/vfs_mount.c:1037 #11 vfs_domount (td=0xfffff80080ac9401, fstype=<optimized out>, fspath=<optimized out>, fsflags=<optimized out>, optlist=0xfffffe000053aa38) at /usr/src/sys/kern/vfs_mount.c:1191 #12 0xffffffff80660b27 in vfs_donmount (td=0xfffff8000237e760, fsflags=2166784, fsoptions=0xfffff80004108600) at /usr/src/sys/kern/vfs_mount.c:726 #13 0xffffffff80660312 in sys_nmount (td=0xfffff8000237e760, uap=0xfffff8000237eb20) at /usr/src/sys/kern/vfs_mount.c:431 #14 0xffffffff808418b7 in syscallenter (td=0xfffff8000237e760) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:144 #15 amd64_syscall (td=0xfffff8000237e760, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1163 #16 <signal handler called> #17 0x00000008002dcb9a in ?? () I *think* it should be awakened by softdep_flush thread, in function process_worklist_item. Alas, this is stuck waiting for a buffer in bufspace_wait. Full backtrace: #0 sched_switch (td=0xfffff8000425f760, newtd=0xfffff800040df000, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2143 #1 0xffffffff805a5294 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:452 #2 0xffffffff805f272b in sleepq_switch (wchan=0xffffffff80a0a8b8 <bdomain+33208>, pri=96) at /usr/src/sys/kern/subr_sleepqueue.c:626 #3 0xffffffff805f25c3 in sleepq_wait (wchan=0xffffffff80a0a8b8 <bdomain+33208>, pri=96) at /usr/src/sys/kern/subr_sleepqueue.c:705 #4 0xffffffff805a4a6b in _sleep (ident=0xffffffff80a0a8b8 <bdomain+33208>, lock=<optimized out>, priority=96, wmesg=0xffffffff808e5676 "newbuf", sbt=0, pr=0, flags=256) at /usr/src/sys/kern/kern_synch.c:217 #5 0xffffffff8064f6ff in bufspace_wait (bd=0xffffffff80a02700 <bdomain>, vp=0xfffff800042145a0, gbflags=<optimized out>, slpflag=<optimized out>, slptimeo=<optimized out>) at /usr/src/sys/kern/vfs_bio.c:773 #6 0xffffffff8064bbfc in getnewbuf (vp=<optimized out>, slpflag=0, slptimeo=0, maxsize=32768, gbflags=0) at /usr/src/sys/kern/vfs_bio.c:3284 #7 0xffffffff80649155 in getblkx (vp=0xfffff800042145a0, blkno=<optimized out>, size=32768, slpflag=0, slptimeo=0, flags=0, bpp=0xfffffe00005853d8) at /usr/src/sys/kern/vfs_bio.c:4022 #8 0xffffffff8064b905 in getblk (vp=<unavailable>, blkno=<unavailable>, size=<unavailable>, slpflag=<unavailable>, slptimeo=<unavailable>, flags=<unavailable>) at /usr/src/sys/kern/vfs_bio.c:3802 #9 0xffffffff807c69b5 in readindir (vp=<unavailable>, lbn=<unavailable>, daddr=56528, bpp=0xfffffe0000585478) at /usr/src/sys/ufs/ufs/ufs_bmap.c:111 #10 0xffffffff807c6468 in ufs_bmaparray (vp=0xfffff800042145a0, bn=-6049804, bnp=0xfffffe0000585518, nbp=<optimized out>, runp=<optimized out>, runb=0x0) at /usr/src/sys/ufs/ufs/ufs_bmap.c:266 #11 0xffffffff807d3975 in ufs_strategy (ap=<optimized out>) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2309 #12 0xffffffff808b9911 in VOP_STRATEGY_APV (vop=0xffffffff80aca540 <ufs_vnodeops>, a=0xfffffe0000585570) at vnode_if.c:2279 #13 0xffffffff80647114 in VOP_STRATEGY (vp=<unavailable>, bp=0xfffffe000090b5c0) at ./vnode_if.h:940 #14 bufstrategy (bo=<optimized out>, bp=0xfffffe000090b5c0) at /usr/src/sys/kern/vfs_bio.c:4999 #15 0xffffffff80648b1e in bstrategy (bp=<optimized out>) at /usr/src/sys/sys/buf.h:419 #16 breadn_flags (vp=<optimized out>, blkno=<optimized out>, size=<optimized out>, rablkno=0x0, rabsize=0x0, cnt=0, cred=0x0, flags=0, ckhashfunc=0x0, bpp=0xfffffe0000585780) at /usr/src/sys/kern/vfs_bio.c:2181 #17 0xffffffff807982b8 in ffs_balloc_ufs2 (vp=<optimized out>, startoffset=<optimized out>, size=<optimized out>, cred=0xfffff8000211b000, flags=<optimized out>, bpp=0xfffffe0000585820) at /usr/src/sys/ufs/ffs/ffs_balloc.c:894 #18 0xffffffff8079fb22 in ffs_snapblkfree (fs=0xfffffe0011c76000, devvp=<optimized out>, bno=48416024, size=32768, inum=5, vtype=VREG, wkhd=0xfffffe0000585950) at /usr/src/sys/ufs/ffs/ffs_snapshot.c:1790 #19 0xffffffff80790d16 in ffs_blkfree (ump=0xfffff80004116800, fs=0xfffffe0011c76000, devvp=0xfffff800041ff780, bno=48416024, size=32768, inum=5, vtype=VREG, dephd=0xfffffe0000585950, key=2) at /usr/src/sys/ufs/ffs/ffs_alloc.c:2602 #20 0xffffffff807baa4f in indir_trunc (freework=0xfffff800048ed480, dbn=<optimized out>, lbn=<optimized out>) at /usr/src/sys/ufs/ffs/ffs_softdep.c:8259 #21 0xffffffff807ba93b in indir_trunc (freework=0xfffff800048ed480, dbn=<optimized out>, lbn=<optimized out>) at /usr/src/sys/ufs/ffs/ffs_softdep.c:8240 #22 0xffffffff807ad4f9 in handle_workitem_indirblk (freework=<optimized out>) at /usr/src/sys/ufs/ffs/ffs_softdep.c:7875 #23 handle_workitem_freeblocks (freeblks=0xfffff800048eda00, flags=512) at /usr/src/sys/ufs/ffs/ffs_softdep.c:7970 #24 0xffffffff807b5ba1 in process_worklist_item (mp=0xfffff80004120000, target=10, flags=512) at /usr/src/sys/ufs/ffs/ffs_softdep.c:1806 #25 0xffffffff807a1e92 in softdep_process_worklist (mp=0xfffff80004120000, full=0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:1600 #26 0xffffffff807a580f in softdep_flush (addr=0xfffff80004120000) at /usr/src/sys/ufs/ffs/ffs_softdep.c:1402 #27 0xffffffff80556f8c in fork_exit (callout=0xffffffff807a5720 <softdep_flush>, arg=0xfffff80004120000, frame=0xfffffe0000585c00) at /usr/src/sys/kern/kern_fork.c:1080 #28 <signal handler called> The above explains why mksnap_ffs is halted; however the whole machine is hanged. It seems almost any user thread (e.g. an "ls" *on a different filesystem*) is stuck in bufspace_wait. Neither buf_daemon thread, nor its child bufspace_daemon are stuck (they are running their normal loop). So far I wasn't able to pinpoint what changed between 11.4 and 12.1 to cause this. Any hint appreciated. -- You are receiving this mail because: You are the assignee for the bug.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-244048-3630-RlsBk1Xrgy>
