Date: Wed, 12 Jul 2017 22:12:25 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 220693] head -r320570 & -r320760 (e.g.): ufs snapshot creation broken & leads to fsck -B related SSD-trim "freeing free block" panics; more Message-ID: <bug-220693-8@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D220693 Bug ID: 220693 Summary: head -r320570 & -r320760 (e.g.): ufs snapshot creation broken & leads to fsck -B related SSD-trim "freeing free block" panics; more Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: markmi@dsl-only.net See also the exchange of list submittals associated with: https://lists.freebsd.org/pipermail/freebsd-current/2017-July/066505.html and: https://lists.freebsd.org/pipermail/freebsd-current/2017-July/066508.html I free quote material from these without attribution here. . . Basic context material . . . As I remember it happened to be that the reporting folks were using non-debug/non-invariant kernel builds. Multiple TARGET_ARCH's, 32-bit and 64-bit, little-endian and big-endian. The basic create-snapshot test that fails: After a short pause with disk activity, the same sorts of errors are=20 logged when using "mksnap_ffs /.snap2" where .snap2 did not previously=20 exist The type of messages was (e.g.): g_vfs_done():ada0s3a[READ(offset=3D6050375794688, length=3D32768)]error =3D= 5 Jul 7 00:10:24 toshi kernel Note the huge offset: such is true of the messages in general. Also the messages are from the kernel and its nmount related snapshot creation activity, not from the user-space program. The original list-notice was about dump (and its snapshot creation) but the issue is not specific to dump. fsck -B related panic material. . . My original context for this: 32-bit powerpc. <Prior failed multi-user boot from system problem leaves root (only) file system not marked clean so fsck -B will actually do something below> boot -s (so: single user mode) # The next 3 lines are the content of a generic, manually-run script. mount -u / mount -a -t ufs (but there is no other file system) swapon -a (there is a swap partition) # fsck -B That "fsck -B" caused the same kinds of lines reported by Michael Butler, happening as fsck makes a snapshot for the background processing to use. After the g_vfs_done lines was text like (typed in from an example camera picture): ** //.snap/fsck_snapshot ** Last Mount on / ** Root file system ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups Reclaimed: 0 directories, 1 files, 22680 fragments 780914 files, 4797127 used, 19552199 free (443479 frags, 3288590 blocks, 1.= 8% fragmentation) ***** FILE SYSTEM MARKED CLEAN ***** But always waiting a while leads to a panic that looks like (showing an example): (Note: context is an SSD with trim enabled) (typed in from camera picture) panic: ffs_blkfree_cq: freeing free block cpuid =3D 2 (varies, of course) time =3D (varies) KDB: stack backtrace (stack addresses can vary: just an example here) 0xd23b17e0: at kdb_backtrace+0x5c 0xd23b1850: at vpanic+0x1e8 0xd23b18c0: at panic+0x54 0xd23b1910: at ffs_blkfree_cq+0x278 0xd23b1980: at ffs_blkfree_trim_task+0x60 0xd23b19b0: at taskqueue_run_locked+0x10 0xd23b1a10: at taskqueue_thread_loop+0x174 0xd23b1a50: at fork_exit+0xf4 0xd23b1a80: at fork_trampoline+0xc KDB: enter: panic [ thread pid 0 tid 1000082 ] Stopped at kdb_enter_0x70: addi r0,r0,0x0 I've tried this on a powerpc64 and it works the same, complete with the "freeing free block" issue. I've also had the problem with a normal multi-user boot that initiated a fsck -B automatically in a context where the SSD had not been marked clean. To avoid this and fix such file systems I've been booting with "boot -s" and using "fsck -F" from the single-user command prompt. Unfortunately two problems with major consequences for my involved context limit the svn range that I can cover for the activity, the problem version ranges being: -r319722 through -r320651 (fixed by -r320652) (actually this is why I had originally used "boot -s" in what I report above: I could get to a shell prompt that way instead of crashing before any login prompt; the crashes left the file system in need of repair) -r320509 through -r320561 (fixed by -r320570) So I was using -r320570 to avoid one of the two problems, now with a trail patch for what was later fixed in -r320652. I do not know if the problem was present back before -r319722 or before -r320509. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-220693-8>