Date: Sun, 26 Oct 2014 09:51:06 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 194606] New: filesystem deadlock on 10.1 and head when TRIM enabled at unmount after r268815, MFC of 268205 Message-ID: <bug-194606-8@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=194606 Bug ID: 194606 Summary: filesystem deadlock on 10.1 and head when TRIM enabled at unmount after r268815, MFC of 268205 Product: Base System Version: 10.1-RC2 Hardware: Any OS: Any Status: Needs Triage Severity: Affects Some People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: madpilot@FreeBSD.org CC: imp@FreeBSD.org While performing some tests with nanobad, FreeBSD 10.1-RC3 on alix hardware I discovered a lockup when unmounting filesystems. This hardware is a small motherboard using CF card as main storage. I usually enable trim support on these. NanoBSD mounts filesystems read only, and I use scripts to mount/unmount filesystems when changes need to be saved. I have seen a deadlock when unmounting. With a debugging kernel I got this: root@qtest:~ [0]# umount /cfg panic: detach with active requests KDB: stack backtrace: db_trace_self_wrapper(c0968053,c08ea7f0,c2d48800,c23d6bc8,c0536a16,...) at db_trace_self_wrapper+0x2d/frame 0xc23d6b98 kdb_backtrace(c09639e1,c09fa7e8,c095761d,c23d6c54,c095761d,...) at kdb_backtrace+0x30/frame 0xc23d6c00 vpanic(c09fa682,100,c095761d,c23d6c54,c23d6c54,...) at vpanic+0x80/frame 0xc23d6c24 kassert_panic(c095761d,c09575b3,c2d7acc0,4c7,c2d7acc0,...) at kassert_panic+0xe9/frame 0xc23d6c48 g_detach(c2d7acc0,4,c095725c,1c2,c09c8d5c,...) at g_detach+0x1d3/frame 0xc23d6c64 g_wither_washer(c09f7df4,0,c0956544,124,0,...) at g_wither_washer+0x109/frame 0xc23d6c90 g_run_events(0,c23d6d08,c095d42a,3dc,0,...) at g_run_events+0x40/frame 0xc23d6ccc fork_exit(c05c4e60,0,c23d6d08) at fork_exit+0x7f/frame 0xc23d6cf4 fork_trampoline() at fork_trampoline+0x8/frame 0xc23d6cf4 --- trap 0, eip = 0, esp = 0xc23d6d40, ebp = 0 --- KDB: enter: panic [ thread pid 12 tid 100006 ] Stopped at kdb_enter+0x3d: movl $0,kdb_why db> I played around with ddb and discovered this: db> show geom 0xc2e98b40 consumer: 0xc2e98b40 class: VFS (0xc09c8d5c) geom: ffs.ada0s3 (0xc3293600) provider: ada0s3 (0xc2e7e200) access: r0w0e0 flags: 0x0030 nstart: 19 nend: 18 Which shows nstart != nend, while g_detach asserts them to be the same. Going up the chain of providers I find also it's providers have nstart - nend == 1: db> show geom 0xc2e9b7c0 consumer: 0xc2e9b7c0 class: PART (0xc09c96b0) geom: ada0 (0xc2e7e780) provider: ada0 (0xc2e7e500) access: r2w0e0 flags: 0x0030 nstart: 1430 nend: 1429 db> show geom 0xc2e7e500 provider: ada0 (0xc2e7e500) class: DISK (0xc09c8890) geom: ada0 (0xc2e7e580) mediasize: 4017807360 sectorsize: 512 stripesize: 0 stripeoffset: 0 access: r2w0e0 flags: (0x0030) error: 0 nstart: 2085 nend: 2084 consumer: 0xc2e9a700 (ada0), access=r0w0e0, flags=0x0030 consumer: 0xc2e9b480 (ada0), access=r0w0e0, flags=0x0030 consumer: 0xc2e9b7c0 (ada0), access=r2w0e0, flags=0x0030 Having no idea how to debug further I started testing various revisions and I finally discovered that the commit that broke it is r268815, which MFCed r268205. Also disabling trim on the FS "fixes" the problem, which seems to confirm that change to be involved. Since this depends on hardware support for trim I have been unable to reproduce this in virtualbox. I'm sorry I'm unable to produce a use case. I'm CCing imp, who committed r268815, hoping he can have some more insight in this. This also affects head, obviously. I'm available for any further testing or information needed. Thanks in advance. -- You are receiving this mail because: You are the assignee for the bug.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-194606-8>