Date: Tue, 07 Mar 2017 19:44:19 +0100 From: Harry Schmalzbauer <freebsd@omnilan.de> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: Konstantin Belousov <kostikbel@gmail.com>, "kib@FreeBSD.org" <kib@freebsd.org>, Mark Johnston <markj@freebsd.org>, FreeBSD Stable <freebsd-stable@freebsd.org> Subject: Re: unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1905] Message-ID: <58BEFF83.9010906@omnilan.de> In-Reply-To: <58BEAAAC.4090303@omnilan.de> References: <57A79E24.8000100@omnilan.de> <YQBPR01MB0401201977AEA8A803F27B23DD1A0@YQBPR01MB0401.CANPRD01.PROD.OUTLOOK.COM> <57A83C78.1070403@omnilan.de> <20160809060213.GA67664@raichu> <57A9A6C0.9060609@omnilan.de> <YTOPR01MB0412B2A08F1A3C1A3B2EB160DD1E0@YTOPR01MB0412.CANPRD01.PROD.OUTLOOK.COM>, <20160812123950.GO83214@kib.kiev.ua> <YTXPR01MB018919BE87B12E458144E218DD140@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>, <57B8793E.4070004@omnilan.de> <YTXPR01MB018948D19A0A9BB5FAC3D5BADDE60@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM> <58BEAAAC.4090303@omnilan.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Bezüglich Harry Schmalzbauer's Nachricht vom 07.03.2017 13:42 (localtime): … > Something ufs related seems to have tightened the unionfs locking > problem in stable/11. Now the machine instantaniously panics during > boot after mounting root with Rick's latest patch. > > Unfortunately I don't have SWAP available on that machine (yet), but > maybe shit is a hint for anybody. > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfffffe00982220e0 > vpanic() at vpanic+0x186/frame 0xfffffe0098222160 > kassert_panic() at kassert_panic+0x126/frame 0xfffffe00982221d0 > witness_assert() at witness_assert+0x35a/frame 0xfffffe0098222230 > __lockmgr_args() at __lockmgr_args+0x517/frame 0xfffffe00982222d0 > vop_stdunlock() at vop_stdunlock+0x3b/frame 0xfffffe00982222f0 > VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0xe0/frame 0xfffffe0098222320 > unionfs_unlock() at unionfs_unlock+0x112/frame 0xfffffe0098222390 > VOP_UNLOCK_APV() at VOP_UNLOCK_APV+0xe0/frame 0xfffffe00982223c0 > unionfs_nodeget() at unionfs_nodeget+0x3ef/frame 0xfffffe0098222470 > unionfs_domount() at unionfs_domount+0x518/frame 0xfffffe00982226b0 > vfs_donmount() at vfs_donmount+0xe37/frame 0xfffffe00982228f0 > sys_nmount() at sys_nmount+0x72/frame 0xfffffe0098222930 > amd64_syscall() at amd64_syscall+0x2f9/frame 0xfffffe0098222ab0 > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0098222ab0 > --- syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x80086ecea, rsp = > 0x7fffffffe318, rbp = 0x7fffffffeca0 --- New discovery: Rick's latest patch casues panic only with KDB. If I compile a kernel without witenss and KDB, the machine boots fine! Also, it's at least not so easy anymore to trigger the deadlock :-) . I need to do more testing but until now Rick's approach seems very promising :-) . Unfortunately I can't provide a fix or suggestion to why the KDB kernel panics and the non-KDB doesn't, just the dull imagination it could be that additional locking checks (KASSERT?), preventing more damage, are not in place. So I guess I'm in danger waters, but it defenitly is a highly appreciated improvement for me and my bery best bet for now (neither eliminating unionfs nor holding off 11 updates were real options for me, especially because unionfs isn't really well wokring on 10.3 either, just not leading to deadlocks in more environments)! I tried the non-debug kernel because I browsed old unionfs discussions and desperately gave Attilio Rao's patch a try since I couldn't remember why I haven't kept it locally: https://people.freebsd.org/~attilio/unionfs_nodeget4.patch (he tried to solve unionfs problems for RELENG_9 back in 2012: https://lists.freebsd.org/pipermail/freebsd-stable/2012-November/070358.html) It's still true that his patch leads to a panic with debugging kernel – only. Same patch without KDB allows to boot and start squid. But the result is the same as with plain r314856, the system deadlocks reproducibly. Also, the trace with his patch looks identical to the plain r314856 unionfs panic. So I hope Rick or someone else can pick up the latest patch and polish it to make KDB-kernels happy :-) I can offer a small donation if that helps! Of course, I'll also provide KDB info if needed/helpful. thanks, -harry
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?58BEFF83.9010906>