Date: Mon, 5 Sep 2016 21:21:42 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Harry Schmalzbauer <freebsd@omnilan.de> Cc: Konstantin Belousov <kostikbel@gmail.com>, FreeBSD Stable <freebsd-stable@freebsd.org>, Mark Johnston <markj@freebsd.org>, "kib@FreeBSD.org" <kib@FreeBSD.org> Subject: Re: unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1905] Message-ID: <YTXPR01MB018948D19A0A9BB5FAC3D5BADDE60@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <57B8793E.4070004@omnilan.de> References: <57A79E24.8000100@omnilan.de> <YQBPR01MB0401201977AEA8A803F27B23DD1A0@YQBPR01MB0401.CANPRD01.PROD.OUTLOOK.COM> <57A83C78.1070403@omnilan.de> <20160809060213.GA67664@raichu> <57A9A6C0.9060609@omnilan.de> <YTOPR01MB0412B2A08F1A3C1A3B2EB160DD1E0@YTOPR01MB0412.CANPRD01.PROD.OUTLOOK.COM>, <20160812123950.GO83214@kib.kiev.ua> <YTXPR01MB018919BE87B12E458144E218DD140@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>, <57B8793E.4070004@omnilan.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Harry Schmalzbauer <freebsd@omnilan.de> wrote: >Bez=FCglich Rick Macklem's Nachricht vom 18.08.2016 02:03 (localtime): >> Kostik wrote: >> [stuff snipped] >>> insmnque() performs the cleanup on its own, and that default cleanup is= not suitable >for the situation. I think that insmntque1() would betterfit= your requirements, your >need to move the common code into a helper.It see= ms that >unionfs_ins_cached_vnode() cleanup could reuse it. >> <https://lists.freebsd.org> >> I've attached an updated patch (untested like the last one). This one cr= eates a >> custom version insmntque_stddtr() that first calls unionfs_noderem() and= then >> does the same stuff as insmntque_stddtr(). This looks like it does the r= equired >> stuff (unionfs_noderem() is what the unionfs VOP_RECLAIM() does). >> It switches the node back to using its own v_vnlock that is exclusively = locked, >> among other things. > >Thanks a lot, today I gave it a try. > >With this patch, one reproducable panic can still be easily triggered: > I have directory A unionfs_mounted under directory B. >Then I mount_unionfs the same directory A below another directory C. >panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ >/usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1= 905 >Result is this backtrace, hardly helpful I guess: > >#1 0xffffffff80ae5fd9 in kern_reboot (howto=3D260) at >/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:366 >#2 0xffffffff80ae658b in vpanic (fmt=3D<value optimized out>, ap=3D<value >optimized out>) > at >/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:759 >#3 0xffffffff80ae63c3 in panic (fmt=3D0x0) at >/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:690 >#4 0xffffffff80ab7ab7 in __lockmgr_args (lk=3D<value optimized out>, >flags=3D<value optimized out>, ilk=3D<value optimized out>, wmesg=3D<value >optimized out>, > pri=3D<value optimized out>, timo=3D<value optimized out>, file=3D<val= ue >optimized out>, line=3D<value optimized out>) > > at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_lock.c:= 992 >#5 0xffffffff80ba510c in vop_stdlock (ap=3D<value optimized out>) at >lockmgr.h:98 >#6 0xffffffff8111932d in VOP_LOCK1_APV (vop=3D<value optimized out>, >a=3D<value optimized out>) at vnode_if.c:2087 >#7 0xffffffff80a18cfc in unionfs_lock (ap=3D0xfffffe007a3ba6a0) at >vnode_if.h:859 >#8 0xffffffff8111932d in VOP_LOCK1_APV (vop=3D<value optimized out>, >a=3D<value optimized out>) at vnode_if.c:2087 >#9 0xffffffff80bc9b93 in _vn_lock (vp=3D<value optimized out>, >flags=3D66560, file=3D<value optimized out>, line=3D<value optimized out>)= at >vnode_if.h:859 >#10 0xffffffff80a18460 in unionfs_readdir (ap=3D<value optimized out>) at >/usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1= 531 >#11 0xffffffff81118ecf in VOP_READDIR_APV (vop=3D<value optimized out>, >a=3D<value optimized out>) at vnode_if.c:1822 >#12 0xffffffff80bc6e3b in kern_getdirentries (td=3D<value optimized out>, >fd=3D<value optimized out>, buf=3D0x800c3d000 <Address 0x800c3d000 out of >bounds>, > count=3D<value optimized out>, basep=3D0xfffffe007a3ba980, residp=3D0x= 0) >at vnode_if.h:758 >#13 0xffffffff80bc6bf8 in sys_getdirentries (td=3D0x0, >uap=3D0xfffffe007a3baa40) at >/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_syscalls.c:3940 >#14 0xffffffff80fad6b8 in amd64_syscall (td=3D<value optimized out>, >traced=3D0) at subr_syscall.c:135 >#15 0xffffffff80f8feab in Xfast_syscall () at >/usr/local/share/deploy-tools/RELENG_11/src/sys/amd64/amd64/exception.S:39= 6 >#16 0x0000000000452eea in ?? () >Previous frame inner to this frame (corrupt stack? Ok, I finally got around to looking at this and the panic() looks like a pr= etty straightforward bug in the unionfs code. - In unionfs_readdir(), it does a vn_lock(..LK_UPGRADE) and then later in t= he code vn_lock(..LK_DOWNGRADE) if it did the upgrade. (At line#1531 as noted in = the backtrace.) - In unionfs_lock(), it sets LK_CANRECURSE when it is the rootvp and LK_E= XCLUSIVE. (So it allows recursive acquisition in this case.) --> Then it would call vn_lock(..LK_DOWNGRADE), which would panic if it has= recursed. Now, I'll admit unionfs_lock() is too obscure for me to understand, but... Is it necessary to vn_lock(..LK_DOWNGRADE) or can unionfs_readdir() just re= turn with the vnode exclusively locked? (It would be easy to change the code to avoid the vn_lock(..LK_DOWNGRADE) c= all when it has done the vn_lock(..LK_EXCLUSIVE) after vn_lock(..LK_UPGRADE) f= ails.) rick >I ran your previous patch with for some time. >Similarly, mounting one directory below a 2nd mountpount crashed the >machine (forgot to config dumpdir, so can't compare backtrace with the >current patch). >Otherwise, at least with the previous patch, I haven't had any other >panic for about one week. > >Thanks, > >-Harry
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTXPR01MB018948D19A0A9BB5FAC3D5BADDE60>