From owner-freebsd-stable@freebsd.org Mon Sep 5 21:21:51 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 89AE1AC4884 for ; Mon, 5 Sep 2016 21:21:51 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from na01-bl2-obe.outbound.protection.outlook.com (mail-bl2on0085.outbound.protection.outlook.com [65.55.169.85]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "Microsoft IT SSL SHA2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C2CA813B; Mon, 5 Sep 2016 21:21:50 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM (10.165.218.133) by YTXPR01MB0191.CANPRD01.PROD.OUTLOOK.COM (10.165.218.135) with Microsoft SMTP Server (version=TLS1_0, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA_P384) id 15.1.609.9; Mon, 5 Sep 2016 21:21:42 +0000 Received: from YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM ([10.165.218.133]) by YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM ([10.165.218.133]) with mapi id 15.01.0599.016; Mon, 5 Sep 2016 21:21:42 +0000 From: Rick Macklem To: Harry Schmalzbauer CC: Konstantin Belousov , FreeBSD Stable , Mark Johnston , "kib@FreeBSD.org" Subject: Re: unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1905] Thread-Topic: unionfs bugs, a partial patch and some comments [Was: Re: 1-BETA3 Panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ /usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1905] Thread-Index: AQHR8Usp3UC6l4vlEEOdyPFAeIRbFKBAJI+AgAA/AgCAA/TscIAA8igAgAiYa2KABCvkAIAZfWvF Date: Mon, 5 Sep 2016 21:21:42 +0000 Message-ID: References: <57A79E24.8000100@omnilan.de> <57A83C78.1070403@omnilan.de> <20160809060213.GA67664@raichu> <57A9A6C0.9060609@omnilan.de> , <20160812123950.GO83214@kib.kiev.ua> , <57B8793E.4070004@omnilan.de> In-Reply-To: <57B8793E.4070004@omnilan.de> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=rmacklem@uoguelph.ca; x-originating-ip: [25.165.183.132] x-ms-office365-filtering-correlation-id: 7c254b4a-3ffc-41ed-018e-08d3d5d2a1e1 x-microsoft-exchange-diagnostics: 1; YTXPR01MB0191; 6:5lGjdfPCAR0zGLdXM9UmH8PUosP8QkRbvWpuaBV8RP8cJGs5sXSJLzHaEfURjbqo1IRda77ZUN9502FAXTLaDjCrrmALWYoBS0LsXGG7+5h4zmLKsoakEiOgY/zdTx6Xx/1QoTkhz67UR9sJsST3F3D8GVWBjwKacHvwEQLVkQnHoGG59aUNRDjWIEo+TZd44YCBODfwh6wkOWyhOP5oUb5y32vtplcj6Xez9QqazzxEpBZ2vDXa/BSLVPCtAOfaUtWxIPaEbcRgt8AS/FV+U6U9HSiGcfUY3rkOx4TPg/qN/nsa4oA/EW3rWW9izpr/; 5:9SyJ2X3xWsUkJxmata1mKwfEKyUiZ2kBHxwS5I0s4Fqvi4VY9n64bTJIbD4q+p+StogH6bjy1hlwpFiONA5zbakjhJk5d2YyGVUcIgqhegYzFgyEIzpFpBM1iHXyGep1w8SGVp7qJkjnKv1hrWAwrQ==; 24:7hO2fiMUJulYt8EBZxWKMKVwgOLPMBeqUS/VNojiW+6iZWmV55nXMSBPUnYa7E3ReRJ/Zz5aj7/ALyTWGhbymckBrXpY6/dhrhtgdrEGUkA=; 7:0SA23gQQkPiV2FQUxJoUfcpyKLeb8+wgWif6x2VfCz6jqIW669HGu/b/TDY9EntAYbmkb6AwVlwxgXO4tl4eTKjw0jKUDeoxzgb3vEcVzXxA6DNdvb1x5vaOjCYGxkRSbCg0qUykjYuplEfdL33I8zQH5GND3jK0Lcnda0Mb/9A21LsQqz5fYmTXWH/15aP6UpV3vgHORET2bwYY6soIdEVVemXC+JPc2kXxNGAXG98DVKyemy9g1c13MzfA7zNO x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:YTXPR01MB0191; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(75325880899374); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040176)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6043046)(6042046); SRVR:YTXPR01MB0191; BCL:0; PCL:0; RULEID:; SRVR:YTXPR01MB0191; x-forefront-prvs: 005671E15D x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(7916002)(24454002)(199003)(51444003)(189002)(76176999)(7846002)(50986999)(19625215002)(54356999)(19627405001)(77096005)(586003)(106116001)(101416001)(106356001)(15975445007)(105586002)(11100500001)(3660700001)(2950100001)(68736007)(2900100001)(10400500002)(4326007)(5890100001)(86362001)(33656002)(5660300001)(2906002)(19617315012)(97736004)(5002640100001)(7906003)(3280700002)(66066001)(87936001)(7696003)(19580395003)(7736002)(19580405001)(8676002)(9686002)(81156014)(74482002)(92566002)(6116002)(102836003)(74316002)(3846002)(16236675004)(8936002)(189998001)(81166006)(122556002)(93886004)(110136002); DIR:OUT; SFP:1101; SCL:1; SRVR:YTXPR01MB0191; H:YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Sep 2016 21:21:42.3506 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-Transport-CrossTenantHeadersStamped: YTXPR01MB0191 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Sep 2016 21:21:51 -0000 Harry Schmalzbauer wrote: >Bez=FCglich Rick Macklem's Nachricht vom 18.08.2016 02:03 (localtime): >> Kostik wrote: >> [stuff snipped] >>> insmnque() performs the cleanup on its own, and that default cleanup is= not suitable >for the situation. I think that insmntque1() would betterfit= your requirements, your >need to move the common code into a helper.It see= ms that >unionfs_ins_cached_vnode() cleanup could reuse it. >> >> I've attached an updated patch (untested like the last one). This one cr= eates a >> custom version insmntque_stddtr() that first calls unionfs_noderem() and= then >> does the same stuff as insmntque_stddtr(). This looks like it does the r= equired >> stuff (unionfs_noderem() is what the unionfs VOP_RECLAIM() does). >> It switches the node back to using its own v_vnlock that is exclusively = locked, >> among other things. > >Thanks a lot, today I gave it a try. > >With this patch, one reproducable panic can still be easily triggered: > I have directory A unionfs_mounted under directory B. >Then I mount_unionfs the same directory A below another directory C. >panic: __lockmgr_args: downgrade a recursed lockmgr nfs @ >/usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1= 905 >Result is this backtrace, hardly helpful I guess: > >#1 0xffffffff80ae5fd9 in kern_reboot (howto=3D260) at >/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:366 >#2 0xffffffff80ae658b in vpanic (fmt=3D, ap=3Doptimized out>) > at >/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:759 >#3 0xffffffff80ae63c3 in panic (fmt=3D0x0) at >/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_shutdown.c:690 >#4 0xffffffff80ab7ab7 in __lockmgr_args (lk=3D, >flags=3D, ilk=3D, wmesg=3Doptimized out>, > pri=3D, timo=3D, file=3Doptimized out>, line=3D) > > at /usr/local/share/deploy-tools/RELENG_11/src/sys/kern/kern_lock.c:= 992 >#5 0xffffffff80ba510c in vop_stdlock (ap=3D) at >lockmgr.h:98 >#6 0xffffffff8111932d in VOP_LOCK1_APV (vop=3D, >a=3D) at vnode_if.c:2087 >#7 0xffffffff80a18cfc in unionfs_lock (ap=3D0xfffffe007a3ba6a0) at >vnode_if.h:859 >#8 0xffffffff8111932d in VOP_LOCK1_APV (vop=3D, >a=3D) at vnode_if.c:2087 >#9 0xffffffff80bc9b93 in _vn_lock (vp=3D, >flags=3D66560, file=3D, line=3D)= at >vnode_if.h:859 >#10 0xffffffff80a18460 in unionfs_readdir (ap=3D) at >/usr/local/share/deploy-tools/RELENG_11/src/sys/fs/unionfs/union_vnops.c:1= 531 >#11 0xffffffff81118ecf in VOP_READDIR_APV (vop=3D, >a=3D) at vnode_if.c:1822 >#12 0xffffffff80bc6e3b in kern_getdirentries (td=3D, >fd=3D, buf=3D0x800c3d000
bounds>, > count=3D, basep=3D0xfffffe007a3ba980, residp=3D0x= 0) >at vnode_if.h:758 >#13 0xffffffff80bc6bf8 in sys_getdirentries (td=3D0x0, >uap=3D0xfffffe007a3baa40) at >/usr/local/share/deploy-tools/RELENG_11/src/sys/kern/vfs_syscalls.c:3940 >#14 0xffffffff80fad6b8 in amd64_syscall (td=3D, >traced=3D0) at subr_syscall.c:135 >#15 0xffffffff80f8feab in Xfast_syscall () at >/usr/local/share/deploy-tools/RELENG_11/src/sys/amd64/amd64/exception.S:39= 6 >#16 0x0000000000452eea in ?? () >Previous frame inner to this frame (corrupt stack? Ok, I finally got around to looking at this and the panic() looks like a pr= etty straightforward bug in the unionfs code. - In unionfs_readdir(), it does a vn_lock(..LK_UPGRADE) and then later in t= he code vn_lock(..LK_DOWNGRADE) if it did the upgrade. (At line#1531 as noted in = the backtrace.) - In unionfs_lock(), it sets LK_CANRECURSE when it is the rootvp and LK_E= XCLUSIVE. (So it allows recursive acquisition in this case.) --> Then it would call vn_lock(..LK_DOWNGRADE), which would panic if it has= recursed. Now, I'll admit unionfs_lock() is too obscure for me to understand, but... Is it necessary to vn_lock(..LK_DOWNGRADE) or can unionfs_readdir() just re= turn with the vnode exclusively locked? (It would be easy to change the code to avoid the vn_lock(..LK_DOWNGRADE) c= all when it has done the vn_lock(..LK_EXCLUSIVE) after vn_lock(..LK_UPGRADE) f= ails.) rick >I ran your previous patch with for some time. >Similarly, mounting one directory below a 2nd mountpount crashed the >machine (forgot to config dumpdir, so can't compare backtrace with the >current patch). >Otherwise, at least with the previous patch, I haven't had any other >panic for about one week. > >Thanks, > >-Harry