From owner-freebsd-current@FreeBSD.ORG Tue Jul 22 17:05:47 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7A5111065679; Tue, 22 Jul 2008 17:05:47 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (skuns.zoral.com.ua [91.193.166.194]) by mx1.freebsd.org (Postfix) with ESMTP id 1277D8FC1C; Tue, 22 Jul 2008 17:05:46 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id m6MH5e9w039284 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 22 Jul 2008 20:05:40 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.2/8.14.2) with ESMTP id m6MH5eql072712; Tue, 22 Jul 2008 20:05:40 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.2/8.14.2/Submit) id m6MH5eOo072711; Tue, 22 Jul 2008 20:05:40 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 22 Jul 2008 20:05:40 +0300 From: Kostik Belousov To: Attilio Rao Message-ID: <20080722170540.GA17123@deviant.kiev.zoral.com.ua> References: <4884F992.7090008@cs.duke.edu> <20080722154825.GZ17123@deviant.kiev.zoral.com.ua> <3bbf2fe10807220954q60ee6747x40076e39884daf19@mail.gmail.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="g3+pAoj2zJcoLeOT" Content-Disposition: inline In-Reply-To: <3bbf2fe10807220954q60ee6747x40076e39884daf19@mail.gmail.com> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: ClamAV version 0.93.3, clamav-milter version 0.93.3 on skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-current@freebsd.org, Andrew Gallatin Subject: Re: reproducible "panic: share->excl" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 17:05:47 -0000 --g3+pAoj2zJcoLeOT Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jul 22, 2008 at 06:54:04PM +0200, Attilio Rao wrote: > 2008/7/22, Kostik Belousov : > > On Mon, Jul 21, 2008 at 05:03:14PM -0400, Andrew Gallatin wrote: > > > I can panic today's -current reliably (or hang it with > > > WITNESS/INVARIENTS disabled). When it crashes, I see > > > the appended panic messages. > > > > > > It seems to be 100% reproducible on my box (AMD64 x2, > > > 512MB ram, UFS2). If anybody savvy in this area would > > > like to reproduce it, I've left the program at ~gallatin/ahunt.c > > > on freefall. Compile it, and run it as: > > > ./a.out -mmbfileinit -madvise=3D/var/tmp/zot -random -size=3D95536 > > > -touch=3D4096 -rewrite=3D2 > > > > > > > > > Cheers, > > > > > > Drew > > > > > > PS: Here is a serial console log from the panic: > > > > ... > > > > > > > login: shared lock of (lockmgr) ufs @ kern/vfs_subr.c:2044 > > > while exclusively locked from kern/vfs_vnops.c:593 > > > panic: share->excl > > > cpuid =3D 1 > > > KDB: enter: panic > > > [thread pid 1702 tid 100149 ] > > > Stopped at kdb_enter+0x3d: movq $0,0x639958(%rip) > > > db> tr > > > Tracing pid 1702 tid 100149 td 0xffffff000d08f000 > > > kdb_enter() at kdb_enter+0x3d > > > panic() at panic+0x176 > > > witness_checkorder() at witness_checkorder+0x137 > > > __lockmgr_args() at __lockmgr_args+0xc74 > > > ffs_lock() at ffs_lock+0x8c > > > VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b > > > _vn_lock() at _vn_lock+0x47 > > > vget() at vget+0x7b > > > vnode_pager_lock() at vnode_pager_lock+0x146 > > > vm_fault() at vm_fault+0x1e2 > > > trap_pfault() at trap_pfault+0x128 > > > trap() at trap+0x395 > > > calltrap() at calltrap+0x8 > > > --- trap 0xc, rip =3D 0xffffffff8079f2bd, rsp =3D 0xfffffffe58c2f7b0= , rbp =3D > > > 0xfffffffe58c2f830 --- > > > copyin() at copyin+0x3d > > > ffs_write() at ffs_write+0x2f8 > > > VOP_WRITE_APV() at VOP_WRITE_APV+0x10b > > > vn_write() at vn_write+0x23f > > > dofilewrite() at dofilewrite+0x85 > > > --More-- > > > > > > kern_writev() at kern_writev+0x60 > > > write() at write+0x54 > > > syscall() at syscall+0x1dd > > > Xfast_syscall() at Xfast_syscall+0xab > > > --- syscall (4, FreeBSD ELF64, write), rip =3D 0x8007296ec, rsp =3D > > > 0x7fffffffe158, rbp =3D 0x7fffffffe210 --- > > > db> show locks > > > exclusive sleep mutex vnode interlock r =3D 0 (0xffffff000d0dc0c0) l= ocked > > > @ vm/vnode_pager.c:1199 > > > exclusive sx user map r =3D 0 (0xffffff000d054360) locked @ vm/vm_ma= p.c:3115 > > > exclusive lockmgr bufwait r =3D 0 (0xfffffffe5047f278) locked @ > > > kern/vfs_bio.c:1783 > > > exclusive lockmgr ufs r =3D 0 (0xffffff000d0dc098) locked @ > > > kern/vfs_vnops.c:593 > > > db> > > > > > > Essentially, you tried to do the write of the part of the region mmaped > > from the file, to the file. The VOP_WRITE() is called with exclusively > > locked vnode, while fault handler tried to lock the vnode in shared mo= de > > to page in. > > > > The following change fixed it for me. > > Attilio, would it make sense to consider LK_CANRECURSE | LK_SHARED as > > a request for the exlusive lock when the current thread already hold t= he > > exclusive lock instead ? I think this would be a proper solution. >=20 > I don't like this kind of magics and ecoding in lockmgr. > I think that the better thing to do here is to recurse the exclusive > lock as you pass to vget(). It could be argued that lockmgr is a black magic in whole. On the other hand, I had to use VOP_ISLOCKED() and manually construct lock request while all needed information is at hands inside the lockmgr. Moreover, I believe that doing implicit shared->exclusive request upgrade in this situation (excl locked by curthread, LK_CANRECURSE present) is right. >=20 > Also note that without WITNESS the code will return EDEADLK in this > case while traditionally what would have happened is that the lockmgr > would have to be downgraded silently, but as you can expect this is a > very dangerous practice. Fully agree. --g3+pAoj2zJcoLeOT Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) iEYEARECAAYFAkiGE2MACgkQC3+MBN1Mb4hygACeOSgFz4Qct1+dMcxRetwJJIIc gGYAn2O5wMApwEFRPhVDGoI1NeHsCHlx =du+u -----END PGP SIGNATURE----- --g3+pAoj2zJcoLeOT--