From owner-freebsd-fs@FreeBSD.ORG Wed Jul 27 17:56:10 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3CD781065673; Wed, 27 Jul 2011 17:56:10 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 70FB08FC12; Wed, 27 Jul 2011 17:56:09 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: As4AAPpPME6DaFvO/2dsb2JhbAA1AQEFKQRGEh0OCgICDQceAhYSPwcXhFaTLJA/uWyRSIErgXuCC4EPBJJ1iDOBOIcT X-IronPort-AV: E=Sophos;i="4.67,277,1309752000"; d="scan'208";a="132455253" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 27 Jul 2011 13:56:08 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 57F7BB40F7; Wed, 27 Jul 2011 13:56:08 -0400 (EDT) Date: Wed, 27 Jul 2011 13:56:08 -0400 (EDT) From: Rick Macklem To: John Baldwin Message-ID: <1847245041.1083168.1311789368340.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <201107270840.57104.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: Attilio Rao , freebsd-fs@freebsd.org Subject: Re: Does msodsfs_readdir() require a exclusively locked vnode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Jul 2011 17:56:10 -0000 John Baldwin wrote: > On Tuesday, July 26, 2011 9:56:37 pm Attilio Rao wrote: > > 2011/7/26 Kostik Belousov : > > > On Tue, Jul 26, 2011 at 10:07:28AM -0400, Rick Macklem wrote: > > >> Kostik Belousov wrote: > > >> > On Mon, Jul 25, 2011 at 07:22:40PM -0400, Rick Macklem wrote: > > >> > > Hi, > > >> > > > > >> > > Currently both NFS servers set the vnode lock LK_SHARED > > >> > > and so do the local syscalls (at least that's how it looks > > >> > > by inspection?). > > >> > > > > >> > > Peter Holm just posted me this panic, where a test for an > > >> > > exclusive vnode lock fails in msdosfs_readdir(). > > >> > > KDB: stack backtrace: > > >> > > > db_trace_self_wrapper(c0efa6f6,c71627f8,c79230b0,c0f2ef29,f19154b8,...) > > >> > > at db_trace_self_wrapper+0x26 > > >> > > kdb_backtrace(c7f20b38,f19154fc,c0d586d5,f191550c,c7f20ae0,...) > > >> > > at > > >> > > kdb_backtrace+0x2a > > >> > > vfs_badlock(c101b180,f191550c,c1055580,c7f20ae0) at > > >> > > vfs_badlock+0x23 > > >> > > assert_vop_elocked(c7f20ae0,c0ee5f4f,c09f3213,8,0,...) at > > >> > > assert_vop_elocked+0x55 > > >> > > pcbmap(c7966e00,0,f191560c,f1915618,f191561c,...) at > > >> > > pcbmap+0x45 > > >> > > msdosfs_readdir(f1915960,c0f4b343,c7f20ae0,f1915940,0,...) at > > >> > > msdosfs_readdir+0x528 > > >> > > VOP_READDIR_APV(c101b180,f1915960,2,f1915a68,c7923000,...) at > > >> > > VOP_READDIR_APV+0xc5 > > >> > > nfsrvd_readdir(f1915b64,0,c7f20ae0,c7923000,f1915a68,...) at > > >> > > nfsrvd_readdir+0x38e > > >> > > nfsrvd_dorpc(f1915b64,0,c7923000,c842a200,4,...) at > > >> > > nfsrvd_dorpc+0x1f79 > > >> > > nfssvc_program(c7793800,c842a200,c0f24d67,492,0,...) at > > >> > > nfssvc_program+0x40f > > >> > > svc_run_internal(f1915d14,c09d9a98,c73dfa80,f1915d28,c0ef1130,...) > > >> > > at svc_run_internal+0x952 > > >> > > svc_thread_start(c73dfa80,f1915d28,c0ef1130,3a5,c7e4b2c0,...) > > >> > > at > > >> > > svc_thread_start+0x10 > > >> > > fork_exit(c0bed7d0,c73dfa80,f1915d28) at fork_exit+0xb8 > > >> > > fork_trampoline() at fork_trampoline+0x8 > > >> > > --- trap 0x804c12e, eip = 0xc, esp = 0x33, ebp = 0x1 --- > > >> > > pcbmap: 0xc7f20ae0 is not exclusive locked but should be > > >> > > KDB: enter: lock violation > > >> > > > > >> > > So, does anyone know if the msdosfs_readdir() really requires > > >> > > a > > >> > > LK_EXCLUSIVE > > >> > > locked vnode or is the ASSERT_VOP_ELOCKED() too strong in > > >> > > pcbmap()? > > >> > > > >> > Yes, msdosfs currently requires all vnode locks to be > > >> > exclusive. One > > >> > of > > >> > the reasons is that each denode (the msdosfs-private vnode > > >> > data) > > >> > carries > > >> > the fat entries cache, and this cache is updated even by the > > >> > operations > > >> > that do not modify vnode from the VFS POV. > > >> > > > >> > The locking regime is enforced by the getnewvnode() > > >> > initializing the > > >> > vnode > > >> > lock with LK_NOSHARE flag, and msdosfs code not calling > > >> > VN_LOCK_ASHARE() > > >> > on the newly instantiated vnode. > > >> > > > >> > My question is, was the vnode in question locked at all ? > > >> I think the problem is that I do a LK_DOWNGRADE. From a quick > > >> look at __lockmgr_args(), it doesn't check LK_NOSHARE for a > > >> LK_DOWNGRADE. > > >> > > >> Maybe __lockmgr_args() should have something like: > > >> if (op == LK_DOWNGRADE && (lk->lock_object.lo_flags & > > >> LK_NOSHARE)) > > >> return (0); /* noop */ > > >> after the > > >> if (op == LK_SHARED && (lk->lock_object.lo_flags & > > >> LK_NOSHARE)) > > >> op = LK_EXCLUSIVE; > > >> lines? > > > The RELENG_7 lockmgr does not check the NOSHARE flag on downgrade, > > > but I agree with the essence of your proposal. > > > > As long as the difference in semantic with the old lockmgr is > > correctly stressed out in the doc (and eventually comments) I'm fine > > with this change. > > I think it is a bug in the LK_NOSHARE implementation if the old > lockmgr() > didn't silently nop downgrade requests when LK_NOSHARE was set. :) We > should definitely fix it to ignore downgrades for LK_NOSHARE. > By the way, I think that __lockmgr_args() in -current doesn't check for LK_NOSHARE. That was what pho@ was testing when he found the problem. At this point, I believe that the new NFS server (which I have a patch for that pho@ is testing to avoid LK_DOWNGRADE) is the only place that is broken. (compute_cn_lkflags() only sets LK_SHARED if MNT_LOOKUP_SHARED is set and the only LK_DOWNGRADE I see is in vfs_cache.c when cn_lkflags == LK_SHARED. The rest are in file systems that handle LK_SHARED locked vnodes, from what I can see at a glance.) So, it isn't a difference between old/current behaviour, just a suggestion that adding a check in __lockmgr_args() might be a nice safety belt for the future, since __lockargs_mgr() already checks for the LK_SHARED case. rick, who will get the fix for the new NFS server to re@ soon