From owner-freebsd-fs@FreeBSD.ORG Wed Aug 23 20:06:56 2006 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C684316A4E0 for ; Wed, 23 Aug 2006 20:06:56 +0000 (UTC) (envelope-from Tor.Egge@cvsup.no.freebsd.org) Received: from pil.idi.ntnu.no (pil.idi.ntnu.no [129.241.107.93]) by mx1.FreeBSD.org (Postfix) with ESMTP id 87C2F43D4C for ; Wed, 23 Aug 2006 20:06:55 +0000 (GMT) (envelope-from Tor.Egge@cvsup.no.freebsd.org) Received: from cvsup.no.freebsd.org (c2h5oh.idi.ntnu.no [129.241.103.69]) by pil.idi.ntnu.no (8.13.6/8.13.1) with ESMTP id k7NK6r4d028075 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Wed, 23 Aug 2006 22:06:54 +0200 (MEST) Received: from localhost (localhost [127.0.0.1]) by cvsup.no.freebsd.org (8.13.4/8.13.4) with ESMTP id k7NK6pSA065827; Wed, 23 Aug 2006 20:06:53 GMT (envelope-from Tor.Egge@cvsup.no.freebsd.org) Date: Wed, 23 Aug 2006 20:06:46 +0000 (UTC) Message-Id: <20060823.200646.74691347.Tor.Egge@cvsup.no.freebsd.org> To: bde@zeta.org.au From: Tor Egge In-Reply-To: <20060824031104.B64391@delplex.bde.org> References: <20060823110808.GD64800@deviant.kiev.zoral.com.ua> <20060823.154718.126633648.Tor.Egge@cvsup.no.freebsd.org> <20060824031104.B64391@delplex.bde.org> X-Mailer: Mew version 3.3 on Emacs 21.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned-By: mimedefang.idi.ntnu.no, using CLAMD X-SMTP-From: Sender=, Relay/Client=c2h5oh.idi.ntnu.no [129.241.103.69], EHLO=cvsup.no.freebsd.org X-Scanned-By: MIMEDefang 2.48 on 129.241.107.38 X-Scanned-By: mimedefang.idi.ntnu.no, using MIMEDefang 2.48 with local filter 16.42-idi X-Filter-Time: 0 seconds Cc: freebsd-fs@freebsd.org Subject: Re: Deadlock between nfsd and snapshots. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Aug 2006 20:06:56 -0000 > I found some cases where i_flag, timestamps and probably other parts of the > vnode seem to be accessed without any proper locks (maybe a refcount) :-(. > See another (private) reply and below. > > I don't understand how the vnode interlock can help much for interoperation > with code which uses vnode locks. Aren't these locks almost independent of > each other, so if the vnode lock is already held then acquiring the vnode > interlock wouldn't block? So to protect iflag with the interlock you > would have to add the interlock to mounds of code that just uses the vnode > lock now. the vnode interlock is used to resolve between those holding a shared lock. There are 4 locking protocols for the protection of i_flags and the timestamps: 1. Previous locking protocol: Giant 2. Current locking protocol: vnode lock. This breaks down when processes with a shared vnode lock perform changes (e.g. sets flags or timestamps). 3. Proposed locking protocol: exclusive vnode lock or shared vnode lock plus the vnode interlock. 4. Alternate locking protocol: vnode interlock. Further code changes might be needed for this to work properly, e.g. IN_MODIFIED might have to be set before making changes to the inode instead of aftewards. > I understand this locking :-). ffs_sync() actually uses only uses the vnode > interlock to access i_flag. I think this is intentionally quick and not > quite right -- there is a comment a few lines before it saying that we depend > on a mntvnode lock to keep things stable enough for a quick test. The scope > of the comment is unclear. I think the quick test is only good enough for > sync(2). The correct unoptimized version of ffs_sync() would have the vnode lock before checking i_flag. When this loop executes as part of suspending the file system, MNTK_SUSPEND is set. No code is executing within regions protected by vn_start_write() and new calls to vn_start_write() will fail or block. Calls to vn_start_secondary_write() might succeed but will then trigger a retry of the whole vnode sync loop. While not having the vnode lock when checking i_flag in ffs_sync() opens a race, the conseqeuences are limited due to other locking and the restart. - Tor Egge