Date: Thu, 24 Aug 2006 01:39:48 +1000 (EST) From: Bruce Evans <bde@zeta.org.au> To: Kostik Belousov <kostikbel@gmail.com> Cc: freebsd-fs@freebsd.org, Tor Egge <Tor.Egge@cvsup.no.freebsd.org> Subject: Re: Deadlock between nfsd and snapshots. Message-ID: <20060824003453.M63627@delplex.bde.org> In-Reply-To: <20060823044043.GA64800@deviant.kiev.zoral.com.ua> References: <20060821.132151.41668008.Tor.Egge@cvsup.no.freebsd.org> <20060822175540.V58720@delplex.bde.org> <20060822130743.GL56637@deviant.kiev.zoral.com.ua> <20060822.214638.74697110.Tor.Egge@cvsup.no.freebsd.org> <20060823044043.GA64800@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 23 Aug 2006, Kostik Belousov wrote: > On Tue, Aug 22, 2006 at 09:46:38PM +0000, Tor Egge wrote: >>> 2. All places that currently set IN_ACCESS, instead would increment >>> i_accessed using the atomic ops. ufs_itimes shall update i_access >>> under some mutex if i_accessed is greater than zero. >> >> Protecting the existing i_flag and the timestamps with the vnode >> interlock when the current thread only has a shared vnode lock should >> be sufficient to protect against the races, removing the need for #3, >> #4 and #4 below. You asked about this in a later reply (the one with the patch). This seems wrong to me. I think MNT_ILOCK() (like you used) is sufficient, but you should just use a nonblocking vn_start_write() to avoid knowing about the internals of vn_start_write(). If the shared (or whatever) vnode lock is insufficient, then there are much larger, much older bugs. Inodes are accessed a lot with just the the vnode lock, and the vnode interlock here won't affect races with most other accesses. >> What's left is avoiding setting IN_MODIFIED when it's unsafe, to >> protect against the deadlock. > > So, I will do the following: > > 1. Protect both setting and reading inode times and i_flag with vnode > interlock. This shall be done through all the sys/ufs/*/* code. The patch doesn't change so much. Most places shouldn't need changing. > 2. Modify ufs_itimes: >> If neither IN_CHANGE nor IN_UPDATE is set then it might be unsafe to >> set IN_MODIFIED since the file system might be suspended or in the >> process of being suspended with the vnode sync loop in ffs_sync() >> having iterated past the vnode. > In other words, if IN_CHANGE or IN_UPDATE are already set, I can > safely convert IN_ACCESS into IN_MOD. Not quite. IN_ACCESS can also be handled if IN_MODIFIED is already set, even if neither IN_CHANGE nor IN_UPDATE is set. All this is when the file system is suspended; otherwise we can do anything to the inode. In all cases, we depend on the inode not changing underneath us. I think ordinary vnode locking gives that, just like it did before suspension existed. > Otherwise, I shall implemented the algorithm below. Suspending/suspended > checks need to take MNT_ILOCK. >> ... >> When ufs_itimes() cannot set IN_MODIFIED then it has to either risk >> losing the access time update or use some mechanism to defer it (e.g. >> set IN_LAZYMOD or a new flag and let process_deferred_inactive() set >> IN_MODIFIED after the file system has been resumed). >> > BTW, shall the test for MNT_RDONLY in the ufs_itimes moved earlier ? Probably not. It is supposed to be earlier, but is misplaced to work around bugs in the MNT_RDONLY case (this case should never set a flag or a timestamp, but in fact does set flags to work around bugs elsewhere). > 3. Add the process_deferred_lazymod procedure, called from ffs_snapshot > before proc_deferred_inactive, that shall convert IN_LAZYMOD | IN_ACCESS > into IN_MODIFIED. To be safe, the proc_def_lazymod needs vn_start_write braces. I think you should just ignore IN_ACCESS when it cannot be converted to a timestamp, or use IN_LAZYMOD. ufs_inactive() needs to do something with IN_LAZYMOD other than blindly turn it into IN_MODIFIED (but I believe the problem case is unreachable in -current -- see other mail). ffs_update() clears IN_MODIFIED. I now think this and other clearings in ffs_update() are safe. vnode locking should make it safe to change the inode, and it doesn't matter if the file system is suspended (or being suspended) provided IN_MODIFIED is not set when it shouldn't be. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060824003453.M63627>