Date: Sat, 30 Nov 2002 15:15:43 -0800 From: Terry Lambert <tlambert2@mindspring.com> To: Robert Watson <rwatson@freebsd.org> Cc: Michal Mertl <mime@traveller.cz>, current@freebsd.org Subject: Re: system locks with vnode backed md(4) Message-ID: <3DE9469F.CAC6CB82@mindspring.com> References: <Pine.NEB.3.96L.1021130130417.77446C-100000@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Robert Watson wrote: > On Sat, 30 Nov 2002, Michal Mertl wrote: > > I'm now unable to make it dead-lock again. Yet it happened quite easily. > > I had more md backing files in the same directory at the beginning (to > > test Terry's suspicion mentioned in thread 'jail' on hackers@). > > I've noticed that chroot() environments tend to make existing deadlock > opportunities more likely. I'm not quite sure why that is. :-) Lock to parent. It's the same reason you can lock up if you use automount, with all the automount mount points happening in the same subdirectory. > There are a fair number of vnode locking deadlock scenarios that are > unavoidable where we rely on grabbing vnode locks out of the directory > structure lock order. This occurs for vnode-backed md devices, quotas, > and UFS1 extended attributes, and probably some other situations. I > suspect that Terry is correct that operations on the vnode backing file > storage directory are triggering the problem, since that increases the > chances that a vnode lock "race to root" will occur from both the file > system backed into the md device, and for the md backing vnodes during > blocking I/O. See other postings. The "race to root" is the one I was originally commenting on. I'm not sure that it applies in this case, I think this case might be the "out of memory to create new soft dependencies" case, where you can end up holding a lock on a buffer that needs to be flushed to recover memory, until you can satisfy the request to create a dependency (starvation deadlock). The "race to root" is a "deadly embrace" deadlock. > If you can avoid directory operations on the md backing > directory, that would probably be one way to avoid triggering the bug. Yes. By placing each vnconfiged device in its own subdirectory, you avoid them. There's still a window on your host OS doing it's own traversal, but that's (effectively) a "whole FS lock", so it doesn't trigger a problem. > Seeing it reproduced would probably confirm that this is the case. It's a pain. I wasted a couple of days trying to reproduce, without a box I could wipe and make into a wscratch box, with little luck. I think that it requires reproducing the failing box in detail, which I wasn't willing to do (hence the workaround). > On the > other hand, there may be other deadlocks in the vnode/ufs/md code that can > be more easily corrected than this general VFS problem, so details there > would be very useful. There are a number of them; they are all a pain. It's really tempting to just refactor the code so that all locking occurs at the same logical layer, without being held across function calls. That'd be a heck of a lot of work, though... probably worth it, in the end. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3DE9469F.CAC6CB82>