Date: Thu, 25 Feb 2010 14:10:27 -0800 From: "Jake Holland" <jholland@fastsoft.com> To: <freebsd-stable@freebsd.org> Subject: RE: vfs deadlock during panic? Message-ID: <D13CB108B048BD47B69C0CA1E0B5C03201113407@hq-es.FASTSOFT.COM> In-Reply-To: <20100225090502.GA84181@icarus.home.lan> References: <D13CB108B048BD47B69C0CA1E0B5C0320111335C@hq-es.FASTSOFT.COM><20100225073317.GA82327@icarus.home.lan><D13CB108B048BD47B69C0CA1E0B5C03201113360@hq-es.FASTSOFT.COM> <20100225090502.GA84181@icarus.home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
> Are you sure one of the filesystems on the disk isn't corrupt? > There's been reports of this problem in the past, but AFAIR it > doesn't manifest itself in this manner. Ah, thanks. Your comment spurred me to search for 'VOP_LOCK1_APV lock order reversal', instead of 'freebsd hang on panic', and I see now that this has been reported several times. I read a bunch of the threads, but it looks like there's no solution yet. But you're right that nobody else seems to be complaining about a rare hang-on-panic problem, either. Anyway, I didn't see any of the threads that mentioned file system corruption, with the possible exception of http://lists.freebsd.org/pipermail/freebsd-current/2010-January/014786.h tml, which said that running fsck was what triggered the LORs. So I'm assuming this was a mis-remembered detail, unless you've got a better reference, and I'll take a rain check on re-installing everything on a new disk, for now. But thanks for the comment, I do appreciate it, and it helped me realize what I should follow up on. I guess my next step is to try to fix the vfs locking. I think I'll see what happens if I use a sx_lock instead of a mtx for BO_MTX to guard the block, so it won't care so much what the underlying file system does during vnode operations, for the file access. I assume that won't work, but maybe it's a start towards understanding what I do need. The mounting one looks trickier, because the vn_lock looks rather confusing, and I'm really not sure what to do about the Giant dependencies it seems to have. But I guess maybe I'll see if there's a way to defer some of these operations to a working thread or something. Not sure if I'll actually have the time to go that deep on this issue, and I'm unfortunately not certain it'll solve the hanging panic problem. I guess I can see why nobody fixed it yet. Oh well. Thanks again for the suggestion. Maybe in light of the alternative it would be worth at least trying that separate disk idea after all. I have already seen something very similar on at least 2 different machines with different disks, but they came from the same dump/restore image, so maybe if it's because of fs corruption, there's a shared reason behind it.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D13CB108B048BD47B69C0CA1E0B5C03201113407>
