Date: Sun, 15 Mar 1998 21:55:24 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: karl@mcs.net (Karl Denninger) Cc: tlambert@primenet.com, hackers@FreeBSD.ORG Subject: Re: Odd problem we're seeing here Message-ID: <199803152155.OAA13550@usr06.primenet.com> In-Reply-To: <19980314185135.47922@mcs.net> from "Karl Denninger" at Mar 14, 98 06:51:35 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> What I'm trying to understand is why you're doing these in the order you are > (instead of locking first, then doing the LEASE). > > I'm willing to attempt these (actually, only the one in the tty_tty.c file > applies to most normal operations, correct?) but I'd like to *understand* > how and why the order is as it is, given that it still looks "backwards" > to me in terms of operation sequence. OK. If you look at the code around the vn_read and vn_write, vs. the code around vn_rdwr, you will see a pattern begin to emerge. In all cases, when vn_read or vn_write is called, the vp is locked. When vn_rdwr, it's not clear whether or not the vp is locked. The answer is "it depends on who is doing the calling". One of several problems here is order of operation; I happen to think it's the most important of them: If it's locked, you want to call VOP_LEASE before calling vn_rdwr, because you know whether you are reading or writing before you call If it's not locked, you want to call VOP_LEASE in vn_rdwr itself after the lock is asserted. The problem in the second becomes obvious when you go to try to add the code to actually implement it. For right now, I've punted. You don't sleep with a partially complete operation that for whatever reason needs atomicity in the VOP_LEASE call path, so it's not really dangerous to call it on unlocked VP's, unless you have the possibility of another scheduling context active in there at the same time. This can only happen in: 1) SMP, if the kernel is reentrant and depending on object locks 2) Kernel threading, if the kernel is thread reeentrant, and depending on object locks 3) Kernel preemption, as an implementation detail for realtime processing, where the preemption depends on object locks The punt is OK in this case, because of the other places vn_rdwr is used. The only real fix you could apply would be to get rid of vn_rdwr entirely. I have not made it more broken than it was. Right now, this means NFS is unstable in SMP (at least it has been empirically so on my SMP box), and the patches for additional lease calling at least follow the dictum "above all else, do no harm", and *should* help in the UP case. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199803152155.OAA13550>