From owner-freebsd-hackers Sun Mar 15 13:56:35 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id NAA10659 for freebsd-hackers-outgoing; Sun, 15 Mar 1998 13:56:35 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from smtp04.primenet.com (smtp04.primenet.com [206.165.6.134]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id NAA10498 for ; Sun, 15 Mar 1998 13:55:37 -0800 (PST) (envelope-from tlambert@usr06.primenet.com) Received: (from daemon@localhost) by smtp04.primenet.com (8.8.8/8.8.8) id OAA14948; Sun, 15 Mar 1998 14:55:30 -0700 (MST) Received: from usr06.primenet.com(206.165.6.206) via SMTP by smtp04.primenet.com, id smtpd014927; Sun Mar 15 14:55:26 1998 Received: (from tlambert@localhost) by usr06.primenet.com (8.8.5/8.8.5) id OAA13550; Sun, 15 Mar 1998 14:55:24 -0700 (MST) From: Terry Lambert Message-Id: <199803152155.OAA13550@usr06.primenet.com> Subject: Re: Odd problem we're seeing here To: karl@mcs.net (Karl Denninger) Date: Sun, 15 Mar 1998 21:55:24 +0000 (GMT) Cc: tlambert@primenet.com, hackers@FreeBSD.ORG In-Reply-To: <19980314185135.47922@mcs.net> from "Karl Denninger" at Mar 14, 98 06:51:35 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > What I'm trying to understand is why you're doing these in the order you are > (instead of locking first, then doing the LEASE). > > I'm willing to attempt these (actually, only the one in the tty_tty.c file > applies to most normal operations, correct?) but I'd like to *understand* > how and why the order is as it is, given that it still looks "backwards" > to me in terms of operation sequence. OK. If you look at the code around the vn_read and vn_write, vs. the code around vn_rdwr, you will see a pattern begin to emerge. In all cases, when vn_read or vn_write is called, the vp is locked. When vn_rdwr, it's not clear whether or not the vp is locked. The answer is "it depends on who is doing the calling". One of several problems here is order of operation; I happen to think it's the most important of them: If it's locked, you want to call VOP_LEASE before calling vn_rdwr, because you know whether you are reading or writing before you call If it's not locked, you want to call VOP_LEASE in vn_rdwr itself after the lock is asserted. The problem in the second becomes obvious when you go to try to add the code to actually implement it. For right now, I've punted. You don't sleep with a partially complete operation that for whatever reason needs atomicity in the VOP_LEASE call path, so it's not really dangerous to call it on unlocked VP's, unless you have the possibility of another scheduling context active in there at the same time. This can only happen in: 1) SMP, if the kernel is reentrant and depending on object locks 2) Kernel threading, if the kernel is thread reeentrant, and depending on object locks 3) Kernel preemption, as an implementation detail for realtime processing, where the preemption depends on object locks The punt is OK in this case, because of the other places vn_rdwr is used. The only real fix you could apply would be to get rid of vn_rdwr entirely. I have not made it more broken than it was. Right now, this means NFS is unstable in SMP (at least it has been empirically so on my SMP box), and the patches for additional lease calling at least follow the dictum "above all else, do no harm", and *should* help in the UP case. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message