From owner-freebsd-hackers Sat Feb 20 14:52:54 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from smtp02.primenet.com (smtp02.primenet.com [206.165.6.132]) by hub.freebsd.org (Postfix) with ESMTP id B98CB11A71 for ; Sat, 20 Feb 1999 14:52:35 -0800 (PST) (envelope-from tlambert@usr08.primenet.com) Received: (from daemon@localhost) by smtp02.primenet.com (8.8.8/8.8.8) id PAA20948; Sat, 20 Feb 1999 15:52:34 -0700 (MST) Received: from usr08.primenet.com(206.165.6.208) via SMTP by smtp02.primenet.com, id smtpd020893; Sat Feb 20 15:52:27 1999 Received: (from tlambert@localhost) by usr08.primenet.com (8.8.5/8.8.5) id PAA19845; Sat, 20 Feb 1999 15:52:21 -0700 (MST) From: Terry Lambert Message-Id: <199902202252.PAA19845@usr08.primenet.com> Subject: Re: Panic in FFS/4.0 as of yesterday To: dfr@nlsystems.com (Doug Rabson) Date: Sat, 20 Feb 1999 22:52:20 +0000 (GMT) Cc: dillon@apollo.backplane.com, freebsd-hackers@FreeBSD.ORG In-Reply-To: from "Doug Rabson" at Feb 20, 99 09:35:40 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > I always thought that the vnodes were locked that way during lookup to > avoid more serious problems but I have never done the analysis to figure > it out. Certainly there are some tricky cases in the way that lookup is > used to prepare for a subsequent create or rename (but that isn't the > issue here I think). See the rename code. > If it works, then changing lookup to not require locks on both vnodes at > the same time would be a good thing. One of the reasons that NFS doesn't > have proper node locks is that a dead NFS server can lead to a hung > machine though a lock cascade from the NFS mount point. The correct way to do this, IMO, is a back-off/retry, which would unlock the lock and queue the operation for retry, which would reacquire the lock. I swear I saw code in NFS to do this. Maybe it was pre-BSD4.4. > > Maybe. The difference is that the I/O topology is not known at the > > higher levels where the I/O gets queued, so it would be more difficult > > to calculate what the async limit should be in a scaleable way. > > Understood. I played with a few non-solutions, limiting i/o on a mount > point and on a vnode to an arbitrary limit but wasn't able to make a real > difference to the responsiveness of the test. > > It does seem wrong that a single writer process can generate arbitrary > amounts of latency (essentially only bounded by the number of available > buffers) for other clients on the same drive. Ideally the driver should be > able to propagate its 'queue full' signals up to the bio system but I > can't see a way of doing that easily in the current code. If the queue gets full, then the disk gets busy. In which case, you could convert to using sync writes for all pending directory operations. It should be pretty easy to do this kludge by (1) deciding how many waiters is "too many", and (2) checking if the mount is async. This would avoid propagating the changes up. I really think, though, that the correct fix is to flag the async writes for directory data in the code, and then when you do the lower level queue insertion, insert them ahead, per my other posting. I personally like the name "EXPEDITE" for the flag. 8-). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message