From owner-freebsd-hackers Wed Dec 16 22:54:00 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id WAA14142 for freebsd-hackers-outgoing; Wed, 16 Dec 1998 22:52:19 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from smtpott1.nortel.ca (smtpott1.nortel.ca [192.58.194.78]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id WAA14135 for ; Wed, 16 Dec 1998 22:52:16 -0800 (PST) (envelope-from Andrew.Atrens.atrens@nortelnetworks.com) Received: from zcars01t by smtpott1; Thu, 17 Dec 1998 01:51:53 -0500 Received: from wmerh01z.ca.nortel.com by zcars01t; Thu, 17 Dec 1998 01:50:16 -0500 Received: from nortel.ca (atrens@nortel.ca@wmerh01z) by wmerh01z.ca.nortel.com with ESMTP (8.7.1/8.7.1) id BAA06053; Thu, 17 Dec 1998 01:50:15 -0500 (EST) Message-ID: <3678AC2F.2DC85099@nortel.ca> Date: Thu, 17 Dec 1998 02:01:04 -0500 From: "Andrew Atrens" Reply-To: "Andrew Atrens" Organization: Nortel Networks ( formerly Bell-Northern Research ) X-Mailer: Mozilla 4.5 [en] (X11; I; FreeBSD 3.0-CURRENT i386) X-Accept-Language: en MIME-Version: 1.0 To: hackers@FreeBSD.ORG CC: David G Andersen , Karl Denninger , bright@hotjobs.com Subject: Re: yup, found it (NFS) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Whooops! Looks like the Nortel (microsoft exchange) Gateway had some fun with my email address on my FreeBSD box (that I use from home)... Sorry guys the below message is actually from _me_. Please don't bother Mr. MacPherson, he must rue the day I chose to use my name (as account name) on this box... Andrew. "Macpherson, Andrew (A.) [EXCHANGE:HAL02:HM00-I:NT]" wrote: > > On Wed, 16 Dec 1998, David G Andersen wrote: > > > Date: Wed, 16 Dec 1998 22:23:53 -0700 (MST) > > From: David G Andersen > > To: Karl Denninger > > Cc: bright@hotjobs.com, hackers@FreeBSD.ORG > > Subject: Re: yup, found it (NFS) > > > > Lo and behold, Karl Denninger once said: > > > > > > On Wed, Dec 16, 1998 at 11:51:39PM -0500, Alfred Perlstein wrote: > > > > On Wed, 16 Dec 1998, Karl Denninger wrote: > > > > > > > > > Remove the intr for now. If that fixes it then at least we have > > > > > hard proof of where it is. > > > > It does. You may wish to look at PR kern/8732, which we opened about a > > month ago on exactly this topic. > > Yep, I got bit by this while using amd. Updating amd seemed to reduce the > frequency of the freezes, however one type of freeze was very easy to > reproduce. While editing a file on an NFS partition with Xemacs, the > system would consistently lock when Xemacs attempted to auto-save the > document... it was doing a write to an NFS disk from a SIGALRM handler. > Alfred's pine behaviour sounds like it might be similar. > > David suggested I toast my nfsiod's and since then the system's been > rock-solid. > > As for mount options, I have `intr' enabled... > > I wonder if this PR is one of the deadlocks that Matt Dillon referred to > in his recent mail to the list... > > Andrew > > > > > > > > cause. This of course assumes you mount executable directories (very > > > > > common in clusters) across NFS. > > > > Interesting. We didn't bump into this one, but my test program didn't > > check for it - only for the buffer flushing. > > > > > > > Certainly the expected execution path is basically the same, and I can > > > > > *trigger it* with a SIGINT to a running process which happens to have some > > > > > of its working set paged out at the time it receives the signal (ouch!) > > > > > > > > That doesn't seem very good at all. Is this second case for all > > > > NFS mounts? or only intr mounts? > > > > If it's like the bug we found (which I'd wager), it's probably for intr > > mounts. Like we mention in the PR, the problem seems to be related to the > > change from sleep to an interruptable tsleep. > > > > > What I want to know is whether a "ro,soft" mount has the same > > > vulnerability. We use them around here for things like mounting > > > the Usenet spool. > > > > Nope. Soft doesn't seem to affect it (at least, the last time I tested > > it). Another cheap fix is to not run any nfsiods, preventing the > > asynchronous flush from occuring in the first place. > > > > We've been hounding on this PR for a while (that's kern/8732. :), and > > would love to see a resolution for it. If someone wants to suggest the > > proper behavior, I'm more than happy to start drudging up a fix. > > > > -Dave > > > > -- > > work: danderse@cs.utah.edu me: angio@pobox.com > > University of Utah http://www.angio.net/ > > Department of Computer Science > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > > with "unsubscribe freebsd-hackers" in the body of the message > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-hackers" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message